Skip navigation

That is it folks, the Mischievous round is done. Once again, I am personally devastated that the Doctor Who representatives (The Master and the Daleks) did not make it beyond the first round. Thanks to those (looking in your direction sqlrockstar and @emoore@empireiron.com) for keeping the dream alive. Perhaps, we will take this as a powerful message. Next year, no Doctor Who. Humpf.

 

Who else avoided a first round elimination? Our official results will post shortly, but here is a recap of some of the most popular and hotly contested match ups. 

  • We thought that this match up would be closer by far, but the power of the Dark Side and a choke hold were too much for Voldemort's weak geek cred
  • Bahlkris thinks these two are a winning sitcom combination (dark sense of humor, indeed)... the pilot episode could be a rematch to determine who gets the bigger bedroom. In the meantime, this round goes to Khan over Hannibal
  • And Skeletor's Cinderella story continues, with a complete trouncing of Jaws.

 

So, we have a new set of match-ups, likely more random than before.

 

Voting in the Rotten round starts around 10 am CT today. We still need your help to determine the most infamous of the baddies.

We are a solid 3 months into 2015 at this point, the first quarter is nearly out, and subsequently we should start seeing the seeds or even some fruit of various ‘2015 Predictions’ by “industry” folks.  Well, rather than wait until the end of the year to see, what ‘might’ happen vs what might not. What are *you* seeing.  You are the community, you are the people for whom the predictions toll, but most importantly you are the ones who actually decide IF they will come to fruition or not.

 

We’ve seen predictions around Cloud adoption, implementation of Software Defined Data Centers, whether components of Software Defined Storage, Software Defined Networking or other similar type capabilities.   I don’t want to even bring up the fact that the past 10 years have been, “The Year of VDI” but every year it is predicted, and every year sadness follows because wide-scale adoption keeps people saying, “one more time…”

 

So what are you seeing, let’s say… Forget the Analysts, the Trade Press, even my own predictions because they mean nothing if they don’t actually happen. (No, this isn’t some kind of obscure Nostradamus stuff :)) But the best kind of predictions follow as Self-Fulfilling prophecies and you are in the drivers seat of IT to see things start happening or *want* things to start happening.

 

So now that we have a few months under our belt, what are some of the Predictions you DO see happening and others you would *like* to see happen.  Fortunately the industry and marketplace is mature that a majority of the solutions exist to adopt, it’s just a matter of…

 

Which ones are resonating with you?

Management wants to be able to track employees productivity and their performance to use in periodic employee evaluations. Performance improvements can lead to keeping your job, bonuses, and pay increases, where declines in performance lead to pay decreases, black marks in your HR file, and possibly losing your job. When accurate metrics are used in evaluations, they can be beneficial to the organization and individual.

 

One type of metric is the Service Level Agreement, or SLA. This defines how IT, as a group, is going to respond to the issue that a user reports. Some organizations define the SLA to be the time it takes to resolve the issue. While some issues can be resolved quickly, like a password change, other issues can take a long time to diagnose and correct. The SLA can also be measured by how long it takes for the initial response to the request. While I believe this is a better metric than time to resolution, this can be abused. The person assigned the ticket can easily make initial contact but not make any progress in diagnosing or resolving the issue.

 

Another metric is customer satisfaction. One way of getting this metric is using surveys sent to the requestor after the issue is resolved. Not all surveys make it to their destination and if they do, many get ignored or deleted. The questions on surveys are written in multiple choice for easy analysis, but don’t really provide much for collecting real feedback from the requestor. If an issue was handled by multiple people, then who does the survey reflect upon and does the requestor realize this.

 

Managers want to know how well their employees are doing and a way to accurately measure the employee. Ticketing systems have some metrics that can be used to track how well employees are doing. How do you accurately measure employees, in particular Help Desk employees?  What are the metrics that really matter? Can all of these metrics be tracked in one system? How would you like to be measured?

Amit.JPGAmit Panchal was one of the Virtualization Field Day 4 delegates we recently presented to in Austin and I got to thinking we should get to know our friend from “Blighty” a bit better. So, for this month’s IT

Blogger Spotlight series we caught up with him.


SW: Tell us a bit about yourself. How did you get your start in IT?

 

AP: I started in IT as a Helpdesk Analyst for a large solutions provider. I actually completed an accountancy degree, but my passion was in IT, so naturally I worked my way up from there.


SW: I guess you could say your interest in all things IT naturally led you to start blogging?

 

AP: I started blogging because I wanted to share what I was learning in the industry with the wider IT community. I’m particularly interested in virtualization so my blog (Amit’s Technology Blog: Demystifying the World of Virtualisation & Technology) allows me to share my own thoughts, and helps me raise my profile in the community.


SW: So, what keeps you busy during the week?


AP: Monday to Friday is hectic with work and managing a team/projects and ad hoc jobs for a fast-paced global manufacturing company. Outside work, I'm very busy in the evenings juggling home life and carving out time to follow the latest tech trends and keeping my social circles buzzing mainly via Twitter (@amitpanchal76). I also try to make time to stay up-to-date with fellow bloggers by listening to different podcasts.

 

SW: Wow, you’re a busy guy! What blogs are you following these days?

 

AP: Some of my favorite blogs are Yellow Bricks, The Saffa Geek and I regularly follow Frank Denneman, Mike Preston and Hans De Leenheer.


SW: I’d say you have your finger on the “IT” pulse. Do you have any favorite tools you use?

 

AP: I use a number of different tools in my day-to-day, but if I had to pick my top tools I’d say vSphere Ops Manager, System Center Operations Manager, and Quest Active Roles Server. I’ve also used SolarWinds Storage Manager. It’s an excellent product for trending, reporting, and real-time analysiswhich has been recently followed up with SolarWinds Storage Resource Monitor.


SW: OK, time to talk football (soccer). I see you’re a Manchester United Fan. I’ve got two questions for you. David Moyes or Louis van Gaal?

 

AP: Louis van Gaal all the way! He's made some good moves this season and hasn’t struggled in the way that Moyes did. He has a good track record, so I have high hopes for him.


SW: Ruud van Nistelrooy or Robin van Persie?

 

AP: Ruud van Nistelrooy is my pick as he was lethal in attack and could score in many different ways. He was also less prone to injury!


SW: Other than football, what else do you enjoy when you’re not “plugged in”?

 

AP: When I'm not working or blogging, I'm usually trying to keep control of my two boys. I love traveling and regularly go abroad with my family to escape the UK weather and catch some sun, sea, and sand. I'm also an avid reader and love both fiction and non-fiction.


SW: Escaping this winter weather sounds good right about now. Changing gears for the final question, what trends are you seeing in the industry?

 

AP: I’ve seen a recent explosion in hyper-converged solutions (1 box solutions that are scalable), SaaS apps, hybrid cloud (as companies seek to experiment with workloads off-premise), and flash storage in the data center.

Incident responders: Build or buy?

There is far more to security management than technology. In fact, one could argue that the human element is more important in a field where intuition is just as valuable as knowledge of tech. In the world of security management I have not seen a more hotly debated non-technical issue than the figurative “build or buy” when it comes to incident responder employees. The polarized camps are the obvious:

  • Hire for experience.
  • In this model the desirable candidate is a mid-career or senior level, experienced incident responder. The pros and cons are debatable:Hire for ability
    • More expensive
    • Potentially set in ways
    • Can hit the ground running
    • Low management overhead
  • In this model, a highly motivated but less experienced engineer is hired and molded into as close to what the enterprise requires as they can get. Using this methodology the caveats and benefits are a bit different, as it is a longer term strategy.
    • Less expensive
    • “Blank Slate”
    • Requires more training and attention
    • Initially less efficient
    • More unknowns due to lack of experience
    • Can potentially become exactly what is required
    • May leave after a few years

In my stints managing and being involved with hiring, I have found that it is a difficult task to find a qualified, senior level engineer or incident responder that has the personality traits conducive to melding seamlessly into an existing environment. That is not so say it isn’t possible, but soft skills are a lost art in technology, and especially so in development and security. In my travels, sitting on hiring committees and writing job descriptions, I have found that the middle ground is the key. Mid-career, still hungry incident responders that have a budding amount of intuition have been the blue chips in the job searches and hires I have been involved with. They tend to have the fundamentals and a formed gut instinct that makes them incredibly valuable and at the same time, very open to mentorship. Now, the down side is that 40% of the time they’re going to move on just when they’re really useful, but that 60% that stick around a lot longer? They are often the framework that thinks outside the box and keeps the team fresh.

Better Network Configuration Management promises a lot. Networks that are more reliable, and can respond as quickly as the business needs. But it’s a big jump from the way we've run traditional networks. I'm wondering what’s holding us back from making that jump, and what we can do to make it less scary.


We've all heard stories about the amazing network configuration management at the Big Players (Google, Facebook, Twitter, Amazon, etc). Zero Touch Provisioning, Google making 30,000 changes per month, auto-magic fine-grained path management, etc. The network is a part of a broader system, and managed as such. The individual pieces aren't all that important - it's the overall that matters.


Meanwhile, over here in the real world, most of us are just scraping by. I've seen many networks that didn't even have basic automated network device backups. Even doing something like automated VLAN deployment is crazy talk. Instead we're stuck in a box-by-box mentality, configuring each device independently. We need to think of the network as a system, and but we're just not in a place to do that.


Why is that? What's stopping us from moving ahead? I think it’s a combination of being nervous of change, and of not yet having a clear path forward.


Are we worried about greater automation because we're worried about a script replacing our job? Or do we have genuine concerns about automation running amok? I hear people say things like "Oh our Change Management team would never let us do automated changes. They insist we make manual changes." But is that still true? For server management, we've had tools like Group Policy, DRS, Puppet/Chef/Ansible/etc for years now. No reasonable-sized organisation would dream of managing each of their servers by hand. Change Management got used to that, so why couldn't we do the same for networking? Maybe we're just blaming Change Management as an excuse?


Maybe the problem is that we need to learn new ways of working, and change our processes, and that’s scary. I’m sure that we can learn new things - we’ve done it before. But A) do we want to? and B) do we even know where to start?


If you’re building an all-new network today you’d bake in some great configuration management. But we, as a wider industry, need to figure out how to improve the lot of existing networks. We can’t rip & replace. We’ve got legacy gear, often with poor interfaces that don’t work well with automation toolsets. We need to figure out transition plans - for both technology & people.


Have you started changing the way you approach network configuration management? Or are you stuck? What’s holding you back? Or if you have changed, what steps did you take? What worked, and what didn’t?

I talk (“brag”) a lot about being a SolarWinds® Head Geek®. Everyone on the Head Geek team (Patrick Hubbard, Thomas LaRock, Kong Yang, Don Jacob, and Praveen Manohar) often gushes about how we feel this is the best job in the world.

 

But what does being a Head Geek really mean? Is it all tweets about bacon, sci-fi references on video, and Raspberry Pi-connected coffee machines?

 

Honestly, all that stuff is part of the job. More to the point, being excited about tech, engaging in geek and nerd culture, and being openly passionate about things is part of what makes people want to listen, friend, and follow us Head Geeks.

 

First and foremost, Head Geeks are experts in, and champions of monitoring and automation—regardless of the tools, techniques, or technology. We know our stuff. We are proud of our accomplishments. And, we love to share everything we know.

 

Our core message is, “Monitoring and automation are awesome, and companies can derive a huge benefit both financially and operationally from having effective tools in place. I’d love to tell you more about it. Oh, and by-the-way, the company where I work happens to make some cool stuff to help you get that done.” Education and inspiration first—sales second.

 

In addition, being a Head Geek means:

  • Finding and building your voice as a trusted and credible source of information about monitoring and automation for both companies and IT professionals.
  • Providing information about new technologies, developments, and techniques, and showing how they impact the current state of technology.
  • Building solid opinions about technology based on facts and experience, and then engaging in dialogue and meaningful interactions to share those opinions with the public.
  • Being a proponent not just for pure technology, but also for cultural issues and initiatives within the IT community.
  • Enjoying the thrill of digging into new products and features (regardless of vendor) and exploring their impact to the IT landscape.
  • Mentoring IT pros on honing their skills in using monitoring and automation tools as well as fostering significant and meaningful value, stability, and reliability.
  • Using our platform to shine a light on the awesome achievements of other IT pros, not ourselves.

 

But how does that look day-to-day?

 

We have a love of writing. Our idea of a great day includes cobbling together 800-2000 words for a magazine or blog post; helping tweak the phrasing on a customer campaign email; and scripting demonstrations that teach customers how to fix a common (or not-so-common) issue, use a feature, or gain insight into the way things work.

 

Our love of writing goes hand-in-hand with a love of reading. We consume bushels of blogs, piles of podcasts, and multitudes of magazine articles.

 

We are at ease in front of a crowd. Head Geeks appear in videos (SolarWinds Lab as well as creative videos, tutorials, and demos). We lead webcasts, participate in podcasts, and perform product demonstrations for potential customers. We staff booths and do presentations at trade shows like VMworld®, MS-Ignite, SQL Pass, and Cisco® Live where we get to interact with thousands of IT pros.

 

We are social media savvy. That goes beyond “the big three.” We seek out the hotbeds of IT discussions—from StackExchange and reddit® to thwack® and SpiceWorks®—and we get involved in the conversations happening there. We contribute, ask, answer, teach, and learn.

 

But there is even more that happens behind the scenes.

 

As much as we are a voice of experience and credibility in the IT professional community, our opinions are equally sought after inside SolarWinds.

  • Our experience helps set the direction for new features and even new products.
  • We interact with the development teams and serve as the voice of the customer.
  • We take issues we hear about on the street and make sure they get attention.
  • We contribute insight during the design of user interfaces and process flows.
  • We push product managers and developers to include features that may be hard to implement, but will honestly delight our users.

 

Head Geeks are true “technical creatives.” We are critical to many departments—especially the marketing arm of our organization. We make sure that our message speaks with a sincere, passionate, human voice to our customers. We ensure that marketing and sales literature is technically accurate, but also crafted to highlight the features we know customers will be excited to use.

 

We love a challenge, and we know that the name of the game is change. Can we film two SolarWinds Lab episodes in one day? Let’s give it a shot! Quick, what can we do to help our European team build awareness and excitement? Let’s host a “mega lab” presented in their time zone. How about attending Cisco Live and then doing a week of SolarWinds Lab filming back to back? We’ll be exhausted, but we’re going to get some great material.

 

Being a Head Geek isn’t for everyone. Not because it takes some magical alchemical mixture of traits, nor because it’s a job that has to be earned with years of sweat and toil. It’s not for everyone because not everyone finds the things I just described to be exciting, rewarding, or fun.

 

But for those who do—and you can put me firmly in that camp—it’s the job of a lifetime.

What seems like a lifetime ago I worked for a few enterprises doing various things like firewall configurations, email system optimizations and hardening of Netware, NT4, AIX and HPUX servers. There were 3 good sized employers, a bank and two huge insurance companies that both had financial components. While working at each and every one of them, I was, subject to their security policy (one of which I helped to craft, but that is a different path all together), and none of which really addressed data retention. When I left those employers, they archived my home directories, remaining email boxes and whatever other artifacts I left behind. None of this was really an issue for me as I never brought any personal or sensitive data in and everything I generated on site was theirs by the nature of what it was. What did not occur to me then, though, was that this was essentially a digital trail of breadcrumbs that could exist indefinitely. What else was left behind and was it also archived? Mind you, this was in the 1990s and network monitoring was fairly clunky, especially at scale, so the likely answer to this question is "nothing", but I assert that the answer to that question has changed significantly in this day and age.

Liability is a hard pill for businesses to swallow. Covering bases is key and that is where data retention is a double edged sword. Thinking like I am playing a lawyer on TV, keeping data on hand is useful for forensic analysis of potentially catastrophic data breaches, but it can also be a liability in that it can prove culpability in employee misbehavior on corporate time, resources and behalf. Is it worth it?

Since that time oh so long ago I have found that the benefit has far outweighed the risk in retaining the information, especially traffic data such proxy, firewall, and network flows.  The real issues I have, as noted in previous posts, is the correlation of said data and, more often than not, the archival method of what can amount to massive amounts of disk space.

If I can offer one nugget of advice, learned through years of having to decide what goes, what stays and for how long, it is this: Buy the disks. Procure the tape systems, do what you need to do to keep as much of the data as you can get away with because once it is gone it is highly unlikely that you can ever get it back.

Double double, toil and trouble… Something wicked this way comes. Our third annual Bracket Battle is upon us.

 

Mwuh huh huh huh ha!

 

On March 23, thirty-three infamous individuals begin the battle to rip worlds apart and crush each other until only one remains as the most despicable of all time. It’s VILLAINS time, people!

 

These head-to-head, villain-versus-villain matchups should once again spark controversy. Trust us… every year, we get something stirred up – whether is it the absence of someone, or the overrating of another. Y’all are a hard group to please!

 

We have included a wide range of our worst enemies, including cunning and depraved villains who have tried to rule Middle-earth, Castle Greyskull, Asgard, Springfield and beyond. Draw your weapons; it’s time to decide:

 

  • Demi-god or Immortal?
  • Lightsaber or Wand?
  • Clown Prince or Dr. Fava Bean?
  • The Dragon versus The Ring?

 

We came up with the field and decided where we would start, but the power is yours to decide who will be foiled again and which single scoundrel, in the end, will rule them all.

 

But, in a twist no one saw coming, we are changing things up this year.  We are setting up a little trap, ummm, no… giving you a PREVIEW of the bracket today, even though voting will not begin until Monday.

 

Dastardly plans (AKA Rules of Engagement) are outlined below…

 

MATCH UP ANALYSIS

  • For each combatant, we offer links to the best Wikipedia reference page by clicking on the NAME link in bracket
  • A breakdown of each match-up is available by clicking on the VOTE link.
  • Anyone can view the bracket and the match-up descriptions, but to comment and VOTE you must be a thwack member (and logged IN). 

 

VOTING

  • Again, you have to be logged in to vote and debate… 
  • You may only vote ONCE for each match up
  • Once you vote on a match, click the link to return to the bracket and vote on the next match up in the series.
  • Each vote gets you 50 thwack points!  So, over the course of the entire battle you have the opportunity to rack up 1550 points.  Not too shabby…

 

CAMPAIGNING

  • Please feel free to campaign for your favorites and debate the merits of our match ups to your hearts content in the comments section and via twitter/Facebook/Google + etc. etc. etc.
  • We even have hashtags… #swbracketbattle and #EvilLaugh… to make it a little bit easier.
  • There will be a PDF version of the bracket available to facilitate debate with your hencemen.
  • And, if you want to post pics of your bracket predictions, we would love to see them on our Facebook page!

 

SCHEDULE

  • Bracket Release is TODAY
  • Every round voting will begin at 10 am CT
  • Play-in Battle OPENS March 23
  • The Mischievous round OPENS March 25
  • The Rotten round OPENS March 30
  • The Wicked round OPENS April 2
  • The Vile round OPENS April 6
  • The Diabolical Battle OPENS April 9
  • And, finally, the one true OVERLORD will be announced on APRIL 13

 

If you have other questions… feel free to drop them below and we will get right back with you!

 

So, which of these villains we love to hate can plot and scheme their way to the top of this despicable heap? Whose dastardly plans will rip apart worlds and crush humanity?

 

OK, we will stop our monologue and let you decide.

A finely-tuned NMS is at the heart of a well-run network. But it’s easy for an NMS to fall into disuse. Sometimes that can happen slowly, without you realizing. You need to keep re-evaluating things, to make sure that the NMS is still delivering.

 

Regular Checkups

Consultants like me move around different customers. When you go back to a customer site after a few weeks/months, you can see step changes in behavior. This usually makes it obvious if people are using the system or not.


If you're working on the same network for a long time, you can get too close to it. If behaviors change slowly over time, it can be difficult to detect. If you use the NMS every day, you know how to get what you want out of it. You think it's great. But casual users might struggle to drive it. If that happens, they'll stop using it.


If you're not paying attention, you might find that usage is declining, and you don't realize until it’s too late. You need to periodically take an honest look at wider usage, and see if you're seeing any of the signs of an unloved NMS.

 

Signs of an unloved NMS

Here’s some of the signs of an unloved NMS. Keep an eye out for these:

  1. Too many unacknowledged alarms
  2. No-one has the NMS screen running on their PC - they only login when they have to
  3. New devices not added

 

What if things aren't all rosy?

So what if you've figured out that maybe your NMS isn't as loved as you thought. What now? First don't panic. It's recoverable. Things you can do include:

  1. Talk. Talk to everyone. Find out what they like, and what’s not working. It might just be a training issue, or it might be something more. Maybe you just need to show them how to set up their homepage to highlight key info.
  2. Check your version. Some products are evolving quickly. Stay current, and take advantage of the new features coming out. This is especially important with usability enhancements.
  3. Check your coverage: Are you missing devices? Are you monitoring all the key elements on those devices? Keep an ear to the ground for big faults and outages: Is there any way your NMS could have helped to identify the problem earlier? If people think that the NMS has gaps in its coverage, they won't trust it.

 

All NMS platforms have a risk of becoming shelfware. Have you taken a good honest look recently to make sure yours is still working for you? What other signs do you look for to check if it's loved/loathed/ignored? What do you do if you think it might be heading in the wrong direction?

This is a conversation I have A LOT with clients. They say we want "logfile monitoring" and I am not sure what they mean. So I end up having to unwind all the different things it COULD be, so we can get to what it is they actually need.

 

It's also an important clarification for me to make as SolarWinds Head Geek because depending on what the requested means, I might need to point them toward Kiwi Syslog Server, Server & Application Monitor, or Log & Event Manager (LEM).

 

Here’s a handy guide to identify what people are talking about. “Logfile monitoring” is usually applied to 4 different and mutually exclusive areas. Before you allow the speaker to continue, please ask them to clarify which one they are talking about:

  1. Windows Logfile
  2. Syslog
  3. Logfile aggregation
  4. Monitoring individual text files on specific servers

 

More clarification on each of these areas below:

Windows Logfile

Monitoring in this area refers specifically to the Windows event log, which isn’t actually a log “file” at all, but a database unique to Windows machines.

 

In the SolarWinds world, the tool that does this is Server & Application Monitor (SAM). Or if you are looking for a small, quick, and dirty utility, the Eventlog Forwarder for Windows will take Eventlog messages that match a search pattern and pass them via Syslog to another machine.

 

Syslog

Syslog is a protocol, which describes how to send a message from one machine to another on UDP port 514. The messages must fit a pre-defined structure. Syslog is different from SNMP Traps. This protocol is most often found when monitoring network and *nix (Unix, Linux) devices, although network and security devices send out their fair share as well.

 

In terms of products, this is covered natively by Network Performance Monitor (NPM), but as I've said often you shouldn't send syslog or trap directly to your NPM primary poller. You should send it into a syslog/trap "filtration" first. And that would be the Kiwi Syslog server (or its freeware cousin).

 

Logfile aggregation

This technique involves sending (or pulling) log files from multiple machines and collecting them on a central server. This collection is done at regular intervals. A second process then searches across all the collected logs, looking for trends or patterns in the enterprise. When the audit and security groups talk about “logfile monitoring,” this is usually what they mean.

 

As you may have already guessed, the SolarWinds tool for this job is Log & Event Manager (LEM). I should point out that LEM will ALSO receive syslog and traps, so you kind of get a twofer if you have this tool. Although, I personally STILL think you should send all of your syslog and trap to a filtration layer, and then send the non-garbage messages to the next step in the chain (NPM or LEM).

 

Monitoring individual text files on specific servers

This activity focuses on watching a specific (usually plain text) file in a specific directory on a specific machine, looking for a string or pattern to appear. When that pattern is found, an alert is triggered. Now it can get more involved than that—maybe not a specific file, but a file matching a specific pattern (like a date); maybe not a specific directory, but the newest sub-directory in a directory; maybe not a specific string, but a string pattern; maybe not ONE string, but 3 occurrences of the string within a 5 minute period; and so on. But the goal is the same—to find a string or pattern within a file.

 

Within the context of SolarWinds, Server & Application Monitor has been the go-to solution for this type of thing. But, at this moment it’s only through a series of Perl, Powershell, and VBScript templates.

 

We know that’s not the best way to get the job done, but that's a subject for another post.

 

The More You Know…

For now, it's important that you are able to clearly define—for both you and your colleagues, customers, and consumers—the difference between "logfile monitoring" and which tool or technique you need to employ to get the job done.

Remote control software is a huge benefit to all IT staff when troubleshooting an issue. There are big benefits for using a service provider to host this functionality for you. There are many reasons, mainly security, to not use a service provider and instead host this application internally. However, internally hosting a remote control application can cost more in capital expenditure and overhead.

 

When you host something in the cloud you are giving that service provider responsibility for a significant portion of your security control. Even for something as simple as remote control software there are concerns about security. For many solutions you have to rely on the authentication mechanism the provider built, although some will allow you to tie authentication into your internal Active Directory. The provider may allow for two-factor authentication. You have to rely on the provider’s encryption mechanism and trust that all signaling (setup, control, and tear down) and data traffic is encrypted, along with the appropriate algorithms. The remote control service provider not only services your hosts, but that of many other organizations and you have trust them to keep everyone separated. Also, with all of those combined hosts, it makes the service provider a larger target for an attack than your organization may be on it’s own. When your organization’s Internet connection goes down you loose the ability to control any of your end hosts from the internal side of your organization’s network. When you delete an end host or discontinue service from the provider you data might not be completely deleted.

 

Hosting a remote control application within your own organization can be difficult in itself. You have to have the infrastructure to host the application. Then if you want redundancy, the application has to support redundancy and you have to have more infrastructure. Then you need to make sure you update the application on your server(s), on top of ensuring the end hosts are up to date, which requires planning, testing, and change control. If you expose your internal remote control application to the Internet, like a service provider would, then you need to monitor it for potential intrusions and attacks, and defend against those. That may require additional infrastructure and add complexity. If your organization’s Internet connection goes down and you are on the inside of your organization, then you loose connectivity to all of the remote hosts. If you are external, then you loose connectivity to all of the internal hosts.

 

There is no one solution that fits everyone’s needs. As a consultant I have seen many different solutions and have ones that I prefer. Do you use a remote control solution from a service provider or do you have one you host yourself? Why did your organization choose that one?

tweet.pngI’ve had a few members ask about my silly Pi Day TwitterBot hack, and why someone would even want to do such a thing.  The real answer is a geek compulsion, but the thinking went something like this:

 

Pi Day 2015 was going to give us Pi via the date to 10 digits, but if you included the milliseconds you could go to 13 digits.  Well, to be fair you could go to 1,000,000 digits if you had a sufficiently accurate timer to produce a decimal second with enough accuracy.  But let’s face it; true millisecond accuracy in IT gear is unlikely anyway.   Are you happy with the clarification realtime programmers?!

 

IMG_8762.JPG

I realized that if I could loop tight enough to trigger at a discrete millisecond boundary I could do something at that fateful moment.  And because a geek can, a geek should.  So what should happen at that moment? Do a SQL update, save a file, update a config?  No, there was only one thing to do: tweet.  The next trick was to create a bot. I use Raspberry Pi’s for just about all maker projects now.  After years of playing with microcontrollers I finally switched over.  Pi’s are cheaper than Arduinos when you consider adding IO, they run a full Linux OS, and many add-on boards work with them. (And yes I monitor my Pi’s at home with Orion, so be sure to check out Wednesday’s SolarWinds Lab which is all about monitoring Linux.)

 

But neither Arduinos nor Pi's have is real-time clocks, which is a little bit of a problem if you’re planning to do time sensitive processing.  So here’s the general setup for the project, and I’ll save you the actual code because 1) I made it in < 20 minutes, 2) no one will ever need it again and 3) mostly because it’s so ugly I’m embarrassed.  I used Python because there were libs aplenty.

 

The Hack

 

  1. Find “real enough time” i.e. accurate offset.  I’m too cheap to buy a GPS module, so I used NTP. However, single NTP syncs aren’t nearly enough to get millisecond-ish accuracy, plus the Raspberry’s system (CPU) clock drifts a bit.  So first, we need to keep a moving average of the offset and I used ntplib

     

    c = ntplib.NTPClient()
    response = c.request('europe.pool.ntp.org', version=3)
    >>> response.offset
    -0.143156766891
    >>> response.root_delay
    0.0046844482421875
    
    
    
    
    

     

     

    Next, poll 30 times in a minute and, deposit the results into a collections.deque.  It’s a double ended buffer object, meaning you can add or remove items from either end.  (It’s easier to implement than a circular buffer).  Adjusting the overall length in 30 sample increments lets you expand the running average beyond a single update cycle.

     

  2. Keep an eye on clock drift.  The actual trigger loop on the Raspberry would need to hammer the CPU and I didn’t want to get into a situation where I’d be about to trigger on the exact millisecond but get hit with an NTP update pass.  To do that I’d need to fire based on a best guess of the accumulated drift since the previous sync.  So, whenever the NTP sync fired, I saved the previous average offset delta from the internal clock, also into a deque.  On average the Pi was drifting 3.6 secs/day or 0.0025 secs/min.  Because it was constantly recalculating this value I corrected for thermal effects and other physical factors and the drift was remarkably stable.

     

  3. oAuth, the web and twitter.  Twitter is REST based and if I were building an app to make some cash, I’d probably either be really picky about choosing a client library or implement something myself.  But there was no need of it here, so I checked the Twitter API docs and picked tweepy.

     

    auth = tweepy.OAuthHandler(consumer_key, consumer_secret) 
    auth.secure = True auth.set_access_token(access_token, access_token_secret) 
    api = tweepy.API(auth)
    # If the authentication was successful, you should 
    # see the name of the account print out print(api.me().name) 
    # If the application settings are set for "Read and Write" then 
    # this line should tweet out the message to your account's timeline.
    api.update_status('Updating using OAuth authentication via Tweepy!')
    
    
    
    
    

     

    I gave my app permission to my feed, including updates (DANGER!) generated keys and that was about it.  Tweepy  makes it really easy to use to tweet, and pretty nicly hides the oAuth foo.

     

  4. The RESTfull bit.  As sloppy as NTP really is, it’s nothing compared to the highly variable latency of web transactions.  With a REST call, especially to a SaaS service, there are exactly 10^42 things that can affect round trip times.  The solution was twofold.  First make sure the most variable transaction – oAuth – happened well in advance of the actual tweet. Second, you need to know what the average LAN -> gateway -> internet-> Twitter REST service delay is.  Turns out, you guessed it, it’s easy to use a third deque object to do some test polls and keep a moving average to at least guestimate future web delay.

     

  5. Putting it all together - the ugly bit. The program pseudocode looked a little something like this:

     

    // For all code twitterTime = time.time() – {offset rolling average} – {predicted accumulating drift}
    // Gives the corrected network time rather than the actual CPU time.
    Do the oAuth
    While (twitterTime < sendTime - 20)
    {
       Do the NTP moving average poll
       Update the clock drift moving average
       Update the REST transaction latency moving average 
       Wait 10 minutes     
    }
    While (twitterTime < sendTime - 2)
    {
       Do the NTP moving average poll
       Update the clock drift moving average
       Wait  1 minute
    }
    While (twitterTime < sendTime – {RESTlatency moving average})
    {
       Sleep 1 tick // tight loop
    }
    Send Tweet
    Write tweet time and debug info to a file
    End
    
    
    
    
    

 

Move Every Tweet, For Great Justice

 

I watched my Twitter feed Saturday morning from the bleachers at kickball practice, and sure enough at ~9:26 am, there it was.  This morning with a little JSON viewing I confirmed it was officially received in the 53rd second of that minute.

 

Why do geeks do something like this?  Because it’s our mountain, it’s there and we must climb it.  There won’t be another Pi day like this, making it singular and special and in need of remembrance.  So, we do what we do. The only question is how closely did I hit the 589th millisecond?  Maybe if I ask Twitter, really nicely...

Support centers in organizations are under constant pressure due to increasing volume of service tickets and increasing end-users to manage. The complexity and diversity of support cases make it all the more difficult to provide timely resolution considering the lean support staff and tight deadlines. So, how can help desk admins increase the efficiency of the help desk process, and ultimately result in faster service delivery? Considering all the things you do, the question to ask next is: “Where can I save time in all my daily goings-on?” Conserving time on repetitive, less important, and menial tasks can help you gain that time for actual ticket resolution.

 

Here are 5 useful time-saving strategies for improved help desk productivity:

 

#1 DON’T GO INVENTING FIXES. SOMEONE MIGHT HAVE ALREADY DONE THAT.Help Desk Flash 2.PNG

Not all service tickets are unique and different from one another. It is highly common to have had different users face the same issue in the past. The smart way here is to track repeating help desk tickets, their technician assignment, and capture the best resolution applied in an internal knowledge base. This way, it will be never be a new issue to deal with from scratch. Any new technician can look up the fix, and resolve the issue quickly.

 

#2 KNOW WHAT YOUR ARE DEALING WITH.

Before jumping the gun and assuming you know exactly what problem you are dealing with and start fixing it, make sure you have elicited all the details about the issue from the end-user. Sometimes it might just be that the user doesn’t know how to use something, or it is such a simple fix that the user can do it himself. So, don’t settle for vague descriptions from tickets. Make sure you get as many details from the user about the issue, before you start providing the solution.

 

#3 PROMOTE END-USER SELF-SERVICE.

If your user base is growing and you are receiving tickets for commonplace issues with easy fixes, it is time you start thinking about building an internal self-service portal with updated how-to’s and FAQs to help users resolve Level 0 issues themselves. Password reset is still a top call driver for support teams. Self-service portals will free up a fair share of IT admins’ time if this could be automated.

 

#4 ESCALATE WHEN YOU CAN’T RESOLVE.

While you might feel capable of resolving any level of support ticket, there will be a time when you face some technical challenges. Finding the cause for slow database response time may not be your forte. That pesky VM always reports memory exhaustion no matter what you do. These are times you must act on judgment and escalate the issue to another technician or your IT manager. Getting all worked up on the same issue (going only off on a hunch) will not only delay resolution, but will result in more tickets piling up. Make sure your help desk has proper escalation, de-escalation, and automated ticket routing functionality in cases where SLAs are not met.

 

#5 DO IT REMOTELY.

Yes, personal human contact is the best possible means of communication. However, it can cost you handsomely in time and money if you start visiting your end-users one by one for desktop support. So many service tickets can be resolved remotely if you conduct a remote session of the user’s system. And if you have additional remote administration tools, you can master the art of telecommuting for IT support.

 

What other tricks of the trade do you have to up your sleeve to help fellow IT pros speed up customer support?

Leon Adato

Sour Notes in iTunes

Posted by Leon Adato Expert Mar 11, 2015

On Monday, iTunes was down. But we all expected that because Apple was holding its “Spring Ahead” event, and was poised to announce a slate of new products.

 

Today, iTunes was down again (or at least parts of it) and this was very NOT expected.

 

The first report of the outage appeared on TheNextWeb.com. They noted that  iTunes connect was down, you could see music but not buy it, and several app pages were dead when you click them.

 

As is the case with most short-term outages (Apple responded and resolved it within an hour or two) we will likely never know what really happened. And that’s fine. I’m not on the iTunes internal support team so I don’t need the ugly details.

 

But it's always fun to guess, right? Armchair quarterbacking an outage is the closest to sports that some of us I.T. Pro's get.

 

First, I ruled out security. A simple DDOS or other targeted hack would have defaced the environment, taken out entire sections (or the whole site), and made a much larger mess of things.

 

Second, I took simple network issues off the list. Having specific apps, song purchasing, and individual pages die is not the profile of a failure in routing, bandwidth, or even load balancing.

 

My first choice was Storage – if the storage devices that contain the actual iTunes songs as well as app downloads were affected that would explain why we saw failures once we got past those initial pages. It could have explained why the failure is geographic (UK and US) but we didn't hear about failures in other parts of the world.

 

My runner-up vote went  to Database – corrupt records in the database that houses the CMS which undoubtedly drives the entire iTunes site. Having specific records corrupted would explain why some pages worked and others don’t.

 

Then CNBC published a statement from Apple apologizing for the outage and explaining it was an internal DNS problem.

 

Whatever the reason, this failure underscores why today’s complex, inter-connected, cloud and hybrid cloud environments need monitoring that is both specific and holistic.

 

Specific because it needs to pull detailed data about disk and memory IOPS, errored packets, application pool member status, critical service status (like DNS), synthetic tests against key elements (like customer purchase actions), and more.

 

Holistic because we now need a way to view the way write errors on a single disk in an array affects the application running on a VM that uses the array in its datastore. We need to see when a DNS resolution fails (before the customer tries it) and correlate that to the systems that depend on those name resoolutions.

 

That means monitoring that can take in the entire environment top to bottom.

 

Yes, I mean AppStack.

 

Hey, Apple internal support: If you want us to set up a demo for you, give us a call.

Users only call the HelpDesk with problems. Some of the issues, like password resets, are easy to resolve. Other issues can get very complex and then add into the mix the user not properly describing the issue they are having or exactly what the error message they see says. When helping a user with an issue, have you ever asked a user to click on something here or there and let you know what pops up on the screen? How long did you wait until you asked if anything different is on the screen and the user says that something was displayed several minutes ago?

 

 

I am a very visual person and I need to see the error or see how long it took for the error message to pop up. An error message that comes back right away could mean something completely different than if it took a few seconds; users cannot really convey that timing well. Years ago when I first started working in IT, I used a product called PCAnywhere that would let me remote control another machine. I could even do it remotely from home via dial up!

 

 

The ability to remotely see what is happening on a user’s machine makes a huge difference. Today I use a variety of these applications, depending on what my client will support, but they all have a large set of features beyond just remotely controlling the machine. Solarwinds DameWare lets you remotely reboot, start/stop processes, view logs, AD integration, and mobile remote control. Other than remote control of a machine, what other features do you use? Which features make it easier for you to troubleshoot issues from wherever you are?

As an admin, how do you ensure that you don’t run out of disk space? In my opinion, thin provisioning is the best option. It reduces the amount of storage that needs to be purchased for any application to start working. Also, monitoring thin provisioning helps you understand the total available free space and thus you can allocate more storage dynamically (when needed). In a previous blog I wrote, I explained how thin provisioning works and the environment it can be useful in. Now I’d like to discuss the different approaches for converting from fat volume to thin.


Once you’ve decided to move forward with thin provisioning, you can start implementing all your new projects with minimum investment. With thin provisioning, it’s very important to account for your active data (in fat volume) and to be aware of challenges you might encounter. For example, when conducting a regular copy of existing data from fat volumes to thinall the blocks associated with fat volume will be copied to the thin, ultimately wasting any benefits from thin provisioning.


There’s several ways to approach copying existing data. Let’s look at a few:


File copy approach

This is the oldest approach for migrating data from fat volume to thin volume. In this method, the old fat data is backed up at the file level and restored as new thin data. The disadvantage of this type of backup and restore is that it’s very time consuming. In addition, this type of migration can cause interruption to the application. However, an advantage to the file copy approach is that it marks the zero value blocks as available to be overwritten.


Block-by-block copy approach

Another common practice is using a tool that does a block-by-block copy from an old array (fat volume) to a new thin volume. This method offers much higher performance compared to the file copy method. However, the drawback to this method is the zero-detection issuemeaning fat volumes will have unused capacity which will be filled with zero’s awaiting the eventual probability of an application writing data to it. So, when you do general migration by copying block-by-block data from array to the new, you receive no benefit from thin provisioning. The copied data will have unused space with zero-blocks, and you end up with wasted space.


Zero-detection

A tool that can handle zero block detection can also be used. The tool should remove the zero valued blocks, while copying the old array to the new. This zero-detection technology can be software based or hardware based. Both software and hardware based fat to thin conversions can help remove zero blocks. However, the software based fat to thin conversion has a disadvantagethe software needs to be installed on a server. That means this software will consume large amounts of server resources and will impact other server activities. The hardware based fat to thin conversion also has a disadvantageit’s on the expensive side.


As discussed, all the methods to convert from fat volumes to thin have advantages and disadvantages. But, you cannot continue using traditional provisioning or fat provisioning for storagesince fat provisioning wastes money and results in poor storage utilization. Therefore, I highly advise using thin provisioning in your environment, but make sure you convert your old fat volumes to thin ones before you do.

 

After you have implemented thin provisioning, you can start the over-committing of storage space. Be sure to keep an eye out for my upcoming blog where I will discuss the over-commitment of storage. 

If you haven’t heard already, SolarWinds’ Head Geeks are available for daily live chat, Monday-Thursday for the month of March at 1:00PM CT.  kong.yang, adatole, sqlrockstar and yes me too, will be online to help answer any questions you may have about products, best practices, or general IT.  Though unlikely, some chump stumpage may occur, so we’ll also have experts from support and dev to make sure we have the best answer for anything you can throw at us.  You’ll find us here http://thwack.com/officehours on the Office Hours event page in thwack.

 

My Question

 

Daily Office Hours is part of a thwack & Head Geek experiment to test new ways for the community to reach product experts.   I’m also testing a new tag line for our fortnightly web TV show, SolarWinds Lab http://lab.solarwinds.com, and would love your feedback before I rebuild the show header graphic.

 

What do you think of: What Will You Solve Next?

 

It means something specific to me, but I’d love to get your feedback before I say what I think it means.  Please leave some comments below.  Do you like it?  Is it the kind of thing we ask each other on thwack?  Is it something SolarWinds does/should ask?  Am I taking liberties with the tm?  After you all chime in and let me know what you think, I’ll reply with what I think it means.

 

Thanks as always, we hope to see you in March for Office Hours!

Of all of the security techniques, few garner more polarized views than interception and decryption of trusted protocols. There are many reasons to do it and a great deal of legitimate concerns about compromising the integrity of a trusted protocol like SSL. SSL is the most common protocol to intercept, unwrap and inspect and accomplishing this has become easier and requires far less operational overhead than it did even 5 years ago. Weighing those concerns against the information that can be ascertained by cracking it open and looking at its content is often a struggle for enterprise security engineers due to the privacy implied. In previous lives I have personally struggled to reconcile this but have ultimately decided that the ethics involved in what I consider to be violation of implied security outweighed the benefit of SSL intercept. With other options being few, blocking protocols that obfuscate their content seems to be the next logical option, however, with the prolific increase of SSL enabled sites over the last 18 months, even this option seems unrealistic and frankly, clunky. Exfiltration of data, being anything from personally identifiable information to trade secrets and intellectual property is becoming a more and more common "currency" and much more desirable and lucrative to transport out of businesses and other entities. These are hard problems to solve.

Are there options out there that make better sense? Are large and medium sized enterprises doing SSL intercept? How is the data being analyzed and stored?

Is User Experience (UX) monitoring going to be the future of network monitoring? I think that the changing nature of networking is going to mean that our devices can tell us much more about what’s going on. This will change the way we think about network monitoring.


Historically we’ve focused on device & interface stats. Those tell us how our systems are performing, but don't tell us much about the end-user experience. SNMP is great for collecting device & interface counters, but it doesn't say much about the applications.


NetFlow made our lives better by giving us visibility into the traffic mix on the wire. But it couldn't say much about whether the application or the network was the pain point. We need to go deeper into analysing traffic. We've done that with network sniffers, and tools like Solarwinds Quality of Experience help make it accessible. But we could only look at a limited number of points in the network. Typical routers & switches don't look deep into the traffic flows, and can't tell us much.


This is starting to change. The new SD-WAN (Software-Defined WAN) vendors do deep inspection of application performance. They use this to decide how to steer traffic. This means they’ve got all sorts of statistics on the user experience, and they make this data available via API. So in theory we could also plug this data into our network monitoring systems to see how apps are performing across the network. The trick will be in getting those integrations to work, and making sense of it all.


There are many challenges in making this all work. Right now all the SD-WAN vendors will have their own APIs and data exchange formats. We don't yet have standardised measures of performance either. Voice has MOS, although there are arguments about how valid it is. We don't yet have an equivalent for apps like HTTP or SQL.


Standardising around SNMP took time, and it can still be painful today. But I'm hopeful that we'll figure it out. How would it change the way you look at network monitoring if we could measure the user experience from almost any network device? Will we even be able to make sense of all that data? I sure hope so.

What is the Geek's Guide to AppStack? Simply put, it's the central repository for all tech content involving the AppStack. If data and applications are important to you, the AppStack Dashboard is for you. The AppStack Management Bundle enables agility and scalability in monitoring and troubleshooting applications. And this blog post will continue to be updated as new AppStack content is created. So bookmark and favorite this post as your portal to all things AppStack. Also, if there is anything that you would like to discuss around AppStack, please comment below.

 

 

AppStack IsAppStack How-to

 

 

Application Relationship Mapping for Fast Root Cause Analysis: Application Stack Dashboard

 

Application Stack Dashboard: How to Set Up, Use and Customize

 

Download the Application Stack Management Bundle:

 

AppStack Reference

Hear what the Head Geeks had to say in the AppStack Blog Series:

Ambassador BlogsApplication-Centric Monitoring:

AppStack Social Trifecta:

Helpful AppStack Resources:

SolarWinds Tech Field Day Coverage:

There’s two ways to get things donethe hard way or the easy way. The same holds true with help desk management. There are many micro to small businesses who do not have the resources to manage their help desk, and end up spending more time and effort doing the job manuallyby tracking tickets via email, updating statuses on spreadsheets, walking over to the customer’s desk to resolve tickets, etc. This is a tedious and time-consuming process, highly ineffective, and causes delays & SLA lapses.

 

Streamlining the help desk process and employing automation to simplify tasks and provide end-user support is the other waythe smart way. If you know what tools to use, and how to get the best benefits from them, you can achieve help desk automation cost-effectively.

 

Here are a few things you should automate:

  • Ticketing management: from ticket creation, technician assignment, tracking, to resolution
  • Asset Management: scheduled asset discovery, association of assets to tickets, managing inventory (PO, warranty, parts, etc.)
  • Desktop Support: ability to connect to remote systems from the help desk ticket for accelerated support

 

Take a look at this Infographic from SolarWinds to understand the benefits of centralized and organized help desk management.

1411_hde_infographic_vF.png

 

Learn how to effectively manage IT service requests »

If you’ve worked in IT for any amount of time, you are probably aware of this story: An issue arisesthe application team blames the database, the database admin blames the systems, the systems admin blames the network, and the network team blames the application. A classic tale of finger pointing!

 

But, it’s now always the admins fault. We can’t forget about the usersoften the weakest link in the network.

 

Over the years, I think I’ve heard it all. Here are some interesting stories that I’ll never forget:

 

Poor wireless range


User:     Since we moved houses, my laptop isn’t finding my wireless signal.

Me:        Did you reconfigure your router at the new location?

User:     Reconfigure…what router?

 

The user had been using their neighbors signal at their previous house. I guess they just assumed they had free Wi-Fi?  However, this was almost a decade ago when people were unaware that they could secure their Wi-Fi.

 

Why isn’t my Wireless working?


User:     So, I bought a wireless router and configured it, but my desktop isn’t picking up the signal.

Me:        Alright, can you go to ‘Network Connections’ and check if your wireless adapter is enabled?

User:     Wait, I need a wireless adapter?

 

Loop lessons


I was at work and one of my coworkers…let’s call him the hyper enthusiastic newbie. Anyway, the test lab was under construction, lab devices were being configured and the production network wasn’t connected to the lab yet. After hours of downtime, the hyper enthusiastic newbie came to me and said:

 

Newbie:               I configured the switch, and then I wanted to test it.

Me:                        And?

Newbie:               I connected port 1 from our lab switch to a port on the production switch. It worked.

Me:                        Great.

Newbie:               And then to test the 2nd port, I connected it to another port on the production switch.

 

This is a practical lesson on what switching loopbacks can do to the network

 

Not your average VoIP trouble


A marketing team member’s VoIP phone goes missing. An ARP lookup showed that the phone was on a sales reps desk. The user decided to borrow the phone for her calls because hers wasn’t working. Like I said, not your average VoIP trouble.

 

One of my personal favorites: Where's my email?


User:     As you can see I haven’t received any email today.

Admin: Can you try expanding the option which says today?

 

Well, at least it was a simple fix.


Dancing pigs over reading warning messages


So, a user saw wallpaper of a ‘cute dog’ online. They decided to download and install it despite the 101 warning signs that his system threw at him. Before they knew it…issues started to arise: Malware, data corruption, and soon every system was down. Oh my!

 

Bring your own wireless


The self-proclaimed techie user plugs in his wireless travel router that also has DHCP enabled. This DHCP also first responds to a client that asks for an IP. As you all know, this can lead to complete Mayhem and is very difficult to troubleshoot.

 

Excuse me, the network is slow


I hear it all the time and for a number of reasons:

 

Me:        What exactly is performing slowly?

User:     This download was fine. But, after I reached the office, it has stopped.

Me:        That is because torrents are blocked in our network.

 

That was an employee with very high expectations.

 

Monitor trouble!


Often, our office provides a larger sized monitor to users who are not happy with their laptop screen size. That said:

User:     My extra monitor displays nothing but the light is on.

Me:       Er, you need to connect your laptop to the docking station.

User:     But I am on wireless now!

 

Due to all these instances, user education has been a priority at work. However, these situations still continue to happen. What are your stories? We’d love to hear them.

Many IT folks, including yours truly, made technology predictions for 2015. These predictions revolved around buzz worthy tech trends of now – hybrid cloud, software-defined data center, converged infrastructure, and containers. In a recent webinar, I hosted three techies to get their take on these tech constructs and share their E’s for successfully navigating the fear, uncertainty, and doubt, aka the FUD factor.


The three industry SMEs were:

  • Christopher Kusek. Chief Technology Officer of Xiologix, LLC. Christopher is an EMC® Elect, a VMware® vExpert, and an accomplished speaker, and an author with five books published. Blog: http://pkguild.com
  • John Troyer. Founder of TechReckoning, an independent IT community. He is the co-host of the Geek Whisperers Podcast and consults with technology vendors on how to work better with geeks. Blog: http://techreckoning.com
  • Dennis Smith. IT veteran in various roles. Most recently, Dennis was the Principal Engagement Manager for EMC social marketing where he led the strategy for the influencer program, the EMC Elect. Blog: http://thedennissmith.com


The three “E’s” from the panelists stayed with me long after our webcast was over were:

Empowerment

Christopher stated that technologies are meant to make it easier for IT pros to do what they have to do with their role and responsibilities. He pointed to root-cause visibility through the entire stack as a requirement to make these technology constructs viable and successful; but also said that tool makers aren’t there yet. In lieu of hundreds of management tools, he recommended using a few tools to empower IT pros to manage and monitor their ecosystem as they integrate their existing deployments with new technology constructs. He concluded by saying that IT pros can't be an extra in their IT ecosystem action film. Be the main character in the IT world full of characters!


Empathy

John discussed the shift in IT attitude towards empathy to customers. He focused on shared goals, shared responsibilities, and end-to-end context but broke up the pieces into consumable bites for the IT pro. In other words, the IT pro wasn't responsible for knowing how to maintain the entire ecosystem from application development through operations (DevOps). He also discussed the rise in μ-services (micro-services) and how insane the monitoring is. Finally, he shared wisdom about how IT pros need to become a π-specialist. No not that kind of pie, but rather, the 3.14159 kind. And that pi-specialist needs broad IT generalist skills, but with deep-roots specialist know-how.


Evolution

Dennis spoke of continuous learning in the continuous application delivery era. He shared his experience in building tech communities, engaging customers, and delivering solutions with the latest and greatest technologies. He felt that IT pros needed to be adaptable because technology is ever changing. That which can disrupt, often does change. His advice to IT pros was to pay it forward because it will be returned back to you many times over. And to pay it forward simple means to share your knowledge and expertise without expectation of getting anything in return.


Bridging to Business with Empower, Empathy, and Evolution in Clouds & SDDC.

The 2015 IT Prediction webinar centered on technology trends that promise agility and scalability. But to get to consumable simplicity, the equation is dependent on the IT pros, the processes that they put into place, and the tools that they will use to bridge the technology to successful business outcomes. The first tool that IT pros need to bridge utility is one that provides connected visibility of the application through the data stack.


Hello AppStack! If this resonates with any of you readers out there, I highly recommend you take a look at this Application Stack Management Bundle. You can access a free trial here:

http://bit.ly/AppStackDownload


Better yet, download the tool and then tune in to the live demonstration of the AppStack dashboard to…


On March 12th at 11AM-12PM CT, SolarWinds will demonstrate the AppStack dashboard to troubleshoot application performance issues.

Troubleshooting Performance Issues across the Application Stack

URL: http://bit.ly/AppStackWebcast

Thursday March 12th from 11AM-12PM


IT Pros can even try out the Application Stack Management Bundle while watching the live demo. Ask questions during the demo as SolarWinds SMEs will be on to answer them live. 

Given the current state of networking and security and with the prevalence of DDoS attacks such as the NTP Monlist attack, SNMP and DNS amplifications as well as the very directed techniques like DoXing and most importantly to many enterprises, exfiltration of sensitive data, network and security professionals are forced to look at creative and often innovative means to ascertain information about their networks and traffic patterns. This can sometimes mean finding and collecting data from many sources and correlating it or in extreme cases, obtaining access to otherwise secure protocols.

Knowing your network and computational environment is absolutely critical to classification and detection of anomalies and potential security infractions. In today’s hostile environments that have often had to grow organically over time, and with the importance and often associated expenses of obtaining, analyzing and storing this information, what creative ways are being utilized to accomplish these tasks? How is the correlation being done? Are enterprises and other large networks utilizing techniques like full packet capture at their borders? Are you performing SSL intercept and decryption? How is correlation and cross referencing of security and log data accomplished in your environment? Is it tied into any performance or outage sources?

Capacity planning is an important part of running a network. To me, it’s all about two things: Fewer surprises, and better business discussions. If you can do that, you'll get a lot more respect.


When I was working for an ISP, we had several challenges:

  • Average user Internet usage is steadily increasing
  • Users are moving to higher-bandwidth access circuits, which means even more usage
  • Upstream WAN bandwidth still costs money. Sometimes lots of money.

I built up a capacity planning model that took into account current & projected usage, and added in marketing estimates of future user changes. It wasn’t a perfect model, but it was useful. It gave me something to use to figure out how we were tracking, and where the pain points would be.


Fewer Surprises

No-one likes surprises when running a network. If your VM runs out of memory, it’s easy enough to allocate more. But if your WAN link reaches 90%, it might take weeks to get more bandwidth from your provider. If you hit that peak due to foreseeable growth, it makes for an awfully uncomfortable discussion with the boss. Those links can be expensive too. You'll be in trouble with the bean-counters if the budgets have been set, and then you tell them that you need another $10,000/month. You can't always get it right. There’s always situations where a service proves far more popular than expected, or a marketing campaign takes off. But reducing the surprises helps your sanity, and it improves your credibility.


Better Business Discussions

I like to use capacity planning and modeling tools for answering those “What if?” questions. e.g. The marketing team will come to you with questions like these:

  • What if we add another 5,000 users to this service? What will that do to our costs?
  • What if we move 10,000 customers from this legacy service to this new one? How will our traffic patterns change?
  • Do we have capacity to run a campaign in this area? Or should we target some other part of the country?
  • Where should we invest to improve user experience?


If you've been doing your capacity planning, then you've got the data to help answer those questions. You get a lot more respect when you're able to have those sorts of discussions, and answer questions sensibly.


This does take real effort though. Getting the data together and making sense of it can be tough. Tying it to business changes in particular is tough. No capacity planning model fully captures everything. But it doesn't have to be perfect - you can always refine it over time.


Are you actively doing capacity planning? How is it helping? (Assuming it is!) If you're not doing any capacity planning, what’s been holding you back? Or have you had any really nasty surprises, where you've run out of capacity in an embarrassing way?

WhatColorIsThisRouterSolarWinds.jpg

Admins across the internet are freaking out about the color of this router.  Is it BlueGreen or Greenish Blue?  Although the debate has raged in datacenters for years, reliable sources including Adobe have weighed in on the matter, and indicated the router is in fact supposed to be PANTONE 7477, however Cisco has never taken a firm stand on the matter, more recently giving up entirely and gone grey with new systems.

 

It's believed that a combination of aging fluorescent lighting and overwork tracing dark Ethernet affects the color sensitivity of network administrators, leading to the heated disagreement.  Coupled with the erosive effects of X-Rays from CRTs before the arrival of LCDs, fluctuations in caffeine levels and temporary frustration with a device may also affect IT pro color perception.

 

And for the record I saw the dress as #838CC3 and #5D4C32

Last month, I had the pleasure and the privilege of sitting down with Ethan Banks and Greg Ferro of PacketPushers.net, to share some of my thoughts on monitoring, business justifications, the new features of NPM 11.5, and life in general. You can listen to the conversation here : Show 225 – SolarWinds on The Cost of Monitoring + NPM 11.5

 

 

I gotta tell you, it was an absolute hoot.

 

Getting IT professionals (IE: Geeks) to speak in public is not always an easy task. But Ethan and Greg are the consummate conversationalists. They know that the key to getting a really juicy conversation going is to tap into the love and passion that all IT pro's have. So we spent a decent amount of time “warming up.” I'm naturally gregarious, so I didn't feel like I needed the time to loosen up, but I appreciated not hitting the microphone cold.

 

Once the tape started rolling, Ethan and Greg kept the banter up while also helping me stay on track. For an attention-deficit-prone guy like me, that was a huge help.

 

But aside from all the mechanics, the best part of the talk was that they were sincerely interested in the topic and appreciative of the information I brought to the table. These are guys who know networking inside and out. They are IT pros who podcast, not talking heads who know enough tech to get by. And they really REALLY care about monitoring.

 

It's the fastest way to anyone's heart, really. Show me that you care about what I care about, and I'm yours.

 

So here's hoping I have the chance to join the PacketPushers gang again this year and share the monitoring love some more. Until then, I've got episode 225 to cherish, and I will be tuning into their regular podcasts to see what else they have to say.

On Tuesday, February 24, we released new versions of all our core systems management products, including Server & Application Monitor, Virtualization Manager and Web Performance Monitor. We also released a brand new product called Storage Resource Monitor. While this is all exciting in and of itself, what we’re most thrilled with is that all these products now have out-of-the-box integration with our Orion platform and include our application stack (AppStack) dashboard view.

 

The AppStack dashboard view is designed to streamline the slow and painful process of troubleshooting application problems across technology domains and reduce it from hours—and sometimes even days—down to seconds. It does this by providing a logical mapping and status between the application and its underlying infrastructure that is generated and updated automatically as relationships change. This provides a quick, visual way to monitor and troubleshoot application performance from a single dashboard covering the application, servers, virtualization, storage, and end-user experience. What’s key here is that this is all done in the context of the application. This means that from anywhere in the AppStack dashboard, you can understand which application(s) is dependent on that resource.

 

In addition to the immediate value we think this will have for you, our users, it also highlights a shift within SolarWinds towards setting our sights on tackling the bigger problems you struggle with on a daily basis. We’ve always sought to do this for specific situations, such as network performance management, application monitoring, IP address management, storage monitoring, etc., but the new AppStack-enabled products help solve a problem that spans across products and technical domains.

 

However, this doesn’t mean SolarWinds is going to start selling complex, bloated solutions that take forever to deploy and are impossible to use. Rather, by intelligently leveraging data our products already have, we can attack the cross-domain troubleshooting problem in the SolarWinds way—easy to use, easy to deploy and something that solves a real problem.

 

But know that the AppStack dashboard view isn’t the end. Really, it’s just the beginning; step one towards making it easier for you to ensure that applications are performing well. Our goal is to be the best in the industry at helping you answer the questions:

 

  • “Why is my app slow?”
  • “How do I fix it?”
  • “How do I prevent it from happening again?”

 

While integrating these four products with the AppStack dashboard view is a great first step, there’s clearly a lot more we can do to reach the goal of being the best in the industry at that. Pulling in other elements of our product portfolio that address network, log and event, and other needs—along with adding more hybrid/cloud visibility and new capabilities to the existing portfolio are all areas we are considering to reach that goal.

 

Easy to use yet powerful products at an affordable price. No vendor lock-in. End-to-end visibility and integration that doesn’t require a plane-load of consultants to get up and running. That’s our vision.

I hope you’ll take a look for yourself and see just how powerful this can be. Check out this video on the new AppStack dashboard view and read more here.

The core application to any HelpDesk workflow is the ticketing system. It helps all levels to track issues/requests and documents what has been discovered about the particular issue. There are some ticketing systems that are really complex with lots of features. As more and more features are added, complexity is added as well.

 

 

I was recently speaking with someone who stated that they had emailed into their IT ticket system with a list of items that needed addressing. They knew they were supposed to open a single ticket for each issue separately. A few days later the IT staff worked on one of the issues, then closed the ticket and marked it as resolved. Only one of the issues on the ticket was resolved with the rest of them never being addressed.

 

 

Since ticket creation via email is so easy, some users may create more tickets than they normally would. Perhaps instead of doing the research in a knowledge base or on a company intranet, users just send in the email to ask IT. I personally have done this because it was easy.

 

 

The email ticket creation feature is a convenient feature of a ticketing system. Users can easily send in one request or issue at a time to create a ticket. Users don’t have to stay on a phone trying to explain the issue while someone transcribes it into a ticket for them. However, with every feature there are down sides too. Users will inevitably email multiple items in at once and IT staff will overlook them. The email system or integration will go down. Users will create tickets via email with very generic requests like ‘Internet is slow’.

 

 

Does a feature like ticket creation via email improve user experience? Does the benefit to enable such a feature outweigh the cost to set it up and maintain it? Can you train users to only send in one issue per ticket.

Filter Blog

By date: By tag: