Showing results for 
Search instead for 
Did you mean: 

The Art of Troubleshooting

Level 13

I wrote about visibility via monitoring being the first step in successful IT change management. And as an IT Pro’s career progresses, they will encounter many breaks and failures in their IT infrastructure. The only guarantee in IT is that something will break and IT pros have to be able to fix it ASAP. Experience and a solid process framework, coupled with visibility are key to successfully troubleshooting IT issues.

Troubleshooting is a skill that consists of two parts: root-cause analysis and taking corrective measures. In the past, troubleshooting would include:

  1. Reading the fabulous manual (RTFM)
  2. Working with wonderful vendors… post sales
  3. Patching together a keep-the-lights-on solution with putty and duct tape
  4. Or leveraging the reboot all systems and leave it in the hands of FM – fate & magic

Fast forward to today, and troubleshooting is all about collaboration i.e. someone has probably already ran into this issue and has blogged about it or shared the knowledge on an IT community website like thwack. So troubleshooting becomes as simple as Google-ing it or Bing – winner, winner, chicken dinner.

But what if you are the first to encounter a problem? Then, you’ll need a framework to troubleshoot issues. If you don’t have one, here’s a template framework that you can leverage. And within that framework root-cause analysis begins with what is happening (a real-time dashboard) and what has happened (logs). Once the problem is identified and cause-effect is understood, the prescriptive measures can be determined, tested, verified as viable fixes, and deployed into production. Troubleshooting success consists of the efficiency and effectiveness of the resolution.

In closing, troubleshooting is a constantly evolving skill for an IT pro. When you think you’ve mastered your environment, new technology always intervenes. So learn the art of troubleshooting like your career depends on it.

Let me know what you think in the comment section below. Also feel free to share your troubleshooting process or tips below.

Level 15

The scientific method is always your friend.  It also helps to keep track of each detail, making only one change at a time, documenting that change, and then repeating.  After some experience, you will learn to identify the "variables" that can go wrong with a situation and apply one-off analysis to grow your experience and your skills.  Don't be afraid to be involved with all kinds of troubleshooting.  The greater the quantity of issues the greater the quantity of experience gained.  Never ending cycle.

Level 13

Thanks for chiming in and sharing with the Community! Excellent points on the scientific (1-thing at a time) method and experience. It is a constant cycle for IT admins.

About the Author
Mo Bacon Mo Shakin' Mo Money Makin'! vHead Geek. Inventor. So Say SMEs. vExpert. Cisco Champion. Child please. The separation is in the preparation.