Tools for IT Operations Management

My last couple posts have focused more on the people and process side of things for IT Operations Management, but the right tools are also very important. Choosing the right tool for IT Operations Management might be one of the hardest things to do in IT. You have a lot of different requirements from a lot of different teams (Virtualization, Storage, Network, Apps, Business Management, etc.),  so it makes matching a single tool very difficult. As you see I said "right tool" and not "right tools" and the reason I said this is because we are usually looking for that one tool that serves all our needs. This is one of the areas I think IT Operations limits their decision making when evaluating tools. Chances are you are not going to find just that one tool that meets all requirements of the multiple teams. I've seen too many times where the virtualization team looks for a ITOM tool, but want it to report on compute, hypervisor, storage, and network. They then limit their search to a tool that can perform all those funtions, even if the tool is only sub-par at all those functions. They then end up with a sub-par tool, because they wanted "one tool to rule all". If they only opened up to selecting multiple tools to get their job done, then they might have had tools that can perform all functions very well.

One thing that is important when selecting multiple tools for IT Operations management is that the tools provide a way of integration. I'm not saying that the tools have to directly integrate, but you need to be able to integrate the tools into your own process. Tools you select should have API's or SDK's that allow you to abstract needed information from the tool programmatically. This will then allow you to feed other groups' information to tools that they might use that don't have direct integration. This allows you to start aggregating information and getting more of an end-to-end few from apps to infrastructure.

I would love to hear from others on what they find important when selecting tools and how you managed integrating multiple tools into your process.

  • A key factor is availability of enterprise quality federal support.  Will all contact with the company always be through level one helpdesk personnel?  Will the salesperson managing the relationship be engaged and stay engaged while we work issues?

    Additionally, customization is huge.  We live and die by custom property integration, and expect similar abilities in other enterprise monitoring tools.

    Traps are heavily leveraged for integration into NetCool.

  • we have a ton of tools in our arsenal we are not trying to find a way to combine them all into one.

  • In my experience, the only way to get one tool to rule them all is to only go with the #1 or #2 provider for absolutely everything, regardless of whether they fit your needs.

    But then you miss out on much more capable systems

    A good example would be Palo Alto who have left Cisco in their dust, but are relatively new and have limited support in Solarwinds.

  • As mentioned before...requirements and scalability.

    I tend to find "tool suites" to be a compromise in the functionality in order to gain better integration.

    Past experience has led me to determine a framework with which you are going to support (Solarwinds, OpenView, Tivoli, etc) then

    using a unified event messaging format (for scalability) , then pick your point solutions where needed.

  • Not only do I agree with most of the posts so far on this subject from API's, responsible parties, strategy, and scope, but I additionally will add that I prefer the "single pane of glass" approach.  It is hard enough finding a tool that does exactly what you need, but to have it be able to interact with your other troubleshooting and analytical tools as well as "play nice", is an even better goal!  A goal I am still trying to acheive!  I went through an extensive eval years ago and the winner of the eval, which included live demos etc, was SolarWinds.  They are the meat of my toolset.  Their "single pane of glass" approach and subsequent acquisitions to the toolset they provide has made alerting, reporting, and troubleshooting Network, user, virtualization, storage, and other issues much simpler.  There is still no "all-in-one" IT Managemetn tool to date that I have found, but NPM, SAM, NCM, VNQM, LEM, Network Engineer's Toolset, VMTurbo, and Riverbed Cascade Express/Pilot/Shark have aided me tremendously in pre and post analysis for user as well as Network and server issues.  Also the reporting is greatly appreciated by IT and upper management within our organization.

Thwack - Symbolize TM, R, and C