We've all been there at some point or another. For some of us it's a daily ritual, for others, it's what haunts our dreams at night.
I'm speaking, of course, about troubleshooting complex multi-faceted IT issues which can (and often do) transcend functional silos within the overarching IT organization. Cloud, virtualization, hybrid IT, storage area networks, converged infrastructure etc., have all fundamentally transformed IT. Tasks which historically may have taken hours or even days, now take minutes, and some just seconds to complete. These historically, time consuming tasks have become so easy, in fact, that in some organizations they're fully or at least partially automated.
The story doesn't end at the infrastructure either. Instead of monolithic, single purpose, do everything, application servers; distributed application architectures, elasticity, and containerization have increasingly becoming the norm. While all these abstraction layers have helped address application scalability and availability, they have also made it significantly more difficult for those monitoring the health and performance of everything which contributes to the delivery of the services those applications provide.
The troubleshooting process itself is more entangled than ever before, often times requiring collaboration amongst many different functional silos within the organization. From the storage administrator, DBA, Virtualization admin, cloud architect, application designer, network engineer, DevOps, systems administrators, etc., the finger pointing begins right away. But where to begin? The Orion® Platform and its related product modules such as Network Performance Monitor, Server & Application Monitor, etc., can collect millions of different metrics from across the entirety of the IT environment. That's an overwhelming amount of data to parse through as the clock is ticking; the pressure mounts, and your SLA is hanging in the balance.
Many Orion administrators have created elaborate, purpose-built, custom summary dashboards, comprising many various chart resources to help consolidate relevant information related to specific business critical services and to aid in troubleshooting scenarios. The issues with this approach are varied. First, depending upon the complexity and elaborateness of the custom dashboard, the creation process can be both tedious and time consuming. Second, only those with Orion View Management rights have the necessary permissions required to build or modify these custom summary dashboards, placing a tremendous burden on the Orion administrator and making them the bottleneck for efficient troubleshooting. Third, there's the issue of maintaining these custom summary dashboards. Some larger organizations may have dozens or even hundreds of these for various business services, and if not properly maintained to reflect changes made throughout the organization, these dashboards become less and less useful for their intended purpose.
From the end-user's perspective, these dashboards help eliminate the need to navigate through a various multitude of views and dashboards in search of data needed to isolate the root cause. Provided, of course, the issue they're investigating occurred within the default time period shown within the chart resources. If it didn't, then the end-user must click on each chart resource individually to be taken to the Custom Chart View, where they can adjust the timeframe shown to the timeframe desired for that specific chart. This then defeats the entire purpose of the custom summary dashboard as a means to visually correlate data more easily for the specific time period when the issue occurred.
With the release of Network Performance Monitor 12.1 RC1, and the impending release of Server & Application Monitor 6.4 RC1 comes a completely new method for visualizing and correlating data within the Orion web interface. We affectionately refer to this as 'PerfStack'. In part, the name represents the newfound performance analysis and correlation capabilities this feature brings to the Orion® Platform, enriched with relationship data which powers AppStack. The culmination of time series and relationship data is what sets PerfStack apart, allowing you to quickly sift through the massive amounts of data Orion products collect, eliminating noise, and allowing you to focus on what's truly relevant to the issue at hand.
Immediately upon upgrading to any of our latest release candidates you will notice a new option available under the 'Home' menu, entitled 'Performance Analysis', which will take you to the new PerfStack dashboard. By default, you will begin with a new analysis project, whereby you will need to start by adding at least one entity. In the throes of an active investigation into an ongoing issue, the entity selected would typically be the one exhibiting the symptom. This could be the switch, router, virtual machine, host, (AKA: node), or something more specific like the application, LUN, array, or web transaction, to name only a few of the possibilities. Choose whatever the most logical starting point is in your investigation and note that you can add as many entities as you wish to your analysis project.
|Performance Analysis Menu Option||Add Entities|
Once you’ve added your entity/s to PerfStack, hovering your mouse over any of those entities in the list, you will notice two icons that appear to the right of the entity name. When clicked, the first icon brings in all other entities related to that object. This leverages existing AppStack relationship data, as well as additional relationship information not currently expressed within AppStack. This includes items such as Hardware Health sensors, network Interfaces, IIS™ Sites, application pools, etc. Relationship information within PerfStack allows you to dramatically accelerate the troubleshooting process and focus exclusively on what's likely related to the issue at hand. This prevents you from fumbling about, wasting time manually trying to figure out how things are related on your own.
|Add Related Entities||Available Metrics|
Clicking on any entity name itself populates a list of all available metrics associated with that entity in the adjacent column. These metrics are categorized into collapsed groupings based on their type, any of which can be expanded to reveal individual metric tiles. These tiles can then be dragged onto the chart area on the right where the metric data is plotted.
Add as many metrics to the chart area as you desire. You can add multiple metrics to the same chart, as well as stack multiple charts on top of each other. PerfStack is also capable of combining data from disparate data sources into the same chart. For example, combine Storage Resource Monitor's LUN IOPS data with Network Performance Monitor's network latency and bandwidth data into the same chart to troubleshoot iSCSI performance issues. A feat never before possible within the Orion® Platform family of products.
|Correlate Metrics Across Different Orion Product Modules|
PerfStack also allows you to combine integer-based metrics with percentage-based metrics while maintaining the appropriate scale. This is accomplished by maintaining two separate x-axes when metrics of dissimilar types are combined within the same chart.
|Combine Metrics of Dissimilar Units Within The Same Chart|
Data for all metrics displayed within the chart area are automatically aligned across the same time period. Hovering your mouse over any chart area adds a vertical marker which tracks with your mouse movement to visually align all data points across the series. It also displays the date and time that datapoint was collected. As you move your mouse horizontally across the time period of the chart area, the values within the legend update to reflect the values aligned to the vertical marker.
The viewable time period can be adjusted globally, maintaining sync across all charts in PerfStack. Both relative and custom timeframes are available, allowing you to view things such as the last seven days of history, or focus on a particular event that occurred between 8am - 6pm last Tuesday, for example.
|Adjust Time Range||Save your Masterpiece|
Once you've completed your masterpiece, you can optionally save it for future reference. Charted metrics, the entities they're derived from, and the custom or relative time frame are all included as part of your saved project. Saved PerfStack projects can be loaded just as easily as they were saved with a top five most recently used PerfStack list making juggling between projects a snap. You can also save a copy of a project using Save As, or delete PerfStack projects you no longer need from within the More menu.
|Load Saved PerfStack MRU||Delete & Save As|
Each individual Orion user can create, save, load, update, and delete their own works of art within PerfStack. Also, users aren't required to have Orion 'Admin' or 'View Management' rights to do so, either. Any Orion user can create as many PerfStack dashboards as they wish and manage them independently without assistance from the Orion Admin; something I'm sure is music to the ears of every Orion administrator.
Sharing is Caring
The most powerful PerfStack feature of all is the ability to collaborate with others within your IT organization; breaking down the silo walls and allowing teams to triage and troubleshoot problems across functional areas. Anything built in PerfStack is sharable. The only requirement is that the individual you're sharing with has the ability to login to the Orion web interface. Sharing is as simple as copying the URL in your browser and pasting it into email, IM, or even a helpdesk ticket.
When received, the URL provided allows others to instantly see exactly what you've found. They, in turn, can work from that URL, add their own metrics, and send it back without ever affecting your project. Similar behavior is also true if you wish to share a saved PerfStack. Sharing saved PerfStack projects with others allows the recipient access to view your saved PerfStack dashboard and any updates you may make to it in the future. Other users cannot modify your saved PerfStack project, but they can save their own copy, update it as needed, and share back their results.
Orion account limitations are fully respected by PerfStack, so there's no need to worry about users gaining access to information they shouldn't. If for any reason, a user should receive a link to a PerfStack project containing objects not permitted by their account restrictions, then any/all metrics related to those restricted objects would be automatically hidden from that user's perspective of the PerfStack project.
Coming Soon to a Theater (Monitor) Near You
PerfStack is a feature provided with all upcoming Orion product releases which are built atop Orion Platform v2017.1 or later. The list of supported modules include (but not limited to)...
- Network Performance Monitor 12.1
- Server & Application Monitor 6.4
- NetFlow Traffic Analyzer 4.2.2
- Virtualization Manager 7.x
- Storage Resource Monitor 6.4
- Web Performance Monitor 2.2.1
SolarWinds PerfStack Analysis Dashboard was engineered to allow Orion users to troubleshoot, isolate, and identify IT problems in ways never before possible. As our IT environments evolve, so too must the tools we use to monitor them. Cross silo collaboration is an essential ingredient to any successful IT organization, now and for the foreseeable future. PerfStack, like NetPath before it, are just tiny glimpses into what we have in store for the future of Orion, and we sincerely hope you like what you see.