Skip navigation
1 2 3 Previous Next

Product Blog

699 posts

Greetings All!

 

On the SolarWinds Sales Engineering team, my colleagues and I often get requests from customers regarding how to do something custom, whether it is simply viewing certain data about a node on its node details page, or perhaps it is something more complex, such as automating putting devices into maintenance mode, as part of a workflow, or used to create runbooks.

 

Over the coming weeks we will have a series of “primers”, to equip you with the skills needed to create and adapt scripts & queries, within the Orion® Platform.  If you need to address any of these use cases, or similar, then this is the series for you!

  1. When you need to include information that’s not covered in an out of the box Orion report
  2. If you need to automate the addition of node to Orion for monitoring, as part of an onboarding process for new VMs
  3. If you require usage metrics for particular devices, so you can chargeback to other departments or customers

 

To begin with, we will introduce some of the terms and concepts involved, starting with some architecture basics, and building through to more hands on examples, looking at custom reports & scripts.

Topics will include:

  1. Intro to API, SDK, & SWQL
  2. SWQL studio
  3. SWQL Walkthrough
  4. Examples of SWQL in reports/alerts/web/ etc
  5. Automating Orion using PowerShell®
  6. Automating Orion from Linux® & some bonus tips ‘n tricks

 

The overall goal here is enable you to work through, and find solutions for your particular use cases. What this series will not be is:

  • An introduction to SQL
  • An introduction to scripting/programming
  • Pre-built solutions for every custom use case

 

We want to help you help yourself.

 

Read First!

Before we look at a single piece of code or query, let’s just take a moment to cover some important housekeeping. As with any customisation, especially when scripting there is always a possibility that things may go wrong. While automating manual processes is a most excellent endeavour, accidently deleting all your nodes is not! So before you begin working on any customisation, let’s just take a moment to cover a few simple best practices.

  • Set up a dev instance of the Orion Platform for experimentation
  • Don’t try untested scripts on production systems
  • Make a backup of your Orion database

 

So, with that, let’s get on with the show.

 

Terms and Concepts

First up we will introduce a few terms. If you are an advanced user you can probably skip ahead at this stage, but if new to writing queries and scripts, having a strong understanding of these at the very beginning can save a lot of hardship further on down the line.

 

SolarWinds Query Language (SWQL). SWQL (pronounced “swick-le”) is essentially a read-only subset of SQL with some SolarWinds conveniences added, and will be core to many of the topics that will be covered in the following posts. The third post in our series in particular will dive into SWQL in more detail, but at this stage we will look at some high-level points.

Application Programming Interface (API). In software development terms, an API is can be thought of as the access point for one piece of software to access another. In an N-tier application it allows different parts of an application to be developed independently. Orion, for example is N-tier, and web, polling, reporting, and coordination components communicate via service layers.

In the context of Orion, the API is what allows to read data using SWQL, as well as adding, deleting and updating data “invoking” commands (which we will examine in more detail in our 5th and 6th posts.)

SolarWinds Information Service (SWIS). The actual implementation of the API within the Orion Platform is embodied as SWIS, which manifests a Windows® service, the SolarWinds® Information Service.  It is via SWIS that other Orion Platform products (such as Network Atlas, Enterprise Operations Console (EOC) and Additional Web Servers) communicate. It is also via SWIS that various scripting and programming technologies can be used to access Orion.  From a technical perspective, it can be accessed over two ports:

  • 17777 – net.tcp: high performance but Microsoft® only-
  • 17778 – JSON or SOAP  over HTTPS - interoperability with other programming languages

 

Software Development Kit (SDK). An SDK is a set of tools and libraries, provided by a vendor, to allow others to more easily consume their API. In relation to Orion, the Orion SDK can be installed on Windows, and provides not only the files needed to use PowerShell scripts, but also includes SWQL Studio, which can be used to build custom SWQL queries and visually browse the available data. It is worth noting that since it’s possible to access the API using REST, you don’t need to have the Orion SDK deployed. Our next post will cover installing the SDK, and some tips for its use.

 

Intro to SWQL

SWQL can be hand-written, or more commonly, the SWQL studio can be used to generate queries. For simplicity, at this early stage, it’s worth noting that constructs from standard SQL such as

  • Select x from y
  • Where
  • Group by
  • Order by
  • Join
  • Union

 

All exist in SWQL, along with functions such as

  • SUM
  • Max
  • Min
  • Avg
  • Count
  • Isnull
  • Abs

 

A key point to note here however, is that update, insert and delete are not supported via SWQL itself. Those use cases are supported outside of SWQL and will be covered at a later point.

A major differentiator however is that SWQL automatically links many related objects without joins. This makes writing queries much simpler and more efficient.

 

For example, if we want to select the caption of the nodes in an Orion instance, and also list the interface names for each interface on those devices, using traditional SQL we would end up with something similar to

 

SELECT TOP (5)

    N.[caption]     

      ,[InterfaceName]

       FROM [Interfaces] I

       left join [Nodes] N on N.NodeID = I.NodeID

 

Running this would output

Caption                InterfaceName

ORION11            vmxnet3 Ethernet Adapter

mysql01              eno16777984

mysql01              lo

mysql01              eno16777984

bas-2851.local   VoIP-Null0

 

With SWQL, this simply becomes

SELECT TOP 5 Caption ,N.Interfaces.Name

       FROM Orion.Nodes N I

 

Gives the same results! Moreover, because it’s read-only, you cannot really break anything.

 

Wrap Up

With today’s post we’ve laid the foundations of the customizing the Orion Platform. We’ve identified some use cases where the API can be used to both read information from, or make changes to your Orion Platform  And to make the series “real”, we’ve seen a short SWQL example, that gives a good introduction to the power of using SWQL over SQL within the Orion Platform.  In the next post we will begin to get hands-on, by installing and navigating through, the Orion SDK. But in the meantime, you can discover more about the topics covered in the SolarWinds Lab episode SWIS API PROGRAMMING CLASS.

Companies are moving their email to cloud in droves.

Let's face it, administering Microsoft Exchange is one of those jobs that when everything goes right, no one knows you exist.  And when things go wrong, everyone knows you exist. The good news is that many companies are offloading their Exchange to Microsoft through the use of Microsoft Office 365.  If you doubt that Office 365 is big, consider that in July of this year Office 365 online workplace tools brought in more revenue than the traditional version of Office that’s installed on people’s computers. When you think about it, e-mail server replacement is the perfect SaaS application.  It's well defined without huge deviation from one organization to the next, scales well across multiple servers, needs to be accessible from anywhere and often needs permanent retention of records.  All things that the cloud is good at.

 

Moving to the cloud means I'll never have to worry about email again, right?

It's important to remember that while moving to the cloud alleviates your responsibility for the servers that run e-mail, you still are responsible for monitoring the e-mail itself and your company's connectivity to the cloud.  Monitoring cloud-based applications is different than monitoring on-premises applications.  Where you may have been concerned with memory and disk capacity on your servers, or server-to-server communication in the past.  Those are not concerns with SaaS.  But some potential issues still exist.    Here are just a few of the metrics you may need to be concerned with in an Office 365 environment:

 

  • Portal Access - Rather than server availability, it's important to know portal availability. This includes the user portal, the administration portal and the billing portal.  These may each be used by different users in your company but are all important.
  • Forwarded Exchange Users -  Are these mailboxes really necessary?  Are they violating company or government policies? What if a healthcare worker is forwarding messages containing patient information to a personal account, for example? 
  • Inactive Exchange Users - While sometimes you may keep a user's mailbox for a period of time after they are gone, sometimes you just forget to delete them and are paying for unneeded accounts.
  • Groups Accepting External Email - Do you really want external entities to be able to bulk mail these groups?
  • Top Senders - This is a handy metric for telling if your accounts have been hacked and are being used by spammers.
  • Administrative Roles - Did the number of administrators change unexpectedly? 
  • License Usage - Get a handle on how quickly your license usage grows.  How many licenses are being used?  What percentage of my total?  You still need capacity planning for SaaS, just a different type of capacity.
  • Last Password Change - Number of users with a password that is 90 days old or more.  How many users have a password that never expires?
  • User Mailbox Security - How many users have access to a large number of mailboxes?  Should they? 

 

Earlier this year, in collaboration with Loop1 Systems, we developed a set of templates for Microsoft Office 365 to monitor these and much more.  The templates have been very popular with customers, but there are a few things you can do to improve their implementation and function. Since these templates monitor Software as a Service, they aren't exactly like other templates that we typically provide.

 

Microsoft Office 365 is Software as a Service and it doesn't run on any of your servers.  What node should you apply the template to?

Since these templates are PowerShell scripts that run against a Microsoft URL, the best solution is to create an external node and apply the templates to it.  You can use "outlook.office365.com" as the node.   This is the URL for the mail API requests.  Technically for the Portal, Subscription, Security Statistics and License Statistics templates the scripts use "api.admin.microsoftonline.com", but splitting the Office 365 templates between two nodes can be confusing and forces the SAM user to understand which components of the service reside on each node.

 

You can also use an ICMP node rather than an external node

External nodes don't report status.  By using an ICMP node, you will get a rudimentary status indication on the node icon based on a ping of the URL.  External nodes give no status and always display a purple "arrow" icon without status.  However the URL "api.admin.microsoftonline.com" doesn't seem to respond to ping requests so it will always appear to be down if you point an ICMP node there.  Here is the external icon vs. the ICMP node icon.

Get a real picture of Office 365 availability with NetPath

Another way to determine the responsiveness of the Office 365 application is to set up a NetPath service for "outlook.office365.com".  If you have NetPath, you can use it to get a detailed view of the bottlenecks between your site and the application portal.

 

Improving responsiveness to queries by polling less frequently

Depending on the number of mailboxes in your environment and the number of templates implemented, you can experience throttling of your API requests from the Office 365 API.  If you are throttled, the choices are to either run less component monitors or reduce your polling frequency on some templates.  Most users can actually reduce the polling frequency substantially on most or all Office 365 templates since the majority of the metrics don't change frequently.  One thing to keep in mind is that if you want to ensure enough data points to avoid gaps in history, you might want to use less than an hour for your polling frequency, so try setting the frequency to 1200 (20 minutes) rather than the default of 300 (5 minutes). If you want to know more about Microsoft API throttling, see Avoid getting throttled or blocked in SharePoint Online | Microsoft Docs for a description.  The article is about Sharepoint but the concept is the same for Office 365.

 

I don't like the output of the detailed data from the templates.  Can I make it more readable?

The data comes back from the API in a comma-delimited format which is great for programming but not so readable.  To make the data more readable, you can modify your own copies of the scripts as follows:

Replace:

[string]::Join( ", ", $users) 

With

[string]::Join( "< br/>", $users)

NOTE: You should be aware that this modification is injecting HTML directly into the output from the PowerShell script.  When viewed on the SAM console it will display correctly.  However, this change could create unexpected results in other areas of SAM that are not displayed on a web server, such as reports.

 

Comparing Exchange 2013/2016 templates with the Office 365 template.  They are both Exchange, why are they so different?

Since Office 365 is SaaS, many of the metrics in our previous Exchange templates is either not available or not meaningful.  Metrics like disk I/O and disk latency aren't available for a cloud service where the hardware is abstracted away from the user.  Similarly attempting to monitor processes and services on the hosts is not possible.  Primarily with Office 365 we monitor application data, which is available through the Office 365 API.

 

There was a MAPI round trip template available for Exchange.  Can I run this template against Office 365?

The MAPI round trip template was intended to check connectivity between multiple Exchange servers.  Since Office 365 is SaaS, you don't control the physical servers that are used for your accounts.  With cloud-based applications, you should check connectivity between your network and the Office 365 website.  You can get a sense of this connectivity by using the portal templates and the ICMP option discussed above.  Also as mentioned above, you can use NetPath to show the actual path your connections take to Microsoft.  Another option is to use Web Performance Monitor to record a typical mail transaction and get perspective on each part of the session.

 

A comprehensive approach to monitoring Microsoft Office 365

Hopefully, this post has given you some ideas about why and how to monitor Office 365. SolarWinds offers many tools to help you from SAM templates to network tools to user simulation.

I am excited to announce the latest release of Virtualization Manager (VMAN) 8.1 which is now available on the Customer Portal. Continuing the great work done in VMAN 8.0 to migrate VMAN functionality to a unified Orion platform, VMAN 8.1 further extends these capabilities by providing Capacity Planning natively on the Orion platform. With VMAN 8.1, the new and improved capacity planning feature provides a modeling wizard, which is more intuitive and helps the user create various configurations along with test variable growth scenarios.  Administrators are now empowered to generate reports on the fly and provide accurate growth predictions of their virtual infrastructure to the business. In addition to Capacity Planning, Azure Cloud Monitoring has been added to VMAN's capabilities to monitor hybrid infrastructure.

 

 


What's New in Virtualization Manager 8.1

 

 

Capacity Planning

 

The new and improved Capacity Planning is an Orion-based scenario wizard that helps you create and test variable configurations on the fly. You can also take a more intensive approach, accounting for your environment's historical performance, new system needs, and peak usage times.  When planning your virtual environment infrastructure, resources, and allocation, you may need help determining the best amounts and structure. How will adding a substantial amount of VMs affect your resource usage and performance? Should you add one or more new hosts to keep up with expansion? Do you need to worry about capacity right now, or will your environment be okay for the next 6 months? Capacity Planning uses historical data and trending calculations to better determine the outcome of your infrastructure needs. Run multiple types of expansion scenarios: Simulate adding VMs, adding hosts, or both. Each scenario generates a report with detailed usage statistics covering predicted CPU, memory, and disk usage.

Click here for online demo.

 

Capacity Planning Home

From the Capacity Planning home page, you may select prior capacity planning reports to review or select new to open up the capacity planning wizard to create a new report.

Three different modeling scenarios to choose from


  • Run Checkup Wizard -  A resource depletion report will be run on a cluster designated by you based on existing resource utilization.

 

  • Simulate adding extra VMs - Model a scenario of adding additional VM load infrastructure and determine the impact to your existing hosts. Model VM usage based on an existing VM or create your own configuration.

 

  • Simulate adding extra host computers - Determine the impact of adding additional hosts to an existing cluster.

 

  • Simulate adding extra VMs and host computers -  Determine if your infrastructure growth strategy will keep up with expected VM growth by simulating adding both VMs and Hosts to an existing cluster.

 

Profiles

Profiles are used to determine the impact of adding the specified number of VMs or hosts to your infrastructure.

 

  • Predefined Profiles - You can select a predefined VM profile of Small, Medium, or Large.

These profiles are static and not based on your an actual VM in your environment.

  • Custom
    • Use an existing VM/Host -  Selecting an existing VM will model a scenario of adding a number of VMs based on the resource utilization of the profile.
    • The use of custom profiles greatly improves capacity planning modeling when you can forecast an upcoming project that will increase the use of resources based on a currently used profile.
    • Build your Own - You can manually specify hardware configurations.  Useful if you do not have an existing VM or host to model configurations after.

 

The use of custom profiles greatly improves capacity planning modeling when you can forecast an upcoming project that will increase the use of resources based on a currently used profile.

Reports

 

The Capacity Planning reporting wizard culminates in an easy to review report that breaks out the current and expected state of your virtual infrastructure.

 

The report can be broken up into 3 sections.

 

Projected Growth

    • Workload - Provides an aggregated view of VM workload utilization in the selected cluster.
    • Resources - Provides an aggregated view of Host resources in the selected cluster
    • Simulated growth - The amount and configuration of VMs or Hosts added

 

 

Charting Thresholds

Provides a graphical representation of the actual vs simulated utilization of CPU, memory, & Disk. Estimations of when thresholds are reached for Warning, Critical, and Capacity are provided

 

 

Summary

  • Available Virtual Capacity - How many VMs can fit in the environment based on selected profile and all 3 default profiles (small, medium, & large) and what the initial constraint will be for growth (CPU, memory, or Storage).
  • Additional Hosts Recommended - How many more hosts are predicted to be needed in the selected cluster based on the default or custom profile used in the selected time period.
  • Advanced Statistics - These are essentially a roll-up of the resource modeling chosen, usage history that the prediction is made on and the thresholds and resources included in the report.

 

Cloud Infrastructure Monitoring for Microsoft Azure

Cloud Monitoring for Azure is now available to both VMAN & SAM customers. Similar to Cloud Infrastructure Monitoring for AWS the cloud vendor API will be used to monitor cloud instances. This basic cloud visibility doesn’t use Orion licenses for the cloud servers. For full details on performance and availability, you can optionally manage cloud servers as Orion nodes, which will consume licenses.  Additionally, if you add SAM application component monitors to your cloud instance this would obviously require SAM licenses.

 

 

What is included

  • Edit AWS/Azure Account using new UI (without editing API polling options)
  • Choose Instances/VMs for AWS/Azure Accounts
  • Azure VM’s polling, including performance metrics
  • Cloud Summary page accommodating instances for both cloud providers
  • Azure Cloud VM Details page with majority of resources
  • Monitor Azure VM as Orion Node (requires Orion node license)

 



 

 

 

 

Documentation

VMAN 8.1 Release Notes - Virtualization Manager 8.1 Release Notes - SolarWinds Worldwide, LLC. Help and Support

Getting Started - Virtualization Manager (VMAN) Getting Started Guide - SolarWinds Worldwide, LLC. Help and Support

Admin Guide - VMAN 8.1 Administrator Guide - SolarWinds Worldwide, LLC. Help and Support

We’re happy to announce the release of SolarWinds® Storage Performance Monitor, a standalone free tool that provides a consolidated view of all your storage array’s performance.

Designed for storage and systems administrators in organizations of any size, Storage Performance Monitor gives your team insight into total IOPS and throughput on storage pools, LUNs, and NAS Volumes.

Arrays supported by Storage Performance Monitor Free Tool:

  • Dell EMC Isilon
  • Dell EMC Unity
  • Dell EMC VNX / CLARiiON
  • Dell EqualLogic PS Series
  • IBM DS-8000
  • IBM FlashSystem A9000
  • IBM FlashSystem V7000/v3700
  • NetApp AFF series
  • NetApp ONTAP
  • PureStorage

 

The Storage Performance Monitor free tool allows you to add as many storage arrays as you wish. Simply click the “ADD STORAGE ARRAY” button, select the appropriate array type, and provide your connection details. Once you’ve added your array to Storage Performance Monitor free tool, it will load polling interval, discovery interval, and thresholds from global settings. You can edit these thresholds locally – per entity, by clicking the vertical ellipsis button.

You can then fine-tune your thresholds for this entity. If you do not wish to monitor it, simply keep the threshold empty. You won’t then get any notifications about this entity.

What else does Storage Performance Monitor do?

  • Allows you to add as many arrays as you wish
  • Creates a notification when any of the following conditions are met:
    • a metric exceeds a threshold,
    • the tool cannot connect to the array,
    • the array topology changes.
  • Can write all notifications into Microsoft Event Log.

 

For more detailed information about the Storage Performance Monitor, please see the SolarWinds Storage Performance Monitor Quick Reference guide here on THWACK®: https://thwack.solarwinds.com/docs/DOC-192403

 

Download SolarWinds Storage Performance Monitor: https://www.solarwinds.com/free-tools/storage-performance-monitor

 

For more advanced storage monitoring, try the evaluation version of Storage Resource Monitor, which gives you these features in addition:

  • Store historical data
  • Monitor hardware health
  • Monitor additional performance metric – storage latency
  • Monitor storage capacity
  • Map storage to hosts/VMs/applications
  • Alerting
  • Reporting
  • runs as a Windows service
  • Web front end

If you're one of the many customers that tried the beta or release candidate, you've already seen many of the exciting new features in SAM 6.5. For the rest of you, here are more details about what's new:

 

Azure Cloud Infrastructure Monitoring

Just as SAM 6.4 directly accessed the world's largest cloud, SAM 6.5 expands your ability to leverage multi-cloud further by adding support for Microsoft Azure accounts in addition to Amazon AWS.  While Amazon Web Services remains the largest cloud infrastructure provider, Microsoft is growing in this market at an incredible rate.  And many customers have a footprint in both.  Now you can see Azure or AWS accounts, or a mixture of both in the same view.

Cloud-related enhancements include:

  • Add or edit your AWS/Azure Account using new UI

  • Choose Instances/VMs for AWS/Azure Accounts
  • Poll Azure VMs for performance metrics
  • A single Cloud Summary page displays data for instances/VMs for multi-cloud environments

  • Monitor Azure VMs as Orion nodes to leverage the extended features of the Orion Platform and SAM for application and OS monitoring, including expanded metrics

PerfStack 2.0 — New Features and Improvements

  • Navigate directly from Application Detail pages to predefined sets of application monitor metrics

  • Use pre-defined links from these pages or create your own custom PerfStack charts from a broad range of metrics such as these from the Azure cloud:

  • Zoom into PerfStack charts to view more details for a selected time period
  • Alert visualization improvements
    • Each individual alert start/end time is now visualized separately in PerfStack
    • Existing aggregate alert visualization against an object is retained
  • Export PerfStack data to Excel
  • Share PerfStack functionality to make it more discoverable with new Share button
  • For more on PerfStack improvments, see Orion Platform 2017.3 - PerfStack New Features & Improvements

New Orion Installer

  • Install and upgrade one or more Orion Platform products simultaneously
  • Modern interface with a simplified design and intuitive workflow
  • Download and install only what is needed
  • Reduces download size and accelerates installation
  • For more on the Installer improvements, see New Installer Upgrade Experience

Other Improvements

  • New SAM templates for Microsoft Office 365 are included out of the box.  Microsoft is the leader in Software as a Service, and Microsoft Office 365 is gaining adoption with companies of all sizes.
  • Linux Agent for ARM-based devices such as Raspberry Pi to help you manage your Internet of Things (IOT)
  • Updated SAM templates for many Microsoft products
  • Search function integrated into global navigation
  • Improved dashboard view customization and Manage Nodes page
  • Support for High Availability (HA) 2.0 including multi-subnet deployments (WAN/DR)

 

What Else?

Don't see what you are looking for here? Check out the What We're Working On Beyond SAM 6.5 post for what our dedicated team of database nerds and code jockeys are already looking at.  If you don't see everything you've been wishing for there, add it to the Server & Application Monitor Feature Requests

If you are using Orion, I am sure you have already heard of PerfStack (aka Performance Analysis), SolarWinds' Drag & Drop Answers to Your Toughest IT Questions.  DPA did not make the first cut of PerfStack, but I am happy to announce that with the release of DPA Integration Module 11.1, you can now see DPA goodness in PerfStack too!!! Imagine expanding Using PerfStack with WPM and SAM to Troubleshoot Web Performance Issues to include database and query wait time!  Answer the question "Is it the database or the application?" in just a few clicks, pulling in whatever data you need from both DPA and Orion!

 

 

PerfStack Support for Databases monitored by DPA

DPAIM 11.1 exposes DPA data already available in other Orion views, but also adds some new ones as well! Data includes:

  • Total Wait Time
  • Wait time by dimension (SQL, Wait Types, Programs, Users, etc.) with drill downs to SQL text and Wait Type descriptions
  • Database metrics
  • VM metrics
  • Custom metrics (yes, even the custom metrics!)

 

Data History and Granularity

PerfStack is able to show all the data available in DPA. So if you have 5 years of history in DPA, you'll see it in PerfStack. As you drill down in PerfStack, you'll get to more granular data (down to 1 second wait time). PerfStack will choose the best granularity for the time range of your current view, and then adjust accordingly as you zoom in or out.

 

Stacking Data

PerfStack allows you to stack different wait time dimensions in the same view to help you solve problems quickly. For example, do you want to know who is writing those bad queries? Just stack SQL and user dimensions into one view and look for the correlation.  You can also stack dimensions from different databases - even if they are on different DPA servers.

 

Data Explorer

Even more DPA data is available in PerfStack's Data Explorer function. The Data Explorer lets you drill down and view the details of the wait time dimension you've highlighted in PerfStack. All Wait Time dimensions will show you a list of metrics ranked by total wait time. but a couple dimensions show extra information:

  • Total Instance Wait Time (SQL) will show you the SQL text, letting you find the query causing the issue.
  • Wait Time by Wait Type will show you descriptions of the Wait Type, helping you understand why your queries are waiting.

 

Data Explorer - Query Search

When you drill into the Data Explorer for Total Instance Wait Time (SQL), you'll see the SQL text for the queries with the most Wait Time.  However, you can search the text listed in the Data Explorer - say a specific table, group by or other parameter - to further filter the result set.

 

Predefined PerfStack Views for Databases

Every database displayed in Orion via DPAIM has a predefined link to a default PerfStack view, which will include both wait time and key metrics for the specific database.

 

Adding Database Data to PerfStack

To add database data to PerfStack, click Add Entity then scroll and select Databases Instance.

PerfStack will list all databases from all integrated DPA servers.

Once you've selected an instance, you can choose multiple wait time metrics to add to PerfStack.

Once you've added wait time metrics, selecting an area will allow you to drill to additional information on some metrics: SQL text (for Total Instance Wait Time) and the wait type description (for Wait Type).

To see a fully operation PerfStack example, check out this PerfStack Application to Database Mapping in the Orion Demo Site.

 

New Resource - Blocking & Deadlocks

A new resource available for every instance monitored by DPA will show you blocking (all databases) and deadlocks (SQL Server only), to let you quickly see if the source of a database problem is blocking or deadlocks.

The resource has three tabs to show which queries are waiting the most (Top Waiters), which queries are blocking the most (Top Root Blockers) and if there are deadlocks.

Clicking on the SQL hash will get you a query detail popup, with a link to the DPA historical view for that SQL.

 

Improved Resource - All Databases Instances

When we first created the resources for the DPA Integration Module, we tried to communicate as much information as possible, combining DPA wait time and performance data as well as data in Orion.  However, with so much information, it made it too difficult for users to quickly answer the question "is the database having a performance issue".  Yes, memory may be high, but if users aren't waiting on queries any longer than normal, then there isn't really a "performance" issue.  Based on this feedback, we've made a few changes:

  • In the last release (DPA 11.0), we changed the status shown in Orion to reflect DPA's wait time status only
  • In DPA 11.1. we've simplified resources to only show wait time status and not the status of advisors or metrics (CPU, Memory, etc.)
  • Wait time status now has three states (green, yellow, red) instead of two (green, red)

Basically, we want to make it really clear if there is a problem with the database performance issue (i.e. high wait time) before you drill down to additional data.

 

Database Summary View - All Database Instances Resource

For databases monitored by DPA, the status is determined by the wait time status from DPA, regardless of other indicators in DPA or Orion. For databases monitored by Orion but not DPA, the Orion status is shown.

The new resource has simple filters at the top, allowing you to select one or more statuses.  In this case I've filtered the resource to only see database instances with a red or yellow status.

Other resources have been improved as well, see the DPAIM 11.1 Release Notes for more information.

 

Support for SQL 2017, Oracle 12.2, MariaDB and more...

As usual, all the databases that DPA supports you'll see in Orion too.  Here are the newest databases supported by DPA:

  • SQL Server 2017 on Windows
  • SQL Server 2017 on Linux
  • Oracle 12.2
  • MariaDB 10.0, 10.1 and 10.2
  • IBM DB2 11.1

See a full list of Database versions you can monitor with DPA.

 

DPAIM for ALL DPA Customers!

In the past, since DPAIM was an "integration" module, it was not available as a standalone module, meaning you needed to own another Orion product to use it.  However, with the addition of PerfStack support, we wanted to make sure all DPA customers could enjoy the benefits of DPAIM and Orion too.  So now, DPAIM is a standalone module, and doesn't require to be installed with another Orion product.   All DPA customers can now install the DPAIM module and take advantage of the great features of the Orion platform.

 

How do I get the DPA Integration Module?

If you own DPA, you can get DPAIM 11.1 multiple ways.

  • If you have a SAM installation, when you upgrade to v6.5, DPAIM 11.1 module will be installed and ready to use.
  • If you have other Orion-based products installed (NPM, VMan, SRM, etc.) but not SAM, you can download DPAIM from the Customer Portal and install it on your Orion server.
  • If you own only DPA but no other SolarWinds Orion product, you can take advantage of DPAIM too. Simply download it from the Customer Portal and install the DPA Integration Module (DPAIM) on another server.  If you are using a SQL Server for your DPA repo, you can use the same one for DPAIM.  If not, you can opt for SQL Express bundled with the DPAIM installer.

Once you have DPAIM installed, follow these instructions to integrate it with DPA.

 

But wait, there's more!

See everything else that made it into the release in the DPAIM 11.1 Release Notes, including:

  • Improved Instance View - Database Response Time Resource
  • New Query Popup, with formatted SQL
  • Compatibility Checker

 

What's Next?

Don't see what you are looking for here? Check out the What We Are Working On for DPA (Updated Nov 6, 2017) post for what our dedicated team of database nerds and code jockeys are already looking at.  If you don't see everything you've been wishing for there, add it to the Database Performance Analyzer Feature Requests.

Orion and the modules which run atop the platform provide a tremendous wealth of statistical information at your fingertips for spotting trends and hotspots. That data collected is also helpful for determining if what you're seeing now is anomalous, or normal consistent behavior based upon historical analysis. Unfortunately, one area where Orion hasn't been quite as strong is helping users troubleshoot active ongoing issues. Should you find yourself in the throes a major outage or performance issue, Orion does an outstanding job of ensuring you're alerted to the problem at hand. Where it falls short however, is providing tools which aid in your ability to diagnose the root cause of the issue in real-time.

 

As many of you are keenly aware, default polling intervals for statistic data collection in Orion is typically somewhere between 5-10 minutes for most Orion product modules. While this normal polling interval for statistic collection is perfectly reasonable for trend analysis, alerting, and reporting, it's less than ideal when you're actively troubleshooting an ongoing issue. Ideally, you'd want the ability to make change, like restarting a Windows service or Linux daemon, change a CBQoS policy, or allocate additional resources to a virtual machine, and then see immediately the impact those changes are having to the issue you're trying to resolve. In these situations, it's simply untenable to wait 5-10 minutes for Orion's next polling cycle to determine if what changes you made resolved the issue. Doing so significantly bottlenecks the number of things you can try, and extends the duration of the outage as you wait for Orion's next scheduled polling interval to determine if the issue is resolved.

 

Sure, there are alternatives and workarounds which many people leverage in these situations. Some choose to click the 'Poll Now' button feverishly to get updated values ahead of the normal 5-10 minute polling interval, but even this takes a minute or so before data is collected and visible within the Orion web interface. While better, this is still less than optimal for troubleshooting purposes. Others instead, use different tools like command line interfaces on switches, routers and linux, or Resource Monitor and Task Manager on WIndows for their firefighting needs. These tools though, have their own drawbacks, such as requiring you leave Orion where you were initially alerted to the problem, and console into the device exhibiting the issue. If this problem potentially spans multiple devices, such as in the case of distributed application architectures, clustered or load balanced servers, HSRP, VRRP, etc. then you'll be forced to juggle multiple console sessions with no ability to compare or correlate metrics between devices.

 

 

Enter PerfStack

 

With the release of PerfStack included in Orion Platform 2017.3, these woes are a thing of the past. No more juggling between different tools as your boss watches over your shoulder, breathing down your neck as you scramble to isolate the cause of your next critical performance issue. With our new improvements to PerfStack, we introduce you to real-time polling, which provides up to one second statistic collection granularity when activated. This can be for a single entity like a node, or even multiple disparate entities simultaneously.

 

PerfStack-Real-Time-Polling.gif

 

 

Start Real-Time Polling

To begin using PerfStack's new Real-Time Polling capabilities, start a new project and add a node by clicking 'Add Entities'. Expand the 'Node' category and click on the node you just added in the previous step to select it. This will populate the metric pallet with the list of all available metrics for that entity. Within the metric palette, expand 'CPU/Memory' or 'Response Time' and you will notice a blue rocketship icon which adorns many of the available metrics listed. This icon denotes that the metric is available for Real-Time Polling. Note that not all metrics for a given entity are pollable in real-time. A full listing of all real-time pollable metrics can be found by expanding the 'Real-Time Polling' category in the metric palette of the selected entity.

 

Rocket Ship.pngReal-Time Polling Category.png

 

Once you've identified which real-time metrics you'd like to visualize within your PerfStack project, drag and drop those metric tiles into the chart area the same as you would any other metric. You can of course include both real-time and non-real-time metrics within the same project, but only those denoted with the blue rocket ship icon will be updated within the chart at one second intervals. Other metrics included within the same project will continue to update themselves based upon their normal scheduled polling intervals.

 

Now that you've added the some real-time metrics to your PerfStack project, simply click the 'Start Real-Time Polling' icon in the top action bar. This will automatically change the timeframe of the chart to the last 10 minutes. This allows you to more easily visualize variations in the charted values at high frequency polling intervals. You may also notice the rocketships blink when real-time polling is starting. This process takes just a second or two, then the charts begin to move. To stop real-time polling, simply click the 'Stop Real-Time Polling" button in the top action bar.

 

 

Start-Real-Time-Polling.gif

 

 

Real-Time Polling Limits

While real-time polling is active, you can continue to add or remove additional real-time metrics to your project. These can be from the same, or entirely different entities. Real-Time polling will continue for those existing metrics on the chart, and any newly added metrics will begin to update in real-time. There is a limit of ten unique real-time metrics per-project which can be polled. Should you exceed this limit, you will notice a toast message appears in the top right of the window when attempting to add the eleventh metric to a chart where real-time polling is enabled. This same message will appear if your project contains more than 10 real-time pollable metrics and you attempt to enable Real-Time Polling. To resume real-time polling, reduce the number of metrics which can be polled in real-time within your PerfStack project to ten or fewer.

 

Session LimitGlobal Limit
PerfStack RealTime Exceeded.pngPerfStack Real-Time Global Limit.png

 

In addition to the per-session limit of 10 real-time metrics, there is also a notification if you exceed a global limit of thirty unique metrics across all web interface sessions on the Orion server. Real-time polling uses a shared cache across all sessions, so if you and three of your colleagues are viewing the same ten metrics in real-time within PerfStack this only counts as 10 real-time metrics, not 40. This is because PerfStack is only polling the device in real-time once, and not for each unique user session. This helps reduce overhead on the Orion server, as well as any strain on the monitored device.

 

Polling Methods

In our ever enduring commitment to remain an agentless first monitoring solution, Real-Time Polling in this release is available only for nodes managed via ICMP, SNMP, or WMI. Those nodes which are managed via the Orion Agent cannot as yet utilize Real-Time Polling. Should you select an entity within PerfStack that is managed via the Agent, you will notice the absence of any blue rocket ship icons in that entities metric tiles, denoting that Real-Time Polling is not available for that entity.

 

Metrics and Entities Supported

As stated above, Real-Time Polling is not yet available for all metrics and entity types. For this release of PerfStack, we focused on what we believe to be the most vital real-time metrics users would need at hand during a firefight. This includes 34 metrics spanning across three different entity types, nodes, interfaces, and volumes; allowing you to troubleshoot the most common network, storage, and device related performance issues in real-time from a single, centralized, web based interface. If you'd like to see additional real-time metrics supported in future releases, we'd love to know which ones you would find most valuable, and how you would plan to use them.

 

NodesInterfacesVolumes
Average CPU LoadAvailabilityAverage Disk Queue Length
Average Memory UsedReceived DiscardsAverage Disk Reads
Average Percent Memory UsedReceived ErrorsAverage Disk Transfer
Peak CPU LoadTransmit DiscardsAverage Disk Writes
Peak Memory UsedTransmit ErrorsMaximum Disk Queue Length
Minimum CPU UsedAverage Receive bpsMaximum Disk Reads
Minimum Memory UsedMinimum Receive bpsMaximum Disk Transfer
Average Response TimePeak Receive bpsMinimum Disk Queue Length
Maximum Response TimeReceive Percent UtilizationMinimum Disk Reads
Minimum Response TimeAverage Transmit bpsMinimum Disk Transfer
Minimum Transmit bpsMinimum Disk Writes
Peak Transmit bps
Transmit Percent Utilization

 

 

User Restrictions

We at SolarWinds, understand that not all Orion administrators may want every user to have access to such an amazing feature. After all, they may be completely mesmerized by the screen and not get any actual work done as a result. With that in mind, you will find a new user or group level permission which controls whether the 'Real-Time Polling' button appears within PerfStack for those users. This new setting can be found under [Settings -> All Settings -> Manage Accounts]. From there, select a group or individual user account and click 'Edit'. Expand 'Performance AnalysIs Settings' at the bottom of the page and change this setting from 'Allow' to "Disallow' for any user or group. This will disable Real-Time Polling for those users. By default, all users have permission to launch Real-Time Polling within PerfStack.

 

Real-Time Polling is only one of the latest improvements we've made to PerfStack in the Orion Platform 2017.3 release. If you're interested in what other goodies we've stuffed under the hood, hop on over to my earlier post, entitled Orion Platform 2017.3 - PerfStack New Features & Improvements for the full rundown.

We are excited to share, that we've reached GA for Web Help Deskv12.5.2

 

This service release includes:

 

Clickjacking protection

This release prevents malicious code from redirecting a hyperlink in the Web Help Desk user interface to an unauthorized third-party website or resource.

 

Secure password reset logic

After you click Forgot Password on the Log In screen, Web Help Desk verifies your current email address and redirects you back to the application using a secure connection to reset your password.

 

Improved LDAP security

Web Help Desk now prevents unauthorized LDAP client account users from logging in to an LDAP tech account with an identical user name. In v12.5.1 and earlier, WHD had 2 ways to handle LDAP authentication. One for techs and one for clients. After you install this release, the tech LDAP authentication functionality is removed, and every tech, who used this functionality will have his WHD password reset, and will also receive an email with steps to log in to WHD.

See Unauthorized clients can log in to a Tech account using LDAP authentication for details.

Before you install this upgrade, ensure that all techs have client accounts (authenticated through LDAP) linked to their tech accounts. Also ensure, that the tech username is not the same as any of the client's usernames. After the upgrade, all techs must access their tech account through their client account, or using the WHD tech username and WHD password (which can be reset using the secure password reset logic).

 

Updated Apache Tomcat

This release supports Apache® Tomcat® 7.0.82 for improved security. See the Apache Tomcat website for details.

 

Notabe fixed issues

Tickets linked to a survey now close properly after you change the status to Resolved.

The Office 365 connector now supports subfolders.

Tickets restricted to a location group can no longer be accessed by users in another location group.

 

We encourage all customers to upgrade to this latest release which is available within your customer portal.

Thank you!

SolarWinds Team

To kick off the Q4 releases, I am happy to announce Generally Availability of Database Performance Analyzer 11.1.   This release continues to build momentum on previous releases by extending our support of Availability Groups, supporting the latest databases, and improving the DPA interface.  We've also added a subscription option when you deploy DPA in the Amazon cloud.

 

Support for New Database Versions

We'd like to announce official support for the following databases:

  • SQL Server 2017 on Windows
  • SQL Server 2017 on Linux
  • Oracle 12.2
  • MariaDB 10.0, 10.1 and 10.2
  • IBM DB2 11.1

When integrated with Orion, these databases will appear in DPAIM.

 

Availability Groups:  Status, Alerts and Annotations!

New in version 11.1, DPA regularly polls the status of all SQL Server Availability Groups (AGs) contained in the monitored instance. The DPA Home page displays a new status icon which tells you that AGs are present in a monitored instance.

 

DPA’s new AG monitoring gives you the ability to:

  • See the status of all your AGs on the home page, including a new filter widget.
  • When you drill down, see the status of AGs, databases, and replicas. This includes synchronization and failover status information.
  • See annotations on trend charts that show when AG failovers have occurred. The annotations show you the previous and current replica (from/to), and allows you to correlate failovers to changes in load.
  • Send an alert email when:
    • An AG failover occurs.
    • An AG status becomes Partially Healthy and Not Healthy.

For a detail view of new AG features, see this feature post:  DPA 11.1: Improved monitoring of SQL Server Availability Groups

HomePageAGInfo.png

AG_summary.png

 

Amazon Subscription

Have a lot of databases in the Amazon cloud?  You can now monitor them via your Amazon subscription, simply start up DPA from the AWS Marketplace, connect up a repository and start monitoring your databases. All currently supported databases can be monitored, and you can integrate DPA with your Orion server and see the data in Orion.

 

Improved Wait Time Status Indicator

The wait time indicator on the home page has been improved in two ways.

  • There are 3 status (Green, Yellow, Red) instead of 2 (Blue, Red).  Default thresholds are now 1.3x and 1.6x the historical wait time for Yellow and Red, respectively.
  • We evaluate every 10 minutes, instead of once an hour.

This increased frequency and new thresholds allow DPA to show you wait time pressure much quicker than previous versions.

This new status is propagated to Orion via the integration module.

 

And a whole lot more!

  • When you search for a SQL statement while creating an alert or a report, the search results include each SQL statement's total wait time for the last 7 days
    SQLSearchResults.png
  • Improved instance filter includes if instance monitor is on/off.
    status indicator.png
  • New icons and images that align better with Orion
  • See the DPA 11.1 Release Notes for the rest!

 

What's Next?

Don't see what you are looking for here? Check out the What We Are Working On for DPA (Updated Nov 6, 2017) post for what our dedicated team of database nerds and code jockeys are already looking at.  If you don't see everything you've been wishing for there, add it to the Database Performance Analyzer Feature Requests.

 

 

Syslog

When PerfStack was initially released, a surprising few key metrics were noticeably absent. Chief among those, was the surprising lack of Syslog data. In the Network Performance Monitor 12.2 release, and all Orion product modules which include Orion Platform 2017.3, we rectified this injustice, bringing Syslog into the PerfStack fold. For any node sending Syslog data to Orion you will find a new 'Syslog' metric tile under the 'Status, Events, Alerts' category. Simply drag that tile to the chart area, and voila! Syslog data charted over time broken down by severity level.  As with all metric tiles within PerfStack this metric tile is dynamic and will only appear Nodes which have been configured to send Syslog data to the Orion server. If this tile does not appear for nodes you believe should, verify that Orion has received Syslog data from the device using the web based syslog viewer within the Orion web interface.

 

 

Hovering your mouse over the charted area will show the total number of syslog messages received at the top of the legend for the time shown below the vertical marker, as well as a breakdown of each severity type contributing to that total beneath it.

 

 

SNMP Traps

Not to be outdone, the inclusion of SNMP Trap data has also been added. Much like Syslog, SNMP Trap data is broken out by severity and is available as a dynamic metric tile for any node which is configured to send SNMP Trap data to Orion. The SNMP Trap metric tile can be found under the same 'Status, Events, Alerts' category as Syslog, when a node is selected from within the metric palette.  If this tile does not appear for nodes you have configured to send SNMP Trap data to Orion, verify that Orion is receiving those SNMP Traps from the device using the web based SNMP Trap viewer, accessible from within the Orion web interface.

 

 

 

Zoom

When hovering your mouse over the charted area you may notice your mouse cursor has learned a new trick, changing to crosshairs. Holding down your mouse button while dragging across the charted area allows you to lasso a specific time period of interest, such as a sudden spike in resource usage. The selected area is then the focal point of your project, while the rest of the chart area surrounding this time period becomes visually de-emphasized through color desaturation. while the colors remain bright and vibrant within the selected area.  This easily allows you to focus your attention on the area selected area without visual distraction from the surrounding area. This selection process occurs across all charts within the PerfStack, making it easier to visually correlate what else occurred during the same time period.

 

 

Once a time period has been selected, new options appear next to the selected area. Clicking the [+] icon highlighted above, zooms into the selected time range where you can view high fidelity detailed data collected during that time period. Similarly, clicking the [-] icon allows you to zoom further out to spot trends, or return to your previous perspective after having zoomed in. Any time after having made a selection, clicking the [X] icon will cancel the selection and return focus to the entire viewable chart area.

 

 

Data Explorer

Great! So now you know how many Syslog messages and SNMP Traps were received from any of your devices, along with their respective severity. You can even zoom into a specific time frame and cross correlate this data against other metrics from the same or different entities monitored by Orion. That's some super powerful stuff! But you know what would make this data even more powerful?

 

What if you could actually view the full details of those Syslog or SNMP-Traps from within PerfStack itself? That would assuredly accelerate the troubleshooting process and aid in reducing time-to-resolution; both of which are key tenets of PerfStack. Well that's exactly what we've done!

 

To get started, select a time period in the chart area, the same as described above to zoom in or out of the chart. Next, click the top paper & magnifying glass icon from the options displayed to the right of the selected area. This action will cause the Data Explorer tab to be shown in the left pane, where the Metric Palette normally appears. Switching between the Data Explorer and the Metric Palette is simply a matter of clicking on the appropriate tab.

 

Within the Data Explorer, all Syslog or SNMP Trap data is listed in the chronological order it was received. Each line represents a single message, beginning with its severity and ending with the date and time the message was received. In between is a brief preview of the message body. To view the full message text, simply expand the row.

 

 

If your devices are sending a lot of Syslog or SNMP Traps to Orion, it may be difficult to sift through all the noise and focus on what's truly important, even if you select the tiniest window of time. Since this just so happens to be another key tenet of PerfStack, we added filtering and search to the Data Explorer. This allows you to do things, such as show only Warning or Critical messages that were received during the selected time period, while filtering out the excessive noise generated by Informational and Debug messages.

 

And BOOM! Just like that, the problem is found. If SolarWinds was a leading US based office supply company, right now you'd probably feel compelled to blurt out 'That was easy". The magic doesn't end there either. This same functionality that's available for both Syslog and SNMP Traps can also be used to view the details of Orion Events. So if you aren't afforded the opportunity to configure devices to send Syslog or SNMP Traps to your Orion server, you can still give this new feature a whirl by adding Orion Events to your PerfStack and following the instructions outlined above.

 

 

Additional Metrics Support

In addition to Syslog & Traps, PerfStack in Orion Platform 2017.3 includes support for a variety of additional Orion product module metrics in this release. Given the feedback we have received since the initial release PerfStack, we know these will all be extremely welcomed additions.

 

Universal Device Pollers

That's right, I said Universal Device Pollers! Unquestionably the single most often requested feature request I receive for PerfStack has been adding support for Universal Device Pollers. With the release, I'm proud to announce you can now add Universal Device Pollers to your PerfStack projects no differently than any other metric. Universal Device Pollers for both Nodes and Interfaces are supported, and appear under their respective entity type. When you select an interface for example, which has Universal Device Pollers assigned, the 'Group' names as defined within the Universal Device Poller Windows application will appear within PerfStack as new metric categories. In the example below I have a Universal Device Poller Group I've defined in the Win32 application called 'Cisco Switch', which I've assigned to interface 'Fa0/1' on 'lab-transi-sw1'. Similarly, I have a Node based Universal Device Poller called 'sonic Current CPU Util' which is assigned to 'stp-nsa2400'. Please note that for Universal Device Pollers to appear in PerfStack, they must be of type 'Rate' or 'Counter' and that 'Keep Historical Data' must be enabled.

 

PerfStack Universal Device Pollers.png

Network Insight for F5 & ASA

This release of PerfStack also includes support for NPM's Network Insight for F5, and all new Network Insight for ASA. These metrics appear under their own distinct categories or node entities and are treated no differently than any other metric within PerfStack.

 

F5 ASA.png

 

Voice & Network Quality Manager

A party just isn't a party unless you invite your friends, so in this release we invited VNQM to join the fun by bring metrics for IPSLA operations into PerfStack. IPSLA Operations appear as their own separate entity type within PerfStack. If however, you find it easier to search by source router instead you can select that node entity and click the add related items button and all IPSLA operations running on that router will appear listed in the entities list. Select the IPSLA operation you'd like to see more information about and the list of all available metrics for that operation appear listed in the metric pallet. From there you know the drill; just drag and drop them onto the chart area and voila!

 

Network Configuration Manager

Also joining the party is NCM, allowing Orion users for the first time ever to visually cross correlate configuration changes to the impact they have on the network. When combined with NTA you can easily see the effect your recent CBQoS policy change is having on the flow of traffic from the moment the change was made. PerfStack also makes it easy for you to not only determine exactly when a change was made, but by whom. Similar to Syslog, Traps, and Events, selecting a time period from the charted area and clicking the Data Explorer button reveals detailed information about the configuration change that occurred, such as the username of the individual who logged into the device and their IP address.

 

 

 

 

Alert Visualization

Visualization of alerts in the premiere release of PerfStack displayed only the total aggregate of all alerts against a given entity, as well as how long all alerts had been active against that object. While useful, you were unable to determine what those alerts were which triggered, or how long each of those alerts remained active. A lot of that important detail was obscured through aggregation, making this particular area of PerfStack ripe for improvement.

 

In this release of PerfStack we preserved the same total alert aggregate that was available in the first release, but then extended the chart to show the names of the alerts which triggered along with their individual durations. No longer are you left wondering what alert was triggered against the object, or fumbling through other areas of the Orion web interface to track down that information. You'll also know at a glance when the alert triggered and for how long, in a manner which allows you to visually recognize patterns of reoccurrence and correlate specific individual alerts against other metrics collected by Orion.

 

Export

Occasionally it's necessary to share your PerfStack findings with others who may not have access to Orion, or archive those findings in a ticketing system for historical purposes beyond your defined retention period. For some, this information needs to be imported into other systems for change back, show back, billing, forensic analysis, or correlation with other non-Orion tools in the environment. Previously, your only recourse was to export data for each metric individually via the Custom Chart View, or write your own series of custom queries against the SQL database backend to obtain the raw values behind these charts.  In this PerfStack release however, those days are behind us.

 

After adding all metrics of interest to your PerfStack project, simply click the new 'Export' button located in the top menu. This will export your project's content to a Microsoft® Excel friendly comma separated value (CSV) formatted file,which is then downloaded to your local machine via your browser.

 

Double click the file to open in Excel, or upload the file to Google™ Docs to open in Sheets. Inside you will find all the raw data which made up the charts in PerfStack. Each series of the same chart is represented as its own set of columns, complete with the entity and metric name, date and time the data was collected, and its value. If your PerfStack project contained multiple charts, the values for each chart are grouped together in the CSV file the same as they were visually laid out within PerfStack itself. Simply put, the detailed raw data for the second PerfStack chart is directly beneath the end of the series for the first chart data contained within the CSV file.

 

 

The logical layout of the CSV file makes it easy to visually recreate similar charts to what was seen in PerfStack, using the native graphing tools included in Excel or Sheets. The format of the raw data is also well suited for import into 3rd party reporting solutions like Domo or eazyBI. With the data contained in logical layout and an open format, the possibilities are limitless.

 

 

 

Usability Improvements

 

Share

Not everyone who used PerfStack initially realized that whatever they created could be easily shared with others, simply by copying the URL and sending it to them. Since this is one of PerfStack's most powerful capabilities, we felt it needed to be promoted to the top menu, alongside other important functions like saving and loading a PerfStack.

 

Just click this 'Share' button and the dynamic PerfStack URL is automatically copied to your clipboard and ready to be pasted into an email, instant message, helpdesk ticket, etc. and shared with others.

 

Getting to PerfStack in previous releases meant leaving what you were looking at and starting the troubleshooting process over from the very beginning again in PerfStack if you needed to correlate symptoms with their root cause. For example, if users are complaining about the performance of SharePoint, a logical starting point might be to go to the Node Details view of the SharePoint server in Orion. But what if you then wanted information from the Hypervisor this virtual machine is running on, the storage array, or the SharePoint application being monitored by SAM? Well, you'd probably navigate to PerfStack, add the problem node, and then its relationships, before eventually plotting metrics in the chart area for correlation. This sounds tedious just explaining it and we knew we could do better.

 

Within the 'Management' or 'Details' resource located on the details view of virtually any entity type supported by PerfStack, you will find a new 'Performance Analysis' link. When pressed, you will be taken directly to PerfStack and the entity object you came into PerfStack will be pre-populated in the entity selector, along with any of its related items. In addition, the chart area will pre-populate with relevant metrics associated with the entity you came in from.

 

 

For example, if you were to click on the "Performance Analysis' link from the 'Node Details' view of your SharePoint server, you would be taken to PerfStack where the Node, it's volumes, interfaces, etc. are already listed in the entity list, and metrics such as Status, CPU/Memory utilization, response time, alerts, and events are all pre-populated for you. These metrics are dynamic based upon the entity type you enter from, ensuring they are always relevant to what you're investigating.

 

Links To PerfStack.png

We didn't stop simply at having links into PerfStack from other views in Orion. We knew that there were occasions when users viewing data in PerfStack needed additional information only found on entity details views, such as MAC addresses, serial/model numbers, etc.. WIth that in mind, we included direct links to the Details view of any entity shown within PerfStack's entity list. Simply hover your mouse over the entity name, the same as you would to add related items or to remove the entity from the list. There you will notice a new link icon, which when clicked will open a new browser tab that will take you to the details view for that entity.

 

Link to Details from PerfStack.png

 

Drag & Drop Entities

We all know you can drag and drop metrics onto PerfStack's chart area to visualize virtually any combination of metrics desirable, but what if I told you that you could drag and drop entire entities into the chart area? Would that blow your mind?  Well that's exactly what you can do now with PerfStack!

 

Utilizing the same logic derived from 'Links to PerfStack' referenced above, relevant and dynamic metrics associated with a given entity are pre-populated and charted when an entity is dragged into PerfStack's chart area. Simply click and hold your mouse on the drag handle which appears to the left of the entity name when hovering your mouse. Next, drag the entity into the chart area. This will then populate the charted area will the same metrics that would appear if you had entered PerfStack through the 'Performance Analysis' link on that entities Details view, saving you precious time in the throes of troubleshooting.

Drag and Drop Entities.png

Full Screen Mode

WIth the initial release of PerfStack we received a lot of feedback from customers wanting to add PerfStack to a wallboard in their NOC. The first issue these customers ran into was a bug where PerfStack did not respect session timeout values defined for the user account. I'm happy to announce that this issue has been resolved, allowing you to have PerfStack running and updating indefinitely if so desired, without ever timing out your session.

 

Next on their wish list was some ability to declutter the UI of extraneous elements which were unnecessary for a non-interactive display. This would allow them to maximize the viewable area for data being displayed in the charts and legend. Being the accommodating bunch that we are, a newly added button was added to the top right of PerfStack, just above the chart legend. When clicked, all UI elements except the chart and legend are removed, placing PerfStack into fullscreen mode. To exit full screen mode and return to normal mode, just click the button again or remove all metrics from the chart. Note that this button will only appear when one or more metrics are plotted in the chart area.

 

Maximize.png

 

Sharing is great when you're collaborating on a specific issue with a group of individuals, but what if you as the Orion Administrator want to create custom PerfStack dashboards to share with your users? This can be accomplished with other custom Orion views fairly easily, and PerfStack function no different.

 

To begin, first start by creating your custom PerfStack and including all metics you'd like represented in the chart. When complete, save your changes and click the 'Share' button. You should now have the URL to your saved PerfStack in your copy/paste buffer. Next, go to [Settings -> All Settings -> Customize Menu Bars] and edit the menu bar for the user/s you'd like to have access to your saved PerfStack. At the bottom of the page click the 'Add' button, give your link a name, and paste the saved PerfStack URL into the URL field of the 'Edit Custom menu Item' dialog windows that appears. From there, click 'ok' to save your changes and drag your newly created menu item from the 'Available Items' column to the 'Selected Items' column to add this to your menu bar.

 

When users which are assigned that Menu Bar login, they will now see a link to your custom saved PerfStack in their navigation. This allows Orion administrators to quickly share custom saved PerfStacks without emailing or instant messaging links to users. Similarly, you can now also use the 'User Links' resource in a similar fashion to provide your users links to custom saved PerfStack dashboards.

 

 

This represents an incredible amount of awesome jam packed into a single release, and I haven't even mentioned Real-Time polling yet! Let us know what your thoughts are these PerfStack improvements in the comments section below. We'd love to hear your feedback!

SRM 6.5 was made available in the Customer Portal on September 13th!  The release notes are a great place to get a broad overview of everything in this release. Exciting times since SRM also joined in other Q3 releases from SolarWinds (NPM, NTA, NCM, UDT, IPAM, DPA, & VMAN). We’ve been attending shows, monitoring customer requests and our theme for Q3 is expanding our already comprehensive list of device support! In fact, I would say that FLASH was our Primary focus, sprinkled with a bit of HYBRID Array models. We constantly measure various Storage Market Sectors, speak with analysts and rely on customer feedback with regard to Storage architecture and growth. This research has led us to recognize that Flash-Type Array solutions are a rapidly growing market sector and we want to be there when you are looking to renew your Storage environment, providing a path for you to continue using SolarWinds Storage Resource Monitor! Of course, there are some other worthy reasons to upgrade to SRM 6.5 like UI, Installer, and Perfstack enhancements, along with continued additions towards monitoring Array Hardware Health!!

 

Here's a Summary of what was accomplished in 6.5

Support for arrays:

  • IBM FlashSystem A9000/A9000R
  • IBM DS 8xxx
  • NetApp EF
  • NetApp AFF
  • EMC VMAX Flash Family

Hardware health monitoring for:

  • EMC VNX CLARiiON
  • EMC VNX Celerra/VNX Gateway
  • IBM SVC/V7000/V3700

 

Installer Enhancements:

  • Use the new SolarWinds Orion Installer to install and upgrade one or more Orion Platform products simultaneously in your environment. When installing new products into an existing Orion environment, the Orion Installer verifies compatibility between the product versions and notifies you if additional steps are needed.
  • Bottom Line – The installer installs or upgrades all products from a single screen. You do not need to download files for each product.

Read More about the Orion Installer

 

UI Enhancements:

  • Start searching for widgets in two clicks from any customizable page
  • Mark favorite widgets so you can quickly add them to new dashboards or pages
  • Drag and drop widgets directly onto pages or move widgets to new locations

Read more about Dashboards

SolarWinds NCM 7.7 became available for download in the Customer Portal on September 13th. As always, the release notes contain plenty of great information on the new features.  I’d like to dive deeper into our Network Insight for Cisco ASA, building on the great post by Chris O'Brien.

 

Network Insight for Cisco ASA

As Chris pointed out, this is our second installment in the Network Insight series, and the first that I’ve had the pleasure of being involved in. The initial Network Insight release brought together NPM and NCM to deeply manage and monitor F5 BIG-IP devices. NCM 7.5 delivered valuable capabilities including binary configuration support, F5 LTM and GTM configuration support, and new inventory support.

 

For this release, we focused on delivering a set of capabilities around monitoring and management of Cisco’s Adaptive Security Appliance, or Cisco ASA. For SolarWinds NPM, this includes specific features around:

  • Site to Site VPN
  • Remote Access VPN
  • Interfaces

 

For SolarWinds NCM, we focused on the following three areas:

  • Firmware Upgrade
  • Multi-contexts
  • And the most exciting, Access Control List Management

 

Firmware Upgrade Support

With SolarWinds NCM 7.7, we’ve continued to improve the Firmware Upgrade feature, adding support for upgrading the firmware for Cisco ASAs, both in single- and multi-context mode.

  • Multi-context – must be used if the device is in multi-context mode, even if there is only one context. Must be run from the admin context
  • Single-context – must be used if the device is in single-context mode, or if the device doesn’t support contexts at all.

firmware upgrade for Cisco ASA

 

Multi-Context

In NCM 7.7, we can discover security contexts for Cisco ASA’s and easily bring them under management. To take advantage of this, first discover the ASA admin context. NCM will automatically discover additional contexts and list them in the Contexts resource. To manage each context, simply click on the “+” icon. Each additional context counts as a node; its configuration will be stored and managed separately.

Cisco ASA multi-context discovery and management

 

Access Control Lists (ACLs) Management

Saving the best for last, NCM 7.7 automatically discovers ACLs, which zones they are assigned to, and what interfaces are assigned to those zones. Using NCM, you can now ensure that your ACLs are doing what you expect them to do. Gone are the days of laboriously poring over each rule in an ACL in turn, hunting down object and object group definitions, wondering if a particular rule is being hit (and if not, why not?), and if something changed recently, and if so, what changed?

 

To see the list of ACLs for a particular ASA, mouse over the subviews panel and select “Access Lists.”

List of ACLs for a particular Cisco ASA

 

 

NCM tracks the history of ACLs on a particular ASA, including showing the date for the most recent version. And, if there are prior versions, they can be viewed via the expand carat.

 

History of a particular ACL on a Cisco ASA

 

And you can easily compare ACLs to prior versions, other ACLs on the same ASA, or even across ASAs.

Change which ACLs are compared for a Cisco ASA

 

Navigating into a particular ACL, you can see the rules of the ACL, with syntax highlighting.  You can filter rules by type, source, and destination. Each rule shows a count of hits, that is, how many times the firewall has seen traffic that matches a particular rule.

You can also drill into object groups, to see their definition including history.

ACL Detail view, including syntax highlighting, filtering, object group hierarchy traversal for Cisco ASA

 

Finally, with NCM 7.7, we’ve added Overlapping Rule Detection. Overlapping rules are classified in two ways, in terms of completeness of overlap and type of overlap.

With regards to completeness, we can have either partial or complete overlap, which should be self-explanatory. And with regards to type, we have the following:

  • Redundant: a rule earlier in the list overlaps this rule, and does the same action to the matched traffic.
  • Shadowed: a rule earlier in the list overlaps this rule, and does the opposite action.

 

Mousing over the Overlap indicator in the the Access List view, you can see a summary of the issues with a particular ACL.

Cisco ASA ACL rule overlap, summary

Drilling into a specific ACL, you can see which rules are overlapping, and clicking on the "Show the details" link will provide even more detail.

Cisco ASA ACL rule overlap, detailed view

 

 

Conclusion

Why are you still reading this? Go get the latest version from the Customer Portal and install it today! And, while you’re waiting for the new installer to work its magic, feel free to click through the new functionality in our online demo. We’d love to hear your feedback, post away below!

A long time ago, in a network far, far away, a post was written called Why Should I Care About Release Candidates?  I say a long time ago because it was 2009, shortly before I joined SolarWinds and while feature phones still ruled the earth.   Since that post, SolarWinds has added thousands of customers, but as Product Managers, we work hard to stay close to you so we deliver real value in every release.  And we are very thankful for your willingness to share your time and thoughts about our products and company.

 

However, we wanted to reach out again, to customers both new and old, to tell you (or remind you) the secret of how to get what you really want in future releases of your favorite products. Ready for the secret?  The main secret to getting what you really want?  It's easy - PARTICIPATE!  Participate as much as you can!  Since we operate on the YAWL principle (You Asked, We Listened), the more we hear from you, the better.  We have a variety of ways for you to participate.

  • Review What We Are Working On: Our What We Are Working On posts details the features we are currently working on across all our products.  Its a great place to start and see what is on the roadmap.  If you don't see what you want, check the feature request form for your products and vote.
  • Vote on Feature Requests: Each product has a Feature Request forum  (examples:  NPM, SAM, DPA) where you can add and vote on your favorite features.  While product managers reviews all feature requests, voting helps up prioritize... and when we start working on a feature, we can reach out to directly to everyone who votes.
  • Walk Through UX Mockups:  Early in the release process, our excellent UX team puts together functional mockups for you to review.  Feedback here has a big influence on what we implement and how it works. At times it may feel like a therapy session, but we really want to understand what you see, how you process the information and decide what to do next.
  • Install Beta Releases: Beta's are early versions of the next release with some features complete and ready to try in your environment.  This is where the rubber meets the road and features come to life - and your feedback is crucial.  Remember - beta's are fresh installs only, not suitable for consumption on your production server.
  • Upgrade to Release Candidates:  As emphasized in the old post, Release Candidates are fully supported releases, meaning you can upgrade your production servers and start using new features right away. Our Support and Sales Engineering teams are fully trained in the new version.  Don't hesitate - our new upgrade process for Orion products is amazing.
  • Show Us Your Environment:  Our UX team also loves to do "Show Me" sessions, where we watch you use our products in your environment, and see how you solve problems.
  • Answer Surveys:  From time to time, you will receive requests to fill out surveys, here on Thwack and via email.  These help us understand broad trends of your business, environment, and feature needs.

 

Hopefully you are in inspired mood, because here is your opportunity to participate: Take this one page survey so we can reach you next time we are looking for volunteers.

REALITY CHECK: Will I really get EVERYTHING I want?

You may be thinking to yourself "Will I really get everything I want"?  Sadly, no... we can't make all your dreams come true because we have limited resources and no limit of good ideas from you.  But this is why your participation is so important, to help get the most valuable features to the top of the list.  And patience often pays off - check out this feature we implemented earlier this year from a request in 2012 - Silence Alerts While Still Monitoring.

 

Thank you for taking the time to read this, and hopefully we will be hearing from you!

 

The PM Team.

In continuance of the great Q3 release from SolarWinds (NPM, NTA, NCM, UDT, IPAM, DPA, & SRM), I am excited to announce the Generally Availability of Virtualization Manager (VMAN) 8.0.  Over time, Virtualization Manager has incrementally evolved from a stand-alone virtual appliance to providing advanced integration to the SolarWind's monitoring stack. VMAN Orion allows for a single pane of glass for troubleshooting (AppStack ), root cause analysis (PerfStack), and VMAN features found only on Orion (predictive recommendations).  In VMAN 8.0 we have taken this a step further by providing native VMAN functionality in Orion while removing the virtual appliance requirement.  Our goal with the release was to improve ease of use while strengthening VMANs ability to find and resolve problems in the virtual infrastructure.  The VMAN 8.0 download can be found in the SolarWinds customer portal.

 

Native VMAN Polling on Orion

It is now possible to download and install VMAN much like you would for SAM & NPM without upgrading and maintaining the VMAN appliance.  If you are currently integrated, it is extremely simple to migrate your VMAN polling to Orion by going to Virtualization Settings VMware Settings (or Hyper-V Settings).  Just select the vCenter or Hyper-V host to switch polling over and choose VMAN Orion Polling from the Polling Method drop-down menu.  In VMAN 8.0 there are 3 different polling methods that you will notice.

  • Basic - This is the general polling performed by SAM & NPM and does not require a VMAN license but it will require a SAM or NPM node license. ***Note - you can still use the basic polling method if you only have VMAN polling since the VMAN license is allocated node licenses equal to the socket count in your VMAN license, (e.g. 64 socket license = 64 node licenses).
  • VMAN Appliance - This is the traditional VMAN polling which occurs from the virtual appliance and requires that the appliance is integrated with Orion to get full VMAN functionality in Orion.
  • VMAN Orion - Full VMAN polling that occurs from Orion and does not require the VMAN appliance but does require a VMAN license

 

 

In addition to the new polling method in VMAN 8.0 we have added the ability to scale out your VMAN polling using the Additional Polling Engine (APE) framework for free.  No additional polling engine license is required to run with VMAN infrastructure monitoring.

 

Batch Execution of Recommendation Actions

The time it takes to optimize your environment has been drastically reduced with the ability to quickly select and batch multiple recommendations to execute immediately or schedule for a later time through just a few clicks.

 

VMAN will automatically configure the correct order of operations in which to perform the actions based on your selections.

 

 

Management action based recommendation policies

 

New policies in Virtualization Manager allow for greater control and precision over the recommendations which optimize your environment.  Create granular policy exclusions based on recommended actions for CPU, Memory, VM & Storage Migrations.

The new policy becomes available by selecting the Disallow Actions for Recommendations and then selecting the scope (vm, host, cluster, or datastore) to apply the policy against.  Then select the actions to exclude from recommendations for the scope chosen above.

  • Move VM - Actions based on VM movement/placement to a different host or datastore.
    • Move VM to a different host
    • Move VM to a different Datastore
  • Configuration - Modification of the VM memory or vCPU configuration
    • Change CPU Resources
    • Change Memory resources

 

 

PerfStack 2.0 - New Features & Improvements

With VMAN 8.0 we also get a new and improved version of Perfstack that continues to build on the tenants of SolarWinds to help dig into problems faster and identify true root cause across different IT disciplines.

    • Links from VM & Host Details that populate the PerfStack palette
    • Zoom into PerfStack charts to view more detail for the selected time period
    • Data ExplorerAlert visualization improvements
      • Export PerfStack data to Excel
    • Alert visualization improvements
      • Each individual alert start/end time is now visualized separately in PerfStack
      • Existing aggregate alert visualization against an object is retained
    • Export PerfStack data to Excel
    • Easier to share PerfStack dashboards with the new ‘Share’ button.
    • PerfStack 2.0 - Real-Time Polling

You can get to PerfStack from within the VM details page in three different spots (we wanted to make sure you didn't miss it.)

  • Virtualization Manager Tools
  • Virtual Machine Details
  • Resource Utilization

 

 

View of the PerfStack palette pre-populated with metrics of the VM.

 

 

WEB INTERFACE IMPROVEMENTS

 

 

NEW INSTALLER UPGRADE EXPERIENCE

    • Install and upgrade one or more Orion Platform products simultaneously
    • Modern interface with a simplified design and intuitive workflow
    • Downloads and installs only what is needed
      • Reduces download size & accelerates installation

 

 

Windows Authentication & SSL Encryption for Orion Microsoft SQL Database connectivity

 

 

Documentation

Virtualization Manager (VMAN) - SolarWinds Worldwide, LLC. Help and Support

 

VMAN 8.0 Release Notes - SolarWinds Worldwide, LLC. Help and Support

NPM 12.2 was made available in the Customer Portal on September 13th!  The release notes are a great place to get a broad overview of everything in the release.  Here, I'd like to go into greater depth on  Network Insight for ASA including why we built it and how it works.  Knowing that should help you get the most out of the new tech!

 

Network Insight

We live in amazing times.  Every day new technologies are invented that change how we interact, how we build things, how we learn, how we live.  Many (most?) of these technologies are only possible because of the relatively new ability for endpoints to talk to each other over a network.  Networking is a key enabling technology today like electricity was in the 1800s and 1900s, paving the way for whole wave of new technologies to be built.  The better we build the networks, the more we enabling this technological evolution.  That's why I believe in building great networks.

 

A great network does exactly one thing well: connects endpoints.  The definition of "well" has evolved through the years, but essentially it means enabling two endpoints to talk in a way that is high performance, reliable, and secure.  Turns out this is not an easy thing to do, particularly at scale.  When I first started maintaining, and later building networks, I discovered that monitoring was one the most effective tools I could use to build better networks.  Monitoring tells you how the network is performing so you can improve it.  Monitoring tells you when things are heading south so you can get ahead of the problem.  Monitoring tells you if there is an outage so you can fix it, sometimes even before users notice.  Monitoring reassures you when there is not an outage so you can sleep at night.

 

Over the past two decades, I believe as a company and as an industry we have done a good job of building monitoring to cover routers, switches, and wireless gear.  That's great, but virtually every network today includes a sprinkling of firewalls, load balancers, and maybe some web proxies or WAN optimizers.  These devices are few in number, but absolutely critical.  They're not simple devices either.  Monitoring tools have not done a great job with these other devices.  The problem is that we mostly treat them like just another router or switch.  Sure, there are often a few token extra metrics like connection counts, but that doesn't really represent the device properly, does it?  The data that you need to understand the health and performance of a firewall or a load balancer is just not the same as the data you need for a switch.  This is a huge visibility gap.

 

Network Insight is designed to fill that gap by finally treating these other devices as first class citizens; acquiring and displaying exactly the right data set to understand the health and performance of these critical devices.

 

Network Insight for Cisco ASA

Network Insight for Cisco ASA is our second installment in the Network Insight story, following Network Insight for F5.  As you saw with F5, Network Insight for ASA takes a clean slate approach.  We asked ourselves (and many of you) questions like:

 

  • What role does this device play in connecting endpoints?
  • How can you measure the quality with which the device is performing that role?
  • What is the right way to visualize that data to make it easiest to understand?
  • What are the most common and severe problems that occur with this device?
  • Can we detect those problems?  Can we predict them?

 

With these learnings in hand, we built the best monitoring we could from the ground up.  Let's take a look at what we came up with.

 

Access Lists

 

ACLs define what traffic is allowed or blocked.  This is the most essential task of the firewall and monitoring tools generally don't provide any visibility.

 

The first thing we found here is there's no good way to get all of this data via SNMP.  We have to pull the config and analyze it.  For that reason, we handed this piece off to the NCM team to work on.  Check out more here: Network Configuration Manager 7.7 is now generally available!

 

 

Site to Site VPN

 

Site to site VPN tunnels are the next most important service that ASAs provide.  They are often used to connect offices to data centers, data centers to cloud providers, or one organization to a partner.

 

Yesterday, you could monitor these tunnels by testing connectivity to the other side of the tunnel, for example an ICMP monitor to a node that can only be reached through the tunnel.  Today, we poll the ASA itself via SNMP and API to show a complete picture including:

  • What tunnels are configured?
  • Are my tunnels up or down?
  • If a tunnel is up:
    • How long has the tunnel been up?
    • How much bandwidth is being used by the tunnel?
    • What protocols are securing the traffic transiting the tunnel?
  • If a tunnel is down:
    • How long has the tunnel been down?
    • What phase did the tunnel negotiation fail at?

 

 

 

This means we automatically detect and add VPN tunnels as they're configured or removed and constantly keep an eye on these very important logical connections.  I'll highlight a couple interesting things.

 

Favorites

 

We're introducing a simple new concept called favorites.  Marking a tunnel as a favorite by clicking the star on the right does two things.  First, you can filter and sort based on this attribute.  The page by default shows favorite tunnels first, so you will always see your favorites first until you change the sorting method.  Second, it promotes that tunnel's status to the summary screen.  We found for most ASAs there were a couple of VPN tunnels that were wildly more important than all of the other tunnels.  Here at SolarWinds HQ for example, it's the tunnel to the primary data center.  At the primary data center, it's the tunnel to the secondary data center.  Favorties provide a super easy way to add extra focus to the tunnels that are so important that a big part of the story of the health and performance of the ASA is the health of the tunnels themselves.

 

 

Tunnel Status

What is the status of the tunnel?

 

Turns out this is a harder question to answer than it looks.  Tunnels are established on-demand.  If you just configured a tunnel, but have not sent any interesting traffic so the tunnel is not up, should we show it as down (red)?  That doesn't seem right.  What if the tunnel was up for 3 months, but interesting traffic stopped coming so the tunnel timed out and went back down, but is prepared to come back up as soon as interesting traffic is seen?  The tunnel is definitely "down", but should it be red?  Probably not!  We spent a lot of time thinking about this and talking to you guys to determine the logic that decides if an administrator considers a tunnel down, up, or something in between.  All of that logic is built into the statuses you see presented on this page.

 

Phases

For years, my first troubleshooting step on a tunnel that was down was to review logs and find out what phase negotiation failed at.  This tells you what set of variables you need to review for matching against your peer.  I'm very pleased that this first data point is now right in the monitoring tool that identified the tunnel as down to start with.  I hope it helps you guys get your tunnels back up faster.

 

 

Remote Access VPN

 

When users connect to the office using a software VPN client on their laptop, Cisco calls that Remote Access VPN.  As with Network Insight for F5, we are careful here to use the same terms as the manufacturer so it's easy to understand what we're talking about.

 

Again, we have to use both SNMP and API to get all the data we need to answer the following questions:

  • Who's connected?
  • Who tried to connect in the past, and what was the result?
  • How long have they been connected?
  • How much data have they uploaded and downloaded?
  • What is their session history?

 

 

Again, I'll highlight a few things.

 

List View

One of the challenges is the sheer number of remote access connections there are.  We know we do not do good enough job at dealing with very large lists today and our UI Framework team has been working on solving that.  This page is one of the first implementations of the new List View that they created.  This list view gives you the tools to easily deal with very large lists.  The left side of the screen lets you filter on anything shown on the right.  The filters available are considerate of the data and values seen on the right, so we don't have useless filters.  You can stack several filters and remove them individually.  Finally, after filtering your list you can still sort and search through those filtered results to further hone your list.

 

 

You'll see this list view a lot more as time passes.

 

Interfaces

 

Whereas interfaces are the main story on a switch or router, they're an important secondary story on an ASA.  We rebuilt the interfaces view from the ground up based on the List View.  Along the way, we made sure we were building it for a firewall.

 

 

NAMEIF

As my fellow ASA Administrators know, nameif is not a typo.  Nameif is the command you use to specify the name of an interface on an ASA.  A nameif must be configured for an interface to function, and from the moment you specify the nameif onward, every other element in the interface references the nameif.  ACLs, NAT, you name it.  In other words, the identity of an interface on an ASA is its nameif (like CPLANE or OUTSIDE), not it's physical name (like GigabitEthernet0/2).  Accordingly, that is the primary name shown here, with the physical interface name shown only if the interface isn't in use and doesn't have a nameif.

 

Access Lists

If you have NCM to pull access lists from configs, we will identify which access list is applied to each interface and provide a link to review the access list.  This is super convenient in practice.

 

Security Level

Security levels have some control over what traffic the ASA allows.  It also provides a quick indicator of how much the administrator trusts the network connected to a specific interface.  Kind of important things for a firewall.

 

Favorites

Again, we're using the simple favorites concept.  I expect a lot of ASAs to have the interface connected to the Internet favorited!

 

 

Platform

 

All of the things described above are technology services that are built on a platform.  The platform must be healthy for the services to have any chance of being healthy.  The platform sub-view helps you understand the health of the platform.

 

 

High Availability

While high availability is a feature of many platforms, it seems to be particularly popular on ASAs.  Additionally, it seems Administrators have to fiddle with it a lot.  Administrators have to failover to perform software upgrades, some choose to failover to change circuits, failover to upgrade hardware, failover for all sorts of reasons.  While I'm concerned we are all using failover so often, it is clear that NPM has to provide great coverage for H/A.

 

In speaking with lots of ASA administrators we found several different behaviors.  Some administrators were unaware of whether their ASAs were really ready for failover or not.  Some check manually every once in a while, but have had an active ASA go down only to discover failover could not occur.  Some expert administrators were checking failover status, but were also checking the quality of failover that would occur by verifying configuration synchronization and state synchronization.

 

Our H/A resources takes the best practices we found were being manually used by expert administrators, automates the monitoring of them, and presents simple conclusions in the UI.  If everything is green, you get simple checks and a phrase explaining what is healthy.  If something goes wrong, you get a red X more verbose explanation.  For example, if the standby is ready but the config is not in sync, failover can occur but the behavior of the firewall may change.  Maybe your last ACL change was not copied to the standby, so it doesn't apply if there is a failover.  If standby is ready but connection state information is not synced, failover can occur but all of your users will have to restablish their connections.  Not good!

 

Of course you can alert on all of these things.

 

Connection Counts

Firewalls store information about each connection that is actively flowing through them at a given moment.  Because of that, there is a limit to how many concurrent connections they can handle, and this is one of the primary values used to determine what size firewall you need to buy.  It's obvious then that it should also be a crucial part of how we understand the load of the device in addition to RAM and CPU, so we've included it here.

 

Aggregating connection failure rates is an interesting way to get an indicator that something is amiss.  Perhaps your firewall is blocking a DDOS or maybe a firewall rule change went awry.  Watching this one value can be a leading indicator of all sorts of specific problems.

 

Summary: Putting it all Together

If we've done our job, we're providing comprehensive coverage of the health and performance of an ASA on all of the sub-views.  Now, we pull all the information together and summarize it on the Summary page.

 

 

Details Widget

One of the things that really weighed down the Node Details page for most nodes was the Details resource.  This resource has historically been a catch all for lots of little bits of largely static data users have asked us to show on this page.  The problem is that it kept growing and eventually took up nearly half the page with data that actually wasn't that commonly needed.  Here we have rebuilt the resource to focus on the most important data, but with the additional data available within the "other details" drop down.  This also allowed us to move away from the archaic pattern of Name:value pairs in our UI.  Instead, we describe the device as your peer would.  You can see how the resource reads more like "this is <hostname>, the <context name> context on a <hardware model> running <software version>".

 

Also, did you know that what we called "resources" in the previous UI framework are called "widgets" in the new UI Framework?  There's your daily dose of useless trivia!

 

PerfStack Widget

Did you notice it?  The Load Summary and Bandwidth widgets on this page are powered by PerfStack charting.  Try clicking around on them.  It's oh so pleasant.  More to come on this later.

 

Favorites

The Bandwidth and Favorite Site-to-Site VPN widgets display information about the components you identified as your favorites on the other pages.  I think it's about time we recognized that all VPN tunnels and all interfaces are not equally important.  Some are so critical that their status alone is a big part of the answer to the question: how is the firewall running?  Favorites makes it easy to give them the attention they deserve.

 

Setup Network Insight for ASA

 

To get this visibility in your environment, jump on over to the customer portal to download the new version.  After upgrading your NPM instance, the new ASA monitoring should "just work," but here's the specifics just in case.

 

Already monitoring ASAs?

The new monitoring will start up automatically.  Give the new version a couple minutes to poll and jump over to Node Details for one of your ASAs.  You'll get a bunch of new information out of the box.  For complete coverage as seen in the screenshots above, you'll be prompted to edit the node, check the "Advanced ASA" monitoring check box, and enter CLI credentials.  Make sure to look at the sub-views (mouse over to the left)!

 

There is one caveat.  If you've assigned a custom view to your ASAs, we will not overwrite that!  Instead, you will have to choose to manually change the view for your ASAs over to our new view.

 

Adding a new ASA?

 

Simply "Add Node" and select the "Advanced ASA" monitoring check box on the last step to enter CLI credentials.  That's it.  Give it a few minutes and check out the Node Details page for that ASA.

 

Conclusion

 

That does it for now.  You can click through the functionality yourself in our online demo.  I'd love to hear your feedback once you have it running in your environment!

Filter Blog

By date: By tag: