1 2 3 Previous Next

Product Blog

721 posts

As some organizations start to either move workloads to the cloud, or as they build new apps and services natively within the cloud, the number of hybrid deployments and environments are increasing exponentially. In these hybrid environments, it’s important to remember how critical it can be to retain a single pane of glass, allowing for visibility into your applications and infrastructure.

 

With the recent release of Server & Application Monitor v6.7 (SAM), we’ve built in support for monitoring containers for Docker, Mesos, and Kubernetes, giving you visibility infrastructure wide into physical, virtual, cloud, and now container infrastructure. As cloud services and cloud infrastructure continues to grow and see more adoption, Server & Application Monitor needs to grow with it. In this post, we’ll discuss some of the ways we’re supporting this growth, as we highlight several new additions to support Microsoft Azure PaaS services. We’ll dive deeply into three of these new additions below. And of course, as always, please let us know any additional content you’d like to see SAM monitor.

 

The three new templates I am going to cover today for Azure include:

  • Azure App Service
  • Azure SQL Database
  • Azure Event Hub

 

For all three of these templates, be sure to install a couple of PowerShell modules on the system that SAM is installed, allowing you to leverage the following PowerShell commands:

  • Install-Module -Name Azure
  • Install-Module -Name AzureRM

 

Azure App Service:

Application Template can be downloaded here - Microsoft Azure App Service.apm-template

 

Prerequisites:

  1. To connect with your Azure account, the following parameters are required:

     subcriptionID,ApplicationID,TenantID, Secret Key, Application Name

     Note: Azure App to monitor, with its name and ID, should have role set as 'contributor or Reader' in the Azure access control.

    2. Application name for which metrics will be calculated.

    3. Time interval for which data has to be fetched (in hours).

    4. PowerShell version supported 5.1 or above.

 

Script Argument:

  • Login credential to access Azure Portal. Azure details have to be passed in script arguments as per prerequisite #2.

Example:

             <SubscriptionID>,<TenantID>,<ApplicationID>,secretKey=<Enter SecretKey>,<ApplicationName>,TimeRange=<Time in hrs>

  • The ApplicationID with which you are making a connection to the Azure portal (as mentioned in Credential/Prerequisites) must be registered in Azure Active Directory as a contributor role for the monitored application.

        Reference link: https://support.solarwinds.com/Success_Center/Server_Application_Monitor_(SAM)/Knowledgebase_Articles/Add_an_Azure_Active_Directory_app_for_cloud_monitoring_in_the_Orion_Platform

 

Portions of this document were originally created by and are excerpted from the following sources:

https://docs.microsoft.com/en-us/azure/app-service/web-sites-monitor   

https://docs.microsoft.com/en-us/powershell/azure/authenticate-azureps?view=azurermps-6.7.0

https://docs.microsoft.com/en-us/powershell/module/azurerm.insights/?view=azurermps-6.7.0&viewFallbackFrom=azurermps6.7.0#monitor

 

MONITORED COMPONENTS

  • Average number of bytes sent

      This monitor provides the average number of bytes sent for the given app.

      Unit: MB (Mega Bytes)

  • Total number of 2xx requests

      This monitor provides the count of requests resulting in an HTTP status code >= 200 but < 300 for the given app.

      Unit: Count

  • Total number of 3xx requests

      This monitor provides the count of requests resulting in an HTTP status code >= 300 but < 400 for the given app.

      Unit: Count

  • Total number of 401 requests

      This monitor provides the count of requests resulting in HTTP 401 status code for the given app.

      Unit: Count

  • Total number of 403 requests

      This monitor provides the count of requests resulting in HTTP 403 status code for the given app.

      Unit: Count

  • Total number of 404 requests

      This monitor provides the count of requests resulting in HTTP 404 status code for the given app.

      Unit: Count

  • Total number of 406 requests

      This monitor provides the count of requests resulting in HTTP 406 status code for the given app.

      Unit: Count

  • Total number of 4xx requests

      This monitor provides the count of requests resulting in an HTTP status code >= 400 but < 500 for the given app.

      Unit: Count

  • Total number of 5xx requests

      This monitor provides the count of requests resulting in an HTTP status code >= 500 but < 600 for the given app.

      Unit: Count

  • Total number of requests served by the app

      This monitor provides the total number of requests regardless of their resulting HTTP status code for the given app.

      Unit: Count

  • Average number of bytes received

      This monitor provides the average number of bytes received for the given app.

      Unit: MB (Mega Bytes)

  • Average memory used

      This monitor provides the average amount of memory in MBs used by the given app.

      Unit: MB (Mega Bytes)

  • Average response time

      This monitor provides the average time taken for the app to serve requests in milliseconds (ms).

      Unit: MS (Milliseconds)

 

TROUBLESHOOTING STEPS

Detailed troubleshooting steps (common for template):

  • Check that the PowerShell version is 5.1 or more and the Azure module is installed on the system where the template will run.
  • Template uses PowerShell components; script should run with administrator privilege.

Be sure to detail troubleshooting steps (specific for components).

  • Components connect with Azure using service principal authentication for which application has to be created at the Azure portal. See below link:

     https://docs.microsoft.com/en-us/azure/azure-stack/azure-stack-create-service-principals

  • Provide Azure IAM permission to the application, which was created in the last step. See below link:

     https://support.solarwinds.com/Success_Center/Server_Application_Monitor_(SAM)/Knowledgebase_Articles/Configure_Azure_IAM_permissions_for_cloud_monitoring_in_the_Orion_Platform

  • Script fetch data based on time range given in last script arguments. By default, script fetch data for the past hour. While giving the time range, make sure the data is available for the metric at that time, otherwise the component will be unable to fetch the data.

 

Azure SQL Database:

Application Template can be downloaded here: Microsoft Azure SQL Database.apm-template

 

Prerequisites:

  1. To connect with your Azure account, the following parameter is required: subcriptionID, ApplicationID, TenantID, Secret Key.

Note: Any Azure App (with its name and ID) having role as 'Read Only'.

    2. SQL Server Database name for which metrics have to be calculated.

    3. Time interval for which data has to be fetched (in hours).

    4. PowerShell version 5.0 or later.

 

Credentials:

  1. Login credential to access your Azure Portal. This has to be passed as script arguments per prerequisite #2, as listed above. e.g. <subcriptionID>, <TenantID>, <ApplicationID>, value=<Secret Key>, <Application Name>, value=<Time Interval>, <Database Name>
  2. Windows Administrator on the machine where template would be running against.    

 

Portions of this document were originally created by and are excerpted from the following sources:

https://azure.microsoft.com/en-in/blog/windows-azure-sql-database-management-with-powershell/

https://docs.microsoft.com/en-us/azure/monitoring-and-diagnostics/monitoring-supported-metrics

 

MONITORED COMPONENTS

  • Blocked Connections

      This metric provides the average number of firewall blocked connections established for the given SQL database during the time period specified as the polling frequency.

      Unit: Count

  • Failed Connections

      This monitor provides the average number of failed connections established for the given SQL database during the time period specified as the polling frequency.

      Unit: Count

  • Successful Connections

      This metric provides the average number of successful connections established for the given SQL database during the time period specified as the polling frequency.

      Unit: Count

  • Deadlocks

      This metric provides the average number of deadlocks established for the given SQL database during the time period specified as the polling frequency.

      Unit: Count

  • Database throughput units (DTU) limit

      This metric provides the average database throughput limit in units for the given SQL database during the time period specified as the polling frequency.

      Unit: Count

  • Database throughput units (DTU) used

      This metric provides the average database throughput units used for the given SQL database during the time period specified as the polling frequency.

      Unit: Count

  • Sessions percentage

      This metric provides the average percentage of available sessions used for the given SQL database during the time period specified as the polling frequency.

      Unit: Percent

  • Database size percentage

      This metric provides the average percentage of storage used for the given SQL database during the time period specified as the polling frequency.

      Unit: Percent

  • Total database size

      This metric provides the average for the total database size for the given SQL database during the time period specified as the polling frequency.

      Unit: Megabytes

  • Workers percentage

      This metric provides the average percentage of available workers used for the given SQL database during the time period specified as the polling frequency.

      Unit: Percent

  • Average CPU utilization

      This metric provides the average percent CPU used for the given SQL database during the time period specified as the polling frequency.

      Unit: Percent

  • Average IO utilization

      This metric provides the average percentage of data IO used for the given SQL database during the time period specified as the polling frequency.

      Unit: Percent

  • Average log utilization

      This metric provides the average percentage of log IO used for the given SQL database during the time period specified as the polling frequency.

      Unit: Percent

  • In-Memory OLTP storage percent

      This monitor provides the average In-Memory OLTP (Online Transaction Processing) storage percent for the given SQL database during the time period specified as the polling frequency.

      Unit: Percent

  • Database throughput unit (DTU) percentage

      This metric provides the average percentage of database throughput units used for the given SQL database during the time period specified as the polling frequency.

      Unit: Percent

 

Azure Event Hub:

Application Template can be downloaded from here: Microsoft Azure Event Hub Namespace.apm-template

 

Prerequisites:

  1. To connect with your Azure account, the following parameters are required: subcriptionID, ApplicationID, TenantID, Secret Key, Application Name

Note: Any Azure App (with its name and ID) having role as 'Read Only'.

 

  1. Namespace for which metrics have to be calculated.
  2. Time interval for which data has to be fetched (in hours).
  1. PowerShell version 5.0 or later.

 

Credentials:

  1. Login credential to access the Azure Portal. This has to be passed as script arguments per prerequisite #2, listed above. e.g. < subcriptionID>, < TenantID>, < ApplicationID>, value=<Secret Key>, <Application Name>, value=<Time Interval>, <Application Name>
  2. Windows Administrator on the machine where template would be running against.    

 

Portions of this document were originally created by and are excerpted from the following sources:

https://docs.microsoft.com/en-us/azure/event-hubs/event-hubs-quickstart-powershell

https://docs.microsoft.com/en-us/azure/monitoring-and-diagnostics/monitoring-supported-metrics

https://docs.microsoft.com/en-us/azure/event-hubs/event-hubs-metrics-azure-monitor


MONITORED COMPONENTS

  • Archive backlog messages

      This monitor provides total Archive messages in backlog for the given namespace via PowerShell cmd-let.

      Unit: Count

  • Archive message throughput

      This monitor provides total Event Hub archived message throughput for the given namespace via PowerShell cmd-let.

      Unit: Bytes

  • Archive messages

      This monitor provides total Event Hub archived messages for the given namespace via PowerShell cmd-let.

      Unit: Count

  • Incoming Bytes

      This monitor provides the total Event Hub incoming message throughput for the given namespace via PowerShell cmd-let.

      Unit: Bytes

  • Outgoing bytes

      This monitor provides the total Event Hub outgoing message throughput for the given namespace via PowerShell cmd-let.

      Unit: Bytes

  • Average Disk Seconds per Write

      Average Disk Seconds per Write is the average time of a write of data to the disk.

  • Incoming Messages

      This monitor provides the total incoming messages for the given namespace via PowerShell cmd-let.

      Unit: Count

  • Incoming Requests

      This monitor provides the Total incoming send requests for the given namespace via PowerShell cmd-let.

      Unit: Count

  • Internal Server Errors

      This monitor provides the Total internal server errors for the given namespace via PowerShell cmd-let.

      Unit: Count

  • Other Errors

      This monitor provides the total failed requests for the given namespace via PowerShell cmd-let.

      Unit: Count

  • Outgoing Messages

      This monitor provides the total outgoing messages for the given namespace via PowerShell cmd-let.

      Unit: Count

  • Successful Requests

      This monitor provides the total successful requests for the given namespace via PowerShell cmd-let.

      Unit: Count

  • Server Busy Errors

      This monitor provides the Total server busy errors for the given namespace via PowerShell cmd-let.

      Unit: Count

 

 

The SolarWinds trademarks, service marks, and logos are the exclusive property of SolarWinds Worldwide, LLC or its affiliates. All other trademarks are the property of their respective owners.

I’m happy to announce General Availability of Storage Resource Monitor 6.7. This release continues the momentum of supporting arrays that you all requested on THWACK®. It also comes along with new functionality both that you all have asked for as well as some that we think you’ll be surprised by and excited to check out. So, without further ado, why don’t we take a look at what’s new? (Note: that rhymed)

 

(Also, don’t forget to check out the SRM 6.7 Release Notes for more information about installing, upgrading, and new fixes.)

 

New Array Support

As you have probably come to expect, the aforementioned support includes all the standard features you know and love: capacity utilization and forecasting, performance monitoring, end-to-end mapping in AppStack™, integrated performance troubleshooting in PerfStack™, and Hardware Health. Now, as of SRM 6.7, we support the following devices:

 

  • Huawei OceanStor v3 Series
  • Huawei OceanStor v5 Series
  • Huawei Dorado V3 Series

 

What’s that? You want to see the screenshots to prove it? We can provide that.

 

Summary View

 

Block Storage

 

File Storage

 

Hardware Health (don’t worry about the critical state – we’re on it.)

 

New Hardware Health Support

What if you were able to extend the previous screenshot to arrays that you’re already using SRM to monitor? And you were able to see details on fans, power supplies, batteries, and more? With SRM 6.7 you can do that for the following array vendors and types:

 

  • EMC Isilon v8
  • NetApp

 

How great is that? And yes, of course, here’s your screenshot:

 

 

 

Support for GTP Format Partitions

This is something we’ve heard from some of you recently. And you all have discussed it on THWACK as well. So, as normal, ask and you shall receive. With SRM 6.7, we have added support for GPT drives on Windows Server 2008 and later. Here’s a screenshot of what it’s going to look like when you’re selecting what to monitor (after that step, it’s going to be the same great experience you’ve come to expect from your MBR partitions):

Support for Storage in Orion Maps

This feature comes with a tremendous amount of capability. So much so, that we have dedicated blog posts on it. But before I link you to that, I’ll write a quick note about what it means for you and SRM.

 

Starting with our newest releases (which include SRM 6.7), Orion® Maps are now going to include data all the way down to your storage environment. That’s huge. Say, for example, if all you think (and want to think) about all day is storage, you can now open an Orion Maps view from your storage device screen and sit back as a map is built for you that shows you what VMs sit on top of that storage. Automatically—you don’t have to do a thing. How great is that? What you do with that information (read “who you go after”) is up to you.

 

And, you want a screenshot? Forget screenshots. This one is animated:

 

 

 

Again, there is a ton here to cover, so I’ll leave you with the short taste above and this link to a wonderfully written post on the topic.

 

What’s Next

That’s it for the meaty functionality of SRM 6.7. There was a lot there and we’re excited to share it all with you.

 

If you don’t see what you’re looking for, head over to our What We’re Working On post to see what our storage team is already working on for our next releases. If you don’t see what you want there, make sure to add it to the Storage Manager (Storage Profiler) Feature Requests page.

 

And of course, we want to know what you think—just let it be known below in the comments.

 

Disclaimer: The SolarWinds trademarks, service marks, and logos are the exclusive property of SolarWinds Worldwide, LLC or its affiliates.  All other trademarks are the property of their respective owners.

I am pleased and excited to announce the latest addition to the SolarWinds product family on the Orion® Platform, Server Configuration Monitor or SCM. For those of your that are familiar with our Network Configuration Manager product or NCM, this new product is its sibling with a focus on systems and applications.

With this new product, SolarWinds continues to deliver on our unexpected simplicity promise of building simple, powerful, and affordable products.  SCM, built on the Orion Platform, is designed to enable you to capture, visualize, and understand changes in your environment in near real-time.

With this first version, we are focused on Windows systems and enabling customers to version and diff the following elements.

  • Hardware
  • Software
  • Registry
  • Files
  • Microsoft IIS

 

While we have already delivered v1, the team is hard at work to continue to advance and expand this product as outlined in our “What We Are Working On” post.  SCM is licensed by nodes, a simple-to-understand model, and pricing is affordable to many.  Let’s walk through a couple of use cases of how you could use the product.

 

Web Application Outage:

At the same time that you receive an alert from Server & Application Monitor indicating a critical web application is down, you also begin to receive calls from users indicating they cannot access it.

 

Where do you start investigation to determine where the problem lies? 

  • Is this a networking problem?
  • Is the hardware having a problem?
  • Is this a storage or database problem?
  • Is this application running on your virtual platform and is that having an issue?

 

While investigating each of these areas, the clock is ticking and you are getting more and more calls. 

Using the SolarWinds® PerfStack dashboard for a real-time view into multiple different metrics and parameters, as highlighted in the below screenshot, I can see that right before the web application went down, a configuration change was detected.

 

 

In the data explorer, I click on the web.config link and I am immediately taken to a comparison or “diff” page showing me what changed here.  As you can see, someone went in and made some edits to this config file and now the XML structure is broken, which took down your web application.  Now you can quickly remote desktop into that machine and change this back, save the file, and get the web application back up and operational.  So, what may have taken 30-60 minutes to investigate the infrastructure and root cause was isolated and addressed typically in minutes with SCM.

 

 

Let’s briefly touch on a couple of other use cases for Server Configuration Monitor.

  • Unauthorized Software
  • Malware Security Incident
  • Hardware Change

 

THE ARCHITECTURE

So how does it work? It all depends on what you want to monitor. If you are only interested in hardware and software versioning, we can do this without an agent. However, if you want to monitor files for changes, registry, or Microsoft IIS, then you will need to deploy the Orion Agent onto this machine. If you already have the Orion Agent deployed on your machine (for example, for monitoring in SAM) SCM is an add-in package in the Orion Agent that just needs to be enabled. The key point I am making here is this is not a separate agent.

 

 

WHAT ELSE?

Don't see what you are looking for here?

 

Visit the SolarWinds Server Configuration Monitor forum.

Check out the What We're Working On for SolarWinds Server Configuration Monitor post to see what our dedicated team is already looking at.

If you don't see everything you've been wishing for there, add it to the SolarWinds SCM Feature Requests.

 

We are excited to get this out there and begin to gather input and feedback. Don’t forget you can quickly and easily install a free, fully featured 30-day trial to kick the tires yourself.

 

The SolarWinds trademarks, service marks, and logos are the exclusive property of SolarWinds Worldwide, LLC or its affiliates. All other trademarks are the property of their respective owners.

Upgrading SolarWinds Orion Platform Products is Amazing

By Destiny Bertucci

 

 

          I know what you’re thinking right now, “She's out of her darn MIND!” Bear with me for a moment here. I’ve seen a lot of failed upgrades and pushback on upgrading systems to newer OS and application versions. However, I’ve seen more, even smoother upgrades in the past few years that have allowed me to want to make sure everyone has the best experience possible when upgrading.This means I’ve gathered information that can help you be more knowledgeable about why you should upgrade and to get the best features available all while achieving more secure options for your environment. Without further ado, let’s dive in.

 

          I’d like to start with some necessary information to help you prepare for upgrading, no matter what level you are currently on. I can help guide you to a better environment with the SolarWinds® Orion® Platform while maintaining proper control on how it should be done to help sidestep some “gotcha” moments.

  1. You must know what version you are on, period. When I say that, I mean I’d like for you to have a notepad or an Excel® sheet that allows you to have all the info on your environment readily available. I’ve attached the one I currently use while managing my environments.
  2. You’ll need to know where to find your version and upgrade path:
  3. If you are on 12.0 or above, use this: https://support.solarwinds.com/Success_Center/Orion_Platform/Orion_Documentation/SolarWinds_Orion_Installer
  4. If you are below 12.0, please use the following: https://customerportal.solarwinds.com/support/product-upgrade-advisor
  5. Check out Windows® version support for each level of SolarWinds Orion Platform products:  https://support.solarwinds.com/Success_Center/Orion_Platform/Knowledgebase_Articles/Windows_Server_2012_2016_and_SQL_Server_2012_2014_2016_and_2017_Support
  6. My favorite information is the migration guide. Because sometimes when you’re behind in the upgrade cycles, you realize you need a complete overhaul of your environment. Again, perfectly fine! Sometimes it’s even best to migrate when upgrading because you can stay up to date more easily on a new platform. So, this guide is one I keep near and dear to my heart: https://support.solarwinds.com/Success_Center/Network_Performance_Monitor_(NPM)/NPM_Documentation/Migration_Guide
  7. DBAs love information about the types of databases needed and/or used. Here’s a link to help everyone on your environment team be aware of the end game with databases: https://support.solarwinds.com/Success_Center/Orion_Platform/Knowledgebase_Articles/Databases_used_by_SolarWinds_modules
  8. SQL Server® requirements: https://support.solarwinds.com/Success_Center/Orion_Platform/Knowledgebase_Articles/Databases_used_by_SolarWinds_modules
  9. Port requirements: https://support.solarwinds.com/Success_Center/Network_Automation_Manager/NAM_Install_Guide/030/020
  10. Look up each module’s requirements, so you’re creating an environment that lasts and is a pleasant environment for users to use. There is nothing worse than waiting for the page to load because the database is underpowered OR the NetFlow database is underpowered for the number of flows you are using. Please acquaint yourself with the SolarWinds Customer Success Center and use it to find the system requirements you need. 
  11. Here is an excellent link from our awesome community members on in-place upgrades for SQL: https://thwack.solarwinds.com/message/398951#398951

        

           Now that you have gathered the information that you need, let’s talk about why you would want to upgrade. With the ability to use configurations within your network devices to visualize data, it’s vital to bring in these devices and use them to stay ahead of issues better and even solve some issues you may not have seen before.

         

          What in the world is she talking about now? Well, how about being able to see interface config snippets for your Cisco® devices on the interface details page? Or visualizing a switch stack for full redundancy, and using NetPath network path analysis  to break through your firewall to show you connection points from end to end? One major reason you may want to upgrade is to simplify your environment’s “break-down” moments. 

        

           SolarWinds has been working one-on-one with IT groups in all departments to understand and work to solve for their frustrations. Being able to visualize those virtual port channel bundles, for instance. Instead of waiting for an alert, it would be nice to shake out your monitoring and management environment to allow yourself to see clearly and make decisions based on your baselines that match your unique setup.

         

          Security-wise, let’s be honest… if you’re on an unsupported version of Windows or SQL Server, that’s a security issue, big time. If they’re not patchable, they are NOT on my environment. Security should be a focus for you, especially for older versions of .NET. Let’s get our heads in the game and start visualizing these upgrades and making them happen, you know, for security’s sake and all.

          All the data I provided here SHOULD allow you to have a successful upgrade in your future. If you have any suggestions for upgrading, please drop me a line!

 

 

~Dez~

The latest version of Server & Application Monitor (SAM) is now generally available. SAM v6.7 adds some very exciting and cool new features, which I will walk through in this post. If you are an existing customer, head on over to the customer portal to get the latest bits.

 

Outside of product releases, we’re also working on creating new or enhancing existing application template content for use in SAM. We recently published some content on THWACK®, which I covered here and here.

 

As always, we love to hear your feedback and how you would like to see them enhanced and evolve to help us continue to make SAM a better product.

 

Container Monitoring:

Available in both SAM and Virtualization Manager (VMAN), we have added our first version of container monitoring for Docker, Kubernetes, and Mesos. With this enhanced visibility, SAM can now provide insight into not just your physical infrastructure, but also virtual, cloud, and now container-based workloads.

 

Serena wrote up a document on this, going step-by-step through the configuration and deployment process, but essentially what we are doing is deploying a monitoring container within the container environment you wish to monitor. The information we are collecting is available in both AppStack and PerfStack for real-time troubleshooting.

 

In a future post, we will walk through how you can use SAM application templates to monitor applications in those containers.

 

Orion Maps and AppMap with ADM

The Orion® Maps team is on a roll coming off the initial release of Orion Maps in Q2, and this quarter they have added a bunch of new features, including support for leveraging the Application Dependency functionality added in SAM 6.6 to illustrate application or service dependencies in the maps.

 

Jeff Blank wrote up a very detailed and terrific document on these map enhancements here.

 

We think that these maps are fantastic because, first, they’re dynamic. Second, besides understanding the infrastructure relationships, the maps can help you understand which services and applications are talking to what and where.

 

Below is a screen grab from Jeff’s article illustrating these dependencies in an Active Directory environment.

 

 

 

SolarWinds APM Integration

For those who have been around SolarWinds for some time, I need you to put on your amnesia hats for a minute. I know back in the early days, SAM used to be called APM as well, but this is application performance management as defined by market analysts to give you code-level visibility into your custom applications.

 

SolarWinds® APM is a new product we are offering. It is designed to offer a very tight integration with SAM, giving users insight into their IIS-based, .NET applications natively within the SAM web console. SolarWinds APM is a cloud-based product based on SolarWinds AppOptics. In the screenshot below, we are pulling that data in real-time from the APM cloud service via API into the SAM console with the SAM look and feel from a charting and visual perspective.

 

When we were first considering this product and integration, we spoke with many SAM customers about their interest in functionality like this and how this typically flowed process-wise in their environments. The overwhelming feedback from folks was that with SAM and the other Orion Platform products, they had solid visibility into the infrastructure and off-the-shelf applications, but limited visibility into custom apps. When end users would report issues about these applications, it was hard for them to determine if it truly was a problem with the application or with something else.

 

If you upgrade to SAM v6.7, under the Settings page in the product, there is a new UI option called APM Deployment Summary. If you are interested in trying this product out, you can sign up for a 30-day trial directly from within SAM. The integration will be set up for you with your SAM deployment. SolarWinds APM can also be leveraged standalone as well if that is your preference—the option is yours.

 

 

The team is already hard at work on the next version of SAM, as you can see covered here in the “What We are Working On” post. Also, please keep the feedback coming on what you think and what you would like to see in the product in the ideas section of the forum.

Hot off the heels of my previous post of new Server & Application Monitor content for Microsoft SQL and Exchange, as well as SAP HANA, we also now have some new and enhanced content for monitoring your Oracle databases.   As mentioned in my previous blog post, this will be a steady drumbeat of releasing new and enhanced monitoring content for SAM, so please keep an eye out on THWACK® and I will keep you up to date via the product blog as well.

 

First off, as of today, these templates still require the additional components be installed and added to the Orion® Server and/or poller that these databases are being monitored from. We have documentation about this already in the Success Center here - https://support.solarwinds.com/Success_Center/Server_Application_Monitor_(SAM)/Knowledgebase_Articles/Configure_SAM_to_monitor_an_Oracle_Database_Server


Oracle Database: 
https://thwack.solarwinds.com/docs/DOC-203309

This template contains newly added performance and statistics counters for Oracle Database.

 

Prerequisites: Oracle client installed on Orion APM server. This is available from the SolarWinds customer portal under Additional Downloads.

Credentials: An Oracle username and password with read access to the Oracle tables.

 

MONITORED COMPONENTS

Components without predetermined threshold values provide guidance such as "Use the lowest threshold possible" or "Use the highest threshold possible" to help you find an appropriate threshold for your application. For more information, see http://knowledgebase.solarwinds.com/kb/questions/2415.

 

  • SGA Size

     This component monitor returns the System Global Area (SGA) as the part of the system memory (RAM) shared by all the processes belonging to a single Oracle database instance.

     Unit: Bytes

     Source: https://docs.oracle.com/cd/B28359_01/server.111/b28320/dynviews_3028.htm#REFRN30233

  • PGA Size

     Program Global Area (PGA) is a private memory region that contains the data and control information for a server process. Only a server process can access the PGA. Oracle Database reads and writes information in the PGA on behalf of the server process. Oracle Database automatically sizes the PGA by dynamically adjusting the portion of the PGA memory     

     dedicated to work areas, based on 20% of the SGA memory size. The minimum value is 10MB.

     PGA memory currently allocated by the process (including free PGA memory not yet released to the operating system by the server process)

     Unit: Bytes

     Source: https://docs.oracle.com/cd/B28359_01/server.111/b28320/dynviews_2098.htm#REFRN30186

  • Buffer Pool Size

     This component monitors the buffer pool size for the Oracle Database. The default buffer pool size is determined by the DB_CACHE_SIZE initialization parameter.

     Unit: Bytes

     Source: https://docs.oracle.com/en/database/oracle/oracle-database/12.2/refrn/V-BUFFER_POOL.html#GUID-1E70B05F-6E52-44B0-AFB3-5ADDA620008D

  • Shared Pool Size

     This component monitors the shared pool area size. The shared pool is a RAM area within the RAM heap that is created at startup time, a component of the System Global Area (SGA). The size depends on the size of your RAM.

     Unit: Bytes

     Source: https://docs.oracle.com/cd/B19306_01/server.102/b14237/dynviews_2106.htm#REFRN30238

  • Buffer Pool Response Time

     This component monitors the buffer pool response time. The value should be low for good performance.

     The query below calculates response time for logical reads per second from buffer within an interval of 15 seconds.

     Unit: Seconds

     Source: https://docs.oracle.com/cd/E11882_01/server.112/e40402/dynviews_3090.htm#REFRN30343

  • Single block read response time

     This component monitors the cumulative single-block read response time at the file level in seconds. This value should be low. A high value means a high latency.

     Unit: Seconds

     Source: https://docs.oracle.com/en/database/oracle/oracle-database/12.2/refrn/V-FILESTAT.html#GUID-9DF61EA4-EF94-4F60-B966-D1B9AFEFF3E0

  • Multi block read response time

     This component monitors the cumulative multi-block read response time at file level in seconds. This value should be low. A high value means a high latency.

     Unit: Seconds

     Source: https://docs.oracle.com/en/database/oracle/oracle-database/12.2/refrn/V-FILESTAT.html#GUID-9DF61EA4-EF94-4F60-B966-D1B9AFEFF3E0

  • Log write response time

     This component monitors log write response time. The response time here includes write time + wait time that log writer spent waiting.

     Unit: Seconds

     Source: https://docs.oracle.com/cd/E18283_01/server.112/e17110/statviews_4061.htm

     https://docs.oracle.com/cd/B19306_01/server.102/b14237/statviews_3177.htm

  • Physical I/O total rate

     This component monitors the physical I/O total rate. The total rate includes read rate + write rate per sec.

     A high value means a better performance.

     Unit: Bytes/second

     Source: https://docs.oracle.com/cd/E11882_01/server.112/e40402/dynviews_3090.htm

  • Physical I/O read rate

     This component monitors the physical I/O read rate per sec.

     A high value means a better performance.

     Unit: Bytes/second

     Source: https://docs.oracle.com/cd/E11882_01/server.112/e40402/dynviews_3090.htm

  • Physical I/O write rate

     This component monitors the physical I/O write rate per sec.

     A high value means a better performance.

     Unit: Bytes/second

     Source: https://docs.oracle.com/cd/E11882_01/server.112/e40402/dynviews_3090.htm

  • Commit latency

     This component monitors latency for commits by all users. If the value is null that means the number of commits per second is 0.

     Unit: Seconds

     Source: https://docs.oracle.com/cd/E11882_01/server.112/e40402/dynviews_3090.htm

  • SQL*Net receive rate

     This component monitors SQL*Net receive rate (clients + dblinks). In other words, bytes received via SQL*Net from client + bytes received via SQL*Net from dblink.

     Unit: Bytes/second

     Source: https://docs.oracle.com/cd/B28359_01/server.111/b28320/dynviews_3086.htm

  • SQL*Net send rate

     This component monitors SQL*Net send rate (clients + dblinks). In other words, bytes sent via SQL*Net to client + bytes sent via SQL*Net to dblink.

     Unit: Bytes/second

     Source: https://docs.oracle.com/cd/B28359_01/server.111/b28320/dynviews_3086.htm

  • Active sessions total

     This component monitors the total number of active sessions at any moment.

     Unit: Count

     Source: - https://docs.oracle.com/cd/B19306_01/server.102/b14237/dynviews_2088.htm

  • Active sessions waiting

     This component monitors the number of active sessions waiting to be run.

     Unit: Count

     Source: https://docs.oracle.com/cd/B19306_01/server.102/b14237/dynviews_2088.htmCPU.

  • Active sessions working

     This component monitors the total number of active sessions currently executing on CPU.

     Unit: Count

     Source: https://docs.oracle.com/cd/B19306_01/server.102/b14237/dynviews_2088.htm

  • Blocked sessions

     This component monitors the total number of sessions blocked by other sessions.

     Unit: Count

     Source: https://docs.oracle.com/cd/E11882_01/server.112/e40402/dynviews_3017.htm

  • Connections

     This component monitors the total number of active connections at any point.

     Unit: Count

     Sources:

     https://docs.oracle.com/cd/B19306_01/server.102/b14237/dynviews_2088.htm

     https://docs.oracle.com/cd/B19306_01/server.102/b14237/dynviews_2129.htm

     https://docs.oracle.com/cd/B28359_01/server.111/b28320/dynviews_2098.htm

  • Request rate

     This component monitors the total number of incoming request per second. A high number of requests might be a reason for slow response.

     Unit: Count/second

     Source: https://docs.oracle.com/cd/E11882_01/server.112/e40402/dynviews_3090.htm

  • Database Size (size of all tablespaces)

     This component monitors the total database size (size of all table spaces) of the Oracle Database. The default value is fetched in bytes.

     Unit: Bytes

     Sources:

     https://docs.oracle.com/cd/B19306_01/server.102/b14237/statviews_3122.htm

     https://docs.oracle.com/cd/B19306_01/server.102/b14237/statviews_3083.htm

  • Database Used Space (amount actually used)

     This component monitors the total database used space.

     Unit: Bytes

     Sources:

     https://docs.oracle.com/cd/B19306_01/server.102/b14237/statviews_3122.htm

     https://docs.oracle.com/cd/B19306_01/server.102/b14237/statviews_3083.htm

  • SQL Parse to execute ratio

     This component monitors SQL parsing to execute ratio.

     The query below will calculate the ratio by dividing parse count by execution count.  A higher ratio means better performance.

     Unit: Percent

     Source: https://docs.oracle.com/cd/B28359_01/server.111/b28320/dynviews_3086.htm

 

 

Oracle Automatic Storage Management:
https://thwack.solarwinds.com/docs/DOC-203310

This template contains newly added performance and statistics counters for Oracle ASM.

 

Prerequisites: Oracle client installed on the Orion SAM server. This is available from the SolarWinds customer portal[TK5] under Additional Downloads.

Credentials: An Oracle username and password with read access to the Oracle tables.

 

MONITORED COMPONENTS

Components without predetermined threshold values provide guidance such as "Use the lowest threshold possible" or "Use the highest threshold possible" to help you find an appropriate threshold for your application. For more information, see http://knowledgebase.solarwinds.com/kb/questions/2415.

 

  • Average Write Throughput

     This component monitor fetches the value for average write throughput for all disks under ASM disk group. The returned value will only show the results since the last polling period.

     Unit: MB/second

     Source: https://docs.oracle.com/cd/B19306_01/server.102/b14237/dynviews_1019.htm#REFRN30170

  • Average Read throughput

     This component monitor fetches the value for average read throughput for all disks under ASM disk group. The returned value will only show the results since the last polling period.

     Unit: MB/second

     Source: https://docs.oracle.com/cd/B19306_01/server.102/b14237/dynviews_1019.htm#REFRN30170

  • Average write latency

     This component monitor fetches the value for average write latency per MB for all disks under ASM disk group, at any time. The returned value will only show the results since the last polling period.

     Unit: Milliseconds

     Source: https://docs.oracle.com/cd/B19306_01/server.102/b14237/dynviews_1019.htm#REFRN30170

  • average read latency

     This component monitor fetches the value for average read latency per read request for all disks under ASM disk group, at any time. The returned value will only show the results since the last polling period.

     Unit: Milliseconds

     Source: https://docs.oracle.com/cd/B19306_01/server.102/b14237/dynviews_1019.htm#REFRN30170

  • average i/o read request

     This component monitors average number of I/O read requests for the disk group. The returned value will only show the results since the last polling period.

     Unit: Count.

     Source: https://docs.oracle.com/cd/B19306_01/server.102/b14237/dynviews_1019.htm#REFRN30170

  • average i/o write request

     This component monitors average number of I/O write requests for the disk group. The returned value will only show the results since the last polling period.

     Unit: Count.

     Source: https://docs.oracle.com/cd/B19306_01/server.102/b14237/dynviews_1019.htm#REFRN30170

 

 

Oracle Dataguard:
https://thwack.solarwinds.com/docs/DOC-203308

This template contains performance and statistics counters for Oracle Dataguard.

 

Prerequisites: Oracle client installed on the Orion SAM server. This is available from the SolarWinds customer portal[TK6] under Additional Downloads.

Credentials: An Oracle username and password with read access to the Oracle tables.

 

MONITORED COMPONENTS

Components without predetermined threshold values provide guidance such as "Use the lowest threshold possible" or "Use the highest threshold possible" to help you find an appropriate threshold for your application. For more information, see http://knowledgebase.solarwinds.com/kb/questions/2415

 

  • LOG APPLY GAP

     This component monitors the number of logs the secondary server has not yet applied.

     The greater the value, the lower the protection. The returned value will only show the results since the last polling period.

     Unit: Count

     Source:

     https://docs.oracle.com/cd/B14117_01/server.101/b10755/dynviews_1126.htm

     https://docs.oracle.com/cd/B14117_01/server.101/b10755/dynviews_1011.htm

     https://docs.oracle.com/cd/B13789_01/server.101/b10755/dynviews_1015.htm

     https://docs.oracle.com/cd/B13789_01/server.101/b10755/dynviews_1054.htm

  • Log Apply LAG

     This component monitors how long it is taking the secondary to apply logs.

     The returned value will only show the results since the last polling period.

     Unit: Seconds

     Source:

     https://docs.oracle.com/cd/B14117_01/server.101/b10755/dynviews_1126.htm

     https://docs.oracle.com/cd/B14117_01/server.101/b10755/dynviews_1011.htm

     https://docs.oracle.com/cd/B13789_01/server.101/b10755/dynviews_1015.htm

     https://docs.oracle.com/cd/B13789_01/server.101/b10755/dynviews_1054.htm

  • Log Destination error.

     This component monitors the count of error(s) that have occurred on any of the destinations while applying redo logs.

     Unit: Count

     Source:

     https://docs.oracle.com/cd/B13789_01/server.101/b10755/dynviews_1061.htm

     https://docs.oracle.com/cd/B14117_01/server.101/b10755/dynviews_1011.htm

 

The SolarWinds trademarks, service marks, and logos are the exclusive property of SolarWinds Worldwide, LLC or its affiliates.  All other trademarks are the property of their respective owners.

Content is key as new applications get released to the market, as well as new versions of products that have been out there for some time. Application templates are a critical component of what makes Server & Application Monitor (SAM) great and we’re constantly taking feedback on how to enhance the content we have and what additional content folks would like to see. The following post is part 1 of more to come on net-new and enhanced application monitoring templates for Server & Application Monitor. As always, if you have comments or feedback, please let us know and if there are any application templates you would like to see that we do not offer today, please let us know.

 

SAP HANA:

SAP HANA is a net-new addition to our library. Unlike many of our other templates, there are some prerequisites to get monitoring to work properly.

 

This template can be found on THWACK® at the following URL, or, if you have SAM, you can look at the application templates page, which connects to SAM.

SAP HANA 2.0.apm-template

From the server that will be polling your HANA instances, you’ll need the 32-bit or 64-bit HANA ODBC drivers. You should be able to download these from the SAP portal. You also need the ODBC credentials to access SAP HANA 2.0 Express Edition. Note that if you install the 64-bit version, you will need to update the template to use the 64-bit job engine vs. default 32-bit.

 

If you have an account for the SAP Support Portal (customer, partner, ask-your-administrator), just enter SAP HANA client in the search bar. Take 2.0 and select the operating system (such as Windows).

If you don’t have an SAP support account, you can also download the SAP HANA client from the Developer community, https://www.sap.com/developer/trials-downloads.htmlThis will direct you to the SAP store; it also requires an account but this one is free.


The metrics we are gathering for HANA include the following. (If you want more details on what the counters mean, how they are calculated, and any reference documentation, please see the links to the templates. In this case, for HANA,
SAP HANA 2.0.apm-template.)

 

  • CPU Utilization %
  • I/O Read Throughput in MB - DATA volume
  • I/O Read Throughput in MB - LOG Volume
  • I/O Write Throughput in MB - DATA Volume
  • I/O Write Throughput in MB - LOG Volume
  • System Memory Used %
  • Heap Memory Used %
  • Connections
  • Active Statements
  • Active Procedures
  • Table Lock Count
  • Record Lock Count
  • Blocked Transaction Count

 

Here is how this looks in SAM:

 

 

Enhanced Exchange 2016:

Next up is a set of enhancements to an existing template we already offer today, Microsoft Exchange 2016. We just added some new experience monitors as well as some component monitors within the template itself.

https://thwack.solarwinds.com/docs/DOC-203053
https://thwack.solarwinds.com/docs/DOC-203054
https://thwack.solarwinds.com/docs/DOC-203055
https://thwack.solarwinds.com/docs/DOC-203056

 

There are now four templates available for Exchange 2016.

  • Active Sync Connectivity
  • Edge Transport Role Counters & Services
  • Mailbox Role Counters & Services
  • OWA Form Login (PowerShell)

 

 

Prerequisites:

  1. WMI access to the Exchange server.
  2. Credentials: Windows Administrator on the target server.
  3. To run template “Exchange Active Sync Connectivity Template”:
      1. Exchange 2016 Management tool should also be installed on the machine. Once it’s installed, import this tool utility in PowerShell via this command:
        Add-PSSnapin Microsoft.Exchange.Management.PowerShell.SnapIn;
      2. Double-click on Exchange Server installer. It will ask the folder where you need to save the extracted files. Once extraction is completed, go to the Scripts folder and run the script “new-testcasconnectivityuser.ps1”—this script creates the test user, which helps in fetching the output from the command “Test-ActiveSyncConnectivity” used in the script.
      3. “Test-ActiveSyncConnectivity” needs Client Access Server (CAS). You can find this server name by executing the PowerShell command “Get-ExchangeServer” and note the “Name” value.
      4. Test to ensure http://<Hostname>/powershell or https://<Hostname>/powershell should be working.
  4. To run template “Exchange 2016 OWA Form Login (PowerShell)”:
      1. Resolve the IP of the node this script will run against, make an entry of that IP in etc/hosts file.
      2. Test to ensure http://<Hostname>/owa or https://<Hostname>/owa should be working.

 

SQL 2016 on Windows:

You can read more about and download these two templates here.

https://thwack.solarwinds.com/docs/DOC-203050
https://thwack.solarwinds.com/docs/DOC-203051

 

There are now two templates available for SQL Server 2016 on Windows.

  • Analysis Services
  • Reporting Services

 

For SQL 2016 Analysis Services, we are collecting the following metrics/info.

  • Service: SQL Server Analysis Services
  • Cache: Direct hits/sec
  • Cache: Lookups/sec
  • Cache: Direct hit ratio
  • Cache: Current entries
  • Cache: Current KB
  • Cache: Inserts/sec
  • Cache: Evictions/sec
  • Cache: Misses/sec
  • Cache: KB added/sec
  • Cache: Total direct hits
  • Cache: Total evictions
  • Cache: Total filtered iterator cache hits
  • Cache: Total filtered iterator cache misses
  • Cache: Total inserts
  • Cache: Total lookups
  • Cache: Total misses
  • Connection: Current connections
  • Connection: Current user sessions
  • Connection: Requests/sec
  • Connection: Failures/sec
  • Connection: Successes/sec
  • Connection: Total failures
  • Connection: Total requests
  • Connection: Total successes
  • Data Mining Prediction: Queries/sec
  • Data Mining Prediction: Predictions/sec
  • Locks: Current latch waits
  • Locks: Current lock waits
  • Locks: Current locks
  • Locks: Lock waits/sec
  • Locks: Total deadlocks detected
  • Locks: Latch waits/sec
  • Locks: Lock denials/sec
  • Locks: Lock grants/sec
  • Locks: Lock requests/sec
  • Locks: Unlock requests/sec
  • MDX: Total NON EMPTY unoptimized
  • MDX: Total recomputes
  • MDX: Total Sonar subcubes
  • Memory: Cleaner Memory shrinkable KB
  • Memory: Cleaner Memory nonshrinkable KB
  • Memory: Cleaner Memory KB
  • Memory: Cleaner Balance/sec
  • Memory: Filestore KB
  • Memory: Filestore Writes/sec
  • Memory: Filestore IO Errors/sec
  • Memory: Quota Blocked
  • Memory: Filestore Reads/sec
  • Proactive Caching: Notifications/sec
  • Proactive Caching: Processing Cancellations/sec
  • Proc Aggregations: Temp file bytes written/sec
  • Processing: Rows read/sec
  • Processing: Rows written/sec
  • Processing: Total rows read
  • Processing: Rows converted/sec
  • Processing: Total rows converted
  • Processing: Total rows written
  • Storage Engine Query: Queries from cache direct/sec
  • Storage Engine Query: Queries from cache filtered/sec
  • Storage Engine Query: Queries from file/sec
  • Storage Engine Query: Avg time/query
  • Storage Engine Query: Measure group queries/sec
  • Storage Engine Query: Dimension queries/sec
  • Threads: Processing pool idle I/O job threads
  • Threads: Processing pool busy I/O job threads
  • Threads: Processing pool job queue length
  • Threads: Processing pool job rate

 

Here is how that will look in SAM:

 

For Reporting Services, we are collecting the following metrics/info:

  • MSRS Windows Service: Active Sessions
  • MSRS Windows Service: Cache Flushes/Sec
  • MSRS Windows Service: Cache Hits/Sec
  • MSRS Windows Service: Cache Hits/Sec (Semantic Models)
  • MSRS Windows Service: Cache Misses/Sec
  • MSRS Windows Service: Cache Misses/Sec (Semantic Models)
  • MSRS Windows Service: Delivers/Sec
  • MSRS Windows Service: Events/Sec
  • MSRS Windows Service: Memory Cache Hits/Sec
  • MSRS Windows Service: Memory Cache Miss/Sec
  • MSRS Windows Service: Reports Executed/Sec
  • MSRS Windows Service: Requests/Sec
  • MSRS Windows Service: Snapshot Updates/Sec
  • MSRS Windows Service: Total Processing Failures
  • MSRS Windows Service: Total Rejected Threads
  • MSRS Windows Service: Report Requests
  • MSRS Windows Service: First Session Requests/Sec
  • MSRS Windows Service: Next Session Requests/Sec
  • MSRS Windows Service: Total App Domain Recycles
  • MSRS Windows Service: Total Cache Flushes
  • MSRS Windows Service: Total Cache Hits
  • MSRS Windows Service: Total Cache Hits (Semantic Models)
  • MSRS Windows Service: Total Cache Misses
  • MSRS Windows Service: Total Cache Misses (Semantic Models)
  • MSRS Windows Service: Total Deliveries
  • MSRS Windows Service: Total Events
  • MSRS Windows Service: Total Memory Cache Hits
  • MSRS Windows Service: Total Memory Cache Misses
  • MSRS Windows Service: Total Reports Executed
  • MSRS Windows Service: Total Requests
  • MSRS Windows Service: Total Snapshot Updates
  • Report Server: Active Connections
  • Report Server: Bytes Received/sec
  • Report Server: Bytes Sent/sec
  • Report Server: Errors/sec
  • Report Server: Logon Attempts/sec
  • Report Server: Logon Successes/sec
  • Report Server: Memory Pressure State
  • Report Server: Memory Shrink Amount
  • Report Server: Memory Shrink Notifications/sec
  • Report Server: Requests Executing
  • Report Server: Requests/sec
  • Report Server: Tasks Queued
  • Service: SQL Server Reporting Services
  • Report Server TCP Port
  • Report Server: Bytes Received Total
  • Report Server: Bytes Sent Total
  • Report Server: Errors Total
  • Report Server: Logon Attempts Total
  • Report Server: Logon Successes Total
  • Report Server: Requests Disconnected
  • Report Server: Requests Not Authorized
  • Report Server: Requests Rejected
  • Report Server: Requests Total

 

SQL 2017 on Windows:

You can read more about and download the template here.
https://thwack.solarwinds.com/docs/DOC-203052

 

This template uses Windows performance counters to assess the status and performance of Microsoft SQL Server 2017 Analysis Services.

 

Prerequisites:

 

Below are the metrics and counters we will gather:

  • Service: SQL Server Analysis Services
  • Cache: Direct hits/sec
  • Cache: Lookups/sec
  • Cache: Direct hit ratio
  • Cache: Current entries
  • Cache: Current KB
  • Cache: Inserts/sec
  • Cache: Evictions/sec
  • Cache: Misses/sec
  • Cache: KB added/sec
  • Cache: Total direct hits
  • Cache: Total evictions
  • Cache: Total filtered iterator cache hits
  • Cache: Total filtered iterator cache misses
  • Cache: Total inserts
  • Cache: Total lookups
  • Cache: Total misses
  • Connection: Current connections
  • Connection: Current user sessions
  • Connection: Requests/sec
  • Connection: Failures/sec
  • Connection: Successes/sec
  • Connection: Total failures
  • Connection: Total requests
  • Connection: Total successes
  • Data Mining Prediction: Queries/sec
  • Data Mining Prediction: Predictions/sec
  • Locks: Current latch waits
  • Locks: Current lock waits
  • Locks: Current locks
  • Locks: Lock waits/sec
  • Locks: Total deadlocks detected
  • Locks: Latch waits/sec
  • Locks: Lock denials/sec
  • Locks: Lock grants/sec
  • Locks: Lock requests/sec
  • Locks: Unlock requests/sec
  • MDX: Total NON EMPTY unoptimized
  • MDX: Total recomputes
  • MDX: Total Sonar subcubes
  • Memory: Cleaner Memory shrinkable KB
  • Memory: Cleaner Memory nonshrinkable KB
  • Memory: Cleaner Memory KB
  • Memory: Cleaner Balance/sec
  • Memory: Filestore KB
  • Memory: Filestore Writes/sec
  • Memory: Filestore IO Errors/sec
  • Memory: Quota Blocked
  • Memory: Filestore Reads/sec
  • Proactive Caching: Notifications/sec
  • Proactive Caching: Processing Cancellations/sec
  • Proc Aggregations: Temp file bytes written/sec
  • Proc Aggregations: Current partitions
  • Proc Aggregations: Total partitions
  • Proc Aggregations: Memory size rows
  • Proc Aggregations: Memory size bytes
  • Proc Aggregations: Rows merged/sec
  • Proc Aggregations: Rows created/sec
  • Proc Aggregations: Temp file rows written/sec
  • Processing: Rows read/sec
  • Processing: Rows written/sec
  • Processing: Total rows read
  • Processing: Rows converted/sec
  • Processing: Total rows converted
  • Processing: Total rows written
  • Storage Engine Query: Queries from cache direct/sec
  • Storage Engine Query: Queries from cache filtered/sec
  • Storage Engine Query: Queries from file/sec
  • Storage Engine Query: Avg time/query
  • Storage Engine Query: Measure group queries/sec
  • Storage Engine Query: Dimension queries/sec
  • Threads: Processing pool idle I/O job threads
  • Threads: Processing pool busy I/O job threads
  • Threads: Processing pool job queue length
  • Threads: Processing pool job rate

 

The SolarWinds trademarks, service marks, and logos are the exclusive property of SolarWinds Worldwide, LLC or its affiliates.  All other trademarks are the property of their respective owners.

I am happy to announce the General Availability of Database Performance Analyzer (DPA) 12.0. This release focuses on analysis with two major features: Query Performance Analyzer (QPA) and Table Tuning Advisor. We have also improved our integration with the Orion® Platform by adding blocking, deadlocks, and wait time status to the PerfStack feature. In this post, I will cover Table Tuning Advisor, while QPA will be covered in another post.

 

Table Tuning Advisor

Every database has inefficient queries—ones that perform many logical reads but retrieve a relatively small number of rows. In other words, they do a lot of work for a small return. This type of inefficiency can result in higher I/O, longer wait times, greater amounts of blocking, and increased resource contention.

 

Tuning inefficient queries can be difficult and many questions tend to surface as part of the process. DPA 12.0 with Table Tuning Advisor can help lead you to answers to some of these common questions.

  • Should you tune the query? Add a new index? Or maybe add columns to an existing index?
  • Plans are complex and hard to analyze; which steps are the ones I should pay attention to?
  • Which predicates in the plans are causing inefficient data access and high amount of reads?
  • Are there recommendations I can use as a starting point?
  • Are there other inefficient queries that access the same table and could be affected by indexing decisions?
  • How many indexes currently exist on the table and how are they designed?
  • How much data churn (inserts, deletes, and sometimes updates) does the table undergo?

 

DPA’s Table Tuning Advisor is designed to analyze expensive queries and plans to help identify tables that have inefficient workload run against them. For each table, the advisor page displays aggregated information about the table and the inefficient queries. You can use this information to make informed decisions about database performance optimization opportunities, and to weigh the potential costs and benefits of adding an index.

 

Navigation

There are two ways to get to the advisor page:

  • A new Tuning super-tab near the top of the page appears after clicking into an instance. This will take you to a page that combines the Query and Table Tuning Advisors.

  • The new Query Performance Analyzer (QPA) page with the Table Tuning Advisors section provides a summary of the advice aggregated to the table level and includes links to the advisor detail page.

Advisor Page Layout

The Table Tuning Advisor page has three main areas:

  • Inefficient SQL – a list of queries accessing the table ranked by relative workload.
  • SQL and Plan Details – SQL and Plan details for the selected query.
  • Table and Index Information – current table information, existing indexes on the table and the table’s columns.

 

 

Table Tuning Advisor Example

Let’s assume we are being proactive and want to tune something that will have a big impact. At a summary level, the tuning tab shows the tables with inefficient queries and ranks them based on workload. The list includes an aggregated view of wait time for each table, the number of queries that have inefficient plan steps on the table, and the number of index recommendations. This list quickly gives insight into the tables that have the highest inefficient workloads executing against them. These are prime opportunities for tuning.

 

Are there any recommendations to use as a starting point? Clicking on the “orders” table takes us to the Table Tuning Advisor page that provides details about inefficient queries accessing the table. This page pulls together what you need to know about the table regarding inefficient usage patterns, statistical information, design of current indexes, and much more. Index recommendations appear near the top of the page and may provide a good starting point for a solution.

 

Which steps in the plans are inefficient and does it align with the recommendation? DPA uses a proprietary algorithm to find plan steps that are inefficient and causing issues. Inefficient “producer” steps (for example, full table/index scans) read data to be processed later by subsequent "consumer" plan steps. While consumer steps (for example, sorts) can have a high plan cost, they are usually affected by a preceding producer step that read too much data. DPA can point out the inefficient producer plan steps that should be the focus of tuning efforts.

 

In this example, DPA identified two steps that are inefficient:

  1. INDEX SCAN – Step 64 – A full scan of the o_totalprice_index index. Notice the predicate value that shows a function named CONVERT. The query is using a CONVERT function against the o_totalprice column which will often negate effective use of an index. An INDEX SCAN reads the entire index, which is why the step shows 15 million rows associated with it.
  2. CLUSTERED INDEX SCAN – Step 69 – A full scan of the orders table. Notice the CONVERT_IMPLICIT function within the predicate value. This indicates an implicit conversion, i.e., data type mismatch, and DPA displays a predicate warning as a result. Click on the warning to get additional information. Other potential warnings include:
    1. Lookup Warning – The plan uses an index but is required to go back to the table to “look up” other needed information. Adding a “covering” index can potentially help tune this issue.
    2. Spool Warning – The plan step is storing data for later use, but large amounts of spooling can cause disk overhead.
    3. Parallel Warning – DPA has detected a parallelism step later in this query's execution, implying that this step's intermediate result set is likely large enough to exceed parallel processing cost thresholds. Look for ways to rewrite the query to reduce the size of intermediate result sets earlier in the query. For example, look for a sub-select that could produce fewer rows.

 

Based on the data shown by DPA in this example, the index recommendation may help tune the clustered index scan in step 69. However, tuning step 64 will likely require a modification to the query to remove the CONVERT function on the o_totalprice column. Gleaning this information via manual plan analysis would probably take hours. Plan analysis is difficult, so let Table Tuning Advisor help get you to a good starting place.

 

Are there other inefficient queries that access this table? The left pane of the Table Tuning Advisor page shows other inefficient queries, ranked by relative workload, accessing the “orders” table. Pay attention to the queries near the top of this list because they cause more workload against the table. Conversely, you should not spend as much time on queries near the bottom with small relative workloads. These queries could be affected by a new or modified index on the table.

 

How many indexes currently exist on the table and how are they designed? Toward the bottom of the Table Tuning Advisor page, the current indexes and their columns are shown along with information about statistics and usage. Also shown are fragmentation percentages, sizes of the table and indexes, the table’s columns, and more. This is important for several reasons:

  • Is the data churn for the table high? If so, this means insert/delete activity is high and a new index could cause more harm than good.
  • Is there an existing index that already contains the o_shippriority column? If so, can the index be modified to benefit this query versus creating a new index?
  • Were optimizer statistics generated recently? If not, and churn is high, updating the statistics for the table may be a good first step.
  • Are indexes fragmented? If they are and scans are performed against them, defragmenting them may help performance.

 

What Did You Find?

Our development team uses DPA to help make sure our code performs well. When using the Table Tuning Advisor, it pointed them to a problematic set of tables. Within a couple of hours, they tuned the queries with a simple rewrite and saved hours of database time every night during the cleaning process. If you find interesting stories in your environment, let us know by leaving comments on this blog post.

 

We would love to hear feedback about the following:

  • Does this improve your workflow when tuning a query? How much time does it save you?
  • Are there tuning questions that are not answered by the page?
  • Is all of the assembled data important to you when tuning?

 

What’s Next?

Don’t forget to read Brian’s blog about Query Performance Analyzer (QPA). To learn more about other DPA 12.0 new features, see the DPA Documentation library and visit your SolarWinds Customer Portal to get the new software.

 

If you don't see the features you've been wanting in this release, check out the What We Are Working On for DPA post for what our dedicated team of database nerds are already looking at. If you don't see everything you've been wishing for there, add it to the Database Performance Analyzer Feature Requests.

To kick off the Q3 systems releases, I am happy to announce Generally Availability of Database Performance Analyzer version 12.0.   This release focuses on analysis with two major features: Query Performance Analyzer (QPA) and the Table Tuning Advisor.  We've also improved our integration with Orion® Platform by adding blocking, deadlocks, and wait time status to PerfStack™. In this post, I'll cover QPA and the Orion integration. Table Tuning Advisor will be covered in another post.

 

Query Performance Analyzer

QPA is designed to intelligently assemble current and historical data for a query, combining all the information about a query into one place, including the query analysis (summarized per day) and the historical charts (30 days of data, down to 10-minute intervals). QPA analyzes the data about the query and automatically expands sections and selects metrics to show you the most relevant data. It also allows you to change time ranges on the query, and still has the great drill-down capability you are used to. You can use QPA for queries in any database supported by DPA.

 

Since QPA has all the data you previously saw on multiple screens, all links on query hashes and names now go to QPA, keeping your current timeframe. So now, when you go to look at a query in the product, you get QPA!

 

New Charting Capabilities

QPA uses SolarWinds' new Nova GUI components, allowing us to assemble and present data in new ways. We are very excited to have adopted this technology. There are a few nifty features that you'll see in the screenshots below.

  • Charts all have the same x-axis, even if their data is at different frequencies or ranges
  • As you roll over the chart, the values and time are shown both for the chart you are on and all other charts displayed
  • In all charts, you can uncheck one of the items on the legend to remove it
  • When you roll over an item in the chart legend, it is highlighted while other items are grayed out

All of these combine to make it very easy to inspect and correlate data across multiple charts.

 

QPA Layout

QPA has two main areas:

  • The Wait Type Chart and Time Navigation
  • Three tabs showing different data and analysis

 

Top Chart - Wait Types and Navigation (yes it's sticky!)

DPA is all about waits, so the top chart shows the total wait time by wait type, and it is sticky so it stays at the top of the page, making it easy to correlate the waits with the data in the charts below it. The new time navigation at the top of the chart allows to you to choose a pre-defined time range or build your own.  And now, you can display data further back than 30 days if you need to.

 

Tabs - Intelligent Analysis, SQL Text and Supporting Data

QPA has three tabs which we cover in detail below.

  • Intelligent Analysis: Intelligently assemble and display the most relevant data about this query
  • SQL Text: A nicely formatted version of the SQL text
  • Supporting Data: Additional performance data about this query available in under 24 hours

 

Intelligent Analysis

QPA can intelligently assemble the most important information about a query and allow you to customize your view to meet your needs. Intelligence includes expanding sections to show you relevant data and picking metrics based on the predominant wait type.

 

Sections include:

  • Query Advisor: Latest advice for the query in the current time period
  • Tables Tuning Advisor: Latest Table Tuning Advisors for the query in the current time period
  • Statistics: Query statistics, both the actual value and per execution
  • Blocking: Shows blocking info (blockee and blocker) if it sees significant blocking
  • Plans: Shows plan information if more than one plan is used for the current time period
  • Resource Metrics for the Instance: Displays instance resources based on the predominant wait time

Here is a query with both Query Advisors and Table Tuning Advisors.

Keep scrolling to see multiple plans and PLE (and more CPU/memory resources). Note that:

The wait type chart shrinks and stays at the top of the page

Rolling over a chart shows detailed data on each chart

QPA selected which sections to expand and which metrics to show

 

SQL Text

Formatted SQL text that is easy to read, as well as easy to copy.

 

Supporting Data

Supporting data is additional per-query data we collect and is only available at timeframes of 24 hours or less.  Sections are auto-expanding if DPA detects interesting data.

 

 

Analyzing a Query with QPA (Example)

If we look at the following query for 30 days in QPA, we can see that wait time started increasing around April 23. The query advisors show advice for the latest day (just like on the trends page), but instead of drilling, let’s scroll down the page some

I see the number of executions is unchanged, but wait time per execution increased with wait time... so it looks like something changed.

If I keep scrolling, I'll see that the blocking section is closed (so no blocking), but the plans section is open and showing multiple plans. DPA noticed a plan change and displayed this chart automatically. If there was only one plan, DPA closes the chart and just gives you a link to the plan.

Note that increase in wait time and wait time per execution correspond to the same time as the plan change on April 23—BINGO!

If I want to see more detail on April 23, I can drill by clicking on the bar chart (just like on the trends page).  I can click it on the top chart, or any other bar chart (like the plans chart).

When I drill into Apr 23, I can see that the change correlates to the plan change. Note that I can also see the instance statistics, and they don't indicate any kind of resource pressure.

 

From here, I can drill down to an hour if I want, or I can click the plan hashes and take a look at the differences between them.

 

Blocking, Deadlocks and Wait Time Status in PerfStack

We don't have a new DPA Integration Module (DPAIM) for the 12.0 release, but PerfStack is so versatile, we can share new data with it and have it available automatically. Now blocking (root blocking and blockee), deadlocks, and wait time status are available in PerfStack.

When you highlight the blocking info, you can see the queries in the data explorer.

 

What did you find in your environment?

We'd love to hear your story about queries and indexes you've improved in your environment. Feel free to post your stories here and commiserate with your fellow admins. For example, during an RC-assisted upgrade, we helped a customer upgrade and walked through the new features, and in just a few minutes, we found a query with over six hours of wait time in QPA. By drilling into the new Table Advisor, we were able to discover the table was missing an index.

 

What's Next?

Don't forget to read Dean's blog on the Table Tuning Advisor and the DPA 12.0 Release Notes

 

If you don't see the features you've been wanting in this release, check out the What We Are Working On for DPA (Updated August 29, 2018) post for what our dedicated team of database nerds and code jockeys are already looking at.  If you don't see everything you've been wishing for there, add it to the Database Performance Analyzer Feature Requests.

NetFlow Traffic Analyzer

Faster. Leaner. More Secure.

 

The new NetFlow Traffic Analyzer leverages the power of columnstore technology in MS SQL Server to deliver answers to your flow analysis questions faster than ever before. MS SQL 2016 and later runs in a more efficient footprint than previous flow storage technologies, making better use of your infrastructure. Support for TLS 1.2 communication channels and monitoring of TCP and UDP Port 0 traffic helps to secure your environment.

 

Version 4.4 also introduces a new installation process to confirm that you have the necessary prerequisites, and to guide you through the installation and configuration process.

 

NTA 4.4 is now available in the Customer Portal. Check out the Release Notes for an overview of the features.

 

Faster

The latest release of NTA makes use of Microsoft’s latest version of their SQL columnstore based flow storage database.  Columnstore databases organized and query data by column, rather than row index. They are the optimal technology for large-scale data warehouse repositories, like massive volumes of individual flow records. Our testing and our beta customer experiences indicate that columnstore indexes support substantial performance improvements in both querying data, and in data compression efficiency.

 

NTA was an early adopter of columnstore technology to enhance the performance of our flow storage database. As Microsoft’s columnstore solutions have matured, we’ve chosen to adopt the MS SQL 2016 and later versions as the supported flow storage technology. That offers our customers the ability to standardize on MS SQL across the Orion platform, and to manage their monitoring data using a common set of tools with common expertise. We’ve made deployment and support simpler, more robust, and more performant.

 

Leaner

This same columnstore technology also runs more efficiently with the existing resource footprint. This solution builds and maintains columnstore indexes in memory, and then manages bulk record insertions with much less intensive I/O to the disk storage. CPU required to build indexes is also substantially less intensive than our previous versions. As a result, this version will make better use of the same resources to run more efficiently.

 

More Secure

This version of NTA supports TLS 1.2 communication channels, required in many environments to secure communications with client users.

 

Beginning in this version, NTA will explicitly monitor network flows that are destined to TCP or UDP service port 0. Traffic that’s addressed to TCP or UDP port 0 is either malformed – or malicious traffic. This port is reserved for internal use, and network traffic on the wire should never appear addressed to this port. By highlighting and tracking flows addressed to port 0, NTA helps network administrators to identify sources of malicious traffic that may be attacking hosts in their network, and providing the information they need to shut that traffic down.

 

NTA will surface port 0 traffic as a distinct application, so the information is available in all application resources.

NTA Port 0 Traffic

Supported Database Configurations

This version of NTA maintains a separate database for Flow Storage. NPM also maintains the Orion database for device and interface data. Both of these databases are built in MS SQL instances.

 

New installations of NTA and upgrades to version 4.4 and later will require an instance of MS SQL 2016 Service Pack 1 or later version for flow storage. For evaluation, the express edition is supported. For production deployments, we support the Standard and Enterprise editions.

 

When upgrading to this version from older version on the FastBit database, data migration is not supported. This upgrade will build out a new, empty database in the new MS SQL instance.  The existing flow data in the FastBit database will not be deleted or modified in any way. That data can be archived for regulatory requirements, and customers can run older product versions in evaluation mode to temporarily access the data.

 

In the current NTA product, we require a separate dedicated server for Flow Storage. The simplest upgrade would use that dedicated server with the new release to install an instance of MS SQL 2016 SP1 or later for flow storage. Many of our customers will be interested in running both the Orion database and the NTA Flow Storage database in the same MS SQL instance. We support that, but for most customers that will take some planning to consolidate and to appropriately size that instance to support both databases.

 

Here's a more detailed discussion of NTA's New MS SQL Based Flow Storage Database. Also, a knowledge base article on NTA 4.4 Adoption is available, with frequently asked questions.

 

We’re doing some testing now to provide some performance guidance for key performance indicators to monitor. One of the benefits of using MS SQL technology for both of these databases is that there are many common tools and techniques available to monitor and tune MS SQL databases. We plan to provide guidance for both monitoring, and deployment planning.

 

Conclusion

Please visit the NetFlow Traffic Analyzer Forum on THWACK to discuss your experiences and new feature requests for NTA.

I am very excited to announced that Solarwinds NCM 7.8 is available for download in the Customer Portal! This release brings many valuable features and the release notes are a great resource for these.

 

Network Insight for Cisco Nexus
This is the third iteration in our Network Insight series and in this release we have extended those insights to Cisco Nexus. We understand that your Cisco Nexus devices are a sizable investment and come with a host of valuable features and that you also expect deeper insight from your Solarwinds monitoring and management tools as a result. This meant that we had to go back and develop some new features and expand on existing ones to ensure that the relevant information you need is presented properly. It means that your workflows are logical and more time efficient.

 

 

Virtual Port Channels

One of the really awesome features of a Cisco Nexus, that comes with a good deal of complexity, is the ability to create and deploy vPCs. vPCs operate as a single logical interface, but are actually just a group of interfaces working together. What this means is that managing vPCs can become a time drain, as the number of vPCs increases and as the number of interfaces on each vPC pair increases. Network Insight provides a view to show each vPC and the member interfaces in each of those vPCs. This is covered in the NPM v12.3 release blog.

 

In addition to this view, there is another layer of detail that shows the configuration of each vPC and its member interfaces. To see this detail you will click on "View Configs" on the vPC page. This page displays the configuration details for each of the side of the vPC and the configurations of each member interface. This allows you to save time by more efficiently identifying configuration errors within the vPC and the member interfaces. I think we can all agree that not having to hop across multiple windows and execute manual searches or commands to find issues is a major workflow improvement!

 

The example below is a vPC with multiple member interfaces:

 

Virtual Device Contexts

As it is covered here, each VDC is essentially a VM on a Cisco Nexus (also Cisco ASAs!) and each context is configured separately and provides its own set of services. These configurations are downloaded and backed up by NCM. They are also referenced for all the features in this release.

 

To manage a context in NCM, one just needs to click "Monitor Node" and it will walk through node addition process, after that has concluded each configuration is downloaded and stored separately.

 

Access Control Lists

ACLs define what to do with the network traffic. ACLs are very complicated to manage because within each ACL are rules (Access Control Elements) and within these are object groups. The object groups are containers that house specific information for the given rule like the interfaces that you might block a particular MAC address from traversing. The layering creates some problems. Manually you need to verify the rules are handling traffic by examining the hit counts, and that none of the rules are shadowed or redundant. Lastly, to ensure we met all of your needs for ACLs we extended the existing functionality of Access Control Lists (ACLs) beyond Port Access Control Lists (PACLs) and VLAN Access Control Lists (VACLs), to include MAC ACLs and non-contiguous subnet masks.

 

ACLs are super easy to add and once the Nexus nodes are added to NCM, it will automatically discover ACLs and grant you access to all the information available inside those ACLs. You won't need to spend copious amounts of time digging into each ACL, determining if changes occurred, and what changes occurred.

 

To see the list of ACLs for a particular Nexus, mouse over the entities on the side panel and select “Access Lists.”

Access Control List Entity View

 

With this view you are able to see the historical record of ACLs, including the date and time of each revision, and if there are any overlapping rules inside of each version of the ACL. To expose the previous version for viewing just expand the view. From this same screen you are able to view the ACL details and also compare against the next most recent, older revision, or a different nodes ACL.

ACL detail view and rule alerts

 

When you navigate into the ACL, each of the rules in that ACL are displayed including all the syntax for that ACL. In this view each rule provides a hit counter, making it easy to see which rules are impacting traffic and which ones are not. You are also able to drill down into the object groups.

 

Viewing conflicting rules is simple in NCM. Expanding on the alert, you can see the shadowed or redundant rules.

  • Redundant: a rule earlier in the list overlaps this rule, and does the same action to the matched traffic.
  • Shadowed: a rule earlier in the list overlaps this rule, and does the opposite action.

 

Interface Config Snippets???

At some point during the course of your day you will have identified one or many interfaces that warrant deeper inspection. Based on feedback from many of you, we discovered that once you reached this point you needed to see more information. Specifically, information about that interface and the interface configuration information. Normally you would have had to dig into overall running or startup configs requiring you to navigate away from the interface screen. This is why we created where interface config snippets and this is probably one of my favorite features in this Network Insight release.

 

These snippets are the running configurations of the specific interface you are viewing.

Interface Config Snippet


Once you have found the snippet on the page, you are able to verify which configuration this snippet is pulled from and the date and time of when it was downloaded.

Interface Config Snippet details + history

 

Conclusion

That is all I have for now on this release but I recommend you go check out our online demo and visit the customer portal to click through this functionality and see all the great features available in this release. My fellow cohort cobrien put together a great blog on Network Performance Monitor's v12.3 release for Network Insight and I highly recommend that you head over and give it a read! I look forward to hearing your feedback once you have this new release up and running in your environment!

 

Starting with NPM 12.2, SolarWinds has embarked on a journey to transform your Orion deployment experience with fast and frequent releases of key deployment components. The first step was revamping the legacy installer to the new and improved SolarWinds Orion installer. The installer was able to deploy new or upgrade an entire main poller in one seamless session. The second iteration of the installer released the capability to do the same for your scalability engines. In this release NTA has been updated to utilize a MSSQL database, allowing us to happily say that the SolarWinds Orion installer is truly an All-in-One installer solution for your Orion deployment. For NPM 12.3, we have made tremendous scalability improvements that allow you to utilize even more scalability engines. As a result, your Orion deployment upgrades gain in complexity, so the installer team is providing additional updates to how you can stage your environment for minimal upgrade time.

 

Normal Upgrade Process

 

Using the All-in-One SolarWinds Orion installer, your upgrade process will look like the following.

 

Step one:

 

Review all system requirements, back up your database and if possible snapshot the Orion deployment. This will be especially important in this release, as the NTA Flow Storage database requirements have changed. Note: Flow Storage database refers to the database instance that stores NTA collected flow data. In previous versions this was utilizing a Fastbit database, but in this release has been updated to use MSSQL with a minimum version of 2016. An Orion database is the primary database that stores all polled data from NPM and other Orion products.

 

Step two:

 

Download the NPM 12.3 installer, selecting either the online or the offline variant according to your system requirements. Note: the SolarWinds Orion installer is

 

Step three:

 

Run the installer on your main poller and upgrade it to completion. If you have any other Orion product modules installed, the installer will upgrade this instance to the latest versions of those modules at the same time to maintain compatibility with the new Orion Platform 2018.2. If there are new database instances to be configured, that will be handled during the Configuration Wizard stage of the main poller upgrade. This release of the installer has a new type of preflight check that requires confirmation from you before proceeding. In the example below, is one for the NTA upgrade. Click for details to see the confirmation dialog and select yes or no.

 

Configuration Wizard step for NTA:

 

Step four:

 

If you don’t have any scalability engines, e.g Additional Polling Engines, Additional Websites or HA Backups you’re ready to explore all of the new features available in this version!

 

Scalability Engines

 

For those environments utilizing scalability engines or for those who are looking to try them out, this section will guide you through the process of deployment. Even if you have not utilized scalability engines previously, trying them out to test the scale improvements is incredibly easy. Like every SolarWinds Orion product, they are available for an unlimited 30-day free evaluation.

 

Deploying a fresh scalability engine is handled with the same installer that you downloaded for the main poller.

 

1. Copy the installer to your intended server and Click to “Run as Administrator”

 

Note: If you downloaded the offline installer, which is about 2 GB, the download process to your server can take some time and does not currently stage the scalability engine for faster upgrade. In the future, this is something we’d like to improve but is not an available feature for this release.  if you’d like to shorten the initial download of installer file to server, you can always use the online installer to set up your scalability engine. This installer file is about 40 MB so the download of installer file time to the server is much shorter. This will still meet offline requirements because when selecting the “Add a Scalability Engine” option, it will choose to download from the main poller to maintain version compatibility and does not require internet access. As always, the 40 MB scalability engines installer is also available for download from the All Settings -> Polling Engines page.

 

2. Select the “Add a Scalability Engine” option.

 

first screen of installer

 

3. Similar to the main poller upgrade process, at this point system checks that are specific to scalability engines will be run.

 

Note: Anything tagged as a blocker may need confirmation or action from you before proceeding.  If this is the case, address those issues and run the installer again. Things that are tagged as a warning or informational message are simply for your awareness and will not prevent your installation from proceeding.

 

4. Select the type of scalability engine that you are looking to deploy, and then complete the steps in the wizard to finish your installation per your normal process.

 

 

Upgrading a scalability engine, is also handled through the same installer. However, this is where you have an opportunity to utilize our staging feature.

Note: If you were to proceed with your normal practice of putting the scalability engines installer on each server you need to upgrade, and then manually upgrading, that process will work perfectly well with no changes. Please read through the “Staging Your Environment for your Scalability Engines Upgrade” section below to see the alternative workflow that allows you to stage your environment.

 

Staging Your Environment for Your Scalability Engines Upgrade

 

For customers with more than a handful of scalability engines or with some distributed over WAN links, we noticed that they were occasionally experiencing extremely high download times from their main poller to their scalability engines. In addition, there was no centralized area where one could see the upgraded state of the scalability engines. Navigate to "All Settings", and click "High Availability Deployment Summary" and you will see the foundational pieces for an Orion deployment view.

 

The Servers tab contains the original High Availability Deployment Summary content, and is where you can continue to set up additional HA pools and HA environment.

 

Check out the new Deployment Health tab! You may not have heard of our Active Diagnostics tool, but it comes prepackaged with every install of the Orion Platform with test suites designed to test for our most common support issues. We've brought that in depth knowledge to your web console in the new Deployment Health view. With nightly run tests across your Orion Deployment, every time you come to this page you will see if there are any issues that could be a factor in the performance of Orion or your upgrades.

 

You are able to refresh a check if you're working on an issue and wish to see an updated test result. If there are tests that you don't want to address, silence them to hide the results from the web console. Click on the caret to the right and you'll be able to see more details and a link to a KB article that will give you remediation advice.

 

On the Updates tab is where you will be able to stage your scalability engines.

 

The first page of the wizard will let you know if there are updates that are available to be installed on your scalability engines. At this point you've upgraded your main poller, so there are definitely updates available!  Click "Start" to get started!

 

The second page is where we are testing the connection to each of the scalability engines. If we are able to determine the status of these engines, we'll give you the green light to proceed to the next step. Common issues that could prevent this from being successful could be that the SolarWinds Administration Service has not been updated to the correct version or is not up and running at this point. Click "Start Preflight Checks" to proceed.

 

Similar to the Deployment Health tab, these are running preflight checks across your Orion Deployment. You'll be able to see all of the same preflight checks that were available through the installer client, except centralized to one view. If there are blockers present on this screen, you can still proceed in this flow if at least one scalability engine is ready to go, but please note down those scalability engines with blockers. You will need to address those blockers before an upgrade can occur on those servers. Click "Start download" to start the staging process.

 

 

 

At this point, we are starting the download process of every msi needed to upgrade your scalability engines. In this example, I'm only staging one scalability engine, but if you  have multiple, you can see the benefits in time savings right away! All of the downloads will be triggered in parallel.

Sit back and relax as we stage your environment for you. You can even open up RDP sessions to those servers with one click from this page.

 

When everything has finished downloading, we will let you know which servers are ready to install. Click on the "RDP' icon to open your RDP session to the server.

 

On your desktop, you should see the SolarWinds scalability engines installer waiting for you to click on and finish the upgrade.

 

Visually you will run through the same steps that you normally would in clicking through the installer wizard. However, when you actually get to the installation part, you'll notice that there is no download appears in the progress bar. Finish your upgrade and move on to the next!

 

I hope you enjoy this update to how you can upgrade your Orion Deployment. I'm always looking for feedback on how we make this as streamlined as possible for you.

NPM 12.3 is available today, May 31st, on the Customer Portal!  The release notes are a great place to get a broad overview of everything in the release.  Here, I'd like to go into greater depth on Network Insight for Cisco Nexus including why we built it and how it works.  Knowing that should help you get the most out of the new tech!

 

Network Insight

What's all this "Network Insight" talk?  If you haven't heard of this big theme we've been building on a few years, start here.  If you know the story, skip ahead to the Network Insight for Cisco Nexus section.

 

We live in amazing times.  Every day new technologies are invented that change how we interact, how we build things, how we learn, how we live.  Many (most?) of these technologies are only possible because of the relatively new ability for endpoints to talk to each other over a network.  Networking is a key enabling technology today like electricity was in the 1800s and 1900s, paving the way for whole wave of new technologies to be built.  The better we build the networks, the more we enabling this technological evolution.  That's why we believe in building great networks.

 

A great network does exactly one thing well: connects endpoints.  The definition of "well" has evolved through the years, but essentially it means enabling two endpoints to talk in a way that is high performance, reliable, and secure.  Turns out this is not an easy thing to do, particularly at scale.  When I first started maintaining, and later building networks, I discovered that monitoring was one the most effective tools I could use to build better networks.  Monitoring tells you how the network is performing so you can improve it.  Monitoring tells you when things are heading south so you can get ahead of the problem.  Monitoring tells you if there is an outage so you can fix it, sometimes even before users notice.  Monitoring reassures you when there is not an outage so you can sleep at night.

 

Over the past two decades, we believe as a company and as an industry we have done a good job of building monitoring to cover routers, switches, and wireless gear.  That's great, but virtually every network today includes a sprinkling of firewalls, load balancers, chassis switches, and maybe some web proxies or WAN optimizers.  These devices are few in number, but absolutely critical.  They're not simple devices either.  Monitoring tools have not done a great job with these other devices.  The problem is that we mostly treat them like just another router or switch.  Sure, there are often a few token extra metrics like connection counts, but that doesn't really represent the device properly, does it?  The data that you need to understand the health and performance of a firewall or a load balancer is just not the same as the data you need for a switch.  This is a huge visibility gap.

 

Network Insight is designed to fill that gap by finally treating these other devices as first class citizens; acquiring and displaying exactly the right data set to understand the health and performance of these critical devices.

 

Network Insight for Cisco Nexus

Network Insight for Cisco Nexus is our third installment in the Network Insight story, following Network Insight for F5 and Network Insight for ASA.  Nexus chassis switches are used to build high performance, scalable, and virtually indestructible data center networks.  Thats why Nexus are at the heart of many of the largest data centers.  Nexus are switches so our traditional switching data is still important, but a $300k chassis switch has a lot of additional capabilities that a $5k switch does not.   As you saw with F5 and ASA, Network Insight for Cisco Nexus takes a clean slate approach.  We asked ourselves (and many of you) questions like:

 

  • What role does this device play in connecting endpoints?
  • How can you measure the quality with which the device is performing that role?
  • What is the right way to visualize that data to make it easiest to understand?
  • What are the most common problems that occur with this device?  What are the most severe?
  • Can we detect those problems?  Can we predict them?

 

With these learnings in hand, we built the best monitoring we could from the ground up.

 

VDC Aware

 

Similar to ASA's, Nexus can be split into virtual instances.  Nexus calls them Virtual Device Contexts while ASA calls them Contexts.  VDCs are to Nexus what VMs are to servers, allowing a single piece of hardware to be split into several logical nodes.  Each logical node, or VDC, is configured separately and provides a full set of technology services.  All of the features you read about below discover complete information about each VDC.

 

Adding the Admin VDC for a Nexus to monitoring lets NPM map out all of the VDCs, which will then appear on the Node Details screen:

 

Anytime you go to Node Details for any of the VDCs, you'll get this new resource so it's easy to navigate between them.  NCM users will also find it easier than ever to make sure all of their VDCs are backed up.  If you're well setup for catastrophic failures, they're less likely to occur, right?  More info on what NCM is doing for VDCs can be found here.

 

So Many Interfaces

 

The first big difference between Cisco Nexus and most other devices is simple interface count.  Thanks to the distributed nature of a Nexus deployment, particularly Fabric Extenders, a single Nexus 7k is likely to have hundreds or even thousands of ports.  Dealing with thousands of ports on a single device is different than dealing with the usual couple dozen, and we wanted to make sure this fundamental part of Nexus monitoring was done right.

 

First, the Node Details page now contains a simple summary of all of the interfaces:

 

Like Network Insight for ASA, we have a new sub-view for each major technology service provided by the device.  Clicking on Interfaces, in the above resource or on the sub-view tabs on the left, will bring you to the Interfaces sub-view showing all interfaces.  Clicking on any of the status icons or numbers will bring you to a list of only those interfaces.

 

This is built on the relatively new List View that's part of our Unified Interface Framework.  UIF is an important component to make sure the UI across all Orion Platform based tools from SolarWinds have a consistent UI experience so when you learn how to do something in one tool, you know how to do it in all tools.  The list view is made for management of large lists, including:

  • Multi-level filtering, for example, interfaces with status Up AND (utilization Warning OR Critical).
  • Colored highlighting of values over your thresholds for that specific entity.
  • Sorting
  • Searching
  • Pagination control with up to 100 items per page.

 

I particularly like the search function for looking up ports on a certain module.  Entering a "1/" in the search field will show you all the ports on slot 1.  Easy.

 

These are straight forward improvements but I think you'll find it much more pleasant dealing with the large interface counts on your Nexus devices.  And good news: we extended this sub-view to all nodes so you have a super polished interface interaction model on your smaller switches too.

 

Virtual Port Channels

 

A big part of why people are willing to shell out for the huge cost of a Nexus is more reliable connectivity to endpoints like servers.  Nexus should provide an order of magnitude higher reliability connectivity to servers.  Cisco accomplishes with vPCs, a Multi-Chassis Etherchannel (MCEC) technology that allows a single endpoint to uplink to multiple switches.  Traditional port channels can only connect a single upstream switch, resulting in a single point of failure.

 

Believe it or not, vPCs are a serious departure from how networking works.  In fact, a pair of Nexus have to "conspire" (a fancy word for lie) to present themselves as a single switch to the endpoint they're connected to.  Cisco has a bunch of technology to make it work, and in our research we found this was making it hard for administrators to understand, monitoring, and troubleshoot their vPCs.  When we dug into this, we found that expert administrators will spend several minutes to understand the health of a single VPC.  They do things like:

  • Login to Nexus
  • "show vpc"
  • "show interface port-channel..."
  • "show interface...", repeat 2-4 times
  • "show run interface...", repeat 2-4 times
  • Find peer switch, login, and do all the commands again.

 

When all is said and done, they've mapped 5, 7, 9, or even more different components, each with its own status, performance, and config.  Our goal was to have this expert level data set available to experts and non-expert users in seconds.  The vPC tab accomplishes that:

On the left we see the vPCs.  Each vPC is mapped to the local port-channel.  We find the peer switch and map the vPC to the port-channel on the peer.  Mousing over allows you to see the member ports of each port-channel and navigate to them:

Again we're using the List View, so you have filtering, sorting, searching, pagination, and so forth as expected.  Click to drill into any interface for all the details we have about that interface.  Of course all of this is can be alerted upon and reported on to keep you ahead of problems without staring at monitoring all day.  There's some really cool additional stuff you can do with NCM specific to vPCs.  If you're interested, check out their upcoming post.

 

During beta and RC we found environments where folks had spent hundreds of thousands to more than a million dollars and countless hours setting up high resiliency.  Once they pointed NPM at their Nexus, they found that resiliency had deteriorated over time.  They had failures and the redundancy saved them, but it also meant they didn't know the problem existed so they never restored redundancy.  This leaves them one failure away from a catastrophe in a multi-million dollar high redundancy environment.

 

If you're in IT, you're strapped for time.  Our monitoring tools have to help us do better here.  I'm happy that NPM will now help you keep your vPCs running clean!

 

Access Lists?!

 

One thing that surprised me is how many of you are running ACLs on your Nexus.  There's a trend of moving security closer to the endpoint, and Nexus devices are the access layer for many data center environments.  This results in lots of Port Access Control Lists (PACLs) and VLAN Access Control Lists (VACLs).  Fortunately, we recently worked on this for Cisco ASA.  The latest NCM release extends and enhances the ACL backup and analysis capability, including new support for MAC ACLs and non-contiguous subnet masks.  All of the Access List functionality is based on pulling and analyzing configs, so you'll need the NCM tool to get this feature.  Check it out NCM's post  - and also, bonus, my favorite part: Interface Config Snippets!

 

Traditional Routing and Switching

 

While working on the enhanced capabilities, we also revisited some core technology of ours to make sure it was covering Nexus well.  Things like routing protocol monitoring and hardware health should work better than ever.  We think we've got everything covered but there's a huge number of combinations of hardware (platform and modules) and software (trains and versions).  If you notice any gaps please shoot me a private message with the data that's not showing up for you and a SNMP walk of your device.

 

Setup

 

I would have started this guide with setup if not for the fact that setup is so darn easy.  To get this feature working, add a node as usual and you'll notice a new check box on the last step of the Add Node Wizard:

 

 

Check that box, enter your CLI creds (read only is fine) and you're good to go.  If you have existing Nexus under monitoring and you'd like to get the enhanced monitoring, head over to manage nodes.  You can edit an individual node and check this box, or you can find all of them with Machine Type and/or search and enable all at once.

 

There's nothing else you need to configure or define.  Simple right?

 

Other Deep Dives

 

We've got a couple other deep dives for new Orion Platform features included in NPM 12.3.  Check 'em out!

 

Orion Platform 2018.2 Improvements - Chapter One

Orion Platform 2018.2 Improvements - Chapter Two - Intelligent Mapping

Orion Platform 2018.2 Improvements - Chapter Three

 

Conclusion

 

That does it for now.  You'll be able to click through the functionality yourself in our online demo starting around June 6th.  If you're on active maintenance for NPM, head over to the Customer Portal to get your upgrade now.  I'd love to hear your feedback once you have it running in your environment!

Starting with VMAN 8.0, and continuing with 8.1  we've streamlined how you deploy and use VMAN.  Virtualization Manager 8.2 , the latest edition of these efforts, is now available on your Customer Portal.

 

One of the biggest pain points that surfaced over the last 2 releases was that the process for adding virtualization nodes to be monitored was not intuitive. This is solved with a new simplified workflow!

 

Whether choosing to add a Node or setting up a Discovery job, we've updated those entry points to direct you to the new, separate workflow.

 

Add a Node - Select VMware vCenter or Hyper-V devices
Network Discovery - Add VMware vCenter or Hyper-V devices
All Settings -  Add VMware vCenter or Hyper-V devices

 

Once you click on any of those entry points you'll be able to get started monitoring your environment with a few simple clicks.

 

Add a Virtual Object for Monitoring
See the thresholds that apply to your virtualization manager entities
Click Finish and you're successfully on your way to monitoring your virtualization environment

 

If you identified any thresholds that you'd like to tweak, simply navigate to All Settings -> Virtualization Settings to update your thresholds. Within a few clicks, you're ready to take advantage of capacity planning, recommendations and much more!

 

Get Started with Documentation

VMAN 8.2 Release Notes

VMAN 8.2 Getting Started Guide

VMAN 8.2 Administrator Guide

VMAN 8.2 Deployment Sizing Guide

Applications talk to each other, and you should know who they are talking to

 

Applications constantly rely on communication between different servers to deliver data to end-users. The more applications end-users require to do their job, the greater the complexity of application environments and those communication based relationships.

With the release of Server & Application Monitor 6.6, we introduced an Orion Agent based feature, called Application Dependencies, which enables system administrators to quickly gain an understanding of which applications servers are talking to one another, as well as see related metrics, to help with troubleshooting application performance issues.

 

How do you enable it?

The ability to discover and map Application Dependencies is enabled by default. This allows SAM to actively collect inbound and outbound communication at the application process level. This is paired with an ability to collect connection related metrics (latency and packet loss), which is disabled by default. You can find all of the configuration options in the Application Connection Settings section of the Main Settings & Administration screen.

 

What does it show you?

At its core, Application Dependencies help you understand if application performance issues are associated with server resource utilization or network communication. For example, Microsoft Exchange is heavily dependent on Active Directory for authentication and other services. Application Dependencies show you the relationship, and the communication, by adding a few new resources in SAM.

 

The two main areas where you can see the Application Dependency information. One area is in a new widget that is available on application and node details pages. This widget will show you the discovered application dependencies, specific to that monitored application or node. Notice in the screen below that you can see where multiple Exchange servers have a dependency on the Active Directory server, ENG-AUS-SAM-62, and more specifically the Active Directory service that is running on it.

 

The second area where you can see Application Dependency information is in the connection details page, which is linked from the above mentioned connections widget. This will allow you to see all of the application monitors, and associated processes, process resources metrics, and ports, responsible for the discovered communication, between two specific nodes. You will also see the latency and packet loss data, if you have enabled the Connection Quality Polling component. The screen below shows the relationship between ENG-AUS-SAM-62 (Active Directory) and ENG-AUS-SAM63 (Exchange), in greater detail.

What’s going on under the covers?

There are two, new Orion Agent plug-ins that help deliver this new functionality. One is the Application Dependency Mapping plug-in, and the other is the Connection Quality Polling plug-in.

The Application Dependency Mapping plug-in is responsible for collecting the active connection data from the server. That information is then sent back to the Orion Server, where it is correlated with component monitor and node data, already being collected by SAM (Note: You must have at least one component monitor, like the process monitor, applied to the server). As SAM matches the collected data from the different application servers, it creates the connection details pages and populates the connection widget.

 

The Connection Quality Polling plug-in is responsible for a synthetic probe, which measures latency and packet loss. This accomplished by sending TCP packets to the destination server, on the specific port identified by the active connection information collected by the Application Dependency Mapping plug-in. It is important to note that the Connection Quality Polling plug-in includes the NPCAP driver for use with this synthetic probe.

 

If you would like to read more about how this feature works, you can find more information in the SAM administrator guide.

 

Is that it?

Application Dependencies is not the only feature that was released in SAM 6.6. You can read more about the other features in the release notes. You can also check out Application Dependencies, live in action, in the online demo.

Filter Blog

By date: By tag:

SolarWinds uses cookies on its websites to make your online experience easier and better. By using our website, you consent to our use of cookies. For more information on cookies, see our cookie policy.