Troubleshooting VMs with SolarWinds Observability SaaS - Part One

Working with VMs hosted by Microsoft Azure  

SolarWinds® Observability tools provide a comprehensive solution for monitoring the performance of your network, applications, databases, and infrastructure, including VMs. SolarWinds Observability is available in two versions: SolarWinds Observability Self-Hosted, an on-premises, private, or public cloud solution, or SolarWinds Observability SaaS, a hosted cloud-native solution. SolarWinds Observability unifies monitoring for all your services and applications in a single solution, improving cooperation between your software engineers, network engineers, database administrators, ITOps, DevOps, and SRE teams.  

Virtualization helps improve the resource-sharing of physical hardware, reducing costs. However, monitoring VM performance is essential to ensure reliable functionality. Identifying and solving problems quickly for efficient VM management and optimization is critical.  

Note: If you run your VMs with My VMware, Nutanix software, or in Microsoft  Hyper-V environments, you can take advantage of a rich set of VM performance management tools with SolarWinds Observability Self-Hosted. In this series, we’ll cover cloud VMs, such as those hosted with Azure. 

SolarWinds Observability software can help you identify and remove bottlenecks in your cloud VMs through performance metrics and alerts. It also provides performance analysis to speed up your troubleshooting efforts. 

In this guide, Part One of a three-part series troubleshooting VMs with SolarWinds Observability SaaS, we’ll review the essential metrics and KPIs associated with VMs. We’ll also explore how to address performance issues for VMs hosted by Azure, using monitoring tools and performance-tuning strategies with SolarWinds Observability SaaS. 

Critical Metrics for Monitoring Virtual Machines 

Key VM metrics give insight into the elements influencing your VM's functions. SolarWinds Observability SaaS captures and monitors VMs hosted by Azure. To better understand your VM's performance, you should become familiar with the following metrics: 

  • Network utilization: Refers to the percentage of available network bandwidth the VM uses. This metric indicates how efficiently the VM utilizes network resources and helps identify performance bottlenecks. 
  • CPU load: Measures the percentage of the VM's CPU capacity currently in use. Monitoring CPU load helps ensure the VM has sufficient processing power to handle its  workload. 
  • Memory usage: Refers to the proportion of RAM actively used by the VM out of the total RAM allocated. Effective memory management will help you avoid performance and degradation issues. 
  • Connection state and IOps: Measures the responsiveness and efficiency of input/output operations, indicating how well the VM communicates with storage. Monitoring the Input/Output Operations Per Second (IOps) metric can help you identify potential storage related issues impacting VM performance. 
  • Number of connections: Measures the count of active connections to and from the VM. Monitoring this metric is essential for assessing the workload and ensuring the VM can handle the expected volume of concurrent connections. 

  • Connection latency: Refers to the time delay between sending a request and receiving a response. It indicates the responsiveness of the VMs’ network connections. This metric can help you maintain optimal user experience and identify potential network issues. 

Monitor your VM environment further with: 

  • VM capacity: Refers to the VM's ability to handle its assigned workload without resource constraints. Monitoring VM capacity helps ensure the VM has sufficient resources (CPU, memory, and storage) to meet the demands of its applications and services. 
  • Changes in network configurations: Change monitoring is essential for addressing and detecting potential security vulnerabilities, misconfigurations, and performance issues introduced by network adjustments. 
  • Authentication failures: Tracks unsuccessful attempts to access the VM to uncover potential security threats. 
  • Syslog: Captures system logs, provides insights into system events, and detects anomalies or issues within the VM. 

Before identifying KPIs for your VM, it’s important to understand your goals and expectations for application performance. For example, connection latency and the number of connections would make suitable KPIs for a web server application. You might focus on CPU load and memory usage for a computer-intensive workload.  

Tailor your KPIs to reflect the unique characteristics of each application, ensuring your monitoring strategy addresses the specific needs and objectives of your diverse workloads. 

How to Troubleshoot VMs hosted by Azure 

SolarWinds Observability SaaS tools provide you with insights into virtual server performance. They simplify management by mapping VMs to their underlying hosts, storage, and other related objects. They also help you view and analyze the performance correlations between virtual servers.  

In this section, we’ll look at leveraging the capabilities in SolarWinds Observability SaaS to troubleshoot performance issues for VMs hosted by Azure. 

Add Your Azure Cloud Account to SolarWinds Observability SaaS 

The first step in using SolarWinds Observability SaaS to monitor your VMs hosted by Azure is to integrate SolarWinds Observability SaaS with your Azure account. Create an application as a service principal with Azure AD either through the Azure CLI or the Azure portal. It will have the “Monitoring Reader” role. 

Then, click the Add Data button at the top in SolarWinds Observability SaaS. Select Infrastructure. 

 

From there, walk through the steps for adding Azure resources. If you used the Azure CLI for setup, you will be asked to provide an appId and tenantId. If you used the Azure portal, then you will provide the client secret. SolarWinds Observability SaaS will walk you through the steps to select the regions and resources associated with your Azure account. 

These Azure resources will show up in the Infrastructure Monitoring area of SolarWinds Observability SaaS. 

The metrics gathered from your Azure resources may depend on individual resource configurations and the subscription pricing tier of your Azure account. These metrics are associated with the Azure Resource Manager. 

When you add Azure VM resources to SolarWinds Observability SaaS, the SolarWinds Observability Agent is installed on the host to provide additional monitoring data via OpenTelemetry Collector. 

Metrics Dashboards and Visualizations 

Within SolarWinds Observability SaaS, you can quickly access dashboards and visualizations for your Azure VMs. In addition to navigating through the Azure page of Infrastructure resources, you can also use Explorer and filter by entity type (set to Azure VM). 

You can drill down to an individual VM to see its Health Score and an overview of several critical metrics. 

SolarWinds Observability SaaS calculates the Health Score based on entity anomalies, status, and alerts. Higher severity alerts may have a more significant impact on the health score. A closer look will show past events that would have impacted the Health Score (both positively and negatively). 

Alerts and Notifications 

SolarWinds Observability SaaS lets you create alerts quickly and simply on metrics from VMs hosted by Azure to detect and notify you of common issues in virtual environments. You can create alerts manually on the Alerts Settings page by clicking Create Alert. 

Alerts for VMs hosted by Azure are based on metric conditions, triggering when a specific metric exceeds a certain threshold. 

Alerts can also be created directly from a dashboard in the infrastructure resources overview for a specific VM. 

 Creating an alert this way automatically pre-populates the alert creation form. From here, you can further customize threshold values or other conditions. 

 

After configuring alert thresholds, you can define the actions that should be triggered. For example, you can set up an alert to send an email notification. 

Performance Tuning Strategies 

Managing and optimizing VM performance can be complex, lengthy, and can consume resources. The process can be fraught with errors and inconsistencies. SolarWinds Observability SaaS provides tools for analyzing and optimizing VM performance metrics. 

You can use the Metrics Explorer to compare metrics across all VMs hosted by Azure. 

For example, you can visually compare CPU utilization across all Azure VMs by filtering down to cloud.platform:azure_vm and grouping metrics by host.name.  

The visual shows how one particular VM hosted by Azure spiked in CPU utilization while that same metric remained relatively low for other VMs. 

The Metrics Explorer from SolarWinds Observability SaaS allows you to: 

  • Explore historical VM data, providing a comprehensive performance metrics and trends walkthrough. 
  • Confirm and address resource allocation issues in your cloud environments, helping ensure optimal performance and resource utilization across VM instances. 
  • Use data correlation techniques to troubleshoot network traffic, both incoming and outgoing, for VMs hosted by Azure. 


Conclusion 

The metrics and alerting capabilities in SolarWinds Observability SaaS are a robust solution for monitoring the performance of VMs hosted by Azure. You can collect key metrics to optimize VM performance by connecting your Azure cloud account to SolarWinds Observability and installing the SolarWinds Observability Agent on your resources, such as Azure VMs. You can also leverage tools and visualizations in Metrics Explorer to troubleshoot and fine-tune VM performance, helping ensure your virtualization environment is functioning as expected. 

The second installment of this three-part series will continue our exploration of SolarWinds Observability SaaS for performance tuning of VMs hosted by Azure.  

Sign up for a free trial of SolarWinds Observability SaaS today. 

THWACK - Symbolize TM, R, and C