As bmrad pointed out in the Beta 1 Post, we've been working really hard to extend the integration with NPM and SAM introduced  in Virtualization Manager 6.0.  The team has been hyper-focused on simplifying configuration of the integration in order to bring you App-aware infrastructure monitoring, while preserving your flexibility to start with the tool you want (e.g. SAM or VMAN) and leverage the integration in the places which make the most sense. The features I'm going to outline here come directly from you and what you've told us matter most to detecting and remediating problems quickly in your virtualized environment.

3-11-2014 2-59-31 PM.png

 

 

Synchronization Wizard

 

I'm not going to go into elaborate detail here about the Sync Wizard, as bmrad did an excellent job of that in the Beta 1 Post. However, I did want to thank all of our Beta participants for giving us great feedback on the usability of the Wizard workflow. The Product Team completely understands that it doesn't matter how great the integration is, if you can't get the integration setup in the first place, none of the rest matters. With the feedback from Beta 1, we were able to streamline the messaging and workflow in the Wizard to get you up and running with the integration in minutes. We're not saying it's perfect (and definitely let us know where things don't make sense still!), but it should go a long way to making sure you never see broken integration resources ever again.

 

 

Baselines (Dynamic Thresholds) on Clusters, Hosts, VM's, and Datastores

 

As we discussed in the Beta 1 Post, the VMAN integration is now taking advantage of Dynamic Threshold, or Baselines, for thresholds and alerting purposes. An IT environment is a dynamic place, and when you add virtualized infrastructure to the mix, complexity leaps an order of magnitude. Isn't it about time that your alerting system recognized that fact? Well now it can!

 

So what can you set baseline threshold on? Our conversations with you determined the most important attributes across Datastores, VM's, Hosts, and Clusters for us to add to this release.. Obviously, we couldn't talk to everyone, and therefore are very interested in your feedback if there are other key metrics that would be valuable to baseline for your environment! In 6.1, you can set dynamic thresholds against the following virtual objects:

 

Clusters
  • CPU Load
  • Memory Load
Hosts
  • Network Utilization
  • Memory Load
  • CPU Load
VirtualMachines
  • Memory Load
  • CPU Load
  • CPU Ready
  • IOPS Total
  • IOPS Read
  • IOPS Write
  • Latency Total
  • Latency Read
  • Latency Write
  • Network Usage Rate
Datastore
  • IOPS Total
  • IOPS Read
  • IOPS Write
  • Latency Total
  • Latency Read
  • Latency Write

 

Given that, let's see what it might look like if I want to go set a CPU Load baseline threshold against Virtual Hosts in my environment. CPU Load is an important metric to measure on Hosts with Baselines. I may have Hosts that run heavily loaded all of the time and the VM's perform acceptably on those. Therefore, I may not care to use a static threshold like 80% Warning / 90% Critical, but instead just want my alerting system to tell me "when this host is under abnormally high load." So let's get started....

 

What You'll DoWhat You'll See
First step, of course, is to make sure that you've got the integration enabled. Simply go to Settings->Virtualization Settings->Enable Virtualization Manager Integration and enter the IP and credentials for your VMAN appliance. This will launch you directly into our new (VMAN 6.1) Synchronization Wizard. For more information on the Sync Wizard, reference the Beta 1 Blog Post.

3-11-2014 9-06-55 PM.png

3-11-2014 9-10-04 PM.png

3-11-2014 9-10-38 PM.png

3-11-2014 9-11-09 PM.png

3-11-2014 9-15-03 PM.png

Once the wizard is done syncing your environment, the integration is now setup and ready to go. You now will have access to all of the additional baseline goodness I mentioned above. So the first place to head to is back to the Settings page. On the main Settings page, you'll see a new sub-heading - Manage Virtual Devices. This was formerly the "Virtualization Polling Settings" menu option for VIM, but we've now extended it for setting Thresholds on your Virtual Devices and thus the name change.3-11-2014 9-31-11 PM.png
Once you get to the Manage Virtual Devices page, select the Thresholds tab. This will reveal a dropdown menu where you can select virtual object types - VC's, Clusters, Hosts, VM's, Datastores - to view in the selection box below. You can also search here to further refine your selection. Given our example use case, I'm going to select Hosts here to view all the Virtual Hosts in my environment. This will show both VMware and Hyper-V hosts that are enabled by the integration (i.e. visible to VMAN and Orion).3-11-2014 9-36-03 PM.png
Now that I've filtered the view to see all the Virtual Hosts in my environment, I can now multi-select the Hosts on which I want to set a Threshold. This might be useful if I want to set Static Threshold on some subset of my hosts and use automatic Dynamic Threshold for others. With multi-select I can do either quickly. Once I've selected all the Hosts I need, I click Edit Thresholds.3-11-2014 9-53-50 PM.png

Now we're into the heart of the matter. The "Edit Properties" screen presents me with several key pieces of information:

  • The Hosts I'm editing properties for
  • The properties that are available to edit. For those of you playing along at home, you'll see the 3 baseline metrics I listed for hosts in the table at the beginning of this section - CPU Load, Memory Usage, and Network Utilization.

 

Our example use case involves setting CPU Load on these Hosts, so I'll select the CPU Load checkbox. This will reveal the current settings for that metric. In order to override the "Global Orion Threshold" with a custom value or baseline, select the checkbox next to "Override Global Orion Threshold or Set Dynamic Threshold."

 

As you can, see in the screenshot to the right, the current CPU Load thresholds are static thresholds:

  • Greater than 80% for Warning
  • Greater than 90% for Critical.

 

We want to use Dyanmic Baselines for this metric. All I have to do in this case is select the Use Dynamic Baseline Threshold button and it will automatically set these Hosts to use Dynamic Thresholds for this metric.

 

Voila! We won't show an explicit value for these baseline thresholds, because they will be different for each node. If you want to see the baseline history (statistical data over time), you'll need to edit a single node at a time.

3-11-2014 10-01-33 PM.png

3-11-2014 10-10-24 PM.png

So let's go check out one of the ESX Hosts we set the threshold on. If you look at the Host Details page, you can see that the Resource Utilization graph is showing Yellow and Red bars for the warning and critical thresholds we've now set. Oh BTW - that Resource Utilization sparkline chart? Also new in VMAN 6.1 for all vNodes!3-11-2014 10-18-18 PM.png

 

Advanced Alerting - Now with a Virtual Twist!

 

OK, so now that I've set my dynamic threshold, how do I actually get alerted? Well, with VMAN 6.1, you can now alert on the VMAN data presented in the integration. So I can use Orion's Advanced Alert Manager to set alerts just as I would any other Orion object. We've even included a subset of the standard VMAN alerts out-of-the-box. These alerts include:

 

ClusterHostVMDatastore/Cluster Shared Volume

 

    • Cluster CPU utilization
    • Cluster memory utilization
    • Cluster storage utilization

 

    • Host CPU utilization
    • Host memory utilization
    • Host rebooted

 

    • VM CPU Ready
    • VM CPU Load
    • High VM CPU Utilization
    • VM memory swap
    • VM memory ballooning
    • VM Memory Underallocated
    • High VM Memory Utilization
    • Guest storage space utilization
    • VM Rebooted
    • VM Disk Latency

 

    • Datastore Low Free Space
    • Datastore Overallocation
    • Datastore High Latency

 

So continuing our example for above, let's take a dive in the Orion Advanced Alert Manager and setup an alert on our threshold:

 

What To DoWhat You'll See
First, go ahead and RDP to the Orion Server to access the Orion Advanced Alert Manager. Go ahead and click on "Configure Alerts" which will launch the Alert Manager application.3-20-2014 7-52-18 PM.png
In the picture to the right, I've highlighted one of the new out-of-the-box integrated alerts "Host CPU Utilization." Let's take a look at what this alerts trigger conditions are.3-20-2014 7-53-00 PM.png
You can see that this alert is set to fire whenever any Virtual Host's CPU Utilization is greater than or equal to 70%. This is a static trigger level. You'll also notice that the alert is set to trigger after 15 minutes sustained. This is to ensure that at least 2 polling intervals have occurred before firing this alert as default performance collection is in 10 minute intervals. If you set this alert sustain condition for less than 10 minutes you will get noisy alerts that probably won't serve your intentions.3-20-2014 7-53-49 PM.png
I'd like to take a second to point out that you can setup alerts on not only Virtual Hosts, but also Clusters, VM's and Datastores. Available alert-able metrics for each can be determined by selecting an object type, selecting the metric field, and then navigating through the hover menus.3-20-2014 7-54-33 PM.png3-20-2014 8-08-19 PM.png

Now, back to setting up our dynamic alert. It's easiest to just copy an existing alert if you're a beginner, so that's what I've done. I can then hit "Edit" and change the name of the alert. More interesting is the alert condition. I use the same "CPU Utlization" metric, but instead of "Greater than or Equal to" I set it to be a Threshold alert. I want to alert if either a Warning or Critical Threshold is triggered, so I set the alert to trigger if either condition is reached. And that's pretty much it! Hit OK and off you go.

3-20-2014 7-55-58 PM.png3-20-2014 7-56-39 PM.png

 

Web-Based Reporting - Leverage the Newest Orion Core Feature on Your Virtual Data

 

So now you've seen how to set a dynamic threshold and alert on that, all in Orion, all on data collected by VMAN. Pretty cool, huh? Well the last piece is Orion Reporting. Orion has recently introduced Web-based Reporting, and I'm happy to say that the Web-based Reporting system can also now be used to report on the data presented in the VMAN integration. As a quick example, let's go ahead and quickly show you a report I created to show me all of the VM's in my environment with High CPU Load.

 

What To DoWhat You'll See
First step - Go ahead and navigate to the Orion Reports view on the Home tab. Once you're there, click on Manage Reports, then "Create New Report."3-20-2014 8-35-48 PM.png3-20-2014 8-36-17 PM.png
My report is just going to be a simple table, so I'll choose "Custom Table" on the next screen.3-20-2014 8-36-44 PM.png

I'm going to create a table that relies on a Dynamic Query. This way, as VM's get added and removed from my virtual infrastructure, my reports do not have to be updated to include/exclude them specifically. I can do that if the need arises, but not for this example.

 

I'm going to report on "Virtual Machines." However, I don't want a list of all VM's, because that's not really interesting. I'm only interested in VM's that have high CPU Load. Therefore, I'll add a conditional to the report to only report on VM's where CPU Load is greater than or equal to 50%. From here, you can name your datasource something meaningful, or just hit "Add to Layout."

3-20-2014 8-39-07 PM.png
I'll now add Columns to my table. Since this is a simple report, I'll just add the VM's "Name" and "CPU Load." Since I'm specifically trying to identify problems with CPU Load, I'll also have my report sort the results by CPU Load in Descending order. I don't need to add any filtering conditions here in the "Filter results" section, because my datasource I configured for the whole report is already doing that.3-20-2014 8-41-42 PM.png

I can now generate a preview of my report to see the data.Note that none of these VM's are Orion "Nodes" - these are all VMAN objects - vNodes - presented in Orion.   From here, I can add properties, like adding it to Favorite Reports, or create report schedules. For our purposes, this is where we'll end, since this is not meant to be a full tutorial on the Orion Report Writer.

3-20-2014 8-42-24 PM.png

 

Account Limitations - Role-based Access Control, Orion Style

 

I'll wrap up by briefly discussing Account Limitations, also known as Role-based Access Control. This has been a longstanding requests from VMAN customers and we're very happy we are able to deliver it through the integration in this release by leveraging Orion's Account Limitations feature. This functionality has become more critical as folks use VMware and Hyper-V estates as Private Cloud infrastructure. User access can now be restricted under different level of the virtual "hierarchy." For example, if I have a large environment with multiple vCenters, I can limit users to see only the virtual objects - hosts, vm's, clusters, datacenters, - under a single vCenter. These limitations can now be applied in the following ways:

  • View everything under a Single or Group of:
    • vCenters
    • Datacenters
    • Clusters
    • Hosts
  • View a single or group of VM's
  • Datastores don't fit neatly in the above hierarchy, so they have their own control.

 

Note that the embedded Flex (Flash) views from VMAN shown in the Orion interface (e.g. the Map view) is all or nothing since the limitations are applied on the Orion side, not on the VMAN side. There's not much to show here, as Account Limitations just limit your view of the infrastructure, so it doesn't really make for interesting screenshots. Nonetheless, this should be a useful feature for our customers running these products in a private cloud deployment.

 

That's all folks! Don't forget to sign-up for the Beta and give us some feedback!