cancel
Showing results for 
Search instead for 
Did you mean: 
Create Post

Custom Report - VMware Host Disk Device Latencies

Description

This alert will calculate the average host disk device read and write latency metrics for last 10 minutes (based on frequency of alert) and send an email when the value is over 15% (warning) or 30% (critical).

Alert Definition

To create the alert, click on Alerts > Manage Alerts tab and create a Custom Alert of type Custom SQL Alert - Multiple Numeric Return. Configure the alert with values similar to these:

Alert Name: VMware Host Disk Device Latency

Execution Interval: 10 Minutes

Notification Text: The following VMware storage devices have high latencies

SQL Statement:

DECLARE @HostID varchar(5), @VMID varchar(5), @SQL varchar(max);

-- the Ignite metric data is stored in VM and HOST specific tables, and we need the VMID and HOSTID to get there

SELECT @VMID=vmid FROM cond WHERE id = #DBID#;

SELECT @HostID=current_host_id FROM conv_vm WHERE id=@VMID;

-- query the VM metric tables for the last 10 minutes (frequency above) and calculate an average VM Disk Usage Rate

SET @SQL = 'select ''Disk - '' + d.display_name + '' - Metric: '' + m.metric_key, AVG(md.v) latency_ms '+

          +'from conv_metrics m '+

          +'inner join conv_metric_detail_'+@HostID+' md ON md.metric_id = m.id '+

          +'inner join conv_device d on d.id = md.device_id '+

          +'WHERE d.device_type=''Disk'' '+

          +'AND m.metric_key like ''%latency%'' '+

          +'AND md.d >= DATEADD(MINUTE, -#FREQUENCY#, CURRENT_TIMESTAMP) '+

          +'GROUP BY d.display_name, m.metric_key'

EXEC (@SQL)

Execute Against: Repository

Units: ms latency

High Threshold: Min 30, Max empty

Medium Threshold: Min 15, Max 30

Labels (1)
Comments

I ran this on the DPA repository and got an error

Msg 208, Level 16, State 1, Line 4

Invalid object name 'cond'.

Were you logged in as the schema owner? Otherwise, you may have to fully qualify object names...

The main reason you might get this is a schema mismatch.

This line is the issue but, there maybe others. "SELECT @VMID=vmid FROM cond WHERE id = #DBID#;"

If your repository database is in SQL server DPA's default schema maybe set to ignte for all the table names. If you are running this report in something like SQL server management studio as a Sys_admin your default schema would be set to DBO.

You would need to edit to match your tables something like SELECT @VMID=vmid FROM ignite.cond WHERE id = 1 ;  Note: this Alert was made to be run in DPA and the #DBID# and #FREQUENCY# are key item that is replaced in the DPA alerts by a database ID of the registered instances and the frequency setting of the alter as defined.  The user for DPA's repository connection normally has it's default schema set to the ignite schema if that is what the tables are so the alert code should work as it stands in DPA but, would give the result you describe if it was run in SQL server Management Studio as another user.

You would need to perhaps add the ignite to all table names. and  edit the items for #FREQUENCY# and #DBID# out for this to work in SSMS.

-- the Ignite metric data is stored in VM and HOST specific tables, and we need the VMID and HOSTID to get there

SELECT @VMID=vmid FROM ignite.cond WHERE id = 1;

SELECT @HostID=current_host_id FROM ignite.conv_vm WHERE id=@VMID;

-- query the VM metric tables for the last 10 minutes (frequency above) and calculate an average VM Disk Usage Rate

SET @SQL = 'select ''Disk - '' + d.display_name + '' - Metric: '' + m.metric_key, AVG(md.v) latency_ms '+

          +'from ignite.conv_metrics m '+

          +'inner join ignite.conv_metric_detail_'+@HostID+' md ON md.metric_id = m.id '+

          +'inner join ignite.conv_device d on d.id = md.device_id '+

          +'WHERE d.device_type=''Disk'' '+

          +'AND m.metric_key like ''%latency%'' '+

          +'AND md.d >= DATEADD(MINUTE, 10, CURRENT_TIMESTAMP) '+

          +'GROUP BY d.display_name, m.metric_key'

EXEC (@SQL)

Thanks all for the help

Does anyone know of any customer metrics for the guest only. Total host doesn't work for us as we would like to know information for the current VM having storage issues. Then we dont have to wait for the admins to get back to us.

Thanks in advance

Version history
Revision #:
1 of 1
Last update:
‎01-10-2014 12:35 PM
Updated by: