DPA has the following four categories of CPU monitors and their default settings, which I have listed in what I believe to be an order from "outer" to "inner":
Metric | Critical | Warning | Chart Type |
Host CPU Usage | 80+ | 60-80 | Area |
VM CPU Usage (MHz) | Area | ||
VM CPU Usage | 80+ | 60-80 | Area |
O/S CPU Utilization | 90+ | 80-90 | Column |
Instance CPU Utilization | 80+ | 70-80 | Column |
Why the difference in naming between Usage and Utilization? Why the different chart types?
Why don't the Critical and Warning levels seem to logically progress? I would expect that we would alert on progressively lower percentages as we progress from "outer" to "inner". The O/S has things other than the Instance that can be using CPU (and there can be more than one Instance). The VM has some overhead, so it would be reasonable to expect that value to always be slightly higher than the value for O/S. Yet that doesn't seem to be reflected in the default settings.
The Host value includes the CPU usage for all of its guest VMs as well as its total CPU capacity (some of which may not be assigned to any VM), so that may have no relationship at all to the other three metrics. I therefore listed it for completeness (and naming comparison) but don't feel it can truly be considered part of an "outer" to "inner" progression in the same way the others can.
The display order of the charts also seems strange to me, not really showing the progression I described, and neither grouping charts for the VM together nor grouping the charts for the Signal Waits together (Signal Wait time is off the image but would appear at the bottom left if the image was taller):
There is no feature that would let us rearrange the order of the charts either globally or for a particular instance (although there is a feature request for that: ).
There must be a reason for these various design decisions, but the pattern doesn't seem obvious to me. Does anyone have any thoughts on all this?