This discussion has been locked. The information referenced herein may be inaccurate due to age, software updates, or external references.
You can no longer post new replies to this discussion. If you have a similar question you can start a new discussion in this forum.

Custom SAM Template causes High Host CPU

I'm trying to roll out SAM monitoring to our infrastructure (version 6.4.0). There's a lot of really neat things that can be monitored out-of-the-box, but also flexibility to add whatever we need. I created an application template to monitor all the "gotchas" in our environment, however, when I apply it to VMs in our environment, the host CPU usages spike to 100%. I'm sure it's just that I need to optimize the application monitors, but I need some assistance. We have 9 hosts and if I add this application template to 20 VMs, most of them immediately spike to 100%. Without the application template, the hosts sit between 50-80% CPU usage. I already tried adjusting some of the fetching methods from WMI to RPC, but I'm not noticing a difference.

I also understand this could be tricky, but I'm going to try my best to communicate what I have in the monitors. Any assistance is appreciated.

I've attached the script for the PowerShell monitors. It just reports free disk space, with the drive as an argument.

1. Directory Size Monitor
Path = \\${IP}\c$\Windows\Logs\CBS
Extension Filter = *
Include subdirectories? No
Convert returned value? Yes, XtoMega(${Statistic})
Warning > 1000
Critical > 2000

2. Windows PowerShell Monitor
Execution Mode = Local Host
Convert returned value? Yes, Truncate(XtoMega(${Statistic}),2)
Warning < 2000
Critical < 1000

3. Windows PowerShell Monitor
Execution Mode = Local Host
Convert returned value? Yes, Truncate(XtoMega(${Statistic}),2)
Warning < 2000
Critical < 1000

4. Windows Service Monitor
Fetching method = RPC
Service Name = <specified>

5. Windows Service Monitor
Fetching method = RPC
Service Name = <specified>

6. Process Monitor - Windows
Fetching method = WMI
Process Name = <specified>

7. Windows Event Log Monitor
Fetching method = RPC
Log Source = <specified>
Event ID = Find all IDs

8. Windows Event Log Monitor
Fetching method = RPC
Log Source = <specified>
Event ID = Find all IDs

9. Directory Size Monitor
Path = \\${IP}\c$\$Recycle.Bin
Extension Filter = *
Include subdirectories? Yes
Convert returned value? Yes, XtoMega(${Statistic})
Warning > 1000
Critical > 2000

10. Windows Service Monitor
Fetching method = RPC
Service Name = <specified>

11. Windows Service Monitor
Fetching method = RPC
Service Name = <specified>

12. Directory Size Monitor
Path = \\${IP}\c$\ProgramData\Microsoft\Windows\WER
Extension Filter = *
Include subdirectories? Yes
Convert returned value? Yes, XtoMega(${Statistic})
Warning > 1000
Critical > 2000

  • Hard to say without knowing all of the nuances in your environment, but two definite tips: 

    • WMI (WinRM) is always more efficient than RPC for the fetching method. Use that for everything that you can.
    • How often are these components being polled? You could be polling some of these more frequently than necessary. For example, how often do you truly need to check the size of folders like CBS and Recycle Bin? Hourly would likely be much more than enough.
    • These monitors are more than I would use, but still not unreasonable. If they are bringing your hosts to their knees, your next step should be to look at the PowerShell script monitors. Try running just one at a time to find out which one specifically is causing the most load on your hosts. That may highlight some script problems that could be optimized.

    Good luck!

  • Thanks for the reply. I'll flip them back to WMI since I'm not noticing a difference anyways (thanks). The template is set to poll every 10 minutes. I'm not aware of any way to poll the monitors at a different interval.

    I attached the PowerShell script I'm using. We have two disks on each VM, and I use the same script to check each disk separately; using the argument as the drive letter. It's pretty much just running "Get-WmiObject -Class win32_logicalDisk" against the node from the Orion server to determine the amount of free disk space available.