This discussion has been locked. The information referenced herein may be inaccurate due to age, software updates, or external references.
You can no longer post new replies to this discussion. If you have a similar question you can start a new discussion in this forum.

NPM - CPU Maxed Constantly - False Latency Alerts

I've opened a case with support (Case #940956), but while I'm waiting on a response about the diagnostic file I've uploaded I want to ask here. Running version 11.5.2 of NPM on a virtual server. Database is on a separate dedicated SQL server. We have 8 products total and are using 8 CPU's running at 2.00 GHz with  24 GB of RAM.

On Feb 12th we began getting inundated with latency alerts from various nodes. Investigating showed that they were not experiencing latency. We looked into Orion to determine the source and found that the CPU was completely pegged. Stopping all SolarWinds services causes it to return to almost no utilization, starting them will cause it to jump right back to 100%. Contacted support and they had us reinstall CoreInstaller, JobEngine, Job Engine.v2, InfomationService, and CollectorInstaller. Doing this fixes the problem for a couple days but then it seems to come back.

Trouble shooting we've tried indicates that starting up the SolarWinds.JobEngineWorker.v2 is the specific service that causes the load. We tried increasing the number of CPU's available from 8 to 16 and this worked for a couple days but now its back to the same issue even with double the recommended number.

http://www.solarwinds.com/netperfmon/solarwinds/wwhelp/wwhimpl/common/html/wwhelp.htm#context=SolarWinds&file=orionaghow…

This was helpful for understanding how NPM works

Our problem seems similar to these other issues:

Re: 11.5.2 Upgrade Issues (SQL Connections\High CPU) 
Re: High CPU usage SWjobengineworker2.exe

This is the relevant info about how much stuff we're polling.

Polling Completion99.88
Elements5325
Network Node Elements1086
Volume Elements2148
Interface Elements2091
SAM Application Polling Rate5% of its maximum rate.» Learn more
Routing Polling Rate1% of its maximum rate.» Learn more
UnDP Polling Rate0% of its maximum rate.» Learn more
Polling Rate30% of its maximum rate.» Learn more
VIM.VMware.Polling10
IPAM.Dhcp.Polling0
SAM Windows Scheduled Tasks Polling Rate2% of its maximum rate.» Learn more
Hardware Health Polling Rate15% of its maximum rate.» Learn more
Fibre Channel Polling Rate0% of its maximum rate.» Learn more
Wireless Polling Rate0% of its maximum rate.» Learn more
Wireless Heat Map Polling Rate0% of its maximum rate.» Learn more
Total Job Weight1890
Number of HW Health Monitors478
Number of HW Health Sensors10757

This is a capture of all the Job Engine workers. Is it normal to have this many Job Engine Workers?

pastedImage_0.png