This discussion has been locked. The information referenced herein may be inaccurate due to age, software updates, or external references.
You can no longer post new replies to this discussion. If you have a similar question you can start a new discussion in this forum.

High Cpu on Windows due to Solarwinds Job Enginer and Collector Services resulting into degraded performance

Hi, We are seeing high CPU on  Windows Servers mostly 100 percent whenever opening the SW web console, the website becomes completely unresponsive if trying to run jobs, and SolarWinds performance  has been degraded significantly, we can't even get graphs info, and can't even change SNMP creds if we want to because our CPU remains spiked at 100 percent whenever we are doing anything for Solarwinds 

The problem became visible after our NCM jobs were stuck on Post Processing.

Technical Support has not been of much help, they have tried figuring out the problem but could not find out why the job engine and collector services are spiking the CPU suddenly.

Solarwinds version 2023.1, our server is not under-provisioned from a deployment standpoint. 

Whatever TAC did, it resulted in more degraded performance and other errors such as licensing errors even when are all our licenses are active. Troubleshooting has also made the website take much longer for displaying information than it was taking earlier.

Any help or suggestion is highly appreciated.

Parents
  • You might want to start to utilize Orion Insights.  This would give you a background of the Environment where your SW is running.  It can give you ideas how to optimize the software.

    I also found this post.

    thwack.solarwinds.com/.../311588

  • An update on this , after the engineer did some more troubleshooting last Thursday we were able to run the NCM jobs successfully and the CPU on the main Orion server seemed stabilized for now , but then after this we are see new kinds of error on Solarwinds Portal , such as warnings on License expiring soon ( although its good until October 2023 ) and some error such as below..any idea on this

  • For any concerns or issues with licensing, reach out to Customer Service or your Maintenance/Renewal rep, they're the only ones with the access to figure out what a licensing warning is about. 

    I added a screenshot of this post to the notes I sent on, per our direct message. 

  • This error pertains to all your solarwinds servers not being on the same version.

    https://support.solarwinds.com/SuccessCenter/s/article/How-to-Check-NPM-Version?language=en_US

    Go to Resolution 3.  That's a sure way to check for the versions between servers.  Otherwise, follow advise.

  • Thanks, @donrobert5 Given some of the other issues, I'm very skeptical of this one, but I have asked a couple of support managers to have another look at this one. Slight smile

  • I chatted with one of the support managers I'd been working with this AM. He's going to have someone reach out to you and take another look. I apologize for the delay and the issues you're having. Hopefully, we can get this resolved. 

  • Hi Thank you for your responses ...maybe I spoke too quick on the CPU part , today we are experiencing the same slowness and CPU staying high when accessing the Solarwinds console ..i tried checking Solarwinds version but it appears all polling engines are on version 2023.1 and since when i did upgrade on this Solarwinds instance , a TAC engineer was assisting me as was facing issues and after the upgrade they verified that the version is same on main engine and pollers 

  • I also checked license details under Admin > details > License Details , all the modules are on the same version 

  • Hi Guys , I created Orion Insights reports and after going through it I found following Critical points on it

    1 . RAM used by w3wp.exe is 1043 MBs , I had Application pool settings changed earlier already for time interval so not sure why is still consuming this much of RAM on recommendation by TAC.

    2. 2 Alert Actions Account for xx% of all Successful Actions 

    3.4 Failing Alerts Actions Found in DB

    4.1 Alert Actions Account for xx% of All Alerting related events 

    5. Can busiest Alerts in report or alert Actions affect CPU ?

    6. Also I saw tempdb Transac on Log File Size and Cost Threshold For Parallelism values lower than recommended values ?

    7. Contrary to before I am seeing Administrative service and Information Service consuming more CPU instead of job and collector Services.

    We don't have a lot of concurrent users and even when I am opening two login pages , CPU could be seen going high and slowness in general.

    After the upgrade I am seeing Log monitor events , is this some new feature after the upgrade which will be logging events in our Database ?

    Also should I open a separate ticket just for CPU ?

Reply
  • Hi Guys , I created Orion Insights reports and after going through it I found following Critical points on it

    1 . RAM used by w3wp.exe is 1043 MBs , I had Application pool settings changed earlier already for time interval so not sure why is still consuming this much of RAM on recommendation by TAC.

    2. 2 Alert Actions Account for xx% of all Successful Actions 

    3.4 Failing Alerts Actions Found in DB

    4.1 Alert Actions Account for xx% of All Alerting related events 

    5. Can busiest Alerts in report or alert Actions affect CPU ?

    6. Also I saw tempdb Transac on Log File Size and Cost Threshold For Parallelism values lower than recommended values ?

    7. Contrary to before I am seeing Administrative service and Information Service consuming more CPU instead of job and collector Services.

    We don't have a lot of concurrent users and even when I am opening two login pages , CPU could be seen going high and slowness in general.

    After the upgrade I am seeing Log monitor events , is this some new feature after the upgrade which will be logging events in our Database ?

    Also should I open a separate ticket just for CPU ?

Children