This discussion has been locked. The information referenced herein may be inaccurate due to age, software updates, or external references.
You can no longer post new replies to this discussion. If you have a similar question you can start a new discussion in this forum.

High CPU usage on Solarwinds server, Solarwinds Information Service V3 using 100% CPU, empty groups

Normally my Solarwinds Core Server CPU usage is about 40-60%. Last week after I created about 40 empty groups, the CPU usage is keeping at 100% constantly. “Solarwinds Information Service V3” is using most of CPU power (always above 60%). After I removed the groups, system is back to normal.

Found the cause but not sure why? Any idea? Or a bug in SolarWinds?

Thanks for any help!

The environment is:

Core Server: VM, 2 cores, 16GB RAM, Winn 2008 R2,

                    Running: Core 2012.2.2 + NPM 10.4.2 + SAM 5.2 SP1 + NCM 7.1.1 Polling engine for all networks devices (900)

Additional Poller: VM,1 core, 8GB RAM, Winn 2008 R2 Polling engine for all servers (900)

Config Management Server: VM, 1 core, 8GB RAM, Winn 2008 R2

Database Server: Dell R710, 2 x Intel Xeon L5520 Processor 2.26GHz, 24GB RAM, 6 x 450GB SAS 15k 3.5 HD,

                             Windows 2008 R2 + MS SQL 2008 SP1

                             RAID : 2 xHDDs RAID1 with 2 partitions (OS on P1 and DB logs on P2) & 4 xHDDs RAID10 with 2 partitions (DB on P1 and tempdb on P2)

1800 Nodes are monitored

8000 Interfaces are monitored

5500 Volumes are monitored

1000 Applications are monitored

220 groups

12 users accessing daily

  • We are working on some performance changes in the next version. Until then...

    Try changing:

    C:\Program Files (x86)\SolarWinds\Orion\Information Service\3.0\Plugins\SolarWinds.Data.Providers.Orion.Containers.v3.dll.config

        <add key="MaxThreadPoolWorkerThreads" value ="10"/>

    That is probably 100 by default.

    Restart the information service v3 after the change.

    let me know if it helps.

  • Thank you Karlo.

    I believe the Groups cause the performance issue. Please see the attached screenshot.

    Before I added 34 empty groups there are about 200 groups, members are servers grouped by theirs roles. Most of them are static groups and a few dynamic (less 20) groups. After I added those empty groups, CPU usage reached 100%. So I removed them and the CPU usage was back to 40-60%. Further I removed 5 dynamic groups and then the CPUI usage drop to 30-40%

    Looks like Groups cause the performance issue, especially the empty groups & dynamic groups.

    As the CPU usage is low now I did not apply the change you suggested yet.  But I was planning to create another 200 dynamic groups based on node locations. I will start to do it gradually and see the performance and apply the change you suggested if necessary.

    Thanks again!

    CPU usage.jpg

    Message was edited by: Patrick Huang I also reducing the group refresh rate from 60 seconds to 600 seconds

  • I seem to have this issue after upgrading to 10.4.2, however CPU on Poller 2 seems to be the only one bouncing against the 100% ceiling on the CPU.

    I have a Primary Poller and Two Additional Pollers. I am running NPM, NTA, NCM, and VQNM.

  • The groups do cause higher CPU utilization in SWISv3. Dynamic groups are more taxing on SWIS and the database as they require more processing, but we are trying to optimize that as we find places to improve that in each release.

    Also, if you haven't done so already...

    In this file "...\SolarWinds\Orion\Solarwinds.Orion.Core.Dependencies.dll.config"

    Please switch option ContainersEnableCache to true

    <add key="ContainersEnableCache" value="True"/>

    - You need to do that on each engine.

    Restart the Collector DataProcessor service after this is changed (no need to do this if the option is already true.)

    I believe that the above change I recommended for the containers thread count (MaxThreadPoolWorkerThreads) would help your performance even now.

  • Hi M Gibson,

    Depending on what services are causing your CPU to be high, my above recommendations may help your CPU as well.  If it is Job Engine or Job Engine workers, then your CPU utilization will require some more investigation from support and possibly development.

    Thanks

  • Thank you Karlo again. I have applied the changes you suggested on both pollers. I will let you know the results next week. Thanks Patrick

  • Any progress here?  Just checking to see if the recommendations have helped your overall CPU utilization.

  • It seems no more improvement as the CPU usage is already low (30-40%) Please see attached screen shots

    Another question: The additional poller CPU is quite high. Is it normal? Additional poller is polling about 900 servers Thanks

    attachments.zip
  • Disappointing that your main poller did not have a reduction in CPU.  What are your empty groups looking like?  Are they a dynamic query on a custom property?

    What processes are using up the CPU on your additional poller?  Collector DataProcessor?  Job Engine Worker?  If the Job Engine Worker, then view the Command Line column in the Task Manager and say which workers are causing the problem (NPM Interfaces, Core, SAM,...)

  • Thanks for reply. Empty groups were empty, no any menbers. On additional poller, it seems SAM JobEngineWorker2 use most CPU usage, but not more than 50%. Please see the screenshots Thanks

    additional-poller-status.jpgadditional-poller-performance.jpgjobSchedulerV2-delay.jpgjob-engineWorker2-cpu-usage.jpgjob-engineV2-cpu-usage.jpgcollector-data-cpu-usage.jpg