This discussion has been locked. The information referenced herein may be inaccurate due to age, software updates, or external references.
You can no longer post new replies to this discussion. If you have a similar question you can start a new discussion in this forum.

Solarwinds NPM and SAM performance issue

Hi, hoping someone on here can help me since I'm not getting any responses from support on this issue.  I have unlimited versions of NPM & SAM which are currently monitoring  660 nodes, 1800 volumes, and 7600 interfaces.  I have NPM and SAM running on he following box:

VMWare 5.1

O/S 2012 R2

16GB RAM

10 Virtual CPU's running at 3.47 GHZ.

Performance is as follows:

CPU - 10%

memory - 6GB

NIC - HIGH of 10-20mbps

My database is on the following:

VMWare 5.1

O/S 2012 R2

SQL 2012

16GB Ram

1 Virtual CPU running at 3.47 GHZ

Solarwinds is the only database on this server.

Performance is as follows:

CPU - 30%

RAM - 14GB (DBA's have sql setup to take 12GB)

NIC - High of  550 kbps

NICS are set to Gigibit.

I've also set polling of  Nodes, Interface, and Volumes to 240 secs.  They were at 60 secs before hand.  The problem I'm having is that solarwinds is not able to keep up with the amount of data coming in.  My db team uses ignite to monitor our DB's and i've had them keep an eye on the solarwinds DB performace.  Over the last week their doesn't appear to be any issues with the db performance, but my last database update hovers around the 500 minute mark until i reboot the solarwinds app server.  Anybody have an idea's how i can keep this under 1 minute all the times, included when i switch might go out?

  • The last DB update time may be related more to a time difference between the SQL server and the Orion server. That itself does look a little odd since that's saying that nothing new has been written to database in 500 minutes. Definitely a problem and not at all related to not being able to keep up. If Orion was unable to keep up it would still be collecting and writing to the database, but likely things would not be polled as frequently as you would like or have configured. The result would be a marginal delay in how often things get polled, but the Orion server would still be polling *something* constantly, trying to keep up.

    I would argue that you are probably right on the cusp of needing an Additional Poller, but I don't think that's related to your database update delay.You can determine how close you are to needing an additional poller by looking at [Settings - Polling Engines] and seeing what the following percentages look like. The closer you are to 100% (or over) the more likely you are to need an Additional Poller.

    SAM Application Polling Rate11% of its maximum rate.» Learn more
    Hardware Health Polling Rate0% of its maximum rate.» Learn more
    Polling Rate4% of its maximum rate.» Learn more
    VIM Hyper-V Polling Rate1% of its maximum rate.» Learn more
    SAM Windows Scheduled Tasks Polling Rate0% of its maximum rate.» Learn more
  • Also, I don't see the need for 10 vCPUs on the NPM/SAM server. The DB server definitely could do with a few more vCPUs - 1 is too little.  And, definitely, you will need to check the Polling Engines settings page to see where the polling throughput is at currently.

  • Hey Guys,

    Thank you for response.  When i originally bought solarwinds it was just for a small group of servers, so i had the sql server running on the same box as the app.  As we added more i added more CPU, hence the reason for 10 vCPU's.  I'll be cutting that back one of these days.  I also added another vCPU to the SQL server to see if that helps any.  Below are my current polling rates.

    SAM Application Polling Rate51% of its maximum rate.» Learn more
    Hardware Health Polling Rate6% of its maximum rate.» Learn more
    Polling Rate40% of its maximum rate.» Learn more
    Routing Polling Rate1% of its maximum rate.» Learn more
    UnDP Polling Rate0% of its maximum rate.» Learn more
  • Based on those numbers everything looks good. What does your polling completion rate look like on that same page?

  • Well came back from along weekend and my solarwinds servers was 1226 minutes behind!  Below is all the polling info.  I'm checking with my DBA team to see if they see any issues on the dba server.

    Last Database Update    1226 minutes 52 seconds ago

    Polling Engine on XXXXXXXXXX

    Engine Status     Status  Polling Engine Active

    Type of Polling Engine    Primary

    Polling Engine Version   SolarWinds Orion Core Services 2014.2

    Installer language selection         English

    Operating system regional setting            English

    IP Address          XXX.XXX.XXX.XXX

    Last Database Sync          23 seconds ago

    Polling Completion          100

    Elements             10342

    Network Node Elements              674

    Volume Elements            1808

    Interface Elements         7860

    SAM Application Polling Rate      52% of its maximum rate.» Learn more

    Hardware Health Polling Rate     6% of its maximum rate.» Learn more

    Polling Rate        41% of its maximum rate.» Learn more

    Routing Polling Rate        1% of its maximum rate.» Learn more

    UnDP Polling Rate            0% of its maximum rate.» Learn more

    1. VIM.VMware.Polling      10

    Total Job Weight              3033

    Number of HW Health Monitors               222

    Number of HW Health Sensors 4481

  • I would recommend opening a case with support so we can dig through your log files to determine what's going on. Unfortunately this is a particularly difficult problem to troubleshoot properly via a forum like Thwack.