Roughly 5 minute delay before hosts are marked "down" - Normal ?

Hi All, new to NPM, so bear with me if I've made a noob error.

My Cisco devices are monitored using SNMP (and I presume IMCP?).  If I take down my test switch, it takes sometimes 4 - 6 minutes before NPM marks the node as "down". It alerts me due to "packet loss" on the affected node very quickly.

I've changed the default polling interval from 120 to 60 seconds, with not much luck.

Am I being too impatient, or have I done something wrong?

Thanks in advance,

Ellis

  • Hi,

    at the settings page you can select polling settings, at the bottom of that page you can find "Node Warning Level", you can change the value there to determine after how many seconds a node is considered as down.

  • Hi - thanks for your answer.

    As a test I changed the "Node Polling Interval" to 30 seconds and changed the "Node Warning Level" to 10 seconds. It still took around 3 mintues for NPM to mark it as down.

    Am I still being too impatient?!

  • EllisD

    The other thing you can check is under Web Console Settings on the Admin page.  By default, the Orion page only refreshes every 5 minutes.

  • with those settings (assuming you are constantly refreshing either manually or by ways that kweise mentioned) or it should take less than a minute for the node to appear as down... maybe you should open a support case to investigate this issue

    PS: if you changed the polling interval for all nodes and you have a lot of them, this might overload your database which in turn might cause this delayed notification.

    with "appear" as down you mean the icon being green instead of red or do you mean an alert being sent out? because you also need to adjust alerts if you want them to trigger sooner or later...

  • I was manually refreshing the web page. Think it's time for a support query to find out what's going on.

    I only changed the polling interval to 30 seconds as a test, it was still slow/sluggish when set to 1 minute, 2 minutes etc.

    By "appear", i meant turn red :-(

    Thanks for your help.

    Ellis

  • isn't there also a setting somewhere that indicates 3 missed polls before it marks it down?

  • That sounds vaugely familiar but I can't find that setting anymore.  However, there is a setting under Orion Polling Settings for Node Warning Level.  It sets the number of seconds before Orion marks devices as down.  I'm guessing the default is 120 seconds.  It's at the bottom of the page under Calculations & Thresholds.

  • In the advanced alert for Node Down you have two different settings that you can tune, one setting for how often the Alert Engine checks for the alert criteria to be true and there is another setting as was previously mentioned by Njoylif that tells the Alert that it must be in an alert state for x number of minutes before the alert is triggered.

    So consider all of the different systems in play...

    1. Polling Interval
    2. How often alert is checked
    3. How long the alert must be tripped before an alert is generated

    If each of these is set to 5 minutes then your node could be down up to 15 minutes before an alert is tripped.

     

    Check these settings on your system, they may be the cause of your problem.

    Hope this helps!

  • Byron,

    You're right, but I don't think he's talking about an alert.  He's just talking about how long it takes for the Node's icon to change from green to red and be classified as down on the Orion web site.



  • That sounds vaugely familiar but I can't find that setting anymore.  However, there is a setting under Orion Polling Settings for Node Warning Level.  It sets the number of seconds before Orion marks devices as down.  I'm guessing the default is 120 seconds.  It's at the bottom of the page under Calculations & Thresholds.



    So SW, does that mean you went to a FAST POLL if a poll is missed or what is the scenario?
    thx