10 Replies Latest reply on Sep 4, 2014 10:24 AM by RichardLetts

    Taking 5 minutes to be notified an AP is down

    sk3l3t0r

      Ive monitored the controller and can see that when an AP is down, the controller knows about it within 30 seconds... Ive checked the polling time of the NPM alert for when an AP is down, which is set to the default of 1 minute.. so why is it actually taking 5 minutes to be notified via email that it is down?

       

      I would assume that NPM is querying the controller about AP statuss, rather than actually polling the APs directly, so not sure why its taking so long to be notified...  Are their any debug or similar logs I can monitor to see whats going on in real time??

        • Re: Taking 5 minutes to be notified an AP is down
          contactjt

          There is a fast ping that starts on the first failure. On my system it's 180 seconds, I think it works out to and average time of half the polling interval + 180 seconds, then that is checked every 1 minute which could add some delay, then there is any email delay.

          • Re: Taking 5 minutes to be notified an AP is down
            Vinay BY

            You would need to verify the same in 2 more sections - Did you check Trigger Condition and Trigger Actions section on your NPM Alert ?

            T1.JPG

            If its not set under Trigger Condition , then please go ahead and check if there is a delay on your email action under Trigger Actions:

            T2.JPG

            • Re: Taking 5 minutes to be notified an AP is down
              crzyr3d

              Check timing in your events list to see when the device shows down and the time of the alert, if you haven't completed that already. Also, on an alert, check the first screen to see how often you are looking to see if the device is down.  You may already have looked at all this.

              • Re: Taking 5 minutes to be notified an AP is down
                rhidians

                If the Trigger conditions are not delayed, the statistical polling is 60 seconds and the fast polling is set to 0 seconds (aka Node warning level) then I would expect the system to alert you after the first 60 seconds not 5 minutes. I'd be interested to find out what is causing the delay. If you don't get the pointers from some one here I'd get in touch with support to have a look

                 

                If the AP points sends syslogs or traps then I would suggest having a look at alerting on them.

                • Re: Taking 5 minutes to be notified an AP is down
                  sk3l3t0r

                  The APs do not directly support snmp, so if I wanted to discover and poll them, I could only do this via icmp/IP, which as we have several hundred, adding them manually isn't really practical.  It is a shame that as NPM knows about them, from the controller, that I simply cant import them as nodes and poll them accordingly.

                   

                  NPM only knows about the APs due to the controllers being nodes and using snmp, so their status is only known through communicating with the controller, which from their response below, is done via the statistics collection interval of the node.

                   

                   

                  Had this answer from SW support:

                  If the AP is added as it's own Node into Orion then the Node Polling Interval(Default 120 seconds) will be used.

                  If however you're monitoring it through your Wireless LAN Controller then it uses the Statistics Collection Interval(Default 10 minutes)

                  To increase Statistics Collection Interval just go to your Controller in Manage Nodes and Edit Properties on the Controller. Change Collect Statistics to a lower number.

                    • Re: Taking 5 minutes to be notified an AP is down
                      crzyr3d

                      That's actually how we do it, meaning we add by ping the ap as a node itself and we have them set up with static ap's since they are all lightweight devices.  I've found that if something happens to the WLC and it loses it connections that it only sees what's new and what it loses while it's up, not what was lost to begin with.  I have alerting set up based on the ping and that works for us.

                    • Re: Taking 5 minutes to be notified an AP is down
                      Craig Norborg

                      The real question might be, if you're looking to be notified quickly of an AP going down, is polling the WLC the best way of doing this?

                       

                      Personally I'd go with an SNMP trap from the controller, I believe the "AP Register" one will let you know if an AP associates or disassociates.

                       

                      If you have good, overlapping wireless coverage I wouldn't think this would be that critical though...

                        • Re: Taking 5 minutes to be notified an AP is down
                          sk3l3t0r

                          Nope.. I would have to agree that it probably isn't!

                           

                          Given that NPM displays info about your thin APs, and therefore their device name and IP, you would think that it could be fairly straightforward to add them as monitored nodes!  If I run a discovery and it discovers an "unknown device", why not check that against any other devices that it might now about even if discovered via an snmp query of the controller, the put the 2 pieces of info together to resolve the "unknown device"!

                        • Re: Taking 5 minutes to be notified an AP is down
                          RichardLetts

                          What type of AP/Controller? Does the controller generate traps indicating the tunnel between the AP and the controller has gone down?

                           

                          I use a script off the TrapReceiver to set the status of the AP automatically (saves increasing the polling time). This sets the AP whose IP address matches the command line argument passed in as available, and down. I have an equivalent script that flags it as up when it returns to service (via a trap). The periodic polling catches any missed traps

                           

                          Const DB_CONNECT_STRING = "Provider=SQLOLEDB.1;Data Source=server;Initial Catalog=SolarWindsOrion;User ID='username';Password='password';"
                          
                          if WScript.Arguments.Count = 0 then
                              WScript.Echo "Missing parameters"
                              Wscript.Quit
                          end if
                          
                          Set myConn = CreateObject("ADODB.Connection")
                          Set myCommand = CreateObject("ADODB.Command" )
                          myConn.Open DB_CONNECT_STRING
                          Set myCommand.ActiveConnection = myConn
                          
                          ' generate an update statement from the input
                          ' myCommand.CommandText = SQL update statement here
                          myCommand.Execute
                          myConn.Close
                          

                           

                          You can extend this idea to set custom properties on nodes or interfaces for other trap-directed alerting.

                           

                          aside: I get pretty circumspect about sharing scripts that directly modify the database since they tend to be Orion-version specific and could be dangerous if improperly used. So, I'll leave it to you to figure out the correct update statement to generate in the script if this is an approach you want to take.