14 Replies Latest reply on May 26, 2010 2:29 AM by islamm

    Detecting flapping devices

      Hello all :-)

       

      I have my advanced alerts set up to not alert unless a node has been down for 4 minutes, which works well to ward off spurious alerts because of high WAN utilization, etc.  However, this can mask a flapping circuit and/or device from us, if if flaps up and down within the 4 minute window.  Has anyone found a good way to detect this, and alert on it?  I've seen several threads on the forum about it, but none that gave a good answer on how to set up the alerts.

       

      Thanks :)

        • Re: Detecting flapping devices
          tonyled

          have you tried setting up traps or a syslog alert?

          i set up an alert via syslog to send me an email whenever this happens and it works really well

            • Re: Detecting flapping devices
              DirtySouth

              That would be really helpful. Would you mind posting your filter configuration?

                • Re: Detecting flapping devices

                  X2 - I'd like to see the config.  Are you basing the "flap" on getting X number of "Device/Interface up" traps  in a specified time period?

                    • Re: Detecting flapping devices
                      neilmborilla

                      You have to setup the configuration on your network device to forward syslog or events to NPM server.

                       

                      From syslog console you can configure the ALERTS.

                       

                      From my experience NPM does not detect or trigger alert when the device or interface flapped,especially in WAN. That's for the syslog to cover the detection

                        • Re: Detecting flapping devices

                          Following on from above.

                          The syslog message of "link flap" is not detected for some certain end devices.  Switch log only reports up/down status.  Advanced alert manager has been configured to show up/down status but does not pick up the flapping.

                          I have also configured syslog alerts to notify me when there is a flapping port based on string "changed to up", "changed to down".   Problem is, it reports even when users are pluggin/unpluggin their laptops etc.

                          Has anyone found a way to get this reported correctly?

                            • Re: Detecting flapping devices
                              borgan

                              Try designating the interface as "unpluggable" in Node Management. That should keep it from being included in down interface alerts.

                                • Re: Detecting flapping devices

                                  Thanks for the reply.  The solution you have suggested I still think it does not really answer the question of detecting a flapping port/interface.

                                  We would like to detect ANY flapping port.  Whether this may be a:-

                                  • Uplink
                                  • User interface
                                  • Serial Connection
                                  • Server Interface etc tec


                                  As I understand it from SW, Orion Polling engine cannot pick up any flap detection, hence they had suggested to use SNMP or SYSLOG's to pull the information from.  But this presents the problem as described above.  If I use either the SNMP TRAP MIB "SNMPv2-MIB:linkUp" or "SNMPv2-MIB:linkdown" or syslog string "changed to up", "changed to down", it will pick up erroneous alerts when multiple users are connecting/disconnecting their machines simultaniously.

                                  The other problem is, some of the flapping interfaces do not get written to syslog as 'flap' but only reported as up/down.  So specifying the 'flap' does not really work for syslog alerts.

                                   

                                  By specifying "unplugged", how would that work?

                      • Re: Detecting flapping devices
                        r0berth1

                        In Syslog Viewer i setup an alert and included part of a syslog message that shows *flapping* and set it to alert if it happens 10 times per hour. This has been working fine for me and sends me an email on all flapping that goes on.

                          • Re: Detecting flapping devices

                            Yes thats all good if Cisco switch/router is reporting it as 'flapping' in the logs then syslog on SW would pick it up.

                            However, if the logs on the Cisco device is not seeing/reporting it as 'flapping' but only 'up' and 'down' status (10-50 times a minute or >) then syslog does not really work.  If I was to use the method you suggested say with a threshold of 20, then syslog would only report back on the instance/message when it hits count 20.  Now on count 20, it could be a user plugging in/out etc.  There appears to be no way of tieing down the hosts/nodes with their own individual syslog counts in Solarwinds/Syslog application.

                            See the problem I was faced with?

                            However, I have managed to come up with a solution myself after much looking around and sleepless nights!

                            I have created a report with custom SQL which looks in the syslog table and groups the number 'down' messages where the message originated from  (i.e. hosts/nodes).  I then speicfied a threshold of 60min with 'down' messages > 20.  This report is then imported into the home page which gets refreshed every 3min. :)