15 Replies Latest reply on Sep 6, 2012 4:27 PM by Shahram

    Interface up down notification

    Shahram

      Does anyone know how to create an alert notification for when an interface goes up or down? I'm only interested in getting a notification from the interfaces I'm monitoring. for example I have a switch with 200 ports, but I only want to get alerted on up down status of the two uplinks it has and not on all the user ports. I have about a hundred of these switches/routers.

        • Re: Interface up down notification
          Richard Nicholson

          The way I would do this is create a Custom Property under Interfaces with the Custom Property Editor on the SolarWinds Server.. Call it Uplink and use the True/False values to populate your uplinks.

           

          Then use the Advanced Alert Manager to create an alert using the Custom Property you created.

           

          Using the Interface as the type of property to monitor create the following alert.

           

          Trigger Alert when all of the following apply

               Uplink is equal to True

               Interface Status is equal to Down

           

          You can structure the alert how you need or add conditions if you need to filter out devices another way.

            • Re: Interface up down notification
              Shahram

              Thank you Rich. I might just unmanage the other interafces and not worry about the custom property, but that is a neat trick.

               

              Playing around with these alerts I discovered that an interface can have several different states other than up and down. Do you happen to know what they mean? Especially the 'Warning" and "Unknown"?

                • Re: Interface up down notification
                  Richard Nicholson

                  Warning is when the interface is going down and hasn't responded to ICMP for a set amount of time. It's now in a fast poll state where the polling engine is sending ICMP more rapidly to verify the down status.  If it does reply during the fast poll it goes back to an Up status.  If the interface never responds during the fast poll it's assumed down and moved to that status.  You can change the polling and fast poll rates on the polling engines under settings. Unknown is a status that Orion uses if it can't poll the interface for the initial poll after being set up.

                   

                  I also don't monitor any ports with the exception of my uplinks.  These are always in most cases user ports, and I don't need statistics from those ports.  Plus the added job weight to our pollers adds up very quickly with about 30stacks of 48port switches that are in a 4 to 5 stack totaling over 120 switches.

                   

                  If you like the statistics you can also set the non-uplink ports as "Unpluggable"  from the Edit Properties option of the Node Interface you want to set.  This is a great option to keep collecting stats, but it will ignore any alerts you built against interfaces since you have labeled that interface as an interface that can be offline, up, down, warning.. anything basically and it won't care since you listed the interface as having the ability to be unplugged.

                    • Re: Interface up down notification
                      Shahram

                      I'm not sure if the warning state has anything to do with icmp. I have layer two interfaces in warning and unknown states and when I go into these interfaces details, poll stats are showing intermittent.

                        • Re: Interface up down notification
                          Richard Nicholson

                          Well I didn't explain it the best since Interfaces aren't technically polled with ICMP at least I can't see how they would be since they aren't Layer 3 devices and don't adhere to any Layer 3 routing on the port.  My statement should have said nodes or Layer 3 devices with IP addresses for the example I gave..  I would imagine interfaces are polled with SNMP in some form or fashion, but the status should still mean the same thing.  Warning = Missing polls/dropping polls Unknown = Never has responded to a poll. 

                           

                          I'm not sure how a device would ever be unknown if it has polled since it knew the status at one point anything other than up would be down.

                      • Re: Interface up down notification
                        Sohail Bhamani

                        There are global interface thresholds you can set in the Settings, NPM Thresholds.  There is one for errors/discards, interface utilization, and another.  If your interface has crossed one of these thresholds it could go warning or critical based on the threshold.

                         

                        Sohail

                        http://www.loop1systems.com

                    • Re: Interface up down notification
                      Richard Nicholson

                      Right.. What I am saying though is that the Interfaces that are in warning have received and responded to a poll at some point, but are having issues keeping up responses all the time.  I haven't seen a node/interface stay in warning when they don't receive a poll after a polling period, and on the Unknown status I have only seen this in my environment when I have never been able to poll the device I added to NPM or SAM.

                       

                      Run Wireshark on the Orion server and sniff your polling port.  Filter by IP address of the node and see what you are sending and getting back.

                       

                      To me it almost sounds like your poller is getting overloaded and just can't keep up with the polling anymore.  In turn it's dropping polls and responses, or somewhere on your network between the switch and SolarWinds you have major congestion and the SNMP/ICMP polls are being de-prioritized and dropped at a router or a Layer3 switch.

                      • Re: Interface up down notification
                        Richard Nicholson

                        Remember the status identifiers are for if nodes/interfaces are responding or not, or if they are having issues responding..  A "Warning" or "Unknown" status isn't for anything else other than the polling status of an node/interface.  They don't report the health or hardware error status of a node/interface other than showing Up (they respond) or  Down (no response after x amount of time).  They are simply an indication of the Up/Down Status or an in between status due to Packet Loss, or inability to poll. 

                         

                        From what I was reading it seems you are or were thinking the Warning and Unknown statuses are reporting something to you other than the availability of the Node or Interface and its ability to respond to a ICMP or SNMP packet.

                         

                        When a node/interface is responding and up you can get some error and buffer dropping statistics from SNMP and they can show an issue on the Node/Interface or something coming from the Line connected to the node/interface causing an issue, but this will not take a node/interface out of Up (green) status while reporting these errors unless the node/interface it self stops responding to these request for information. 

                        • Re: Interface up down notification
                          donc1972

                          Hi.  What I do is have a series of key words in the interface description.  If it is for an agent, we ignore it - we don't "do" port level management for users.  However, in our description to any uplinked switches or other managed devices, such as servers, we add /* UPLINK TO XYZ @ IP ADDRESS / INTERFACE */ and create a custom trigger to send us alerts / tickets when ports go down.  We taylor this to the situation, but this way the WAN team and Systems team get real time alerts when things go bump in the datacenter. 

                           

                          Uplink Trigger.PNG

                           

                          Then we create the alert which emails the team as well as cuts a ticket directly to our queue:

                           

                           

                          Trigger Email.PNG

                           

                          You can of course do this for your circuits, as well - we use /* L3 MPLS link to xxxxxx */ or any other carrier and do the same key word search for the word MPLS.  Keeps us plenty busy!

                            • Re: Interface up down notification
                              Shahram

                              The one drawback to relying on SNMP polls is that you don't get notified until the next poll (two minutes by default). What would be a best way to utilize syslog or snmp traps?

                                • Re: Interface up down notification
                                  Sohail Bhamani

                                  An Interface down trap from the device to Orion could be matched with a Trap Viewer alert using the "Change the Status of an Interface" action.  The Interface would need to be monitored in Orion from what I can tell since its that status which would be changed.  Once the status changes, your Interface Down alert should kick in.

                                   

                                  Sohail

                                  http://www.loop1systems.com

                                    • Re: Interface up down notification
                                      Shahram

                                      Sounds like a good idea.

                                      So I guess what I'm going to do is make sure I have a unique description on the interfaces I'm concerned about and create my alert using the method donc1972 mentioned (although Richard's way is good also, having descriptions on interfaces is our standard practice) and set the Trap alert to change my interface status immediately. Oh, and I need to make sure my devices send the link-status trap.

                                       

                                      I've yet to find some time to troubleshoot the snmp packet loss issue...

                                       

                                      Thanks everyone for your help!