7 Replies Latest reply on Apr 8, 2014 3:31 AM by fandresena

    bgpPeerState and Alerting (Need a second set of eyes)

    KMSigma

      Greetings Thwack Community Friends!

      It's been a while since I've written a post seeking assistance because almost everything I've wanted to find, I've already found.  That being said, I'm having issues with one of my Universal Device Pollers and the Advanced Alert Bound to it.  I'll try to explain the situation as best I can and then attach the export of the UnDP and the Alert Definition in case that will help.

      Universal Device Poller Definition
      OID Polled (Numberical)1.3.6.1.2.1.15.3.1.2
      OID Polled (Long)iso.org.dod.internet.mgmt.mib-2.bgp.bgpPeerTable.bgpPeerEntry.bgpPeerState
      Get TypeGetSubTree / Get Table
      Label TypeSame Table
      Label Detail1.3.6.1.2.1.15.3.1.7
      Label Detail "Name"iso.org.dod.internet.mgmt.mib-2.bgp.bgpPeerTable.bgpPeerEntry.bgpPeerRemoteAddr

      This returns (on some routers) two entries based on two BGP neighbors. We currently only have one or two remote site routers setup this way, but it is the way in which we are going in the future.

      Right now, we're only interested in alerting on a BGP down (defined to us as any result that is not equal to 6 [established]) on routers which are part of our MPLS cloud.  All of our internal neighbors would start with a private IP scheme.  For the sake of discussion, let's say that the internal scheme is the 10.0.0.0/8 IP network.

      My alert is defined like this:

      Trigger Condition
      Property Type:Custom Node Poller
       Trigger Alert when all of the following apply
       Poller Name is equal to bgpPeerState
       Numerical Status is not equal to 6

      Reset Condition
      When trigger actions are no longer true

      Alert Suppression
       Suppress Alert when all of the following apply
       Row ID starts with 10.

      I think this will work as I guess, but I'd really appreciate any insight anyone can give.  Right now we're showing no alerts, but the only time we show an alert is when one of our WAN providers has an issue.  Obviously, I'd like to not have to wait on that or cause that.

      Lastly, I'd love to be able to "fix" the variables that are available for the alerting text itself.  Right now it's pretty basic.  It's just:

      On Trigger Subject: [ALERT] BGP Down on ${Node.Caption} (${Node.IP_Address})
      On Trigger Body:  BGP Status on ${Node.Caption} (${Node.IP_Address}) is not-established for neighbor ${CustomPollerStatus.RowID}.

      On Reset Subject: [RESET] BGP Down on ${Node.Caption} (${Node.IP_Address})
      On Reset Body: BGP Status on ${Node.Caption} (${Node.IP_Address}) is established for neighbor ${CustomPollerStatus.RowID}.

      Lastly, I'd like to change the the headers on the rows in the UnDP element on my node detail pages.  Having the Remote IP of the peer would be great, but it's only really "useful" if that's what it's called.  This also applies with the bgpPeerState itself which shows as the numerical value even though the UnDP "knows" what each number 1-6 means.

      Above you can see the private address of 10.28.x.x that is showing a state of "3" which should have the alert suppressed.  The second (216.149.x.x) is at 6 and will not trigger the alert.

      Summary

      OK - this has gone way far afield, but it's basically me trying to figure out how to work with multiple elements returned from a UnDP request.

      Any help would be greatly appreciated.

        • Re: bgpPeerState and Alerting (Need a second set of eyes)
          Andy McBride

          That suppression will trigger if there is any instance of a row ID starts with 10 in the entire database. You should move the suppression to the trigger and reverse the logic.

          Unfortunately Row ID does not start with X is not an option today. What you could do is go ahead and create the alert with the stats with in the trigger. Then go to the database and find the query in the dbo.AlertsDefinitions table. Take that query and flip the logic on the starts with part, then paste it into an advanced SQL alert. I will also mark this for a PM to consides adding the does not start with as a feature. When you have the advances sql alert delete the one you made to create the query.

          Note - there may be other things you will need to add to the query, but I'm not a sql guru so I don't know what those might be.

          Andy

            • Re: bgpPeerState and Alerting (Need a second set of eyes)
              fandresena

              Hi ,

              I want to monitor bgp peer state and show address of all neighbor but i don't show that address.I use this :

              BGP Status on ${Node.Caption} (${Node.IP_Address}) is established for neighbor ${CustomPollerStatus.RowID}

               

              I see the name of  Node,address IP but i do not see address of neighbor.Can you help me please?

              Thanks

                • Re: Re: bgpPeerState and Alerting (Need a second set of eyes)
                  KMSigma

                  As of NPM 10.6 (?), I moved this to a "Routing Neighbors" alert (much easier than the UDP Poller that I was using).

                  The trigger condition is on "Routing Neighbors"

                  The triggers conditions are:

                  • Display Name <> Established
                  • Protocol ID = 14 (BGP Only - it looks like 16 is EIGRP, but we don't care about that on our WAN Routers)
                  • Node Name contains "WAN" (We only want this watched on our WAN Routers)

                  The reset is the default (when triggers are no longer true).

                  I have two trigger actions:

                  NetPerfMon Event Log: BGP Routing on ${Node.Caption} (${Node.IP_Address}) lost with ${NeighborIP}

                  Send Email (as HTML):

                  Subject: [ALERT] Routing Neighbor Down for ${Node.Caption} (${Node.IP_Address})

                  Body: (it's all in one block because it has to be for the HTML to format correctly):

                  <html><head><title>Orion Alert: ${AlertName} for ${Caption}</title><style type="text/css">/*<![CDATA[*/.o{BACKGROUND-COLOR: #e0e0e0;} .11{FONT-FAMILY: Verdana; font-size: 7pt; COLOR:#000000; text-align: LEFT; vertical-align:TOP}.12{FONT-FAMILY: Verdana; font-size: 7pt; COLOR:#000000; text-align: LEFT; vertical-align:CENTER}.13{FONT-FAMILY: Verdana; font-size: 7pt; COLOR:#000000; text-align: LEFT; vertical-align:BOTTOM}.21{FONT-FAMILY: Verdana; font-size: 7pt; COLOR:#000000; text-align: CENTER; vertical-align:TOP}.22{FONT-FAMILY: Verdana; font-size: 7pt; COLOR:#000000; text-align: CENTER; vertical-align:CENTER}.23{FONT-FAMILY: Verdana; font-size: 7pt; COLOR:#000000; text-align: CENTER; vertical-align:BOTTOM}.31{FONT-FAMILY: Verdana; font-size: 7pt; COLOR:#000000; text-align: RIGHT; vertical-align:TOP}.32{FONT-FAMILY: Verdana; font-size: 7pt; COLOR:#000000; text-align: RIGHT; vertical-align:CENTER}.33{FONT-FAMILY: Verdana; font-size: 7pt; COLOR:#000000; text-align: RIGHT; vertical-align:BOTTOM}.h1{FONT-FAMILY: Verdana; font-size: 12pt; COLOR:navy; text-align: LEFT; font-weight: BOLD; vertical-align:CENTER}.h2{FONT-FAMILY: Verdana; font-size: 10pt; COLOR:gray; text-align: LEFT; font-weight: BOLD; vertical-align:CENTER}.h3{FONT-FAMILY: Verdana; font-size: 7pt; COLOR:gray; text-align: LEFT; font-weight: BOLD; vertical-align:CENTER}.h4{FONT-FAMILY: Verdana; font-size: 7pt; COLOR:gray; text-align: LEFT; vertical-align:CENTER}.chl{FONT-FAMILY: Verdana; font-size: 7pt; font-weight: BOLD; COLOR:white; text-align: LEFT;}.chc{FONT-FAMILY: Verdana; font-size: 7pt; font-weight: BOLD; COLOR:white; text-align: CENTER;}.chr{FONT-FAMILY: Verdana; font-size: 7pt; font-weight: BOLD; COLOR:white; text-align: RIGHT;}.toc{text-align: LEFT;}/*]]>*/</style></head><body><table cellspacing="0" cellpadding="1" width="792" border="0"><tr><td><table cellspacing="0" cellpadding="0" width="792" border="0"><tr> <td class='h1'>${Node.Caption} (${Node.IP_Address}) has lost IP Routing with Neighbor at ${NeighborIP}</td></tr><tr> <td class='h2'>This alert has been triggered because BGP Status on ${Node.Caption} (${Node.IP_Address}) has been lost with Neighbor (${NeighborIP}).</td></tr><tr><td> </td></tr></table><table cellspacing="0" cellpadding="1" width="792" border="1" bordercolor="#003366"><tr><td><table cellspacing="0" cellpadding="3" width="100%" border="0"><tr><td align="left" valign="middle" bgcolor="#003366" class='chl'>Node</td><td width="16" align="left" valign="middle" class='12'><img src="http://ORIONSERVER/Orion/images/StatusIcons/${Node.StatusLED}" alt="${Node.StatusDescription}" width="16" height="16"></td><td align="left" valign="middle" class='12'><a href="http://ORIONSERVER/Orion/NetPerfMon/NodeDetails.aspx?NetObject=N:${Node.NodeID}">${Node.Caption} / ${Node.IP_Address}</a></td></tr><tr> <td align="left" valign="middle" bgcolor="#003366" class='chl'>Neighbor IP</td><td width="16" align="left" valign="middle" class='12'><img src="http://ORIONSERVER/NetPerfMon/images/blank.gif" alt="${StatusDescription}" width="1" height="1"></td><td align="left" valign="middle" class='12'><a href="http://ORIONSERVER/Orion/DetachResource.aspx?ResourceID=1802&NetObject=N:${Node.NodeID}">${NeighborIP}</a></td></tr><tr> <td align="left" valign="middle" bgcolor="#003366" class='chl'>Current Status</td><td width="16" align="left" valign="middle" class='12'><img src="http://ORIONSERVER/NetPerfMon/images/blank.gif" width="1" height="1" alt="${InterfaceTypeDescription}"></td><td align="left" valign="middle" class='12'>${DisplayName} (${ProtocolStatus})</td></tr><tr><td align="left" valign="middle" bgcolor="#003366" class='chl'>DNS Name</td><td width="16" align="left" valign="middle" class='12'> </td><td align="left" valign="middle" class='12'>${Node.DNS}</td></tr></table></td></tr></table></td></tr></table></body></html>

                   

                  My Reset actions are pretty much the same thing, but with other language.