0 Replies Latest reply on Sep 20, 2018 3:34 PM by rschroeder

    How does one create Advanced Dynamic alerts that report "confidence" about the state of a device or circuit that leverage routers' latency, loss of BGP Neighbor, Bandwidth utilization, etc.?

    rschroeder

      I'd like to learn how to create alerts for when any remote router's latency reaches or exceeds 20% higher than its average latency.

       

      It would be nice if the alert could be dynamic enough to only includes nodes whose names include certain characters (e.g.:  Node Name Contains "FRED" or "WILMA" or "PEBBLES").

       

      I'm not finding a custom Dynamic alert building solution, but I believe one is present.  Can you help me with the basic steps?

       

       

       

      If that was easy for you, let's get more advanced!  It would be excellent to discover more information and make it part of the alert to give it context and veracity.  Such an advanced alert should include:

      • BGP Neighbor information from the router on the other end of the circuit.  IF we have high latency, we might not have have BGP Neighbor loss.  Including that information in the alert seems like it's possible.  Is there a way for NPM to query logs on the near router and see if it has an entry about a BGP neighbor loss on the circuit facing the remote router?  IF it sees that loss, then one type of alert should go out.  If it DOESN'T see that loss, then a different kind of alert should go out.
      • The alert should be acknowledged within ten minutes.  If it's not acknowledged then it must be forwarded to a higher team member or supervisor or Manager.   I know I've seen this functionality in older versions of NPM.  Where does one find this option in the 12.2 or higher line?
      • How can we tie this into our LANDesk Service Desk (Web Help Desk) app to automatically create a new ticket and assign it to the right person or team via an API?
        • I have a backdoor method that leverages an e-mail sent to our Help Desk app.
        • If the alert sent an e-mail there, a ticket would automatically be created.  But it wouldn't be escalated there automatically, and it's not new and improved like using an API.  I'd like to learn that and make it part of the solution!
      • How can we tie bandwidth utilization into this, too?  If a pipe is full, the router won't lose BGP neighbor, and traffic's flowing, but ICMP may experience significant loss.
      • The alert should generate a NetPath or Traceroute output when latency is high so we can discover intermediary hops with problems.  It would be awesome if a map were included in the alert that highlights hops with signficant latency changes!

       

      I searched through Thwack for the topic, but never found anything quite this advanced.  And yet it doesn't seem quite so advanced that it's out of the realm of NPM's capabilities.

       

      This seems like NPM or VNQM or something in the SW product line should be able to do it.  Are there some folks with experience doing these things that can offer advice?  adatole, ding, serena, sqlrockstar either know how to do this, or they know who I can reach out to for guidance to get this done efficiently, I bet . . .