13 Replies Latest reply: May 21, 2010 5:40 PM by Steven Klassen RSS

Alert Supression

cfry

I know that you can't build an interface alert and use node details in the trigger conditions and vide versa, can you have an interface alert and use node details in alert supression?

I have interface alerts and node alerts that notifiy me when the network status is down.  The node alert notifies me when my router goes down and the interface alerts me if the WAN interface goes down which would also cause a network outage.  I just had a router go down and I got both alerts.  The interface is down because the node is down.  Can I put a supression on the interface alert that says to supress the alert if node status is up?  If not, any suggestions so I'm not getting 2 alerts for the same thing?  I wish SW would let you create an alert using interface and node details, is that on the roadmap?

Thanks,

Christina

 
  • Re: Alert Supression
    cfry

    With the alert supression, I mean "supress the interface alert if the node status is down".

  • Re: Alert Supression
    Steven Klassen

    When a node goes down, the interface goes into unknown, not down. You shouldn't have gotten both alerts. What happened to the device that caused it to become unreachable?

  • Re: Alert Supression
    Tuchux

     Page errored out but still posted see next post

  • Re: Alert Supression
    Tuchux

    You would get two alerts as SolarWinds would poll the node and see it down but would also poll the interface and see it down.

    If the alert is interface specific you could add an exclusion to say if the node status is down suppress the alert. The only potential issue with this is that if you have an interfaces alert that monitors multiple interfaces from multiple nodes then if the node you add to the suppression goes down the interface alert will never trigger for any other interfaces also until the node in the suppression comes back up. The suppression iis all encompassing.....not specific to every interface trigger.

    The suppression is more meant for when you have a remote site with multiple server nodes say and if the router connecting that site goes down you do not want to reveice the Server Down alerts so your suppression on this alert would be router status = down.

    It should be possible in the trigger to add the particular router and have it equal node status is UP but again it depends on the alert and how many different router interfaces it is monitoring. I would have to see the alert and understand a little more on what all is being monitored to give a better answer.

     

    Hope this helps

    • Re: Alert Supression
      Steven Klassen


      You would get two alerts as SolarWinds would poll the node and see it down but would also poll the interface and see it down.

       



      If the node is down, how is SolarWinds polling it to find out the interface is down? Again we're assuming that the interface that's going down is also the one causing the down condition for the device. If it is, you would only get the node down alert. If it isn't and the two are happening close together (e.g., interface gigabitethernet1/0 goes down and shortly afterward the entire router explodes) then two messages would make sense.

      I still think we need to hear more about what's causing the node to go down.

  • Re: Alert Supression
    Steven Klassen

     



    I know that you can't build an interface alert and use node details in the trigger conditions and vide versa, can you have an interface alert and use node details in alert supression?

     



    I'm confused by this part - if you create an interface alert, you can certainly use node details in the conditions. For example, if you wanted to watch all WAN interfaces, but only on your Cisco routers the trigger condition might look like:

    Interface > CustomProperties > WAN is equal to True

    Node Details > Vendor is equal to Cisco

    What kind of interface/node alert were you trying to create that you had problems with?

    • Re: Alert Supression
      cfry

      Thank you for the responses everyone.  Based on some of the answers I received I think my original question needs to be clarified.  I monitor a complex network of 14 global regions and a total of 97 within those regions.  The HUB of this network is here in Hercules.  I monitor the WAN for each of these sites, the Primary MPLS router and the back-up internet router.  When a router goes down do to explosion, power outage, etc., I have an alert that triggers after 5 minutes of downtime (ruling out flapping).  Once I receive that alert, I open a case with our engineers to resolve that issue.  When the router has been restored and is stable for 10 minutes, I receive the reset trigger. 

      I don't experience routers being hard-down as often as I would experience the MPLS or the Internet link interface going down.  Because of that, I needed to also create an interaface alert that would also tell me if network performance has been degraded to this site - if it was MPLS link that went down, service is now being handled by the back-up internet router.  Yesterday I had a situation where the site lost power, Solarwinds sent me an alert that the router was down and then another alert that the interface was down.  The interfaces were showing down, not unknown in this case.

      So, the purpose for all this?  I'm using Alarmpoint subscription service so that these remote site business partners can be notified of network status.  If they have network degradation or even total loss, this affects their bottom line as they use the network to make money.  When issues occur with their network, since we manage it remotely, we want to inform them of the issue and that we're already taking care of it but if they have questions they can call a particular number.  I'm in testing mode of this service and it's working BEAUTIFULLY!  Except when I received two e-mails for the site that went down yesterday that said the "the primary network connection in Chambley, France is down" and then another that said "the primary network connection in Chamley, France is down" right after!  This is because I'm alerting on both the router and interface. 

      The good news, I think I may have fixed the issue using alert supression on the interface alert that says to supress the alert if the node is down.  I won't know until a site goes down again though I really don't want to test that scenario in my production enviroment, time to set up my test lab for this!

      • Re: Alert Supression
        qle


        The good news, I think I may have fixed the issue using alert supression on the interface alert that says to supress the alert if the node is down.  I won't know until a site goes down again though I really don't want to test that scenario in my production enviroment, time to set up my test lab for this!

         



        Be sure you have the alert suppression as part of the Trigger Condition tab and not as part of the Alert Suppression tab and it'll work great!

      • Re: Alert Supression
        Steven Klassen


        I don't experience routers being hard-down as often as I would experience the MPLS or the Internet link interface going down.  Because of that, I needed to also create an interaface alert that would also tell me if network performance has been degraded to this site - if it was MPLS link that went down, service is now being handled by the back-up internet router.  Yesterday I had a situation where the site lost power, Solarwinds sent me an alert that the router was down and then another alert that the interface was down.  The interfaces were showing down, not unknown in this case.

         



        I hate to sound like a broken record here, but the interfaces on a down node go into an 'unknown' state, not 'down'. I promise you this is due to some misconfiguration of the alert definition. You don't need suppression in this case. In fact, if you try to suppress it this way you might be missing important alerts.

        What do the trigger conditions for the node down and interface down alerts look like?

      • Re: Alert Supression
        byrona

        If you could post screenshots of the alerts mrxinu can probably point out the alerting logic that is causing the problems.

        • Re: Alert Supression
          cfry

          NODE Alerts

           

          TRIGGER Alerts

           

           

          • Re: Alert Supression
            qle

            As you have it, your interface alert will never trigger as long as any node is down. The condition in the Alert Suppression tab will be tested against all nodes in NPM, not just the one that triggered this alert. Remove it from the Alert Suppression tab and add it to the Trigger Condition tab and it'll work great.

            • Re: Alert Supression
              Steven Klassen


              As you have it, your interface alert will never trigger as long as any node is down. The condition in the Alert Suppression tab will be tested against all nodes in NPM, not just the one that triggered this alert. Remove it from the Alert Suppression tab and add it to the Trigger Condition tab and it'll work great.

               



              Agreed. This doesn't explain why you would see a node down along with down interfaces, but this would definitely do the trick either way.