This discussion has been locked. The information referenced herein may be inaccurate due to age, software updates, or external references.
You can no longer post new replies to this discussion. If you have a similar question you can start a new discussion in this forum.

Alert Suppressing

Hello thwack community I would like to hear how you are using Orion to alert on site outage.
 
What I am trying to do is alert on a site down without getting an alert for every node at that site wile maintain the alert for any one device that may go down.  
In short trying to find a way to have 2 alerts,
            one if full site is down ( all nodes)
            Two if only a number of devices are down but site is still up.
 
Is anyone doing something like this or have advice on how I would attempt this?  I do have custom fields that group each site but how do I tell Solarwinds to only alert when 100% of nodes in group XYZ are down  and/or suppress the node alert if XYZ group is down. 
Thanks for any help or advice

  • Good luck with your request friend!  I've not been able to find any way to have an advanced alert know that status of any other node other than the one that the alert has been raised for.

    In essence you want to have Parent/Child relationships so you can only alert on the Parent node for a site, say the WAN router.  Unfortunately Orion doesn't support this kind of relationship system in the alerting system.  It's probably my biggest wish list item.

    Maybe someone will prove me wrong and have an idea for you! I'll certainly be waiting to see!

  • I would like to see that as well...

  • Anyone from Solarwinds is this possible or is it on the roadmap?

    You can already display nodes in groups then show that group as down (RED) can this function be moved into an alert?
  • Thanks, I had not found that post.

     

    But reading through it I am still a little confused.

     

    First sounds like I would need a report for each of my sites.  This could be 100 or more alerts.

     

    Next most of these sites do have 2 routers – and I only want to be alerted when these two routers are down.  (Site down hard)  I have every device marked with a custom Property that labels it to that site (groups).  I also have alerts right now that just sais any Router or switch down alert.   But if I have a site that has 25 switches I don’t want to get 25 emails.

     

    Too many Custom properties makes upkeep a pain and something that may not always be updated correctly.   

     

    I have thought about setting the HSRP for each site up in Solarwinds, then if any HSRP is down alert.  The alert would then list the custom fields and tell me what site it is. 

     

    Going further I am being pressured by management to then provide this type of information in a report form.  Needing to be able to run a report at the end of the month that Sais every date / time / duration of any complete site outage over the last month.  If I could run both an alert and report that said; If  100% of nodes in group X is down then that site is down.     
  • The solution suggested by Network Guru in that thread is excellent but you've pointed the exact reasons as to why I have not implemented it myself. SolarWinds has indicated that it is on the roadmap to have some kind of intelligence included in NPM that will automatically determine dependencies. If my memory serves me correctly, a version with this capability will be delivered by the end of this year.

    Keep in mind that this request is not new and was made at least a few years ago. Every few months, a thread such as this appears and ignites the fire again. Hopefully, *knock on wood*, SolarWinds pulls through and provides this feature this year.

  • Is there any chance that this post could help you? 

    Alternatively, you could use a stored procedure on the SQL server to update custom properties automatically and alert based upon the status of the custom property ().  Not a SQL guru here, but you could do something like:

    • Configure a custom property for each node in a site for "Down_Metric"
    • Stored procedure looks for nodes in custom property "Tier 1" (your site dual routers) to check status.
    • If one router in "Tier 1" is down, update custom property "Down_Metric" with "10".  If both routers are down, update custom property "Down_Metric" with "20".
    • Configure alert for the nodes to only alert you if the custom property "Down_Metric" is greater than 10.

    I'm not saying that it's not complicated and messy, but given the lack of many viable options, it might do the trick.  This is one instance where being able to update a custom property as an alert action would be pretty nice.

  • I haven't looked into this yet, as we are still getting our 9.1 instance off the ground, but I thought the following notation in the release notes for SP3 for 9.1 was quite interesting.

    "- Custom property values may now be defined using variables, or macros, following the ${VARIABLE} format."


    Depending on how these values are treated by the alerting engine, one might be able to suppress alerts based on the value of a SQL query in a custom property field of a given device by defining the custom property as ${SQL:SomeQuery}. Again, I haven't played around with this yet, but my initial thought is that one could probably use this mechanism to suppress alerts for downstream nodes when any one (or multiple) upstream devices were down by defining the correct SQL query as a variable for a given node. You would have to have some combination of custom properties assigned to all of your nodes that would allow them to be used as qualifiers by the SQL query, which is something we actually already have in place in our environment.

    Don't know if it will work, but thought it was worth mentioning.

    I LOVE custom properties... :)

  • Thanks for the input,

    These options even though they look time consuming to setup just might get me what I need.  I may need to give them a shot.

    I wish SW would just integrate better grouping and relationships rather then counting solely on custom properties.  Seems like management is wanting more and more alerting and reporting out of the system all the time.

  • This is for sure something we are looking at for a future release