11 Replies Latest reply on May 11, 2009 4:18 PM by MagnAxiom

    Alert Suppressing

    afsprau

           

      Hello thwack community I would like to hear how you are using Orion to alert on site outage.

       

      What I am trying to do is alert on a site down without getting an alert for every node at that site wile maintain the alert for any one device that may go down.  

      In short trying to find a way to have 2 alerts,

                  one if full site is down ( all nodes)

                  Two if only a number of devices are down but site is still up.

       

      Is anyone doing something like this or have advice on how I would attempt this?  I do have custom fields that group each site but how do I tell Solarwinds to only alert when 100% of nodes in group XYZ are down  and/or suppress the node alert if XYZ group is down. 

      Thanks for any help or advice

        • Re: Alert Suppressing
          MagnAxiom

                  

          Good luck with your request friend!  I've not been able to find any way to have an advanced alert know that status of any other node other than the one that the alert has been raised for.

          In essence you want to have Parent/Child relationships so you can only alert on the Parent node for a site, say the WAN router.  Unfortunately Orion doesn't support this kind of relationship system in the alerting system.  It's probably my biggest wish list item.

          Maybe someone will prove me wrong and have an idea for you! I'll certainly be waiting to see!

          • Re: Alert Suppressing
            afsprau

            Anyone from Solarwinds is this possible or is it on the roadmap?


            You can already display nodes in groups then show that group as down (RED) can this function be moved into an alert?

            • Re: Alert Suppressing
              bshopp

              Have you seen  this thread? Re: Grouping of objects for alerting

                • Re: Alert Suppressing
                  afsprau

                  Thanks, I had not found that post.


                   


                  But reading through it I am still a little confused.


                   


                  First sounds like I would need a report for each of my sites.  This could be 100 or more alerts.


                   


                  Next most of these sites do have 2 routers – and I only want to be alerted when these two routers are down.  (Site down hard)  I have every device marked with a custom Property that labels it to that site (groups).  I also have alerts right now that just sais any Router or switch down alert.   But if I have a site that has 25 switches I don’t want to get 25 emails.


                   


                  Too many Custom properties makes upkeep a pain and something that may not always be updated correctly.   


                   


                  I have thought about setting the HSRP for each site up in Solarwinds, then if any HSRP is down alert.  The alert would then list the custom fields and tell me what site it is. 


                   


                  Going further I am being pressured by management to then provide this type of information in a report form.  Needing to be able to run a report at the end of the month that Sais every date / time / duration of any complete site outage over the last month.  If I could run both an alert and report that said; If  100% of nodes in group X is down then that site is down.     

                    • Re: Alert Suppressing
                      qle

                      The solution suggested by Network Guru in that thread is excellent but you've pointed the exact reasons as to why I have not implemented it myself. SolarWinds has indicated that it is on the roadmap to have some kind of intelligence included in NPM that will automatically determine dependencies. If my memory serves me correctly, a version with this capability will be delivered by the end of this year.

                      Keep in mind that this request is not new and was made at least a few years ago. Every few months, a thread such as this appears and ignites the fire again. Hopefully, *knock on wood*, SolarWinds pulls through and provides this feature this year.

                        • Re: Alert Suppressing
                          bleearg13

                          Is there any chance that this post could help you?  Re: Can I create Advansed Alert with count?

                          Alternatively, you could use a stored procedure on the SQL server to update custom properties automatically and alert based upon the status of the custom property (Stored Procedue to populate custom property).  Not a SQL guru here, but you could do something like:

                          • Configure a custom property for each node in a site for "Down_Metric"
                          • Stored procedure looks for nodes in custom property "Tier 1" (your site dual routers) to check status.
                          • If one router in "Tier 1" is down, update custom property "Down_Metric" with "10".  If both routers are down, update custom property "Down_Metric" with "20".
                          • Configure alert for the nodes to only alert you if the custom property "Down_Metric" is greater than 10.

                          I'm not saying that it's not complicated and messy, but given the lack of many viable options, it might do the trick.  This is one instance where being able to update a custom property as an alert action would be pretty nice.

                            • Re: Alert Suppressing
                              vhcato

                              I haven't looked into this yet, as we are still getting our 9.1 instance off the ground, but I thought the following notation in the release notes for SP3 for 9.1 was quite interesting.

                              "- Custom property values may now be defined using variables, or macros, following the ${VARIABLE} format."


                              Depending on how these values are treated by the alerting engine, one might be able to suppress alerts based on the value of a SQL query in a custom property field of a given device by defining the custom property as ${SQL:SomeQuery}. Again, I haven't played around with this yet, but my initial thought is that one could probably use this mechanism to suppress alerts for downstream nodes when any one (or multiple) upstream devices were down by defining the correct SQL query as a variable for a given node. You would have to have some combination of custom properties assigned to all of your nodes that would allow them to be used as qualifiers by the SQL query, which is something we actually already have in place in our environment.

                              Don't know if it will work, but thought it was worth mentioning.

                              I LOVE custom properties... :)

                      • Re: Alert Suppressing
                        afsprau

                        Thanks for the input,

                        These options even though they look time consuming to setup just might get me what I need.  I may need to give them a shot.

                        I wish SW would just integrate better grouping and relationships rather then counting solely on custom properties.  Seems like management is wanting more and more alerting and reporting out of the system all the time.

                          • Re: Alert Suppressing
                            bshopp

                            This is for sure something we are looking at for a future release

                              • Re: Alert Suppressing
                                MagnAxiom

                                No doubt there...

                                 

                                I'm not a SQL Guru by any means, I don't even think I could call myself proficent in SQL.

                                 

                                The idea of having to go insert SQL queries seems like a complete hack!

                                 

                                I'm not wanting to write someting that becomes so super customized that documentation and training become an issue.

                                Come on SolarWinds, step up and add this functionality!!!