6 Replies Latest reply on Feb 8, 2017 7:57 AM by wessens

    Alert - (DownTime) displays as 0 time

    wessens

      Gentlemen,

       

      Please let me know what I am doing wrong as my alert displays as the following:

       

      Code:

      ${NodeName}: ${Status} for ${N=Alerting;M=Downtime} minutes.

       

      Output:

      SERVERNAME: Down for 0 minutes.

       

      Currently the "Trigger Condition" is set to:

      twa.png

       

      any hints on this? I am looking for a downtime of at least 5 minutes. or how long the node has been down once the alert gets sent out.

       

       

      - WessenS

        • Re: Alert - (DownTime) displays as 0 time
          torquenut

          Hi wessens,

           

          Is this output a live alert of a device that has gone DOWN, or a simulated one in the Alert Editing fields?  If you're simulating against a node that is currently UP, it will display a downtime of 0 minutes, since the node is not technically down.

          • Re: Alert - (DownTime) displays as 0 time
            torquenut

            Also, a screenshot of your "Trigger Conditions" page may help us figure out your issue!

            • Re: Alert - (DownTime) displays as 0 time
              wessens

              Hello Torquenut,

               

              It is a portion of our live alert email. Mainly the very top of the alert paragraph that is the only part not working properly.

              Everything else is fine...

               

              Code:

              ${NodeName}: ${Status} for ${N=Alerting;M=Downtime} minutes.

              Machine Type: ${MachineType}

              Description: ${description}

              Last Boot: ${N=SwisEntity;M=LastBoot}

              Location: ${location}

               

              ${NodeDetailsURL}

              ${N=Alerting;M=AcknowledgeLink}

               

               

               

              here is my trigger condition page:

               

              twaa.png

               

              Thanks!

              • Re: Alert - (DownTime) displays as 0 time
                torquenut

                Okay, I think I've got it!  TL;DR: This is expected behavior for the first alert.

                 

                I set up my laptop as a test node and threw a generic "node down" alert at it, making sure to include the "downtime" variable.  Here's what I got when I disconnected my laptop (left) and when I plugged it back in:

                 

                The very first alert was sent as soon as the managed alert noticed my laptop was offline.  Imagine you're walking down the street and you find a penny on the ground.  You don't know how long that penny has been there, but from your perspective, its existence was all of 10 seconds.  Since the node is down, it cannot report "downtime" like it can with "uptime", for obvious reasons; it's up to the polling engine to discover that the device is no longer communicating, and then uses the alert rule to define how long it's been since its discovery.

                 

                If you want to get a running record of how long the device has been offline, I recommend editing your alert to repeat its message sending every couple minutes.  You can do this by opening the Alert, going to the "Trigger Actions" page, "Editing" the message, and put an "X" under "Repeat this action every X minutes until the alert is acknowledged" under "Execution Settings":

                 

                 

                 

                If you don't need a reminder or a running tally of the downed node, just make sure the < ${N=Alerting;M=Downtime} > variable is in your RESET ACTIONS message for when the node comes back online.

                 

                Clear as mud?  I hope this helps!

                 

                P.S. - I would guess that if you set your polling interval to be longer than your alert's "condition must exist for" variable, that your first alert would hold a value greater than 0.  But then you wouldn't be getting the alert until several more minutes after the actual event.

                • Re: Alert - (DownTime) displays as 0 time
                  wessens

                  Clear as mud! Thanks for your response.

                   

                  Initially that is what we thought. That a downed node would display as "0" minutes as the node just went down. though from the "Trigger Condition" page being set to "Condition must exist for more than 5 minutes" we were hoping an email would be sent displaying at least a "downtime: 5 minutes" as that is what we set it to.

                   

                  The reset condition is just as yours (Screenshot). Perfectly reports how long the device was down until it came back up.

                  I'm sure the repeating alerts should help. but the production side and the groups it hits would not like that (repeating emails/alerts of same node).

                  Either way, you have helped a lot!

                   

                  Thanks!

                   

                  - Wess