cancel
Showing results for 
Search instead for 
Did you mean: 
Create Post
Level 9

Alert not working

I've created an advanced alert to notify me when a node goes down based on the following:


Status is equal to Down
Vendor is equal to Windows


However this doesn't seem to work as I'm not getting alerts when the Windows systems go down. I've used the test feature in the alerts area and successfully received the notifications from this. I've also ensured that the alert is enabled. Is there something obvious that I'm missing?

0 Kudos
12 Replies
Product Manager
Product Manager

"Status" should be "Node Status". See the following screenshot


0 Kudos

Yes, I'm sorry- typo on my part. ""Node Status" is configured in the query (I don't believe there is a "Status" only option).


Perhaps some other obvious item I'm missing? Is anyone else successfully using "Vendor" in their Alert condition?

0 Kudos

If your alert is as aLTeReGo posted, it should definitely work. I've defined an alert similar to this and it's been working fine for us. Just to double-check, you have both conditions created and the property to monitor (at the top) is set to "Node"?
0 Kudos

Well I think I figured it out and, of course, user error was likely the cause. There was an additional condition in the Alert that specified "Vendor = VMware". Thus, the logic of the Alert was waiting for a 'Windows' system AND a 'VMware' system to be in 'Down Status' before triggering the alert. So I stripped out the Vendor = VMware condition and it seems to be working- albeit not real time. Which brings me to my next question- Where can I lower the response time to see when a system is down. Currently, when I disable a particular systems network interface, it takes about 4 minutes for me to receive the Alert. Such a delta would miss a majority of system restarts (which is really what I'd like this alert to monitor).

Thanks again for the support.

0 Kudos

When you test it, by disabling the interface, how long does it take for it to show down in Orion?
0 Kudos

Well, Orion is still set at the default 4 minute interval for the page to refresh. But I manually kept refreshing after disabling the interface and, at about 120 seconds, it went to warning, and then another 60-120 seconds, it changed to 'Down' status and I received an alert (email). within seconds. So, it seems that NPM is the slowest link- I have changed the poll interval for NPM down from 120 seconds to 60. Is there some setting that sets the node in 'Down' state immediately after 60 seconds instead of 'Warning' (as seen in Orion)? (Though I think the 'Warning' is related to packet loss...)

0 Kudos

I manually kept refreshing after disabling the interface and, at about 120 seconds, it went to warning, and then another 60-120 seconds, it changed to 'Down' status and I received an alert (email).
 

You got the first part, changing the status polling interval to 60 seconds.  The second part of what your looking for is under 'Advanced Settings' in System Manager.  Go to the 'Node Warning Interval' tab.  From there you can drop the amount of time that a node will sit in the warning state before going to the down state.

 

0 Kudos

Cool- So I pushed this settings down to 10 seconds and haven't noticed any significant decraease in the delay. Maybe I just need to restart a service?

0 Kudos

the Node Warning Interval determines the length of time a node stays in the warning state. This is configured from the System Administrator Console
File >>> Advanced Settings >>> Node Warning Interval
0 Kudos

Orion pings each node every [Node Polling Interval - default 120 seconds]. If no response is received, the node status switches to Warning and we start polling it every 10 seconds. This continues until either we get a response - node status switches back to Up and we resume normal polling - or the "node warning interval" (default 120 seconds) expires and we mark the node Down.

You can configure the node polling interval on a per-node basis and you can set the default node polling interval for new nodes in System Manager under File > NetPerfMon Settings > Polling. You can change the node warning interval in System Manager under File > Advanced NetPerfMon Settings.

0 Kudos

assign an snmp trap on the switch interface going down, would be the basic answer.... but I think that you are monitoring virtuals with this alert? (Vendor=VMWare) so then I would look at the Virtual center's snmp trap abilities.

2 cents dropped, see if wish comes true...
0 Kudos

I'm anxious to hear what kind of response you get back from either the Thwack community or Solarwinds themselves. I'd like to do something similar with my alerts, but before I go and break something that's currently working I would like to know if this is suppose to work. Honestly I don't see any reason why it shouldn't, but then again I just use the code, I don't write it. 🙂


0 Kudos