This discussion has been locked. The information referenced herein may be inaccurate due to age, software updates, or external references.

You can no longer post new replies to this discussion. If you have a similar question you can start a new discussion in this forum.

SLIGHT PACKET LOSS CAUSING NODE DOWN ALERTS

orioncrack over 7 years ago

We're seeing slight packet loss and barrage of Node down alerts will drop in.

Where is packet loss configured that would affect our basic Node Down triggers? I'm stumped.

Thanks

Top Replies

0 GoldTipu over 7 years ago

Node is been marked as down when its reached the set default Threshold
Settings > Polling Settings >
by default the Node Stauts is been polled every 120 sec .
Please try increase the polling cycle frequency for the nodes to 10 ~ 20 or ~ 30 so even its drooping the packet or unreachable the next packet will be responded and it will not trigger the Threshold
Edit Node >
You can also try SNMP status poll if this also helped as below
Difference between ICMP & SNMP response time
Cancel
Vote Up +2 Vote Down

Cancel
0 Karthik.A over 7 years ago

Hi orioncrack,
Changing polling frequency will not help you if you are getting alerts due to frequent packet loss.
The better method is to set a hold time on the alerts.
For setting hold time:
1.Goto Alert manager where you configured Node down alerts
2.In conditions tab there will be option called don't trigger the alert until the condition exists for xx sec/min.
3. Update the time to xx min, this will check for the status for given intervals and if the device is not responding till that time, then it will trigger the alert.
Cancel
Vote Up +2 Vote Down

Cancel
0 orioncrack over 7 years ago in reply to Karthik.A

Thanks. I'm aware of these basic steps. I just think there should be more control over this as the polling warning settings don't seem very finite and to me are not exact. An alert should only be generated if there is 100% packet loss. I suppose I can throw that conditional statistic at it. But I feel more control is needed on this.
Cancel
Vote Up 0 Vote Down

Cancel
0 GoldTipu over 7 years ago in reply to Karthik.A

Changing polling frequency is very normal for the Nodes such as Firewall as they have configured Intrusion prevention on the interfaces .
In other cases the network latency could cause such issues as well therefor instead of going UP to the global alert its better to fix the root issue addressing the frequency so the Node will only be marked as down when its really is down so IT DOES HELP in such a way .
In large environment changing the whole Alert condition for one or two nodes causing the issue as above is not really should be recommended as in this case he only had an issue with single Node which is been marked as Down due to be polled after 120 second and the packet is either dropped due to network latency or the up to the interface level .
Cancel
Vote Up 0 Vote Down

Cancel
0 anshumaandevmishra over 7 years ago

Go to the All Settings -->Orion Thresholds(in Section thresholds and polling) -- from here you can change the polling critical and warning percentage.
OR
Setup a alert with the below condition --
Hope it will help you...
Cancel
Vote Up 0 Vote Down

Cancel
0 aLTeReGo over 4 years ago

NPM 12.5 includes support for sustained thresholds, allow you to alert when a node threshold has been exceeded for more than 'X' consecutive, or 'X' out of 'Y' polls. For more information, see the following post.
Orion Platform 2019.2 - Enhanced Node Status
Cancel
Vote Up 0 Vote Down

Cancel