Hello,
I have core nodes which I currently monitor for packet loss. While the alert can be helpful at times, it's causing some confusions with the engineers depending on the reasons for the alert. If the "node's" packet loss reaches above 10%, an alert is sent; the reset alert is sent when packet loss drops below 5%.
The issue we've been having with this setup:
- The alert does not clearly inform me where the packet loss is occurring (it could be at the node or at the far-end device)
- Packet loss on multiple edge devices (due to coincidence, or some common cause) triggers the alert
- The default alert was configured for 5%, I edited it to 10%; I want to inquire if you all are using a higher or lower percentage
I'm wondering if anyone has dealt with these issues and found some better method for monitoring their core devices for packet loss. Any advice or comments are appreciated.