This discussion has been locked. The information referenced herein may be inaccurate due to age, software updates, or external references.
You can no longer post new replies to this discussion. If you have a similar question you can start a new discussion in this forum.

Polling and alerting on high CPU for routers and switches

All,

I was lucky enough to recently experience a high CPU (100% pegged!) issue on one of my core routers. I'm still not 100% certain what caused it, but after troubleshooting for a hour, I had to reboot it and the issue is now "resolved." At least until the CPU climbs back up to 100%, and I can troubleshoot further (I'm leaning toward a code bug).

However, this issue also pointed out a gap in my monitoring; that is to say, I'm not alerting on high CPU for network gear at all (or memory). After all, most network issues are connectivity and routing, CPU and memory are usually the last things I look at. So, I'm wondering if there is a way to alert on CPU and memory; what thresholds should I consider as a "rule of thumb?"

I found this posting that seems like a great solution, but I prefer to let NPM handle this.

https://supportforums.cisco.com/document/53381/high-cpu-event-detection-methods-cisco-routers-switches

Is anyone else alerting on high CPU and memory on your network gear. What thresholds are you using? What do your alerts look like? I've not upgraded to NPM 10.7 yet, and even if I did, I don't know if base-lining for CPU or memory is configurable.

D