What are the best thresholds for the following network parameters (routers and switches)
1. Average CPU load
2.Disk usage
3.Percent memory used
4.Percent packet loss
5.Response time
That has a lot of dependencies. You probably need to run a report o something and baseline where they are at now. Older switches likely run at a higher utilisation. Response time is based on back to it's poller so again you will have to baseline normal time. We don't really mess with any of the defaults to be honest. Especially if you have preventative measures in place like loop detection and things like that.
1. Average CPU load : warning 75% , high 85 % and critical 95% or you can also decrease them by 5% if the devices are more critical (like in Data Center).
2.Disk usage : same as above.
3.Percent memory used : same as above
4.Percent packet loss : if the packet loss is more than 10%, then it will starts create issues. So, 10% / 30% / 45% or 50%.
5.Response time : response time is actually depends on location. If talking about LAN then it should be less than 25 ms and for WAN it should be between 150 ms to 250 ms for un-interrupted communication. And can differ if using ISP's MPLS connection or your own VPN connection, then again less than 25 ms is fine. Greater than these response time values will starts creating trouble.
If anybody have other thoughts, most welcome.