Hi Guys,
I've recently started at a company and I have been tasked with improving the Solarwinds monitoring. I had experience with other monitoring tools but I'm relatively new to Solarwinds.
Currently we mainly use NPM and NCM but we only really look for ups and downs. We are monitoring switches, routers, phone systems and servers. I'd like to branch out with the monitoring and do more than ups and downs but I'm wondering if you helpful people can give me some guidance on what’s best to alert on? I'd like to create some new alerts but I'd rather not do it just for the sake of adding new alerts...
The alerts I'm thinking of adding are:
- Node Down
- Node Reboot
- Interface Down
- Disk Space
- CPU Load
- Memory Utilization
- Bandwidth Utilization
- Packet Loss
- Latency
- Hardware Errors
- NCM config backup fails
- Fan speed
- Power supply
- Temperature
- IP address change
- DNS change
- MAC address change
- Monitor ports in switch to show unplugged/Up
Is this a good start or can anyone else think of others things that would be added benefit to Solarwinds?