1 Reply Latest reply on May 28, 2016 10:17 AM by aLTeReGo

    Heads up - Linux/net-snmp volume monitoring


      Just a heads up if you monitor Linux servers with SW/NPM - we had to figure this out the hard way by having a volume fill up and crash an application. Last week, I got a call about an application having problems, and when I logged into the server to check on things, I notice that root was 100% full. SW reported that it was only 95% full, and even though I had an alert set to trigger at >= 95%, the alert didn't trigger at all. This alert has triggered many times, so we know it works. I'm not sure why the alert didn't trigger, but I think I know why SW reported root as being only 95% full when in fact it was 100% full.


      Apparently, there are two ways to poll net-snmp for volume information. SW uses the stock host resources MIB which doesn't appear to account for the (default) 5% space reserved for root on the file system. This is why it reported the volume as 95% full when it was actually at 100%. I have confirmed this discrepancy on RHEL 5 and Ubuntu 10.04 Server.


      The latest NPM manual says to use net-snmp 5.5+ (which might fix this, but I haven't tested it) but RHEL 5 and Ubuntu 10.04 use earlier versions. If, like me, you either can't or won't upgrade net-snmp on your distributions, you just have to make sure you know that SW might report your Linux / net-snmp volumes as having more free space than they really do (5% is the default reservation). We changed the alert to trigger at 90% knowing that if the volume resides on a Linux server, then it could really be at 95% utilization.