I have nine Linux servers that I have upgraded to NET-SNMP 184.108.40.206. I was previously using version 5.1.x on these servers and was getting CPU utilization of 100% on some systems. Now after the upgrade I am getting memory utilization of 100% on two of the servers. The servers are using around 80% of the Real Memory and about 50% of the SWAP space. I have read some posts where the memory utilization is calculated by both Real Memory and the SWAP space combined, which is in correct. I thought this was fixed with version 9.1 with SP2 but apparently it isn't. Is anybody else experiencing this? Also, does anybody know the SNMP OID that SolarWinds is polling for the memory utilization? That would help me narrow down whether the issue is with NET-SNMP or with NPM.
Has anyone found a solution to this one other than hiding the utilisation? We have other business groups that would like to monitor the performance of their servers. Does anyone have an OID to grab the cache amount so I can set up a formula for the business to view accurate usage? Thanks.
I have a case (81128) opened around a similar issue which I think is related, but my issue is with windows servers not Linux. In my case the memory spedometer on node details is not reporting correctly. Here is what their response was to my case:
"When viewing some of my Windows severs at idle the numbers are very close to if not exact for the correlation between the gauge and table variables. However if I task the server with additional processes that use more memory, the discrepancy surfaces within the gauge and tends to be off by 20-45% from what is displayed in the table.
Here is the OID that is being used to poll all server types for Memory Used per RFC accross all network devices.
Space Used (Disk Volumes Table) - HOST-RESOURCES-MIB hrMemorySize 220.127.116.11.18.104.22.168.2 calculation method -%percentage (tot size\in use)
Memory Used (Gauge) - HOST-RESOURCES-MIB hrSWRunPerfMem 22.214.171.124.126.96.36.199.1.1.2 calculation method - Add all values
Windows process utilization does not report an accurate statistic for it's Mem Used as it is possible using this calculation to exceed the 100% threshold of Mem Used by active Processes as found in the Task Manager. You are adding up all of memory used by all processes as displayed. Unix servers and networking devices do not have this issue as they report more accurate process statistics.
This is due to Microsoft combining physical and virtual memory as one figure for usage.
For 2003 servers, monitor Phyical Memory under volumes instead. You should find it is more accurate.
I have no further information on this."
This is a direct quote from support.
Instead of SW changing the value behind the spedometer to reflect physical memory they simply blame it on Microsoft.
We created 2 views:
Admin - Manage Views (Copy the Node Details)
We have Node Details - nonWindows & Node Details - Windows
On the Windows Detail we removed CPU Load and Memory utilization. Then we added just CPU Load - Radial Gauge
Once this is done go to Admin -Views by Device Types
Change the view for all nonWindows systems to the Node Details - nonWindows view and all windows systems to Node Details - Windows.
This allows use to see the memory gauge on Cisco devices and no on Windows servers.
Hopes this help
I have disabled resource monitoring on the Linux servers for the time being. As it turns out, this issue is a bit clouded. The OID that Orion is grabbing gives the free memory resources including the memory buffers, etc. However, the TOP command and the system monitor withing Linux shows the utilized memory minus the buffers. If you use the free -m command in Linux, it gives you the actual used physical memory, which is consistent with what SolarWinds was reporting through NET-SNMP. So, to make a long story short, yes the problem is real, but is it a real problem? The Linux servers appear to be running fine and for now we don't have the resources to free up any extra RAM from the VM Hosts. Therefore the best option for me is to ignore the problem temporarily.
I've been monitoring this thread because we have the same issue.
Although what Solarwinds is reporting is true for the Physical Memory Used, we would like Orion to monitor what "-/+ buffers/cache:" reports with the "free" command.
SolarWinds solutions are rooted in our deep connection to our user base in the THWACK® online community. More than 150,000 members are here to solve problems, share technology and best practices, and directly contribute to our product development process.