cancel
Showing results for 
Search instead for 
Did you mean: 
Create Post
Level 12

Linux Memory Utilization Monitors and You

Jump to solution

Our client is wondering why the values in Solarwinds do not reflect the values found on their servers:

top - 17:58:42 up  1:44,  1 user,  load average: 0.03, 0.06, 0.06
Tasks:  94 total,   1 running,  93 sleeping,   0 stopped,   0 zombie
Cpu(s):  3.7%us,  0.2%sy,  0.0%ni, 94.8%id,  1.2%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   8174656k total,  1725996k used,  6448660k free,    39772k buffers
Swap:  8388600k total,        0k used,  8388600k free,   285544k cached

= ~21% Utilization

$ free -m
             total       used       free     shared    buffers     cached
Mem:          7983       1684       6298          0         39        278
-/+ buffers/cache:       1366       6616
Swap:         8191          0       8191

= ~21% Utilization

Solarwinds = 17% utilization

Figuring that this was just a case of SNMP sending slightly different data I tried a basic snmpwalk against memory:

$ snmpwalk -v 2c -c xxxxxxxxxx localhost Memory
UCD-SNMP-MIB::memIndex.0 = INTEGER: 0
UCD-SNMP-MIB::memErrorName.0 = STRING: swap
UCD-SNMP-MIB::memTotalSwap.0 = INTEGER: 8388600
UCD-SNMP-MIB::memAvailSwap.0 = INTEGER: 8388600
UCD-SNMP-MIB::memTotalReal.0 = INTEGER: 8174656
UCD-SNMP-MIB::memAvailReal.0 = INTEGER: 6446020
UCD-SNMP-MIB::memTotalFree.0 = INTEGER: 14834620
UCD-SNMP-MIB::memMinimumSwap.0 = INTEGER: 16000
UCD-SNMP-MIB::memShared.0 = INTEGER: 0
UCD-SNMP-MIB::memBuffer.0 = INTEGER: 42552
UCD-SNMP-MIB::memCached.0 = INTEGER: 285616
UCD-SNMP-MIB::memSwapError.0 = INTEGER: 0
UCD-SNMP-MIB::memSwapErrorMsg.0 = STRING:

1-(memAvailReal/memTotalReal) = ~21%

Even when I manually enter the OIDs I receive the same basic results. 

$ snmpwalk -v 2c -c xxxxxxx localhost .1.3.6.1.4.1.2021.4.5.0
UCD-SNMP-MIB::memTotalReal.0 = INTEGER: 8174656
$ snmpwalk -v 2c -c xxxxxxx localhost .1.3.6.1.4.1.2021.4.6.0
UCD-SNMP-MIB::memAvailReal.0 = INTEGER: 6400580

= ~21%

I'm having a hard time explaining to our client why Solarwinds is reporting a 4% lower utilization than they are seeing on the server itself.  4% could be the difference between an alert being generated or not, so you can see where the dilemma is coming from.

We have seen similar situations on Linux disk monitors, but in that case we are able to see how the values are being pulled more or less directly from SNMP.  When we can fall back on Solarwinds using the SNMP reported data we are able to explain why utilization levels in Solarwinds do not reflect those on the server itself.  In this case we are really at a loss for an explanation.

Is Solarwinds using a different OID? If so, is there a way to change the OID that is being used to the ones I just showed above without resorting to a UDP or something?  Can someone provide me with the formula that is being used to calculate Memory Used on the CPU Load & Memory Utilization module?

Thanks in advance,

Bob

1 Solution
Level 12

A system admin recently sent in a ticket which claims that Solarwinds is not reporting memory data properly for a Linux server.  Wanting to see what the issue was I decided to dive in and see what information I could find. 

This is what the admin wrote in detailing the issue:
Solarwinds is reporting 20% memory used but in reality almost all of the memory is used. There also do not appear to be any alerts sent on this issue.

$ free -m
                                total      used      free       shared buffers                 cached
Mem:                    7983       7580       402         0              666                         5206
-/+ buffers/cache:           1707       6275
Swap:                    8191       0              8191


It’s pretty easy to see where he was seeing high utilization:

Used/total = % Utilization
7580/7983 = 0.949 = ~95%

Since Solarwinds was only showing ~20% utilization this is obviously cause for concern… But let’s dig a bit deeper.

I decided to take a look at the same system that the admin was referencing and crunch some numbers of my own…

Solarwinds says there is 18% Memory Utilization…
 
Let’s take a look at free!
$ free -m
                                Total      used      free       shared  buffers cached
Mem:                    7983       2207       5775       0              306         494
-/+ buffers/cache:           1406       6576
Swap:                    8191       0              8191


Hmmm… Using our above formula for memory utilization we get ~28%!  That’s a full 10% difference.

Let’s check top instead!

$ top -b | head -8
top - 09:34:01 up 1 day, 17:20,  1 user,  load average: 0.00, 0.00, 0.00
Tasks:  94 total,   1 running,  93 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.3%us,  0.1%sy,  0.0%ni, 99.1%id,  0.4%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   8174656k total,  2261088k used,  5913568k free,   313432k buffers
Swap:  8388600k total,        0k used,  8388600k free,   506568k cached


Uh oh… top is showing the same 28% utilization.  This is not looking good for us.  But, we all know that Solarwinds is just relying on SNMP data that is being returned by the system, right?

$ snmpwalk -v 2c -c xxxxxx localhost memory
UCD-SNMP-MIB::memIndex.0 = INTEGER: 0
UCD-SNMP-MIB::memErrorName.0 = STRING: swap
UCD-SNMP-MIB::memTotalSwap.0 = INTEGER: 8388600
UCD-SNMP-MIB::memAvailSwap.0 = INTEGER: 8388600
UCD-SNMP-MIB::memTotalReal.0 = INTEGER: 8174656
UCD-SNMP-MIB::memAvailReal.0 = INTEGER: 5913956
UCD-SNMP-MIB::memTotalFree.0 = INTEGER: 14302556
UCD-SNMP-MIB::memMinimumSwap.0 = INTEGER: 16000
UCD-SNMP-MIB::memShared.0 = INTEGER: 0
UCD-SNMP-MIB::memBuffer.0 = INTEGER: 313432
UCD-SNMP-MIB::memCached.0 = INTEGER: 506568
UCD-SNMP-MIB::memSwapError.0 = INTEGER: 0
UCD-SNMP-MIB::memSwapErrorMsg.0 = STRING:


Wait a second… There is no value for Memory Used… All SNMP is seeing are the Total and Available!?  Oh well, let’s see what happens when we calculate for utilization using these values…

1 – (memAvailReal / memTotalReal) = ~28%

Hmmmm… that’s not good, is it?  Solarwinds must be doing something different with that data.

We could try Solarwinds’ not so useful SNMPWalk.exe to try and get a huge dump of all the SNMP data that is returned, but that won’t really tell us what OIDs Solarwinds is really polling against…  Let’s take a break and sniff some packets.

Fire up Wireshark and set the filter: ip.src = xxx.xxx.xxx.xxx

After we force a repoll on the device we get about 15 hits, but we’re only concerned with these four…

1.3.6.1.4.1.2021.4.5.0
1.3.6.1.4.1.2021.4.6.0
1.3.6.1.4.1.2021.4.14.0
1.3.6.1.4.1.2021.4.15.0

What could these possibly relate to?

1.3.6.1.4.1.2021.4.5.0 => memTotalReal
1.3.6.1.4.1.2021.4.6.0 => memAvailReal
1.3.6.1.4.1.2021.4.14.0 => memBuffer
1.3.6.1.4.1.2021.4.15.0 => memCache


So it looks like Solarwinds is pulling not only the data for Total and Available but also Buffer and Cache.  With a little bit of creative formulating we find this….

1 - ((memAvailReal + memBuffer + memCache) / memTotalReal) = ~18%

Wowee!  So it turns out that Solarwinds is actually counting Buffered and Cached memory as unutilized space. 

But you might be asking yourself (or have an admin asking you)… “Well, why is that number different from the Used values in the free and top commands?”

To answer that question let’s do a little dumpster errm… code diving!

After downloading the source files for procps utilities we find a few little nuggets of wisdom:

sysinfo.c

Kb_main_used = kb_main_total – kb_main_free


Oh… so the only reason that free is showing used is because some dude coded it that way?  Yep.

Digging a little deeper we find this gem:
free.c
“-/+” buffers/cache: %10Lu %10Lu\n”,
S(kb_main_used – buffers_plus_cached),
S(kb_main_free + buffers_plus_cached


You might be wondering why this is important… remember the free command we ran back at the start?
$ free -m
                                Total      used      free       shared  buffers cached
Mem:                    7983       2207       5775       0              306         494
-/+ buffers/cache:           1406       6576
Swap:                    8191       0              8191


Let’s try something creative…

“-/+ buffers/cache.used” / Mem.Total = ~18%

So it turns out the data was there for the admin to use the whole time… he was just looking at the wrong data.  When we run the numbers he reported originally through the same formula we get ~21% which is about on par for what he reported Solarwinds as showing  (and without having a screenshot of what he was seeing is well within the margin of error for guesstimations).

Without getting into the details, top and vmstat are also part of the same utility package.  Vmstat is nice because it doesn’t do any utilization calculations on its own.

Lets tie this all together and see where all of these different resources are pulling their Memory from…

$ cat /proc/meminfo
MemTotal:      8174656 kB
MemFree:       5914452 kB
Buffers:        313432 kB
Cached:         506568 kB
SwapCached:          0 kB
Active:        1803944 kB
Inactive:       333024 kB
HighTotal:           0 kB
HighFree:            0 kB
LowTotal:      8174656 kB
LowFree:       5914452 kB
SwapTotal:     8388600 kB
SwapFree:      8388600 kB
Dirty:              68 kB
Writeback:           0 kB
AnonPages:     1317016 kB
Mapped:          30708 kB
Slab:            90220 kB
PageTables:       6608 kB
NFS_Unstable:        0 kB
Bounce:              0 kB
CommitLimit:  12475928 kB
Committed_AS:  1784324 kB
VmallocTotal: 34359738367 kB
VmallocUsed:    267160 kB
VmallocChunk: 34359470839 kB
HugePages_Total:     0
HugePages_Free:      0
HugePages_Rsvd:      0
Hugepagesize:     2048 kB


Wild stuff… /proc/meminfo contains just about the rawest human readable data relating to memory utilization.  And guess what, no utilization information is included…  It is up to the end user to figure out how they want to calculate utilization.  For most system admins they decide to rely on a generic formula that simply subtracts free space from total space.  Solarwinds chose to include buffered and cached space as free space.

For a little more in depth discussion about why buffered and cached memory are counted as unutilized, visit this handy website: http://www.linuxatemyram.com/

Even if you don’t need it you should visit it just to see the awesome title pic.  And ya, apparently this issue is so common that some dude actually used it as the domain name.

I hope this helps you understand Solarwinds memory utilization monitors on Linux.  Please feel free to relay this information to any admins that get into a huff about Solarwinds showing a different value than they think is correct.  Of course, be sure that SNMP is returning the correct data, but now that you know the proper formula this can be easily calculated from their top or free information.

View solution in original post

20 Replies
Level 17

In the interest of leveraging new NPM features, I've created a poller (not UnDP, but actual replacement poller) that you can use INSTEAD OF the built-in SolarWinds RAM poller that uses the "simpler" calculation. You can download it here: linuxatemyram

Enjoy!

Leon Adato | Head Geek
------
"Measure what is measurable,
and make measurable what is not so." - Gallileo

Level 16

Hi adatole

by any chance can u help me for AIX devices?

Level 17

Probably. The issue (if I recall correctly) is that getting SNMP and AIX to behave is tricky business and could require recompiling the SNMP agent if not the kernel itself. (yeah, we all love IBM soooo much!!).

I would start by doing an SNMP walk on the machine itself and outputting the results to a text file and reviewing it.

THEN I would do an SNMP walk from the polling engine (look for "snmpwalk.exe").

If your results differ, you simply have a permission issue that you can fix by changing snmpd.conf (or the AIX equivalent). If they are the same, and you find your memory counter, you can use the instructions above to go get it.

If you DON'T see the memory counter you want, then I created a couple of templates that leverage SAR to pull multiple stats for for AIX LPAR's. You can probably use those as a starting point for what you need.

AIX_LPAR-disk

AIX_LPAR_non-disk

Leon Adato | Head Geek
------
"Measure what is measurable,
and make measurable what is not so." - Gallileo

0 Kudos
Level 16

Thanks Leon.. in that case let me get the snmpd conf from the Linux team and I need to check if the memory component is showing or not.

I will update once I get the details...

Level 12

So I did some more investigating.  I ended up resorting to sniffing packets to find out what OIDs showed up during a repoll.

 

1.3.6.1.4.1.2021.4.5.0 => memTotalReal
Value (Integer32): 8174656

1.3.6.1.4.1.2021.4.6.0 => memAvailReal
Value (Integer32): 5961904

1.3.6.1.4.1.2021.4.14.0 =>memBuffer
Value (Integer32): 264164

1.3.6.1.4.1.2021.4.15.0 => memCached
Value (Integer32): 472976

Taking this data I was able to approximate the 18% utilization shown by Solarwinds (Current utilization calculated in top was ~27%)...

(memAvailReal + memBuffer + memCached) / memTotalReal = ~18%

Is this the correct formula? If so, why was this chosen?  I would prefer to have a formula that can be verified with simple system commands like top or free, but simple confirmation that this is the correct formula would be enough to explain to the administrator why he is seeing different results.

Level 12

It turns out that both SNMP and free are pulling data directly from /proc/meminfo which does not contain actual utilization levels.  free calculates used space by subtracting free memory from total memory.  That is explanation enough for me to give the admin.

I'd still like to know why it was decided to use the above formula for memory utilization in Solarwinds.

Thanks!

Level 12

A system admin recently sent in a ticket which claims that Solarwinds is not reporting memory data properly for a Linux server.  Wanting to see what the issue was I decided to dive in and see what information I could find. 

This is what the admin wrote in detailing the issue:
Solarwinds is reporting 20% memory used but in reality almost all of the memory is used. There also do not appear to be any alerts sent on this issue.

$ free -m
                                total      used      free       shared buffers                 cached
Mem:                    7983       7580       402         0              666                         5206
-/+ buffers/cache:           1707       6275
Swap:                    8191       0              8191


It’s pretty easy to see where he was seeing high utilization:

Used/total = % Utilization
7580/7983 = 0.949 = ~95%

Since Solarwinds was only showing ~20% utilization this is obviously cause for concern… But let’s dig a bit deeper.

I decided to take a look at the same system that the admin was referencing and crunch some numbers of my own…

Solarwinds says there is 18% Memory Utilization…
 
Let’s take a look at free!
$ free -m
                                Total      used      free       shared  buffers cached
Mem:                    7983       2207       5775       0              306         494
-/+ buffers/cache:           1406       6576
Swap:                    8191       0              8191


Hmmm… Using our above formula for memory utilization we get ~28%!  That’s a full 10% difference.

Let’s check top instead!

$ top -b | head -8
top - 09:34:01 up 1 day, 17:20,  1 user,  load average: 0.00, 0.00, 0.00
Tasks:  94 total,   1 running,  93 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.3%us,  0.1%sy,  0.0%ni, 99.1%id,  0.4%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   8174656k total,  2261088k used,  5913568k free,   313432k buffers
Swap:  8388600k total,        0k used,  8388600k free,   506568k cached


Uh oh… top is showing the same 28% utilization.  This is not looking good for us.  But, we all know that Solarwinds is just relying on SNMP data that is being returned by the system, right?

$ snmpwalk -v 2c -c xxxxxx localhost memory
UCD-SNMP-MIB::memIndex.0 = INTEGER: 0
UCD-SNMP-MIB::memErrorName.0 = STRING: swap
UCD-SNMP-MIB::memTotalSwap.0 = INTEGER: 8388600
UCD-SNMP-MIB::memAvailSwap.0 = INTEGER: 8388600
UCD-SNMP-MIB::memTotalReal.0 = INTEGER: 8174656
UCD-SNMP-MIB::memAvailReal.0 = INTEGER: 5913956
UCD-SNMP-MIB::memTotalFree.0 = INTEGER: 14302556
UCD-SNMP-MIB::memMinimumSwap.0 = INTEGER: 16000
UCD-SNMP-MIB::memShared.0 = INTEGER: 0
UCD-SNMP-MIB::memBuffer.0 = INTEGER: 313432
UCD-SNMP-MIB::memCached.0 = INTEGER: 506568
UCD-SNMP-MIB::memSwapError.0 = INTEGER: 0
UCD-SNMP-MIB::memSwapErrorMsg.0 = STRING:


Wait a second… There is no value for Memory Used… All SNMP is seeing are the Total and Available!?  Oh well, let’s see what happens when we calculate for utilization using these values…

1 – (memAvailReal / memTotalReal) = ~28%

Hmmmm… that’s not good, is it?  Solarwinds must be doing something different with that data.

We could try Solarwinds’ not so useful SNMPWalk.exe to try and get a huge dump of all the SNMP data that is returned, but that won’t really tell us what OIDs Solarwinds is really polling against…  Let’s take a break and sniff some packets.

Fire up Wireshark and set the filter: ip.src = xxx.xxx.xxx.xxx

After we force a repoll on the device we get about 15 hits, but we’re only concerned with these four…

1.3.6.1.4.1.2021.4.5.0
1.3.6.1.4.1.2021.4.6.0
1.3.6.1.4.1.2021.4.14.0
1.3.6.1.4.1.2021.4.15.0

What could these possibly relate to?

1.3.6.1.4.1.2021.4.5.0 => memTotalReal
1.3.6.1.4.1.2021.4.6.0 => memAvailReal
1.3.6.1.4.1.2021.4.14.0 => memBuffer
1.3.6.1.4.1.2021.4.15.0 => memCache


So it looks like Solarwinds is pulling not only the data for Total and Available but also Buffer and Cache.  With a little bit of creative formulating we find this….

1 - ((memAvailReal + memBuffer + memCache) / memTotalReal) = ~18%

Wowee!  So it turns out that Solarwinds is actually counting Buffered and Cached memory as unutilized space. 

But you might be asking yourself (or have an admin asking you)… “Well, why is that number different from the Used values in the free and top commands?”

To answer that question let’s do a little dumpster errm… code diving!

After downloading the source files for procps utilities we find a few little nuggets of wisdom:

sysinfo.c

Kb_main_used = kb_main_total – kb_main_free


Oh… so the only reason that free is showing used is because some dude coded it that way?  Yep.

Digging a little deeper we find this gem:
free.c
“-/+” buffers/cache: %10Lu %10Lu\n”,
S(kb_main_used – buffers_plus_cached),
S(kb_main_free + buffers_plus_cached


You might be wondering why this is important… remember the free command we ran back at the start?
$ free -m
                                Total      used      free       shared  buffers cached
Mem:                    7983       2207       5775       0              306         494
-/+ buffers/cache:           1406       6576
Swap:                    8191       0              8191


Let’s try something creative…

“-/+ buffers/cache.used” / Mem.Total = ~18%

So it turns out the data was there for the admin to use the whole time… he was just looking at the wrong data.  When we run the numbers he reported originally through the same formula we get ~21% which is about on par for what he reported Solarwinds as showing  (and without having a screenshot of what he was seeing is well within the margin of error for guesstimations).

Without getting into the details, top and vmstat are also part of the same utility package.  Vmstat is nice because it doesn’t do any utilization calculations on its own.

Lets tie this all together and see where all of these different resources are pulling their Memory from…

$ cat /proc/meminfo
MemTotal:      8174656 kB
MemFree:       5914452 kB
Buffers:        313432 kB
Cached:         506568 kB
SwapCached:          0 kB
Active:        1803944 kB
Inactive:       333024 kB
HighTotal:           0 kB
HighFree:            0 kB
LowTotal:      8174656 kB
LowFree:       5914452 kB
SwapTotal:     8388600 kB
SwapFree:      8388600 kB
Dirty:              68 kB
Writeback:           0 kB
AnonPages:     1317016 kB
Mapped:          30708 kB
Slab:            90220 kB
PageTables:       6608 kB
NFS_Unstable:        0 kB
Bounce:              0 kB
CommitLimit:  12475928 kB
Committed_AS:  1784324 kB
VmallocTotal: 34359738367 kB
VmallocUsed:    267160 kB
VmallocChunk: 34359470839 kB
HugePages_Total:     0
HugePages_Free:      0
HugePages_Rsvd:      0
Hugepagesize:     2048 kB


Wild stuff… /proc/meminfo contains just about the rawest human readable data relating to memory utilization.  And guess what, no utilization information is included…  It is up to the end user to figure out how they want to calculate utilization.  For most system admins they decide to rely on a generic formula that simply subtracts free space from total space.  Solarwinds chose to include buffered and cached space as free space.

For a little more in depth discussion about why buffered and cached memory are counted as unutilized, visit this handy website: http://www.linuxatemyram.com/

Even if you don’t need it you should visit it just to see the awesome title pic.  And ya, apparently this issue is so common that some dude actually used it as the domain name.

I hope this helps you understand Solarwinds memory utilization monitors on Linux.  Please feel free to relay this information to any admins that get into a huff about Solarwinds showing a different value than they think is correct.  Of course, be sure that SNMP is returning the correct data, but now that you know the proper formula this can be easily calculated from their top or free information.

View solution in original post

Level 16

Hi bobross

Since u mentioned that SW calculates the memory which includes buffer, cache and free components, is there any way that we can poll only the free component and alert based on that?

We have been recently questioned by our Linux admins that SW is showing very high utilization, whereas as per them the free% is very much available and they don't want us to consider the cache and buffer part in it....

0 Kudos

Even as Syndrome (in The Incredibles) claimed to be "geeking out" over Mr. Incredible's creative hiding behind a dead super hero, from a robot searching for a live super hero, so too am I geeking out over your techno-sleuthing.

pastedImage_0.png

Nicely done!

0 Kudos
Product Manager
Product Manager

bobross, you hit it right on the money, though I've never seen it so beautifully laid out and articulated. If you don't mind, I'd love to have one of our technical document writers steal pretty liberally from your posting to create a KB article that describes this in as much detail. This truly is very helpful information for new customers who wonder how memory usage is calculated in Orion NPM and SAM. Thank you for sharing!

0 Kudos
Level 12

Feel free... but I doubt that a technical writer will be able to capture the adventurous tone 😄

0 Kudos
Level 14

bobross,

We do what we can...sometimes we even go outside to see what that might be like...

0 Kudos
Level 20

Outside... what's that?

 

Great post by the way...

0 Kudos
Level 10

Great article! 

0 Kudos
Level 15

I've been informed about this lovely posting by the SAM PM, AlterEgo. As the SAM tech writer, I will indeed write a KB on this next week, "borrowing" your hard work. And I was a fan of the real Bob Ross...I know how he expresses himself. (Unfortunately, tech-writing doesn't permit a great deal of flare. I'll be sure to beat the devil out of it though.) Thanks.

0 Kudos
Level 14

bobross,

Thank you for this excellent piece of code archeology. I'll get with Dev to confirm and determine how best we can provide info more clearly and more in-depth for you and your users/customers.

Thanks,

Level 11

Hi,

This is something that's annoyed and plagued us for ages.

I'm looking at a linux box now and the CPU and Memory stats show 322mb used, 751mb available but then the volume info shows Physical memory 964mb used.

I was told once that the Memory statistics area was more about the amount of memory the running processes were using which wasn't the same info as reported under the Volume information. Linux allocates all physical memory to itself and then dishes it out after and this was what gave the different results.

Hopefully the Solarwinds Dev's can clarify once and for all the mystery of where all the memory stats come from and which one to trust for when a server should be upgraded.

Thanks

Jase

0 Kudos
Level 12

From my above investigation I concluded (personally of course) that the standard CPU & Memory Utilization monitor is a better indicator of overall memory utilization.  I didn't have a chance to test the Physical Memory 'Volume' monitor, but from your numbers I would assume that it is counting Buffered and/or Cached Memory as "Used".

With that said, I would also like clarification as to whether or not I have the formula correct.  It is handy that we can now tell some of our more difficult system admins about how Net-SNMP is reporting the data, but a firm answer from SW about how SW itself is calculating the output would be great.

Level 7

Hi Bob, I realize this is digging up an old post, but I didn't really see anybody say if your formula was correct ( 1-(memfree+membuffer+memcache)/totalmem = memused% ). Since you dove into pretty great detail, even unleashing the wireshark, I am surprised you did not hit upon the answer (or maybe you did and I missed it). The formula is correct, and nobody should ever trust what Linux reports as "free memory" since it is being very pedantic. True, that will be the amount of memory not used by anything, but it is not the amount that is available for use by an application if needed. You need to look at Linux's disk caching algorithms, which can be summarized as "Is there memory available? yes? Well, lets cache some data!". Memory that is not used for anything is not helping you at all. Linux will use this to cache data from disk. Since the memory was unused to begin with, there is no loss, and if you want to read one of those blocks again, then it is already in memory and will save you a disk read. If an application comes along and needs more memory than the system has 100% free, then it will flush cache pages and hand them over. The buffer value is similar, but contains data waiting for writes. Flushing these values will involve the disk, so it is more costly than just dumping the cache. These are both used by the kernel on an as-needed and as-available basis. If a server is well and truly running low on memory, then there will be very little in cache if anything. I normally watch these three values and swap utilisation.

...and the reason I ended up on this page in the first place is related, but I did not find the answer here. I am trying to setup an snmp memory alert in solarwinds SAM without using an agent, but I don't see a way to use anything other than what SNMP reports...so I am not able to create anything that says 1-(memfree+membuffer+memcache)/totalmem = memused%. I guess I'll keep looking.

Edit: ...and I think I found what I was looking for under the NPM, which seems more useful for monitoring servers than the SAM.

Level 8

Thanks for providing a ray of clue in this mob.  I've been working to educate the "but all my ram is gone" webdevs and similar folks for 25 years now - who now stare PAST the "+/- buffers/cache" line and repeat the same bad math - and I see the work isn't yet done.

0 Kudos