cancel
Showing results for 
Search instead for 
Did you mean: 
Create Post
Level 10

WMI vs SNMP

Leon (@adatole) has done lots of papers in the past on WMI vs SNMP, but I have a more technical question.

When I poll using WMI, I get much more detailed statistics than I get when I poll using SNMP.  I mean, literally, in a 30 minute window, WMI will reveal that CPU has been all over the map - from 10% up to 95%.  If I look at the same time period with SNMP, I get a very sedate report of "Steady 20%, no problems here".  Why does WMI report not agree with SNMP?

Also, a lot of people have stated that there is no impact in moving from SNMP to WMI or vice versa (or, that is to say, the impact is minimal).  We run both in our organization, along with the Agent on our Domain Controllers.  Do we get any benefit by comparing them?  Chinese Proverb: "Man who own three clocks, never sure what time it is."  Comparing them is an exercise in futility.  Even when we set up a single workstation to poll using WMI and SNMP simultaneously, the stats don't agree. 

Oh, and one more thing...in this new age of close examination of Event Logs, we've noticed that a server being polled from Orion generates massive numbers of Login/Logoff events.  A machine which is running WMI and ONLY collecting stats every ten minutes will generate 4-8 events on a minute-by-minute basis.  (Our Storage team noticed this on one of their servers.)   For a machine which is monitoring a few Windows Services or a couple of Application monitors, the minute-by-minute count goes up to 15-20 per minute.

 

Does anyone know how the SNMP agent reports statistics back to Orion?  Because our customers are getting sick of being told that CPU usage is at 95%, logging in and seeing that it's not...then they point at SNMP and say, "It was only at 20% the whole time!"  Getting harder to be an Orion admin...

3 Replies

A lot of questions here!   

First off, values put into both SNMP and WMI counters are generated somehow via algorithm's that tell the SNMP or WMI agent how to report the values back.   They could vary quite a bit.  SNMP could be averaged over a period of time, while WMI could be "now" values.

It used to be that WMI would impact your system quite a bit more than SNMP, not sure if that's still the case or not.   Just from the way they operate, with SNMP being a pretty discrete system with its own authentication and such, while WMI has hooks into all sorts of things for authentication like AD, and I think WMI has a lot more that you can monitor, I'd expect WMI to be more of a load overall.

I think you're assuming that WMI goes out an monitors everything at once.   ie: if you have 20 things your monitoring that it goes out every 10 minutes and polls those 20 things all at once.  However, it can spread these things across the whole 10 minutes, where it could do 20 different authentications to do the 20 different things, one every half minute or so.   There is no way to sync up the polls that I know of.

You also have to remember that Microsoft deprecated the SNMP agent in 2012, so they no longer support it.  The way that it reports values back could be out of date.   Your better off using the Agent or WMI.

Between the Agent and WMI, I'd expect the agent to have less of an impact.  It's not subject to the whim's of Windows where it needs to authenticate and get logged, etc.   It should be much more efficient than WMI polling, not to mention more secure (ie: its encrypted) and not prone to authentication or policy changes for the most part.


@cnorborg wrote:

Between the Agent and WMI, I'd expect the agent to have less of an impact.  It's not subject to the whim's of Windows where it needs to authenticate and get logged, etc.   It should be much more efficient than WMI polling, not to mention more secure (ie: its encrypted) and not prone to authentication or policy changes for the most part.


This statement makes sense on the surface, and I'm curious if you can expound on the theory that the agent is much more efficient than WMI polling. 

The only thing that I would add is that using SAM's new capability to use WMI via WinRM instead of using WMI over RPC+DCOM is already more efficient and more secure: is is always encrypted (regardless of port 5985 vs 5986) and only requires one port to be open, versus the legacy method that used 3 low ports plus the full ephemeral range for RPC. To complete this enhancement, please vote for this feature request to add support for WinRM polling of [Windows] node statistics, in addition to the current ability to use WinRM for application monitoring. FR: https://thwack.solarwinds.com/t5/NPM-Feature-Requests/Support-Win-RM-for-node-details-polling-in-NPM...

Thanks!

0 Kudos


@cnorborg wrote:

First off, values put into both SNMP and WMI counters are generated somehow via algorithm's that tell the SNMP or WMI agent how to report the values back.   They could vary quite a bit.  SNMP could be averaged over a period of time, while WMI could be "now" values.

That was the thinking here, too. The problem was, I couldn't state it without proof. It wasn't until I went to http://www.net-snmp.org/docs/mibs/host.html that I found the answer:
"The average, over the last minute, of the percentage of time that this processor was not idle. Implementations may approximate this one minute smoothing period if necessary."

So the value I am getting in Orion when I poll via SNMP is a one-minute average, instead of being the "right now" value I get out of WMI.

 


@cnorborg wrote:

I think you're assuming that WMI goes out an monitors everything at once.   ie: if you have 20 things your monitoring that it goes out every 10 minutes and polls those 20 things all at once.  However, it can spread these things across the whole 10 minutes, where it could do 20 different authentications to do the 20 different things, one every half minute or so.   There is no way to sync up the polls that I know of.


Well, I can do an analysis of the logon/logoff events coming from my poller, and I can show you the number of events in one-minute buckets. What I'm finding is that every minute I get (minimum) four events (two logon/logoff), then every five minutes I get between two to eight extra (depending on how many 5-minute tests I'm running) and then every 10 minutes I get another two to four on top of the one-minute and five-minute polls. (We poll for statistics every 10 minutes.). I guess I'm saying that I agree, it may not be polling them "all at once" but it certainly has good timing, lined up within each minute.

 


@cnorborg wrote:

You also have to remember that Microsoft deprecated the SNMP agent in 2012, so they no longer support it.  The way that it reports values back could be out of date.   Your better off using the Agent or WMI.

Between the Agent and WMI, I'd expect the agent to have less of an impact.  It's not subject to the whim's of Windows where it needs to authenticate and get logged, etc.   It should be much more efficient than WMI polling, not to mention more secure (ie: its encrypted) and not prone to authentication or policy changes for the most part.


Meh, not crazy about the Agent.  It is buggy in it's own right.  Often it'll just decide that in order to go on working, it wants you to reboot it's host server.  (And it doesn't even have the courtesy to tell you it's gone on an extended coffee break.)  

 

Thanks for answering. 

0 Kudos