Could SNMP Please Just Die Already?

I-heart-snmp.png

The Simple Network Management Protocol – SNMP – was originally proposed by way of RFC–1067 back in 1988. While that doesn’t hold a candle to TCP/IP which is approaching middle age, SNMP, at the grand old age of 27, seems to be having a bit of a mid-life crisis.

Reliability

One problem with SNMP is that it’s based on UDP, or as I like to call it, the “Unreliable Data Protocol”. This was advantageous once upon a time when memory and CPU were at a premium, because the low session overhead and lack of need to retransmit lost packets made it an ideal “best effort” management protocol. These days we don’t have the same constraints, so why do we continue to accept an unreliable protocol for our critical device management? If Telnet were based on UDP, how would you feel about it? I’m guessing you wouldn’t accept that, so why accept UDP for network management?

It’s kind of funny that we’re still using SNMP when you think about.

Inefficient Design

Querying a device using SNMP is slow, iterative and inefficient. Things were improved slightly with SNMPv2, but the basic mechanism behind SNMP continues to be clunky and slow. Worse, SNMPv3’s attempt at adding security to this clunkmeister remains laughably unused in most environments.

Alternatives

So if SNMP is the steaming pile of outdated monkey dung that I am suggesting it is, what should we use instead? I’m open to suggestions, because I know there must be something better than this.

SNMP Traps

For traps, rather than waiting for a UDP alert that may or may not get there, how about connecting over TCP and subscribing to an event stream? If you only want certain events, filter the stream (a smart filter would work on the server side, i.e. the device). This is pretty much what the Cisco UCS Manager does; the event stream reliably sends XML-formatted events to anybody that subscribes to them. This also means that you don’t get what I see so often in networks, which is a device wasting time sending traps to destinations that were decommissioned years ago. An event stream requires an active receiver to connect and subscribe, so events are only sent to current receivers.

SNMP Polling

The flavors of the month in the Software Defined Networking (SDN) world are things like NETCONF and REST APIs. These are TCP-based mechanisms by which to request and receive data formatted in JSON or XML, for example. You’ve spotted by now, I’m sure, that I’m network centric, but why not poll all devices this way? Rather than connecting each time a poll is requested, why not keep the connection alive and request data each time it’s needed? XML and JSON can seem rather inefficient as transfer mechanisms go, but if we’re running over HTTP, maybe we can use support for GZIP to keep things more efficient?

Parallel connections could be used to separate polling so that a high-frequency poll request for a few specific polls runs on one connection while the lower-frequency “all ports” poll runs on another.

Screen-Scraping

If we put aside fancy formatted data, have you ever wondered whether it would be easier to just SSH to a device and issue the appropriate show commands once a minute, or once every 5 minutes? SSH can also support compression, so throughput can be minimized easily too. The data is not structured though, which is a huge disadvantage, so unless you’re polling a Junos device where you can request to see the response in XML format, you’ve got some work to do in order to turn that data into something useful. In principal though, this is a plausible if rather unwieldy solution.

And You?

How do you feel about SNMP; am I wrong? Are you using any alternatives – even for a small part of your assets – that you can share? Are you willing to use a different protocol? And why has SNMP continued to be the standard protocol when it’s evidently so lame? The fact that SNMP is ubiquitous at this point and is the de facto network management standard means that supplanting it with an alternative is a huge step, and any new standard is going to take a long time to establish itself;just look at IPv6.

  • There's no real replacement for SNMP right now, as too many people/enterprises are trying to DevOps their way around things, rather than getting down to the root of the problem, and tackling it at it's core. Now, there's nothing wrong with thinking outside the box to solve a problem, but a replacement for SNMP is so fundamental that it would have to be done properly.

    A RFC is a standard, and that standard has to suit every single application of it's contents, and it has to work in the same way in every application. "Everyone" would need to come together to forge a new RFC for monitoring kit in the 21st century! I can't see it happening any time soon, however, as every one of the tech giants will be looking to build in some form of competitive edge, and would try to corrupt the nascent RFC to their own ends. This is why businesses should not be part of solving the problem.

    These things are best solved by career scientists. Mainly because science is global, and (other than funding) not bothered about the monetary gain made possible by the output of their craniums...It'll be the boffins who produce SNMPvNEXT.

  • I like to work SNMP to colect data.. Secuity and easy.

    WhenI have a  problem not are hardware, not are software it´s peopleware hehehehhehe change the configuration and not remenber to update in  orion server.

    emoticons_happy.png

  • That's for sure true. The last thing we need is a different solution for each vendor, or (maybe even worse) the introduction of proprietary protocols.

  • ‌Here is an example of the load on a Palo Alto firewall using the different polling methods it supports. http://www.indeni.com/blog/using-the-api-with-palo-alto-networks-firewalls-the-cpu-perspective

THWACK - Symbolize TM, R, and C