cancel
Showing results for 
Search instead for 
Did you mean: 
Create Post

Monitoring In An Age Of Abstraction

Level 11

Practitioners in nearly every technology field are facing revolutionary changes in the way systems and networks are built. Change, by itself, really isn't all that interesting. Those among us who have been doing this a while will recognize that technological change is one of the few reliable constants. What is interesting, however, is how things are changing.

Architects, engineers, and the vendors that produce gear for them have simply fallen in love with the concept of abstraction. The abstraction flood gates have metaphorically flown open following the meteoric rise of the virtual machine in enterprise networks. As an industry, we have watched the abstraction of the operating system -- from the hardware it lives on -- give us an amazing amount of flexibility in the way we deploy and manage our systems.  Now that the industry has fully embraced the concept of abstraction, we aim to implement it everywhere.

Breaking away from monolithic stack architecture

If we take a look at systems specifically, it used to be that the hardware, the operating system, and the application all existed as one logical entity.  If it was a large application, we might have components of the application split out across multiple hardware/OS combos, but generally speaking the stack was a unit. That single unit was something we could easily recognize and monitor as a whole. SNMP, while it has its limitations, has done a decent job of allowing operators to query the state of everything in that single stack.

Virtualization changed the game a bit as we decoupled the OS/Application from the hardware. While it may not have been the most efficient way of doing it, we could still monitor the VM like we used to when it was coupled with the hardware.  This is because we hadn't really changed the architecture.  Abstraction gave us some significant flexibility but our applications still relied on the same components, arranged in a similar pattern to the bare-metal stacks we started with.  The difference is that we now had two unique units where information collection was required, the hardware remained as it always had and the OS/Application became a secondary monitoring target.  It took a little more configuration but it didn't change the nature of the way we monitored the systems.

Cloud architecture changes everything

Then came the concept of cloud infrastructure. With it, developers began embracing the elastic nature of the cloud and started building their products to take advantage of it. Rather than sizing an application stack based off of guesstimates of the anticipated peak load, it can now be sized minimally and scaled out horizontally when needed by adding additional instances. Previously, just a handful of systems would have handled peak loads. Now those numbers could be dozens, or even hundreds of dynamically built systems scaled out based on demand. As the industry moves in this direction, our traditional means of monitoring simply do not provide enough information to let us know if our application is performing as expected.

The networking story is similar in a lot of ways. While networking has generally been resistant to change over the past couple of decades, the need for dynamic/elastic infrastructure is forcing networks to take several evolutionary steps rather quickly.  In order to support the cloud models that application developers have embraced, the networks of tomorrow will be built with application awareness, self-programmability, and moment-in-time best path selection as core components.

Much like in the systems world, abstraction is one of the primary keys to achieving this flexibility. Whether the new model of networks is built upon new protocols, or overlays of existing infrastructure, the traditional way of statically configuring networks is coming to an end. Rather than having statically assigned primary, secondary, and tertiary paths, networks will balance traffic based off of business policy, link performance, and application awareness. Fault awareness will be built in, and traffic flows will be dynamically routed around trouble points in the network. Knowing the status of the actual links themselves will become less important, much like physical hardware that applications use. Understanding network performance will require understanding the actual performance of the packet flows that are utilizing the infrastructure.

At the heart of the matter, the end goal appears to be ephemeral state of both network path selection as well as systems architecture.

So how does this change monitoring?

Abstraction inherently makes application and network performance harder to analyze. In the past, we could monitor hardware state, network link performance, CPU, memory, disk latency, logs, etc. and come up with a fairly accurate picture of what was going on with the applications using those resources. Distributed architectures negate the correlation between a single piece of underlying infrastructure and the applications that use it.  Instead, synthetic application transactions and real-time performance data will need to be used to determine what application performance really looks like. Telemetry is a necessary component for monitoring next generation system and network architectures.

Does this mean that SNMP is going away?

While many practitioners wouldn't exactly shed a tear if they never needed to touch SNMP again, the answer is no. We still will have a need to monitor the underlying infrastructure even though it no longer gives us the holistic view that it once did. The widespread use of SNMP as the mechanism for monitoring infrastructure means it will remain a component of monitoring strategies for some time to come. Next generation monitoring systems will need to integrate the traditional SNMP methodologies with deeper levels of real-time application testing and awareness to ensure operators can remain aware of the environments they are responsible for managing.

13 Comments
Level 15

I like this article

Level 10

Yes, indeed. As we progress and set the bar higher and higher in terms of innovation and technological development to satisfy our insatiable hunger for endless improvement and drive for business success and organizational profit, changes with IT is becoming more radical and revolutionary than before. With the recent trend, I completely agree with the author, monitoring in the coming days will be more complex and sophisticated than before, thus the need for new means of monitoring is essential in order for folks like us to cope with the change and to thrive in our filed of specialization.

MVP
MVP

I don't see SNMP going away...it may need to evolve some.

The fun part is to now utilize correlation of metrics, events, and status of processes to provide an overall view of a service that is distributed across multiple servers.

Level 15

I agree. SNMP is as resilient as IPv4. I've heard talk for years that SNMP will be "phased out" and yet it sill sticks around. I agree with you, SNMP needs to evolve to keep alive. Given that it is so rudimentary now I am not sure myself how it evolves.

"The Cloud" (can you say "A.S.P."?  I knew you could!) DOES seem to decrease the concrete and increase the abstract.  And as resources expand and decrease on demand in the VM world, there should be a corresponding automatic increase and decrease in snmp monitoring.

Call it snmp-4 or 5 or snmp-9000, the label matters only for reference.  But as VM or Load Balancer resources ebb and flow, so should hooks into monitoring systems that increase and decrease monitoring automatically, too.

Imagine a monitoring solution and a VM solution that are intelligent enough to work together to automatically add monitoring interfaces and set thresholds, that can even automatically build and deploy and assign additional Solarwinds polling solutions to the newly expanded resources.   At 10 a.m. you have X number of VM hosts and clients being monitored by one Orion NPM poller.  At 3 p.m. the number of VM hosts and clients has automatically grown significantly in response to demand.  But the VM growth process has automatically spawned a new NPM poller, assigned the appropriate new monitoring interfaces on the poller for each new VM host/client, and joined the new poller to NPM's main instance. 

Now THIS is one fun face for intelligent monitoring!  Solarwinds, is this on the roadmap? I hope so!

Level 11

SNMP is far too engrained to be left in the dust any time soon but I'm not sure about SNMP evolving in and of itself though.  I believe we will augment SNMP with broader telemetry data, specifically crafted to monitor the end-to-end user experience rather than state of any particular portion of the application.  I imagine this deeper level of information is going to come from the collection, logging, and analysis of flow data from different points in the network.  Correlating this data within itself, as well as with the traditional SNMP information we have now, is going to be a really significant challenge in moving forward. 

Level 11

I used to hate the term "cloud" as all it meant was that an organization was running their resources on other peoples equipment.  Over time though it has started to take on a more definite and unique meaning, to myself at least.  I see cloud architecture as something that incorporates on-demand provisioning/scalability without manual intervention from an operator/engineer.  It's this style of elastic infrastructure that is going to challenge our traditional mindset on how to monitor applications, not necessarily whether it is hosted on or off site.

I do agree that automating your monitoring through hooks from the LBs or VM platforms could be a way to keep up with an elastic-style infrastructure.  Interesting thought.

Product Manager
Product Manager

Elasticity, different pricing models (e.g. consumption-based), and self-provisioning seem to be the keys. Otherwise it's just same stuff, different shovel

I do like the idea of automatically increasing/decreasing monitoring. Some cloud services, like Azure, have hooks to automatically provision things, but it's usually used for security (e.g. if you deploy a web server, automatically deploy a web application firewall). A lot of people put hooks in their deployment automation side to make sure standard stuff gets out there, but it would be cool to be able to use your monitoring system to a) tell you what's not being monitored but is new on the network (this isn't really a new problem, but is made worse by elasticity) and ideally b) automatically monitor it. It sucks to get woken up by some problem you didn't know you had because it was out of scope of monitoring.

Imagine a firewall or IDS/IPS system that intelligently detected new flows and configured monitors for them in Orion, while at the same time building rules to pass those flows, but first sending the rules to Change Management for approval before deploying them.

Automated monitoring will happen, one day, but I'd always prefer to have at least one hand on the wheel, as it were. We here all know how pivotal reliable monitoring is, and having ED_209 doing my monitoring would lead me somewhat uneasy

Level 14

Once again, it comes down to the tools in the toolbox.  Choose the tools that give you information you need.  SNMP still provides valuable information to network monitors and as long as it does, it will remain a tool in my toolbox.

Level 20

SNMP isn't going anywhere but...The one thing I DO see is snmpv3 becoming more mandatory... which if you've used it can sometimes be a real pain in the @$$ if you get my drift... I guess the more we use the better we'll be at it.  I'm sure NCM can help get those configs straight across the enterprise which should help!