Fiefdoms, Silos, and Plumbing

SomeClown over 7 years ago 4 minute read time

In the world of information technology, those of us who are tasked with providing and maintaining the networks, applications, storage, and services to the rest of the organization, are increasingly under pressure to provide more accurate, or at least more granular, service level guarantees. The standard quality of service (QoS) mechanisms we have used in the past are becoming more and more inadequate to properly handle the disparate types of traffic we are seeing on the wire today. In order to continue to successfully provide services in a guaranteed, deliberate, measurable, and ultimately very accurate manner, is going to require different tools and additional focus on increasingly more all encompassing ecosystems. Simply put: our insular fiefdoms are not going to cut it in the future. So, what are we going to do about the problem? What can we do to increase our end to end visibility, tracking, and service level guarantees?

One of the first things we ought to do is make certain that we have, at the very least, implemented some baseline quality of service policies. Things like separating video, voice, regular data, high priority data, control plane data, etc., seem like the kind of thing that should be a given, but every day I am surprised by another network that has very poorly deployed what QoS they do have. Often I see video and voice in the same class, and I see no class for control plane traffic; my guess is no policing either, but that is another topic for another day. If we cannot succeed at the basics, we most certainly should not be attempting anything more grandiose until we can fix the problems of today.

I have written repeatedly on the need to break down silos in IT, to get away from the artificial construct that says one group of people only control one area of the network, and have only limited interaction with other teams. Many times, as a matter of fact, I see such deep and ingrained silos that the different departments do not actually converge, from a leadership perspective, until the CIO. This unnecessarily obfuscated the full network picture from pretty much everyone. Server teams know what they have and control, storage teams are the same, and on down the line it goes with nobody really having an overall picture of things until you get far enough into the upper management layer that the fixes become political, and die by the proverbial committee.

In order to truly succeed at providing visibility to the network, we need to merge the traditional tools, services, and methodologies we have always used, with the knowledge and tools from other teams. Things like application visibility, hooks into virtualized servers, storage monitoring, wireless and security and everything in between need to be viewed as one cohesive structure on which service guarantees may be given. We need to stop looking at individual pieces, applying policy in a vacuum, and calling it good. When we do this it is most certainly not good or good enough.

We really don’t need QoS, we need full application visibility from start to finish. Do we care about the plumbing systems we use day to day? Not really, we assume they work effectively and we do not spend a lot of time contemplating the mechanisms and methodologies of said plumbing. In the same way, nobody for whom the network is merely a transport service cares about how things happen in the inner workings of that system, they just want it to work. The core function of the network is to provide a service to a user. That service needs to work all of the time, and it needs to work as quickly as it is designed to work. It does not matter to a user who is to blame when their particular application quits working, slows down, or otherwise exhibits unpleasant and undesired tendencies, they just know that somewhere in IT, someone has fallen down on the job and abdicated one of their core responsibilities: making things work.

I would suggest that one of the things we should certainly be implementing, looking at, etc., is a monitoring solution that can not only tell us what the heck the network routers, switches, firewalls, etc., are doing at any given time, but one in which applications, their use of storage, their underlying hardware—virtual, bare metal, containers—and their performance are measured as well. Yes, I want to know what the core movers and shakers of the underlying transport infrastructure are doing, but I also want visibility into how my applications are moving over that structure, and how that data becomes quantifiable as relates to the end user experience.

If we can get to a place where this is the normal state of affairs rather than the exception, using an application framework bringing everything together, we’ll be one step closer to knowing what the heck else to fix in order to support our user base. You can’t fix what you don’t know is a problem, and if all groups are in silos, monitoring nothing but their fiefdoms, there really is not an effective way to design a holistic, network-wide solution to the quality of service challenges we face day to day. We will simply do what we have always done and deploy small solutions, in a small way, to larger problems, then spend most of our time tossing crap over the fence to another group with a “it’s not the network” thrown in as well. It’s not my fault, it must be yours. And at the end of the day, the users are just wanting to know why the plumbing isn’t working and the toilets are all backed up.

Top Comments

Jfrazier over 7 years ago +1

Your suggestion is a good one. Solutions do exist and have for a while now. The challenge is the cost, software and hardware, complexity, and talent pool to setup and administer a full end to end solution…
tinmann0715 over 7 years ago +1

I read this blog post and I can't help but think of DevOps and SDN and the momentum that they bring with them. I, for one, have a love/hate relationship with silos. In my 20+ years of IT (almost exclusively…
bsciencefiction.tv over 7 years ago +1

We have long advocated that the single Pane of Glass could unify the kingdom.

byrona over 7 years ago

We broke down the silos years ago and consolidated all of the teams on one toolset under one manager and it made a huge difference.
One thing to realize is a the end of the day all anybody really cares about are the applications, everything else is just a means to delivering those applications. Because of this everybody involved needs to come together to work toward that goal.
- Cancel
- Vote Up 0 Vote Down
- More
- Cancel
ecklerwr1 over 7 years ago

And next vxlan, NSX, and openstack are going to throw huge big new wrench in the gears!!! Who's responsible for what is changing... the old storage team and network team and security team are all having the water muddied by virtualization of everything!
- Cancel
- Vote Up 0 Vote Down
- More
- Cancel
mtgilmore1 over 7 years ago

One screen One view
- Cancel
- Vote Up 0 Vote Down
- More
- Cancel
akiebach over 7 years ago

A lot of that comes down to team-building. Everyone has their own mannerisms and eccentricities when it comes to communication, so the more you have your teams engaging and interacting with one another, the easier larger-scale projects are going to be. And you're right, a lot of that is going to fall on management's shoulders, as it's their responsibility to know their team and build those bridges of trust.
- Cancel
- Vote Up 0 Vote Down
- More
- Cancel
Jfrazier over 7 years ago

Ah...but is it tempered ?
- Cancel
- Vote Up 0 Vote Down
- More
- Cancel