cancel
Showing results for 
Search instead for 
Did you mean: 
Create Post

Tips for Simplifying Network Management

Level 10

Simplifying network management is a challenging task for any organization, especially those that have chosen a best of breed route and have a mix of vendors. I ask my customers to strive for these things when looking to improve their network management and gain some efficiency.

  1. Strive for a Single Source of Truth—As an administrator there should be a single place that you manage information about a specific set of users or devices (e.g. Active Directory as the only user database). Everything else on the network should reference that source for its specific information. Multiple domains or maintaining a mix of LDAP and RADIUS users makes authentication complicated and arguably may make your organization less secure as maintaining these multiple sources is burdensome. Invest in doing one right and exclusively.
  2. Standardization—A tremendous amount of time savings can be found by eliminating one-off configurations/sites, situations, etc. An often overlooked part in this time savings is in consulting and contractor costs, the easier it is for an internal team to quickly identify a location, IDF, device, etc. the easier it will be for your hired guns as well. A system should be in place for IP address schemes, VLAN numbering, naming conventions, low voltage cabling, switch port usage, redundancy, etc.
  3. Configuration Management—Creating a plan for standardization is one thing, ensuring it gets executed is tougher. There are numerous tools that allow for template-based configuration or script-based configuration. If your organization is going to take the time to standardize the network, it is critical that it gets followed through on the configuration side. DevOps environments may turn to products like Chef, Puppet or Ansible to help with this sort of management.
  4. Auditing and Accountability—Being proactive about policing these efforts is important and to do that some sort of accountability needs to be in place. This should happen in change control meetings to ensure changes are well thought out and meet the design standards, safeguards are in place to ensure the right people are making the changes and that those changes can be tracked back to a specific person (no shared “admin” or “root” accounts!) to help ensure that all of the hard work put in to this point is actually maintained. New hires should be trained and indoctrinated in the system to ensure that they follow the process.

Following these steps will simplify the network, increase visibility, speed troubleshooting, and even help security. What steps have you taken in your environment to simplify network management?  We’d love to hear it!

12 Comments
MVP
MVP

Documentation....everything you mentioned must be documented and documented well so that it is 3 AM friendly and is your tool for auditing..especially when there are exceptions.

Jfrazier​ , could one make the argument that the requirement for Documentation is addressed in adherence to #1, partly #2, and #3? (In theory of course) Like you I am a big fan of documentation but I have abandoned writing things down as they quickly become obsolete. I now prefer digitial, "On Demand" documentation.

MVP
MVP

It is the documentation that ties it all together.

It defines and indicates what is the single source of truth, how things are standardized and when it is not, why and how it fits in as well as mitigation.  Configuration management plans and how they work also have to be documented as well as scripts, issues, and workarounds.  This all then feeds into auditing and accountability.  If it is not documented, do you truly have procedures in place to mitigate X ? 

I would also throw in here to minimize and reduce wherever possible. I still see many instances where segmentation takes place on the physical layer (This I found baffling). sv_neal​ where does Monitoring and Capacity Management fall in these 4 areas?

I like these ideas.  But they might only remain in the theoretical / best-practice realm for some I.T. staff, who accept them as theoretically ideal but unattainable in practice.

Worse, some folks might cast aside good standards or recommendations because they don't have the experience or environmental scope and scale to understand the benefits of the standards.  In these cases some technical people may not even go through the effort of building standards, or of discovering your existing recommendations and following them.

I find it's never time wasted when I show staff the rational for our standards.  Examples I share with my team include:

  • Naming conventions that can be leveraged for searching and scripting and reporting with NPM and NCM, including:
    • Sites
    • Interfaces
    • Port-channels
    • Custom fields
  • Practices that maintain high uptime, including
    • Never plugging all the devices for one department into the same switch blade--if you have two blades available, and all devices are plugged into one blade, and that blade fails, you've participated in a department outage that could have been lessened.
    • Never plugging all critical devices into the same switch if a second switch in the stack is available--same reason as above
    • Never plugging adjacent all critical devices into adjacent switch ports when other ports in the same switch are available.  This is because when ASIC's and OctaPIDS (Nortel world) fail, they take out groups of adjacent switch ports simultaneously.  It's awkward, but I've seen ports 1-6 fail to pass traffic while ports 7+ are working just fine.  Spreading connections throughout that switch reduces problems that could come when someone decides to plug all the devices from one department into all adjacent ports.
  • Cable management:
    • Never bundle power cables with Ethernet (Induction from the power cables can generate noise and electrical signal on adjacent Ethernet cables, which is a pain to troubleshoot)
    • Never run cables across a switch's front except at right angles to the horizontal switch (so when a switch or blade fails, you only have to unplug those 48 ports, and not unplug patches going to other blades or switches that are strung across the failed blade that must be removed and replaced).
  • Follow an equipment life cycle
    • Proactively replacing switches & routers based on MTBF's and experience results in a huge decrease in unscheduled down time during the day
    • Enables us to do a much better job budgeting funds and allocating support hours.

In every case, I can demonstrate how these practices make our tech support lives easier and simpler and better, resulting in a less stressful future.  And the bigger the scale and the more complex the environment, the greater the benefit from following standards.

Once people understand why a method has made it into "The Standards" they may be less likely to invent their own method.  If their method isn't grounded in training and practical troubleshooting experience garnered from years of hands-on problem solving, they may bring your team several steps backwards by creating headaches that will be with you for years.

Level 16

I like your opening statement.

Some of the challenges I have personally seen with the 'best of breed' approach is companies under estimate and then under staff the areas needed

to get the most use out of a 'best of breed' tool. That tool ends up being someones hobby instead of their job.

The tool ends up several versions behind, only partially implemented, and then eventually shelved.

My experience with Solarwinds is while its not always the best of breed for everything out there - it does a decent job, and I can get

90% of the functionality I need right out of the box.

Well said

MVP
MVP

Best of breed is not always the most appropriate tool for the job....

Level 21

I totally agree with Jfrazier​, I have been the one up late at night trying to find the documentation that turned out to never have existed.  This brings up another good point in that it's not just important to have good documentation but to also do a good job of managing and organizing that documentation so others can find it.  Good documentation is worthless if it can't be found or accessed quickly when needed.

Level 20

"powerbroker" helps a lot with handling root on unix and linux!

Level 10

WHAT?!?  Network Management is not "self-documenting" like my government leadership believes?

Level 13

rschroeder​ typed a great list for everyone to follow - standards go a long way to help troubleshoot and identify issues.

About the Author
Shaun Neal is a Solution Architect with enterprise networking, security and mobility expertise. Additionally, Shaun is engaged in wireless product development, deployment, integration and go to market strategies. His experience aligns information technology and the organizational mission to create service orientated architecture design and see it through implementation.