cancel
Showing results for 
Search instead for 
Did you mean: 
Create Post

The Value of Configuration Consistency

Level 13

As a network engineer, I don't think I've ever had the pleasure of having every device configured consistently in a network. But what does that even mean? What is consistency when we're potentially talking about multiple vendors and models of equipment?

There Can Only Be One (Operating System)

Claim: For any given model of hardware there should be one approved version of code deployed on that hardware everywhere across an organization.

Response: And if that version has a bug, then all your devices have that bug. This is the same basic security paradigm that leads us to have multiple firewall tiers comprising different vendors for extra protection against bugs in one vendor's code. I get it, but it just isn't practical. The reality is that it's hard enough upgrading device software to keep up with critical security patches, let alone doing so while maintaining multiple versions of code.

Why do we care? Because different versions of code can behave differently. Default command options can change between versions; previously unavailable options and features are added in new versions. Basically, having a consistent revision of code running means that you have a consistent platform on which to make changes. In most cases, that is probably worth the relatively rare occasions on which a serious enough bug forces an emergency code upgrade.

Corollary: The approved code version should be changing over time, as necessitated by feature requirements, stability improvements, and critical bugs. To that end, developing a repeatable method by which to upgrade code is kind of important.

Consistency in Device Management

Claim: Every device type should have a baseline template that implements a consistent management and administration configuration, with specific localized changes as necessary. For example, a template might include:

  • NTP / time zone
  • Syslog
  • SNMP configuration
  • Management interface ACLs
  • Control plane policing
  • AAA (authentication, authorization, and accounting) configuration
  • Local account if AAA authentication server fails*

(*) There are those who would argue, quite successfully, that such a local account should have a password unique to each device. The password would be extracted from a secure location (a break glass type of repository) on demand when needed and changed immediately afterward to prevent reuse of the local account. The argument is that if the password is compromised, it will leave all devices susceptible to accessibility. I agree, and I tip my hat to anybody who successfully implements this.

Response: Local accounts are for emergency access only because we all use a centralized authentication service, right? If not, why not? Local accounts for users are a terrible idea, and have a habit of being left in place for years after a user has left the organization.

NTP is a must for all devices so that syslog/SNMP timestamps are synced up. Choose one timezone (I suggest UTC) and implement it on your devices worldwide. Using a local time zone is a guaranteed way to mess up log analysis the first time a problem spans time zones; whatever time zone makes the most sense, use it, and use it everywhere. The same time zone should be configured in all network management and alerting software.

Other elements of the template are there to make sure that the same access is available to every device. Why wouldn't you want to do that?

Corollary: Each device and software version could have its own limitations, so multiple templates will be needed, adapted to the capabilities of each device.

Naming Standards

Claim: Pick a device naming standard and stick with it. If it's necessary to change it, go back and change all the existing devices as well.

Response: I feel my hat tipping again, but in principle this is a really good idea. I did work for one company where all servers were given six-letter dictionary words as their names, a policy driven by the security group who worried that any kind of semantically meaningful naming policy would reveal too much to an attacker. Fair play, but having to remember that the syslog servers are called WINDOW, BELFRY, CUPPED, and ORANGE is not exactly friendly. Particularly in office space, it can really help to be able to identify which floor or closet a device is in. I personally lean toward naming devices by role (e.g. leaf, access, core, etc.) and never by device model. How many places have switches called Chicago-6500-01 or similar? And when you upgrade that switch, what happens? And is that 6500 a core, distribution, access, or maybe a service-module switch?

Corollary: Think the naming standard through carefully, including giving thought to future changes.

Why Do This?

There are more areas that could and should be consistent. Maybe consider things like:

  • an interface naming standard
  • standard login banners
  • routing protocol process numbers
  • vlan assignments
  • CDP/LLDP
  • BFD parameters
  • MTU (oh my goodness, yes, MTU)

But why bother? Consistency brings a number of obvious operational benefits.

  • Configuring a new device using a standard template means a security baseline is built into the deployment process
  • Consistent administrative configuration reduces the number of devices which, at a critical moment in troubleshooting, turn out to be inaccessible
  • Logs and events are consistently and accurately timestamped
  • Things work, in general, the same way everywhere
  • Every device looks familiar when connecting
  • Devices are accessible, so configurations can be backed up into a configuration management tool, and changes can be pushed out, too
  • Configuration audit becomes easier

The only way to know if the configurations are consistent is to define a standard and then audit against it. If things are set up well, such an audit could even be automated. After a software upgrade, run the audit tool again to help ensure that nothing was lost or altered during the process.

What does your network look like? Is it consistent, or is it, shall we say, a product of organic growth? What are the upsides -- or downsides -- to consistency like this?

21 Comments
Level 15

Tips and principles from someone who has been in the trenches for years.  Nice listing.  I after having worked at a large number of companies both as field engineer and as a corporate IT person, truly appreciate the benefits of standards and consistency.  Being handed a standards document binder when you walked into a new customer site or when you sit down at a new desk for the first day after orientation is a major boon to understanding the environment and to ensuring that what was done continues to work going forward.  Also, having templates and checklists makes this task easier.

MVP
MVP

Good posting..

On the first topic..pick your naming standard carefully and then stick to it.

Nothing screws things up more than having more than one way to refer to something.  It will always find a way to bite you.

I suspect many (all?) organizations / individuals recognize there is benefit in consistency, and that nearly all understand making all things consistent is expensive in time and money.

But few guides focus on the benefits of that consistency.  Sure, we know a few of them.  It's nice to have everything set up using the same NTP servers, the same TACACS servers, etc.  It just makes building & deploying a new device (or recovering an old one) simpler--one less thing to think about.

But there are deeper layers to explore--who's willing to list the things they can leverage from consistency that saves time and labor and expense next year or in five years when it's deployed consistently today?

  • If you use the same interface naming convention you can easily make useful reports from it.  Better still, you can deploy corrections and configurations on every device that uses that convention.  For a simple example, if a firewall's interfaces are consistently named Outside and Inside, and are always int Gi0/0 and Gi01 respectively, no one has any questions about their cabling, rules, or purpose.  Everyone with management responsibilities can deploy security settings uniformly and quickly-without having to do discovery ("Did he use int Gi0/0 for Outside like I always do?  Or did he use int Gi0/1 for Outside?") .  And reports about throughput and direction will be consistent--better still, anything that changes will stand out like a red flag and you'll notice it quickly.
  • Put on your paranoid hat and imagine something bad getting at your Internet or internal network.  Security and Management issue an emergency directive to you:  Shut down all access between X and Y.  Will you have to start searching for those physical ports by CDP neighbor results?  Or can you simply use NCM and report on all Internet-facing links?  Or all Data Center links?  Or all Access Switch links.  Imagine it being bad--really bad!  Maybe Ransomware is loose inside your network and you're told to shut down all links to the data center to save the servers and data?  Could you do it quickly?  With consistent naming and port-use, yes, you could.  Could you minimize the commands necessary to shut down the Internet in all your sites without shutting down your Intranet?  If you built it with a uniform style, consistently deployed--yes, you could shut down only the ports needed, and nothing else.
  • Always using the same physical ports for links between core switches, distribution switches, and access switches makes future troubleshooting faster. You don't have to do discovery.  You ALWAYS know that int Gi0/0/51 and Gi1/0/51 on an Access Switch stack are the uplinks to the Distribution switches.  And you can go to them immediately in NPM for reports on throughput when folks using that stack have complaints.  No more wasting time finding CDP neighbor ports to the Distribution switches.
  • Using the same VLAN naming convention can speed bulk changes for security and QoS.  If you've got a hundred or a thousand remote sites, and you say VLAN XX is always the first Data VLAN at a site, and VLAN XX+1 will always be the associated VoIP VLAN for VLAN XX, now you can more quickly troubleshoot problems, or make bulk changes efficiently.
  • The same goes for router interfaces and SVI's.  Can you imaging how much time your staff would waste if any IP address in a Class C were randomly chosen as the default gateway for a subnet?  Instead, (I HOPE!) we all pick the same position in the scope (I don't care if you have .1 or .254 or .23 as your router's gateway for any subnet--just be consistent across your entire organization).  As a result, anyone can troubleshoot access at L3 with ICMP, instead of having to wonder what their gateway is, and waste time with an "ipconfig /all" command to discover too much information for their own good.

Consistency across platforms means future security deployments take less time and thought.  You say you've got 500 devices that have a vulnerable protocol enabled on them?  But they were all put into just three VLAN's, and nothing else is in those subnets?  Consistency saved you a ton of ACL writing on a per-port basis--you can write three ACLs and be done with that problem protocol.

Some future vulnerability shows up and Security and Management are on you to block it?  It'll be simpler if you built and named everything as consistently as possible, and deployed all your gear with the same version of code (so you're not troubleshooting different versions of code, and not having to use ten different syntax variations).

I've only scratched the surface.  Who'll step up and reveal time savers they've been able to use because someone thought ahead years ago and made consistent deployments?

Maybe better:  Who'll step up and reveal problems that are hard to fix because previous deployments were NOT consistent?  Those stories should be lessons to everyone about why it's important to have a great plan before you start implementing anything.  Our house of cards must have a solid foundation, or it will come tumbling down.  Consistent code and naming and deployments are part of that foundation.  And they can be HUGE labor savers when you can leverage that consistency in the future as new security needs and new devices with new requirements show up on the network.

Naming convention, yes but also username convention

MVP
MVP

Nice article

MVP
MVP

Nice article and many of the things that I deal with everytime I change jobs (ok, not that much, but it still happens every time)

I've worked for places that had a small data center (in the beginning) so they named the servers something silliy, i.e. Disney or Peanuts characters. It was fun in the beginning, but really what does Linus or Lucy do? I'm a big proponent of a naming convention that is extensible - Assume you are going to be big and multi location if it never happens so what, if it does and you aren't prepared . . .

Level 20

GPO's, STIG's, CIS Security Benchmarks all can help with consistency in configuration.  There are some new tools like SteelCloud that also help with configuration and maintaining a security baseline:  https://www.steelcloud.com/

Now if I could just have NCM for network configs and steelcloud for windows and linux compliance I'd be set I think.  RMF here we come!

Level 11

What are you all using to build templates? I just discovered Jinja2 within the last couple of weeks and I'm trying to get started in my spare time. /cc jgherbert

Level 13

Outside script usage, most commonly I use Excel with vlookups in the configuration.

e.g. I have a cell at the top into which I put the DC (usually with a dropdown selection), and I call the cell "DC_NAME". Then the cell containing the syslog line might look like:

=concatenate("logging ", vlookup(DC_NAME,syslog1,2,false))

...where "syslog1" is a table listing out the primary syslog server for each DC, e.g.

Syslog Server #1
| NYC  | 10.1.2.3  |
| SJC  | 10.74.2.3 |
| etc. | etc.      |

That way when I change the destination DC (or site name), the appropriate syslog IPs are populated. It takes some time to set up, but it's pretty approachable for most people.

Beyond that, I'll store a template in a jinja2-alike format, even if I don't process it with jijna2 - think about it as pseudo-code:

logging {{ SYSLOG1 }}
logging {{ SYSLOG2 }}
user admin password {{ ADMIN_PASS }}
ntp server {{ NTP1 }}

...and so on. Maybe at first I replace that information by hand; but at least I have a baseline template to refer to. Later on I might write a script to do it, or integrate it into a tool like Ansible or the like. Basically I don't believe there's a right or wrong way; it's whatever works for you, IMO.

Everyone has to be on board--otherwise the naming isn't "by convention."  I've seen places where the names of servers were in a class by themselves, and represented servers with specific functionalities.  States, cities, presidents, countries, species of birds, fish, dogs, cats--they all had their consistent use.  Places that use overlapping nouns as server names for different kinds of functionalities are asking for confusion.  For example, if a city name is a database server, and a president name is a domain controller, will you get it right when you hear "Lincoln is down!"  How about "Would you please reboot Washington for me?  It's slow."  Rebooting a domain controller or a database server can have different impacts depending on how resilient their solutions are, and getting the wrong Lincoln or Washington might be a very big deal.

I've seen names of fictional characters and fictional places represent virtual environments or test platforms--they're "not real."  But they still follow the conventions set earlier--perhaps types of cats are firewalls (Tiger, Lion, Leopard, Puma), and firewalls installed on VM's or test firewalls are named for fictional cats (Felix, Tom, Tony-The-Tiger, Top Cat, etc.).

Mail servers named for clouds or birds or jets (air mail is the frame of reference) could be intuitive.

The trick is to avoid choosing naming rules that reduce or eliminate possible confusion by the same name being in multiple server categories.

The world of nouns is available.  Pick Roman Gods, Greek Goods, Norse Mythology, Arabian-named stars, planets or moons, galaxies--hey, they're all over my head.  The side benefits are several:  you can enjoy work a little more; a server or service begins to take on a personality--it can become a recognized old friend or a temperamental prima donna or a troublemaker; and you end up learning a little bit more about the world and history by researching names and choosing their servers' categories.  I worked where the Network team had rights to the names of Gods, and Hades and Persephone were active and standby real firewalls while Phlegethon was the virtual load-sharing interface for both--they locked down security and you couldn't get out to the Internet (or in from it) without their permission.  Clotho and Lachesis and Atropos were load-sharing VPN appliances with timers activated on session inactivity (these Fates would cut the life string of any VPN not in use for X hours or minutes).  Tartarus and Styx and Elysium were another Active, Virtual, and Standby security solution . . .  You get the idea.  Naming CAN BE FUN, if you enjoy cleverness and puns and likes to find relationships that make sense between kinds of names and kinds of servers.  If you're not interested in learning these kinds of relations, or in researching history and language, you can always call them Firewall 1, Firewall 2, Firewall X . . .  Database 1, Database 2, . . .  The "actual" type of server name provides no obscurity--if someone's looking to infect your database servers, they'll find them only slightly quicker if you name them DB1 and DB2 than if you named them Sears and Penneys.  But will that extra bit of obscurity help protect your systems just long enough to discover or defeat the outsider's negative actions?  Or will the obscurity cause extra time spent learning and troubleshooting?  Both require proving a negative, and it's not worth the effort.  Have fun with your conventions while making them efficient!

As long as the naming is consistently deployed--and taught--by everyone, the convention can be learned and become a time saver.

Level 16

A lot of good points. Thank you.

I had a challenge to work through in NPM where node names had no standards because they were inherited through acquisitions. I ended up driving all of my alerting using custom properties assigned to the nodes.

Otherwise the alerting had so many OR conditions it was unmanageable.

In the end it worked out perfectly.

Level 21

Nothing screws things up more than having more than one way to refer to something.  It will always find a way to bite you.

You aren't kidding about this Jfrazier​!  As a service provider we manage systems that belong to different clients and each one has their own different naming convention coupled with the fact we have our own standard as well.  We have been able to successfully use the label of the node in Orion to bridge these two by having a hyphenated name that combines both our name and theirs so we can look it up by either name.

Level 11

Thanks for your reply!

Level 12

Don;t forget closet and patch panel labeling documentation and convention.

Along with that, switchport descriptions should be short enough to be unique on SHOW INTERFACE STATUS, so you can easily paste them into excel to do cross references with ARP and MAC (assuming you don't have a working UDT setup);.

Finally, I think that rptected documentation libraries and rules for updating are vital to enterprise management.

Level 13

On a related note, IOS used to only return the first 32 chars of an interface description when requested via SNMP. Useful knowledge when you're figuring out where to put the important information in the description...!

We are in the process of modernizing our Config Mgmt for our network equipment. Some very nice pointers in here. We've had lively discussions over timezones already.

Level 13

Great point on name standards.  Don't be some plain (BLUE, GREEN, RED.....) doesn't mean a thing.

MVP
MVP

On the topic of credentials, adding a few new local credentials over time on top of the huge list we try to carry in our heads is not the answer.  Because then the complexity or predictability will suffer.  I know it seems obvious, but this list just seems to get bigger weekly.  Leading to forgotten credentials for each new set added.

Imagine having a thousand servers, and Information Security mandates each System Admin uses a separate and unique user name and password for each server--and that they never log into a server with more privileges than absolutely required.  It means having a "regular" name & password and a different administrative name & password for each SysAdmin.

Welcome to my world, where password safes are de rigueur, and where AAA (particularly TACACS) is a life saver.

MVP
MVP

Not sure if jamming more passwords into my head will work well now, much less when I become a senior citizen.  I especially hate the ones where you only log in once a year to take some training, and the immensely complex password combined with 365 days passing makes it utterly pointless.

Some of my passwords are required to be changed every 30 days.  I don't know what I'd do without my password safe.  I can keep 60 passwords in my head, but not 70.  And not when I have to change some monthly.