Personal experience tells us outages impacting production networks are most often due to mistakes in making device configuration changes. We either make mistakes in editing a configuration file or in targeting device(s) for the change.
The threat of configuration-related outages is not only constant but grows with your network and IT team. And the impact of outages tend to grow as well, depending on how your business relies on your network to connect with customers. If your network supports customer-facing web interfaces, for example, an outage can mean calculable revenue loss for your company.
When configuration management mistakes are made, and outages occur, reducing downtime depends on being able to quickly locate the mistake by auditing recent changes to the network. Two tools are basic to establishing and maintaining a reliable audit trail: a configuration change approval system, and a configuration change confirmation system.
Configuration Change Approval
Approval systems are often role-based, allowing any member of an IT team to complete device configuration work, scheduling changes; but also requiring a manager to review and enable changes to be executed as scheduled. A software workflow usually sends the request for approval at the time a change is scheduled; and when the change is approved the software takes action based on the schedule, usually coordinated with network maintenance windows.
Configuration Change Confirmation
By triggering download of a changed device configuration, and comparing the current configuration to a previous one, a change confirmation system provides the team with a checkpoint in the form of an email alert. Verifying the accuracy of configuration changes can be as easy as reviewing the side by side comparison included in each email alert.
Make Auditing Easy
These two systems provide a history of what was done when and to which devices. Reversing a configuration mistake becomes relatively trivial. Without these systems, however, before beginning to resolve the current outage, you face an initial crisis of needing to assemble and verify the timeline of config changes.
To see a good example of a configuration change approval system see this guided video tour starting at time marker 6:10:
You can navigate this same software on your own in demo mode: