On modern enterprise networks, 100% uptime has become table stakes. Most organizations can no longer rely on a single circuit for internet connectivity. We look to carrier circuits for the redundancy and guaranteed uptime that our organizations need. When carrier outages occur, network engineers find themselves in a hot seat they can do little about. However, if we do our homework, we can improve our organizations' uptime by taking care as we provision connectivity.
Causes of carrier outages
Most network engineers have experienced the rapid-fire text messages and flurry of questions when the Internet stops working. It’s important to understand the upstream causes of these outages so we can work with our carriers to mitigate them. The first, and most common, is the dreaded fiber cut. Regardless of the cause, a fiber cut in the wrong location can have widespread impacts. Second, an upstream provider issue can interrupt service. While less frequent than a fiber cut, these outages can be frustrating because, although your circuits and peerings are healthy, traffic does not flow properly. Third, DDoS attacks, whether directly targeting your organization or another customer on your provider’s network, can have a crippling impact on service availability.
Managing Around a Fiber Cut
A few different approaches can help mitigate the impacts of a fiber cut. Your organization can purchase circuit diversity from a single carrier. In this scenario, your carrier will engineer multiple circuits into your facility. As part of this service, they will evaluate the physical path each circuit follows and ensure the circuits do not ride the same cable or poles. For true diversity, you’ll need to be certain that circuits take different paths into your facility. And, if circuits terminate into a powered cabinet, you must verify the reliability of the power source for that gear. Ask lots of questions and hold your carrier accountable. Be certain that they are contractually obligated to provide diversity because there are penalties if they fail to do so. Work with an engineer for your carrier; don’t take the sales rep’s word for it. A single provider should have a complete view of the physical path for your circuits and be able to guarantee physical diversity. Unfortunately, however, using single carrier puts you at a higher risk for an upstream configuration our routing failure with that provider.
The Multiple Carrier Route
Instead of ordering path diversity from a single carrier, you can order two circuits from different providers. This option reduces your reliance on a single carrier, but makes it more difficult to ensure full path diversity. You will need to talk to your carriers about sharing the physical path information for the circuits with you or with one another. You’ll still want to be certain the circuits enter the building via a different conduit and terminate into properly powered equipment. If you use different carriers, you will need to pay special attention to your BGP configuration to verify that the path in and out of your network is what you expect.
An Important Note about Grooming
Even if you do everything right — you validate proper path diversity when you order a circuit, you pay special attention to the entrances into your building, and you verify that all vendor equipment is properly powered — things can change. Carriers will periodically groom circuits to change the path they follow through their network. An industrious provider engineer may see that a circuit follows a less-than-optimal path through their network and then diligently re-engineer it to be more efficient. You will not be notified when the grooming takes place; it will be transparent to you, the customer. The only way to prevent grooming is to communicate clearly with your carrier and ask that they mark circuits that have been carefully engineered for path diversity to prevent them from being groomed.
As with most topics is networking, there are many factors to consider and tradeoffs to be made when ordering connectivity for your organization. You cannot have complete control over carrier-provided connectivity, but you can be diligent throughout the process, communicate the challenges clearly with your leadership, and be clear with your service provider about your expectations and the level of service being provided.