This discussion has been locked. The information referenced herein may be inaccurate due to age, software updates, or external references.
You can no longer post new replies to this discussion. If you have a similar question you can start a new discussion in this forum.

Failover manual IP routes

I have a scenario where when we failover from our major site A to B, we have to remove 3 routes, then add 3 more. This is on 3 difference devices in order to ensure failover occurs. Is this better suited to a config change template or an NCM job?  I was thinking possibly of a template where the items needed for one direction of failover (A to B) are an option from the listbox and the items for the other direction (B to A) are the other listbox option.

Anyone have any good ideas how to do such a thing? I can't imagine this should be 11 NCM jobs.

  • I would never want to rely on an NCM job to automatically detect specific changes and automatically implement config changes to routers when there's an issue.


    I would much prefer to use automated dynamic routing protocols to handle that in sub-second fashion.  Even something as time-consuming as a BGP failover is preferable--I could take 30 seconds of WAN outage better than waiting for the several minutes it might take for Solarwinds polling to discover a problem and implement a remediation via NCM.

    Please provide more specific information about the routes & failover process.  Ideally you wouldn't have to ever manually remove routes and then add different ones--and then reverse the process--on three devices--when a failover occurs.  You should be able to have all of the the routes present on all the routers at all times, and simply prioritize the preferred path with routing weights or costs.  When the preferred route path goes missing for some reason, the routers will automatically use the standby route.  And depending on the automated gateway protocol you use, they may be able to do it in 20 milliseconds or less.

    If/when you wish to manually change the paths for failover testing or other reasons, you could simply change the prioritization or weight on the preferred or alternate route and still enjoy that virtually-hitless path change.

    Your router vendor / support team should be consulted to ensure you're aware of the options, and to help you choose between them.  EIGRP is FAST to failover.  VRRP and HSRP are great solutions for alternate paths.  OSPF has long been an industry standard that can automatically failover to an alternate route.  BGP is what the Internet relies on for alternate path discovery and use.  Even Spanning-tree, while slower than the other options I mentioned, is well understood and rock solid for automatically disabling or enabling alternate paths--if you have your design and prioritization numbers properly configured.

  • My recommendation would be to execute an NCM action when a failover occurs using the Alert Manager.

    pastedImage_0.png

  • It would work, but it wouldn't failover fast enough for my hospital grade network.  For that we need an IGRP.

  • So, the problem we have is the route all internet traffic goes today. Which is (Every site) -> Site A, over MPLS and then out the Internet circuit for non-internal traffic. So, if site A is down, we're out of luck in various ways. It sucks, yes. Meanwhile, when the internet is the only part that's down at site A we can reroute to site B, but we have to remove the route from the 7K/ASR at Site A and then add the second route for the ASR/7K/Router at Site B to go out site B's internet. I think they're doing it this way simply because they always did. They're using EIGRP and not OSPF.

    I was pushing for floating statics at a minimum but we don't have that for some reason.

  • designerfx

    Please go back to the drawing board rschroeder​ is absolutely right.

    It’s not the right tool for the job emoticons_happy.png

    SolarWinds Lab #66 - Using the Right Tool for the Job

    or just take adatole​  way “damme  wrong“

  • Hah, that's completely fine! I'm always happy to go to Thwack and get some great ideas or find out an idea was bad. Always useful. emoticons_happy.png

  • There are great automated failover solutions for your particular situation.  Some probably simply need to be enabled / configured on your network and you'll be into a better world of reliability for your clients.  Other solutions may require software upgrades or new licenses to enable the required feature sets on your routers, or perhaps would require different or additional hardware between your gear and the ISP's gear.

    In all cases, designing and implementing fast (hitless) automated failover and failback is required for my hospital-grade network.  Hopefully you'll find a great solution just waiting to be implemented without cost in dollars or downtime.

    If you discover/choose a way to make your failovers work automatically, I bet we'd all enjoy learning what solution was right for you, and how fast/reliable it is for you and your clients.

    If you go with an NCM-based solution, I think Thwack readers would also appreciate learning how you achieved automating your failovers, too!

    Best of luck, and swift packets to you!

    Rick Schroeder

  • look into policy based routing, you can add / remove things from the routing table

  • thsukudu​ will look into this, thank you. rschroeder​ I do plan updates; but priority currently is unfortunately elswhere emoticons_sad.png