This discussion has been locked. The information referenced herein may be inaccurate due to age, software updates, or external references.
You can no longer post new replies to this discussion. If you have a similar question you can start a new discussion in this forum.

Multi-Subnet Failover (WAN/DR) Deployment

High Availability 2.0 provides the first peek into supporting redundancy for Orion across subnets. This was previously referred to WAN deployment or Disaster Recovery with the Failover Engine, but under High Availability we refer to this simply as a multi-subnet failover configuration. In other words, this provides the same automated, near instantaneous, failover and recovery mechanisms as High Availability does in its first release, but extends that functionality to support pollers spread across different subnets. Those could be different sites, a dedicated disaster recovery location, or possibly even the cloud.

HIGH AVAILABILITY REQUIREMENTS

  • High Availability 2.0 Installer (Built-in and located under [Settings -> All Settings -> High Availability Deployment Summary -> Setup A New HA Server -> Get Started Setting Up a Server -> Download Installer Now]
    • High Availability 2.0  Can be used only with product modules running with Orion Core 2017.3
  • Two servers running Windows Server 2012 or later
    • Both primary and secondary servers must reside on different subnets for multi-subnet failover
      • Primary and secondary servers which reside on the same subnet can be used for same-subnet failover using a traditional VIP
    • Windows or BIND DNS Server credentials for configuring the virtual hostname
    • Windows Server OS version, edition, or bitness need not match between primary and secondary servers.
    • Primary and secondary servers may be optionally joined to a Windows domain
    • High Availability supports the following configurations of primary and secondary servers.
      • Physical to Physical
      • Physical to Virtual
      • Virtual to Virtual
      • Virtual to Physical
  • A separate server running SQL 2012 or later.
    • This server does not need to reside on the same subnet as either the primary and secondary Orion server
    • Any Microsoft SQL edition may be used, including SQL Express
    • Bonus points for utilizing a SQL Cluster

pastedImage_4.png

PRIMARY SERVER INSTALL

When installing the Primary Orion server you will follow the normal 'Advanced' installation process that you would for any other Orion product. Ensure not to select the 'Express' install option during installation, as a separate server running Microsoft SQL 2012 or later is required. When the Configuration Wizard runs you will be prompted to provide the Username, Password, and IP address of the SQL server you will be using for the installation.

SECONDARY SERVER INSTALL

Once the primary server is up and running using the NPM 12.2 installer, you will need to perform a similar installation on the secondary server using the separate High Availability installer which can be downloaded from within the Orion web interface under [Settings -> All Settings -> High Availability Deployment Summary -> Setup A New HA Server -> Get Started Setting Up a Server -> Download Installer Now].

Download the High Availability Secondary Server Installer

All Settings.png
High Availability Settings.png
High Availability Deployment Summary.png
pastedImage_7.png
Evaluate High Availability.png
pastedImage_9.png

Next, execute the installation by double clicking on the "SolarWinds-Orion-Installer.exe" downloaded or copied to the secondary server.  Enter the IP address of fully qualified domain name (FQDN) of your main Orion server, along with 'Admin' or equivalent credentials used to log into the Orion web interface and click 'Next'. On the following step of the Wizard, select the additional server role you wish to install. Since this will be a High Availability Backup for the main Orion server, select 'Backup Server for Main Server Protection' and click 'Next'.

Enter IP of Main Orion Server & Provide 'Admin' Credentials

Select Server Role to Install

pastedImage_0.pngpastedImage_1.png

Once the Installation completes the Configuration Wizard will be started. When prompted to provide information regarding the SQL server database, ensure you utilize the same SQL instance and SQL database that was chosen for the primary Orion server.

The following video, while arguably boring to watch, demonstrates the secondary server installation process.

CLUSTER POOL CREATION

As soon as both the primary and secondary servers are installed, return to the Orion web interface under [Settings -> All Settings -> High Availability Deployment Summary]. There you will be able to join the two servers into a multi-subnet failover pool.

Click 'Set up High Availability Pool"
Setup High Availability Pool.png
Enter a Virtual Hostname and click 'Next'
Pool Properties.png
Select your DNS Server Type
DNS Settings.png
Microsoft DNS

Enter the IP Address of your DNS Server, the DNS Zone (E.G. solarwinds.com) and administrative credentials to the DNS server to create the shared virtual hostname

Microsoft DNS.png

BIND DNS

If you are running BIND DNS, enter the IP address of your BIND DNS server, the DNS Zone, your TSIG secret key name, and the TSIG shared secret key value.

BIND.png

Summary

Once complete, review the summary and click "Create Pool"

Summary.png

Success

When done, you will have pooled two Orion servers together across multiple subnets into a redundant, high availability pool

Setup Complete.png

The following short video walks through this process in under a minute.

  • FormerMember
    0 FormerMember

    What will be the Database failover requirements? If I failover to a data center in another geo or the cloud, do I need to rely on MSSQL availability groups for replication? Or does the HA solution replicate the database to another instance?

  • High Availability 2.0 and Orion 2017.3 support SQL AlwaysOn Availability Groups in a multi-subnet failover configuration. This would be the recommended method for providing redundancy to your Orion SQL database.

  • FormerMember
    0 FormerMember in reply to aLTeReGo

    Perfect, thanks!

  • Thanks a lot for sharing this.. makes it more easier to understand and implement.

    One query on the Virtual hostname part. What all communication on ports needs to be allowed towards/from it? The pre requisites for monitoring any device will have to be opened towards both Primary and Secondary right OR even towards the Virtual host name.  If any link is there then please send me that so that i will directly refer that.

  • For Bind, updating the DNS name is performed over port 53. For WIndows DNS, updating the virtual hostname is done over WMI. In either case, this needs to be allowed from both members of the pool.

  • Hi aLTeReGo

    Have some queries on the failover setup.

    1. Once we download the installer for secondary server and complete the installation it will redirect to console of Primary server, correct? Will all the services on secondary server be in running state?

    2. Then once we configure the HA pool by using VIP and finish it, will the services still show in running mode in both?

    3. For testing failover, what all scenarios it will work? Service restart is one, how about other scenarios?

    4. In case of using VIP, console be accessible from whichever is active and VIP, right?

    5. Any specific settings to be done so that we can access VIP to access the console?

  • ss

    1. Once we download the installer for secondary server and complete the installation it will redirect to console of Primary server, correct? Will all the services on secondary server be in running state?

    Negative. Only a few critical services will be running on the standby server. The SolarWinds Administration Service, the SolarWinds Agent, SolarWinds HighAvailability, and SolarWinds Orion Module Engine services.

    2. Then once we configure the HA pool by using VIP and finish it, will the services still show in running mode in both?

    The same services I listed above will be running on the standby server. All other services will remain stopped and disabled until a failover occurs and the standby server becomes the active member of the pool.

    3. For testing failover, what all scenarios it will work? Service restart is one, how about other scenarios?

    I recommend reviewing my post here -> Torture Testing High Availability

    4. In case of using VIP, console be accessible from whichever is active and VIP, right?

    Yes, that is correct.

    5. Any specific settings to be done so that we can access VIP to access the console?

    In the off chance you configured your Orion web console to only be accessible from one specific IP address, you will need to change this so IIS is bound to all adapters. E.G. (All Unassigned).

    pastedImage_25.png

  • thanks for the response :-) We are trying to setup the HA for evaluation purpose and found some issues.. Let me go through all your points and let u know incase i am still not able to resolve it.

  • Can you help me understand the "Virtual hostname"?  Is that only known internally to solarwinds or is that the actual name of the record in DNS?