HA Troubleshooting Guide for website hosting pool
There are several HA KBAs published from SolarWinds, however, they don't go into detail on what you'd expect on the db table entries, and when HA is working right, certain tables will create entries for the active host, if there are entries for both pool members on these tables, then you will be in a state of both hosts wanting to be active and the solution is to stop services and correct the db entries to force it to the correct side. This is quicker than truncating all HA tables and rebuilding the pools entirely, which SolarWinds has provided documentation for - linked in this post below.
- If a failover just initiated, depending on location of the db or latency, a failover could take up to 15 minutes. Don’t make any changes for 15 min and observe.
- Ensure the VIP is up
- If the VIP is not up, login to both pool member servers. Check running services, if all Orion services are running on one of the hosts, that will be the active site. Navigate to the website locally using the system name. If the website is up and accessible, there is most likely an issue with the load balancer. Work with network team to address.
- If the local website is not up or services on either host are not running, check sql connection. If connection is failing, work with SQL team to address why the listener is down.
- If the SQL connection succeeded, and neither website is up or services aren’t running there is likely an issue with the HA pool. Investigate HA. Services should be running as described
- Validate there is an active and standby system defined for all areas below:
- If at any point 2 systems in a pool display the same role in the db, that pool is not working and needs to be corrected. If there are 2 entries on the Orion servers table or the websites table, stop services on all servers including High availability service, delete the inaccurate entries and update the registry keys to ensure the active/standby is accurate. Restart the services (starting with the desired active host) and wait 10 – 15 minutes for the system to recognize which is supposed to be the active and conduct a full failover.
- If the above does not work, delete the HA pool and recreate by following below steps
- Both HA servers are Standby and Orion Services are stopped (site.com)
Registry Keys
The registry keys associated with HA are Install type and run type
HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\SolarWinds\High Availability
RunType = which role that server is currently fulfilling
InstallType = was originally designated as primary or main
When HA is stable, the keys will have both an install type and a run type of either standby or primary depending on active site.

Data Base Tables:
Pool members
HA pool members table will have a record for both the primary and the standby, and the pool member type field should match the registry keys.

Engines
Engines table will show only the primary host of that pool, if both servers in the pool are displayed there is an issue with HA. This will be addressed in the troubleshooting steps above.

Orion Servers
Orion servers table will show both members of the pool and what server type they are built to support

Websites
Websites table should only have 1 entry

Services
On the active host you should see all services running for the modules that are licensed. On the standby host, you will see just the High availability service (not shown on the solarwinds platform service manager tool, only visible in windows services) as well as the below:
