Orion Platform 2017.1 and later includes an entirely new and fully integrated high availability solution for the Orion Platform. If you plan to play around with High Availability you will need to meet the following requirements.
When installing the Primary Orion server you will follow the normal 'Advanced' installation process that you would for any other Orion product. Ensure not to select the 'Express' install option during installation, as a separate server running Microsoft SQL 2008 or later is required. When the Configuration Wizard runs you will be prompted to provide the Username, Password, and IP address of the SQL server you will be using for Orion.
Once the primary server is up and running you will need to perform a similar installation on the secondary server using the separate High Availability installer which can be downloaded from within the Orion web interface under [Settings -> All Settings -> High Availability Deployment Summary -> Setup A New HA Server -> Get Started Setting Up a Server -> Download Installer Now].
|Download the High Availability Secondary Server Installer|
Next, execute the installation by double-clicking on the "SolarWinds-Orion-Installer.exe" downloaded or copied to the secondary server. Enter the IP address of fully qualified domain name (FQDN) of your main Orion server, along with 'Admin' or equivalent credentials used to log into the Orion web interface and click 'Next'. On the following step of the WIzard, select the additional server role you wish to install. Since this will be a High Availability Backup for the main Orion server, select 'Backup Server for Main Server Protection' and click 'Next'.
|Enter IP of Main Orion Server & Provide 'Admin' Credentials||Select Server Role to Install|
Once the Installation completes the Configuration Wizard will be started. When prompted to provide information regarding the SQL server database, ensure you utilize the same SQL instance and SQL database that was chosen for the primary Orion server.
The following video, while arguably boring to watch, demonstrates the secondary server installation process.
As soon as both the primary and secondary servers are installed, return to the Orion web interface and navigate to [Settings -> All Settings -> High Availability Deployment Summary]. There you will be able to join the two servers into a single high availability cluster pool. The following short video walks through this process in under a minute.
For more detailed instructions on how to make your Orion Install highly availability, please see the documentation at the link below.
@aLTeReGo Have a use case for which i need some guidance...
there are 2 DCs and Solarwinds is required to be in HA. Both will be in different subnet, there is no AD in this environment and no Load Balancers... So can HA still be used? I went thru some posts and there were many sections mentioning that virtual hostname or VIP is just for convenience, but i really dont understand this! How does HA work when no HA pool is created?
and can VIP be used for different subnet deployment OR Virtual hostname is must coz i think it automatically shows Virutal hostname and not VIP option.. correct me if m wrong...
In addition to this if i dont use HA, lets say i opt for Additional Poller on DR site and main polling engine and DC site.. if the DC site is shutdown for sometime or there is some issue then shifting the devices to APE will work or it wont(coz as far i understand, if main polling engine is down, ideally no alerting will work)? I have Linux servers monitored via agent too so will this work too?
Question so in 2017.3 - High Availability will support different subnet , will this take across AWS - Multi Tenancy setup using Virtual Host Names ?
Yes. In fact, there is an out-of-the-box alert configured to update the virtual hostname in Route53.
There's no automatic licence transfer from FoE to HA. You have to speak to your reseller & ask them to make the transfer.
However FoE is going away & HA is here to stay; support for FoE will tail off
We should be rolling out the release candidate in the next few weeks. As a beta participant, you will be among the first to be notified once it's made available.
Can you click on both of the items listed in the table and post a screenshot of each? I really need to see the 'Server Type' to determine what's going on. Also, it appears DWSOLARWINDS01 is down, which can't be the case since it's the server you're accessing in the web interface. Can you ensure the date/time are in sync between both Orion servers and the database server?
Hello. Described situation seem familiar to me. That's probably caused by known issue (finally will be fixed) in situation, when you already once launched the installer on the backup machine (even when you immediately canceled it), it get's the main role in the registry and following server role selection wizard doesn't rewrite it.
For Jeremy's reference: HA-1831
To repair your situation, please follow those steps:
0. Stop SolarWindsHighAvailability service on both machines (and stop orion services on the backup server)
1.On the backup server switch the InstallType and RunType to "MainPollerStandby" in the registry key HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\SolarWinds\HighAvailability
2. Run following commands against the database:
truncate table HA_FacilitiesInstances
truncate table HA_PoolMemberInterfacesInfo
truncate table HA_PoolMembers
truncate table HA_Pools
truncate table HA_ResourcesInstances
3. Delete redundant server from Engines and Websites (those who aren't supposed to be active for now) tables in the database
4. Start SolarWindsHighAvailability services again
Regarding the down status, please check if the Agent service is running on the backup machine (the agent is automatically deployed on each server). If not, please start it.
I hope it helps to evaluate.
Just getting back to this. I followed your directions, and nothing has changed...
dwsolarwinds01 is the primary - the agent is running on that box but it shows RED
dwsolarwinds02 is the failover - even though I made the registry changes you specified - it still shows as a mainpoller in this display:
jbrunke would you be able to collect Diagnostics from both members of the HA pool and upload them following the steps outlined in the KB article below. Be sure to add 'aLTeReGo' to the subject line and post back here once they've been uploaded so we can take a look at what's going on.
I am yet to setup the HA, but have installed it on the first server, this obviously has the UI improvements from the NPM 12 so that's great. our major pain points currently are HA and AppInsight for SQL and custom monitoring
FEEDBACK thus far
1. Unable to see 'Alert triggered' eventhough there are a number of Alerts triggered.
2. When app is initially added it goes into Unknown, we also noticed a number of times in the previous version that 'goes into unknown' can mean 'n' number of things, so perhaps it might be worth differentiating this a bit more
3. When adding custom process\services we would like to set the status of the application differently, e.g. we might have some services that does not work break application if it goes down but want the app to go into 'critical' or 'warning' in this version this is not possible as well. "Status When Process is Not Running:" Essentially this can be Warning, Critical or Down or Not Running
will comeback with more feedback as i get sometime.
Thanks Vik for the feedback. I seems however that some of your points may be unrelated or not specific to the beta. As for your point #1, I'm not entirely sure I completely understand what you're seeing (or not seeing) based solely on your description, but it sounds somewhat unlikely related to the beta. If that is the case, please post this feedback in the Server & Application Monitor forum. Your points 2 & 3 sound like feature requests, also not related to new features/functions included in the SAM 6.3 beta. As such, I would recommend posting these to the Server & Application Monitor Feature Requests forum where other members of the community can vote on these ideas.
hello sir, yes they maybe a feature request but in my experience feature requests from cases just goes into ether and there is no way to keep track of it, so perhaps yea i would post it on the forums for better visibility. as for 1. I am unable to see any Events in the orion/netperfmon/events.aspx
It's erratic at the moment, sometimes there are some events, other times not.
This looks really promising. We've been using FoE for a couple years, but recently excluded it from a new SW server architecture we've set up. It added a lot of complexity, and made OS updates a real pain. We added additional hardware fault tolerance and dropped FoE.
SolarWinds solutions are rooted in our deep connection to our user base in the THWACK® online community. More than 150,000 members are here to solve problems, share technology and best practices, and directly contribute to our product development process.