I have run up a new instance of SolarWinds with a bunch of modules (below) and am trying to set up HA clusters for the main poller and an additional polling engine. I've run the Orion installer on the HA server and it can connect to the main Orion server but errors out while downloading the files from the main server with the error: "Can’t receive data from SolarWinds Administration Services" with a link to a related KB article (don't have a screenshot handy sorry). However, I don't think the KB article applies as the server definitely has the ports open and it is not taking longer than 30 minutes to copy (it errors within a minute).
I tried to contact support but they cannot help as it is not a licensed module, but at the moment the module won't be purchased if it can't be installed. The case number is #1178593 and they're going to pass it over to a sales engineer but I'm not sure they will be able to assist if it needs escalation.
- This is a new installation of SolarWinds (modules installed 3 weeks ago) and all latest hot fixes installed as of last week
- Modules installed (full module versions in the attached log file):
- NPM 12.1
- SAM 6.4.0
- NTA 4.2.2
- NCM 7.6
- VNQM 4.4.0
- IPAM 4.3.2
- SRM 6.4.0
- VMAN 7.1 / VIM 7.1.0
- Additional Polling Engine
- I have verified that each server can reach the other servers via hostname and IP address and I can telnet to the ports required from the HA servers
- The same issue occurs on both new servers that will become the HA servers (main and additional servers)
- Not all modules are licensed – some are still in the evaluation period (still active, time remaining - NTA, SRM, VMAN).
- I can confirm I can manually copy all the installer files between the servers quickly (takes less than a minute to copy)
- The servers are running the same OS (Windows Server 2012 R2) and have same specs (CPU, memory, disk).
- I tried disabling the antivirus – didn’t fix the issue
- No production HA license installed however the 30 day evaluation was enabled today
So I now have the following questions for HA:
- Any thoughts on the above problem so we install/enable HA?
- Do we have to reconfigure NTA FSDB server to use the HA VIP? Does the NTA FSDB initiate communications with the main server or does it only respond to request from the pollers/website?
- Do we have to reconfigure VMAN integration to use the HA VIP? How does VMAN handle the active/passive server swap?
- Is it possible to have HA change multiple VIPs on the server (i.e. a second NIC with another VIP)? These servers have two network interfaces and roughly half of the objects are polled via the second NIC. Polling should still be OK from the HA server NICs (and we'll confirm the IPs and HA VIPs are in the firewall rules), but what about receiving data? A workaround for now I guess would be to have syslog, traps, NetFlow be sent to the main poller (or additional poller) IP and HA IP - the passive server will have the services disabled so won't record traffic. It will mean duplicating network traffic but it would ensure the data is still received when a failover occurs.
I've attached the installer log file as well. It errors out at the same spot each time so I suspect it might have something to do with the NetworkAtlas hot fix because that file doesn't exist on the main Orion server in the C:\ProgramData\SolarWinds\Installers directory.
2017-06-07 14:54:26,476 [3] DEBUG SolarWinds.Orion.CompatibilityPreInstaller.Forms.DownloadProgressPage - Downloading CollectorInstaller.msi
2017-06-07 14:54:26,663 [3] DEBUG SolarWinds.Orion.CompatibilityPreInstaller.Forms.DownloadProgressPage - Downloading InformationService.msi
2017-06-07 14:54:26,851 [3] DEBUG SolarWinds.Orion.CompatibilityPreInstaller.Forms.DownloadProgressPage - Downloading InformationService-HotFix2.msp
2017-06-07 14:54:26,945 [3] DEBUG SolarWinds.Orion.CompatibilityPreInstaller.Forms.DownloadProgressPage - Downloading SolarWinds-Orion-NetworkAtlas.msi
2017-06-07 14:54:27,773 [3] DEBUG SolarWinds.Orion.CompatibilityPreInstaller.Forms.DownloadProgressPage - Downloading SolarWinds-Orion-NetworkAtlas-v1.16-HotFix1.msp
2017-06-07 14:54:27,820 [3] ERROR SolarWinds.Orion.CompatibilityPreInstaller.Forms.DownloadProgressPage - FileTransferProxyClient.GetFile failed, ex: System.ServiceModel.CommunicationObjectFaultedException: The communication object, System.ServiceModel.Channels.ServiceChannel, cannot be used for communication because it is in the Faulted state.
Server stack trace:
at System.ServiceModel.Channels.CommunicationObject.Close(TimeSpan timeout)
Exception rethrown at [0]:
at System.Runtime.Remoting.Proxies.RealProxy.HandleReturnMessage(IMessage reqMsg, IMessage retMsg)
at System.Runtime.Remoting.Proxies.RealProxy.PrivateInvoke(MessageData& msgData, Int32 type)
at System.ServiceModel.ICommunicationObject.Close(TimeSpan timeout)
at System.ServiceModel.ClientBase`1.System.ServiceModel.ICommunicationObject.Close(TimeSpan timeout)
at System.ServiceModel.ClientBase`1.Close()
at System.ServiceModel.ClientBase`1.System.IDisposable.Dispose()
at OrionInstallerLib.FileTransfer.FileTransferProxyClient.FetchFile(String filename, Stream fileStream, Action`1 reportProgress)
at OrionInstallerLib.FileTransfer.FileTransferProxyClient.GetFile(FileInfo fileInfo, Action`1 reportProgress)
2017-06-07 14:54:27,820 [3] ERROR SolarWinds.Orion.CompatibilityPreInstaller.Forms.DownloadProgressPage - System.ServiceModel.CommunicationObjectFaultedException: The communication object, System.ServiceModel.Channels.ServiceChannel, cannot be used for communication because it is in the Faulted state.
Server stack trace:
at System.ServiceModel.Channels.CommunicationObject.Close(TimeSpan timeout)
Exception rethrown at [0]:
at OrionInstallerLib.FileTransfer.FileTransferProxyClient.GetFile(FileInfo fileInfo, Action`1 reportProgress)
at SolarWinds.Orion.CompatibilityPreInstaller.Forms.DownloadProgressPage.DownloadFile(String fullPath, Action`1 setDownloadedBytes)
at SolarWinds.Orion.CompatibilityPreInstaller.Forms.DownloadProgressPage.backgroundWorker_DoWork(Object sender, DoWorkEventArgs e)
2017-06-07 14:54:27,820 [3] DEBUG SolarWinds.Orion.CompatibilityPreInstaller.WorkFlow.Workflow - set result:CantReceiveDataFromSWA
2017-06-07 14:54:27,835 [1] DEBUG SolarWinds.Orion.CompatibilityPreInstaller.WorkFlow.Workflow - set result:CantReceiveDataFromSWA