cancel
Showing results for 
Search instead for 
Did you mean: 
Create Post
Level 11

RabbitMQ won't start after an upgrade, 2nd upgrade and rollback to snapshot

Still not got through to Support after phoning a second time after being told to upgrade the failed upgrade.

Anyone out there able to help!

14 Replies
Level 11

Ok so I think I have found the issue.  I couldn't let this lie and even though we have the ticket logged with support, as a techie I couldn't just sit back and wait, so I've been doing some digging.

I'd looked into the configuration wizard log (C:\ProgramData\SolarWinds\Logs\Orion) and found errors for the rabbitMQ and a version of 1.3.0.37:

2020-01-15 07:18:17,844 [13] DEBUG MessageBusConnectionProvider - Opened connection to <Servername> using type easynetq-direct and username orion

*** Assembly SolarWinds.MessageBus.RabbitMQ, Version=1.3.0.37, Culture=neutral, PublicKeyToken=null, .NET version v4.0.30319 ***

2020-01-15 07:18:17,844 [13] DEBUG EasyNetQueue - Opening queue TestQueue durable=False

From what I could remember, Orion Platform 2017.3.5 Sp5 should be running RabbitMQ v 1.1.40 - the MSi is called Solarwinds.RabbitMQ.install.msi

So this log showing version 1.3 of rabbitMQ made me wonder if this was the problem.

I looked into the installer logs for the date we did the upgrade (C:\ProgramData\SolarWinds\Logs\Installer\2020-01-14_10-23-44) and in the AdministrationService.Clinet.log I found:

RABBITMQ:1.1.40.0 msiexec.exe - skipped

2020-01-14 10:24:20,672 [21] DEBUG (null) SolarWinds.Administration.UpdatePathResolver.UpdatePathResolver - InstallChain:.............[RABBITMQ, 1.3.1000.420]

So it looked like for some reason the system was still thinking the newer version of rabbitMQ was on the server.  But when I looked in the installer folder RABBITMQ-1.3.1000.420-Solarwinds.RabbitMQ.Install.msi was not present, but the msi for version 1.1 .40 was.

So my logic was, if the system thought that 1.3 was installed, but actually 1.1.40 was then this might be causing the issues.

So i copied the msi for version1.3 over and ran the uninstall.  deleteed the Ericsson key from the registry and then re ran the msi for version 1.1.40 - Nothig to loose right, RabbitMQ wasn't able to start anyway!

The msi finished, I checked the registry for the Ericsson key and low and behold the key had been recreated and all of the arguments were populated correctly (they were now the same as what we had in our dev server) .

So I crossed my fingers, opened OSM, selected RabbitMQ and hit the start button................................

Its ALIVE!!!!!!!!!!!!!!!!!!!!!!!!
pastedImage_1.png

So if you experience anything like this, i'd suggest checking that the version of RabbitMQ that is installed is the version the system thinks is it.

Level 11

So solarwinds is running, but RabbitMQ still will not start.  2 days of talking with support and they are asking us to reinstall all of the main poller???

Please Please does someone know how to fix the issue with rabbitMQ not starting

Reinstalling the Main polling engine is actually not that drastic. Your data lives in the database and will be there after the reinstall. Typically they have a script that does a very good job of removing Orion, and the reinstall is like a fresh start. If you ahve customized pages and template maybe you lose that, but very few people would be impacted. I say run the reinstall.

0 Kudos

what do the logs say?

0 Kudos
Community Manager
Community Manager

simonp73​ I just got in touch with support management. Looks like the case was opened with medium severity but it has been updated to system down. Someone should be reaching out shortly. Please let us know if that is not the case.

what seems to be the issue?

0 Kudos

To start with the initial upgrad to 12.5 stopped RabbitMQ from starting and the RbaaitMQ doesn't seem to be configured at all correctly so wont start.

Support said to upgrade the upgrade, but that seems to nowbe stuck in a loop in the config wizard saying there are multiple pollers and just loops nack between the DBselection screen and saying its a multiple poller environment

About to get our windows guys to restore to the snapshot before we started the install

0 Kudos

pastedImage_0.pngClick OK and it just goes back to the DB selection screen

0 Kudos

Ouch. You're gonna have to do the fix outlined in that article across all your pollers, and then go into the advanced config for each APE and switch their PubSub to "OverMessageBus" (advanced config Success Center)

0 Kudos

Yep.  Currently rolling the main poller back to the snapshot before I started the upgrade

Lucky I have the snapshot

Really need to work out where this all went pear shaped!

0 Kudos

for the rabbitMQ issue take a look at this article.  Success Center - RabbitMQ Fatal Error Insufficient ciphers

Do you have any APE's?

0 Kudos

9APEs and 1 AWS

Nothing to do with cyphers

0 Kudos

whats the error in the RabbitMQ log?

0 Kudos

There wasn't a log

The Rabbit MQ just wasn't installed correctly and no matter how many times we ran service.bat install it still wouldnt fix it

0 Kudos