cancel
Showing results for 
Search instead for 
Did you mean: 
Create Post
Level 13

Solarwinds is now horribly unstable.

Jump to solution

Last week I did an upgrade to all of the latest versions of solarwinds and now every two days it just stops working to a point where I have to reboot the server.

Alerts still work, but you cant view anything on the main front page.  I have done a config on the web front end several times and it still doesn't work.  I have included an any/any exception so that nothing will be blocked, though it never had this problem prior to the upgrade.

Does anyone else have this issue?  Or something similar?

pastedImage_0.png

Tags (1)
1 Solution
Product Manager
Product Manager

Alright, we’ve gotten enough feedback to officially declare the fix for this issue is to change the TransferMode property in the Information Service settings from Streamed to Buffered.  To recap, you can do this in the centralized settings section of your website:

1) Go to http://YOURHOSTNAME/orion/Admin/AdvancedConfiguration/Global.aspx
2) Find "SolarWinds.Orion.InformationServiceClient" and change TransferMode to "Buffered"
3) To apply changes please restart Orion services on all SolarWinds Servers.


You are not losing anything by making this change – Streamed was introduced in 12.3/2018.2 to help manage SWIS Memory consumption.  Moving to buffered is basically how SWIS has worked in every other version. 

 
We will soon be introducing an official fix that will make this change. We’ll also be revisiting Streamed mode on our end to see if we can make it work without causing port exhaustion, but that will be at some point down the line. 

View solution in original post

197 Replies
Level 11

I'm having issues with the latest version as well, it started with Hardware Health throwing a strop, but then Friday night (just after we had left, of course) everything just died a death.

A reboot brings it back up, but then the Collector service chews up all the CPU after a while and you have to reboot again.

I also have a case open, waiting to hear back.

My server was fully patched before upgrading.

Level 12

Are your MS patches up to date? We, and a few others via the Help Desk  have had issues:

To make it clear as Microsoft always make it a little difficult.

You need https://support.microsoft.com/en-ie/help/4095875/description-of-the-security-and-quality-rollup-for-net-framework-3-5-f

Which means, your 2012 r2 servers require:

KB4103473 - https://support.microsoft.com/en-ie/help/4103473

Download here, please choose Windows Server 2012 R2 one!

https://www.catalog.update.microsoft.com/Search.aspx?q=4103473

Please let me know if this helps or if you require further assistance.

Best Regards,

A Help Desk Engineer

After patch applied, no issues.

0 Kudos

Done patches prior to doing the application upgrade.

pastedImage_0.png

Did a reboot so that all patches were applied.

0 Kudos
Level 9

I started having the same issue after the latest set of upgrades...without fail every two days the server crashes and i have to shut down services and restart and everything is fine for another two days....i also have a support ticket in and have sent diagnostics and log files in, have had the tech support look things over twice and have not had the issue resolved yet...hopefully this can be resolved soon.

could you pass on your case number please?

0 Kudos

My ticket number for reference is as follows - 00106029

thank you!

0 Kudos

I've had a similar issue, and I've been working closely with an AE on diagnosing it. I haven't heard back in a while, usually that means they've got the data they need and something's cooking.

What's the status of your SolarWinds services? Goes the Orion Module Engine keep stopping/restarting every few minutes?

My Orion Module Engine was crashing and restarting so often that it caused multiple instances of the windows logging process and ran my CPUs up to 99%. This caused page load errors, database timeouts, and even kept me from being able to RDP into the server. restarting all the services or rebooting the server provided only temporary relief as the problem always returned.

Last weekend, I went ballistic on the server and performed the following:

     1. shutdown all services.

     2. copied c:\ProgramData\Solarwinds\JobEnginev2  folder to a backup location.

     3. Went to control panel and uninstalled Solarwinds Job engine v2.13.1337

     4. Deleted the Job engine v2 folder located at c:\ProgramData\Solarwinds\JobEnginev2

     5. Went to control panel and ran a repair on Visual C++  2013 redistributable which forced a reboot.(probably not needed)

     6. After reboot re-installed Job Engine v2 by running C:\ProgramData\Solarwinds\Installers\JobEnginev2.msi

     7. ran the solarwinds configuration wizard (selected services)

After performing the above my server has been stable now  with no Orion Module Engine service crashes or any crashes for that matter.

However if the problem returns,  i will be opening a case.

Have you tried the steps outlined here, and also updated to the latest hotfix that was released on Friday?

0 Kudos

I have not applied either of those items, as I am just finding out about them today; I will keep them in mind if any problems return and make sure they are applied before opening a case. However my Information Service seems to be fine, at least for now. It was the Orion Module Engine that was causing my issues, it would crash and cause NetFlow  and Collector services to crash as well.

My server has been fine since i re-installed the job engine and I will probably wait for the dust to settle before tinkering with it any further, unless of course issues return.

Nope.  The information service was stopped, but that's about it.

0 Kudos

Same

0 Kudos
Level 9

Hi there,

this looks like the .net Framework Cache on the Server that is hosting the website is broken.

Please try the following:

1. stop IIS

2. Stop all Solarwinds Services on that device

3. delete the following folders:

     - "C:\Windows\Microsoft.NET\Framework\v4.0.30319\Temporary ASP.NET Files\root" delete the folder (in my case named "53bc9024") in there

     - "C:\Windows\Microsoft.NET\Framework64\v4.0.30319\Temporary ASP.NET Files\root" delete the folder (in my case named "e22c2559") in there

4. Start IIS and Solarwinds services

Please report back if that fixes the issue. I created a sheduled task that does what I described and it does taht every day at 1 AM, because it frequently happens in my environment too.

Hope this helps!

That didn't fix my problem.

0 Kudos

I'll give it a go next time.  I always get this error:

So I found this - SWIS discover Interface Error

I'll give both a try and see what works.

Do you know why the .net framework becomes full?

pastedImage_0.png

The .net Framework cache does not really get "full", it just gets corrupted or something.

It happens to me very irregular, which is why I created the scheduled task to run daily at a time when nobody needs the Orion webinterface.

Better to be safe than sorry!

I have two questions for you though:

1. Where do you get this error exactly? I have never seen an error message like that coming from solarwinds.

2. Did you try the suggested troubleshooting steps? Did you configure the antivirus exceptions?

0 Kudos

When you restart all the services on a schedule, how do you keep sensitive alerts from firing when it comes back up?

My network colleagues are tired of being the first to detect Orion website "unexpected errors" when they try to do their before-the-users-arrive tasks. 

But my Storage and DBA colleagues will not tolerate bogus PagerDuty alerts when their sensitive monitors trigger as the services come back up, and the agents reconnect.

0 Kudos

Every time Solarwinds plays up, I stop all services.  Start them, then try the site.  Nothing.

Then after running a config wizard for database/services/website (in either a combination or all together) that window then pops up.

I reboot the device, leave it some time and then it's all ok.

0 Kudos
Level 13

I have raised a ticket with solarwinds (00117521) And I have done a diagnostics for them to view too.