
Has anyone ran in to a problem with the information service crashing on a 10.2.1 deployment? This is a brand new single poller Orion install connecting to a remote SQL 2005 database. We stood up a new Windows Server 2008 R2 VM image, installed Orion, connected to the SQL database, and almost immediately started seeing the following message in the Windows event log:
The SolarWinds Information Service service terminated unexpectedly. It has done this X time(s). The following corrective action will be taken in 60000 milliseconds: Restart the service.
It happens about once an hour which causes the web site to either error or time out.
I have a case (312670) open that has been dragging on for multiple days so I wanted to check and see if any other users have had this issue.
I'm sorry your case has been dragging out. You might consider upgrading to NPM 10.2.2 which was released this week.
I'll ping dev and have them review your case to see if there is anything obvious. If you are still having issues, I will get it escalated.
Mav
I had dev take a quick look at your case. They are still investigating but have some ideas of why you are seeing this. We'll update you with more information soon.
Mav
Hi,
have you tried to disable 3rd party applications that could affect Orion normal functioning, i.e. firewalls, antiviruses, etc. ?
Also you could put map resources to dedicated views to reduce the web server load.
I removed all non Orion programs early on because we were worried about conflicts, but I am still seeing the crashes. I will look in to the map setting and make the change today.
I checked on our views in use and I don't think I will gain much by adding/removing maps. There are only a few maps that are really used by everyone and they will be displayed in our operations and help desk areas regardless of dedicated views.
I saw the service crash a couple times on Saturday and it is already happening again today so I uploaded a new diagnostics file. If any of the developers have any ideas please take a look, this has been happening for a solid week now.
So do the developers have any ideas, because my support tech just asked me what event log error I'm seeing again. I included that error in the first email I sent, this is getting a little old.
jbehrmann, we have a fix for this we would like you to test. Support should be reaching out to you soon to provide the details. I apologize for the back and forth on this, but you should have resolution soon.
Mav
Patched file has been installed, I will keep everyone in the loop. Thanks.
Well, I had nearly a week of peace but the error resurfaced yesterday. I haven't caught the web site crashing yet, but there are a number of errors logged in the system log:
Log Name: System
Source: Service Control Manager
Date: 3/2/2012 9:04:41 AM
Event ID: 7031
Task Category: None
Level: Error
Keywords: Classic
User: N/A
Computer: XXXXXXX
Description:
The SolarWinds Information Service service terminated unexpectedly. It has done this 2 time(s). The following corrective action will be taken in 60000 milliseconds: Restart the service.
Event Xml:
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
<System>
<Provider Name="Service Control Manager" Guid=" XXXXXXX }" EventSourceName="Service Control Manager" />
<EventID Qualifiers="49152">7031</EventID>
<Version>0</Version>
<Level>2</Level>
<Task>0</Task>
<Opcode>0</Opcode>
<Keywords>0x8080000000000000</Keywords>
<TimeCreated SystemTime="2012-03-02T15:04:41.782750900Z" />
<EventRecordID>10527</EventRecordID>
<Correlation />
<Execution ProcessID="528" ThreadID="30328" />
<Channel>System</Channel>
<Computer>XXXXXXX</Computer>
<Security />
</System>
<EventData>
<Data Name="param1">SolarWinds Information Service</Data>
<Data Name="param2">2</Data>
<Data Name="param3">60000</Data>
<Data Name="param4">1</Data>
<Data Name="param5">Restart the service</Data>
</EventData>
</Event>Just to fill everyone in we had run an APM install after patching our infomation service dll, which replaced one of the file copies in the web folder. Once I re-applied the fixed version to that location the service has stabilized again.
Thanks keeping us updated, jbehrmann.
DH
Hi jbehrmann,
We are experiencing the same issue, but with NPM 10.2.2 and SAM 5.0. I think this error is occuring once an hour. Is your fix solved the problem?
Thanks!
Hi jbehrmann, hi to everybody, I have same error on my NPM 10.2 server.
I opened many ticket and applied many solutions but I continue to retrieve same issue.
I heard about a 10.2.2 patch...where I can obtain this?
Thanks