cancel
Showing results for 
Search instead for 
Did you mean: 
Create Post
Product Manager
Product Manager

NPM 12.3 Orion 2018.2 Upgrade Feedback

What has your upgrade to NPM 12.3 on Orion Platform 2018.2 looked like? We on the product manager team would like to hear about it all, the good the bad and the ugly! For a starting point here is a quick getting started blog post on upgrading to 2018.2 Orion Platform: Preparing for the Upgrade to 2018.2

Tags (1)
310 Replies

hpstech  wrote:

Hola Serena -

We already upgraded Production to 12.3. So I guess the next DAU for us would be for 12.4.

Unless of course you'd like to help design our HA initiative for AWS.

Our main focus now is doing HA between on-prem and AWS with our newly acquired HA license.

From the looks of it all, there are many considerations for this idea but its something that's been mandated.

Exactly! I'm looking to get you in for the DAU for 12.4 when we're ready. Just checking if that's something you'd want to do

0 Kudos

Definitely!

We're first going to get HA in AWS working and stable, then we'll look forward to the next major NPM upgrade.

Hopefully the release date lines up with where we're at.

Thanks

hpstech  wrote:

Definitely!

We're first going to get HA in AWS working and stable, then we'll look forward to the next major NPM upgrade.

Hopefully the release date lines up with where we're at.

Thanks

awesome!!

0 Kudos
Level 8

Upgrade Nightmare,


The upgrade process its self was very easy and had no problems or errors during the install. The issues came as soon as i opened the web browser. At first glance everything looked fine but i soon got the server error prompt in the upper right corner and noticed no menu bar. I did some digging and found a mess of swis events relating to:

ERROR SolarWinds.InformationService.ChangeBrokerEvaluator.LegacyEngine.LegacyEvaluator - (null) (null) Subscription match handler failed with an exception. Indication: System.InstanceModified, SubscriptionId: , Handler: System.Action`1[SolarWinds.InformationService.Addons.PubSub.EvaluatorNotification], Exception: System.Net.Http.HttpRequestException: An error occurred while sending the request. ---> System.Net.WebException: Unable to connect to the remote server ---> System.Net.Sockets.SocketException: No connection could be made because the target machine actively refused it [::1]:17799.

After 12 hours on the phone with Support we was able to determine its an issue with post installation of Microsoft Cortex Web integration module to IIS and the ticket was sent to DEV. 00161459

pcwildcat65  wrote:

Upgrade Nightmare,


The upgrade process its self was very easy and had no problems or errors during the install. The issues came as soon as i opened the web browser. At first glance everything looked fine but i soon got the server error prompt in the upper right corner and noticed no menu bar. I did some digging and found a mess of swis events relating to:

ERROR SolarWinds.InformationService.ChangeBrokerEvaluator.LegacyEngine.LegacyEvaluator - (null) (null) Subscription match handler failed with an exception. Indication: System.InstanceModified, SubscriptionId: , Handler: System.Action`1[SolarWinds.InformationService.Addons.PubSub.EvaluatorNotification], Exception: System.Net.Http.HttpRequestException: An error occurred while sending the request. ---> System.Net.WebException: Unable to connect to the remote server ---> System.Net.Sockets.SocketException: No connection could be made because the target machine actively refused it [::1]:17799.

After 12 hours on the phone with Support we was able to determine its an issue with post installation of Microsoft Cortex Web integration module to IIS and the ticket was sent to DEV. 00161459

Thank you for differentiating between the upgrade experience and the after effects. I'll follow to see the results of the ticket you sent over.

0 Kudos
Level 11

Upgraded last Thursday night. Third attempt, and finally a success!

First attempt I ran into database issues. Something about ID numbers in the APM tables being too large to squeeze into an integer. I recommended an upgrade to the pre-flight checklist to automatically scan the database for any numbers which are too large to be converted correctly by the configuration wizard. Or, of course, change the variably typing to be at least as large as the old max variable value. Our stopgap (learned after rolling back this install) was to truncate the APM tables, deleting all of our APM data. If you're getting an Arithmethic error after installing and while running the Configuration Wizard, call Tech Support, have them identify the table(s) at fault, and you can truncate the table (at the cost of all of the data in that table.)

On the second attempt, I filled the datastore of my VM and crashed the system. Not Solarwinds' fault, but had to roll back anyway. Thank goodness for regular backups!!!

My third attempt was finally successful. The online installer hung without error, and I sat watching a crashed installer for at least 45 minutes not knowing that it had failed. (note: keep an eye on the download speed from the beginning, and if it drops significantly for an extended period, this may have happened to you as well.) After contacting Tech Support, I downloaded the latest offline installer, and this completed successfully on the first try, and much faster than waiting for all of the components to download individually. I will be using this method for future upgrades/installs, even though the online installer is "recommended."

Then when upgrading our 8 Additional Polling Engines, 2 completed successfully, 2 needed a reboot post-install to function normally, and 4 failed to uninstall the old software and then were missing some package files, so I needed to call in to get that resolved. Fortunately the gentleman from Manila who assisted me was able to isolate the problem, identify it as a never-before-seen issue, and resolve it. He also wrote a KB article to help anyone else who encounters the issue in the future.

I'm still seeing extremely high CPU usage on almost all of my pollers. Still looking into this issue...

Total install time: Approximately 8 hours.

jemertz  wrote:

Upgraded last Thursday night. Third attempt, and finally a success!

First attempt I ran into database issues. Something about ID numbers in the APM tables being too large to squeeze into an integer. I recommended an upgrade to the pre-flight checklist to automatically scan the database for any numbers which are too large to be converted correctly by the configuration wizard. Or, of course, change the variably typing to be at least as large as the old max variable value. Our stopgap (learned after rolling back this install) was to truncate the APM tables, deleting all of our APM data. If you're getting an Arithmethic error after installing and while running the Configuration Wizard, call Tech Support, have them identify the table(s) at fault, and you can truncate the table (at the cost of all of the data in that table.)

On the second attempt, I filled the datastore of my VM and crashed the system. Not Solarwinds' fault, but had to roll back anyway. Thank goodness for regular backups!!!

My third attempt was finally successful. The online installer hung without error, and I sat watching a crashed installer for at least 45 minutes not knowing that it had failed. (note: keep an eye on the download speed from the beginning, and if it drops significantly for an extended period, this may have happened to you as well.) After contacting Tech Support, I downloaded the latest offline installer, and this completed successfully on the first try, and much faster than waiting for all of the components to download individually. I will be using this method for future upgrades/installs, even though the online installer is "recommended."

Then when upgrading our 8 Additional Polling Engines, 2 completed successfully, 2 needed a reboot post-install to function normally, and 4 failed to uninstall the old software and then were missing some package files, so I needed to call in to get that resolved. Fortunately the gentleman from Manila who assisted me was able to isolate the problem, identify it as a never-before-seen issue, and resolve it. He also wrote a KB article to help anyone else who encounters the issue in the future.

I'm still seeing extremely high CPU usage on almost all of my pollers. Still looking into this issue...

Total install time: Approximately 8 hours.

Hi Jon,

When the online installer hung - did the occur on the main poller or on the additional polling engines?

0 Kudos

It was the main poller. The additional polling engines did not use an online installer, that was an APE installer generated by the other installer.

0 Kudos

jemertz  wrote:

It was the main poller. The additional polling engines did not use an online installer, that was an APE installer generated by the other installer.

Great! Have you used the online installer previously on this same VM? Was the download speed similar or worse?

0 Kudos

I previously installed 12.0.1 / 2016.2 on it, which is what I was upgrading from. I don't think that installer was online; I installed our 4 components separately. That version was before the days of the Unified Installer.

Level 13

No crashes, rock solid.

Very happy so far.

Thank you all from the SolarWinds team and early adopters.

I think our success is due to starting with a previously "fresh" 12.2 deployment.

We had a dev assisted upgrade to 12.2 which failed miserably. The engineer on the DAU call gave me the "Owl" script and we blew away the entire installation. The box was victim to many in place upgrades from the 11.x days.

I think starting fresh on 12.2 made the 12.3 upgrade so far painless for us.

All props to the Owl.

Image result for owl

Level 13

Upgraded tonight.

So far all is OK.

I'll be reporting any issues here.

Thank you all for being early adopters.

Orion Platform 2018.2 HF4, IPAM 4.7.0, NCM 7.8, CloudMonitoring 2.0.1, NPM 12.3, DPAIM 11.1.0, NTA 4.2.3, SAM 6.6.1, NetPath 1.1.3 © 1999-2018 SolarWinds Worldwide, LLC. All Rights Reserved.

Going on 10+ hours and no issues.

Woo hoo.

Level 12

After the upgrade, geographical Maps are missing.

0 Kudos
Level 10

After upgrading to NPM 12.3, we are facing High CPU utilization on main poller server as shown in below screenshot.

high cpu.jpg

and following Error occurs when we try to access web console.

webconsole error.JPG

0 Kudos

gangadhar.k  wrote:

After upgrading to NPM 12.3, we are facing High CPU utilization on main poller server as shown in below screenshot.

high cpu.jpg

and following Error occurs when we try to access web console.

webconsole error.JPG

Are you on HF4 of the Orion Platform? If you check in the footer of your webconsole, it should show you the information on what is installed.

0 Kudos

Yes we have installed Orion Platform 2018.2 HF4.

0 Kudos

Even with HF4?

Thanks

0 Kudos

Yes even after applying Orion Platform 2018.2 HF4

0 Kudos
Level 20

I'm on Orion Platform 2018.2 HF3, WPM 2.2.2, SRM 6.6.0, NCM 7.8, CloudMonitoring 2.0.1, NPM 12.3, DPAIM 11.1.0, NTA 4.2.3, VMAN 8.2.1 HF1, SAM 6.6.1, NetPath 1.1.3 and now I'm starting to get issues where the the website loses some resources and won't load after a reboot a couple days later.  The error on the website when I download locally has this in it:

Message: Error: A query to the SolarWinds Information Service failed.

ErrorSite: OrionWeb.InformationServiceProxy.PropagateException

ErrorType: SolarWinds.Orion.Web.InformationService.SwisQueryException

Stack:

at SolarWinds.Orion.Web.InformationService.InformationServiceProxy.PropagateException(Exception e)

at SolarWinds.Orion.Web.InformationService.InformationServiceProxy.Query(String query, IDictionary`2 parameters, Boolean useSeparateExecutionContext, Nullable`1 dataProviderTimeout)

at SolarWinds.Orion.Web.DAL.EventDAL.GetEvents(GetEventsParameter param, Boolean federationEnabled)

at Orion_Controls_EventsReportControl.LoadData(Int32 nodeID, Int32 netObjectID, String netObjectType, String deviceType, Int32 eventType, DateTime periodBegin, DateTime periodEnd, Boolean showClearedEvents, Int32 maxRecords)

at Orion_NetPerfMon_Resources_Events_LastXEvents.Page_Load(Object sender, EventArgs e)

at System.Web.Util.CalliEventHandlerDelegateProxy.Callback(Object sender, EventArgs e)

at System.Web.UI.Control.OnLoad(EventArgs e)

at SolarWinds.Orion.Web.UI.BaseResourceControl.OnLoad(EventArgs e)

-- Inner Exception

  Message: The socket connection was aborted. This could be caused by an error processing your message or a receive timeout being exceeded by the remote host, or an underlying network resource issue. Local socket timeout was '00:01:00'.

A reboot makes the interface look right again but I'm afraid this is going to keep happening now.  Should I try HF4?

0 Kudos

  wrote:

I'm on Orion Platform 2018.2 HF3, WPM 2.2.2, SRM 6.6.0, NCM 7.8, CloudMonitoring 2.0.1, NPM 12.3, DPAIM 11.1.0, NTA 4.2.3, VMAN 8.2.1 HF1, SAM 6.6.1, NetPath 1.1.3 and now I'm starting to get issues where the the website loses some resources and won't load after a reboot a couple days later.  The error on the website when I download locally has this in it:

Message: Error: A query to the SolarWinds Information Service failed.

ErrorSite: OrionWeb.InformationServiceProxy.PropagateException

ErrorType: SolarWinds.Orion.Web.InformationService.SwisQueryException

Stack:

at SolarWinds.Orion.Web.InformationService.InformationServiceProxy.PropagateException(Exception e)

at SolarWinds.Orion.Web.InformationService.InformationServiceProxy.Query(String query, IDictionary`2 parameters, Boolean useSeparateExecutionContext, Nullable`1 dataProviderTimeout)

at SolarWinds.Orion.Web.DAL.EventDAL.GetEvents(GetEventsParameter param, Boolean federationEnabled)

at Orion_Controls_EventsReportControl.LoadData(Int32 nodeID, Int32 netObjectID, String netObjectType, String deviceType, Int32 eventType, DateTime periodBegin, DateTime periodEnd, Boolean showClearedEvents, Int32 maxRecords)

at Orion_NetPerfMon_Resources_Events_LastXEvents.Page_Load(Object sender, EventArgs e)

at System.Web.Util.CalliEventHandlerDelegateProxy.Callback(Object sender, EventArgs e)

at System.Web.UI.Control.OnLoad(EventArgs e)

at SolarWinds.Orion.Web.UI.BaseResourceControl.OnLoad(EventArgs e)

-- Inner Exception

  Message: The socket connection was aborted. This could be caused by an error processing your message or a receive timeout being exceeded by the remote host, or an underlying network resource issue. Local socket timeout was '00:01:00'.

A reboot makes the interface look right again but I'm afraid this is going to keep happening now.  Should I try HF4?

I would definitely recommend going to HF4 if you can.