This discussion has been locked. The information referenced herein may be inaccurate due to age, software updates, or external references.
You can no longer post new replies to this discussion. If you have a similar question you can start a new discussion in this forum.

Orion Platform - 2020.2 - A Gateway to Your Fastest Upgrade Ever!

 

Time is money.


A timeless quote by Benjamin Franklin which is clearly relevant to what all of us deal with each and every day. Time is a precious commodity we never seem to have enough of, and as a part of the Orion Platform 2020.2 release, we are extremely excited to announce some very important enhancements targeted at giving back time. In this release, we continue to build upon the foundation delivered in previous versions with a goal of improving the overall install/upgrade process. With 2020.2, these improvements should not only simplify the upgrade process but more importantly, drastically mitigate downtime in order to do so.   

As a software company, we work diligently to provide a constant stream of added value to our users and that is obviously rolled up into a release.  We want our users to be on the latest and greatest releases as this allows you to take advantage of the most recent bug fixes, security updates, and new features. The fact is, that especially in environments of larger size or complexity, the upgrade process in order to get to the latest release has been a considerable investment of your time. Many customers spent anywhere from multiple hours to multiple days completing an upgrade, making it a less than appealing undertaking to endure. There is little doubt that we need to continue to make significant changes in this area.

As an example, a customer recently reached out requesting feedback on the most efficient method to complete their upgrade. The customer's environment consisted of multiple Orion modules and 30 Additional Polling Engines. Needless to say, there was concern over how long this process would take with a tight window to get it done. The customer ended up copying the offline installer onto each and every one of his polling engines manually. Now all said and done, they stated that they were able to complete their upgrade in a little under 3 hours, which wasn't too bad. However, that did not account for all the time they had spent downloading and copying files to prep the environment. We can clearly improve on this process, and we did!

Updates & Evaluations

As an Orion Administrator, you will undoubtedly be familiar with the My Orion Deployment page within the web console. Upon visiting this page and selecting the Updates & Evaluations tab, you will be greeted by some adjustments to the user interface.

jblankjblank_0-1585795300657.png

As you can see from the example above, the upper portion provides specific information in each of the 3 sections. The first section indicates if any major updates are available, the second calls out patches and hotfixes based on what is installed, and the 3rd indicates products that may not be a part of your current installation.

Make a Plan

Shifting our focus to the bottom half of the screen we have 3 paths laid out before us. Whether or not you may have realized it, the Orion Update Tool can be used to validate your compatibility level with certain versions, or even with a new product you want to trial. The problem we heard from many of you was that this was not clear. Others realized this was possible, but steered clear of the tool, due to a fear you would begin an upgrade on accident.   

That problem should now be reconciled, so we begin with the Make a Plan option. Clicking on this button will take you through a workflow you may also recognize, indicating what modules and versions you are currently running and what they will be upgraded to. 

jblankjblank_1-1585796871696.png

This screen also provides ample opportunity to learn about other products you may not be familiar with, but for the purpose of this walk-through, we will scroll down and continue on. The next step will focus on establishing a connection from the main poller to any scalability engines distributed within the environment. This includes Additional Polling Engines, Web Servers, or HA servers. If the system is unable to connect to a scalability engine, you will receive a warning with actionable information similar to below.

jblankjblank_2-1585797086159.png

You may choose to rectify that issue immediately or move forward, it is up to you. The following step will perform what you will hear us internally call the pre-flight checks. In essence, these are system checks of each server that provide valuable feedback on optimizations that may need to occur prior to being able to upgrade, or recommendations to be mindful of before proceeding. For example, it would be difficult to install an update on a polling engine if that machine is out of disk space. These are items that you just may not be aware of that could cause hiccups or even prevent you from making progress if you were trying to move forward with an actual update. 

jblankjblank_0-1585798579755.png

At the bottom of the screen, you will see four separate options. These are really centered around completing the task of planning your upgrade, in which case you can click done, run your system checks again, or even retrieve a printable version of the page. This may be handy for some of you with change control processes that require documentation. Another new and notable feature is also a perfect segue into our next topic, the option to proceed to download files.

Pre-Stage Files

Selecting the Start Downloading Files button pictured above is, in reality, the same workflow as if we were to have chosen the Pre-Stage Files button on the first page pictured below.

jblankjblank_3-1587419579581.png

The goal was to maintain a similar workflow throughout each of the three options, but ensure you had a clear and concise decision making tree, as well as stopping points throughout the process. 

Proceeding through the workflow will guide you down the same path regardless of where you clicked the button, at which point you can now pre-stage your environment. What this provides is a mechanism to prep your environment ahead of time, where previously you either had to do this manually yourself, downloading and copying files onto each scalability engine, or include this overhead as part of your change window needed to complete the upgrade. This cuts all that extra work out of the equation with just a few clicks! Once you have begun this process, you are free to go back to what you were doing and come back to check on it later.  

Pictured below, the staging begins with your primary poller and then transitions to upload bits to each of your scalability engines.

jblankjblank_0-1586314071762.png

jblankjblank_0-1587419294786.png

Feel free to go back to what you were doing or of course troubleshoot an issue if necessary. It has no effect on the upgrade. Upon return, you will be greeted with progress indicators and success indicators once completed. When it is time for the actual change window, come back to this page and simply click the Upgrade Now to proceed. Whether this is the next day or the next month it really doesn't matter, you are set and ready to upgrade!

jblankjblank_2-1587419350152.png

As always, there is built in intelligence to centralized upgrades and if you return to this page, and a newer update is available, you will be notified and can make the choice to download the new bits prior to upgrading. Fear not, the system will also ensure that only new bits necessary for the latest update are downloaded, rather than having to download absolutely everything again.

SDK and Configuration Wizard Improvements.

The team was also able to deliver new verbs into SWQL Studio to ensure that advanced users have the ability to automate pre-staging functionality through the SDK.

jblankjblank_0-1587435848206.png

A new SWIS verb has been added to Orion.Environment entity and you have the ability to specify the "updatemode" as well.

  • Install & Upgrade = (1) - Arguments are optional
  • Hotfix & Patches Only = (2) - Arguments are optional
  • Install Only Evals = (3) - Arguments are mandatory for which eval products will be installed; NCM, SAM and so on. 
jblankjblank_1-1587436481071.png


There have also been massive amounts of updates and changes done to the configuration wizard itself designed to expedite the time it takes to complete. We have also seen drastic changes in this area as well, and combining both of these options together makes for a significant reduction in time dedicated to updating your environment. 

IMPORTANT UPDATE

THE BEST PART IS that while this has been implemented in the 2020.2 release, you may be thinking that you won't get to take advantage of the new features until the next release. Normally that would be true, but that wouldn't really give us the wow factor we are after. You actually get to take advantage of this feature right now. For online environments running 2019.2 or later, the SWA service will automatically be updated and allow you to take advantage of the feature/functionality described above.

Want more good news!? As of 8/26/2020, offline environments can also take advantage of the Pre-Staging feature! Yes, this was one of the first questions asked once the original release was made available, and with the 2020.2.1 Service Release, we are happy to deliver Pre-Staging for offline environments. Just to reiterate what this means, when pre-staging, all services remain running and Orion continues to function normally throughout the process. This prevents you from spending significant time performing these actions yourself, as well as eliminating this task during your actual upgrade window. Only later, when you ultimately decide to upgrade, will there be any disruption in service. This disruption should also be drastically reduced thanks to the additional improvements made to the configuration wizard.

So, for those of you running offline environments you will immediately notice a different behavior from the typical offline installer as pictured below. 

jblankjblank_0-1598537942343.png

This allows you to choose your own adventure, and we certainly recommend using the Pre-Stage option. Once selected, the SolarWinds Administration Service will be updated and the installer will identify the necessary bits needed for your installation. 

jblankjblank_1-1598538312834.png

The appropriate files will then be copied from the .exe to the install directory of your server.

jblankjblank_2-1598538423958.png

Once this is complete, you will receive a success message and directions that allow you to copy the URL or click FINISH to launch the Orion Update Tool.

jblankjblank_0-1598539187809.png

From the Centralized Upgrades UI, you can now follow the same steps outlined above to either walk through the upgrade now, or pre-stage the necessary bits across all your scalability engines. 

Remember, for those of you currently running on Windows 2012R2 and believe you can't upgrade, the 2020.2 release fully supports both Windows Server 2012 R2, and SQL Server 2012, as well as SQL Express for smaller environments. Be sure to check out the Release Notes and System Requirements for the platform and each of your corresponding modules.

The Orion Platform is responsible for delivering features that reduce complexity and provide a solid foundation for the modular system. We are very anxious to hear your feedback and hope this provides a welcomed change to the upgrade process. Please share your experiences below and let us know what you think.

Want More!

Be sure to check out the other Orion Platform Posts as we have even more in-store and loads of new enhancements that will have a big impact.

  •  - this will not be a great option for offline environments for the time being. Technically you could upgrade the main polling engine and then pre-stage with the new features, but if your main polling engine is updated, then you would need to upgrade the rest of your scalability engines too, otherwise they won't work. 

    We are hoping to provide a way in which a user could potentially pre-stage the right way, even in an offline environment, but that is not available right now.   

  • Ahh after spending around 6 hours upgrading to 2019.4 I got excited until I read the restriction for Offline environments We have a proxy server that has access to SolarWinds for customer portal etc will that allow the prestaging 

  •  Thought a follow up to my post was necessary.  After escalating my case, I received a call from SW tech support.  Spent the next few hours on the phone with the tech, and i'm happy to say all issues are resolved.  I have been a Solarwinds customer for nearly 10 years, and worked with many tech support engineers.  Overall I believe the quality of tech support is good, but the tech i worked with (Matt) was on a whole other level.  Before touching anything, Matt took the time to ask questions in order to properly diagnose the issue and current state of my environment.  I can't tell you how important this often overlooked step is.  With the background and context, he explained to me what he believed was the problem, and the approach we wanted to take towards fixing it.  He was very knowledgeable, in my opinion a cut above most tech support i've worked with.  In addition, he was patient, creative, and persistent.  He genuinely wanted to fix my issue, not just get off the phone, and it was clear he didn't want to stop until i was happy.  Although this is the way it should be, this experience with SW tech support, and specifically Matt was outstanding.  As i mentioned in my previous post, I have a main polling engine and 3 additional polling engines.  My environment was fully functioning prior to the upgrade attempt but all except for the main polling engine were down for nearly 24 hours.

    The fixes were as follows:
    1.  Ran permission check on all polling engines and main engine.  Several permissions were repaired on all.

    2.  Checked Orion Service Manager and validated that all services were running.  A few services on each of the APE's was found to be down and restarted.

    3.  I think the main problem was that software on my main polling engine and APE's got out of sync.  Specifically, I am using the legacy trap system due to dependancies that we have.  I chose not to install Orion Log & Event Manager on my main polling engine, but for whatever reason it got installed on my APE's.  I don't recall there being an option not to install it though.  In any event, Matt removed the 2020.2 trap component and reinstalled the legacy system on all 3 of my APE's.  We re-ran the configuration wizards and everything is now resolved, and running better than ever, primarily due to Matt.

    Best,

    Tom

  • Some feedback from my upgrade:

    Ran the script to upgrade administration service, ran smooth.

    Back to the upgraded web page for upgrading orion:
    The first page, "Welcome page", after both "Make a plan" and "Pre-stage files" could be clearer that this is not the actual upgrade, just the plane/pre-stage. But technically they work fine, like it!!

    (Side note; when you run config wizard on the server you get nice info on what it is doing but now when installing on the web gui you get nothing more than percentage. I often log on to the server anyway, and open the config wizard log file while it's running just to know what it is doing. Can't we get that nice info on the web as well? Maybe a checkbox saying "show detailed progress" or something.)

    Got one error in config wizard:
    "Database configuration failed: Error while executing script- Cannot insert the value NULL into column 'ThresholdNameId', table 'SolarWindsOrion.dbo.Thresholds'; column does not allow nulls. INSERT fails. The statement has been terminated."

    Support and I found the issue, ran below to fix: (Case #00528523)

    IF NOT EXISTS (SELECT * FROM [dbo].[ThresholdsNames] WHERE [Name] = 'Volumes.Stats.PercentDiskUsed')
    INSERT INTO [dbo].[ThresholdsNames] ([EntityType],[Name],[DisplayName],[DefaultThresholdOperator],[Unit],[ThresholdOrder])
    VALUES ('Orion.Volumes', 'Volumes.Stats.PercentDiskUsed',
    N'@{R=Core.Strings.3;K=VolumeThresholds_PercentUsed;E=sql}',
    1,
    N'@{R=Core.Strings;K=XMLDATA_IB0_6;E=sql}'/*'%'*/,
    4)
    ELSE
    UPDATE [dbo].[ThresholdsNames]
    SET [DisplayName] = N'@{R=Core.Strings.3;K=VolumeThresholds_PercentUsed;E=sql}',
    [DefaultThresholdOperator] = 1,
    [Unit] = N'@{R=Core.Strings;K=XMLDATA_IB0_6;E=sql}'/*'%'*/,
    [ThresholdOrder] = 4
    WHERE [EntityType] = 'Orion.Volumes' AND
    [Name] = 'Volumes.Stats.PercentDiskUsed'

    When upgrade was finally done, I had no way of getting back to "normal orion". Just the "upgrade tool web page". No menus etc. (https://orion.[ourdomain]/Administration/SolarWinds.Administration.CentralizedUpgrade.Web.centralized.html#!/upgradetool). Had to retype URL manually.

    Thanks!

  •  - Thank you very much for the write up and great detail about your experience in working with support.  

  •  - Too comments as I missed one of your questions.  

    You should be able to use this process as far back as 2019.2 versions of the platform. 

    Thank you for sharing your support case and the details about your environment. I will make sure this gets to our Dev team to investigate.  

  • Well, this is slightly depressing. I was looking forward to this new and improved "Centralized Upgrade" so I went ahead and upgraded to 2020.2 RC 2 for Orion to see if we could finally get around the "Can't download file because of the invalid checksum" error that we have had since Centralized Upgrades was released. Looks like its the same story just a different title  

    So we have over 50 polling engines spread across different regions of the world. Our main Orion Poller lives in Sacramento.

    The 7 APEs we have in Sacramento have no issue with centralized upgrades. 

    We have 7 APEs in Virginia as well and this datacenter is the closest to where our main poller lives. The latency between our main Orion Poller and our Virginia APEs is 60ms. When the Centralized Upgrade process kicks off, all the small files (less than 300mb) have no problem copying over. The first failure was the CORE-2020.2.5220.27327-CoreInstaller.msi (429mb) file

    jakegevans_0-1588686191296.png

    Copying a 429mb file between our Sacramento and Virginia datacenter shouldn't be a huge issue for the SW Centralized Upgrade process, but it is. 

    To get around this, I have to manually copy the specific file that is failing and place it in the ProgramData\SolarWinds\Installers folder. When I manually copy the file it takes less than 3 minutes to complete. This is what makes me wonder HOW SolarWinds is trying to copy this file and why is it taking so long that it times outs and the checksum does not match because the file was not completely copied over before the checksum initiated. 

    I've asked multiple times if there is a config file somewhere that has a timeout listed for the file transfer and no one has been able to provide an answer. 

    I am sure this process works for small businesses and businesses that have Orion installed at every datacenter but that's not scalable and for large companies that span across different regions, this centralized process is never going to work. 

  •  - I am sorry this didn't work out for you. Did you happen to run the script and attempt the pre-stage at which point you saw this issue and decided to manually push through? 

    Do you have any old support cases I can reference or a new one where you shared diagnostics?  In order to provide assistance we would need to investigate more closely. We have tested numerous different environments above 50+ pollers where we haven't encountered issues as you describe but with thousands of nuances to each and every environment, it is difficult to catch all scenarios. We currently use WCF to pass files which isn't the most ideal option, but we hope to make additional improvements in the future.   

    We will work to try and simulate more scenarios closer to your environment and see if we can reproduce. If you are able, please create a case and share the number and we will make sure to have a look.   

  • anyone tryed the firemware - update with parallel proceses in 2020.2 RC (NCM) ??

  • We are currently running 2019.2 and implementing this script went smooth.  One question, I am not seeing a way to upgrade our Orion to 2019.4 only to 2020.2 RC.  Will that option be available in the full release?

    Harvie