Open for Voting

Let's have the Config Wizard's option to rebuild Web Services disabled if a user's SW deployment would break due to it being "too customized".

Request:

Rerunning the Config Wizard and selecting to optimize the Web service shouldn't be an option if it's going to break Solarwinds deployment that have customized HTML content. 

At the minimum a warning popup should appear, explaining that running the Config Wizard on the Web services will break all Solarwinds web access to the Main Instance of NPM if one has customized HTML Resources present.

It just shouldn't break it at all.  When SW web pages look wonky, and when reloading the page and stopping/starting services and reloading the Main Instance server doesn't fix the issue, a logical next step seems to be to rerun the Config Wizard and have it optimize the Web services.  In my case, this completely broke my SW deployment.  Yes, there's a KB article about it, why to avoid this, and how to correct from it.  But I wasn't aware of it, and I feel it shouldn't be necessary to have a KB and a work around / prevention for it--the option to break one's deployment should not be present.

The reason for this request:

I recently had a situation where a newly-built Custom HTML resource wouldn't display properly.  It looked like this:

pastedImage_6.png

It displays the interface alias of a switch port using the Custom HTML resources.  This view had Interface Alias information overlapping text.  That made it hard to read, and made poor impressions on anyone looking at it.  It made it look like Solarwinds NPM isn't a well-designed product.  However, other Custom HTML resources pointing at other points on this same switch looked proper:

pastedImage_8.png

I swapped the Custom HTML out and found the problem followed the switchport with overlapping text.  It didn't matter if it was in the right hand column of the Custom View or the Left.

I deleted the Custom HTML and rebuilt it.  No improvement.

I compared two switchports' Custom HTML code--the wonky view and one that looked good.  The ONLY difference was the Element ID number of the two elements.

I stopped & started services on the server hosting the Main Instance.  No improvement.

I did some googling about the problem and found no great explanation or suggestions.

So I ran the Config Wizard and told it to optimize the Web services.  That broke all web services for all my Solarwinds web sites and pages--NPM, NCM, NTA, IPAM, UDT.

I ran the diagnostics and opened an emergency Site Down ticket (00382387) with Solarwinds.  I uploaded screen shots to the case, showing the original problem Custom HTML resource as it looked prior to running the Config Wizard, and I provided a detailed description of what I'd done to cause the original problem, and what I'd done to try to correct it, which ultimately caused all my Solarwinds to fail.

I uploaded the Diagnostics to the case and called Support.

Soon I was on the phone with a polite and expert Solarwinds technical resource.  He gathered the information from me, reviewed the logs, and dug through his technical support database for the exact problem.  He found the specific issue I was experiencing in his database, and provided this information about its cause:

"This is caused by too much customization on the website,

Customers may need to make customizations to the Orion website. These customizations require the use of the .cs files within the APP_CODE folder for the website.  Website compilation removes this folder, however, in NPM 12.0 or later, it is not possible to skip the optimization when the only product installed on the server has a pre-compiled website.

For situations such as this, it is possible to disable the use of the pre-compiled website. This allows the normal website optimization to run and can be skipped, allowing the APP_CODE folder to remain.

  You can refer on the link below for the steps taken to resolve the case .
https://support.solarwinds.com/SuccessCenter/s/article/Disable-pre-compiled-website-to-allow-optimization-to-run-and-be-skipped"

In short, first I made customized web pages using the Custom HTML option for Widgets and Resource creation.

Then when the view didn't look good, I broke all HTML on my Solarwinds by running the Config Wizard and attempting to optimize the Web service.

I shot myself in the foot, all with seemingly properly logic and good intent.

But what was the ultimate cause, you may wonder?  Here's the secret sauce that wasn't in Solarwinds' Tech Support KB or offline resources:  This switch had inherited its management IP address from an older switch that used different hardware.  Originally the IP address was associated with a Cisco 6509 VSS pair.  When I retired/replaced them with a newer Cisco 6807 VSS pair, I kept the same management IP address on the new switch.  And my Solarwinds deployment began showing grayed-out ports it had learned from the 6509--which were not longer present on the 6807.

This information shows up as Interface Unknown entries.  One would logically expect NPM to remove these previously-discovered interfaces and update its database either when NPM does a scheduled Discovery, or manual Rediscovery, or scheduled or manual Inventory, plus a new polling of the node.  But one would be wrong to expect this, apparently.

I found that I could clear out the problem if I would completely delete the Node with Unknown Interfaces from NPM and then re-Add it.  The grayed out ports that were Unknown Interfaces disappeared.

Better still, this ALSO cleared up the overlapping HTML line problem that started it all in the first place.

How about an Up-Vote for this Feature Request?   Fixing this on the engineer's end might just save you and others from experiencing a Solarwinds monitoring outage, and the wasted time and embarrassment and frustration recovering from it.

  • Ouch.  And we have been chasing slowness problems and just ran that a week ago to try and speed the website up.  I have not had complaints yet.

  • I didn't think I had any custom HTML in my deployment either.  I was only building new views, right?

    But they leveraged Custom HTML widgets, so my deployment is customized--at least, it is according to SW's definition.

    Even this basic "How to" page I built will put anyone's deployment into a Customized HTML mode--which might become unusable if they run the Config Wizard with the Web radio button checked.

    How to create a simple custom view of multiple interfaces' bandwidth utilization

  • I don't have custom HTML in my deployment.  But I have seen when a management IP is reused, sometimes, not all times, it doesn't properly update all the node information unless it is deleted and re-added. 

    For example, out LAN team swapped out an old HP switch with a Cisco switch.  Even after forcing a rediscover and poll now, the node had conflicting information.  On the node summary page, the Hardware Details resource retained the HP Procurve information, but the Node Details resource noted the Cisco Machine Type and Cisco IOS version installed.  Hardware sensors were all messed up too.  I did have to delete the node and re-add it.  This has happened on a Cisco Switch model replacement as well, but sometimes this data updates correctly (or is harder to spot and is still messed up).  I use the out-of-box Machine Type Change alert to tell me to look at these, but this alert is hit or miss.  Some Cisco switch swaps don't seem to trigger this alert.

    Our network teams don't always remember to notify or update the monitoring team on changes.  They are mostly ticket driven and updating monitoring and occasionally inventory systems is an afterthought as long as they can close the ticket or change request work order.  I catch some of these, but wonder if our database is accurate because of this.