cancel
Showing results for 
Search instead for 
Did you mean: 
Create Post

Orion Platform 2018.2 Improvements - Chapter One

The time has come again for another exciting rundown of some of the improvements and enhancements coming your way in the next major installment of the Orion Platform. For those who may not be familiar, Orion is the foundational component upon which product modules such as Network Performance Monitor (NPM), Server & Application Monitor (SAM), and many others are built atop. Platform capabilities are available to, or can be leveraged by modules which run atop the Orion Platform. In most cases, those enhancements are available regardless of which Orion module(s) you are running, such as PerfStack. In others, it may be something which individual modules can extend to utilize for their own purposes, such as the Orion Agent which has been the basis for delivering amazing new capabilities from NetPath and QoE in NPM, to Application Dependency Mapping (ADM) and IaaS monitoring in SAM.

UPS Monitoring

Several years ago I created a Universal Device Poller (UnDP) for monitoring APC SmartUPS devices, and still to this day it remains amongst one of the most popularly downloaded UnDPs for NPM, if not the most popular. Universal Device Pollers are an incredibly powerful feature of NPM, allowing you to monitor virtually anything about a device which is managed via SNMP. However, there comes a time when certain functionality becomes so ubiquitous that it makes sense to promote it to native functionality of the monitoring solution and not require users to create it themselves. So in this 2018.2 release of the Orion Platform included with NPM 12.3, that's precisely what we set out to do, while also making some improvements along the way.

If you haven't already done so, you'll want to start by adding your APC UPS equipment to Orion. You can do so individually using the 'Add Node Wizard' [Settings > All Settings > Add Node], or in bulk using Sonar Discovery [Settings > All Settings > Discover Network]. If you are adding the devices using the 'Add Node Wizard', you will notice a new option listed for your APC UPS equipment entitled 'UPS'. Checking the box next to this option will enable UPS polling for this device.

List ResourcesPower Control Unit Status ResourceUPS Firmware Version
pastedImage_11.pngPicture1.pngpastedImage_30.png

Once you've successfully completed the 'Add Node Wizard' and navigate to the 'Node Details' view of your newly added UPS device, you will notice a newly added resource entitled 'Power Control Unit Status'. This resource reflects the most important information about your UPS device, including things such as its overall status, time on battery, and the batteries current charge capacity. This information can, as you would expect, be utilized in Alerts to notify you things such as when the UPS is on Battery, if a battery needs replacing, or if the battery is reaching an unsafe operating temperature. You may also notice that the 'Software Version' field in the "Node Details' resource now accurately reflects the firmware version installed and running on the UPS.

Currently, this new capability is limited exclusively to APC (American Power Conversion) SmartUPS Uninterruptible Power Supplies (UPS) containing Network Management (AKA Web/SNMP) cards. This feature does not support APC's unmanaged BackUPS series, nor does it yet support other UPS vendors, such as Eaton, Tripp Lite, or CyberPower. At least for now, we recommend using the Universal Device Poller to monitor similar metrics for UPS vendors other than APC. We will, however, be keeping a close eye on the NPM feature request forum to gauge interest in native support for other UPS vendors.

Linux/Unix Load Average

In a similar vein to UPS monitoring discussed above, we learned from speaking with our customers over the years, as well as from those participating in the Orion Improvement Program, that monitoring Load Average on Linux and Unix systems ranks among the most popular uses of the Universal Device Poller. In our enduring pursuit to deliver unexpected simplicity to our customers, we realized that collecting these important metrics natively was something which was long overdue.

Beginning in Orion Platform 2018.2, and included with NPM 12.3, Load Average is collected automatically for any node which supports it. This is typically any Linux based operating system, but can also extend to FreeBSD, AIX, and other Unix like OS's. The Load Average metrics are collected for nodes monitored via the Orion Agent, as well as those managed agentlessly via SNMP. There's really no additional steps required if you added your nodes using the default selection. Since Load Average has a direct correlation to CPU utilization, it's intuitively tied to the existing 'CPU & Memory' option shown under 'List Resources'. When selected, Load Average statistics are collected automatically if the node being monitored supports them.

List Resources - CPU & MemoryLoad Average Resource
pastedImage_3.pngpastedImage_4.png

On the 'Node Details' view of your Linux servers, you will notice a snazzy new resource entitled 'Load Average' which displays the one minute, five minute, and fifteen-minute load average of the machine being monitored. Because Load Average metrics are tightly coupled to the number of CPU cores, we extended Orion's alerting to allow you to combine Load Average statistics with CPU count within your Alert Trigger so you can be notified when your system is under strain.

pastedImage_5.png

Load Average has also been added to the default PerfStack metrics for the node, meaning if you click on the 'Performance Analysis' button on the "Management' Resource of the 'Node Details' view for Linux server, you'll be taken to PerfStack where these Load Average statistics are automatically prepopulated. Similarly. if you're already working in PerfStack you can drag the node itself onto the chart area, the Load Average statistics, as well as other default metrics for the node will populate the PerfStack dashboard.

PerfStack Load Average.gif

Group Availability

Ever since bshopp introduced us to Orion Groups back in NPM 10.1, we've heard from many of you that the manner in which availability is calculated for these groups just didn't jive with how you think about availability in your environment, nor did it provide a valuable measurement for use in your SLAs. Sadly, Group Availability in Orion is calculated binarily. Put simply, the group is either 100% 'Up' or it's 100% 'Down' regardless of the number of members contained within the group. What this usually meant was, so long as at least one member in the group was 'Up', the availability of the group was 100%. That remained true even if there were 99 other things 'down' in that group at that time. I know, it sounds odd when you say it aloud or even when you're writing it down, but that's how it's been for years and somehow we've managed the muddle through. Well in this release of the Orion Platform, no longer will you be forced to just muddle through. Today we heed your cries!

Rather than turn the world on its end, causing lots of confusion and alerts storms in our wake, we left the legacy Group Availability metric in place, untouched. I know that will come as a big relief to those of you which have grown dependant upon this method of calculating availability and have built reports and alerts around this metric. What we chose to do instead is introduce a new Group metric entitled 'Group Members Availability', which as one would expect, properly and accurately calculates the availability of the group based on its members. This includes nested groups as well.

pastedImage_3.png

This new 'Group Members Availability' metric appears automatically on the 'Group Details' view upon group creation. We will also start calculating this new metric upon upgrade to Orion Platform 2018.2 if you already have existing groups. So there's really nothing you need to do. We even include a new out-of-the-box report we refer to as 'Members Based Group Availability Report - Last Month' which serves as an example for how easily this metric can be added to your own reports compared to some of the complex SQL queries some had attempted to use in the past. You can even leverage this new Group Members Availability metric in your alerting conditions with no fuss!

And More!

There's still plenty more we've managed to jam pack into this release of the Orion Platform that we're particularly excited about and would love to get your feedback on. Stay tuned to learn about some of the mapping improvements jblankjblank has whipped up and the many usability enhancements serena has crammed into this release, such as sexy new hovers, a new PerfStack widget, and additional improvements that we've made to ensure your next upgrade experience is great!

Comments

Wow...I would love to see the entire list of features... Will wait for the

prod release...

Will this include upgrades to Report Writer?  I would like to be able to create reports on UPS battery life, last battery replacement, load, etc.

I can't get migrated from 2008R2 fast enough!

Nice

The group member Availability chart is very popular already here.

whopping improvements!!

if you change temp from fahrenheit to celsius it changes to all temps to 0.

also how do i sort the unknowns out on these seems to be the same on all my ups?

pastedImage_0.png

dodo123 , we have tried extensively to reproduce this issue internally to no avail. Would you please open a support ticket so we can investigate further?

I have this same problem.  I still have the UPS UNDP polling in place and the data is there and accurate.  The new resource data looks like dodo123​'s screenshot with the wrong data and wrong status of battery replacement.

jhandberg, did you also change from Fahrenheit to Celsius or does your resource just look similar to dodo123 's for possibly different reasons?

I didn't change anything from the default.  We just upgraded our SolarWinds this morning.  When I saw our resource displaying this way, I thought I should add to this thread.  Especially since the data is different from the UNDP data.  I spot checked about half a dozen of our UPS devices and see the same issue.  These resources are the same device, new resource vs UNDP resource

pastedImage_0.png pastedImage_1.png

jhandberg, please open a case with support so we can determine what's causing this issue. I have some theories, but they're too complex to troubleshoot here on Thwack and require diagnostics and a MIB walk of the device to confirm.

Yep mine is the same as that we have the undp setup also and they don’t match at all.

The data doesn't match the UnDP or that the native UPS polling is in an 'unknown' state like jhandberg's screenshot posted above? in his particular case, there's an issue polling the data, which is why the status is 'unknown'. The problem is obviously not the device since the values are current in the Universal Device Poller resource. So something else here is afoul. The Cortex and RabbitMQ logs would be the first place I'd look, but they're both included in the Diagnostics so I recommend opening a case with support so we can determine what the cause of the issue is.

This is cool

Where is the Database table for this information?  I utilize the serial number from the UnDP for asset management and would like to utilize your fancy new resource instead.

Thanks for your Awesome Work !!!

Chris

This was fixed by installing hotfix 1 and then unselecting ups in list resources and saving.

Then doing a new list resource bypassing the cached one and selecting UPS again.

This information is stored as JSON in the 'Cortex_Documents' table of the database. The best way to query this information is through SWQL as 'Cortex.Orion.PowerControlUnit'.

Hi, Can we add ASA active user sessions, currently we are monitoring via undp like APC as prior?

Network Insight for ASA will monitor Remote Access VPN users and concurrent connections.  Check it out: NPM 12.2 and NCM 7.7 feature: Network Insight for Cisco ASA firewalls - SolarWinds Worldwide, LLC. H...

Just added my 1st APC UPS onto our lab instance with all the latest SW patches. The temp is showing as 70 F although my Orion local user account was already set to C. Is there somewhere else I need to adjust F to C for this new widget

Temperature units of measurement for UPS monitoring are controlled through user settings under 'Power Control Unit Settings'.

pastedImage_0.png

Woh, now I feel silly. I didn't even notice the new section there.  Many thanks aLTeReGo

Robert

Perfect. Thanks.

Hi,

I am not getting power control unit status post upgradation 2018.2

pastedImage_0.png

Orion Platform 2018.2 HF3, UDT 3.3.1, VNQM 4.5.0, SRM 6.6.0, WPM 2.2.2, DPAIM 11.1.0, NPM 12.3, VMAN 8.2.1 HF1, NetPath 1.1.3, CloudMonitoring 2.0.1, SAM 6.6.1, NTA 4.2.3

So is that not an image of your power control status as that looks ok?

If not have u list resources and selected ups status. If list resources had ups selected already remove it then submit it. Then list resources again forcing a new cache and selecting ups.

er.vansh17091  wrote:

Hi,

I am not getting power control unit status post upgradation 2018.2

pastedImage_0.png

Orion Platform 2018.2 HF3, UDT 3.3.1, VNQM 4.5.0, SRM 6.6.0, WPM 2.2.2, DPAIM 11.1.0, NPM 12.3, VMAN 8.2.1 HF1, NetPath 1.1.3, CloudMonitoring 2.0.1, SAM 6.6.1, NTA 4.2.3

Are your UPS's APC SmartUPS? Have you added them as nodes to Orion? When you perform a 'List Resources' on those APC UPS nodes, do you have an option to select 'UPS' and is it selected? Lastly, have you applied the latest hotfix? There was a fix for an issue that was preventing some APC UPSs from being properly detected that is now addressed.

Hi Alterego

Yes, we use UNDP smart ups template to monitor the UPS. I have installed the hotfix 3.

pastedImage_0.png

UPS interface is available

pastedImage_2.png

UPS- Interface is not available - for few nodes, UPS interface is not showing

pastedImage_1.png

Have you verified the Power Control Unit resource has been added to the view? It's possible that you'reusing a custom Node Details view that did not get this resource added to it upon upgrade.

aLTeReGo

Are you  "in charge" to that power control  poller ?

Please explain how do I bulk remove that poller ?

I don't see any "nice way" to that ...

/SJA

sja  wrote:

aLTeReGo

Are you  "in charge" to that power control  poller ?

Please explain how do I bulk remove that poller ?

I don't see any "nice way" to that ...

/SJA

Out of curiosity, why are you wanting to bulk disable UPS Monitoring?

Is it possible to set up any alerts or reports on the UPSs?

yerffej07  wrote:

Is it possible to set up any alerts or reports on the UPSs?

This is already possible today.

pastedImage_2.png

Version history
Revision #:
1 of 1
Last update:
‎04-23-2018 03:14 PM
Updated by: