cancel
Showing results for 
Search instead for 
Did you mean: 
Create Post
Level 10

One or more thresholds are exceeded. OK, great. WHICH ONES???

Orion Platform 2018.4 HF3, NPM 12.4, SAM 6.8.0

We've got a bunch of nodes that all have blinking yellow boxen on their green status icons, and "Node status is Up, One or more thresholds are exceeded.." [sic] in the status column of Manage Nodes.

pastedImage_0.png

For the life of me, I have no idea why.

Some are Hyper-V hosts - when I go into the host node, it shows VMs as up-with-warning or up-with-critical in Appstack, but going into the vm nodes, everything is green - no disks about to run out of space, no crazy CPU usage, etc.

Some are vms where, when I go into the node, everything is green - no disks about to run out of space, no crazy CPU usage, etc.

Two are vms in a Hyper-V cluster - hovering over them in Manage Nodes says that one has guest status critical and the other is warning, but none of the cluster nodes are in this list.  Again with everything looking fine when actually looking at the nodes' pages.

One is an ESXi host.  It does have two vms that are turned off (on purpose), but other than that, everything looks normal.

Search is failing me, just leads to three articles that are zero help:

----

https://support.solarwinds.com/SuccessCenter/s/article/Clusters-hosts-VMs-showing-as-Critical-One-or...

Gives example saying they went over a network throughput theshold, and shows how to adjust that threshold.

Great.  HOW does one know that it was the network one that was triggered???

---

https://support.solarwinds.com/SuccessCenter/s/article/Virtual-Machines-warning-One-or-more-threshol...

It's referencing v 11.5 and talks about getting info from the VIM_Resources table to see what was triggered.

There is no such table in v12.

-----

https://support.solarwinds.com/SuccessCenter/s/article/VMWare-guest-status-showing-as-critical-one-o...

Kind of a combo of the other two - talking about stuff specific to v11.5, and assumes you already know what is triggering it, just shows how to adjust the threshold.

----

Anyone know how the heck I figure out why NPM is getting cranky about these nodes?

0 Kudos
11 Replies
Level 8

We are receiving bulk alerts for our ESXI Host is critical/warning. if we look into details it shows nothing.

at node level the node is up and healthy, but in Virtual Host - it says critical.

under hosts with problems - it shows nothing why the host is critical.

for Guests, it says one of more thresholds are exceeded but what threshold is exceeded it will not clear.

serenaaLTeReGo​ any inputs on this issue.

pastedImage_0.png

pastedImage_1.png

0 Kudos

If you go to the Virtual host details view you should see which thresholds are being exceeded.

pastedImage_0.png

No, that's what I'm saying - nothing looks wrong.  CPU, RAM, etc are all normal.  Nothing is in the red.

0 Kudos

So--- I just built a new instance for my company this one on 2019.4. I ran into the same issue. I would hope there is a better way to do this, but i will share what worked for me anyway.

I went to Settings> all settings> virtualization settings>Virtualization global thresholds> I then (one by one) took note of the current threshold and then increased the setting. I had a second page up with my node and I kept changing the thresholds while repolling my node. (Each time it didn't change the status, I reset it back to the original and changed the next one.) Eventually I found the one that increasing made my node status green again. That let me identify the issue. For me it was Storage Capacity Usage. It was a little odd because i didn't see any issue with my storage... but there it is.

I wish there was a better answer and it would just tell you what threshold was exceeded, but....there it is.

 

0 Kudos

I get the same and have noticed this on mine but I haven't found a fix as yet

pastedImage_0.png

0 Kudos

sv9780  wrote:

We are receiving bulk alerts for our ESXI Host is critical/warning. if we look into details it shows nothing.

at node level the node is up and healthy, but in Virtual Host - it says critical.

under hosts with problems - it shows nothing why the host is critical.

for Guests, it says one of more thresholds are exceeded but what threshold is exceeded it will not clear.

serena aLTeReGo  any inputs on this issue.

pastedImage_0.png

pastedImage_1.png

The relationship between node and virtual host status in prior versions were not connected in a way that was satisfactory to me because you could have a node status of green because it was pingable, but the virtual host status would be tightly integrated with what we were receiving from the hypervisor. in VMAN 8.4, that was improved Virtualization Manager 8.4 Release Notes by adding virtual entities as child contributors to node status.

pastedImage_5.png

Then in Orion platform 2019.2, platform released its enhanced node status that gave greater visibility into child contributors. Orion Platform 2019.2 - Enhanced Node Status 

My expectation is that if you're upgraded to 2019.2 and the latest VMAN 8.5 that came along with that release you will see the combination of these 2 changes to make troubleshooting your particular use case much much better.

when the virtual host is critical, you will know and node status will be critical. as a result, your alerts can be tuned and more targeted.

0 Kudos

Have you looked into Alerts & Activity > Message Center and filtered for information coming from those nodes?  This might be the area to find the data you seek.

pastedImage_0.png

Once there, enter the IP address or hostname of one of the nodes that reportedly has problems, select the appropriate check boxes and drop downs, and click Apply.  You may find the problems defined nicely here:

pastedImage_1.png

0 Kudos

Yes, thanks.  And there's nothing useful.

E.g. The Windows servers, with everything checked, the only stuff shown are reboots that happened a month ago for Windows updates.

None of the dozens of other Windows servers that rebooted around the same time look any different, except they don't have the warning.

0 Kudos
Level 14

The Child Status (the little yellow/red at the bottom right) of a Node is for any element

When you go into the Server's Node Details page, look at the AppStack Resource to see if any Application, or related VM Host/Cluster is relaying the status.

Note that Node Status Roll-up way it is designed for Orion Core 2010-2018.4 SAM 3.5-6.8.

Orion Core 2019.1 and SAM 6.9 Status is changing Orion Platform 2019.2 - Enhanced Node Status

If you are on Orion Core 2018.4 HF3, you may already see the new Status of 2019.2 enabled.

0 Kudos

That's what I'm saying - on some of the hosts that are in this list, in Appstack, they show the vm as up-with-critical or up-with-warning.  But I have no idea why.  Going into those vm nodes flagged in Appstack, everything looks normal, except that in AppStack, the host is in warning.

It's almost like there's a circular dependency - the host is upset because the guest is in warning/critical, but the guest is upset because the host is in warning.

0 Kudos

jaybone  wrote:

That's what I'm saying - on some of the hosts that are in this list, in Appstack, they show the vm as up-with-critical or up-with-warning.  But I have no idea why.  Going into those vm nodes flagged in Appstack, everything looks normal, except that in AppStack, the host is in warning.

It's almost like there's a circular dependency - the host is upset because the guest is in warning/critical, but the guest is upset because the host is in warning.

what Sean was hinting at with enhanced status is that with Orion platform 2019.2, there is additional information as to why this is the case.

pastedImage_2.png

Also, regarding hosts status, the way we calculate host status will not allow for a circular dependency. As you're referring to NODE status, this is something that Jeremy goes into quite a lot of detail inOrion Platform 2019.2 - Enhanced Node Status 

0 Kudos