This discussion has been locked. The information referenced herein may be inaccurate due to age, software updates, or external references.

You can no longer post new replies to this discussion. If you have a similar question you can start a new discussion in this forum.

Hardware Alert Help

ec-umass over 8 years ago

Hi,

I am looking for a way to get alerted when we have a physical hardware issue, whether it be a drive in a server, power supply, battery, memory, etc.

Does anyone have any good examples of how I might be able to do this?

Thanks.

Top Replies

0 AlexSoul over 8 years ago
First you need to make sure that your node's hardware is being polled:
Run "List Resources" on your node and add all hardware components.
Then add "Current Hardware Health" resource onto your node page so that you can see all hardware sensors that SolarWinds was able to pick up
To create an alert you can do so via web-based Alerting wizard:
Go to ALERTS > MANAGE ALERTS > ADD NEW ALERT
Under "Trigger Condition" select "I want to report on: Hardware Sensor"
Create condition that suits you the most
Next > Next > Next > Done
Cancel
Vote Up +2 Vote Down

Cancel
0 ec-umass over 8 years ago in reply to AlexSoul

Do you know if there is a way to just include Windows/Linux/VMware servers? We have a bunch of networking things on here that I am not concerned about and I am not sure how to filter out just servers?
Thanks.
Cancel
Vote Up 0 Vote Down

Cancel
0 highstone351 over 8 years ago

To answer your second "Filter Question", find the mib for the server that lists the "Hardware Type" for the server(s). That means that SNMP needs to run on your servers to take advantage of polling. That will create a small amount of overhead for the servers, FYI. You can use SW "Tools" to help find the correct mib by walking the mib tree on the devices. Just set up your community name on the server and then in Tools to get that to work. (You will also need to set NPM and NCM up as well to get the information from polling.)
Set up polling for the servers.
1. Create a NEW alert for hardware problems for the servers. As an example, call it "Server Hardware Problems".
2. Inside the alert, set up the logic to say "If all of the following are true" for the first statement.
3. Then under that, set up a first global for "if ANY of the following are true" as the set of logical "or" statements. Example: The Field "hardware type" is equal to "Windows".
The Field "hardware type is equal to "linux"
etc.
4. A THIRD logical global statement should be created Under the FIRST one. It will be indented just like the second one that we just did. In that statement it should say "where any of the following are true".
5. Under that third global statement, create all of the hardware components that you want to watch.
The logic example is thus: If this is a Windows machine AND it has a hardware problem with (name your physical item here), the alert will trigger.
6. The message part of the alert should have (variables) for the Node name, the hardware type, the component name and a timestamp if wanted. You will have to use the listing from tools to determine what the devices may be called.
With some manipulation, it should be what you are looking for.
hope that this helps, Mark
Cancel
Vote Up +2 Vote Down

Cancel
0 ec-umass over 8 years ago in reply to highstone351

Ok, I think I have it down pretty well. One other problem I have noticed is we have about 1000 nodes and some are physical servers other are virtual. I noticed on some of the physical servers the "Hardware Sensor" check box was not checked off and I had to do it manually.
Does anyone know of a custom search so that I can just list Physical Servers and see which ones don't have the Hardware Sensor checked?
All of our servers (well most) are polling by SNMP
Thanks
Cancel
Vote Up 0 Vote Down

Cancel
0 AlexSoul over 8 years ago in reply to ec-umass
Very easy my friend:
Option (1) - Filter by OS type within alert:
Go back to Trigger Condition > Add new condition (green plus icon) > Browse all objects
Select Orion Object as "Node" > Add "Machine Type" filter
You will need to group all machine types that you want with "OR" grouping
Option (2) - Create custom property for filtering
Just create custom property called something like [TYPE] as a drop-down and set pick values such as {Server; Router; Switch; etc}.
Then, assign value {Server} for all your nodes that classify as "Servers", leaving all your networking equipment with values such as Router, Switch, etc
Modify your trigger condition to trigger only if node has [TYPE] = {Server}
Cancel
Vote Up 0 Vote Down

Cancel
0 AlexSoul over 8 years ago in reply to ec-umass

This is great question, thanks.
It was supposed to be pretty straightforward, but it turned out that it is not that clear how to differentiate between Physical/Virtual machines, even though Node Details resource shows this information. There are many questions about it on Thwack and not many definitive answers
Here is exact same thing that you are looking for: Reporting nodes not configured for hardware polling?
Cancel
Vote Up 0 Vote Down

Cancel
0 krishnamishra0786 over 7 years ago in reply to AlexSoul

Hi Alex,
I have created the an alert in below format but it is not working.
Actually esxi have SD disk which is going down and and we are getting the below event in node,
so i start to create an alert but not get success, please let us know so that i can create the same alert.
Thanks in Advance.
Cancel
Vote Up 0 Vote Down

Cancel
0 AlexSoul over 7 years ago in reply to krishnamishra0786

I would suggest to start from scratch. First - remove all filters and all limitations and only leave "I want to alert on Hardware Sensor". See what is going to trigger. Then, start filtering out once you know it works.
Another thing I notice - you have limited scope in your rule above. I use this functionality very rarely. Most of the things can be done on filter level. So - switch back to "all object in my environment" and simply create filter for the node nae, that you have blanked out above
something like this:
Cancel
Vote Up 0 Vote Down

Cancel
0 krishnamishra0786 over 7 years ago in reply to AlexSoul

Hi,
I used the alert trigger condition-
But i think the condition would not giving the correct result.
Thanks
Krishna
Cancel
Vote Up 0 Vote Down

Cancel
0 AlexSoul over 7 years ago in reply to krishnamishra0786

ok, getting closer... remove this condition now, leave it blank so that it captures everything, and see if you alert will trigger
Cancel
Vote Up 0 Vote Down

Cancel