Hi,
I am looking for a way to get alerted when we have a physical hardware issue, whether it be a drive in a server, power supply, battery, memory, etc.
Does anyone have any good examples of how I might be able to do this?
Thanks.
First you need to make sure that your node's hardware is being polled:
To create an alert you can do so via web-based Alerting wizard:
To answer your second "Filter Question", find the mib for the server that lists the "Hardware Type" for the server(s). That means that SNMP needs to run on your servers to take advantage of polling. That will create a small amount of overhead for the servers, FYI. You can use SW "Tools" to help find the correct mib by walking the mib tree on the devices. Just set up your community name on the server and then in Tools to get that to work. (You will also need to set NPM and NCM up as well to get the information from polling.)
Set up polling for the servers.
1. Create a NEW alert for hardware problems for the servers. As an example, call it "Server Hardware Problems".
2. Inside the alert, set up the logic to say "If all of the following are true" for the first statement.
3. Then under that, set up a first global for "if ANY of the following are true" as the set of logical "or" statements. Example: The Field "hardware type" is equal to "Windows".
The Field "hardware type is equal to "linux"
etc.
4. A THIRD logical global statement should be created Under the FIRST one. It will be indented just like the second one that we just did. In that statement it should say "where any of the following are true".
5. Under that third global statement, create all of the hardware components that you want to watch.
The logic example is thus: If this is a Windows machine AND it has a hardware problem with (name your physical item here), the alert will trigger.
6. The message part of the alert should have (variables) for the Node name, the hardware type, the component name and a timestamp if wanted. You will have to use the listing from tools to determine what the devices may be called.
With some manipulation, it should be what you are looking for.
hope that this helps, Mark
Ok, I think I have it down pretty well. One other problem I have noticed is we have about 1000 nodes and some are physical servers other are virtual. I noticed on some of the physical servers the "Hardware Sensor" check box was not checked off and I had to do it manually.
Does anyone know of a custom search so that I can just list Physical Servers and see which ones don't have the Hardware Sensor checked?
All of our servers (well most) are polling by SNMP
Thanks
Very easy my friend:
Option (1) - Filter by OS type within alert:
Option (2) - Create custom property for filtering
This is great question, thanks.
It was supposed to be pretty straightforward, but it turned out that it is not that clear how to differentiate between Physical/Virtual machines, even though Node Details resource shows this information. There are many questions about it on Thwack and not many definitive answers
Here is exact same thing that you are looking for: Reporting nodes not configured for hardware polling?
Hi Alex,
I have created the an alert in below format but it is not working.
Actually esxi have SD disk which is going down and and we are getting the below event in node,
so i start to create an alert but not get success, please let us know so that i can create the same alert.
Thanks in Advance.
I would suggest to start from scratch. First - remove all filters and all limitations and only leave "I want to alert on Hardware Sensor". See what is going to trigger. Then, start filtering out once you know it works.
Another thing I notice - you have limited scope in your rule above. I use this functionality very rarely. Most of the things can be done on filter level. So - switch back to "all object in my environment" and simply create filter for the node nae, that you have blanked out above
something like this:
SolarWinds solutions are rooted in our deep connection to our user base in the THWACK® online community. More than 195,000 members are here to solve problems, share technology and best practices, and directly contribute to our product development process.