This discussion has been locked. The information referenced herein may be inaccurate due to age, software updates, or external references.
You can no longer post new replies to this discussion. If you have a similar question you can start a new discussion in this forum.

Hardware Alert Help

Hi,

I am looking for a way to get alerted when we have a physical hardware issue, whether it be a drive in a server, power supply, battery, memory, etc.

Does anyone have any good examples of how I might be able to do this?

Thanks.

  • First you need to make sure that your node's hardware is being polled:

    • Run "List Resources" on your node and add all hardware components.
    • Then add "Current Hardware Health" resource onto your node page so that you can see all hardware sensors that SolarWinds was able to pick up

    To create an alert you can do so via web-based Alerting wizard:

    • Go to ALERTS > MANAGE ALERTS > ADD NEW ALERT
    • Under "Trigger Condition" select "I want to report on: Hardware Sensor"
    • Create condition that suits you the most
    • Next > Next > Next > Done emoticons_wink.png
  • Do you know if there is a way to just include Windows/Linux/VMware servers? We have a bunch of networking things on here that I am not concerned about and I am not sure how to filter out just servers?

    Thanks.

  • To answer your second "Filter Question", find the mib for the server that lists the "Hardware Type" for the server(s). That means that SNMP needs to run on your servers to take advantage of polling. That will create a small amount of overhead for the servers, FYI. You can use SW "Tools" to help find the correct mib by walking the mib tree on the devices. Just set up your community name on the server and then in Tools to get that to work. (You will also need to set NPM and NCM up as well to get the information from polling.)

    Set up polling for the servers.

    1. Create a NEW alert for hardware problems for the servers. As an example, call it "Server Hardware Problems".

    2. Inside the alert, set up the logic to say "If all of the following are true" for the first statement.

    3. Then under that, set up a first global for "if ANY of the following are true" as the set of logical "or" statements. Example: The Field "hardware type" is equal to "Windows".

                                            The Field "hardware type is equal to "linux"

                                                 etc.

    4. A THIRD logical global statement should be created Under the FIRST one. It will be indented just like the second one that we just did. In that statement it should say "where any of the following are true".

    5. Under that third global statement, create all of the hardware components that you want to watch.

    The logic example is thus: If this is a Windows machine AND it has a hardware problem with (name your physical item here), the alert will trigger.

    6. The message part of the alert should have (variables) for the Node name, the hardware type, the component name and a timestamp if wanted. You will have to use the listing from tools to determine what the devices may be called.

    With some manipulation, it should be what you are looking for.

    hope that this helps, Mark

  • Ok, I think I have it down pretty well. One other problem I have noticed is we have about 1000 nodes and some are physical servers other are virtual. I noticed on some of the physical servers the "Hardware Sensor" check box was not checked off and I had to do it manually.

    Does anyone know of a custom search so that I can just list Physical Servers and see which ones don't have the Hardware Sensor checked?

    All of our servers (well most) are polling by SNMP

    Thanks

  • Very easy my friend:

    Option (1) - Filter by OS type within alert:

    • Go back to Trigger Condition > Add new condition (green plus icon) > Browse all objects
    • Select Orion Object as "Node" > Add "Machine Type" filter
      • You will need to group all machine types that you want with "OR" grouping

    Option (2) - Create custom property for filtering

    • Just create custom property called something like [TYPE] as a drop-down and set pick values such as {Server; Router; Switch; etc}.
    • Then, assign value {Server} for all your nodes that classify as "Servers", leaving all your networking equipment with values such as Router, Switch, etc
    • Modify your trigger condition to trigger only if node has [TYPE] = {Server}
  • This is great question, thanks.

    It was supposed to be pretty straightforward, but it turned out that it is not that clear how to differentiate between Physical/Virtual machines, even though Node Details resource shows this information. There are many questions about it on Thwack and not many definitive answers

    Here is exact same thing that you are looking for: Reporting nodes not configured for hardware polling?

  • Hi Alex,

    I have created the an alert in below format but it is not working.

    Actually esxi have SD disk which is going down and and we are getting the below event in node,

    pastedImage_1.png

    so i start to create an alert but not get success, please let us know so that i can create the same alert.

    pastedImage_0.png

    Thanks in Advance.

  • I would suggest to start from scratch. First - remove all filters and all limitations and only leave "I want to alert on Hardware Sensor". See what is going to trigger. Then, start filtering out once you know it works.

    Another thing I notice - you have limited scope in your rule above. I use this functionality very rarely. Most of the things can be done on filter level. So - switch back to "all object in my environment" and simply create filter for the node nae, that you have blanked out above

    something like this:

    pastedImage_0.png

  • Hi,

    I used the alert trigger condition-

    pastedImage_2.png

    But i think the condition would not giving the correct result.

    Thanks

    Krishna

  • ok, getting closer... remove this condition now, leave it blank so that it captures everything, and see if you alert will trigger

    pastedImage_0.png