18 Replies Latest reply on Oct 31, 2016 4:26 AM by alexslv

    Hardware Alert Help

    ec-umass

      Hi,

      I am looking for a way to get alerted when we have a physical hardware issue, whether it be a drive in a server, power supply, battery, memory, etc.

       

      Does anyone have any good examples of how I might be able to do this?

       

      Thanks.

        • Re: Hardware Alert Help
          alexslv

          First you need to make sure that your node's hardware is being polled:

          • Run "List Resources" on your node and add all hardware components.
          • Then add "Current Hardware Health" resource onto your node page so that you can see all hardware sensors that SolarWinds was able to pick up

           

          To create an alert you can do so via web-based Alerting wizard:

          • Go to ALERTS > MANAGE ALERTS > ADD NEW ALERT
          • Under "Trigger Condition" select "I want to report on: Hardware Sensor"
          • Create condition that suits you the most
          • Next > Next > Next > Done
          1 of 1 people found this helpful
            • Re: Hardware Alert Help
              ec-umass

              Do you know if there is a way to just include Windows/Linux/VMware servers? We have a bunch of networking things on here that I am not concerned about and I am not sure how to filter out just servers?

               

              Thanks.

                • Re: Hardware Alert Help
                  alexslv

                  Very easy my friend:

                   

                  Option (1) - Filter by OS type within alert:

                   

                  • Go back to Trigger Condition > Add new condition (green plus icon) > Browse all objects
                  • Select Orion Object as "Node" > Add "Machine Type" filter
                    • You will need to group all machine types that you want with "OR" grouping

                   

                  Option (2) - Create custom property for filtering

                   

                  • Just create custom property called something like [TYPE] as a drop-down and set pick values such as {Server; Router; Switch; etc}.
                  • Then, assign value {Server} for all your nodes that classify as "Servers", leaving all your networking equipment with values such as Router, Switch, etc
                  • Modify your trigger condition to trigger only if node has [TYPE] = {Server}
              • Re: Hardware Alert Help
                highstone351

                To answer your second "Filter Question", find the mib for the server that lists the "Hardware Type" for the server(s). That means that SNMP needs to run on your servers to take advantage of polling. That will create a small amount of overhead for the servers, FYI. You can use SW "Tools" to help find the correct mib by walking the mib tree on the devices. Just set up your community name on the server and then in Tools to get that to work. (You will also need to set NPM and NCM up as well to get the information from polling.)

                 

                Set up polling for the servers.

                1. Create a NEW alert for hardware problems for the servers. As an example, call it "Server Hardware Problems".

                2. Inside the alert, set up the logic to say "If all of the following are true" for the first statement.

                3. Then under that, set up a first global for "if ANY of the following are true" as the set of logical "or" statements. Example: The Field "hardware type" is equal to "Windows".

                                                        The Field "hardware type is equal to "linux"

                                                             etc.

                4. A THIRD logical global statement should be created Under the FIRST one. It will be indented just like the second one that we just did. In that statement it should say "where any of the following are true".

                5. Under that third global statement, create all of the hardware components that you want to watch.

                The logic example is thus: If this is a Windows machine AND it has a hardware problem with (name your physical item here), the alert will trigger.

                 

                6. The message part of the alert should have (variables) for the Node name, the hardware type, the component name and a timestamp if wanted. You will have to use the listing from tools to determine what the devices may be called.

                 

                With some manipulation, it should be what you are looking for.

                 

                hope that this helps, Mark

                1 of 1 people found this helpful
                  • Re: Hardware Alert Help
                    ec-umass

                    Ok, I think I have it down pretty well. One other problem I have noticed is we have about 1000 nodes and some are physical servers other are virtual. I noticed on some of the physical servers the "Hardware Sensor" check box was not checked off and I had to do it manually.

                     

                    Does anyone know of a custom search so that I can just list Physical Servers and see which ones don't have the Hardware Sensor checked?

                     

                    All of our servers (well most) are polling by SNMP

                     

                    Thanks

                      • Re: Hardware Alert Help
                        alexslv

                        This is great question, thanks.

                         

                        It was supposed to be pretty straightforward, but it turned out that it is not that clear how to differentiate between Physical/Virtual machines, even though Node Details resource shows this information. There are many questions about it on Thwack and not many definitive answers

                         

                        Here is exact same thing that you are looking for: Reporting nodes not configured for hardware polling?

                          • Re: Hardware Alert Help
                            krishna mishra

                            Hi Alex,

                             

                            I have created the an alert in below format but it is not working.

                            Actually esxi have SD disk which is going down and and we are getting the below event in node,

                             

                             

                            so i start to create an alert but not get success, please let us know so that i can create the same alert.

                             

                            Thanks in Advance.

                              • Re: Hardware Alert Help
                                alexslv

                                I would suggest to start from scratch. First - remove all filters and all limitations and only leave "I want to alert on Hardware Sensor". See what is going to trigger. Then, start filtering out once you know it works.

                                 

                                Another thing I notice - you have limited scope in your rule above. I use this functionality very rarely. Most of the things can be done on filter level. So - switch back to "all object in my environment" and simply create filter for the node nae, that you have blanked out above

                                 

                                something like this:

                                 

                                  • Re: Hardware Alert Help
                                    krishna mishra

                                    Hi,

                                     

                                    I used the alert trigger condition-

                                     

                                    But i think the condition would not giving the correct result.

                                     

                                    Thanks

                                    Krishna

                                      • Re: Hardware Alert Help
                                        alexslv

                                        ok, getting closer... remove this condition now, leave it blank so that it captures everything, and see if you alert will trigger

                                         

                                          • Re: Hardware Alert Help
                                            krishna mishra

                                            when i put last column, which is working and showing the lots of event for h/w sensor.

                                             

                                             

                                            But the point is i would to capture only perticuller h/w sensor event - like "hardware sensor critical for usb direct access"

                                            i think filter is not working.

                                              • Re: Hardware Alert Help
                                                alexslv

                                                OK, you see - we are getting there So, we have now confirmed that you alert is working and you have also concluded that without filer it works (albeit too much noise) and with filer it doesn't - hence, the problem is filter

                                                 

                                                I would suggest you the following:

                                                 

                                                * Go to your Reports and create a new report

                                                * Make it to report on "Hardware Sensor"

                                                * Then, add a bunch of columns you like, including Hardware sensor message (as per your above screenshot)

                                                * Do not add any filters

                                                * Run a report and see if you get a result

                                                 

                                                Often you think that logically your message should be called "Message", but it is not (a good example is a Node name - in SolarWinds world the name that you see on top of node's summary page is actually called "Caption" in database). By playing with report you can figure out the exact column names you can filter by. Then, go back to your alert and use your findings to adjust filter accordingly. This is the process I follow when I need to report/alert on something new that I haven't done before. Running a report helps to see what sort of data is stored in what columns in database.

                                                  • Re: Hardware Alert Help
                                                    alexslv

                                                    Another thing I have just noticed - you are saying:

                                                    Events are different then objects (Herd ware Sensor in this case)

                                                     

                                                    Try this:

                                                    * Instead of searching for "Message" add these two filters:

                                                     

                                                    1. Node Node Name is equal to [whatever you server name is]

                                                    2. Hardware Sensor status is equal to Critical

                                                     

                                                    P.S. I have noticed that "Caption" doesn't come up in the list of possible fields for the Node to choose from... strange.... but I think "Node Name" will do in this case. Try the above

                                                      • Re: Hardware Alert Help
                                                        alexslv

                                                        And you can also add name of the sensor to limit it to a particular exact sensor that you need

                                                         

                                                         

                                                        Now, back to "Message" -  I think this is what you have (as mentioned above):

                                                        1. You see your event in logs. Right?

                                                        2. You then create alert based on "Hardware Sensor"

                                                        3. Both Hardware Sensor and Event objects (which are different as previously mentioned) have "Message" field apparently (I have just confirmed this in my environment)

                                                        4. What you are trying to do is to find a message of the event in hardware sensor table... obviously it is not there

                                                         

                                                        So, as it stands now you cannot alert on just about any events (only on Auditing Events). Therefore, your only option will be to use filter as proposed above for your Hardware Sensor Status + Sensor Name, Node Name... etc to limit scope further the way you like

                                                          • Re: Hardware Alert Help
                                                            krishna mishra

                                                            Hi Alex,

                                                             

                                                            below condition is working fine, but we don't multiple h/w sensor alert,

                                                             

                                                            but when i am applying the filter, then it is not working, could you please find a way to so that filter could work.

                                                             

                                                              • Re: Hardware Alert Help
                                                                alexslv

                                                                You are trying to find an evant Message in the Hardware Sensor object. This is not going to work

                                                                 

                                                                1. Change "Message" to "Name"

                                                                2. Type first few letters in the value field for the name and press down arrow - it should offer you available options to chose from

                                                                3. Pick your exact name you want

                                                                 

                                                                  • Re: Hardware Alert Help
                                                                    krishna mishra

                                                                    Many Thanks Alex for suggestion,

                                                                     

                                                                    But Message should be work in filter, And solarwind support team should mark this as Feature Request and service improvement.

                                                                    i am going to raise an feature request so next version this should be work,

                                                                     

                                                                    Thanks

                                                                    K

                                                                      • Re: Hardware Alert Help
                                                                        alexslv

                                                                        I don't think you got the point... I believe everything is working fine. I suggest you create a report first and see what's coming up in your message field for Hardware Sensor - I bet it will be empty. Read my post above - you are trying to find a message of the Event object in the Hardware Sensor object. ... This is like looking for your socks under my bed - yes, they are indeed under the bed, but not not mine, yours!