I´m currently usin SAM 6.0.2.
And my problem at the moment is that I can't create an advanced alert for hardware status faliure.
Example:
I have a server with 40+ harddrives and want to have an alert for when a disc brakes down.
Is this possible?
Hey seluton,
What you could do to start is to create an advanced alert for when hardware status is anything other than 'up', and set the advanced alert so that it waits for 6 minutes (three polling cycles using the defaults) before it fires off the alert actions. This will eliminate any 'flappy' alerts from spamming your dashboard/mailbox/alert central instances, but will give you true warnings.
Looking for something specific as a disk would take more fiddling, but the above should give you the visibility you need.
Tnx ,
I found out that I need to upgrade my SAM and NPM to be able to create alerts on hardware status.
It is not possible to do in SAM 6.0.2 and NPM 10.6.1 as I understand it.
No worries. The versions I'm running are newer that yours, so I can't really test it for you I'm afraid
Hope you get it sorted!
It is possible, I am doing it happily now with SQL-based trigger. What is not possible in SAM 6.0 is to "mute/disable/exclude" any particular hardware components individually. Check this article and my reply: Alert Prioritising Dashboard (SWQL) for Problematic Nodes (Servers)
Here is an SQL for Custom SQL Alert for Node - it will report on nodes that are having hardware issues. For details on which hardware component is in problem you would need to dig in the node itself
TRIGGER CONDITION:
--SELECT -- Nodes.NodeID AS NetObjectID-- ,Nodes.Caption AS Name--FROM SolarWinds.dbo.Nodes --WHERE Nodes.NodeID NOT IN -- (-- SELECT DISTINCT Nodes.NodeID-- FROM SolarWinds.dbo.Nodes WITH(NOLOCK)---------------------------------------------INNER JOIN Solarwinds.dbo.APM_HardwareInfo hw_info ON hw_info.NodeID = Nodes.NodeIDINNER JOIN Solarwinds.dbo.APM_HardwareItem hw_item ON hw_item.NodeID = Nodes.NodeID---------------------------------------------WHERE Nodes.UnManaged = 0 AND --node is not unmanaged Nodes.[Status] <> '2' AND --node is not down hw_info.IsDisabled <> '1' AND --hardware on the node was not disabled hw_item.IsDeleted <> '1' AND --hardware sensor was not deleted hw_item.[Status] NOT IN ('1','0') --exclude (1)Up, (0)Unknown states-----------------------------------------------)
--SELECT
-- Nodes.NodeID AS NetObjectID
-- ,Nodes.Caption AS Name
--FROM SolarWinds.dbo.Nodes
--WHERE Nodes.NodeID NOT IN
-- (
-- SELECT DISTINCT Nodes.NodeID
-- FROM SolarWinds.dbo.Nodes WITH(NOLOCK)
---------------------------------------------
INNER JOIN Solarwinds.dbo.APM_HardwareInfo hw_info ON hw_info.NodeID = Nodes.NodeID
INNER JOIN Solarwinds.dbo.APM_HardwareItem hw_item ON hw_item.NodeID = Nodes.NodeID
WHERE
Nodes.UnManaged = 0 AND --node is not unmanaged
Nodes.[Status] <> '2' AND --node is not down
hw_info.IsDisabled <> '1' AND --hardware on the node was not disabled
hw_item.IsDeleted <> '1' AND --hardware sensor was not deleted
hw_item.[Status] NOT IN ('1','0') --exclude (1)Up, (0)Unknown states
--)
RESET CONDITION:
--SELECT -- Nodes.NodeID AS NetObjectID-- ,Nodes.Caption AS Name--FROM SolarWinds.dbo.Nodes WHERE Nodes.NodeID NOT IN ( SELECT DISTINCT Nodes.NodeID FROM SolarWinds.dbo.Nodes WITH(NOLOCK)---------------------------------------------INNER JOIN Solarwinds.dbo.APM_HardwareInfo hw_info ON hw_info.NodeID = Nodes.NodeIDINNER JOIN Solarwinds.dbo.APM_HardwareItem hw_item ON hw_item.NodeID = Nodes.NodeID---------------------------------------------WHERE Nodes.UnManaged = 0 AND --node is not unmanaged Nodes.[Status] <> '2' AND --node is not down hw_info.IsDisabled <> '1' AND --hardware on the node was not disabled hw_item.IsDeleted <> '1' AND --hardware sensor was not deleted hw_item.[Status] NOT IN ('1','0') --exclude (1)Up, (0)Unknown states---------------------------------------------)
WHERE Nodes.NodeID NOT IN
(
SELECT DISTINCT Nodes.NodeID
FROM SolarWinds.dbo.Nodes WITH(NOLOCK)
)
Remember to update database name
Regards,
Alex
SAM 6.0.2 includes a default out-of-the-box alert "Alert me when any hardware component goes into a warning or critical state" that should work for this purpose.