This discussion has been locked. The information referenced herein may be inaccurate due to age, software updates, or external references.
You can no longer post new replies to this discussion. If you have a similar question you can start a new discussion in this forum.

Node Downtime with Duration and Minimum Length Filtering

**REQUIRES ORION PLATFORM 2018.2 OR ABOVE**

I had assembled this based on a much older SQL report, and then updated it to SWQL, then added some more intelligence to it so you can filter it based on the duration of the outage, search by the device names, and it has a method of letting you know when nodes have been down so long they aged out of the events table.

Based on popular requests I figured it was time to put it out here to make it easier for the Thwackers to find and use.  This is intended to be used inside the Custom Query Resource


pastedImage_1.png

select n.caption as [Device]
-- shows the current status icon
, '/Orion/images/StatusIcons/Small-' + n.StatusIcon AS [_IconFor_Device]
-- makes a clickable link to the node details
, n.DetailsUrl as [_linkfor_Device]
-- shows the timestamp of the down event, if there is no timestamp then is says the event was greater than the number of days in your event retention settings
, isnull(tostring(t2.[Down Event]),concat('Greater than ',(SELECT CurrentValue FROM Orion.Settings where settingid='SWNetPerfMon-Settings-Retain Events'),' days ago')) as [Down Event]
-- shows the timestamp of the up event, unless the object is still down
, isnull(tostring(t2.[Up Event]),'Still Down') as [Up Event]
-- figures out the minutes between the down and up events, if the object is still down it counts from the down event to now, displays 99999 if we cannot accurately determine the original downtime, and
, isnull(MINUTEDIFF(t2.[Down Event], isnull(t2.[Up Event],GETDATE())),99999) as Minutes


from orion.nodes n
left join (SELECT   
-- Device nodeid used for our join  
StartTime.Nodes.NodeID    

-- Down Event time stamp in local time zone   
,ToLocal(StartTime.EventTime) AS [Down Event]   
 
-- Up Event time stamp in local time zone   
,(SELECT TOP 1   
ToLocal(EventTime) AS [EventTime]   
FROM Orion.Events AS [EndTime]   
-- picks the first up event that is newer than the down event for this node
WHERE EndTime.EventTime >= StartTime.EventTime  
-- EventType 5 is a node up
AND EndTime.EventType = 5   
AND EndTime.NetObjectID = StartTime.NetObjectID   
AND EventTime IS NOT NULL   
ORDER BY EndTime.EventTime   
) AS [Up Event]   
 
-- This is the table we are querying   
FROM Orion.Events StartTime   
 
-- EventType 1 is a node down
WHERE StartTime.EventType = 1   
   
) t2 on n.NodeID = t2.nodeid


-- this is how I catch nodes that are down but have aged out of the events table
where (n.status = 2 or t2.nodeid is not null)


-- If you want to filter the results to only show outages of a minimum duration uncomment the below line
--and MINUTEDIFF(isnull(t2.[Down Event],(GETUTCDATE()-30)), isnull(t2.[Up Event],GETUTCDATE())) >  60


-- if you want to use this query in a search box of the Custom Query resource uncomment the below line
--and n.Caption like '%${SEARCH_STRING}%'


order by t2.[down event] desc

-Marc Netterfield

    Loop1 Systems: SolarWinds Training and Professional Services

Parents
  • I have a query very similar to this one and here is an example of it being used to show total downtime.

    SELECT

        ST.Nodes.Caption AS [Device],

            '/Orion/images/StatusIcons/Small-' + ST.Nodes.StatusIcon AS [_IconFor_Device],

            ST.nodes.detailsurl as [_linkfor_Device],

        CONCAT(--Downtime formatted

            (CASE WHEN (SUM(SECONDDIFF(ST.EventTime,ISNULL(ET.EventTime, GETUTCDATE()))/60/60/24)) >= 1 --Days

                THEN CONCAT(SUM(SECONDDIFF(ST.EventTime,ISNULL(ET.EventTime, GETUTCDATE()))/60/60/24), 'd ') ELSE '' END),

            (CASE WHEN (SUM(SECONDDIFF(ST.EventTime,ISNULL(ET.EventTime, GETUTCDATE()))/60/60)) >= 1 --Hours

                THEN CONCAT(SUM(SECONDDIFF(ST.EventTime,ISNULL(ET.EventTime, GETUTCDATE()))/60/60%24), 'h ') ELSE '' END),

            (CASE WHEN (SUM(SECONDDIFF(ST.EventTime,ISNULL(ET.EventTime, GETUTCDATE()))/60)) >= 1 --Minutes

                THEN CONCAT(SUM(SECONDDIFF(ST.EventTime,ISNULL(ET.EventTime, GETUTCDATE()))/60%60), 'm ') ELSE '' END),

            (CASE WHEN (SUM(SECONDDIFF(ST.EventTime,ISNULL(ET.EventTime, GETUTCDATE())))) >= 1 --Seconds

                THEN CONCAT(SUM(SECONDDIFF(ST.EventTime,ISNULL(ET.EventTime, GETUTCDATE()))%60), 's ') ELSE '' END)) AS [Downtime]

    FROM

        Orion.Events AS ST --Get StartTime on Events

    LEFT JOIN

        Orion.Events AS ET ON ET.NetObjectID = ST.NetObjectID --Match Events with the next UP event after the DOWN event

            AND ET.EventTime = (SELECT TOP 1 x.EventTime FROM Orion.Events AS [x] WHERE

                                    x.EventTime >= ST.EventTime AND x.EventType = 5

                                    AND x.NetObjectType = 'N' AND x.NetObjectID = ST.NetObjectID

                                    ORDER BY x.EventTime) --SubQuery to get the next UP Event

    WHERE

        ST.EventType = 1 --DownEvent

        AND DAYDIFF(ST.eventtime,GETUTCDATE())<30 --Get Last X days

        AND ST.Nodes.CustomProperties._DeviceType = 'Branch Router' --Branch routers only

        AND ST.Nodes.Status <> 9 --Status is not unamanaged

        --AND ST.Nodes.caption like '%${SEARCH_STRING}%' --Used if search enabled

    GROUP BY

        ST.Nodes.Caption,

        ST.Nodes.StatusIcon,

        ST.nodes.detailsurl

    ORDER BY

        SUM(SECONDDIFF(ST.EventTime,ISNULL(ET.EventTime, GETUTCDATE()))) DESC

Reply
  • I have a query very similar to this one and here is an example of it being used to show total downtime.

    SELECT

        ST.Nodes.Caption AS [Device],

            '/Orion/images/StatusIcons/Small-' + ST.Nodes.StatusIcon AS [_IconFor_Device],

            ST.nodes.detailsurl as [_linkfor_Device],

        CONCAT(--Downtime formatted

            (CASE WHEN (SUM(SECONDDIFF(ST.EventTime,ISNULL(ET.EventTime, GETUTCDATE()))/60/60/24)) >= 1 --Days

                THEN CONCAT(SUM(SECONDDIFF(ST.EventTime,ISNULL(ET.EventTime, GETUTCDATE()))/60/60/24), 'd ') ELSE '' END),

            (CASE WHEN (SUM(SECONDDIFF(ST.EventTime,ISNULL(ET.EventTime, GETUTCDATE()))/60/60)) >= 1 --Hours

                THEN CONCAT(SUM(SECONDDIFF(ST.EventTime,ISNULL(ET.EventTime, GETUTCDATE()))/60/60%24), 'h ') ELSE '' END),

            (CASE WHEN (SUM(SECONDDIFF(ST.EventTime,ISNULL(ET.EventTime, GETUTCDATE()))/60)) >= 1 --Minutes

                THEN CONCAT(SUM(SECONDDIFF(ST.EventTime,ISNULL(ET.EventTime, GETUTCDATE()))/60%60), 'm ') ELSE '' END),

            (CASE WHEN (SUM(SECONDDIFF(ST.EventTime,ISNULL(ET.EventTime, GETUTCDATE())))) >= 1 --Seconds

                THEN CONCAT(SUM(SECONDDIFF(ST.EventTime,ISNULL(ET.EventTime, GETUTCDATE()))%60), 's ') ELSE '' END)) AS [Downtime]

    FROM

        Orion.Events AS ST --Get StartTime on Events

    LEFT JOIN

        Orion.Events AS ET ON ET.NetObjectID = ST.NetObjectID --Match Events with the next UP event after the DOWN event

            AND ET.EventTime = (SELECT TOP 1 x.EventTime FROM Orion.Events AS [x] WHERE

                                    x.EventTime >= ST.EventTime AND x.EventType = 5

                                    AND x.NetObjectType = 'N' AND x.NetObjectID = ST.NetObjectID

                                    ORDER BY x.EventTime) --SubQuery to get the next UP Event

    WHERE

        ST.EventType = 1 --DownEvent

        AND DAYDIFF(ST.eventtime,GETUTCDATE())<30 --Get Last X days

        AND ST.Nodes.CustomProperties._DeviceType = 'Branch Router' --Branch routers only

        AND ST.Nodes.Status <> 9 --Status is not unamanaged

        --AND ST.Nodes.caption like '%${SEARCH_STRING}%' --Used if search enabled

    GROUP BY

        ST.Nodes.Caption,

        ST.Nodes.StatusIcon,

        ST.nodes.detailsurl

    ORDER BY

        SUM(SECONDDIFF(ST.EventTime,ISNULL(ET.EventTime, GETUTCDATE()))) DESC

Children
No Data