This discussion has been locked. The information referenced herein may be inaccurate due to age, software updates, or external references.
You can no longer post new replies to this discussion. If you have a similar question you can start a new discussion in this forum.

Node Downtime with Duration and Minimum Length Filtering

**REQUIRES ORION PLATFORM 2018.2 OR ABOVE**

I had assembled this based on a much older SQL report, and then updated it to SWQL, then added some more intelligence to it so you can filter it based on the duration of the outage, search by the device names, and it has a method of letting you know when nodes have been down so long they aged out of the events table.

Based on popular requests I figured it was time to put it out here to make it easier for the Thwackers to find and use.  This is intended to be used inside the Custom Query Resource


pastedImage_1.png

select n.caption as [Device]
-- shows the current status icon
, '/Orion/images/StatusIcons/Small-' + n.StatusIcon AS [_IconFor_Device]
-- makes a clickable link to the node details
, n.DetailsUrl as [_linkfor_Device]
-- shows the timestamp of the down event, if there is no timestamp then is says the event was greater than the number of days in your event retention settings
, isnull(tostring(t2.[Down Event]),concat('Greater than ',(SELECT CurrentValue FROM Orion.Settings where settingid='SWNetPerfMon-Settings-Retain Events'),' days ago')) as [Down Event]
-- shows the timestamp of the up event, unless the object is still down
, isnull(tostring(t2.[Up Event]),'Still Down') as [Up Event]
-- figures out the minutes between the down and up events, if the object is still down it counts from the down event to now, displays 99999 if we cannot accurately determine the original downtime, and
, isnull(MINUTEDIFF(t2.[Down Event], isnull(t2.[Up Event],GETDATE())),99999) as Minutes


from orion.nodes n
left join (SELECT   
-- Device nodeid used for our join  
StartTime.Nodes.NodeID    

-- Down Event time stamp in local time zone   
,ToLocal(StartTime.EventTime) AS [Down Event]   
 
-- Up Event time stamp in local time zone   
,(SELECT TOP 1   
ToLocal(EventTime) AS [EventTime]   
FROM Orion.Events AS [EndTime]   
-- picks the first up event that is newer than the down event for this node
WHERE EndTime.EventTime >= StartTime.EventTime  
-- EventType 5 is a node up
AND EndTime.EventType = 5   
AND EndTime.NetObjectID = StartTime.NetObjectID   
AND EventTime IS NOT NULL   
ORDER BY EndTime.EventTime   
) AS [Up Event]   
 
-- This is the table we are querying   
FROM Orion.Events StartTime   
 
-- EventType 1 is a node down
WHERE StartTime.EventType = 1   
   
) t2 on n.NodeID = t2.nodeid


-- this is how I catch nodes that are down but have aged out of the events table
where (n.status = 2 or t2.nodeid is not null)


-- If you want to filter the results to only show outages of a minimum duration uncomment the below line
--and MINUTEDIFF(isnull(t2.[Down Event],(GETUTCDATE()-30)), isnull(t2.[Up Event],GETUTCDATE())) >  60


-- if you want to use this query in a search box of the Custom Query resource uncomment the below line
--and n.Caption like '%${SEARCH_STRING}%'


order by t2.[down event] desc

-Marc Netterfield

    Loop1 Systems: SolarWinds Training and Professional Services

Parents
  • I am trying to resolve how this query retrieves its data. I ran the query for a single device on my network and it displayed 8 records for the device. The report lists the up/down dates and times with the total minutes for each. When I query the Orion.Event table looking for same one device with the event type of 1 or 5 I get the following results. 4 of the records I could match to the report, one record on the report listed a 11/26 date while the query of the event table listed 11/27, another record on the report listed 12/6 for a date while the event query listed 12/7, the last two records where not found in the event table.

    Management wants to use this report to track the equipment that vendors are responsible for so I need to be able to verify the data is correct. Here are a couple of questions.

    1. Is there some other table that is being used? I have been looking to find what table "StartTime.EventType" or "EndTime.EventType" are defined in.

    2. I have created individual groups that contain just the devices specific vendors are responsible for. How do I modify the query to select a group name for a specific date range and have just those devices appear in the report?

    3. I see a reference to T2.xxx, am I correct in assuming that this is some type of temporary table?

    Thank you for your help.

Reply
  • I am trying to resolve how this query retrieves its data. I ran the query for a single device on my network and it displayed 8 records for the device. The report lists the up/down dates and times with the total minutes for each. When I query the Orion.Event table looking for same one device with the event type of 1 or 5 I get the following results. 4 of the records I could match to the report, one record on the report listed a 11/26 date while the query of the event table listed 11/27, another record on the report listed 12/6 for a date while the event query listed 12/7, the last two records where not found in the event table.

    Management wants to use this report to track the equipment that vendors are responsible for so I need to be able to verify the data is correct. Here are a couple of questions.

    1. Is there some other table that is being used? I have been looking to find what table "StartTime.EventType" or "EndTime.EventType" are defined in.

    2. I have created individual groups that contain just the devices specific vendors are responsible for. How do I modify the query to select a group name for a specific date range and have just those devices appear in the report?

    3. I see a reference to T2.xxx, am I correct in assuming that this is some type of temporary table?

    Thank you for your help.

Children
No Data