cancel
Showing results for 
Search instead for 
Did you mean: 
Create Post

Alert Prioritising Dashboard (SWQL) for Problematic Nodes (Servers)

Here is an example to use SWQL to build a view to display problematic nodes (servers) with issues from one or more flowing areas:

•    Node Status (column name: CONN) - (1 UP, 2 Down, ignore other status)

•    Node Response Time (column name: M_SECS) - in milliseconds, (> 0 OR When Node is Down, it is -1). If M_SECS> 500: Warning, If  M_SECS> 500: Critical

•    Node CPU Load (column name:  C_LOAD) - in percentage, (Between 0 - 100). If  C_LOAD > 95: Warning, If  C_LOAD > 98: Warning, If  C_LOAD =100: Down

•    Node Memory Usage (column name:  R_Load) - percentage, (Between 0 - 100). If  R_LOAD > 95: Warning, If  R_LOAD > 98: Warning, If  R_LOAD =100: Down

•    Node Highest Volume Usage (column name:  V_PERCENT) - (Between 0 - 100). If  V_PERCENT > 95: Warning, If  V_PERCENT > 98: Warning, If  V_PERCENT =100: Down

•    Node Hardware Components worst  Status (column name: HW_Status) -  (UP, Undefined, Unknown, Warning, Critical, n/a)

•    Node  Application worst Status (column name:  APP_Status)  - (UP, Unmanaged, Unknown, Unreachable, Warning, Critical, Down, n/a)

Alert-swql.jpg

In order to the worst (highest priority) condition are shown on the top of the list I gave each status different scores, and each column different weights. Then calculate total score as the priority. Here is the calculation:


•    wConn (Connection), scores: Down - 1000, Up - 0; weight 1.00
•    wTime (Response Time), scores:  > 1000ms - 80, >500ms - 10, other - 0; Weight 0.75
•    wCPU (CPU Load), scores: 100% - 600, >98% - 80, >95% - 10, Other - 0; Weight 1.00
•    wRAM (Memory Load), scores: 100% - 600, >98% - 80, >95% - 10, Other 0; Weight 1.00
•    wVol (MAX(Volume Usage)), the highest volume usage of all volumes on a node, scores: 100% - 600, >98% - 80, >95% - 10, Other 0; Weight 0.75
•    wHW  (Hardware Status (worst Value)), the worst HW component status of a node with HW monitor enabled   scores: Critical - 80, Warning - 10, Up - 0, other 1; Weight 0.50
•    wApp (Application Status (worst value), the worst application statues of a node with application monitors assigned.  scores: Down - 600, Critical - 80, Warning - 10, Up - 0, other 1; Weight 0.50

Maximum Total Weighted Score (Exclude wConn):  80*0.75 + 600*1.00 + 600*1.00 + 600*0.75 + 80*0.50 + 600 *0.50  = 2050

Priority = ROUND((t1.wTime*0.75 + t1.wCPU*1.0 + t1.wRAM*1.0 + t1.wVol*0.75 + t1.wHW*0.5 + t1.wApp*0.5)/2.05 + t1.wConn*1.00, 2)

Final Priority value is between 0 and 1000.

You can change the score and weight to meeting your requirement.

Steps:

  • Create a view; add “Custom Query” resource.

swql.jpg

  • In the view, edit Custom Query:
  • In the Custom SWQL Query box, add the codes in attached file “thwack-swql-alerts.txt”
  • Enable search, and in Search SWQL Query box, add the codes in attached file “thwack-swql-alerts-withSearch.txt”

Done!

Using Search:

•    By Node Name
If you want to just display a node or a group of nodes with similar names, type node name or part of the name in the search box and click search button.
•    By Connection Status
If you want to just display nodes in DOWN status, type “n 1” (white space between n and 1) in the search box and click search button.
•    By CPU or RAM or Volume usage
If you want to just display node with CPU or RAM or Volume usage above certain level, using the following:

     o    “c 80”  (CPU usage above 80%)
     o    “r 80” (Memory usage above 80%)
     o    “v 80” (Volume usage above 80%)
•    By Hardware Status
If you want to just display node with certain hardware status, type “h status” (‘status’ can be one of the following: UP, undefined, Unknown, Warning, Critical, n/a).
•    By Application Status
If you want to just display node with certain application status, type “a status” (‘status‘ can be one of the following: UP, Unmanaged, Unknown, Unreachable, Warning, Critical, Down, n/a).

You can customise the query to meeting your requirements.

Thanks alexslv's post https://thwack.solarwinds.com/docs/DOC-174568, which is very helpful!

===========================

Update:  As Alex suggested, I have updated the query and new files are attached. Thanks Alex!

===========================

Update: 11/March/2015

I have added 2 addition columns for Alert Prioritising Dashboard.

One column is AlertTime, another one is Acknowledge (Ack). The Ack column is click-able. Right click it and open a new windows to View or Acknowledge an alert.

Please see the additional document at https://thwack.solarwinds.com/docs/DOC-176727

alert-000.jpg

============================

Update: 11/11/2015

The original query is for NPM & SAM, but if you only need NPM (network nodes) part, I did create another two queries for network devices only.

The files: "networkNOC-ForThwack.txt"  and "InterfaceNOC-ForThwack.txt" are attached.

"networkNOC-ForThwack.txt" is for network device (NPM) only.

001.jpg

"InterfaceNOC-ForThwack.txt" if is for network Interface only.

002.jpg

Both are limited to Vendor = 'Cisco', you can change it to meet your requirements.

Attachments
Comments

Outstanding! This script works out-of-the-box incredibly well. Thank You. Your prioritising solution is amazing

I have also created HTML Resource to go on the same page for "Using Search":

TITLE: Using Search

<b>By Node Name</b><br/>

If you want to just display a node or a group of nodes with similar names, type node name or part of the name in the search box and click search button.<br/>

<br/>

<b>By Connection Status</b><br/>

If you want to just display nodes in DOWN status, type “n 1” (white space between n and 1) in the search box and click search button.<br/>

<br/>

<b>By CPU or RAM or Volume usage</b><br/>

If you want to just display node with CPU or RAM or Volume usage above certain level, using the following:<br/>

<ul>

<li>“c 80”  (CPU usage above 80%)</li>

<li>“r 80” (Memory usage above 80%)</li>

<li>“v 80” (Volume usage above 80%)</li>

</ul>

<b>By Hardware Status</b><br/>

If you want to just display node with certain hardware status, type “h status” (‘status’ can be one of the following: UP, undefined, Unknown, Warning, Critical, n/a).<br/>

<br/>

<b>By Application Status</b><br/>

If you want to just display node with certain application status, type “a status” (‘status‘ can be one of the following: UP, Unmanaged, Unknown, Unreachable, Warning, Critical, Down, n/a).

I have noticed that Hardware is cutting on false positives. After a bit of digging it seems that you are checking overall hardware category status for the node. I do not know which table you need to check via SWQL, but from SQL perspective I am checking dbo.APM_HardwareInfo table and any rows with IsDelete=1 must be excluded.

Here is SQL for Hardware I am using. You just need to convert these checks into SWQL and update your script:

SELECT

  Nodes.NodeID

,hw_item.ID

,Nodes.Caption

,hw_item.UniqueName

,hw_item.DisplayName

FROM Solarwinds.dbo.Nodes

---------------------------------------------

INNER JOIN Solarwinds.dbo.APM_HardwareInfo hw_info ON hw_info.NodeID = Nodes.NodeID

INNER JOIN Solarwinds.dbo.APM_HardwareItem hw_item ON hw_item.NodeID = Nodes.NodeID

---------------------------------------------

WHERE

  Nodes.UnManaged = 0 AND --node is not unmanaged

  Nodes.[Status] <> '2' AND --node is not down

  hw_info.IsDisabled <> '1' AND --hardware on the node was not disabled

  hw_item.IsDeleted <> '1' AND --hardware sensor was not deleted

  hw_item.[Status] NOT IN ('1','0') --exclude (1)Up, (0)Unknown states

---------------------------------------------

This is great, thank you

Hi Alex

Thank you very much!!! You are right, these is a false positive(s). In my system I have other columns and filters so it does not show the  false positives.

I checked my query, "hardware sensor was deleted" filter was missing, so one line added after line 190

JOIN Orion.HardwareHealth.HardwareItem AS e ON e.HardwareCategoryStatusID = c.ID

e03.jpg

Hope this resolves the issue. I have replaced the attachments with new files.

Thanks Again!

Patrick

Thanks! I was doing the same thing

WOW! This is really good. Is there a way to sort out devices by vendor? For example I want to only monitor my Windows \ Linux \ Unix devices. By the Way, iv'e been lurking for about 2 months, the stuff on thwack is awesome. Thanks everyone I hope to soon be adding some articles.

Alex

If you just want alerts for Windows/Linux servers, please replace "WHERE" clause from Line 211 to 226 with the codes below:

WHERE  -- list all nodes which are:

(

  a.Vendor IN ('windows','net-snmp')  --- for Winodws and Linux servers only

  AND (

  a.Status=2  -- Node is Down

  OR

  (a.Status=1 AND a.AvgResponseTime > 500)  -- Node is up but Response Time is slow

  OR

  (a.Status=1 AND a.CPULoad > 95) -- Node is up but CPU load is high

  OR

  (a.Status=1 AND a.PercentMemoryUsed > 95) -- Node is up but memory load is high

  OR

  (a.Status=1 AND g.HWStatusID > 3) -- Node is up one or more HW components is in Warning/Critical Status

  OR

  (a.Status=1 AND h.vPercentage > 95.0 AND h.vPercentage < 100.0) -- Node is Up, but One or more volumes usage is high

  OR

  (a.Status=1 AND i.aStatusID > 3)  -- Node is up, but one or more applications is in NOT UP Status

  )

)

This is really awesome, great effort. thanks for sharing!

Somebody give this guy a raise!!

Love this - thank you.

Is there an easy way to change the scope of this? i have a few SQL servers that will always report high memory usage - so i don't want to see them - could i change the scope based on a custom property?

thanks!

If you already have a custom property to identify those "a few SQL servers", you could add the condition in the WHERE clause: (a.Status=1 AND a.PercentMemoryUsed > 95 AND "custom property" <> 'This Value'  ) -- Node is up but memory load is high

wluther‌ - TESTIFY!

hcy01uk‌ - BRAVO!!!

Thanks for sharing!

alexslv‌ and hcy01uk

I wanted to convert this to use only Cisco, Riverbed, and Juniper devices.  I edited the follow lines but only seem to get a couple items.

a.Vendor IN ('Cisco','Riverbed Technology','Juniper') 

  AND (

What else am I missing here?

CourtesyIT‌ Have you adjusted each of the alert thresholds to qualify a majority to test with?

ie. (a.Status=1 AND a.CPULoad > 95) -- Node is up but CPU load is high

lower the 95 to 0, and see if it shows everything.

(a.Status=1 AND a.CPULoad > 95) -- Node is up but CPU load is high


Also, you're probably not going to need to use the SAM parts of this, but do you have Hardware Health enabled on your devices?

-Will

Your nodes must be set to poll routing table. Do List Resources on nodes

that don't show up and ensure you have Routing Table (or similar name)

checked.

To troubleshoot, just remove all filters after JOINs, this will give you

full list of everything, without filtering anything out. Add a.Vendor into

Select statement to get an idea what vendors you have so that you can use

this info to filter later (adding to your IN list those that you want)

Let me know how it goes

I was wondering about the SAM piece and hardware health is enabled on all devices.

As I lower the above numbers down to networking levels more devices appear.  I will also need to modify a bit since some of the columns are not required.

Thanks,

Eric

Got a bit lost on the JOINs.  There are several. 

CourtesyIT‌ I think this should remove the SAM part of it for you... However, I am not in a position to test anything at the moment.  And, as always, there is a super high possibility I am just flat out wrong.

Remove the following lines. (Sorry, I am unable to attach the edited file, and when I posted the entire script, it just took up way too much room...)

LINE 17

--t1.APP, t1.[_IconFor_APP], t1.aStatus AS APP_Status,

LINES 136 THROUGH 169

--'' AS APP,  -- application Status

--CASE

--  WHEN a.Status = 2  THEN  '/Orion/images/StatusIcons/Small-Unreachable.gif'  -- if node is down

--  WHEN i.aStatusID IS NULL THEN  '/Orion/images/StatusIcons/Small-Unreachable.gif'

--  WHEN i.aStatusID=0  THEN '/Orion/images/StatusIcons/Small-Up.gif' 

--  WHEN i.aStatusID=1  THEN  '/Orion/images/StatusIcons/Small-Unmanaged.gif'

--  WHEN i.aStatusID=2  THEN  '/Orion/images/StatusIcons/Small-Unknown.gif'

--  WHEN i.aStatusID=3  THEN  '/Orion/images/StatusIcons/Small-Unreachable.gif'

--  WHEN i.aStatusID=4  THEN  '/Orion/images/StatusIcons/Small-Up-Warn.gif'

--  WHEN i.aStatusID=5  THEN  '/Orion/images/StatusIcons/Small-Critical.gif'

--  WHEN i.aStatusID=6  THEN  '/Orion/images/StatusIcons/Small-Down.gif'

--  ELSE  '/Orion/images/StatusIcons/Small-Unreachable.gif' 

--END AS [_IconFor_APP], -- application Status ICcon

--CASE

--  WHEN a.Status = 2  THEN 'Unreachable'

--  WHEN i.aStatusID IS NULL THEN  'n/a'

--  WHEN i.aStatusID=0 THEN 'Up'

--  WHEN i.aStatusID=1 THEN 'Unmanaged'

--  WHEN i.aStatusID=2 THEN 'Unknown'

--  WHEN i.aStatusID=3 THEN 'Unreachable'

--  WHEN i.aStatusID=4 THEN 'Warning'

--  WHEN i.aStatusID=5 THEN 'Critical'

--  WHEN i.aStatusID=6 THEN 'Down'

--  ELSE 'n/a'

--END AS aStatus, -- application Status

--CASE

--  WHEN a.Status = 2  THEN 0

--  WHEN i.aStatusID IS NULL THEN  0

--  WHEN i.aStatusID=2 THEN 1 -- Other

--  WHEN i.aStatusID=4 THEN 10 -- Warning

--  WHEN i.aStatusID=5 THEN 80 -- Critical

--  WHEN i.aStatusID=6 THEN 600 -- Down

--  ELSE 0 -- Up

--END AS wApp -- application Status score

LINES 195 THROUGH 211

--LEFT JOIN  -- find out the worst application status on a node if apply

--(

--SELECT NodeID,

--MAX(CASE

--  WHEN Status=1 THEN 0  -- UP

--  WHEN Status=17 THEN 1  -- Unmanaged

--  WHEN Status=0 THEN 2  -- Unknown

--  WHEN Status=12 THEN 3  -- Unreachable

--  WHEN Status=3 THEN 4  -- Warning

--  WHEN Status=14 THEN 5  -- Critical

--  WHEN Status=2 THEN 6   -- Down

--  ELSE NULL

--END

--) AS aStatusID -- convert application Status raw values to ordered integers, the bigger value the worst status

--FROM Orion.APM.Application

--GROUP BY NodeID

--) AS i ON i.NodeID = a.NodeID

LINES 225 & 226

--  OR

--  (a.Status=1 AND i.aStatusID > 3)  -- Node is up, but one or more applications is in NOT UP Status

AND LINE 237

--       t1.aStatus LIKE (CASE WHEN '${SEARCH_STRING}' LIKE 'A %' THEN SubString('${SEARCH_STRING}%',3,25) END) OR

Sorry wluther‌ it did not work.  I am going back to the beginning to see if I can get it to work.  It would be nice to see what is all available then piece out from there. 

wluther is correct, you need change thresholds to meet your requirement.

The original query is for NPM & SAM, but if you only need NPM (network nodes) part, I did create another two queries for network devices only.

The files are attached in my main post.

I have this customization working for me and have been very helpful thanks.   The one bit that I am having an issue with is the ssh connection portion.  I am using putty for my ssh client.  Any workarounds for this? 

Hello!

Could someone tell me, as I am very new to coding SQL..

How can I sort through the nodes, and have the program exclude a certain set of nodes?

For example, I want to ignore any nodes, that start "F#### " or "C####" ?

I'm sure its something simple, but Im simply just not seeing it.

Thank you.

In WHERE Clause Add: t1.NodeName NOT LIKE 'F%' AND t1.NodeName NOT LIKE 'C%'

Fooey!

Didnt work. But I'll keep trying. 

----------------------------------------------

WOOHOOO!!! 

AHA!!!! I figured it out!!!!

I had to change the line to read.

  a.Caption NOT LIKE 'F%' AND a.Caption NOT LIKE 'C%'

And use the AND instead of OR..

But it works now!!!

Thank you so much!!! 

Hi, CourtesyIT, If you only look for nodes within a certain custom property, please try this:

In the SWQL file "thwack-swql-alerts-v2.txt",

After line 171 add:

========================================================================

FROM Orion.Nodes AS a

LEFT JOIN Orion.NodesCustomProperties AS b ON a.NodeID = b.NodeID --- New line

LEFT JOIN  -- find out the highest volume usage on a node

========================================================================

In WHERE clause (from line 212) add/modify:

=========================================================================

WHERE  -- list all nodes which are:

(

  (b.your-custom-property-name='your-custom-property-value' )   --- new line

  AND (  --- new line

  a.Status=2  -- Node is Down

  OR

  (a.Status=1 AND a.AvgResponseTime > 500)  -- Node is up but Response Time is slow

  OR

  (a.Status=1 AND a.CPULoad > 95) -- Node is up but CPU load is high

  OR

  (a.Status=1 AND a.PercentMemoryUsed > 95) -- Node is up but memory load is high

  OR

  (a.Status=1 AND g.HWStatusID > 3) -- Node is up one or more HW components is in Warning/Critical Status

  OR

  (a.Status=1 AND h.vPercentage > 95.0 AND h.vPercentage < 100.0) -- Node is Up, but One or more volumes usage is high

  OR

  (a.Status=1 AND i.aStatusID > 3)  -- Node is up, but one or more applications is in NOT UP Status

  --- new line

)

===========================================================================

Thanks,

Hi

i would like to know how i can filter out the acknowledged alerts.

i tried

,o.AlertActive.Acknowledged = 'true' then i see all alerts which are acknowledged, but when i try "false" or <> 'true' i see nothing

Please try this:

In another document: https://thwack.solarwinds.com/docs/DOC-176727 .

Modify file: thwack-swql-alerts-withSearch-v3-Ack-NPM11.txt

at the bottom of the file in WHERE clause, Add the codes below at the end of WHERE clause:

       OR t1.Ack = (CASE WHEN '${SEARCH_STRING}' LIKE 'K %' THEN SubString('${SEARCH_STRING}%',3,1) END)

In dashboard search box, type "k y" display acknowledged alerts, or "k n" for un-acknowledged alerts.

Hope this help.

Hi,

I tried to use the filter for another resource, and tried to apply the filter, but this doesn't work

SELECT

  o.AlertConfigurations.Name AS [ALERT NAME]

  ,'/Orion/NetPerfMon/ActiveAlertDetails.aspx?NetObject=AAT:' + ToString(o.AlertObjectID) AS [_LinkFor_ALERT NAME]

  ,o.EntityCaption AS [ALERT OBJECT]

  ,o.EntityDetailsURL AS [_LinkFor_ALERT OBJECT]

  ,o.RelatedNodeDetailsURL AS [_LinkFor_RELATED NODE]

  ,ToLocal(o.AlertActive.TriggeredDateTime) AS [ALERT TRIGGER TIME]

FROM Orion.AlertObjects o

WHERE o.AlertActive.Acknowledged <> true

ORDER by o.AlertActive.TriggeredDateTime DESC

This is an awesome dashboard.

N00b question - how do I change the name of the columns to something anyone can understand?

Sorry, not great with queries,....


Also, any chance someone knows how to zoom in on the table, displaying this on a 50" screen is unreadable as part of NOC.

In the Query from Line "SELECT t1.Conn, t1.[_IconFor_Conn], t1.NodeName, t1.[_LinkFor_NodeName]"

the Bold Text below are column names , you can change to desired ones.

SELECT t1.Conn, t1.[_IconFor_Conn], t1.NodeName, t1.[_LinkFor_NodeName],

t1.RTime, t1.[_IconFor_RTime], t1.Time AS M_Secs,

t1.CPU, t1.[_IconFor_CPU], t1.cLoad AS C_Load,

t1.RAM, t1.[_IconFor_RAM], t1.rLoad AS R_Load,

t1.Vol, t1.[_IconFor_Vol], t1.vPercent AS V_Percent,

t1.HW, t1.[_IconFor_HW], t1.hStatus AS HW_Status,

t1.APP, t1.[_IconFor_APP], t1.aStatus AS APP_Status,

ROUND((t1.wTime*0.75 + t1.wCPU*1.0 + t1.wRAM*1.0 + t1.wVol*0.75 + t1.wHW*0.5 + t1.wApp*0.5)/2.05 + t1.wConn*1.00, 2)AS Priority

​You can use browser zoon function to fit the view in the screen, and/or change screen resolution,

Thank you.

The zoom only works if all the NOC screens need the zoom.  In my case, only this page of the NOC has a small resolution.

Absolutely outstanding piece of Solarwinds Customization! Thanks for this!

I only have a small suggestion as I ran into this issue at one of my clients.

the line :

FROM Volumes

WHERE VolumeType = 'Fixed Disk'

could be replaced with:

FROM Volumes

WHERE VolumeTypeID = '4'

In the environment in question the "Fixed Disk" is labelled in German "Eingebauter Datenträger" so the Volumes did not show up.

With the ID you should be "language independent"

Cheers,

Holger

Best Dashboard for Solarwinds i have seen so far!

We use it since a week and i have following questions:

- is it possible to filter out a specific application? I have some applications i dont want to see in the dashboard.

- is it possible to change the priority for a specific application? I have some applications which a more important than others.

- is is possible to filter out Nodes where "i know and it is okay" that they have high RAM oder CPU usage?

Thanks in advance!

regards,

Alois

Hi, you could create a customer property for application to  filter out and/or set priority for an application. Same way to create a customer property for node to filter out some nodes.

Hi, thanks for reply!

filter out Nodes with high ram and CPU works with custom property! Thanks for that!

But wehen i try to filter out a application with custom property it does not work - i do it like this:

  OR

  (a.Status=1 AND i.aStatusID > 3 AND i.ApplicationTemplateID = '320')  -- Node is up, but one or more applications is in NOT UP Status

  )

when i try this in SWQL Studio it says :

Subquery i does not contain requested column ApplicationTemplateID

But if i understand the query in the right way then i shoud be "Orion.APM.Application" or am i wrong?

i also tried it like this:

FROM Orion.APM.Application as y

LEFT JOIN Orion.APM.ApplicationCustomProperties AS z ON y.ApplicationID = z.ApplicationID

GROUP BY NodeID

) AS i ON i.NodeID = a.NodeID

WHERE  -- list all nodes which are:

(

  a.Vendor IN ('windows','net-snmp','Linux','IBM','Citrix Systems','IBM eServer X','VMware Inc.')  --- for Winodws and Linux servers only

  AND (

  a.Status=2  -- Node is Down

  OR

  (a.Status=1 AND a.AvgResponseTime > 500)  -- Node is up but Response Time is slow

  OR

  (a.Status=1 AND a.CPULoad > 95) -- Node is up but CPU load is high

  OR

  (a.Status=1 AND a.PercentMemoryUsed > 95 AND b.hideraminnoc = 'FALSE' )  -- Node is up but memory load is high and CP HideRAMinNOC are not listed

  OR

  (a.Status=1 AND g.HWStatusID > 3) -- Node is up one or more HW components is in Warning/Critical Status

  OR

  (a.Status=1 AND h.vPercentage > 95.0 AND h.vPercentage < 100.0) -- Node is Up, but One or more volumes usage is high

  OR

  (a.Status=1 AND i.aStatusID > 3 and z.HideAPPinNOC = 'TRUE')  -- Node is up, but one or more applications is in NOT UP Status

  )

)

Table i only have 2 columns NodeID and aStatusID, ApplicationTemplateID is not included.

As more than one application monitor can assign to the same node, the table i is using MAX() to pick up worst status from all assigned application monitors. If you want filter a few applications by their ApplicationTemplateID, you need add the codes in table i:

============================

LEFT JOIN  -- find out the worst application status on a node if apply

(

SELECT NodeID,

MAX(CASE

  WHEN Status=1 THEN 0  -- UP

  WHEN Status=17 THEN 1  -- Unmanaged

  WHEN Status=0 THEN 2  -- Unknown

  WHEN Status=12 THEN 3  -- Unreachable

  WHEN Status=3 THEN 4  -- Warning

  WHEN Status=14 THEN 5  -- Critical

  WHEN Status=2 THEN 6  -- Down

  ELSE NULL

END

) AS aStatusID -- convert application Status raw values to ordered integers, the bigger value the worst status

FROM Orion.APM.Application

WHERE ApplicationTemplateID<> '320'

GROUP BY NodeID

) AS i ON i.NodeID = a.NodeID

===============================================

This should works if you only have a few applications need to be filtered, but you have many applications to be filtered and also application priority need to be set, a customer property for application will be a better choice. For example:

Create a customer property for application is called AppPri (Application Priority, values [0,1,2,3,4,5], 0 means not display on dashboard),

change the i table to:

================================

LEFT JOIN  -- find out the worst application status on a node if apply

(

SELECT i0.NodeID,

MAX(CASE

  WHEN i0.Status=1 THEN 0  -- UP

  WHEN i0.Status=17 THEN 1  -- Unmanaged

  WHEN i0.Status=0 THEN 2  -- Unknown

  WHEN i0.Status=12 THEN 3  -- Unreachable

  WHEN i0.Status=3 THEN 4  -- Warning

  WHEN i0.Status=14 THEN 5  -- Critical

  WHEN i0.Status=2 THEN 6  -- Down

  ELSE NULL

END

) AS aStatusID, -- convert application Status raw values to ordered integers, the bigger value the worst status

SUM(i1.AppPri) AS aPri

FROM Orion.APM.Application AS i0

LEFT JOIN Orion.APM.ApplicationCustomProperties AS i1

ON i0.ApplicationID = i1.ApplicationId

WHERE i1.AppPri is <> 0

GROUP BY i0.NodeID

) AS i ON i.NodeID = a.NodeID

=======================================

you can add AppPri where calculating total priority

Wow this works perfectly! (with AppPrio)

can you guide me to add the calculation of AppPrio?

Thanks in advance!

Calculating total priority is flexible to meet your requirements.  The default one is base on all column values and their weight:

ROUND((t1.wTime*0.75 + t1.wCPU*1.0 + t1.wRAM*1.0 + t1.wVol*0.75 + t1.wHW*0.5 + t1.wApp*0.5)/2.05 + t1.wConn*1.00, 2) AS Priority

With AppPrio , you need re-calculate wApp (from line 161):

===================================

CASE

  WHEN a.Status = 2  THEN 0

  WHEN i.aStatusID IS NULL THEN  0

  WHEN i.aStatusID=2 THEN 1*aPri -- Other

  WHEN i.aStatusID=4 THEN 10*aPri -- Warning

  WHEN i.aStatusID=5 THEN 80*aPri -- Critical

  WHEN i.aStatusID=6 THEN 600*aPri -- Down

  ELSE 0 -- Up

END AS wApp -- application Status score

===================================

many thanks for your help!

it is all working now, the only thing i had to change was the sum operator - it only works with MAX instead of SUM

===================

MAX(i1.AppPri) AS aPri

===================

When i user SUM instead of MAX the following Message appears:

"Operand data type nvarchar is invalid for sum operator"

I think I have modified the Network and interface queries to excluded items that are have had their alerts muted. I just borrowed part of the muted alert report and modified the last part of the query. I have tried to highlight my change. If there is a better way of doing this I would be grateful.

Interface:

FROM Orion.NPM.Interfaces AS  i

LEFT JOIN Orion.AlertSuppression AS AlertSup on AlertSup.EntityUri = i.Uri

LEFT JOIN  Orion.Nodes AS n ON n.NodeID = i.NodeID

LEFT JOIN Orion.NodesCustomProperties AS b ON n.NodeID = b.NodeID

Where  n.Vendor = 'cisco' AND i.Type = 6 AND AlertSup.Uri IS NULL AND (

  i.OutPercentUtil > 75 OR

  i.InPercentUtil > 75 OR

  i.InErrorsThisHour > 1000 OR

  i.OutErrorsThisHour > 1000 OR

  i.CRCAlignErrorsThisHour > 0 OR

  i.LateCollisionsThisHour > 0

)

----

) AS t1

Order BY t1.Weight DESC

Network Node

FROM Orion.Nodes AS a

LEFT JOIN Orion.AlertSuppression AS AlertSup on AlertSup.EntityUri = a.Uri
JOIN Orion.NodesCustomProperties AS b ON a.NodeID = b.NodeID

LEFT JOIN

(SELECT c.NodeID AS NodeID,

MAX(CASE

  WHEN c.Status=1 THEN 0  -- UP

  WHEN c.Status=9 THEN 1  -- Unmanaged

  WHEN c.Status=0 THEN 2  -- Unknown

  WHEN c.Status=12 THEN 3  -- Unreachable

  WHEN c.Status=3 THEN 4  -- Warning

  WHEN c.Status=14 THEN 5  -- Critical

  WHEN c.Status=2 THEN 6   -- Down

  ELSE 1

  END

) AS HWStatusID

FROM Orion.HardwareHealth.HardwareCategoryStatus AS c

JOIN Orion.HardwareHealth.HardwareItem AS e ON e.HardwareCategoryStatusID = c.ID

Group BY c.NodeID

) AS g ON g.NodeID = a.NodeID

WHERE AlertSup.Uri IS NULL AND

(

  a.Vendor ='Cisco'

  AND (  

  a.Status=2

  OR

  (a.Status=1 AND a.AvgResponseTime > 500)

  OR

  (a.Status=1 AND a.CPULoad > 95)

  OR

  (a.Status=1 AND a.PercentMemoryUsed > 95)

  OR

  (a.Status=1 AND g.HWStatusID > 3)

  OR

  (a.Status=1 AND a.PercentLoss > 10)

  )

)

) AS t1

Hello all,

I am a total noob when it comes this this so any help at all would be greatly appropriated.  I am looking for a way to filter out all but servers from the query. 

SELECT
Nodeid, caption, vendor, category
FROM Orion.Nodes
where category = 2 -- servers

and

SELECT
CategoryID, Description, DisplayName, InstanceSiteId, InstanceType, Uri

FROM Orion.NodeCategories

Or

FROM Orion.Nodes

WITH Caty AS (SELECT * FROM Category)

SELECT * FROM Caty WHERE Category = '2'

I really have no idea how this might be incorporated into the SQL query.

I other issue was resolved above @jkrenzien​ thanks you so much. (see below)

FROM Orion.Nodes AS a

LEFT JOIN Orion.AlertSuppression AS AlertSup on AlertSup.EntityUri = a.Uri

LEFT JOIN  -- find out the highest volume usage on a node

.

.

.

WHERE AlertSup.Uri IS NULL AND -- list all nodes which are:

If I wanted to have it list groups ( a specific five groups) what kind of work would this take to do?

Well, trilobite rex​, I was going to say to jump through a few tables, Orion.Nodes + Orion.WorldMap.Point + Orion.Groups (Nodes.WorldMapPoint.Group.ContainerID), so you didn't have to do any joins... But, that seems to not actually work. I'm not sure if it's not working because it's just flat out wrong, or if it's not working but should be working... Otherwise, you could add another join up in there, though it's already a busy looking query.

Actually, I'm not sure that would even work either, unless everything had a set of mapping points. I guess you'll just have to join them the good old fashion way.

Version history
Revision #:
1 of 1
Last update:
‎02-26-2015 11:01 AM
Updated by: