6 Replies Latest reply on Feb 13, 2018 4:58 PM by bradical987

    Interface Discards/Errors Percent Alert

    bradical987

      This query was originally adapted from jbiggley post in Percentage Interface Errors and Discards @ polling interval

       

      For the purposes of this post and to keep it as simple as possible, I'm trying to create an alert that will identify interfaces that exceed a threshold defined as a percentage of receive discards (.1% for this example). This query works just fine in SQL studio and the returned results are all the interfaces that meet or exceed the .1% receive discards threshold.

       

      SELECT interfaces.fullname,
             interfaces.interfaceid
      FROM   interfaces WITH (nolock)
             JOIN (SELECT id.[interfaceid],
                          CASE
                            WHEN id.[in_discards]
                                 + Isnull(it.[in_totalpkts], 0) = 0 THEN 0
                            ELSE Round(( ( [in_discards] ) / (
                                         id.[in_discards]
                                         + Isnull(it.[in_totalpkts], 0)
                                                             ) *
                                         100 ), 2
                                 )
                          END receivepercentdiscards,
                          CASE
                            WHEN id.[out_discards]
                                 + Isnull(it.[out_totalpkts], 0) = 0 THEN 0
                            ELSE Round(( ( [out_discards] ) / (
                                         id.[out_discards]
                                         + Isnull(it.[out_totalpkts],
                                         0) )
                                         * 100 ),
                                 2)
                          END transmitpercentdiscards
                   FROM   [DBO].[interfaceerrors_detail] id WITH (nolock)
                          LEFT JOIN [DBO].[interfacetraffic_detail] it WITH (nolock)
                                 ON id.interfaceid = it.interfaceid
                                    AND id.[datetime] = it.[datetime]
                   WHERE  in_discards > 0
                          -- Match events from the previous 15 minutes (alert evaluation interval)
                          AND id.datetime >= Dateadd(ss, -900, Getdate())) AS t1
               ON t1.interfaceid = interfaces.interfaceid
             JOIN nodesdata nd WITH (nolock)
               ON interfaces.nodeid = nd.nodeid
             JOIN nodescustomproperties ncp WITH (nolock)
               ON nd.nodeid = ncp.nodeid
      WHERE  ( [T1].[receivepercentdiscards] >= 1
                OR [T1].[transmitpercentdiscards] >= 1 )
         AND ( ncp.site_id NOT LIKE 'UTR%' ); -- filter out devices assigned to any UTR% `site_id`
      

      My question pertains to how I take this information and use it in a trigger action, as this is my first SQL alert. With all my other alerts, that identify issues through the drop down menu conditions (for lack of a better way to phrase it), I am able to use the variables such as ${N=SwisEntity;M=Node.DisplayName} or ${N=SwisEntity;M=InterfaceName}. My first question is if it is possible to pass the ReceivePercentDiscards value into my trigger action. If it is, how would I go about such a thing?

       

      I have tried passing the below variables, but instead of passing the actual ReceivePercentDiscards value, it just shows up as "{SQL: [T1].[ReceivePercentDiscards]}":

      C:\windows\system32\windowspowershell\v1.0\powershell.exe -ExecutionPolicy unrestricted -command "C:\sw_tools\scripts\slack\Slack_Alert_Trigger_RX-Discard.ps1 '${N=SwisEntity;M=Node.DisplayName}' '${N=SwisEntity;M=InterfaceName}' '${N=SwisEntity;M=Alias}' '{SQL: [T1].[ReceivePercentDiscards]}' '${N=Alerting;M=AlertDetailsUrl}' '${N=SwisEntity;M=Node.DetailsUrl}' '${N=Alerting;M=AcknowledgeUrl}' '${N=Generic;M=DateTime;F=OriginalValue}'"
        • Re: Interface Discards/Errors Percent Alert
          mesverrum

          None of the information from the trigger conditions is passed to the alert action,  you would just need to build a complete sql query from scratch to get the information you want.

          In this case it should probably be something like (completely winging that as I'm not vpn'ed into the lab right now):

           

          SELECT CASE

          WHEN id.[in_discards] + ISNULL(it.[in_totalpkts], 0) = 0 THEN

          ELSE ROUND((([in_discards]) / (id.[in_discards] + ISNULL(it.[in_totalpkts], 0))) * 100, 2) 

          END ReceivePercentDiscards 

          FROM [dbo].[interfaceerrors_detail] id WITH (nolock) 

          LEFTJOIN [dbo].[interfacetraffic_detail] it WITH (nolock) ON id.interfaceid = it.interfaceid 

          AND id.[datetime] = it.[datetime] 

          WHERE id.datetime >= Dateadd(ss, -900, Getdate())

          and id.interfaceid=${interfaceid} -- this might be different use the insert variable button to find the correct syntax for interfaceid

           

           

           

          Basically you need to just modify the query so it returns that stat for just the interfaceid that triggered the alert.

           

          Custom SQL/SWQL queries only allow you to return a single cell of data, so make sure you don't try to do anything that would give you a table of results

          1 of 1 people found this helpful
            • Re: Interface Discards/Errors Percent Alert
              bradical987

              I know I am in way over my head with this, due to my very basic knowledge of SQL and being my first attempt at SQL alerts. Are you saying that would be the trigger condition or part of the alert action?

                • Re: Interface Discards/Errors Percent Alert
                  mesverrum

                  That would be in part of the action.  Like I said, nothing you do in custom sql from the trigger will carry over to the action, so you have to recreate that statistic on the action side as well.

                  You would want to modify the query so that the answer it gives is just that one single stat for a specific interface instead of it generating a list of all the interfaces that exceeded the limit.

                   

                   

                   

                  Honestly though, if SQL isn't your wheel house you could also just take the easy route and have your alert message say "interface such and such has exceeded the limit of packet errors" and be done with it.

                    • Re: Interface Discards/Errors Percent Alert
                      bradical987

                      I'm definitely not satisfied with having a static "threshold exceeded" message displayed in the alert, but I have to resign to that option. I know this won't be the last time I want an alert that requires SQL, so I want to figure out how to make it work.

                       

                      I thought I had figured it out by adapting your sample query (replacing ${interfaceid}) and inserting it into the alert message box in Trigger actions, wrapped in ${SQL: <query>} but I get errors on just about every line.

                       

                      MACRO SQL ERROR - Incorrect syntax near the keyword 'END'.

                      Incorrect syntax near 'receivepercentdiscards'.

                      Incorrect syntax near the keyword 'with'. If this statement is a common table expression, an xmlnamespaces clause or a change tracking context clause, the previous statement must be terminated with a semicolon.

                      Incorrect syntax near the keyword 'with'. If this statement is a common table expression, an xmlnamespaces clause or a change tracking context clause, the previous statement must be terminated with a semicolon.

                        • Re: Interface Discards/Errors Percent Alert
                          mesverrum

                          Like I mentioned I was just winging that one based on some educated guesses, probably some syntax hangups.  Would want to load it into the database manager or into SSMS to validate the syntax.  Pick a single interface and hard code this into the ${interfaceid} placeholder while you are testing to make sure you can get back a result that seems believable.  If I get some free time after I finish with my client today I can take a stab at it in my lab and figure out what it doesn't like.

                          1 of 1 people found this helpful
                    • Re: Interface Discards/Errors Percent Alert
                      bradical987

                      Alright, I finally figured it out.

                       

                      The query I am using to detect interface discards as a percentage of traffic is:

                      SELECT interfaces.fullname,
                             interfaces.interfaceid
                      FROM   interfaces WITH (nolock)
                             JOIN (SELECT id.[interfaceid],
                                          CASE
                                            WHEN id.[in_discards]
                                                 + Isnull(it.[in_totalpkts], 0) = 0 THEN 0
                                            ELSE Round(( ( [in_discards] ) / (
                                                         id.[in_discards]
                                                         + Isnull(it.[in_totalpkts], 0)
                                                                             ) *
                                                         100 ), 2
                                                 )
                                          END receivepercentdiscards,
                                          CASE
                                            WHEN id.[out_discards]
                                                 + Isnull(it.[out_totalpkts], 0) = 0 THEN 0
                                            ELSE Round(( ( [out_discards] ) / (
                                                         id.[out_discards]
                                                         + Isnull(it.[out_totalpkts],
                                                         0) )
                                                         * 100 ),
                                                 2)
                                          END transmitpercentdiscards
                                   FROM   [DBO].[interfaceerrors_detail] id WITH (nolock)
                                          LEFT JOIN [DBO].[interfacetraffic_detail] it WITH (nolock)
                                                 ON id.interfaceid = it.interfaceid
                                                    AND id.[datetime] = it.[datetime]
                                   WHERE  in_discards > 0
                                          -- Match events from the previous 15 minutes (alert evaluation interval)
                                          AND id.datetime >= Dateadd(ss, -900, Getdate())) AS t1
                               ON t1.interfaceid = interfaces.interfaceid
                             JOIN nodesdata nd WITH (nolock)
                               ON interfaces.nodeid = nd.nodeid
                             JOIN nodescustomproperties ncp WITH (nolock)
                               ON nd.nodeid = ncp.nodeid
                      WHERE  ( [T1].[receivepercentdiscards] >= 1
                                OR [T1].[transmitpercentdiscards] >= 1 )
                         AND ( ncp.site_id NOT LIKE 'UTR%' ); -- Custom field filter

                       

                      Under the trigger action, in the Message displayed when this alert is triggered box, I put:

                       

                      ${SQL: SELECT CASE WHEN id.[in_discards] + Isnull(it.[in_totalpkts], 0) = 0 THEN 0 ELSE Round(( ( [in_discards] ) / ( id.[in_discards] + Isnull(it.[in_totalpkts], 0) ) ) * 100, 2) END receivepercentdiscards, CASE WHEN id.[out_discards] + Isnull(it.[out_totalpkts], 0) = 0 THEN 0 ELSE Round(( ( [out_discards] ) / ( id.[out_discards] + Isnull(it.[out_totalpkts], 0) ) ) * 100, 2) END transmitpercentdiscards FROM [dbo].[interfaceerrors_detail] id WITH (nolock) LEFT JOIN [dbo].[interfacetraffic_detail] it WITH (nolock) ON id.interfaceid = it.interfaceid AND id.[datetime] = it.[datetime] WHERE id.datetime >= Dateadd(ss, -900, Getdate()) AND id.interfaceid = ${N=SwisEntity;M=InterfaceID};}% Packet Loss

                       

                      For the trigger action, I am using a webhook and sending the following JSON:

                      {
                          "attachments":[
                              {
                                  "fallback":"${N=Alerting;M=AlertName}",
                                  "title":"Interface Alert",
                                  "color":"danger",
                                  "fields":[
                                      {
                                          "title":"Device",
                                          "value":"${N=SwisEntity;M=Node.DisplayName}",
                                          "short":false
                                      },
                                      {
                                          "title":"Interface",
                                          "value":"${N=SwisEntity;M=Caption}",
                                          "short":false
                                      },
                                      {
                                          "title":"Status",
                                          "value":"${N=Alerting;M=AlertMessage}",
                                          "short":false
                                      },
                                      {
                                          "title":"Alert Details",
                                          "value":"<${N=Alerting;M=AlertDetailsUrl}|Alert Link>",
                                          "short":true
                                      },
                                      {
                                          "title":"Acknowledge",
                                          "value":"<${N=Alerting;M=AcknowledgeUrl}|Acknowledge Link>",
                                          "short":true
                                      }
                                  ]
                              }
                          ]
                      }

                       

                      Thanks mesverrum for the help!