8 Replies Latest reply on Dec 2, 2016 11:38 AM by cowincarbonite

    Help with CPU Alert

    Campy

      Can someone tell me why this alert has been triggered? I have roughly 6 servers out of about 75 that are triggering this alert but they shouldn't be, maybe some fresh eyes can see something.

      Server is Oracle Linux.

       

      Thanks

      Screen Shot 2016-08-15 at 12.52.08 PM.png

      Screen Shot 2016-08-15 at 12.52.42 PM.png

      Screen Shot 2016-08-15 at 12.53.24 PM.png

        • Re: Help with CPU Alert
          chad.every

          One thing to remember is that by default CPU polling in NPM happens every 9 minutes. So by the time you receive an alert the issue might have resided. You do have the condition must exist checked which is what I was going to recommend (and it's what I do too).

           

          Have you watched the CPU directly on that Linux host to see how it rides in real-time? It might just be that the CPU spikes at the same time that NPM polls the node. Those 6 nodes in questions probably have a average CPU load.

            • Re: Help with CPU Alert
              Campy

              Thanks chad.every - I unfortunately don't have direct access other than the real-time process explorer from SolarWinds. Watching that the server never went over 60% CPU.

              Is it possible that it's triggering on the level of one cpu versus the total cpu?

               

              I upped the time threshold from 10minutes to 20, just to see if some of the alerts would clear. No luck yet, in fact a new server came in on the same alert. It's also at about 50% utilization. Ugh

                • Re: Help with CPU Alert
                  chad.every

                  My CPU alert is a little different. Here is what I have.

                  2016-08-16 09_14_45-Edit Alert - _ SMS_Slack alert me when CPU load has an issue (custom)_.png

                   

                  Not having direct access to those servers does make it more difficult. Maybe see if you can work with that Linux team to see if there are any abnormalities. If everything checks out ok and you're still getting CPU alerts then those 6 servers are probably just normal operation. I would edit the CPU thresholds on those servers to accommodate for that.

                    • Re: Help with CPU Alert
                      Campy

                      Thanks - I restructured my rules to be like yours.

                      Here's what I still get happening, the memory is the issue but the CPU alert is being triggered. The one for Memory utilization already triggered.

                      Screen Shot 2016-08-16 at 1.45.13 PM.png

                       

                      Screen Shot 2016-08-16 at 1.53.09 PM.png

                        • Re: Help with CPU Alert
                          chad.every

                          I'd try setting those nodes to use dynamic baseline thresholds and see if that improves.

                           

                          2016-08-17 08_06_08-Edit Properties.png

                          • Re: Help with CPU Alert
                            cowincarbonite

                            heya Campy did you ever figure this out? I've been having a similar issue where im getting high memory utilization alerts being triggered by an alert that supposed to be just for CPU (it currently alerts on both cpu and memory using just the cpu load threshold reached alert).

                             

                            Trigger Condition:  

                            Alert on all objects where:
                            All child conditions must be satisfied (AND)
                              Node - Status - is equal to - Up
                            The actual trigger condition:
                            All child conditions must be satisfied (AND)
                              At least one child condition must be satisfied (OR)
                              Node - Vendor - is equal to - Windows
                              Node - Vendor - is equal to - net-snmp
                              All child conditions must be satisfied (AND)
                              CPU Load Threshold - Critical Value Reached - is equal to - 1

                             

                            Thanks