4 Replies Latest reply on Mar 24, 2014 5:14 PM by nicole pauls

    Console not responsive and keeps crashing

    harrijs

      I am experiencing horrible performance in the console.

       

      The console session will crash my browser session.

       

      This is interesting because the appliance has been running for 8 weeks during a demo with no issues.  The day we applied our license everything got borked.

       

      I started to troubleshoot the problem and am coming up empty handed.

       

      The interesting thing I am noticing is a negative number of alerts waiting in memory.  The console is also almost a day behind on processing events.

       

      cmc::acm# diskusage

      Checking Disk Usage (this could take a moment)... ....oo.oo.oo.oo.oo.oo.oo.

      Partition Disk Usage:

              LEM:             34% (965M/3.0G)

              OS:              40% (1.1G/3.0G)

              Logs/Data:       43% (95G/234G)

              Temp:             4% (179M/5.9G)

      Database Queue(s): 4.0K (No alerts queued, -4241689380 alerts waiting in memory)

      Rules Queue: 2.1M (0 alerts queued, 0 alerts waiting in memory)

      Console Queue: 2.1M (0 alerts queued, 0 alerts waiting in memory)

      DataCenter Queue: 2.1M (0 alerts queued, 0 alerts waiting in memory)

      EPIC Rules Queue: 2.1M (0 alerts queued, 0 alerts waiting in memory)

      Forensic Database Queue: 2.1M (0 data queued, 0 data items waiting in memory)

      Logs: 64G

      Tool Profiles Message Queue: 2.1M (0 alerts queued, 0 alerts waiting in memory)

       

       

      cmc::cmm# viewsysinfo
      Collecting general system information......... done.
                        CMC version: 4737
                        The time is: 2014/03/18 17:07:43
                  Machine uptime is: 58 min
                   Linux version is: 3.2.0-3-amd64
            Machine architecture is: 64-bit
               Physical Memory info:
                                    MemTotal:       16474176 kB
                                    MemFree:         6024856 kB
                     Number of CPUs: 4
                          CPU model: Intel(R) Xeon(R) CPU E5-2420 0 @ 1.90GHz
             Memory Overcommit is default.
            ---------------------------------
          TriGeo manager version is: 5.7.0
            TriGeo manager build is: release
            TriGeo upgrade build is: 520398
                Product Support Key: XXXXXXXXXXXXXXXXXXXXX

                       License Type: Commercial
      Total Number of Node Licenses: 50
                 Available Licenses: 40

         Manager heap configuration: Initial heap size is 5300M and maximum heap size is 5300M
          Max # of alerts in memory: 600000

                    Flow configured: false

          Virtualization Platform: VMware
          ------------------------------------------
          Clock
            Synchronization : Enabled
            Hypervisor Time : 18 Mar 2014 16:59:29
            Guest Time      : Tue Mar 18 17:07:44 2014

          CPU
            Speed           : 1900 MHz
            Reservation     : 2000 MHz
            Limit           : Unlimited

          Memory
            Reservation     : 16384 MB
            Limit           : Unlimited
            Swapped         : 0 MB
            Ballooned       : 0 MB

       

       

      cmc::acm# top
      Press <enter> to view manager CPU/memory statistics with "top" (use q to quit)
      top - 17:09:49 up  1:00,  1 user,  load average: 0.98, 0.85, 1.69
      Tasks:  74 total,   1 running,  72 sleeping,   0 stopped,   1 zombie
      Cpu(s): 58.5%us,  2.2%sy,  0.0%ni, 39.2%id,  0.0%wa,  0.0%hi,  0.1%si,  0.0%st
      Mem:  16474176k total, 10224316k used,  6249860k free,    36092k buffers
      Swap:   995996k total,        0k used,   995996k free,  4520524k cached

        PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                               
      1328 trigeo    20   0 13.5g 5.3g  43m S  241 33.5 116:47.35 java                                                                  
        915 root      20   0 49812 2584 1868 S    2  0.0   2:02.18 syslog-ng   

        • Re: Console not responsive and keeps crashing
          curtisi

             Clock

                Synchronization : Enabled

                Hypervisor Time : 18 Mar 2014 16:59:29

                Guest Time      : Tue Mar 18 17:07:44 2014


          You have an almost 10 minute discrepancy in the time between host and guest, and this could cause problems.  Can you go to APPLIANCE in the CMC shell and run DATECONFIG?  Press enter 4 times to see the current time, and re-run to correct the time.

          • Re: Console not responsive and keeps crashing
            harrijs

            Thanks for pointing this out.  I didn't notice it before.

             

            Before I ran the dateconfig command as you suggested, I ran viewsysinfo again to see what the disparity was between the hypervisor and the guest.  It was a lot closer than yesterday.

             

            cmc::cmm# viewsysinfo
            Collecting general system information......... done.
                              CMC version: 4737
                              The time is: 2014/03/19 16:26:48
                        Machine uptime is: 1 day, 17 min
                         Linux version is: 3.2.0-3-amd64
                  Machine architecture is: 64-bit
                     Physical Memory info:
                                          MemTotal:       16474176 kB
                                          MemFree:         2650488 kB
                           Number of CPUs: 4
                                CPU model: Intel(R) Xeon(R) CPU E5-2420 0 @ 1.90GHz
                   Memory Overcommit is default.
                  ---------------------------------
                TriGeo manager version is: 5.7.0
                  TriGeo manager build is: release
                  TriGeo upgrade build is: 520398
                      Product Support Key: R3WME-BSRB6-92AA-YLNG-DBPRN-X7A2R

                             License Type: Commercial
            Total Number of Node Licenses: 50
                       Available Licenses: 40

               Manager heap configuration: Initial heap size is 5300M and maximum heap size is 5300M
                Max # of alerts in memory: 600000

                          Flow configured: false

                Virtualization Platform: VMware
                ------------------------------------------
                Clock
                  Synchronization : Enabled
                  Hypervisor Time : 19 Mar 2014 16:26:27
                  Guest Time      : Wed Mar 19 16:26:49 2014

                CPU
                  Speed           : 1900 MHz
                  Reservation     : 2000 MHz
                  Limit           : Unlimited

                Memory
                  Reservation     : 16384 MB
                  Limit           : Unlimited
                  Swapped         : 0 MB
                  Ballooned       : 0 MB

             

            After running the command from the appliance prompt, I received the following output.

             

            cmc::acm# dateconfig
            Press <enter> to update your manager's current date and time

            Enter the current date in month/day/year format. (MM/DD/YYYY)>
            Enter the current time in hour:minute format. (hh:mm)>
            setting date to ...
            Wed Mar 19 16:28:51 CDT 2014

             

             

            The problem is this is almost 10 minutes behind the time that my NTP server is showing.  I am going to check my VMware configuration for ntp settings to make sure they are working correctly.

            1 of 1 people found this helpful
            • Re: Console not responsive and keeps crashing
              harrijs

              After a little more investigating we found that the VM setting for synchronizing guest time with host was checked.  This is not a setting that we modify on VM guests so I have to assume that it was set in the original VM image we loaded from Solarwinds.  I am attaching an image with the setting location.  This needs to be unchecked, and then I had to run ntpconfig again from the CLI.  Now my time is at least synchronized.  I will continue to monitor the performance to see if this resolved the issue.  Thanks again for pointing me in the correct direction.

               

              lem_time_sync.jpg

              1 of 1 people found this helpful
                • Re: Console not responsive and keeps crashing
                  nicole pauls

                  Just wanted to confirm - the default setting on the virtual appliance is indeed to sync guest time with host (this seems to work GENERALLY well without having to do any additional work on the customer's part, since the hypervisor tends to be NTP synced itself). When you have that setting configured, LEM will ignore the NTP configuration, but as soon as you turn it off you can configure an actual NTP host.

                   

                  Hopefully this helps with your unresponsiveness issue, it sounds awfully coincidental so far but I'll take it