This discussion has been locked. The information referenced herein may be inaccurate due to age, software updates, or external references.
You can no longer post new replies to this discussion. If you have a similar question you can start a new discussion in this forum.

Console not responsive and keeps crashing

I am experiencing horrible performance in the console.

The console session will crash my browser session.

This is interesting because the appliance has been running for 8 weeks during a demo with no issues.  The day we applied our license everything got borked.

I started to troubleshoot the problem and am coming up empty handed.

The interesting thing I am noticing is a negative number of alerts waiting in memory.  The console is also almost a day behind on processing events.

cmc::acm# diskusage

Checking Disk Usage (this could take a moment)... ....oo.oo.oo.oo.oo.oo.oo.

Partition Disk Usage:

        LEM:             34% (965M/3.0G)

        OS:              40% (1.1G/3.0G)

        Logs/Data:       43% (95G/234G)

        Temp:             4% (179M/5.9G)

Database Queue(s): 4.0K (No alerts queued, -4241689380 alerts waiting in memory)

Rules Queue: 2.1M (0 alerts queued, 0 alerts waiting in memory)

Console Queue: 2.1M (0 alerts queued, 0 alerts waiting in memory)

DataCenter Queue: 2.1M (0 alerts queued, 0 alerts waiting in memory)

EPIC Rules Queue: 2.1M (0 alerts queued, 0 alerts waiting in memory)

Forensic Database Queue: 2.1M (0 data queued, 0 data items waiting in memory)

Logs: 64G

Tool Profiles Message Queue: 2.1M (0 alerts queued, 0 alerts waiting in memory)

cmc::cmm# viewsysinfo
Collecting general system information......... done.
                  CMC version: 4737
                  The time is: 2014/03/18 17:07:43
            Machine uptime is: 58 min
             Linux version is: 3.2.0-3-amd64
      Machine architecture is: 64-bit
         Physical Memory info:
                              MemTotal:       16474176 kB
                              MemFree:         6024856 kB
               Number of CPUs: 4
                    CPU model: Intel(R) Xeon(R) CPU E5-2420 0 @ 1.90GHz
       Memory Overcommit is default.
      ---------------------------------
    TriGeo manager version is: 5.7.0
      TriGeo manager build is: release
      TriGeo upgrade build is: 520398
          Product Support Key: XXXXXXXXXXXXXXXXXXXXX

                 License Type: Commercial
Total Number of Node Licenses: 50
           Available Licenses: 40

   Manager heap configuration: Initial heap size is 5300M and maximum heap size is 5300M
    Max # of alerts in memory: 600000

              Flow configured: false

    Virtualization Platform: VMware
    ------------------------------------------
    Clock
      Synchronization : Enabled
      Hypervisor Time : 18 Mar 2014 16:59:29
      Guest Time      : Tue Mar 18 17:07:44 2014

    CPU
      Speed           : 1900 MHz
      Reservation     : 2000 MHz
      Limit           : Unlimited

    Memory
      Reservation     : 16384 MB
      Limit           : Unlimited
      Swapped         : 0 MB
      Ballooned       : 0 MB

cmc::acm# top
Press <enter> to view manager CPU/memory statistics with "top" (use q to quit)
top - 17:09:49 up  1:00,  1 user,  load average: 0.98, 0.85, 1.69
Tasks:  74 total,   1 running,  72 sleeping,   0 stopped,   1 zombie
Cpu(s): 58.5%us,  2.2%sy,  0.0%ni, 39.2%id,  0.0%wa,  0.0%hi,  0.1%si,  0.0%st
Mem:  16474176k total, 10224316k used,  6249860k free,    36092k buffers
Swap:   995996k total,        0k used,   995996k free,  4520524k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                               
1328 trigeo    20   0 13.5g 5.3g  43m S  241 33.5 116:47.35 java                                                                  
  915 root      20   0 49812 2584 1868 S    2  0.0   2:02.18 syslog-ng   

  •    Clock

          Synchronization : Enabled

          Hypervisor Time : 18 Mar 2014 16:59:29

          Guest Time      : Tue Mar 18 17:07:44 2014


    You have an almost 10 minute discrepancy in the time between host and guest, and this could cause problems.  Can you go to APPLIANCE in the CMC shell and run DATECONFIG?  Press enter 4 times to see the current time, and re-run to correct the time.

  • Thanks for pointing this out.  I didn't notice it before.

    Before I ran the dateconfig command as you suggested, I ran viewsysinfo again to see what the disparity was between the hypervisor and the guest.  It was a lot closer than yesterday.

    cmc::cmm# viewsysinfo
    Collecting general system information......... done.
                      CMC version: 4737
                      The time is: 2014/03/19 16:26:48
                Machine uptime is: 1 day, 17 min
                 Linux version is: 3.2.0-3-amd64
          Machine architecture is: 64-bit
             Physical Memory info:
                                  MemTotal:       16474176 kB
                                  MemFree:         2650488 kB
                   Number of CPUs: 4
                        CPU model: Intel(R) Xeon(R) CPU E5-2420 0 @ 1.90GHz
           Memory Overcommit is default.
          ---------------------------------
        TriGeo manager version is: 5.7.0
          TriGeo manager build is: release
          TriGeo upgrade build is: 520398
              Product Support Key: R3WME-BSRB6-92AA-YLNG-DBPRN-X7A2R

                     License Type: Commercial
    Total Number of Node Licenses: 50
               Available Licenses: 40

       Manager heap configuration: Initial heap size is 5300M and maximum heap size is 5300M
        Max # of alerts in memory: 600000

                  Flow configured: false

        Virtualization Platform: VMware
        ------------------------------------------
        Clock
          Synchronization : Enabled
          Hypervisor Time : 19 Mar 2014 16:26:27
          Guest Time      : Wed Mar 19 16:26:49 2014

        CPU
          Speed           : 1900 MHz
          Reservation     : 2000 MHz
          Limit           : Unlimited

        Memory
          Reservation     : 16384 MB
          Limit           : Unlimited
          Swapped         : 0 MB
          Ballooned       : 0 MB

    After running the command from the appliance prompt, I received the following output.

    cmc::acm# dateconfig
    Press <enter> to update your manager's current date and time

    Enter the current date in month/day/year format. (MM/DD/YYYY)>
    Enter the current time in hour:minute format. (hh:mm)>
    setting date to ...
    Wed Mar 19 16:28:51 CDT 2014

    The problem is this is almost 10 minutes behind the time that my NTP server is showing.  I am going to check my VMware configuration for ntp settings to make sure they are working correctly.

  • After a little more investigating we found that the VM setting for synchronizing guest time with host was checked.  This is not a setting that we modify on VM guests so I have to assume that it was set in the original VM image we loaded from Solarwinds.  I am attaching an image with the setting location.  This needs to be unchecked, and then I had to run ntpconfig again from the CLI.  Now my time is at least synchronized.  I will continue to monitor the performance to see if this resolved the issue.  Thanks again for pointing me in the correct direction.

    lem_time_sync.jpg

  • FormerMember
    0 FormerMember in reply to harrijs

    Just wanted to confirm - the default setting on the virtual appliance is indeed to sync guest time with host (this seems to work GENERALLY well without having to do any additional work on the customer's part, since the hypervisor tends to be NTP synced itself). When you have that setting configured, LEM will ignore the NTP configuration, but as soon as you turn it off you can configure an actual NTP host.

    Hopefully this helps with your unresponsiveness issue, it sounds awfully coincidental so far but I'll take it emoticons_wink.png