1 Reply Latest reply on Apr 24, 2017 8:57 AM by sean.martinez

    MSMQ Folder continues to grow over 1Gig

    MagnAxiom

      This has been an ongoing issue for us, over 3 years now.

       

      Here is the typical scenario.  Some of my users start reporting sluggish results or long waits for SolarWinds results and they give me a call/email.  I pop out to our 4 SolarWinds servers (Primary Poller, 2 x Additional Poller, 1 x Web Server) and check MSMQ folder size. Usually, one or more of the servers has a substantial MSMQ folder size... 1-2 GB in some cases.  If nothing is done about this, SW will eventually flip out as MSMQ will die.  I stop SolarWinds Services on the affected box, stop MSMQ, Delete the .mq files in the c:\Windows\System32\msmq\storage\ folder, restart MSMQ service, restart SW Services, system returns to normal operation, albeit with a gap in polling data for some nodes.

       

      This happens enough that I have batch files created to check the MSMQ folder size and if needed, another to stop services and remote the files, then start services again.

       

      I'm in a situation where I don't have any admin state on the SQL Server SolarWinds is hosted on and I have a feeling these issues are as a result of the SQL server being too overloaded and me getting blocking many times when trying to write the continual stream of data that SW polling generates. 

       

      How do I go about proving to the DB folks this is the issue, if it is in fact the issue?  I'm more concerned in fixing this than blaming the SQL DB's... But when I've been on with SW support in the past, they are always surprised at how loaded the SQL server is with other databases.

       

      I'm on NPM 12.1, SAM 6.3.0, NCM 7.6

       

      We're on the latest MS SQL I believe and are on "Always On"

        • Re: MSMQ Folder continues to grow over 1Gig
          sean.martinez

          Deleting the .mq files means you are deleting polled data that has yet to reach the SQL Server, which then means you cannot report or alert on the data you dumped.

           

          What I sometimes do is stop the Job Engine v2 (service responsible for polling all data), then wait to see how long it takes for the MSMQ to clear out to the SQL Server. The MSMQ is great in case the MS SQL Server is unavailable, because the services will hold onto the data until the SQL Server is established, or availability to the SQL Server is available.

           

          You have SAM already, so I would recommend using the AppInsight for SQL to monitor the SQL instance. It can report back all performance data, which will be useful to the Database team.

           

           

          We cannot give any other recommendations without the sizing (CPU, memory, Disk RAID configuration) of the MS SQL Server.