16 Replies Latest reply on Mar 11, 2016 9:42 AM by DanielleH

    SolarWinds Database Suddenly Spikes In Storage Size, And Rapidly Growing...

    wluther

      I figured I would ask Thwack Nation, while I wait on support. Anyone else out there have this problem before, any advice?

      Support keeps wanting to tell me this is all due to record retention, however, I simply cannot wrap my brain around that answer. I am not saying that answer is wrong. I am simply saying it does not make sense to me. Perhaps there is a better example out there that would better assist my acceptance, however, as it stands now, I am simply not buying that answer.

       

      DB server was configured based on SolarWinds best practice pdf, so the current settings, for the most part, should be configured accordingly.

      We have not recently added a massive amount of nodes/interfaces/etc.

       

      I know there are many other specific details I have not mentioned here, but I figured I would at least get this posted, and will add more details if needed.

       

       

      Orion Platform 2015.1.2, NCM 7.4, NPM 11.5.2, DPA 10.0.0, IVIM 2.1.0, SAM 6.2.2

      Microsoft SQL Server 2012 (SP1) - 11.0.3000.0 (X64)

      Oct 19 2012 13:38:57

      Copyright (c) Microsoft Corporation

      Standard Edition (64-bit) on Windows NT 6.2 <X64> (Build 9200: ) (Hypervisor)

       

      Network Elements18398
      Nodes1246
      Interfaces16366
      Volumes786
      Alerts60
      Events128123
      Pollers74173
      Polling Engines3

       

       

      I have pasted a couple of the descriptions I sent to support, below.

       

      "Case # 950343 Created: Our Orion database has been rapidly increasing in size over the past few weeks."

      Message History
      Our Orion database has been rapidly increasing in size over the past few weeks. Prior to this issue, our database was growing slowly and steadily, being approximately 65GB-70GB. We had projections of the database reaching 100GB by the end of 2016, or early 2017.

      We had a few latency issue, so we rebuilt the SQL server, and swapped them out. A few weeks later, and our database is now over 100GB, which is approximately a full year ahead of schedule.

      I have gone through the SQL server settings, as well as the DB settings, and verified the new server matches the old server. I have even gone through and reduced the number of device configs stored in the db, as well as the length of time, which freed up over 5GB of db storage space.

      We need to determine why our db is growing so rapidly, and see what we can do to prevent this, or at least slow it down.

       

      Shrinking the DB does absolutely nothing, as there is no space to shrink. Everything there is being used, or at least appears to be, and does not shrink more than a few MB.

       

       

       

      Before we rebuilt our SQL server, our DB was approximately 70GB. DB growth was very slow, taking 4-6 months to grow from 60GB to 70GB. We had projected to reach, or be close to reaching, the 100GB size close to the end of this year, or early 2017. Over the past 4 weeks, or so, after rebuilding, and switching over to, our new SQL server, our DB has grown nearly 40 GB. Currently, our DB is approximately 107GB, and growing roughly 100MB every 45-75 minutes.

       

       

       

      I have attached a chart/graph showing the severely sharp and rapid increase in the DB size. I am very far from being convinced this is simply due to our data retention. There must be a better, and more believable, explanation.

       

      Thank you,

      -Will

       

       

       

      DB_growth_Custom_Chart_-_Data_File_Size_201512-201603.png

       

       

       

       

       

      Thank you,

       

      -Will

        • Re: SolarWinds Database Suddenly Spikes In Storage Size, And Rapidly Growing...
          rschroeder

          I've seen something similar after adding a dozen Cisco WLC 5508's to NCM, and pointing their syslog and traps at the NPM pollers.  WLC's send far more information than we anticipated; we had to severely restrict the amount of information those wireless controllers sent to NPM, otherwise we would quickly run out of space.  Ultimately we stopped their syslog and trap reporting to NPM and sent the traffic to Splunk, which we sized appropriately for the amount of data that WLC's can send.

           

          Similarly, we were logging too much syslog and trap information from 60+ ASA's to NPM.  Again, we turned to Splunk and reduced the amount of info sent to NPM, and as a result our dB growth got back on track.

           

          It might be worth reviewing whether these kinds of devices were changed to send large amounts of syslog or traps to your NPM.  Hopefully you'll be able to see which devices may be responsible for the additional growth in space required, and determine if the problem is actual Orion or the devices reporting to it.

            • Re: SolarWinds Database Suddenly Spikes In Storage Size, And Rapidly Growing...
              wluther

              rschroeder We actually had our syslog pointing directly to our main poller, previously. However, as you mentioned, storage can fill up pretty quickly, so we stood up a dedicated Kiwi Syslog server, and repointed everything to send syslog there, instead. (That is the little bump in the graph, towards the middle/end of December.) Since then, we have adjusted our syslog and trap retention settings on our poller to 1 day, just in case, only forwarding the real time change notifications from the Kiwi server to the main poller.

               

              In regards to recently adding devices, we have not had an increase, and not had any abnormal activity, with adding devices. Additionally, while trying to figure this out a couple weeks ago, I adjusted our NCM config storage retention in the db, which reduced that table by nearly 5GB. Unfortunately, that reclaimed space was "refilled" by the following day. I had also increased weekly maintenance jobs to run nightly, which also did not show any changes, neither good nor bad...

               

              Thank you,

               

              -Will

            • Re: SolarWinds Database Suddenly Spikes In Storage Size, And Rapidly Growing...
              zackm

              I would not think this would be an issue with SolarWinds as you have indicated no changes to SolarWinds in this process. The only change has been the SQL server, so I would definitely start there.

               

              Top 25 Tables by Size:

               

              SELECT TOP 25
                  t.NAME AS 'Table Name',
                  p.rows AS 'Row Counts',
                  (SUM(a.total_pages) * 8)/1000 AS 'Total Space (MB)'
              FROM 
                  sys.tables t
              INNER JOIN      
                  sys.indexes i ON t.OBJECT_ID = i.object_id
              INNER JOIN 
                  sys.partitions p ON i.object_id = p.OBJECT_ID AND i.index_id = p.index_id
              INNER JOIN 
                  sys.allocation_units a ON p.partition_id = a.container_id
              WHERE 
                  t.NAME NOT LIKE 'dt%' 
                  AND t.is_ms_shipped = 0
                  AND i.OBJECT_ID > 255 
              GROUP BY 
                  t.Name, p.Rows
              ORDER BY 
                  'Total Space (MB)' DESC
              

               

              Autogrowth Settings:

              SELECT
              S.[name] AS [Logical Name]
              ,S.[file_id] AS [File ID]
              , S.[physical_name] AS [File Name]
              ,CAST(CAST(G.name AS VARBINARY(256)) AS sysname) AS [FileGroup_Name]
              ,CONVERT (varchar(10),(S.[size]*8)) + ' KB' AS [Size]
              ,CASE 
                WHEN S.[max_size]=-1 THEN 'Unlimited' 
                ELSE CONVERT(VARCHAR(10),CONVERT(bigint,S.[max_size])*8) +' KB' 
              END AS [Max Size]
              ,CASE s.is_percent_growth 
                WHEN 1 THEN CONVERT(VARCHAR(10),S.growth) +'%' 
                ELSE Convert(VARCHAR(10),S.growth*8) +' KB' 
              END AS [Growth]
              ,Case WHEN S.[type]=0 THEN 'Data Only'
                WHEN S.[type]=1 THEN 'Log Only'
                WHEN S.[type]=2 THEN 'FILESTREAM Only'
                WHEN S.[type]=3 THEN 'Informational purposes Only'
                WHEN S.[type]=4 THEN 'Full-text '
              END AS [usage]
              ,DB_name(S.database_id) AS [Database Name]
              FROM sys.master_files AS S
              LEFT JOIN sys.filegroups AS G ON ((S.type = 2 OR S.type = 0) AND (S.drop_lsn IS NULL)) AND (S.data_space_id=G.data_space_id)
              --Enter your Database name here
              WHERE S.[name] LIKE '%SolarWinds%'
              

               

              Database Recovery Model:

               

              SELECT 
              Name AS [Database Name],
              Recovery_Model_Desc AS [Recovery Model] 
              FROM sys.Databases
              --Enter database name here
              WHERE Name LIKE '%SolarWinds%'
              

               

              I would also check your maintenance plans and see what settings/tasks you have enabled for this database.

               

              That's a start anyways, there are a LOT of different reasons your database could be growing like this, but the results of these queries might give a little more insight to point in the right direction.

               

              -ZackM

              Loop1 Systems: SolarWinds Training and Professional Services

              1 of 1 people found this helpful
                • Re: SolarWinds Database Suddenly Spikes In Storage Size, And Rapidly Growing...
                  wluther

                  zackm

                   

                  I appreciate your time and suggestions. I have run the queries, and provided the results below.

                  In regards to maintenance plans for this db, I only have the nightly backup set to run.

                   

                  My brain tells me this is weird (top 25 tables by size), as the problem started alongside February, yet only the last day of February shows up in the list. I think I would have expected to see February show up in that list more than it did...

                   

                   

                  Top 25 Tables by Size:

                  Table NameRow CountsTotal Space (MB)
                  NCM_ConfigArchive1205625055
                  InterfaceAvailability_Detail1077853813752
                  CPUMultiLoad_Detail296991002087
                  APM_HardwareItem_Detail154245551809
                  InterfaceTraffic_Detail_201601104670899919
                  InterfaceTraffic_Detail_201601134684693918
                  InterfaceTraffic_Detail_201601094665658917
                  InterfaceTraffic_Detail_201601064671086917
                  InterfaceTraffic_Detail_201601074651588916
                  InterfaceTraffic_Detail_201601114670449916
                  InterfaceTraffic_Detail_201601124654713915
                  InterfaceTraffic_Detail_201603083929046909
                  InterfaceTraffic_Detail_201601084648887897
                  InterfaceTraffic_Detail_201601144513393888
                  InterfaceTraffic_Detail_201601054516877871
                  InterfaceTraffic_Detail_201601044386311863
                  InterfaceTraffic_Detail_201601034385628862
                  InterfaceTraffic_Detail_201601014355024859
                  InterfaceTraffic_Detail_201512304344285857
                  InterfaceTraffic_Detail_201512284346332854
                  InterfaceTraffic_Detail_201512314361517854
                  InterfaceTraffic_Detail_201601024374802845
                  InterfaceTraffic_Detail_201512294347805842
                  InterfaceTraffic_Detail_201603014093320802
                  InterfaceTraffic_Detail_201602294098555802

                   

                   

                   

                  Autogrowth Settings:

                  Logical NameFile IDFile NameFileGroup_NameSizeMax SizeGrowthusageDatabase Name
                  SWNPMDB1E:\MainData\SWNPMDB.mdfPRIMARY112448960 KBUnlimited102400 KBData OnlySWNPMDB
                  SWNPMDB_log2F:\MainLog\SWNPMDB_log.ldfNULL5632832 KB41059328 KB512000 KBLog OnlySWNPMDB
                  SWNPMDB_FG13E:\MainData\SWNPMDB_FG1.mdfFG1102400 KBUnlimited102400 KBData OnlySWNPMDB
                  SWNPMDB_FG24E:\MainData\SWNPMDB_FG2.mdfFG2102400 KBUnlimited51200 KBData OnlySWNPMDB
                  SWNPMDB_FG35E:\MainData\SWNPMDB_FG3.mdfFG3102400 KBUnlimited51200 KBData OnlySWNPMDB
                  SWNPMDB_FG46E:\MainData\SWNPMDB_FG4.mdfFG451200 KBUnlimited51200 KBData OnlySWNPMDB

                   

                   

                   

                  Database Recovery Model:

                  Database NameRecovery Model
                  SWNPMDBSIMPLE

                   

                   

                   

                   

                  Thank you,

                   

                  -Will

                    • Re: SolarWinds Database Suddenly Spikes In Storage Size, And Rapidly Growing...
                      janene16

                      It looks like even though your DB is set to Simple recovery, your LDF file size is huge. I wonder if there's a pending transaction still queued..

                        • Re: SolarWinds Database Suddenly Spikes In Storage Size, And Rapidly Growing...
                          wluther

                          janene16 5GB is huge... lol... We had been fighting that stupid log file for a while too... it had blown up to upwards of 50GB+, and I am not really sure how/why it decided to calm down. I have had multiple tickets open on that issue too, but, like I said, it just kind of calmed down on its own, and support called it a day because it stopped growing.

                           

                          I have had the DPA people look into it, as well as multiple NPM and SAM support techs too. Nothing ever seemed to work, with the exception of the dark magic used by the server gremlins while nobody was looking, then it stopped.

                        • Re: SolarWinds Database Suddenly Spikes In Storage Size, And Rapidly Growing...
                          zackm

                          x2 on the LDF file, that is excessive from what I am used to seeing.

                           

                          Even though it wouldn't technically be supported by SWI, I generally like to put my DBs into FULL recovery and create a more robust backup schema that would allow for a better recovery timeframe if there were problems. Another benefit of this model is that you get to control your log file size more grnularly by scheduling Transaction Log backups.

                           

                          This is not something to take over lightly though and should be discussed with your DBAs.

                           

                          The other thing that immediately strikes me is that your Interface Details tables are about twice the size I would have suspected.

                           

                          Given the default 9 minute polling interval, each interface should see 160 polls/day.

                           

                          Your original screenshot shows 16,366 interfaces, which would mean you should see ~2,618,560 rows on each of those tables.

                           

                          Same thing with the InterfaceAvailability_Detail table. Default polling is 2 minutes, you should have 82,484,640 rows to cover 7 days of detail data (default detail retention).

                           

                          If you have not purposefully increased the amount of polls hitting those interfaces, this should be investigated.

                           

                          It sounds like you have increased retention dates in your environment based on your original post. And that's fine. But it is something to note on why those tables are so large.

                           

                          What I would recommend is using SAM to setup some SQL Server User Experience monitors on your database and try and trend the top 10 tables for a week or two and see where the problems are more granularly.

                           

                          If you don't want to use SAM, just run that top 25 query every day at about the same time and copy the results out to excel and then see where the largest growth is really happening.

                           

                          It's really not directly obvious by this data, I wouldn't even blame this on the log file, that's just a red flag. But if you can trend some data over time at a granular, table-level, that should point you to where the actual growth is instead of just seeing that the whole DB is growing...

                          1 of 1 people found this helpful
                        • Re: SolarWinds Database Suddenly Spikes In Storage Size, And Rapidly Growing...
                          wluther

                          zackm Well, I just looked up the change management records for that server swap. As it turns out, it was initially scheduled for Feb 1st, however, was rescheduled, and the work actually took place on Feb 11th, roughly 12 days after the growth accelerated.

                           

                          I kind of wish I didn't know that, because now I am even more confused, as that server swap was the only "unusual" work/changes done to our SolarWinds environment.

                           

                          -Will

                            • Re: SolarWinds Database Suddenly Spikes In Storage Size, And Rapidly Growing...
                              ecklerwr1

                              That then starts to point to it not being the SQL server swap itsself then if it had already started it's exponential growth over a week before the server was swapped.  Zackm seems to be onto something though run those queries every day for a few days and see if anything big jumps out maybe... I've also recently finally figured out the correct permissions needed to get AppInsightforSQL working and I'm running it now on my main SQL Server holding the solarwinds tables.  That AppInsightforSQL show an amazing amount of data about SQL server.  alterego helped me with what I needed with this:  Configuring and Integrating AppInsight for SQL

                              This has been great to finally see what's going on deep inside my SQL Server day to day and spot trends.  It was also great to setup a service account with only the required permissions needed by AppInsightforSQL and not needing a sa or domain admin account with full permissions to entire SQL Server ie. it made my dba much happier to go along with versus saying oh I need an sa account on the SQL Server you manage.

                              1 of 1 people found this helpful
                            • Re: SolarWinds Database Suddenly Spikes In Storage Size, And Rapidly Growing...
                              patriot

                              Very nice report Zackm. Does the first query have a variable for the target database? Am I not seeing it?

                            • Re: SolarWinds Database Suddenly Spikes In Storage Size, And Rapidly Growing...
                              RichardLetts

                              Are you sure your database is actually growing in size? Yes, I know the files are growing in size, but that is not really an issue since recently accessed data is cached..

                               

                              I note your moved to a new SQLserver database -- are the database server setting the same on both of them? e.g. autoshrink, etc?

                               

                              in my environment the size of the database files is ~500GB, but the space actually consumed by the database tables is ~344GB

                               

                              note.. if you have wireless of any size then you quickly end up with LOTS of wireless data:

                              (my three largest tables)

                              Wireless_Interfaces_DAILY29400474889690
                              Wireless_Clients_DAILY13417615760397
                              Wireless_Clients_SessionHistory5371727728526
                              • Re: SolarWinds Database Suddenly Spikes In Storage Size, And Rapidly Growing...
                                cahunt

                                RichardLetts is correct about that -

                                 

                                Wireless takes up a lot... seeing your APM in the top 5 makes me curious if there have been any increased Hardware Health monitoring?

                                 

                                and for god sakes man, limit your log file size! 

                                j/k - cause your DB is still far smaller than mine....I poll for too many tables.

                                • Re: SolarWinds Database Suddenly Spikes In Storage Size, And Rapidly Growing...
                                  wluther

                                  rschroeder zackm ecklerwr1 janene16 cahunt RichardLetts

                                   

                                  Thank you all for your input on this issue. Here is a rough explanation of my findings.

                                   

                                  I went back through the audit trails again, when I got to the office yesterday morning. I DID find a change that occurred 5 days prior to the increased growth, and it was a change in the retention settings.

                                  From what I can put together, the original settings (set many years ago) had been changed/reduced last year.

                                  The most recent change (5 days prior to the growth) increased those retention settings back to what they should have been.

                                  I suppose it slipped passed me the first time because the settings matched what they SHOULD have been all along, so I did not really notice it had recently been changed.

                                  I changed the settings back to what they had been reduced to last year, and then reran the maintenance job.

                                   

                                  3/10/2016 8:12:25 AMAudit EventUser ***** changed settings Detailed Statistics Retention from 60 to 30.

                                   

                                  The database size dropped roughly 50% after the maintenance had completed.

                                  I am not sure why those changes took 5 days to show any growth, however, it most definitely caused the damage.

                                   

                                  While the error was definitely on my side, I would like to also thank Application Engineer Matthew Lamb (Austin, TX), Support Technician Jamin Walters (Boulder, CO), as well as Jamin's manager, Robert Mandeville (Boulder, CO), for taking the time to look over logs and settings with me. I had already been given an answer, however, they went that little extra bit with me, enough to provide reason, and shine some light in the direction of the root cause. Perhaps DanielleH, maria.bungau, or mrs.alterego can help make sure those guys get their fair share of recognition too, if it's not too much trouble.

                                   

                                   

                                  Thank you all for the help,

                                   

                                  -Will

                                  1 of 1 people found this helpful