cancel
Showing results for 
Search instead for 
Did you mean: 
Create Post
Level 15

Orion 9.5 Database Maintenance

Jump to solution

Looks like the DB maintenance on my SLX server is crippling performance.  Since it started the last two days the CPU and disk IO are through the roof on the server.  This is causing false positives and outage alerts. 

I had a similar issue in 9.1 that there was a patch to fix which was referenced in this thread

Is anyone else having this issue?  I am going to take the same steps to disable the DB Maint until I have a workaround.

0 Kudos
1 Solution

There is a variable "@chunksize = 100" in the stored procedures that the DB maintenance uses this set a limit of 100 records per loop.  If you have a large amount of records to proces then this will take forever.

I ended up changing 3 of them to 100,000. (dbm_ResponseTime_DeleteStale dbm_InterfaceTraffic_DeleteStale dbm_InterfaceErrors_DeleteStale) After the change the first time the DB maintenance ran it took a few hour to complete but after that it now runs and completes in a reasonable amount of time

You may need to change more of the stored procedures depending of the size of your database but it seems any table over 400 MB will take a large amount of time to complete with the 100 row limit set.

View solution in original post

0 Kudos
32 Replies
Level 9

We are having the same issue.  We originally thoght is was the 2G limit of SQL 2005.  We increased the limit to 6G and not the CPU bets maxed out when the database maintenance starts.  The process never dies until we restart the SQL server.  I have opened a case and provided the diagnostics and the SQL deadlock trace.  I hope there is an answer soon as this is causing impacting our ability to monitor our network.

Is there a way to disable the DB maintenance for now?

0 Kudos

There is a variable "@chunksize = 100" in the stored procedures that the DB maintenance uses this set a limit of 100 records per loop.  If you have a large amount of records to proces then this will take forever.

I ended up changing 3 of them to 100,000. (dbm_ResponseTime_DeleteStale dbm_InterfaceTraffic_DeleteStale dbm_InterfaceErrors_DeleteStale) After the change the first time the DB maintenance ran it took a few hour to complete but after that it now runs and completes in a reasonable amount of time

You may need to change more of the stored procedures depending of the size of your database but it seems any table over 400 MB will take a large amount of time to complete with the 100 row limit set.

View solution in original post

0 Kudos

Hi,

lasher could you (or someone else) give me a hint how I can change the chunkszie.

I'm using Microsoft SQL Server Management Studio Express, right click on the mentioned procedure and choose change.

After changing the variable I'd try to save that script, but I can only save it as a new sql query.

What is the right way to edit that procedures ?

Thanks Andre

0 Kudos

If you right-click on the procedure and select 'Modify'... after changing the value you can click the 'Execute' button at the top in SQL Management Studio.

0 Kudos

Thanks for the reply.

But I ask the question 6 month ago.

In the meantime I found it. It's quite easy If you know how it works 😉

Regards Goose

0 Kudos

Where are you making these changes?

0 Kudos

Was I doing the changes myself?  Yes I did.  It is not hard and a simple (double hyphen) "--"  will comment out the original line then just copy it and change the number.  If it does not help then delete the line you added and delete the "--". 

0 Kudos

I was referring to where are you doing the changes.  What file are you editing?

0 Kudos

Sorry my misread.  

Using a tool like Microsoft SQL Server Management Studio Express you can get access to the location of the Stored Procedures.

The 3 I had to change were:

dbm_ResponseTime_DeleteStale

dbm_InterfaceTraffic_DeleteStale

dbm_InterfaceErrors_DeleteStale

0 Kudos

Thanks Lasher,

Judging by the logs Solarwinds support sent me my issues are with the Interface Traffic and Errors procedures.  I have increased the limit to 100000 and will try the DBMaint again.

I will post the results when I have them.

0 Kudos

Same issue here after upgrade to 9.5. The Boss is not a happy camper because of this so I do have a support ticket 101114.

After we make these changes and a patch comes out will it correct these changes?

0 Kudos

Just double checked and yes it should

0 Kudos

I am also having the same issues and Solarwinds tech support sent me to this thread.

Case 101114

There is a workaround that may work for you,there are a few procedures used in the Database Maintenance that are limited to 100 rows. If it has a large number of rows to delete it has to loop over and over again. This is what is slowing down the process.
Simple increases of that limit will speed up the process greatly.

The workaround at the moment is to increase the chunk size of the procedure to 100,000 the DB maintenance should run. 

Can you have a look at the following Thwack post for details on how to fix?

Now what happens when they come out with a fix after I make the changes just to get around this problem?

0 Kudos

Any more updates on this?  We have disabled the maintenance for now but the SQL still gasp in daat until restarted.  We have a case opened with SW and really need to get this fixed ASAP.

0 Kudos

I used Lasher's suggestions to increase the chunksize on the stored procedures to fix our issue.  I have passed that information along to solarwinds support for a potential sp or hotfix. 

Have you tried Lasher's suggestion?

0 Kudos

No we have not.  I am not a database programmer and do not want to mess up the database.

 

If Soalrwinds support directs me to this then we will do it.

0 Kudos

We are looking to address this in an upcoming SP, cannot tell you which one yet, still  working that out

0 Kudos

Support emailed me telling me to install 9.5 SP1 and that would fix the issue.  I noticed that it wasnt in the release notes for the SP though.

After installing the service pack, the maintenance job still ran good.  Took about 5 minutes on my 33GB database.  I also noticed that the chunk size I had set to 1,000,000 was now back to 100 after the service pack. 

0 Kudos

There wasn't a change to DB Maintenance in SP1.  The change to increase the chunk size is being worked on and we hope it makes the next SP. 

The main reason it was probably taking so long was that we were cleaning up orphan data that had been accumulating.  The 100 chunk size wasn't a good choice for that.  For "normal" runs, which really only cleans out expired data, the 100 chunk size isn't that bad.  The plan is to increase the chunk size, albeit, probably not to 1,000,000 rows.

0 Kudos

I really don't think it is the number of rows that was the issue.  It looked like to me it is the way the "loop" is written.  After each "loop" the query pulls back the entire table again.  Over time this gets faster and faster as the table gets smaller but there has to be a more efficient way to do this.

0 Kudos