Community
- Command Central
- MVP Program
- Monthly Mission
- Blogs
- Groups
- Events
- Media Vault
Products
- Observability
- Network Management
- Application Management
- IT Security
- IT Service Management
- System Management
- Database Management
Content Exchange
- SolarWinds Platform
- Server & Application Monitor
- Database Performance Analyzer
- Server Configuration Monitor
- Network Performance Monitor
- Network Configuration Manager
- SQL Sentry
- Web Help Desk
Free Tools & Trials

Increased SQL Waits for Custom Poller Statistics queries

We upgraded our database server this past weekend. It wasn't a shabby box prior to the upgrade. It was an HP DL360 with dual Intel Xeon E5-2660 @ 2.2GHZ for a total of 40 cores and 256GB RAM. We were also running a 4x800TB SSD array. All of this to support an environment with 17 additional polling engines and about 13,000 nodes (nearly 98,000 elements) and about 87,000 application component monitors. Our DB is right around 550GB.

Seems like a lot of power, right? We decided to add another 512GB RAM (for a whopping 768GB total) and added a mirrored set of SSD drives for our SQL log files. Our expectation was that if we could run our DB in memory (SQL is allocated 736GB) and move our SQL logs to a dedicated array that we would absolutely crush any sort of hardware constraints for DB performance. Yesterday was our first full business day running with the new server in place. Here is what we saw in DPA. (You have DPA, don't you??)

2016-08-30 12_03_49-IWC - SQL (All Days) - PI Time (08_30_16 12_03_22 PM) - (8.3.407 s08 Win2012).png

Yep, that's right. We threw all the hardware we could possibly throw at this database and our total waits INCREASED. In fact, they jumped from 114 hours of total wait on Monday, August 22nd to almost 156 hours of total wait on Monday, August 29th. What the heck is going on?

We dug a little deeper. Those two large blocks in the last stacked bar graph are for custom poller inserts and rollups. Those two queries accounted for 24% of the overall waits. The *vast* majority of that wait comes during our DB maintenance (12-5AM) as shown below for the insert query (the purple block above). The bottom blue block all CPU waits and is spread across the day fairly equitably.

2016-08-30 12_33_44-Ignite - Performance Detail (8.3.407 s08 Win2012).png 2016-08-30 12_36_07-Ignite - Performance Detail (8.3.407 s08 Win2012).png

There is a 3rd offender that contributes about 5% of the total wait time and is also an insert statement related to custom poller statistics.

2016-08-30 12_37_41-Ignite - Performance Detail (8.3.407 s08 Win2012).png

We do have a lot of custom pollers (113 unique pollers) with nearly 70,000 poller assignments. That makes for a lot of data. Like 752,944,412 rows in our CustomPollerStatistics_Detail table We collect those statistics every 5 minutes, but it still a lot of data.

No, we don't have CPU utilization issues and CPU queue length is good.

2016-08-30 12_52_17-Ignite - Resources (8.3.407 s08 Win2012).png

Even with SSDs out the wahzoo we still get spikes in both read and write latency during our maintenance window. (Maintenance actually starts at 9PM for us)

2016-08-30 12_57_42-Ignite - Resources (8.3.407 s08 Win2012).png 2016-08-30 12_57_54-Ignite - Resources (8.3.407 s08 Win2012).png

So, here is my ask for the Thwack community. Do you see similar wait profiles in your environment? If not, and you have the same use profile for custom pollers, did you do anything to help address the issues?

Here are my ideas:

1) Convert custom pollers (UnDPs) to SAM components. (This is problematic as we use some of those UnDPs in transforms to calculate new data points, but it could work for many of them.)

2) Create new pollers for memory and CPU pollers via the Orion UI and assign them to the custom poller nodes. (This allows us to do transforms, but I am not sure if all of our UnDP transforms could be converted over)

I'm open to anything to help reduce the size of that CustomPollerStatistics_Details table and improve our associated wait times.

Find more posts tagged with

customerpollerstatistics

wait

sql

db performance

Accepted answers

All comments

There are no accepted answers yet