cancel
Showing results for 
Search instead for 
Did you mean: 
Create Post

Database 102 - Why Is My Query Slow?

Level 11

As a database administrator (aka DBA, or Default Blame Acceptor) throughout my career, I’ve worked with a myriad of developers, system administrators, and business users who have all had the same question—why is my query (or application) slow? Many organizations lack a full-time DBA, which makes the question even harder to answer. The answer is sometimes simple, sometimes complicated, but they all start with one bit of analysis you need to do: whether the relational database management system (RDBMS) you are using is DB2, MySQL, Microsoft SQL Server, Oracle, or PostgreSQL.

It’s All About the Execution Plan

A database engine balances CPU, memory, and storage resources to try to provide the best overall performance for all queries. As part of this, the engine will try to limit the number of times it executes expensive processes, by caching various objects in RAM—one use is saving blocks with the data needed to return results to a query. Another common use of caching is for execution plans or explain plans (different engines call these different things), which is probably the most important factor in your queries performance.

When you submit a query to a database engine, a couple of things happen—the query is first parsed, to ensure its syntax is valid, the objects (tables, views, functions) you’re querying exist, and you have permission to access them. This process is very fast and happens in a matter of microseconds. Next, the database engine will look to see if that query has been recently executed and if the cache of execution plans has a plan for that query. If there’s not an existing plan, the engine will have to generate a new plan. This process is very expensive from a CPU perspective, which is why the database engine will attempt to cache plans.

Execution or explain plans are simply the map and order of operations required to gather the data to answer your query. The engine uses statistics or metadata about the data in your table to build its best guess at the optimal way to gather your data. Depending on your database engine, other factors such as the number of CPUs, the amount of available memory, various server settings, and even the speed of your storage may impact the operations included in your plan (DBAs frequently refer to this as the shape of the plan).

How Do I Get This Plan and How Do I Read It?

Depending on your RDBMS, there are different approaches to gathering the plan. Typically, you can get the engine to give you a pre-plan, which tells you the operations the engine will perform to retrieve your data. This is helpful when you need to identify large operations like table scans, which would benefit from an index would be helpful. For example—if I had the following table called Employees:

EmployeeID

LastName

State

02

Dantoni

PA

09

Brees

LA

If I wanted to query by LastName, e.g.,

SELECT STATE

FROM Employees

WHERE LastName = ‘Dantoni’

I would want to add an index to the LastName column. Some database engines will even flag a missing index warning, to let you know an index on that column would help the query go faster.

There is also the notion of a post plan, which includes the actual row counts and execution times of the query. This can be useful if your statistics are very out of date, and the engine is making poor assumptions about the number of rows your query will return.

Performance tuning database systems are a combination of dark arts and science and can require a deep level of experience. However, knowing about the existence of and how to capture execution plans, allows you to have a much better understanding of the work your database engine is doing, and can give you a path to fix it.

8 Comments
Level 12

thanks for the post

Level 13

Thanks for the Article

MVP
MVP

Nice write up, thank you.

Level 13

Thanks for the article!

Level 15

Interesting article.  Thanks for posting.

Level 13

Good post, thanks.  A lot of folks don't understand this, and it can cause a lot of unnecessary performance problems.  I had a  developer that was worried about pulling too much data so he was getting each row as he needed it, even when he knew he'd eventually need up to 1,000 (he was worried about putting too much load on the sewrver). I was watching the db (I'm a DBA) and asked him what in the world he was doing.  When he told me I just laughed.  Told him it's massively more efficient to pull it all - that's what databases are for.  When I showed him the parse took more resources than the retrieval he couldn't believe it until he tried it.

Level 12

"Performance tuning database systems are a combination of dark arts and science and can require a deep level of experience."

This is the biggest truth bomb I've ever seen on Thwack!

Level 11

woah I am actually learning something. thanks!

About the Author
Joseph D'Antoni is an Senior Architect and SQL Server MVP with over a decade of experience working in both Fortune 500 and smaller firms. He is currently Solutions Architect for SQL Server and Big Data for Anexinet in Blue Bell, PA. He is frequent speaker at major tech events, and blogger about all topics technology. He believes that no single platform is the answer to all technology problems. He holds a BS in Computer Information Systems from Louisiana Tech University and an MBA from North Carolina State University. .