Tuesday, May 28, 2013

Investigating Plan Cache Bloat

SQL Server includes a DMV, sys.dm_exec_query_stats, that returns performance statistics for each query plan cached in memory.  However, it can also help give you insight into how consistent your developers are with writing code.

For this topic, we'll just concentrate on a few columns returned by the DMV: sql_handle and plan_handle.  Per Books Online, sql_handle is a value that refers to the batch or stored procedure that the query, and plan_handle is a value that refers to the compiled plan of that query.  For each query that is processed, SQL Server can generate one or more compiled plans for that query.  This one-to-many relationship can be caused by a number factors, but one simple reason can be coding inconsistency.

One simple coding difference that I often see is within the SET statements preceding a query.  If a developer executes the exact same query using different SET statements, then SQL Server will compile a separate plan for each one.

First, we need to clear the cache.

DBCC FREEPROCCACHE;
GO

Next run these two queries.

SET QUOTED_IDENTIFIER OFF;
GO
SELECT p.FirstName,p.LastName FROM Person.Person p
JOIN HumanResources.Employee e ON p.BusinessEntityID = e.BusinessEntityID
WHERE e.Gender = 'M';
GO

SET QUOTED_IDENTIFIER ON;
GO
SELECT p.FirstName,p.LastName FROM Person.Person p
JOIN HumanResources.Employee e ON p.BusinessEntityID = e.BusinessEntityID
WHERE e.Gender = 'M';
GO

As you can see, the only difference between the two queries is the value for SET QUOTED_IDENTIFIER. Now let's query the DMV.

SELECT s.text, q.sql_handle, q.plan_handle
FROM sys.dm_exec_query_stats q CROSS APPLY sys.dm_exec_sql_text(sql_handle) s;
GO



We can see that we have 2 rows returned, one for each query.  As you'll notice, the sql_handle is the same for each, but the plan_handle is different.  Next let's look at the graphical query plan of each.

SELECT * FROM sys.dm_exec_query_plan(0x0600050049DA7633D08998220000000001000000000000000000000000000000000000000000000000000000);
GO


SELECT * FROM sys.dm_exec_query_plan(0x0600050049DA7633908298220000000001000000000000000000000000000000000000000000000000000000);
GO


You will see the query plan is the same; however, SQL Server treats each one as if it were a completely distinct query.  If this were just a typo by the developer, then SQL Server just doubled the amount of plan cache needed for this query and wasted valuable resources.

Let's look at the same queries from another angle.  This time we'll remove the SET statements, but change the formatting of the queries.

First clear the plan cache.

DBCC FREEPROCCACHE;
GO

Next run these two queries.

SELECT p.FirstName,p.LastName FROM Person.Person p
JOIN HumanResources.Employee e ON p.BusinessEntityID = e.BusinessEntityID
WHERE e.Gender = 'M';
GO

SELECT
        p.FirstName
       ,p.LastName
FROM Person.Person p
JOIN HumanResources.Employee e ON p.BusinessEntityID = e.BusinessEntityID
WHERE e.Gender = 'M';
GO

Finally, look at the DMV.

SELECT s.text, q.sql_handle, q.plan_handle
FROM sys.dm_exec_query_stats q CROSS APPLY sys.dm_exec_sql_text(sql_handle) s;
GO


What you'll notice is SQL Server will still treat each query as if it were two completely different statements; however, the only difference is the formatting.

In these examples, we've covered how important it is for the developers to be consistent with all the code passed to SQL Server.  Just minor changes in the code will cause SQL Server to generate different query plans and lead to plan cache bloat and wasted resources.  As a DBA, these are some simple examples of feedback you should be providing to your development teams.  Be proactive and don't let them down!

No comments:

Post a Comment