Monday, December 31, 2012

How to Use xp_dirtree to List All Files in a Folder - Part 2

In a previous blog post, I demonstrated how you can use an undocumented stored procedure, master.sys.xp_dirtree, to list all files in a folder.  The previous posting mostly talked about how the stored procedure worked and the output it generated.  For this posting, I have expanded upon the script to parse all subdirectories and list all files in all subfolders from the starting point.

Just for a quick review, xp_dirtree has three parameters:
  1. directory - This is the directory you pass when you call the stored procedure; for example 'D:\Backup'.
  2. depth  - This tells the stored procedure how many subfolder levels to display.  The default of 0 will display all subfolders.
  3. isfile - This will either display files as well as each folder.  The default of 0 will not display any files.
Now for the updated script:

DECLARE
       @BasePath varchar(1000)
      ,@Path varchar(1000)
      ,@FullPath varchar(2000)
      ,@Id int;

--This is your starting point.
SET @BasePath = 'D:\Backup';

--Create a temp table to hold the results.
IF OBJECT_ID('tempdb..#DirectoryTree') IS NOT NULL
      DROP TABLE #DirectoryTree;

CREATE TABLE #DirectoryTree (
       id int IDENTITY(1,1)
      ,fullpath varchar(2000)
      ,subdirectory nvarchar(512)
      ,depth int
      ,isfile bit);

--Create a clustered index to keep everything in order.
ALTER TABLE #DirectoryTree
ADD CONSTRAINT PK_DirectoryTree PRIMARY KEY CLUSTERED (id);

--Populate the table using the initial base path.
INSERT #DirectoryTree (subdirectory,depth,isfile)
EXEC master.sys.xp_dirtree @BasePath,1,1;

UPDATE #DirectoryTree SET fullpath = @BasePath;

--Loop through the table as long as there are still folders to process.
WHILE EXISTS (SELECT id FROM #DirectoryTree WHERE isfile = 0)
BEGIN
      --Select the first row that is a folder.
      SELECT TOP (1)
             @Id = id
            ,@FullPath = fullpath
            ,@Path = @BasePath + '\' + subdirectory
      FROM #DirectoryTree WHERE isfile = 0;

      IF @FullPath = @Path
      BEGIN
            --Do this section if the we are still in the same folder.
            INSERT #DirectoryTree (subdirectory,depth,isfile)
            EXEC master.sys.xp_dirtree @Path,1,1;

            UPDATE #DirectoryTree
            SET fullpath = @Path
            WHERE fullpath IS NULL;

            --Delete the processed folder.
            DELETE FROM #DirectoryTree WHERE id = @Id;
      END
      ELSE
      BEGIN
            --Do this section if we need to jump down into another subfolder.
            SET @BasePath = @FullPath;

            --Select the first row that is a folder.
            SELECT TOP (1)
                   @Id = id
                  ,@FullPath = fullpath
                  ,@Path = @BasePath + '\' + subdirectory
            FROM #DirectoryTree WHERE isfile = 0;

            INSERT #DirectoryTree (subdirectory,depth,isfile)
            EXEC master.sys.xp_dirtree @Path,1,1;

            UPDATE #DirectoryTree
            SET fullpath = @Path
            WHERE fullpath IS NULL;

            --Delete the processed folder.
            DELETE FROM #DirectoryTree WHERE id = @Id;
      END
END

--Output the results.
SELECT fullpath + '\' + subdirectory AS 'CompleteFileList'
FROM #DirectoryTree
ORDER BY fullpath,subdirectory;

--Cleanup.
IF OBJECT_ID('tempdb..#DirectoryTree') IS NOT NULL
      DROP TABLE #DirectoryTree;
GO

This code will process one folder level at a time since we're specifying 1 for the depth parameter.  For each folder in the #DirectoryTree table, we get a list of files and subfolders and insert it to the #DirectoryTree table. 

INSERT #DirectoryTree (subdirectory,depth,isfile)
EXEC master.sys.xp_dirtree @Path,1,1;

If it's a folder, then it the file parameter will be 0 and the WHILE loop will process once it gets further down the loop.  The folder needs a clustered index to keep all the rows in order.

ALTER TABLE #DirectoryTree
ADD CONSTRAINT PK_DirectoryTree PRIMARY KEY CLUSTERED (id);

After each folder is processed, the folder row is deleted from the table.  This is necessary so the WHILE loop doesn't turn into an infinite loop.

DELETE FROM #DirectoryTree WHERE id = @Id;

The full path is being saved for output later, so comparing it to the current base path will let us know if we need to update the @BasePath parameter.

IF @FullPath = @Path
      --Continue processing the same folder.
ELSE
      SET @BasePath = @FullPath;
      --Start processing the next subfolder.

This is how we can keep looping through each subfolder until each one has been processed.

The final step is to output the results.

SELECT fullpath + '\' + subdirectory AS 'CompleteFileList'
FROM #DirectoryTree
ORDER BY fullpath,subdirectory;


I have gotten pretty good performance out of this script.  Even traversing the C:\Windows folder on my test VM, I can get the report to return all 62000+ files in 14000+ folders within about 6 minutes. Hopefully this example will prove to be helpful for your everyday tasks.  Enjoy!

Tuesday, December 18, 2012

SQL Saturday #143 - Washington, DC

Last week, I attended my very first SQL Saturday event in Washington, DC.  Although I don't live anywhere near Washington, I was able to make the short trek from Charlotte, NC.  It was well worth the time and effort.  I missed the previous one in Charlotte back in October, but  I have to say now I really hate that I did.  I've been to SQL Connections in Las Vegas, and SQL Saturday reminded me of a one-day version of it.  I wish I could have attended every single session, but at least I was able to hear the presentations that I most interested in. 

I was able to meet a few of the SQL experts (Mike Walsh, David Klee, Jason Hall, etc.) that I've been following through their blogs and on Twitter.  Hearing them speak and being able to converse with them 1-on-1 was surreal.

For any aspiring DBA out there, if SQL Saturday is in your area it should be a mandatory event on your calendar.  They are free, except for lunch, and it provides an excellent opportunity to see how others in the industry are using the technology.  Beyond listening to the speakers, there is the networking aspect.  I met with other attendees, presenters, and of course the vendors to ask questions about SQL or where I could find more information a specific topic.  I probably collected a dozen business cards for people and I plan to keep in touch with them.

Check out SQL Saturday's site for a list of all the upcoming events.  If there are not any close to you, then check PASS for any local SQL Server user groups in your area.  Getting involved in those can provide the same opportunities as SQL Saturday events.

Tuesday, December 11, 2012

T-SQL Terminator

You may have noticed in my code examples, I always include a semicolon ";" at the end of my TSQL statements.  If you're not doing this, then you need to start now.  Actually, if you're not writing TSQL code as a DBA then you need to start now.

Starting in SQL Server 2005, Microsoft introduced the TSQL terminator.  This syntax is not required for most statements, but at some point in the future it will be required for all.  You can read the details on MSDN.  As with all new changes to SQL, you should be future-proofing your code.

Some statements already require the use of the terminator.  One of the best examples is the use of a Common Table Expression (CTE).  If you attempt to run this code below, you will get a syntax error.

USE AdventureWorks2008R2

WITH CTE
AS (
      SELECT
             P.FirstName
            ,P.LastName
            ,E.VacationHours
      FROM HumanResources.Employee E
            JOIN Person.Person P
                  ON E.BusinessEntityID = P.BusinessEntityID
)
SELECT * FROM CTE
WHERE VacationHours > 40;
GO

Msg 319, Level 15, State 1, Line 3
Incorrect syntax near the keyword 'with'. If this statement is a common table expression, an xmlnamespaces clause or a change tracking context clause, the previous statement must be terminated with a semicolon.

To start defining a CTE, the WITH statement must be the first statement in a batch or the preceding statement must be terminated with a semicolon, even if it's something as simple as a USE DATABASE statement.

Add the semicolon to the end of the USE statement and rerun, and you'll get the results without any errors.

USE AdventureWorks2008R2;

WITH CTE
AS (
      SELECT
             P.FirstName
            ,P.LastName
            ,E.VacationHours
      FROM HumanResources.Employee E
            JOIN Person.Person P
                  ON E.BusinessEntityID = P.BusinessEntityID
)
SELECT * FROM CTE
WHERE VacationHours > 40;
GO

(178 row(s) affected)

This is one of the easiest code changes you can make, so starting adding it to your code today and save yourself the hassle of rewriting it tomorrow.

Tuesday, December 4, 2012

Get Performance Counters from within SQL Server

I think most DBAs know that you can use the Windows Performance Monitor to capture performance metrics for your SQL Server.  But did you know that you also have access to all the SQL Server performance counters from within SQL Server?

By using using the DMV sys.dm_os_performance_counters, you can view that performance data.  As with all DMVs, there are a few drawbacks.  The data collected is only valid since the last time SQL Server was started, and this DMV does not allow access to any of the Window OS performance counters.

For example, we can query the DMV to find the Page Life Expectancy.

SELECT * FROM sys.dm_os_performance_counters
WHERE  object_name = 'MSSQL$TEST1:Buffer Manager'
AND counter_name = 'Page life expectancy';



The DMV returns five columns:
  1. object_name - Category of the counter.
  2. counter_name - Name of the counter.
  3. instance_name - Name of a specific instance of the counter. For example, the database name.
  4. cntr_value - The current value of the counter.
  5. cntr_type - The type of the counter and will vary depending on the Windows architecture.


For this example, the counter value is 3074 which means a data page will stay in the buffer pool for 3074 seconds before being removed.  The cntr_type is very important for each of the counters.  A type of 65792 is PERF_COUNTER_LARGE_ROWCOUNT which means the cntr_value is the absolute meaning and no more calculations are needed.

Now let's looks at another one that does require extra calculations.

SELECT * FROM sys.dm_os_performance_counters
WHERE  object_name = 'MSSQL$TEST1:Latches'
AND counter_name LIKE 'Average Latch Wait Time%';



The Average Latch Wait Time (ms) is type 1073874176 - PERF_AVERAGE_BULK.  This is essentially the same as the total wait time for all latches.  The Average Latch Wait Time Base is type 1073939712 - PERF_LARGE_ROW_BASE.  It's the number of processes that had to wait for a latch.  In order to find the true "average" wait time, you will need to divide the "Average Latch Wait Time (ms)" by its base value.  In this case, 16296/378 = 43.1ms. 

For more information about all the counter type definitions, check out MSDN.

As you can see, there is wealth of information within this DMV, but you need to understand the types of each counter and don't just look at the overall value.

Wednesday, November 21, 2012

How to Use xp_delete_file to Purge Old Backup Files

Continuing on my recent trend of using undocumented stored procedures, this week I thought we can cover using xp_delete_file to purge old backup files from a folder.  This is stored procedure used by the maintenance plans to clean up old backup files, but it makes for a handy purge tool when creating your own backup scripts.

Let's suppose we have a simple script (taken from a recent post on stackoverflow.com) to backup all databases on a server.

DECLARE @name VARCHAR(50); -- Database name
DECLARE @path VARCHAR(256); -- Path for backup files
DECLARE @fileName VARCHAR(256); -- Filename for backup
DECLARE @fileDate VARCHAR(20); -- Used for file name

-- Path to backups.
SET @path = 'D:\Backup\';

-- Get date to include in file name.
SELECT @fileDate = CONVERT(VARCHAR(20),GETDATE(),112);

-- Dynamically get each database on the server.
DECLARE db_cursor CURSOR FOR
SELECT name
FROM master.sys.databases
WHERE name NOT IN ('master','model','msdb','tempdb');

OPEN db_cursor;
FETCH NEXT FROM db_cursor INTO @name;

-- Loop through the list to backup each database.
WHILE @@FETCH_STATUS = 0
BEGIN 
      -- Build the path and file name.
      SET @fileName = @path + @name + '_' + @fileDate + '.BAK';
      -- Backup the database.
      BACKUP DATABASE @name TO DISK = @fileName WITH INIT;
      -- Loop to the next database.
      FETCH NEXT FROM db_cursor INTO @name;
END 

-- Clean up.
CLOSE db_cursor;
DEALLOCATE db_cursor;
GO

The script will create a new database backup file of every database on the server each time it's executed.  However, the path may not have unlimited disk space.  This is where we need to use xp_delete_file.

Xp_delete_file takes a five parameters:

  1. File Type = 0 for backup files or 1 for report files.
  2. Folder Path = The folder to delete files.  The path must end with a backslash "\".
  3. File Extension = This could be 'BAK' or 'TRN' or whatever you normally use.
  4. Date = The cutoff date for what files need to be deleted.
  5. Subfolder = 0 to ignore subfolders, 1 to delete files in subfolders.

To incorporate this into our backup script, we just need to add another parameter to define the cutoff date.

DECLARE @DeleteDate DATETIME = DATEADD(wk,-1,GETDATE());

In this example, we want to purge any backup files that are more than one week old and have a file extension of 'BAK'.

Next we just need to add the xp_delete_file after the backup loop completes.

EXEC master.sys.xp_delete_file 0,@path,'BAK',@DeleteDate,0;

We only need to execute xp_delete_file once with subfolder=0 parameter since all backup files are being stored in the same folder.  If there was a separate subfolder for each database, then we'd change the subfolder parameter to 1 to delete files from each of those database subfolders.

The updated script would look like this.

DECLARE @name VARCHAR(50); -- Database name
DECLARE @path VARCHAR(256); -- Path for backup files
DECLARE @fileName VARCHAR(256); -- Filename for backup
DECLARE @fileDate VARCHAR(20); -- Used for file name
DECLARE @DeleteDate DATETIME = DATEADD(wk,-1,GETDATE()); -- Cutoff date

-- Path to backups.
SET @path = 'D:\Backup\';

-- Get date to include in file name.
SELECT @fileDate = CONVERT(VARCHAR(20),GETDATE(),112);

-- Dynamically get each database on the server.
DECLARE db_cursor CURSOR FOR
SELECT name
FROM master.sys.databases
WHERE name NOT IN ('master','model','msdb','tempdb');

OPEN db_cursor;
FETCH NEXT FROM db_cursor INTO @name;

-- Loop through the list to backup each database.
WHILE @@FETCH_STATUS = 0
BEGIN 
      -- Build the path and file name.
      SET @fileName = @path + @name + '_' + @fileDate + '.BAK';
      -- Backup the database.
      BACKUP DATABASE @name TO DISK = @fileName WITH INIT;
      -- Loop to the next database.
      FETCH NEXT FROM db_cursor INTO @name;
END 

-- Purge old backup files from disk.
EXEC master.sys.xp_delete_file 0,@path,'BAK',@DeleteDate,0;

-- Clean up.
CLOSE db_cursor;
DEALLOCATE db_cursor;
GO

Don't let the stored procedure file name, xp_delete_file, fool you.  Microsoft has some internal code that only allows for the deletion of database and transaction log backup files.  The stored procedure cannot be used to delete any other type of file.

Yes, it really is that simple.


Tuesday, November 13, 2012

How to Use xp_dirtree to List All Files in a Folder

UPDATED -- Dec 31, 2012 -- Be sure to read Part 2 of this post discussing xp_dirtree.

Last week I blogged about how to use an undocumented stored procedures to create folders.  This week we need to do the opposite.  We need to get a list of all files from a subfolder in order to process them.  In order to do that, we'll use another undocumented extended stored procedure; master.sys.xp_dirtree.  This stored procedure will display a list of every folder, every subfolder, and every file for path you give it.

Xp_dirtree has three parameters: 
  1. directory - This is the directory you pass when you call the stored procedure; for example 'D:\Backup'.
  2. depth  - This tells the stored procedure how many subfolder levels to display.  The default of 0 will display all subfolders.
  3. file - This will either display files as well as each folder.  The default of 0 will not display any files.
For today's example, we just want to display all of our backup files (*.BAK) in a particular folder. We need to all of the other parameters in order to show the files as well as any subfolders.

EXEC master.sys.xp_dirtree 'D:\Backup\TRON4\TEST2\MyDb1',0,1;

The output below will show us each subfolder and every file for the given directory.


We're only concerned about the BAK files.  So we'll need to create a temp table to hold the output of xp_dirtree.  Then we can select only the rows that are an actual file (isfile = 1) and that have a file extension of BAK.

IF OBJECT_ID('tempdb..#DirectoryTree') IS NOT NULL
      DROP TABLE #DirectoryTree;

CREATE TABLE #DirectoryTree (
       id int IDENTITY(1,1)
      ,subdirectory nvarchar(512)
      ,depth int
      ,isfile bit);

INSERT #DirectoryTree (subdirectory,depth,isfile)
EXEC master.sys.xp_dirtree 'D:\Backup\TRON4\TEST2\MyDb1',1,1;

SELECT * FROM #DirectoryTree
WHERE isfile =AND RIGHT(subdirectory,4) = '.BAK'
ORDER BY id;
GO

Below is the output.


This example only shows the files for a specific subfolder, which is used to get a list of files that are needed for purging before another database backup runs.  This stored procedure also can accept UNC paths (\\Server\Folder) to get a list of files and subfolders from a network drive.  In a future post, I will add on to this script to show all files in all subfolders for a given directory.  Until then, play around with xp_dirtree and its parameters to see the different outputs.  You may be able to incorporate this into your own code.

Tuesday, November 6, 2012

Dynamically Create a Folder for Each Database

If you're a DBA like me then you're probably pretty detail-oriented and like to keep things very organized.  For example, when doing database backups, I like to have a specific folder structure to hold the backup files for each database.  See below:


D:\Backup\Server\Instance\MyDb1
D:\Backup\Server\Instance\MyDb2
D:\Backup\Server\Instance\MyDb3

In order to make this happen automatically, my TSQL backup jobs need to have code that will create folders when a new database is added or when the jobs are initially setup.  SQL offers an undocumented extended stored procedures that allow us to create them on the fly; master.sys.xp_create_subdir.

Master.sys.xp_create_subdir will create whatever folder listing you pass it.  For example, if you pass it 'D:\Backup\Server\Instance\MyDb1', it will create each of those folders.  If any of those folders already exist, then it will ignore it and move on to the next one.

So if we need to create a folder for each database before running our backup job, then we can execute a block of code such as this.


DECLARE
       @BaseFolder nvarchar(128)
      ,@DatabaseFolder nvarchar(256)
      ,@DbName sysname
     
-- Only one input parameter needed from the user.
SET @BaseFolder = 'D:\Backup';

DECLARE Db CURSOR FOR
SELECT name FROM sys.databases WHERE name <> 'tempdb';

OPEN Db;
FETCH NEXT FROM Db INTO @DbName;
WHILE @@FETCH_STATUS = 0
BEGIN
      -- Dynamically create a folder for each database.
      SET @DatabaseFolder = @BaseFolder + '\'
      + CONVERT(nvarchar,SERVERPROPERTY('MachineName')) + '\'
      + ISNULL(CONVERT(nvarchar,SERVERPROPERTY('InstanceName')),'DEFAULT')
      + '\' + @DbName;
      EXEC master.sys.xp_create_subdir @DatabaseFolder;
      FETCH NEXT FROM Db INTO @DbName;
END

-- Clean up
CLOSE Db;
DEALLOCATE Db;
GO

On our test server we have three user databases plus the usual system databases, so we end up with a folder structure like this.


This example creates folders for backups; however, the same code can be adapted for any number of uses, and you can even pass UNC paths (\\Server\Folder\) so you can create network folders as well.