We often take the advice given to us on forums or in articles at face value. Even though the authors almost always say things like “your mileage may vary” or “may not apply to your situation” people still assume it is the gospel. Sometimes it is lack of experience. Other times it is just lack of knowledge on how to verify these things on your own. In this article I’m going to give you a tool to look at what SQL Server is doing at the disk level and allow you to make better decisions on how to configure your underlying disks.
There are several things you need to know about how SQL Server accesses the database files and the implications of that before you can construct a proper testing methodology.
http://technet.microsoft.com/en-us/library/cc966500.aspx covers the basics. There are a few things I will highlight here.
ACID and WAL
ACID (Atomicity, Consistency, Isolation, and Durability) is what makes our database reliable. The ability to recover from a catastrophic failure is key to protecting your data.
WAL (Write-Ahead Logging) is how ACID is achieved. Basically, the log record must be flushed to disk before the data file is modified.
Stable media isn’t just the disk drive. A controller with a battery backed cache is also considered stable. Since SQL Server can request blocks as big as 64KB make sure your controller can handle that block size request in cache. Some older controllers only do a 16KB block or smaller.
FUA (Forced Unit Access)
With the requirement of stable media SQL Server creates and opens all files with a specific set of flags. FILE_FLAG_WRITETHROUGH tells the underlying OS not to use write caching that isn’t considered stable media. So, the local disk cache is normally bypassed. Not all hard drives honor the flag though, Some SATA/IDE drives ignore it. Usually, the drive manufacturer provides a tool to turn off write caching. If you are using desktop drives in a mission critical situation be aware of the potential for data loss. FILE_FLAG_NO_BUFFERING tells the OS not to buffer the file ether. At this point the only cache available will be the battery backed or other durable cached on the controller.
SQL Server uses asynchronous access for data and log files. This allows IO request to queue up and use the IO system as efficiently as possible. The main difference between the two are SQL Server will try and gather writes to the data file into bigger blocks but the log is always written to sequentially.
All of these rules apply to everything but tempdb. Since tempdb is recreated at restart every time recoverability isn’t an issue.
SQL Server data access patterns
Searching around you will find these generalities about SQL Server’s IO patterns
Sequential 512 bytes to 64KB
Data File Read/Writes
Read ahead – more important to Enterprise Edition
8KB to 125KB
8KB to 128KB
512 byte – full initialize on log file only.
Backup Sequential Read/Write
Restore Sequential Read/Write
DBCC – CHECKDB
Sequential Read 8K – 64K
DBCC – DBREINDEX
(Read Phase) Sequential Read (see Read Ahead)
DBCC – DBREINDEX
(Write Phase) Sequential Write
Any multiple of 8K up to 128K
DBCC – SHOWCONTIG
Sequential Read 8K – 64K
Now that we have an idea of what SQL Server is suppose to be doing its time to verify our assumptions.
Capturing IO activity
There are a few tools that will allow you to capture the file activity at the system level. Process Monitor is a free tool from Microsoft that I will use to collect some base line information. In it’s standard configuration Process Monitor captures a ton of stuff and uses the page file to spool the info to. So, before we begin we need to change the default configuration.
Capturing IO data using process monitor.
Filter to apply
process is sqlservr.exe
Operation is Read
Operation is Write
Columns to choose.
Date & Time
Time of Day
Change Backing File.
The maximum number of events it will capture is 199 million. This is enough on my system to capture 12 hours of activity easily. Once we have a good sample you can save it off as an XML file or CSV. Choosing CSV it is pretty easy to import the data into SQL Server using SSIS or your tool of choice.
I import the CSV into a raw table first.
Raw table to import into.
CREATE TABLE [SQLIO].[dbo].[pm_imp] ( [Process Name] VARCHAR(12), [PID] SMALLINT, [Path] VARCHAR(255), [Detail] VARCHAR(255), [Date & Time] DATETIME, [Time of Day] VARCHAR(20), [Relative Time] VARCHAR(50), [Duration] REAL, [TID] SMALLINT, [Category] VARCHAR(6) )
Next I create a cleaner structure with some additional information separated from the detail provided.
SELECT [Process Name] AS ProcessName, PID AS ProcessID, PATH AS DatabaseFilePath, Detail, [Date & Time] AS EventTimeStamp, [Time of Day] AS TimeOfDay, [Relative Time] AS RelativeTime, [Duration], TID AS ThreadID, Category AS IOType, substring(detail,charindex('Length: ',detail,0) + 8
,(charindex(', I/O',detail,0) - charindex('Length:',detail,0) - 8)) AS IOLength, CASE reverse(left(reverse(PATH),3)) WHEN 'mdf' THEN 'Data' WHEN 'ndf' THEN 'Data' WHEN 'ldf' THEN 'Log' END AS FileType INTO SQLIOData FROM dbo.pm_imp WHERE reverse(left(reverse(PATH),3)) IN ('mdf','ndf','ldf')
Once we have the data cleaned up a bit we can now start doing some analysis on it.
Queries for interesting patterns.
This query gives us our read and write counts.
SELECT count(*) IOCount ,IOType FROM SQLIOData GROUP BY IOType ORDER BY count(*) DESC
This one shows us the size of the IO and what type of operation it is.
SELECT count(*) IOCount ,IOLength ,IOType FROM SQLIOData GROUP BY IOLength,IOType ORDER BY count(*) DESC
This is a look at activity by file type data or log.
SELECT count(*) IOCount, FileType FROM SQLIOData GROUP BY FileType ORDER BY count(*) DESC
Since we are capturing the thread id we can see how many IO’s by thread.
SELECT count(*) IOCount, ThreadID FROM SQLIOData GROUP BY ThreadID ORDER BY count(*) DESC
We can also look at IO types, sizes and count by file helping you see which ones are hot.
SELECT count(*) IOCount, databasefilepath, iotype, iolength FROM SQLIOData WHERE databasefilepath LIKE '%filename%' GROUP BY databasefilepath, iotype, iolength HAVING count(*) > 10000 ORDER BY databasefilepath, count(*) DESC
Now that we see exactly what our IO patterns are we can make adjustments to the disk subsystem to help scale it up or tune it for a particular pattern.
This is just another tool in your tool belt. This is a supplement to using fn_virtualfilestats to track file usage. I use it to get a better idea of the size of the IO’s being issued.Using these two tools I can see the size of the IO’s in a window of time that is reported by my fn_virtualfilestats capture routine.
Always verify your assumptions, or advice from others.