Monthly Archives: June 2010

Fundamentals of Storage Systems – Disk Controllers, Host Bus Adapters, and Interfaces

We have covered the Hard Disk and the System Bus. This time around we will cover disk controllers and host bus adapters.

In The Beginning…

There were three distinct components to your IO subsystem, the disk, controller, and the host bus adapter. Today there are still three distinct components but the arrangement has changed. The physical disk we have covered and you know about. What you may not realize is the disk controller is actually the circuit board on the back of the hard drive. In the past this board may have been an add-in card, a back plane that the drives plugged into or even an add-in card with the hard disk mounted on it! It took a little time for the configuration we take for granted today to settle out. Once the form factor for a hard drive and the controller was done there was still the issue of what a host bus adaptor was suppose to do. Some of you may remember the old days of MFM, RLL, and proprietary disk layouts. Having to do a low level format, setting the interleave, even having to park the drive when you were done with the computer. Those days are long gone. Now, low level formatting is done at the factory, there is no need for interleaving, and all drives auto-park. Whoa, what a time warp. All of these things were eliminated mostly due to the advancement in disk controllers and host bus adapters.

The Disk Controller

The card that slots into your system and is connected via cable to your hard drive isn’t the disk controller. The disk controller resides on the hard drive and handles all the low level operations. From spinning the disk, moving the heads and transferring the data the disk controller does most of the heaving lifting. Once the data has been read it finally makes its way down the wire to the host bus adapter.

DiskController

Interface Protocols

There have been several data encoding and signaling schemes over the years. We have touched on MFM and RLL as the first wide spread standards used early on. The two standards that have stood the test of time are IDE/ATA and SCSI.  These standards can be implemented on top of other protocols like IP and Fibre Channel both network protocols.There are ATA implementations on FC and IP but nether are as popular as SCSI. Fibre Channel is pretty much the domain of Storage Area Networks(SAN) which we will cover in a future article.

A breakdown of speeds.

Bus Type Speed MB/Sec
ATA/133 133
SATA 150 150
SATA/SAS 300 300
SATA/SAS 600 600
SCSI U160 160
SCSI U320 320

Alternate SCSI/ATA implementations

Fibre Channel 1GFC 106
Fibre Channel 2GFC 212
Fibre Channel 4GFC 425
Fibre Channel 8GFC 850
iSCSI Gigabit Ethernet 125
iSCSI 10 Gigabit Ethernet 1250

A modern spinning disks would have a hard time using even ATA/133’s available bandwidth all by itself. The older parallel ATA (PATA) and SCSI standards are giving way to their newer serial counterparts SATA and SAS. The previous generation had several marked differences between them. ATA could only have two drives per channel while SCSI could have up to 15. ATA was unidirectional, only able to read or write, to the drive while SCSI was bidirectional. This has carried over to the new standards as well.

If you have a SAS HBA it will accept both SAS and SATA drives. Another great feature is the reliability of the connectors. Both ATA and SCSI relied on large ribbon cables and in the case of SCSI termination to the cable chain. I have been kept up at night troubleshooting faulty SCSI cabling running down the chain to try and figure out which drive was causing the problem or if it was a termination issue. The new cables are much smaller and are all point to point, no daisy chaining or termination issues to worry about. The last boon added was the idea of using expanders in the case of SAS or port multipliers for SATA only arrays. The old SCSI standard with 15 drives in a single chain was limiting. You also had the issue that 15 drives could easily saturate a single U320 channel. The biggest SCSI RAID HBA’s usually shipped with 4 channels. In contrast, the new SAS HBA’s may have 4 times that amount. With the SAS expanders you can aggregate SAS channels and have more drives in a single chain. With the SAS 300 standard you could have 4 drives saturate a single channel. With a single 4 drive expander you could have 4 drives on that single channel making the most use of the available bandwidth. You can also have up to 128 drives on a edge expander and up to an astounding 16,384 SAS devices in a single SAS domain. This gives you a lot of flexibility when it comes to configuring your storage and utilizing the bandwidth available.

As you plan your configuration you must be mindful of how many channels you have, what kind of bus the HBA uses and how much bandwidth is available through the entire stack. For example, If you have a PCIe RAID controller with 28 ports that is a theoretical throughput of 8.4 gigabytes a second of available bandwidth via the SAS 300 protocol. The drive may be able to deliver 80 megabytes a second if you only use one drive per port and no expanders that is 2.2 gigabytes a second. If the HBA isn’t plugged into a PCIe 2.0 x8 slot or PCIe 1.0 x16 slot you aren’t going to get that 2.2 GB/Sec of throughput. You should still get the IO’s available but sustained throughput will be limited. Just because an HBA says it can support 108 drives doesn’t mean you will get all the throughput of those drives. You may have an HBA that only supports PCIe 1.0 and only has 4 lanes for a total of 1GB/sec of throughput to the system. Again, you get the IO increase and for SQL Server sometimes that is exactly what you are after.

Host Bus Adapter

This is what most people think of as the disk controller or controller card. In its simplest form it transfers data to and from the system board to the hard disk controller. Of course there are other things that can happen on the HBA. It can have intergraded RAID functions, additional caching, or other things that are not appropriate to do at the disk controller level. There are several types of HBA’s from the ones built into your computers motherboard to high end SAS RAID controllers.

Cache, Disk Controllers, and HBA’s

Almost all enterprise class HBA’s usually have caching as an option or built into the card. This is a particular interest to us and SQL Server. Your data will be safe, SQL Server guarantees this over all else. In my post on capturing IO patterns I discuss why and how SQL Server does this and the concept of stable media. SQL Server assumes that it is talking to a single physical disk and opens the data files in such a way that write caching isn’t used even if it is available. SAS and SCSI drives honor this request normally. But, one of the options more advanced HBA’s offer you are the ability to use the cache and still have stable media. This is usually accomplished through a battery backup unit mounted on the card that keeps the cache memory active during a system failure. Some controllers will gladly let you shoot yourself in the foot by letting you turn the write cache on without a battery and also enable the local write cache on the disk drive as well. In this situation if you have a sudden power failure, data loss is going to happen if there are any writes at that time. Currently, there isn’t a disk drive on the market with a battery backed cache that I know of. There is a new possibility of using fast NAND flash instead of DRAM to act as the cache on drives and HBA’s. Since NAND is non-volatile it doesn’t need to have constant power. To make up for the slower speed of NAND over DRAM caches, they are making them two or more times the size.

Just in case you haven’t had a chance to peek into your servers here is an assortment of HBA’s from yesterday and today.

800px-KL_Tandon_HDD_MFM 761px-ATA_Controller_Board 800px-Adaptec
An MFM add-on card with disk drive, disk controller and host bus adapter. A PCI ATA/133 IDE host bus adapter A PCI SCSI host bus adapter
800px-QLA_2200F 800px-Adaptec_RAID_52445_0 lsi_sas_8480_e
A PCI-X Fibre Channel host bus adapter PCIe SAS host bus adapter with 28 ports available. A PCIe SAS host bus adapter with 2 4x external ports. Notice the memory module?

Until Next Time

I hope you know a little more about HBA’s now and have a better understanding what they are and what they do.

Series To Date:
  1. Introduction
  2. The Basics of Spinning Disks
  3. The System Bus
  4. Disk Controllers, Host Bus Adapters and Interfaces – You are here!
  5. RAID, An Introduction
  6. RAID and Hard Disk Reliability, Under The Covers
  7. Stripe Size, Block Size, and IO Patterns
  8. Capturing IO Patterns
  9. Testing IO Systems

Fundamentals of Storage Systems – Testing IO Systems

12/03/2009 – UPDATE! There were a couple of bugs in the SQLIOCommandGenerator new SQLIOTools.zip has been updated.

I often tell people one of the greatest things about SQL Server is that anyone can install it. I also tell people what the worst things about SQL Server is that anyone can install it. Microsoft fostered a “black-box” approach to SQL Server in 7.0 and 2000. Thankfully, they are reversing this course. As a follow-on to my last article, capturing I/O patterns, we will take a quick look at building some synthetic tests based on those results. There are several tools on the market test I/O systems, some of them free some of the not. SQLIO has been around for several years. There are lots of good articles already on the web describing various uses for this tool.SQLIO was specifically designed to test the limits of your I/O system at different workloads. The problem is people tend to run this tool, will look at the best results, and assume that they will see the same results when the server goes live. But, without understanding your current workloads that is an unreasonable expectation at best. What ends up happening, is a misconfigured I/O system, lots of headaches, with no idea why the system performs so poorly.

I always advocate testing new systems before they go into production. I also understand that it always isn’t an option. Having found myself in that exact situation recently, I’ve decided to take my own advice and pull the new storage off-line to do the proper testing. I’m also taking this opportunity to refine my testing methodology and gather as many data points before the system goes live.

The Test Scripts

With my IO patterns in hand I set out to build a couple of little tools to help me generate all the test scripts and manage the data. As usual, I built these as command line tools since I have no skill at all with GUI’s. It is all in C# and I will be posting them up to Codeplex. You can download the tools here SQLIOTools.zip, this zip has the two tools, they are beta and don’t have a ton of error checking built into them yet. The first tool, SQLIOCommandGenerator does just that, generates the batch file that has all the commands. I does depend on the SQLIO.exe being in the same directory as well as having already defined a parameter file for it to use.

params.txt

X: S Q L I O _testfile0.dat 8 0x0 150240

The first parameter is the test file name that SQLIO will create on start up or use if it already exists. Second is the number of threads that will access that file. Third is the affinity mask. Fourth is the file size in megabytes. Make sure and size the file large enough to be representative of a real database you would be housing on the system. If it is too small it will simply fit in the RAID controllers cache and give you inflated results. I also tend to use one thread per physical CPU core. Be careful though, if you are using a lot of files, having too many threads can cause SQLIO to run out of memory.

Calling SQLIOCommandGenerator:

SQLIOCommandGenerator 0.10
We assume -F<paramfile> -LS -d,-R,-f,-p,-a,-i,-m,-u,-S,-v, -t not implemented

Usage: SQLIOCommandGenerator [OPTIONS]

Generates the command line syntax for the SQLIO.exe program output into a batch file.
Options:
-f, –iopattern[=VALUE] Random, Sequential or Both
-k, –iotype[=VALUE] Read,Write or Both
-s, –seconds[=VALUE] Number of seconds to run each test 1(60) to 10(600) minutes is normal
-c, –cooldown[=VALUE] Number of seconds pause between tests suggested minimum is 5 seconds.
–os, –outstandingiostart[=VALUE]  Starting number of outstanding IOs 1
–oi, –outstandingioincrament[=VALUE] Multiply Outstanding IO start by X i.e 2
–oe, –outstandingioend[=VALUE] Ending Number of outstanding IOs i.e. 64
–ol, –outstandingiolist[=VALUE] Specific Outstanding IO List i.e. 1,2,4,8,16,32,64,128,256,512,1024
–oss, –iosizestart[=VALUE] Starting Size of the IO request in kilobytes i.e. 1
–osi, –iosizeincrament[=VALUE] Multiply IO size by X in kilobytes i.e. 2
–ose, –iosizeend[=VALUE]  Ending number of outstanding IOs in kilobytes – i.e. 1024
–osl, –iosizeList[=VALUE]  Specific IO Sizes in kilobytes i.e. 1,2,4,8,16,32,64,128,256,512,1024
-b, –buffering[=VALUE] Set the type of buffering None, All, Hardware, Software. None is the default for SQL Server
–bat, –sqliobatchfilename[=VALUE]  The name of the output batch file that will be  created
-?, -h, –help show this message and exit

So I passed it this command:

SQLIOCommandGenerator.exe -k=Both -s=600 -c=5 –os=1 –oi=2 –oe=256 –oss=1 –osi=2
–se=1024 -b=all –bat=c:wes_sqlio_bat.txt -f=both

That generates this sample:

:: Generated by SQLIOCommandGenerator
:: This relies on SQLIO.exe being in the same directory.
:: c:wes_sqlio_bat.txt c:paramfile.txt c:outputfile.csv “description of the tests”
:: param1 sqlio parameter file, param2 output of each test to single csv file, param3 test description
SET paramfile=%1
SET outfile=%2
SET runtime=600
SET cooloff=5
SET desc=%3
@ECHO OFF
ECHO ComputerName: %COMPUTERNAME% > %OUTFILE%
ECHO Date: %DATE% %TIME% >> %OUTFILE%
ECHO Runtime: %RUNTIME% >> %OUTFILE%
ECHO Cool Off: %COOLOFF% >> %OUTFILE%
ECHO Parameters File: %PARAMFILE% >> %OUTFILE%
ECHO Description: %DESC% >> %OUTFILE%
ECHO Test Start >> %OUTFILE%
ECHO Command Line: sqlio -kW -s%RUNTIME% -frandom -b1 -o1 -LS -BY -F%PARAMFILE% >> %OUTFILE%
sqlio -kW -s%RUNTIME% -frandom -b1 -o1 -LS -BY -F%PARAMFILE% >> %OUTFILE%
timeout /T %COOLOFF%
ECHO End Date: %DATE% %TIME% >> %OUTFILE%
:: This batch will take approximately 264.0014 Hours to Execute.

The batch file has the instructions for calling it and what parameters you can pass into it. You can omit seconds and cool down if you want to generate a more generic batch file.

This tool is flexible enough for my needs. I can generate specific targeted tests when I have data back that up, or I can generate more general tests to feel out the performance edges.

You may have noticed the estimate run time, that is pretty accurate. This is a worst case scenario where you have chosen pretty much every possible test to run. I wouldn’t recommend this. With the data we have already we can narrow down our testing to just a few IO sizes and queue depths to keep the test well within reason.

SQLIOCommandGenerator.exe -k=Both -s=600 -c=5 –ol=2 –osl=8,64 -b=None –bat=c:wes_sqlio_bat.txt -f=both

This batch will take approximately 80.08334 Minutes to Execute.

Much better! by focusing on our IO targets we now have a test that is meaningful and repeatable.

Why would you want to repeat this test over and over? Simple, not all RAID controllers are created equal. You may need to adjust several options before you hit the optimal configuration.

Running The Tests

Now that I have my tests defined I need to start running them and gathering information. There are some constants I always stay with. One, use diskpart.exe to sector align your disks. Two, format NTFS with a 64k block size. Since I”m doing these tests over and over I wrote a little batch file for that too. Diskpart can take a command file to do its work. Once the RAID controller is in I create an array and look what disk number is assigned to it. As long as you don’t make multiple arrays you will always get the same disk number. After that I format the volume accordingly. WARNING, I do use the /Y so the format happens without prompting for permission!

diskpart.txt

select disk 2

create partition primary align = 64

assign letter = X

testvol.bat

diskpart /S z:diskpart.txt

format x: /q /FS:NTFS /V:TEMP /A:64K /Y

I I also use the RAID controllers command line interface if it has one to make it easier to construct the tests and just let them run using a batch file as a control file. If that isn’t possible don’t worry, the bulk of your time will be waiting for the test to complete anyway.

Gathering The Data

As you have guessed, I have a tool to parse the output of the tests and import them into SQL Server or export it as a CSV file for easy access in Excel. SQLIOParser is also pretty simple to use.

SQLIOParser 0.20

Usage: SQLIOParser [OPTIONS]

Process output of the SQLIO.exe program piped to a text file.

Options:

-c, –computername[=VALUE] The comptuer name that the test was executed on.
-s, –sqlserver[=VALUE] The SQL Server you want to import the data into.
-u, –sqluser[=VALUE] If using SQL Server authentication specify a user
-p, –sqlpass[=VALUE] If using SQL Server authentication specify a password
-t, –tablename[=VALUE] The table you want to import the data into.
-d, –databasename[=VALUE] The database you want to import the data into.
-f, –sqliofilename[=VALUE]  The file name you want to import the data from.
-a, –sqliofiledirectory[=VALUE] The directory containing the files you want to import the data from.
-o, –csvoutputfilename[=VALUE]  The file name you want to export the data to.
-?, -h, –help show this message and exit

It will work with a single file or import a set of files in a single directory. If you are importing to SQL Server you need to have the table already created.

CREATE TABLE [dbo].[SQLIOResults](
[ComputerName] [varchar](255) NULL,
[TestDescription] [varchar](255) NULL,
[SQLIOCommandLine] [varchar](255) NULL,
[SQLIOFileName] [varchar](255) NULL,
[ParameterFile] [varchar](255) NULL,
[TestDate] [datetime] NULL,
[RunTime] [int] NULL,
[CoolOff] [int] NULL,
[NumberOfFiles] [int] NULL,
[FileSize] [int] NULL,
[NumberOfThreads] [int] NULL,
[IOOperation] [varchar](255) NULL,
[IOSize] [varchar](255) NULL,
[IOOutstanding] [int] NULL,
[IOType] [varchar](255) NULL,
[IOSec] [decimal](18, 2) NULL,
[MBSec] [decimal](18, 2) NULL,
[MinLatency] [int] NULL,
[AvgLatency] [int] NULL,
[MaxLatency] [int] NULL
)

This is the same structure the CSV is in as well.

Analyzing The Results

I will warn you that the results you get will not match your performance 100% once the server is in production. This shows you the potential of the system. If you have horrible queries hitting your SQL Server those queries are still just as bad as before. Generally, I ignore max latency and min latency focusing on the average. That is what I am most worried about as the IO load changes or queue depth increases how will the system respond. Remember raw megabytes a second isn’t always king. Number of IO’s at a given IO block size is also very important. I will go into great detail in the next article as I walk you through analyzing the results from my own system so stay tuned for that.

Final Thoughts

These tests aren’t the end of your road. I still advocate playing back traces and seeing how the system responds with your exact workload whenever possible. If you can’t do that then using tools like SQLIO is better than nothing at all. We are also working under the assumption that we are upgrading or replacing an existing production server. If that isn’t the case and this is a brand new deployment using SQLIO will help you know what your I/O system is capable of before you have a problem with bad queries or other issues that always crop up on new systems.

You can always to more testing. It is almost a never ending process, my goal isn’t to give you the end solution just to give you another tool to pull out when you need it. As always, I look forward to your feedback!

Series To Date:
  1. Introduction
  2. The Basics of Spinning Disks
  3. The System Bus
  4. Disk Controllers, Host Bus Adapters and Interfaces
  5. RAID, An Introduction
  6. RAID and Hard Disk Reliability, Under The Covers
  7. Stripe Size, Block Size, and IO Patterns
  8. Capturing IO Patterns
  9. Testing IO Systems – You are here!

OT: Maximizing My Time, Using Technology

Well I’ve taken the plunge. I have finally broken down and decided to dictate blog post, articles, and other documents. Speech recognition has come a long way over the years, and it’s time I make use of it. Does that mean I’m getting typing altogether? Not exactly. One of the biggest problems is learning how to speak in a way that isn’t exactly natural. It also changes the way I write.

One of the things that people who know me know that I am dyslexic. My hand writing is horrible and my spelling is so poor that it is not unusual for the spellchecker in each article I write to completely miss a word or two. It also means that even when I see a word misspelled it looks perfectly okay to me. So, between the spellchecker missing words, and my inability to spot incorrect words means I spend a lot of time editing and proofing all my work. For short e-mails this isn’t a big problem. But, on average my blog posts are between 1000 and 2000 words, this leaves a lot of room for error. So, here we are leaning on technology. It also means I can “write” while away from my computer via my Sony recorder.

I don’t know how well this is going to go but, with a minimum of training I was able to dictate this post via the crappy head set they ship with the software 🙂

Fundamentals of Storage Systems – Capturing IO Patterns

We often take the advice given to us on forums or in articles at face value. Even though the authors almost always say things like “your mileage may vary” or “may not apply to your situation” people still assume it is the gospel. Sometimes it is lack of experience. Other times it is just lack of knowledge on how to verify these things on your own. In this article I’m going to give you a tool to look at what SQL Server is doing at the disk level and allow you to make better decisions on how to configure your underlying disks.

The Basics

There are several things you need to know about how SQL Server accesses the database files and the implications of that before you can construct a proper testing methodology.

http://technet.microsoft.com/en-us/library/cc966500.aspx covers the basics. There are a few things I will highlight here.

ACID and WAL

ACID (Atomicity, Consistency, Isolation, and Durability) is what makes our database reliable. The ability to recover from a catastrophic failure is key to protecting your data.

WAL (Write-Ahead Logging) is how ACID is achieved. Basically, the log record must be flushed to disk before the data file is modified.

Stable Media

Stable media isn’t just the disk drive. A controller with a battery backed cache is also considered stable. Since SQL Server can request blocks as big as 64KB make sure your controller can handle that block size request in cache. Some older controllers only do a 16KB block or smaller.

FUA (Forced Unit Access)

With the requirement of stable media SQL Server creates and opens all files with a specific set of flags. FILE_FLAG_WRITETHROUGH tells the underlying OS not to use write caching that isn’t considered stable media. So, the local disk cache is normally bypassed. Not all hard drives honor the flag though, Some SATA/IDE drives ignore it. Usually, the drive manufacturer provides a tool to turn off write caching. If you are using desktop drives in a mission critical situation be aware of the potential for data loss. FILE_FLAG_NO_BUFFERING tells the OS not to buffer the file ether. At this point the only cache available will be the battery backed or other durable cached on the controller.

File Access

SQL Server uses asynchronous access for data and log files. This allows IO request to queue up and use the IO system as efficiently as possible. The main difference between the two are SQL Server will try and gather writes to the data file into bigger blocks but the log is always written to sequentially.

All of these rules apply to everything but tempdb. Since tempdb is recreated at restart every time recoverability isn’t an issue.

SQL Server data access patterns

Searching around you will find these generalities about SQL Server’s IO patterns

Log Writes

Sequential 512 bytes to 64KB

Data File Read/Writes

8KB

Read ahead – more important to Enterprise Edition

8KB to 125KB

Bulk Insert

8KB to 128KB

Create Database

512 byte – full initialize on log file only.

Backup Sequential Read/Write

1 MB

Restore Sequential Read/Write

64K

DBCC – CHECKDB

Sequential Read 8K – 64K

DBCC – DBREINDEX

(Read Phase) Sequential Read (see Read Ahead)

DBCC – DBREINDEX

(Write Phase) Sequential Write

Any multiple of 8K up to 128K

DBCC – SHOWCONTIG

Sequential Read 8K – 64K

Now that we have an idea of what SQL Server is suppose to be doing its time to verify our assumptions.

Capturing IO activity

There are a few tools that will allow you to capture the file activity at the system level. Process Monitor is a free tool from Microsoft that I will use to collect some base line information. In it’s standard configuration Process Monitor captures a ton of stuff and uses the page file to spool the info to. So, before we begin we need to change the default configuration.

ProcessMon1

Capturing IO data using process monitor.

Filter to apply

process is sqlservr.exe
Operation is Read
Operation is Write

ProcessMon2

Columns to choose.

ProcsessMon5

Process Name
PID
PATH
Detail
Date & Time
Time of Day
Relative Time
Duration
TID
Category

Change Backing File.

ProcessMon3

The maximum number of events it will capture is 199 million. This is enough on my system to capture 12 hours of activity easily. Once we have a good sample you can save it off as an XML file or CSV. Choosing CSV it is pretty easy to import the data into SQL Server using SSIS or your tool of choice.

ProcessMon4

I import the CSV into a raw table first.

Raw table to import into.

CREATE TABLE [SQLIO].[dbo].[pm_imp] (
  [Process Name]  VARCHAR(12),
  [PID]           SMALLINT,
  [Path]          VARCHAR(255),
  [Detail]        VARCHAR(255),
  [Date & Time]   DATETIME,
  [Time of Day]   VARCHAR(20),
  [Relative Time] VARCHAR(50),
  [Duration]      REAL,
  [TID]           SMALLINT,
  [Category]      VARCHAR(6)
)

Next I create a cleaner structure with some additional information separated from the detail provided.

SELECT
[Process Name]       AS ProcessName,
PID                  AS ProcessID,
PATH                 AS DatabaseFilePath,
Detail,
[Date & Time]        AS EventTimeStamp,
[Time of Day]        AS TimeOfDay,
[Relative Time]      AS RelativeTime,
[Duration],
TID                  AS ThreadID,
Category             AS IOType,
substring(detail,charindex('Length: ',detail,0) + 8
,(charindex(', I/O',detail,0) - charindex('Length:',detail,0) - 8)) AS IOLength,
CASE reverse(left(reverse(PATH),3))
    WHEN 'mdf'
    THEN 'Data'
    WHEN 'ndf'
    THEN 'Data'
    WHEN 'ldf'
    THEN 'Log'
  END AS FileType
INTO   SQLIOData
FROM
  dbo.pm_imp
WHERE  reverse(left(reverse(PATH),3)) IN ('mdf','ndf','ldf')

Once we have the data cleaned up a bit we can now start doing some analysis on it.
Queries for interesting patterns.

This query gives us our read and write counts.

SELECT
count(*) IOCount
,IOType
FROM
SQLIOData
GROUP BY IOType
ORDER BY count(*) DESC
 

This one shows us the size of the IO and what type of operation it is.

SELECT
count(*) IOCount
,IOLength
,IOType
FROM
SQLIOData
GROUP BY IOLength,IOType
ORDER BY count(*) DESC

This is a look at activity by file type data or log.

SELECT
count(*) IOCount,
FileType
FROM
SQLIOData
GROUP BY
FileType
ORDER BY
count(*) DESC

Since we are capturing the thread id we can see how many IO’s by thread.

SELECT
count(*) IOCount,
ThreadID
FROM
SQLIOData
GROUP BY
ThreadID
ORDER BY
count(*) DESC

We can also look at IO types, sizes and count by file helping you see which ones are hot.

SELECT
count(*) IOCount,
databasefilepath,
iotype,
iolength
FROM
SQLIOData
WHERE
databasefilepath LIKE '%filename%'
GROUP BY
databasefilepath,
iotype,
iolength
HAVING   count(*) > 10000
ORDER BY databasefilepath,
count(*) DESC

Now that we see exactly what our IO patterns are we can make adjustments to the disk subsystem to help scale it up or tune it for a particular pattern.

This is just another tool in your tool belt. This is a supplement to using fn_virtualfilestats to track file usage. I use it to get a better idea of the size of the IO’s being issued.Using these two tools I can see the size of the IO’s in a window of time that is reported by my fn_virtualfilestats capture routine.

Always verify your assumptions, or advice from others.

Series To Date:
  1. Introduction
  2. The Basics of Spinning Disks
  3. The System Bus
  4. Disk Controllers, Host Bus Adapters and Interfaces
  5. RAID, An Introduction
  6. RAID and Hard Disk Reliability, Under The Covers
  7. Stripe Size, Block Size, and IO Patterns
  8. Capturing IO Patterns – You are here!
  9. Testing IO Systems

Why does change control for SQL Server have to be so hard?

I’ve been dealing with change control and source code repositories for most of my professional career. While I’ve seen change control and integration advance steadily for writing programs it feels like the database part of things is just stuck in the stone age. For months now I’ve been researching solutions for source control, change management, and deployment of database objects. The conclusion I’ve come to is there is no solution. Well, no easy solution. I was very happy in the early days of SQL Server 2005 when they announced source control integration into management studio. It was a great pain for me personally to have Visual Studio, and the solution architecture it offered and not have that on the database side of things. Alas, it wasn’t meant to be. What they meant buy source control was using the previous generation of integration and then crippling it.

Really?

image

This doesn’t look like much of a solution to me.

I know what most of you are thinking. If you have Visual Studio use it. That works for me but not the people on my team that only have access to SSMS. It also means I have to jump between two tools to do one thing, work with SQL Server. I have been told that Microsoft is basically pushing you to Visual Studio for all of your development needs. Leaving SSMS as a management tool only. If Visual Studio did everything SSMS did it wouldn’t be that big a deal for me personally.

 

Options Available

SQL Server Management Studio Hacks

I tried several things to work around the limitations SSMS has. I found you could manually edit the solution file to get extra folders. The only problem with that is they all show up as ether Queries or Miscellaneous. Other than that one and the old fix for sorting files by name there aren’t any other hacks I can find.

Toad for SQL Server

Toad1

Generally has a nice look and feel.It has all the development and management features to be a true replacement for management studio. I tried all the normal things that I do in SSMS in Toad and several things were better. The debugger was nice and the statement optimizer is also a nice addition. It does fall down flat in some basic key areas. I never could get it to display an execution plan. As a T-SQL guy the plan is a must. I know it is a bug somewhere. Having something this fundamental during and evaluation is a big red light though.

The only down side is it doesn’t support Sourcegear Vault/Fortress which is a real shame. Lots of SMB’s use Vault for source control since it is miles better than visual source safe and much cheaper than team system.

ApexSQL Edit

apexsql1

That left only one other contender in this fight. ApexSQL Edit has been around quite a while as well. Initially, it has a similar look and feel to Toad. I know there isn’t a lot that you can do to since both look like Office. I is also missing the management features but I can live with that. The goal is to get the developers a tool they can develop in and use our code repository easily. ApexSQL Edit did include support for Vault and it worked as expected. Again, I started using it daily like I would SSMS. Everything I tried worked, for the most part. 95% of the time it would generate an execution plan. Not as clean as SSMS but it had more options on how to display the plan, which I liked. I did have a few crashes, but this was a beta build and I will let that go until I test the full release. Since this was a beta I did provide feedback and initially the folks at ApexSQL were very responsive. Eventually though everything just went quiet accept for the sales guys asking me how things were going. Right now they are a no go until the stability issues are addressed and the RTM is out so I can do a full evaluation again.

 

Final Thoughts

What I hoped would be a pretty easy exercise turned out to be a real work out. For all of SSMS’s problems it is stable and familiar. I was really hoping that ether Toad or ApexSQL Edit would solve my problems. I haven’t given up on ApexSQL Edit yet, we will just have to play the waiting game and keep using an inadequate solution until someone comes up with something better.