Category Archives: Syndicated

SQL In The City: Austin

So, No SQL Saturday in Austin This Year.

I know a lot of folks were disappointed that we (POSSE) weren’t able to pull together in time for a SQL Saturday in Austin this year. We are shooting for a spring date and I’ll be posting more about that in the next couple of weeks.

SQL In The City To The Rescue!

Red Gate Software, a sponsor last year for Austin’s SQL Saturday, are doing something different. They have done a few of these events in the UK but they are taking it on the road! They have six events planned for us here in the states. We are one of the first stops in the tour. On October 1st they will be taking over the AT&T Executive Education Conference Center in down town Austin, TX. This is a first class facility.

So, Whats The Catch?

Pretty much the same catch as any other free training event. The exception to the model is there is only one vendor footing the bill. You still get some of the best training at any price and get to meet the Red Gate people that make some of the best tools for our platform. This isn’t a fluff marketing event. Red Gate has constantly and consistantly supported the community over the years. Between Simple Talk and SQLServerCentral.com and employing some of the smartest guys in our industry like Steve Jones and Grant Fritchey. To put it mildly, they have supported the community that supports them.

Great! So Tell Me More…

Besides myself, Steve and Grant they have invited a host of other smart speakers. Tim Radney is a chapter leader and PASS Regional Mentor and has been hitting the SQLSaturday circuit and is highly rated. Jim Murphy, someone who has become a good friend and fellow chapter leader in Austin running the CACTUSS Central group is a veteran with SQL Server and is also an excellent speaker. And my other friend and MVP Aaron Nelson will be on hand. We also have a .net veteran Rob Richardson joining our motley band. That’s on top of Red Gate insiders who build these ingeniously simple tools.

Now It’s Your Turn!

Go and register for SQL In The City Austin, TX today!

Demystifying SQL Server Differential Database Backups

Odd Man Out

SQL Server has three backup types. Two you have heard of and used. One, while useful, isn’t very well understood.

Let’s start with a technical recap of the three backup types for SQL Server.

1. Full Database Backup

When you request a full backup, SQL Server dumps all the data pages from your database, metadata about how your database is stored on disk and finally enough of the transaction log to bring the database back into a consistent state. There are a few things you need to know about the full backup semantics. When you take a full backup it makes a few changes. Those changes are tracked in two places in the database and one in MSDB. The changes tracked in the database allow us to then use transaction log backups and differential database backups. The data logged to MSDB isn’t critical for restoring your backups. It does make it much easier to do so. Full backups are considered our “base” backup type. Every other backup type can use a full database backup as its base. Even though a full backup does capture some of the transaction log it doesn’t clear the log. If you are in simple mode, the normal checkpoint process will clear the log. If you are in bulk load or full recovery mode, you will need to take a transaction log backup to clear the log.

2. Transaction Log Backup

Transaction log backups are a critical part of any recovery plan. Without them you can’t restore up to the minute. If your database is in anything other than simple recovery mode, your only supported option to clear the log is a transaction log backup. Transaction log backups are serial by nature. The log restored depends on either a full or differential for its base and any log backups done before the current log you wish to apply.

3. Differential Database Backup

Like a full database backup, the differential backs up data pages and enough of the transaction log to bring the database back into a consistent state. Unlike full or transaction log backups, the differential backup captures all changes since the last full backup occurred. The information on changed data pages is stored internally in the database and doesn’t require any information from MSDB. The map of changed data pages only gets reset on the next full backup. Transaction log backups or other differential database backups will not reset the changed data map. You can think of transaction log backups as incremental backups. People coming from a systems administration background can get tripped up and treat differential backups like incremental backups. This can cause a significant waste of time when restoring your database since you only need to apply the full backup and the most current differential, or the differential you are interested in to get your database back into a recovered state.

Understanding Differential Database Backups

Most people are put off by the nature of differential backups mainly due to the amount of space they can grow to and the extra complexity they can add to your recovery plans. If you don’t manage them, you can quickly run into a differential that is larger than the full it is based on. Also, any data page alterations are tracked. For example, if you take a full backup then perform full index reorganization on a heavily fragmented index you can end up with very large differential backups. File shrinks with full reorganizations also have the same effect. Even though the actual data hasn’t changed, you end up with differential database backups that are unwieldy and difficult to manage. If you miss a full backup in your schedule, your differentials again may grow larger than your full backup.

There are several cases where differential database backups are a pivotal key to recovering your database quickly and with as little data loss as possible. Let’s take a look at a few scenarios.

Shortening Recovery Time

This alone should be good enough reason for you to investigate differential backups. Every restore operation has a cost-in-time associated to it. Remember, even if a transaction log backup is virtually empty, there is a cost-in-time to spool up and tear down the restore session for each log backup you apply. Not to mention replaying the transactions in the logs. In many cases, it can be much faster to apply a differential backup than applying multiple transaction log backups. By skipping all the data manipulation and just replacing the altered pages you are reducing the amount of IO required to restore.

Database in Simple Recovery Mode

There may be situations where you aren’t concerned with up-to-the-minute recovery but still need something better than weekly full backups to meet your recovery goals. Differentials fit in well here. By leveraging differential backups, you can take a single full once a week and daily differentials to cut down on the space needed to store your backups. Also, since differential backups contain all the changes since the last full, to recover you only need the full backup and the differential backup of the time interval you want to restore to. I recommend keeping your differentials just like you would your transaction log backups so if you need to recover your database into another environment or if you suffer corruption in one of your differentials, you still have as much data as possible to restore.

Large Database with Minimal Data Change

With today’s large disk capacities, it isn’t unusual to see multi-terabyte databases with years of data in them. Moving our full backup schedule out to every two weeks or every month and supplementing with differentials is an excellent way to conserve backup space and shorten time to recovery. Again, we only need the last full, the last differential and any transaction logs after the differential was taken to get us back up to the minute.

Increasing Recoverability

if you only take a full database backup once a week and transaction log backups every 15 minutes, you could end up applying over 670 logs to get your database back on line if you have a failure at the end of the week. If you have any errors in one of the transaction log backups, everything after that is pretty much useless to you. If it dies at backup 599, it may not be the end of your business. If it is log 38, it could mean a week’s worth of data gone. Since differential backups don’t break the LSN chain and transaction log backups don’t reset the changed data map, you can use either backup type even when one or the other may have had an error. Differentials allow us to bridge gaps in our transaction log since we can apply any transaction logs taken after the differential backup. This is one of the real strengths of differential backups. So, if you are doing weekly full backups, daily differential backups and transaction log backups every 15 minutes, you are covered in two ways. Normally, you would restore the full then the latest differential backup followed by any additional transaction logs. If you had a differential backup corrupted but your transaction logs, were fine you could still restore fully.

Repairing Log Shipping

Another great use of differentials is to repair your log shipped databases. If anything happens to the LSN chain, in most cases the only way to repair your log ship target is to start over again from a full and then apply all the logs to get it back up to current. If this is a large database or if there are a lot of transaction log backups to recover this could leave you exposed for quite a while. You can always take a differential backup, apply that to the log ship target then restart your log shipping from that point. I have used this technique successfully over the years when there have been network outages causing our log ship targets to fall way behind cutting catch up time from hours to minutes.

Final Thoughts

Incorporating differential backups will add complexity to your backup strategy but the benefits can be staggering. Between the storage savings and reduction in recovery time it’s clear that differential backups should be in your tool belt. I would also encourage you to practice restoring using your differential backups. Try out different failure scenarios like failed transaction logs or differential backups. Make sure you understand how to restore up to the minute and stop at a specific time now that you have differentials in the mix.

This is a re-blog from an article I wrote for SWWUG on April 19th 2012

Free Tools for Testing SQL Sever

You Need To Test Your SQL Server!

Its like screaming “TEST THE BREAKS ON YOUR CAR!” If you don’t have the tools or know how it is just good advice you can’t follow. I’ve done a TON of benchmarks over the years using every tool imaginable including ones I’ve written myself. There are several barriers to generating a valid repeatable benchmark in your own environment. Lets take a look at the general criteria that I use when testing OLTP systems and the benchmark tools available.

I’ve Been Spoiled

I’m going to come clean. I really like Quest Software’s Benchmark Factory. They have put years of work into the tool set. It is also one of the few database benchmark tools that runs against most major RDBMS platforms. This alone made it my go to tool for a very long time. Also, since it implements most of the TPC Council benchmark specifications when someone asks how I tested I can hand them reams of information on the benchmark in question. TPC also has a well defined metric for determining not just raw performance but a dollar amount to get that performance. The catch is that Benchmark Factory isn’t an apples to apples with the published TPC results. To assume so would be a horrible mistake on your part. If you have ever read through a full disclosure (I read them to my four year old to put him to bed) you know that it would be very difficult for your organization to implement the exact same test on the exact same software without spending some major bucks. I have worked a few projects that were willing to spend the money and bring in a team to implement the testing protocols. If you are talking a project that is going to run say ten million your boss may be willing to spend 100,000 of that to completely test the system before it goes into production. In most cases though, this isn’t the case. Even spending the money on Benchmark Factory may be outside of your budget. So, we turn to free or opensource solutions sometimes we just write our own.

Other Options

There are several other options for testing database servers. I’ve use a few over the years. First, I’m limiting this list to those that run on Windows, they may run on other platforms but that isn’t important to me. Secondly, The test setup has to be repeatable. Lastly, there has to be a way to measure performance. In some cases this may be using perfmon(Windows Performance Monitor) or the tool may capture some meaningful metrics.

Database Hammer SQL Server Resource Kit
http://www.sqlmag.com/content1/topic/put-the-hammer-down/catpath/configuration

Not exactly free or cutting edge but was good a putting a load on SQL Server. Even though this is VERY old due to its simplicity it works pretty much on every version of SQL Server from 2000 on. Again, one of the downsides is capturing metrics. The other down side is finding the dang thing. If you have an MSDN subscription you can still download it or you can head over to your local used book store and hope the CD-Rom that came with the book isn’t toast.

Eye On SQL – Load Test
http://eyeonsqlloadtest.codeplex.com/

A basic tool to put a database under load. There are no predefined tests and it doesn’t gather more than the most basic metrics. It is simple to use and can be a replacement for something like Database Hammer

TPCBench
https://github.com/SQLServerIO/TPCCBench

This is a tool I wrote a while back that implements TPC-C. It is in C# so anyone that is familiar can work on the code and peek inside to see how I implemented my version. The down side is it only implements one test, TPC-C. and TPC-C has several limiting factors to it. The one I never liked was all the data generated was random, all of it. If it was a name it was a random string. This leads to a database that effectively has 100% carnality, and is also a very poor choice for testing database compression. My goal was to implement TPC-E, which overcomes all of these issues but is fairly complex and hard to get right. I fell back to Benchmark Factory and never moved forward with this tool.

Hammerora
http://hammerora.sourceforge.net/

This is a pretty simple to use tool an has a easy to understand GUI. It supports two benchmarks a TPC-C style OLTP benchmark and a TPC-H style OLTP benchmark. That is also one of the down sides, it only supports two benchmarks and both are a bit long in the tooth. It also supports various flavors of SQL Server, Oracle, MySQL and Postgres. This is a nice little bonus if you are testing like hardware and OS and want to vary the RDBMS. The last bonus for me personally is it is opensource. This means I can look at the code and see how each test is implemented.

DVD Store Database Benchmark
http://en.community.dell.com/techcenter/extras/w/wiki/dvd-store.aspx

This is a tool I’m currently working on to bridge the gap between TPC-C and TPC-E. It was written in house by Dell and now is developed and maintained by Dell and VMWare. It is more complex than TPC-C but not as complex as TPC-E. It is opensource which was a huge plus for me and allowed me to fork it over at https://github.com/SQLServerIO/DVD-Store-Database-Benchmark and tweak on it to meet my needs. I’ll have a full blog post on this soon explaining the benchmark in detail and what changes I have made to it. It is a staple for testing databases on ESX. It has a standard way for gathering metrics. The load generator is actually in C#. You can also do a full stack test since it will work against a website that is also included in the benchmark. The down side again, is it only implements one benchmark.

Description of the Replay Markup Language (RML) Utilities for SQL Server
http://support.microsoft.com/kb/944837

This is a tool set that was initially developed in house at Microsoft to test customer scenarios. This is different than the other tools since it is a replay tool. But you can capture metrics and vary your workloads. The big down side is it isn’t easy to use. Also, if you are testing new hardware and don’t have a trace to replay from production you are back to zero.

Apache JMeter
http://jmeter.apache.org/

Not a database stress tool in the strictest since of the word but you can use it that way. The upside for JMeter is it will test your whole stack, web servers, middle tiers and database servers. You can use it to drive pretty much anything. Since it is written if Java it runs on Windows. Since there is a JDBC driver for SQL Server you can use it directly to test your database as well. The down sides are also huge. It is written in Java for starters. It is also using the JDBC stack and that can be a limiting factor. It is a significant undertaking to setup as well. I’ve included some links to people that have set it up to test databases and a ASP.NET web site.

http://ilkinbalkanay.blogspot.com/2010/03/load-testing-relational-databases-with.html
http://blog.technicallyworks.com/2009/06/load-testing-aspnet-sites-with-jmeter.html
http://blog.technicallyworks.com/2009/06/load-testing-aspnet-with-jmeter.html

Now What?

Well, that is easy, start reading, building and testing. I personally use something like Eye On SQL or Hammerora to do an initial smoke test or break in. Then move over to the more complicated benchmark tools. I have pretty much settled on the DVD Store benchmark for now as my OLTP testing tool. Now I need to find something more up to date than the TPC-H test everyone else has implemented so I can round out with a solid OLAP benchmarking tool. Maybe I’ll write something again!

Speaking at PASS Summit 2012

It’s Not A Repeat

Speaking at the PASS Summit last year was one of the highlights of my career. I had a single regular session initially and picked up an additional session due to a drop in the schedule. Both talks were fun and I got some solid feedback.

The Boy Did Good

I won’t say great, there were some awesome sessions last year. I did do well enough to get an invite to submit for all the “invite only sessions”. I was stunned. I don’t have any material put together for a half day or a full day session yet and the window to submit sessions was a lot smaller this year. But I do have three new sessions and all of them could easily be extended from 75 minutes to 90 minutes. So, I submitted for both regular sessions and spotlight sessions and got one of both! WOO HOO!

The Lineup

I’ll be covering two topics near and dear to my heart.

How I Learned to Stop Worrying and Love My SAN [DBA-213-S]
Session Category: Spotlight Session (90 minutes)
Session Track: Enterprise Database Administration & Deployment

SANs and NASs have their challenges, but they also open up a whole new set of tools for disaster recovery and high availability. In this session, we’ll cover several different technologies that can make up a Storage Area Network. From Fibre Channel to iSCSI, there are similar technologies that every vendor implements. We’ll talk about the basics that apply to most SANs and strategies for setting up your storage. We’ll also cover SAN pitfalls as well as SQL Server-specific configuration optimizations that you can discuss with your storage teams. Don’t miss your chance to ask specific questions about your SAN problems.

I’ve built a career working with SAN and System Administrators. The goal of this session is to get you and your SAN Administrator speaking the same language, and to give you tools that BOTH of you can use to measure the health and performance of your IO system.

Integrating Solid State Storage with SQL Server [DBA-209]
Session Category: Regular Session (75 minutes)
Session Track: Enterprise Database Administration & Deployment

As solid state becomes more mainstream, there is a huge potential for performance gains in your environment. In this session, we will cover the basics of solid state storage, then look at specific designs and implementations of solid state storage from various vendors. Finally, we will look at different strategies for integrating solid state drives (SSDs) in your environment, both in new deployments and upgrades of existing systems. We will even talk about when you might want to skip SSDs and stay with traditional disk drives.

I’ve spoken quite a bit on solid state storage fundamentals this time around I’ll be tackling how people like myself and vendors are starting to mix SSD’s into the storage environment. Where it makes sense and where it can be a huge and costly mistake.

Finally

I hope to see you at the Summit again this year! Always feel free to come say hi and chat a bit. Networking is as important as the sessions and you will build friendships that last a lifetime.

2011 What A Great Year

What A Diffrince A Year Makes

In January I posted up my new years list basically, stuff I wanted to learn in the last twelve months. Lets take a quick look at last years goals.

SQL Server Stuff

1. Partitioning
Win
Done, and pretty early on. I did a lot of reading and did some partitioning setups in my home lab and feel much better about partitioning.

2. Change Data Capture
Draw
This was a business need. At my old company we were using triggers everywhere and CDC seemed like a good fit for most of that. I did some reading and test setups but didn’t get much farther than that. Not a 100% win but I did learn stuff.

3. Replication
Win
Ongoing, since I’ve changed jobs replication moved up on the list as CDC fell down the list. I’ve geared back up and feel like I am back on solid ground.

4. Analysis Services Administration
Fail
Again, casualty of moving jobs. I have ZERO interaction with Analysis Services and as such, didn’t do much on this front.

.Net Programming Stuff

1.LINQ
Draw
I did do some stuff with the more general LINQ to objects and now that I have moved jobs I actually have more exposure to LINQ so I’ll get to learn more in the new year.

2.Parallel Programming in .Net
Draw
Didn’t do much with PLINQ but did work a bit with the new async framework stuff. My biggest complement is it isn’t horrible. I don’t think it is as slick as say node.js but it is light years ahead of the early 2.0 days and threading.

3.Entity Framework
FAIL
Just couldn’t do it. Partly due to the job change and partly due to complete lack of caring. I just can’t get jazzed up about it. I’ve worked with a lot of ORM’s and they all have strengths and weaknesses.

90 Degrees From Center

1. Get Better With Python
Win
I did a few little things in Python on a few different platforms like the Google App Engine. I mostly use it at home. There is a large user base and it does make lots of programming tasks much easier so I’ll hang on to Python for a while longer.

2. Pick Up Lisp again
EPIC FAIL
I spent about two days banging on Lisp again and realized why I don’t use Lisp today. There just isn’t enough support for it in the wild. There are lots of flavors and variations on Lisp but to gain any real proficiency you really need to focus on one dialect and master it. I just couldn’t find one and settle. Eventually, that time popped over to Python.

Changes, Changes and Changes

This seems to be a constant in my life. Admittedly most of that is self inflicted but I wasn’t expecting the amount of stuff to happen to me personally or professionally this year.

Lets make the short list:
New House
New Job
Spoke at SQLRally
Pulled off SQLSaturday #97
MVP
Spoke at PASS Summit

yep, crazy year but a good one for sure.

Thank You

I don’t say that enough. The community has been great to me this year and I can’t wait to see what next year holds!

See you next year!