Book Review: SQL Server 2012 High-Performance T-SQL Using Windows Functions

I’ve been extremely lazy about reading technical books for the past few months. Primarily I’ve been reading SQL blogs and haunting the SQLServerCentral forums instead. While still informative, I have a ton of SQL Ebooks that demand to be read. Part of my New Year’s Resolution was to start reading both technical books and novels again…so I’m on my fourth novel! However, I just finished my first technical book of the year. I read the first chapter of a few books before settling on something “easy”.

SQL Server 2012 High-Performance T-SQL Using Windows Functions by Itzik Ben-Gan is actually a very short for a SQL book, coming in at only 210 pages. With such a lengthy title, it’s shocking that the book is so short and concise. It is edited well, having a tiny errata containing only one technical mistake on a sample script and two minor typos. This is awesome considering many books I’ve read are plagued with gigantic errata. There is also a companion website where you can download source code and the entirety of Chapter 1.

TSQLFunctionsChapter 1 – SQL Windowing
Pages: 32

The introductory chapter provides a great overview of what the book has in store, making it a perfect sample download. It explains the options available for window functions and which functions are new in SQL 2012 (LAG, LEAD, etc.). The chapter provides some quick examples to whet your appetite with tantalizing hints of more details to come. The chapter is kept light enough that you want to keep reading while still providing useful information, all the while promising great things to come.

Highlights:

  • Programming options explained – Set Based, Window Functions, and Iterative
  • Preview of Chapter 5 in the form of sample scripts and situations
  • Elements of window functions – Partitioning, Ordering, and Framing

Chapter 2 – A Detailed Look at Windows Functions
Pages: 48

This chapter goes into explains partitioning and framing. If you are even slightly fuzzy on the concept, this is a very important section. I was surprised at just how much detail the author went into to explain the technicalities.  On the other hand, it’s a deep dive that could overwhelm someone trying to read the book with very little prior knowledge of window functions. In that case, I suggest you skim the chapter and come back to it after you better understand what window functions can do to save your queries.

The thing that bothers me is how the author melds theoretical SQL window functions in with the TSQL supported functions. I appreciate that he mentions ANSI standards, but if it’s not directly supported in TSQL, I don’t really want to learn about it in detail while reading about TSQL functions. Now I’m afraid that I will try using NTH_VALUE and then get horribly disappointed when I remember it’s not supported. Separating the functions that are not supported by SQL Server into a different section or chapter might have been the better route here.

Highlights:

  • Aggregate Functions – Distinct and Nested
  • Ranking Functions – Row_Number, Ntile, Rank, Dense_Rank
  • Distribution Functions – Percent_Rank, Cume_Dist
  • Offset Functions – Lag, Lead, First_Value, Last_Value

Chapter 3 – Ordered Set Functions
Pages: 20

The explanations in this chapter get very “mathy”. If equations scare you, you might need to read quickly, or just look at the general solutions and move on. Basically he provides script alternatives and the math necessary to get those alternatives to work. Very cool; a bit complex.

Highlights:

  • Hypothetical Set Functions – Ranking and Distributions
  • String Concatenation – Using Coalesce, Stuff, and XML path

Chapter 4 – Optimization of Windows Functions
Pages: 32

If you don’t understand query execution plans, a lot of this chapter will go over your head. It even starts with such a warning. There are numerous examples of execution plans and each example explains why it is slow or fast based on index scans, seeks, parallelism, etc. Sample data is provided for testing.

Columnstore indexes are mentioned, but for window functions to perform best, normally these should not be used due to sorting issues. It’s not like many people are using them yet anyways.

The detailed information on framing was my favorite part of this chapter. I’ve used ROWS UNBOUNDED PRECEDING to calculate running totals in SQL 2012 with great success, but I had never studied the particular differences between ROWS and RANGE (hint, stick with ROWS by default) or considered the many applications of the BETWEEN operator for framing, especially for pulling specific rows quickly using LEAD or LAG.

Highlights:

  • Indexing Guidelines for Window Functions
  • Improving Parallelism using Apply
  • Optimizing Functions using Framing (added in SQL 2012)
  • Distribution Functions – Rank and Inverse optimization

Chapter 5 – T-SQL Solutions Using Windows Functions
Pages: 78

Some people will skip directly to this chapter, and really, once you have read the book, this is most likely the section you will flip to when you need a trick query.  The format of this chapter is: Introduce an issue; provide a set based, iterative, and windows function solution; then compare the time and effort to do each.

It’s a great chapter to glance over once, and then keep mental notes of what it contains for when you need  to do some odd math, like calculate data islands or gaps.

Highlights:

  • Number Table
  • Date Sequences
  • Paging
  • Removing Duplicates
  • Pivoting
  • Top N
  • Mode
  • Running Totals
  • Intervals
  • Gaps
  • Islands
  • Median
  • Conditional Aggregate
  • Sorting Hierarchies

Summary

Overall, this was a great read that I recommend to anyone who writes TSQL. Although I’ve worked with window functions for a few years and have experience using the new SQL 2012 functions, the exhaustive detail that the author went into is important for tuning, especially considering the relatively short reading investment necessary. The last chapter will continue to be a great resource for rare query requests, especially since it  provides non-window function alternatives – in case a query has to run against an older version of SQL Server.

Advertisements