SQL Server Storage: Pages and Extents

It’s time for another SQL Server refresher today! This time we will discuss some storage basics, specifically Pages and Extents and how they relate to each other. There are a lot of resources out there discussing these storage units, but I’ve tried to put my own spin on things and aggregate as much data as I could find about them.

Pages

Naturally we are going to discuss pages first, since they are the most fundamental unit of storage for SQL Server. Pages store everything in the database and are only 8 KB small. Your entire disk I/O is performed at the page level.

Page

An example data page layout is easier to visualize than explain

Pages have three major components, a page header, records, and the offset array. Pages start with a 96 byte header which contains meta-data, like the page number, owner’s object id, and page type. Pages end with the offset array which is 36 bytes and has pointers to each new row stored in the page.  These pointers are stored last to first, but that’s more easily explained in the picture. The offset array is essentially the index for the page. The middle of the page is the records, and consists of the remaining 8060 bytes containing stored data.

There are different types of pages, such as data, index, image, and a number of informational pages. Probably the most interesting type are overflow pages. If a row is greater than 8060 bytes, the data can be stored on overflow pages which are linked together. Overflow pages can store as much as 2GB in a single column, but obviously this is less than ideal. The performance impact increases since each extra page increases read times. The most obvious example of this situation is VARCHAR(MAX) or VARBINARY(MAX) datatypes. Data type limitations normally relate directly to the size of a page, (MAX) datatypes effectively bypass the limit and cause overflow pages. For instance, VARCHAR(8000) and NVARCHAR(4000) are the normal limits and based on the size of a single page. Using (MAX) datatypes that span multiple pages increases reads causing less than stellar performance. Queries like SELECT * can grab poorly performing datatypes accidently and should be avoided as much as possible.

Extents

Extent

Extents can be Uniform and have all the same type of pages or Mixed with a combination of page types

Simply put, extents are groups of pages. Extents consist of exactly eight contiguous pages, with every page being part of an extent. In other words, pages always come in groups of eight, so data grows in a minimum of 64 KB increments. Unlike the many types of pages, there are only two types of extents.

Mixed Extents: In these extents, pages are allocated to multiple objects, or different types of pages. New tables or indexes are put into mixed extents for storage efficiency. When a small table is made that would consist of less than eight pages, it gets stored in a mixed extent with other similarly small objects. If a database grows large enough to fill an entire extent, it can utilize the next type of extent.

Uniform Extents: These extents have pages that are all allocated to the same object. Larger databases often have extents with identical page types, such as data pages or index pages. More data can be read in a single read operation with uniform extents, so performance can see an improvement.

Originally I had planned to provide some example scripts to discover information about your pages, and storage setup, but in an effort to keep the information in byte-sized chunks, I’ll continue with that next week.

Remotely Enable Always On

Always On Availability Groups is the new feature for High Availability in SQL Server 2012. It’s been out for awhile now, but unless you have Enterprise Edition SQL, you might not have been able to use it much.

Of course you need a cluster to utilize Always On, but once that is complete, you also have to enable Always On in Configuration Manager on all your servers that will be participating in the AG as well.

Continuing on with my lazy, automated DBA goals of logging into computers as rarely as possible, I developed the below PowerShell script to connect to SQL Servers, enable Always On, and then restart the SQL Service in order for the changes to take effect.

The only thing you need to change below is the computer names, it should automatically detect your SQL instance names. If that doesn’t work (I haven’t been able to test every possible name parsing possibility), you can supply the instance names yourself.


## List Servers in AG Here ##
$Computers = 'Computer1','Computer2'
## Everything Else is Automated ##

# Finds the servers running the services, and the services' names
Invoke-Command -ComputerName $Computers -Scriptblock {
$Services = (Get-Service -Include MsSql* | Where { $_.Status -eq 'Running' } )
$Nodes = @()

# Parses the names of the SQL Instances
ForEach( $Service in $Services )
{ $Nodes += $Env:ComputerName'\'+$Service.DisplayName.Split('(')[1].Replace(')','') }

# Loops through each instance and enables AlwaysOn, restarting with -Force
ForEach ( $Node in $Nodes )
{ Enable-SQLAlwaysOn -ServerInstance $Node -Force }

# Starts SQL Service on the affected server(s) if it's still stopped
If($Services -ne $NULL)
{ Start-Service -DisplayName ($Services.DisplayName) }
}

The key code here for enabling Always On is this snippet below.

Enable-SQLAlwaysOn -ServerInstance $Node -Force

If the longer script cannot automatically detect your ServerInstance, you can provide it manually and run the command. Restart your SQL Service for the change to take affect.

SQL Server Changing Passwords and an SSPI Context Error

The other day I encountered a login error when connecting to a SQL Server. The circumstance seemed strange compared to similar errors described online with many of those seeming rather complicated to find the real solution. Since this server had been around for awhile, it was unlikely that some major Active Directory change would be necessary to resolve the issue.

This SQL Server was part of an Availability Group, and the connection worked fine when connecting using the Server/Instance name, however, when attempting to connect via the Listener, the following error occurred.

Cannot connect to Server/Instance.
The target principal name is incorrect.
Cannot generate SSPI context (Microsoft SQL Server)

Articles online indicated this was an SPN, Kerberos, and/or Active Directory issue, and something needed to be reset, but the only way to know for sure was to continue down a long troubleshooting list. Luckily, the problem was simpler than that, but still very strange.

I had reset the service account passwords the afternoon before this error became apparent. Each service was restarted afterwards to verify the change worked properly and SQL had been successfully connected to. Everything seemed fine from my perspective.

The next day, some users attempted to connect using the Listener and that’s when the errors started. I don’t normally connect via the Listener, so I hadn’t thought to check that, didn’t think it would be necessary.

Troubleshooting the easy solutions first seemed like a good idea, so I decided to try restarting the SQL service, which failed everything to another server in the cluster immediately. The services came online, and now both the instance and Listener could be connected to. OK, well probably sort of solved.

I failed it to a third node in the cluster, everything still worked great. Cool. This was looking even better.

Next I failed it back to the original node.  This time, the SQL Service came online, but not the Listener. Strange, how did it work in the first place? Everything was running on that server before I restarted the service, even if it wasn’t running correctly. I reset the passwords in SQL Configuration Manager, and then restarted the services. Everything worked perfectly now.

In summary, somehow all the services restarted on the server after the password change, but the Listener had a bad password and was not allowing connections. When I attempted to restart the Listener again, it failed until the password was corrected. I still don’t know how this happened, but it’s a good reminder to be especially careful when changing service passwords.  Changing passwords on a cluster can be even more dangerous since you have extra services to update that may not even be running on the server at the time, so verifying everything went smoothly can take a few extra steps.

Tales of when a Log Fails to Shrink in an Availability Group

I received a report that one of my servers had 7% free space on its log drive. Sounded like something fun to resolve. I checked on what was going on and found a log file that was 99% free and a hundred gb in size. While shrinking a log file is not a good practice and I’m not advocating it by any means because it’s just going to grow again and your storage is there specifically to hold logs, this situation was a out of the ordinary and we needed space.

The problem was, this log would not shrink. It was being extremely uncooperative. I took a backup, log backups, multiple shrink attempts, but it wouldn’t budge. The message returned was a big clue though.

<code>The log for database ‘dbname’ cannot be shrunk until all secondaries have moved past the point where the log was added.</code>

As you might have guessed, this server was a SQL Server 2012 instance and in an Always On Availability Group. The database in question could not shrink because it was participating in the AG.

It wasn’t an ideal fix, but by removing the database from the Availability Group, I was able to perform a log shrink to get the size to a more manageable amount. No, I did not truncate it to minimum size, I adjusted it to a reasonable amount based on its normal work. I didn’t want the log to just have to grow again. The shrink worked flawlessly, and with adequate drive space, I attempted to add the database back to the AG via the wizard.

The AG wizard refused to help. The database was encrypted and the AG wizard will not let you add a database if it is encrypted. No explanation why, it just doesn’t like that. You can add an encrypted database to an AG via script though. You can even script the change from the wizard by using a non-encrypted database then changing the database name in the scripted result. The resulting script is exactly what the AG wizard would do, it just cannot execute it automatically.


ALTER AVAILABILITY GROUP AgName
ADD DATABASE DbName;
GO

With free space and an encrypted database safely back in my AG, I was off to new adventures!

Recovery Model Comparison

This simple blog will cover the bulk of information you’ll need to make a fully informed decision on which Recovery Model to choose for your SQL Server databases. I know, that was horrible, but I couldn’t resist. Recovery Models are normally considered one of the more basic things to know about SQL Server, but refreshers are never a bad idea. Furthermore, if you are like me, you haven’t actually dealt with the Bulk Logged model much.

Simple Recovery Model

The simple model constantly reclaims log space in order to keep storage requirements minimal. Transactions are still written to the log; however, once a transaction completes and the data is written to the data file, the transaction log is truncated. This keeps the log small and allows the space to be reused.

While this model takes the least amount of planning and work to maintain, it also provides the least amount of protection. You cannot perform transaction log backups, so any changes since your last backup will be lost in the event of a disaster. There is no possibility to recover to a specific point in time either, and high availability solutions cannot be implemented. This model is best used for unimportant or test data. Data that rarely changes or would have no impact if data loss occurs between backups are also good choices for Simple recovery.

Supported Backups:

  • Full
  • Differential

Unavailable:

  • Log shipping
  • Database Mirroring
  • Always On Availability Groups
  • Point in time Restore

Bulk Logged Recovery Model

This model performs minimal logging of bulk operations in order to reduce the size of the transaction log. This means that the model does support the use of transaction log backups. In fact, it’s essentially identical to the Full Recovery model except for omitting bulk operations from the log. Even so, you shouldn’t use this model long term, just go full.

Bulk operation examples:

  • Bulk Imports (BCP, Bulk Insert)
  • CREATE/ALTER INDEX
  • SELECT INTO
  • WRITETEXT & UPDATETEXT

Since Point in Time recovery is not supported in this model, data can be lost, but only if the transaction log is damaged or bulk logged operations occur (since those aren’t logged). This recovery model is best used during index maintenance or when inserting massive amounts of data. Your transaction log won’t blow up in size, and you’ll be able to flip back to Full Recovery easily after your maintenance is complete (more on that later).

Supported Backups:

  • Full
  • Differential
  • Transaction Log

Full Recovery Model

With the Full Recovery model, you can prevent data being lost due to damage and you finally have the opportunity to restore to a specific point in time. All transactions are stored in the log until a log backup occurs or the log is truncated. If you fail to schedule regular log backups, your transaction log will grow till storage becomes a major problem. After a transaction completes, the data from the log is then written into the data file.

If the tail of the log is damaged, you can still have some data loss up to the last transaction log backup. This is the most appropriate recovery model to use for production data as its the only way to provide point in time recovery.

Supported Backups:

  • Full
  • Differential
  • Transaction Log

Changing Recovery Models

The command to flip recovery models is very easy, just issue an alter command on the desired database and set it to the desired model.


USE [Master];
ALTER DATABASE TestDB SET RECOVERY FULL;

Flipping between recovery models requires a few considerations, which I’ve summed below.

Type Notes
Full to Bulk Backup log for a last recovery point then switch to Bulk Logged Recovery; backup log again after the switch.

Backup Log -> Switch to Bulk-> Backup Log

Bulk to Full Switch back to Full Recovery after Bulk Logged operations complete; backup the log.

Switch to Full -> Backup Log

Simple to Full/Bulk Switch to Full or Bulk Recovery; perform a Full or Differential Backup to start the log chain. Schedule or reenable log backups. Change takes effect after first backup.

Switch to Full/Bulk -> Data Backup -> Backup Log

Bulk/Full to Simple Disable Log Backups, breaking the log chain. Switch to Simple Recovery; perform a Full or Differential Backup.

Disable Log Backups -> Switch to Simple -> Data Backup