Monitoring Backup and Restore Times

Backups and restores are arguably the most important aspect of a DBA’s job. There are few times when a manager will be breathing down your neck more impatiently than during a restore process. Of course, one of the worst things you can say during such an event is, “I have no idea when it will be complete, maybe another ten minutes?” When the process is still running twenty minutes later, it’s difficult to say if the fury will ever end. Let’s explore some obvious and not-so-obvious methods to monitor progress.

GUI Luckily, the GUI Backup/Restore operation has you partially covered. When you start a process with it, the completion percentage will be automatically tracked in the bottom left hand corner, but you surely already know this. While this does not supply you with a time estimation, at least there is something to track.

ProgressBar

Scripted A scripted backup/restore operation using the GUI Script Action reveals the option that provides the progress bar, STATS. The default value is 10, which increments in, no surprise, values of ten percent. The value can be delayed on tiny databases.  If you are getting such odd numbers, it’s probably processing so fast that no one is concerned with completion time anyways. Exceptionally long backups might warrant a STATS value of 5, but personally, I’ve always used 10.

Sample Backup Script Using STATS

BACKUP DATABASE [ToBeDeleted]
TO DISK = N'C:\SQL\BACKUP\ToBeDeleted.BAK'
WITH COPY_ONLY, RETAINDAYS = 1, NOFORMAT, NOINIT,
NAME = N'ToBeDeleted-Full Database Backup',
SKIP, NOREWIND, NOUNLOAD, STATS = 10
GO

If you write a backup/restore job by hand, always include the STATS option. It can save you from going mad during a long operation. What do you do if you forgot to include it though? Sitting around for an hour or more waiting blindly on a restore is a horrible situation. There will be much wailing and gnashing of teeth, at best.  Did it hang? Is it just verifying now? Who knows!?

With a little bit of math and estimation using the progress, (mathimation perhaps?) you can predict a fairly accurate completion time. That estimation may be wildly wrong though, and not many people estimate well under pressure. The last thing you want to do is under-estimate when the entire company is waiting for your magical fix.

All that being said, if you are creating a backup/restore job that will be automated via a job, STATS should not be used. You aren’t actively monitoring it, so all it would do is fill the logs with relatively useless information. On a related note, you should consider using Trace Flag 3226 to suppress successful backup messages as well. Scheduled backups can fill an error log, stealing precious space while making it hard to find the real problems.

Monitoring The solution to this lies in a short T-SQL script. Open a query window and paste the following script in. It will provide you with any databases currently being backed up or restored along with some useful information.

USE [master]
GO
SELECT
	 [Database] = d.name
	,[RunningMin] = r.total_elapsed_time / 60000
	,[RemainingMin] = r.estimated_completion_time / 60000
	,[EstimatedCompletion] = DATEADD(MILLISECOND,estimated_completion_time,GETDATE())
	,[%Complete] = r.percent_complete
	,[Command] = t.[text]
FROM
	sys.databases d
	INNER JOIN sys.dm_exec_requests r ON r.database_id = d.database_id
	CROSS APPLY sys.dm_exec_sql_text(r.[sql_handle]) t
WHERE
 	r.command LIKE '%backup%'
	OR r.command LIKE '%restore%'
GO

Some of the fields are self-explanatory. The current duration and estimated remaining duration default to milliseconds, so they are converted to minutes in the script. The estimated time of completion is a convenience column, because when I’m under pressure, I don’t normally add times perfectly. The command is especially useful when you were not the one who originated the process but need to verify that the script is not poorly performing due to the script used.

Remember, the estimates on this script will not be 100% perfect, its based on the I/O being performed at query time along with expected I/O. There are many things that can affect the duration, such as I/O throughput, compression, encryption, and contention. Even so, the estimates should be closer than anything you would guess by pulling a random number out of the air. Saying, “Sql estimates another 30.5 minutes” carries a lot more weight than, “I feel that based on the winds and astrological alignment that it should be completed in 23 minutes.”

Script Explanation The sys.databases view is one that everyone should be very familiar with and using rather than the deprecated sysdatabases, but the other two objects might be a bit more mysterious. The first Dynamic Management View, dm_exec_requests, has awesome troubleshooting potential. The view returns all commands that are currently being processed by SQL Server, so you could use it to find long running processes, specific queries, etc. The other object, dm_exec_sql_text, is a Dynamic Management Function and is used to find the full sql script executed using the sql_handle parameter. In this usage, the sql_handle is supplied from dm_exec_requests command field through a CROSS APPLY.

Apply was added in 2005 (along with the DMOs), but is largely still a mystery to many people. I admit, I still don’t understand the optimal times to use the operator, but per TechNet, the primary purpose is to invoke table-valued functions for each row returned by a query. Well, that’s exactly what we need here.  Apply also has some great applications for TOP X queries and aggregates, but that’s outside the scope of this article. Using apply here is cleaner than using an inline subquery. Strictly speaking, the performance shouldn’t matter on such a small script, but it’s best to learn as much as possible. Finally, the script is filtered through the WHERE clause to only return backup and restore commands. If you leave that filter off, you’ll end up returning every currently processing command on the server, again, not the intention of this solution.

Summary A colleague tipped me off to this monitoring script while I was waiting for a particularly large backup to finish so I could set up mirroring. I made a few modifications, including the cross apply and estimated completion time. You should make your own variations to suit your environment and tastes. I’m just happy to have a way to check progress in case I forget to throw STATS into a hastily scripted restore operation in the middle of the night.

 

Advertisements