Getting Jobs by Run Time

New month, so I have to make sure I write something about SQL again, instead of babbling about PowerShell all the time! I manage a lot of servers, so scripting out something that I can run on all the servers rather than pulling up the GUI on one server at a time is always important.

Checking to see when jobs ran on a single server is a pretty quick task. You just have to pull up the Job Monitor, and you’ll have times and even more information easily accessible. Doing that on a dozen servers is tedious though. Doing it on a hundred would be a nightmare. Beyond that, checking specific times through the Job Monitor can be a bit of a hassle, as it displays a summary of the last execution times. What if you want to know what happened 5 hours ago, regardless of whether jobs have ran since then?

You can query SQL for job history, and with a lot of time conversions,  you can get a list of jobs that ran within a specific time period. I use this to narrow down what was happening on a box while troubleshooting. If you need to review long running jobs over a period of time, compare run times of jobs across servers, or just see what jobs might have been slowing performance at 3:03 am, this script can help you out.

The script takes a start and end time and checks for any jobs that started or started earlier but were still running by your supplied start time.

Calculating the Start and End times from job history is ugly. You’ll see that the datetime conversions below are disgustingly complicated. The short and sweet explanation is that date, time, and duration are all integer values. Because Microsoft thinks everything should be a number. To properly calculate date and time, then display it in a readable format, you have to stuff in colons and and use modulus division to get the hours, minutes, and seconds displaying correctly. If you really like math (I do!), that’s a technical explanation of modulo.

The run status will tell you if the job succeeded or not. Now, according to TechNet SQL Server 2000, there is a run_status of 4 meaning the job is still In Progress. Newer versions don’t mention a run_status of 4 and I’ve never seen run_status of 4 in a result set, even though I had jobs running at the time. Will it work for you? I really don’t know. I don’t trust it though. For newer versions of SQL, I’d suggest running a system stored procedure instead to get currently executing jobs, the code for which I’ve provided after my script.


DECLARE  @StartTime INT = '195500' --HHMMSS
,@EndTime INT = '195600' --HHMMSS
,@Date  INT = CONVERT(INT,CONVERT(VARCHAR,GETDATE(),112))
,@MinTime INT = 10  --Job Run Time Minimum in Seconds

SELECT
[JobName] = j.name
,[StartDate] = CONVERT(VARCHAR,CONVERT(DATETIME,RTRIM(jh.run_date)),101)
,[StartTime] = CONVERT(TIME(0),STUFF(STUFF(REPLACE(STR(RTRIM(jh.run_time),6,0),' ','0'),3,0,':'),6,0,':'))
,[EndTime] = DATEADD(second,run_duration%100,DATEADD(minute,run_duration / 100%100,DATEADD(hour,run_duration / 10000,CONVERT(TIME(0),STUFF(STUFF(REPLACE(STR(RTRIM(jh.run_time),6,0),' ','0'),3,0,':'),6,0,':')))))
,[Duration] = STUFF(STUFF(REPLACE(STR(jh.run_duration,7,0),' ','0'),4,0,':'),7,0,':')
,[RunStatus] =
CASE jh.run_status
WHEN 0 THEN 'Failed'
WHEN 1 THEN 'Succeeded'
WHEN 2 THEN 'Retry'
WHEN 3 THEN 'Cancelled'
WHEN 4 THEN 'InProgess'
ELSE 'Unknown'
END
FROM msdb.dbo.sysjobs j
INNER JOIN msdb.dbo.sysjobhistory jh ON jh.job_id = j.job_id
WHERE run_date = @Date
AND step_id = 0 -- Entire Job
AND run_duration > @MinTime
AND
(
( run_time >= @StartTime AND run_time <= @EndTime )
OR
( Run_Time < @StartTime
AND
DATEADD(second,run_duration%100,DATEADD(minute,run_duration / 100%100,DATEADD(hour,run_duration / 10000,CONVERT(TIME,STUFF(STUFF(REPLACE(STR(RTRIM(jh.run_time),6,0),' ','0'),3,0,':'),6,0,':'))))) > CONVERT(TIME,STUFF(STUFF(REPLACE(STR(RTRIM(@StartTime),6,0),' ','0'),3,0,':'),6,0,':'))
)
)
ORDER BY Run_Date,Run_Time

Note that jobs that are still executing do not have an entry in job history. So combining this script with the executing jobs system stored procedure is a good idea if you are dealing with a recent time window especially. You can add the parameter @Category_Name = ‘jobs from msx’ to return only the jobs managed by your Central Management Server.


EXEC msdb.dbo.sp_help_job @Execution_Status = 1

Advertisements

Get Failed Jobs with T-SQL

It’s been quite some time since I did a post relating to SQL. I’ve just been sharing a lot of PowerShell scripts lately because almost all my views are related to PowerShell.

Monitoring SQL Jobs can be done in many ways, and I highly suggest your jobs send an email notification to your DBA group on failure. No not just you. What happens when you are on vacation or you leave permanently and your account gets disabled? No notifications. Don’t do that. You can also set up fancy reports to display information about your job status using SSRS, but I always like to have something that is easily portable and can be ran in SSMS quickly.

Uses

In the middle of the night when email isn’t loading and a call comes in that something important failed, but the helpdesk can’t articulate what exactly it is over the phone…this script can help a lot. Suddenly confusion and trying to piece together a string of letters and numbers over the phone turns into a simple, “Oh, yes, I see Server X has failed Job Y 3 times. It ran successfully last time, I’ll go ahead and connect to that server to look into it but I suspect someone might have been doing Z.”

I developed this failed job script to run on a single SQL server or through a multi-server query to see all failed jobs across all servers. I commonly run this script in the mornings as a summary when there are multiple failed job emails that came in over night. It’s a much easier way for me to see the history this way. If a job is continuing to fail and it’s high priority, I start there.

Explanation

The job provides the name, run date, and run time for each failed job, but it also provides some extra information for sorting priority. I’ve used ROW_NUMBER to sort job history and retrieve the latest run results of each failed job. That’s the only purpose of using a CTE here, getting Row 1, or the most recent run result for each job. Using the latest result helps determine how problematic the reported job failure is. Did the latest trans log backup run successfully even though one of them failed in the past? OK, that’s probably a minor issue then. Are they still failing? Well, then we have something a bit more worrisome to consider.

I always limit the results to a 24 hour period to limit the result set. I’ve also included a clause in the past to exclude a specific job that was prone to failing due to network issues or a colleague testing something.

WITH JobInfo AS
(
 SELECT
  j.job_id
  ,j.name
  ,jh.run_status
  ,jh.run_date
  ,jh.run_time
  ,jh.[message]
  ,[PriorStatus] = --'NA' --2008
    LAG(jh.run_status,1) OVER (ORDER BY jh.job_id,jh.run_date,jh.run_time)  --2012
  ,[NextStatus] = --'NA' --2008
    LEAD(jh.run_status,1) OVER (ORDER BY jh.job_id,jh.run_date,jh.run_time) --2012
  ,[RN] = ROW_NUMBER() OVER (PARTITION BY jh.job_id
 ORDER BY jh.job_id DESC, jh.run_date DESC, jh.run_time DESC)
 FROM msdb.dbo.sysjobs j
  INNER JOIN msdb.dbo.sysjobhistory jh ON jh.job_id = j.job_id
 WHERE jh.run_date &amp;amp;gt;= CONVERT(INT,CONVERT(VARCHAR,GETDATE()-1,112))
  AND jh.step_id = 0
)
SELECT
  j.Name
  ,[RunDate] = CONVERT(DATE,CONVERT(VARCHAR,j.run_date,112))
  ,[RunTime] = CONVERT(TIME(0),STUFF(STUFF(REPLACE(STR(RTRIM(j.run_time),6,0),' ','0'),3,0,':'),6,0,':'))
  ,[PriorStatus] = CASE j.PriorStatus WHEN '1' THEN 'SUCCESS' WHEN '0' THEN 'FAIL' ELSE 'N/A' END
  ,[NextStatus] = CASE j.NextStatus WHEN '1' THEN 'SUCCESS' WHEN '0' THEN 'FAIL' ELSE 'N/A' END
  ,[LatestStatus] = CASE ji.run_status WHEN '1' THEN 'SUCCESS' ELSE 'FAIL' END
  ,j.[message]
 FROM JobInfo j
   INNER JOIN JobInfo ji ON ji.job_id = j.job_id AND ji.RN = 1
 WHERE j.run_status = 0
 ORDER BY j.name,[RunDate],[RunTime]

Exceptions

If you are running the script on SQL 2012, you can get a bit more information using LEAD/LAG in the form of the completion status of the job before and after the reported failure. I use this to pattern failures and to further prioritize multiple job failures. All other things being equal, I’m going to be more concerned about the job that has consecutive failure than a job that fails intermittently.

Sadly, IF statements, LEAD/LAG, and SQL 2008R2 or older do not play well with each other. I’ve tried making IF statements that run the LEAD/LAG version of the job only on SQL 2012 while running the NA version of the job otherwise, but SQL always wants to parse the 2012 functions, resulting in errors. I’ve resigned to just commenting out the LEAD/LAG functions when running this script on SQL 2008R2 and older. The benefits of the PriorStatus and NextStatus is fairly minimal, so it’s not a huge loss. I find the LatestStatus to be the most important of the three by far.

PowerShell: Get Adapter Binding Order

In my environment, I have to set binding orders on all new sql servers, and then ensure that those binding orders don’t magically change (trust me, it happens, I just don’t know why). If you aren’t sure why adapter binding orders matter, Microsoft provides a short explanation and walkthrough on manually setting them, “Windows attempts to communicate with services by using network protocols in the order specified in Network Connections. To make network communications faster, you can reorder the protocols in this list from most used to least used.”

Set Bind Order Manually
Manually, you can get to your adapters via the Control Panel -> Network & Sharing Center -> Change Adapter Settings. At the adapter list, you then hit the super secret key Alt. This will display the otherwise hidden toolbar at the top of the explorer window. Select the Advanced Menu -> Advanced Settings. The top of that dialogue box will display the adapter binding order in descending order of priority.

Adapters

Alt displays the toolbar like magic. I was so excited when I learned that.

Script It!
My script is a function that lists the binding order by each adapter’s friendly name. Displaying the name rather than the model information requires a few combined commands that reference each adapter by GUID. If you have ever tried to grab adapter names, you’ll understand why I saved a function. It’s a pain.

I save all the adapters in bind order into an array and then return that value. The idea is simple, it just takes some variable manipulation to get there.


Function Get-BindOrder
{
  $Binding = (Get-ItemProperty "HKLM:\SYSTEM\CurrentControlSet\Services\Tcpip\Linkage").Bind
  $Return = New-Object PSobject
  $BindingOrder = @()
 ForEach ($Bind in $Binding)
 {
  $DeviceId = $Bind.Split("\")[2]
  $Adapter = (Get-WmiObject Win32_Networkadapter | Where {$_.GUID -eq $DeviceId }).NetConnectionId
  $BindingOrder += $Adapter
 }
  $Env:ComputerName
  $BindingOrder
}#EndFunction
CLS
Get-BindOrder

Remote & Multi-Server Calls
If you need information for your entire network, you can quickly wrap the function in our familiar friend, Invoke-Command.


$Computers = "YourListHere"
Invoke-Command -ComputerName $Computers -ScriptBlock {
    ## Paste Function Script Here ##
}

Continuation
One method of setting the Binding Order via PowerShell requires hitting a few places in the registry and can be accomplished using Set-ItemProperty. It’s a bit of a pain though, so that’s a story for another day.

Database Mirror Monitoring with Scripts

The database mirror monitor is a nice tool, but registering all your databases and clicking through each one can be time consuming when you have dozens. I want quick, easy to read results when an error comes in. If things look bad, I’ll open the monitor for a specific database to start an in depth look.

And yes, I do have reasonable alerts set up, but scripts are for proactive searching, dealing with new servers, and, most importantly, a jumping off point when an alert is generated. My normal process is:

  1. Read alert email
  2. Open multi-server query and run all the mirroring scripts below
  3. Investigate specific databases that are still reporting errors from the scripts

Sometimes I get to stop at step 2 because the issue has already been resolved due to a network glitch or someone else responding faster than me. I keep all the scripts saved in a single file which is always open and I run periodically.

The scripts are not as refined as I’d like, so please tweak them. They do the job when errors are coming in though. The only time I can think of any changes are when they are showing a slew of horrific errors, and at that time I just don’t have time to think about changing a script to make it prettier.

Find un-Synchronized Mirrors
Running this as a multi-server query will provide a list of all mirrored databases not currently synchronized. This is my most used method for verifying that a mirror is still in trouble.

SELECT *
FROM sys.database_mirroring
WHERE mirroring_state_desc IS NOT NULL
    AND mirroring_state_desc <> 'SYNCHRONIZED'

Find Databases that are not Online or are Restoring
This script is a bit more specialized. I rarely get results (which is good), but it has indicated anomalies before. This will return any databases that are offline or in a restoring state, but not part of a mirror. Sometimes this indicates a test database or one left in the process of being mirrored, but at least in my environment, I never want to let a database be left in that mode for very long.

The worst scenario I’ve had with this script was a time that it found a database that had mirroring removed, so the partner had just been left in restoring state. Somehow that had been overlooked until I found it with this.

SELECT
   d.name
  ,d.user_access_desc
  ,d.state_desc
  ,m.mirroring_state_desc
  ,m.mirroring_role_desc
FROM sys.databases d
   LEFT JOIN sys.database_mirroring m ON m.database_id = d.database_id
WHERE (d.state_desc <> 'ONLINE'
OR d.state_desc IS NULL)
   AND (m.mirroring_role_desc IS NULL
OR m.mirroring_role_desc <> 'MIRROR')

DB Mirror Monitor Script
This mimics the database mirroring monitor. Main benefits here are you do not have to preregister every single database to the monitor and then wait for each one to refresh and click through each database. Just run the script, and apply any filters that you want. When run as a multi-server query it will automatically increment through each mirrored database.

If you want to see your entire mirroring monitor in one moment, this is for you.

The reason for this script is because sometimes I get mirrors that are still considered synchronized, but are not pushing data fast enough. They throw strange alerts, if any, when this happens. Resolving this normally involves finding out why the network is moving slowly. Ultimately there is little I can do while I wait for a fix, but it’s good to know what is going on in the meantime.

DECLARE @Mirror TABLE
 (database_name SYSNAME,[role] INT,mirroring_state INT
,witness_status INT,log_generation_rate INT,unsent_log INT
,send_rate INT,unrestored_log INT,recovery_rate INT
,transaction_delay INT,transactions_per_sec INT,average_delay INT
,time_recorded DATETIME,time_behind DATETIME,local_time DATETIME)

-- @id will increment and pull each mirrored database
DECLARE  @id        INT
 ,@db        VARCHAR(256)
 ,@command    VARCHAR(2000)
SELECT @id = MIN(database_id) FROM sys.database_mirroring
WHERE mirroring_guid IS NOT NULL

-- Loop through each database to pull mirror monitor information
WHILE @id IS NOT NULL
BEGIN
  SELECT @db = d.name FROM sys.databases d WHERE d.database_id = @id
  SELECT @command = 'EXEC msdb.dbo.sp_dbmmonitorresults @database_name='+QUOTENAME(@db)+''
  PRINT @command
  INSERT INTO @Mirror
    EXEC (@command)
--Increment @id
  SELECT @id = MIN(database_id) FROM sys.database_mirroring
  WHERE mirroring_guid IS NOT NULL AND database_id > @id
END
-- Your WHERE clause/values will vary
SELECT * FROM @Mirror
WHERE unsent_log > 10

Get Drive Sizes using SQL or PowerShell

There are countless methods to obtain drive size information, each with its own set of benefits and limitations. I use a few different methods depending on if I have to type out the solution quickly or if I need a comprehensive view.

Method 1 – xp_fixeddrives
The most basic command to get drive sizes is easy to remember, so I use it when I need a fast result. This script leaves a lot to be desired though. It returns all drive letters on the queried SQL server, but only displays one column, MB free.

EXEC master.dbo.xp_fixeddrives

FixedDrives

You could get extra information by inserting the results to a temp table and formatting it, but that defeats the purpose of this quick command. If I want more information, I move onto a more robust script entirely.

Method 2 – dm_os_volume_stats
In newer versions of SQL Server (2008R2+) you have the opportunity to use the dynamic management function dm_os_volume_stats. This function returns data on all drive letters with SQL files (Data, Log, Temp) and you can easily modify the select statement to tailor the results to your desires.

SELECT
   [Drive] = volume_mount_point
  ,[FreeSpaceMB] = available_bytes/1024/1024 --/1024
  ,[SizeMB] = total_bytes/1024/1024 --/1024
  ,[PercentFree] = CONVERT(INT,CONVERT(DECIMAL(15,2),available_bytes) / total_bytes * 100)
FROM sys.master_files mf
  CROSS APPLY sys.dm_os_volume_stats(mf.database_id,mf.file_id)
--Optional where clause filters drives with more than 20% free space
WHERE CONVERT(INT,CONVERT(DECIMAL(15,2),available_bytes) / total_bytes * 100) < 20
GROUP BY
   volume_mount_point
  ,total_bytes/1024/1024 --/1024
  ,available_bytes/1024/1024 --/1024
  ,CONVERT(INT,CONVERT(DECIMAL(15,2),available_bytes) / total_bytes * 100)
ORDER BY [Drive]

I’ve returned the results in MB to match the result set in xp_FixedDrives, but feel free to divide by another 1024 bytes to convert to gigabytes. (That’s normally what I do). My query includes Free Space, Total Size, and Percent Free space. I also included an optional WHERE clause to filter out any drives that have over 20% Free space. Simply open up a multi-server query, copy and paste the script, and you have current data for drives with low space.

VolumeStats

This is great for most situations…but this method fails to return all drives I’m concerned about. I do not get feedback on the C drive nor D Drive, which is where I have SQL and other applications installed. Not everyone will worry about system drives, but I want to know if someone is throwing data where it doesn’t belong.  I bet you care about that too. In that case, I use another resource.

Method 3 – PowerShell (preferred)
When I need data on every drive, I turn to PowerShell. PowerShell allows me to perform a full drive query against servers with versions prior to 2008R2 as well. The script isn’t overly difficult, but you will need to supply a list of computers to run against. I suggest saving your SQL Server computer names into a notepad file and having PowerShell read from that.

If you want a quick and dirty way to get your computer names, open a multi-server query using your CMS and run:

SELECT SERVERPROPERTY('ComputerNamePhysicalNetBios')
SELECT NodeName FROM sys.dm_os_cluster_nodes

That will return all computers running SQL and all cluster nodes – active or inactive. If you combine these as a distinct list, you should have all your computers. I didn’t say it was clean, I said it was quick.

The PowerShell script returns all drive letters, but you can filter a specific drive or name if you so desire. You can manipulate the WHERE clause by following the format of the one I provided.

For example, change $_.PercentFree to $_.VolumeName and -lt (which means less than) to -eq (equals).  Replace 20 with the drive name in quotes, for example, 'Data'. If you need multiple clauses, the format is WHERE { $_.Value1 -eq '1' -AND $_.Value2 -eq '2' }


$Computers = ( Get-Content C:\Users\$Env:UserName\Desktop\Computers.txt )
Get-WmiObject -class Win32_LogicalDisk -ComputerName $Computers -Filter "DriveType=3" |
 Select-Object SystemName,DeviceID,VolumeName,
  @{n='FreeSpace(MB)';e={$_.FreeSpace / 1MB -as [int]}},
  @{n='Size(MB)';e={$_.Size / 1MB -as [int]}},
  @{n='PercentFree';e={$_.FreeSpace / $_.Size * 100 -as [int]}} |
   Where { $_.PercentFree -lt 20 } |
    Sort SystemName,DeviceId | FT -AutoSize

Results

PowerShell Get Size of Folders

This week another server hit low disk space on the C drive, so I decided it would be an appropriate time to work out a new PowerShell script. I wanted to calculate the size of each folder to determine the biggest threats and work down from there. Clean out the big stuff first kind of thing. If you have ever tried to do this via the Windows GUI, you know this can take awhile. Right-click -> Properties works great for calculating one folder’s size, but when you need to check 100, it’s a teeny bit tedious.

I decided to write a script from scratch to test/improve my scripting skills, but I did use the internet in the end for some formatting help on the Size In Megabytes conversion.

Note: If you find that your folder size is not correct, the most likely issue is that you need to include a -Force parameter on Get-ChildItem within the ForEach loop. This will force hidden items to be counted, and should accurately display the total file size. Example below.

$Items = (Get-ChildItem $Folder -Force -Recurse | 
  Measure-Object -Property Length -Sum)

Further Note: If you attempt to use the above fix, you could encounter a length error as shown below.

GetChildItem-ForceError

The file name must be less than 260 characters; the directory must be less than 248 characters.

The only viable fix I have found for this requires using cmdshell syntax rather than PowerShell. It’s a lot uglier to work with, and frankly, I got the information I needed without looking at system hidden files (it’s not like I was going to delete those anyways). Knowing how big the files were that I could delete was enough for me in this exercise. But if you really want the full answer…

-Force Error Workaround Using CMD
You can replace Get-ChildItem with the below in an attempt to work around hidden items. You’ll have to mess with the code a bit, but you wanted some homework, right?
My opinion though, just don’t worry about counting the hidden items. Save yourself the frustration.

CMD /c DIR C:\USERS /-c

Full script (Not using -Force or CMD)

## SET COMPUTER(S) TO CALCULATE FOLDER SIZES
$Computers = "COMPUTERNAME"
Invoke-Command -ComputerName $Computers -ScriptBlock {
## SET THIS VALUE FOR THE FOLDER TO RECURSE THROUGH
$FolderPath = 'C:\Users'

$Result = @() #Initializes the&nbsp;result array
#this populates the folder list
$Folders = (Get-ChildItem $FolderPath).FullName
ForEach ( $Folder in $Folders )
{
$Items = 0 #resets file size count on each folder
$Items = (Get-ChildItem $Folder -Recurse | 
  Measure-Object -Property Length -Sum)
$Result += [PSCustomObject] @{
  FolderPath = $Folder
  Size = $Items.Sum
  SizeInMB = "{0:N2}" -f ($Items.Sum / 1MB)
}#EndCustomObject
}#EndForEach
$Result | Sort Size -Descending |
  FT FolderPath,SizeInMb -AutoSize
}#EndInvoke

Enterpise Policy Management Framework – Skip the Report

Policy-Based Management in SQL Server is a wonderful and underused feature. The Enterprise Policy Management Framework available through Codeplex further enhances this with reports and history tables, but if you have ever used the main report, you know how distressingly slow it can be. Not only that, but drilling through the report to find failed policies on instances is an extremely painful process. Visual reports are pretty, and the aggregation is a really cool feature, but expanding nodes are not practical for a large environment.

Test

It depressed me just pulling up the slow report to screen shot this. Expanding nodes is even worse.

So while I use EPMF, I do not check the reports daily. Instead, I script the daily failures and errors to quickly review them. Below are three scripts that can provide policy results, use whichever you prefer.

Script 1: Short and sweet. This provides all failures and errors for the current day, including the unshredded XML evaluation results. This is my go to script each morning. Find failures and investigate.


SELECT
phd.EvaluatedServer
,phd.EvaluationDateTime
,phd.EvaluatedPolicy
,CategoryName
,PolicyResult
,ph.EvaluationResults
FROM EESqlReports.policy.PolicyHistoryDetail phd
INNER JOIN EESqlReports.policy.PolicyHistory ph ON ph.PolicyHistoryID = phd.PolicyHistoryID
WHERE
PolicyResult <> 'PASS'
AND CONVERT(DATE,phd.EvaluationDateTime) = CONVERT(DATE,GETDATE())
ORDER BY PolicyResult DESC,Policy_id,phd.EvaluatedServer;

Script 2: This script is essentially the same as above, but it includes the evaluated object or database along with a cleaner XML result. While still XML, it is easier to read, and the best solution I’ve found to read the evaluation results reliably for any policy. Sometimes the XML will not provide an actual value on a failed result. I’ve yet to discover what causes this. Theories include network issues or unexpected values.


WITH XMLNAMESPACES ('http://schemas.microsoft.com/sqlserver/DMF/2007/08'AS DMF)
SELECT
phd.EvaluatedServer
,phd.EvaluationDateTime
,phd.EvaluatedPolicy
,phd.CategoryName
,phd.PolicyResult
,[CheckParam1] = res.Expr.value('Function[1]/Constant[2]/Value[1]','nvarchar(100)')
,[ExpectedValue1] = res.Expr.value('Constant[1]/Value[1]','nvarchar(100)')
,[ActualValue1] = res.Expr.value('Function[1]/ResultValue[1]','nvarchar(100)')
FROM policy.PolicyHistoryDetail phd
INNER JOIN policy.PolicyHistory ph on ph.PolicyHistoryID = phd.PolicyHistoryID
CROSS APPLY ResultDetail.nodes('Operator') AS Res(Expr)
WHERE
PolicyResult <> 'PASS'
AND CONVERT(DATE,phd.EvaluationDateTime) = CONVERT(DATE,GETDATE())
ORDER BY PolicyResult DESC,EvaluatedServer;

Script 3: This script attempts to shred the XML results, showing you the check parameter, expected value, and actual value. This sample script is very short because shredding a policy result gets complicated fast. I’m only shredding the first line of a policy here. Since most policies have more than one parameter and thus values, this script gets excessive fast.  Further, if the policy doesn’t fit the XML format scripted perfectly, you’ll get NULL results. Before you know it, you have to tailor-make a script for almost every policy in your environment. It’s just not effective, and why I suggest just reading the XML from the above script if you really need to view results. As terrible as XML is to read, it’s better than crafting a script for dozens or even hundreds of policies.


<span style="color: #808080; font-family: Consolas; font-size: small;">Consolas; font-size: small;"><span style="color: #808080; font-family: Consolas; font-size: small;">WITH XMLNAMESPACES ('http://schemas.microsoft.com/sqlserver/DMF/2007/08' AS DMF)
SELECT
PH.EvaluatedServer
,PH.EvaluationDateTime
,PH.EvaluatedPolicy
,Res.Expr.value('(../DMF:TargetQueryExpression)[1]', 'nvarchar(150)') AS EvaluatedObject
,(CASE  WHEN Res.Expr.value('(../DMF:Result)[1]','nvarchar(150)') = 'FALSE'
AND Expr.value('(../DMF:Exception)[1]','nvarchar(max)') = '' THEN 'FAIL'
WHEN Res.Expr.value('(../DMF:Result)[1]','nvarchar(150)') = 'FALSE'
AND Expr.value('(../DMF:Exception)[1]','nvarchar(max)') <> '' THEN 'ERROR'
ELSE 'PASS'
END) AS PolicyResult
,CAST(Expr.value('(../DMF:ResultDetail)[1]','nvarchar(max)') AS XML) AS ResultDetail
FROM policy.PolicyHistory PH
INNER JOIN policy.PolicyHistoryDetail phd ON ph.PolicyHistoryID = phd.PolicyHistoryID
INNER JOIN msdb.dbo.syspolicy_policies p ON p.name = PH.EvaluatedPolicy
CROSS APPLY EvaluationResults.nodes('
declare default element namespace "http://schemas.microsoft.com/sqlserver/DMF/2007/08";
//TargetQueryExpression') AS Res(Expr)
WHERE phd.PolicyResult = 'FAIL'
AND CONVERT(DATE,phd.EvaluationDateTime) = CONVERT(DATE,GETDATE())
ORDER BY PolicyResult DESC,EvaluatedServer;</span>

 

Database Owner Verification

There’s often low hanging fruit once you start looking into servers, or you just need to verify settings are correct.  Recently I had to verify that all databases had the correct owner for auditing purposes. There are many scripts out there to do this, but I wanted to make a few exceptions, and I didn’t need a lot of extra fluff for this specific task either. I also had a plan to convert this to a policy afterwards to ensure settings were not changed later.

I used two different scripts, one for 2012 servers that might be utilizing Availability Groups and one for 2008 servers that might be using Mirroring. I wanted to allow exceptions for databases that were mirrors or secondaries, because without failing them over, I could not update the owner. These will be fixed on the next scheduled failover, but until then, I don’t want the exceptions mucking up my results.

The WHERE clause excludes my allowed owner while also verifying that I don’t have any databases where the ownership has been broken and listed as NULL.  Server roles for AGs and Mirrors are included in the result set so that when I want everything, I can exclude the WHERE clause and quickly see what’s what.

SQL 2012 (Availability Groups)

SELECT
d.name
,[Owner] = suser_sname(owner_sid)
,s.role_desc
FROM sys.databases d
,msdb.sys.dm_hadr_availability_replica_states s
WHERE (suser_name(owner_sid) <> 'SA' OR SUSER_NAME(owner_sid) IS NULL)
AND s.role_desc = 'Primary'

SQL 2008 (Mirrors)

Same as the above query, but while that query above will error when you try to run it on an older version of SQL than 2012 because of the HADR view, this one works while determining if any of your servers have the wrong database owner. Just exclude the AND statement to show the mirrors.


SELECT
d.name
,[Owner] = suser_sname(owner_sid)
,m.mirroring_role_desc
,d.state_desc
FROM sys.databases d
LEFT JOIN sys.database_mirroring m ON m.database_id = d.database_id
WHERE
(suser_name(owner_sid) <> 'SA' OR SUSER_NAME(owner_sid) IS NULL)
AND (m.mirroring_role_desc <> 'mirror' OR m.mirroring_role_desc IS NULL)

Afterwards, I used Policy Based Management to create a policy checking for DB ownership.  I chose the simple way of creating a Database Facet checking that @Owner = 'SA'. (Rename your account, but for demonstration purposes this is the most obvious). The policy will check every database, but automatically skip mirrors/secondaries because it cannot read those databases. The script remains useful for checking the databases that the policy cannot read.

Querying Memory Settings

SQL settings should be verified on a regular basis, and I’m occasionally asked to produce a list of memory configuration settings on all servers for management. It’s a simple process to check this information using Management Studio, but scripting is a much better solution, especially if you need to verify the settings on a large number of servers.

You can verify memory settings quickly for a single system by right-clicking on the server in Object Explorer then selecting Memory on the left hand side.

ServerProperties-Memory

SQL defaults to have access to all the memory on your server (2,147,483,647MB). Best practices suggest a variety of configurations, from 10-20% reserved for the OS, to a minimum of 4GB reserved, to 1GB reserved for every 4-8GB of RAM available. The most important thing to do is leave some space reserved for the OS. Leaving the default memory setting will allow SQL to eat all the available RAM, potentially leaving the server unresponsive as windows may eventually be completely memory starved.

I prefer to verify memory settings against all my servers at once. Not only is it faster, it’s easier to provide a report this way. Open a New Query from your Central Management Server targeting all of your servers for fastest results. Querying sys.configurations and sys.dm_os_sys_memory will provide the information you need, but the results leave a lot to be desired. By pivoting the data and performing a bit of formatting, you can create a report that is much cleaner and appealing.

SET NOCOUNT ON
GO
SELECT
[TotalRamGB] = Ram/1024
,[MinRamGB] = [Minimum size of server memory (MB)]/1024
,[MaxRamGB] = [Maximum size of server memory (MB)]/1024
,[RamLeftForWindowsGB] =
CASE WHEN [Maximum size of server memory (MB)]/1024 = 2097151 THEN 0
ELSE  Ram/1024 - [Maximum size of server memory (MB)]/1024
END
FROM (
SELECT
[Descr] = [Description]
,[Val] = CONVERT(BIGINT,value)
,[Ram] = total_physical_memory_kb/1024
FROM sys.configurations c
,sys.dm_os_sys_memory m
WHERE c.name IN ('min server memory (MB)','max server memory (MB)')
) a
PIVOT
(
MIN(Val) FOR Descr IN ([Minimum size of server memory (MB)],[Maximum size of server memory (MB)])
) b

This script converts all the memory settings to GB, but if you need to work in MB, just remove the GB conversions. The last column indicates how much memory is reserved for the OS. A value of 0 here indicates that SQL’s max memory setting has been left at default, and thus has free reign to steal all the memory. These servers should be addressed as soon as possible. Again, the amount of space you leave available will vary, but I suggest at least 4GB. If your server has such a small amount of RAM that this is a substantial portion of your total RAM, it’s probably time to ask for more RAM too. It’s one of the cheapest and best upgrades possible.

T-SQL Tuesday #66: Monitoring with Utility Explorer

TSQL2sDay150x150_388014A5This month’s topic for T-SQL Tuesday is hosted by Cathrine Wilhelmsen (blog | twitter) with the subject of monitoring.

The topic of the month is monitoring. We all monitor something while working with SQL Server: Maybe you’re a DBA who monitors database performance, an SSIS developer who monitors job and package execution, an application developer who monitors queries or a BI analyst who monitors report usage? Do you use T-SQL or PowerShell scripts? Have you created your own monitoring reports or applications? Have you invested in a complete monitoring solution from a vendor? What is monitored manually and what is automated with alerts? If a brand new SQL Server professional in your role asks you for advice, what do you say is the number one most important thing to monitor? Have you ever stayed awake for a week solving a problem that could have been avoided with proper monitoring? Did you implement a monitoring solution that saved your company?

I currently use so many different methods of monitoring, it was daunting to decide what to discuss. I could have written about a high level overview of everything, but that would have been nothing more than mentioning each one in order to fit it into one post. Policies, jobs, traces, audits, extended events, PowerShell…we use them all. Basically if it is not a third-party tool (spending money on any solution is tough to justify here), my office uses it or will be giving it a shot soon. Coincidentally, we just started using another form of monitoring in the past few weeks, so I will discuss some of the trials and tribulations of using the SQL Server Utility.

Utility Explorer provides a simplified dashboard for monitoring and baselining SQL Server CPU and storage resources. This feature is not available for Standard Edition, like almost anything else new and exciting. To utilize this feature, you will need to create a Utility Control Point (UCP). I’d suggest using your Central Management Server to host. Don’t have a CMS? If you have more than two servers, make one. It’s not hard. Unlike a CMS though, the UCP can monitor itself. This is one of the more obscure features added in SQL Server 2008R2, but you can access it in Management Studio by selecting View -> Utility Explorer.

GettingStarted

After selecting your instance, you need to specify the account to run the utility collection set. Using the Agent account is the easiest way to set up, but you have to use a Windows domain account as the collection account, regardless of which path you choose.

SpecifyAccount

Select the instance where data will be collected. Again, I suggest your CMS. A utility database will be created to store data, and the default collection period is one year, so make sure you have sufficient drive space or be ready to reduce your retention period. No worries though, this new dashboard will remind you if your disk space starts getting low! The utility database will grow about 2GB, per monitored instance, per year (typically), while the msdb database should grow around 20MB, per instance, per year. Warning: if you also use Data Collector, your data growth can be exponentially higher.

SpecifyInstance

Now the selected instance will be validated. The instance cannot be enrolled if it is already a member of another UCP. I’m using Collection Sets as well, so validation failed on that. You can bypass this issue by simply disabling your collection sets during the enrollment process. That will allow validation to succeed. After enrolled, you can turn collection sets back on.. The two pieces are related (they use the same database) and yet separate enough that they don’t always play well together.

Validation

Once you have your UCP created, you can begin enrolling instances, which is a very simple process. Just right-click on Managed Instances, or within the white space of Utility Explorer Content, and select Enroll Instance. Specify the instance name and the agent account, like you did above while creating the UCP, and then validate the enrollment. The instance may throw a warning about WMI. Since it’s just a warning, the enrollment will still succeed. My instances have not had any issues reporting data after throwing this warning. Researching the subject has thus been low priority, but it appears to be a permissions issue. Troubleshooting with that article has shown nothing of concern though…

Test

Once you have a few instances enrolled, open the Utility Dashboard, which will look similar to this.

UtilityDashboard

Select Managed Instances and select the tabs to view CPU usage, storage space, policy configuration for utilization pass/fails, and computer properties. Each enrolled instance will show a pass/fail mark based on the configured policies.

ManagedInstances
The Storage Utilization tab provides a quick look at your volume free space. You can drill down to see data file and log file sizes as well.
StorageUtilization
The Property Details tab summarizes the instances software and hardware.
PropertyDetails

Finally, in the Utility Administration, there are a few settings you need to be aware of. This is where you can set Global policies for storage warnings, set read privileges for user accounts on the Utility dashboards, and specify data retention. I suggest you start small and monitor how fast your sysutility database grows. Like I said earlier, combining this with Data Collection, we had massive data growth, more on that in the future.

Overall, Utility Explorer provides an easy to read dashboard for quick looks at your SQL Servers. If you are a graphical person and can spare the disk space, it’s a great feature. Sure, you can get the current settings via queries, but the really valuable part is the automatic retention which allows quick baselining of your CPU and storage. Capacity planning made easy.

StorageOverTime