Database Owner Verification

There’s often low hanging fruit once you start looking into servers, or you just need to verify settings are correct.  Recently I had to verify that all databases had the correct owner for auditing purposes. There are many scripts out there to do this, but I wanted to make a few exceptions, and I didn’t need a lot of extra fluff for this specific task either. I also had a plan to convert this to a policy afterwards to ensure settings were not changed later.

I used two different scripts, one for 2012 servers that might be utilizing Availability Groups and one for 2008 servers that might be using Mirroring. I wanted to allow exceptions for databases that were mirrors or secondaries, because without failing them over, I could not update the owner. These will be fixed on the next scheduled failover, but until then, I don’t want the exceptions mucking up my results.

The WHERE clause excludes my allowed owner while also verifying that I don’t have any databases where the ownership has been broken and listed as NULL.  Server roles for AGs and Mirrors are included in the result set so that when I want everything, I can exclude the WHERE clause and quickly see what’s what.

SQL 2012 (Availability Groups)

SELECT
d.name
,[Owner] = suser_sname(owner_sid)
,s.role_desc
FROM sys.databases d
,msdb.sys.dm_hadr_availability_replica_states s
WHERE (suser_name(owner_sid) <> 'SA' OR SUSER_NAME(owner_sid) IS NULL)
AND s.role_desc = 'Primary'

SQL 2008 (Mirrors)

Same as the above query, but while that query above will error when you try to run it on an older version of SQL than 2012 because of the HADR view, this one works while determining if any of your servers have the wrong database owner. Just exclude the AND statement to show the mirrors.


SELECT
d.name
,[Owner] = suser_sname(owner_sid)
,m.mirroring_role_desc
,d.state_desc
FROM sys.databases d
LEFT JOIN sys.database_mirroring m ON m.database_id = d.database_id
WHERE
(suser_name(owner_sid) <> 'SA' OR SUSER_NAME(owner_sid) IS NULL)
AND (m.mirroring_role_desc <> 'mirror' OR m.mirroring_role_desc IS NULL)

Afterwards, I used Policy Based Management to create a policy checking for DB ownership.  I chose the simple way of creating a Database Facet checking that @Owner = 'SA'. (Rename your account, but for demonstration purposes this is the most obvious). The policy will check every database, but automatically skip mirrors/secondaries because it cannot read those databases. The script remains useful for checking the databases that the policy cannot read.

Querying Memory Settings

SQL settings should be verified on a regular basis, and I’m occasionally asked to produce a list of memory configuration settings on all servers for management. It’s a simple process to check this information using Management Studio, but scripting is a much better solution, especially if you need to verify the settings on a large number of servers.

You can verify memory settings quickly for a single system by right-clicking on the server in Object Explorer then selecting Memory on the left hand side.

ServerProperties-Memory

SQL defaults to have access to all the memory on your server (2,147,483,647MB). Best practices suggest a variety of configurations, from 10-20% reserved for the OS, to a minimum of 4GB reserved, to 1GB reserved for every 4-8GB of RAM available. The most important thing to do is leave some space reserved for the OS. Leaving the default memory setting will allow SQL to eat all the available RAM, potentially leaving the server unresponsive as windows may eventually be completely memory starved.

I prefer to verify memory settings against all my servers at once. Not only is it faster, it’s easier to provide a report this way. Open a New Query from your Central Management Server targeting all of your servers for fastest results. Querying sys.configurations and sys.dm_os_sys_memory will provide the information you need, but the results leave a lot to be desired. By pivoting the data and performing a bit of formatting, you can create a report that is much cleaner and appealing.

SET NOCOUNT ON
GO
SELECT
[TotalRamGB] = Ram/1024
,[MinRamGB] = [Minimum size of server memory (MB)]/1024
,[MaxRamGB] = [Maximum size of server memory (MB)]/1024
,[RamLeftForWindowsGB] =
CASE WHEN [Maximum size of server memory (MB)]/1024 = 2097151 THEN 0
ELSE  Ram/1024 - [Maximum size of server memory (MB)]/1024
END
FROM (
SELECT
[Descr] = [Description]
,[Val] = CONVERT(BIGINT,value)
,[Ram] = total_physical_memory_kb/1024
FROM sys.configurations c
,sys.dm_os_sys_memory m
WHERE c.name IN ('min server memory (MB)','max server memory (MB)')
) a
PIVOT
(
MIN(Val) FOR Descr IN ([Minimum size of server memory (MB)],[Maximum size of server memory (MB)])
) b

This script converts all the memory settings to GB, but if you need to work in MB, just remove the GB conversions. The last column indicates how much memory is reserved for the OS. A value of 0 here indicates that SQL’s max memory setting has been left at default, and thus has free reign to steal all the memory. These servers should be addressed as soon as possible. Again, the amount of space you leave available will vary, but I suggest at least 4GB. If your server has such a small amount of RAM that this is a substantial portion of your total RAM, it’s probably time to ask for more RAM too. It’s one of the cheapest and best upgrades possible.

Getting Cluster Nodes through T-SQL

One of my most frustrating moments is when I need to log into the active node of a cluster. I inevitably choose a passive node first, then wind up opening the Failover Cluster Manager to find the active node then finally log in to the correct server. Well, I finally took some time to get it right by using my trusty Central Management Server and T-SQL.

Running all of these scripts on a CMS will allow you to hit every SQL server in your domain. When I first decided to pull Server data using a script, I immediately thought of @@SERVERNAME. However, this returns the cluster name (for clusters), not the active server node. What I really needed took a minor amount of research.

Another possible choice included sys.dm_os_cluster_nodes, but this had surprising limitations. While it does provide node names, if you try to query all servers using a CMS, the script can fail to merge the result sets, displaying the following error message. In my case, it also failed to return results for one entire cluster.

An error occurred while executing batch. Error message is: The result set could not be merged because the result schema did not match the schema from the first responding server.

The next option was to use SERVERPROPERTY to find the actual server name hosting SQL. When using a CMS, the script will also return the results of @@SERVERNAME as a bonus.

SELECT [ActiveServer] = SERVERPROPERTY('ComputerNamePhysicalNetBIOS')

While this is good enough for some scenarios, I wanted a more targeted answer, so I kept digging. First, I wanted a script that would only show the active nodes in each cluster. I’ll mostly use this query to ensure I’m avoiding the primary server during maintenance and to verify which server is the active node when developers or management start asking questions and want immediate answers. (You know, as opposed to when they aren’t in a hurry…)

To get the exact results I want will require a bit of string parsing. I suggest you review SUBSTRING and CHARINDEX if you aren’t positive on how they work. In short, Substring will display the server name up to where Charindex determines the ‘\’ appears. This will provide a column consisting of only the cluster name. The next column will display the active node. I then filter out all servers other than clusters and exclude default instance names as well.

SELECT
[ClusterName] = SUBSTRING(@@SERVERNAME,0,CHARINDEX('\',@@SERVERNAME))
,[ActiveNode] = SERVERPROPERTY('ComputerNamePhysicalNetBIOS')
WHERE SERVERPROPERTY('ComputerNamePhysicalNetBIOS') <> SUBSTRING(@@SERVERNAME,0,CHARINDEX('\',@@SERVERNAME))
AND @@SERVERNAME <> SERVERPROPERTY('ComputerNamePhysicalNetBIOS')

This final query returns every node in each cluster, with the last column denoting the active node. Useful when you need to remember which servers belong to what and your documentation has gone missing – or doesn’t exist yet.

SELECT
[ClusterName] = SUBSTRING(@@SERVERNAME,0,CHARINDEX('\',@@SERVERNAME))
,[Nodes] = NodeName
,[IsActiveNode] = CASE WHEN NodeName = SERVERPROPERTY('ComputerNamePhysicalNetBIOS') THEN '1' ELSE '' END
FROM sys.dm_os_cluster_nodes
WHERE SERVERPROPERTY('ComputerNamePhysicalNetBIOS') <> SUBSTRING(@@SERVERNAME,0,CHARINDEX('\',@@SERVERNAME))
AND @@SERVERNAME <> SERVERPROPERTY('ComputerNamePhysicalNetBIOS')

Summary I shared these scripts with my colleagues, who were very excited about not having to play “Guess the Active Node” anymore. There might be more eloquent ways to determine cluster nodes, but this method is perfect for my environment. I suggest saving the query to a code repository project in Solution Explorer or as a Snippet so that it’s always easily accessible.

I’m participating in the many SQL Blog challenges this month, and I’d like to thank each of them for the push and community support they provide.

Using Not Exists to get Noncompliant Servers

Sometimes you need to perform a check against all servers to verify settings are correct. There are many ways to do this, but in some cases, you just need an idea of how many servers have “fallen through the cracks”.  Probably the fastest way to check all servers is to use a multi-server query through a Central Management Server, and using NOT EXISTS with a subquery will get those wayward servers.

In this scenario, I wanted to verify every nonsystem database was using the Full Recovery Model. I knew there were a few out there that were in Simple, and I needed to provide a list of each of those. The tsql to get this isn’t too hard.

SELECT
     [DatabaseName] = name
    ,recovery_model_desc
FROM sys.databases
WHERE recovery_model_desc = 'SIMPLE' AND database_id > 4

The results include the Server Name since it was a multi-server query. (Told you it was easy)

GetRecoveryModel

Great, I have a list of the offending databases. The next thing I did was create a policy that’s evaluated daily to ensure that all the new and existing databases continued being in Full Recovery. It’s a very easy condition to make. Policy-Based Management can be a daunting feature to get into with SQL Server, as there is not much documentation out there, but my link above will explain some of the basics for setting up a policy to get you started.

FullRecoveryCondition

However, the next issue to tackle was that I needed a list of all the servers that were missing a log backup job. Those needed to be catalogued and subsequently fixed. This is where it got a bit tricky. Using a CMS, I could get all the servers with the job very quickly, but what a pain that would be to filter and sort manually. I thought about it for a few moments, and came up with a great solution, using the NOT EXISTS clause. I’ve probably only used this clause a handful of times in the past, but this seemed like a perfect application.

SELECT [LogBackupJob] = 'Missing'
WHERE NOT EXISTS
(SELECT *
FROM msdb.dbo.sysjobs
WHERE name LIKE 'BackupLog%'

LogBackupsMissing

Again, the CMS displays the Server Name in the results, so I just added a static column to the select clause so that when I copied the results into another file, it would still make sense.

One thing to mention about NOT EXISTS. The select statement inside doesn’t matter. You can do Select *, Select 1, Select 50/0. It doesnt’ matter. In short, the select is not evaluated, the statement is just checking for existing records. There are a few quality articles out there explaining the evaluation process, or lack thereof, but they can be hard to find.

The applications for NOT EXISTS in retrieving a list of servers missing a feature are limitless, and now that I realized how well it works, I’ll likely be using it a lot more in the future. Security compliance, maintenance, and more can be monitored quickly to find offending instances.

Policy Based Management – Tricks & Tips

In continuation of my original post for TSQL Tuesday on the benefits and issues with Policy Based Management, I’d like to discuss options and suggestions for a basic, yet powerful, policy for evaluating the last backup date of each database. While this policy sounds simple enough, successfully evaluating it against every server in your environment may not be so easy. Each server might have special rules, and if you use high availability, you have an even bigger mess to deal with.

The first step is to create a blank policy, which I covered in detail in my original post. To verify a last backup, we need to use the Database Facet. This will cause the policy to evaluate against all databases by default, with some exceptions – see Targets below, but that can be altered per your whims, which I will address at the end.

Once the Create Policy dialogue box is open, we need to create a detailed condition for evaluation. This is going to require the easy-to-use but hard-to-figure-out-which-one-you-really-need built in fields, as well as some specialized ExecuteSql statements to make a robust policy.

Starting off, we only want to check the backups on databases that are in a normal state. We do this by selecting the Field, Operator, and Value of @Status = Normal. Simple enough, but now let’s dive into the trickier parts.

CreateCondition

The meat of the policy are the next two steps. Select AND @LastBackupDate = DATEADD(‘DAY’,-1,GETDATE()). Immediately create another line OR LastDifferentialBackupDate = DATEADD(‘DAY’,-1,GETDATE()). Note that unlike the normal sql script, you need single quotes around DAY. The Condition will error and remind you about this if you forget though. These expressions will check that a full or differential backup has been taken within the last 24 hours. Modify the date to fit your needs and remove the differential test if you never use them. You can then group the full and differential backup clauses together, and insert another clause to verify that a full backup has been taken within the last week. You can try using DATEADD(‘WEEK’) and that will not even error, however, it does not evaluate successfully. Stick with days and make your life easier.  Getting comfortable making grouped clauses is an imperative to successfully implementing complicated policies and this one ensures that some form of backup has been done every day, but still verifies that a full backup was done at least each week.

AddBackUpExpressions

If you need to apply different rules for specific databases or instances, you can do that too, but it won’t be fun. Create an ExecuteSQL statement that will test the Database name using @Name = ‘YourDatabase’ or create a custom field using ExecuteSql(‘String’, ‘SELECT SERVERPROPERTY(”InstanceName”)’) = ‘InstanceNotToBackup’. You can also evaluate based on database size if necessary. Perhaps you only perform full backups nightly on databases that are less than 100g in size, while larger databases only receive a differential backup. You can use the @Size Field (which is in megabytes) to check this value. Perform some grouping and you have a targeted policy.

Troubleshooting ExecuteSql If you have any problems with your ExecuteSql statements, test them by running the query in SSMS without the ExecuteSql wrapping. Get the query working first, then move onto evaluating the ExecuteSql string in the policy. When you write statements this way, don’t forget to add double quotes to your query since it will reside in a string when evaluated from the policy.

Mirroring By default, policies target Every Database. However, this is a lie. Policy Based Management actually targets every non system database that is accessible. This means that Mirroring takes care of itself. It will only evaulate against principal databases. The secondaries will not show up, and good luck trying to convince a policy to run against them. Even if you expicitly tell a database target to look at restoring or databases using mirroring, it will still skip them.

Availability Groups Since Avaibility Groups and their corresponding views did not exist prior to SQL2012, they provide a problem when you query a variety of SQL Server versions. In a multi-server query, you deal with this issue by first using the test: IF(SERVERPROPERTY('IsHadrEnabled')=1)

You’ll need to include that line in the condition to test AG queries successfully, so add it to your ExecuteSql string. If you forget, you will pay the price whenever you evaluate the policy against an older version of SQL.

Hadr

In verions prior to 2012, SQL cannot use sys.fn_hadr_backup_is_preferred_replica because it does not exist. Using an IF statement will let you bypass this error.

If you perform your backups on a preferred replica, you can use the following expression, evaluating to equal two. This passes the policy for non preferred replicas.

ExecuteSql('Numeric','
IF(SERVERPROPERTY(''IsHadrEnabled'')=1)
BEGIN
'SELECT sys.fn_hadr_backup_is_preferred_replica(DB_NAME())
END')

Perhaps you don’t care about preferred replicas though; you have no performance issues ever and live on the wild side – always performing your backups on the primary replica. You can allow all secondaries to automatically pass using the below condition, still checking for value two.

ExecuteSql('Numeric','
IF(SERVERPROPERTY(''IsHadrEnabled'')=1)
BEGIN
SELECT a.role FROM sys.dm_hadr_availability_replica_states a WHERE is_local = 1
END')

This all hinges on you having readable secondaries for your AGs, if you do not, the policy will skip over them regardless since the databases are not accessible.

AddAGExpression

Grouping the backup expressions again ensures they function properly

Arrays All the above conditions assume that you want to see every database, even if you are automatically passing them. Well, obviously TempDB is never going to be backed up, so we need to provide an expression for that, luckily this is simple and done via the @Name Field with the Value ‘Tempdb‘. If you want to exclude other databases, you can do so using the Operator IN, but if you think this works exactly like SQL and write (‘DB1′,’DB2’), it will never work. After much frustration, I realized that you need to set the value to Array(‘DB1′,’DB2’). Without the word Array, the IN clause will not evaluate correctly. Aren’t you glad I saved you all that frustration?

Test

I used an array to demonstrate how they are used, not because it was necessary for one value.

Targets Finally, you have the option of including all of these automatic passes not in the Condition you are checking, but in a custom Condition. Yes, it is very unfortunate that everything is a condition in PBM. So to make things easier, let’s call them Target Conditions and Evaluation Conditions. The same rules apply for creating an Evaluation Condition, but any databases that evaluate to True in the expression will be excluded from the evaluation results entirely. The main thing you need to be concious of is that when evaluating on the Database Facet, the default Every condition is false. It does not evaluate against nonaccessible databases (as I mentioned with Mirroring), and it does not target System databases. If you really want everything to be targeted, you need to create a Every Database Target Condition. This will seem rather redundant, but it works, and without explicitly stating you want both System and NonSystem objects, the target will try to do as little work as possible. It’s lazier than I am.

Don't even try making this a one line array, conditions hate that if you use standard fields.

Don’t even try making this a one line array, conditions hate that if you use standard fields.

Testing and Evaluation I’ve already mentiond that you should test ExecuteSql statements by running the query seperately. You’ll still have to do a lot of grouping testing and evaluating to get a complicated policy to work the way you want. When you evaluate a new policy, always view the results on a few databases and servers to verify that the actual results are in line with what your expected results are. If you get a bad statement, you might not be getting any actual results at all!

Test, test, test new policies!

Test, test, test new policies!

Automatic Passes vs. Targets For most policies I like to see every server and automatically pass the exceptions, hence why I presented conditions with so many automatic passes. Evaluating like this allows me to quickly determine that the server count is correct. It also quells the fear whenever anyone new evaluates a policy. Normally when someone notices a database is not evaluating, alarms go off in their head that the policy is broken and thus that it has never worked properly before. It can be quite a chore to calm them down and explain that was the intended result. That being said, all the Evaluation Conditions can be moved into the Target conditions instead, removing any unwanted results. You know TempDb is never going to be backed up, so moving it to the Target condition is of no consequence.

The main point to be stressed about Policy Management is that it is a powerful feature, but it does require you to learn how to use it through trial and error.

Multi-Server Job Scripting Failure

Central Management Servers are probably the single most important thing in my SQL life right now. I manage a lot of servers, so having an organized way to keep track of them is paramount. Multi-Server Jobs are a great way to centralize jobs that need to be on all your boxes. If you are not familiar with the idea, I suggest you look into it if you have more than a handful of servers.

I was building a new CMS the other day and had to get dozens of Multi-Server Jobs transferred to the new instance. Some of the jobs have 30 steps, so using Object Explorer to script out the jobs was the obvious choice. However, I was immediately hit with an ambiguous error stating that SQL Server was “Unable to cast object of type ‘System.DBNull’ to type ‘System.String’.”

Test

My original error had another, equally unhelpful line.

Knowing that some object is Null doesn’t help very much. After browsing through the job steps, everything looked in order. Internet searches provided surprisingly little on this subject, and nothing I found really explained how it happens. I know I was going to be migrating a CMS again, and soon, so I wanted to know how to fix it. Surely it wasn’t just a bug in Management Studio? On to the test case!

First of all, you need to create a Multi-Server job. I’m assuming you already have a CMS set up, if not, you need one.
Right-click Multi-Server Jobs, create a new job, and then assign at least two targets for testing. I’m going to use the name TestJob for the rest of this scenario.

test

Make sure you have at least 2 target servers

Once TestJob is made, (it does not matter what it does; mine just performed SELECT @@VERSION) you need to run the job. Execute the following code:
msdb.dbo.sp_start_job 'TestJob' or just start the job using the GUI.

Test

Your job can fail, I don’t care. Just run it.

Now that you have some job history for 2 servers, let’s look at that history:

SELECT *
FROM msdb.dbo.sysjobservers js
INNER JOIN msdb.dbo.sysjobs j ON j.job_id = js.job_id
WHERE j.name = 'TestJob'

This will display one line per targeted server. This query is included for your own reference, we are concerned about the next one’s more targeted results. The query is checking your multi-server job history, specifically the server list, and comparing that to the servers that exist on the CMS. Right now, everything looks fine, and matches.

SELECT
[JobName] = j.Name
,[JobServerId] = js.Server_id
,[TargetServerId] = ts.Server_id
FROM msdb.dbo.sysjobservers js
INNER JOIN msdb.dbo.sysjobs j ON j.job_id = js.job_id
LEFT JOIN msdb.dbo.systargetservers ts ON ts.server_id = js.server_id
WHERE j.name = 'TestJob'

Test

Job History Server_Id matches Target Server_Id, awesome!

Again, at this point, all is well, and you can script out the multi-server job to your heart’s content. That would be fine if you saved a copy and never, ever changed the job while it sat on your CMS. Fast forward a few years though, and weird things may have happened. Let’s simulate a weird thing. I’m going to change the Server_Id in the history to an invalid value. That’s right, there are no constraints preventing me from doing this! This is going to simulate removing a server from the CMS after history already exists for it, or someone just going into your history and changing things because they hate you.

You will need to change the numeric values of the server_ids to match invalid and valid server_ids on your CMS, respectively.
UPDATE js
SET Server_id = 7
FROM msdb.dbo.sysjobservers js
INNER JOIN msdb.dbo.sysjobs j ON j.job_id = js.job_id
WHERE j.name = 'TestJob' and server_id = 5

Now we are going to rerun the script from earlier to verify the job history information.

SELECT
[JobName] = j.Name
,[JobServerId] = js.Server_id
,[TargetServerId] = ts.Server_id
FROM msdb.dbo.sysjobservers js
INNER JOIN msdb.dbo.sysjobs j ON j.job_id = js.job_id
LEFT JOIN msdb.dbo.systargetservers ts ON ts.server_id = js.server_id
WHERE j.name = 'TestJob'

Test

The history now has a server_id that doesn’t exist on the CMS

Great…now we have a null server. If you guessed this is a problem, you guessed correctly. If you try to script the job out now, you will receive a failure notification like the one at the top. Now that we have recreated the issue, we know how to fix it though!

Run the following code to find any null servers and just clean out that history. I’m going to assume if your server doesn’t exist, you don’t care about the history. I didn’t at least.

DELETE FROM js
FROM msdb.dbo.sysjobservers js
INNER JOIN msdb.dbo.sysjobs j ON j.job_id = js.job_id
LEFT JOIN msdb.dbo.systargetservers ts ON ts.server_id = js.server_id
WHERE j.name = 'TestJob' AND ts.server_id IS NULL

Finally, let’s look at the job history again, I’ve supplied the code for a third time, just in case you forgot it.

SELECT
[JobName] = j.Name
,[JobServerId] = js.Server_id
,[TargetServerId] = ts.Server_id
FROM msdb.dbo.sysjobservers js
INNER JOIN msdb.dbo.sysjobs j ON j.job_id = js.job_id
LEFT JOIN msdb.dbo.systargetservers ts ON ts.server_id = js.server_id
WHERE j.name = 'TestJob'

Test

Now we have clean history with no null values

If you try to script the job, it will work, and you can copy your multi-server job over to the new server.

Let’s take it a step further, and expand the script to check all the multi-server jobs. Notice that it is only looking at Category_Id=2, or Multi-Server jobs, and only those with invalid Server_Ids.

SELECT
[JobName] = j.Name
,[JobServerId] = js.Server_id
,[TargetServerId] = ts.Server_id
FROM msdb.dbo.sysjobservers js
INNER JOIN msdb.dbo.sysjobs j ON j.job_id = js.job_id
LEFT JOIN msdb.dbo.systargetservers ts ON ts.server_id = js.server_id
WHERE j.category_id = 2 AND ts.server_id IS NULL

Hopefully you got 0 results, and everything on your CMS is happy.
Even though I changed the Server_Id to recreate this scenario, the original problem occurred due to servers defecting from the CMS, but leaving behind their server history. Anytime a server defects, it needs to automatically remove the history from the sysjobservers table as well, otherwise people will continue to have this problem.