PowerShell: Copy-Item to Many Remote Computers

I thought I already blogged about this, but when a friend asked me if I had a script for quickly moving a file to many different computers, I couldn’t find the page. Knowing it would be a lot easier to copy and paste a link rather than explain the script each time someone asked me for this, I had my next blog topic.

Basic Copying
Copying a file from one location to another using PowerShell is a pretty simple process, but you can encounter some hiccups with the file paths depending on your setup. At the most basic level, all you need is the Copy-Item cmdlet.

Copy-Item –Path 'C:\OriginalLocation\FileName.txt' –Destination 'C:\NewLocation'

Alternatively you can use the Move-Item command in the same manner. Can you guess what the difference is? That’s right! Move-Item moves the item rather than copying it. I rarely move, but I copy all the time.

Multi-Computer Copying
The purpose of my script is for when you have a single file that needs to be pushed out to dozens of computers. It uses Copy-Item and loops through computers. The function  has extra Verbose logging, copy verification, and, of course, multi-computer copy capability.

Remember that the $Path needs to include the local file path including the file name and file extension. The $Destination needs to be formatted for remote calls or DriveLetter$ but should not include a file name. If you plan to rename the file across all the computers you are copying to, I suggest you just rename the original file first.

Finally, I suggest you always use the –Verbose parameter for this function. It’s just a lot nicer seeing what is happening to your computers rather than the script just sitting there running, leaving you in the dark while you fret over what it might be doing…or not doing.


Function Copy-File
{
  [Cmdletbinding()]
  Param(
   [Parameter(ValueFromPipeline=$true,Mandatory=$true)]
    [String[]]$ComputerName,
   [Parameter(ValueFromPipeline=$true,Mandatory=$true)]
    [String]$Path,
   [Parameter(ValueFromPipeline=$true,Mandatory=$true)]
    [String]$Destination
)
  Process {
   #Extract FileName from Path
   $File = $Path.Split('\')
   $File = $File[$File.Length-1]

  ForEach ($Computer in $ComputerName)
  {
    Write-Verbose "Starting copy to $Computer"
    IF(Test-Path "\\$Computer\$Destination")
    {
     Write-Verbose "Folder $Destination exists on $Computer"
    }
    ELSE
    {
     Write-Verbose "Creating folder $Destination on $Computer"
     New-Item "\\$Computer\$Destination" -Type Directory
    }

   Write-Verbose "Copying $File to $Computer"
   TRY
   {
    Copy-Item -Path $Path -Destination "\\$Computer\$Destination" -Force
    Write-Host "Copied to \\$Computer\$Destination\$File`n" -ForegroundColor GREEN
   }
   CATCH
   {
    Write-Warning "Copy failed to $Computer`n"
   }
  }#EndForEach
 }#EndProcess
}#EndFunction

Example
And now for an example of execution. Declare the variables, then run the command. The function supports piping, but 99% of the time I just declare the variable first. Again, use a $ rather than a : in the destination’s path if you want this to copy files to a remote location. You should not include a preceding \\ on the drive letter though, that’s taken care of in the concatenation.

##### EXAMPLE #####
$Comp = "Comp1","Comp2","Comp3" #Get-Content to read a list from txt file
$Path = "C:\Users\$Env:Username\Desktop\Apples.txt" #Include the FileName and Extension
$Dest = "C$\Users\$Env:Username\Desktop" #Be Sure to use a $ instead of a :

#Quick Variables
Copy-File -Verbose $Comp $Path $Dest

#Piping a variable into the function
$Comp | Copy-File -Verbose -Path $Path -Destination $Dest #Include the parameter names for safety

Advertisements

Collection Set Failure Troubleshooting

My shop uses a lot of monitoring. In fact, just about every monitoring feature built into SQL Server we have tried, are using, or have plans to use in the near future. Some of the monitoring overlaps in purpose, and some of it steps on each others’ toes. We get useful data, but we’ve also discovered a lot of unexpected errors. One such issue was uncovered this month when we were updating servers to a new cumulative update.

A little background first
We use Collection Sets. A more obscure feature of SQL Server that you may have never even heard of. I admit, I did not remember that it existed before one of my colleagues suggested trying it. Collection sets essentially provide data audits and have helped us track down some anomalies in the past. The collection sets run nearly constant queries across all of our servers, then report back to the CMS to catalog the data. We’ve altered the jobs a bit, most importantly offsetting the collection set’s upload schedules to avoid latency and blocking issues.

Well, after installing a Cumulative Update this month, all of the collection sets broke. Failures started flowing in stating that the data uploads were not processing and our inboxes exploded. We had to figure out what the problem was and then fix over a hundred servers.

Error Message
…Component name: DFT – Upload collection snapshot, Code: -1073450901, Subcomponent: SSIS.Pipeline, Description: “RFS – Read Current Upload Data” failed validation and returned validation status…

Solution
The first step was to determine how to get the collection set jobs working correctly again. We had to stop each collection set, then stop any collection upload jobs that were still running.

CollectionSetStop

That alone wasn’t enough though, we had to then clear the cache files as well. These are stored in a folder on each server running the collection sets. You can find your cache location in the Object Explorer under Management -> Data Collection -> Right-click Properties.

CacheLocation

Deleting the cache seemed unnecessary to me at first, but when I tried skipping it, the collection still failed after restart.
We had to fix a lot of servers, and it would have taken hours to do this manually. So I scripted all the changes. I needed it done quickly because I was planning to get Korean food with a friend on the day of the break.

Scripts
You can do this all through PowerShell using Invoke-SqlCmd, but I find that process is very slow to make an initial connection, limits feedback, and I just needed this done fast.

-- SQL
-- Stop Collection Sets
EXEC msdb.dbo.sp_syscollector_stop_collection_set @name = 'Query Statistics'
EXEC msdb.dbo.sp_syscollector_stop_collection_set @name = 'Server Activity'

-- Stop Collection Upload jobs - they keep running after collection stops
-- Change job names as necessary
EXEC msdb.dbo.sp_stop_job @job_name = 'collection_set_3_upload'
EXEC msdb.dbo.sp_stop_job @job_name = 'collection_set_2_upload'
## POWERSHELL
# Create file with computer names to connect to, or supply names with commas
$Computers = (Get-Content C:\Users\$Env:UserName\Desktop\Computers.txt)
# Provide path where you save your cache files
$Cache = "D$\Cache\CollectionSet"

$StartTime = Get-Date
ForEach ($Computer in $Computers)
{
$BeginTime = Get-Date
(Get-ChildItem "\\$Computer\$Cache" -Recurse) | Remove-Item -Force
"$Computer complete in $(((Get-Date) - $BeginTime).TotalSeconds) seconds."
}
"Total Runtime: $(((Get-Date) - $StartTime).TotalSeconds) seconds."
-- SQL
-- Start Collection Sets
EXEC msdb.dbo.sp_syscollector_start_collection_set @name = 'Query Statistics'
EXEC msdb.dbo.sp_syscollector_start_collection_set @name = 'Server Activity'

The PowerShell code should run quickly as it will delete files from more than one server at a time. I included the computer name and run time  as feedback so you don’t sit wondering how far along the script is. I had no feedback in my original script, so I sat fretting over how long it would take and how long I’d have to wait to go to lunch.

Database Owner Verification

There’s often low hanging fruit once you start looking into servers, or you just need to verify settings are correct.  Recently I had to verify that all databases had the correct owner for auditing purposes. There are many scripts out there to do this, but I wanted to make a few exceptions, and I didn’t need a lot of extra fluff for this specific task either. I also had a plan to convert this to a policy afterwards to ensure settings were not changed later.

I used two different scripts, one for 2012 servers that might be utilizing Availability Groups and one for 2008 servers that might be using Mirroring. I wanted to allow exceptions for databases that were mirrors or secondaries, because without failing them over, I could not update the owner. These will be fixed on the next scheduled failover, but until then, I don’t want the exceptions mucking up my results.

The WHERE clause excludes my allowed owner while also verifying that I don’t have any databases where the ownership has been broken and listed as NULL.  Server roles for AGs and Mirrors are included in the result set so that when I want everything, I can exclude the WHERE clause and quickly see what’s what.

SQL 2012 (Availability Groups)

SELECT
d.name
,[Owner] = suser_sname(owner_sid)
,s.role_desc
FROM sys.databases d
,msdb.sys.dm_hadr_availability_replica_states s
WHERE (suser_name(owner_sid) <> 'SA' OR SUSER_NAME(owner_sid) IS NULL)
AND s.role_desc = 'Primary'

SQL 2008 (Mirrors)

Same as the above query, but while that query above will error when you try to run it on an older version of SQL than 2012 because of the HADR view, this one works while determining if any of your servers have the wrong database owner. Just exclude the AND statement to show the mirrors.


SELECT
d.name
,[Owner] = suser_sname(owner_sid)
,m.mirroring_role_desc
,d.state_desc
FROM sys.databases d
LEFT JOIN sys.database_mirroring m ON m.database_id = d.database_id
WHERE
(suser_name(owner_sid) <> 'SA' OR SUSER_NAME(owner_sid) IS NULL)
AND (m.mirroring_role_desc <> 'mirror' OR m.mirroring_role_desc IS NULL)

Afterwards, I used Policy Based Management to create a policy checking for DB ownership.  I chose the simple way of creating a Database Facet checking that @Owner = 'SA'. (Rename your account, but for demonstration purposes this is the most obvious). The policy will check every database, but automatically skip mirrors/secondaries because it cannot read those databases. The script remains useful for checking the databases that the policy cannot read.

PowerShell Serial Function to Parallel Workflow

I’ve been feeling very underwhelmed about blogging for the past week, so I needed something short and simple to discuss this time. Luckily I had an old script that I could adapt and explain.

PowerShell introduced Workflows in version 3, and they are particularly wonderful for the ability to create a ForEach loop that runs in parallel. This can significantly reduce your run times when making numerous remote connections, but with most things, it depends on your use and environment.

A standard PowerShell function executes against multiple computers in a serial fashion, meaning one at a time. This is perfectly fine for a small number of computers, but as the list increases so does the time required. Adapting to a Workflow is not especially difficult, but you may run into some invalid commands. In my below examples, I built a function and workflow with nearly identical syntax. If you look closely though, you’ll see that the Get-Service cmdlet includes a -PSComputerName parameter instead of the standard -ComputerName parameter. Also, Out-Null was not included in Workflows. I had to pipe to InlineScript in order to get that to work. Further details about Workflow scripting can be found here.

The example functions execute serially while the workflow executes in parallel. Get-ServiceTimed utilizes a standard cmdlet, but also calculates the total run time like the rest of the examples. I’ve included it solely for comparison to Get-ServiceSerial which has a ForEach loop – so as to be as similar to the workflow as possible.

Each function calculates and returns the runtime.

Cmdlet
Executed in 33.51 seconds against 32 computers.
Executed in 0.03 seconds against 2 computers.

Function Get-ServiceTimed
{
Param($Computers)
$StartTime = Get-Date
Get-Service -ComputerName $Computers | Out-Null
"Run Time - Cmdlet - $(((Get-Date) - $StartTime).TotalSeconds) seconds."
}

ForEach Serial
Executed in 29.30 seconds against 32 computers, just as slow as using the cmdlet normally.
Executed in 0.03 seconds against 2 computers.

Function Get-ServiceSerial
{
Param($Computers)
$StartTime = Get-Date
ForEach ($Computer In $Computers)
{ Get-Service -ComputerName $Computer | Out-Null }
"Run Time - Serial - $(((Get-Date) - $StartTime).TotalSeconds) seconds."
}

ForEach Parallel
Executed in 12.36 seconds against 32 computers. More than twice as fast as the serial execution.
Executed in 5.88 seconds against 2 computers. Significantly slower than any serial test.

WorkFlow Get-ServiceParallel
{
Param($Computers)
$StartTime = Get-Date
ForEach -Parallel ($Computer In $Computers)
{ Get-Service -PSComputerName $Computer | InlineScript { Out-Null } }
"Run Time - Parallel - $(((Get-Date) - $StartTime).TotalSeconds) seconds."
}

Summary I only supplied a few example execution times because times vary wildly depending on network traffic and configuration. Your results may vary greatly. Serial execution was actually faster than parallel execution on average until I was connecting to more than ten computers. If you only work with a handful of computers, you may not need to worry about this. I’m often connecting to at least 30 computers at once, in which case parallel execution is always much faster.

Remotely Get & Set Delayed Start for a Service in PowerShell

This week I was asked to check all the SQL Servers to verify that a specific service was running and that it was set to Delayed Start. Requests like this are great fun for me, I get to play with PowerShell and sometimes even get to reuse scripts I’ve written.

The first step was something I’ve done before, setting a service through PowerShell, so I was confident that I had a script for it and the task would be complete in minutes.

Setup

I had around a hundred servers to update. My advice is to use a notepad file with a server list if you have to update more than even half a dozen computers. It’s just a lot easier than formatting them all for a variable. The below declaration will pull the details of your notepad file using the Get-Content cmdlet. I keep a computer list on my desktop and the variable $Env:UserName will pull the currently logged in user, so anytime I share a script, the next person just has to make a similar list. Remember to set one computer per notepad line.

$Computers = ( Get-Content C:\Users\$Env:UserName\Desktop\Computers.txt )

I’m going to work with the Windows Update service in the following examples since every computer should have this.

$Service = "wuauserv"

What doesn’t work

It turns out that PowerShell does not natively support setting Auto Delayed Start with the Set-Service cmdlet. Why was it overlooked? There are a few connect threads out there for the issue. Go ahead and vote them up, we might get lucky and it will make a difference.

SetStartUpType

Like the Connect threads mention, there is no member for Startup Type, you can verify that it’s missing with the following query to Get-Service.

Get-Service | Get-Member -MemberType Property
Notice no Startup
Notice Startup Type is missing

This means we will have to set the service to running first, then set the startup type via a workaround afterwards.

There are a number of ways to set a service to running in PowerShell, but I’ll share two short ways. They are basically the same method; the first one just pipes the command. The benefit of the first method is that you can run the command before the pipe to verify what you are about to affect. It’s a bit like doing a select statement before you run an update in SQL. It’s all about being safe.

Method 1: Piping

Run the code before the pipe (the | character) first to verify what is going to be set. Then run the whole part once you are satisfied with the results.

 Get-Service $Service -ComputerName $Computers |
Set-Service -Status Running 

Method 2: Single Cmdlet

This method is more straightforward. Nothing is wrong with Method 2; it’s just for those who are a bit more confident in their PowerShell skills, or for the ones who like to throw caution to the wind.

Set-Service -Name $Service -ComputerName $Computers -Status Running

Verify Service is Running

It’s a good idea to verify that the services are now running afterwards. There’s almost always a computer that fails to connect and you’ll have the peace of mind that some random code you found online did run successfully.This query returns any computers with a service that is still not running.

Get-Service $Service -ComputerName $Computers |
Where-Object { $_.Status -ne 'Running' } |
Format-Table MachineName,Status,Name -AutoSize

Set Delayed Start

This step gets problematic. We have to step outside PowerShell a bit to use an obscure command line script, at least it’s obscure to me, Sc Config. Throwing a ForEach loop around the command will allow it to run against all your computers, just remember the \\$Computer parameter and that Start= Delayed-Auto has to be written exactly like I have it (including the space after the equal sign) unless you like getting errors. I added an optional Write-Host to insert whitespace and the computer name. You’ll want that if any connections fail so that you know which computer is the issue.

ForEach ( $Computer In $Computers )
{
Write-Host "`n $Computer"
SC.EXE \\$Computer Config $Service Start= Delayed-Auto
}

Verifying Delayed Start

Now we have to use yet another method to verify that the service is set to Delayed Start. We know PowerShell doesn’t work already. Using WMI doesn’t help either, it returns misleading results. Everything is Auto! It doesn’t tell you if it’s Delayed Start or regular Automatic. For example:

ForEach ( $Computer In $Computers )
{
Get-WmiObject -Query "Select StartMode From Win32_Service Where Name='$Service'" |
Select-Object @{l='MachineName';e={$Computer}},StartMode
}

The only viable choice is to search for a registry key that is created for Delayed Start services. The following script will return your desired service if it has a Delayed Auto Start value. Since it’s such a weird query, I wanted to see everything this time. I’m using Invoke-Command to demonstrate a different cmdlet. Notice that I defined $Service inside the ScriptBlock. If you fail to do this, the variable will not be declared. It’s all compartmentalized inside the Invoke-Command per computer.

Invoke-Command -ComputerName $Computers -ScriptBlock {
$Service = "wuauserv"
Write-Host "`n $Env:ComputerName"
Get-ChildItem HKLM:\SYSTEM\CurrentControlSet\Services |
Where-Object {$_.Property -Contains "DelayedAutoStart" -And $_.PsChildName -Like "$Service*" } |
Select-Object -ExpandProperty PSChildName
}

Summary

It was a lot more work than I anticipated to set Delayed Auto Start on a service, but I learned some neat tricks in the meantime. If you just have to set a service to running or one of the three supported Startup methods, you’ve gotten off easy, and you won’t need this work around.

Using Not Exists to get Noncompliant Servers

Sometimes you need to perform a check against all servers to verify settings are correct. There are many ways to do this, but in some cases, you just need an idea of how many servers have “fallen through the cracks”.  Probably the fastest way to check all servers is to use a multi-server query through a Central Management Server, and using NOT EXISTS with a subquery will get those wayward servers.

In this scenario, I wanted to verify every nonsystem database was using the Full Recovery Model. I knew there were a few out there that were in Simple, and I needed to provide a list of each of those. The tsql to get this isn’t too hard.

SELECT
     [DatabaseName] = name
    ,recovery_model_desc
FROM sys.databases
WHERE recovery_model_desc = 'SIMPLE' AND database_id > 4

The results include the Server Name since it was a multi-server query. (Told you it was easy)

GetRecoveryModel

Great, I have a list of the offending databases. The next thing I did was create a policy that’s evaluated daily to ensure that all the new and existing databases continued being in Full Recovery. It’s a very easy condition to make. Policy-Based Management can be a daunting feature to get into with SQL Server, as there is not much documentation out there, but my link above will explain some of the basics for setting up a policy to get you started.

FullRecoveryCondition

However, the next issue to tackle was that I needed a list of all the servers that were missing a log backup job. Those needed to be catalogued and subsequently fixed. This is where it got a bit tricky. Using a CMS, I could get all the servers with the job very quickly, but what a pain that would be to filter and sort manually. I thought about it for a few moments, and came up with a great solution, using the NOT EXISTS clause. I’ve probably only used this clause a handful of times in the past, but this seemed like a perfect application.

SELECT [LogBackupJob] = 'Missing'
WHERE NOT EXISTS
(SELECT *
FROM msdb.dbo.sysjobs
WHERE name LIKE 'BackupLog%'

LogBackupsMissing

Again, the CMS displays the Server Name in the results, so I just added a static column to the select clause so that when I copied the results into another file, it would still make sense.

One thing to mention about NOT EXISTS. The select statement inside doesn’t matter. You can do Select *, Select 1, Select 50/0. It doesnt’ matter. In short, the select is not evaluated, the statement is just checking for existing records. There are a few quality articles out there explaining the evaluation process, or lack thereof, but they can be hard to find.

The applications for NOT EXISTS in retrieving a list of servers missing a feature are limitless, and now that I realized how well it works, I’ll likely be using it a lot more in the future. Security compliance, maintenance, and more can be monitored quickly to find offending instances.

Policy Based Management – Tricks & Tips

In continuation of my original post for TSQL Tuesday on the benefits and issues with Policy Based Management, I’d like to discuss options and suggestions for a basic, yet powerful, policy for evaluating the last backup date of each database. While this policy sounds simple enough, successfully evaluating it against every server in your environment may not be so easy. Each server might have special rules, and if you use high availability, you have an even bigger mess to deal with.

The first step is to create a blank policy, which I covered in detail in my original post. To verify a last backup, we need to use the Database Facet. This will cause the policy to evaluate against all databases by default, with some exceptions – see Targets below, but that can be altered per your whims, which I will address at the end.

Once the Create Policy dialogue box is open, we need to create a detailed condition for evaluation. This is going to require the easy-to-use but hard-to-figure-out-which-one-you-really-need built in fields, as well as some specialized ExecuteSql statements to make a robust policy.

Starting off, we only want to check the backups on databases that are in a normal state. We do this by selecting the Field, Operator, and Value of @Status = Normal. Simple enough, but now let’s dive into the trickier parts.

CreateCondition

The meat of the policy are the next two steps. Select AND @LastBackupDate = DATEADD(‘DAY’,-1,GETDATE()). Immediately create another line OR LastDifferentialBackupDate = DATEADD(‘DAY’,-1,GETDATE()). Note that unlike the normal sql script, you need single quotes around DAY. The Condition will error and remind you about this if you forget though. These expressions will check that a full or differential backup has been taken within the last 24 hours. Modify the date to fit your needs and remove the differential test if you never use them. You can then group the full and differential backup clauses together, and insert another clause to verify that a full backup has been taken within the last week. You can try using DATEADD(‘WEEK’) and that will not even error, however, it does not evaluate successfully. Stick with days and make your life easier.  Getting comfortable making grouped clauses is an imperative to successfully implementing complicated policies and this one ensures that some form of backup has been done every day, but still verifies that a full backup was done at least each week.

AddBackUpExpressions

If you need to apply different rules for specific databases or instances, you can do that too, but it won’t be fun. Create an ExecuteSQL statement that will test the Database name using @Name = ‘YourDatabase’ or create a custom field using ExecuteSql(‘String’, ‘SELECT SERVERPROPERTY(”InstanceName”)’) = ‘InstanceNotToBackup’. You can also evaluate based on database size if necessary. Perhaps you only perform full backups nightly on databases that are less than 100g in size, while larger databases only receive a differential backup. You can use the @Size Field (which is in megabytes) to check this value. Perform some grouping and you have a targeted policy.

Troubleshooting ExecuteSql If you have any problems with your ExecuteSql statements, test them by running the query in SSMS without the ExecuteSql wrapping. Get the query working first, then move onto evaluating the ExecuteSql string in the policy. When you write statements this way, don’t forget to add double quotes to your query since it will reside in a string when evaluated from the policy.

Mirroring By default, policies target Every Database. However, this is a lie. Policy Based Management actually targets every non system database that is accessible. This means that Mirroring takes care of itself. It will only evaulate against principal databases. The secondaries will not show up, and good luck trying to convince a policy to run against them. Even if you expicitly tell a database target to look at restoring or databases using mirroring, it will still skip them.

Availability Groups Since Avaibility Groups and their corresponding views did not exist prior to SQL2012, they provide a problem when you query a variety of SQL Server versions. In a multi-server query, you deal with this issue by first using the test: IF(SERVERPROPERTY('IsHadrEnabled')=1)

You’ll need to include that line in the condition to test AG queries successfully, so add it to your ExecuteSql string. If you forget, you will pay the price whenever you evaluate the policy against an older version of SQL.

Hadr

In verions prior to 2012, SQL cannot use sys.fn_hadr_backup_is_preferred_replica because it does not exist. Using an IF statement will let you bypass this error.

If you perform your backups on a preferred replica, you can use the following expression, evaluating to equal two. This passes the policy for non preferred replicas.

ExecuteSql('Numeric','
IF(SERVERPROPERTY(''IsHadrEnabled'')=1)
BEGIN
'SELECT sys.fn_hadr_backup_is_preferred_replica(DB_NAME())
END')

Perhaps you don’t care about preferred replicas though; you have no performance issues ever and live on the wild side – always performing your backups on the primary replica. You can allow all secondaries to automatically pass using the below condition, still checking for value two.

ExecuteSql('Numeric','
IF(SERVERPROPERTY(''IsHadrEnabled'')=1)
BEGIN
SELECT a.role FROM sys.dm_hadr_availability_replica_states a WHERE is_local = 1
END')

This all hinges on you having readable secondaries for your AGs, if you do not, the policy will skip over them regardless since the databases are not accessible.

AddAGExpression

Grouping the backup expressions again ensures they function properly

Arrays All the above conditions assume that you want to see every database, even if you are automatically passing them. Well, obviously TempDB is never going to be backed up, so we need to provide an expression for that, luckily this is simple and done via the @Name Field with the Value ‘Tempdb‘. If you want to exclude other databases, you can do so using the Operator IN, but if you think this works exactly like SQL and write (‘DB1′,’DB2’), it will never work. After much frustration, I realized that you need to set the value to Array(‘DB1′,’DB2’). Without the word Array, the IN clause will not evaluate correctly. Aren’t you glad I saved you all that frustration?

Test

I used an array to demonstrate how they are used, not because it was necessary for one value.

Targets Finally, you have the option of including all of these automatic passes not in the Condition you are checking, but in a custom Condition. Yes, it is very unfortunate that everything is a condition in PBM. So to make things easier, let’s call them Target Conditions and Evaluation Conditions. The same rules apply for creating an Evaluation Condition, but any databases that evaluate to True in the expression will be excluded from the evaluation results entirely. The main thing you need to be concious of is that when evaluating on the Database Facet, the default Every condition is false. It does not evaluate against nonaccessible databases (as I mentioned with Mirroring), and it does not target System databases. If you really want everything to be targeted, you need to create a Every Database Target Condition. This will seem rather redundant, but it works, and without explicitly stating you want both System and NonSystem objects, the target will try to do as little work as possible. It’s lazier than I am.

Don't even try making this a one line array, conditions hate that if you use standard fields.

Don’t even try making this a one line array, conditions hate that if you use standard fields.

Testing and Evaluation I’ve already mentiond that you should test ExecuteSql statements by running the query seperately. You’ll still have to do a lot of grouping testing and evaluating to get a complicated policy to work the way you want. When you evaluate a new policy, always view the results on a few databases and servers to verify that the actual results are in line with what your expected results are. If you get a bad statement, you might not be getting any actual results at all!

Test, test, test new policies!

Test, test, test new policies!

Automatic Passes vs. Targets For most policies I like to see every server and automatically pass the exceptions, hence why I presented conditions with so many automatic passes. Evaluating like this allows me to quickly determine that the server count is correct. It also quells the fear whenever anyone new evaluates a policy. Normally when someone notices a database is not evaluating, alarms go off in their head that the policy is broken and thus that it has never worked properly before. It can be quite a chore to calm them down and explain that was the intended result. That being said, all the Evaluation Conditions can be moved into the Target conditions instead, removing any unwanted results. You know TempDb is never going to be backed up, so moving it to the Target condition is of no consequence.

The main point to be stressed about Policy Management is that it is a powerful feature, but it does require you to learn how to use it through trial and error.

Multi-Server Job Scripting Failure

Central Management Servers are probably the single most important thing in my SQL life right now. I manage a lot of servers, so having an organized way to keep track of them is paramount. Multi-Server Jobs are a great way to centralize jobs that need to be on all your boxes. If you are not familiar with the idea, I suggest you look into it if you have more than a handful of servers.

I was building a new CMS the other day and had to get dozens of Multi-Server Jobs transferred to the new instance. Some of the jobs have 30 steps, so using Object Explorer to script out the jobs was the obvious choice. However, I was immediately hit with an ambiguous error stating that SQL Server was “Unable to cast object of type ‘System.DBNull’ to type ‘System.String’.”

Test

My original error had another, equally unhelpful line.

Knowing that some object is Null doesn’t help very much. After browsing through the job steps, everything looked in order. Internet searches provided surprisingly little on this subject, and nothing I found really explained how it happens. I know I was going to be migrating a CMS again, and soon, so I wanted to know how to fix it. Surely it wasn’t just a bug in Management Studio? On to the test case!

First of all, you need to create a Multi-Server job. I’m assuming you already have a CMS set up, if not, you need one.
Right-click Multi-Server Jobs, create a new job, and then assign at least two targets for testing. I’m going to use the name TestJob for the rest of this scenario.

test

Make sure you have at least 2 target servers

Once TestJob is made, (it does not matter what it does; mine just performed SELECT @@VERSION) you need to run the job. Execute the following code:
msdb.dbo.sp_start_job 'TestJob' or just start the job using the GUI.

Test

Your job can fail, I don’t care. Just run it.

Now that you have some job history for 2 servers, let’s look at that history:

SELECT *
FROM msdb.dbo.sysjobservers js
INNER JOIN msdb.dbo.sysjobs j ON j.job_id = js.job_id
WHERE j.name = 'TestJob'

This will display one line per targeted server. This query is included for your own reference, we are concerned about the next one’s more targeted results. The query is checking your multi-server job history, specifically the server list, and comparing that to the servers that exist on the CMS. Right now, everything looks fine, and matches.

SELECT
[JobName] = j.Name
,[JobServerId] = js.Server_id
,[TargetServerId] = ts.Server_id
FROM msdb.dbo.sysjobservers js
INNER JOIN msdb.dbo.sysjobs j ON j.job_id = js.job_id
LEFT JOIN msdb.dbo.systargetservers ts ON ts.server_id = js.server_id
WHERE j.name = 'TestJob'

Test

Job History Server_Id matches Target Server_Id, awesome!

Again, at this point, all is well, and you can script out the multi-server job to your heart’s content. That would be fine if you saved a copy and never, ever changed the job while it sat on your CMS. Fast forward a few years though, and weird things may have happened. Let’s simulate a weird thing. I’m going to change the Server_Id in the history to an invalid value. That’s right, there are no constraints preventing me from doing this! This is going to simulate removing a server from the CMS after history already exists for it, or someone just going into your history and changing things because they hate you.

You will need to change the numeric values of the server_ids to match invalid and valid server_ids on your CMS, respectively.
UPDATE js
SET Server_id = 7
FROM msdb.dbo.sysjobservers js
INNER JOIN msdb.dbo.sysjobs j ON j.job_id = js.job_id
WHERE j.name = 'TestJob' and server_id = 5

Now we are going to rerun the script from earlier to verify the job history information.

SELECT
[JobName] = j.Name
,[JobServerId] = js.Server_id
,[TargetServerId] = ts.Server_id
FROM msdb.dbo.sysjobservers js
INNER JOIN msdb.dbo.sysjobs j ON j.job_id = js.job_id
LEFT JOIN msdb.dbo.systargetservers ts ON ts.server_id = js.server_id
WHERE j.name = 'TestJob'

Test

The history now has a server_id that doesn’t exist on the CMS

Great…now we have a null server. If you guessed this is a problem, you guessed correctly. If you try to script the job out now, you will receive a failure notification like the one at the top. Now that we have recreated the issue, we know how to fix it though!

Run the following code to find any null servers and just clean out that history. I’m going to assume if your server doesn’t exist, you don’t care about the history. I didn’t at least.

DELETE FROM js
FROM msdb.dbo.sysjobservers js
INNER JOIN msdb.dbo.sysjobs j ON j.job_id = js.job_id
LEFT JOIN msdb.dbo.systargetservers ts ON ts.server_id = js.server_id
WHERE j.name = 'TestJob' AND ts.server_id IS NULL

Finally, let’s look at the job history again, I’ve supplied the code for a third time, just in case you forgot it.

SELECT
[JobName] = j.Name
,[JobServerId] = js.Server_id
,[TargetServerId] = ts.Server_id
FROM msdb.dbo.sysjobservers js
INNER JOIN msdb.dbo.sysjobs j ON j.job_id = js.job_id
LEFT JOIN msdb.dbo.systargetservers ts ON ts.server_id = js.server_id
WHERE j.name = 'TestJob'

Test

Now we have clean history with no null values

If you try to script the job, it will work, and you can copy your multi-server job over to the new server.

Let’s take it a step further, and expand the script to check all the multi-server jobs. Notice that it is only looking at Category_Id=2, or Multi-Server jobs, and only those with invalid Server_Ids.

SELECT
[JobName] = j.Name
,[JobServerId] = js.Server_id
,[TargetServerId] = ts.Server_id
FROM msdb.dbo.sysjobservers js
INNER JOIN msdb.dbo.sysjobs j ON j.job_id = js.job_id
LEFT JOIN msdb.dbo.systargetservers ts ON ts.server_id = js.server_id
WHERE j.category_id = 2 AND ts.server_id IS NULL

Hopefully you got 0 results, and everything on your CMS is happy.
Even though I changed the Server_Id to recreate this scenario, the original problem occurred due to servers defecting from the CMS, but leaving behind their server history. Anytime a server defects, it needs to automatically remove the history from the sysjobservers table as well, otherwise people will continue to have this problem.