Failover Cluster Manager Connection Error Fix

A few days ago I encountered a new error with Failover Cluster Manager.  A couple of servers had been rebuilt to upgrade them from Windows Server 2008 to 2012. They were added back to the cluster successfully. However, one of the servers would not open Failover Cluster Manager properly, and tracking down the solution took a long time.

The problem server successfully joined the cluster, but now it would not connect to the cluster using Failover Cluster Manager. If you opened up the application, it didn’t try to automatically connect, and manually connecting with the fully qualified name failed too. Below is the generated error.

failoverclustermanager_wmierror

I love how this error has absolutely no useful information to it. Luckily I was able to track Error 0x80010002 down online.

Research indicated that there was some sort of WMI error on the computer. Rebooting didn’t help anything, and after numerous attempts to correct/rebuild the WMI repository, not much was accomplished. Eventually, the server could connect to the cluster, but that only worked about 30% of the time, and it nearly timed out even when it did succeed! The cluster still never connected automatically.

After further poking around on the internet, I found a few suggested solutions, with my ultimate fix closely following this post. I still had to combine everything together and run scripts all over the cluster before things returned to normal.

First of all, this is a condensed version of the Cluster Query from the TechNet post linked above.

1) Cluster Query


$Nodes = Get-ClusterNode
ForEach ($Node in $Nodes)
{
 If($Node.State -eq "Down")
  { Write-Host "$Node : Node down skipping" }
 Else
 {
  Try
  {
   $Result = (Get-WmiObject -Class "MSCluster_CLUSTER" -NameSpace "root\MSCluster" -Authentication PacketPrivacy -ComputerName $Node -ErrorAction Stop).__SERVER
   Write-Host -ForegroundColor Green "$Node : WMI query succeeded"
  }
  Catch
  {
   Write-Host -ForegroundColor Red "$Node : WMI Query failed" -NoNewline
   Write-Host  " //"$_.Exception.Message
  }
 }
}

Any server that throws an error with the above query needs to have the following scripts ran on it:

2) MOF Parser
This will parse data for the cluster file. 

cd c:\windows\system32\wbem
mofcomp.exe cluswmi.mof

FCM was still not working correctly, so I reset WMI with the following command.

3) Reset WMI Repository


Winmgmt /resetrepository

That will restart the WMI service, so you’ll probably have to try running it multiple times until all the dependent services are stopped. The command shouldn’t take more than a few seconds to process either way though.

After that, the server that failed the Cluster Query (1) was reporting good connections, but FCM still wouldn’t open properly!

I decided to try the two WMI commands (2 & 3) again on the original server that couldn’t connect to FCM. I had already ran those commands there during the initial troubleshooting, so I was starting to think this was a dead end. Still, it couldn’t hurt, so I gave it a shot.

I reopened FCM and voila! Now the cluster was automatically connecting and looking normal.

As a further note, after everything appeared to be working correctly, SQL was having trouble validating connections to each node in the cluster during install, and I had to run commands 2 & 3 on yet another node in the cluster before things worked 100%, even though that node never had a connection error using the Cluster Query (1).

PowerShell: Detect and Uninstall Java FAST

I’ve been building out a remote, silent Java Updater for awhile now. Making sure it works perfectly in my environment is not an easy task. I’m 90% there, I just have to get over the last few hurdles.
One of the major problems was uninstalling Java efficiently. You can Google scripted Java uninstalls, and you’ll probably find the same recycled code over and over again. (I’ve provided it below, but don’t touch it) The problem is, this code is terrible.

Let me explain. The code works as intended. That’s not why it’s terrible. Why am I complaining then? It uses a very, very bad WMI class. What’s WMI you ask? Not the problem. The class is. It’s a huge waste of time. How bad is it?

Win32Product_Evil

When even Bing thinks it’s bad. It’s bad.

 

Get-WmiObject -Class win32_product would have been a very useful command, however, it’s by far the slowest and most inconsistent thing I have ever used.  Worse, you can’t fix it. Use a WHERE clause, use a FILTER, use wildcards, use explicit values. It doesn’t matter. It’s still slower than molasses. If you plan to use this command, you might as well just go to lunch after you hit execute.

If you didn’t check out those links above, let me summarize:

Invoking win32_product causes a computer to enumerate through the entire list of installed programs and validates each install, causing potentially extremely long wait times before returning results. Regardless of where clauses or filters, every program is always checked in this way.

In testing, the command takes 20-40 minutes to return results on a production server. Even on my computer it takes at least 5 minutes, normally much longer. I have ran this command and gotten results as fast as 11 seconds…after I ran it three times in quick succession. Then running it again five minutes later already took three minutes again. And that was the best case scenario test I had. That’s just unacceptable. Here’s that terrible code with a timer included. Take my word for it and just skip past this.

Win32_Product (Very Slow – Just Skip It!)

#### GET INSTALLED JAVA VERSION - SLOW!
$Start = Get-Date
$Java = Get-WmiObject -Class win32_product | WHERE { $_.Name -LIKE "Java [0-9]*" }
$Current = (Get-Date) - $Start
"Execution Time: $($Current.Hours):$($Current.Minutes):$($Current.Seconds)"
$Java

If you ran that, I’m sorry you wasted away half your life waiting for it to complete, but I did warn you.

My Method (Fast but Wordy)

I came up with a method to uninstall through registry guids. Detecting the java installation guid goes from 30 minutes to <1 second in this manner. The entire installation process was cut by at least 98%, over half an hour down to roughly 25 seconds! I can work with that! The code is a lot longer though.


########## GET INSTALLED JAVA VERSION - FAST!
$Start = Get-Date
Function Get-Uninstaller
{
$ErrorActionPreference = "SilentlyContinue"
$Guids = Get-ChildItem -Path 'Registry::HKLM\Software\Microsoft\Windows\CurrentVersion\Uninstall' | ForEach {$_.Name}
$Products = ForEach ($Guid in $Guids)
{
$Results = "" | SELECT DisplayName, IdentifyingNumber
$Results.DisplayName = Get-ItemProperty -Path $($Guid.Replace("HKEY_LOCAL_MACHINE","Registry::HKLM")) | ForEach {$_.DisplayName}
$Results.IdentifyingNumber = Get-ItemProperty -Path $($Guid.Replace("HKEY_LOCAL_MACHINE","Registry::HKLM")) | ForEach {$_.UninstallString.Split("X")[1]}
$Results
}
$Products
}#EndFunction Get-Uninstaller

$Java = Get-Uninstaller | WHERE { $_.DisplayName -LIKE "Java [0-9]*" }
$Current = (Get-Date) - $Start
"Execution Time: $($Current.Hours):$($Current.Minutes):$($Current.Seconds)"
$Java

Win32Reg_AddRemovePrograms (Fast, but last minute discovery)

But wait, there’s more! The documentation on how broken win32_products is provided me with an even easier method. I didn’t find it until researching some links for this blog though. I was very skeptical about this code until I ran it, because I never saw anyone suggesting its use online while searching for a PowerShell uninstaller. It’s actually just as instantaneous as my registry read method!


########## GET INSTALLED JAVA VERSION - Alternative
$Start = Get-Date
$Java = Get-WmiObject -Class win32reg_addremoveprograms | WHERE { $_.DisplayName -LIKE "Java [0-9]*" }
$Current = (Get-Date) - $Start
"Execution Time: $($Current.Hours):$($Current.Minutes):$($Current.Seconds)"
$Java.DisplayName
$Java.ProdID

Uninstall Script

Whichever method you choose, you’ll need to then run the uninstaller still. Just change the Guid Identifier appropriately for whichever script you choose. That is, $App.IdentifyingNumber or $App.ProdID. I’m logging to a Temp Java folder, so make sure that folder exists or just remove the /l and everything after it if you don’t care about the logging. The uninstall runs silently with no user interface and suppresses reboots.


########## UNINSTALL JAVA
ForEach ($App in $Java)
{
$ArgumentList = "/x $($App.IdentifyingNumber) /qn /norestart /l*vx C:\Temp\Java\UninstallLog.log"
Start-Process -FilePath "msiexec.exe" -ArgumentList $ArgumentList -Wait -PassThru
}

 

Remotely Get & Set Delayed Start for a Service in PowerShell

This week I was asked to check all the SQL Servers to verify that a specific service was running and that it was set to Delayed Start. Requests like this are great fun for me, I get to play with PowerShell and sometimes even get to reuse scripts I’ve written.

The first step was something I’ve done before, setting a service through PowerShell, so I was confident that I had a script for it and the task would be complete in minutes.

Setup

I had around a hundred servers to update. My advice is to use a notepad file with a server list if you have to update more than even half a dozen computers. It’s just a lot easier than formatting them all for a variable. The below declaration will pull the details of your notepad file using the Get-Content cmdlet. I keep a computer list on my desktop and the variable $Env:UserName will pull the currently logged in user, so anytime I share a script, the next person just has to make a similar list. Remember to set one computer per notepad line.

$Computers = ( Get-Content C:\Users\$Env:UserName\Desktop\Computers.txt )

I’m going to work with the Windows Update service in the following examples since every computer should have this.

$Service = "wuauserv"

What doesn’t work

It turns out that PowerShell does not natively support setting Auto Delayed Start with the Set-Service cmdlet. Why was it overlooked? There are a few connect threads out there for the issue. Go ahead and vote them up, we might get lucky and it will make a difference.

SetStartUpType

Like the Connect threads mention, there is no member for Startup Type, you can verify that it’s missing with the following query to Get-Service.

Get-Service | Get-Member -MemberType Property
Notice no Startup
Notice Startup Type is missing

This means we will have to set the service to running first, then set the startup type via a workaround afterwards.

There are a number of ways to set a service to running in PowerShell, but I’ll share two short ways. They are basically the same method; the first one just pipes the command. The benefit of the first method is that you can run the command before the pipe to verify what you are about to affect. It’s a bit like doing a select statement before you run an update in SQL. It’s all about being safe.

Method 1: Piping

Run the code before the pipe (the | character) first to verify what is going to be set. Then run the whole part once you are satisfied with the results.

 Get-Service $Service -ComputerName $Computers |
Set-Service -Status Running 

Method 2: Single Cmdlet

This method is more straightforward. Nothing is wrong with Method 2; it’s just for those who are a bit more confident in their PowerShell skills, or for the ones who like to throw caution to the wind.

Set-Service -Name $Service -ComputerName $Computers -Status Running

Verify Service is Running

It’s a good idea to verify that the services are now running afterwards. There’s almost always a computer that fails to connect and you’ll have the peace of mind that some random code you found online did run successfully.This query returns any computers with a service that is still not running.

Get-Service $Service -ComputerName $Computers |
Where-Object { $_.Status -ne 'Running' } |
Format-Table MachineName,Status,Name -AutoSize

Set Delayed Start

This step gets problematic. We have to step outside PowerShell a bit to use an obscure command line script, at least it’s obscure to me, Sc Config. Throwing a ForEach loop around the command will allow it to run against all your computers, just remember the \\$Computer parameter and that Start= Delayed-Auto has to be written exactly like I have it (including the space after the equal sign) unless you like getting errors. I added an optional Write-Host to insert whitespace and the computer name. You’ll want that if any connections fail so that you know which computer is the issue.

ForEach ( $Computer In $Computers )
{
Write-Host "`n $Computer"
SC.EXE \\$Computer Config $Service Start= Delayed-Auto
}

Verifying Delayed Start

Now we have to use yet another method to verify that the service is set to Delayed Start. We know PowerShell doesn’t work already. Using WMI doesn’t help either, it returns misleading results. Everything is Auto! It doesn’t tell you if it’s Delayed Start or regular Automatic. For example:

ForEach ( $Computer In $Computers )
{
Get-WmiObject -Query "Select StartMode From Win32_Service Where Name='$Service'" |
Select-Object @{l='MachineName';e={$Computer}},StartMode
}

The only viable choice is to search for a registry key that is created for Delayed Start services. The following script will return your desired service if it has a Delayed Auto Start value. Since it’s such a weird query, I wanted to see everything this time. I’m using Invoke-Command to demonstrate a different cmdlet. Notice that I defined $Service inside the ScriptBlock. If you fail to do this, the variable will not be declared. It’s all compartmentalized inside the Invoke-Command per computer.

Invoke-Command -ComputerName $Computers -ScriptBlock {
$Service = "wuauserv"
Write-Host "`n $Env:ComputerName"
Get-ChildItem HKLM:\SYSTEM\CurrentControlSet\Services |
Where-Object {$_.Property -Contains "DelayedAutoStart" -And $_.PsChildName -Like "$Service*" } |
Select-Object -ExpandProperty PSChildName
}

Summary

It was a lot more work than I anticipated to set Delayed Auto Start on a service, but I learned some neat tricks in the meantime. If you just have to set a service to running or one of the three supported Startup methods, you’ve gotten off easy, and you won’t need this work around.