vSphere 5.1 Single Sign On Troubleshooting Adventure
What was supposed to be a afternoon of host memory upgrades, cluster re-balancing and DRS changes, as well as an upgrade from vSphere 5.1 to 5.1U1, turned into quite the troubleshooting exercise. A few people asked me on Twitter to document the experience so hopefully this post saves a few of you some time if this issue comes up. Here we go!
Before I started, the environment looked something like this:
- vCenter Server 5.1 installed on a Windows Server 2008 R2 standard VM. This same machine also had vSphere SSO, vSphere Web Client, vCenter Inventory Service, and vCenter Update Manager on it. All running 5.1 unpatched.
- vCenter Server database is stored on an external SQL 2008 server. vCenter SSO database is stored locally on the virtual machine within a SQL 2008 Express instance.
- 9 ESXi 5.1 hosts all in a single cluster with HA enabled and DRS set to manual
- Active Directory authentication was enabled on both the ESXi hosts and vCenter Server.
- AD/DNS/DHCP are hosted on separate physical servers
Given that I needed to power off the hosts to upgrade the memory and they all needed a reboot to patch to ESXi 5.1U1, this meant downtime. Additionally, the current cluster did not have EVC enabled and there are a mix of different processor families here, so we scheduled downtime for the entire environment to shutdown each VM and move the hosts into two, EVC-enabled clusters.
Now that we have some background, the first thing on my list was to upgrade vCenter Server to 5.1u1. Now if you’re not familiar with the Windows installer, when you run the autorun program that comes with the vCenter Server iso, you install the components in the following order:
- vCenter Single Sign On
- vCenter Inventory Service
- vCenter Server
- vSphere Client
- vSphere Web Client
- vCenter Update Manager
There is a simple install option which automates a lot of this, but that is not available for upgrades. It only works if you are doing a new install. So I proceeded with the install order and completed the SSO, Inventory Service, and vCenter Server pieces. Everything installed just fine. Next I updated the vSphere Client and after that installed I attempted to login to vCenter Server using active directory credentials. This is where things went downhill…
Active Directory authentication did not work. I verified AD was actually working properly so this was not the issue. Not being able to login with AD, I tried the default administrator@System-Domain account which let me in. I also updated the vSphere Web Client, in hopes that it would let me in, but the installer wouldn’t let me past the part where you connect it to vCenter SSO. Even though I was typing the correct lookup service URL and username/password, it would come back with “password incorrect or blank”. Now I’m not going to list out all my troubleshooting steps here as it was lengthy, but suffice it to say that I had a corrupt SSO installation/database. Database repairs failed, so my only option was to re-install SSO. And this is where the fun begins!
So after uninstalling vCenter SSO and attempting to reinstall, it came back with an error saying unable to re-create database users. Now this gave me a clue that uninstalling SSO doesn’t actually wipe out the current database configuration. So what you’ll want to do here is use the SQL Server 2008 Management Studio which should be installed on the VM to browse to the local database instance. The default name of the instance is VIM_SQLXP, so the full server name looks like: localhost\VIM_SQLXP or .\VIM_SQLXP
The next thing you need to do is delete both the database used for SSO and the database users. In my case I backed up the database before deleting if, for some reason, I needed something inside of it. The default database name is RSA. After deleting that, I deleted the two DB users: RSA_User and RSA_DBA. Once that was completed, vCenter SSO installed properly.
Now after re-installing SSO, you are left with an environment that is no longer linked to vCenter SSO. In my case, this meant that the vCenter Inventory Service, vCenter Server, and vSphere Web Client all needed to be repointed.
You will find the following VMware KB very helpful if you ever run into this issue: http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2033620
The first step in my case was to repoint vCenter Server to the new SSO instance. You do this by performing the following steps (I’m assuming in all these steps that vCenter Server is installed in the default location and ports):
- Use Windows explorer and navigate to: C:\Program Files\VMware\Infrastructure\VirtualCenter Server\ssoregtool
- Locate the sso_svccfg.zip file and extract it to a folder here
- Open a command prompt and CD to that folder you just unzipped the files to
- Run the following command, updating your vCenter SSO URL, and user/pass as appropriate:
repoint.cmd configure-vc –lookup-server https://vc5.corp.com:7444/lookupservice/sdk –user “admin@System-Domain” –password “SSO_pw1!” –openssl-path “C:\Program Files\VMware\Infrastructure\Inventory Service\bin/”
If you try to start vCenter Server at this point, it will try to, but fail. You need to re-populate the certificate file names within the VPXD.conf after re-pointing to the new SSO instance. The following VMware KB describes this and I’ve also included the steps below: http://kb.vmware.com/kb/2048753
- Locate the vpxd.conf file which is located in:
- Create a copy of this file in case anything goes wrong. Now open this file in Notepad
- Search for “null” and you’ll see two fields that look like this:
- On both of these fields, change the null values to match below
- Save the file and close it
Now restart both the vCenter Server and vCenter Management Webservices services. Your vCenter Server should now be linked to the new SSO instance and should start up properly.
Next we need to re-link the vSphere Web Client to the new SSO instance. To do that, follow this procedure:
- CD to C:\Program Files\VMware\Infrastructure\vSphereWebClient\scripts
- Run the following command, replacing your vCenter Server name, admin username, and password:
client-repoint.bat https://vc5.corp.com:7444/lookupservice/sdk “admin@System-Domain” “SSO_pw1!”
Restart the vSphere Web Client service and you should be able to login to v Center Server with the user admin@System-Domain and the password you specified during installation. The default URL for the vSphere Web Client is: https://vc5.corp.com:9443/vsphere-client/
At this point, you should be able to login, but you should now see a message about vCenter being unable to connect to the inventory service. We’ll tackle that next…
To fix the Inventory service and re-link it to SSO, we need to perform a similar process:
- Open a command prompt and CD to C:\Program Files\VMware\Infrastructure\VirtualCenter Server\isregtool
- Run the following command, replacing your vCenter Server name in:
Now you can restart the Inventory Service and it should be re-linked to SSO. You’ll need to restart vCenter Server as well to pickup this change. We’re almost there!
After vCenter Server restarts, login to the vSphere Web Client using the admin@System-Domain credentials again. Although we have fixed all the links to SSO, in my case the Active Directory groups and permissions had been blown away, however the AD identity source was still there. So to add those back, use the following steps:
- On the left-hand panel, click on vCenter Home
- Click on vCenter Servers until Inventory List. Then click on your vCenter server name
- Click on manage along the top row and then choose the Permissions subsection
- Click on the + to add a new permission
- In my case, I was granting the AD group Domain Administrators the Administrator role in vC. So if you click the add button on the left hand pane, it will let you select your domain and then you can search for and add the group.
- Choose whichever role you would like to assign and make sure to do propagate to children
Once that is complete, log back out of the web client (or the vSphere client) and you should be able to login using Active Directory credentials! After that, in my case, I updated vCenter Update Manager and was able to proceed with my host updates.
So that was a bit of a long post, but I wanted to outline what happened and all of the steps I had to go through. Hopefully this helps out one of you if you ever hit this upgrade bug! This whole afternoon made me appreciate the SSO implementation in vSphere 5.5, which was completely re-written, as it is much easier to install and administer!