I was going to test a auditing script from a DefCon presenter on my AD server, when I was adding the USB controller and the USB stick I was passing thorugh to get the script in my VM was being weird.
First USB 3.0 connected just fine, and connected the USB device to the VM, but diskpart was not showing it. So I went to remove it and try a USB 2.0 controller, that failed to connect since the USB 3.0 was still showing there and I selected to remove it again, which it errored another concurrent task. Makes sense, till refreshing the page told me unprivileged account. I wasn’t sure what this was about, so I decided to open another window and navigate to my center web app… 503 service unavailable:
“503 Service Unavailable (Failed to connect to endpoint: [N7Vmacore4Http20NamedPipeServiceSpecE:0x000055aec30ef1d0] _serverNamespace = / action = Allow _pipeName =/var/run/vmware/vpxd-webserver-pipe)”
What the… rebooting the VCSA showed no success still same error even with an incognito window.. ughh.
I found this thread: https://communities.vmware.com/thread/588755
I was going through this, and decided to try to renew the certs, even though my internal PKI certs were still valide (AFAIK, and checking the cert provided when accessing the page). Now here’s the thing, while I ran the certificate-manager script and renewed all the certs, I noticed my AD server somehow was down. I booted it back up. I’m not exactly sure which fixed it. So I decided to take another snapshot while it was in this “fixed state” and revert to the broken state. After restoring o the broken state nothing was responding at all on the https service from the VCSA, so I gave it a simple reboot (which I did initially before I noticed my AD server was down, for some reason). Sure enough after the reboot everything was working fine with my internal PKI certs.
I guess if you set vCenter to use MS AD as the primary login domain and that domain is not available the web management service becomes unavailable… that kind of sucks. I should have noticed my AD was not operational but I didn’t have monitoring on it 😉 or use my local workstation as a AD member. Mostly just random VMs I have for testing.
Like most people, should have looked at the logs for a better idea of what the root cause was. I threw 2 darts at a dart board and had to revert to find the true root cause. Not the best way to troubleshoot, but sometimes if logs are not available it is another method…