ESXi 6.0 on Windows 10 Hyper-V

You might ask why, I recently completed the VMware training module for installation and management of VMware vSphere components, to start to play around I don’t exactly have a bunch of hardware kicking around. I do however, have my awesome gamming Rig which is massively over powered in terms of CPU, Memory however… err not so much, disk I/O… also meh, these will need to be expanded on, but I do at least have Windows 10 running on SSD and a 3 TB spindle disc for more regular storage needs, but everyone knows a 7200 RPM disk provides mediocre performance.
Anyway I’m choosing Hyper-v Since I already have windows, and it comes free with windows, there are other options such as Oracles Virtual Box, and VMware player (can only run one VM at a time though for the free version :S)

Besides that here’s the steps so far.

1) Activate your Windows 10 Pro (1607), as mentioned installed mine on a 120 GB SSD.

2) Ensure VT-x and VT-e and probably VT-d is enabled, and that you have a motherboard and CPU capable of doing virtualization.

3) make sure all your hardware drivers are up-to-date.

4) Install Hyper-v.

5) Configure server settings such as HDD location, CPU allocation, and networks. In my case, I want my ESXi hosts to be isolated from the internet, so I pick internal.

6) Grab the ESXi ISO installation media from VMware (login and subscription required).

7) Create VM, I had to pick Gen 1 with BIOS, Gen 2 with EUFI didn’t boot the ISO for me.

8) I noticed at first attempt at the ESXi VM, it was sitting at loading kernel for an awfully long time, sure enough a simple Google search and discovered this gem.

When ESXi installer runs hit tab; add ignoreHeadless=TRUE

9) Before I could go any further I came across the dreaded, there are no network adapters available. You can Google this, but you will probably get blog posts about people attempting to load the nested VM with only 2 GB of ram when ESXi minimum requires 4 GB of RAM, so you have to be very specific in your search. In this case it’s amazing the power of the open community these days:
Turns out (as usual) it’s a driver related issue (don’t worry I’ll talk about this a couple times throughout this guide).
Lucky enough some lad was genius enough to figure out a solution, not only that but also provide the direct VIB to inject into the ISO file.
I followed the instructions, discovering that the latest supported release was for 5.x including 5.5 for ESXi installation customizer.
A double whammy it didn’t run on my Windows 10 x64 box… dismay not, we’re playing with VMs here.
I quickly created another VM and install my old Windows XP ISO with custom Dark Vista theme imbedded.
The great part was getting the files into the VM was a breeze. Simply shutdown the VM, navigate to the VMs HDD folder, Right click the VHDX file, and select the mount context menu.
This mounts the system and C:\ as separate disks on my windows host, copied the files in, booted the VM, and followed the instructions using the provided VIB and the ESXi 6.0 installer from VMware (login required). Bam sure enough I got a new custom ESXi 6.0 installer ISO file. Moved it out in the same fashion. Mounted it as the ESXi VM’s disc, and booted it up!
Finally the installation moves on! (Make sure you choose a “Legacy Network adapter”)
*Note I do not discuss storage choice when setting up this test host, I simply chose to create a VHDX file of 1 TB for the Nested VMs and for ESXi to be installed on.

10) Once the installation completes, and reboots make sure to hit SHIFT + O and add “ignoreHeadless=TRUE”. Let ESXi boot in DCUI

11) At DCUI, navigate to “Troubleshooting Options”, Then “Enable Shell”

12) Press ALT + F1 (Not F2 as the source states, F2 is the DCUI, F1 is the console). Then Login with root.

13) Type in this command and you won’t have type the headless part of the boot.
“esxcfg-advcfg –set-kernel “TRUE” ignoreHeadless”
(Copy command then select the Menu item “Clipboard”, then “Type Clipboard Text”) (classic Ctrl + V works too)

14) I was finally able to manage set an IP address for managing the host, the virtualized ESXi host hahaha. Sadly the vSphere client failed to connect on my XP VM.
So I setup a Windows 7 x64 bit VM instead. I set this up on Hyper-V on my Windows ten machine, alongside my ESXi hosts to mimic having a laptop running Windows 7.
The vSphere phat client can be downloaded from VMware (login required). Creating my first test VM on my nested ESXi host seemed to have an issue, reading further in the communities shows others with the exact same issue.

Turns out one can simply add a line to the VM’s VMX file “vmx.allowNested = TRUE”. This can be done via SSH (if enabled) or direct console (ALT+F1) using vi.

15) Another thing I noticed was when I was using the Hyper-V Manager’s console to manage my Windows VM running vSphere, and then having it open up vSphere’s console that the Hyper-v console would hang.
My only option at this point was to change my Windows 7 mgmt VM’s network setup. Instead of it only being in the locked down management network, I added another NIC to the VM after creating an external vSwitch in Hyper-V.
Since I have a DHCP server in my local LAN, having the Windows 7 NIC setup to DHCP provided it from my DHCP pool. Using ipconfig (in VM) or checking my DHCP server’s pool I was able to find the IP to remote into.
This of course required setting up remote desktop permissions on the Windows 7 VM. This also allowed me to work in full screen mode, and didn’t crash when opening up vSphere consoles, including of course copy and paste abilities. :D.

16) Next sort of problem was kind of expected. No x64 VM’s in my Nested enviro. There’s topics on this. So I decided to grab the latest 32 bit version of windows that’s available… you guessed it; Server 2008 (Not R2).
Grabbing a couple different versions available from MSDN, gave me a tad bit of issues. First off, don’t use the Checked/Debug, I played with the standard and the SP2 versions. I found the issue was it was hanging at completing installation.
Checking the VM stats via vSphere phat client VM’s performance tab, showed MAX CPU (not always a sign of being hung as it could still just be processing, but definitely a sign on the less), then the big give away, Disk I/O and consumed memory.
Disk I/O was none, and the consumed memory was on a steady decline till it plateaued neared nothing, all signed of stuck or looped process. Since I felt like giving it a little benefit of the doubt, and I had two virtual ESXi hosts to play with,
I decided to bump the CPU on one from 2 to 4. This allowed me to create a VM with 4 virtual CPUs instead of 2. Not sure why this would make a diff, and not sure if it exactly was. So I mounted the same 2008 with SP2 ISO and load the full desktop standard.
This time it finally got into the desktop… guess I’ll try the Standard core now on my other host after upping the CPU as well… let’s see. Yay Server Core installed using the standard 2008 32bit ISO with 4 core CPU.

17) Next issue I came across was not being able to have the VM’s inside the nested ESXi servers communicate with any other device in the same flat layer 2 network. I was sure I had configured everything correctly.
If one Googles this they will find lots n lots of articles on it stating the importance of promiscuous mode. I was up super late trying to figure out this problem and was starting to get a bit crazy. Setting all forms of the settings I could possibly find.
Including attempting to set mirror ports on the ESXi’s VM NICs on Hyper-v hahaha. AS I mentioned you’ll find many references to it, but googling promiscuous mode hyper-v and you discover most people stating to add a line to the VM’s XML config file.
Well it probably won’t take you long before you discover you VM config location doesn’t contain XML files but rather vmcx files. Yeeeeapppp, good luck opening them up… they are now binary…. Wooooo! No admins playing around in here! Take that you tweakers!
This was a change in Hyper-V starting with Server 2016 / Windows 10. I spent a couple hours tumbling down this rabbit hole. To help other I’ll make this part as clear as mud!
IN Hyper-V, ON THE ESXi VMs NETWORK SETTING THAT IS THE LEGACY NETOWRK ADAPTER (the one used as the “physical” adapter in the ESXi vSwitch) EXPAND THE SETTING AND UNDER ADVANCED FEATURES SELECT “Enable MAC address spoofing”.
That’s it! That is Server 2016/Windows 10 Hyper-V’s work around for nested hypervisors. Although as usual support people on TechNet instead of giving an answer or a technical work around would rather dust their hands the classical “not supported” instead of “It’s possible, here’s how, but if something doesn’t work with these settings that’s all we can help with” which I feel would have been a far better response. Maybe these support people just aren’t aware, who knows here’s where I found my answer.

18) So now that I got my hosted ESXi servers up and running and communicating the next step is vCenter. vCenter will setup its own SSO domain, we can add a MS AD domain later and change the default SSO domain to be our active direct domain. However the default SSO domain created at vCenter deployment is the local configuration domain for all vCenter services. Grab vCenter Appliance from VMware. You might be wondering what gives when you discover under the download list for vCenter that there’s an ISO and an IMG file, but no OVA/OVF. This is cause in vSphere 6.0 the vCenter appliance is deployed via a client system using some weird system to communicate to the host to deploy via some web stuff… even I don’t know the exact details of what’s up, either way, if you attempt to create an VM and mount the ISO, you’ll find it’s not bootable. So mount it to the management VM. In my case my Windows 7 VM with vSphere installed. Since Windows 7 doesn’t have native ISO mounting features I had to install virtual clone drive. Then mount the ISO and navigate inside.

Oddly enough it almost seems as if you need a windows system to deploy a Linux appliance. Under the VCSA folder you should find an integration plugin exe installable… run the installable exe file. There seems to be a set that states installing certificates and service, this might be the start of the certs of the built in SSO domain. Not sure though. Once it’s done it sort of leaves you in the dark… as every just closes and there’s not complete window in the wizard…
Guess I’ll just run vsa-setup.html now… Since I have a native version of Windows 7 setup… looks like I’ll need IE 10/11 as the default IE 8 won’t suffice. Lucky for me the Windows 7 machine still had access to the internet, so I Googled the IE 11 installer and ran it, this may be a pre-requirement for the normal installer. As it seems to download and install required updates. You may need to find an offline installer file for IE 11 if you are in a test enviro where you Windows machine doesn’t have access to the internet.

Click Allow. Another pop-up will appear, click Allow.

Now we can finally click install: S

Accept the user agreement, then enter one of the hosts IP address. Member I installed and run this one the Windows 7 machine that can already access the hosts via SSH or the vSphere phat client. So I will enter the IP address here as I haven’t setup DNS at all yet in my environment, and one wouldn’t technically yet if the plan was to have nested DNS servers (The DNS the hosts point to are VMs it hosts).

I made a wrong IP entry, it alerted me as it couldn’t connect to the host, then corrected the IP, and got a cert warning.

Setup the Virtual Appliance OS’s Root password (I believe it uses openSUSE, so this would set the underlying openSUSE root password).

Now under the deployment type if one were never to connect any other vCenter server into the SSO domain for enhanced link mode, you can pick embedded, however for scalability, and the fact I need to setup a Windows Server vCenter to run Update Manger, I’ll create an external Platform Services Controller (PSE). This will require me to run through the wizard separately to actually deploy the vCenter server, in this case I’m actually just setting up the PSE Virtual Appliance (VA). Hence the options all make sense.

As I mentioned this will create the SSO domain for all vCenter services, do not make this the same as your AD domain, this will cause confusion between domains when you add and set your AD domain as the primary SSO identity source. I stick with VMware default vsphere.local, then add a site name (generally this would be some sort of regional reference). Also set the SSO admin password.

It complained about DNS a requirement and a System name, I’m assuming this is hostname, even though it requested it be either FQDN or IP like it was required for some sort of looked, I specified simply the hostname, and a DNS server IP that is not yet even setup for DNS (That which will be my PDC in my test AD setup) This allowed me to continue the setup. I’m thinking this might be what it enters as a common name or SAN for its cert. is my guess.
I went back and changed it to an IP address as I figured my first couple attempts to access it will be through its IP address and I didn’t want to deal with cert warnings. It did however warn me that FQDN is more preferred and this makes sense when a proper DNS system is already implemented.
Hahaha Sure enough had to go through all that IE 11 setup, and plugin installation for it simply to deploy an OVF file hahaha.

Cool, I guess it’s more that based on how you want to set vCenter up with the new PSE instead of having a bunch of Documentation to read through (While this is technically always best to do anyway) it sort of automates the templates to deploy and how to configure them using a questionnaire type setup. I believe in 6.5 this is maybe easier with some sort of HTML 5 based deployment system. Not sure though.
So I hit a couple snags on deployment. First off I thought I was stuck on not being able to do nest x64 virtualization on my nested ESXi hosts. Until the great lads in Freenodes #vmware told me to enable virtualization extensions to the ESXi VM.
“17:18 < genec> Zew: then did you forget to pass VT-x to the ESXi VM?” – Oh Neat! Thanks genec.
Since I was running all my stuff on Hyper-V I had to Google this. Did take long till I found my answer.
Set-VMProcessor -VMName -ExposeVirtualizationExtensions $true
The VM name being my ESXi hypervisor.

This however I only discovered after I enabled the whole vmx.allowNested = TRUE bit on the deployed VM after I saw that it failed with that usual error message. Luckily enough a bit of googling again and I was able to find my answer.
“You can add vmx.allowNested = “TRUE” to /etc/vmware/config in the ESXi VM to avoid having to put it in every nested VM’s configuration file.” –Thanks Matt
I’ll delete the existing vApp and try my deployment again.
Once I managed to mount the VSCA ISO and install the client plugin, and attempt to deploy the PSC/VCSA I got hung up. It appears all my x64 VMs within my Nested ESXi hosts failed to properly boot. All the different VCSA\PSC versions all went into a boot loop. Windows 7 x64 gave a fault screen after loading the installer files and attempting to boot the setup.exe. Server 2016 just showed a black screen. Looking into this I discovered this guy’s blog… looks like I may have to resort to VMware Workstation Pro!
I’ll post this blog post for now as it has become rather long. I will post my success or failure in the upcoming weeks. Stay tuned!

Jan 2018 Update

I remember this being extremely painful, but was so way easier on Workstation Pro.

Deleting Windows.old from New Windows 10 install

So I recently re-installed my Windows OS on my desktop. Glad to say the experience is general a really good one; most driver install automagically.
I even let it grab the latest updates, as you might expect this includes the Windows 10 1607 version update; aka the anniversary edision.
to my dismay I noticed a Windows.old folder under my C:\ the drive I selected to install windows on. Being a fresh install I was a bit surprised to see this. as it’s usually from a complete Windows upgrade. I guess the 1607 version is a complete new version according to MS?

A quick google search however provided a nice blog by of course my fav HowToGeek .

Sadly even after this, I still had a Windows.old folder. 🙁

I couldn’t figure out why, so I decided to navigate into the folder. I discovered it was a particular file InstAud.sys or something of that nature. I opened up the properties of it and seized ownership of the file, I am the administrator after all.
To my dismay again, I still couldn’t add or remove permissions. I managed to use takeown and cacls to add permissions, but then it stated was locked by a process. Using sysinternals handle and procexp both came up empty handed for any locks.. odd :S.
autoruns also didn’t show anything linked to that file… mmm. Even after, disabling my intl video (I was using a gefore PCI-e video card anyway), disabled it in device manager. and uninstalled the software/drivers. Made sure the service was gone. Yet it pertained I coudn’t delete the file.
So, normally I’d use a linux live USB to mount the partition and wipe it out that way. However I wanted to keep it all Windows and attempt to do this without pointing user to any other site or tools.
And yes…. I did figure it out! hahah. Follow these steps.

Step 1) Click the Action Center Icon (or swipe from the right side on a tablet) and select All Settings -> Update and Security -> Recovery -> (Under Advanced Startup) Restart Now.
Step 2) Select Troubleshoot -> Cmd prompt.
Step 3) Change to the windows disk drive, generally C:

Now if you attempt to navigate directly to the file you might find the commands you set to provide you account permissions might not have been persistent.
You might also be surprised to find there is no cacls command. well fear not, you’ll just have to use icacls instead.

Step 4) Take Ownership:
cd C:\Windows.old
takeown /F .\*
Step 5) Grant Permissions:
icacls .\ /grant Administrator:F
Step 5) Navigate into the folders, till you find the issue file, use del to delete, use takeown and icacls to take ownership and permissions.
Step 6) Once that's done use rmdir /S /Q to delete C:\Windows.old

That’s it! your done, reboot. And See Windows.old is gone! That was tougher than it shoulda been. Now I can re-enable, and re-install my integrated Intel drivers, I’m not sure exactly what happened, it was an update that ran right after installing Windows. I did however use a dedicated Windows 10 USB stick that’s 6 months old or so. Just show how important to get the latest builds whenever possible.

Adding static host records on DDWRT’s DNSmasq

I run my own domain. You’d be amazed to find out most my my services are hosted by a single router, a Asus RT-N16 running DD-WRT.
Lately I noticed my website wouldn’t load inside my own network. I saw that I had no actual records for my base domain, or it’s www counter-part. Although I had hoped that an external record lookup and a hairpin would be good enough, it didn’t seem to be the case, plus to aboved un-needed latency a direct lookup with the direct internal IP would be far more benificial.

To my dismay I haven’t managed the box in a while and I couldn’t seem to find any DHCP based records under /etc/hosts.
I was aware of where I set the file for DHCP records, however not sure if DNSmasq uses that for DNS lookups as well or not.
A quick google search however provided a nice blog with step on how to accomplish it via the Web Interface.

Log into the administration interface and go to the Services tab.
Find the DNSMasq section and make sure the DNSMasq option is enabled.
In the Additional DNSMasq Options box type in your local DNS configurations (one entry per line):

address=/host-or-domain/ip-address

where host-or-domain refers to the machine name (or domain name) you want to customize the address for and ip-address is the numeric IP address
Save and Apply and you should be all set.

Thanks ZEDT

SharePoint 2010 Managed Service Accounts

The reason for this blog post was due to a domain migration which involved a SharePoint 2010 server. These were the symptoms, and all the steps I took to resolve them.

To start ensure you have set a new farm admin account.

E.G. stsadm -o updatefarmcredentials -userlogin domain\farmadmin -password PASSWORD
This is of course under the sharepoint bin directory and wont be part of the servers default path.
All Tasks moving forward will be done with this account (in my case the account has local admin rights as well as being the sharepoint farm admin permissions)

You might have come across an error such as this when working with SharePoint. Might have been from an admin removing a service account in AD, or in my case a whole domain change.

Image result for Error-Removing Managed Account

If you’re new to SharePoint you might simply panic before thinking and simply google and you might come across this.
Which sadly leads to a dead end. The reason his leads to a dead end while others have answers I’ll get to soon.
Next the simple thing you’d figure is to do as it says, reconfigure the service to run under a different account. To your dismay you discover the Central Admin page to make this change gives you this!

You might be thinking you’re in a catch 22 here. But fret not, when in doubt PowerShell. I love PowerShell and it only keeps getting better, in this case it’s our savour.
Remember when I stated that SharePoint link was a dead end. It was a dead cause there are different component types, 4 to be exact and I’m gonna tell you how to fix them all! With Sources!
First up the 4 different types!

1) Service Instance:

	Cmdlet:		Get-SPServiceInstance

	Use the comdlet to list all SharePoint Service instances, mark the ID. Place the service instance into a variable.

	E.G.		$WindowsTokenService=Get-SPServiceInstance -Identity ServiceGUID

	Once you have the variable, it should have a bunch of subclasses. In this case .service.ProcessIdentity. You can see it attributes by typing.

	E.G.		$WindowsTokenService.service.ProcessIdentity

	It should display the service account used to run the service. Simply change it via

	E.G		$WindowsTokenService.service.ProcessIdentity.Username="Domain\NewServiceAccount"

	*NOTE* This accuont needs to already be registered as a serivce account in SharePoint, either via the Central Admin page or Powershell.
	Then call the object'ssubclass update and deploy methods.

	E.G.		$WindowsTokenService.service.ProcessIdentity.Update()
	E.G.		$WindowsTokenService.service.ProcessIdentity.Deploy()

	That's it for basic SharePoint Service Instances.

2) Service Applications:
	
	Cmdlet:		Set-SPServiceApplicationPool

	The funny part about SharePoint Service instances is theres no Set type powershell comdlete for it. Thus the cmdlet used was a Get cmdlet.
	The funnt part about Service Applications is not to use the ServiceApplication cmdlet, but rather the ServiceApplicationPool cmdlets.

	E.G		Get-SPServiceApplicationPool -Identity SercurityTokenServiceApplicationPool | Set-ServiceApplicationPool -Account "Domain\NewServiceAccount"

	Don't forget to do an IIS reset. Then running the Get-SPServiceApplicationPool cmdlet the service application should have a set PrcoessAccountName.

3) Content Applications:

	Cmdlet:		[Microsoft.SharePoint.Administration.SPWebService]::ContentService.ApplicationPools | ft Name

	Yeah you read that right, there's no direct PowerShell cmdlet for this one. You got to go deep... real deep. Anyway run the cmdlet to list all Content Applications.
	Once you have determined the one you need to change the service account for place it in a variable.

	E.G. 		$SPAppPool=[Microsoft.SharePoint.Administration.SPWebService]::ContentService.ApplicationPools | where {$_.Name -like "My Content App Pool Name"}

	Calling this variable will result in an output very similar to a Service Instance's Service.ProcessIdentity subclass. So you guessed it.

	E.G.		$SPAppPool.Username="Domain\NewServiceAccount"
	E.G.		$SPAppPool.Update()
	E.G.		$SPAppPool.Deploy()

4) Search Service:

	Cmdlet:		Get-SPEnterpriseSearchService

	Yup believe it another Get cmdlet to make a change, where are these Set counterparts you may ask, well that's a good fucking question. We should ask Microsoft.
	Anyway, if this hasn't annoyed you enough already chances are you haven't been a SharePoint admin for long, cause it's a rabbit hole. So to finish up here.

	E.G.		$SSS=(Get-SPEnterpriseSearchService).get_ProcessIdentity()
	E.G.		$SSS.Username="Domain\NewServiceAccount"
	E.G.		$SSS.Update()
	E.G.		$SSS.Deploy()

Sources:
General SharePoint 2010 Managed Service Accounts
Service Instances Source
Service Application Source
Content Application Source
Search Service Source

Once the Get-SPServiceAccounts shows all good on password expiry and no bad accounts exist, there should be no issues opening the Configure Managed Service Accounts section in Central Administration page of SharePoint 2010.
Happy Configuring. 🙂

To Paraphrase:

0) There is no way to paraphrase this.
1) Don’t break SharePoint.
2) Don’t break SharePoint.

vCenter Network Partitioned

Have you ever experienced a Network Partitioned warning in vSphere 5? Hopefully not, but if you find yourself with this warning in vSphere. Don’t panic its not as bad as it could have been in 4.x.

This literally just means that the host can not communicate with any of the designated VMK’s checked off for management traffic. In my case it happened after making network changes to my infrastructure. In this case I still had bonded links at my switches, but somehow the VMK load balancing algo had switched to “route based on originating port ID”, this load balance algo doesn’t work with bonded NICs, and needs to be “route based on IP hash”. My end goal was to get off bonded links for my host and use the default load balance algo that VMware uses, as this can be down with non stacked switches and can be done with minimal switching knowledge (in case others need to manage the system in the future).

It took me a little bit to catch the issue, cause the symptoms were that each host could ping any device in their respected management subnets but NOT the other host, flat /24 subnet too, really had me baffled. As I couldn’t vMotion in this state either, but lucky the VMs on each host remained active (as they have separate communication VMPGs on dedicated physical connections).

Once I caught the error, I was able to verify vMotion worked again. That’s all there is to it!

To Paraphrase to solution:

1) Check which VMKs have management checked off.
2) Check those vSwitches physical connections.
3) If multiple ports check configs on physical switch and load balance algo.
4) Google any errors along the way.
5) Check host to host communication by consoling into host and using vmkping.

Jan 2018 Update

I remember this…

Changing Network Location to Domain

Have you ever restored a VM? Have you done your DR testing by actually doing a full recovery with AD? Did you find you had a couple odd things occur after restore, such as not being able to RDP into your recovered server? Chances are your network profile has changed to public, instead of Domain. This in turn causes certain firewall rules to trigger.

I remember coming across this issue multiple times, especially when people usually want private instead of public and vice versa. So chances are you’ve come across this, telling you to use PowerShell cmdlet to change its setting, which to my guess makes a registry change. The other option they specified was to use the GUI.

Well I find changing local security policies and all that other stuff rather annoying. Soo after a bit more googling I found a really nice answer, which worked and was very simple to implement. Very nicely written and easy to follow by a Evan A Barr. You can view his site here.

To Paraphrase to solution Using Network Connection Properties:

0) by adding a DNS suffix so that NLA can properly locate the domain controller.
1) Go to Network Connections.
2) Go to the properties of of the network adapter in the wrong location.
3) Go to the properties for IPv4.
4) Click the "Advanced..." button.
5) Select the DNS tab.
6) Enter your domain name into the text box for "DNS suffix for this connection:".
7) Disable and then enable the connection to get NLA to re-identify the location.

Renewing expired certificates on vCenter 5.5

Do you follow best practice? Have you setup a VMware HA cluster with vCenter. Do you have your own PKI and certificates? Did you not have active monitoring on said certs? Then chance are you are in the exact same boat as me! This blog post assumes you are well advise in using the SSL Cert Automation Tool as well as creating certificates for use with the tool.

This one begins on a Monday after the weekend. I was getting alerts of failed backup jobs. I managed to configure Veeam at my work place and have been happy with the product and support from day 1. I also configured a cold site for backup retention in the event our primary site, you know…. implodes. Anyway, I was used to getting “failed” alerts when really there was simply a communication hiccup across my IPsec tunnel, which usually the job would complete successfully and just report the error. This time however it was different, the errors were for normal backup jobs and reported “incorrect username and password.” I knew the service account’s password, used by Veeam, never expired or changes. Instantly telling me something else is wrong. I then attempt to login into vSphere connecting to my vCenter server, and sure enough it says the same thing wrong username and password, to which another notice pops up saying all communications are untrusted due to expired certs. Doh!

At this point you’ll probably have done exactly what I did… check your installation documentation right?!?! I mean if you are running custom certs, I’m assuming you follow other best practices such as documenting. :P. But after that you are probably googling once you discover part of the SSL tool are not working!

Chances are you came across VMwares KB on renewing certs on a 5.5 version instance of vCenter only to discover that at step 5 a) that the tool reports the local machine doesn’t have the SSO service installed. This really comes down to what the “tool” really is, and that’s a batch script. Yeah you read that right a BATCH script, so you could imagine how ugly and how painful that must have been to code. Like seriously 5.5 was released in Sept 2013 and they were coding using PowerShell by then… shame on you VMware. Anyway, the most likely problem here is in the way this batch script actually checks for the installed service (I looked at the source code of the “tool” but I didn’t actually locate the part that handles this and I’m strictly making assumptions here) is that it probably has a more direct string to which it looks for, again assuming here a reg key or something of that nature and its probably using a version number to check against, if the version changes the script would reply a “can’t find this”. and thus you get the above error which you know is wrong. So how do you fix this, well you grab the exact version of the tool for the updated instance of vCenter you are on (this requires a valid VMware subscription to grab the version of the tool you need). I managed to update one form post in hopes it helps others at this stage of the game.

At this point I kept following through the tutorial, just an FYI I was going through all this with a VMware tech support, and they had to get another tech who specialized in these cases. I came across other issues as well such as in Step 5 d) I got a error similar to this. Sadly I’m writing this up several days after the event so I can’t remember what exactly we did to recover from this one.
At this point gotta keep pushing through the KB which has a total of 24 steps, so you could imagine how painful all this is to do. At the same time I’m not sure HA is even available, and all my backups couldn’t run and any management of VMs would have to be done manually till vCenter could be back up and running. I’ve talked to others and many people suggest to stick with self signed certs even though we all know its not best practice. Thanks VMware for making best practice really hard to implement and maintain.
Also at the very end steps I didn’t not actually have a listed service ID for web client but only the web logger, although you can have separate service ID instance for these, in my case I had to use the web logger service ID to complete the final step. Then after the Web Client wasn’t working properly which I fixed by reinstalling the service/feature via add/remove programs. The fact there is no repair option on this installer bugs me.

To Paraphrase to solution:

1) Ensure you are using the latest and correct version of the SSL tool *cough BATCH script*.
2) Create all your new certificates and chains.
3) Follow the KB article very carefully, specially when it says to do some steps manually vs using the "tool".
4) Google any errors along the way.
5) Bash your head in for following best practices.

Jan 2018 Updates

This brings back bad memories, It’ll soon be time to update to 6.5. We’ll see how VMware has handled internal PKI this time.

Reclaim unused space from VMDK

Let’s say you have a bunch of servers *Cough Server 2008 R2* that have been fairly well maintained and all running on VMware’s ESXi hypervisor system. As a regular server admin you’ve come to terms with updates and keeping systems for the most part on the latest n greatest. Now lets also say you happen to be the storage admin as well and you find you are running out of space on your SAN. What do you do? Usually buy more space. But lets get to the real heart of the matter… Systems, if not properly set up, get messy (don’t get me started on Windows registry.) we’re sticking with storage as the topic of the day. Well good news is I’m here to help you reorganize and re-claim all that space. Lets get started!

*NOTE* if you are running thick provisioned discs you’ll have to svmotion them to another datastore to convert them to thin first.

First and foremost you’re going to want to clean up your WINSXS folder. Don’t believe me, run windirstat to find out just how big it has become from all those updates.

How do you clean up your WINSXS? You may ask well, first ensure your server has Windows6.1-KB2852386-v2-x64 installed. Note these steps work for Windows 7 as well if anyone happens to need to save space on a client machine. You might be able to find cleanmgr.exe online, but your safer to copy it from another server. or try this. run cleanmgr.exe make sure you run it as an administrator and clean system files. Clean up the old update files. Reboot (Your HAVE to reboot to complete the update removals before moving to the next step!)

For the next part you may or may not want to do depending on what the app reports. Run Disk Defrag. In this case my servers were about %40 fragged; meaning that over time as files were added, used,and then deleted they were placed randomly throughout the disk depending on where the FAT (File Allocation Table) generally in this case NTFS telling which sections were free to overwrite. Yup when you delete a file it’s not actually deleted from the sections just from the table. So Defragging pretty much “shoves” all the actually still in use data nice and organized at the “front” of the disc. This is generally only required on spindle discs, if your system is using SSD, or a logical unit based on RAID this won’t matter.

Now if you’re simply clearing space on a phsyical device, barebone device. You’re pretty much good to go. However for the rest of us virtulaized guys who want to reclaim space on our SAN’s we still have a ways to go.

This is where I find the “fun” begins. if you attempt to look it up you’ll find some old articles from VMware about using vmware tools. Well #1) The GUI options are gone,if you attempt to find vmware tools under control panel, you won’t find it. #2) If you go ahead and try to use the cmdlets you’ll probably find it simply returns the disc can’t be shrunk. I personally say don’t waste your time attempting to do anything here with VMware tools. For Linux users you can accomplish this via dd very easily. For the rest of us Windows users we can thank the Great Mark Russinovic for sysinternals, in particular this time for sdelete. Grab it and run sdelete -z (important in v1.56 it was -c, in 1.61 use -z) If you don’t specify a drive it will use the drive you run the cmd from, I’m assuming.

Time for the last and final fun part. Read this and this. Once you’ve done that I’ll provide my findings:

1) You have to svmotion between datastore of different blocksizes (I found the 2 MB block size was the one that worked for me)

2) you can’t use the vmkstools holepunch option against a VMDK stored on a NFS datastore

To Paraphrase to solution:

1) Remove and delete temp files, unused profiles, and old update files.
2) Defrag to organize all the blockson the guest file system.
3) Use sdelete or dd to zero dirty blocks.
4) Hole punch or svmotion the VMDK to shrink used size.
5) Enjoy a beer and a bunch of recovered space.
6) You might even notice a performance increase from all the organized guest file systems

Jan 2018 Updates

2016 didn’t have many posts, but they sure are good ones, I forgot all about this stuff. haha.

Zoneminder on Acer Netbook

The story begins when I first got my IP camera; a WansCam PT with IR. I got a couple of them for around $50 bucks a pop, which was an amazing price. Still hard to match to this day. I had set it up on the overhang of my parents garage. This caused some issues with their garage door opener due to interference of the WiFi. This cheap camera amazingly having survived two winters without issue, it was however protected by the overhang.
The thing was it only had send a picture to a FTP server at most every second on motion sensor. It could also send email alerts, and a couple other basic features, and could supposedly monitor it remotely if you port forwarded the web hosting port… which I never did for security reasons, for one it only having basic forms authentication, and I setup a more secure SSH tunnel with key authentication to accomplish the same task, but way more secure.
So whenever I had to review an event, or even just check the images I had to scroll through hundreds of pictures, and it wasn’t very good at it either, WiFi, interference, etc.
Then I discovered Zoneminder. It blows my mind. I’m a huge open source advocate. This one… takes the cake; for me!

Installing Zoneminder on Acer Netbook x86.

1) grab Debain from here.

2) grabbed Rufus to place installer on to USb as notebooks don’t have dvd drives.

3) put Debain 8.2 x86 USB installer into notebook USB port. Powered on notebook pressing f12 for boot options.

4) Installed Debain, no Desktop Enviro. This is a server afterall.

5) remove cdrom from /etc/apt/sources.list using vi

6) Follow this: Install Zoneminder on Debian.

7) uncommment HandleLidSwitch=suspend and change suspend to ignore in /etc/systemd/logind.conf

8) Enjoy ZoneMinder on a Acer KA90 netbook, by navigating to http://netbookip/zm

9) I) adding a Wanscam IP camera; In Zoneminder Web GUI -> options -> ZM_OPT_CONTROL enabled. restart zone minder service.
II) ZoneMinder Web GUI -> Add New Monitor. *NOTE* The Zoneminder account here is a operator account I created on the Wanscam WebInterface Prior to setting up ZoneMinder.

Well Poops… I had 3 images here that got lost…

III) Nightvision IR control is controlled with the Wake/Sleep buttons under zoneminder control.
IV) Set your presets, and have fun moniorting and controlling your PT IP camera from zoneminder!
The big benifit here now is Zoneminder will track motion and record events, so no more needing FTP enabled on the IP camera. Plus all events images are kept together and more than 1 a second, I’ve been able to get about 4-5 fps all being saved a seperate jpeg images. My next goal witll be to get a IP camera of otherwise setup to see if I can get recorded video on motion.
I also want to eventually figure out scripted, timed presets for a patrolling type camera on the cheap. Also need to setup POE to reduce cables. I plan on getting that setup using a old Cisco 1811 POE router. I hope this post is helpful for someone out there.

Jan 2018 Update

This server has actually been P2Ved (Physical to Virtual migration). I’ll probably blog about how I accomplished this. The only other server on my hypervisor at this moment. Haha

Managing Software locally and remotely using CMD and PowerShell

Using CMD one uses wmic command…

1) Example to query listed applications on remote system running Windows
	wmic /NODE:RemoteHostName product get name, version
2) Example to uninstall application remotely using wmic
	wmic /NODE:RemoteHostName product where name="ApplicationName" call uninstall /nointeractive

*NOTE* these require WMI management to be allowed through the windows firewall.

That’s neat, this can be better achieved using powershell…

1) Example to query listed applications on remote system running Windows via PowerShell v2
	gwmi Win32_Product -co nb00647 | ft name, version
2) Example to uninstall application remotely using wmic
	(gwmi win32_Product -co Server1 | where {$_.Name -like '*ApplicationName*'}).Uninstall()

That’s amazing!! What’s the issue?

Well, first off, it’s not clear if this query runs agaist both known application registries (on any 64 bit based Windows system), those being… HKLM\SOFTWARE\Wow6432Node\Microsoft\Windows\CurrentVersion\Uninstall (For 32 Bit Apps) HKLM\SOFTWARE\Microsoft\Windows\CurrentVersion\Uninstall (For 64 Bit Apps)

At this point I wasn’t sure if this was querying both or just one of these locations..
I decided to test this with an old FireFox installation, that I had replaced a while ago (I initially used FrontMotion firefox to allow for configuring via GPO’s) This was made possible by later version of firefox via the mozilla.cfg file and this file could be pushed and enforced by GPO, anyway.

PS C:\Windows\system32> gwmi Win32_Product | where {$_.Name -like '*Fire*'}

IdentifyingNumber : {3F98D293-8219-4730-B49B-F223030021B8}
Name              : Mozilla Firefox (en-US)
Vendor            : FrontMotion
Version           : 29.0.1.0
Caption           : Mozilla Firefox (en-US)

Once I had ensured the correct object being returned, I called its uninstall function.

PS C:\Windows\system32> (gwmi Win32_Product | where {$_.Name -like '*Fire*'}).uninstall()

__GENUS          : 2
__CLASS          : __PARAMETERS
__SUPERCLASS     :
__DYNASTY        : __PARAMETERS
__RELPATH        :
__PROPERTY_COUNT : 1
__DERIVATION     : {}
__SERVER         :
__NAMESPACE      :
__PATH           :
ReturnValue      : 0
PSComputerName   :

Key thing here is the Return value, claims 0, so that be considered a success, lets check the returned value..
Sure enough, no returned objects, lets scan the registry for stale keys for that particular GUID/IdenitfierNumber

reg query HKLM /f "3F98D293-8219-4730-B49B-F223030021B8" /s
(This can take a long time, if local to the machine, searching via find in regedit can be quicker)
reg query HKCR /f "3F98D293-8219-4730-B49B-F223030021B8" /s

Both queries return no values, thus were cleanly removed from the registry..
However, I still have a firefox version 39 listed in my Programs and Features.
So, what gives? As I had mentioned before on what the wmic and qwmi commands query the Win32_Product class, from what I’ve seen so far it appears this is querying on a specific set of the registry and not all the applicable registry sections:
HKCR\Installer\Products HKLM\Software\Microsoft\Windows\CurrentVersion\Uninstall
Doing a quick reg query for the word firefox sure enough displayed the listed installation of Firefox 39, and not the old 29 listed above…

HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\Microsoft\Windows\CurrentVersion\Uninstall\Mozilla Firefox 39.0 (x86 en-US)
    Comments    REG_SZ    Mozilla Firefox 39.0 (x86 en-US)
    DisplayIcon    REG_SZ    C:\Program Files (x86)\Mozilla Firefox\firefox.exe,0
    DisplayName    REG_SZ    Mozilla Firefox 39.0 (x86 en-US)
    InstallLocation    REG_SZ    C:\Program Files (x86)\Mozilla Firefox
    UninstallString    REG_SZ    "C:\Program Files (x86)\Mozilla Firefox\uninstall\helper.exe"
    URLUpdateInfo    REG_SZ    https://www.mozilla.org/firefox/39.0/releasenotes

According to this stackflow post, there is no way to use wmi/qwmi to query 32 bit applications… I find this hard to believe and will update this blog should new news pop up.

Now here’s the kicker, Firefox was removed from my Program Files, but a Mozilla folder still exists in my Program files (x86), again seemingly like a lack of wmic application control for 32 bit applications. However I have no firefox in my search, and no firefox.exe avilable in the existing folder in PF(x86)… lets try to uninstall whats listed under programs and features… Would you look at that… says something happened during uninstall, and asked to remove the listing from programs list. Doing another “reg query HKLM /f “firefox” /s” shows it been removed from the keys mentioned above. However lots of plugin keys remain… oh well Deleted Profile Data, Program File Data and called it a night.

In order to build a more-or-less reliable list of applications that appear in the "Programs and Feautres" in the Control Panel, you have to consider that not all applications were installed using MSI. WMI only provides the ones installed with MSI.

Here is a short summary of what I've found out:

MSI applications always have a Product Code (GUID) subkey under HKLM\...\Uninstall and/or under HKLM\...\Installer\UserData\S-1-5-18\Products. In addition, they may have a key that looks like HKLM\...\Uninstall\NotAGuid.

Non-MSI applications do not have a product code, and therefore have keys like HKLM\...\Uninstall\NotAGuid or HKCU\...\Uninstall\NotAGuid.

Infro provided by Ilya Kogan

Jan 2018 Update

This brings back bad memories haha. I should find some time to play with this again on Windows 10, see if anythings changed since.