This blog is going to be an interesting one. The points as to how I got to a windows machine with a bad boot is as interesting as how I managed to resolve the issue.
It started on a Friday, well it technically started before that, but long story short my colleague and I were planning to do a P2V of a physical SQL machine. I prepared the new ESXi host slowly during the week to prepare for this.
As I was destined to go camping that weekend and still needed a bit of work to be done to get ready for the progress, I’ll admit I was a bit rushed in telling my colleague the appropriate steps to take. Now at this point the host was ready, albeit was a bit on the small side when it came to the local datastore, (3 x 6Gb/s sata discs in a RAID 5 (I really wanted RAID 1 with a hot spare, but this was the best suitable option this HBA/RAID controller provided me)). Anyway while I was away camping, go figure in a location with no cell service, my colleague completed the P2V. Turns out it didn’t go 100% perfectly as planned, as he emailed me a bunch of error noted by the host ESXi. Turns out he had provisioned Thick Provision lazy zeroed discs normally I wouldn’t deny its a good choice under certain circumstances, however, in this case with the limited datastore space it wasn’t the greatest choice cause there wasn’t really room to spare on empty “zeroed” data.
So after I was informed of the situation I began to attempt to fix the issue, which happened to be that the host was unable to remove/consolidate snapshot due to the lmited space left on the datatstore. I began by adding the host into our SAN network and connecting it to our SAN storage. I did a svMotion after I had initially shutdown the guest OS. Shutting down the guest OS took over 20 minutes with the slowness it had been brought to by the issue, while it was still specifying shutting down I had got impatient and forced the system down.
After the initial svMotion, I checked the datastore and noticed that there was still a VM folder in there and all the space had not yet cleared, which I found a bit strange considering it should have migrated all the data to the new store.
I figured well lets see how the VM reacts now… and whomp whomp waaaaaaa was presented with this! (Imagine Source was removed, I can’t remember what it was lol)
At this point I wasn’t sure if this was realated to the snapshots and the VM folder set being not all in one place.. So I decided to delete all the snapshots. Even after completion and noting that all Data had indeed migrated to the SAN, I was still presented with the error shown above.
At this point I was starting to worry that I might have ruined 20 hours of P2V work, I was too tired to carry on for the day and decided to boot the physical back up to handle DB requests for the following work day.
The next day I jumped on fixing the issue at hand to recover this VM and save the 20 hours it took to P2V this thing. I initially started by mounting the Windows Server 2008 R2 installation DVD to the guest VM and adjusting the boot time to allow me to load the boot order and boot from the disc. Even though selecting repair my computer did see all the local discs including the installed OS, it would only give me the option of recovering from a system image (which I didn’t have), run diag tools (doesn’t help in this case) and command prompt. So I loaded command prompt. Now everything I tried in bootrec.exe options had failed:
/FixMBR (didn’t work)
/FixBoot (didn’t work)
/ScanOS (Found 0 installed instances)
/RebuildBCD (Found 0 installed instances)
At this point I felt it was pretty shot and unrecoverable, but like usual I felt to give one last google search on the issue of 0 found instances. Which lead me to this MS answers post, with the same question. To paraphrase the solution from Vijay B
To Paraphrase to solution:
1) bcdedit /export c:\bcdbackup (Backup the existing bcd) 2) attrib c:\boot\bcd -h -r -s (Allow write/modify of the BCD file) 3) ren c:\boot\bcd bcd.old (rename the BCD file, can also just be deleted, this is a backup solution) 4) bootrec /rebuildbcd
At this point it will catch the windows install and actually rebuild the BCD (/Rebuild BCD), believe it or not after that I was able to successfully boot the VM, and saved a 20 hour P2V. I can now freely vMotion and move this VM as required in my hypervised system!
Jan 2018 Update
Another good post, but sad didn’t write out the error message as clearly the outsourced image has been lost from the interwebs.