Migrate ESXi VM to Proxmox

I’m going to simulate migrating to Proxmox VE in my home lab.

I saw this YT video comparing the two and gave me the urge to try it out in my home lab.

In this test I’ll take one host from my cluster and migrate it to use Proxmox.

Step one, move all VMs off target host.
Step two, remove host from cluster.
Step three, shutdown host.

In this case it’s an old HP Folio laptop. Next Install PVE.

Step one Download Installer.
Step two, Burn image or flash USB stick with image.
Step 3 boot laptop into PVE installer.

I didn’t have a network cable plugged in, and in my haste I didn’t pay attention to the bridge main physical adapter, it was selected as wlo1 the wireless adapter. I found references to the bridge info being in /etc/network/interfaces some reason this was only able to get pings to work. all other ports and services seemed completely unavailable.  Much like this person, I simply did a reinstall (this time minding the physical port on network config). Then got it working.

First issue I had was it poping up saying Error Code 100 on apt-get update.

Using the built in shell feature was pretty nice, use it to follow this to change the sources to use no-subscription repos.

The next question was, how can I setup another IP thats vlan tagged.

I thought I had it when I created a “Linux VLAN”, and defining it an IP within that subnet and tagging the VLAN ID. I was able to get ping replies, even from my machine in a different subnet, I couldn’t define the gateway since it stated it was defined on the bridge, make sense for a single stack. I figured it was cause ICMP is UDP and doesn’t rely on same paths (session handshakes) and this was probably why the web interface was not loading. I verified this by connecting a different machine into the same subnet and it loaded the web interface find, further validating my assumptions.

However when I removed the gateway from the bridge and provided the correct gateway for the VLAN subnet I defined, the wen interface still wasn’t loading from my alternative subnetting machine. Checking the shell in the web interface I see it lost connectivity to anything outside it’s network ( I guess the gateway change didn’t apply properly) or some other ignorance on my part on how Proxmox works.

I guess I’ll leave the more advanced networking for later. (I don’t get why all other hypervisors get this part so wrong/hard, when VMware makes it so easy, it’s a checkbox and you simply define the VLAN ID in, it’s not hard…) Anyway I simply reverted the gateway back to the bridge. Can figure that out later.

So how to convert a VM to run on ProxMox?

Option 1) Manually convert from VMDK to QCOW2

or

Option 2) Convert to OVF and deploy that.

In both options it seems you need a mid point to store the data. In option 1 you need to use local storage on a Linux VM, almost twice it seems once to hold the VMDK, and then enough space to also hold the QCOW2 converted file. In option 2 the OP used an external drive source to hold the converted OVF file on before using that to deploy the OVF to a ProxMox host.

I decided to try option 1. So I spun up a Linux machine on my gaming rig (Since I still have Workstation and lots of RAM and a spindle drive with lots of storage). I picked Fedora Workstation, and installed openssh-server, then (after a while, realizing to open firewall out on the ESXi server for ssh), transferred the vmdk to the fedora VM:

106 MB/s not bad…

Then installed the tools on the fedora VM:

yum install -y qemu-img

NM it was already installed and converted it…

On Proxmox I couldn’t figure out where the VM files where located “lvm-thin” by default install. I found this thread and did the same steps to get a path available on the PVE host itself. Then used scp to copy the file to the PVE server.

After copying the file to the PVE server, ran the commands to create the VM and attach the hdd.

After which I tried booting the VM and it wouldn’t catch the disk and failed to boot, then I switched the disk type from SCSI to SATA, but then the VM would boot and then blue screen, even after configuring safe mode boot. I found my answer here: Unable to get windows to boot without bluescreen | Proxmox Support Forum

“Thank you, switching the SCSI Controller to LSI 53C895A from VirtIO SCSI and the bus on the disk to IDE got it to boot”.

I also used this moment to uninstall VMware tools.

Then I had no network, and realized I needed the VirtIO drivers.

If you try to run the installer it will say needs Win 8 or higher, but as pvgoran stated “I see. I wasn’t even aware there was an installer to begin with, I just used the device manager.”

That took longer then I wanted and took a lot of data space too, so not an efficient method, but it works.

No coredump target has been configured. Host core dumps cannot be saved.

ESXi on SD Card

Ohhh ESXi on SD cards, it got a little controversial but we managed to keep you, doing the latest install I was greet with the nice warning “No coredump target has been configured. Host core dumps cannot be saved.

What does this mean you might ask. Well in short, if there ever was a problem with the host, log files to determine what happened wouldn’t be available. So it’s a pick your poison kinda deal.

Store logs and possibly burn out the SD/USB drive storage, which isn’t good at that sort of thing, or point it somewhere else. Here’s a nice post covering the same problem and the comments are interesting.

Dan states “Interesting solution as I too faced this issue. I didn’t know that saving coredump files to an iSCSI disk is not supported. Can you please provide your source for this information. I didn’t want to send that many writes to an SD card as they have a limited number (all be it a very large number) of read/writes before failure. I set the advanced system setting, Syslog.global.logDir to point to an iSCSI mounted volume. This solution has been working for me for going on 6 years now. Thanks for the article.”

with the OP responding “Hi Dan, you can definately point it to an iscsi target however it is not supported. Please check this KB article: https://kb.vmware.com/s/article/2004299 a quarter of the way down you will see ‘Note: Configuring a remote device using the ESXi host software iSCSI initiator is not supported.’”

Options

Option 1 – Allow Core Dumps on USB

Much like the source I mentioned above: VMware ESXi 7 No Coredump Target Has Been Configured. (sysadmintutorials.com)

Edit the boot options to allow Core Dumps to be saved on USB/SD devices.

Option 2 – Set Syslog.global.logDir

You may have some other local storage available, in that case set the variable above to that local or shared storage (shared storge being “unsupported”).

Option 3 – Configure Network Coredump

As mentioned by Thor – “Apparently the “supported” method is to configure a network coredump target instead rather than the unsupported iSCSI/NFS method: https://kb.vmware.com/s/article/74537

Option 4 – Disable the notification.

As stated by Clay – ”

The environment that does not have Core Dump Configured will receive an Alarm as “Configuration Issues :- No Coredump Target has been Configured Host Core Dumps Cannot be Saved Error”.
In the scenarios where the Core Dump partition is not configured and is not needed in the specific environment, you can suppress the Informational Alarm message, following the below steps,

Select the ESXi Host >

Click Configuration > Advanced Settings

Search for UserVars.SuppressCoredumpWarning

Then locate the string and and enter 1 as the value

The changes takes effect immediately and will suppress the alarm message.

To extract contents from the VMKcore diagnostic partition after a purple screen error, see Collecting diagnostic information from an ESX or ESXi host that experiences a purple diagnostic screen (1004128).”

Summary

In my case it’s a home lab, I wasn’t too concerned so I followed Option 4, then simply disabled file core dumps following the second steps in Permanently disable ESXi coredump file (vmware.com)

Note* Option 2 was still required to get rid of another message: System logs are stored on non-persistent storage (2032823) (vmware.com)

Not sure, but maybe still helps with I/O to disable coredumps. Will update again if new news arises.

TPM security on a ESXi VM

Great part about vSphere 7 is it introduced the ability to add a TPM based hardware to a VM.

Let’s see if we can pull it off in our lab.

What I need a Key Provider, Lucky for use with 7.0.3 VMware provides a “Native Key Provider

During my deployment of the NKP, one requirement is to make a backup of the key I guess, which was failing for me. I found this VMware thread with someone having the same issue.

Sure enough, the comment by “acartwright” was pretty helpful, as I too opened the browser console and noticed the CORS errors. The only diff was I wasn’t using CNAMEs, per say, but I had done a pilot of vCenter renaming. the fact the names showing up as not matching and the ones that were listed in the console reminded me of that. When I went to check the hostname, and local host file, sure enough they had the incorrect name in there.

So, after following the steps in my old blog post to fix the hostname and the localhosts file, I tried to backup the NKP and it worked this time. 😀

So, sure there after this I went to add the TPM and I couldn’t find it, oh right it’s a newer feature, I’ll have to update the VM’s compatibility mode.

Made snapshot, updated to latest hardware ID, boots fine, lets add the TPM hardware, error can’t add TPM with snapshots. Ugh, fine delete snapshot (tested VM boots fine before doing this), add TPM success.

Before changing the VM boot option to EFI, boot the VM and boot the OS into Windows RE, use mbr2gpt command to convert the boot partitions to the proper type supported by EFI.

Once completed, change VM boot options to EFI, and check off secure boot.

Congrats you just configured a ESXi VM with a vTPM module. 🙂

 

vCenter Appliance Failed File Based Backup

Story Time

*UPDATE* VMware has pulled this garbage mess of an update version of vSphere. Why?

1) They PSOD ESXi Hosts...

2) Broke more shit then they fixed...

3) Broke and silently removed protocols for File Based Backups (This post)

As much as the backup failed, I failed along with it,

Task. Backup the vCenter Server using VAMI to create a file based backup.

Now for a ESXi host, you can do this super easy (at least the config so install new and simply load the config)

For a deep and better understanding of backing up and restoring ESXi host’s please read this really amazing blog post by Michael Bose from NAKIVO.

Back up ESXi configuration:

vim-cmd hostsvc/firmware/backup_config

and You will get a simple URL to download the file right to your management machine/computer.

Does vCenter have something like this? (from my research…) No.

You use the vCenter Server Interface to perform a file-based backup of the vCenter Server core configuration, inventory, and historical data of your choice. The backed-up data is streamed over FTP, FTPS, HTTP, HTTPS, SFTP, NFS, or SMB to a remote system. The backup is not stored on the vCenter Server.

Which hasn’t been updated since 2019. Let’s make a couple things here clear:

  1. The HTTP and HTTPS mentioned above are not like the ESXi style mentioned above where it creates a nice backup file locally on the VCSA and presents you with a simple URL to navigate to, to download it. It expects the HTTP/HTTPS to be a file based server to accept file transfers to (like dropbox).
  2. Lots of these “supported” protocols have pretty bad bugs, or simply don’t even work at all. Which well see below.

Doing the Theory

So OK, l log into VAMI, Click the Backup tab on the left hand nav, try to add a open SMB path I have available to use cause, why not, make my life some what easy…

Looking this up I get: VAMI Backup with SMB reports error: “Path not exported by the remote filesystem” (86069) (vmware.com) dated Oct 28,2021. Nice, nice.

Alrighty then, I’ll just spin up a dedicated FTP service on my freeNas box I guess. I learnt a couple things about chroot and local users via FTP, but the short and sweet was I created a local account on the FreeNAS box, created a Dataset under than existing mounted logical volume, and granted that account access to the path. Then enabled local user login for the FTP server, and specified that path as the user’s home path, and enabled chroot on the FTP service, so when this user logs in all they can see is their home path, which to that user appears as root. This (I felt) was a fair bit of security on it, even though its a lab and not needed, just nice…. ANYWAY… Once I had an FTP server ready….

Now I went to Start a File based backup of the vcenter server:

First Error: Service Not Running

In my case I got an error that the PSC Health service was not running, this might just be cause my lack of decent hardware for good performance might have caused some services to not start up in a timely manner. Either way, Navigating to Services in VAMI and started the PSC Health service. Lucky for me there was no further errors on this part.

If you have service errors you will have to check them out and get the required services up and running, which is out the scope of this post.

Second Error: Number of Connections

The next error I got complained about the allowed number of connections to the target.

Which in my case there was an option on the FreeNAS FTP service configurations for this, I adjusted it to “0” or unlimited in hopes to resolve this problem:

restart the service, and try again…

Third Error: Unknown

This is starting to get annoying…

What kind of vague error is that?!

Guy in this thread states the path has to be empty? what?

I tried that, cleared some more space, and it seems to have sorta worked?

Clear the FTP users home path, and try again:

Fourth Problem: Stuck @ 95%

The Job appeared to run but I noticed a couple things:

1) Even though the backup config said the overall size would only be roughly 400MB, the job ran to around 1.8 Gigs.

2)  All I/O appeared to stop and all Resources returned to an idle state, while the job remained stuck processing at 95%.

OK… I found this thread, which suggested to restart the autodeploy service, tried that and it didn’t work, the job remained stuck @ 95%.

I also found this VMware KB,  however,

1) I have a tiny deployment so no chance my DB would be 300Gigs.

2) When I went to check the “buggy python script” the “workaround” seemed to already have been implemented. So the versions of vCenter I was on (7.0u3a) already had this “fix” in place

3) The symptoms still remain to be exactly the same and the python scripts remain in a “sleeping” state.

FFS already….

Try Anyway

Well I saw the files were created, so I decided to try the restore method on the VCSA deployment wizard anyway…

I forgot to take a snippet here, but it basically stated there was a missing metafile.json file. I can only assume that when the backup process was stuck at 95% it never created this required json file…

FUCK….

One Scheduled Run

I noticed that I suppose overnight a scheduled job tried to run and provided yet a different error message:

Well that’s still pretty vague, as far as I know there should be no connectivity issues since file were created all the way up to 1.8 gigs, so I don’t see how it’s network, or permissions related, or even available space in this case, since all files were cleared, up to the already possible and shown to be written 1.8 gigs, which have been deleted to empty the path every time.

Liek seriously, wtf gives here. The fact there’s an entirely new KB with an entire Table of list of shit that apparently is wrong with this file based backup honestly begs the question, Where the FUCK is the QA in software these days? This shit is just fucking ridiculous already…

Check the Logs

*This Log file only gets created the first time you click “configure” under the backup section of VAMI.

Here’s how to access the logs:

Using putty or similar, SSH in as root on the appliance.
Type Shell at the prompt.
Type cd /var/log/vmware/applmgmt.
Type more backup.log or tail backup.log.

[VCDB-WAL-Backup:PID-42812] [VCDB::_backup_wal_files:VCDB.py:797] INFO: VCDB backup WAL start not received yet.

Checking the entry I find this thread. Along with this Reddit Post. Which leads right back to the first shared thread, which states some bitching about the /etc/issues files… and I have a strange feeling, just like the stuck @ 95% issue, I’ll look at the file and it will probably be correct just like the guy who created the Reddit post.

Try Alternative Protocols

When I tried alternative protocols I came across more issues:

NFS – Had the same path issue SMB did “Path not exported by remote system”

SCP – Was apparently silently dropped, much like what this thread mentioned. The amount of silence on that thread speaks volumes to me.

TFTP was also dropped.

You are so Fucked

Soo I wonder if I try to “upgrade” aka downgrade using the UI installer of a supposed version that works (7.0u2b)…

Alright so let me get this straight… I upgraded, and now I can’t make a backup cause the upgraded version is completely broken it terms of its File Basked Backups.

I can’t Roll back the upgrade without having kept the old VCSA, which was removed in my case since all other services was working, vSphere itself.

I can’t “downgrade” and existing one, I can’t make a backup to restore my old ones. OK fine well how about a huge FUCK YOU VMWARE. while I try to come up with some sort of work around for this utter fucking mess.

Infected Mushroom – U R So F**ked [HQ & 1080p] – YouTube

Work around option #1

Build a brand new vCenter, add hosts, and reconfigure.

The main issue here is the fact if you rely on CBT, you will be fucked and all the VM-IDs will have changed, so you will have to:

1) Edit and adjust all back up jobs to point to the new VM, via it’s new VM-IM.

2) Let the delta files be all recalculated (which can be major I/O on storage units depending on many different factors (# of VM, Size of VMs, change of files on VMs, etc)

Not and option I want to explore just yet.

Work Around option #2

Back and restore the config database?

Let’s try.. first backup…

copy python scripts (hope they not all buggy and messed up too..)

Stop required services:

service-control --stop vmware-vpxd
service-control --stop vmware-content-library

change the script permissions

chmod +x backup_lin.py

Run it:

Make a copy of it via WinSCP.

run the restore script… and

well was worth a shot but that failed too….

Lets try PG dump for shits…

I’d really recommend to read this blog post by Florian Grehl on Virden.net for great information around using postgres on vCenter.

Connect to server via SSH (SSH enabled required on vCenter).

“To connect to the database, you have to enable SSH for the vCenter Server, login as root, and launch the bash shell. When first connecting to the appliance, you see the “Appliance Shell”. Just enter “shell” to enter the fully-featured bash shell.

The simplest way to connect to the databases is by using the “postgres” user, which has no password. It is convenient to also use the -d option to directly connect to the VCDB instance.”

# /opt/vmware/vpostgres/current/bin/psql -U postgres -d VCDB

Cool, this lets us know the postgres DB service is running. The most important take away from Florian’s post is:

“When connecting, make sure that you use the psql binaries located in /opt/vmware/vpostgres/current/bin/ and not just the psql command. The reason is that VMware uses a more recent version than it is provided by the OS. In vSphere 7.0 for example, the OS binaries are at version 10.5 while the Postgres server is running 11.6”

Kool, I could use pg_dumpall but I found it didn’t work (maybe that was wrong version of vcenter being mixed, not sure) either way lets try just the VCDB instance…

interesting, lol, as you see I got an error about version mismatch. I found this thread about it and with the info from Florians post, had an idea, tried it out, and it actually worked. Mind… BLOWN.

rm /usr/bin/

OK let’s take this file and place it on the newly deployed vcenter.

even though restore appeared to have worked the vCenter instance booted and showed to be like new install. Was worth a shot I guess, but did not work.

Work Around Option #3

I’m not sure this is even a fair option, as it only works if you have existing backup of alternative types. In my case I use Veeam and its saved my bacon I don’t know how many times.

Sure enough Veeam saved my bacon again. I ended up restoring a copy of my vCenter before the 7.0u3a, which happened to be on 7.0u2d.

I managed to add a SMB path without it erroring, and unreal, I ran a File Based Backup and it actually succeeded!!

Now I just simply run the deploy wizard, and pick restore to build a new vCenter server from this backup.

Ahhh VMware… dammit you got me again!

alright fine… grabs yet another copy of vCenter…

and this time…

are you fucking kidding me? Mhmmm interesting… VCSA 7.0 restore issue – VMware Technology Network VMTN

ok… good to know…

From this… to this….

then Deploy again…

It stated it failed, due to user auth. However I was able to login and verify it worked, but sadly it also instantly expired the license as well. I was hoping I could get another 60 days without creating a new center, reconfiguring and breaking my VM-IDs and CBT delta points for my backup software.

Even this link states what I’m trying to do is not possible… ugh the struggles are real!

In the end just started from scratch, Ugh,

Changing vCenter Hostname

Changing vCenter Hostname

Why?!?! Cause I gotta!

Source: Changing your vCenter Server’s FQDN – VMware vSphere Blog

PreReqs, AKA Checklist

  • Backup all vCenter Servers that are in the SSO Domain before changing the FQDN of the vCenter Server(s)
  • Supports Enhanced Linked Mode (ELM)
  • Changing the FQDN is only supported for embedded vCenter Server nodes
  • Products which are registered with vCenter Server will first need to be unregistered prior to an FQDN change. Once the FQDN change is complete they can then be reregistered.
  • vCenter HA (VCHA) should be destroyed prior to an FQDN change and reconfigured after changes
  • All custom certificates will need to be regenerated
  • Hybrid Linked Mode with Cloud vCenter Server must be recreated
  • vCenter Server that has been renamed will need to be rejoined back to Active Directory
  • Make sure that the new FQDN/Hostname is resolvable to the provided IP address (DNS A records)

NOTE: If the vCenter Server was deployed using the IP as PNID/FQDN, then the following should also be considered:

  • The PNID change workflow cannot be used to change the IP address of vCenter Server
  • The PNID change workflow cannot be used to change the FQND of vCenter Server

In this scenario, use the vCenter Server Appliance Management Interface (VAMI) to update hostnames or IP changes directly. 

The main thing I was expecting was the certificate issue. In my home lab, I removed SSO domain before this change (just using vpshere.local), no ELM, already using embedded (all-in-one), no VCHA, no Hybird, oh yeah…. not sure if you “leave an SSO domain”, before joining back to AD…

My Only Pre-Req

I went into DNS and pre-created A host records for the new server hostname: vCenter.zewwy.ca

Steps

Basically log into VAMI, and change the name.

Then

and and…. well WTF…

No matter what I do it’s greyed out… I thought maybe the untrusted cert, might be an issue so tried from a machine with full trusted chain, and same issue!

Like…. Why… why is Next greyed out? It’s like whatever Button Validation code is written for it is not being triggered, is this a browser version issue? I can’t find anything online with anyone having this issue…. Why? Cause I was right, it was the input validation…

Honestly, this is one of those MASSIVE facepalm moments in my life. I only realized after the fact the username field was NOT auto filled, it was only a label that was greyed and provided as a suggestion… Fill both fields and the next is ungreyed…

Step 4, check the checkbox to acknowledge the warning, and away… she goes!

At which point I clicked redirect now (both web addresses were still available as it didn’t seem to matter which you came from, the cert was untrusted either way, cause the CA not in my trusted ca store)

5 minutes later….

I tell ya nothing more annoying than a spinning circle and the warning “don’t refresh” when the status bar simply does not move… sure got some conflicting messages here….

*Starts to sweat*…

after about 10 minutes time…

More Certificate Fun!

Alright so after this, quick take always… when I went to check the site it was “untrusted” but not for the reason I had thought, I thought it would have been from the same issue as the source blog, and be the hostname on the cert but that was not the case, instead it was imply the the cert chain seemed to be missing, and the issuer could not be verified:

as well as:

So what to do about this… You can download the CA cert from vcenter/certs/download.zip (some reason I had to use IE). Then install the CA cert. (I noticed even after I did this I still had cert warning, error, but after the next day, maybe cache clearing or update, it reported green in the web browser).

Now when I logged in, I got the ol Cert Alert in the vCenter UI

first thing to try is removing old CA’s

Which I did, following this VMware KB

I simply followed my other post about this, and just cleared reset to green on the alert. (Still good days later).

Backup Solutions

Don’t forget to change the server in your backup software, such as I had to do this in Veeam.

These were my results…

Which go figure errored out…

So right click, go to properties of the object… Next, next…

Accept the certs new certificate

Now you figure all is well, but when I went to create a new backup job, when I attempted to expand the vcenter server in Veeam. It just hung there…

I ended up rebooting the server, and then waiting for all the Veeam services to be started. I reopened Veeam, and when to Inventory, clicked the vCenter server, took a second and then showed all the hosts, and the VMs. I clicked it and rescanned to be safe and got this result which was a bit different then the applied settings confirmation above. I think maybe I forgot to rescan the host after applying the new settings, assuming it would have done that as part of the properties change wizard.

which lucky for me now worked, and I was able to select a VM in the Veeam backup wizard, and it successfully backed up the VM.

Final Caveats

like what the heck, everywhere else its changed except at the shell. Let’s see if we change change this.

Well that was easy enough, no reboot required. 🙂

I also found the local hosts file doesn’t update either, in the file it states it managed by VAMI, so many have to look there for potential solutions:

I noticed this since I had to do a work around for something else, and sure enough caught it. I’ll change it manually with vi for now and see what changes after a reboot.

Summary

Overall, literally quick n easy.

  1. Verify DNS records exist.
  2. Use VAMI to edit hostname via editing the Network MGMT settings and change the hostname, click apply and wait.
  3. Manually clear out the old Certs that were created under the old hostname.
  4. Reconfigure you backup solution, which is vender specific (I provided step for Veeam as that is the Backup Vender I like to use)

Overall the task seemed to go pretty smooth. I’ll follow up with any other issue I might come across in the future. Cheers.

 

 

How to remove a Datastore from a vSphere Cluster

How to Remove a Datastore

Intro

Hey everyone,

I figured I’d write up a quick little help guide on removing a Datastore. Now this isn’t new and likely to be buried on the internet because of it. However in my searches I have found the following sources to be great reads. I highly recommend you check them out.

1)  Official Source VMware KB2004605.

2) A Blog guide by Sam McGeown, here.

3) A post by Mike on cswitchzero.

Now let’s go through the checklist from the official source one by one.

Check List

  • If the LUN is being used as a VMFS datastore, all objects (for example, virtual machines, templates, and Snapshots) stored on the VMFS datastore must be unregistered or moved to another datastore.-This one is pretty easy navigate to the datastore files and check. You may find some remanence from the following though.
  • All CD/DVD images located on the VMFS datastore must also be unmounted/unregistered from the virtual machines.-This shouldn’t even be the case if you did check one.
  • The datastore is not used for vSphere HA heartbeat.-This setting will use a folder labeled “.vSphere-HA”
    For a Quick overview of Datastore Heart beating See here
    To “remove” aka change them See here
  • The datastore is not part of a datastore cluster.-You can find useless help on this process from VMware here. I’m assuming it’s an easy task via the WebUI
  • The datastore is not managed by Storage DRS.-If you removed it from the datastore cluster, how could this be an issue?
  • The datastore is not configured as a diagnostic coredump Partition/File and Scratch Partition. For more information, see the following:
  • Storage I/O Control is disabled for the datastore.-See here on how to enable (disabling is the exact reverse)
  • No third-party scripts or utilities running on the ESXi host can access the LUN that has issue.-Honestly I’m not sure how you could check this… even when doing some quick research, you can have scripts I guess that are not on the hosts, but run by alternative machines via PowerCLI. As described in this community post. I guess you’d have to know, either way the scripts would just fail, shouldn’t affect the vSphere cluster.
  • If the LUN is being used as an RDM, remove the RDM from the virtual machine. Click Edit Settings, highlight the RDM hard disk, and click Remove. Select Delete from disk if it is not selected and click OK.Note: This destroys the mapping file but not the LUN content.

    – This is more involving the removing of the backend physical device. Which in my case is the final goal. Though if yours was just to remove a datastore while keeping the physical storage in place this can be ignored.

  • As noted by Sam but not the official source or Mike is if you see a .dvsData folder. as stated by SAM “The .vdsData folder is created on any VMFS store that has a Virtual Machine on it that also participates in the VDS – so by migrating your VMs off the datastore you’ll be ensuring the configuration data is elsewhere.”
  • Check that there are no processes locking the VMFS with this command:
esxcli storage core device world list -d

Datastore Removal Steps

Step 1) Follow the Checklist above.

Make sure no files reside on the Datastore.

Step 2) Unmount Datastore from all ESXi hosts.

As noted by SAM blog post even in vSphere 5.x using the C# phat client, this was possible to do via a wizard against all hosts that have the datastore mounted. Even on the newer HTML5 WebUI this is still possible (I think everyone wants to fully forget that VMware chose flash for a short time).

At this point the Datastore will show up as inaccessible to vSphere. As noted by both Mike and Sam. This will be the same anywhere from 5.x-7.x (As noted by Mike it might be slightly more important to follow procedures with earlier versions of ESXi 3 or 4). If the Check list was followed, there should be no issues unmounting the datastore.

If you need to do this via esxcli (Source):

# esxcli storage filesystem list

Unmount the datastore by running the command:

# esxcli storage filesystem unmount [-u UUID | -l label | -p path ]

For example, use one of these commands to unmount the LUN01 datastore:

# esxcli storage filesystem unmount -l LUN01

# esxcli storage filesystem unmount -u 4e414917-a8d75514-6bae-0019b9f1ecf4

# esxcli storage filesystem unmount -p /vmfs/volumes/4e414917-a8d75514-6bae-0019b9f1ecf4

Step 3) Detach the LUN from all hosts.

As noted by Sam, if you are on 5.x you might want to automate this via PowerCLI. Then noted by Mike, newer 7.x can now do this in bulk via the Management WebUI.

6/7 WebUI -> Hosts n Clusters -> Hosts -> Cluster -> Host -> Configure Tab -> Storage Device (left side tree) -> Highlight Device -> Detach

for esxcli

Obtaining the NAA ID of the LUN to be removed

esxcli storage vmfs extent list

To detach the device/LUN, run the command:

# esxcli storage core device set --state=off -d NAA_ID

6. To verify that the device is offline, run the command:

# esxcli storage core device list -d NAA_ID

The output, which shows that the Status of the disk is off.

Step 4) Rescan HBAs

At this point, if you rescan all HBAs on all hosts the inaccessible datastore should be gone from the WebUI.

At this point you can remove the LUN from being seen (disc from showing up under devices) this will either be iSCSI based configurations (remove static and dynamic IPs from the iSCSI initiator settings on each host.) Mostly likely for a shared VMFS datastore.

It could be a local disc over a local storage controller (such as a logical drive created in RAID) such as behind a Pxxx storage controller.

Removing the source device will always be dependent on how it was configured in the first place.

Summary

So today we covered removing a Datastore. The important thing to remember is removing a Datastore takes a lot more steps than removing one, cause so many different VM’s and services can be applied to a datastore once it has started being used.

In many cases, the SysLog and Scratch partition are big hang ups, and should be looked at closely. Which, however, as stated if you are actually checking for files on the datastore this stuff will be pretty evident.

In most cases, ensure you follow the check list and the process should be pretty smooth. Hope this helps someone.

*Note* I often provide screen shots to provide some context, in this case I decided to leave it more generic to span multiple versions of vSphere.

Creating Custom ESXi Image

Follow these steps

  1. Download Offline Bundle of ESXi Image
  2. Download Drivers E.G The Native ESXi USB NIC drivers
  3. Install PowerCLI (Set-ExecutionPolicy Remotesigned; Import-Module PowershellGet; Install-Module -Name VMware.PowerCLI)
  4. In PowerCLI connect the standard SoftwareDepot by typing:

    Add-EsxSoftwareDepot -DepotUrl <Path to zip>

  5. Get the ImageProfile list:

    Get-EsxImageProfile

  6. Clone standard ImageProfile:

    New-EsxImageProfile -CloneProfile ESXi-6.7.0-8169922-standard -Name MyProfile -Vendor <vendor>

  7.  [Only If Required] If your vib file has Acceptance Level – CommunitySupported, we need to set this Acceptance Level for our ImageProfile:

    Set-EsxImageProfile -ImageProfile MyProfile -AcceptanceLevel CommunitySupported

  8. Add our vib to SoftwareDepot:

    Get-EsxSoftwarePackage -PackageUrl <path to vib>

  9. Add our vib to ImageProfile:

    Add-EsxSoftwarePackage -PackageUrl

Error:

Search result.

Answer driver for specfic version (7.1, need 6.5)

So I downloaded the proper driver but I couldn’t figure out how to pick the right software package since the “get” command was actually already loaded the other driver, so it kept trying to add the 7.1 driver. Only thing I could think of was to close the powershell windows and start fresh…

10. Export ImageProfile to ISO image:

Export-EsxImageProfile -ImageProfile MyProfile -ExportToIso -FilePath

That was it! Sadly the laptop I wanted to use this on was still boot looping, and sadly the USB NIC “Insagnia” didn’t seem to work and was getting NFS4 client failed to load, and not network adapters found on the machine. But was worth a shot.

VMware vCenter Updates using VAMI

This is a quick post on the latest security release notification from VMware.

VMSA-2021-0002 (vmware.com)

If for whatever reason an update is not possible you can follow these workarounds.

While you can use VUM to distribute updates and patches to ESXi hosts.

You’ll have to use VAMI for updating vCenter.

You can download the latest patches here (vmware account required).

I did this on my lab vCenter,  took a lil while but not bad.

  1. Made a backup of the VCSA using Veeam
  2. Shutdown Veeam or any other backup solution that might use vCenter
  3. Notified anyone that might use vCenter that it would be inaccessible during update
  4. Attached ISO to VCSA VM (You can do as 4sysops did and upload to a datastore, or you can simply open the VCSA console via VMRC, and attach the ISO from your Downloads folder)
  5. Log into VAMI (https://vcsa:5480)
  6. Click Update on left nav, then Update -> Check CD-ROM
  7. The update should be available as the option, then click Stage and Install
  8. Accept the EULA, use/don’t use CEIP, Check I have a backup, Click Install.

It could take an hour or so, then everything is back to running state, here’s the summary page after completion:

You can read the alternative methods such as using CLI, or how to handle a vCenter HA cluster upgrade using the link above to 4sysops guide on upgrading vCenter.

Sorry this post is not as extensive as usual, just a heads up about the latest VMware patches. Stay Safe out there.

 

ESXi 6.7 on HPE DL380 G7

I had this long blog post I was going to write about HP screwing me on a ESXi upgrade, but in a nut shell you can read these ones about that how shebang:

  1. MonsterMuffin (crude)
  2. Claud “Admin” (Less crude)

As both of them mention you have to do a clean install, and you probably won’t have a config saved from that exact version as you are just updating to it, so your config on the old 5.x or 6.0 won’t work either. If you have dvswitches and all that fun jazz probably not a huge deal but if you have standard vswitches and lots of custom configurations around them including vlan tagging, well this can be crappy.

I did manage through all my trial n errors to get a working copy but it required workarounds I don’t think would have been supported, so meh just follow those…

I tried everything to get a ESXi system upgraded to 6.7 without loosing or reconfiguring the host, you figure just do anew install and reload the config.

However you can only load config for the same version a backup of that one was created. I eventually came across other things other PSOD’s and had to even at one point edit the boot file to remove a HP dedicated driver from loading. After all that meh, just install new with the custom images mentioned in the above Blogs.

I’m really sorry I would have covered these tasks in far more detail but spent a good couple days smashing my head just trying to get it to work. something are just not worth the effort, and blogging every annoying error and steps along the way on this one… is just one of those things.

Cheers.