VMware Changes Update URLs

If you run a home lab, or manage systems for companies you may have noticed updates not working in VAMI… something like…. Ohhh I dunno.. this:

Check the URL and try again.

Unable to patch the vCenter via VAMI as it fails to download the updates from Broadcom public repositories

Cause

Public facing repository URLs and authentication mechanisms are changing. Download URLs are no longer common but unique for each customer therefore will require to be re-configured.

Well… wow thank you Broadcom for being so… amazing.

If you want to be overly confused about the whole thing you can this this KB: Authenticated Download Configuration Update Script

As the original link I shared above all you have to do is login to the Broadcom support portal, and get a token, and edit the URL…. but….

Notes:

    • The custom URL is not preserved post migration upgrade, FBBR restore and VCHA failover
    • If there is a proxy device configured between vCenter and the internet, ensure it is configured to allow communications to the new URL
    • Further patches automatically update this URL. For example, if 8.0.3.00400 is patched to 8.0.3.00500, the default URL will change to end in 8.0.3.00500.

Looks like this was enforced just a couple days ago … Sooooo, happy patching?   ¯\_(ツ)_/¯

Permission to perform this operation was denied. NoPermission.message.format

For anyone who may use my site as a source of informational references, I do apologies, for the following:

  1. My Site Cert expiring. ACME is great, I’m just a bit upset they refuse to announce their HTTP auth sources so I can’t create a security rule for it. Right now it would be restricted to App Type. While not bad.. not good enough, so I manually have to allow the traffic for the cert to be renewed.

    No… I have no interest in allowing ACME access to my DNS for DNS auth.

  2. Site was down for 24 hours. If anyone noticed at all, yes my site was down for over 24 hours. This was due to a power outage that lasted over 12 hours after a storm hit. No UPS could have saved me from this. Though one is in the works even after project “STFU” has completed.

    No, I have no interest in clouding my site.

I have a couple blog post ideas roaming around, I’m just having a hard time finding the motivation.

Anyway, if you get “Permission to perform this operation was denied. NoPermission.message.format” while attempting to move a ESXi host into a vCenter cluster. Chances are you may have a orphaned vCLS VM.

If so, log into VAMI and restart the ESX Agent Manager (EAM) service.

After restarting that service everything should hunky dory…

Cheers.

Update Veeam 12.3

Grab Update file from Veeam.

Step 1) Sign in to Veeam portal

I didn’t have a paid product license, so my download section was full of free trial links. Since I’m using CE (community edition) from here: Free Backup Software For Windows, VMware, & More – Veeam

Step 2) Download the ISO, it’s a doosy at 13 GBs

Step 3) Read the update notes for any expected issues/outcomes.

For all the FAQs go here: Veaam Upgrade FAQs

For basic System Requirements and release notes see here: Veeam Backup & Replication 12.3 Release Notes

The main thing will be the change of the server SQL service, moving from MS SQL Express, to PostgresDB, Though it’s not directly mentioned from what I can see other than the step 8 in the Upgrade path: Upgrading to Veeam Backup & Replication 12.3 – User Guide for VMware vSphere

Step 4) Attach the ISO to the server being upgraded or installed on

My case a 12.1 based server.

My case it’s a VM, so I just attach it via VMRC.

Step 5) Run the Installer

Make sure you stop any “continuous” jobs, and close the B&R Console.

Double Click Setup.exe on the mounted ISO’s main directory.

If you haven’t guessed it, click Upgrade. Yes, nice to see coding done where it just does a check and knows it’s a Veeam server, so the only option is to Upgrade.

In my case I again only have one option to choose from.

How long we wait is based on the Matrix. Looking at the VM resource usage, and my machines based on the setup, looks like it’s reading from the ISO to load installation files. and writing it somewhere to disk, my setup only yielded me about 40 MB’s and took roughly 8 minutes.

Agree to the EULA.

Upgrade the server, here’s you have a checkbox to update remote components automatically (such as Veeam proxies). In my lab the setup is very simply so I have none. I just click next.

License upgrade: (I’ll try not selecting this since CE, nope wizard wouldn’t let me for CE, shucks hahah)

Service account, Local System (recommended). I left this default, next.

Here’s the OG MS SQL instance:

… yes?

For the Veeam Hunter service… ignore (Shrug)

free space… needs more than 40 Gigs… holy molly….

43.1 GB required, 41 GB Available. Unreal, guess I’ll extend the drive, great part of running VMs. 🙂

Finally! Let’s Gooooo! and sure enough first step.. here comes the new SQL instance.. this is probably why it requires over 40 gigs to do the install, to migrate the SQL instance from MS SQL to Postgres…. Wonder if space will be reclaimed by removal of the MS SQL Express instance….

Roughly half hour later…

Mhmmm checking the services I see the orginal MS SQL instance is still there running. I see a postgres service.. not running… uhhhh mhmmm…

All Veeam services are running, open the Veeam B&R console, connect, and yup it opens. The upgrade component wizard automatically opened, and it updated the only item.. itself.

*UPDATE* Patch for latest CVE of 9.9. If you have a domain joined Veeam server.

KB4724: CVE-2025-23120

*thumbs up* It’s another 8 gig btw…

Installing Core Linux

Installing TC-Linux (Core Only)

Sources

Source: wiki:install_hd – Tiny Core Linux Wiki

On, ESXi VM: wiki:vmware_installation – Tiny Core Linux Wiki

FAQs: http://www.tinycorelinux.net/faq.html

Setting up VM

VM Type: Other Linux 32bit kernel 4.x
CPU: 1
Mem: 256 MB
HDD: 20 Gig
Network: DHCP + Internet Access

Change boot to BIOS (instead of EFUI)

Booting and Installing Core Linux

Attach ISO boot. Core Linux boots automatically from ISO:

For some reason the source doesn’t tell you what to do next. type tc-install and the console says doesn’t know what you are talking about:

AI Chat was kind enough to help me out here, and told me I had to run:

tce-load -wi tc-install

Which required an internet connection:

However even after this, attempting to run gave the same error.. mhmm, using the find command I find it, but it needs to be run as root, so:

sudo su
/tmp/tcloop/tc-install/usr/local/bin/tc-install.sh

C for install from CDrom:

Lets keep things frugal around here:

1 for the whole disk:

y we want a bootloader (It’s extlinux btw located [/mnt/sda1/boot/extlinux/extlinux.conf}):

Press enter again to bypass “Install Extensions from..”

3 for ext4:

Like the install source guide says add boot options for HDD (opt=sda1 home=sda1 tce=sda1)

last chance… (Dooo it!) y:

Congrats… you installed TC-Linux:

Once rebooted the partition and disk free will look different, before reboot, running from memory:

after reboot:

Installing OpenSSH?

tce-load -wi openssh

This is where things got a little weird. Installing an app… Not as root TC-Linux says…

This is when things got a bit annoying n weird, even though the guide says using -wi installs it in the on boot section, I found it wasn’t loading on boot, well at first I noticed it didn’t start at all after install, as I couldn’t SSH in, this was cause of a missing config file…

Even if I got it running it still wouldn’t run at boot and that apparently was cause the file disappeared after reboot. This is apparently cause the system mostly run entirely in RAM. If you didn’t notice even after install the root filesystem was still only roughly 200 MB in size (enough to fit into the RAM we configured for this VM).

Notice the no password on the tc account? Set it, reboot. doesn’t stick…

Notice the auto login on tty1? Attempt to disable.. doesn’t stick…

Configuring Core Linux

Long story short apparently you have to define what paths are to be considered persistent via a file:

/opt/.filetool.lst

These files are saved to mydata.gz via the command:

filetool.sh -b

So here’s what we have to do:

  1. Configure the system to ensure settings we configure stay persistent across reboots.
  2. Change the tc account password.
  3. Disable auto login on TTY1.
  4. Configure Static IP address.
  5. Install and run on boot OpenSSH.

Changing TC Password

Step 1) Edit /opt/.filetool.lst (use vi as root)
– add etc/passwd and etc/shadow

Step 2) run:

filetool.sh -b

Step 3) run

passwd tc

Step 4) run

filetool.sh -b

Now reboot, you may not notice that it applied due to the auto login, however, if you type exit to get back to the actual login banner, type in tc and you will be prompted for the password you just set. Now we can move on to the next step which is to disable the auto login.

Disable Auto-Login

Step 1) Run

sudo su
echo 'echo "booting" > /etc/sysconfig/noautologin' >> /opt/bootsync.sh

Step 2) Run

filetool.sh -b
reboot

K on to the next fun task… static IP…

Static IP Address

For some reason AI said I had to create a script that runs the manual step… not sure if this is the proper way… I looked all over the Wiki: wiki:start – Tiny Core Linux Wiki I can’t find nothing.. I know this works so we’ll just do it this way:

Step 1)  Run:

echo "ifconfig eth0 192.168.0.69 netmask 255.255.255.0 up" > /opt/eth0.sh
echo "route add default gw 192.168.0.1" >> /opt/eth0.sh
echo 'echo "nameserver 192.168.0.7" > /etc/resolv.conf' >> /opt/eth0.sh
chmod +x /opt/eth0.sh
echo "/opt/eth0.sh" >> /opt/bootlocal.sh
filetool.sh -b

Step 2) reboot to apply and verify.

What about SSH?!

Oh right.. we got it installed but we never got it running did we?!

Step 1) Run:

cp /usr/local/etc/ssh/sshd_config.orig /usr/local/etc/ssh/sshd_config
vi /usr/local/etc/ssh/sshd_config

Edit and uncomment:
Port: 22
Address: 0.0.0.0
PasswordAuthedAllowed:true

Step 2) Run:

echo "usr/local/etc/ssh/" >> /opt/.filetool.lst
echo "/usr/local/etc/init.d/openssh start" >> /opt/bootlocal.sh
filetool.sh -b
reboot

congrats you got openSSH working on TC-Linux.

Hostname

Most systems you run the hostname command… ooooeee not so easy not TC-Linux.

Option 1 (Clean)

Edit the first line of /opt/bootsync.sh which sets the hostname.

Then just run filetool.sh -b, done.

Option 2 (Dirty)

To ensure the hostname persists across reboots, you need to modify the /etc/sysconfig/hostname file:

  1. Edit the hostname configuration file:
    sudo vi /etc/sysconfig/hostname
    
  2. Add or modify the line to include your desired hostname:
    your_new_hostname
    
  3. Save and close the file.
  4. Add /etc/sysconfig/hostname to the persistence list:
    echo "etc/sysconfig/hostname" >> /opt/.filetool.lst
    echo "hostname $(cat /etc/sysconfig/hostname)" >> /opt/bootlocal.sh
  5. Save the configuration:
    filetool.sh -b
reboot

That’s it for now, next blog post we’ll get to installing other goodies!

Managing Apps

Installing Apps

As you can see it’s most running:

tce-load -wi

for all the details see their page on this, or run -h.

Source of app (x86): repo.tinycorelinux.net/15.x/x86/tcz/

For the most it’s install app. Edit files as needed, saved edited files to /opt/.filetool.lst. Then run backup command, test service, edit /opt/bootlocal.sh with commands needed to get app/service running. again run filetool.sh and bobs your uncle.

Deleting Apps

To remove a package on Tiny Core Linux that was installed using tce-load, here’s what you can do:

  1. For Extensions in the onboot.lst File:
    • First, remove the package name from the /etc/sysconfig/tcedir/onboot.lst file to prevent it from being loaded at boot. You can edit the file with:
      bash
      sudo nano /etc/sysconfig/tcedir/onboot.lst
      
    • Delete the entry corresponding to the package you wish to remove, then save and exit.
  2. Delete the Extension File:
    • Navigate to the directory where the extensions are stored:
      bash
      cd /etc/sysconfig/tcedir/optional
      
    • Remove the .tcz file associated with the package:
      bash
      sudo rm package-name.tcz
      
  3. Clean Up Dependency Files (Optional):
    • To clean up leftover dependency files related to the removed package, you can check and delete them from the same directory (/etc/sysconfig/tcedir/optional).

 

Veeam VM Restore failed: Cannot apply encryption policy. You must set the default key provider.

So in my Lab vCenter went completely POOOOOF. So, I installed it fresh.

After vCenter was installed, I updated my Veeam configuration to ensure my backup chains wouldn’t break which still works great by the way.

One VM was missing from my vSphere. So I went to restore it when all of a sudden:

I remembered by post about configuring a Native Key Provider cause it was required as such to have a vTPM. So I thought, is this a “PC Load Letter” problem, and it’s actually just complaining that I didn’t configure a NKP for it to “apply encryption policy”?

Follow the same old steps to configure a NKP.

  • Log in to the vSphere Client:
    • Open the vSphere Client and log in with your credentials.
  • Navigate to Key Providers:
    • Select the vCenter Server instance.
    • Click on the Configure tab.
    • Under Security, click on Key Providers.
  • Add a Native Key Provider:
    • Click on Add.
    • Select Add Native Key Provider.
    • Enter a name for the Native Key Provider.
    • If you want to use hosts with TPM 2.0, select the option Use key provider only with TPM protected ESXi hosts.
  • Complete the Setup:
    • Click Add Key Provider.
    • Wait for the process to complete. It might take a few minutes for the key provider to be available on all hosts.
  • Backup the Native Key Provider:
    • After adding the Native Key Provider, you must back it up.
    • Click on the Native Key Provider you just created.
    • Click Backup.
    • Save the backup file and password in a secure location.

Once I did all that…

No way that actually worked. But will it boot? Well it def “booted” but it asked for the BitLocker key (which makes sense since we created a new TPM and it doesn’t have the old keys). I checked my AD and sadly enough for some reason it didn’t have any BitLocker keys saved for this AD object/VM.

Guess this one is a loss and the importance of saving your encryption keys.

Careful Cloning ESXi Hosts

I’ll keep this post short. I was doing some ESXi host deployments in my home lab, and I noticed that when I would install on a 120GB SSD, the install would go smoothly, but I wasn’t able to use any of the storage as a Datastore. However, if I took a fresh install copy of ESXi from installing onto an 8GB USB Stick and DD’d it to the 120GB SSD I got several advantages from this:

  1. When done via a USB3 Pipe of Linux live holding a copy of my base image to deploy I could get speeds in excess of 100 MB/s, and with only 8GB of data to transfer, the “install” would complete in a mere 90 seconds.
  2. The IP address and root password are preconfigured to what I already now, and I can simply change the IP address from the DCUI and call it a day.

Using this method I could have a host up in less than 5 minutes (2 min to boot linux live, 90 seconds to install the base ESXi OS image, and 2 more to boot ESXi). This was of course on machine without ECC and all the server hardware firmware jazz… in those cases install times are always longer. anyway…

This was an amazing option, until I noticed that when I connect one machine in I just deployed and changed the IP address, and (since I’m super anal about networking during this type of project/operations) I noticed my ping from one machine (a completely different IP address) started to drop when the new device came up… and after a while the ping responses would come back but drop from the new host, and vice versa, flip and flop it goes. I’m used to this usually if there’s an IP conflict and two devices have the same IP address. In this case they were different IP addresses… after enough symptom gathering and logical deduction of because I had to assume that the MAC address just be the same and this is the same problem in reverse (different IP’s but same MAC) and as such experiencing the same symptoms.

To validate this I simply deployed my image to a new machine, then I went on the hunt to figure out how to see the MAC address, since I couldn’t plug in the NIC and get to the web based MGMT interface I had to figure out how to do that via the console CLI directly… mhmm after enough googling on my phone I found this spiceworks thread with my answer:

vim-cmd hostsvc/net/info | grep “mac =”

I then checked this against the ESXi host that I saw the flipping flopping with, and sure enough they matched…  After doing a fresh install I noticed that the first 3 sections match the physical MAC, but in my DD deployed ones they retain the MAC of the system from which it was installed and those when I ran the command above, I could tell which ones were deployed via my method. This was further mentioned in this reddit thread by a commenter who goes by the name of sryan2K1:

“The physical NIC MACs are never used. vmk ports, along with VMs themselves will all use VMWare’s OUI as the first half of the address on the wire.”

OK, now maybe I can still salvage my deployment method by simply deleting and recreating the VMK after deployment, but I’d guess it best be done via the DCUI or direct console… I found one KB by VMware/Broadcom but it gave a 404 but Luckly there was a wayback machine link for it here.

Which states the following:

“During Initial Installation and DCUI, ESXi management interface (default vmk0) is created during installation.

The MAC address assigned will be the primary active physical NIC (pnic) associated.

If the associated vmnic is modified with the management interface vmkernel will once again assign MAC address of the associated physical NIC.

To create a VMkernel port and attach it to a portgroup on a Standard vSwitch, run these commands:

esxcli network ip interface add --interface-name=vmkX --portgroup-name=portgroup
esxcli network ip interface ipv4 set --interface-name=vmkX --ipv4=ipaddress --netmask=netmask --type=static"

Alternatively, you can also use esxcli to create the management interface vmkernel on the VDS.

Creation of the management interface with the ‘esxcli network’ will generate a VMware Universally Unique address instead of the pnic MAC address.

It is recommended to use the esxcli network IP interface method to create the management interface and not use DCUI.

Workarounds:               None

Additional Information:
Using DCUI to remove vmnic binding from management vmkernel or any modification will apply change at vSwitch level. Management interface is associated with propagating the change to any port groups within the vSwtich level.

Impact/Risks:                None.”

I”m assuming it means if you use the DCUI to reconfigure the MGMT interface settings the MAC will automatically be reconfigured to match what I found during initial clean install and mentioned in the reddit thread of using the first 3 sections to derive the MAC of the VMK.

But what if you don’t have any additional interfaces to use to make the section change in the DCUI to have that actually happen? cause what I’ve noticed changing the IP address and disabling IPv6 and rebooting did not change the VMK’s MAC address. Oh there’s in option in the DCUI “Reset Network Settings” within there there’s several options, I simply picked reset to factory defaults. Said success, checked the MAC via the first command stated above and bam the VMK nic changed to what it should be! Sweet my deployment method is still viable.

Hope this helps someone.

The virtual machine must be encrypted

Sooo I lost a VM in my fray of re-organizing my server farm. Like a lost pup I figured I just rely on my good old Veeam backup sets. Recover VM, alright here we goo….

What.. what does that mean…. Oh wait is this cause of when I blogged about adding vTPMs to VMs?

Re-checked the linked video from VMware… 2 min in … “Failure to save your key backup will result in unrecoverable data loss”…. mhmmm, OK I thought all I did was add a TPM device to my VM and enabled secure boot, that’s the deal here?

Somewhere I read that the VM config files get encrypted, but I don’t think that’s the case here either.  Even checking the Pre-reqs from VMware I can’t see anything nothing this:

Prerequisites

Ensure that your vSphere environment is configured with a key provider. See the following for more information:
Configuring vSphere Trust Authority
Configuring and Managing a Standard Key Provider
Configuring and Managing vSphere Native Key Provider
Ensure that host encryption mode is enabled. See Enable Host Encryption Mode Explicitly.
The guest OS you use can be Windows Server 2008 and later, Windows 7 and later, or Linux.
The ESXi hosts running in your environment must be ESXi 6.7 or later (Windows guest OS), or 7.0 Update 2 (Linux guest OS).
The virtual machine must use EFI firmware.
Verify that you have the required privileges:
Cryptographic operations.Clone
Cryptographic operations.Encrypt
Cryptographic operations.Encrypt new
Cryptographic operations.Migrate
Cryptographic operations.Register VM

What I think is happening here is my NKP that IS a Prerequisite went poof (the vCenter server that was used to create it is shutdown and not being used), and another temp vCenter is being used.

My first thought was maybe I could just add a new NKP and go as I figured the TPM physical module that’s installed simply needs this, and I think it’s this hardware that’s faulting the boot.

I didn’t want to muck the with original I just recovered so I tried to clone it, but the clone failed too complaining about encryption before adding a TPM, further validating my assumption. What I don’t understand it how the VM was allowed to be created from backup in the first place if I can’t even clone it…?

Any since I know recovery is possible (since I just did it), I guess maybe I can just remove it? Or I could also create a new VM and use vmkfstools to clone the hdd… let’s try that first…

Go to boot VM, well got past that error but the Machine was bitlocked, I was hoping it wasn’t going to be.. go to AD server, open ADUC… no bitlocker tab… ughhhh…

ADUC Missing BitLocker Recovery Tab in 1809 – Microsoft Community

Right but where is that in on a server, oh in server manager it moved…

Yay there’s the bitlocker tab and… it’s empty.. man give me a fucking break… so now I have a bunch of backups that are useless cause I lost the bitlocker key… shiiiiiiit

Well I don’t have anything to follow up on here but a lesson learnt to backup your bitlocker key (I don’t know why it wasn’t save to the AD computer object).

Clearing up Space on an Exchange Server

I wanted to migrate an old exchange server, but the size was way more then I ever expected… so I dusted out my old script…

GitHub – Zewwy/Manage-ExchangeLogs: Script to help ease in managing Exchange logs

inetpub logs ended up being over 50 gigs. etl nearly 5 Gigs, another 5 On the next. all together cut out over 60 Gigs of data space. But lots remained.

I then found a OWA prem folder with lots of data and found a usual blogger covering it’s data removal here: Remove old Exchange OWA files to free up disk space – ALI TAJRAN

“To know which Exchange build number folders we can remove, we have to find which Exchange versions are running in the organization.

$ExchangeServers = Get-ExchangeServer | Sort-Object Name
ForEach ($Server in $ExchangeServers) {
Invoke-Command -ComputerName $Server.Name -ScriptBlock { Get-Command Exsetup.exe | ForEach-Object { $_.FileversionInfo } }
}

After running the above script in Exchange Management Shell, look at the Exchange build number.”

Pretty much run that command and delete all the folders expect the one the system is currently on.

Just like this blogger saved another 10 Gigs off that folder alone, but still heafty, I checked the Exchange DB folder path and found enless log files (guess I have to extend my script. Ended up writing a one liner to grab all the log files and clear then, this was almost another 50 gigs in space saved. We are well over 100 Gigs in space saved/deleted. but there’s still some heft, checking the Exchange client mailbox DB is only 6 gigs (and I”ll see if I can save space there but over all it is peanuts to all the space being used.

Next I found WinSXS folder taking up space. I followed the steps from this blog: What to Do If the Windows Folder Is Too Big on Windows 10? (diskpart.com)

I had already ran diskclean including system files. I ran the DISM commands specified as well but that only brough the 15 gigs of space it was using to roughly 7 Gigs, half is not bad but I was hoping it could do better.

Man wonder if I can just delete it? No! Says Microsoft:

“Don’t delete the WinSxS folder, you can instead reduce the size of the WinSxS folder using tools built into Windows. For more information about the WinSxS folder, see Manage the Component Store.

We following that MS KB pretty runs the same commands as the blog.

I’ll live with that for now, I logged into the mailbox and deleted everything I could, I should have no other mailboxes… wonder what’s got the Exchange DB up to 6 gigs of used space?

I followed another one of Ali’s Blogs: Get mailbox database size and white space – ALI TAJRAN

But that just told me what I had already found out looking at the base file system. He links to another one of this post to use a a script he had written. I checked the script… Clean use of case statement… Well done bud 🙂

Anyway I pipe get mailbox selection into a stats comdlet and then Format the whole list:

(Get-mailbox -resultsize unlimited) | Get-MailboxStatstics | Select *Name,*Count,*Size | FL

This shows All mail items size, and all message and attachment sizes. For me all mailboxes were tiny, and I can get rid of several of them as well. but I don’t think that will change the size on disk…

Got rid of everything down to 2 mailboxes and with 2 gig quotes each no way it could be over 4 gigs, yet it’s still the 6-7 gigs noted eariler.. mhmmm…

So in his blog he basically create a new mailbox DB via the EAC (or Powershell) moves the mailboxes and deletes the old DB. OK I can do that…

So old DB as seen here:

Create new folder path for new DB file (or use a new disk whatever ya want):


Create new DB:

New-MailboxDatabase -Name "DB02" -Server "EX01" -EdbFilePath "C:\DBPath\DB02.edb" -LogFolderPath "C:\LogPath\DB02""

Restart the services.. wait is my new DB and files? Oh I forgot to mount it:

Alright… time to move the mailboxes…

Huh, I was hoping for a process display but I guess it makes sense to throw the job in the background to not be interrupted by signing out or closing the console. Checking resource monitor… Starts chanting… Oh ya Go Beanus, Go Beanus!!

Looks like I/O settled… and…

Get-MoveRequest

Completed nice… size? less then 300 Megs baby.

K, just need to delete the unmount and delete the old DB. What a dink he knew it would fail but I followed his other blog post here:

Cannot delete mailbox database in Exchange Server – ALI TAJRAN

and after running the remove-mailboxdatabase cmdlet it still told me I had to manually delete the file, so I did… I finally got the server to roughly 30 gigs… not bad but I really don’t like that 7 gig SXS folder…

I even cleaned of the SoftwareDistrobution (Windows update Cache folder)

Hope this helps someone, time to hole punch this vmdk and migrate this server.

*Update* The hole punch didn’t work, Why? cause I forgot to run sdelete.
*WARNING* I tried to run sdelete on the VM while it was thin provisioned on a datastore that didn’t have enough storage to will the whole drive, as such the VM errored out with there is no more free space on disk.

It’s like the old adamant goes, things gotta get dirty before they get clean. In this case the drive has to be completed filled (with zeros) before it can be hole punched. Make sure the VM resides on a Datastore with enough actual space if the drive were to be completely filled.

*Update #2* Seems I went down a further rabbit hole then I wanted this to go, unlike my post about hole punching a Linux VM, which was pretty easy and straight forward. This one had a couple extra problems up it’s sleeve.

Problem #1 – Clearing Up Storage Space When Extending Is Not Possible.

This is literally what this whole blog post was about, read it and clear whatever space you can. If you have a physical Exchange server you’re probably done, and all your backups would probably be agent based at this point.

However if you’re virtualized (specifically with VMware/ESXi), then you have some more steps to do, the “hole punching”

Problem #2 – Hole Punched Size Doesn’t Match OS Disk Usage

This is were I want to extend on this blog post cause while the solution seems simple and straight forward. Each step has it’s own caveats and issue to overcome. Let’s discuss these now…

  1. You have to “zero the empty space” before VMware’s tools and properly complete the hole punch. This is only an issue if you happen to be over provisioning the datastore. If so:
  2.  At this point its assumed you cleared as much space as possible, and 2 you have defragged the HDD using the Windows defrag tool, and you have the VM overprovisioned. Simply shrink the partition down to a size that IS available on the datastore, or migrate to a datastore with enough storage. In my case I opted for the first choice to shrink the partition when I hit YET ANOTHER PROBLEM:
  3.  Even though I knew I had cleared the used space down to roughly 30 GBs of space, running the shrink wizard in diskmgmt tool stated it could only shrink the disk to 200GB since “There was a system file preventing further shrinkage”. WTF man, we ran disk cleanup, we cleared SXS folder, we cleared old logs, lots, we cleared the actual Exchange Database files, we disabled, and shrunk the pagefile then re-enabled… What is possibly preventing the shrinkage?

I found this post: windows 7 – Can’t shrink partition due to mystery file – Super User after I looked in the event log for event 259 showing the file in question preventing the shrinkage is “$LogFile::$DATA”… Da Fuck does that mean…

In short.. It’s an NTFS journaling file using “Alternate Data Streams“, or as quoted by Andrew Lambert “The $LogFile file is a part of the NTFS filesystem metadata, it’s used as part of journaling. The ::$DATA part of the name indicates the default $DATA stream of the file. I don’t know why it’s causing a problem, though.”

Bunch of comments about System Restore points, but I checked and there were none. Many other comments mentioning the use of 3rd party tools (no thanks). I can’t seem to locate it but I pretty sure I remember reading a comment somewhere that other NTFS aware applications have the ability to move and correct such things. So here’s my plan of action:

  1. Create a snapshot so I don’t have to recover the whole VM is something goes wrong. (On a slower I/O datastore but one with enough space for whole disk just to be safe).
  2. Boot The Exchange VM but with GParted Live disc connected.
  3. Use GParted to shrink the partition.
  4. Clone the VM. (This is what I don’t get the cloned VM still shows a disk size usage of 70 GBs…. AHhhhhhhhh!!)

Here’s another interesting note, as I stated in point 1 I had this VM on a Datastore based on spindle disk, shown on the ESXi host as “Non-SSD”, and cloned it to an SSD Datastore where it now states to use 70 Gig when the OS boots only having a partitioned disk of 46 Gigs, with 12 Gigs free. Opening the defrag application states defrag not possible cause it’s an SSD, guess let’s run sdelete see what happens?!

sdelete -z c:
sdelete -c c:

The backend VMDK grew from 70 Gigs to 80 Gigs… man wtf is going on… Hole punch it:

vmkfstools -K /vmfs/volumes/SSD/VM/VM.vmdk

You’re tellin me.. the SSD can handle ripping the drive at over 250 MB/s but holepunching causes I/O errors?

Good ol technology never ceasing to piss me off… fine I’ll destroy this VM, move the main spindle drive into a new ESXi host. Which will have an SSD Datastore with more storage and hopefully not on the way out (if it actually is an I/O error, storage/drive failure on the SSD). One sec…

So yeah… even with a larger SSD the copy worked, the hole punch “succeeded” but the drive was still 80 gigs. I made a clone and the vmdk came down to 60 gigs, I still can’t make sense of the roughly 30 gigs of discrepancy. Since the whole idea is to move this to my wireless ESXi host, I’ll see what exporting it as it is now results in the final OVA file size and then update this blog post.

 

Hole Punching a Linux VM on ESXi

I covered this in the past here:

Reclaim unused space from VMDK – Zewwy’s Info Tech Talks

But this time I wanna covered it a bit differently. Things I noticed:

  1. A proper VM with VMtools installed, and thin provisioned will automatically shrink the overall size being shown and used on disk on the ESXi browser.

Yet for some reason after I used the SCP method to move a VM from one host to another (frowned upon as it secretly converts it from thin to thick). Yet even after migrating to a new datastore specifying thin, it still show as full disk usage on the host.

I know is less checking the VM itself via it’s console/terminal whatever:

In my old blog post I mentioned “using DD” but not showing for stating how at all, Googling this I found this thread with an interesting answer:

“The /zero-Tag is actually a file name. The command just copies zeros from the virtual File /dev/zero (infinite number of zeros) into /mnt/hdb/zero until the disk is full, or some other error occurs.

This is why you have to remove the file /mnt/hdb/zero after that in order to regain the unused space.

However, a better way to fill free space with zeros (on ext2,3,4 file systems) is to use a tool called zerofree.”

Oooo zerofree?

Huh, something created (a tool) for exactly the task (job) at hand. Great!

Error, how classic, complained path is mounted RW, like yeah and?

Ughhh, google? virtualbox – zerofree on ubuntu 18.04 – Ask Ubuntu

Step 1) Reboot into ubuntu recovery console, hold down [Perfect ESC keystroke]

K how do I get into Advanced Grub boot? Holding Shift did nothing, if I mash ESC I get grub> (notices tiny flicker of Grub menu).. great I have to Mash ESC only once perfectly in a less then 1 second boot window… man give me a break… Once in I see advanced option as stated by the answer.

Step 2) advanced options -> recovery mode -> root console

Step 3) find the root directory

mount | grep "sda"

Step 4) run zerofree

echo "u" > /proc/sysrq-trigger
mount /dev/mapper / -o remount,ro
zerofree -v /dev/sda1

Step 5) reboot

Checking the ESXi host… what it went up 2 gigs, da fuq man…

Step 6) Compress Disk.

In my previous post this was to vMotion to datastore or use holepunch option which from what I can tell this the -K option, which I dont’ think can be used on a live VM, and a storage vMotion can’t be done. without vCenter. Since I’m temp working without a vCenter server, let’s try the holepunch option, this will require shutting down the VM, but since the zerofree required it anyway down time was already in play.

On the ESXi host:

[root@Art-ESXi:~] vmkfstools -K /vmfs/volumes/Art-Team/VM1/VM1.vmdk
vmfsDisk: 1, rdmDisk: 0, blockSize: 1048576
Hole Punching: 100% done.

Oh noooo I’ve tainted my results… checking the web UI the space has gone back done to stating 20 Gigs like the first snippet… but doing a du -h on the flat file shows it is only the 10 Gigs as expected it to be:

Well I don’t know what to make of this discrepancy…

Huh, I found this post of someone doing the exact same thing around the time I wrote my original post but simply used the command:

dd bs=1M count=8192 if=/dev/zero of=zero

I have no clue how output file of zero means the disk… guess you can try that too, then holepunch the VMDK just like I did.

There.. I shutdown the VM again and this time did my ol trick to vMotion a VM without vCenter, and after re-registering the vMotion (between two datastores) it finally showed up with the correct space in the UI. 🙂

Hope this post helps someone, it’s been covered many times in the past.

How to vMotion a VM without vCenter WITHOUT Shared Storage

While I have covered this in the past here:
How to vMotion a VM without vCenter – Zewwy’s Info Tech Talks

This was using shared network storage between hosts…. what If you have no vCenter AND no shared storage? In my previous post I suggested to go check out VMware arena’s post but that just covers how to copy files from one host to another, and what I’ve noticed is while it does work if you let it complete, the vmdk is no longer thin and takes up the full space as specified by its defined size.  This is also mentioned as such in this serverfault thread “Solutions like rsync or scp will be rate-limited and have no knowledge of the content (e.g. sparse VMDK files, thin-provisioned volumes, etc.)”

So options provided there are:

  1. Export the VM as an OVF file, move to a local system, then reimport the OVF to your ESXi destination.I attempted this but on the host I could only export vmdk. While attempting to do so I got network issues (browser asking to download multiple files, but I must not have noticed and time and timed out? not sure). This also requires an intermediary device and double down/up on the network, I’m hoping for a way between hosts directly.
  2. Use vSphere and perform a host/storage migration.This post is how to do it without. Also note I attempted this but in my case I’m using my abomination ESXi host I created in my previous blog post, and vCenter fails the task with errors. (Again SCP succeeds but doesn’t retain thin provisioning). Not sure why SCP succeeds but vCenter fails seems to be more redundant to poor connection and keeps going, which happens when the WiFi NICs underload in those situations.
  3. Leverage one of Veeam’s free products to handle the ad hoc move.

I love Veeam, but in this case I’m limited in resources, lets see if we can do it via native ESXi here.

So that exhausts all those options. What else we got…

Move VMware ESXi VM to new datastore – preserve thin-provisioning – Server Fault

Oh someone figured out what I did in my intital post all the way back in 2013… wonder how I missed that one.. oh well, same answer as my initial post though required shared storage… moving on…

LOL no way… Willam Lam all the way back from over 14 years ago! Answering the question I had about compression of the files. and saying the OVF export is still the best option.. mhmmm…

I don’t want to stick to just scp, man did it suck getting to 97% done on a 60 Gig provisioned VMDK, that’s only taking up roughly 20 gigs, to have to not work cause I put my machine to sleep thinking it was a remote connection (SSH) to the machine and the machine is doing the actual transfer… just to wake my machine the next morning to have a “corrupt” vmdk that fails to boot or svmotion to get thin. I have machines with fast local storage but poor network, it’s a problem from back in the day with poor slow internet speeds. So what do we have? We got gzip and tar, what’s the diff?

In conclusion, GZIP is used to compress individual files, whereas TAR is used to combine numerous files and directories into a single archive. They are frequently used together to create compressed archive files, often with the “.tar.gz” extension.

Also answered here.

“If you come from a Windows background, you may be familiar with the zip and rar formats. These are archives of multiple files compressed together.

In Unix and Unix-like systems (like Ubuntu), archiving and compression are separate.

tar puts multiple files into a single (tar) file.
gzip compresses one file (only).
So, to get a compressed archive, you combine the two, first use tar or pax to get all files into a single file (archive.tar), then gzip it (archive.tar.gz).

If you have only one file, you need to compress (notes.txt): there’s no need for tar, so you just do gzip notes.txt which will result in notes.txt.gz. There are other types of compression, such as compress, bzip2 and xz which work in the same manner as gzip (apart from using different types of compression of course).”

OK, so from this it would seem like a lot of wasted I/O to create a tar file of the main VDMK flat file, but we could gain from compressing it. Let’s just do a test of simple compression and monitor the host performance while doing so.

Another thing I noticed that I didn’t seem to cover in my previous post in doing this trick was the -ctk.vmdk files. Which are change block tracking files, as noted from here:

“Version 3 added support for persistent changed block tracking (CBT), and is set when CBT is enabled for a virtual disk. This version first appeared in ESX/ESXi 4.0 and continues unchanged in recent ESXi releases. When CBT is enabled, the version number is incremented, and decremented when CBT is disabled. If you look at the .vmdk descriptor file for a version 3 virtual disk, you can see a pointer to its *-ctk.vmdk ancillary file. For example: version=3

# Change Tracking File
changeTrackPath=”Windows-2008R2x64-2-ctk.vmdk”
The changeTrackPath setting references a file that describes changed areas on the virtual disk.
If you want to back up the changed area information, then your software should copy the *-ctk.vmdk file and preserve the “Change Tracking File” line in the .vmdk descriptor file. If you do not want to back up the changed area information, then you can discard the ancillary file, remove the “Change Tracking File” line, read the VMDK file data as if it were version 1, and roll back the version number on restore.

I’ll have to consider this when running some of the commands coming up. Now we still don’t know how much, if any, space we’ll save from compression alone and the time it’ll take to create the compressed file… from my research I found this resource pretty helpful:

Which Linux/UNIX compression algorithm is best? (privex.io)

Since we want to keep it native doing quick tests via the command line shows ESXi to have both gzip and xz but not lx4 or lbzip2, which kind of sucks as they showed to have the best performance in terms of compression speeds… as quoted by the article “As mentioned at the start of the article, every compression algorithm/tool has it’s tradeoffs, and xz’s high compression is paid for by very slow decompression, while lz4 decompresses even faster than it compressed.” Which is exactly what I want to see in the end result, if we save no space, then the process will burn I/O and expected life of the drive being used or pretty much zero gains.

Highest overall compression ratio: XZ If we gonna do this this is what we want, but how long it takes and how much resources (CPU cycles, and thus overall WATTS) trade off will come into question (though I’m not actually taking measurements and doing calculations, I’m looking at it at points and time and making assumed guessed at overall returns).

Time to find out what we can get from this (I’m so glad I looked up xz examples cause it def is not intuitive (no input then output parameters, read this to know what I mean) :

xz -c /vmfs/volumes/SourceDatastore/VM/vm-flat.vmdk > /vmfs/volumes/TargetDatastore/whereever/vmvmdk.xz

Mhmmm no progress… crap didn’t read far enough along and I should have specified the -v flag, not sure why that wouldn’t be defaulted, having no response of the console kind of sucks… but checking the host resources via the web GUI shows CPU being used, and write speed….. sad….

CPU usage:

and Disk I/O:

Yeah… maybe 4 MB/s and this is against a SSD storage on a SATA bus, there’s no way the storage drive or the controller is at fault here… this is not going to be worth it…

Kill command, check compressed file less than 300 MB in size, OI, that def not going to pay off here…

I decided to try taring everyting into one file without compression hoping to simply get it to to one file roughly 20gigs in size with max I/O. As mentioned here:

“When I try the same without compression, then I seem to get the full speed of my drive. ”

However to my dismay (maybe it ripped the SSDs cache too hard?) I unno I’d get I/O error, even though the charts showed insane throughput, I decided to switch to another datastore a spindle drive on the ESXi host and you can see the performance just sucks compared to the SSD itself.

Which now again stuck waiting cause instead of amazing throughput its stuck going only 20 MB/s apparently… uggghhhh.

To add to this frustration, I figured I’d try the OVF export option again, but I guess cause the tar operation has a read on the file, I’m assume a file lock, when attempting the OVF export it just spits an web response “File Not Found”. So, I can’t even have a race knowing full well the SSD could read much faster than what it’s currently operating at. I don’t really know what the bottleneck is at this point…

Even at this rate it’s feeling almost pointless, but man just to keep a vmdk thin, why, oh WHY SCP can’t you just copy the file as the size it is… mhmmm there has to be a way other than all this crap….

I don’t think this guy had any idea he went from thin too thick on the VM….

I thought about SSHFS, but it’s not available on ESXi server….

Forgot about Willams project GhettoVCB Great if I actually wanted more of a backup solution… considered for future blog, but over kill to just move a VM.

The deeper I go here the more the simply export to OVF template and import is seeming reaaaaaalll appeasing.

Awww man this tar operation looks like its takin more size then the source. doing a du -h on the source shows 19.7 Gigs… tar file has now surpassed 19.8 Gigs in size… with no sign of slowing down or stopping lol. Fuck man I think tar is also completely unaware of thin disk and I think it’ll make the whole tar file what ever the provisioned size was (aka thick). Shiiiiiiiiiiiiit!

Trying the Export VM option looked so promising,

until the usual like always… ERROR!!

FFS man!!! Can’t you just copy the files via SSH between hosts? Yeah but only if you’re willing to copy the whole disk and if you’re lucky holepunch it back to thin at the destination… can’t you do it with the actual size on disk… NO!

Try the basic answer on almost all posts about this, just export as template and import… Browser download ERROR… like Fuck!!!

Firefox… nope same problem… Fuck…. Google what ya got for me? well seems like almost the same as my initial move of using SCP but use WinSCP via my client machine and uttering in a middle man in the process, but I guess using the web interface  to download/upload was already a man in the middle process anyway… fine let’s see if I can do that… my gawd is this ever getting ridiculous… what a joke… Export VM from ESXi embedded host client Failed – Network Error or network interruption – Server Fault

And of course when I connect via Win SCP it see the hard drive as being 60 Gigs, so even though trafser speed are good to taking way more space then needs and thus waste data over the bus… FUCK MAN!!!!!

If only there was a way to change that, oh wait there is, I blogged about it before here: How to Shrink a VMDK – Zewwy’s Info Tech Talks

OK Make a clone just to be safe (you should always have real backups but this will do. and amazing this operation on the SSD was fast and didn’t fail.

Woo almost 300 MB/s and finished in under 4 minutes. Now let’s edit the size.

Well I tried the edit size, but only after doing a vmkfstools convertion of the vmdk would it show the new size in WinSCP, even then transferred the files and it was still corrupted in the end..

ESXi 6.5 standalone host help export big VM ? | MangoLassi

Mhmmm another link to Willams site, covering the exact same thing, but this time using a tool ovftool….

and wait a second… He also said there’s a way to use the ovftool on the ESXi server itself, in this post here….. mhmmmm If I install the Linux OVF tool on the ESXi host, I should be able to transfer the VM while keeping the thin disk all “native” on ESXi… close enough anyway…

Step 1) Download the OFV tool, Linux Zip

Step 2) Upload Zip file via Web GUI to Datastore. (Source ESXi)

Step 3) unzip tool (unzip ovfool.zip), then delete zip.

Step 4) Open outbound 443 on Source ESXi server. Otherwise you get error on tool.

Step 5) run command to clone VM, get error that managed by ESXi host.

Step 6) remove hosts from ESXi and run command again… fail cause network error (Much like OVF export error, seem that happens over port 443/HTTPS)

Man Fuck I can’t fucking win here!!!

I think I’m gonna have to do it the old fashioned way… doing it via “seeding”. plug in a Drive into the source ESXi, and physically move it to the target.

Tooo beeeee continued……

I grabbed the OVFtool for windows (the machine I was doing all the mgmt work on anyway, yet it too failed with network issues.

I decided to reboot the mgmt services on the host:

Then gave it one last shot…

Holy efff man the first ever success yet… don’t know if this would havefixed all my other issues, the export failing for https, and all the others? And the resulting OVA was only about 8 Gigs. Time to see if I can deploy it now on the target host.

I deployed the OVA to the target via the WebGUI without issue.

I also tested the ESXi webGUI export VM option and this time it also succeeded without failure, checking the host resources CPU is fairly high on both ovftool export or the webGUI export option. Using esxtop showed hostd process taking up most of the CPU usage during the processes. Further making me believe restarting that service is what fixed my issues…