Hyper-V Networking

Install Hyper-V, and let’s learn it’s Networking

Hyper-V Networking

On VMware, when you install the OS (ESXi) on the hardware in the DCUI you simply set the VLAN tag right on the network interface you configure, and bind it to a physical NIC in the menu area, simple.

We leave the network connected to a non tagged, internet allowed connection. Now we can configure our networks.

1. Create one external switch

  • Bind it to your physical NIC (configured as a trunk on the upstream switch).
  • Check “Allow management OS to share this network adapter” once — this creates the first host vNIC.

2. Add more host vNICs via PowerShell:

# Add another vNIC for the host bound to the same external switch
Add-VMNetworkAdapter -ManagementOS -Name "HostVLAN20" -SwitchName "ExternalSwitch"
Set-VMNetworkAdapterVlan -ManagementOS -VMNetworkAdapterName "HostVLAN20" -Access -VlanID 6

3. Verify in Windows

  • You’ll now see multiple vEthernet adapters in the host OS (e.g., “vEthernet (HostVLAN20)”, “vEthernet (HostVLAN30)”).
  • Assign IP addresses or enable DHCP as needed.

*Note* this is for doing it on a single network server (for my lab use) usually you’d have dedicated NICs for other networks designated on those servers… anyway.. where was I… right now that we got additional network on the server we can work on a shared storage, you know how I like my iSCSI.

So, when I setup Hyper-V, and connect a VM to the “default” virtual switch, the Hyper-V server acts as a DHCP provider and NATs the connections from that vSwitch to whatever other network the host is connected to?

Great question — this trips up a lot of people when they first use Hyper‑V! Let’s break it down clearly:

🖧 Hyper‑V Default Switch Behavior

  • Introduced in Windows 10 (1709): The “Default Switch” is automatically created by Hyper‑V and cannot be deleted or modified.
  • NAT (Network Address Translation): Yes — the host machine provides NAT services. This means your VM’s traffic is translated and routed through the host’s network connection, regardless of whether that’s Wi‑Fi, Ethernet, or even VPN.
  • DHCP Services: The host also runs a lightweight DHCP service for the Default Switch. This is what hands out IP addresses to your VMs automatically (usually in the 172.16.x.x range).
  • Connectivity: VMs connected to the Default Switch can reach the internet and other devices on the host’s network, but inbound connections from the LAN to the VM are limited. It’s designed for convenience, not for complex networking setups.

⚖️ Comparison with Other Switch Types

Switch Type Provides DHCP? Provides NAT? Use Case
Default Switch ✅ Yes (host DHCP) ✅ Yes (host NAT) Quick setup, internet access without config
External Switch ❌ No (relies on external network DHCP) ❌ No VM directly on physical network, full LAN visibility
Internal Switch ❌ No ❌ No Communication between host and VMs only
Private Switch ❌ No ❌ No Communication between VMs only

✅ So to answer directly:

Yes — when you connect a VM to the Default Switch, Hyper‑V’s host acts as a DHCP provider and NATs the VM’s traffic to whatever network the host is connected to. It’s essentially a convenience bridge so you don’t have to manually configure networking.

“That’s what I did configured a external virtual switch so the hosts NIC would be VLAN tagged in a certain network, I was figuring the default switch would act more like a … well switch at layer2 of the hosts Nic, so VMs on it would get DHCP from the main network, as if it was connected to a switch in the base network, not layer3 and then 4 NAT like c’mon…”

Exactly — you’ve nailed the distinction. The Default Switch in Hyper‑V is deliberately not a true Layer‑2 bridge to your host NIC. It’s a convenience construct that Microsoft built to “just work” for internet access without requiring you to mess with VLANs, external switches, or DHCP servers on your LAN.

Here’s the breakdown:

🔌 Why the Default Switch behaves like NAT

  • It’s implemented using Windows Internet Connection Sharing (ICS) under the hood.
  • ICS is a Layer‑3 service: it provides a private subnet (usually 172.16.0.0/12), runs a DHCP service, and NATs traffic out through whatever adapter the host is using.
  • That’s why VMs on the Default Switch don’t see your LAN’s DHCP server — they only see the host’s mini DHCP/NAT service.
  • It’s intentionally isolated so you don’t accidentally expose VMs directly to your corporate or home LAN without realizing it.

⚖️ Why it’s not Layer‑2 bridging

  • A true Layer‑2 bridge would forward broadcasts (like DHCP Discover) from the VM to your LAN. That’s what an External Virtual Switch does.
  • The Default Switch instead terminates those broadcasts at the host and answers them itself. That’s why you only ever get the host‑provided DHCP.
  • VLAN tagging, trunking, or other advanced networking features aren’t supported on the Default Switch — it’s locked down.

✅ When to use each

  • Default Switch (NAT + DHCP): Quick internet access, testing, dev boxes, laptops where you don’t want to reconfigure NICs.
  • External Switch (Layer‑2 bridge): When you want VMs to behave like physical machines on your LAN, get DHCP from the real network, or use VLAN tagging.
  • Internal/Private Switches: For host‑VM communication or VM‑VM only networks.

So your expectation — that the Default Switch would act like a transparent Layer‑2 bridge — makes sense, but Microsoft intentionally designed it as a Layer‑3 NAT gateway instead. If you want “real switch” behavior, the External Switch is the only way to go.

“So, whatever my host machine is allowed to do, the guest VMs in the default switch are also allowed to do? like access the internet, and the firewall that gave the host the ability to access the internet would never see the private NATed IP address the Hyper-V host is dishing out to those VMs?”

You’ve got it — that’s exactly how the Default Switch works in Hyper‑V. Let me spell it out clearly:

🌐 How traffic flows with the Default Switch

  • VMs get private IPs: Each VM connected to the Default Switch is assigned an IP from the host’s internal DHCP service (usually 172.16.x.x).
  • Host does NAT: The Hyper‑V host translates those private IPs into its own IP before sending traffic out to the physical network.
  • Firewall visibility: From the perspective of your upstream firewall or router, all traffic from those VMs looks like it’s coming from the host’s IP. The firewall never sees the VM’s private NATed addresses.
  • Permissions: Whatever the host is allowed to do (e.g., access the internet, reach certain networks), the VMs can also do — because their traffic is piggybacking on the host’s connection.

⚖️ Important implications

  • Outbound access: VMs can reach the internet or LAN resources as long as the host can. They inherit the host’s connectivity.
  • Inbound access: External devices cannot directly initiate connections to those VMs, because NAT blocks unsolicited inbound traffic unless you set up port forwarding (which the Default Switch doesn’t expose for configuration).
  • Firewall rules: The upstream firewall only enforces rules against the host’s IP. If you need per‑VM visibility or control, you’ll need an External Switch so each VM gets its own LAN IP.

✅ So yes: the Default Switch makes your VMs “ride along” with the host’s network permissions. They’re hidden behind NAT, which is why your firewall only ever sees the host’s traffic.

Additional Information

If you want to apply a VLAN tag to a physical NIC that’s not a Hyper-V host on Windows:

Since I didn’t see anything in the usual Network  IPv4 configuration window in Windows. I decided to see how useful this AI stuff is.. so I asked it…

“how do I VLAN tag the IP on windows 11 interface?”

Quick Answer: On Windows 11, VLAN tagging isn’t enabled by default. You need a NIC that supports IEEE 802.1Q VLAN tagging, then configure the VLAN ID either through the NIC’s Device Manager properties (Advanced tab → VLAN ID) or by using Intel PROSet/ANS tools or PowerShell (Set-NetAdapter -VlanID) if your driver supports it.

🖥️ Steps to Configure VLAN Tagging on Windows 11

1. Check NIC Compatibility

  • VLAN tagging requires a network card (NIC) that supports IEEE 802.1Q.
  • Not all NICs expose VLAN options in Windows. Intel and Broadcom NICs often support it, but you may need vendor-specific drivers.

2. Configure via Device Manager

  • Press Win + R, type devmgmt.msc, and hit Enter.
  • Expand Network adapters, right-click your NIC → Properties.
  • Go to the Advanced tab.
  • Look for VLAN ID or Priority & VLAN.
  • Enter the VLAN ID (e.g., 100) and click OK.

3. Configure via Intel PROSet/ANS (if supported)

  • Install Intel PROSet/Advanced Network Services (ANS).
  • In the NIC properties, Intel’s tab allows creating multiple VLAN interfaces with different IDs.
  • Each VLAN interface can then be assigned its own IP address.

4. Configure via PowerShell

  • Open PowerShell as Administrator.
  • Run:
    Set-NetAdapter -Name "Ethernet" -VlanID 100
    
  • Replace "Ethernet" with your adapter name and 100 with your VLAN ID.
  • Note: This only works if your NIC driver supports VLAN tagging. Otherwise, you’ll get an error like “Requested operation not supported on adapter”.

5. Assign IP Address

  • Once VLAN tagging is enabled, go to Network & Internet settingsChange adapter options.
  • Right-click the VLAN-enabled adapter → PropertiesInternet Protocol Version 4 (TCP/IPv4).
  • Assign the IP address, subnet mask, and gateway for that VLAN.

⚠️ Important Notes

  • Driver Support is Key: If your NIC driver doesn’t expose VLAN options, you cannot tag VLANs directly in Windows without Hyper-V or third-party tools.
  • Hyper-V Alternative: You can create VLAN-tagged virtual NICs using Hyper-V’s Virtual Switch Manager if your hardware doesn’t support native tagging.
  • Testing: Use ping or tracert to verify connectivity once VLAN tagging is applied.

Bottom Line: To VLAN tag an IP on Windows 11, you must enable VLAN tagging on the NIC (via Device Manager, Intel PROSet/ANS, or PowerShell if supported), then assign the IP to that VLAN interface. If your NIC doesn’t support VLAN tagging, you’ll need Hyper-V or vendor-specific utilities.

My response “I’ve got an intle 82579LM Gig NIC, I see it says VLAN enabled in the advanced tab, but I don’t see an area to configure it, I’m sure the NIC supports it since I was using ESXi with vlan tagging just fine.”

That’s a great observation — the Intel 82579LM definitely supports VLAN tagging at the hardware level, but here’s the catch: Windows drivers for that NIC don’t expose VLAN configuration in the same way ESXi or Linux does.

🔎 Why You Don’t See VLAN Config in Windows

  • The Advanced tab in Device Manager shows “Priority & VLAN” or “VLAN Enabled,” but that option only toggles whether the NIC can process VLAN-tagged frames.
  • Unlike ESXi, Windows doesn’t let you directly assign VLAN IDs to a physical NIC unless the driver provides a VLAN interface creation utility (like Intel PROSet/ANS).
  • For the 82579LM, Intel stopped including VLAN configuration in their Windows drivers after Windows 8/10. On Windows 11, the driver exposes VLAN capability but no UI to configure VLAN IDs.

✅ Workarounds

Here are the practical options:

1. Intel PROSet/ANS (if available)

  • Older Intel drivers came with Advanced Network Services (ANS), which let you create VLAN sub-interfaces.
  • On Windows 11, Intel no longer bundles ANS with drivers for the 82579LM, so you won’t see the VLAN tab.

2. PowerShell Command

  • Windows has a Set-NetAdapter -VlanID cmdlet, but it only works if the NIC driver supports VLAN tagging through the OS.
  • On the 82579LM with current Windows 11 drivers, this usually fails with “Requested operation not supported”.

3. Hyper-V Virtual Switch

  • If you enable Hyper-V, you can create a virtual NIC and assign VLAN IDs there.
  • This is the most reliable way to do VLAN tagging on Windows when the physical NIC driver doesn’t expose VLAN config.

4. Third-Party NIC Utilities

  • Some OEMs (Dell, HP) ship custom drivers/utilities that re-enable VLAN tagging. If your system vendor has one, that’s worth checking.

⚠️ Bottom Line

Your Intel 82579LM can tag VLANs, but on Windows 11 the Intel driver doesn’t expose VLAN configuration. That’s why you don’t see a place to set the VLAN ID.

  • On ESXi/Linux → VLAN tagging works natively.
  • On Windows → You need Intel PROSet/ANS (no longer supported for this NIC), Hyper-V, or vendor-specific drivers.

Well ok I guess once we install Hyper-V hopefully it’ll be more like ESXi in a sense and we can tag.

*UPDATE* ExternalSwitch, WiFi NIC

I was trying to use a Hyper-V server with an External Switch, bound to a WiFi NIC. and was getting unexpected results based on the above information. but my results were NOT as I had expected, I was assuming Default Switch behavior. You may be asking, “Why would you expect that behavior if you’re creating an External Switch?”  Now, if you read my Wireless ESXi host blog, you probably are well aware of the VLAN limitations of WiFi, and can never truly be used as a trunk port (Limitation of the 802 standard, not of OS or software).

So how could a ExternalSwitch work, via WiFi if the layer 2 broadcast doesn’t work and can’t “speak” with the rest of the layer 2 stack? Yet I create a VM and it DOES get a DHCP lease address from my local subent?! What the heck is going on here…

So I had to ask AI, what was going on here, it says, yeah… that’s expected… here’s the deets… get ready.. it’s a long one….

BAH-LETED, there was way tooooo much BS from the AI response to justify keeping this data in here… long story short… Local subnet VMs work fine (it does ARP Masquerading), VLANs will never work per the usual BS I’ve talked about in my Wireless ESXi host blog.

Upgrade vCenter from 7 to 8

Upgrading vCenter

Let me start by having a firm base, a working vCenter 7 with a proper connected Veeam server. Since my server is dead I’m going to start from scratch.

Pre-req Step (Base vCenter 7 with Veeam)

Other Pre-reqs:
DNS server (or local host records)

Step 1) Install vCenter 7

20 minutes later, complete. Now I made one mistake along the way (to replicate what I think happened the first go around) and that was to give the VM name vCenter.zewwy.ca, while when configuring the network FQDN I gave it vcenter.zewwy.ca. After install, I was able to replicate the findings I had from my first go around which I now know will cause and upgrade problem for the upgrade to vCenter 8. When this is done, the install wizard has no issues with this and the install will complete successfully. However now the hostname with have a uppercase, while the PNID will have a lowercase:

It’s this case while will cause the upgrade to have hiccups which I’ll cover below. However, I need to add my hosts which were connected to my previous upgrade, let’s see if it’ll take my hosts…

Step 2) I followed my blog post to fix up my Veeam jobs again. 

Step 3) Upgrade to vCenter 8.

Stage 1 Error

Remember when I showed above the mistake I made by entering the FQDN without a capital when configuring the network when I ran the vCenter7 installer. Putting lower case here, provided the above error, to get past it had to use the case sensitive hostname with a capital.

Stage 2 Error

Now pick your host and you have another error to bypass at Stage 2.

I had another weird issue where even though the VM deployed, it was not reachable over the network and the installer timed out. To resolve this I simply changed the VM network VMPG, saved, and then changed it back to the proper one saved, and it was pingable. To get back to stage 2 simly navigate to the VAMI web page on port 5480. When you get to the stage to connect to the source you enter the details and get this error at the pre-upgrade checks:

changing the vcenter source entry again to a capital as we did before will not work the same error pops up showing to go to that blog post on how to change the FQDN. Since my FQDN already looked correct (with a capital), but the command was showing (the PNID) was with a lower case, Instead of changing the FQDN to be lowercase, and going through all the steps in that blog (there are a lot) I simply set the PNID to have a capital in it

Get:

/usr/lib/vmware-vmafd/bin/vmafd-cli get-pnid --server-name localhost

Set:

/usr/lib/vmware-vmafd/bin/vmafd-cli set-pnid --server-name localhost --pnid vCenter.mydomain.com

Yay no more error:

Now pick your content to migrate, and at this point you should stop using vCenter for the duration of the migration.

At this point the Pings dropped (roughly 12 no reponse at about 1-2 minute down time…. then they came back up as the new vCenter IP changed over. At which point step 2 began.

I went to play a bit of party animals, it had timed out on the old vami IP, which the installer may have auto switched over, but logging into VAMI on the new server showed everything was green.

Logging in, everything looks good, checking Veeam yup, rescan of vcenter worked without issue, check re-calculating VM on a backup job, yup it works.

Success. This post doesn’t cover additional steps (applying new license keys, checking using migrations (vsphere.local), remoting loging services or connections (rsyslog)) all those you’d have to verify after completing these steps. Now to upgrade the ESXi hosts, both esxi hosts remained on 7.0.3x versions for this post.

VMware Changes Update URLs

If you run a home lab, or manage systems for companies you may have noticed updates not working in VAMI… something like…. Ohhh I dunno.. this:

Check the URL and try again.

Unable to patch the vCenter via VAMI as it fails to download the updates from Broadcom public repositories

Cause

Public facing repository URLs and authentication mechanisms are changing. Download URLs are no longer common but unique for each customer therefore will require to be re-configured.

Well… wow thank you Broadcom for being so… amazing.

If you want to be overly confused about the whole thing you can this this KB: Authenticated Download Configuration Update Script

As the original link I shared above all you have to do is login to the Broadcom support portal, and get a token, and edit the URL…. but….

Notes:

    • The custom URL is not preserved post migration upgrade, FBBR restore and VCHA failover
    • If there is a proxy device configured between vCenter and the internet, ensure it is configured to allow communications to the new URL
    • Further patches automatically update this URL. For example, if 8.0.3.00400 is patched to 8.0.3.00500, the default URL will change to end in 8.0.3.00500.

Looks like this was enforced just a couple days ago … Sooooo, happy patching?   ¯\_(ツ)_/¯

Permission to perform this operation was denied. NoPermission.message.format

For anyone who may use my site as a source of informational references, I do apologies, for the following:

  1. My Site Cert expiring. ACME is great, I’m just a bit upset they refuse to announce their HTTP auth sources so I can’t create a security rule for it. Right now it would be restricted to App Type. While not bad.. not good enough, so I manually have to allow the traffic for the cert to be renewed.

    No… I have no interest in allowing ACME access to my DNS for DNS auth.

  2. Site was down for 24 hours. If anyone noticed at all, yes my site was down for over 24 hours. This was due to a power outage that lasted over 12 hours after a storm hit. No UPS could have saved me from this. Though one is in the works even after project “STFU” has completed.

    No, I have no interest in clouding my site.

I have a couple blog post ideas roaming around, I’m just having a hard time finding the motivation.

Anyway, if you get “Permission to perform this operation was denied. NoPermission.message.format” while attempting to move a ESXi host into a vCenter cluster. Chances are you may have a orphaned vCLS VM.

If so, log into VAMI and restart the ESX Agent Manager (EAM) service.

After restarting that service everything should hunky dory…

Cheers.

Careful Cloning ESXi Hosts

I’ll keep this post short. I was doing some ESXi host deployments in my home lab, and I noticed that when I would install on a 120GB SSD, the install would go smoothly, but I wasn’t able to use any of the storage as a Datastore. However, if I took a fresh install copy of ESXi from installing onto an 8GB USB Stick and DD’d it to the 120GB SSD I got several advantages from this:

  1. When done via a USB3 Pipe of Linux live holding a copy of my base image to deploy I could get speeds in excess of 100 MB/s, and with only 8GB of data to transfer, the “install” would complete in a mere 90 seconds.
  2. The IP address and root password are preconfigured to what I already now, and I can simply change the IP address from the DCUI and call it a day.

Using this method I could have a host up in less than 5 minutes (2 min to boot linux live, 90 seconds to install the base ESXi OS image, and 2 more to boot ESXi). This was of course on machine without ECC and all the server hardware firmware jazz… in those cases install times are always longer. anyway…

This was an amazing option, until I noticed that when I connect one machine in I just deployed and changed the IP address, and (since I’m super anal about networking during this type of project/operations) I noticed my ping from one machine (a completely different IP address) started to drop when the new device came up… and after a while the ping responses would come back but drop from the new host, and vice versa, flip and flop it goes. I’m used to this usually if there’s an IP conflict and two devices have the same IP address. In this case they were different IP addresses… after enough symptom gathering and logical deduction of because I had to assume that the MAC address just be the same and this is the same problem in reverse (different IP’s but same MAC) and as such experiencing the same symptoms.

To validate this I simply deployed my image to a new machine, then I went on the hunt to figure out how to see the MAC address, since I couldn’t plug in the NIC and get to the web based MGMT interface I had to figure out how to do that via the console CLI directly… mhmm after enough googling on my phone I found this spiceworks thread with my answer:

vim-cmd hostsvc/net/info | grep “mac =”

I then checked this against the ESXi host that I saw the flipping flopping with, and sure enough they matched…  After doing a fresh install I noticed that the first 3 sections match the physical MAC, but in my DD deployed ones they retain the MAC of the system from which it was installed and those when I ran the command above, I could tell which ones were deployed via my method. This was further mentioned in this reddit thread by a commenter who goes by the name of sryan2K1:

“The physical NIC MACs are never used. vmk ports, along with VMs themselves will all use VMWare’s OUI as the first half of the address on the wire.”

OK, now maybe I can still salvage my deployment method by simply deleting and recreating the VMK after deployment, but I’d guess it best be done via the DCUI or direct console… I found one KB by VMware/Broadcom but it gave a 404 but Luckly there was a wayback machine link for it here.

Which states the following:

“During Initial Installation and DCUI, ESXi management interface (default vmk0) is created during installation.

The MAC address assigned will be the primary active physical NIC (pnic) associated.

If the associated vmnic is modified with the management interface vmkernel will once again assign MAC address of the associated physical NIC.

To create a VMkernel port and attach it to a portgroup on a Standard vSwitch, run these commands:

esxcli network ip interface add --interface-name=vmkX --portgroup-name=portgroup
esxcli network ip interface ipv4 set --interface-name=vmkX --ipv4=ipaddress --netmask=netmask --type=static"

Alternatively, you can also use esxcli to create the management interface vmkernel on the VDS.

Creation of the management interface with the ‘esxcli network’ will generate a VMware Universally Unique address instead of the pnic MAC address.

It is recommended to use the esxcli network IP interface method to create the management interface and not use DCUI.

Workarounds:               None

Additional Information:
Using DCUI to remove vmnic binding from management vmkernel or any modification will apply change at vSwitch level. Management interface is associated with propagating the change to any port groups within the vSwtich level.

Impact/Risks:                None.”

I”m assuming it means if you use the DCUI to reconfigure the MGMT interface settings the MAC will automatically be reconfigured to match what I found during initial clean install and mentioned in the reddit thread of using the first 3 sections to derive the MAC of the VMK.

But what if you don’t have any additional interfaces to use to make the section change in the DCUI to have that actually happen? cause what I’ve noticed changing the IP address and disabling IPv6 and rebooting did not change the VMK’s MAC address. Oh there’s in option in the DCUI “Reset Network Settings” within there there’s several options, I simply picked reset to factory defaults. Said success, checked the MAC via the first command stated above and bam the VMK nic changed to what it should be! Sweet my deployment method is still viable.

Hope this helps someone.

Hole Punching a Linux VM on ESXi

I covered this in the past here:

Reclaim unused space from VMDK – Zewwy’s Info Tech Talks

But this time I wanna covered it a bit differently. Things I noticed:

  1. A proper VM with VMtools installed, and thin provisioned will automatically shrink the overall size being shown and used on disk on the ESXi browser.

Yet for some reason after I used the SCP method to move a VM from one host to another (frowned upon as it secretly converts it from thin to thick). Yet even after migrating to a new datastore specifying thin, it still show as full disk usage on the host.

I know is less checking the VM itself via it’s console/terminal whatever:

In my old blog post I mentioned “using DD” but not showing for stating how at all, Googling this I found this thread with an interesting answer:

“The /zero-Tag is actually a file name. The command just copies zeros from the virtual File /dev/zero (infinite number of zeros) into /mnt/hdb/zero until the disk is full, or some other error occurs.

This is why you have to remove the file /mnt/hdb/zero after that in order to regain the unused space.

However, a better way to fill free space with zeros (on ext2,3,4 file systems) is to use a tool called zerofree.”

Oooo zerofree?

Huh, something created (a tool) for exactly the task (job) at hand. Great!

Error, how classic, complained path is mounted RW, like yeah and?

Ughhh, google? virtualbox – zerofree on ubuntu 18.04 – Ask Ubuntu

Step 1) Reboot into ubuntu recovery console, hold down [Perfect ESC keystroke]

K how do I get into Advanced Grub boot? Holding Shift did nothing, if I mash ESC I get grub> (notices tiny flicker of Grub menu).. great I have to Mash ESC only once perfectly in a less then 1 second boot window… man give me a break… Once in I see advanced option as stated by the answer.

Step 2) advanced options -> recovery mode -> root console

Step 3) find the root directory

mount | grep "sda"

Step 4) run zerofree

echo "u" > /proc/sysrq-trigger
mount /dev/mapper / -o remount,ro
zerofree -v /dev/sda1

Step 5) reboot

Checking the ESXi host… what it went up 2 gigs, da fuq man…

Step 6) Compress Disk.

In my previous post this was to vMotion to datastore or use holepunch option which from what I can tell this the -K option, which I dont’ think can be used on a live VM, and a storage vMotion can’t be done. without vCenter. Since I’m temp working without a vCenter server, let’s try the holepunch option, this will require shutting down the VM, but since the zerofree required it anyway down time was already in play.

On the ESXi host:

[root@Art-ESXi:~] vmkfstools -K /vmfs/volumes/Art-Team/VM1/VM1.vmdk
vmfsDisk: 1, rdmDisk: 0, blockSize: 1048576
Hole Punching: 100% done.

Oh noooo I’ve tainted my results… checking the web UI the space has gone back done to stating 20 Gigs like the first snippet… but doing a du -h on the flat file shows it is only the 10 Gigs as expected it to be:

Well I don’t know what to make of this discrepancy…

Huh, I found this post of someone doing the exact same thing around the time I wrote my original post but simply used the command:

dd bs=1M count=8192 if=/dev/zero of=zero

I have no clue how output file of zero means the disk… guess you can try that too, then holepunch the VMDK just like I did.

There.. I shutdown the VM again and this time did my ol trick to vMotion a VM without vCenter, and after re-registering the vMotion (between two datastores) it finally showed up with the correct space in the UI. 🙂

Hope this post helps someone, it’s been covered many times in the past.

How to vMotion a VM without vCenter WITHOUT Shared Storage

While I have covered this in the past here:
How to vMotion a VM without vCenter – Zewwy’s Info Tech Talks

This was using shared network storage between hosts…. what If you have no vCenter AND no shared storage? In my previous post I suggested to go check out VMware arena’s post but that just covers how to copy files from one host to another, and what I’ve noticed is while it does work if you let it complete, the vmdk is no longer thin and takes up the full space as specified by its defined size.  This is also mentioned as such in this serverfault thread “Solutions like rsync or scp will be rate-limited and have no knowledge of the content (e.g. sparse VMDK files, thin-provisioned volumes, etc.)”

So options provided there are:

  1. Export the VM as an OVF file, move to a local system, then reimport the OVF to your ESXi destination.I attempted this but on the host I could only export vmdk. While attempting to do so I got network issues (browser asking to download multiple files, but I must not have noticed and time and timed out? not sure). This also requires an intermediary device and double down/up on the network, I’m hoping for a way between hosts directly.
  2. Use vSphere and perform a host/storage migration.This post is how to do it without. Also note I attempted this but in my case I’m using my abomination ESXi host I created in my previous blog post, and vCenter fails the task with errors. (Again SCP succeeds but doesn’t retain thin provisioning). Not sure why SCP succeeds but vCenter fails seems to be more redundant to poor connection and keeps going, which happens when the WiFi NICs underload in those situations.
  3. Leverage one of Veeam’s free products to handle the ad hoc move.

I love Veeam, but in this case I’m limited in resources, lets see if we can do it via native ESXi here.

So that exhausts all those options. What else we got…

Move VMware ESXi VM to new datastore – preserve thin-provisioning – Server Fault

Oh someone figured out what I did in my intital post all the way back in 2013… wonder how I missed that one.. oh well, same answer as my initial post though required shared storage… moving on…

LOL no way… Willam Lam all the way back from over 14 years ago! Answering the question I had about compression of the files. and saying the OVF export is still the best option.. mhmmm…

I don’t want to stick to just scp, man did it suck getting to 97% done on a 60 Gig provisioned VMDK, that’s only taking up roughly 20 gigs, to have to not work cause I put my machine to sleep thinking it was a remote connection (SSH) to the machine and the machine is doing the actual transfer… just to wake my machine the next morning to have a “corrupt” vmdk that fails to boot or svmotion to get thin. I have machines with fast local storage but poor network, it’s a problem from back in the day with poor slow internet speeds. So what do we have? We got gzip and tar, what’s the diff?

In conclusion, GZIP is used to compress individual files, whereas TAR is used to combine numerous files and directories into a single archive. They are frequently used together to create compressed archive files, often with the “.tar.gz” extension.

Also answered here.

“If you come from a Windows background, you may be familiar with the zip and rar formats. These are archives of multiple files compressed together.

In Unix and Unix-like systems (like Ubuntu), archiving and compression are separate.

tar puts multiple files into a single (tar) file.
gzip compresses one file (only).
So, to get a compressed archive, you combine the two, first use tar or pax to get all files into a single file (archive.tar), then gzip it (archive.tar.gz).

If you have only one file, you need to compress (notes.txt): there’s no need for tar, so you just do gzip notes.txt which will result in notes.txt.gz. There are other types of compression, such as compress, bzip2 and xz which work in the same manner as gzip (apart from using different types of compression of course).”

OK, so from this it would seem like a lot of wasted I/O to create a tar file of the main VDMK flat file, but we could gain from compressing it. Let’s just do a test of simple compression and monitor the host performance while doing so.

Another thing I noticed that I didn’t seem to cover in my previous post in doing this trick was the -ctk.vmdk files. Which are change block tracking files, as noted from here:

“Version 3 added support for persistent changed block tracking (CBT), and is set when CBT is enabled for a virtual disk. This version first appeared in ESX/ESXi 4.0 and continues unchanged in recent ESXi releases. When CBT is enabled, the version number is incremented, and decremented when CBT is disabled. If you look at the .vmdk descriptor file for a version 3 virtual disk, you can see a pointer to its *-ctk.vmdk ancillary file. For example: version=3

# Change Tracking File
changeTrackPath=”Windows-2008R2x64-2-ctk.vmdk”
The changeTrackPath setting references a file that describes changed areas on the virtual disk.
If you want to back up the changed area information, then your software should copy the *-ctk.vmdk file and preserve the “Change Tracking File” line in the .vmdk descriptor file. If you do not want to back up the changed area information, then you can discard the ancillary file, remove the “Change Tracking File” line, read the VMDK file data as if it were version 1, and roll back the version number on restore.

I’ll have to consider this when running some of the commands coming up. Now we still don’t know how much, if any, space we’ll save from compression alone and the time it’ll take to create the compressed file… from my research I found this resource pretty helpful:

Which Linux/UNIX compression algorithm is best? (privex.io)

Since we want to keep it native doing quick tests via the command line shows ESXi to have both gzip and xz but not lx4 or lbzip2, which kind of sucks as they showed to have the best performance in terms of compression speeds… as quoted by the article “As mentioned at the start of the article, every compression algorithm/tool has it’s tradeoffs, and xz’s high compression is paid for by very slow decompression, while lz4 decompresses even faster than it compressed.” Which is exactly what I want to see in the end result, if we save no space, then the process will burn I/O and expected life of the drive being used or pretty much zero gains.

Highest overall compression ratio: XZ If we gonna do this this is what we want, but how long it takes and how much resources (CPU cycles, and thus overall WATTS) trade off will come into question (though I’m not actually taking measurements and doing calculations, I’m looking at it at points and time and making assumed guessed at overall returns).

Time to find out what we can get from this (I’m so glad I looked up xz examples cause it def is not intuitive (no input then output parameters, read this to know what I mean) :

xz -c /vmfs/volumes/SourceDatastore/VM/vm-flat.vmdk > /vmfs/volumes/TargetDatastore/whereever/vmvmdk.xz

Mhmmm no progress… crap didn’t read far enough along and I should have specified the -v flag, not sure why that wouldn’t be defaulted, having no response of the console kind of sucks… but checking the host resources via the web GUI shows CPU being used, and write speed….. sad….

CPU usage:

and Disk I/O:

Yeah… maybe 4 MB/s and this is against a SSD storage on a SATA bus, there’s no way the storage drive or the controller is at fault here… this is not going to be worth it…

Kill command, check compressed file less than 300 MB in size, OI, that def not going to pay off here…

I decided to try taring everyting into one file without compression hoping to simply get it to to one file roughly 20gigs in size with max I/O. As mentioned here:

“When I try the same without compression, then I seem to get the full speed of my drive. ”

However to my dismay (maybe it ripped the SSDs cache too hard?) I unno I’d get I/O error, even though the charts showed insane throughput, I decided to switch to another datastore a spindle drive on the ESXi host and you can see the performance just sucks compared to the SSD itself.

Which now again stuck waiting cause instead of amazing throughput its stuck going only 20 MB/s apparently… uggghhhh.

To add to this frustration, I figured I’d try the OVF export option again, but I guess cause the tar operation has a read on the file, I’m assume a file lock, when attempting the OVF export it just spits an web response “File Not Found”. So, I can’t even have a race knowing full well the SSD could read much faster than what it’s currently operating at. I don’t really know what the bottleneck is at this point…

Even at this rate it’s feeling almost pointless, but man just to keep a vmdk thin, why, oh WHY SCP can’t you just copy the file as the size it is… mhmmm there has to be a way other than all this crap….

I don’t think this guy had any idea he went from thin too thick on the VM….

I thought about SSHFS, but it’s not available on ESXi server….

Forgot about Willams project GhettoVCB Great if I actually wanted more of a backup solution… considered for future blog, but over kill to just move a VM.

The deeper I go here the more the simply export to OVF template and import is seeming reaaaaaalll appeasing.

Awww man this tar operation looks like its takin more size then the source. doing a du -h on the source shows 19.7 Gigs… tar file has now surpassed 19.8 Gigs in size… with no sign of slowing down or stopping lol. Fuck man I think tar is also completely unaware of thin disk and I think it’ll make the whole tar file what ever the provisioned size was (aka thick). Shiiiiiiiiiiiiit!

Trying the Export VM option looked so promising,

until the usual like always… ERROR!!

FFS man!!! Can’t you just copy the files via SSH between hosts? Yeah but only if you’re willing to copy the whole disk and if you’re lucky holepunch it back to thin at the destination… can’t you do it with the actual size on disk… NO!

Try the basic answer on almost all posts about this, just export as template and import… Browser download ERROR… like Fuck!!!

Firefox… nope same problem… Fuck…. Google what ya got for me? well seems like almost the same as my initial move of using SCP but use WinSCP via my client machine and uttering in a middle man in the process, but I guess using the web interface  to download/upload was already a man in the middle process anyway… fine let’s see if I can do that… my gawd is this ever getting ridiculous… what a joke… Export VM from ESXi embedded host client Failed – Network Error or network interruption – Server Fault

And of course when I connect via Win SCP it see the hard drive as being 60 Gigs, so even though trafser speed are good to taking way more space then needs and thus waste data over the bus… FUCK MAN!!!!!

If only there was a way to change that, oh wait there is, I blogged about it before here: How to Shrink a VMDK – Zewwy’s Info Tech Talks

OK Make a clone just to be safe (you should always have real backups but this will do. and amazing this operation on the SSD was fast and didn’t fail.

Woo almost 300 MB/s and finished in under 4 minutes. Now let’s edit the size.

Well I tried the edit size, but only after doing a vmkfstools convertion of the vmdk would it show the new size in WinSCP, even then transferred the files and it was still corrupted in the end..

ESXi 6.5 standalone host help export big VM ? | MangoLassi

Mhmmm another link to Willams site, covering the exact same thing, but this time using a tool ovftool….

and wait a second… He also said there’s a way to use the ovftool on the ESXi server itself, in this post here….. mhmmmm If I install the Linux OVF tool on the ESXi host, I should be able to transfer the VM while keeping the thin disk all “native” on ESXi… close enough anyway…

Step 1) Download the OFV tool, Linux Zip

Step 2) Upload Zip file via Web GUI to Datastore. (Source ESXi)

Step 3) unzip tool (unzip ovfool.zip), then delete zip.

Step 4) Open outbound 443 on Source ESXi server. Otherwise you get error on tool.

Step 5) run command to clone VM, get error that managed by ESXi host.

Step 6) remove hosts from ESXi and run command again… fail cause network error (Much like OVF export error, seem that happens over port 443/HTTPS)

Man Fuck I can’t fucking win here!!!

I think I’m gonna have to do it the old fashioned way… doing it via “seeding”. plug in a Drive into the source ESXi, and physically move it to the target.

Tooo beeeee continued……

I grabbed the OVFtool for windows (the machine I was doing all the mgmt work on anyway, yet it too failed with network issues.

I decided to reboot the mgmt services on the host:

Then gave it one last shot…

Holy efff man the first ever success yet… don’t know if this would havefixed all my other issues, the export failing for https, and all the others? And the resulting OVA was only about 8 Gigs. Time to see if I can deploy it now on the target host.

I deployed the OVA to the target via the WebGUI without issue.

I also tested the ESXi webGUI export VM option and this time it also succeeded without failure, checking the host resources CPU is fairly high on both ovftool export or the webGUI export option. Using esxtop showed hostd process taking up most of the CPU usage during the processes. Further making me believe restarting that service is what fixed my issues…

*Dec 2025 update*

Everything I mentioned above it doing a move of a VM between two different ESXi hosts, if you simply want to move a VM between datastores on the ESXi host itself then.

      • ESXi Web GUI (Datastore Browser): Power off VM → Unregister VM → Copy/move VM folder to target datastore → Re‑register VM.
      • CLI/Shell scripts: Use vmkfstools or scp to copy VMDK files, then update the VMX file and re‑register.

A typical shell script might:

  1. Power off the VM.
  2. Use vmkfstools -i to clone the VMDK to the new datastore.
    vmkfstools -i /vmfs/volumes/source_datastore/vm/vm.vmdk \
               /vmfs/volumes/target_datastore/vm/vm.vmdk -d thin
    
  3. Copy the VMX and other config files with cp or scp.
  4. Register the VM on the new datastore using the ESXi host client.

This ensures the VM is properly cloned/moved while preserving thin/thick provisioning.

⚠️ Risks & Considerations

  • Downtime: Without Storage vMotion, the VM must be powered off.
  • Provisioning changes: Some copy methods may convert thin disks to thick unless you specify -d thin.
  • Networking/Devices: Detach virtual CDs/USBs before unregistering, or the VM may fail to boot.
  • Unsupported scripts: Community scripts work but aren’t officially supported by VMware. Always test in a lab before production.
  • Risks specifically with doing a move using the WebUI
    When you use the “Move” option in the datastore browser, ESXi is essentially doing a cold file copy at the hypervisor level. A few things explain what you’re seeing:

    • Performance:
      • The copy runs through the host’s management plane, not optimized like Storage vMotion.
      • It’s single‑threaded and doesn’t leverage advanced I/O scheduling, so speeds can be much slower than expected.
      • If your datastore is local spinning disks or consumer‑grade SSDs, throughput can be especially poor compared to enterprise storage.
    • Resource impact:
      • You’ll see spikes in CPU and memory usage because the host is handling the copy itself.
      • Other VMs can feel sluggish if the datastore I/O queue is saturated.
    • No cancel option:
      • Unfortunately, once the task is launched in the Host Client, there’s no supported way to stop it mid‑stream.
      • Killing the process or rebooting the host risks corrupting the VM files.
      • VMware only provides cancel/rollback with Storage vMotion (via vCenter), not with manual datastore moves.

    So in short use vmkfstools -i from the ESXi shell to clone/move disks. It’s more efficient. I learnt this hard way by trying it out, and watching the system resources to see it flying at a blistering 20-30 MB/s, so now my ESXi is in a degraded state while I have to wait for the process to complete which will take hours at this rate. Oiii…

Wireless ESXi Host

The Story

So, the other day I pondered an idea. I wanted to start making some special art pieces made from old motherboards, and then I also started to wonder could I actually make such an art piece… and have it functional?

I took an apart my old build that was a 1U server I made from an old PA-500 and a motherboard I repurposed from a colleague who gifted me their old broken system. Since it was a 1U system, I had purchased 2 special pieces to make it work, a special CPU heatsink (complete solid copper, with a side blower fan, and a 300 watt 1U PSU. both of which made lots of noise.

I also have another project going called “Operation Shut the fuck up” in which all the noisy servers I run will be either shutdown or modified to make zero noise. I hope with the project to also reduce my overall power consumption.

So I started by simply benching the Mobo and working off that, which spurred a whole interest into open case computer designs. I managed to find some projects on Thingiverse for 2020 extrusions and corner braces, cable ties… the works. The build was coming along swimmingly. There was just one thing that kept bugging me about the build… The wires…

Now I know the power cable will be required reguardless, but my hope was to have/install an outlet at the level the art piece was going to be placed at and have it nicely nested behind the art piece to hide it. Now there were a couple ways to resolve this.

  1. Use an Ethernet over Power (Powerline) adapter to use the existing copper power lines already installed in the house. (Not to be confused with PoE).
    There was just one problem with this, my existing Powerline kit died right when I wanted to use it for the purpose. (Looking inside looks like the, soldered to the board, fuse blew, might be as simple as replacing that but it could be a component behind the fuse failed and replacing it would simply blow the new fuse).
    *This is still a very solid option as the default physical port can be used and no other software/configuration/hackery needs to be done, (Plug n Play).
  2.  The next best option would be to use one of these RJ45 to Wireless adapters:
    Wireless Portable WiFi Repeater/Bridge/AP Modes, VONETS VAP11G-300.
    VONETS VAP11G-500S Industrial 2.4GHz Mini WiFi Bridge Wireless Repeater/Router Ethernet to WiFi Adapter
    This option is not as good as the signal quality over wireless is not has good as physical even when using Powerline adapters. However, this option much like the Powerline option, again allows the use of the default NIC, and only the device itself would need to be preconfigured using another system but otherwise again no software/configuration/hackery needs to be done.
  3.  Straight up use a WiFi Adapter on the ESXi host.

Now if you look up this option you’ll see many different responses from:

  1. It can’t be done at all. But USB NICs have community drivers.
    This is true and I’ve used it for ESXi hosts that didn’t have enough NICs for the different Networks that were available (And VLAN was not a viable option for the network design). But I digress here, that’s not what were are after, Wifi ESXi, yes?
  2.  It can’t be done. But option 1, powerline is mentioned, as well as option 2 to use a WiFi bridge to connect to the physical port.
  3.  Can’t be done, use a bridge. Option 2 specified above. and finally…
  4.  Yeah, ESXi doesn’t support Wifi (as mentioned many times) but….. If you pass the WiFi hardware to a VM, then use the vSwitching on the host.. Maybe…

As directly quoted by.. “deleted” – “I mean….if you can find a wifi card that capable, or you make a VM such as pfsense that has a wifi card passed through and that has drivers and then you router all traffic through some internal NIC thats connected to pfsense….”

It was this guys comment that I ran with this crazy idea to see if it could be done…. Spoiler alert, yes that’s why I’m writing this blog post.

The Tasks

The Caveats

While going through this project I was hit with one pretty big hiccup which really sucks but I was able to work past it. That is… It won’t be possible to Bridge the WAN/LAN network segments in OPNsense/PFsense with this setup. Which really sucked that I had to find this out the hard way… as mentioned by pfsense parent company here:

“BSS and IBSS wireless and Bridging

Due to the way wireless works in BSS mode (Basic Service Set, client mode) and IBSS mode (Independent Basic Service Set, Ad-Hoc mode), and the way bridging works, a wireless interface cannot be bridged in BSS or IBSS mode. Every device connected to a wireless card in BSS or IBSS mode must present the same MAC address. With bridging, the MAC address passed is the actual MAC of the connected device. This is normally a desirable facet of how bridging works. With wireless, the only way this can function is if all the devices behind that wireless card present the same MAC address on the wireless network. This is explained in depth by noted wireless expert Jim Thompson in a mailing list post.

As one example, when VMware Player, Workstation, or Server is configured to bridge to a wireless interface, it automatically translates the MAC address to that of the wireless card. Because there is no way to translate a MAC address in FreeBSD, and because of the way bridging in FreeBSD works, it is difficult to provide any workarounds similar to what VMware offers. At some point pfSense® software may support this, but it is not currently on the roadmap.”

Cool what does that mean? It means that if you are running a flat /24 network, as most people in home networks run a private subnet of 192.168.0.0/24, that this device will not be able to communicate in the layer 2 broadcast domain. The good news is ESXi doesn’t needs to work, or utilizes features of broadcast domains. It does however mean that we will need to manage routes as communications to the host using this method will have to be on it’s own dedicated subnet and be routed accordingly based on your network infrastructure. If you have no idea what I’m talking about here then it’s probably best not to continue on with this blog post.

Let’s get started. Oh another thing, at the time of this writing a physical port is still required to get this setup as lots of initial configurations still need to take place on the ESXi host via the Web GUI which can initially only be accessible via the physical port, maybe when I’m done I can make a mirco image of the ESXi hdd with the required VM, but even then the passthrough would have to be configured… ignore this rambling I’m just thinking stupid things…

Step 1) Have a ESXi host with a PCI-e based WiFi card.

I’ve tested this with both desktop Mobo with a PCI-e Wifi card, and a laptop with a built in Wifi Card, in both cases this process worked.

As you can see here I have a very basic ESXi server with some old hardware but otherwise still perfectly useable. For this setup it will be ESXi on USB stick, and for fun I made a Datastore on the remaining space on the USB stick since it was a 64 Gig stick. This is generally a bad idea, again for the same reasons mentioned above that USB sticks are not good at HIGH random I/O, and persistent I/O on top of that, but since this whole blog post is getting an ESXi host managed via WiFi which is also frowned upon why not just go the extra mile and really piss everyone off.

Again I could have done everything on the existing SATA based SSD and avoid so much potential future issue…. but here I am… anyway…

You may also note that at this time in the post I am connecting to a physical adapter on the ESXi host as noted by the IP addresses… once complete these IP addresses will not be used but remain bound the physical NIC.

Step 2) Create VM to manage the WiFi.

Again I’m choosing to use OPNsense cause they are awesome in my opinion.

I found I was able to get away with 1 GB of memory (even though min stated is 2) and 16 GB HDD, if I tried 8 GB the OPNsense installer would fail even though it states to be able to install on 4 GB SD Cards.

Also note I manually change boot from BIOS to EFI which has long been supported. At this stage also check off boot into EFI menu, this allows the VMRC tool to connect to ISO images from my desktop machine that I’m using to manage the ESXi host at this time.

Installing OPNsense

Now this would be much faster had I simply used the SSD, but since I’m doing everything the dumbest way possible, the max speed here will be roughly 8 MB/s… I know this from the extensive testing I’ve done on these USB drives from the ESXi install. (The install caused me so much grief hahah).

Wow 22 MB/s amazing, just remember though that this will be the HDD for just the OPNsense server that won’t need storage I/O, it’ll simply boot and manage the traffic over the WiFi card.

And much like how ESXi installed on the exact same USB drive, we are going to configure OPNsense to not burn out the drive. By following the suggestions in this thread.

Configuring  OPNsense

Much like the ESXi host itself at this point I have this VM connected to the same VMPG that connects to my flat 192.168 network. This will allow us to gain access to the web interface to configure the OPNsense server exactly in the same manner we are currently configuring the ESXi host. However, for some reason the main interface while it will default assign to LAN it won’t be configured for DHCP and assumes 192.168.1.1/24 IP… cool, so log into the console and configure the LAN IP address to be reachable per your config, in my case I’m going to give it an IP address in my 192.168.0.0/24 network.

Again this IP will be temporary to configure the VM via the Web GUI. Technically the next couple steps can be done via the CLI but this is just a preference for me at this time, if you know what you are doing feel free to configure these steps as you see fit.

I’m in! At this point I configure SSH access and allow root and password login. Since this it a WiFi bridged VM and not one acting as a firewall between my private network and the public facing internet this is fine for me and allows more management access. Change these how you see fit.

At this point, I skip the GUI wizard.  Then configured the settings per the link above.

Even with only 1 GB of memory defined for the VM, I wonder if this will cause any issues, reboot, system seems to have come up fine… moving on.

Holy crap we finally have the pre-reqs in place. All we have to do now is configure the WiFi card for PCI passthrough, give it to the VM, and reconfigure the network stacks. Let’s go!

Locate WiFi card and Configure Passthrough

So back on the ESXi web interface go to … Host -> Manage -> Hardware and configure the device for pasththrough until, you find all devices are greyed out? What the… I’ve done this 3 times what happed….

All PCI Passthrough devices grayed out on ESXi 6.7U3 : r/vmware (reddit.com)

FFS, OK I should have mentioned this in the pre-reqs but I guess in all my previous builds test this setting must have been enabled and available on the boards I was using… I hope I’m not hooped here yet again in this dang project…

Great went into the BIOS could find nothing specific for VT-d or VT-x (kind of amazed VM were working on this thing the whole time. I found one option  called XD bit or something, it was enabled, I changed it to disabled, and it caused the system to go into a boot loop. It would start the ESXi boot up and then half way in randomly reboot, I changed the setting back and it works just fine again.

I’m trying super hard right now not to get angry cause everything I have tried to get this server up and running while not having to use the physical NIC has failed… even though I know it’s possible cause I did this 2 other times successfully and now I’m hung cause of another STUPID ****ING technicality.

K I have one other dumb idea up my ass… I have a USB based WiFi NIC, maybe just maybe I can pass that to OPNsense…

VMware seems to possibly allow it: Add USB Devices from an ESXi Host to a Virtual Machine (vmware.com)

OPNsense… Maybe? compatible USB Wifi (opnsense.org)

Here goes my last and final attempt at this hardware….

Attempting USB WiFi Passthrough

Add device, USB Controller 2.0.

Add Device, Find USB device on host from drop down menu.

Boot VM….. (my hearts racing right now, cause I’m in a HAB (Heightened Anger Baseline) and I have no idea if this final work around is going to work or not).

Damn it doesn’t seem to be showing under interfaces… checking dmesg on the shell…

I mean it there’s it has the same name as the PCI-e based WiFi card I was trying to use, but that is 1) pulled from the machine, and 2) we couldn’t pass it through, and dmesg shows it’s on the usbus1… that has to be it… but why can’t I see it in the OPNsense GUI?

OMG… I think this worked… I went to Interfaces wireless, then added the run0 I saw in dmesg….

I then added as an available interface….

For some weird reason it gave it a weird assignment as WifIBridge… I went back into the console and selected option 2 to assign interfaces:

Yay now I can see an assignable interface to WAN. I pick run0

Now back into OPNsense GUI… OMG… there we go I think we can move forward!

Once you see this we can FINALLY start to configure the wireless connection that will drive this whole design! Time for a quick break.

Configuring WiFi on OPNsense

No matter if you did PCI-e passthrough or USB passthrough you should now have an accessible OPNsense via LAN, and assigned the WiFi device interface to WAN. Now we need to get WAN connected to the actual WiFi.

So… Step 1) remove all blocking options to prevent any network issues, again this is an internal bridge/router, and not a Edge Firewall/NAT.

Uncheck Block Private Networks (Since we will be assigning the WAN interface a Private IP), and uncheck Block bogon networks.

Step 2) Define your IP info. In my case I’m going to be providing it a Static IP. I want to give it the one that is currently being used to access it that is bound to the vNIC, but since it’s alread bound and in use we’ll give it another IP in the same subnet and move the IP once it’s released from the other interface. For now we will also leave it as a slash 32 to prevent a network overlap of the interface bound on LAN thats configured for a /24.

No IPv6.

Step 3) Define SSID to connect to and Password.

I did this and clicked apply and to my dismay.. I couldn’t get a ping response… I ssh’d into the device by the current VMX nic IP and even the device itself couldn’t ping it (interface is down, something is wrong).

Checking the OPNsense GUI under INterface Assignments I noticed 2 WiFI interfaces (somehow I guess from me creating it above, and then running the wizard on the console?).

Dang I wanted to grab a snip, but from picking the main one (the other one was called a clone), it has now been removed from the dropdown, and after picking that one the pings started working!

Not sure what to say here, but now at this point you should have a OPNsnese server accessible by LAN (192.168.0.x) and WAN (192.168.0.x). The next thing is we need to make the Web interface accessible by the WAN (Wireless) interface.

Basically, something as horrendous as this drawing here:

Anyway… the first goal is to see if the WiFi hold up, to test this I simply unplug the physical cable from the beaitful diagram above, and make sure the pings to the WAN interface stay up… and they both went down….

This happened to me on my first go around on testing this setup… I know I fixed it.. I just can’t remember how… maybe a reboot of the VM, replug in physical cable. Before I reboot this device I’ll configure a gateway as well.

Interesting, so yup that fixed the WiFi issue, OPNsense now came up clean and WiFi still ping response even when physical nic is removed from the ESXi host… we are gonna make it!

interesting the LAN IP did not come up and disappeared. But that’s OK cause I can access the Web GUI via the WAN IP (Wirelessly).

finally OK, we finally have our wireless connection, now we just need to create a new vSwitch and MGMT network on the ESXi host that we will connect to the OPNsense on the VMX0 side (LAN) that you can see is free to reconfigure. This also free’d the IP address I wanted to use for the WAN, but since I’ve had so many issues… I’m just going to keep the one I got working and move on.

Configure the Special Managment network.

I’m going to go on record and say I’m doing it this way simply cause I got this way to work, if you can make it work by using the existing vSwitch and MGMT interfaces, by all means giver! I’m keeping my existing IPs and MGMT interfaces on the default switch0 and creating a new one for the wireless connection simply so that if I want to physically connect to the existing connection.. I simply plug in the cable.

Having said that on the ESXi host it’s time to create a new vSwitch:

Now create the new VMK, the IP given here is the in the new subnet that will be routed behind the OPNsense WAN. In my example I created a new subnet 192.168.68.0/24 this will be routed to the WAN IP address given to OPNsense in my example here that will be 192.168.0.33. (Outside the scope of this blog post I have created routes for this on my devices gateway devices, also since my machine is in the same subnet at the OPNsense WAN IP, but the OPNsense WAN IP address is not my subnets gateway IP this can cause what is known as asymetric routing, to resolve this you simply have to add the same route I just mentioned to the machine managing the devices. You have been warned, design your stuff better than I’m doing here… this is all simply for educational purposes… don’t do this ever in production)

Now we need to create a VMPG for the VM to connect the VMX0 IP into the new vSwitch to provide it the gateway IP for that new subnet (192.168.68.1/24)

Now we can finally configure the vNIC on the OPNsense VM to this new VMPG:

Before we configure the OPNsense box to have this new IP address let’s configure the ESXi gateway to be that:

OK finally back on the OPNsense side let’s configure the IP address…

Now to validate this it should simply be making sure the ESXi host can ping this IP…

All I should have to do now is configure the route on my machine doing all this work and I should also be able to ping it…

More success… final step.. unplug physical nic to pings stay up?? OMG and they do!!! hahaha:

As you can see the physical NIC IP drops but the new secret MGMT IPs behind the WiFi stay up! There’s one final thing we need to do though.

Configure Auto Start of OPNsense

This is a critical step in the design setup as the OPNsense needs to come up automatically in order to be able to manage the ESXi host if there is ever a reboot of the host.

Then simply configure the auto start setting for this VM:

I also go in and change the auto start delay to 30 seconds.

Summary

And there you have it… and ESXi host completely managed via WiFi….

There are a ton of limitations:

  1. No Bridging so you can’t keep a flat layer 2 broadcast domain. Thus:
  2. Requires dedicated routes and complex networking.
  3. All VM traffic is best handled directly on internal vSwitch otherwise all other VM traffic will share the same WiFi gateway providing a terrible experince.
  4. The Web interface will become sluggish when the network interface is under load.
  5.  However it is overall actually possible.
  6. * Using PCI-e passthrough disallows snapshots/vMotions of the OPNsense VM but USB does allow it, when doing a storage vMotion the VM crashed on me, for some reason auto start disabled too had to manually start the VM back up. (I did this by re-IPing the ESXi server via console and plugging in a phsyical cable)
  7. With USB WiFi Nic connections can be connected/disconnected from the host, but with PCI-e Passthrough these options are disabled.
  8. With USB NIC you can add more vNICs to OPNsense and configure them, it just brings down the network overall for about 4-5 min, but be patient it does work.Here’s a Speedtest from a Windows Virtual Machine on the ESXi host.

Hope you all enjoyed this blog post. See ya all next time!

*UPDATE* Remember when I stated I wanted to keep those VMKs in place incase I ever wanted to plug the physical cable back in? Yeah that burnt me pretty hard. If you want a backup physical IP make it something different then you existing network subets and write it down on the NIC…

For some really strange reason HTTPS would work but all other connections such as SSH would timeout very similar to an asymmetric routing issue, and it actually cause it kind was. I’m kinda shocked that HTTPS even managed to work… huh…

Here’s a conversation I had with other on VMware IRC channel trying to troubleshoot the issue. Man I felt so dumb when I finally figured out what was going on.

*Update 2* I notice that the CPU usage on the OPNsense VM would be very high when traffic through it was taking place (and not even high bandwidth here either) AND with the pffilter service disabled, meaning it working it pure routing mode.

High CPU load with 600Mbit (opnsense.org)

Poor speeds and high CPU usage when going through OPNsense?

“Furthermore, set the CPU to 1 core and 4 sockets. Make sure you use VirtIO nics and set Multiqueue to 4 or 8. There is some debate going on if it should be 4 or 8. By my understanding, setting it to 4 will force the amount of queues to 4, which in this case matches your amount of CPU cores. Setting it to 8 will make OPNsense/FreeBSD select the correct amount.” Says Mars

“In this case this is also comparing a linux-based router to a BSD based one. Linux will be able to scale throughput much easily with less CPU power required when compared to the available BSD-based routers. Hopefully with FreeBSD 13 we’ll see more optimization in this regard and maybe close the gap a bit compared to what Linux can do.” Says opnfwb

Mhmmm ok I guess first thing I can try is upping the CPU core count. But this VM also hosts the connection I need to manage it… Seems others have hit this problem too…

Can you add CPU cores to VM at next restart? : r/vmware (reddit.com)

while the script is decent, the comment by cowherd is exactly what I was thinking I was going to do here: “Could you clone the firewall, add cores to the clone, then start it powering up and immediately hard power off the original?”

I’ll test this out when time permits and hopefully provide some charts and stats.

Using Fake PMem to Strach That Itch

Sooooo… let’s say you have a ESXi server setup, with lots of memory, you have local storage and you could simply install ESXi on that and be done with it… but let’s just say you wanna be a hard ass and use the old USB install method. VMware’s ESXi will allow such an install but also warn you that no scratch (USB drives are not meant for heavy sustained I/O and fail due to this, thus they disable scratch here) and no core dump locations are configured (no way it could write the data fast enough, or be reliable with there’s a system crash).

So you wanna stick to your guns and have ESXi installed on USB, and you already disabled the annoying alert for core dumps cause this is a lab host and you simply don’t care…. alright….

Even then the pestering “System logs on host localhost.localdomain are stored on non-persistent storage. Consult product documentation to configure a syslog server or a scratch partition.” shows up, in my previous post I simply added a note…

Note* Option 2 was still required to get rid of another message: System logs are stored on non-persistent storage (2032823) (vmware.com)

That being “Option 2 – Set Syslog.global.logDir”, for this special beast of a lab setup though while we DO have an SSD planned to be a datastore, and again we could simply have installed ESXi here and moved on with my life. I want this to be the most comlpex ESXi host ever, and instead have full separate of the SSD and the ESXi host SO install, in a way the SSD can be fully unplugged up the system would still boot fine (given the USB drive is still alive).

So now I discovered you can create a datastore in memory, what a perfect place for a scratch partition (I know it won’t actually persist as noted by Willam himself and it won’t mean anything cause any core dumps won’t exist if the system experiences a power failure, but in my use case I simply don’t care I just need scratch, and what better place then memory…. Yes I’m aware scratch is suppose to be when memory is low, but if its reserved it’ll be there even when the host thinks there’s no memory to flow over to)… Anyway…

Where the fuck was I… oh Yeah PMem… Let’s see if this works in ESXi 7…

How to simulate Persistent Memory (PMem) in vSphere 6.7 for educational purposes?  (williamlam.com)

“Disclaimer: This is not officially supported by VMware. Unlike a real physical PMem device where your data will be persisted upon a reboot, the simulated method will NOT persist your data. Please use this at your own risk and do not place important or critical VMs using this method.

In ESXi 6.7, there is an advanced boot option which enables you to simulate or “fake” PMem by consuming a percentage of your physical ESXi hosts memory and allocating that to form a PMem datastore. You can append this boot option during the ESXi boot up process (e.g. Control+O) or you can easily manage it using ESXCLI which is my preferred method of choice.

Run the following command and replace the value with the desired percentage for PMem allocation:”

esxcli system settings kernel set -s fakePmemPct -v 5

“Note: To disable fake PMem, simply set the value to 0

You can also verify whether fake PMem is enabled or its current configured value is by running the following command:

esxcli system settings kernel list -o fakePmemPct

For the changes to go into affect, you obviously will need to reboot your ESXi host.”

Man I fought with getting my install to work for hours cause of a faulty USB drive.. I really should just install on the SSD… nah! I want the difficult way!

… Reboot and…. finally! it shows 🙂

I went to change the scratch location, but providing the path with the escape character didn’t work…. but specifying it directly the GUI took it.

as you can see the old path, but after reboot I got the error again, looks like it didn’t like the path…

That’s great…. the datastore name changes, and so does it’s mounted path, which explains why the error popped up again.. OI

So much for that idea

 

PVE Hosts Won’t Boot, Missing Drive

This is pretty dumb… every other hypervisor I’ve ever played with, if the boot drive is fine… the OS boots… period….

Yet the other day I tried to boot my PVE host and it just won’t boot it would get stuck stating that a datastore (nothing that’s a dependency for the OS to actually boot) was causing the OS not to boot….

I found this PVE thread that was more recent with a comment that worked for the OP.

“if you created the partition via the gui, we create a systemd-mount unit under /etc/systemd/system

(e.g. mnt-datastore-foo.mount)

you can disable that unit with ‘systemctl disable <unit>’

or delete the file

we’ll improve the docs in the near future and have planned to make the gui disk management a bit easier in regards to removing/reusing”

This was posted in 2021, yet when I checked that path (booting into Recovery mode) it didn’t contain any file with a ending of .mount… So not sure what this is about. I did however find this thread which was exactly my problem… and funny enough the OP literally posted their own answer (which is the answer here as well) and no other comments made on the post, which was created in  2018…

[SOLVED] – Reboot Proxmox Host now will not boot | Proxmox Support Forum

“So I had upgraded some packages and the proxmox host recommended rebooting the system. After rebooting the system hangs at the screen showing [DEPEND] in yellow for 3 lines:
Dependency failed for /mnt/Media
Dependency failed for local file system
Dependency failed for File system check on /dev/Data-Storage/Media

I tried running control-D to continue but it does not continue.

I’m guessing I need to clean up the entries how can i do that? I’m assuming I just need to boot into emergency and edit /etc/fstab and remove the entries?

OK yes removing those from /etc/fstab fixed it and now it boots.”

This is exactly what I did as well… I saw the offending entry which was a BTRFS storage I had configured in the past and that storage unit had been shutdown. (I thought I blogged this, but I only blogged about using LVM over iscsi.. Configuring shared LVM over iSCSI on Proxmox – Zewwy’s Info Tech Talks)

Anyway, removing the entry from fstab and rebooting.. bam PVE host came right up.

Constructive criticism to PVE, while yes any knowledgeable Linux sysadmin will figure out how to fix this, as I just did here. However, how about NOT having the boot process fail simply cause a configured storage is not available… like all other hypervisors… BOOT the host and show the storage as failed in the management UI to clean it up that way…. Just.. food for thought….