Spammed via BCC

Well, whenever I’d check my local email, I noticed a large amount of spam and junk getting sent to my mailbox. The problem was the spammers were utilizing a trick of using BCC, aka Blind Carbon Copy. This means that the actual users it was all sent to (in a bulk massive send, no less) were all hidden from all people that received the email.

Normally people only have one address associated with their mailbox, and thus it would be obvious which address it was sent to, and getting these to stop outside of other technical security measures can be very difficult. It’s very similar to a real-life person who knows where you live and is harassing you, secretly at night by constantly egging your house. You can’t ask them to stop since you don’t know who they are, can’t really use legal tactics because you don’t know who they are. Sop you have to rely on other means, first identification if the person is wished to be identified, or simply move. Both are tough.

In my case I use multiple email addresses when signing up for stuff so if one of those service providers get hacked or compromised, I usually can simply remove the leaked address from my list of email addresses.

However because the spammer was using BCC, the actual to address was changed to a random address.

Take a look at this example, as you can see, I got the email, but it was addressed to jeff.work@yorktech.ca. I do not own this domain so to me it was clearly forged. However, that doesn’t help me in determining which of the multiple email addresses had been compromised.

I figured I’d simple use EAC and check the mail flow section, but for some reason it would always return nothing (broken)?

Sigh, lucky for me there’s the internet, and a site called practical365 with an amazing exchange admin who writes amazing posts who goes by the name Paul Cummingham. This was the post to help me out: Searching Message Tracking Logs by Sender or Recipient Email Address (practical365.com)

In the first image you can see the sender address, using this as a source I provided the following PowerShell command in the exchange PowerShell window:

Get-MessageTrackingLog -Sender uklaqfb@avasters.nov.su

Oh, there we go, the email address I created for providing a donation to heart n stroke foundation. So, I guess at some point the Heart n Stroke foundation had a security breach. Doing a quick Google search, wow, huh sure enough, it happene 3 years ago….

Be wary of suspicious messages, Heart and Stroke Foundation warns following data breach | CTV News

Data security incident and impact on Heart and Stroke constituents | Heart and Stroke Foundation

This is what I get for being a nice guy. Lucky for me I created this email alias, so for me it’s as simple as deleting it from my account. since I do not care for any emails from them at this point, fuck em! can’t even keep our data safe, the last donation they get from me.

Sadly, I know many people can’t do this same technique to help keep their data safe. I wish it was a feature available with other email providers, but I can understand why they don’t allow this as well as email sprawl would be near unmanageable for a service provider.

Hope this post helps someone in the same boat.

Upgrading Windows 10 2016 LTSB to 2019 LTSC

*Note 1* – This retains the Channel type.
*Note 2* – Requires a new Key.
*Note 3* – You can go from LTSB to SA, keeping files if you specify new key.
*Note 4* – LTSC versions.
*Note 5* – Access to ISO’s. This is hard and most places state to use the MS download tool. I however, managed to get the image and key thanks to having a MSDN aka Visual Studio subscription.

I attempted to grab the 2021 Eval copy and ran the setup exe. When it got to the point of wanting to keep existing file (aka upgrading) it would grey them all out… 🙁

So I said no to that, and grabbed the 2019 copy which when running the setup exe directly asks for the key before moving on in the install wizard… which seems to let me keep existing files (upgrade) 🙂

My enjoyment was short lived when I was presented with a nice window update failed window.

Classic. So the usual, “sfc /scannow”

Classic. So fix it, “dism /online/ cleanup-image /restorehealth”

Stop, Disable Update service, then clear cache:

Scan system files again, “sfc /scannow”

reboot make sure system still boots fine, check, do another sfc /scannow, returns 100% clean. Run Windows update (after enabling the service) comes back saying 100% up to date. Run installer….

For… Fuck… Sakes… what logs are there for this dumb shit? Log files created when you upgrade Windows 11/10 to a newer version (thewindowsclub.com)

setuperr.log Same as setupact.log Data about setup errors during the installation. Review all errors encountered during the installation phase.

Coool… where is this dumb shit?

Log files created when an upgrade fails during installation before the computer restarts for the second time.

  • C:\$Windows.~BT\Sources\panther\setupact.log
  • C:\$Windows.~BT\Sources\panther\miglog.xml
  • C:\Windows\setupapi.log
  • [Windows 10:] C:\Windows\Logs\MoSetup\BlueBox.log

OK checking the log…..

Lucky me, something exists as documented, count my graces, what this file got for me?

PC Load letter? WTF does that mean?!  While it’s not listed in this image it must have been resolved but I had a line that stated “required profile hive does not exist” in which I managed to find this MS thread of the same problem, and thankfully someone in the community came back with an answer, which was to create a new local temp account, and remove all old profiles and accounts on the system (this might be hard for some, it was not an issue for me), sadly I still got, Windows 10 install failed.

For some reason the next one that seems to stick out like a sore thumb for me is “PidGenX function failed on this product key”. Which lead me to this thread all the way back from 2015.

While there’s a useless comment by “SaktinathMukherjee”, don’t be this dink saying they downloaded some third party software to fix their problem, gross negligent bullshit. The real hero is a comment by a guy named “Nathan Earnest” – “I had this same problem for a couple weeks. Background: I had a brand new Dell Optiplex 9020M running Windows 8.1 Pro. We unboxed it and connected it to the domain. I received the same errors above when attempting to do the Windows 10 upgrade. I spent about two weeks parsing through the setup error logs seeing the same errors as you. I started searching for each error (0x800xxxxxx) + Windows 8.1. Eventually I found one suggesting that there is a problem that occurs during the update from Windows 8 to Windows 8.1 in domain-connected machines. It doesn’t appear to cause any issues in Windows 8.1, but when you try to upgrade to Windows 10… “something happened.”

In my case, the solution: Remove the Windows 8.1 machine from the domain, retry the Windows 10 upgrade, and it just worked. Afterwards, re-join the machine to the domain and go about your business.

Totally **** dumb… but it worked. I hope it helps someone else.”

Again, I’m free to try stuff, so since I was testing I cloned the machine and left it disconnected from the network, then under computer properties changed from domain to workgroup (which means it doesn’t remove the computer object from AD, it just removes itself from being part of a domain). After this I ran another sfc /scannow just to make sure no issue happened from the VM cloning, with 100% green I ran the installer yet again, and guess what… Nathan was right. The update finally succeeded, I can now choose to rename the PC and rejoin the domain, or whatever, but the software on the machine shouldn’t need to be re-installed.

Another fun dumb day in paradise, I hope this blog post ends up helping someone.

 

Move Linux Swap and Extend OS File System

Story

So, you go to run updates, in this case some Linux servers. So, you dust off your old dusty fingers and type the blissful phrase, “apt update” followed by the holier than thou “apt upgrade”….

You watch as the test scrolls past the screen in beautiful green text console style, as you whisper, “all I see is blonde, and brunette”, having seen the same text so many times you glaze over them following up with “ignorance is bliss”.

Your sweet dreams of living in the Matrix come to a halt as instead of success you see the dreaded red text on the screen and realize the Matric has no red text. Shucks this is reality, and the update has just failed.

Reality can be a cruel place, and it can also be unforgiving, in this case the application that failed to update is not the problem (I mean you could associate blame here if the dev’s and maintainers didn’t do any due diligence on efficiencies, but I digress), the problem was simply, the problem as old as computers themselves “Not enough storage space”.

Now, you might be wondering at this point… what does this have to do with Linux Swap?!?!? Like any good ol’ storyteller, I’m gettin’ to that part. Now where was I… oh yes, that pesty no space issue. Now normally this would be a very simple endeavor, either:

A) Go clear up needless crap.
Trust me I tried, ran apt autoclean, and apt clean. Looked through the File System, nothing was left to remove.

B) Add more storage.
This is the easiest route, if virtualized simply expand the VM’s HDD on the host that’s serving it, or if physical DD the contents to a drive of similar bus but with higher tier storage.

Lucky for me the server was virtual, now comes the kicker, even after expanding the hard drive, the Linux machine was configured to have a partition-based Swap. In both situations, virtual and physical, this will have to be dealt with in order to expand the file system the Linux OS is utilizing.

Swap: What is it?

Swap is space on a hard drive reserved for putting memory temporary while another request for memory is being made and there is no more actual RAM (Random Access Memory) available for it to be placed for use. The system simply takes lower access memory and just kind “shows it under the rug” to be cleaned or used later.

If you were running a system with massive amounts of memory, you could, in theory, run without swap, just remove it and life’s good. However, in lots of cases memory is a scarce commodity vs something like hard drive storage, the difference is merely speed.

Anyway, in this case I attempted to remove swap entire (steps will be provided shortly), however this system was no different in terms of just being provisioned enough where several MB of RAM was actually being placed into swap, as such when I removed the swap, and all the services began to spin up the VM became unusable, as running commands would return unable to associate memory. So instead, the swap was simply changed from a partition to a file-based swap.

Step 1) Stop Services

This step may or may not be required, it depends on your systems current resource allocations, if you’re in the same boat as I was in that commands won’t run as the system is at max memory usage; then this is needed to ensure the system doesn’t become unusable during the transition, as it will require to disable swap for a short time.

The commands to stop services will depend on both the Linux distro used and the service being managed. This is beyond the scope of this post.

Step 2) Verify Swap

Run the command:

swapon -s

This is an old Linux machine I plan on decommissioning, but as we can see here, a shining example of a partition-based swap, and the partition it’s assigned to. /dev/sda3. We can also see some of the swap is actually used. During my testing I found Linux wouldn’t disable swap if it is unable to allocate physical memory for its content, which makes sense.

Step 3) Create Swap File

Create the Swap file before disabling the current partition swap or apparently the dd command will fail due to memory buffers.

dd if=/dev/zero of=/swapfile count=1 bs=1GiB

This also depends on the size of your old swap, change the command accordingly based on the size of the partition you plan to remove. In my case roughly a Gig.

chmod -v 0600 /swapfile
chown -v root:root /swapfile
mkswap /swapfile

Step 4) Disable Swap

Now it’s time for us to disable swap so we can convert it to a file-based version. If it states it can’t move the data to memory cause memory is full, revert to step 1 which was to stop services to make room in memory. If this can’t be done due to service requirements, then you’d have to schedule a Maintenace window, since without enough memory on the host service interruption is inevitable… Mr.Anderson.

swapoff /dev/sda3

easy peasy.

Step 5) Enable Swap File

swapon /swapfile

Step 6) Edit fstab

Now looks like we done, but don’t forget this is handled by fstab after reboot, just ask me how I know….  yeah, I found out the hard way… let’s check the existing fstab file…

cat /etc/fstab

Step 7) Reboot and Verify Services

Wait both mounted as swap… what??!?

To fix this, I removed the partition, updated kernel usage, and initram, then reboot:

fdisk /dev/sda
d
3
w
partprobe
update-initramfs -u

Rebooted and swapon showed just the file swap being used. Which means the deleted partition is no longer in the way of the sectors to allow for a full proper expansion of the OS file system. Not sure what was with the error… didn’t seem to affect anything in terms of the services being offered by the server.

Step 8) Extending the OS File System

If you’ve ever done this on Windows, you’ll know how easy it is with Disk Manager. On linux it’s a bit… interesting… you delete the partition to create it again, but it doesn’t delete the data, which we all come to expect in the Windows realm.

fdisk /dev/sda
d
[enter]
n
[enter]
[enter]
[enter]
w
partprobe
resize2fs /dev/sda2

The above simply delete’s the second partition, then recreates it using all available sectors on the disk. Then final commands allows the file system to use all the available sectors, as extended by fdisk.

Summary

Have fun doing whatever you need to do with all the new extra space you have.  Is there any performance impact from doing this? Again, if you have a system with adequate memory, the swap should never be used. If you want to go down that rabbit hole.. here.. Swap File vs Swap Partition : r/linux4noobs (reddit.com) have fun. Could I have removed the swap partition, created it at the end of the new extended hard drive…. yes… I could have but that would have required calculating the sectors, and extending the new file system to the sector that would be the start of the new swap partition, and I much rather press enter a bunch of times and have the computer do it all for me, I can also extend a file easier than a partition, so read the reddit thread… and pick your own poisons…

Upgrading Windows Server 2016 Core AD to 2022

Goal

Upgrade a Windows Server 2016 Core that’s running AD to Server 2022.

What actually happened

Normally if the goal is to stay core to core, this should be as easy as an in-place upgrade. When I attempted this myself this first issue was it would get all the way to end of the wizard then error out telling me to look at some bazar path I wasn’t familiar with (C:\$windows.~bt\sources\panther\ScanResults.xml). Why? Why can’t the error just be displayed on the screen? Why can’t it be coded for in the dependency checks? Ugh, anyway, since it was core I had to attach a USB stick to my machine, pass it through to the VM, save the file open it up, and nested deep in there, it basically stated “Active Directory on this domain controller does not contain Windows Server 2022 ADPREP /FORESTPREP updates.” Seriously, ok, apparently requires schema updates before upgrading, since it’s an AD server.

Get-ADObject (Get-ADRootDSE).schemaNamingContext -Property objectVersion
d:\support\adprep\adprep.exe /forestprep
d:\support\adprep\adprep.exe /domainprep

Even after all that, the install wizard got past the error, but then after rebooting, and getting to around 30% of the install, it would reboot again and say reverting the install, and it would boot back into Server 2016 core.

Note, you can’t change versions during upgrade (Standard vs Datacenter) or (Core vs Desktop). For all limitation see this MS page. The “Keep existing files and apps” was greyed out and not selectable if I picked Desktop Experience. I had this same issue when I was attempting to upgrade a desktop server and I was entering a License Key for Standard not realizing the server had a Datacenter based key installed.

New Plan

I didn’t look at any logs since I wasn’t willing to track them down at this point to figure out what went wrong. Since I also wanted to go Desktop Experience I had to come up with any alternative route.

Seem my only option is going to be:

  1. Install a clean copy of Server 2016 Desktop, Update completely). (Run sysprep, clone for later)
  2. Add it as a domain controller in my domain.
  3. Migrate the FSMO roles. (If I wanted a clustered AD, I could be done but that wouldn’t allow me to upgrade the original AD server that’s failing to upgrade)
  4. Decommission the old Server 2016 Core AD server.
  5. Install a clean copy of Server 2016 Desktop, Update completely). (The cloned copy, should be OOBE stage)
  6. Add to Domain.
  7. Upgrade to 2022.
  8. Migrate FSMO roles again. (Done if cluster of two AD servers is wanted).
  9. Decommission other AD servers to go back to single AD system.

Clean Install

Using a Windows Server 2016 ISO image, and a newly spun up VM, The install went rather quick taking only 15 minutes to complete.

Check for updates. KB5023788 and KB4103720. This is my biggest pet peve, Windows updates.

RANT – The Server 2016 Update Race

As someone who’s a resource hall monitor, I like to see what a machine is doing and I use a variety of tools and methods to do so, including Resource Monitor, Task Manager (for Windows), Htop (linux) and all the graphs available under the Monitor tab of vSphere. What I find is always the same, one would suspect high Disk, and high network (receive) when downloading updates (I see this when installing the bare OS, and the disk usage and throughput is amazing, with low latency, which is why the install only took 15 minutes).

Yet when I click check for updates, it’s always the same, a tiny bit of bandwidth usage, low disk usage, and just endless high CPU usage. I see this ALL THE TIME. Another thing I see is once it’s done and reboot you think the install is done, but no the windows update service will kick off and continue to process “whatever” in the background for at least another half hour.

Why is Windows updates such Dog Shit?!?! Like yay we got monthly Cumulative updates, so at least one doesn’t need to install a rolling ton of updates like we did with the Windows 7 era. But still the lack of proper reporting, insight on proper resource utilization and reliance on “BITS”… Just Fuck off wuauclt….

Ughhh, as I was getting snippets ready to show this, and I wanted to get the final snip of it still showing to be stuck at 4%, it stated something went wrong with the update, so I rebooted the machine and will try again. *Starting to get annoyed here*.

*Breathe* Ok, go grab the latest ISO available for Window Server 2016 (Updates Feb 2018), So I’m guessing has KB4103720 already baked in, but then I check the System resources and its different.

But as I’m writing this it seems the same thing is happening, updates stalling at 5%, and CPU usage stays at 50%, Disk I/O drops to next to nothing.

*Breaks* Man Fuck this! An announcer is born! Fuck it, we’ll do it live!

I’ll let this run, and install another VM with the latest ISO I just downloaded, and let’s have a race, see if I can install it and update it faster then this VM…. When New VM finished installing, let a couple config settings. Check for updates:

Check for updates. KB5023788 and KB4103723. Seriously?

Install, wow, the Downloading updates is going much quicker.  Well, the download did, click install sticking @ 0% and the other VM is finishing installing KB4103720. I wonder if it needs to install KB4103723 as well, if so then the new VM is technically already ahead… man this race is intense.

I can’t believe it, the second server I gave more memory to, was the latest available image from Microsoft, and it does the exact same thing as the first one.. get stuck 5%.. CPU usage 50% for almost an hour.. and error.

lol No fucking way… reboot check for updates, and:

at the same time on the first VM that has been checking for updates forever which said it completed the first round of updates…

This is unreal…

Shit pea one, and shit pea 2, both burning up the storage backend in 2 different ways…. for the same update:

Turd one really rips the disk:

Turd two does a bit too, but more just reads:

I was going to say both turds are still at 0% but Turd one like it did before spontaneously burst back in “Checking for update” while the second one seem it moved up to 5%… mhmm feel like I’ve been down this road before.

Damn this sucks, just update already FFS, stupid Windows. *Announcer* “Get your bets here!, Put in your bets here!” Mhmmm I know turd one did the same thing as turd 2, but it did complete one round of updates, and shows a higher version then turd 2, even though turd 2 was the latest downloadable ISO from Microsoft.

I’m gonna put my bets on Turd 1….

Current state:

Turd 1: “Checking for Updates”… Changed to Downloading updates 5%.. shows signs of some Disk I/O. Heavy CPU usage.

Turd 2: “Preparing Updates 5%” … 50% CPU usage… lil to no Disc I/O.

We are starting to see a lot more action from Turd 1, this race is getting real intense now folks. Indeed, just noticing that Turd one is actually preparing a new set of updates, now past the peasant KB4103720. While Turd 2 shows no signs of changing as it sits holding on to that 5%.

Ohhhh!!! Turd one hits 24% while Turd 2 hit the same error hit the first time, is it stuck in a failed loop? Let’s just retry this time without a reboot.. and go..! Back on to KB4103720 preparing @ 0%. Not looking good for Turd 2. Turd 1 has hit 90% on the new update download.

and comming back from the break Turd one is expecting a reboot while Turd 2 hits the same error, again! Stop Windows service, clear softwaredistrobution folder. Start update service, check for updates, tried fails, reboot, retry:

racing past the download stage… Download complete… preparing to install updates… oh boy… While Turd one is stuck at a blue screen “Getting Windows Ready” The race between these too can’t get any hotter.

Turd one is now at 5989 from 2273. While Turd 2 stays stuck on 1884. Turd 2 managed to get up to 2273, but I wasn’t willing to watch the hours it takes to get to the next jump. Turd 1 wins.

Checking these build numbers looks like Turd 1 won the update race. I’m not interested in what it takes to get Turd 2 going. Over 4 hours just to get a system fully patched. What a Pain in the ass. I’m going to make a backup, then clear the current snap shot, then create a new snapshot, then sysprep the machine so I can have a clean OOBE based image for cloning, which can be done in minutes instead of hours.

END RANT

Step 2) Add as Domain Controller.

Wow amazing no issues.

Step 3) Move FSMO Roles

Transfer PDCEmulator

Move-ADDirectoryServerOperationMasterRole -Identity "ADD" PDCEmulator

Transfer RIDMaster

Move-ADDirectoryServerOperationMasterRole -Identity "ADD" RIDMaster

Transfer InfrastrctureMaster

Move-ADDirectoryServerOperationMasterRole -Identity "ADD" Infrastructuremaster

Transfer DomainNamingMaster

Move-ADDirectoryServerOperationMasterRole -Identity "ADD" DomainNamingmaster

Transfer SchemaMaster

Move-ADDirectoryServerOperationMasterRole -Identity "ADD" SchemaMaster

Step 4) Demote Old DC

Since it was a Core server, I had to use Server Manager from the remote client machine (Windows 10) via Server Manager. Again no Problem.

As the final part said it became a member server. So not only did I delete under Sites n Services, I deleted under ADUC as well.

Step 5) Create new server.

I recovered the system above, changed hostname, sysprepped.

This took literally 5 minutes, vs the 4 hours to create from scratch.

Step 6) Add as Domain Controller.

Wow amazing no issues.

Step 7) Upgrade to 2022.

Since we got 2 AD servers now, and all my servers are pointing to the other one, let’s see if we can update the Original AD server that is now on Server 2016 from the old Core.

Ensure Schema is upgraded first:

d:\support\adprep\adprep.exe /forestprep

d:\support\adprep\adprep.exe /domainprep

run setup!

It took over an hour, but it succeeded…

Summary

If I had an already updated system, that was already on Desktop Experience this might have been faster, I’m not sure again why the in-place update did work for the server core, here’s how you can upgrade it Desktop Experience and then up to 2022. It does unfortunately require a brand new install, with service migrations.

Veeam Backup Encryption

Story

So, a couple posts back I blogged about getting a NTFS USB drives shared to a Windows VM via SMB to store backups onto, so that the drive could easily plugged into a Windows machine with Veeam on it to recover the VMs if needed. However, you don’t want to make it this easy if it were to be stolen, what’s the solution, encryption… and remembering passwords. Woooooo.

Veeam’s Solution; Encryption

Source: Backup Job Encryption – User Guide for VMware vSphere (veeam.com)

I find it strange in their picture they are still using Windows Server 2012, weird.

Anyway, so I find my Backup Copy job and sure enough find the option:

Mhmmm, so the current data won’t be converted I take it then…

Here’s the backup files before:

and after:

As you can see the old files are completely untouched and a new full backup file is created when an Active full is run. You know what that means…

Not Retroactive

“If you enable encryption for an existing job, except the backup copy job, during the next job session Veeam Backup & Replication will automatically create a full backup file. The created full backup file and subsequent incremental backup files in the backup chain will be encrypted with the specified password.

Encryption is not retroactive. If you enable encryption for an existing job, Veeam Backup & Replication does not encrypt the previous backup chain created with this job. If you want to start a new chain so that the unencrypted previous chain can be separated from the encrypted new chain, follow this Veeam KB article.”

What the **** does that even mean…. to start I prefer not to have a new chain but since an Active full was required there’s a start of a new chain, so… so much for that. Second… Why would I want to separate the unencrypted chain from the new encrypted chain? wouldn’t it be nice to have those same points still exist and be selectable but just be encrypted? Whatever… let’s read the KB to see if maybe we can get some context to that odd sentence. It’s literally talking about disassociating the old backup files with that particular backup job. Now with such misdirected answers it would seem it straight up is not possible to encrypt old backup chains.

Well, that’s a bummer….

Even changing the password is not possible, while they state it is, it too is not retroactive as you can see by this snippet of the KB shared. Which is also mentioned in this Veeam thread where it’s being asked.

So, if your password is compromised, but the backup files have not you can’t change the password and keep your old backup restore points without going through a nightmare procedure or resorting all points and backing them up somehow?

Also, be cautious checking off this option as it encrypts the metadata file and can prevent import of not encrypted backups.”You can enter password and read data from it, but you cannot “remove the lock” retroactively”

Reason why Veeam asks for passwords even on non-encrypted chains, is because backupdata metadata(holding information about all restore points in the chain, including encrypted and non encrypted ones) is encrypted too!”

“Metadata will be un-encrypted when last encrypted restore point it describes will be gone by retention.”

Huh, that’s good to know… this lack of retroactive ability is starting to really suck ass here. Like I get the limitations that there’d be high I/O switching between them, but if BitLocker for windows can do it for a whole O/S drive LIVE, non-the-less, why can’t Veeam do it for backup sets?

Summary

  • Veeam Supports Encryption
    • Easy, Checkbox on Backup Job
    • Uses Passwords
    • Non Retroactive

I’ll start off by saying it’s nice that it’s supported, to some extent. What would be nice is:

  1. Openness of what Encryption algos are being used.
  2. Retroactive encryption/decryption on backup sets.
  3. Support for Certificates instead of passwords.

I hope this review helps someone. Cheers.

No coredump target has been configured. Host core dumps cannot be saved.

ESXi on SD Card

Ohhh ESXi on SD cards, it got a little controversial but we managed to keep you, doing the latest install I was greet with the nice warning “No coredump target has been configured. Host core dumps cannot be saved.

What does this mean you might ask. Well in short, if there ever was a problem with the host, log files to determine what happened wouldn’t be available. So it’s a pick your poison kinda deal.

Store logs and possibly burn out the SD/USB drive storage, which isn’t good at that sort of thing, or point it somewhere else. Here’s a nice post covering the same problem and the comments are interesting.

Dan states “Interesting solution as I too faced this issue. I didn’t know that saving coredump files to an iSCSI disk is not supported. Can you please provide your source for this information. I didn’t want to send that many writes to an SD card as they have a limited number (all be it a very large number) of read/writes before failure. I set the advanced system setting, Syslog.global.logDir to point to an iSCSI mounted volume. This solution has been working for me for going on 6 years now. Thanks for the article.”

with the OP responding “Hi Dan, you can definately point it to an iscsi target however it is not supported. Please check this KB article: https://kb.vmware.com/s/article/2004299 a quarter of the way down you will see ‘Note: Configuring a remote device using the ESXi host software iSCSI initiator is not supported.’”

Options

Option 1 – Allow Core Dumps on USB

Much like the source I mentioned above: VMware ESXi 7 No Coredump Target Has Been Configured. (sysadmintutorials.com)

Edit the boot options to allow Core Dumps to be saved on USB/SD devices.

Option 2 – Set Syslog.global.logDir

You may have some other local storage available, in that case set the variable above to that local or shared storage (shared storge being “unsupported”).

Option 3 – Configure Network Coredump

As mentioned by Thor – “Apparently the “supported” method is to configure a network coredump target instead rather than the unsupported iSCSI/NFS method: https://kb.vmware.com/s/article/74537

Option 4 – Disable the notification.

As stated by Clay – ”

The environment that does not have Core Dump Configured will receive an Alarm as “Configuration Issues :- No Coredump Target has been Configured Host Core Dumps Cannot be Saved Error”.
In the scenarios where the Core Dump partition is not configured and is not needed in the specific environment, you can suppress the Informational Alarm message, following the below steps,

Select the ESXi Host >

Click Configuration > Advanced Settings

Search for UserVars.SuppressCoredumpWarning

Then locate the string and and enter 1 as the value

The changes takes effect immediately and will suppress the alarm message.

To extract contents from the VMKcore diagnostic partition after a purple screen error, see Collecting diagnostic information from an ESX or ESXi host that experiences a purple diagnostic screen (1004128).”

Summary

In my case it’s a home lab, I wasn’t too concerned so I followed Option 4, then simply disabled file core dumps following the second steps in Permanently disable ESXi coredump file (vmware.com)

Note* Option 2 was still required to get rid of another message: System logs are stored on non-persistent storage (2032823) (vmware.com)

Not sure, but maybe still helps with I/O to disable coredumps. Will update again if new news arises.

Manually Fix Veeam Backup Job after VM-ID change

The Story

There’s been a couple time where my VM-IS’s change:

  • A vSphere server has crashed beyond a recoverable state.
  • A server has been removed and added back into the inventory in vSphere.
  • Manually move a VM to a new ESXi host.
    • VM removed from inventory, and readded.
  • Loss vCenter Server.
  • Full VM Recovery via Veeam.

What sucks is when you go to run the Job in Veeam after any of the above, the job simply fails to find the object. You can edit the job by removing the VM and re-adding it, but this will build a whole new chain, which you can see in the repo of Veeam after such events occur:

As you can see two chains, this has been an annoyance for a long time for me, as there’s no way to manually set the VM-ID in vCenter, it’s all automanaged.

I found this Veeam thread discussing the same issue, and someone mentioned “an old trick” which may apply, and linked to a blog post by someone named “Ideen Jahanshahi”.

I had no idea about this, let’s try…

Determine VM-ID on vCenter

The source uses powerCLI, which I’ve covered installing, but easier is to just use the Web UI, and in the address bar grab it after the vms parameter.

Determine VM-ID in Veeam

The source installs SSMS, and much like my fixing WSUS post, I don’t like installing heavy stuff on my servers to do managerial tasks. Lucky for me, SQLCMD is already installed on the Veeam server so no extra software needed.

Pre-reqs for SQLCMD

You’ll need the hostname. (run command hostname).

You’ll need the Instance name. (Use services.msc to list SQL services)

Connect to Veeam DB

Open CMD as admin

sqlcmd -E -S Veeam\VEEAMSQL2012

use VeeamBackup
:setvar SQLCMDMAXVARTYPEWIDTH 30
:setvar SQLCMDMAXFIXEDTYPEWIDTH 30
SELECT bj.name, bo.object_id FROM bjob bj INNER JOIN ObjectsInJobs oij ON bj.id = oij.job_id INNER JOIN Bobjects bo ON bo.id = oij.object_id WHERE bj.type=0
go

Some reason above code wouldn’t work on my latest build/install of Veeam, but this one worked:

SELECT name, job_id, bo.object_id FROM bjobs bj INNER JOIN ObjectsInJobs oij ON bj.id = oij.job_id INNER JOIN BObjects bo ON bo.id = oij.object_id WHERE bj.type=0

In my case after remove the VM from inventory and readding it:

As you can see they do not match, and when I check the VM size in the job properties the size can’t be calculated cause the link is gone.

Fix the Broken Job

UPDATE bobjects SET object_id = 'vm-55633' WHERE object_id='vm-53657'

After this I checked the VM size in the job properties and it was calculated, to my amazement it fully worked it even retained the CBT points, and the backup job ran perfectly. Woo-hoo!

This info is for educational purposes only, what you do in your own environment is on you. Cheers, hope this helps someone.

vCLS High CPU usage

The Story

So I went to vMotion a VM to do some maintenance work on a host. Target machine well over 50% CPU usage.. what?! That can’t be right, it’s not running anything…

I tried hard powering the VM off, but it just came right back up suckin CPU cycles with it….

The Hunt

alright Google, what ya got for me… I found this blog post by “Tripp W Black” he mentions stopping a vCenter Service called “VMware ESX Agent Manager”, which he stops and then deletes the offending VMs, sounds like a plan. Let’s try it, so login into VAMI. (vcenter.consonto.com:5480)

K, let’s stop it… let me hard power off the VM now… ehh the VM is staying dead and host CPU:

K let’s go kill the other droid I have causing an issue…

ok I got them all down now, but the odd part is I can’t delete them from disk much like Sir Black mentioned in their blog post. The options is greyed out for me, let’s start the service and see what happens…

The Pain

Well, that was extremely annoying, it seemed to have worked only for a moment and the CPU usages came right back, so I stopped the service again, but I can’t delete the VMs…

Similar issues in vSphere 8, even suggestions to stay running in retreat mode, which I’ll get to in a moment. So, if you are unfamiliar, vCLS are small VMs that are distributed to ESXi hosts to keep HA and DRS features operational, even if vCenter itself goes down. The thing is, I’m not even using HA or DRS, I created a cluster for merely EVC purposes, so I can move VMs between hosts live at my own leisure and without downtime. What’s annoying is I shouldn’t have to spend half my weekend day trying to solve a bug in my HomeLab due to poor design choices.

The Constructive Criticism

VMware…. do not assume a cluster alone requires vCLS. Instead, enable vCLS only when HA or DRS features are enabled.

Now that we have that very simple thing out of the way.

The Fix

So, as we mentioned we are able to stop the vCLS VMs when we stop the EAM service on vCenter, but that won’t be a solution if the server gets rebooted. I decided to Google to see how other people delete vCLS when it doesn’t seem possible.

I found this reddit thread, in which they discuss the same thing mentioned above “Retreat Mode”. However, after setting the required settings (which is apparently tattoo’d after done), I still couldn’t delete the VMs, even after restarting the vpxd service. Much like ‘bananna_roboto’ I ended up deleting the vCLS VMs from the ESXi host UI directly, however when checking vCenter UI the still showed on all the hosts.

After rebooting the vCenter server, all the vCLS VMs were gone, at first, I thought they’d come back, but since the retreat mode setting was applied it seems they do not get recreated. Hence, I will leave Retreat mode enabled as suggested in the reddit thread for now, since I am not using HA or DRS.

So if you want to use EVC in a cluster, but not HA and DRS and would like to skim even more memory from your hosts, while saving on buggy CPU cycles, apparently “Retreat mode” is what you need.

If you do need those features, and you are unable to delete the old vCLS VMs, and restarting the EAM service doesn’t resolve your issue (which it didn’t for me), you may have to open a support case with VMware.

Any, I hope this helped someone. Cheers.

USB NICs on ESXi hosts

Quick post here, I wanted to use a USB based NIC to allow one of my hosts to be able to host the firewall used for internet access, this would allow for host upgrades without downtime.

My first concern was the USB bus on the host, being a bit older, I double checked and sad days it was only USB 2.0. Checking my internet speed, it turns out it’s 300 mbps, and USB 2.0 is 480 mbps, so while I may only be able to use less then half of the full speed of the gig NIC, it was still within spec of the backend, and thus won’t be a bottle neck.

Now when I plugged in the USB nic, I sadly was not presented with a new NIC option on the host.

When I googled this I found an awesome post by non-other than one of my online hero’s Willam Lam. Which he states the following:

“With the release of ESXi 7.0, a USB CDCE (Communication Device Class Ethernet) driver was added to enable support for hardware platforms that now leverages a Virtual EEM (Ethernet Emulation Module) for their out-of-band (OOB) management interface, which was the primary motivation for this enhancement.

One interesting and beneficial side effect of this enhancement is that for any USB network adapters that conforms to the CDCE specification, they would automatically get claimed by ESXi and show up as an available network interface demonstrated in my homelab with the screenshot below.”

Then shows a snippet of running a command:

esxcfg-nics -l

Which for me listed the same results as the UI:

Considering I’m running the latest built of 7.x, I guess the device not “conform to the CDCE specification”.

A bit further in the post he shows running:

lsusb

When ran shows the device is seen by the host:

Let’s try to install the Flings USB Driver, see if it works.

“This Fling supports the most popular USB network adapter chipsets found in the market. The ASIX USB 2.0 gigabit network ASIX88178a, ASIX USB 3.0 gigabit network ASIX88179, Realtek USB 3.0 gigabit network RTL8152/RTL8153 and Aquantia AQC111U.”

Step 1 – Download the ZIP file for the specific version of your ESXi host and upload to ESXi host using SCP or Datastore Browser. Done

Luck the error message was clickable, and it provided a helpful hint to navigate to the host as it maybe due to certificate not trusted, and sure enough that was the case.

Step 2 – Place the ESXi host into Maintenance Mode using the vSphere UI or CLI (e.g. esxcli system maintenanceMode set -e true)

Some reason the command line wasn’t returning from the command above, and I had to enable Maintenance mode via the UI. Done.

Step 3 – Install the ESXi Offline Bundle (6.5/6.7) or Component (7.0)

For (7.0+) – Run the following command on ESXi Shell to install ESXi Component:

esxcli software component apply -d /path/to/the component zip

For (6.5/6.7) – Run the following command on ESXi Shell to install ESXi Offline Bundle:

esxcli software vib install -d /path/to/the offline bundle zip

and my results:

Ohhh FFS… Google!!!!!! HELP!!!! Only one hit…

only only 2 responses close to an answer are… “Ok I can confirm that if you create a 7u1 ISO and upgrade to that first, you can then add the latest fling module to it. Key bit of info that is not in the installation instructions” and “Workaround: Update the ESXi host to 7.0 Update 1. Retry the driver installation.”

Uhhhh I thought I just updated my hosts to the latest patches… what am I running?

“7.0.3, 21686933″… checking the source Flings page, oh… it’s a dropdown menu… *facepalm*

I downloaded the ESXi 8 version, let me try the 703 one…

ESXi703-VMKUSB-NIC-FLING-55634242-component-19849370.zip

Reboot! and?

Ehhh it worked, I can now bind it to a vSwitch. I hope this helps someone :). I’m also wondering if this will burn me on future ESXi updates/upgrade. I’ll post any updates if it does.

Share NTFS USB HDD via SMB on FreeNAS

I’m boiling down an entire night of knowledge as short as possible:

Is it possible? Yes, reference (this post)

Does the internet say it’s possible? No and More, No

Jeff “In the FreeNAS documentation it says using USB attached devices as shares is not allowed.”

Let’s do it anyway. Couple point notes:
*I created an account on FreeNAS “veeam” account ID 1001.

  1. Mounting The USB HDD to FreeNAS:
    Using the “Import Disk” option doesn’t work well:

    1. requires existing zpool aka volume, configured.
    2. when completed doesn’t show files properly.
    3. Mounts Disk in Read Only.
    4. Much like the link shared above we just mount it manually via the backend.
      1. ntfs-3g /dev/da6s1 /mnt/USBHDD/ -o rw,user_allow_other,uid=1001,gid=1000
      2. to make this stick after reboots have to edit fstab file. *I haven’t done this yet, when I have and tested it, I’ll update this area.
      3. The command mounts the NTFS using FUSE, and you can’t change ownership of files n folders after mounting only during.
  2. Sharing the Drive via SMB:
    1. Attempting to create a share via the Front End UI will show the path available in the path selector but it will simply state “This field is required” when trying to create the SMB Share. or you might get “The path None does not exist“.
    2. symlinking or mounting directly to existing zpool pool path that’s already shared via SMB, results in failure accessing the drive and Freenas Logs “smbd: dnssd_clientstub write_all(36) failed -1/53 57 Socket is not connected
    3. The above line alone, I went through hell trying to solve, it’s what lead me to learning about FUSE and the chown issues and all that jazz, I went down so many rabbit holes I thought I was defeated, till I had one final idea: just like I manually mucked with the backend to get NTFS mounted in RW, maybe I can edit the backend Samba config to share the path since the front end python scripts were coded to prevent it.
      1. Find the config file: Samba config file:
        /usr/local/etc/smb4.conf
      2. Add a shared path entry:
        [usbhddd] 
            path = "/mnt/USBHDD"
            printable = no
            veto files = /.snapshot/.windows/.mac/.zfs/
            writeable = yes
            browseable = yes
            access based share enum = no
            hide dot files = yes
            guest ok = no
      3. Save the file and restart the Samba Service:
        service samba_server restart
        

When I saw that share path available, and when I double clicked it and I saw the files saved there show up, my jaw dropped!!! I couldn’t believe it worked.

Much like the manually having to edit the FSTAB to get the drive to mount automatic at boot, I have a feeling the smb4.conf file maybe overwritten at boot, which may require a cron job script to resolve. I again haven’t got to that point yet, I just finished this proof of concept that was, from my research, deemed to be impossible. Yet here I am blogging my success. See below for some info regarding Samba.

Samba options

Samba for FreeBSD

Key take away is that there’s a “link” between the Unix user and the “SMB” user. “FreeBSD user accounts must be mapped to the SambaSAMAccount database for Windows® clients to access the share. Map existing FreeBSD user accounts using pdbedit(8):”

pdbedit -a -u username

Final Note. I did this so I could have Backup Copy Jobs run, the Veeam server is a VM and this allows the VM to be migrated to other hosts while still being able to do both regular backup jobs and Backup copy jobs. and now that the USB drive on FreeNAS is NTFS based, I can just take the drive plug it into a windows machine and start restore operations. Having said that I’m doing this for my HomeLab and is for educational purposes only.

Here’s a snip of the repo in use via Veeam.