First time Postfix

I setup a new Container on Proxmox VE. I did derp out and didn’t realize you had to pre-download templates. It also failed to start, but apparently due to no storage space (you can only see it when you pay close attention when creating the container, it won’t state so when trying to start it. YYou figure creation would simply fail)

Debian 12, and off to the races…

As usual.. first things first, updates. Classic.

Went to follow this basic guide.
I created a user, and set password, started and enabled postfix service.

I figured I’d just do the old send email via telnet trick.

Which kept saying connection refused. I found a similar post, and found nothing was listening on port 25. I checked the existing config file:

/etc/postfix/main.cf

seemed there was nothing for smb like mentioned in that post, adding it manuallyy didn’t seem to help. I did notice that I didn’t have the chance to run the config wizard for postfix. Which from this guide tells you how to initiate it manually:

sudo dpkg-reconfigure postfix

After running this I was able to see the system listening on port 25:

After which the smtp email sendind via telnet worked.. but where was the email, or user’s mailbox? mbox style sounds kinda lame one file for all mail.. yeech…

maildir option sounds much better…

added “home_mailbox = /var/mail/” to my postfix config file, and restarted postfix… now:

well that’s a bit better, but how can I get my mail in a better fashion, like a mailbox app, or web app? Well Web app seems out of the question…

If I find a good solution to the mail checking problem I’ll update this blog post. Postfix is alright for an MTA I guess simple enough to configure. Well there’s apparently this setup you can do, that is PostFix Mail Transfer Agent(MTA SMTP), with Dovecot a secure IMAP and POP3 Mail Delivery Agent(MDA). These two open-source applications work well with Roundcube. The web app to check mail. Which seems like a lot to go through…

March 11, 2024

Migrate ESXi VM to Proxmox

I’m going to simulate migrating to Proxmox VE in my home lab.

I saw this YT video comparing the two and gave me the urge to try it out in my home lab.

In this test I’ll take one host from my cluster and migrate it to use Proxmox.

Step one, move all VMs off target host.
Step two, remove host from cluster.
Step three, shutdown host.

In this case it’s an old HP Folio laptop. Next Install PVE.

Step one Download Installer.
Step two, Burn image or flash USB stick with image.
Step 3 boot laptop into PVE installer.

I didn’t have a network cable plugged in, and in my haste I didn’t pay attention to the bridge main physical adapter, it was selected as wlo1 the wireless adapter. I found references to the bridge info being in /etc/network/interfaces some reason this was only able to get pings to work. all other ports and services seemed completely unavailable. Much like this person, I simply did a reinstall (this time minding the physical port on network config). Then got it working.

First issue I had was it poping up saying Error Code 100 on apt-get update.

Using the built in shell feature was pretty nice, use it to follow this to change the sources to use no-subscription repos.

The next question was, how can I setup another IP thats vlan tagged.

I thought I had it when I created a “Linux VLAN”, and defining it an IP within that subnet and tagging the VLAN ID. I was able to get ping replies, even from my machine in a different subnet, I couldn’t define the gateway since it stated it was defined on the bridge, make sense for a single stack. I figured it was cause ICMP is UDP and doesn’t rely on same paths (session handshakes) and this was probably why the web interface was not loading. I verified this by connecting a different machine into the same subnet and it loaded the web interface find, further validating my assumptions.

However when I removed the gateway from the bridge and provided the correct gateway for the VLAN subnet I defined, the wen interface still wasn’t loading from my alternative subnetting machine. Checking the shell in the web interface I see it lost connectivity to anything outside it’s network ( I guess the gateway change didn’t apply properly) or some other ignorance on my part on how Proxmox works.

I guess I’ll leave the more advanced networking for later. (I don’t get why all other hypervisors get this part so wrong/hard, when VMware makes it so easy, it’s a checkbox and you simply define the VLAN ID in, it’s not hard…) Anyway I simply reverted the gateway back to the bridge. Can figure that out later.

So how to convert a VM to run on ProxMox?

Option 1) Manually convert from VMDK to QCOW2

Option 2) Convert to OVF and deploy that.

In both options it seems you need a mid point to store the data. In option 1 you need to use local storage on a Linux VM, almost twice it seems once to hold the VMDK, and then enough space to also hold the QCOW2 converted file. In option 2 the OP used an external drive source to hold the converted OVF file on before using that to deploy the OVF to a ProxMox host.

I decided to try option 1. So I spun up a Linux machine on my gaming rig (Since I still have Workstation and lots of RAM and a spindle drive with lots of storage). I picked Fedora Workstation, and installed openssh-server, then (after a while, realizing to open firewall out on the ESXi server for ssh), transferred the vmdk to the fedora VM:

106 MB/s not bad…

Then installed the tools on the fedora VM:

yum install -y qemu-img

NM it was already installed and converted it…

On Proxmox I couldn’t figure out where the VM files where located “lvm-thin” by default install. I found this thread and did the same steps to get a path available on the PVE host itself. Then used scp to copy the file to the PVE server.

After copying the file to the PVE server, ran the commands to create the VM and attach the hdd.

After which I tried booting the VM and it wouldn’t catch the disk and failed to boot, then I switched the disk type from SCSI to SATA, but then the VM would boot and then blue screen, even after configuring safe mode boot. I found my answer here: Unable to get windows to boot without bluescreen | Proxmox Support Forum

“Thank you, switching the SCSI Controller to LSI 53C895A from VirtIO SCSI and the bus on the disk to IDE got it to boot”.

I also used this moment to uninstall VMware tools.

Then I had no network, and realized I needed the VirtIO drivers.

If you try to run the installer it will say needs Win 8 or higher, but as pvgoran stated “I see. I wasn’t even aware there was an installer to begin with, I just used the device manager.”

That took longer then I wanted and took a lot of data space too, so not an efficient method, but it works.

February 13, 2024

Repurposing a Blackberry Playbook

I’ll keep this post short for now. but basically, I soft bricked a Playbook by factory resetting it without knowing that the OOBE called home to BB servers to grab the EULA to agree to get through the OOBE.

I kept it in hopes that one day someone would figure out a way past it. Then I found this YT video by someone who runs the channel Gold Screw.

I used an old Windows 7 laptop from wayback, good enough to do the needful:

Download and install Blackberry Desktop Software. (just need to check off Device Drivers)
Download Darcy’s Blackberry Tools
BlackBerry 10.0.4.197 Alpha for BlackBerry Playbook

Then:

Open DBBT
pick Build Autoloader
fill field one with 10.0.4.197 File (the Desktop one)
enter something into the New Autoloader Name field.
Click build it.
run the new exe from a cmd prompt.
Plug in BB Playbook to PC via USB cable. (it should automatically detect and write the 10.0.4.197 OS build onto the Playbook)
reboot the playybook.
in the new OOBE go back n forth pressing skip as fast as possible, eventually it gets past both the EULA and the User ID part.

To my amazement it worked! Then he has follow up video on how to install the apps.

I wasn’t keen on having to use an old copy of chrome, but I can def understand as to the version you know something works. You can watch his video on how to accomplish that after you saved your BB Playbook from the trash.

Here’s the whole list of apps.

October 21, 2023

Spammed via BCC

Well, whenever I’d check my local email, I noticed a large amount of spam and junk getting sent to my mailbox. The problem was the spammers were utilizing a trick of using BCC, aka Blind Carbon Copy. This means that the actual users it was all sent to (in a bulk massive send, no less) were all hidden from all people that received the email.

Normally people only have one address associated with their mailbox, and thus it would be obvious which address it was sent to, and getting these to stop outside of other technical security measures can be very difficult. It’s very similar to a real-life person who knows where you live and is harassing you, secretly at night by constantly egging your house. You can’t ask them to stop since you don’t know who they are, can’t really use legal tactics because you don’t know who they are. Sop you have to rely on other means, first identification if the person is wished to be identified, or simply move. Both are tough.

In my case I use multiple email addresses when signing up for stuff so if one of those service providers get hacked or compromised, I usually can simply remove the leaked address from my list of email addresses.

However because the spammer was using BCC, the actual to address was changed to a random address.

Take a look at this example, as you can see, I got the email, but it was addressed to jeff.work@yorktech.ca. I do not own this domain so to me it was clearly forged. However, that doesn’t help me in determining which of the multiple email addresses had been compromised.

I figured I’d simple use EAC and check the mail flow section, but for some reason it would always return nothing (broken)?

Sigh, lucky for me there’s the internet, and a site called practical365 with an amazing exchange admin who writes amazing posts who goes by the name Paul Cummingham. This was the post to help me out: Searching Message Tracking Logs by Sender or Recipient Email Address (practical365.com)

In the first image you can see the sender address, using this as a source I provided the following PowerShell command in the exchange PowerShell window:

Get-MessageTrackingLog -Sender uklaqfb@avasters.nov.su

Oh, there we go, the email address I created for providing a donation to heart n stroke foundation. So, I guess at some point the Heart n Stroke foundation had a security breach. Doing a quick Google search, wow, huh sure enough, it happene 3 years ago….

Be wary of suspicious messages, Heart and Stroke Foundation warns following data breach | CTV News

Data security incident and impact on Heart and Stroke constituents | Heart and Stroke Foundation

This is what I get for being a nice guy. Lucky for me I created this email alias, so for me it’s as simple as deleting it from my account. since I do not care for any emails from them at this point, fuck em! can’t even keep our data safe, the last donation they get from me.

Sadly, I know many people can’t do this same technique to help keep their data safe. I wish it was a feature available with other email providers, but I can understand why they don’t allow this as well as email sprawl would be near unmanageable for a service provider.

Hope this post helps someone in the same boat.

October 6, 2023April 17, 2024

Configure Certificate-Based Administrator Authentication on a Palo Alto Networks Firewall

Source

As a “more secure” alternative to password-based authentication to the firewall web interface, you can configure certificate-based authentication for administrator accounts that are local to the firewall. Certificate-based authentication involves the exchange and verification of a digital signature instead of a password.

Configuring certificate-based authentication for any administrator disables the username/password logins for all administrators on the firewall; administrators thereafter require the certificate to log in.

To avoid any issues I created a snapshot of the PA VM. This took out my internet for roughly 30 seconds or so.

Step 1) Generate a certificate authority (CA) certificate on the firewall.
You will use this CA certificate to sign the client certificate of each administrator.
Create a Self-Signed Root CA Certificate.
Alternatively, Import a Certificate and Private Key from your enterprise CA or a third-party CA.

I do have a PKI I can use but no specfic key-pair that’s nice for this purpose, for the ease of testing I’ll create a local CA cert on the PAN FW.

Step 2) Configure a certificate profile for securing access to the web interface.
Configure a Certificate Profile.
Set the Username Field to Subject.
In the CA Certificates section, Add the CA Certificate you just created or imported.

Now for ease of use and testing I’m not defining CRL or OCSP.

Step 3) Configure the firewall to use the certificate profile for authenticating administrators.
Select Device -> Setup – > Management and edit the Authentication Settings.
Select the Certificate Profile you created for authenticating administrators and click OK.

Step 4) Configure the administrator accounts to use client certificate authentication.
For each administrator who will access the firewall web interface, Configure a Firewall Administrator Account and select Use only client certificate authentication.
If you have already deployed client certificates that your enterprise CA generated, skip to Step 8. Otherwise, go to Step 5.

Step 5) Generate a client certificate for each administrator.
Generate a Certificate. In the Signed By drop-down, select a self-signed root CA certificate.

Step 6) Export the client certificate.
Export a Certificate and Private Key. (I saved as pcks12, with a password)
Commit your changes. The firewall restarts and terminates your login session. Thereafter, administrators can access the web interface only from client systems that have the client certificate you generated.

File was in my downloads folder.

Step 7)Import the client certificate into the client system of each administrator who will access the web interface.

Refer to your web browser documentation. I am using windows, so I’m assuming the browser (Edge) will use the windows store, so I installed it to my user cert store by simply double clicking the file and providing the password in the import wizard prompt. Then checked my local user cert store.

Time to commit and see what happens…

as soon as I committed I got a prompt for the cert:

If I open a new InPrivate window and don’t offer the certificate I get blocked:

If I provide the certificate the usual FBA login page loads.

So now any access to the firewall requires the use of this key, and a known login creds. Though the notice stated it “disables the username/password logins for all administrators on the firewall” my testing showed that not to be true, it simply locks down access to the FBA page requiring the user of the created certificate.

Using Internal PKI

Let’s try to set this up, but instead of self signed, let’s try using an interal PKI, in this case Windows PKI using Windows based CA’s.

Pre-reqs, It is assumed you already have a windows domain, PKI and CA all already configured. If you require asstance please see my blog post on how to set such a environment up from scratch here: Setup Offline Root CA (Part 1) – Zewwy’s Info Tech Talks

This post also assumes you have a Palo Alto Networks firewall in which you want to secure the mgmt web interface with increased authentication mechanisms.

Step 1) Import all certificates into the PA firewall so it shows a valid stack:

Step 3) Generate a client certificate for each administrator.
Generate a Certificate. In the Signed By drop-down, select a CSR.

Now I’m not 100% certain how this all works, so I name the name common name and SAN the same as the local admin account I wanted to secure.

Then export CSR, and sign it by your internal CA server. and import it back into the PA firewall. In my case I decided (for testing purposes and simply due to pure ignorance) to create the certificate using the Web Server Template, even though I know this is going to be a certificate used for user authentication. *shrug* The final result should look like this:

Step 4) Configure the firewall to use the certificate profile for authenticating administrators. Pick the Cert profile created in Step 2.
Select Device -> Setup – > Management and edit the Authentication Settings.
Select the Certificate Profile you created for authenticating administrators and click OK. (At this point I recommend to not commit until at least the certificate created earlier is exported.)

Step 5) Configure the administrator accounts to use client certificate authentication.
For each administrator who will access the firewall web interface, Configure a Firewall Administrator Account and select Use only client certificate authentication.

This is where things start to feel weird in the whole process of this stuff…. It seem as soon as you check this checkbox off, the password fields disappear:

Before:

After:

Which makes it seem like it just changes the account the account from password based to just certificate based, and not 2fa as expected. On top of that, why can’t I specify which certificate to use, does this mean any certificate that exists within the PA store is good enough? I guess I’ll have to test to see if that’s the case anyway…

Step 6) Export the client certificate.
Export a Certificate and Private Key. (I saved as pcks12, with a password)

Step 7) Commit, and watch it be like before, where the web login won’t even show an FBA page until you present a certificate first. Which again seems like the firewall doesn’t associate certain certificates with certain users, instead it seems to lock down the FBA page to require ANY certificate (with key?) that is configured or signed by the CA’s specified in the Certificate profile.

Which seems like such a dumb design, it be way better off, that when you check off Certificate based option for a user, you have to pick which cert, then instead of blocking the FBA page as a whole, when that user’s credentials are entered into the FBA page, it then checks/asks for the certificate specified in the one selected in the user creation process.

I seemed to be getting stuck at 400 bad request even with the certificate in my personal store. My only guess is due to the point I mentioned about that I picked web server template when I signed the certificate, which you can see client auth is missing from the useage field:

I didn’t make a snapshot (or you maybe running a physical firewall), how do I fix this? Well… access the console directly (VM use the hypervisor console), or if physical use the console port, or if you configured SSH access, SSH in, and revert the config. I figured “load config last-saved” would have worked, but it didn’t I guess last saved is running config so the command to me feels useless. I could be missing something on that, so instead I had to pick a config from a couple months ago. The first time around it didn’t state anything about restarting the web mgmt service, but when picking the older config it does:

This must be cause of the Cert Profile binding option in the auth section of the mgmt settings. Further validating my assumptions on the design choice.

Now I was able to log back in to the MGMT web interface, load the config with all my work on it (so I didn’t have to redo all the steps above). Let’s simply recreate the “user” cert but using a client template, and see how that goes…

1) Delete the old cert (check)

2) Create new cert (check)
This time, no additional fields (not even a SAN):

Signed using User template:

Import it into the firewall… (Check) (No clue where that TLSv1.3 cert came from…)

Export it from the firewall… (check)

Import into client machine, user’s personal store.. (check) (Interesting shows assign to the admin account that requested the certificate)

Double check the Mgmt auth settings (check), so only main difference is the client cert now and… error 400… ****

I reverted again, after which I loaded the config above again, but this time changing the cert profile selected on the mgmt auth section to be the self signed one that worked in the orginal posting I made about this stuff, oddly enough after commit on my reg web browser I couldn’t get the web interface to load (400 error) but with incong/in-private window I got the prompt for the admin cert and I got the FBA page.

So for testing one last time to get the Internal PKI cert to work. I decided to make one last change. When I made the certificate I specified the subject name to be that of the account (in this case I had an account on the PA firewall of akamin. I also decided to use the Template I created for making user certs for Global Protect which were templated for client auth. The final results on the PA looked like this:

and exporting, and importing into client machine cert store looked like this:

As you can see this looks much cleaner then all my previous attempts, and shows all assigned to be the user in which we want to login as. The only other change was I created another Certificate Profile, but did not check off any of the Blocked options. Once I committed this change I got a 400 on my regular web browser, but opening an in-private window I got:

Finally! Picking it we can see it auto populated the username:

However don’t be fooled by this, I was easily able to change the name in the field and log in as another user. In this case I changed the name to another local admin, and entered the password of that user and logged in just fine. Further validating that all it’s doing is blocking access to the FBA page to anyone who has Any cert signed by the CA’s listed in the Certificate Profile.

Now I want to figure out the regular browser 400 error problem so I don’t have to open an in-private window each time. Usually this means just cleaning the cache, but when picking what to clear I picked last hour and everything but browsing history, that didn’t work. Reboot did work.

The next task is to see if I could load this certificate onto a YubiKey, and be able to use it’s ability to act as a certificate key store.

Yubikey

Source

First annoying thing this source is missing is that you need the YubiKey MiniDriver installed in order to complete this task.

The next thing that burnt me, was when I went to import the certificate it kept saying my PIN was wrong. Which first lead to my PIN becoming blocked, which lead me to reading all this stuff. Since my PIN was locked after 3 attempts, I nearly locked the PUK as the first two entries were wrong, and lucky me it was set to the default, and I managed to unblock the PIN. I then managed to set a new PUK and PIN. I did this by using ykman. Which was available on Fedoras native repo, so I did this using fedora live.

What I don’t get is, is there a different pin for WebAuthN vs this one for certificates? It seems like it, cause even when the certificate pin was blocked my WebAuthN was still working.

Back to my Windows machine

Plug in Yubikey.. then:

certutil –csp "Microsoft Base Smart Card Crypto Provider" –importpfx C:\Path\to\your.pfx

When prompted, enter the PIN. If you have not set a PIN, the default value is 123456.

This time it worked, yay…

ok now how to test this…., I’ll try to access the mgmt web interface from a random computer one thats not the one that I tested above which has the key already installed in the user’s windows cert store. Mhmmm what do I have… how about my old as Acer Netbook running windows 7 32bit, there’s no way that’ll work… would it…

I try to acess the web console and sure enough 400 bad gateway… plug in the Yubi Key…

There’s no way….

No freaking way…. and try to access the web mgmt…

No way! it actually worked, that’s unbelievable!

enter the pin I configured above (not the WebAuthN Pin)…

crazy… I can’t believe that works… so yeah this is a feasible solution, but it’s still not as good as WebAuthN, which I hope will be supported soon.

Weird… I went to access the web interface from my machine that has the cert in my cert store, but now it seems to want the yubikey even though the cert is in my user store, I tried an in-private window but same problem… do I have to reboot my machine again? Fuck no, that didn’t work either… like WTF.

Tried another browser, Chrome, SAME THING! It’s like when running the command to import a certificate into a YubiKey it overrides the one on the local store and always asks for the YubiKey when picking that certificate. Which doesn’t make any sense…

I grabbed the cert, imported into the user store on another machine, and bam it works as intended… it just seems on the machine in which you import it into a yubi then it always wants the yubikey on that machine, regardless of the certificate being in the users cert store… which still doesn’t make sense…

OK, so I deleted the certificate from my user cert store, re-imported it, open an in-private window and now it accepted the cert without asking for the YubiKey. I still don’t understand what’s going on here…. but that fixed it….

Things I still don’t understand though… if I set this user option to require certificate and the password fields disappear in the PAN Web mgmt interface, then why is it still asking for a password for the user? Why is a certificate required before login if there’s a toggle for certificate based login on the user’s setting? Wouldn’t it make more sense that the Web UI stays available and once you enter your creds then based on the creds entered the PAN OS looks up if that check box is enabled, and then ask for the certificate? And you’d have to configure which certificate in the User settings so that it actually ties a specific user to a specific cert, so you don’t have any cert is good for any admin? So many questions…. so little answers…

Wait a second, I can’t remember this users password, and I can’t login, ah nuts I made a typo in the cert.. FFS man…

What makes it even dumber is it states No auth profile found, but what it really means is that user doesn’t exist. Now instead of mucking around creating a new cert import/export/import and all that jazz, lets create a user akamin check off Cert based which means no password set and lets see what happens…

Oh interesting….

Now that the user was properly defined as the common name when the cert was created you can no longer specify a user account, and it forces the one specified. But if this is the case, how does an admin login who isn’t defined to use certificate based login? While this makes sense on which user is supposed to use which cert without having to defined it in the user’s setting. However, it doesn’t explain the forced certificated requirement before the FBA page, or how admins not configured for certificate based login can even login now.

¯\_(ツ)_/¯

I lost my keys… what do I do?

If you have the default admin account and left it as normal (no cert requirement), you can sign in via SSH or direct console and remove the config from the auth settings:

Configuration
delete deviceconfig system certificate-profile
commit

That should be all to get back to normal weblogin, but you’d still need to have an accounts configured to not have the certificate checkbox on those user’s settings.

It seems like that this can work as long as you leave the default admin account configured for regular auth (username and password).Maybe you can still make it work as long as there’s a lot of certs and redundancy. I haven’t exactly tested that out.

~~OK so above I simply reverted cause it was the only change I had. This only works in two conditions:~~

~~You know exactly when the change was implemented.~~
~~You have the latest running config saved.~~

Thinking about this I think the latest running will always be there so you just have to know when the change was implemented. revert to that, then load the last running, and turn off the cert profile on the mgmt auth settings area.

~~but what if you don’t know when that was made, well let’s see if we can make the change via the CLI…~~

~~So I found the location on where to set it….~~

~~set deviceconfig system certificate-profile~~

~~I can’t seem to set it to null… I found a similar question here, which only further validated my concern above about other admins who aren’t configured for cert login…~~

~~“However at the very beginning of the Web Page I can read:~~

“Configuring certificate-based authentication for any administrator disables the username/password logins for all administrators on the firewall; administrators thereafter require the certificate to log in.””

Unfortunately, the linked source is dead, but I’m sure it’s still in play. With the thread having no real answer to the question, it seems my only option is the steps I did before… revert to a config before it was implemented, load the old running config, and within the web UI remove the Cert profile, which totally fucking sucks ass…. However, as we discovered, if we configure a cert with a common name that isn’t a user on the PAN, then we can use that to access the FBA page with accounts that are not checked off with the setting in the users’ setting. I wonder if this is something that wasn’t intended and I discovered it simply by chance which kinda shows the poor implementation design here.

~~I think I covered everything I can about this topic here… Now since this account I created was a superuser (read only) and now that the user exists… I’ll revert…. or wait….~~

~~maybe I can delete that user, and then go back to just needing the cert and I can sign in with another account to fix this… let’s try that LOL.~~

~~Finally direct guidance! Woo~~

~~and….~~

~~No matter what browser or machine I try to connect to it, it just error 400.~~

~~This… shit… sucks.~~

~~Can you make this idea work… yes.~~

Can you fix this if you lose all your keys, not easily, you’d have to know the exact commit the change was made, and if there were other changes made after that, they temp not be applied during the recovery period.

Facepalm…. I don’t know why I didn’t think of this sooner… you don’t use set, you use delete in the cli to set it to none.

August 22, 2023

Upgrading Windows 10 2016 LTSB to 2019 LTSC

*Note 1* – This retains the Channel type.
*Note 2* – Requires a new Key.
*Note 3* – You can go from LTSB to SA, keeping files if you specify new key.
*Note 4* – LTSC versions.
*Note 5* – Access to ISO’s. This is hard and most places state to use the MS download tool. I however, managed to get the image and key thanks to having a MSDN aka Visual Studio subscription.

I attempted to grab the 2021 Eval copy and ran the setup exe. When it got to the point of wanting to keep existing file (aka upgrading) it would grey them all out… 🙁

So I said no to that, and grabbed the 2019 copy which when running the setup exe directly asks for the key before moving on in the install wizard… which seems to let me keep existing files (upgrade) 🙂

My enjoyment was short lived when I was presented with a nice window update failed window.

Classic. So the usual, “sfc /scannow”

Classic. So fix it, “dism /online/ cleanup-image /restorehealth”

Stop, Disable Update service, then clear cache:

Scan system files again, “sfc /scannow”

reboot make sure system still boots fine, check, do another sfc /scannow, returns 100% clean. Run Windows update (after enabling the service) comes back saying 100% up to date. Run installer….

For… Fuck… Sakes… what logs are there for this dumb shit? Log files created when you upgrade Windows 11/10 to a newer version (thewindowsclub.com)

setuperr.log

Same as setupact.log

Data about setup errors during the installation.

Review all errors encountered during the installation phase.

Coool… where is this dumb shit?

Log files created when an upgrade fails during installation before the computer restarts for the second time.

C:\$Windows.~BT\Sources\panther\setupact.log
C:\$Windows.~BT\Sources\panther\miglog.xml
C:\Windows\setupapi.log
[Windows 10:] C:\Windows\Logs\MoSetup\BlueBox.log

OK checking the log…..

Lucky me, something exists as documented, count my graces, what this file got for me?

PC Load letter? WTF does that mean?! While it’s not listed in this image it must have been resolved but I had a line that stated “required profile hive does not exist” in which I managed to find this MS thread of the same problem, and thankfully someone in the community came back with an answer, which was to create a new local temp account, and remove all old profiles and accounts on the system (this might be hard for some, it was not an issue for me), sadly I still got, Windows 10 install failed.

For some reason the next one that seems to stick out like a sore thumb for me is “PidGenX function failed on this product key”. Which lead me to this thread all the way back from 2015.

While there’s a useless comment by “SaktinathMukherjee”, don’t be this dink saying they downloaded some third party software to fix their problem, gross negligent bullshit. The real hero is a comment by a guy named “Nathan Earnest” – “I had this same problem for a couple weeks. Background: I had a brand new Dell Optiplex 9020M running Windows 8.1 Pro. We unboxed it and connected it to the domain. I received the same errors above when attempting to do the Windows 10 upgrade. I spent about two weeks parsing through the setup error logs seeing the same errors as you. I started searching for each error (0x800xxxxxx) + Windows 8.1. Eventually I found one suggesting that there is a problem that occurs during the update from Windows 8 to Windows 8.1 in domain-connected machines. It doesn’t appear to cause any issues in Windows 8.1, but when you try to upgrade to Windows 10… “something happened.”

In my case, the solution: Remove the Windows 8.1 machine from the domain, retry the Windows 10 upgrade, and it just worked. Afterwards, re-join the machine to the domain and go about your business.

Totally **** dumb… but it worked. I hope it helps someone else.”

Again, I’m free to try stuff, so since I was testing I cloned the machine and left it disconnected from the network, then under computer properties changed from domain to workgroup (which means it doesn’t remove the computer object from AD, it just removes itself from being part of a domain). After this I ran another sfc /scannow just to make sure no issue happened from the VM cloning, with 100% green I ran the installer yet again, and guess what… Nathan was right. The update finally succeeded, I can now choose to rename the PC and rejoin the domain, or whatever, but the software on the machine shouldn’t need to be re-installed.

Another fun dumb day in paradise, I hope this blog post ends up helping someone.

July 24, 2023July 24, 2023

Move Linux Swap and Extend OS File System

Story

So, you go to run updates, in this case some Linux servers. So, you dust off your old dusty fingers and type the blissful phrase, “apt update” followed by the holier than thou “apt upgrade”….

You watch as the test scrolls past the screen in beautiful green text console style, as you whisper, “all I see is blonde, and brunette”, having seen the same text so many times you glaze over them following up with “ignorance is bliss”.

Your sweet dreams of living in the Matrix come to a halt as instead of success you see the dreaded red text on the screen and realize the Matric has no red text. Shucks this is reality, and the update has just failed.

Reality can be a cruel place, and it can also be unforgiving, in this case the application that failed to update is not the problem (I mean you could associate blame here if the dev’s and maintainers didn’t do any due diligence on efficiencies, but I digress), the problem was simply, the problem as old as computers themselves “Not enough storage space”.

Now, you might be wondering at this point… what does this have to do with Linux Swap?!?!? Like any good ol’ storyteller, I’m gettin’ to that part. Now where was I… oh yes, that pesty no space issue. Now normally this would be a very simple endeavor, either:

A) Go clear up needless crap.
Trust me I tried, ran apt autoclean, and apt clean. Looked through the File System, nothing was left to remove.

B) Add more storage.
This is the easiest route, if virtualized simply expand the VM’s HDD on the host that’s serving it, or if physical DD the contents to a drive of similar bus but with higher tier storage.

Lucky for me the server was virtual, now comes the kicker, even after expanding the hard drive, the Linux machine was configured to have a partition-based Swap. In both situations, virtual and physical, this will have to be dealt with in order to expand the file system the Linux OS is utilizing.

Swap: What is it?

Swap is space on a hard drive reserved for putting memory temporary while another request for memory is being made and there is no more actual RAM (Random Access Memory) available for it to be placed for use. The system simply takes lower access memory and just kind “shows it under the rug” to be cleaned or used later.

If you were running a system with massive amounts of memory, you could, in theory, run without swap, just remove it and life’s good. However, in lots of cases memory is a scarce commodity vs something like hard drive storage, the difference is merely speed.

Anyway, in this case I attempted to remove swap entire (steps will be provided shortly), however this system was no different in terms of just being provisioned enough where several MB of RAM was actually being placed into swap, as such when I removed the swap, and all the services began to spin up the VM became unusable, as running commands would return unable to associate memory. So instead, the swap was simply changed from a partition to a file-based swap.

Step 1) Stop Services

This step may or may not be required, it depends on your systems current resource allocations, if you’re in the same boat as I was in that commands won’t run as the system is at max memory usage; then this is needed to ensure the system doesn’t become unusable during the transition, as it will require to disable swap for a short time.

The commands to stop services will depend on both the Linux distro used and the service being managed. This is beyond the scope of this post.

Step 2) Verify Swap

Run the command:

swapon -s

This is an old Linux machine I plan on decommissioning, but as we can see here, a shining example of a partition-based swap, and the partition it’s assigned to. /dev/sda3. We can also see some of the swap is actually used. During my testing I found Linux wouldn’t disable swap if it is unable to allocate physical memory for its content, which makes sense.

Step 3) Create Swap File

Create the Swap file before disabling the current partition swap or apparently the dd command will fail due to memory buffers.

dd if=/dev/zero of=/swapfile count=1 bs=1GiB

This also depends on the size of your old swap, change the command accordingly based on the size of the partition you plan to remove. In my case roughly a Gig.

chmod -v 0600 /swapfile
chown -v root:root /swapfile
mkswap /swapfile

Step 4) Disable Swap

Now it’s time for us to disable swap so we can convert it to a file-based version. If it states it can’t move the data to memory cause memory is full, revert to step 1 which was to stop services to make room in memory. If this can’t be done due to service requirements, then you’d have to schedule a Maintenace window, since without enough memory on the host service interruption is inevitable… Mr.Anderson.

swapoff /dev/sda3

easy peasy.

Step 5) Enable Swap File

swapon /swapfile

Step 6) Edit fstab

Now looks like we done, but don’t forget this is handled by fstab after reboot, just ask me how I know…. yeah, I found out the hard way… let’s check the existing fstab file…

cat /etc/fstab

Step 7) Reboot and Verify Services

Wait both mounted as swap… what??!?

To fix this, I removed the partition, updated kernel usage, and initram, then reboot:

fdisk /dev/sda
d
3
w
partprobe
update-initramfs -u

Rebooted and swapon showed just the file swap being used. Which means the deleted partition is no longer in the way of the sectors to allow for a full proper expansion of the OS file system. Not sure what was with the error… didn’t seem to affect anything in terms of the services being offered by the server.

Step 8) Extending the OS File System

If you’ve ever done this on Windows, you’ll know how easy it is with Disk Manager. On linux it’s a bit… interesting… you delete the partition to create it again, but it doesn’t delete the data, which we all come to expect in the Windows realm.

fdisk /dev/sda
d
[enter]
n
[enter]
[enter]
[enter]
w
partprobe
resize2fs /dev/sda2

The above simply delete’s the second partition, then recreates it using all available sectors on the disk. Then final commands allows the file system to use all the available sectors, as extended by fdisk.

Summary

Have fun doing whatever you need to do with all the new extra space you have. Is there any performance impact from doing this? Again, if you have a system with adequate memory, the swap should never be used. If you want to go down that rabbit hole.. here.. Swap File vs Swap Partition : r/linux4noobs (reddit.com) have fun. Could I have removed the swap partition, created it at the end of the new extended hard drive…. yes… I could have but that would have required calculating the sectors, and extending the new file system to the sector that would be the start of the new swap partition, and I much rather press enter a bunch of times and have the computer do it all for me, I can also extend a file easier than a partition, so read the reddit thread… and pick your own poisons…

June 26, 2023

Upgrading Windows Server 2016 Core AD to 2022

Goal

Upgrade a Windows Server 2016 Core that’s running AD to Server 2022.

What actually happened

Normally if the goal is to stay core to core, this should be as easy as an in-place upgrade. When I attempted this myself this first issue was it would get all the way to end of the wizard then error out telling me to look at some bazar path I wasn’t familiar with (C:\$windows.~bt\sources\panther\ScanResults.xml). Why? Why can’t the error just be displayed on the screen? Why can’t it be coded for in the dependency checks? Ugh, anyway, since it was core I had to attach a USB stick to my machine, pass it through to the VM, save the file open it up, and nested deep in there, it basically stated “Active Directory on this domain controller does not contain Windows Server 2022 ADPREP /FORESTPREP updates.” Seriously, ok, apparently requires schema updates before upgrading, since it’s an AD server.

Get-ADObject (Get-ADRootDSE).schemaNamingContext -Property objectVersion

d:\support\adprep\adprep.exe /forestprep
d:\support\adprep\adprep.exe /domainprep

Even after all that, the install wizard got past the error, but then after rebooting, and getting to around 30% of the install, it would reboot again and say reverting the install, and it would boot back into Server 2016 core.

Note, you can’t change versions during upgrade (Standard vs Datacenter) or (Core vs Desktop). For all limitation see this MS page. The “Keep existing files and apps” was greyed out and not selectable if I picked Desktop Experience. I had this same issue when I was attempting to upgrade a desktop server and I was entering a License Key for Standard not realizing the server had a Datacenter based key installed.

New Plan

I didn’t look at any logs since I wasn’t willing to track them down at this point to figure out what went wrong. Since I also wanted to go Desktop Experience I had to come up with any alternative route.

Seem my only option is going to be:

Install a clean copy of Server 2016 Desktop, Update completely). (Run sysprep, clone for later)
Add it as a domain controller in my domain.
Migrate the FSMO roles. (If I wanted a clustered AD, I could be done but that wouldn’t allow me to upgrade the original AD server that’s failing to upgrade)
Decommission the old Server 2016 Core AD server.
Install a clean copy of Server 2016 Desktop, Update completely). (The cloned copy, should be OOBE stage)
Add to Domain.
Upgrade to 2022.
Migrate FSMO roles again. (Done if cluster of two AD servers is wanted).
Decommission other AD servers to go back to single AD system.

Clean Install

Using a Windows Server 2016 ISO image, and a newly spun up VM, The install went rather quick taking only 15 minutes to complete.

Check for updates. KB5023788 and KB4103720. This is my biggest pet peve, Windows updates.

RANT – The Server 2016 Update Race

As someone who’s a resource hall monitor, I like to see what a machine is doing and I use a variety of tools and methods to do so, including Resource Monitor, Task Manager (for Windows), Htop (linux) and all the graphs available under the Monitor tab of vSphere. What I find is always the same, one would suspect high Disk, and high network (receive) when downloading updates (I see this when installing the bare OS, and the disk usage and throughput is amazing, with low latency, which is why the install only took 15 minutes).

Yet when I click check for updates, it’s always the same, a tiny bit of bandwidth usage, low disk usage, and just endless high CPU usage. I see this ALL THE TIME. Another thing I see is once it’s done and reboot you think the install is done, but no the windows update service will kick off and continue to process “whatever” in the background for at least another half hour.

Why is Windows updates such Dog Shit?!?! Like yay we got monthly Cumulative updates, so at least one doesn’t need to install a rolling ton of updates like we did with the Windows 7 era. But still the lack of proper reporting, insight on proper resource utilization and reliance on “BITS”… Just Fuck off wuauclt….

Ughhh, as I was getting snippets ready to show this, and I wanted to get the final snip of it still showing to be stuck at 4%, it stated something went wrong with the update, so I rebooted the machine and will try again. *Starting to get annoyed here*.

*Breathe* Ok, go grab the latest ISO available for Window Server 2016 (Updates Feb 2018), So I’m guessing has KB4103720 already baked in, but then I check the System resources and its different.

But as I’m writing this it seems the same thing is happening, updates stalling at 5%, and CPU usage stays at 50%, Disk I/O drops to next to nothing.

*Breaks* Man Fuck this! An announcer is born! Fuck it, we’ll do it live!

I’ll let this run, and install another VM with the latest ISO I just downloaded, and let’s have a race, see if I can install it and update it faster then this VM…. When New VM finished installing, let a couple config settings. Check for updates:

Check for updates. KB5023788 and KB4103723. Seriously?

Install, wow, the Downloading updates is going much quicker. Well, the download did, click install sticking @ 0% and the other VM is finishing installing KB4103720. I wonder if it needs to install KB4103723 as well, if so then the new VM is technically already ahead… man this race is intense.

I can’t believe it, the second server I gave more memory to, was the latest available image from Microsoft, and it does the exact same thing as the first one.. get stuck 5%.. CPU usage 50% for almost an hour.. and error.

lol No fucking way… reboot check for updates, and:

at the same time on the first VM that has been checking for updates forever which said it completed the first round of updates…

This is unreal…

Shit pea one, and shit pea 2, both burning up the storage backend in 2 different ways…. for the same update:

Turd one really rips the disk:

Turd two does a bit too, but more just reads:

I was going to say both turds are still at 0% but Turd one like it did before spontaneously burst back in “Checking for update” while the second one seem it moved up to 5%… mhmm feel like I’ve been down this road before.

Damn this sucks, just update already FFS, stupid Windows. *Announcer* “Get your bets here!, Put in your bets here!” Mhmmm I know turd one did the same thing as turd 2, but it did complete one round of updates, and shows a higher version then turd 2, even though turd 2 was the latest downloadable ISO from Microsoft.

I’m gonna put my bets on Turd 1….

Current state:

Turd 1: “Checking for Updates”… Changed to Downloading updates 5%.. shows signs of some Disk I/O. Heavy CPU usage.

Turd 2: “Preparing Updates 5%” … 50% CPU usage… lil to no Disc I/O.

We are starting to see a lot more action from Turd 1, this race is getting real intense now folks. Indeed, just noticing that Turd one is actually preparing a new set of updates, now past the peasant KB4103720. While Turd 2 shows no signs of changing as it sits holding on to that 5%.

Ohhhh!!! Turd one hits 24% while Turd 2 hit the same error hit the first time, is it stuck in a failed loop? Let’s just retry this time without a reboot.. and go..! Back on to KB4103720 preparing @ 0%. Not looking good for Turd 2. Turd 1 has hit 90% on the new update download.

and comming back from the break Turd one is expecting a reboot while Turd 2 hits the same error, again! Stop Windows service, clear softwaredistrobution folder. Start update service, check for updates, tried fails, reboot, retry:

racing past the download stage… Download complete… preparing to install updates… oh boy… While Turd one is stuck at a blue screen “Getting Windows Ready” The race between these too can’t get any hotter.

Turd one is now at 5989 from 2273. While Turd 2 stays stuck on 1884. Turd 2 managed to get up to 2273, but I wasn’t willing to watch the hours it takes to get to the next jump. Turd 1 wins.

Checking these build numbers looks like Turd 1 won the update race. I’m not interested in what it takes to get Turd 2 going. Over 4 hours just to get a system fully patched. What a Pain in the ass. I’m going to make a backup, then clear the current snap shot, then create a new snapshot, then sysprep the machine so I can have a clean OOBE based image for cloning, which can be done in minutes instead of hours.

END RANT

Step 2) Add as Domain Controller.

Wow amazing no issues.

Step 3) Move FSMO Roles

Transfer PDCEmulator

Move-ADDirectoryServerOperationMasterRole -Identity "ADD" PDCEmulator

Transfer RIDMaster

Move-ADDirectoryServerOperationMasterRole -Identity "ADD" RIDMaster

Transfer InfrastrctureMaster

Move-ADDirectoryServerOperationMasterRole -Identity "ADD" Infrastructuremaster

Transfer DomainNamingMaster

Move-ADDirectoryServerOperationMasterRole -Identity "ADD" DomainNamingmaster

Transfer SchemaMaster

Move-ADDirectoryServerOperationMasterRole -Identity "ADD" SchemaMaster

Step 4) Demote Old DC

Since it was a Core server, I had to use Server Manager from the remote client machine (Windows 10) via Server Manager. Again no Problem.

As the final part said it became a member server. So not only did I delete under Sites n Services, I deleted under ADUC as well.

Step 5) Create new server.

I recovered the system above, changed hostname, sysprepped.

This took literally 5 minutes, vs the 4 hours to create from scratch.

Step 6) Add as Domain Controller.

Wow amazing no issues.

Step 7) Upgrade to 2022.

Since we got 2 AD servers now, and all my servers are pointing to the other one, let’s see if we can update the Original AD server that is now on Server 2016 from the old Core.

Ensure Schema is upgraded first:

d:\support\adprep\adprep.exe /forestprep

d:\support\adprep\adprep.exe /domainprep

run setup!

It took over an hour, but it succeeded…

Summary

If I had an already updated system, that was already on Desktop Experience this might have been faster, I’m not sure again why the in-place update did work for the server core, here’s how you can upgrade it Desktop Experience and then up to 2022. It does unfortunately require a brand new install, with service migrations.

June 23, 2023

Veeam Backup Encryption

Story

So, a couple posts back I blogged about getting a NTFS USB drives shared to a Windows VM via SMB to store backups onto, so that the drive could easily plugged into a Windows machine with Veeam on it to recover the VMs if needed. However, you don’t want to make it this easy if it were to be stolen, what’s the solution, encryption… and remembering passwords. Woooooo.

Veeam’s Solution; Encryption

Source: Backup Job Encryption – User Guide for VMware vSphere (veeam.com)

I find it strange in their picture they are still using Windows Server 2012, weird.

Anyway, so I find my Backup Copy job and sure enough find the option:

Mhmmm, so the current data won’t be converted I take it then…

Here’s the backup files before:

and after:

As you can see the old files are completely untouched and a new full backup file is created when an Active full is run. You know what that means…

Not Retroactive

“If you enable encryption for an existing job, except the backup copy job, during the next job session Veeam Backup & Replication will automatically create a full backup file. The created full backup file and subsequent incremental backup files in the backup chain will be encrypted with the specified password.

Encryption is not retroactive. If you enable encryption for an existing job, Veeam Backup & Replication does not encrypt the previous backup chain created with this job. If you want to start a new chain so that the unencrypted previous chain can be separated from the encrypted new chain, follow this Veeam KB article.”

What the **** does that even mean…. to start I prefer not to have a new chain but since an Active full was required there’s a start of a new chain, so… so much for that. Second… Why would I want to separate the unencrypted chain from the new encrypted chain? wouldn’t it be nice to have those same points still exist and be selectable but just be encrypted? Whatever… let’s read the KB to see if maybe we can get some context to that odd sentence. It’s literally talking about disassociating the old backup files with that particular backup job. Now with such misdirected answers it would seem it straight up is not possible to encrypt old backup chains.

Well, that’s a bummer….

Even changing the password is not possible, while they state it is, it too is not retroactive as you can see by this snippet of the KB shared. Which is also mentioned in this Veeam thread where it’s being asked.

So, if your password is compromised, but the backup files have not you can’t change the password and keep your old backup restore points without going through a nightmare procedure or resorting all points and backing them up somehow?

Also, be cautious checking off this option as it encrypts the metadata file and can prevent import of not encrypted backups.”You can enter password and read data from it, but you cannot “remove the lock” retroactively”

“Reason why Veeam asks for passwords even on non-encrypted chains, is because backupdata metadata(holding information about all restore points in the chain, including encrypted and non encrypted ones) is encrypted too!”

“Metadata will be un-encrypted when last encrypted restore point it describes will be gone by retention.”

Huh, that’s good to know… this lack of retroactive ability is starting to really suck ass here. Like I get the limitations that there’d be high I/O switching between them, but if BitLocker for windows can do it for a whole O/S drive LIVE, non-the-less, why can’t Veeam do it for backup sets?

Summary

Veeam Supports Encryption
- Easy, Checkbox on Backup Job
- Uses Passwords
- Non Retroactive

I’ll start off by saying it’s nice that it’s supported, to some extent. What would be nice is:

Openness of what Encryption algos are being used.
Retroactive encryption/decryption on backup sets.
Support for Certificates instead of passwords.

I hope this review helps someone. Cheers.

June 10, 2023June 11, 2023

No coredump target has been configured. Host core dumps cannot be saved.

ESXi on SD Card

Ohhh ESXi on SD cards, it got a little controversial but we managed to keep you, doing the latest install I was greet with the nice warning “No coredump target has been configured. Host core dumps cannot be saved.”

What does this mean you might ask. Well in short, if there ever was a problem with the host, log files to determine what happened wouldn’t be available. So it’s a pick your poison kinda deal.

Store logs and possibly burn out the SD/USB drive storage, which isn’t good at that sort of thing, or point it somewhere else. Here’s a nice post covering the same problem and the comments are interesting.

Dan states “Interesting solution as I too faced this issue. I didn’t know that saving coredump files to an iSCSI disk is not supported. Can you please provide your source for this information. I didn’t want to send that many writes to an SD card as they have a limited number (all be it a very large number) of read/writes before failure. I set the advanced system setting, Syslog.global.logDir to point to an iSCSI mounted volume. This solution has been working for me for going on 6 years now. Thanks for the article.”

with the OP responding “Hi Dan, you can definately point it to an iscsi target however it is not supported. Please check this KB article: https://kb.vmware.com/s/article/2004299 a quarter of the way down you will see ‘Note: Configuring a remote device using the ESXi host software iSCSI initiator is not supported.’”

Options

Option 1 – Allow Core Dumps on USB

Much like the source I mentioned above: VMware ESXi 7 No Coredump Target Has Been Configured. (sysadmintutorials.com)

Edit the boot options to allow Core Dumps to be saved on USB/SD devices.

Option 2 – Set Syslog.global.logDir

You may have some other local storage available, in that case set the variable above to that local or shared storage (shared storge being “unsupported”).

Option 3 – Configure Network Coredump

As mentioned by Thor – “Apparently the “supported” method is to configure a network coredump target instead rather than the unsupported iSCSI/NFS method: https://kb.vmware.com/s/article/74537”

Option 4 – Disable the notification.

As stated by Clay – ”

The environment that does not have Core Dump Configured will receive an Alarm as “Configuration Issues :- No Coredump Target has been Configured Host Core Dumps Cannot be Saved Error”.
In the scenarios where the Core Dump partition is not configured and is not needed in the specific environment, you can suppress the Informational Alarm message, following the below steps,

Select the ESXi Host >

Click Configuration > Advanced Settings

Search for UserVars.SuppressCoredumpWarning

Then locate the string and and enter 1 as the value

The changes takes effect immediately and will suppress the alarm message.

To extract contents from the VMKcore diagnostic partition after a purple screen error, see Collecting diagnostic information from an ESX or ESXi host that experiences a purple diagnostic screen (1004128).”

Summary

In my case it’s a home lab, I wasn’t too concerned so I followed Option 4, then simply disabled file core dumps following the second steps in Permanently disable ESXi coredump file (vmware.com)

Note* Option 2 was still required to get rid of another message: System logs are stored on non-persistent storage (2032823) (vmware.com)

Not sure, but maybe still helps with I/O to disable coredumps. Will update again if new news arises.