So, the boot drive died in my FreeNAS server that was running 11.1u1. Good build. I figured since it died now would be a time to try TrueNAS Core. My SAN didn’t seem to be booting the installer, I tried different USB sticks and eventually just ended up using my IODD device, but all of which wouldn’t show the installer (albet on xterm/serial connection since the SAN is headless).
I decided to run the installer on a laptop and installing on …a….USB stick… wait a second….. when FreeNAS was first failing I tried moving the system log files off there since USB drives aren’t really meant for high R/W operations. These are set by the System -> System Dataset. What I think happed was the drive probably could’ve lasted a while longer but since the log files was still configured for the default “boot-pool” aka on the USB boot stick, and I had configured iSCSI with proxmox and at the end I discovered the log flooding problem… I bet it was this log flooding along with writing and log rotating on the USB boot device must have killed it…. Just wow man….
Anyway, I installed on a USB stick on a laptop picking the USB drive as a souce and picking BIOS boot option. I plugged the device into my SAN and boot up, now watching the boot operation (after verifying the boot order was correct) via the serial console window, after the system info the console gets stuck at “Press any key to continue”. I looked at the USB drive and the light indicated that something was happening, also looking at the NIC lights seemed something was going on, but also seem a bit pattern like and I wasn’t sure if something or anything was working. I decided to quickly check my DHCP server for any new entries and sure enough there it was….
I navigated to the address on my browser and sure enough, it booted!
There it is the dataset we need to change, now I don’t have any other pools defined so I can’t change it just yet, but lets make this our first priority so that if another bug from proxmox creeps in it won’t break my SAN. To do that I’m going to need to import the ZFS pools I had on this SAN when FreeNAS died.
Storage -> Pools -> Add
What a clean look. Import, No encryption. Looking for pools, this took a while.
There’s my pools! Sweet:
Next… exciting stuff…
Click Import.
Waiting….. waiting…. Let’s go!
What’s this notification…
I’ll leave that for now… can I change my system dataset? Oh no way it did it automatically….
well… even though I managed to import the pools successfully, since I can’t remember how the extents were made, I think they were file based, but no matter how I make it file based or device based the ESXi hosts see the drive but they aren’t automatically importing the datastore, probably cause they can’t see the FS that’s suppose to be on the iSCSI disk that’s being presented (the extents). I should have simply kept them device based and used the root pool. To add icing to the cake the one server I wanted to recover was the only one I didn’t run the backup copy job on after my USB drive that was hosting those files went belly up and I had to rerun all the jobs, AND the Veeam servers main backup repo was on one of the extents I can’t recover… so even though I had 3-2-1 rule in place I still ended up losing this server….
this is too much for me today, I’m goin to bed…
A new day, but end of it, so won’t finish this yet today either but there is hope.
I found this, which lead to this, which lead to this, and there’s sign of hope seeing the same thing in my own vmkernel log file.
So there it is, and ran the commands as specified in the threads and KB:
tail /var/log/vmkernel.log
Which showed me the volume not mounting due to snapshot.
esxcli storage vmfs snapshot list
Which showed me my old Datastore information.
esxcli storage vmfs snapshot resignature -l "iOSlow"
Which mounted my datastore under a new name:
Sure enough connected the backup drive to my Veeam server. And restored my missing VM. Now I just need to move the rest of the data and create a clean datastore. Or I guess I could simply rename it, but it didn’t auto mount on the other esxi server.