[TL/DR: -14 removed pam_opie.so <http://pam_opie.so/>, so leaving those lines 
in pam.d/system etc prevented su/sudo; 
        zpool upgrade root drive without updating gptzfsboot boot loader 
prevented rebooting.]

Thought that I'd write this down, in case it helps anyone.  The upgrade from 
stable-13 to stable-14 that I made to my file server over the weekend was one 
of the bumpier upgrades in my thirty year history with FreeBSD (yes, I was 
there at the transition from the patchkit).  I'm happy to relate that all of 
the pain related here was self-inflicted, and FreeBSD itself shone through with 
its delightful robustness and straightforward nature.  All ends well.

My file server, for now, is a miniITX AMD Zen 1700 system with four 8T spinning 
rust drives in a single RaidZ Zpool, about 250G of NVME M.2 flash holding root, 
usr, var and swap (boot to ZFS) and a USB connected backup system also running 
ZFS.  I track -stable and update ports (with portmaster), user-space and kernel 
weekly.

The start was very simple, now that FreeBSD is on git: I just git switch'ed to 
the stable/14 branch, which went without a hitch.

Then I ran my usual weekly rebuild script, which got to the end without fuss, 
but with the usual complaint from etcupdate that there were some unresolved 
issues.  I should have been paying more attention to that: I was not using 
etcupdate correctly, and had not been since switching over from mergemaster a 
year or so ago.  Needless to say, misconfiguration was the start of the 
troubles, and they kicked in immediately: I couldn't reboot after the upgrade.  
I couldn't reboot or sudo or su because my /etc/pam.d configuration still 
referred to pam_opie.so, because I had not noticed that being removed.  I 
_could_ still ssh into the system because my ssh config had disabled pam.  
Didn't help though, because I was still stuck as me, and couldn't edit the 
config files, because of sudo (I've since rebuilt sudo to not use pam either!)

Easy enough to fix, right?  Power down and reboot into single-user mode and go 
from there.  Unfortunately I had ripped the graphics card out of the system 
some long time ago as an attempt to keep a bit of heat out of the box and had 
apparently lost it in a couple of intervening house moves.  Perhaps I'd donated 
it to the electronics recycling mob along with a box of old cables and power 
supplies.  Too late to go and get one Saturday afternoon, I found a store some 
distance away that would sell me one on Sunday morning.  With new graphics card 
in hand, I powered down, took the lid off the server, carefully lifted out the 
hard-drive cage and installed the GPU.  Plugged in monitor and keyboard and 
powered up.

Single user mode did the job: edited the pam.d files and was just about to 
reboot when I checked zpool status to see why the boot messages had said 
something about my main array operating in "degraded" mode.  One drive was 
apparently not found/attached.  On closer inspection I discovered that I'd 
dislocated the power supply plug when I took the cage out.  I'd fix that when I 
did the next power-down.  But in the mean time, zpool status had also taunted 
me with new features that I could enable.  So I did, on the root drive.  And 
power-cycled.  And stared dumbly at the boot screen telling me that it couldn't 
find any bootable drives, because the one that was there had an incompatible 
zpool version.  Aargh!  The boot loader had not been updated!  Couldn't even 
get to single user mode to fix it.

I downloaded the 14-release bootonly image from the FreeBSD web site and found 
a suitable thumb drive to put it on.  Power cycled the box again and told the 
boot menu to boot from the thumb drive.  There followed a great deal of gpart 
footling while I tried to remember just how I had the drives arranged, but in 
the end I found the magic incantation (gpart bootcode -p /boot/gptzfsboot -i 1 
nda0) to install the new version of the boot loader.  Rebooted again, this time 
to the main system, rather than the thumb drive, and that worked.  ZFS 
resilvered the previously missing drive quicker than I could notice, and 
subsequent scrubs found nothing in need of fixing.

Everything is now hunky-dory.

Thanks to the always-wonderful FreeBSD team for continuing to produce a system 
that can be understood at sufficient detail to fairly easily dig oneself out of 
what might otherwise be catastrophic misadventures!


Reply via email to