Hi all,
I have solved the boot issue. I got physical access to the machine, and
connected a monitor and keyboard to it.
As usual, it booted the initramfs, and started dropbear, but this time,
I entered the disk key on the keyboard.
The machine attempted to boot, but failed, dropping back to the
initramfs prompt - however, crucially, the video console logged a useful
error - the rootfs could not be mounted (despite mounting just fine,
without errors, in read write mode, over the ssh connection previously).
I ran a fsck in the initramfs environment, and rebooted it.
The machine booted and I again unlocked it via the (keyboard) console.
This resulted in a clean boot with no untoward errors.
I shut it down, and rebooted, and now it boots successfully even if I
unlock it via the ssh connection.
I feel like this is a bug in cryptroot-initramfs - surely it would be
critically important for the aforementioned filesystem error to have
been logged to the ssh console, rather than it silently failing to boot?
afterall - it DOES log the errors to the video console.
Or have I been daft and somehow misconfigured it? I don't think I
deviated from the standard configuration.
Is there a simple way to have errors logged to the console AND to the
ssh session? Have I been thick?
Thanks,
-Ian
On 24/07/2024 00:14, Ian Molton wrote:
Hi all,
I'm having problems with one of my machines. it's a Pine RockPro64.
Debian bookworm has been running very stably on it for some time. I
rebooted it a couple of weeks ago for maintainance, having applied
updates, after 108 days up. I have an encrypted LVM volume, containing
root, swap, and data LVs, with /boot on a MMC card, and I use
cryptsetup-initramfs to allow me to log in and unlock the volume at
boot time, via ssh.
I rebooted it this morning, due to a crash, and it didn't come back up.
I can still connect to dropbear (running in the initramfs context),
and the MoTD (as always) prompts me to run cryptroot-unlock, which
appears to do its job (ie. the LVM volume is unlocked / decrypted),
however it does not proceed to switch root (and drop the ssh
connection), as it used to, with complete reliability.
At first, I suspected a problem with the rootfs, however this appears
not to be the issue - the volume is present at the expected location,
and can be manually mounted.
Executing the following allows me to enter a chroot on the rootfs:
mount -text4 /dev/vg0/root /root
mount --bind /dev /root/dev
mount --bind /proc /root/proc
mount --bind /sys /root/sys
chroot /root /bin/bash --login
and following this I can run:
mount /boot
which correctly mounts the MMC card containing the /boot partition in
the chrooted environment.
Inspecting /boot, everything appears to be in order. I have issued
update-initramfs a few times, even completely removing the existing
initramfs and recreating it. I have also inspected the initramfs
built by update-initramfs, and can see nothing out of the ordinary.
crypttab is copied from the host, and the UUID matches that displayed
by lsblk -f - which is not surprising given that executing
cryptroot-unlock does, in fact, decrypt the volume.
Once chrooted, I can see that /sbin/init is a symlink to
/lib/systemd/systemd, which exists and is executable, but obviously
cannot be executed as anything other than PID 1. Attempting to execute
it results in it complaining of a missing argument, or, if one is
provided, an error that it is ignoring the request due to running in a
chroot.
The kernel command line (cat /proc/cmdline) contains a correct root=
entry, which points at /dev/mapper/vg0-root
I'm stumped - I cannot see why the initramfs environment fails to
mount the rootfs and execute init.
I have run an additional apt update / upgrade / dist-upgrade, whilst
under chroot, in the hope that it will magically fix everything, but
to no avail.
I was using a custom dtb that enabled PCIe x4 on the board, but have
removed that and reverted to the debian-supplied .dtb file just in case.
Any ideas? I have several machines using this configuration, both
arm64 and amd64, and I'm now a little uneasy about rebooting any of
them, in case there has been a breaking change somewhere which they,
too, are likely to fall afoul of.
Any thoughts? I've never really had to debug the init process (ie.
PID1) and am not sure how to proceed.
Thanks,
-Ian