Hi all,

I have solved the boot issue. I got physical access to the machine, and connected a monitor and keyboard to it.

As usual, it booted the initramfs, and started dropbear, but this time, I entered the disk key on the keyboard.

The machine attempted to boot, but failed, dropping back to the initramfs prompt - however, crucially, the video console logged a useful error - the rootfs could not be mounted (despite mounting just fine, without errors, in read write mode, over the ssh connection previously).

I ran a fsck in the initramfs environment, and rebooted it.

The machine booted and I again unlocked it via the (keyboard) console.

This resulted in a clean boot with no untoward errors.

I shut it down, and rebooted, and now it boots successfully even if I unlock it via the ssh connection.

I feel like this is a bug in cryptroot-initramfs - surely it would be critically important for the aforementioned filesystem error to have been logged to the ssh console, rather than it silently failing to boot? afterall - it DOES log the errors to the video console.

Or have I been daft and somehow misconfigured it? I don't think I deviated from the standard configuration.

Is there a simple way to have errors logged to the console AND to the ssh session? Have I been thick?

Thanks,

-Ian


On 24/07/2024 00:14, Ian Molton wrote:
Hi all,

I'm having problems with one of my machines. it's a Pine RockPro64.

Debian bookworm has been running very stably on it for some time. I rebooted it a couple of weeks ago for maintainance, having applied updates, after 108 days up. I have an encrypted LVM volume, containing root, swap, and data LVs, with /boot on a MMC card, and I use cryptsetup-initramfs to allow me to log in and unlock the volume at boot time, via ssh.

I rebooted it this morning, due to a crash, and it didn't come back up.

I can still connect to dropbear (running in the initramfs context), and the MoTD (as always) prompts me to run cryptroot-unlock, which appears to do its job (ie. the LVM volume is unlocked / decrypted), however it does not proceed to switch root (and drop the ssh connection), as it used to, with complete reliability.

At first, I suspected a problem with the rootfs, however this appears not to be the issue - the volume is present at the expected location, and can be manually mounted.

Executing the following allows me to enter a chroot on the rootfs:

mount -text4 /dev/vg0/root /root
mount --bind /dev /root/dev
mount --bind /proc /root/proc
mount --bind /sys /root/sys
chroot /root /bin/bash --login

and following this I can run:

mount /boot

which correctly mounts the MMC card containing the /boot partition in the chrooted environment.

Inspecting /boot, everything appears to be in order. I have issued update-initramfs a few times, even completely removing the existing initramfs and recreating it.  I have also inspected the initramfs built by update-initramfs, and can see nothing out of the ordinary. crypttab is copied from the host, and the UUID matches that displayed by lsblk -f - which is not surprising given that executing cryptroot-unlock does, in fact, decrypt the volume.

Once chrooted, I can see that /sbin/init is a symlink to /lib/systemd/systemd, which exists and is executable, but obviously cannot be executed as anything other than PID 1. Attempting to execute it results in it complaining of a missing argument, or, if one is provided, an error that it is ignoring the request due to running in a chroot.

The kernel command line (cat /proc/cmdline) contains a correct root= entry, which points at /dev/mapper/vg0-root

I'm stumped - I cannot see why the initramfs environment fails to mount the rootfs and execute init.

I have run an additional apt update / upgrade / dist-upgrade, whilst under chroot, in the hope that it will magically fix everything, but to no avail.

I was using a custom dtb that enabled PCIe x4 on the board, but have removed that and reverted to the debian-supplied .dtb file just in case.

Any ideas? I have several machines using this configuration, both arm64 and amd64, and I'm now a little uneasy about rebooting any of them, in case there has been a breaking change somewhere which they, too, are likely to fall afoul of.

Any thoughts? I've never really had to debug the init process (ie. PID1) and am not sure how to proceed.

Thanks,

-Ian


Reply via email to