Ryan,

Part 1)
------

First, please try to reproduce the problem later, not so early in boot,
by disabling the bcache module on the kernel boot parameters, and then
loading it after the system has booted successfully.
(This should be possible as you mentioned the boot disk isn't involved.)

1) Edit '/etc/fstab' and either comment or add the 'noauto' option to
the mounts depending on bcache, so that systemd doesn't delay on boot.

For example,

$ sudo vim /etc/fstab
From: /dev/mapper/*whatadisk* /mountpoint ext4 defaults 0 0
To: /dev/mapper/*whatadisk* /mountpoint ext4 defaults,noauto 0 0
Esc, :x, Enter

2) Edit '/etc/default/grub' and add the 'modprobe.blacklist=bcache' option
to GRUB_CMDLINE_LINUX_DEFAULT.

For example,

$ sudo vim /etc/default/grub
From: GRUB_CMDLINE_LINUX_DEFAULT="console=ttyS0"
To: GRUB_CMDLINE_LINUX_DEFAULT="console=ttyS0 modprobe.blacklist=bcache"
Esc, :x, Enter

Update and check grub config:

$ sudo update-grub

$ grep modprobe.blacklist=bcache /boot/grub/grub.cfg 
                linux   /boot/vmlinuz-4.15.0-91-generic ... 
modprobe.blacklist=bcache 
                linux   /boot/vmlinuz-4.15.0-88-generic ... 
modprobe.blacklist=bcache 

3) Reboot the system in 4.15.0-91, it should not fail, as bcache is not
loaded.

4) Now load bcache, retrigger device events, and check if the problem
reproduces.

$ sudo modprobe bcache
$ sudo udevadm trigger

This should register the bcache devices, e.g., /dev/bcache0.

If you can see /dev/bcache0 and the problem did NOT happen,
please stop here and let me know.

If the problem reproduced, please proceed after your system 
rebooted (it should boot normally as it has bcache disabled.)

...

Part 2)
------

1) Install linux-crashdump:

$ sudo apt install linux-crashdump

Answer these questions:

- Should kexec-tools handle reboots (sysvinit only)? No
- Should kdump-tools be enabled by default? Yes

2) Increase the reserved memory size for the crashdump kernel:

Edit '/etc/default/grub.d/kdump-tools.cfg' and change the crashkernel
size from 192M to 512M or 768M if possible:

For example,

$ sudo vim /etc/default/grub.d/kdump-tools.cfg
from: GRUB_CMDLINE_LINUX_DEFAULT="$GRUB_CMDLINE_LINUX_DEFAULT 
crashkernel=512M-:192M"
to: GRUB_CMDLINE_LINUX_DEFAULT="$GRUB_CMDLINE_LINUX_DEFAULT 
crashkernel=512M-:768M"
Esc, :x, Enter

4) Update grub and reboot

$ sudo update-grub
$ sudo reboot

5) Check kdump status is 'ready' and that panic_on_oops is enabled (1)
by default:

$ sudo kdump-config status
current state:    ready to kdump

$ cat /proc/sys/kernel/panic_on_oops 
1

6) Trigger a test crashdump

$ echo 1 | sudo tee /proc/sys/kernel/sysrq
$ echo c | sudo tee /proc/sysrq-trigger

This apparently 'reboots' the system, and collects a memory dump:

[    8.510809] kdump-tools[781]: Starting kdump-tools:  * running makedumpfile 
-c -d 31 /proc/vmcore /var/crash/202004081540/dump-incomplet$
...
Copying data                                      : [100.0 %] -           eta: 
0s
...
[   15.964149] kdump-tools[781]:  * kdump-tools: saved vmcore in 
/var/crash/202004081540
...
[   16.176388] kdump-tools[781]:  * kdump-tools: saved dmesg content in 
/var/crash/202004081540
...
[   17.187848] kdump-tools[781]: Rebooting.
...

7) After the system boots again, check the crashdump is stored in
/var/crash/<timestamp>

$ ls -1 /var/crash/202004081540
dmesg.202004081540
dump.202004081540

If this didn't happen, please stop and let me know, so we can fix the
crashdump mechanism.

If you have /var/crash/<timestamp>, the crashdump is working, let's move 
forward.
Feel free to remove that directory, $ sudo rm -rf /var/crash/<timestamp>

...

8) Boot again and reproduce the problem.

Again, boot in 4.15.0-91, and reproduce the problem manually as in step
4 in Part 1.

And this should generate a crashdump in /var/crash, as in the test crashdump.
Please create a tarball and attach it to Launchpad.

$ sudo tar cvf lp1867916-crashdump.tar /var/crash/<timestamp>

If there are attachment size limit issues, please let me know, or use
another hosting website, if at all possible.

Thank you very much,
Mauricio

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1867916

Title:
  Regression in kernel 4.15.0-91 causes kernel panic with Bcache

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1867916/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to