On 16/07/2025 23:48, Askar Safin wrote:
---- On Wed, 16 Jul 2025 02:00:56 +0400 Jocelyn Falempe
<jfale...@redhat.com> wrote ---
> Yes, that's the default if you use a drm driver like bochs with fbdev
Thank you for answer! I just tried kernel from drm-tip with this config with
drm_panic in qemu. And panic works.
But I don't like result.
When drm panic happens, messages printed to /dev/console disappear. Only kernel
messages remain.
Yes, that's the expected behavior. DRM panic only prints the kernel
messages, and don't mix that with console output.
Here are steps to reproduce. And then I will describe how this breaks my
workflow.
Compile kernel from drm-tip ( https://gitlab.freedesktop.org/drm/tip ). I used
commit b012f04b5be909a307ff629b297387e0ed55195a .
It seems to include this bochs patch (i. e. "drm/bochs: Add support for
drm_panic").
Use this miniconfig:
$ cat mini
CONFIG_64BIT=y
CONFIG_EXPERT=y
CONFIG_PRINTK=y
CONFIG_PRINTK_TIME=y
CONFIG_PCI=y
CONFIG_TTY=y
CONFIG_VT=y
CONFIG_VT_CONSOLE=y
CONFIG_DRM=y
CONFIG_DRM_FBDEV_EMULATION=y
CONFIG_DRM_BOCHS=y
CONFIG_FRAMEBUFFER_CONSOLE=y
CONFIG_PROC_FS=y
CONFIG_DRM_PANIC=y
CONFIG_DRM_PANIC_SCREEN="kmsg"
CONFIG_BLK_DEV_INITRD=y
CONFIG_RD_GZIP=y
CONFIG_BINFMT_ELF=y
CONFIG_BINFMT_SCRIPT=y
$ make KCONFIG_ALLCONFIG=mini allnoconfig
Create initramfs, which contains exactly these files:
$ find /tmp/i -ls
2861 0 drwxrwxr-x 3 user user 80 Jul 16 23:56 /tmp/i
2891 0 drwxrwxr-x 2 user user 80 Jul 16 23:56
/tmp/i/bin
2893 0 lrwxrwxrwx 1 user user 7 Jul 16 23:56
/tmp/i/bin/sh -> busybox
2892 1980 -rwxr-xr-x 1 user user 2024544 Jul 16 23:56
/tmp/i/bin/busybox
2864 4 -rwxrwxr-x 1 user user 43 Jul 16 23:18
/tmp/i/init
This is "init":
===
#!/bin/sh
set -e
echo hello
sleep 3
exit 0
===
Now boot this in Qemu. I used this command:
$ qemu-system-x86_64 -enable-kvm -m 1024 -kernel arch/x86/boot/bzImage -initrd
/tmp/ini.cpio.gz
You will see word "hello", then after 3 seconds the system will fail into drm
panic.
What I saw: word "hello" disappeared, when the system falled into panic
What I expected to see: word "hello" should remain.
Even with fbcon, there is no guarantee that "hello" will remain visible,
that depends on the screen size, and the amount of logs that the kernel
panic will print.
Now let me describe how this breaks my workflow.
I often use hand-crafted shell scripts as PID 1. Both in Qemu and on real
hardware.
I use them to reproduce and bisect various kernel bugs.
I always put "set -e" in the beginning of shell script. This means that script
fails after first error.
And thus system fails into kernel panic.
I also sometimes put "set -x" to debug these scripts.
Thus, when script fails and panic happens, then faulty shell command will be
last thing printed on screen before panic stacktrace.
But with drm_panic everything printed to /dev/console disappears.
This breaks my workflow.
In Qemu I can easily workaround this by using serial console.
But I cannot do this on real hardware.
And yes, I experience fbcon panic problems on real hardware, too, this is why
I'm interested in drm panic:
https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/14658
(I have not yet tested whether drm_panic fixes that fbcon i915 panic problem,
but I assume it does.)
I can workaround this by using efi fb with fb panic as opposed to i915. But
this will not work if I attempting to catch bug in i915 itself.
(And yes, I recently found another i915-related bug, and I'm trying to debug it
using shell scripts running as PID 1.
Here it is: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/14598 .)
I can workaround this by logging everything to disk.
But this will not work when everything is mounted read-only.
And this is exactly what happens, when I try to catch that kexec-related bug:
immediately before issuing "kexec -e" command I mount everything read-only.
The only remaining workaround is to redirect everything to /dev/kmsg.
I. e. put "exec > /dev/kmsg 2>&1" to the script.
This will work.
But I still don't like this.
This is the workaround I would suggest, as DRM panic can only access the
kmsg data, and has no knowledge of what fbcon was doing.
If the panic occurs because the PID 1 script exits, then the panic stack
trace is not that relevant?
Another thing you can try, is to use DRM log instead of fbcon:
DRM_CLIENT_LOG=y
DRM_CLIENT_DEFAULT_LOG=y
DRM_FBDEV_EMULATION=n
DRM_CLIENT_DEFAULT="log"
DRM_PANIC=n
(and boot with console=drm_log)
drm-log doesn't scroll the whole screen, and use the non-blocking
console API, so is less likely to make artifacts on the screen.
But in this case, you won't get the panic trace.
Best regards,
--
Jocelyn
--
Askar Safin
https://types.pl/@safinaskar