tag 1086028 + patch
tag 1087809 + patch
tag 1093200 + patch
thanks

Hi!

I've finally managed to reproduce this EFAULT in QEMU (using an
Erlang-based script which is shipped in the wings3d source package):

1) I've installed Debian bookworm for mips64el in qemu-system-mips64el
virtual machine (version from unstable), and upgraded it to the
current unstable (machine is loongson3-virt, cpu is Loongson-3A4000).
2) I have to enable SMP in qemu and use -rtc clock=rt (otherwise the
virtual machine won't boot, with clock=rt sometimes it boots,
sometimes it hangs). The full QEMU command line is:

qemu-system-mips64el -machine loongson3-virt -m 4g -cpu Loongson-3A4000 \
            -smp 2,sockets=2,cores=1,threads=1,maxcpus=2 \
            -kernel vmlinuz-loongson-3  \
            -rtc clock=rt \
            -initrd initrd.img-loongson-3 -drive
if=none,file=hda1.bin,id=hd,format=raw  \
            -net nic -net tap,ifname=tap0,script=/bin/true \
            -device virtio-blk-pci,drive=hd -append "root=/dev/vda1
console=ttyS0" \
            -nographic

Here kernel and initrd can be either stock 6.1.123-1 version or
6.1.123-1 with the attached patch. Unfortunately, QEMU can't boot for
me using the newest 6.12.12-1 kernel (it complains that it can't
uncompress initrd, I don't know why).

4) I've install the build dependencies of wings3d (basically, only
erlang-base is necessary)
5) I've extracted the wings3d source package (from stable:
https://packages.debian.org/source/stable/wings3d)
6) I've added the following line as the second line to
wings3d-2.2.9/intl_tools/gen_char_hrl

%%! +S 4:4 +SDcpu 4:4 +c false

(The first two options enable multiple threads, the last one allows
some workaround for the case when monotonic clock jumps backwards,
which appears to be the case for QEMU with SMP enabled).
7) I've run this gen_char_hrl in a loop until it fails.

The result is that with the stock 6.1.123-1 kernel approximately in 1%
cases the script aborts with message:

signal-dispatcher thread got unexpected error: efault (14)

which is exactly the error that prevents Erlang (and many Erlang-based
packages) from building on mips64el.

On the other hand, with the patched kernel the script loop is still
running for more than 24 hours (a few thousands runs) without
aborting. So I'm now fairly confident that the patch fixes the bug.

I'm not sure if there's no adverse effects caused by the patch, so
it'd be better to try it on real hardware as well.

The patch is derived from the thread [1]. It reverses commit [2] with
an additional change, which is necessary because of changes in
expand_stack() introduced in commit [3].

[1] https://lore.kernel.org/all/mvmplxraqmd....@suse.de/T/
[2] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=4bce37a68ff884e821a02a731897a8119e0c37b7
[3] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=8d7071af890768438c14db6172cc8f9f4d04e184

Cheers!
-- 
Sergei Golovan

Attachment: efault0.patch
Description: Binary data

Reply via email to