Hi, On Thu, Feb 13, 2025 at 01:35:13PM +0300, Sergei Golovan wrote: > tag 1086028 + patch > tag 1087809 + patch > tag 1093200 + patch > thanks > > Hi! > > I've finally managed to reproduce this EFAULT in QEMU (using an > Erlang-based script which is shipped in the wings3d source package): > > 1) I've installed Debian bookworm for mips64el in qemu-system-mips64el > virtual machine (version from unstable), and upgraded it to the > current unstable (machine is loongson3-virt, cpu is Loongson-3A4000). > 2) I have to enable SMP in qemu and use -rtc clock=rt (otherwise the > virtual machine won't boot, with clock=rt sometimes it boots, > sometimes it hangs). The full QEMU command line is: > > qemu-system-mips64el -machine loongson3-virt -m 4g -cpu Loongson-3A4000 \ > -smp 2,sockets=2,cores=1,threads=1,maxcpus=2 \ > -kernel vmlinuz-loongson-3 \ > -rtc clock=rt \ > -initrd initrd.img-loongson-3 -drive > if=none,file=hda1.bin,id=hd,format=raw \ > -net nic -net tap,ifname=tap0,script=/bin/true \ > -device virtio-blk-pci,drive=hd -append "root=/dev/vda1 > console=ttyS0" \ > -nographic > > Here kernel and initrd can be either stock 6.1.123-1 version or > 6.1.123-1 with the attached patch. Unfortunately, QEMU can't boot for > me using the newest 6.12.12-1 kernel (it complains that it can't > uncompress initrd, I don't know why). > > 4) I've install the build dependencies of wings3d (basically, only > erlang-base is necessary) > 5) I've extracted the wings3d source package (from stable: > https://packages.debian.org/source/stable/wings3d) > 6) I've added the following line as the second line to > wings3d-2.2.9/intl_tools/gen_char_hrl > > %%! +S 4:4 +SDcpu 4:4 +c false > > (The first two options enable multiple threads, the last one allows > some workaround for the case when monotonic clock jumps backwards, > which appears to be the case for QEMU with SMP enabled). > 7) I've run this gen_char_hrl in a loop until it fails. > > The result is that with the stock 6.1.123-1 kernel approximately in 1% > cases the script aborts with message: > > signal-dispatcher thread got unexpected error: efault (14) > > which is exactly the error that prevents Erlang (and many Erlang-based > packages) from building on mips64el. > > On the other hand, with the patched kernel the script loop is still > running for more than 24 hours (a few thousands runs) without > aborting. So I'm now fairly confident that the patch fixes the bug. > > I'm not sure if there's no adverse effects caused by the patch, so > it'd be better to try it on real hardware as well. > > The patch is derived from the thread [1]. It reverses commit [2] with > an additional change, which is necessary because of changes in > expand_stack() introduced in commit [3]. > > [1] https://lore.kernel.org/all/mvmplxraqmd....@suse.de/T/ > [2] > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=4bce37a68ff884e821a02a731897a8119e0c37b7 > [3] > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=8d7071af890768438c14db6172cc8f9f4d04e184
Just one observation, as we talked about this issue in our weekly Kernel team meeting: Reverting 4bce37a68ff884e821a02a731897a8119e0c37b7 might not be an option as this is part of the upstream fixes for adressing CVE-2023-3269. Some information about the CVE: https://www.openwall.com/lists/oss-security/2023/07/05/1 https://github.com/lrh2000/StackRot https://www.openwall.com/lists/oss-security/2023/07/28/1 This means that this needs to be adressed (upstream, for 6.1.y) in a way that it does not break the CVE fix but unbreaks the mips64el situation. Ben aims to look into it. Regards, Salvatore