Hi [Adding to recipients Thomas Bogendoerfer, Jiaxun Yang, this is about Debian issue with MIPS, https://bugs.debian.org/1086028, https://bugs.debian.org/1087809, https://bugs.debian.org/1093200]
On Wed, Feb 19, 2025 at 11:52:03PM +0100, Salvatore Bonaccorso wrote: > Hi, > > On Thu, Feb 13, 2025 at 01:35:13PM +0300, Sergei Golovan wrote: > > tag 1086028 + patch > > tag 1087809 + patch > > tag 1093200 + patch > > thanks > > > > Hi! > > > > I've finally managed to reproduce this EFAULT in QEMU (using an > > Erlang-based script which is shipped in the wings3d source package): > > > > 1) I've installed Debian bookworm for mips64el in qemu-system-mips64el > > virtual machine (version from unstable), and upgraded it to the > > current unstable (machine is loongson3-virt, cpu is Loongson-3A4000). > > 2) I have to enable SMP in qemu and use -rtc clock=rt (otherwise the > > virtual machine won't boot, with clock=rt sometimes it boots, > > sometimes it hangs). The full QEMU command line is: > > > > qemu-system-mips64el -machine loongson3-virt -m 4g -cpu Loongson-3A4000 \ > > -smp 2,sockets=2,cores=1,threads=1,maxcpus=2 \ > > -kernel vmlinuz-loongson-3 \ > > -rtc clock=rt \ > > -initrd initrd.img-loongson-3 -drive > > if=none,file=hda1.bin,id=hd,format=raw \ > > -net nic -net tap,ifname=tap0,script=/bin/true \ > > -device virtio-blk-pci,drive=hd -append "root=/dev/vda1 > > console=ttyS0" \ > > -nographic > > > > Here kernel and initrd can be either stock 6.1.123-1 version or > > 6.1.123-1 with the attached patch. Unfortunately, QEMU can't boot for > > me using the newest 6.12.12-1 kernel (it complains that it can't > > uncompress initrd, I don't know why). > > > > 4) I've install the build dependencies of wings3d (basically, only > > erlang-base is necessary) > > 5) I've extracted the wings3d source package (from stable: > > https://packages.debian.org/source/stable/wings3d) > > 6) I've added the following line as the second line to > > wings3d-2.2.9/intl_tools/gen_char_hrl > > > > %%! +S 4:4 +SDcpu 4:4 +c false > > > > (The first two options enable multiple threads, the last one allows > > some workaround for the case when monotonic clock jumps backwards, > > which appears to be the case for QEMU with SMP enabled). > > 7) I've run this gen_char_hrl in a loop until it fails. > > > > The result is that with the stock 6.1.123-1 kernel approximately in 1% > > cases the script aborts with message: > > > > signal-dispatcher thread got unexpected error: efault (14) > > > > which is exactly the error that prevents Erlang (and many Erlang-based > > packages) from building on mips64el. > > > > On the other hand, with the patched kernel the script loop is still > > running for more than 24 hours (a few thousands runs) without > > aborting. So I'm now fairly confident that the patch fixes the bug. > > > > I'm not sure if there's no adverse effects caused by the patch, so > > it'd be better to try it on real hardware as well. > > > > The patch is derived from the thread [1]. It reverses commit [2] with > > an additional change, which is necessary because of changes in > > expand_stack() introduced in commit [3]. > > > > [1] https://lore.kernel.org/all/mvmplxraqmd....@suse.de/T/ > > [2] > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=4bce37a68ff884e821a02a731897a8119e0c37b7 > > [3] > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=8d7071af890768438c14db6172cc8f9f4d04e184 > > Just one observation, as we talked about this issue in our weekly > Kernel team meeting: Reverting > 4bce37a68ff884e821a02a731897a8119e0c37b7 might not be an option as > this is part of the upstream fixes for adressing CVE-2023-3269. > > Some information about the CVE: > https://www.openwall.com/lists/oss-security/2023/07/05/1 > https://github.com/lrh2000/StackRot > https://www.openwall.com/lists/oss-security/2023/07/28/1 > > This means that this needs to be adressed (upstream, for 6.1.y) in a > way that it does not break the CVE fix but unbreaks the mips64el > situation. > > Ben aims to look into it. There is 8fa507083388 ("mm/memory: Use exception ip to search exception tables") upstream which fixes 4bce37a68ff8 ("mips/mm: Convert to using lock_mm_and_find_vma()") and relates to the thread https://lore.kernel.org/r/75e9fd7b08562ad9b456a5bdaacb7cc220311cc9.ca...@xry111.site/ . In fact the commit was backported to: v6.6.18: 94d34a6861a2807356b653fc12f958196ebbc043 mm/memory: Use exception ip to search exception tables v6.7.6: c3a7dbff8d0d4d7174d2162e4db7bdcfd3cb8886 mm/memory: Use exception ip to search exception tables v6.8-rc5: 8fa5070833886268e4fb646daaca99f725b378e9 mm/memory: Use exception ip to search exception tables but cannot as it is for 6.1.y. It at least depends on 11ba1728be3e ("ptrace: Introduce exception_ip arch hook"). As Ben Hutching said he will take action to look at this issue I will not further "hijack" the thread, but I though it was worth mentioning the relation to CVE-2023-3269/StackRot and the potential missing bits from upper stable series. Regards, Salvatore