Bug#1049873: closing 1049873

2025-02-13 Thread Christoph Anton Mitterer
On Wed, 2025-02-12 at 08:50 +0100, Salvatore Bonaccorso wrote:
> Yes my undsetstanding from your comments was that 6.12.13-1 does not
> expose the problem.

Okay... let me summarise :-)

- 6.12. doesn't show the original problem (hanging mv) described in
  this bug
  I briefly (and wrongly) thought, that instead the NFS4.1 mountpoint
  would not update the file size after the mv succeeded, but that was
  probably just a mistake on my side.
- The bookworm kernel *does* show the original problem (hanging mv).


> I have reopened the bug, but I believe the only one who actually can
> do something here is either you, and bisect the changes down to what
> broke the behaviour, or someone else using dCache and having the
> possiblity to do experiments on a dedicated note.
> 
> I would start bisecting first by debian kernel-image packages by
> narring down more closely where the behaviour got introduced, then
> from there the respective upstream stable series changes.
> 
> I hope this gives you enough guide already on how to proceed.

Hmm I guess that would rather be rather be quite a "waste" of time.
I cannot really test this on our production system, so I'd need to set
up a test system for bisecting.
And I have anyway adapted my use cases of this already with a TODO to
revert after upgrading to trixie.

My only idea was that we might just leave it open in case someone else
stumbles over the symptom.

But perhaps it's indeed best to just close it as wontfix.

Sorry for the back and forth :-)


Cheers,
Chris.



Bug#1086028: I've reproduced the bug in QEMU

2025-02-13 Thread Sergei Golovan
tag 1086028 + patch
tag 1087809 + patch
tag 1093200 + patch
thanks

Hi!

I've finally managed to reproduce this EFAULT in QEMU (using an
Erlang-based script which is shipped in the wings3d source package):

1) I've installed Debian bookworm for mips64el in qemu-system-mips64el
virtual machine (version from unstable), and upgraded it to the
current unstable (machine is loongson3-virt, cpu is Loongson-3A4000).
2) I have to enable SMP in qemu and use -rtc clock=rt (otherwise the
virtual machine won't boot, with clock=rt sometimes it boots,
sometimes it hangs). The full QEMU command line is:

qemu-system-mips64el -machine loongson3-virt -m 4g -cpu Loongson-3A4000 \
-smp 2,sockets=2,cores=1,threads=1,maxcpus=2 \
-kernel vmlinuz-loongson-3  \
-rtc clock=rt \
-initrd initrd.img-loongson-3 -drive
if=none,file=hda1.bin,id=hd,format=raw  \
-net nic -net tap,ifname=tap0,script=/bin/true \
-device virtio-blk-pci,drive=hd -append "root=/dev/vda1
console=ttyS0" \
-nographic

Here kernel and initrd can be either stock 6.1.123-1 version or
6.1.123-1 with the attached patch. Unfortunately, QEMU can't boot for
me using the newest 6.12.12-1 kernel (it complains that it can't
uncompress initrd, I don't know why).

4) I've install the build dependencies of wings3d (basically, only
erlang-base is necessary)
5) I've extracted the wings3d source package (from stable:
https://packages.debian.org/source/stable/wings3d)
6) I've added the following line as the second line to
wings3d-2.2.9/intl_tools/gen_char_hrl

%%! +S 4:4 +SDcpu 4:4 +c false

(The first two options enable multiple threads, the last one allows
some workaround for the case when monotonic clock jumps backwards,
which appears to be the case for QEMU with SMP enabled).
7) I've run this gen_char_hrl in a loop until it fails.

The result is that with the stock 6.1.123-1 kernel approximately in 1%
cases the script aborts with message:

signal-dispatcher thread got unexpected error: efault (14)

which is exactly the error that prevents Erlang (and many Erlang-based
packages) from building on mips64el.

On the other hand, with the patched kernel the script loop is still
running for more than 24 hours (a few thousands runs) without
aborting. So I'm now fairly confident that the patch fixes the bug.

I'm not sure if there's no adverse effects caused by the patch, so
it'd be better to try it on real hardware as well.

The patch is derived from the thread [1]. It reverses commit [2] with
an additional change, which is necessary because of changes in
expand_stack() introduced in commit [3].

[1] https://lore.kernel.org/all/mvmplxraqmd@suse.de/T/
[2] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=4bce37a68ff884e821a02a731897a8119e0c37b7
[3] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=8d7071af890768438c14db6172cc8f9f4d04e184

Cheers!
-- 
Sergei Golovan


efault0.patch
Description: Binary data


Processed: I've reproduced the bug in QEMU

2025-02-13 Thread Debian Bug Tracking System
Processing commands for cont...@bugs.debian.org:

> tag 1086028 + patch
Bug #1086028 [src:linux] loupe: FTBFS on mips64el: failed to acquire jobserver 
token: Bad address (os error 14)
Bug #1087809 [src:linux] cargo: [mipsel64] failed to acquire jobserver 
token/Bad address (os error 14)
Bug #1093200 [src:linux] Some packages consistently FTBFS with EFAULT (Bad 
address) on most mips64el buildds
Added tag(s) patch.
Added tag(s) patch.
Added tag(s) patch.
> tag 1087809 + patch
Bug #1087809 [src:linux] cargo: [mipsel64] failed to acquire jobserver 
token/Bad address (os error 14)
Bug #1086028 [src:linux] loupe: FTBFS on mips64el: failed to acquire jobserver 
token: Bad address (os error 14)
Bug #1093200 [src:linux] Some packages consistently FTBFS with EFAULT (Bad 
address) on most mips64el buildds
Ignoring request to alter tags of bug #1087809 to the same tags previously set
Ignoring request to alter tags of bug #1086028 to the same tags previously set
Ignoring request to alter tags of bug #1093200 to the same tags previously set
> tag 1093200 + patch
Bug #1093200 [src:linux] Some packages consistently FTBFS with EFAULT (Bad 
address) on most mips64el buildds
Bug #1086028 [src:linux] loupe: FTBFS on mips64el: failed to acquire jobserver 
token: Bad address (os error 14)
Bug #1087809 [src:linux] cargo: [mipsel64] failed to acquire jobserver 
token/Bad address (os error 14)
Ignoring request to alter tags of bug #1093200 to the same tags previously set
Ignoring request to alter tags of bug #1086028 to the same tags previously set
Ignoring request to alter tags of bug #1087809 to the same tags previously set
> thanks
Stopping processing here.

Please contact me if you need assistance.
-- 
1086028: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1086028
1087809: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1087809
1093200: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1093200
Debian Bug Tracking System
Contact ow...@bugs.debian.org with problems



Processed: tagging 1091517

2025-02-13 Thread Debian Bug Tracking System
Processing commands for cont...@bugs.debian.org:

> tags 1091517 + upstream
Bug #1091517 [src:linux] linux: xhci regression breaks fastboot usb 
communication with android bootloader
Added tag(s) upstream.
> thanks
Stopping processing here.

Please contact me if you need assistance.
-- 
1091517: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1091517
Debian Bug Tracking System
Contact ow...@bugs.debian.org with problems



Processed: closing 1092591

2025-02-13 Thread Debian Bug Tracking System
Processing commands for cont...@bugs.debian.org:

> close 1092591
Bug #1092591 [src:linux] linux-image-6.12.6-amd64: SO_PEERSEC fails with 
ENOPROTOOPT with AppArmor enabled
Marked Bug as done
> thanks
Stopping processing here.

Please contact me if you need assistance.
-- 
1092591: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1092591
Debian Bug Tracking System
Contact ow...@bugs.debian.org with problems



Processed: closing 1049873

2025-02-13 Thread Debian Bug Tracking System
Processing commands for cont...@bugs.debian.org:

> close 1049873 6.12.13-1
Bug #1049873 [src:linux] regression: linux-image-6.1.0-10-amd64: NFS4.1/pNFS mv 
hangs, but finishes after Ctrl-C
Marked as fixed in versions linux/6.12.13-1.
Bug #1049873 [src:linux] regression: linux-image-6.1.0-10-amd64: NFS4.1/pNFS mv 
hangs, but finishes after Ctrl-C
Marked Bug as done
> thanks
Stopping processing here.

Please contact me if you need assistance.
-- 
1049873: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1049873
Debian Bug Tracking System
Contact ow...@bugs.debian.org with problems



Bug#1049873: closing 1049873

2025-02-13 Thread Salvatore Bonaccorso


On Fri, Feb 14, 2025 at 01:13:52AM +0100, Christoph Anton Mitterer wrote:
> On Wed, 2025-02-12 at 08:50 +0100, Salvatore Bonaccorso wrote:
> > Yes my undsetstanding from your comments was that 6.12.13-1 does not
> > expose the problem.
> 
> Okay... let me summarise :-)
> 
> - 6.12. doesn't show the original problem (hanging mv) described in
>   this bug
>   I briefly (and wrongly) thought, that instead the NFS4.1 mountpoint
>   would not update the file size after the mv succeeded, but that was
>   probably just a mistake on my side.
> - The bookworm kernel *does* show the original problem (hanging mv).

Then after all my marking as fixed in 6.12.13-1 was actually okay, and
the BTS knows that the 6.1.y version was still unfixed.


> > I have reopened the bug, but I believe the only one who actually can
> > do something here is either you, and bisect the changes down to what
> > broke the behaviour, or someone else using dCache and having the
> > possiblity to do experiments on a dedicated note.
> > 
> > I would start bisecting first by debian kernel-image packages by
> > narring down more closely where the behaviour got introduced, then
> > from there the respective upstream stable series changes.
> > 
> > I hope this gives you enough guide already on how to proceed.
> 
> Hmm I guess that would rather be rather be quite a "waste" of time.
> I cannot really test this on our production system, so I'd need to set
> up a test system for bisecting.
> And I have anyway adapted my use cases of this already with a TODO to
> revert after upgrading to trixie.
> 
> My only idea was that we might just leave it open in case someone else
> stumbles over the symptom.
> 
> But perhaps it's indeed best to just close it as wontfix.
> 
> Sorry for the back and forth :-)

No problem, but given that yes I will close it with the known version
fixing the problem and then let the bug go :)

Regards,
Salvatore