[Bug ld/30930] New: ld-2.41 links mame in a way which gets stuck on aarch64
https://sourceware.org/bugzilla/show_bug.cgi?id=30930 Bug ID: 30930 Summary: ld-2.41 links mame in a way which gets stuck on aarch64 Product: binutils Version: 2.41 Status: UNCONFIRMED Severity: normal Priority: P2 Component: ld Assignee: unassigned at sourceware dot org Reporter: belegdol at gmail dot com Target Milestone: --- Hello, it appears that there is an issue with binutils-2.41 on aarch64 which manifests in mame being linked in a way which in validation gets stuck forever. I discussed this with mame upstream here: https://github.com/mamedev/mame/issues/11587 As suggested, I tried building with -fuse-ld=lld which made the problem go away. I only have reproduction instructions for Fedora, not sure if other distributions are affected: 1. Install Fedora on an aarch64 system 2. fedpkg clone --anonymous mame 3. cd mame 4. fedpkg srpm 5. mock -r fedora-rawhide-aarch64 mame-0.259-1.fc40.src.rpm 6. wait -- You are receiving this mail because: You are on the CC list for the bug.
[Bug ld/30930] ld-2.41 links mame in a way which gets stuck on aarch64
https://sourceware.org/bugzilla/show_bug.cgi?id=30930 --- Comment #1 from Julian Sikorski --- Validation gets stuck on the following function: #0 0xb5bddb08 in ___ZN4bgfx12VertexLayoutC1Ev_bti_veneer () #1 0xf5870b2c in call_init (env=, argv=0xf388, argc=1) at ../csu/libc-start.c:145 #2 __libc_start_main_impl (main=0xaeedadc0 , argc=1, argv=0xf388, init=, fini=, rtld_fini=, stack_end=) at ../csu/libc-start.c:347 #3 0xaef01570 in _start () -- You are receiving this mail because: You are on the CC list for the bug.
[Bug ld/30930] ld-2.41 links mame in a way which gets stuck on aarch64
https://sourceware.org/bugzilla/show_bug.cgi?id=30930 --- Comment #3 from Julian Sikorski --- I will try, however my problem is that the issue only appears to happen with a full (as opposed to single-driver) build. It takes close to 3 hours on the only aarch64 machine I have access to so far, which makes experimentation somewhat challenging. -- You are receiving this mail because: You are on the CC list for the bug.
[Bug ld/30930] ld-2.41 links mame in a way which gets stuck on aarch64
https://sourceware.org/bugzilla/show_bug.cgi?id=30930 --- Comment #5 from Julian Sikorski --- (In reply to Nick Clifton from comment #4) > You may find it useful to compare a broken-linked-with-ld.bfd binary > with a working-linked-with-lld binary. In particular the contents > of whatever init sections they have, and the ordering of function > pointers therein. I am downloading the broken binary from the test system now. How can I do the above? -- You are receiving this mail because: You are on the CC list for the bug.
[Bug ld/30930] ld-2.41 links mame in a way which gets stuck on aarch64
https://sourceware.org/bugzilla/show_bug.cgi?id=30930 --- Comment #6 from Julian Sikorski --- (In reply to Sam James from comment #2) > Could you try give some instructions to reproduce manually from source, > without using Fedora and Fedora specific tooling? > > Bisecting binutils using 'git bisect run' + timeout would be helpful too if > you can. This might work, however I cannot test it unfortunately: 1. git clone https://github.com/mamedev/mame.git 2. Install deps according to the distros mechanism of choice 3. cd mame 4. make -O -j2 V=1 VERBOSE=1 NOWERROR=1 OPTIMIZE=2 PYTHON_EXECUTABLE=python3 QT_HOME=/usr/lib64/qt6 VERBOSE=1 USE_SYSTEM_LIB_ASIO=1 USE_SYSTEM_LIB_EXPAT=1 USE_SYSTEM_LIB_FLAC=1 USE_SYSTEM_LIB_GLM=1 USE_SYSTEM_LIB_JPEG=1 USE_SYSTEM_LIB_PORTAUDIO=1 USE_SYSTEM_LIB_PORTMIDI=1 USE_SYSTEM_LIB_PUGIXML=1 USE_SYSTEM_LIB_RAPIDJSON=1 USE_SYSTEM_LIB_SQLITE3=1 USE_SYSTEM_LIB_UTF8PROC=1 USE_SYSTEM_LIB_ZLIB=1 'SDL_INI_PATH=/etc/mame;' TOOLS=1 'OPT_FLAGS=-O2 -fexceptions -g1 -grecord-gcc-switches -pipe -Wall -Wno-complain-wrong-lang -Werror=format-security -Werror=implicit-function-declaration -Werror=implicit-int -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -mbranch-protection=standard -fasynchronous-unwind-tables -fstack-clash-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer' 'LDOPTS=-Wl,-z,relro -Wl,--as-needed -Wl,-z,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -Wl,--build-id=sha1 -specs=/usr/lib/rpm/redhat/redhat-package-notes 5. ./mame -validate -- You are receiving this mail because: You are on the CC list for the bug.
[Bug ld/30930] ld-2.41 links mame in a way which gets stuck on aarch64
https://sourceware.org/bugzilla/show_bug.cgi?id=30930 --- Comment #8 from Julian Sikorski --- (In reply to Nick Clifton from comment #7) > (In reply to Julian Sikorski from comment #5) > > (In reply to Nick Clifton from comment #4) > > > You may find it useful to compare a broken-linked-with-ld.bfd binary > > > with a working-linked-with-lld binary. In particular the contents > > > of whatever init sections they have, and the ordering of function > > > pointers therein. > > > > I am downloading the broken binary from the test system now. How can I do > > the above? > > Well first you can compare the disassembly of the .init section to make sure > that it is the same in both binaries: > > objdump -D -j .init mame > > Next I was going to suggest that you check the contents of the .init_array > section but it appears to be all zeros, which is a bit strange. > > You could be paranoid and check that the hardware property notes are the > same on both binaries: > > readelf -n -W mame | grep -e .note.gnu.property -A 4 > > But I doubt if that show any discrepancies. > > But I suspect that the only real way you are going to get some traction on > this problem is if you bring in the glibc folks. Maybe file a bug report > telling them that mame is hanging during initialization and that you need > their help finding out where things have gone wrong ? Let them know about > the new version of binutils of course, but do ask them if they can track > down exactly what the linker has done wrong in order to cause the init code > to hang. How would I bring in help from glibc folks? Should I just reassign the bug to glibc? -- You are receiving this mail because: You are on the CC list for the bug.
[Bug ld/30930] ld-2.41 links mame in a way which gets stuck on aarch64
https://sourceware.org/bugzilla/show_bug.cgi?id=30930 --- Comment #10 from Julian Sikorski --- Done: https://bugzilla.redhat.com/show_bug.cgi?id=2241902 I managed to set up an aarch64 rawhide instance on Oracle Cloud but I cannot connect to it yet :( If I manage to get it working, I can see if I can set up a bisect mentioned in comment #2. -- You are receiving this mail because: You are on the CC list for the bug.
[Bug ld/30930] ld-2.41 links mame in a way which gets stuck on aarch64
https://sourceware.org/bugzilla/show_bug.cgi?id=30930 --- Comment #11 from Julian Sikorski --- With a non-mock, fedpkg compile build on Fedora rawhide aarch running on OCI the backtrace is slightly different: #0 0xb5bd4fb0 in ___ZN3emu6detail16device_registrar15register_deviceERNS0_21device_type_impl_baseE_bti_veneer () #1 0xaec52368 in device_type_impl_base () at ../../../../../src/emu/device.h:240 #2 device_type_impl () at ../../../../../src/emu/device.h:283 #3 __static_initialization_and_destruction_0 () at ../../../../../src/mame/acorn/z88_impexp.cpp:34 #4 _GLOBAL__sub_I_Z88_IMPEXP () at ../../../../../src/mame/acorn/z88_impexp.cpp:278 #5 0xf5870b2c in call_init (env=, argv=0xf258, argc=2) at ../csu/libc-start.c:145 #6 __libc_start_main_impl (main=0xaeedadc0 , argc=2, argv=0xf258, init=, fini=, rtld_fini=, stack_end=) at ../csu/libc-start.c:347 #7 0xaef01570 in _start -- You are receiving this mail because: You are on the CC list for the bug.
[Bug ld/30930] ld-2.41 links mame in a way which gets stuck on aarch64
https://sourceware.org/bugzilla/show_bug.cgi?id=30930 --- Comment #13 from Julian Sikorski --- Thanks! The patch does not revert cleanly unfortunately and the changes are complicated enough that I do not feel comfortable running git mergetool. Would someone please be so kind and provide a patch I can apply against 2.41? -- You are receiving this mail because: You are on the CC list for the bug.
[Bug ld/30930] ld-2.41 links mame in a way which gets stuck on aarch64
https://sourceware.org/bugzilla/show_bug.cgi?id=30930 --- Comment #14 from Julian Sikorski --- (In reply to Julian Sikorski from comment #10) > Done: https://bugzilla.redhat.com/show_bug.cgi?id=2241902 > > I managed to set up an aarch64 rawhide instance on Oracle Cloud but I cannot > connect to it yet :( If I manage to get it working, I can see if I can set > up a bisect mentioned in comment #2. Is there a straightforward way of disabling -Werror for non-releases? I got my cloud instance running but I cannot build a mid-release snapshot due to this. -- You are receiving this mail because: You are on the CC list for the bug.
[Bug ld/30930] ld-2.41 links mame in a way which gets stuck on aarch64
https://sourceware.org/bugzilla/show_bug.cgi?id=30930 --- Comment #17 from Julian Sikorski --- (In reply to Nick Clifton from comment #16) > Created attachment 15152 [details] > Proposed patch > > (In reply to Julian Sikorski from comment #13) > > Thanks! The patch does not revert cleanly unfortunately and the changes are > > complicated enough that I do not feel comfortable running git mergetool. > > Would someone please be so kind and provide a patch I can apply against > > 2.41? > > Please try this patch. Thanks! With this patch applied the linked mame binary no longer gets stuck. -- You are receiving this mail because: You are on the CC list for the bug.
[Bug ld/30930] ld-2.41 links mame in a way which gets stuck on aarch64
https://sourceware.org/bugzilla/show_bug.cgi?id=30930 --- Comment #21 from Julian Sikorski --- (In reply to Szabolcs Nagy from comment #20) > seems they made the build use lld, so now i have to undo that. > will look at it tomorrow Sorry about that, I should have mentioned it here. You do not need to do git revert. You can do $ fedpkg switch-branch f39 $ fedpkg srpm $ mock -r fedora-rawhide-aarch64 mame-0.259-1.fc39.src.rpm -- You are receiving this mail because: You are on the CC list for the bug.
[Bug ld/30930] ld-2.41 links mame in a way which gets stuck on aarch64
https://sourceware.org/bugzilla/show_bug.cgi?id=30930 --- Comment #23 from Julian Sikorski --- I was able to complete git bisect in the meantime, it also points to 15b4f66b0a9a3be6caf1898d22a13c39e662006f being the first bad commit. Interestingly enough, I was not able to reproduce the issue with a simple make from mame's git snapshot, which indicates that the erroneous behaviour is being triggered by Fedora RPM packaging options and/or compiler and/or linker flags. -- You are receiving this mail because: You are on the CC list for the bug.
[Bug ld/30930] ld-2.41 links mame in a way which gets stuck on aarch64
https://sourceware.org/bugzilla/show_bug.cgi?id=30930 --- Comment #24 from Julian Sikorski --- I was able to reproduce the problem with the following make call, without the need to use the RPM tooling: make -j16 VERBOSE=1 NOWERROR=1 SYMBOLS=1 SYMLEVEL=1 OPTIMIZE=2 OPT_FLAGS="-O2 -fexceptions -grecord-gcc-switches -pipe -Wall -Wno-complain-wrong-lang -Werror=format-security -Werror=implicit-function-declaration -Werror=implicit-int -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -fstack-protector-strong -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -mbranch-protection=standard -fasynchronous-unwind-tables -fstack-clash-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer" LDOPTS="-Wl,-z,relro -Wl,--as-needed -Wl,-z,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -Wl,--build-id=sha1" -- You are receiving this mail because: You are on the CC list for the bug.
[Bug ld/30930] Broken BTI veneers: ld-2.41 links mame in a way which gets stuck on aarch64
https://sourceware.org/bugzilla/show_bug.cgi?id=30930 --- Comment #28 from Julian Sikorski --- Thank you. I can confirm that these 5 patches allow mame to link successfully on aarch64 when applied on top of Fedora binutils-2.41-10.fc40. -- You are receiving this mail because: You are on the CC list for the bug.
[Bug ld/23304] ARM linker fails to combine identical typeinfo sections
https://sourceware.org/bugzilla/show_bug.cgi?id=23304 Julian Sikorski changed: What|Removed |Added CC||belegdol at gmail dot com -- You are receiving this mail because: You are on the CC list for the bug. ___ bug-binutils mailing list bug-binutils@gnu.org https://lists.gnu.org/mailman/listinfo/bug-binutils