On 24.07.2024 17:47, Konstantin Belousov wrote:
On Wed, Jul 24, 2024 at 01:07:39PM +0000, John F Carr wrote:
On Jul 24, 2024, at 06:50, Konstantin Belousov <k...@freebsd.org> wrote:
On Wed, Jul 24, 2024 at 12:34:57PM +0200, m...@freebsd.org wrote:
On 24.07.2024 12:24, Konstantin Belousov wrote:
On Tue, Jul 23, 2024 at 08:11:13PM +0000, John F Carr wrote:
On Jul 23, 2024, at 13:46, Michal Meloun <meloun.mic...@gmail.com> wrote:
On 23.07.2024 11:36, Konstantin Belousov wrote:
On Tue, Jul 23, 2024 at 09:53:41AM +0200, Michal Meloun wrote:
The good news is that I'm finally able to generate a working/locking
test case. The culprit (at least for me) is if "-mcpu" is used when
compiling libthr (e.g. indirectly injected via CPUTYPE in /etc/make.conf).
If it is not used, libthr is broken (regardless of -O level or debug/normal
build), but -mcpu=cortex-a15 will always produce a working libthr.
I think this is very significant progress.
Do you plan to drill down more to see what is going on?
So the problem is now clear, and I fear it may apply to other architectures as
well.
dlopen_object() (from rtld_elf),
https://cgit.freebsd.org/src/tree/libexec/rtld-elf/rtld.c#n3766,
holds the rtld_bind_lock write lock for almost the entire time a new library is
loaded.
If the code uses a yet unresolved symbol to load the library, the rtl_bind()
function attempts to get read lock of rtld_bind_lock and a deadlock occurs.
In this case, it round_up() in _thr_stack_fix_protection,
https://cgit.freebsd.org/src/tree/lib/libthr/thread/thr_stack.c#n136.
Issued by __aeabi_uidiv (since not all armv7 processors support HW divide).
Unfortunately, I'm not sure how to fix it. The compiler can emit __aeabi_<> in
any place, and I'm not sure if it can resolve all the symbols used by rtld_eld and
libthr beforehand.
Michal
In this case (but not for all _aeabi_ functions) we can avoid division
as long as page size is a power of 2.
The function is
static inline size_t
round_up(size_t size)
{
if (size % _thr_page_size != 0)
size = ((size / _thr_page_size) + 1) *
_thr_page_size;
return size;
}
The body can be condensed to
return (size + _thr_page_size - 1) & ~(_thr_page_size - 1);
This is shorter in both lines of code and instruction bytes.
Lets not allow this to be lost. Could anybody confirm that the patch
below fixes the issue?
commit d560f4f6690a48476565278fd07ca131bf4eeb3c
Author: Konstantin Belousov <k...@freebsd.org>
Date: Wed Jul 24 13:17:55 2024 +0300
rtld: avoid division in __thr_map_stacks_exec()
The function is called by rtld with the rtld bind lock write-locked,
when fixing the stack permission during dso load. Not every ARMv7 CPU
supports the div, which causes the recursive entry into rtld to resolve
the __aeabi_uidiv symbol, causing self-lock.
Workaround the problem by using roundup2() instead of open-coding less
efficient formula.
Diagnosed by: mmel
Based on submission by: John F Carr <j...@mit.edu>
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Just realized that it is wrong. Stack size is user-controlled and it does
not need to be power of two.
Your change is correct. _thr_page_size is set to getpagesize(),
which is a power of 2. The call to roundup2 takes a user-provided
size and rounds it up to a multiple of the system page size.
I tested the change and it works. My change also works and
should compile to identical code. I forgot there was a standard
function to do the rounding.
Right, my bad, thank you for correcting my thinko.
For final resolving of deadlocks, after a full day of digging, I'm very much
incline of adding -znow to the linker flags for libthr.so (and maybe also
for ld-elf.so). The runtime cost of resolving all symbols at startup is very
low. Direct pre-solving in _thr_rtld_init() is problematic for the _aeabi_*
symbols, since they don't have an official C prototypes, and some are not
compatible with C calling conventions.
I do not like it. `-z now' changes (breaks) the ABI and makes some symbols
not preemtible.
In the worst case, we would need a call to the asm routine which causes the
resolution of the _eabi_* symbols on arm.
It would also be possible to link libthr with libgcc.a and use a linker map
to hide the _eabi_ symbols.
In principle yes, but if the ARM ABI states that _eabi symbols must be used,
and exported from libc, then this is also some form of ABI breakage.
I hope that https://reviews.freebsd.org/D46104 is acceptable :)