>>> Joerg Sonnenberger <jo...@bec.de> wrote > > On Fri, Nov 22, 2019 at 08:53:25AM +0900, Takeshi Nakayama wrote: > > >>> Takeshi Nakayama <nakay...@netbsd.org> wrote > > > > > >>> Joerg Sonnenberger <jo...@bec.de> wrote > > > > > > > On Thu, Nov 21, 2019 at 11:06:16PM +0000, Takeshi Nakayama wrote: > > > > > Module Name: src > > > > > Committed By: nakayama > > > > > Date: Thu Nov 21 23:06:16 UTC 2019 > > > > > > > > > > Modified Files: > > > > > src/lib/libc/tls: tls.c > > > > > > > > > > Log Message: > > > > > Fix PR/54074 and PR/54093 completely. > > > > > > > > > > More similar to the ld.elf_so logic, it is necessary to align with > > > > > p_align first. Also, invert the #ifdef condition for consistency. > > > > > > > > This commit just wastes space without reason. > > > > > > No, without this commit the TLS offset calculation is mismatch if > > > alignof(max_align_t)) != p_align. > > > > Ah, this is wrong. > > Correctly, the TLS offset calculation is mismatch with what ld(1) > > expected if p_memsz is not aligned with p_align. > > For TLS variant I, it literally just adds padding at the end of the > allocation. For TLS variant II, it is redundant, because the rounding is > already done in with max_align_t. We do not support p_align > malloc > alignment and the patch is certainly nowhere near enough to change that. > It is actively harmful in that it can make people believe that it could > ever work.
I don't want to do anything about the case of p_align > malloc alignment. I just want to fix an usual 32-bit sparc statically linked binaries not working properly. The new jemalloc now uses TLS, and the 32-bit sparc statically linked binary ELF header looks like this: % readelf -l rescue/sh | egrep 'MemSiz|TLS' Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align TLS 0x6dc000 0x0075c000 0x0075c000 0x00b10 0x00b14 R 0x8 The point is when MemSiz (p_memsz) is not aligned with Align (p_align) as above. In tls.c rev 1.13: tls_size = p_memsz --> 0x00b14 tls_allocation = roundup2(tls_size, alignof(max_align_t)) --> 0x00b18 tcb = calloc() + tls_allocation --> calloc() + 0x00b18 p = tcb - tls_size --> calloc() + 4 memcpy(p, tls_initaddr, tls_initsize) So, the TLS initialization image is copied to calloc() + 4. However, ld(1) assumes that the TLS area starts with: tcb - roundup2(p_memsz, p_align) = tcb - 0x00b18 --> calloc() Since this is 4 bytes different, the initial value of the thread local variable is not set correctly. To fix this problem, I added the same thing as this line in ld.elf_so: https://nxr.netbsd.org/xref/src/libexec/ld.elf_so/tls.c#236 * obj->tlssize = p_memsz, obj->tlsalign = p_align -- Takeshi Nakayama