https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102772
--- Comment #18 from ro at CeBiTec dot Uni-Bielefeld.DE <ro at CeBiTec dot Uni-Bielefeld.DE> --- > --- Comment #17 from Jakub Jelinek <jakub at gcc dot gnu.org> --- > (In reply to r...@cebitec.uni-bielefeld.de from comment #16) >> I've wrapped that in a small test programming, calling foo both from the >> initial thread and a new one: in all cases, the return value from foo >> was 32-byte aligned as expected. > > Perhaps the problem only shows when the PT_TLS size is not a multiple of the > alignment? Or depends on other PT_TLS segments in shared libraries in the > same > process. The pointer2.f90 testcase will link against libgomp which uses TLS > as > well. > So perhaps try: > struct __attribute__((aligned (16))) S { char buf[0x24]; }; > __thread struct S s; > __attribute__((noipa)) S *foo (void) { return &s; } > int > main () > { > #pragma omp parallel > __builtin_printf ("%p\n", foo ()); > return 0; > } > ? I've compiled that with g++ -m32 -O2 -fopenmp. Initially, when foo was just movl %gs:0, %eax addl $s@ntpoff, %eax ret this worked reliably, emitting a 16-byte aligned address 48 times (matching the number of cores). However, when I changed the assembler output to pushl %ebx movl %gs:0, %ebx addl $s@ntpoff, %ebx popl %ebx ret and relinked, the resulting addresses were just 4-byte aligned, exactly one of them even being 0. That might again suggest the Solaris ld/ld.so.1 took the TLS spec literally (provided I've not created that mess myself: IIUC, %ebx is callee-saved).