https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82803

nsz at gcc dot gnu.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |nsz at gcc dot gnu.org

--- Comment #4 from nsz at gcc dot gnu.org ---
i run into the same issue:

static __thread int x;
static int *volatile p;
void f(int c)
{
    while (c--)
      p = &x;
}

with -xc -O2 -fPIC compiles to

  pushq %rbx
  leal -1(%rdi), %ebx
.L10:
  leaq x@tlsld(%rip), %rdi
  call __tls_get_addr@PLT
  subl $1, %ebx
  addq $x@dtpoff, %rax
  movq %rax, p(%rip)
  cmpl $-1, %ebx
  jne .L10
  popq %rbx
  ret

note that with -funroll-loops the loop is

.L46:
  leaq x@tlsld(%rip), %rdi
  call __tls_get_addr@PLT
  subl $8, %ebx
  addq $x@dtpoff, %rax
  movq %rax, p(%rip)
  movq %rax, p(%rip)
  movq %rax, p(%rip)
  movq %rax, p(%rip)
  movq %rax, p(%rip)
  movq %rax, p(%rip)
  movq %rax, p(%rip)
  movq %rax, p(%rip)
  cmpl $-1, %ebx
  jne .L46

so the loop unroller knows it only needs to compute the address once, but gcc
fails to hoist it out of the loop.

if i use a simple global, then the GOT access is hoisted, if i use an
__attribute__((const)) function call then that is hoisted, only tls address
computation is broken.

the issue is not present with -m32 (i386 code gen), but it is present on e.g.
aarch64 and powerpc64 and with tlsdesc -mtls-dialect=gnu2 (then it's the
tlsdesc call that's in the loop instead of __tls_get_addr call).

Reply via email to