On Tue, Jun 09, 2015 at 02:32:07PM +0200, Uros Bizjak wrote: > > - emit_insn (gen_rtx_SET (base, XEXP (part[1][0], 0))); > > + addr = XEXP (part[1][0], 0); > > + if (TARGET_TLS_DIRECT_SEG_REFS) > > + { > > + struct ix86_address parts; > > + int ok = ix86_decompose_address (addr, &parts); > > + gcc_assert (ok); > > + if (parts.seg == DEFAULT_TLS_SEG_REG) > > + { > > + /* It is not valid to use %gs: or %fs: in > > + lea though, so we need to remove it from the > > + address used for lea and add it to each individual > > + memory loads instead. */ > > + addr = copy_rtx (addr); > > + rtx *x = &addr; > > + while (GET_CODE (*x) == PLUS) > > Why not use RTX iterators here? IMO, it would be much more readable.
Do you mean something like this? It is larger and don't see readability advantages there at all (plus the 4.8/4.9 backports can't use that anyway). But if you prefer it, I can retest it. I still need to look for a PLUS and scan the individual operands of it, because the desired change is that the PLUS is replaced with its operand (one that isn't UNSPEC_TP). And getting rid of the ix86_decompose_address would mean either unconditionally performing (wasteful) copy_rtx, or two RTX iterators cycles. The PLUS it is looking for has to be a toplevel PLUS or PLUS in XEXP (x, 0) of that (recursively), otherwise ix86_decompose_address wouldn't recognize it. 2015-06-09 Jakub Jelinek <ja...@redhat.com> PR target/66470 * config/i386/i386.c (ix86_split_long_move): For collisions involving direct tls segment refs, move the UNSPEC_TP out of the address for lea, to each of the memory loads. * gcc.dg/tls/pr66470.c: New test. --- gcc/config/i386/i386.c.jj 2015-06-08 15:41:19.000000000 +0200 +++ gcc/config/i386/i386.c 2015-06-09 14:42:18.357849227 +0200 @@ -22866,7 +22866,7 @@ ix86_split_long_move (rtx operands[]) Do an lea to the last part and use only one colliding move. */ else if (collisions > 1) { - rtx base; + rtx base, addr, tls_base = NULL_RTX; collisions = 1; @@ -22877,10 +22877,48 @@ ix86_split_long_move (rtx operands[]) if (GET_MODE (base) != Pmode) base = gen_rtx_REG (Pmode, REGNO (base)); - emit_insn (gen_rtx_SET (base, XEXP (part[1][0], 0))); + addr = XEXP (part[1][0], 0); + if (TARGET_TLS_DIRECT_SEG_REFS) + { + struct ix86_address parts; + int ok = ix86_decompose_address (addr, &parts); + gcc_assert (ok); + if (parts.seg == DEFAULT_TLS_SEG_REG) + { + /* It is not valid to use %gs: or %fs: in + lea though, so we need to remove it from the + address used for lea and add it to each individual + memory loads instead. */ + addr = copy_rtx (addr); + subrtx_ptr_iterator::array_type array; + FOR_EACH_SUBRTX_PTR (iter, array, &addr, NONCONST) + { + rtx *x = *iter; + if (GET_CODE (*x) == PLUS) + { + for (i = 0; i < 2; i++) + if (GET_CODE (XEXP (*x, i)) == UNSPEC + && XINT (XEXP (*x, i), 1) == UNSPEC_TP) + { + tls_base = XEXP (*x, i); + *x = XEXP (*x, 1 - i); + break; + } + if (tls_base) + break; + } + } + gcc_assert (tls_base); + } + } + emit_insn (gen_rtx_SET (base, addr)); + if (tls_base) + base = gen_rtx_PLUS (GET_MODE (base), base, tls_base); part[1][0] = replace_equiv_address (part[1][0], base); for (i = 1; i < nparts; i++) { + if (tls_base) + base = copy_rtx (base); tmp = plus_constant (Pmode, base, UNITS_PER_WORD * i); part[1][i] = replace_equiv_address (part[1][i], tmp); } --- gcc/testsuite/gcc.dg/tls/pr66470.c.jj 2015-06-09 11:59:05.543954781 +0200 +++ gcc/testsuite/gcc.dg/tls/pr66470.c 2015-06-09 11:58:43.000000000 +0200 @@ -0,0 +1,29 @@ +/* PR target/66470 */ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ +/* { dg-require-effective-target tls } */ + +extern __thread unsigned long long a[10]; +extern __thread struct S { int a, b; } b[10]; + +unsigned long long +foo (long x) +{ + return a[x]; +} + +struct S +bar (long x) +{ + return b[x]; +} + +#ifdef __SIZEOF_INT128__ +extern __thread unsigned __int128 c[10]; + +unsigned __int128 +baz (long x) +{ + return c[x]; +} +#endif Jakub