On Tue, Jun 09, 2015 at 08:09:28PM +0200, Uros Bizjak wrote:
> Please find attach a patch that takes your idea slightly further. We
> find  perhaps zero-extended UNSPEC_TP, and copy it for further use. At
> its place, we simply slap const0_rtx. We know that address to

Is that safe?  I mean, the address, even if offsetable, can have some
immediate already (seems e.g. the offsettable_memref_p predicate just checks
you can plus_constant some small integer and be recognized again) and if you
turn the %gs: into a const0_rtx, it would fail next decompose.
And when you already have the PLUS which has UNSPEC_TP as one of its
arguments, replacing that PLUS with the other argument is IMHO very easy.
Perhaps you are right that there is no need to copy_rtx, supposedly
the rtx shouldn't be shared with anything and thus can be modified in place.

If -mx32 is a non-issue here, then perhaps my initial patch is good enough?

> Index: config/i386/i386.c
> ===================================================================
> --- config/i386/i386.c        (revision 224292)
> +++ config/i386/i386.c        (working copy)
> @@ -22858,7 +22858,7 @@ ix86_split_long_move (rtx operands[])
>        Do an lea to the last part and use only one colliding move.  */
>        else if (collisions > 1)
>       {
> -       rtx base;
> +       rtx base, addr, tls_base = NULL_RTX;
>  
>         collisions = 1;
>  
> @@ -22869,10 +22869,52 @@ ix86_split_long_move (rtx operands[])
>         if (GET_MODE (base) != Pmode)
>           base = gen_rtx_REG (Pmode, REGNO (base));
>  
> -       emit_insn (gen_rtx_SET (base, XEXP (part[1][0], 0)));
> +       addr = XEXP (part[1][0], 0);
> +       if (TARGET_TLS_DIRECT_SEG_REFS)
> +         {
> +           struct ix86_address parts;
> +           int ok = ix86_decompose_address (addr, &parts);
> +           gcc_assert (ok);
> +           if (parts.seg != SEG_DEFAULT)
> +             {
> +               /* It is not valid to use %gs: or %fs: in
> +                  lea though, so we need to remove it from the
> +                  address used for lea and add it to each individual
> +                  memory loads instead.  */
> +               rtx *x = &addr;
> +                  while (GET_CODE (*x) == PLUS)
> +                    {
> +                      for (i = 0; i < 2; i++)
> +                     {
> +                       rtx op = XEXP (*x, i);
> +                       if ((GET_CODE (op) == UNSPEC
> +                          && XINT (op, 1) == UNSPEC_TP)
> +                         || (GET_CODE (op) == ZERO_EXTEND
> +                             && GET_CODE (XEXP (op, 0)) == UNSPEC
> +                             && (XINT (XEXP (op, 0), 1)
> +                                 == UNSPEC_TP)))
> +                       {
> +                         tls_base = XEXP (*x, i);
> +                         XEXP (*x, i) = const0_rtx;
> +                         break;
> +                       }
> +                     }
> +
> +                   if (tls_base)
> +                     break;
> +                   x = &XEXP (*x, 0);
> +                 }
> +               gcc_assert (tls_base);
> +             }
> +         }
> +       emit_insn (gen_rtx_SET (base, addr));
> +       if (tls_base)
> +         base = gen_rtx_PLUS (GET_MODE (base), base, tls_base);
>         part[1][0] = replace_equiv_address (part[1][0], base);
>         for (i = 1; i < nparts; i++)
>           {
> +           if (tls_base)
> +             base = copy_rtx (base);
>             tmp = plus_constant (Pmode, base, UNITS_PER_WORD * i);
>             part[1][i] = replace_equiv_address (part[1][i], tmp);
>           }


        Jakub

Reply via email to