On Tue, Jun 28, 2016 at 12:26 PM, Jakub Jelinek <ja...@redhat.com> wrote:
> On Tue, Jun 28, 2016 at 12:09:51PM +0200, Uros Bizjak wrote:
>> > So, this patch instead changes ix86_expand_vector_move, so that
>> > for SUBREGs it forces the SUBREG_REG into memory (or register if
>> > that fails, though I don't have a testcase for when that would happen),
>> > and just re-creates a SUBREG on the forced MEM (for whole size
>> > SUBREGs that is in the end just using different mode for the MEM).
>>
>> There can be issue with paradoxical subregs, since subreg mode can be
>> wider than original mode.
>
> For paradoxical subregs, the extra bits are undefined, whether it is SUBREG
> of a constant, REG or MEM, isn't that the case?  Though I guess with MEM
> there is a risk that reading the undefined bits from mem will be beyond end
> of the data segment.  It would really surprise me if something created a
> paradoxical SUBREG of a CONSTANT_P.

Yes, this would be really ... interesting.

> Anyway, I can just always force_reg in that case, like:

No, no need for being too paranoid. An assert would be OK, at your discretion.

The original patch w/ or w/o the assert is OK then.

Uros.

>
> 2016-06-28  Jakub Jelinek  <ja...@redhat.com>
>
>         PR middle-end/71626
>         * config/i386/i386.c (ix86_expand_vector_move): For SUBREG of
>         a constant, force its SUBREG_REG into memory or register instead
>         of whole op1.
>
>         * gcc.c-torture/execute/pr71626-1.c: New test.
>         * gcc.c-torture/execute/pr71626-2.c: New test.
>
> --- gcc/config/i386/i386.c.jj   2016-06-27 14:50:51.000000000 +0200
> +++ gcc/config/i386/i386.c      2016-06-28 10:51:18.444624190 +0200
> @@ -19610,12 +19610,30 @@ ix86_expand_vector_move (machine_mode mo
>       of the register, once we have that information we may be able
>       to handle some of them more efficiently.  */
>    if (can_create_pseudo_p ()
> -      && register_operand (op0, mode)
>        && (CONSTANT_P (op1)
>           || (SUBREG_P (op1)
>               && CONSTANT_P (SUBREG_REG (op1))))
> -      && !standard_sse_constant_p (op1, mode))
> -    op1 = validize_mem (force_const_mem (mode, op1));
> +      && ((register_operand (op0, mode)
> +          && !standard_sse_constant_p (op1, mode))
> +         /* ix86_expand_vector_move_misalign() does not like constants.  */
> +         || (SSE_REG_MODE_P (mode)
> +             && MEM_P (op0)
> +             && MEM_ALIGN (op0) < align)))
> +    {
> +      if (SUBREG_P (op1))
> +       {
> +         machine_mode imode = GET_MODE (SUBREG_REG (op1));
> +         rtx r = (paradoxical_subreg_p (op1)
> +                  ? NULL_RTX : force_const_mem (imode, SUBREG_REG (op1)));
> +         if (r)
> +           r = validize_mem (r);
> +         else
> +           r = force_reg (imode, SUBREG_REG (op1));
> +         op1 = simplify_gen_subreg (mode, r, imode, SUBREG_BYTE (op1));
> +       }
> +      else
> +       op1 = validize_mem (force_const_mem (mode, op1));
> +    }
>
>    /* We need to check memory alignment for SSE mode since attribute
>       can make operands unaligned.  */
> @@ -19626,13 +19643,8 @@ ix86_expand_vector_move (machine_mode mo
>      {
>        rtx tmp[2];
>
> -      /* ix86_expand_vector_move_misalign() does not like constants ... */
> -      if (CONSTANT_P (op1)
> -         || (SUBREG_P (op1)
> -             && CONSTANT_P (SUBREG_REG (op1))))
> -       op1 = validize_mem (force_const_mem (mode, op1));
> -
> -      /* ... nor both arguments in memory.  */
> +      /* ix86_expand_vector_move_misalign() does not like both
> +        arguments in memory.  */
>        if (!register_operand (op0, mode)
>           && !register_operand (op1, mode))
>         op1 = force_reg (mode, op1);
> --- gcc/testsuite/gcc.c-torture/execute/pr71626-1.c.jj  2016-06-28 
> 10:40:12.928186649 +0200
> +++ gcc/testsuite/gcc.c-torture/execute/pr71626-1.c     2016-06-28 
> 10:40:12.928186649 +0200
> @@ -0,0 +1,19 @@
> +/* PR middle-end/71626 */
> +
> +typedef __INTPTR_TYPE__ V __attribute__((__vector_size__(sizeof 
> (__INTPTR_TYPE__))));
> +
> +__attribute__((noinline, noclone)) V
> +foo ()
> +{
> +  V v = { (__INTPTR_TYPE__) foo };
> +  return v;
> +}
> +
> +int
> +main ()
> +{
> +  V v = foo ();
> +  if (v[0] != (__INTPTR_TYPE__) foo)
> +    __builtin_abort ();
> +  return 0;
> +}
> --- gcc/testsuite/gcc.c-torture/execute/pr71626-2.c.jj  2016-06-28 
> 10:40:12.928186649 +0200
> +++ gcc/testsuite/gcc.c-torture/execute/pr71626-2.c     2016-06-28 
> 10:40:12.928186649 +0200
> @@ -0,0 +1,4 @@
> +/* PR middle-end/71626 */
> +/* { dg-additional-options "-fpic" { target fpic } } */
> +
> +#include "pr71626-1.c"
>
>
>         Jakub

Reply via email to