Prathamesh Kulkarni <prathamesh.kulka...@linaro.org> writes:
> On Tue, 2 Jul 2019 at 18:22, Richard Sandiford
> <richard.sandif...@arm.com> wrote:
>>
>> Prathamesh Kulkarni <prathamesh.kulka...@linaro.org> writes:
>> > On Tue, 2 Jul 2019 at 16:59, Richard Sandiford
>> > <richard.sandif...@arm.com> wrote:
>> >>
>> >> Thanks for fixing this.
>> >>
>> >> Prathamesh Kulkarni <prathamesh.kulka...@linaro.org> writes:
>> >> > diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c
>> >> > index 89a46a933fa..79bd0cfbd28 100644
>> >> > --- a/gcc/simplify-rtx.c
>> >> > +++ b/gcc/simplify-rtx.c
>> >> > @@ -6697,6 +6697,19 @@ simplify_subreg (machine_mode outermode, rtx op,
>> >> >       }
>> >> >      }
>> >> >
>> >> > +  /* If op is a vector comparison operator, rewrite it in a new mode.
>> >> > +     This probably won't match, but may allow further simplifications.
>> >> > +     Also check if number of elements and size of each element
>> >> > +     match for outermode and innermode.  */
>> >> > +
>> >>
>> >> Excess blank line after the comment.  IMO the second part of the comment
>> >> reads too much like an afterthought.  How about:
>> >>
>> >>   /* If OP is a vector comparison and the subreg is not changing the
>> >>      number of elements or the size of the elements, change the result
>> >>      of the comparison to the new mode.  */
>> >>
>> >> > +  if (COMPARISON_P (op)
>> >> > +      && VECTOR_MODE_P (outermode)
>> >> > +      && VECTOR_MODE_P (innermode)
>> >> > +      && (known_eq (GET_MODE_NUNITS (outermode), GET_MODE_NUNITS 
>> >> > (innermode)))
>> >> > +      && (known_eq (GET_MODE_UNIT_SIZE (outermode),
>> >> > +                 GET_MODE_UNIT_SIZE (innermode))))
>> >>
>> >> Redundant brackets around the known_eq calls.
>> >>
>> >> > +    return gen_rtx_fmt_ee (GET_CODE (op), outermode, XEXP (op, 0), 
>> >> > XEXP (op, 1));
>> >>
>> >> This should use simplify_gen_relational, so that we try to simplify
>> >> the new expression.
>> > Does the attached version look OK ?
>>
>> OK with a suitable changelog (remember to post those!) if it passes testing.
> The change to simplify_subreg regressed avx2-pr54700-1.C -;)
>
> For following test-case:
> __attribute__((noipa)) __v8sf
> f7 (__v8si a, __v8sf b, __v8sf c)
> {
>   return a < 0 ? b : c;
> }
>
> Before patch, combine shows:
> Trying 8, 9 -> 10:
>     8: r87:V8SI=const_vector
>     9: r89:V8SI=r87:V8SI>r90:V8SI
>       REG_DEAD r90:V8SI
>       REG_DEAD r87:V8SI
>    10: r86:V8SF=unspec[r92:V8SF,r91:V8SF,r89:V8SI#0] 104
>       REG_DEAD r92:V8SF
>       REG_DEAD r89:V8SI
>       REG_DEAD r91:V8SF
> Successfully matched this instruction:
> (set (reg:V8SF 86)
>     (unspec:V8SF [
>             (reg:V8SF 92)
>             (reg:V8SF 91)
>             (subreg:V8SF (lt:V8SI (reg:V8SI 90)
>                     (const_vector:V8SI [
>                             (const_int 0 [0]) repeated x8
>                         ])) 0)
>         ] UNSPEC_BLENDV))
> allowing combination of insns 8, 9 and 10
>
> After applying patch, combine shows:
>
> Trying 8, 9 -> 10:
>     8: r87:V8SI=const_vector
>     9: r89:V8SI=r87:V8SI>r90:V8SI
>       REG_DEAD r90:V8SI
>       REG_DEAD r87:V8SI
>    10: r86:V8SF=unspec[r92:V8SF,r91:V8SF,r89:V8SI#0] 104
>       REG_DEAD r92:V8SF
>       REG_DEAD r89:V8SI
>       REG_DEAD r91:V8SF
> Failed to match this instruction:
> (set (reg:V8SF 86)
>     (unspec:V8SF [
>             (reg:V8SF 92)
>             (reg:V8SF 91)
>             (lt:V8SF (reg:V8SI 90)
>                 (const_vector:V8SI [
>                         (const_int 0 [0]) repeated x8
>                     ]))
>         ] UNSPEC_BLENDV))

Bah.  If the port isn't self-consistent about whether a subreg should
be used, it's tempting to say that a subreg should be used and fix the
places that don't.  At least that way we'd avoid the abomination -
ABOMINATION! - of using NaNs to represent truth.

But I agree it looks like this is the only pattern affected.

> Adjusting UNSPEC_BLENDV pattern to match the simplified subreg seems to work.
> Does it look OK ?
>
> Testing the attached patch in progress.
> (A quick comparison for i386.exp now shows no difference before/after patch).
>
> Thanks,
> Prathamesh
>>
>> Thanks,
>> Richard
>
> 2019-07-03  Prathamesh Kulkarni  <prathamesh.kulka...@linaro.org>
>
>       * fwprop.c (reg_single_def_p): New function.
>       (propagate_rtx_1): Add unconditional else inside RTX_EXTRA case.
>       (forward_propagate_into): New parameter reg_prop_only
>       with default value false.
>       Propagate def's src into loop only if SET_SRC and SET_DEST
>       of def_set have single definitions.
>       Likewise if reg_prop_only is set to true.
>       (fwprop): New param fwprop_addr_p.
>       Integrate fwprop_addr into fwprop.
>       (fwprop_addr): Remove.
>       (pass_rtl_fwprop_addr::execute): Call fwprop with arg set
>       to true.
>       (pass_rtl_fwprop::execute): Call fwprop with arg set to false.
>       * simplify-rtx.c (simplify_subreg): Add case for vector comparison.
>       * i386/config/sse.md (UNSPEC_BLENDV): Adjust pattern.

typo: config/i386/sse.md

>
> testsuite/
>       * gfortran.dg/pr88833.f90: New test. 
>
> diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
> index d7d542524fb..5384fe218f9 100644
> --- a/gcc/config/i386/sse.md
> +++ b/gcc/config/i386/sse.md
> @@ -16623,10 +16623,9 @@
>       (unspec:VF_128_256
>         [(match_operand:VF_128_256 1 "register_operand" "0,0,x")
>          (match_operand:VF_128_256 2 "vector_operand" "YrBm,*xBm,xm")
> -        (subreg:VF_128_256
> -          (lt:<sseintvecmode>
> +          (lt:VF_128_256
>              (match_operand:<sseintvecmode> 3 "register_operand" "Yz,Yz,x")
> -            (match_operand:<sseintvecmode> 4 "const0_operand" "C,C,C")) 0)]
> +            (match_operand:<sseintvecmode> 4 "const0_operand" "C,C,C"))]
>         UNSPEC_BLENDV))]
>    "TARGET_SSE4_1"
>    "#"

The (lt:...) and its operands should now be indented by two columns fewer
than previously.

OK with that change, thanks.

Richard

Reply via email to