Prathamesh Kulkarni <prathamesh.kulka...@linaro.org> writes: > On Tue, 2 Jul 2019 at 18:22, Richard Sandiford > <richard.sandif...@arm.com> wrote: >> >> Prathamesh Kulkarni <prathamesh.kulka...@linaro.org> writes: >> > On Tue, 2 Jul 2019 at 16:59, Richard Sandiford >> > <richard.sandif...@arm.com> wrote: >> >> >> >> Thanks for fixing this. >> >> >> >> Prathamesh Kulkarni <prathamesh.kulka...@linaro.org> writes: >> >> > diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c >> >> > index 89a46a933fa..79bd0cfbd28 100644 >> >> > --- a/gcc/simplify-rtx.c >> >> > +++ b/gcc/simplify-rtx.c >> >> > @@ -6697,6 +6697,19 @@ simplify_subreg (machine_mode outermode, rtx op, >> >> > } >> >> > } >> >> > >> >> > + /* If op is a vector comparison operator, rewrite it in a new mode. >> >> > + This probably won't match, but may allow further simplifications. >> >> > + Also check if number of elements and size of each element >> >> > + match for outermode and innermode. */ >> >> > + >> >> >> >> Excess blank line after the comment. IMO the second part of the comment >> >> reads too much like an afterthought. How about: >> >> >> >> /* If OP is a vector comparison and the subreg is not changing the >> >> number of elements or the size of the elements, change the result >> >> of the comparison to the new mode. */ >> >> >> >> > + if (COMPARISON_P (op) >> >> > + && VECTOR_MODE_P (outermode) >> >> > + && VECTOR_MODE_P (innermode) >> >> > + && (known_eq (GET_MODE_NUNITS (outermode), GET_MODE_NUNITS >> >> > (innermode))) >> >> > + && (known_eq (GET_MODE_UNIT_SIZE (outermode), >> >> > + GET_MODE_UNIT_SIZE (innermode)))) >> >> >> >> Redundant brackets around the known_eq calls. >> >> >> >> > + return gen_rtx_fmt_ee (GET_CODE (op), outermode, XEXP (op, 0), >> >> > XEXP (op, 1)); >> >> >> >> This should use simplify_gen_relational, so that we try to simplify >> >> the new expression. >> > Does the attached version look OK ? >> >> OK with a suitable changelog (remember to post those!) if it passes testing. > The change to simplify_subreg regressed avx2-pr54700-1.C -;) > > For following test-case: > __attribute__((noipa)) __v8sf > f7 (__v8si a, __v8sf b, __v8sf c) > { > return a < 0 ? b : c; > } > > Before patch, combine shows: > Trying 8, 9 -> 10: > 8: r87:V8SI=const_vector > 9: r89:V8SI=r87:V8SI>r90:V8SI > REG_DEAD r90:V8SI > REG_DEAD r87:V8SI > 10: r86:V8SF=unspec[r92:V8SF,r91:V8SF,r89:V8SI#0] 104 > REG_DEAD r92:V8SF > REG_DEAD r89:V8SI > REG_DEAD r91:V8SF > Successfully matched this instruction: > (set (reg:V8SF 86) > (unspec:V8SF [ > (reg:V8SF 92) > (reg:V8SF 91) > (subreg:V8SF (lt:V8SI (reg:V8SI 90) > (const_vector:V8SI [ > (const_int 0 [0]) repeated x8 > ])) 0) > ] UNSPEC_BLENDV)) > allowing combination of insns 8, 9 and 10 > > After applying patch, combine shows: > > Trying 8, 9 -> 10: > 8: r87:V8SI=const_vector > 9: r89:V8SI=r87:V8SI>r90:V8SI > REG_DEAD r90:V8SI > REG_DEAD r87:V8SI > 10: r86:V8SF=unspec[r92:V8SF,r91:V8SF,r89:V8SI#0] 104 > REG_DEAD r92:V8SF > REG_DEAD r89:V8SI > REG_DEAD r91:V8SF > Failed to match this instruction: > (set (reg:V8SF 86) > (unspec:V8SF [ > (reg:V8SF 92) > (reg:V8SF 91) > (lt:V8SF (reg:V8SI 90) > (const_vector:V8SI [ > (const_int 0 [0]) repeated x8 > ])) > ] UNSPEC_BLENDV))
Bah. If the port isn't self-consistent about whether a subreg should be used, it's tempting to say that a subreg should be used and fix the places that don't. At least that way we'd avoid the abomination - ABOMINATION! - of using NaNs to represent truth. But I agree it looks like this is the only pattern affected. > Adjusting UNSPEC_BLENDV pattern to match the simplified subreg seems to work. > Does it look OK ? > > Testing the attached patch in progress. > (A quick comparison for i386.exp now shows no difference before/after patch). > > Thanks, > Prathamesh >> >> Thanks, >> Richard > > 2019-07-03 Prathamesh Kulkarni <prathamesh.kulka...@linaro.org> > > * fwprop.c (reg_single_def_p): New function. > (propagate_rtx_1): Add unconditional else inside RTX_EXTRA case. > (forward_propagate_into): New parameter reg_prop_only > with default value false. > Propagate def's src into loop only if SET_SRC and SET_DEST > of def_set have single definitions. > Likewise if reg_prop_only is set to true. > (fwprop): New param fwprop_addr_p. > Integrate fwprop_addr into fwprop. > (fwprop_addr): Remove. > (pass_rtl_fwprop_addr::execute): Call fwprop with arg set > to true. > (pass_rtl_fwprop::execute): Call fwprop with arg set to false. > * simplify-rtx.c (simplify_subreg): Add case for vector comparison. > * i386/config/sse.md (UNSPEC_BLENDV): Adjust pattern. typo: config/i386/sse.md > > testsuite/ > * gfortran.dg/pr88833.f90: New test. > > diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md > index d7d542524fb..5384fe218f9 100644 > --- a/gcc/config/i386/sse.md > +++ b/gcc/config/i386/sse.md > @@ -16623,10 +16623,9 @@ > (unspec:VF_128_256 > [(match_operand:VF_128_256 1 "register_operand" "0,0,x") > (match_operand:VF_128_256 2 "vector_operand" "YrBm,*xBm,xm") > - (subreg:VF_128_256 > - (lt:<sseintvecmode> > + (lt:VF_128_256 > (match_operand:<sseintvecmode> 3 "register_operand" "Yz,Yz,x") > - (match_operand:<sseintvecmode> 4 "const0_operand" "C,C,C")) 0)] > + (match_operand:<sseintvecmode> 4 "const0_operand" "C,C,C"))] > UNSPEC_BLENDV))] > "TARGET_SSE4_1" > "#" The (lt:...) and its operands should now be indented by two columns fewer than previously. OK with that change, thanks. Richard