"H.J. Lu" <hjl.to...@gmail.com> writes: > On 10/18/18, Richard Sandiford <richard.sandif...@arm.com> wrote: >> "H.J. Lu" <hjl.to...@gmail.com> writes: >>> On 10/18/18, Richard Sandiford <richard.sandif...@arm.com> wrote: >>>> "H.J. Lu" <hjl.to...@gmail.com> writes: >>>>> On 10/17/18, Marc Glisse <marc.gli...@inria.fr> wrote: >>>>>> On Wed, 17 Oct 2018, H.J. Lu wrote: >>>>>> >>>>>>> We may simplify >>>>>>> >>>>>>> (subreg (vec_merge (vec_duplicate X) (vector) (const_int 1)) 0) >>>>>>> >>>>>>> to X when mode of X is the same as of mode of subreg. >>>>>> >>>>>> Hello, >>>>>> >>>>>> we already have code to simplify vec_select(vec_merge): >>>>>> >>>>>> /* If we select elements in a vec_merge that all come from the >>>>>> same >>>>>> operand, select from that operand directly. */ >>>>>> >>>>>> It would make sense to me to make the subreg transform as similar to >>>>>> it >>>>>> as >>>>>> possible, in particular you don't need to special case vec_duplicate, >>>>>> the >>>>>> transformation would see that everything comes from the first vector, >>>>>> produce (subreg (vec_duplicate X) 0), and let another transformation >>>>>> optimize that. >>>> >>>> Sorry, didn't see this before the OK. >>>> >>>>> What do you mean by another transformation? If simplify_subreg doesn't >>>>> return X for >>>>> >>>>> (subreg (vec_merge (vec_duplicate X) >>>>> (vector) >>>>> (const_int ((1 << N) | M))) >>>>> (N * sizeof (X))) >>>>> >>>>> >>>>> no further transformation will be done. >>>> >>>> I think the point was that we should transform: >>>> >>>> (subreg (vec_merge X >>>> (vector) >>>> (const_int ((1 << N) | M))) >>>> (N * sizeof (X))) >>>> >>>> into: >>>> >>>> simplify_gen_subreg (outermode, X, innermode, byte) >>>> >>>> which should further simplify when X is a vec_duplicate. >>> >>> But sizeof (X) is the size of scalar of vec_dup. How do we >>> check the mask of vec_merge? >> >> Yeah, should be sizeof (outermode) (which was the same thing >> in the original pattern, but not here). >> >> Richard >> > > Like this > > diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c > index b0cf3bbb2a9..e12b5c0e165 100644 > --- a/gcc/simplify-rtx.c > +++ b/gcc/simplify-rtx.c > @@ -6601,20 +6601,21 @@ simplify_subreg (machine_mode outermode, rtx op, > return NULL_RTX; > } > > - /* Return X for > - (subreg (vec_merge (vec_duplicate X) > + /* Simplify > + (subreg (vec_merge (X) > (vector) > (const_int ((1 << N) | M))) > - (N * sizeof (X))) > + (N * sizeof (outermode))) > + to > + (subreg ((X) (N * sizeof (outermode)))
Stray "(": (subreg (X) (N * sizeof (outermode))) OK with that change if it passes testing. Thanks, Richard > */ > unsigned int idx; > if (constant_multiple_p (byte, GET_MODE_SIZE (outermode), &idx) > && GET_CODE (op) == VEC_MERGE > - && GET_CODE (XEXP (op, 0)) == VEC_DUPLICATE > - && GET_MODE (XEXP (XEXP (op, 0), 0)) == outermode > + && GET_MODE_INNER (innermode) == outermode > && CONST_INT_P (XEXP (op, 2)) > && (UINTVAL (XEXP (op, 2)) & (HOST_WIDE_INT_1U << idx)) != 0) > - return XEXP (XEXP (op, 0), 0); > + return simplify_gen_subreg (outermode, XEXP (op, 0), innermode, byte); > > /* A SUBREG resulting from a zero extension may fold to zero if > it extracts higher bits that the ZERO_EXTEND's source bits. */