On 10/18/18, Richard Sandiford <richard.sandif...@arm.com> wrote: > "H.J. Lu" <hjl.to...@gmail.com> writes: >> On 10/18/18, Richard Sandiford <richard.sandif...@arm.com> wrote: >>> "H.J. Lu" <hjl.to...@gmail.com> writes: >>>> On 10/18/18, Richard Sandiford <richard.sandif...@arm.com> wrote: >>>>> "H.J. Lu" <hjl.to...@gmail.com> writes: >>>>>> On 10/17/18, Marc Glisse <marc.gli...@inria.fr> wrote: >>>>>>> On Wed, 17 Oct 2018, H.J. Lu wrote: >>>>>>> >>>>>>>> We may simplify >>>>>>>> >>>>>>>> (subreg (vec_merge (vec_duplicate X) (vector) (const_int 1)) 0) >>>>>>>> >>>>>>>> to X when mode of X is the same as of mode of subreg. >>>>>>> >>>>>>> Hello, >>>>>>> >>>>>>> we already have code to simplify vec_select(vec_merge): >>>>>>> >>>>>>> /* If we select elements in a vec_merge that all come from the >>>>>>> same >>>>>>> operand, select from that operand directly. */ >>>>>>> >>>>>>> It would make sense to me to make the subreg transform as similar to >>>>>>> it >>>>>>> as >>>>>>> possible, in particular you don't need to special case >>>>>>> vec_duplicate, >>>>>>> the >>>>>>> transformation would see that everything comes from the first >>>>>>> vector, >>>>>>> produce (subreg (vec_duplicate X) 0), and let another transformation >>>>>>> optimize that. >>>>> >>>>> Sorry, didn't see this before the OK. >>>>> >>>>>> What do you mean by another transformation? If simplify_subreg >>>>>> doesn't >>>>>> return X for >>>>>> >>>>>> (subreg (vec_merge (vec_duplicate X) >>>>>> (vector) >>>>>> (const_int ((1 << N) | M))) >>>>>> (N * sizeof (X))) >>>>>> >>>>>> >>>>>> no further transformation will be done. >>>>> >>>>> I think the point was that we should transform: >>>>> >>>>> (subreg (vec_merge X >>>>> (vector) >>>>> (const_int ((1 << N) | M))) >>>>> (N * sizeof (X))) >>>>> >>>>> into: >>>>> >>>>> simplify_gen_subreg (outermode, X, innermode, byte) >>>>> >>>>> which should further simplify when X is a vec_duplicate. >>>> >>>> But sizeof (X) is the size of scalar of vec_dup. How do we >>>> check the mask of vec_merge? >>> >>> Yeah, should be sizeof (outermode) (which was the same thing >>> in the original pattern, but not here). >>> >>> Richard >>> >> >> Like this >> >> diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c >> index b0cf3bbb2a9..e12b5c0e165 100644 >> --- a/gcc/simplify-rtx.c >> +++ b/gcc/simplify-rtx.c >> @@ -6601,20 +6601,21 @@ simplify_subreg (machine_mode outermode, rtx op, >> return NULL_RTX; >> } >> >> - /* Return X for >> - (subreg (vec_merge (vec_duplicate X) >> + /* Simplify >> + (subreg (vec_merge (X) >> (vector) >> (const_int ((1 << N) | M))) >> - (N * sizeof (X))) >> + (N * sizeof (outermode))) >> + to >> + (subreg ((X) (N * sizeof (outermode))) > > Stray "(": (subreg (X) (N * sizeof (outermode))) > > OK with that change if it passes testing.
The self-test failed for 32-bit compiler: expected: (reg:QI 342) actual: (subreg:QI (vec_merge:V128QI (vec_duplicate:V128QI (reg:QI 342)) (reg:V128QI 343) (const_int 65 [0x41])) 64) since && (UINTVAL (XEXP (op, 2)) & (HOST_WIDE_INT_1U << idx)) != 0) works only up to vectors with 64 elements for 32-bit compilers. Should we limit the self-test to vectors with 64 elements? -- H.J.