Hongtao Liu <crazy...@gmail.com> writes: > On Thu, Oct 15, 2020 at 8:38 PM Richard Sandiford > <richard.sandif...@arm.com> wrote: >> >> Hongtao Liu via Gcc-patches <gcc-patches@gcc.gnu.org> writes: >> > + /* Simplify vec_select of a subreg of X to just a vec_select of X >> > + when X has same component mode as vec_select. */ >> > + int l2; >> > + if (GET_CODE (trueop0) == SUBREG >> > + && GET_MODE_INNER (mode) >> > + == GET_MODE_INNER (GET_MODE (XEXP (trueop0, 0))) >> >> Better to use SUBREG_REG here and below. >> > > Yes and changed. > >> > + && (GET_MODE_NUNITS (GET_MODE (trueop0))).is_constant (&l0) >> > + && (GET_MODE_NUNITS (mode)).is_constant (&l1) >> > + && (GET_MODE_NUNITS (GET_MODE (XEXP (trueop0, 0)))) >> > + .is_constant (&l2) >> > + && known_le (l1, l2)) >> > + { >> > + unsigned HOST_WIDE_INT subreg_offset = 0; >> > + gcc_assert (known_eq (XVECLEN (trueop1, 0), l1)); >> > + gcc_assert (can_div_trunc_p (exact_div (subreg_lsb (trueop0), >> > BITS_PER_UNIT), >> > + GET_MODE_SIZE (GET_MODE_INNER >> > (mode)), >> > + &subreg_offset)); >> >> can_div_trunc_p discards the remainder, whereas it looks like here >> you want an exact multiple. >> >> I don't think it's absolutely guaranteed that the “if” condition makes >> the division by GET_MODE_SIZE exact. E.g. in principle you could have >> a subreg of a vector of TIs in which the subreg offset is misaligned by >> a DI offset. >> >> I'm not sure the subreg_lsb conversion is correct though. On big-endian >> targets, lane numbering follows memory layout, just like subreg byte >> offsets do. So ISTM that using SUBREG_BYTE (as per the earlier patch) >> was correct. >> >> In summary, I think the "if” condition should include something like: >> >> constant_mulitple_p (SUBREG_BYTE (trueop0), >> GET_MODE_UNIT_BITSIZE (mode), >> &subreg_offset) >> > > Changed. > >> Thanks, >> Richard > > > Update patch. > > -- > BR, > Hongtao > > From 8d154067963e453c337e6dc2c4f3f19bf0d6e11b Mon Sep 17 00:00:00 2001 > From: liuhongt <hongtao....@intel.com> > Date: Tue, 13 Oct 2020 15:35:29 +0800 > Subject: [PATCH] Simplify vec_select of a subreg of X to just a vec_select of > X. > > gcc/ChangeLog > PR rtl-optimization/97249 > * simplify-rtx.c (simplify_binary_operation_1): Simplify > vec_select of a subreg of X to a vec_select of X. > > gcc/testsuite/ChangeLog > > * gcc.target/i386/pr97249-1.c: New test. > --- > gcc/simplify-rtx.c | 44 +++++++++++++++++++++++ > gcc/testsuite/gcc.target/i386/pr97249-1.c | 30 ++++++++++++++++ > 2 files changed, 74 insertions(+) > create mode 100644 gcc/testsuite/gcc.target/i386/pr97249-1.c > > diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c > index 869f0d11b2e..b1009837b2b 100644 > --- a/gcc/simplify-rtx.c > +++ b/gcc/simplify-rtx.c > @@ -4170,6 +4170,50 @@ simplify_binary_operation_1 (enum rtx_code code, > machine_mode mode, > return subop1; > } > } > + > + /* Simplify vec_select of a subreg of X to just a vec_select of X > + when X has same component mode as vec_select. */ > + int l2; > + unsigned HOST_WIDE_INT subreg_offset = 0; > + if (GET_CODE (trueop0) == SUBREG > + && GET_MODE_INNER (mode) > + == GET_MODE_INNER (GET_MODE (SUBREG_REG (trueop0))) > + && (GET_MODE_NUNITS (GET_MODE (trueop0))).is_constant (&l0)
Nothing really relies on this last line, and nothing uses l0, so better to drop it. > + && (GET_MODE_NUNITS (mode)).is_constant (&l1) > + && (GET_MODE_NUNITS (GET_MODE (SUBREG_REG (trueop0)))) > + .is_constant (&l2) > + && known_le (l1, l2) I'm not sure the last two &&s are really the important condition. I think we should drop them for the suggestion below. > + && constant_multiple_p (SUBREG_BYTE (trueop0), > + GET_MODE_UNIT_BITSIZE (mode), > + &subreg_offset)) > + { > + Excess blank line. > + gcc_assert (known_eq (XVECLEN (trueop1, 0), l1)); This can just use ==. > + bool success = true; > + for (int i = 0; i != l1; i++) > + { > + rtx idx = XVECEXP (trueop1, 0, i); Excess space. > + if (!CONST_INT_P (idx)) Here I think we should check: || maybe_ge (UINTVAL (idx) + subreg_offset, nunits)) where: poly_uint64 nunits = GET_MODE_NUNITS (GET_MODE (SUBREG_REG (trueop0)))). This makes sure that all indices are in range. In particular, it's valid for the SUBREG_REG to be narrower than mode, for appropriate vec_select indices Thanks, Richard