On Fri, 27 Nov 2020 at 15:13, Andre Vieira (lists) <andre.simoesdiasvie...@arm.com> wrote: > > Hi Christophe, > > On 26/11/2020 15:31, Christophe Lyon wrote: > > Hi Andre, > > > > Thanks for the quick feedback. > > > > On Wed, 25 Nov 2020 at 18:17, Andre Simoes Dias Vieira > > <andre.simoesdiasvie...@arm.com> wrote: > >> Hi Christophe, > >> > >> Thanks for these! See some inline comments. > >> > >> On 25/11/2020 13:54, Christophe Lyon via Gcc-patches wrote: > >>> This patch enables MVE vandq instructions for auto-vectorization. MVE > >>> vandq insns in mve.md are modified to use 'and' instead of unspec > >>> expression to support and<mode>3. The and<mode>3 expander is added to > >>> vec-common.md > >>> > >>> 2020-11-12 Christophe Lyon <christophe.l...@linaro.org> > >>> > >>> gcc/ > >>> * gcc/config/arm/iterators.md (supf): Remove VANDQ_S and VANDQ_U. > >>> (VANQ): Remove. > >>> * config/arm/mve.md (mve_vandq_u<mode>): New entry for vand > >>> instruction using expression and. > >>> (mve_vandq_s<mode>): New expander. > >>> * config/arm/neon.md (and<mode>3): Renamed into and<mode>3_neon. > >>> * config/arm/unspecs.md (VANDQ_S, VANDQ_U): Remove. > >>> * config/arm/vec-common.md (and<mode>3): New expander. > >>> > >>> gcc/testsuite/ > >>> * gcc.target/arm/simd/mve-vand.c: New test. > >>> --- > >>> gcc/config/arm/iterators.md | 4 +--- > >>> gcc/config/arm/mve.md | 20 ++++++++++++---- > >>> gcc/config/arm/neon.md | 2 +- > >>> gcc/config/arm/unspecs.md | 2 -- > >>> gcc/config/arm/vec-common.md | 15 ++++++++++++ > >>> gcc/testsuite/gcc.target/arm/simd/mve-vand.c | 34 > >>> ++++++++++++++++++++++++++++ > >>> 6 files changed, 66 insertions(+), 11 deletions(-) > >>> create mode 100644 gcc/testsuite/gcc.target/arm/simd/mve-vand.c > >>> > >>> diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md > >>> index 592af35..72039e4 100644 > >>> --- a/gcc/config/arm/iterators.md > >>> +++ b/gcc/config/arm/iterators.md > >>> @@ -1232,8 +1232,7 @@ (define_int_attr supf [(VCVTQ_TO_F_S "s") > >>> (VCVTQ_TO_F_U "u") (VREV16Q_S "s") > >>> (VADDLVQ_P_U "u") (VCMPNEQ_U "u") (VCMPNEQ_S "s") > >>> (VABDQ_M_S "s") (VABDQ_M_U "u") (VABDQ_S "s") > >>> (VABDQ_U "u") (VADDQ_N_S "s") (VADDQ_N_U "u") > >>> - (VADDVQ_P_S "s") (VADDVQ_P_U "u") (VANDQ_S "s") > >>> - (VANDQ_U "u") (VBICQ_S "s") (VBICQ_U "u") > >>> + (VADDVQ_P_S "s") (VADDVQ_P_U "u") (VBICQ_S "s") > >>> (VBICQ_U "u") > >>> (VBRSRQ_N_S "s") (VBRSRQ_N_U "u") (VCADDQ_ROT270_S > >>> "s") > >>> (VCADDQ_ROT270_U "u") (VCADDQ_ROT90_S "s") > >>> (VCMPEQQ_S "s") (VCMPEQQ_U "u") (VCADDQ_ROT90_U "u") > >>> @@ -1501,7 +1500,6 @@ (define_int_iterator VABDQ [VABDQ_S VABDQ_U]) > >>> (define_int_iterator VADDQ_N [VADDQ_N_S VADDQ_N_U]) > >>> (define_int_iterator VADDVAQ [VADDVAQ_S VADDVAQ_U]) > >>> (define_int_iterator VADDVQ_P [VADDVQ_P_U VADDVQ_P_S]) > >>> -(define_int_iterator VANDQ [VANDQ_U VANDQ_S]) > >>> (define_int_iterator VBICQ [VBICQ_S VBICQ_U]) > >>> (define_int_iterator VBRSRQ_N [VBRSRQ_N_U VBRSRQ_N_S]) > >>> (define_int_iterator VCADDQ_ROT270 [VCADDQ_ROT270_S VCADDQ_ROT270_U]) > >>> diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md > >>> index ecbaaa9..975eb7d 100644 > >>> --- a/gcc/config/arm/mve.md > >>> +++ b/gcc/config/arm/mve.md > >>> @@ -894,17 +894,27 @@ (define_insn "mve_vaddvq_p_<supf><mode>" > >>> ;; > >>> ;; [vandq_u, vandq_s]) > >>> ;; > >>> -(define_insn "mve_vandq_<supf><mode>" > >>> +;; signed and unsigned versions are the same: define the unsigned > >>> +;; insn, and use an expander for the signed one as we still reference > >>> +;; both names from arm_mve.h. > >>> +(define_insn "mve_vandq_u<mode>" > >>> [ > >>> (set (match_operand:MVE_2 0 "s_register_operand" "=w") > >>> - (unspec:MVE_2 [(match_operand:MVE_2 1 "s_register_operand" "w") > >>> - (match_operand:MVE_2 2 "s_register_operand" "w")] > >>> - VANDQ)) > >>> + (and:MVE_2 (match_operand:MVE_2 1 "s_register_operand" "w") > >>> + (match_operand:MVE_2 2 "s_register_operand" "w"))) > >> The predicate on the second operand is more restrictive than the one in > >> expand 'neon_inv_logic_op2'. This should still work with immediates, or > >> well I checked for integers, it generates a loop as such: > >> > > Right, thanks for catching it. > > > >> vldrw.32 q3, [r0] > >> vldr.64 d4, .L8 > >> vldr.64 d5, .L8+8 > >> vand q3, q3, q2 > >> vstrw.32 q3, [r2] > >> > >> MVE does support vand's with immediates, just like NEON, I suspect you > >> could just copy the way Neon handles these, possibly worth the little > >> extra effort. You can use dest[i] = a[i] & ~1 as a testcase. > >> If you don't it might still be worth expanding the test to make sure > >> other immediates-types combinations don't trigger an ICE? > >> > >> I'm not sure I understand why it loads it in two 64-bit chunks and not > >> do a single load or not just do something like a vmov or vbic immediate. > >> Anyhow that's a worry for another day I guess.. > > Do you mean something like the attached (on top of this patch)? > > I dislike the code duplication in mve_vandq_u<mode> which would > > become a copy of and<mode>3_neon. > Hi Christophe, > > Yeah that's what I meant. I agree with the code duplication. The reason > we still use separate ones is because of the difference in supported > modes. Maybe the right way around that would be to redefine VDQ as: > > (define_mode_iterator VDQ [(V8QI "TARGET_HAVE_NEON") V16QI > (V4HI > "TARGET_HAVE_NEON") V8HI > (V2SI > "TARGET_HAVE_NEON") V4SI > (V4HF > "TARGET_HAVE_NEON") V8HF > (V2SF > "TARGET_HAVE_NEON") V4SF V2DI]) > > And have a single define_expand and insn for both vector extensions.
Indeed, I can try that. I have also noticed the VNIM1 / VNINOTM1 pair. > Though we would also need to do something about the type attribut in > case we want to have different scheduling types for both. Though right > now we don't do any scheduling for MVE, I don't know whether these can > be conditionalized on target features though, something to look at. > > > > > The other concern is that it's not exercised by testcase: as you noted > > the compiler uses a pair of loads to prepare the second operand. > > > > But indeed that's probably a separate problem. > > > > I guess your comments apply to patch 2 (vorr)? > > Yeah, forgot to reply to that one, but yes! I still need to have a look > at 4-7. Ok thanks, I have a WIP fix for #7 (vmvn) anyway. > Kind regards, > Andre >