On Mon, Aug 17, 2020 at 7:42 PM Dennis Zhang <dennis.zh...@arm.com> wrote:
>
>
> Hi all,
>
> This patch enables MVE vsub instructions for auto-vectorization.
> It adds RTL templates for MVE vsub instructions using 'minus' instead of
> unspec expression to make the instructions recognizable for vectorization.
> MVE target is added in sub<mode>3 optab. The sub<mode>3 optab is
> modified to use a mode iterator that selects available modes for various
> targets correspondingly.
> MVE vector modes are enabled in arm_preferred_simd_mode in arm.c to
> support vectorization.
>
> This patch also fixes 'vreinterpretq_*.c' MVE intrinsic tests. The tests
> generate wrong instruction numbers because of unexpected icf optimization.
> This bug is exposed by the MVE vector modes enabled in this patch,
> therefore it is corrected in this patch to avoid test failures.
>
> MVE instructions are documented here:
> https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/helium-intrinsics
>

Hi Dennis,

Thanks for this patch . However a quick read suggests  at first glance
that it could do with some refactoring or indeed further breaking
down.

1. The refactor for TARGET_NEON_IWWMMXT and friends which I don't get
the motivation for obviously on a quick read. I'll try and read that
again. Please document why these complex TARGET_ macros exist and how
they are expected to be used in the machine description and what they
are indicated to do.
2. It seems odd that we would have
 "&& ((<MODE>mode != V2SFmode && <MODE>mode != V4SFmode)
+    || flag_unsafe_math_optimizations))" apply to TARGET_NEON but not
apply this to TARGET_MVE_FLOAT in the sub<mode>3 expander. The point
is that if it isn't safe to vectorize a subtract for Neon, why is it
safe to do the same for MVE ? This was done in 2010 by Julian to fix
PR target/43703 - isn't this applicable on MVE as well ?
3. I'm also going to quibble a bit about the use of VSEL as the name
of an iterator as that conflates it with the instruction vsel and it's
not obvious what's going on here.


> This patch also fixes 'vreinterpretq_*.c' MVE intrinsic tests. The tests
> generate wrong instruction numbers because of unexpected icf optimization.
> This bug is exposed by the MVE vector modes enabled in this patch,
> therefore it is corrected in this patch to avoid test failures.
>

I'm a bit confused as to why this got exposed because of the new MVE
vector modes exposed by this patch.

> The patch is regtested for arm-none-eabi and bootstrapped for
> arm-none-linux-gnueabihf.
>
Bootstrapped and regression tested for arm-none-linux-gnueabihf with a
--with-fpu=neon in the configuration ?


> Is it OK for trunk please?



Ramana

>
> Thanks
> Dennis
>
> gcc/ChangeLog:
>
> 2020-08-10  Dennis Zhang  <dennis.zh...@arm.com>
>
>         * config/arm/arm.c (arm_preferred_simd_mode): Enable MVE vector modes.
>         * config/arm/arm.h (TARGET_NEON_IWMMXT): New macro.
>         (TARGET_NEON_IWMMXT_MVE, TARGET_NEON_IWMMXT_MVE_FP): Likewise.
>         (TARGET_NEON_MVE_HFP): Likewise.
>         * config/arm/iterators.md (VSEL): New mode iterator to select modes
>         for corresponding targets.
>         * config/arm/mve.md (mve_vsubq<mode>): New entry for vsub instruction
>         using expression 'minus'.
>         (mve_vsubq_f<mode>): Use minus instead of VSUBQ_F unspec.
>         * config/arm/neon.md (sub<mode>3): Removed here. Integrated in the
>         sub<mode>3 in vec-common.md
>         * config/arm/vec-common.md (sub<mode>3): Enable MVE target. Use VSEL
>         to select available modes. Exclude TARGET_NEON_FP16INST from
>         TARGET_NEON statement. Intergrate TARGET_NEON_FP16INST which is
>         originally in neon.md.
>
> gcc/testsuite/ChangeLog:
>
> 2020-08-10  Dennis Zhang  <dennis.zh...@arm.com>
>
>         * gcc.target/arm/mve/intrinsics/vreinterpretq_f16.c: Use additional
>         option -fno-ipa-icf and change the instruction count from 8 to 16.
>         * gcc.target/arm/mve/intrinsics/vreinterpretq_f32.c: Likewise.
>         * gcc.target/arm/mve/intrinsics/vreinterpretq_s16.c: Likewise.
>         * gcc.target/arm/mve/intrinsics/vreinterpretq_s32.c: Likewise.
>         * gcc.target/arm/mve/intrinsics/vreinterpretq_s64.c: Likewise.
>         * gcc.target/arm/mve/intrinsics/vreinterpretq_s8.c: Likewise.
>         * gcc.target/arm/mve/intrinsics/vreinterpretq_u16.c: Likewise.
>         * gcc.target/arm/mve/intrinsics/vreinterpretq_u32.c: Likewise.
>         * gcc.target/arm/mve/intrinsics/vreinterpretq_u64.c: Likewise.
>         * gcc.target/arm/mve/intrinsics/vreinterpretq_u8.c: Likewise.
>         * gcc.target/arm/mve/mve.exp: Include tests in subdir 'vect'.
>         * gcc.target/arm/mve/vect/vect_sub_0.c: New test.
>         * gcc.target/arm/mve/vect/vect_sub_1.c: New test.

Reply via email to