On Fri, Jan 14, 2022 at 2:18 PM Christophe Lyon via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> Hi,
>
> I hadn't realized we are moving to stage 4 this week-end :-(
>
> The PRs I'm fixing are P3, but without these fixes MVE support is badly
> broken, so I think I would be really good to fix that before the buggy
> version becomes part of an actual release.
> Anyway I posted v1 of the patches during stage1, so it should still be OK
> if they are accepted as-is ?

In the end it's up to the target maintainers to weight the risk of breakage
vs. the risk of not usefulness ;)  But stage3 is where the "was posted
during stage1"
rule can easily apply - at some point we have to stop with such general ruling.

Richard.

> Thanks,
>
> Christophe
>
> On Thu, Jan 13, 2022 at 3:58 PM Christophe Lyon via Gcc-patches <
> gcc-patches@gcc.gnu.org> wrote:
>
> >
> > This is v3 of this patch series, fixing issues I discovered before
> > committing v2 (which had been approved).
> >
> > Thanks a lot to Richard Sandiford for his help.
> >
> > The changes v2 -> v3 are:
> >
> > Patch 4: Fix arm_hard_regno_nregs and CLASS_MAX_NREGS to support VPR.
> >
> > Patch 7: Changes to the underlying representation of vectors of
> > booleans to account for the different expectations between AArch64/SVE
> > and Arm/MVE.
> >
> > Patch 8: Re-use and extend existing thumb2_movhi* patterns instead of
> > duplicating them in mve_mov<mode>. This requires the introduction of a
> > new constraint to match a constant vector of booleans. Add a new RTL
> > test.
> >
> > Patch 9: Introduce check_effective_target_arm_mve and skip
> > gcc.dg/signbit-2.c, because with MVE there is no fallback architecture
> > unlike SVE or AVX512.
> >
> > Patch 12: Update less load/store MVE builtins
> > (mve_vldrdq_gather_base_z_<supf>v2di,
> > mve_vldrdq_gather_offset_z_<supf>v2di,
> > mve_vldrdq_gather_shifted_offset_z_<supf>v2di,
> > mve_vstrdq_scatter_base_p_<supf>v2di,
> > mve_vstrdq_scatter_offset_p_<supf>v2di,
> > mve_vstrdq_scatter_offset_p_<supf>v2di_insn,
> > mve_vstrdq_scatter_shifted_offset_p_<supf>v2di,
> > mve_vstrdq_scatter_shifted_offset_p_<supf>v2di_insn,
> > mve_vstrdq_scatter_base_wb_p_<supf>v2di,
> > mve_vldrdq_gather_base_wb_z_<supf>v2di,
> > mve_vldrdq_gather_base_nowb_z_<supf>v2di,
> > mve_vldrdq_gather_base_wb_z_<supf>v2di_insn) for which we keep HI mode
> > for vpr_register_operand.
> >
> > Patch 13: No need to update
> > gcc.target/arm/acle/cde-mve-full-assembly.c anymore since we re-use
> > the mov pattern that emits '@ movhi' in the assembly.
> >
> > Patch 15: This is a new patch to fix a problem I noticed during this
> > v2->v3 update.
> >
> >
> >
> > I'll squash patch 2 with patch 9 and patch 3 with patch 8.
> >
> > Original text:
> >
> > This patch series addresses PR 100757 and 101325 by representing
> > vectors of predicates (MVE VPR.P0 register) as vectors of booleans
> > rather than using HImode.
> >
> > As this implies a lot of mostly mechanical changes, I have tried to
> > split the patches in a way that should help reviewers, but the split
> > is a bit artificial.
> >
> > Patches 1-3 add new tests.
> >
> > Patches 4-6 are small independent improvements.
> >
> > Patch 7 implements the predicate qualifier, but does not change any
> > builtin yet.
> >
> > Patch 8 is the first of the two main patches, and uses the new
> > qualifier to describe the vcmp and vpsel builtins that are useful for
> > auto-vectorization of comparisons.
> >
> > Patch 9 is the second main patch, which fixes the vcond_mask expander.
> >
> > Patches 10-13 convert almost all the remaining builtins with HI
> > operands to use the predicate qualifier.  After these, there are still
> > a few builtins with HI operands left, about which I am not sure: vctp,
> > vpnot, load-gather and store-scatter with v2di operands.  In fact,
> > patches 11/12 update some STR/LDR qualifiers in a way that breaks
> > these v2di builtins although existing tests still pass.
> >
> > Christophe Lyon (15):
> >   arm: Add new tests for comparison vectorization with Neon and MVE
> >   arm: Add tests for PR target/100757
> >   arm: Add tests for PR target/101325
> >   arm: Add GENERAL_AND_VPR_REGS regclass
> >   arm: Add support for VPR_REG in arm_class_likely_spilled_p
> >   arm: Fix mve_vmvnq_n_<supf><mode> argument mode
> >   arm: Implement MVE predicates as vectors of booleans
> >   arm: Implement auto-vectorized MVE comparisons with vectors of boolean
> >     predicates
> >   arm: Fix vcond_mask expander for MVE (PR target/100757)
> >   arm: Convert remaining MVE vcmp builtins to predicate qualifiers
> >   arm: Convert more MVE builtins to predicate qualifiers
> >   arm: Convert more load/store MVE builtins to predicate qualifiers
> >   arm: Convert more MVE/CDE builtins to predicate qualifiers
> >   arm: Add VPR_REG to ALL_REGS
> >   arm: Fix constraint check for V8HI in mve_vector_mem_operand
> >
> >  gcc/config/aarch64/aarch64-modes.def          |   8 +-
> >  gcc/config/arm/arm-builtins.c                 | 224 +++--
> >  gcc/config/arm/arm-builtins.h                 |   4 +-
> >  gcc/config/arm/arm-modes.def                  |   8 +
> >  gcc/config/arm/arm-protos.h                   |   4 +-
> >  gcc/config/arm/arm-simd-builtin-types.def     |   4 +
> >  gcc/config/arm/arm.c                          | 169 ++--
> >  gcc/config/arm/arm.h                          |   9 +-
> >  gcc/config/arm/arm_mve_builtins.def           | 746 ++++++++--------
> >  gcc/config/arm/constraints.md                 |   6 +
> >  gcc/config/arm/iterators.md                   |   6 +
> >  gcc/config/arm/mve.md                         | 795 ++++++++++--------
> >  gcc/config/arm/neon.md                        |  39 +
> >  gcc/config/arm/vec-common.md                  |  52 --
> >  gcc/config/arm/vfp.md                         |  34 +-
> >  gcc/doc/sourcebuild.texi                      |   4 +
> >  gcc/emit-rtl.c                                |  20 +-
> >  gcc/genmodes.c                                |  81 +-
> >  gcc/machmode.def                              |   2 +-
> >  gcc/rtx-vector-builder.c                      |   4 +-
> >  gcc/simplify-rtx.c                            |  34 +-
> >  gcc/testsuite/gcc.dg/signbit-2.c              |   1 +
> >  .../gcc.target/arm/simd/mve-vcmp-f32-2.c      |  32 +
> >  .../gcc.target/arm/simd/neon-compare-1.c      |  78 ++
> >  .../gcc.target/arm/simd/neon-compare-2.c      |  13 +
> >  .../gcc.target/arm/simd/neon-compare-3.c      |  14 +
> >  .../arm/simd/neon-compare-scalar-1.c          |  57 ++
> >  .../gcc.target/arm/simd/neon-vcmp-f16.c       |  12 +
> >  .../gcc.target/arm/simd/neon-vcmp-f32-2.c     |  15 +
> >  .../gcc.target/arm/simd/neon-vcmp-f32-3.c     |  12 +
> >  .../gcc.target/arm/simd/neon-vcmp-f32.c       |  12 +
> >  gcc/testsuite/gcc.target/arm/simd/neon-vcmp.c |  22 +
> >  .../gcc.target/arm/simd/pr100757-2.c          |  20 +
> >  .../gcc.target/arm/simd/pr100757-3.c          |  20 +
> >  .../gcc.target/arm/simd/pr100757-4.c          |  19 +
> >  gcc/testsuite/gcc.target/arm/simd/pr100757.c  |  19 +
> >  .../gcc.target/arm/simd/pr101325-2.c          |  19 +
> >  gcc/testsuite/gcc.target/arm/simd/pr101325.c  |  14 +
> >  gcc/testsuite/lib/target-supports.exp         |  15 +-
> >  gcc/varasm.c                                  |   7 +-
> >  40 files changed, 1635 insertions(+), 1019 deletions(-)
> >  create mode 100644 gcc/testsuite/gcc.target/arm/simd/mve-vcmp-f32-2.c
> >  create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-compare-1.c
> >  create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-compare-2.c
> >  create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-compare-3.c
> >  create mode 100644
> > gcc/testsuite/gcc.target/arm/simd/neon-compare-scalar-1.c
> >  create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-vcmp-f16.c
> >  create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-vcmp-f32-2.c
> >  create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-vcmp-f32-3.c
> >  create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-vcmp-f32.c
> >  create mode 100644 gcc/testsuite/gcc.target/arm/simd/neon-vcmp.c
> >  create mode 100644 gcc/testsuite/gcc.target/arm/simd/pr100757-2.c
> >  create mode 100644 gcc/testsuite/gcc.target/arm/simd/pr100757-3.c
> >  create mode 100644 gcc/testsuite/gcc.target/arm/simd/pr100757-4.c
> >  create mode 100644 gcc/testsuite/gcc.target/arm/simd/pr100757.c
> >  create mode 100644 gcc/testsuite/gcc.target/arm/simd/pr101325-2.c
> >  create mode 100644 gcc/testsuite/gcc.target/arm/simd/pr101325.c
> >
> > --
> > 2.25.1
> >
> >

Reply via email to