Re: [PATCH 2/3] arm: Remove unnecessary zero-extending of MVE predicates before use [PR 107674]

Andre Vieira (lists) via Gcc-patches Fri, 27 Jan 2023 01:58:26 -0800



On 26/01/2023 15:06, Kyrylo Tkachov wrote:

Hi Andre,

-----Original Message-----
From: Andre Vieira (lists) <andre.simoesdiasvie...@arm.com>
Sent: Tuesday, January 24, 2023 1:54 PM
To: gcc-patches@gcc.gnu.org
Cc: Richard Sandiford <richard.sandif...@arm.com>; Richard Earnshaw
<richard.earns...@arm.com>; Richard Biener <rguent...@suse.de>;
Kyrylo Tkachov <kyrylo.tkac...@arm.com>
Subject: [PATCH 2/3] arm: Remove unnecessary zero-extending of MVE
predicates before use [PR 107674]

Hi,

This patch teaches GCC that zero-extending a MVE predicate from 16-bits
to 32-bits and then only using 16-bits is a no-op.
It does so in two steps:
- it lets gcc know that it can access any MVE predicate mode using any
other MVE predicate mode without needing to copy it, using the
TARGET_MODES_TIEABLE_P hook,
- it teaches simplify_subreg to optimize a subreg with a vector
outermode, by replacing this outermode with a same-sized integer mode
and trying the avalailable optimizations, then if successful it
surrounds the result with a subreg casting it back to the original
vector outermode.

This removes the unnecessary zero-extending shown on PR 107674 (though
it's a sign-extend there), that was introduced in gcc 11.

Bootstrapped on aarch64-none-linux-gnu and regression tested on
arm-none-eabi and armeb-none-eabi for armv8.1-m.main+mve.fp.

OK for trunk?

gcc/ChangeLog:

        PR target/107674
          * conig/arm/arm.cc (arm_hard_regno_mode_ok): Use new MACRO.
          (arm_modes_tieable_p): Make MVE predicate modes tieable.
        * config/arm/arm.h (VALID_MVE_PRED_MODE):  New define.
        * simplify-rtx.cc (simplify_context::simplify_subreg): Teach
        simplify_subreg to simplify subregs where the outermode is not
scalar.


The arm changes look ok to me. We'll want a midend maintainer to have a look at 
simplify-rtx.cc


gcc/testsuite/ChangeLog:

        * gcc.target/arm/mve/mve_vpt.c: Change to remove unecessary
        zero-extend.


diff --git a/gcc/testsuite/gcc.target/arm/mve/mve_vpt.c 
b/gcc/testsuite/gcc.target/arm/mve/mve_vpt.c
index 
26a565b79dd1348e361b3aa23a1d6e6d13bffce8..8e562a9f065eff157f63ebd5acf9af0a2155b5c5
 100644
--- a/gcc/testsuite/gcc.target/arm/mve/mve_vpt.c
+++ b/gcc/testsuite/gcc.target/arm/mve/mve_vpt.c
@@ -16,9 +16,6 @@ void test0 (uint8_t *a, uint8_t *b, uint8_t *c)
  **    vldrb.8 q2, \[r0\]
  **    vldrb.8 q1, \[r1\]
  **    vcmp.i8 eq, q2, q1
-**     vmrs    r3, p0  @ movhi
-**     uxth    r3, r3
-**     vmsr    p0, r3  @ movhi
  **    vpst
  **    vaddt.i8        q3, q2, q1
  **    vpst

Ah I see, that's the testcase from patch 1/3 that I criticized :)
Maybe if we just scan for absence of an uxth, vmrs and vmsr it will be more 
robust?
Thanks,
Kyrill

I could, but I would rather not. I have a patch series waiting for GCC14 that does further improvements to this (and other VPST codegen)sequences and if I do scan for 'absence' of an instruction I have tobreak them up into single tests each. Also it wouldn't then fail if westart spilling the predicate directly to memory for instance. Like Imentioned in the previous patch, the sequence is unlikely to be able tochange through scheduling (other than maybe the reordering of the loadsthrough some bad luck, but I could make it robust to that).

Re: [PATCH 2/3] arm: Remove unnecessary zero-extending of MVE predicates before use [PR 107674]

Reply via email to