Hi Andre,

> -----Original Message-----
> From: Andre Vieira (lists) <andre.simoesdiasvie...@arm.com>
> Sent: Tuesday, January 24, 2023 1:54 PM
> To: gcc-patches@gcc.gnu.org
> Cc: Richard Sandiford <richard.sandif...@arm.com>; Richard Earnshaw
> <richard.earns...@arm.com>; Richard Biener <rguent...@suse.de>;
> Kyrylo Tkachov <kyrylo.tkac...@arm.com>
> Subject: [PATCH 2/3] arm: Remove unnecessary zero-extending of MVE
> predicates before use [PR 107674]
> 
> Hi,
> 
> This patch teaches GCC that zero-extending a MVE predicate from 16-bits
> to 32-bits and then only using 16-bits is a no-op.
> It does so in two steps:
> - it lets gcc know that it can access any MVE predicate mode using any
> other MVE predicate mode without needing to copy it, using the
> TARGET_MODES_TIEABLE_P hook,
> - it teaches simplify_subreg to optimize a subreg with a vector
> outermode, by replacing this outermode with a same-sized integer mode
> and trying the avalailable optimizations, then if successful it
> surrounds the result with a subreg casting it back to the original
> vector outermode.
> 
> This removes the unnecessary zero-extending shown on PR 107674 (though
> it's a sign-extend there), that was introduced in gcc 11.
> 
> Bootstrapped on aarch64-none-linux-gnu and regression tested on
> arm-none-eabi and armeb-none-eabi for armv8.1-m.main+mve.fp.
> 
> OK for trunk?
> 
> gcc/ChangeLog:
> 
>       PR target/107674
>          * conig/arm/arm.cc (arm_hard_regno_mode_ok): Use new MACRO.
>          (arm_modes_tieable_p): Make MVE predicate modes tieable.
>       * config/arm/arm.h (VALID_MVE_PRED_MODE):  New define.
>       * simplify-rtx.cc (simplify_context::simplify_subreg): Teach
>       simplify_subreg to simplify subregs where the outermode is not
> scalar.

The arm changes look ok to me. We'll want a midend maintainer to have a look at 
simplify-rtx.cc

> 
> gcc/testsuite/ChangeLog:
> 
>       * gcc.target/arm/mve/mve_vpt.c: Change to remove unecessary
>       zero-extend.

diff --git a/gcc/testsuite/gcc.target/arm/mve/mve_vpt.c 
b/gcc/testsuite/gcc.target/arm/mve/mve_vpt.c
index 
26a565b79dd1348e361b3aa23a1d6e6d13bffce8..8e562a9f065eff157f63ebd5acf9af0a2155b5c5
 100644
--- a/gcc/testsuite/gcc.target/arm/mve/mve_vpt.c
+++ b/gcc/testsuite/gcc.target/arm/mve/mve_vpt.c
@@ -16,9 +16,6 @@ void test0 (uint8_t *a, uint8_t *b, uint8_t *c)
 **     vldrb.8 q2, \[r0\]
 **     vldrb.8 q1, \[r1\]
 **     vcmp.i8 eq, q2, q1
-**     vmrs    r3, p0  @ movhi
-**     uxth    r3, r3
-**     vmsr    p0, r3  @ movhi
 **     vpst
 **     vaddt.i8        q3, q2, q1
 **     vpst

Ah I see, that's the testcase from patch 1/3 that I criticized :)
Maybe if we just scan for absence of an uxth, vmrs and vmsr it will be more 
robust?
Thanks,
Kyrill

Reply via email to