From: Richard Sandiford <richard.sandif...@arm.com>
Date: Monday, June 12, 2023 at 2:15 PM
To: Tejas Belagod <tejas.bela...@arm.com>
Cc: gcc-patches@gcc.gnu.org <gcc-patches@gcc.gnu.org>, Tejas Belagod 
<tejas.bela...@arm.com>
Subject: Re: [PATCH v2] [PR96339] Optimise svlast[ab]
Tejas Belagod <tejas.bela...@arm.com> writes:
> From: Tejas Belagod <tbela...@arm.com>
>
>   This PR optimizes an SVE intrinsics sequence where
>     svlasta (svptrue_pat_b8 (SV_VL1), x)
>   a scalar is selected based on a constant predicate and a variable vector.
>   This sequence is optimized to return the correspoding element of a NEON
>   vector. For eg.
>     svlasta (svptrue_pat_b8 (SV_VL1), x)
>   returns
>     umov    w0, v0.b[1]
>   Likewise,
>     svlastb (svptrue_pat_b8 (SV_VL1), x)
>   returns
>      umov    w0, v0.b[0]
>   This optimization only works provided the constant predicate maps to a range
>   that is within the bounds of a 128-bit NEON register.
>
> gcc/ChangeLog:
>
>        PR target/96339
>        * config/aarch64/aarch64-sve-builtins-base.cc (svlast_impl::fold): 
> Fold sve
>        calls that have a constant input predicate vector.
>        (svlast_impl::is_lasta): Query to check if intrinsic is svlasta.
>        (svlast_impl::is_lastb): Query to check if intrinsic is svlastb.
>        (svlast_impl::vect_all_same): Check if all vector elements are equal.
>
> gcc/testsuite/ChangeLog:
>
>        PR target/96339
>        * gcc.target/aarch64/sve/acle/general-c/svlast.c: New.
>        * gcc.target/aarch64/sve/acle/general-c/svlast128_run.c: New.
>        * gcc.target/aarch64/sve/acle/general-c/svlast256_run.c: New.
>        * gcc.target/aarch64/sve/pcs/return_4.c (caller_bf16): Fix asm
>        to expect optimized code for function body.
>        * gcc.target/aarch64/sve/pcs/return_4_128.c (caller_bf16): Likewise.
>        * gcc.target/aarch64/sve/pcs/return_4_256.c (caller_bf16): Likewise.
>        * gcc.target/aarch64/sve/pcs/return_4_512.c (caller_bf16): Likewise.
>        * gcc.target/aarch64/sve/pcs/return_4_1024.c (caller_bf16): Likewise.
>        * gcc.target/aarch64/sve/pcs/return_4_2048.c (caller_bf16): Likewise.
>        * gcc.target/aarch64/sve/pcs/return_5.c (caller_bf16): Likewise.
>        * gcc.target/aarch64/sve/pcs/return_5_128.c (caller_bf16): Likewise.
>        * gcc.target/aarch64/sve/pcs/return_5_256.c (caller_bf16): Likewise.
>        * gcc.target/aarch64/sve/pcs/return_5_512.c (caller_bf16): Likewise.
>        * gcc.target/aarch64/sve/pcs/return_5_1024.c (caller_bf16): Likewise.
>        * gcc.target/aarch64/sve/pcs/return_5_2048.c (caller_bf16): Likewise.

OK, thanks.

Applied on master, thanks.

Tejas.


Richard

Reply via email to