Re: [PATCH] aarch64: Expand CTZ to RBIT + CLZ for SVE [PR109498]

Soumya AR Sun, 29 Sep 2024 21:54:08 -0700

Reworked the patch to substitute immediate register values in the test case with
regular expressions. Apologies for the oversight.


Thanks,
Soumya



> On 24 Sep 2024, at 8:53 AM, Soumya AR <soum...@nvidia.com> wrote:
>
> Currently, we vectorize CTZ for SVE by using the following operation:
> .CTZ (X) = (PREC - 1) - .CLZ (X & -X)
>
> Instead, this patch expands CTZ to RBIT + CLZ for SVE, as suggested in 
> PR109498.
>
> The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression.
> OK for mainline?
>
> Signed-off-by: Soumya AR <soum...@nvidia.com>
>
> gcc/ChangeLog:
> PR target/109498
> * config/aarch64/aarch64-sve.md (ctz<mode>2): Added pattern to expand
>        CTZ to RBIT + CLZ for SVE.
>
> gcc/testsuite/ChangeLog:
> PR target/109498
> * gcc.target/aarch64/sve/ctz.c: New test.
>

0001-aarch64-Expand-CTZ-to-RBIT-CLZ-for-SVE-PR109498.patch
Description: 0001-aarch64-Expand-CTZ-to-RBIT-CLZ-for-SVE-PR109498.patch

Re: [PATCH] aarch64: Expand CTZ to RBIT + CLZ for SVE [PR109498]

Reply via email to