Reworked the patch to substitute immediate register values in the test case with regular expressions. Apologies for the oversight.
Thanks, Soumya > On 24 Sep 2024, at 8:53 AM, Soumya AR <soum...@nvidia.com> wrote: > > Currently, we vectorize CTZ for SVE by using the following operation: > .CTZ (X) = (PREC - 1) - .CLZ (X & -X) > > Instead, this patch expands CTZ to RBIT + CLZ for SVE, as suggested in > PR109498. > > The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression. > OK for mainline? > > Signed-off-by: Soumya AR <soum...@nvidia.com> > > gcc/ChangeLog: > PR target/109498 > * config/aarch64/aarch64-sve.md (ctz<mode>2): Added pattern to expand > CTZ to RBIT + CLZ for SVE. > > gcc/testsuite/ChangeLog: > PR target/109498 > * gcc.target/aarch64/sve/ctz.c: New test. >
0001-aarch64-Expand-CTZ-to-RBIT-CLZ-for-SVE-PR109498.patch
Description: 0001-aarch64-Expand-CTZ-to-RBIT-CLZ-for-SVE-PR109498.patch
0001-aarch64-Expand-CTZ-to-RBIT-CLZ-for-SVE-PR109498.patch
Description: 0001-aarch64-Expand-CTZ-to-RBIT-CLZ-for-SVE-PR109498.patch