Re: [PATCH] aarch64: Optimise calls to ldexp with SVE FSCALE instruction

Saurabh Jha Mon, 30 Sep 2024 10:07:14 -0700

Hi Soumya,

Thank you for the patch. Two clarifications:

In the instruction pattern's output string, why did you add the 'Z'prefix before operands? (%0 -> %Z0).

Also, maybe you can make your test cases more precise by specifyingwhich functions generate which instructions. I don't have and SVE testoff the top of my head but have a look at

/gcc/testsuite/gcc.target/aarch64/simd/faminmax-codegen.c
for example.

Regards,
Saurabh



On 9/30/2024 5:26 PM, Soumya AR wrote:

This patch uses the FSCALE instruction provided by SVE to implement the
standard ldexp family of functions.

Currently, with '-Ofast -mcpu=neoverse-v2', GCC generates libcalls for the
following code:

float
test_ldexpf (float x, int i)
{
        return __builtin_ldexpf (x, i);
}

double
test_ldexp (double x, int i)
{
        return __builtin_ldexp(x, i);
}

GCC Output:

test_ldexpf:
        b ldexpf

test_ldexp:
        b ldexp

Since SVE has support for an FSCALE instruction, we can use this to process
scalar floats by moving them to a vector register and performing an fscale call,
similar to how LLVM tackles an ldexp builtin as well.

New Output:

test_ldexpf:
        fmov s31, w0
        ptrue p7.b, all
        fscale z0.s, p7/m, z0.s, z31.s
        ret

test_ldexp:
        sxtw x0, w0
        ptrue p7.b, all
        fmov d31, x0
        fscale z0.d, p7/m, z0.d, z31.d
        ret

The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression.
OK for mainline?

Signed-off-by: Soumya AR <soum...@nvidia.com>

gcc/ChangeLog:

* config/aarch64/aarch64-sve.md
(ldexp<mode>3): Added a new pattern to match ldexp calls with scalar
floating modes and expand to the existing pattern for FSCALE.
(@aarch64_pred_<optab><mode>): Extended the pattern to accept SVE
operands as well as scalar floating modes.

* config/aarch64/iterators.md:
SVE_FULL_F_SCALAR: Added an iterator to match all FP SVE modes as well
as SF and DF.
VPRED: Extended the attribute to handle GPF modes.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/sve/fscale.c: New test.

Re: [PATCH] aarch64: Optimise calls to ldexp with SVE FSCALE instruction

Reply via email to