Hi Soumya, Thank you for the patch. Two clarifications:
In the instruction pattern's output string, why did you add the 'Z' prefix before operands? (%0 -> %Z0).
Also, maybe you can make your test cases more precise by specifying which functions generate which instructions. I don't have and SVE test off the top of my head but have a look at
/gcc/testsuite/gcc.target/aarch64/simd/faminmax-codegen.c for example. Regards, Saurabh On 9/30/2024 5:26 PM, Soumya AR wrote:
This patch uses the FSCALE instruction provided by SVE to implement the standard ldexp family of functions. Currently, with '-Ofast -mcpu=neoverse-v2', GCC generates libcalls for the following code: float test_ldexpf (float x, int i) { return __builtin_ldexpf (x, i); } double test_ldexp (double x, int i) { return __builtin_ldexp(x, i); } GCC Output: test_ldexpf: b ldexpf test_ldexp: b ldexp Since SVE has support for an FSCALE instruction, we can use this to process scalar floats by moving them to a vector register and performing an fscale call, similar to how LLVM tackles an ldexp builtin as well. New Output: test_ldexpf: fmov s31, w0 ptrue p7.b, all fscale z0.s, p7/m, z0.s, z31.s ret test_ldexp: sxtw x0, w0 ptrue p7.b, all fmov d31, x0 fscale z0.d, p7/m, z0.d, z31.d ret The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression. OK for mainline? Signed-off-by: Soumya AR <soum...@nvidia.com> gcc/ChangeLog: * config/aarch64/aarch64-sve.md (ldexp<mode>3): Added a new pattern to match ldexp calls with scalar floating modes and expand to the existing pattern for FSCALE. (@aarch64_pred_<optab><mode>): Extended the pattern to accept SVE operands as well as scalar floating modes. * config/aarch64/iterators.md: SVE_FULL_F_SCALAR: Added an iterator to match all FP SVE modes as well as SF and DF. VPRED: Extended the attribute to handle GPF modes. gcc/testsuite/ChangeLog: * gcc.target/aarch64/sve/fscale.c: New test.