from:"Soumya AR"

[PATCH] aarch64: Emit ADD X, Y, Y instead of SHL X, Y, #1 for SVE instructions

2024-09-10 Thread Soumya AR

ch was bootstrapped and regtested on aarch64-linux-gnu, no regression. OK for mainline? Signed-off-by: Soumya AR gcc/ChangeLog: * config/aarch64/aarch64-sve.md (*post_ra_v3): Split pattern to accomodate left and right shifts separately. (*post_ra_v_ashl3): Matches left

Re: [PATCH] aarch64: Emit ADD X, Y, Y instead of SHL X, Y, #1 for SVE instructions

2024-09-16 Thread Soumya AR

> On 12 Sep 2024, at 7:22 PM, Richard Sandiford > wrote: > > External email: Use caution opening links or attachments > > > Richard Biener writes: >> On Thu, Sep 12, 2024 at 2:35 PM Richard Sandiford >> wrote: >>> >>> Soumya AR write

Re: [PATCH] aarch64: Expand CTZ to RBIT + CLZ for SVE [PR109498]

2024-10-04 Thread Soumya AR

> On 1 Oct 2024, at 6:17 PM, Richard Sandiford > wrote: > > External email: Use caution opening links or attachments > > > Soumya AR writes: >> Currently, we vectorize CTZ for SVE by using the following operation: >> .CTZ (X) = (PREC - 1) - .CLZ (X & -X)

Re: [PATCH] aarch64: Expand CTZ to RBIT + CLZ for SVE [PR109498]

2024-09-29 Thread Soumya AR

Reworked the patch to substitute immediate register values in the test case with regular expressions. Apologies for the oversight. Thanks, Soumya > On 24 Sep 2024, at 8:53 AM, Soumya AR wrote: > > Currently, we vectorize CTZ for SVE by using the following operation: > .CTZ (X)

Re: SVE intrinsics: Fold constant operands for svlsl.

2024-10-21 Thread Soumya AR

Hi, > On 17 Oct 2024, at 12:38 PM, Kyrylo Tkachov wrote: > > Hi Soumya > >> On 17 Oct 2024, at 06:10, Soumya AR wrote: >> >> Hi Richard, >> >> Thanks for the feedback. I’ve updated the patch with the suggested change. >> Ok for mainline? >>

Re: SVE intrinsics: Fold constant operands for svlsl.

2024-10-16 Thread Soumya AR

Hi Richard, Thanks for the feedback. I’ve updated the patch with the suggested change. Ok for mainline? Best, Soumya > On 14 Oct 2024, at 6:40 PM, Richard Sandiford > wrote: > > External email: Use caution opening links or attachments > > > Soumya AR writes: >> Thi

Re: [PATCH] SVE intrinsics: Fold constant operands for svlsl.

2024-10-13 Thread Soumya AR

Pinging with updated subject, had missed the [PATCH] header before. Regards, Soumya > On 24 Sep 2024, at 2:00 PM, Soumya AR wrote: > > This patch implements constant folding for svlsl. Test cases have been added > to > check for the following cases: > > Zero,

Re: SVE intrinsics: Fold constant operands for svlsl.

2024-10-29 Thread Soumya AR

> On 24 Oct 2024, at 2:55 PM, Richard Sandiford > wrote: > > External email: Use caution opening links or attachments > > > Kyrylo Tkachov writes: >>> On 24 Oct 2024, at 10:39, Soumya AR wrote: >>> >>> Hi Richard, >>> >&g

Re: SVE intrinsics: Fold constant operands for svlsl.

2024-10-24 Thread Soumya AR

Hi Richard, > On 23 Oct 2024, at 5:58 PM, Richard Sandiford > wrote: > > External email: Use caution opening links or attachments > > > Soumya AR writes: >> diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc >> b/gcc/config/aarch64/aarch64-sve

[PATCH v2] aarch64: Optimise calls to ldexp with SVE FSCALE instruction [PR111733]

2024-11-03 Thread Soumya AR

ux-gnu, no regression. OK for mainline? Signed-off-by: Soumya AR gcc/ChangeLog: PR target/111733 * config/aarch64/aarch64-sve.md (ldexp3): Added a new pattern to match ldexp calls with scalar floating modes and expand to the existing pattern for FSCALE. (@aar

Re: [PATCH] Match: Fold pow calls to ldexp when possible [PR57492]

2024-11-05 Thread Soumya AR

> On 29 Oct 2024, at 7:16 PM, Richard Biener wrote: > > External email: Use caution opening links or attachments > > > On Mon, 28 Oct 2024, Soumya AR wrote: > >> This patch transforms the following POW calls to equivalent LDEXP calls, as >> discussed in PR57

[PATCH] Match: Fold pow calls to ldexp when possible [PR57492]

2024-10-27 Thread Soumya AR

e no noticeable improvements, there are no non-noise regressions either. The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression. OK for mainline? Signed-off-by: Soumya AR gcc/ChangeLog: PR target/57492 * match.pd: Added patterns to fold certain calls to pow

[PATCH] Match: Optimize log (x) CMP CST and exp (x) CMP CST operations

2024-10-27 Thread Soumya AR

he patch was bootstrapped and regtested on aarch64-linux-gnu, no regression. OK for mainline? Signed-off-by: Soumya AR gcc/ChangeLog: * match.pd: Fold logN(x) CMP CST -> x CMP expN(CST) and expN(x) CMP CST -> x CMP logN(CST) gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/l

Re: [PATCH] Match: Fold pow calls to ldexp when possible [PR57492]

2024-11-11 Thread Soumya AR

Hi Richard, > On 7 Nov 2024, at 6:10 PM, Richard Biener wrote: > > External email: Use caution opening links or attachments > > > On Tue, 5 Nov 2024, Soumya AR wrote: > >> >> >>> On 29 Oct 2024, at 7:16 PM, Richard Biener wrote: >>>

Re: [PATCH v2] aarch64: Optimise calls to ldexp with SVE FSCALE instruction [PR111733]

2024-11-11 Thread Soumya AR

Hi Richard, > On 7 Nov 2024, at 3:19 PM, Richard Sandiford > wrote: > > External email: Use caution opening links or attachments > > > Soumya AR writes: >> Changes since v1: >> >> This revision makes use of the extended definition of aarch64_ptrue

Re: [PATCH] Match: Optimize log (x) CMP CST and exp (x) CMP CST operations

2024-11-11 Thread Soumya AR

Thanks, committed: e232dc3bb5c3e8f8a3749239135b7b859a204fc7 Best, Soumya > On 7 Nov 2024, at 3:32 AM, Jeff Law wrote: > > External email: Use caution opening links or attachments > > > On 11/6/24 1:12 AM, Soumya AR wrote: >> >> >>> On 29 Oct

Re: [PATCH v2] aarch64: Optimise calls to ldexp with SVE FSCALE instruction [PR111733]

2024-11-12 Thread Soumya AR

> On 12 Nov 2024, at 4:27 PM, Richard Sandiford > wrote: > > External email: Use caution opening links or attachments > > > Soumya AR writes: >> diff --git a/gcc/config/aarch64/aarch64-sve.md >> b/gcc/config/aarch64/aarch64-sve.md >> index 06bd3

Re: [PATCH] Match: Fold pow calls to ldexp when possible [PR57492]

2024-11-13 Thread Soumya AR

> On 12 Nov 2024, at 6:19 PM, Richard Biener wrote: > > External email: Use caution opening links or attachments > > > On Mon, 11 Nov 2024, Soumya AR wrote: > >> Hi Richard, >> >>> On 7 Nov 2024, at 6:10 PM, Richard Biener wrote: >>>

Re: [PATCH] Match: Fold pow calls to ldexp when possible [PR57492]

2024-11-13 Thread Soumya AR

> On 13 Nov 2024, at 2:49 PM, Richard Biener wrote: > > External email: Use caution opening links or attachments > > > On Wed, 13 Nov 2024, Soumya AR wrote: > >> >> >>> On 12 Nov 2024, at 6:19 PM, Richard Biener wrote: >>> >>&

[committed] MAINTAINERS: Added myself to write after approval and DCO.

2024-10-30 Thread Soumya AR

Pushed to trunk: 91577f0c8d955bc670ee76d1a8851df336bf240c Signed-off-by: Soumya AR ChangeLog: * MAINTAINERS: Add myself to write after approval and DCO. 0001-MAINTAINERS-Add-myself-to-write-after-approval-and-D.patch Description: 0001-MAINTAINERS-Add-myself-to-write-after-approval

SVE intrinsics: Fold constant operands for svlsl.

2024-09-24 Thread Soumya AR

t, eg. shift by 7 on an 8-bit integer. The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression. OK for mainline? Signed-off-by: Soumya AR gcc/ChangeLog: * config/aarch64/aarch64-sve-builtins-base.cc (svlsl_impl::fold): Try constant folding. gcc/test

[PATCH] aarch64: Optimise calls to ldexp with SVE FSCALE instruction

2024-09-30 Thread Soumya AR

ret The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression. OK for mainline? Signed-off-by: Soumya AR gcc/ChangeLog: * config/aarch64/aarch64-sve.md (ldexp3): Added a new pattern to match ldexp calls with scalar floating modes and expand to the existing pattern

[PATCH] aarch64: Expand CTZ to RBIT + CLZ for SVE [PR109498]

2024-09-23 Thread Soumya AR

-off-by: Soumya AR gcc/ChangeLog: PR target/109498 * config/aarch64/aarch64-sve.md (ctz2): Added pattern to expand CTZ to RBIT + CLZ for SVE. gcc/testsuite/ChangeLog: PR target/109498 * gcc.target/aarch64/sve/ctz.c: New test. 0001-aarch64-Expand-CTZ-to-

Re: SVE intrinsics: Fold constant operands for svlsl.

2024-10-03 Thread Soumya AR

Ping. > On 24 Sep 2024, at 2:00 PM, Soumya AR wrote: > > This patch implements constant folding for svlsl. Test cases have been added > to > check for the following cases: > > Zero, merge, and don't care predication. > Shift by 0. > Shift by register width

Re: [PATCH] aarch64: Optimise calls to ldexp with SVE FSCALE instruction

2024-10-02 Thread Soumya AR

arch64/simd/faminmax-codegen.c > for example. Thanks for the suggestion! I'll update the test case accordingly. Regards, Soumya > Regards, > Saurabh > > > > On 9/30/2024 5:26 PM, Soumya AR wrote: >> This patch uses the FSCALE instruction provided by SVE to impleme

Re: [PATCH] aarch64: Optimise calls to ldexp with SVE FSCALE instruction

2024-10-03 Thread Soumya AR

> On 1 Oct 2024, at 1:18 PM, Tamar Christina wrote: > > External email: Use caution opening links or attachments > > > Hi Soumya, > > Nice patch! > >> -Original Message- >> From: Kyrylo Tkachov >> Sent: Tuesday, October 1, 2024 7:55 AM >

Re: [PATCH] Match: Optimize log (x) CMP CST and exp (x) CMP CST operations

2024-11-06 Thread Soumya AR

> On 29 Oct 2024, at 6:59 PM, Richard Biener wrote: > > External email: Use caution opening links or attachments > > > On Mon, 28 Oct 2024, Soumya AR wrote: > >> This patch implements transformations for the following optimizations. >> >> logN(x) CMP CS

[PATCH] aarch64: Use SVE ASRD instruction with Neon modes.

2024-11-24 Thread Soumya AR

srdz31.b, p7/m, z31.b, #2 asrdz30.b, p7/m, z30.b, #2 stp q30, q31, [x1], 32 cmp x0, x2 The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression. OK for mainline? Signed-off-by: Soumya AR gcc/ChangeLog: * config/aarch64/aarch64-sve

[PATCH] testsuite: Require C99 for pow-to-ldexp.c

2024-11-20 Thread Soumya AR

obvious: 90645dba41bac29cab4c5996ba320c97a0325eb2 Signed-off-by: Soumya AR gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/pow-to-ldexp.c: Require c99_runtime. --- gcc/testsuite/gcc.dg/tree-ssa/pow-to-ldexp.c | 1 + 1 file changed, 1 insertion(+) diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pow-to

[PATCH] aarch64: Use SVE SUBR instruction with Neon modes

2024-11-14 Thread Soumya AR

mainline? Signed-off-by: Soumya AR gcc/ChangeLog: * config/aarch64/aarch64-simd.md: (sub3): Extended the pattern to emit SUBR for SVE targets if operand 1 is an immediate. * config/aarch64/predicates.md (aarch64_sve_arith_imm_or_reg_operand): New predicate

[PATCH] aarch64: Extend SVE2 bit-select instructions for Neon modes.

2024-11-25 Thread Soumya AR

was bootstrapped and regtested on aarch64-linux-gnu, no regression. OK for mainline? Signed-off-by: Soumya AR gcc/ChangeLog: * config/aarch64/aarch64-sve2.md (*aarch64_sve2_nbsl_unpred): New pattern to match unpredicated form. (*aarch64_sve2_bsl1n_unpred): Likewise

Re: [PATCH] aarch64: Extend SVE2 bit-select instructions for Neon modes.

2024-12-10 Thread Soumya AR

> On 10 Dec 2024, at 7:09 PM, Richard Sandiford > wrote: > > External email: Use caution opening links or attachments > > > Soumya AR writes: >> @@ -1815,6 +1849,42 @@ >> } >> ) >> >> +(define_insn "*aarch64_sve2_bsl2n_unpr

Re: [PATCH] aarch64: Use SVE ASRD instruction with Neon modes.

2024-12-10 Thread Soumya AR

> On 10 Dec 2024, at 7:03 PM, Richard Sandiford > wrote: > > External email: Use caution opening links or attachments > > > Soumya AR writes: >> Hi Richard, >> >> Thanks for reviewing this! >> >> I’ve made the suggested changes and add

Re: [PATCH] aarch64: Use SVE ASRD instruction with Neon modes.

2024-12-10 Thread Soumya AR

ord > wrote: > > External email: Use caution opening links or attachments > > > Soumya AR writes: >> The ASRD instruction on SVE performs an arithmetic shift right by an >> immediate >> for divide. >> >> This patch enables the use of ASRD w

Re: [PATCH] aarch64: Extend SVE2 bit-select instructions for Neon modes.

2024-12-10 Thread Soumya AR

> On 5 Dec 2024, at 10:25 PM, Richard Sandiford > wrote: > > External email: Use caution opening links or attachments > > > Soumya AR writes: >> NBSL, BSL1N, and BSL2N are bit-select intructions on SVE2 with certain >> operands >> inverted. These

[PATCH] match.pd: Fix indefinite recursion during exp-log transformations [PR118490]

2025-01-19 Thread Soumya AR

This patch fixes the ICE caused when comparing log or exp of a constant with another constant. The transform is now restricted to cases where the resultant log/exp (CST) can be constant folded. Signed-off-by: Soumya AR gcc/ChangeLog: PR target/118490 * match.pd: Added ! to verify that log/exp

Re: [PATCH] aarch64: Use generic_armv8_a_prefetch_tune in generic_armv8_a.h

2025-02-18 Thread Soumya AR

> On 18 Feb 2025, at 2:27 PM, Kyrylo Tkachov wrote: > > > >> On 18 Feb 2025, at 09:48, Kyrylo Tkachov wrote: >> >> >> >>> On 18 Feb 2025, at 09:41, Richard Sandiford >>> wrote: >>> >>> Kyrylo Tkachov write

[PATCH] aarch64: Use generic_armv8_a_prefetch_tune in generic_armv8_a.h

2025-02-18 Thread Soumya AR

-by: Soumya AR gcc/ChangeLog: * config/aarch64/tuning_models/generic_armv8_a.h: Updated prefetch struct pointer. --- gcc/config/aarch64/tuning_models/generic_armv8_a.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/config/aarch64/tuning_models/generic_armv8_a.h b/gcc

Re: [PATCH] aarch64: Enable selective LDAPUR generation for cores with RCPC2

2025-07-10 Thread Soumya AR

> On 10 Jul 2025, at 3:15 PM, Richard Sandiford > wrote: > > External email: Use caution opening links or attachments > > > Soumya AR writes: >>> On 1 Jul 2025, at 9:22 PM, Kyrylo Tkachov wrote: >>> >>> >>> >>>> On 1

Re: [PATCH] aarch64: Enable selective LDAPUR generation for cores with RCPC2

2025-07-10 Thread Soumya AR

> On 1 Jul 2025, at 9:22 PM, Kyrylo Tkachov wrote: > > > >> On 1 Jul 2025, at 17:36, Richard Sandiford wrote: >> >> Soumya AR writes: >>> From 2a2c3e3683aaf3041524df166fc6f8cf20895a0b Mon Sep 17 00:00:00 2001 >>> From: Soumya AR >&

Re: [PATCH] aarch64: Enable selective LDAPUR generation for cores with RCPC2

2025-07-14 Thread Soumya AR

> On 10 Jul 2025, at 5:20 PM, Richard Sandiford > wrote: > > External email: Use caution opening links or attachments > > > Soumya AR writes: >>> On 10 Jul 2025, at 3:15 PM, Richard Sandiford >>> wrote: >>> >>>

Re: [PATCH] aarch64: Enable selective LDAPUR generation for cores with RCPC2

2025-07-15 Thread Soumya AR

> On 15 Jul 2025, at 3:24 PM, Richard Sandiford > wrote: > > External email: Use caution opening links or attachments > > > Soumya AR writes: >> One additional change with this patch is that I had to update ldapr-sext.c >> too. >> >> During

[PATCH] aarch64: Enable selective LDAPUR generation for cores with RCPC2

2025-07-01 Thread Soumya AR

, no regression. OK for mainline? Signed-off-by: Soumya AR gcc/ChangeLog: * config/aarch64/aarch64-tuning-flags.def (AARCH64_EXTRA_TUNING_OPTION): Add the enable_ldapur flag to conwtrol LDAPUR emission. * config/aarch64/aarch64.h (TARGET_ENABLE_LDAPUR): Use new flag

Re: [RFC PATCH 0/5] aarch64: Support for user-defined aarch64 tuning parameters in JSON

2025-07-17 Thread Soumya AR

> On 7 May 2025, at 10:15 AM, Andrew Pinski wrote: > > External email: Use caution opening links or attachments > > > On Tue, May 6, 2025 at 1:35 AM wrote: >> >> From: Soumya AR >> >> Hi, >> >> This RFC and subsequent patch series

Re: [RFC PATCH 0/5] aarch64: Support for user-defined aarch64 tuning parameters in JSON

2025-07-17 Thread Soumya AR

> On 7 May 2025, at 6:18 PM, David Malcolm wrote: > > External email: Use caution opening links or attachments > > > On Tue, 2025-05-06 at 14:00 +0530, soum...@nvidia.com wrote: >> From: Soumya AR >> >> Hi, >> >> This RFC and subsequent pat

Re: [RFC PATCH 0/5] aarch64: Support for user-defined aarch64 tuning parameters in JSON

2025-07-17 Thread Soumya AR

l email: Use caution opening links or attachments > > > writes: >> From: Soumya AR >> >> Hi, >> >> This RFC and subsequent patch series introduces support for printing and >> parsing >> of aarch64 tuning parameters in the form of JSON. > >

46 matches

Mail list logo