RE: [PATCH]AArch64: Drop ILP32 from default elf multilibs after deprecation

2025-01-17 Thread Tamar Christina
> -Original Message- > From: Kyrylo Tkachov > Sent: Friday, January 17, 2025 3:10 PM > To: Richard Sandiford > Cc: Tamar Christina ; GCC Patches patc...@gcc.gnu.org>; nd ; Richard Earnshaw > ; ktkac...@gcc.gnu.org > Subject: Re: [PATCH]AArch64: Drop ILP32 fr

RE: [PATCH]AArch64: Drop ILP32 from default elf multilibs after deprecation

2025-01-17 Thread Tamar Christina
> -Original Message- > From: Kyrylo Tkachov > Sent: Friday, January 17, 2025 1:22 PM > To: Tamar Christina > Cc: GCC Patches ; nd ; Richard > Earnshaw ; ktkac...@gcc.gnu.org; Richard > Sandiford > Subject: Re: [PATCH]AArch64: Drop ILP32 from default elf multi

RE: [PATCH]AArch64: Drop ILP32 from default elf multilibs after deprecation

2025-01-17 Thread Tamar Christina
> -Original Message- > From: Kyrylo Tkachov > Sent: Friday, January 17, 2025 1:04 PM > To: Tamar Christina > Cc: GCC Patches ; nd ; Richard > Earnshaw ; ktkac...@gcc.gnu.org; Richard > Sandiford > Subject: Re: [PATCH]AArch64: Drop ILP32 from default elf multi

RE: [PATCH v3 1/2] aarch64: Use standard names for saturating arithmetic

2025-01-17 Thread Tamar Christina
16-bit tests. * gcc.target/aarch64/saturating_arithmetic_3.c: 32-bit tests. * gcc.target/aarch64/saturating_arithmetic_4.c: 64-bit tests. Co-authored-by: Tamar Christina -- inline copy -- diff --git a/gcc/config/aarch64/aarch64-builtins.cc b/gcc/

[PATCH]AArch64: Drop ILP32 from default elf multilibs after deprecation

2025-01-17 Thread Tamar Christina
Hi All, Following the deprecation of ILP32 *-elf builds fail now due to -Werror on the deprecation warning. This is because on embedded builds ILP32 is part of the default multilib. This patch removed it from the default target as the build would fail anyway. Cross compiled on aarch64-none-elf

RE: [PATCH] AArch64: Deprecate -mabi=ilp32

2025-01-17 Thread Tamar Christina
> -Original Message- > From: Wilco Dijkstra > Sent: Tuesday, January 14, 2025 5:30 PM > To: Richard Sandiford > Cc: Richard Earnshaw ; ktkac...@nvidia.com; GCC > Patches ; sch...@linux-m68k.org > Subject: Re: [PATCH] AArch64: Deprecate -mabi=ilp32 > > Hi Richard, > > >> +  if (TARGET_IL

RE: [PATCH]AArch64: have -mcpu=native detect architecture extensions for unknown non-homogenous systems [PR113257]

2025-01-16 Thread Tamar Christina
> -Original Message- > From: Richard Sandiford > Sent: Thursday, January 16, 2025 7:11 AM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw > ; ktkac...@gcc.gnu.org > Subject: Re: [PATCH]AArch64: have -mcpu=native detect architecture extensi

[PATCH]middle-end: Add early break conditions to vect-switch-search-line-fast.c [PR118451]

2025-01-16 Thread Tamar Christina
Hi All, When this test was added initially it didn't add the early break effective target tests. This means that the test was "passing" (as in, it was failing to vectorize) because many targets don't support early break. But the test should not have been run for these targets. When the vectoriz

Re: [PATCH]AArch64: have -mcpu=native detect architecture extensions for unknown non-homogenous systems [PR113257]

2025-01-15 Thread Tamar Christina
Re-reading again I realize I misread cache size from your question with cache line size. Cache size can be whatever yes. Cache line size must match. But that doesn't change the fact that this patch is correct. Thanks, Tamar From: Tamar Christina

RE: [PATCH]AArch64: have -mcpu=native detect architecture extensions for unknown non-homogenous systems [PR113257]

2025-01-15 Thread Tamar Christina
> -Original Message- > From: Richard Sandiford > Sent: Wednesday, January 15, 2025 3:23 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw > ; ktkac...@gcc.gnu.org > Subject: Re: [PATCH]AArch64: have -mcpu=native detect architecture extensi

RE: [PATCH]AArch64: have -mcpu=native detect architecture extensions for unknown non-homogenous systems [PR113257]

2025-01-15 Thread Tamar Christina
> -Original Message- > From: Xi Ruoyao > Sent: Wednesday, January 15, 2025 1:40 PM > To: Tamar Christina ; gcc-patches@gcc.gnu.org > Cc: nd ; Richard Earnshaw ; > ktkac...@gcc.gnu.org; Richard Sandiford > Subject: Re: [PATCH]AArch64: have -mcpu=native detect architec

RE: [PATCH 3/4] vect: Ensure profile consistency when adding epilog guard [PR117790]

2025-01-15 Thread Tamar Christina
Ping > -Original Message- > From: Alex Coplan > Sent: Monday, January 6, 2025 11:35 AM > To: gcc-patches@gcc.gnu.org > Cc: Richard Biener ; Jan Hubicka ; Tamar > Christina > Subject: [PATCH 3/4] vect: Ensure profile consistency when adding epilog guard > [PR11779

RE: [PATCH 2/4] cfgloopmanip: Add infrastructure for scaling of multi-exit loops [PR117790]

2025-01-15 Thread Tamar Christina
Ping > -Original Message- > From: Alex Coplan > Sent: Monday, January 6, 2025 11:35 AM > To: gcc-patches@gcc.gnu.org > Cc: Richard Biener ; Jan Hubicka ; Tamar > Christina > Subject: [PATCH 2/4] cfgloopmanip: Add infrastructure for scaling of > multi-exit >

RE: [PATCH 4/4] vect: Fix scale_profile_for_vect_loop for multiple exits [PR117790]

2025-01-15 Thread Tamar Christina
Ping > -Original Message- > From: Alex Coplan > Sent: Monday, January 6, 2025 11:36 AM > To: gcc-patches@gcc.gnu.org > Cc: Richard Biener ; Jan Hubicka ; Tamar > Christina > Subject: [PATCH 4/4] vect: Fix scale_profile_for_vect_loop for multiple exits > [PR1

RE: [PATCH 1/4] vect: Set counts of early break exit blocks correctly [PR117790]

2025-01-15 Thread Tamar Christina
Ping > -Original Message- > From: Alex Coplan > Sent: Monday, January 6, 2025 11:34 AM > To: gcc-patches@gcc.gnu.org > Cc: Richard Biener ; Jan Hubicka ; Tamar > Christina > Subject: [PATCH 1/4] vect: Set counts of early break exit blocks correctly > [PR117790

RE: [PATCH]AArch64: have -mcpu=native detect architecture extensions for unknown non-homogenous systems [PR113257]

2025-01-15 Thread Tamar Christina
> -Original Message- > From: Xi Ruoyao > Sent: Wednesday, January 15, 2025 1:29 PM > To: Tamar Christina ; gcc-patches@gcc.gnu.org > Cc: nd ; Richard Earnshaw ; > ktkac...@gcc.gnu.org; Richard Sandiford > Subject: Re: [PATCH]AArch64: have -mcpu=native detect ar

RE: [PATCH]AArch64: have -mcpu=native detect architecture extensions for unknown non-homogenous systems [PR113257]

2025-01-15 Thread Tamar Christina
> -Original Message- > From: Richard Sandiford > Sent: Monday, January 13, 2025 8:55 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw > ; ktkac...@gcc.gnu.org > Subject: Re: [PATCH]AArch64: have -mcpu=native detect architecture extensi

[PATCH]middle-end: Fix incorrect type replacement in operands_equals [PR118472]

2025-01-15 Thread Tamar Christina
Hi All, In g:3c32575e5b6370270d38a80a7fa8eaa144e083d0 I made a mistake and incorrectly replaced the type of the arguments of an expression with the type of the expression. This is of course wrong. This reverts that change and I have also double checked the other replacements and they are fine.

RE: [PATCH]AArch64: have -mcpu=native detect architecture extensions for unknown non-homogenous systems [PR113257]

2025-01-13 Thread Tamar Christina
> -Original Message- > From: Richard Sandiford > Sent: Monday, January 13, 2025 6:35 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw > ; ktkac...@gcc.gnu.org > Subject: Re: [PATCH]AArch64: have -mcpu=native detect architecture extensi

[PATCH]AArch64: don't override march to assembler with mcpu if march is specified [PR110901]

2025-01-11 Thread Tamar Christina
Hi All, When both -mcpu and -march are specified, the value of -march wins out. This is done correctly for the calls to cc1 and for the assembler directives we put out in assembly files. However in the call to as we don't do this and instead use the arch from the cpu. This leads to a situation

[PATCH]AArch64: have -mcpu=native detect architecture extensions for unknown non-homogenous systems [PR113257]

2025-01-11 Thread Tamar Christina
Hi All, in g:e91a17fe39c39e98cebe6e1cbc8064ee6846a3a7 we added the ability for -mcpu=native on unknown CPUs to still enable architecture extensions. This has worked great but was only added for homogenous systems. However the same thing works for big.LITTLE as in such system the cores must have

RE: [PATCH][libstdc++]: backport inline keyword on std::find

2025-01-10 Thread Tamar Christina
> -Original Message- > From: Jonathan Wakely > Sent: Friday, January 10, 2025 2:36 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; libstd...@gcc.gnu.org > Subject: Re: [PATCH][libstdc++]: backport inline keyword on std::find > > On Fri, 10

[PATCH][libstdc++]: backport inline keyword on std::find

2025-01-10 Thread Tamar Christina
Hi All, This is a backport version of the same patch as https://gcc.gnu.org/pipermail/gcc-patches/2024-December/671618.html for the release branches. I'd like to backport this to GCC 14,13 and 12 where the first regression showed up. I am however aware that GCC 12 is going to get it's last rele

[PATCH]AArch64: correct Cortex-X4 MIDR

2025-01-09 Thread Tamar Christina
Hi All, The Parts Num field for the MIDR for Cortex-X4 is wrong. It's currently the parts number for a Cortex-A720 (which does have the right number). The correct number can be found in the Cortex-X4 Technical Reference Manual [1] on page 382 in Issue Number 5. [1] https://developer.arm.com/doc

RE: [PATCH 2/2][libstdc++]: Adjust probabilities of hashmap loop conditions

2025-01-09 Thread Tamar Christina
or the element is placed in a what I assume to be crowded bucket. It does seem to be beneficial for some user defined datatypes, I assume due to some IPA shenanigans. But overall there were more and larger wins using probability of 0 rather than 1. Kind regards, Tamar From: Tamar Christina

RE: [PATCH]AArch64: Fix costing of emulated gathers/scatters [PR118188]

2025-01-09 Thread Tamar Christina
> -Original Message- > From: Richard Sandiford > Sent: Thursday, January 9, 2025 3:09 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw > ; ktkac...@gcc.gnu.org > Subject: Re: [PATCH]AArch64: Fix costing of emulated gathers/scatters &g

RE: [PATCH]AArch64: Fix costing of emulated gathers/scatters [PR118188]

2025-01-08 Thread Tamar Christina
> -Original Message- > From: Richard Sandiford > Sent: Wednesday, January 8, 2025 10:30 AM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw > ; ktkac...@gcc.gnu.org > Subject: Re: [PATCH]AArch64: Fix costing of emulated gathers/scatters &g

RE: [PATCH 3/4] arm, testsuite: fix arm_v8_3a_fp16_complex_neon_ok

2025-01-08 Thread Tamar Christina
> -Original Message- > From: Richard Earnshaw (lists) > Sent: Wednesday, January 8, 2025 1:18 PM > To: Christophe Lyon ; gcc-patches@gcc.gnu.org; > Richard Sandiford ; Tamar Christina > ; Andre Simoes Dias Vieira > ; ktkac...@nvidia.com; > raman...@nvidia.com &

RE: [PATCH]AArch64: Fix costing of emulated gathers/scatters [PR118188]

2025-01-07 Thread Tamar Christina
> >> i.e. we use separate address arithmetic and avoid UMOVs. Counting > >> two loads and one store for each element of the scatter store seems > >> like overkill for that. > > > > Hmm agreed.. > > > > How about for stores we increase the load counts by count / 2? > > > > This would account for th

RE: [RFC][PATCH] AArch64: Remove AARCH64_EXTRA_TUNE_USE_NEW_VECTOR_COSTS

2025-01-06 Thread Tamar Christina
> -Original Message- > From: Richard Sandiford > Sent: Monday, January 6, 2025 5:54 PM > To: Jennifer Schmitz > Cc: Richard Biener ; Richard Biener > ; Tamar Christina ; > gcc-patches@gcc.gnu.org; Kyrylo Tkachov > Subject: Re: [RFC

RE: [PATCH v2 2/3] cfgexpand: Rewrite add_scope_conflicts_2 to use cache and look back further [PR111422]

2025-01-06 Thread Tamar Christina
> -Original Message- > From: Tamar Christina > Sent: Tuesday, December 31, 2024 1:04 PM > To: Richard Biener ; Andrew Pinski > > Cc: gcc-patches@gcc.gnu.org > Subject: RE: [PATCH v2 2/3] cfgexpand: Rewrite add_scope_conflicts_2 to use > cache and loo

RE: [PATCH]AArch64: Implement four and eight chunk VLA concats [PR118272]

2025-01-03 Thread Tamar Christina
> -Original Message- > From: Richard Sandiford > Sent: Friday, January 3, 2025 10:59 AM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw > ; ktkac...@gcc.gnu.org > Subject: Re: [PATCH]AArch64: Implement four and eight chunk VLA concats &g

RE: [PATCH]AArch64: Implement four and eight chunk VLA concats [PR118272]

2025-01-03 Thread Tamar Christina
> > > > How about instead doing something like: > > > > worklist.reserve (nelts); > > for (int i = 0; i < nelts; ++i) > > worklist.quick_push (force_reg (elem_mode, XVECEXP (vals, 0, i))); > > > > while (nelts > 2) > > { > > for (int i = 0; i < nelts; i += 2) > > { > >

RE: [PATCH 2/2][libstdc++]: Adjust probabilities of hashmap loop conditions

2025-01-02 Thread Tamar Christina
I’ll run the numbers with this change. Thanks, Tamar From: François Dumont Sent: Monday, December 30, 2024 5:08 PM To: Jonathan Wakely Cc: Tamar Christina ; gcc-patches@gcc.gnu.org; nd ; libstd...@gcc.gnu.org Subject: Re: [PATCH 2/2][libstdc++]: Adjust probabilities of hashmap loop condition

RE: [PATCH]AArch64: Fix costing of emulated gathers/scatters [PR118188]

2025-01-02 Thread Tamar Christina
> -Original Message- > From: Richard Sandiford > Sent: Thursday, January 2, 2025 5:54 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw > ; ktkac...@gcc.gnu.org > Subject: Re: [PATCH]AArch64: Fix costing of emulated gathers/scatters &g

RE: [PATCH]AArch64: Implement four and eight chunk VLA concats [PR118272]

2025-01-02 Thread Tamar Christina
> -Original Message- > From: Richard Sandiford > Sent: Thursday, January 2, 2025 5:19 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw > ; ktkac...@gcc.gnu.org > Subject: Re: [PATCH]AArch64: Implement four and eight chunk VLA concats &g

RE: [PATCH]AArch64: Fix costing of emulated gathers/scatters [PR118188]

2025-01-02 Thread Tamar Christina
> -Original Message- > From: Richard Sandiford > Sent: Thursday, January 2, 2025 4:52 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw > ; ktkac...@gcc.gnu.org > Subject: Re: [PATCH]AArch64: Fix costing of emulated gathers/scatters &g

[PATCH]AArch64: Implement four and eight chunk VLA concats [PR118272]

2025-01-02 Thread Tamar Christina
Hi All, The following testcase #pragma GCC target ("+sve") extern char __attribute__ ((simd, const)) fn3 (int, short); void test_fn3 (float *a, float *b, double *c, int n) { for (int i = 0; i < n; ++i) a[i] = fn3 (b[i], c[i]); } at -Ofast ICEs because my previous patch only a

[PATCH]AArch64: Fix costing of emulated gathers/scatters [PR118188]

2025-01-02 Thread Tamar Christina
Hi All, When a target does not support gathers and scatters the vectorizer tries to emulate these using scalar loads/stores and a reconstruction of vectors from scalar. The loads are still marked with VMAT_GATHER_SCATTER to indicate that they are gather/scatters, however the vectorizer also asks

RE: [PATCH v3] LoongArch: Implement vector cbranch optab for LSX and LASX

2024-12-31 Thread Tamar Christina
Hi, > -Original Message- > From: Jiahao Xu > Sent: Wednesday, December 25, 2024 10:00 AM > To: gcc-patches@gcc.gnu.org > Cc: xry...@xry111.site; i...@xen0n.name; chengl...@loongson.cn; > xucheng...@loongson.cn; dengjia...@loongson.cn; Jiahao Xu > > Subject: [PATCH v3] LoongArch: Implemen

RE: [PATCH v2 2/3] cfgexpand: Rewrite add_scope_conflicts_2 to use cache and look back further [PR111422]

2024-12-31 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Wednesday, November 20, 2024 11:28 AM > To: Andrew Pinski > Cc: gcc-patches@gcc.gnu.org > Subject: Re: [PATCH v2 2/3] cfgexpand: Rewrite add_scope_conflicts_2 to use > cache and look back further [PR111422] > > On Sat, Nov 16, 2024 at 5

RE: [PATCH 7/7]AArch64: Implement vector concat of partial SVE vectors

2024-12-19 Thread Tamar Christina
> -Original Message- > From: Richard Sandiford > Sent: Thursday, December 19, 2024 11:03 AM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw > ; ktkac...@gcc.gnu.org > Subject: Re: [PATCH 7/7]AArch64: Implement vector concat of partial SVE

RE: [PATCH 2/2][libstdc++]: Adjust probabilities of hashmap loop conditions

2024-12-18 Thread Tamar Christina
> e791e52ec329277474f3218d8a44cd37ded14ac3..8101d868d0c5f7ac4f97931a > > ffcf71d826c88094 100644 > > > --- a/libstdc++-v3/include/bits/hashtable.h > > > +++ b/libstdc++-v3/include/bits/hashtable.h > > > @@ -2171,7 +2171,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION > > > if (this->_M_equals(__k,

RE: [PATCH 2/2][libstdc++]: Adjust probabilities of hashmap loop conditions

2024-12-17 Thread Tamar Christina
> On Fri, 13 Dec 2024 at 17:13, Tamar Christina wrote: > > > > Hi All, > > > > We are currently generating a loop which has more comparisons than you'd > > typically need as the probablities on the small size loop are such that it > > assumes the

RE: [PATCH 2/7]AArch64: Add SVE support for simd clones [PR96342]

2024-12-17 Thread Tamar Christina
imd_clone_adjust): Adapt safelen check to be compatible with VLA simdlen. gcc/testsuite/ChangeLog: PR target/96342 * gcc.target/aarch64/declare-simd-2.c: Add SVE clone scan. * gcc.target/aarch64/vect-simd-clone-1.c: New test. * g++.target/aarch64/vect-simd

[PATCH]Arm: [committed] fix bootstrap after MVE changes

2024-12-15 Thread Tamar Christina
Hi All, The recent commits for MVE on Saturday have broken armhf bootstrap due to a -Werror false positive: inlined from 'virtual rtx_def* {anonymous}::vstrq_scatter_base_impl::expand(arm_mve::function_expander&) const' at /gcc/config/arm/arm-mve-builtins-base.cc:352:17: ./genrtl.h:38:16: e

RE: [PATCH 7/7]AArch64: Implement vector concat of partial SVE vectors

2024-12-13 Thread Tamar Christina
> > ;; 2 element quad vector modes. > > (define_mode_iterator VQ_2E [V2DI V2DF]) > > > > @@ -1678,7 +1686,15 @@ (define_mode_attr VHALF [(V8QI "V4QI") (V16QI > "V8QI") > > (V2DI "DI")(V2SF "SF") > > (V4SF "V2SF") (V4HF "V2HF") > >

[PATCH 2/2][libstdc++]: Adjust probabilities of hashmap loop conditions

2024-12-13 Thread Tamar Christina
Hi All, We are currently generating a loop which has more comparisons than you'd typically need as the probablities on the small size loop are such that it assumes the likely case is that an element is not found. This again generates a pattern that's harder for branch predictors to follow, but al

[PATCH 1/2][libstdc++]: Add inline keyword to _M_locate

2024-12-13 Thread Tamar Christina
Hi All, In GCC 12 there was a ~40% regression in the performance of hashmap->find. This regression came about accidentally: Before GCC 12 the find function was small enough that IPA would inline it even though it wasn't marked inline. In GCC-12 an optimization was added to perform a linear sear

RE: [PATCH 2/7]AArch64: Add SVE support for simd clones [PR96342]

2024-12-11 Thread Tamar Christina
ping > -Original Message- > From: Tamar Christina > Sent: Wednesday, December 4, 2024 12:17 PM > To: gcc-patches@gcc.gnu.org > Cc: nd ; Richard Earnshaw ; > ktkac...@gcc.gnu.org; Richard Sandiford > Subject: [PATCH 2/7]AArch64: Add SVE support for simd clones

RE: [PATCH 3/7]AArch64: Disable `omp declare variant' tests for aarch64 [PR96342]

2024-12-11 Thread Tamar Christina
ping > -Original Message- > From: Tamar Christina > Sent: Wednesday, December 4, 2024 12:17 PM > To: gcc-patches@gcc.gnu.org > Cc: nd ; Richard Earnshaw ; > ktkac...@gcc.gnu.org; Richard Sandiford > Subject: [PATCH 3/7]AArch64: Disable `omp declare variant' te

RE: [PATCH 7/7]AArch64: Implement vector concat of partial SVE vectors

2024-12-11 Thread Tamar Christina
ping > -Original Message- > From: Tamar Christina > Sent: Wednesday, December 4, 2024 12:18 PM > To: gcc-patches@gcc.gnu.org > Cc: nd ; Richard Earnshaw ; > ktkac...@gcc.gnu.org; Richard Sandiford > Subject: [PATCH 7/7]AArch64: Implement vector concat of partial SV

RE: [PATCH 1/2]AArch64: Add CMP+CSEL and CMP+CSET for cores that support it

2024-12-11 Thread Tamar Christina
> -Original Message- > From: Richard Sandiford > Sent: Wednesday, December 11, 2024 9:50 AM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw > ; ktkac...@gcc.gnu.org > Subject: Re: [PATCH 1/2]AArch64: Add CMP+CSEL and CMP+CSET for co

RE: [PATCH 1/2]AArch64: Add CMP+CSEL and CMP+CSET for cores that support it

2024-12-11 Thread Tamar Christina
> -Original Message- > From: Richard Sandiford > Sent: Wednesday, December 11, 2024 9:32 AM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw > ; ktkac...@gcc.gnu.org > Subject: Re: [PATCH 1/2]AArch64: Add CMP+CSEL and CMP+CSET for co

[PATCH 2/2]AArch64: Set L1 data cache size according to size on CPUs

2024-12-11 Thread Tamar Christina
Hi All, This sets the L1 data cache size for some cores based on their size in their Technical Reference Manuals. Today the port minimum is 256 bytes as explained in commit g:9a99559a478111f7fbeec29bd78344df7651c707, however like Neoverse V2 most cores actually define the L1 cache size as 64-byte

[PATCH 1/2]AArch64: Add CMP+CSEL and CMP+CSET for cores that support it

2024-12-11 Thread Tamar Christina
Hi All, GCC 15 added two new fusions CMP+CSEL and CMP+CSET. This patch enables them for cores that support based on their Software Optimization Guides and generically on Armv9-A. Even if a core does not support it there's no negative performance impact. Bootstrapped Regtested on aarch64-none-

RE: [PATCH 6/7]middle-end: add vec_init support for variable length subvector concatenation.

2024-12-09 Thread Tamar Christina
> >> So I think we can simply set const_n_elts to CONSTRUCTOR_NELTS > >> for vector_typed_elts_p? > >> > > Done, gcc/ChangeLog: PR target/96342 * expr.cc (store_constructor): add support for variable-length vectors. Co-authored-b

RE: [PATCH 5/7]middle-end: Add initial support for poly_int64 BIT_FIELD_REF in expand pass [PR96342]

2024-12-09 Thread Tamar Christina
to subparts of the CTOR vector type. > Done, gcc/ChangeLog: PR target/96342 * expr.cc (store_constructor): Enable poly_{u}int64 type usage. (get_inner_reference): Ditto. Co-authored-by: Tamar Christina Bootstrapped Regtested on aarch64-none-linux-gnu, arm-none-

RE: [PATCH 4/7]middle-end: Fix mask length arg in call to vect_get_loop_mask [PR96342]

2024-12-04 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Wednesday, December 4, 2024 2:43 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd > Subject: Re: [PATCH 4/7]middle-end: Fix mask length arg in call to > vect_get_loop_mask [PR96342] > > On Wed,

RE: [PATCH 6/7]middle-end: add vec_init support for variable length subvector concatenation.

2024-12-04 Thread Tamar Christina
-Original Message- > From: Tamar Christina > Sent: Wednesday, December 4, 2024 3:02 PM > To: Richard Biener > Cc: gcc-patches@gcc.gnu.org; nd ; Richard Sandiford > > Subject: RE: [PATCH 6/7]middle-end: add vec_init support for variable length > su

RE: [PATCH 6/7]middle-end: add vec_init support for variable length subvector concatenation.

2024-12-04 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Wednesday, December 4, 2024 2:53 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; Richard Sandiford > > Subject: Re: [PATCH 6/7]middle-end: add vec_init support for variable length > subvector conca

[PATCH 6/7]middle-end: add vec_init support for variable length subvector concatenation.

2024-12-04 Thread Tamar Christina
vectors. Co-authored-by: Tamar Christina Bootstrapped Regtested on aarch64-none-linux-gnu, arm-none-linux-gnueabihf, x86_64-pc-linux-gnu -m32, -m64 and no issues. Ok for master? Thanks, Tamar --- diff --git a/gcc/expr.cc b/gcc/expr.cc index 2d90d7aac296077cc0bda8a1b4732b

[PATCH 7/7]AArch64: Implement vector concat of partial SVE vectors

2024-12-04 Thread Tamar Christina
Hi All, This patch adds support for vector constructor from two partial SVE vectors into a full SVE vector. It also implements support for the standard vec_init obtab to do this. gcc/ChangeLog: PR target/96342 * config/aarch64/aarch64-sve.md (vec_init): New. (@aarch64_pac

[PATCH 4/7]middle-end: Fix mask length arg in call to vect_get_loop_mask [PR96342]

2024-12-04 Thread Tamar Christina
Hi All, When issuing multiple calls to a simdclone in a vectorized loop, TYPE_VECTOR_SUBPARTS(vectype) gives the incorrect number when compared to the TYPE_VECTOR_SUBPARTS result we get from the mask type derived from the relevant `rgroup_controls' entry within `vect_get_loop_mask'. By passing `m

[PATCH 5/7]middle-end: Add initial support for poly_int64 BIT_FIELD_REF in expand pass [PR96342]

2024-12-04 Thread Tamar Christina
le poly_{u}int64 type usage. (get_inner_reference): Ditto. * expmed.cc (store_bit_field_1): Add is_constant checks to bitsize and bitnum. Co-authored-by: Tamar Christina Bootstrapped Regtested on aarch64-none-linux-gnu, arm-none-linux-gnueabihf, x86_64-pc-linux-gnu -m32, -

[PATCH 3/7]AArch64: Disable `omp declare variant' tests for aarch64 [PR96342]

2024-12-04 Thread Tamar Christina
Hi All, These tests are x86 specific and shouldn't be run for aarch64. gcc/testsuite/ChangeLog: PR target/96342 * c-c++-common/gomp/declare-variant-14.c: Make i?86 and x86_64 target only test. * gfortran.dg/gomp/declare-variant-14.f90: Likewise. Bootstrapped Regt

[PATCH 2/7]AArch64: Add SVE support for simd clones [PR96342]

2024-12-04 Thread Tamar Christina
-simd-clone-1.c: New test. * g++.target/aarch64/vect-simd-clone-1.c: New test. Co-authored-by: Victor Do Nascimento Co-authored-by: Tamar Christina Bootstrapped Regtested on aarch64-none-linux-gnu, arm-none-linux-gnueabihf, x86_64-pc-linux-gnu -m32, -m64 and no issues. Ok for ma

RE: [PATCH 2/2]middle-end: use two's complement equality when comparing IVs during candidate selection [PR114932]

2024-12-03 Thread Tamar Christina
> > This patch implements a new OEP flag called OEP_STRUCTURAL_EQ. This flag > > will > > check if the operands would produce the same bit values after the > > computations > > even if the final sign is different. > > I think the name is badly chosen - we already have OEP_LEXICOGRAPHIC and > OE

RE: [PATCH 2/8]AArch64: Add Neoverse V3 core definition and cost model

2024-12-03 Thread Tamar Christina
> -Original Message- > From: Kyrylo Tkachov > Sent: Tuesday, December 3, 2024 10:19 AM > To: Tamar Christina > Cc: GCC Patches ; nd ; Richard > Earnshaw ; Marcus Shawcroft > ; ktkac...@gcc.gnu.org; Richard Sandiford > > Subject: Re: [PATCH 2/8]AArch64: Add

RE: [PATCH 1/2]middle-end: refactor type to be explicit in operand_equal_p [PR114932]

2024-12-02 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Friday, November 29, 2024 8:57 AM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; rguent...@suse.de; > j...@ventanamicro.com > Subject: Re: [PATCH 1/2]middle-end: refactor type to be explicit in > o

RE: [PATCH v2 4/4] vect: Disable `omp declare variant' tests for aarch64

2024-11-29 Thread Tamar Christina
Ping, I'm filling in for Victor on the patch series. Regards, Tamar > -Original Message- > From: Victor Do Nascimento > Sent: Tuesday, November 5, 2024 12:38 AM > To: gcc-patches@gcc.gnu.org > Cc: ja...@redhat.com > Subject: Re: [PATCH v2 4/4] vect: Disable `omp declare variant' tests f

RE: [PATCH]AArch64 Suppress default options when march or mcpu used is not affected by it.

2024-11-29 Thread Tamar Christina
- > From: Kyrylo Tkachov > Sent: Friday, November 29, 2024 11:21 AM > To: Tamar Christina > Cc: GCC Patches ; nd ; Richard > Earnshaw ; ktkac...@gcc.gnu.org; Richard > Sandiford > Subject: Re: [PATCH]AArch64 Suppress default options when march or mcpu used > is n

[PATCH]middle-end: rework vectorizable_store to iterate over single index [PR117557]

2024-11-27 Thread Tamar Christina
Hi All, The testcase #include #include #define N 8 #define L 8 void f(const uint8_t * restrict seq1, const uint8_t *idx, uint8_t *seq_out) { for (int i = 0; i < L; ++i) { uint8_t h = idx[i]; memcpy((void *)&seq_out[i * N], (const void *)&seq1[h * N / 2], N / 2); } } compil

RE: [PATCH][ivopts]: perform affine fold to unsigned on non address expressions. [PR114932]

2024-11-25 Thread Tamar Christina
Ping > -Original Message- > From: Tamar Christina > Sent: Thursday, November 7, 2024 11:50 AM > To: gcc-patches@gcc.gnu.org > Cc: nd ; rguent...@suse.de > Subject: [PATCH][ivopts]: perform affine fold to unsigned on non address > expressions. [PR114932] > >

RE: [PATCH 1/2]middle-end: refactor type to be explicit in operand_equal_p [PR114932]

2024-11-25 Thread Tamar Christina
ping > -Original Message- > From: Tamar Christina > Sent: Tuesday, November 5, 2024 9:04 AM > To: gcc-patches@gcc.gnu.org > Cc: nd ; rguent...@suse.de; j...@ventanamicro.com > Subject: RE: [PATCH 1/2]middle-end: refactor type to be explicit in > operand_equal_p

RE: [PATCH 2/2]middle-end: use two's complement equality when comparing IVs during candidate selection [PR114932]

2024-11-25 Thread Tamar Christina
ping > -Original Message- > From: Tamar Christina > Sent: Tuesday, November 5, 2024 9:04 AM > To: gcc-patches@gcc.gnu.org > Cc: nd ; rguent...@suse.de; j...@ventanamicro.com > Subject: RE: [PATCH 2/2]middle-end: use two's complement equality when > comp

RE: [PATCH]middle-end: Pass along SLP node when costing vector loads/stores

2024-11-22 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Thursday, November 21, 2024 8:03 AM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd > Subject: Re: [PATCH]middle-end: Pass along SLP node when costing vector > loads/stores > > On Wed, 20 Nov

[PATCH][middle-end] For multiplication try swapping operands when matching complex multiply [PR116463]

2024-11-21 Thread Tamar Christina
Hi All, This commit fixes the failures of complex.exp=fast-math-complex-mls-*.c on the GCC 14 branch and some of the ones on the master. The current matching just looks for one order for multiplication and was relying on canonicalization to always give the right order because of the TWO_OPERANDS.

RE: [PATCH]AArch64 Suppress default options when march or mcpu used is not affected by it.

2024-11-21 Thread Tamar Christina
> > I tried writing automated testcases for these, however the testsuite doesn't > > want to scan the output of -### and it makes the excess error tests always > > fail > > unless you use dg-error, which also looks for"error:". So tested manually: > > You might be able to use dg-message instead.

[PATCH]middle-end: Pass along SLP node when costing vector loads/stores

2024-11-20 Thread Tamar Christina
Hi All, With the support to SLP only we now pass the VMAT through the SLP node, however the majority of the costing calls inside vectorizable_load and vectorizable_store do no pass the SLP node along. Due to this the backend costing never sees the VMAT for these cases anymore. Additionally the

RE: [PATCH]AArch64 Suppress default options when march or mcpu used is not affected by it.

2024-11-19 Thread Tamar Christina
> -Original Message- > From: Andrew Pinski > Sent: Friday, November 15, 2024 7:16 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw > ; ktkac...@gcc.gnu.org; Richard Sandiford > > Subject: Re: [PATCH]AArch64 Suppress default options

[PATCH]AArch64 Suppress default options when march or mcpu used is not affected by it.

2024-11-15 Thread Tamar Christina
Hi All, This patch makes it so that when you use any of the Cortex-A53 errata workarounds but have specified an -march or -mcpu we know is not affected by it that we suppress the errata workaround. This is a driver only patch as the linker invocation needs to be changed as well. The linker and c

RE: [PATCH 1/2] Add suggested_epilogue_mode to vector costs

2024-11-13 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Monday, November 11, 2024 12:17 PM > To: gcc-patches@gcc.gnu.org > Cc: Richard Sandiford ; Tamar Christina > > Subject: [PATCH 1/2] Add suggested_epilogue_mode to vector costs > > The following enables ta

RE: [PATCH][v2] tree-optimization/117502 - VMAT_STRIDED_SLP vs VMAT_ELEMENTWISE when considering gather

2024-11-12 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Tuesday, November 12, 2024 8:31 AM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; RISC-V CI > Subject: RE: [PATCH][v2] tree-optimization/117502 - VMAT_STRIDED_SLP vs > VMAT_ELEMENTWISE when considering gathe

RE: [PATCH][v2] tree-optimization/117502 - VMAT_STRIDED_SLP vs VMAT_ELEMENTWISE when considering gather

2024-11-11 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Monday, November 11, 2024 10:13 AM > To: gcc-patches@gcc.gnu.org > Cc: RISC-V CI ; Tamar Christina > > Subject: [PATCH][v2] tree-optimization/117502 - VMAT_STRIDED_SLP vs > VMAT_ELEMENTWISE when considering g

[PATCH]AArch64 backport Neoverse and Cortex CPU definitions

2024-11-08 Thread Tamar Christina
Hi All, This is a conservative backport of a few core definitions backporting only the core definitions and mapping them to their closest cost model that exist on the branches. Bootstrapped Regtested on aarch64-none-linux-gnu on branches and no issues. Ok for GCC 13 and 14? Thanks, Tamar gcc/C

RE: [PATCH v2 01/10] Match: Simplify branch form 4 of unsigned SAT_ADD into branchless

2024-11-07 Thread Tamar Christina
> -Original Message- > From: Jeff Law > Sent: Thursday, November 7, 2024 8:08 PM > To: Tamar Christina ; Li, Pan2 ; > Richard Biener > Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; kito.ch...@gmail.com; > rdapp@gmail.com > Subject: Re: [PATCH v2 01/10

RE: [PATCH 3/5] Add LOOP_VINFO_MAIN_LOOP_INFO

2024-11-07 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Wednesday, November 6, 2024 2:30 PM > To: gcc-patches@gcc.gnu.org > Cc: Richard Sandiford ; Tamar Christina > > Subject: [PATCH 3/5] Add LOOP_VINFO_MAIN_LOOP_INFO > > The following introduces LOOP_VIN

[PATCH][ivopts]: perform affine fold to unsigned on non address expressions. [PR114932]

2024-11-07 Thread Tamar Christina
Hi All, When the patch for PR114074 was applied we saw a good boost in exchange2. This boost was partially caused by a simplification of the addressing modes. With the patch applied IV opts saw the following form for the base addressing; Base: (integer(kind=4) *) &block + ((sizetype) ((unsigne

RE: [PATCH v2 01/10] Match: Simplify branch form 4 of unsigned SAT_ADD into branchless

2024-11-07 Thread Tamar Christina
> -Original Message- > From: Li, Pan2 > Sent: Thursday, November 7, 2024 1:45 AM > To: Tamar Christina ; Richard Biener > > Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; kito.ch...@gmail.com; > jeffreya...@gmail.com; rdapp@gmail.com > Subject: RE

RE: [PATCH v2 01/10] Match: Simplify branch form 4 of unsigned SAT_ADD into branchless

2024-11-07 Thread Tamar Christina
> -Original Message- > From: Li, Pan2 > Sent: Thursday, November 7, 2024 12:57 PM > To: Tamar Christina ; Richard Biener > > Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; kito.ch...@gmail.com; > jeffreya...@gmail.com; rdapp@gmail.com > Subject: RE

RE: [PATCH 5/5] Allow multiple vectorized epilogs via --param vect-epilogues-nomask=N

2024-11-07 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Wednesday, November 6, 2024 2:32 PM > To: gcc-patches@gcc.gnu.org > Cc: RISC-V CI ; Tamar Christina > ; Richard Sandiford > Subject: [PATCH 5/5] Allow multiple vectorized epilogs via --param > vect-epilog

RE: [PATCH v2 01/10] Match: Simplify branch form 4 of unsigned SAT_ADD into branchless

2024-11-06 Thread Tamar Christina
> -Original Message- > From: Li, Pan2 > Sent: Wednesday, November 6, 2024 1:31 PM > To: Richard Biener > Cc: gcc-patches@gcc.gnu.org; Tamar Christina ; > juzhe.zh...@rivai.ai; kito.ch...@gmail.com; jeffreya...@gmail.com; > rdapp@gmail.com > Subject: RE

RE: [PATCH 2/2]middle-end: use two's complement equality when comparing IVs during candidate selection [PR114932]

2024-11-05 Thread Tamar Christina
ping > -Original Message- > From: Tamar Christina > Sent: Monday, October 14, 2024 4:08 PM > To: Tamar Christina ; gcc-patches@gcc.gnu.org > Cc: nd ; rguent...@suse.de; j...@ventanamicro.com > Subject: RE: [PATCH 2/2]middle-end: use two's complement equality when

RE: [PATCH 1/2]middle-end: refactor type to be explicit in operand_equal_p [PR114932]

2024-11-05 Thread Tamar Christina
ping > -Original Message- > From: Tamar Christina > Sent: Monday, October 14, 2024 4:08 PM > To: Tamar Christina ; gcc-patches@gcc.gnu.org > Cc: nd ; rguent...@suse.de; j...@ventanamicro.com > Subject: RE: [PATCH 1/2]middle-end: refactor type to be explicit in > oper

RE: [PATCH]middle-end: Handle more gcond lowering [PR117176]

2024-10-21 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Monday, October 21, 2024 9:55 AM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd > Subject: Re: [PATCH]middle-end: Handle more gcond lowering [PR117176] > > On Mon, 21 Oct 2024, Tamar Christ

[PATCH]middle-end: Handle more gcond lowering [PR117176]

2024-10-20 Thread Tamar Christina
Hi All, For boolean mask handling we have to lower BIT_NOT_EXPR for correctness into BIT_XOR_EXPR. Normally this is done through vect_recog_bool_pattern by following the defs of the gimple_assign. In the PR we ICE because early exits have the comparison inside the gcond itself and so vect_recog_

RE: [PATCH 2/2] Add a new permute optimization step in SLP

2024-10-18 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Friday, October 18, 2024 11:03 AM > To: Tamar Christina > Cc: Christoph Müllner ; gcc-patches@gcc.gnu.org; > Philipp Tomsich ; Jeff Law ; > Robin Dapp > Subject: RE: [PATCH 2/2] Add a new permute optimization

RE: [PATCH] SVE intrinsics: Add fold_active_lanes_to method to refactor svmul and svdiv.

2024-10-18 Thread Tamar Christina
> -Original Message- > From: Richard Sandiford > Sent: Thursday, October 17, 2024 6:05 PM > To: Jennifer Schmitz > Cc: gcc-patches@gcc.gnu.org; Kyrylo Tkachov ; Tamar > Christina > Subject: Re: [PATCH] SVE intrinsics: Add fold_active_lanes_to method to > re

[PATCH]middle-end: Fix GSI for gcond root [PR117140]

2024-10-18 Thread Tamar Christina
Hi All, When finding the gsi to use for code of the root statements we should use the one of the original statement rather than the gcond which may be inside a pattern. Without this the emitted instructions may be discarded later. Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.

  1   2   3   4   5   6   7   8   9   10   >