>
> This is an example on how I'd like to see cleanup for SLP happening
> in the vectorizable_* and related functions. While this example,
> vectorizable_conversion, is quite straight-forward it helps to
> isolate errors. I've done this in 3 steps:
Happy to help with this if you let me know whi
> -Original Message-
> From: Richard Biener
> Sent: Tuesday, May 6, 2025 9:51 AM
> To: gcc-patches@gcc.gnu.org
> Cc: Tamar Christina ; RISC-V CI c...@rivosinc.com>
> Subject: [PATCH] tree-optimization/120089 - force all PHIs live for
> early-break vect
>
&g
> -Original Message-
> From: Jennifer Schmitz
> Sent: Monday, April 28, 2025 11:40 AM
> To: gcc-patches@gcc.gnu.org
> Cc: Richard Sandiford ; Tamar Christina
>
> Subject: Re: [PATCH] aarch64: Optimize SVE extract last to Neon lane extract
> for
> 128-bit
Hi All,
When the input is already a subreg and we try to make a paradoxical
subreg out of it for copysign this can fail if it violates the subreg
relationship.
Use force_lowpart_subreg instead of lowpart_subreg to then force the
results to a register instead of ICEing.
Bootstrapped Regtested on
Hi All,
The following testcase shows an incorrect masked codegen:
#define N 512
#define START 1
#define END 505
int x[N] __attribute__((aligned(32)));
int __attribute__((noipa))
foo (void)
{
int z = 0;
for (unsigned int i = START; i < END; ++i)
{
z++;
if (x[i] > 0)
> -Original Message-
> From: Richard Sandiford
> Sent: Friday, April 25, 2025 6:55 PM
> To: Jennifer Schmitz
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH] AArch64: Fold LD1/ST1 with ptrue to LDR/STR for 128-bit
> VLS
>
> Jennifer Schmitz writes:
> > If -msve-vector-bits=128, SVE
> -Original Message-
> From: Richard Sandiford
> Sent: Friday, April 25, 2025 4:45 PM
> To: Jennifer Schmitz
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH] aarch64: Optimize SVE extract last to Neon lane extract
> for
> 128-bit VLS.
>
> Jennifer Schmitz writes:
> > For the test c
> -Original Message-
> From: Jakub Jelinek
> Sent: Wednesday, April 23, 2025 10:39 AM
> To: Tamar Christina
> Cc: Richard Biener ; Andi Kleen
> ; GCC Patches
> Subject: Re: [PATCH] Add a bootstrap-native build config
>
> On Wed, Apr 23, 2025 at 09:36:11AM +
> -Original Message-
> From: Richard Biener
> Sent: Wednesday, April 23, 2025 9:37 AM
> To: Tamar Christina
> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Sandiford
>
> Subject: Re: [PATCH]middle-end: Add new "max" vector cost model
>
> On Wed, 23 Apr
> -Original Message-
> From: Richard Biener
> Sent: Wednesday, April 23, 2025 9:19 AM
> To: Andi Kleen ; GCC Patches
> Subject: Re: [PATCH] Add a bootstrap-native build config
>
> On Tue, Apr 22, 2025 at 5:43 PM Andi Kleen wrote:
> >
> > On 2025-04-22 13:22, Richard Biener wrote:
> > >
> -Original Message-
> From: Richard Biener
> Sent: Wednesday, April 23, 2025 10:14 AM
> To: Tamar Christina
> Cc: Richard Sandiford ; gcc-patches@gcc.gnu.org;
> nd
> Subject: RE: [PATCH]middle-end: Add new "max" vector cost model
>
> On We
> -Original Message-
> From: Richard Sandiford
> Sent: Wednesday, April 23, 2025 9:45 AM
> To: Tamar Christina
> Cc: Richard Biener ; gcc-patches@gcc.gnu.org; nd
>
> Subject: Re: [PATCH]middle-end: Add new "max" vector cost model
>
> Tamar Christin
> -Original Message-
> From: Richard Biener
> Sent: Wednesday, April 23, 2025 9:46 AM
> To: Tamar Christina
> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Sandiford
>
> Subject: RE: [PATCH]middle-end: Add new "max" vector cost model
>
> On We
> -Original Message-
> From: Richard Biener
> Sent: Wednesday, April 23, 2025 9:31 AM
> To: Tamar Christina
> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Sandiford
>
> Subject: Re: [PATCH]middle-end: Add new "max" vector cost model
>
> On We
Hi All,
This patch proposes a new vector cost model called "max". The cost model is an
intersection between two of our existing cost models. Like `unlimited` it
disables the costing vs scalar and assumes all vectorization to be profitable.
But unlike unlimited it does not fully disable the vect
Hi All,
This documents the PFA support in GCC-15.
Ok for master?
Thanks,
Tamar
---
diff --git a/htdocs/gcc-15/changes.html b/htdocs/gcc-15/changes.html
index
f03e29c8581f2749a968e592eae2e40ce3ca8521..7fb70b993c56ff43c09aeb7bfaa4479385679dec
100644
--- a/htdocs/gcc-15/changes.html
+++ b/htdocs
Hi All,
I had missed this one during the AMDGCN test failures.
Like vect-early-break_18.c this test is also scalaring the
loads and thus leading to unexpected vectorization for this
testcase.
Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
Cross checked the failing case on amdgc
> -Original Message-
> From: Richard Sandiford
> Sent: Tuesday, April 22, 2025 2:28 PM
> To: Tamar Christina
> Cc: gcc-patches@gcc.gnu.org; Richard Earnshaw ;
> ktkac...@nvidia.com
> Subject: Re: [PATCH] Document AArch64 changes for GCC 15
>
> Tamar Christina
> -Original Message-
> From: Richard Sandiford
> Sent: Tuesday, April 22, 2025 1:31 PM
> To: gcc-patches@gcc.gnu.org
> Cc: Richard Earnshaw ; ktkac...@nvidia.com;
> Tamar Christina
> Subject: [PATCH] Document AArch64 changes for GCC 15
>
> The list i
> -Original Message-
> From: Richard Biener
> Sent: Wednesday, April 16, 2025 9:57 AM
> To: Tamar Christina
> Cc: gcc-patches@gcc.gnu.org; nd
> Subject: Re: [PATCH]middle-end: fix masking for partial vectors and early
> break
> [PR119351]
>
> On Wed, 16 A
Hi All,
The given test is intended to test vectorization of a strided access done by
having a step of > 1.
GCN target doesn't support load lanes, so the testcase is expected to fail,
other targets create a permuted load here which we then then reject.
However some GCN arch don't seem to support
Hi All,
The following testcase shows an incorrect masked codegen:
#define N 512
#define START 1
#define END 505
int x[N] __attribute__((aligned(32)));
int __attribute__((noipa))
foo (void)
{
int z = 0;
for (unsigned int i = START; i < END; ++i)
{
z++;
if (x[i] > 0)
> -Original Message-
> From: Richard Biener
> Sent: Tuesday, April 15, 2025 12:50 PM
> To: Tamar Christina
> Cc: Richard Sandiford ; gcc-patches@gcc.gnu.org;
> nd
> Subject: RE: [PATCH]middle-end: Fix incorrect codegen with PFA and VLS
> [PR119351]
>
&
> -Original Message-
> From: Richard Biener
> Sent: Tuesday, April 15, 2025 12:49 PM
> To: Tamar Christina
> Cc: gcc-patches@gcc.gnu.org; nd
> Subject: Re: [PATCH]middle-end: Fix incorrect codegen with PFA and VLS
> [PR119351]
>
> On Tue, 15 Apr 2025, Tamar
> -Original Message-
> From: Richard Sandiford
> Sent: Tuesday, April 15, 2025 10:52 AM
> To: Tamar Christina
> Cc: gcc-patches@gcc.gnu.org; nd ; rguent...@suse.de
> Subject: Re: [PATCH]middle-end: Fix incorrect codegen with PFA and VLS
> [PR119351]
>
> Tamar
Hi All,
The following example:
#define N 512
#define START 2
#define END 505
int x[N] __attribute__((aligned(32)));
int __attribute__((noipa))
foo (void)
{
for (signed int i = START; i < END; ++i)
{
if (x[i] == 0)
return i;
}
return -1;
}
generates incorrect code with
Ping
> -Original Message-
> From: Tamar Christina
> Sent: Tuesday, July 23, 2024 3:30 PM
> To: Jonathan Wakely ; Filip Kastl
> Cc: gcc-patches@gcc.gnu.org; nd
> Subject: RE: [PATCH][contrib]: support json output from check_GNU_style_lib.py
>
> Hi Both,
>
> -Original Message-
> From: Kyrylo Tkachov
> Sent: Monday, March 31, 2025 1:43 PM
> To: i...@sandoe.co.uk
> Cc: Tamar Christina ; GCC Patches patc...@gcc.gnu.org>; Alice Carlotti ; Richard
> Sandiford
> ; s...@gentoo.org
> Subject: Re: [PATCH v2] aarch64, Dar
> -Original Message-
> From: Richard Biener
> Sent: Tuesday, March 18, 2025 10:48 AM
> To: Tamar Christina
> Cc: gcc-patches@gcc.gnu.org; nd
> Subject: Re: [PATCH] testsuite: update early-break tests for non-load-lanes
> targets
> [PR119286]
>
> On Mon
Hi All,
Broadly speaking, these tests were failing because the BB limitation for SLP'ing
loads in an || in an early break makes the loads end up in different BBs and so
today we can't SLP them. This results in load_lanes being required to vectorize
them because the alternative is loads with permu
> -Original Message-
> From: Richard Sandiford
> Sent: Wednesday, March 5, 2025 11:27 AM
> To: Tamar Christina
> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw
> ; ktkac...@gcc.gnu.org
> Subject: Re: [1/3 PATCH]AArch64: add support for partial modes to last
&g
> > diff --git a/gcc/config/aarch64/aarch64-sve.md b/gcc/config/aarch64/aarch64-
> sve.md
> > index
> a93bc463a909ea28460cc7877275fce16e05f7e6..205eeec2e35544de848e0dbb
> 48e3f5ae59391a88 100644
> > --- a/gcc/config/aarch64/aarch64-sve.md
> > +++ b/gcc/config/aarch64/aarch64-sve.md
> > @@ -3107,12
> -Original Message-
> From: Richard Sandiford
> Sent: Thursday, March 6, 2025 10:40 AM
> To: Tamar Christina
> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw
> ; ktkac...@gcc.gnu.org
> Subject: Re: [1/3 PATCH]AArch64: add support for partial modes to last
&g
> -Original Message-
> From: Richard Sandiford
> Sent: Monday, March 3, 2025 11:53 AM
> To: Tamar Christina
> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw
> ; ktkac...@gcc.gnu.org
> Subject: Re: [1/3 PATCH]AArch64: add support for partial modes to last
&g
> >/* For now assume all conditional loads/stores support unaligned
> > diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
> > index
> 6bbb16beff2c627fca11a7403ba5ee3a5faa21c1..b661dd400e5826fc1c4f70
> 957b335d1741fa 100644
> > --- a/gcc/tree-vect-stmts.cc
> > +++ b/gcc/tree-vect-
> -Original Message-
> From: Richard Sandiford
> Sent: Monday, March 3, 2025 10:12 AM
> To: Tamar Christina
> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw
> ; ktkac...@gcc.gnu.org
> Subject: Re: [PATCH]AArch64: force operand to fresh register to avoid subr
Hi All,
When the input is already a subreg and we try to make a paradoxical
subreg out of it for copysign this can fail if it violates the sugreg
relationship.
Use force_lowpart_subreg instead of lowpart_subreg to then force the
results to a register instead of ICEing.
Bootstrapped Regtested on
> > >
> > > No, I don't think so. The code that eventually performs a
> > > contiguous sub-group access directly should never extend
> > > the load beyond GROUP_SIZE - or should be gated on the DR
> > > not executed speculatively. That is, we should "fix" this
> > > elsewhere.
> > >
> >
> > It do
> -Original Message-
> From: Richard Biener
> Sent: Wednesday, February 26, 2025 1:52 PM
> To: Tamar Christina
> Cc: gcc-patches@gcc.gnu.org; nd
> Subject: RE: [3/3 PATCH v3]middle-end: delay checking for alignment to load
> [PR118464]
>
> On Wed, 26 Feb
> -Original Message-
> From: Richard Biener
> Sent: Wednesday, February 26, 2025 12:30 PM
> To: Tamar Christina
> Cc: gcc-patches@gcc.gnu.org; nd
> Subject: Re: [3/3 PATCH v3]middle-end: delay checking for alignment to load
> [PR118464]
>
> On Tue, 25 Feb
Hi All,
This fixes two PRs on Early break vectorization by delaying the safety checks to
vectorizable_load when the VF, VMAT and vectype are all known.
This patch does add two new restrictions:
1. On LOAD_LANES targets, where the buffer size is known, we reject uneven
group sizes, as they are
Hi All,
These loops will now vectorize the entry finding
loops. As such we get more failures because they
were not expecting to be vectorized.
Fixed by adding #pragma GCC novector.
Bootstrapped Regtested on aarch64-none-linux-gnu,
arm-none-linux-gnueabihf, x86_64-pc-linux-gnu
-m32, -m64 and no
Hi All,
The last extraction instructions work full both full and partial SVE vectors,
however we currrently only define them for FULL vectors.
Early break code for VLA now however requires partial vector support, which
relies on extract_last support.
I have not added any new testcases as they ov
> -Original Message-
> From: Richard Sandiford
> Sent: Thursday, February 13, 2025 4:55 PM
> To: Tamar Christina
> Cc: Richard Biener ; gcc-patches@gcc.gnu.org; nd
>
> Subject: Re: [PATCH v2]middle-end: delay checking for alignment to load
> [PR118464]
>
> -Original Message-
> From: Tamar Christina
> Sent: Wednesday, February 12, 2025 3:20 PM
> To: Richard Biener
> Cc: gcc-patches@gcc.gnu.org; nd
> Subject: RE: [PATCH v2]middle-end: delay checking for alignment to load
> [PR118464]
>
> > -Original
> -Original Message-
> From: Richard Biener
> Sent: Wednesday, February 12, 2025 2:58 PM
> To: Tamar Christina
> Cc: gcc-patches@gcc.gnu.org; nd
> Subject: Re: [PATCH v2]middle-end: delay checking for alignment to load
> [PR118464]
>
> On Tue, 11 Feb
Hi All,
It seems I ran regressions but forgot to check them last time `(*>?<*)?
On the GCC-13 branch the backport caused a failure due to the branch not having
generic-armv8-a and also it still treating the generic cpu special. This made
it return NULL when trying to find the default CPU.
In GC
Hi All,
These two tests now vectorize the result finding
loop with PFA and so the number of loops checked
fails.
This fixes them by adding #pragma GCC novector to
the testcases.
Regtested on x86_64-pc-linux-gnu on an AVX512 machine
with -m32, -m64 and test pass again.
Ok for master?
Thanks,
Ta
> -Original Message-
> From: Richard Biener
> Sent: Wednesday, February 5, 2025 1:15 PM
> To: Tamar Christina
> Cc: gcc-patches@gcc.gnu.org; nd
> Subject: RE: [PATCH]middle-end: delay checking for alignment to load
> [PR118464]
>
> On Wed, 5 Feb
Hi All,
It seems that after my IVopts patches the function contain_complex_addr_expr
became unused and clang is rightfully complaining about it.
This removes the unused internal function.
Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
Ok for master?
Thanks,
Tamar
gcc/ChangeLo
> -Original Message-
> From: Richard Biener
> Sent: Wednesday, February 5, 2025 10:16 AM
> To: Tamar Christina
> Cc: gcc-patches@gcc.gnu.org; nd
> Subject: RE: [PATCH]middle-end: delay checking for alignment to load
> [PR118464]
>
> On Wed, 5 Feb
> -Original Message-
> From: Richard Biener
> Sent: Tuesday, February 4, 2025 12:49 PM
> To: Tamar Christina
> Cc: gcc-patches@gcc.gnu.org; nd
> Subject: RE: [PATCH]middle-end: delay checking for alignment to load
> [PR118464]
>
> On Tue, 4 Feb
> -Original Message-
> From: Jan Hubicka
> Sent: Tuesday, February 4, 2025 4:25 PM
> To: Alex Coplan
> Cc: gcc-patches@gcc.gnu.org; Richard Biener ; Tamar
> Christina
> Subject: Re: [PATCH 1/4] vect: Set counts of early break exit blocks correctly
> [PR
Looks like a last minute change I made accidentally blocked SVE. Fixed and
re-sending:
Hi All,
This fixes two PRs on Early break vectorization by delaying the safety checks to
vectorizable_load when the VF, VMAT and vectype are all known.
This patch does add two new restrictions:
1. On LOAD_LA
Ping
> -Original Message-
> From: Tamar Christina
> Sent: Friday, January 24, 2025 9:18 AM
> To: Alex Coplan ; gcc-patches@gcc.gnu.org
> Cc: Richard Biener ; Jan Hubicka
> Subject: RE: [PATCH 4/4] vect: Fix scale_profile_for_vect_loop for multiple
> exits
&
Ping
> -Original Message-
> From: Tamar Christina
> Sent: Friday, January 24, 2025 9:18 AM
> To: Alex Coplan ; gcc-patches@gcc.gnu.org
> Cc: Richard Biener ; Jan Hubicka
> Subject: RE: [PATCH 3/4] vect: Ensure profile consistency when adding epilog
> guard
Hi All,
This fixes two PRs on Early break vectorization by delaying the safety checks to
vectorizable_load when the VF, VMAT and vectype are all known.
This patch does add two new restrictions:
1. On LOAD_LANES targets, where the buffer size is known, we reject uneven
group sizes, as they are
Ping
> -Original Message-
> From: Tamar Christina
> Sent: Friday, January 24, 2025 9:18 AM
> To: Alex Coplan ; gcc-patches@gcc.gnu.org
> Cc: Richard Biener ; Jan Hubicka
> Subject: RE: [PATCH 2/4] cfgloopmanip: Add infrastructure for scaling of
> multi-exit
> lo
Ping
> -Original Message-
> From: Tamar Christina
> Sent: Friday, January 24, 2025 9:17 AM
> To: Alex Coplan ; 'gcc-patches@gcc.gnu.org' patc...@gcc.gnu.org>
> Cc: 'Richard Biener' ; 'Jan Hubicka'
> Subject: RE: [PATCH 1/4] ve
ping
> -Original Message-
> From: Tamar Christina
> Sent: Wednesday, January 15, 2025 2:08 PM
> To: Alex Coplan ; gcc-patches@gcc.gnu.org
> Cc: Richard Biener ; Jan Hubicka
> Subject: RE: [PATCH 4/4] vect: Fix scale_profile_for_vect_loop for multiple
> exits
&
ping
> -Original Message-
> From: Tamar Christina
> Sent: Wednesday, January 15, 2025 2:08 PM
> To: Alex Coplan ; gcc-patches@gcc.gnu.org
> Cc: Richard Biener ; Jan Hubicka
> Subject: RE: [PATCH 2/4] cfgloopmanip: Add infrastructure for scaling of
> multi-e
ping
> -Original Message-
> From: Tamar Christina
> Sent: Wednesday, January 15, 2025 2:07 PM
> To: Alex Coplan ; gcc-patches@gcc.gnu.org
> Cc: Richard Biener ; Jan Hubicka
> Subject: RE: [PATCH 1/4] vect: Set counts of early break exit blocks correctly
>
ping
> -Original Message-
> From: Tamar Christina
> Sent: Wednesday, January 15, 2025 2:08 PM
> To: Alex Coplan ; gcc-patches@gcc.gnu.org
> Cc: Richard Biener ; Jan Hubicka
> Subject: RE: [PATCH 3/4] vect: Ensure profile consistency when adding epilog
> guard
> -Original Message-
> From: Tamar Christina
> Sent: Friday, January 17, 2025 5:07 PM
> To: Kyrylo Tkachov ; Richard Sandiford
>
> Cc: GCC Patches ; nd ; Richard
> Earnshaw ; ktkac...@gcc.gnu.org
> Subject: RE: [PATCH]AArch64: Drop ILP32 from default elf multi
> -Original Message-
> From: Iain Sandoe
> Sent: Monday, January 20, 2025 6:15 PM
> To: Andrew Carlotti
> Cc: Kyrylo Tkachov ; GCC Patches patc...@gcc.gnu.org>; Tamar Christina ; Richard
> Sandiford ; Sam James
> Subject: Re: [PATCH] aarch64: Provide initial
Hi All,
When registering masks for SIMD clone we end up using nmasks instead of
nvectors where nmasks seems to compute the number of input masks required for
the call given the current simdlen.
This is however wrong as vect_record_loop_mask wants to know how many masks you
want to create from the
> -Original Message-
> From: Thomas Schwinge
> Sent: Monday, January 13, 2025 9:54 AM
> To: Tamar Christina ; Alex Coplan
> ; gcc-patches@gcc.gnu.org
> Cc: Andrew Stubbs
> Subject: Re: [gcc r15-6807] vect: Force alignment peeling to vectorize more
> early
> -Original Message-
> From: Kyrylo Tkachov
> Sent: Friday, January 17, 2025 3:10 PM
> To: Richard Sandiford
> Cc: Tamar Christina ; GCC Patches patc...@gcc.gnu.org>; nd ; Richard Earnshaw
> ; ktkac...@gcc.gnu.org
> Subject: Re: [PATCH]AArch64: Drop ILP32 fr
> -Original Message-
> From: Kyrylo Tkachov
> Sent: Friday, January 17, 2025 1:22 PM
> To: Tamar Christina
> Cc: GCC Patches ; nd ; Richard
> Earnshaw ; ktkac...@gcc.gnu.org; Richard
> Sandiford
> Subject: Re: [PATCH]AArch64: Drop ILP32 from default elf multi
> -Original Message-
> From: Kyrylo Tkachov
> Sent: Friday, January 17, 2025 1:04 PM
> To: Tamar Christina
> Cc: GCC Patches ; nd ; Richard
> Earnshaw ; ktkac...@gcc.gnu.org; Richard
> Sandiford
> Subject: Re: [PATCH]AArch64: Drop ILP32 from default elf multi
16-bit tests.
* gcc.target/aarch64/saturating_arithmetic_3.c: 32-bit tests.
* gcc.target/aarch64/saturating_arithmetic_4.c: 64-bit tests.
Co-authored-by: Tamar Christina
-- inline copy --
diff --git a/gcc/config/aarch64/aarch64-builtins.cc
b/gcc/
Hi All,
Following the deprecation of ILP32 *-elf builds fail now due to -Werror on the
deprecation warning. This is because on embedded builds ILP32 is part of the
default multilib.
This patch removed it from the default target as the build would fail anyway.
Cross compiled on aarch64-none-elf
> -Original Message-
> From: Wilco Dijkstra
> Sent: Tuesday, January 14, 2025 5:30 PM
> To: Richard Sandiford
> Cc: Richard Earnshaw ; ktkac...@nvidia.com; GCC
> Patches ; sch...@linux-m68k.org
> Subject: Re: [PATCH] AArch64: Deprecate -mabi=ilp32
>
> Hi Richard,
>
> >> + if (TARGET_IL
> -Original Message-
> From: Richard Sandiford
> Sent: Thursday, January 16, 2025 7:11 AM
> To: Tamar Christina
> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw
> ; ktkac...@gcc.gnu.org
> Subject: Re: [PATCH]AArch64: have -mcpu=native detect architecture extensi
Hi All,
When this test was added initially it didn't add the early break effective
target tests.
This means that the test was "passing" (as in, it was failing to vectorize)
because many targets don't support early break.
But the test should not have been run for these targets. When the vectoriz
Re-reading again I realize I misread cache size from your question with cache
line size.
Cache size can be whatever yes. Cache line size must match.
But that doesn't change the fact that this patch is correct.
Thanks,
Tamar
From: Tamar Christina
> -Original Message-
> From: Richard Sandiford
> Sent: Wednesday, January 15, 2025 3:23 PM
> To: Tamar Christina
> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw
> ; ktkac...@gcc.gnu.org
> Subject: Re: [PATCH]AArch64: have -mcpu=native detect architecture extensi
> -Original Message-
> From: Xi Ruoyao
> Sent: Wednesday, January 15, 2025 1:40 PM
> To: Tamar Christina ; gcc-patches@gcc.gnu.org
> Cc: nd ; Richard Earnshaw ;
> ktkac...@gcc.gnu.org; Richard Sandiford
> Subject: Re: [PATCH]AArch64: have -mcpu=native detect architec
Ping
> -Original Message-
> From: Alex Coplan
> Sent: Monday, January 6, 2025 11:35 AM
> To: gcc-patches@gcc.gnu.org
> Cc: Richard Biener ; Jan Hubicka ; Tamar
> Christina
> Subject: [PATCH 3/4] vect: Ensure profile consistency when adding epilog guard
> [PR11779
Ping
> -Original Message-
> From: Alex Coplan
> Sent: Monday, January 6, 2025 11:35 AM
> To: gcc-patches@gcc.gnu.org
> Cc: Richard Biener ; Jan Hubicka ; Tamar
> Christina
> Subject: [PATCH 2/4] cfgloopmanip: Add infrastructure for scaling of
> multi-exit
>
Ping
> -Original Message-
> From: Alex Coplan
> Sent: Monday, January 6, 2025 11:36 AM
> To: gcc-patches@gcc.gnu.org
> Cc: Richard Biener ; Jan Hubicka ; Tamar
> Christina
> Subject: [PATCH 4/4] vect: Fix scale_profile_for_vect_loop for multiple exits
> [PR1
Ping
> -Original Message-
> From: Alex Coplan
> Sent: Monday, January 6, 2025 11:34 AM
> To: gcc-patches@gcc.gnu.org
> Cc: Richard Biener ; Jan Hubicka ; Tamar
> Christina
> Subject: [PATCH 1/4] vect: Set counts of early break exit blocks correctly
> [PR117790
> -Original Message-
> From: Xi Ruoyao
> Sent: Wednesday, January 15, 2025 1:29 PM
> To: Tamar Christina ; gcc-patches@gcc.gnu.org
> Cc: nd ; Richard Earnshaw ;
> ktkac...@gcc.gnu.org; Richard Sandiford
> Subject: Re: [PATCH]AArch64: have -mcpu=native detect ar
> -Original Message-
> From: Richard Sandiford
> Sent: Monday, January 13, 2025 8:55 PM
> To: Tamar Christina
> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw
> ; ktkac...@gcc.gnu.org
> Subject: Re: [PATCH]AArch64: have -mcpu=native detect architecture extensi
Hi All,
In g:3c32575e5b6370270d38a80a7fa8eaa144e083d0 I made a mistake and incorrectly
replaced the type of the arguments of an expression with the type of the
expression. This is of course wrong.
This reverts that change and I have also double checked the other replacements
and they are fine.
> -Original Message-
> From: Richard Sandiford
> Sent: Monday, January 13, 2025 6:35 PM
> To: Tamar Christina
> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw
> ; ktkac...@gcc.gnu.org
> Subject: Re: [PATCH]AArch64: have -mcpu=native detect architecture extensi
Hi All,
When both -mcpu and -march are specified, the value of -march wins out.
This is done correctly for the calls to cc1 and for the assembler directives we
put out in assembly files.
However in the call to as we don't do this and instead use the arch from the
cpu. This leads to a situation
Hi All,
in g:e91a17fe39c39e98cebe6e1cbc8064ee6846a3a7 we added the ability for
-mcpu=native on unknown CPUs to still enable architecture extensions.
This has worked great but was only added for homogenous systems.
However the same thing works for big.LITTLE as in such system the cores must
have
> -Original Message-
> From: Jonathan Wakely
> Sent: Friday, January 10, 2025 2:36 PM
> To: Tamar Christina
> Cc: gcc-patches@gcc.gnu.org; nd ; libstd...@gcc.gnu.org
> Subject: Re: [PATCH][libstdc++]: backport inline keyword on std::find
>
> On Fri, 10
Hi All,
This is a backport version of the same patch as
https://gcc.gnu.org/pipermail/gcc-patches/2024-December/671618.html
for the release branches. I'd like to backport this to GCC 14,13 and 12 where
the first regression showed up. I am however aware that GCC 12 is going to
get it's last rele
Hi All,
The Parts Num field for the MIDR for Cortex-X4 is wrong. It's currently the
parts number for a Cortex-A720 (which does have the right number).
The correct number can be found in the Cortex-X4 Technical Reference Manual [1]
on page 382 in Issue Number 5.
[1] https://developer.arm.com/doc
or
the element is placed in a what I
assume to be crowded bucket.
It does seem to be beneficial for some user defined datatypes, I assume due to
some IPA shenanigans. But overall
there were more and larger wins using probability of 0 rather than 1.
Kind regards,
Tamar
From: Tamar Christina
> -Original Message-
> From: Richard Sandiford
> Sent: Thursday, January 9, 2025 3:09 PM
> To: Tamar Christina
> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw
> ; ktkac...@gcc.gnu.org
> Subject: Re: [PATCH]AArch64: Fix costing of emulated gathers/scatters
&g
> -Original Message-
> From: Richard Sandiford
> Sent: Wednesday, January 8, 2025 10:30 AM
> To: Tamar Christina
> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw
> ; ktkac...@gcc.gnu.org
> Subject: Re: [PATCH]AArch64: Fix costing of emulated gathers/scatters
&g
> -Original Message-
> From: Richard Earnshaw (lists)
> Sent: Wednesday, January 8, 2025 1:18 PM
> To: Christophe Lyon ; gcc-patches@gcc.gnu.org;
> Richard Sandiford ; Tamar Christina
> ; Andre Simoes Dias Vieira
> ; ktkac...@nvidia.com;
> raman...@nvidia.com
&
> >> i.e. we use separate address arithmetic and avoid UMOVs. Counting
> >> two loads and one store for each element of the scatter store seems
> >> like overkill for that.
> >
> > Hmm agreed..
> >
> > How about for stores we increase the load counts by count / 2?
> >
> > This would account for th
> -Original Message-
> From: Richard Sandiford
> Sent: Monday, January 6, 2025 5:54 PM
> To: Jennifer Schmitz
> Cc: Richard Biener ; Richard Biener
> ; Tamar Christina ;
> gcc-patches@gcc.gnu.org; Kyrylo Tkachov
> Subject: Re: [RFC
> -Original Message-
> From: Tamar Christina
> Sent: Tuesday, December 31, 2024 1:04 PM
> To: Richard Biener ; Andrew Pinski
>
> Cc: gcc-patches@gcc.gnu.org
> Subject: RE: [PATCH v2 2/3] cfgexpand: Rewrite add_scope_conflicts_2 to use
> cache and loo
> -Original Message-
> From: Richard Sandiford
> Sent: Friday, January 3, 2025 10:59 AM
> To: Tamar Christina
> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw
> ; ktkac...@gcc.gnu.org
> Subject: Re: [PATCH]AArch64: Implement four and eight chunk VLA concats
&g
> >
> > How about instead doing something like:
> >
> > worklist.reserve (nelts);
> > for (int i = 0; i < nelts; ++i)
> > worklist.quick_push (force_reg (elem_mode, XVECEXP (vals, 0, i)));
> >
> > while (nelts > 2)
> > {
> > for (int i = 0; i < nelts; i += 2)
> > {
> >
1 - 100 of 1224 matches
Mail list logo