On Wed, 2025-01-22 at 10:53 +0800, Xi Ruoyao wrote:
> On Wed, 2025-01-22 at 10:37 +0800, Lulu Cheng wrote:
> >
> > 在 2025/1/22 上午8:49, Xi Ruoyao 写道:
> > > The second source register of this insn cannot be the same as the
> > > destination register.
> > >
> > > gcc/ChangeLog:
> > >
> > > * conf
On 21/01/2025 21:58, Jason Merrill wrote:
On 1/15/25 7:36 PM, yxj-github-437 wrote:
On Fri, Jan 03, 2025 at 05:18:55PM +, xxx wrote:
From: yxj-github-437 <2457369...@qq.com>
This patch attempts to fix an error when build module std. The
reason for the
error is __builrin_va_list (aka struc
Hello,
On Tue, 21 Jan 2025, Martin Uecker wrote:
> > > Coudn't you use the rule that .len refers to the closest enclosing
> > > structure
> > > even without __self__ ? This would then also disambiguate between
> > > designators
> > > and other uses.
> >
> > Right now, an expression cannot sta
Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk?
-- >8 --
When we started streaming the bit to handle merging of imported temploid
friends in r15-2807, I unthinkingly only streamed it in the
'!state->is_header ()' case.
This patch reworks the streaming logic to ensure that this d
The test case added in r15-7073 now triggers an ICE, indicating we need
the same fix as AArch64.
gcc/ChangeLog:
PR target/118501
* config/loongarch/loongarch.md (@xorsign3): Use
force_lowpart_subreg.
---
Bootstrapped and regtested on loongarch64-linux-gnu, ok for trunk?
> On 22 Jan 2025, at 13:53, Richard Sandiford wrote:
>
> Kyrylo Tkachov writes:
>> Hi Richard,
>>
>>> On 22 Jan 2025, at 13:21, Richard Sandiford
>>> wrote:
>>>
>>> GCC 15 is the first release to support FP8 intrinsics.
>>> The underlying instructions depend on the value of a new register,
libgccjit fails on startup on aarch64 (and probably other archs).
The issues are that
(a) within jit_langhook_init the call to
targetm.init_builtins can use types that aren't representable
via jit::recording::type, and
(b) targetm.init_builtins can call lang_hooks.decls.pushdecl, which
although
Apologies for the noise.
On 1/21/25 10:16 PM, Jakub Jelinek wrote:
On Fri, Oct 18, 2024 at 11:52:22AM +0530, Tejas Belagod wrote:
Currently poly-int type structures are passed by value to OpenMP runtime
functions for shared clauses etc. This patch improves on this by passing
around poly-int structures by address to avo
Hi David.
I had a patch for this here: https://github.com/antoyo/libgccjit/pull/20
The fact that you removed the debug_tree (and abort) will make it harder
to figure out what the missing types to handle are.
This will also probably make it hard for people to understand why they
get a type error
Hello Michael,
Am Mittwoch, dem 22.01.2025 um 16:54 +0100 schrieb Michael Matz:
> On Wed, 22 Jan 2025, Martin Uecker wrote:
>
> > > > So you do not need to look further. But maybe I am missing something
> > > > else.
> > >
> > > Like ...
> > >
> > > > > Note further that you may have '{ .y[
On 9/6/24 8:02 AM, Jakub Jelinek wrote:
Hi!
On Wed, Aug 14, 2024 at 06:11:35PM +0200, Jakub Jelinek wrote:
Here is the I believe ABI compatible version, which uses the separate
guard variables, so different structured binding variables can be
initialized in different threads, but the thread tha
Hello,
On Wed, 22 Jan 2025, Martin Uecker wrote:
> > > In .y[1][3].z after .y you can decide whether y is a member of the
> > > struct being initialized. If it is, it is a designator and if not
> > > it must be an expression.
> >
> > If y is not a member it must be an expression, true. But if
Hi!
The fold_builtin_frexp folding for NaN/Inf just returned the first argument
with evaluating second arguments side-effects, rather than storing something
to what the second argument points to.
The PR argues that the C standard requires the function to store something
there but what exactly is
On Wed, Jan 22, 2025 at 04:19:37PM +0530, Tejas Belagod wrote:
> On 1/21/25 10:16 PM, Jakub Jelinek wrote:
> > On Fri, Oct 18, 2024 at 11:52:22AM +0530, Tejas Belagod wrote:
> > > Currently poly-int type structures are passed by value to OpenMP runtime
> > > functions for shared clauses etc. This
Added 2 tests for PR118591.
Johann
--
AVR: Add test cases for PR118591.
gcc/testsuite/
PR rtl-optimization/118591
* gcc.target/avr/torture/pr118591-1.c: New test.
* gcc.target/avr/torture/pr118591-2.c: New test.
diff --git a/gcc/testsuite/gcc.target/avr/torture/pr11859
Hi Richard,
> On 22 Jan 2025, at 13:21, Richard Sandiford wrote:
>
> GCC 15 is the first release to support FP8 intrinsics.
> The underlying instructions depend on the value of a new register,
> FPMR. Unlike FPCR, FPMR is a normal call-clobbered/caller-save
> register rather than a global regis
Kyrylo Tkachov writes:
> Hi Richard,
>
>> On 22 Jan 2025, at 13:21, Richard Sandiford
>> wrote:
>>
>> GCC 15 is the first release to support FP8 intrinsics.
>> The underlying instructions depend on the value of a new register,
>> FPMR. Unlike FPCR, FPMR is a normal call-clobbered/caller-save
>
gcc/ChangeLog:
* config/s390/s390.cc: Fix arch15 machine string which must not
be empty.
---
gcc/config/s390/s390.cc | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/gcc/config/s390/s390.cc b/gcc/config/s390/s390.cc
index 313f968c87e..86a5f059b85 100644
--- a/gc
Hello,
On Wed, 22 Jan 2025, Martin Uecker wrote:
> > > So you do not need to look further. But maybe I am missing something
> > > else.
> >
> > Like ...
> >
> > > > Note further that you may have '{ .y[1][3].z }', which is still not a
> > > > designation, but an expression under your proposa
GCC 15 is going to be the first release to support FPMR.
The alternatives for moving values into FPMR were missing
a zero alternative, meaning that moves of zero would use an
unnecessary temporary register.
Tested on aarch64-linux-gnu. I'll push in about 24 hours
if there are no comments before t
GCC 15 is going to be the first release to support FPMR.
While working on a follow-up patch, I noticed that for:
(set (reg:DI R) ...)
...
(set (reg:DI fpmr) (reg:DI R))
IRA would prefer to spill R to memory rather than allocate a GPR.
This is because the register move cost for GENERAL
GCC 15 is the first release to support FP8 intrinsics.
The underlying instructions depend on the value of a new register,
FPMR. Unlike FPCR, FPMR is a normal call-clobbered/caller-save
register rather than a global register. So:
- The FP8 intrinsics take a final uint64_t argument that
specifie
While working on another MSR-related patch, I noticed that
aarch64_write_sysregdi's constraints allowed zero, but its
predicate didn't. This could in principle lead to an ICE
during or after RA, since "Z" allows the RA to rematerialise
a known zero directly into the instruction.
The usual techniq
Hello,
On Wed, 22 Jan 2025, Martin Uecker wrote:
> > You need to decide which is which after seeing the ".". I'm guessing what
> > you mean is that on seeing ".ident" as first two tokens inside in
> > initializer-list you go the designator route, and not the
> > initializer/assignment-express
Am Mittwoch, dem 22.01.2025 um 16:25 +0100 schrieb Michael Matz:
> Hello,
>
> On Wed, 22 Jan 2025, Martin Uecker wrote:
>
> > > You need to decide which is which after seeing the ".". I'm guessing
> > > what
> > > you mean is that on seeing ".ident" as first two tokens inside in
> > > initial
Am Mittwoch, dem 22.01.2025 um 15:53 +0100 schrieb Michael Matz:
> Hello,
>
> On Tue, 21 Jan 2025, Martin Uecker wrote:
>
> > > > Coudn't you use the rule that .len refers to the closest enclosing
> > > > structure
> > > > even without __self__ ? This would then also disambiguate between
> > >
If the target does not support floating-point, we register FP vector
types as 'void' (see register_vector_type).
The leads to warnings about 'pure attribute on function returning
void' when we declare the various load intrinsics because their
call_properties say CP_READ_MEMORY (thus giving them th
在 2025/1/22 下午5:21, Xi Ruoyao 写道:
On Wed, 2025-01-22 at 10:53 +0800, Xi Ruoyao wrote:
On Wed, 2025-01-22 at 10:37 +0800, Lulu Cheng wrote:
在 2025/1/22 上午8:49, Xi Ruoyao 写道:
The second source register of this insn cannot be the same as the
destination register.
gcc/ChangeLog:
* conf
Atomic load does not modify the memory. Atomic store does not read the
memory, thus we can use "=" instead.
gcc/ChangeLog:
* config/loongarch/sync.md (atomic_load): Remove "+" for
the memory operand.
(atomic_store): Use "=" instead of "+" for the memory
operand.
-
This instruction is used to skip an redundant barrier if -mno-ld-seq-sa
or the memory model requires a barrier on failure. But with -mld-seq-sa
and other memory models the barrier may be nonexisting at all, and we
should remove the "b 3f" instruction as well.
The implementation uses a new operand
They are the same.
gcc/ChangeLog:
* config/loongarch/sync.md (atomic_optab): Remove.
(atomic_): Change atomic_optab to amop.
(atomic_fetch_): Likewise.
---
gcc/config/loongarch/sync.md | 6 ++
1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/gcc/config/lo
Bootstrapped and regtested on loongarch64-linux-gnu. Ok for trunk?
Xi Ruoyao (5):
LoongArch: (NFC) Remove atomic_optab and use amop instead
LoongArch: Don't use "+" for atomic_{load,store} "m" constraint
LoongArch: Allow using bstrins for masking the address in
atomic_test_and_set
Loo
Hi Paul,
the patch looks reasonable to me. Ok for mainline.
Just a side-thought: Could it be possible, that the for-loop in trans-decl does
not find the result? Would an assert after the loop at least give a hint, where
something went wrong? That's just from reading the code, so if you think that
We can use bstrins for masking the address here. As people are already
working on LA32R (which lacks bstrins instructions), for future-proofing
we check whether (const_int -4) is an and_operand and force it into an
register if not.
gcc/ChangeLog:
* config/loongarch/sync.md (atomic_test_a
For LL-SC loops, if the atomic operation has succeeded, the SC
instruction always imply a full barrier, so the barrier we manually
inserted only needs to take the account for the failure memorder, not
the success memorder (the barrier is skipped with "b 3f" on success
anyway).
Note that if we use
On 1/21/25 7:04 PM, Marek Polacek wrote:
On Tue, Jan 21, 2025 at 11:00:13AM -0500, Jason Merrill wrote:
On 1/21/25 9:54 AM, Jason Merrill wrote:
On 1/20/25 5:58 PM, Marek Polacek wrote:
@@ -9087,7 +9092,9 @@ cxx_eval_outermost_constant_expr (tree t, bool
allow_non_constant,
return r;
Hi All,
This patch fixes a double ICE arising from confusion between the dummy
symbol arising from a module function/subroutine interface and the module
procedure itself. In both cases, use of the name is unambiguous, as
explained in the ChangeLog. The testcase contains both the original and the
v
On 1/22/25 6:30 AM, Nathaniel Shead wrote:
Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk?
OK.
-- >8 --
When we started streaming the bit to handle merging of imported temploid
friends in r15-2807, I unthinkingly only streamed it in the
'!state->is_header ()' case.
This pat
> On Jan 22, 2025, at 11:22, Martin Uecker wrote:
>
>
> Hello Michael,
>
> Am Mittwoch, dem 22.01.2025 um 16:54 +0100 schrieb Michael Matz:
>> On Wed, 22 Jan 2025, Martin Uecker wrote:
>>
> So you do not need to look further. But maybe I am missing something
> else.
Lik
Hi,
one of the testcases from PR 118097 and the one from PR 118535 show
that the fix to PR 118138 was incomplete. We must not only make sure
that (intermediate) results of operations performed by IPA-CP are
fold_converted to the type of the destination formal parameter but we
also must decouple t
On 1/22/25 2:48 AM, Jakub Jelinek wrote:
Hi!
The fold_builtin_frexp folding for NaN/Inf just returned the first argument
with evaluating second arguments side-effects, rather than storing something
to what the second argument points to.
The PR argues that the C standard requires the function
As it turns out, logical 32-bit shifts with an offset of 25..30 can
be performed in 7 instructions or less. This beats the 7 instruc-
tions required for the default code of a shift loop.
Plus, with zero overhead, these cases can be 3-operand.
This is only relevant for -Oz because with -Os, 3op s
Hi Quin,
sorry, another idea I noted down some time ago which I would like
to mention.
> >
> > - use it only in limited contexts where you do not need to know
> > the type (e.g. this works for goto labels) or for a basic
> > counted_by attribute that only takes an identifier as we have i
Am Mittwoch, dem 22.01.2025 um 16:37 + schrieb Qing Zhao:
>
> > On Jan 22, 2025, at 11:22, Martin Uecker wrote:
> >
> >
> > Hello Michael,
> >
> > Am Mittwoch, dem 22.01.2025 um 16:54 +0100 schrieb Michael Matz:
> > > On Wed, 22 Jan 2025, Martin Uecker wrote:
> > >
> > > > > > So you do n
On Jan 21, 2025, Richard Biener wrote:
> So - your fix looks almost good, I'd adjust it to
>> +case BIT_FIELD_REF:
>> + if (DECL_P (TREE_OPERAND (expr, 0))
>> + && !bit_field_ref_in_bounds_p (expr))
>> + return true;
>> + /* Fall through. */
> OK if that works.
It
On Jan 19, 2025, at 12:47 PM, Torbjorn SVENSSON
wrote:
>
> On 2025-01-19 21:20, Andrew Pinski wrote:
>> On Sun, Jan 19, 2025 at 12:17 PM Torbjörn SVENSSON
>> wrote:
>>>
>>> Ok for trunk?
>>>
>>> --
>>>
>>> Most baremetal toolchains will not have an implementation for alarm and
>>> sigaction
On 9/10/24 2:29 PM, Jakub Jelinek wrote:
Hi!
The following patch on top of the
https://gcc.gnu.org/pipermail/gcc-patches/2024-September/662507.html
patch adds CWG 2867 support for namespace locals.
Those vars are just pushed into {static,tls}_aggregates chain, then
pruned from those lists, sepa
On 1/3/25 6:39 AM, Jakub Jelinek wrote:
On Thu, Dec 19, 2024 at 11:56:10AM -0500, Jason Merrill wrote:
Looks right.
So, I've tried to construct a testcase, but turns out modules.cc just
doesn't handle structured bindings at namespace scope at all, so it is
premature to try to get the ordering
On 1/16/25 7:24 PM, Nathaniel Shead wrote:
On Thu, Jan 16, 2025 at 07:09:33PM -0500, Jason Merrill wrote:
On 1/6/25 7:22 AM, Nathaniel Shead wrote:
I'm not 100% sure I've handled this properly, any feedback welcome.
In particular, maybe should I use `DECL_IMPLICIT_TYPEDEF_P` in the
mangling log
Am Mittwoch, dem 22.01.2025 um 17:30 +0100 schrieb Michael Matz:
> Hello,
>
> On Wed, 22 Jan 2025, Martin Uecker wrote:
>
> > > > In .y[1][3].z after .y you can decide whether y is a member of the
> > > > struct being initialized. If it is, it is a designator and if not
> > > > it must be an ex
Am Mittwoch, dem 22.01.2025 um 18:11 +0100 schrieb Martin Uecker:
> Am Mittwoch, dem 22.01.2025 um 16:37 + schrieb Qing Zhao:
> >
> > > On Jan 22, 2025, at 11:22, Martin Uecker wrote:
> > >
> > >
> > > Hello Michael,
> > >
> > > Am Mittwoch, dem 22.01.2025 um 16:54 +0100 schrieb Michael Ma
On 1/22/25 12:29 AM, Robin Dapp wrote:
Hi,
after testing on the BPI (4.2% improvement for x264 input 1, 4.4% for input 2)
and the discussion in PR117173 I figured it's best to disable the two-source
permutes by default for now. We quickly talked about this on the patchwork
call last week. C
Dear all,
while looking at details of a related but slightly different PR, I found
that we did evaluate the arguments to MINLOC/MAXLOC too often in the
inlined version.
The attached patch creates temporaries for array elements where needed,
and ensures that each array element is only touched onc
On Wed, Jan 22, 2025 at 11:13 AM Haochen Jiang wrote:
>
> Hi all,
>
> These two testcases are misses on previous addition for
> -march=x86-64-v3 to silence warning for -march=native tests.
>
> Ok for trunk?
Ok.
>
> Thx,
> Haochen
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/i386/vnniint16
On 1/6/25 7:21 AM, Nathaniel Shead wrote:
Something like this should probably be backported to GCC 14 too, since
my change in r14-9232-g3685fae23bb008 inadvertantly caused ICEs that
this fixes. But without the previous patch this patch will cause ABI
changes, and I'm not sure how easily it would
On 1/6/25 7:23 AM, Nathaniel Shead wrote:
https://github.com/itanium-cxx-abi/cxx-abi/pull/85 clarifies that
mangling a lambda expression should use 'L' rather than "tl". This only
affects C++20 (and later) so no ABI flag is given.
OK.
gcc/cp/ChangeLog:
* mangle.cc (write_expression)
The first error message in my previous email led me to the following
constraint:
“*C1130* A *variable-name *that appears in a LOCAL or LOCAL_INIT
*locality-spec *shall not have the ALLOCATABLE, INTENT (IN), or OPTIONAL
attribute, shall not be of finalizable type, shall not have an allocatable
ult
PR target/118561
gcc/ChangeLog:
* config/loongarch/loongarch-builtins.cc
(loongarch_expand_builtin_lsx_test_branch):
NULL_RTX will not be returned when an error is detected.
(loongarch_expand_builtin): Likewise.
gcc/testsuite/ChangeLog:
* gcc.targ
在 2025/1/23 上午11:36, Xi Ruoyao 写道:
On Thu, 2025-01-23 at 11:21 +0800, Lulu Cheng wrote:
在 2025/1/22 下午9:26, Xi Ruoyao 写道:
The test case added in r15-7073 now triggers an ICE, indicating we need
the same fix as AArch64.
gcc/ChangeLog:
PR target/118501
* config/loongarch/loong
From: Yash Shinde
This patch addresses an issue in the C preprocessor where incorrect line number
information is generated when processing
files with a large number of lines. The problem arises from improper handling
of location intervals in the line map,
particularly when locations exceed LINE
在 2025/1/22 下午9:26, Xi Ruoyao 写道:
The test case added in r15-7073 now triggers an ICE, indicating we need
the same fix as AArch64.
gcc/ChangeLog:
PR target/118501
* config/loongarch/loongarch.md (@xorsign3): Use
force_lowpart_subreg.
---
Bootstrapped and regtested on
On Thu, 2025-01-23 at 11:21 +0800, Lulu Cheng wrote:
>
> 在 2025/1/22 下午9:26, Xi Ruoyao 写道:
> > The test case added in r15-7073 now triggers an ICE, indicating we need
> > the same fix as AArch64.
> >
> > gcc/ChangeLog:
> >
> > PR target/118501
> > * config/loongarch/loongarch.md (@xorsig
I recently built the master branch of Iain's fork of gcc in order to test
the support for locality specifiers. I have verified that the branch I
built contains the commit mentioned in this email thread
20b8500cfa522ebe0fcf756d5b32816da7f904dd. The code below isn't designed to
do anything useful
On Jan 22, 2025, Alexandre Oliva wrote:
> I have another patch coming up that doesn't raise concerns for me, so
> I'll hold off from installing the above, in case you also prefer the
> other one.
Unlike other access patterns, BIT_FIELD_REFs aren't regarded as
possibly-trapping out of referencing
ср, 22 янв. 2025 г. в 23:53, Georg-Johann Lay :
>
> As it turns out, logical 32-bit shifts with an offset of 25..30 can
> be performed in 7 instructions or less. This beats the 7 instruc-
> tions required for the default code of a shift loop.
> Plus, with zero overhead, these cases can be 3-operan
On Jan 22, 2025, Alexandre Oliva wrote:
> I have another patch coming up that doesn't raise concerns for me, so
> I'll hold off from installing the above, in case you also prefer the
> other one.
And here's an unrelated bit that came to mind while working on this, but
that I split out before pos
From: Pan Li
This patch would like to fix the wroing code generation for the scalar
signed SAT_SUB. The input can be QI/HI/SI/DI while the alu like sub
can only work on Xmode. Unfortunately we don't have sub/add for
non-Xmode like QImode in scalar, thus we need to sign extend to Xmode
to ensure
From: Pan Li
This patch would like to fix the wroing code generation for the scalar
signed SAT_ADD. The input can be QI/HI/SI/DI while the alu like sub
can only work on Xmode. Unfortunately we don't have sub/add for
non-Xmode like QImode in scalar, thus we need to sign extend to Xmode
to ensure
From: Pan Li
This patch would like to refactor the helper function of the SAT_*
scalar. The helper function will convert the define_pattern ops
to the xmode reg for the underlying code-gen. This patch add
new parameter for ZERO_EXTEND or SIGN_EXTEND if the input is const_int
or the mode is non-
From: Pan Li
This patch would like to fix the wroing code generation for the scalar
signed SAT_TRUNC. The input can be QI/HI/SI/DI while the alu like sub
can only work on Xmode. Unfortunately we don't have sub/add for
non-Xmode like QImode in scalar, thus we need to sign extend to Xmode
to ensu
When comparing a signed narrow variable with a wider constant that has
the bit corresponding to the variable's sign bit set, we would check
that the constant is a sign-extension from that sign bit, and conclude
that the compare fails if it isn't.
When the signed variable is masked without gettin
Hi Damian,
looking into the code I neither find the keyword LOCAL nor REDUCE. The match
routine also does not give any hint that those keywords are supported in a do
concurrent loop.
And here is the existing bug
ticket: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101602
But I don't see a ticket
> Could you show me the a piece of codegen difference in X264 that make
> performance improved ?
I have one ready from SATD (see PR117173), there are more.
"Before":
_838 = VEC_PERM_EXPR ;
_846 = VEC_PERM_EXPR ;
"After":
_42 = VEC_PERM_EXPR ;
...
_44 = VEC_PERM_EXPR ;
_45 = VEC_PERM_EXPR ;
"A
lgtm.
--Reply to Message--
On Wed, Jan 22, 2025 16:04 PM Robin Dapp
On Tue, Jan 21, 2025 at 5:45 PM Andrew Pinski wrote:
>
> On Thu, Aug 8, 2024 at 2:07 PM Andrew Pinski wrote:
> >
> > On Fri, Aug 2, 2024 at 7:30 AM Jeff Law wrote:
> > >
> > >
> > >
> > > On 8/1/24 4:12 AM, Surya Kumari Jangala wrote:
> > > > lra: emit caller-save register spills before call ins
76 matches
Mail list logo