On Fri, Nov 10, 2023 at 5:12 PM Richard Biener
wrote:
>
> On Wed, Nov 8, 2023 at 9:22 AM Hongtao Liu wrote:
> >
> > On Wed, Nov 8, 2023 at 3:53 PM Richard Biener
> > wrote:
> > >
> > > On Wed, Nov 8, 2023 at 2:18 AM Hongtao Liu wrote:
> > > >
> > > > On Tue, Nov 7, 2023 at 10:34 PM Richard Bien
On Sun, Nov 12, 2023, 23:10 Tamar Christina wrote:
> > -Original Message-
> > From: Richard Biener
> > Sent: Monday, November 13, 2023 6:55 AM
> > To: Xi Ruoyao
> > Cc: gcc-patches@gcc.gnu.org; chenglulu ;
> > i...@xen0n.name; xucheng...@loongson.cn; Tamar Christina
> > ; tschwi...@gcc.
On Mon, 2023-11-13 at 07:09 +, Tamar Christina wrote:
> In the case of e.g. longaarch64 It looks like the target actually has an
> fcopysign
> Instruction, so wouldn't this rewriting by simplify-rtx be a de-optimization?
Yes it seems a de-optimization on LoongArch. For this micro-benchmark:
On Fri, Nov 10, 2023 at 2:14 PM liuhongt wrote:
>
> When I'm working on PR112443, I notice there's some misoptimizations:
> after we fold _mm{,256}_blendv_epi8/pd/ps into gimple, the backend
> fails to combine it back to v{,p}blendv{v,ps,pd} since the pattern is
> too complicated, so I think maybe
On Sun, Nov 12, 2023 at 10:03 PM Roger Sayle wrote:
>
>
> This patch improves register pressure during reload, inspired by PR 97756.
> Normally, a double-word right-shift by a constant produces a double-word
> result, the highpart of which is dead when followed by a truncation.
> The dead code cal
> -Original Message-
> From: Richard Biener
> Sent: Monday, November 13, 2023 7:09 AM
> To: Andrew Pinski
> Cc: Tamar Christina ; Prathamesh Kulkarni
> ; gcc-patches@gcc.gnu.org; nd
> ; j...@ventanamicro.com
> Subject: Re: [PATCH v3 2/2]middle-end match.pd: optimize fneg (fabs (x)) to
> c
> -Original Message-
> From: Richard Biener
> Sent: Monday, November 13, 2023 6:55 AM
> To: Xi Ruoyao
> Cc: gcc-patches@gcc.gnu.org; chenglulu ;
> i...@xen0n.name; xucheng...@loongson.cn; Tamar Christina
> ; tschwi...@gcc.gnu.org; Roger Sayle
>
> Subject: Re: [PATCH] Fix (fcopysign x, NE
On Fri, 10 Nov 2023, Andrew Pinski wrote:
> On Fri, Nov 10, 2023 at 5:12?AM Richard Biener wrote:
> >
> > On Fri, 10 Nov 2023, Tamar Christina wrote:
> >
> > >
> > > Hi Prathamesh,
> > >
> > > Yes Arm requires SIMD for copysign. The testcases fail because they don't
> > > turn on Neon.
> > >
> >
On Sat, Nov 11, 2023 at 9:36 AM Jakub Jelinek wrote:
>
> Hi!
>
> The following testcase ICEs when dumping details.
> When m_ssa_ranges vector is created, it is safe_grow_cleared (num_ssa_names),
> but when when some new SSA_NAME is added, we strangely grow it to
> num_ssa_names + 1 instead and lat
On Fri, Nov 10, 2023 at 6:15 PM Richard Biener
wrote:
>
> On Fri, Nov 10, 2023 at 2:42 AM Haochen Jiang wrote:
> >
> > Hi all,
> >
> > This RFC patch aims to add AVX10.1 options. After we added -m[no-]evex512
> > support, it makes a lot easier to add them comparing to the August version.
> > Deta
On Sun, Nov 12, 2023 at 9:27 PM Xi Ruoyao wrote:
>
> (fcopysign x, NEGATIVE_CONST) can be simplified to (fneg (fabs x)), but
> a logic error in the code caused it mistakenly simplified to (fneg x)
> instead.
OK.
> gcc/ChangeLog:
>
> PR rtl-optimization/112483
> * simplify-rtx.cc
On Sun, Nov 12, 2023 at 12:12 AM Brendan Shanks wrote:
>
> bad-mapper-1.C has been failing since the posix_spawn codepath was added
> to libiberty, adjust the check to accept the changed error message.
>
> Patch has been verified on x86_64 Linux.
OK
> gcc/testsuite:
>
> * g++.dg/modules/
On Sat, Nov 11, 2023 at 4:11 AM Jakub Jelinek wrote:
>
> On Thu, Nov 09, 2023 at 03:27:11PM +0800, Hongtao Liu wrote:
> > On Thu, Nov 9, 2023 at 3:15 PM Hu, Lin1 wrote:
> > >
> > > This patch aims to avoid generate vblendps with ymm16+, And have
> > > bootstrapped and tested on x86_64-pc-linux-gn
I happened to be browsing the standard a bit later and noticed that we
incorrectly reject the example given below.
Bootstrapped on x86_64-pc-linux-gnu; regtesting ongoing but modules.exp
completed with no errors.
-- >8 --
A typedef doesn't create a new entity, and thus should be allowed to be
ex
This adds the ability to defer the validation of numeric attribute
arguments until the sequence is parsed if the attribute being
handled is one known to be 'clang form'.
We do this by considering the arguments to be strings regardless
of content and defer the interpretation of those strings until
This implements the handling of the clang-form "availability"
attribute, which is the most important case used in the the macOS
SDKs.
PR c++/109877
gcc/ChangeLog:
* config/darwin-protos.h
(darwin_handle_weak_import_attribute): New.
(darwin_handle_availability_att
This adds the ability to defer the validation of numeric attribute
arguments until the sequence is parsed if the attribute being
handled is one known to be 'clang form'.
We do this by considering the arguments to be strings regardless
of content and defer the interpretation of those strings until
This patch set is not actually particualry new, I have been maintaining
it locally one Darwin branches and it has been tested on several versions
of Darwin both with and without Alex's __has_{feature, extension} patch.
This is one of the three most significant blockers to importing the macOS
SDKs
This patch set is not actually particulalry new, I have been maintaining
it locally one Darwin branches and it has been tested on several versions
of Darwin both with and without Alex's __has_{feature, extension} patch.
This is one of the three most significant blockers to importing the macOS
SDKs
在 2023/11/12 上午9:00, Xi Ruoyao 写道:
GCC internal says:
'subreg's of 'subreg's are not supported. Using
'simplify_gen_subreg' is the recommended way to avoid this problem.
Unfortunately loongarch_expand_vec_cond_mask_expr might create nested
subreg under certain circumstances, causing
On 2023/11/13 9:11, juzhe.zh...@rivai.ai wrote:
Ah, nice! How configurable are the bit ranges?
I think Lehua's patch is configurable for bit ranges.
Since his patch allow target flexible tracking subreg livenesss
according to REGMODE_NATURAL_SIZE
+/* Return true if REGNO is a pseudo and M
在 2023/11/11 下午6:58, Xi Ruoyao 写道:
fld and fst have same address mode as ld.w and st.w, so the same
optimization as r14-4851 should be applied for them too.
gcc/ChangeLog:
* config/loongarch/loongarch.md (LD_AT_LEAST_32_BIT): New mode
iterator.
(ST_ANY): New mode itera
Update v4 in below link, please help to ignore v3.
https://gcc.gnu.org/pipermail/gcc-patches/2023-November/636216.html
Sorry for inconvenience.
Pan
-Original Message-
From: Li, Pan2
Sent: Sunday, November 12, 2023 10:31 AM
To: Richard Sandiford ; Jeff Law
Cc: gcc-patches@gcc.gnu.org;
From: Pan Li
Update in v4:
* Merge upstream and removed some independent changes.
Update in v3:
* Take known_le instead of known_lt for vector size.
* Return NULL_RTX when gap is not equal 0 and not constant.
Update in v2:
* Move vector type support to get_stored_val.
Original log:
This patch
Committed, thanks Juzhe.
Pan
From: juzhe.zh...@rivai.ai
Sent: Monday, November 13, 2023 11:11 AM
To: Li, Pan2 ; gcc-patches
Cc: Li, Pan2 ; Wang, Yanzhang ;
kito.cheng
Subject: Re: [PATCH v1] RISC-V: Fix RVV dynamic frm tests failure
OK
juzhe.zh...@rivai.ai
OK
juzhe.zh...@rivai.ai
From: pan2.li
Date: 2023-11-13 11:10
To: gcc-patches
CC: juzhe.zhong; pan2.li; yanzhang.wang; kito.cheng
Subject: [PATCH v1] RISC-V: Fix RVV dynamic frm tests failure
From: Pan Li
The hancement of mode-switching performs some optimization when
emit the frm backup ins
From: Pan Li
The hancement of mode-switching performs some optimization when
emit the frm backup insn, some redudant fsrm insns are removed
for the following test cases.
This patch would like to adjust the asm check for above optimization.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rv
On 11/12/23 19:16, Jin Ma wrote:
Unfortunately this patch has triggered a bootstrap comparison failure on
loongarch64-linux-gnu: https://gcc.gnu.org/PR112497.
It's also causing simple build failures on other targets. For example
c6x-elf aborts when compiling gcc.c-torture/execute/pr82210 (a
> >
> > Unfortunately this patch has triggered a bootstrap comparison failure on
> > loongarch64-linux-gnu: https://gcc.gnu.org/PR112497.
> It's also causing simple build failures on other targets. For example
> c6x-elf aborts when compiling gcc.c-torture/execute/pr82210 (and others)
> with -O2
On Saturday, November 11, 2023 4:11 AM, Jakub Jelinek wrote:
> On Thu, Nov 09, 2023 at 03:27:11PM +0800, Hongtao Liu wrote:
> > On Thu, Nov 9, 2023 at 3:15 PM Hu, Lin1 wrote:
> > >
> > > This patch aims to avoid generate vblendps with ymm16+, And have
> > > bootstrapped and tested on x86_64-pc-l
LGTM.
juzhe.zh...@rivai.ai
From: pan2.li
Date: 2023-11-12 21:47
To: gcc-patches
CC: juzhe.zhong; pan2.li; yanzhang.wang; kito.cheng
Subject: [PATCH v1] RISC-V: Support FP l/ll round and rint HF mode autovec
From: Pan Li
This patch would like to support the FP below API auto vectorization
wi
>> Ah, nice! How configurable are the bit ranges?
I think Lehua's patch is configurable for bit ranges.
Since his patch allow target flexible tracking subreg livenesss according to
REGMODE_NATURAL_SIZE
+/* Return true if REGNO is a pseudo and MODE is a multil regs size. */
+bool
+need_track_sub
Hi. Ping this patch which is last optab pattern for RVV support.
The mask_len_strided_load/mask_len_strided_store document has been approved:
https://gcc.gnu.org/pipermail/gcc-patches/2023-November/635103.html
Bootstrap on X86 and regtest no regression.
Tested on aarch64 no regression.
Tested o
> From: Szabolcs Nagy
> Date: Fri, 3 Nov 2023 15:36:08 +
I don't see others commenting on this patch, and you're not
mentioning this aspect, so I wonder:
> * config/aarch64/aarch64.h (EH_RETURN_TAKEN_RTX): Define.
> (EH_RETURN_STACKADJ_RTX): Change to R5.
> (EH_RETURN_HANDL
Sam James writes:
> Alexander Monakov writes:
> [...]
>>
>> I'm very curious what you mean by "this has come up with LLVM [] too":
>> ttbomk,
>> LLVM doesn't do such lifetime-based optimization yet, which is why compiling
>> LLVM with LLVM doesn't break it. Can you share some examples? Or do
This patch addresses PR rtl-optimization/112380, an ICE-on-valid regression
where a (clobber (const_int 0)) encounters a sanity checking gcc_assert
(at line 7554) in simplify-rtx.cc. These CLOBBERs are used internally
by GCC's combine pass much like error_mark_node is used by various
language fro
On Fri, 2023-11-10 at 18:14 -0500, David Malcolm wrote:
> On Fri, 2023-11-10 at 11:02 -0500, Antoni Boucher wrote:
> > Hi.
> > This patch fixes the segfault when using -flto with libgccjit (bug
> > 111396).
> >
> > You mentioned in bugzilla that this didn't fix the reproducer for
> > you,
>
> Rer
This patch improves register pressure during reload, inspired by PR 97756.
Normally, a double-word right-shift by a constant produces a double-word
result, the highpart of which is dead when followed by a truncation.
The dead code calculating the high part gets cleaned up post-reload, so
the issue
(fcopysign x, NEGATIVE_CONST) can be simplified to (fneg (fabs x)), but
a logic error in the code caused it mistakenly simplified to (fneg x)
instead.
gcc/ChangeLog:
PR rtl-optimization/112483
* simplify-rtx.cc (simplify_binary_operation_1) :
Fix the simplification of (fco
On Sun, 2023-11-12 at 11:02 -0700, Jeff Law wrote:
>
>
> On 11/12/23 10:41, Xi Ruoyao wrote:
> > On Sat, 2023-11-11 at 13:12 -0700, Jeff Law wrote:
> > >
> > >
> > > On 8/14/23 05:22, Jin Ma wrote:
> > > > CLOBBER and USE does not represent real instructions, but in the
> > > > process of pipel
On 11/12/23 10:41, Xi Ruoyao wrote:
On Sat, 2023-11-11 at 13:12 -0700, Jeff Law wrote:
On 8/14/23 05:22, Jin Ma wrote:
CLOBBER and USE does not represent real instructions, but in the
process of pipeline optimization, they will wait for transmission
in ready list like other insns, without
On Sat, 2023-11-11 at 13:12 -0700, Jeff Law wrote:
>
>
> On 8/14/23 05:22, Jin Ma wrote:
> > CLOBBER and USE does not represent real instructions, but in the
> > process of pipeline optimization, they will wait for transmission
> > in ready list like other insns, without considering resource
> >
The relevant peephole2 will never generate alternative (=m,=&a,0,m) because
operand 1 is not dead before the peephole2 pattern.
gcc/ChangeLog:
* config/i386/i386.md (*stack_protect_set_4s__di):
Remove alternative 0.
Bootstrapped and regression tested on x86_64-pc-linux-gnu {,-m32}.
Uros.
diff -
This patch adds a target-independent aligned_register_operand
predicate, for use with register constraints that use filters
to impose an alignment. The definition deliberately jetisons
some of the historical baggage in general_operand.
gcc/
* common.md (aligned_register_operand): New pred
This patch makes IRA apply register filters when picking hard registers.
All the new code should be optimised away on targets that don't use
register filters. On targets that do use them, the new register_filters
bitfield is expected to be only a handful of bits.
Information about register filter
This patch makes LRA apply register filters. This plus the recog
change is enough for correct code generation, but a follow-on IRA
patch improves the allocation.
All the new code should be optimised away on targets that don't
use register filters. That's because get_register_filter just
wraps "r
The main (but simplest) part of this patch makes constrain_operands
take register filters into account.
The rest of the patch adds register filter information to
operand_alternative. Generally, if two register constraints
have different register filters, it's better if they're in separate
alterna
The main way of enforcing registers to be aligned is through
HARD_REGNO_MODE_OK. But this is a global property that applies
to all operands. A given (regno, mode) pair is either globally
valid or globally invalid.
This patch instead adds a way of specifying that individual operands
must be align
SME has various instructions that require aligned register tuples.
However, the associated tuple modes are already widely used and do
not need to be aligned in other contexts. It therefore isn't
appropriate to force alignment in TARGET_HARD_REGNO_MODE_OK.
There are also strided loads and stores t
From: Pan Li
This patch would like to support the FP below API auto vectorization
with different type size
++---+--+
| API| RV64 | RV32 |
++---+--+
| lrintf16 | HF => DI | HF => SI |
| llrintf16 | HF => DI | HF => DI |
From: Pan Li
Update in v3:
* Take known_le instead of known_lt for vector size.
* Return NULL_RTX when gap is not equal 0 and not constant.
Update in v2:
* Move vector type support to get_stored_val.
Original log:
This patch would like to allow the vector mode in the
get_stored_val in the DSE.
Hi Vladimir,
While you're starting your review, please review v3 version that fixes
some ICE issues, thanks.
https://gcc.gnu.org/pipermail/gcc-patches/2023-November/636178.html
On 2023/11/12 20:01, Lehua Ding wrote:
Hi Vladimir,
On 2023/11/10 4:24, Vladimir Makarov wrote:
On 11/7/23 22:47
This patch supports tracking the liveness of a subreg in a lra pass, with the
goal of getting it to agree with ira's register allocation scheme. There is some
duplication, maybe in the future this part of the code logic can be harmonized.
gcc/ChangeLog:
* ira-build.cc (setup_pseudos_has_s
This patch relax the subreg track capability to all subreg registers.
gcc/ChangeLog:
* ira-build.cc (get_reg_unit_size): New.
(has_same_nregs): New.
(ira_set_allocno_class): Adjust.
---
gcc/ira-build.cc | 41 -
1 file changed, 36 i
This patch supports tracking subreg liveness. It first extends
ira_object_t objects[2] to std::vector objects,
which can hold more than one object, and is used to collect all
access via subreg in program and the partial_in and partial_out
of the basic block live in/out.
Then there is a modificatio
This patch switches the live_reg data in lra to live_subreg data,
and the situation will be more complicated than in ira because
this part of the data is modified in lra also and the live_subreg
data will be recalculated.
gcc/ChangeLog:
* lra-coalesce.cc (update_live_info):
Adjust
These patches found a new bug and I resend a v3 version, I'm sorry about
this.
V3: https://gcc.gnu.org/pipermail/gcc-patches/2023-November/636178.html
On 2023/11/12 17:58, Lehua Ding wrote:
Hi,
These patchs try to support subreg coalesce feature in
register allocation passes (ira and lra).
L
This patch changes the previous way of creating a copy between allocnos to
objects.
gcc/ChangeLog:
* ira-build.cc (find_allocno_copy): Removed.
(find_object): New.
(ira_create_copy): Adjust.
(add_allocno_copy_to_list): Adjust.
(swap_allocno_copy_ends_if_ne
This patch adds a live_subreg problem to extend the original live_reg to
track the liveness of subreg. We will only try to trace speudo registers
who's mode size is a multiple of nature size and eventually a small portion
of the inside will appear to use subreg. With live_reg problem, live_subreg
p
This patch switch the use of live_reg data to live_subreg data.
gcc/ChangeLog:
* ira-build.cc (create_bb_allocnos): Switch.
(create_loop_allocnos): Ditto.
* ira-color.cc (ira_loop_edge_freq): Ditto.
* ira-emit.cc (generate_edge_moves): Ditto.
(add_ranges_an
V3 Changes:
1. fix three ICE.
2. rebase
Hi,
These patchs try to support subreg coalesce feature in
register allocation passes (ira and lra).
Let's consider a RISC-V program (https://godbolt.org/z/ec51d91aT):
```
#include
void
foo (int32_t *in, int32_t *out, size_t m)
{
vint32m2_t result
Hi Vladimir,
On 2023/11/10 4:24, Vladimir Makarov wrote:
On 11/7/23 22:47, Lehua Ding wrote:
Lehua Ding (7):
ira: Refactor the handling of register conflicts to make it more
general
ira: Add live_subreg problem and apply to ira pass
ira: Support subreg live range track
ira: S
I think the error message is still a little bit unclear but I couldn't
come up with something clearer that was similarly concise and matching
the existing style.
(Also I noticed that the linked PR was assigned to Nathan but there
hadn't been activity for a while, and I've been looking into these k
钟居哲 writes:
> Hi, Richard.
>
>>> Maybe dead lanes are better tracked at the gimple level though, not sure.
>>> (But AArch64 might need to lower lane operations more than it does now if
>>> we want gimple to handle it.)
>
> We were trying to address such issue at GIMPLE leve at the beginning.
> Tra
Excerpts from David Malcolm's message of November 10, 2023 10:42 pm:
> gcc/d/ChangeLog:
> * lang.opt.urls: New file, autogenerated by
> regenerate-opt-urls.py.
> ---
> gcc/d/lang.opt.urls | 95 +
> create mode 100644 gcc/d/lang.opt.urls
>
[abridged view of
Hi Dimitar,
I solved the problem you reported in V2 patch
(https://gcc.gnu.org/pipermail/gcc-patches/2023-November/636166.html),
is it possible for you to help confirm this? Thank you very much.
On 2023/11/9 0:56, Dimitar Dimitrov wrote:
On Wed, Nov 08, 2023 at 11:47:33AM +0800, Lehua Ding w
This patch switches the live_reg data in lra to live_subreg data,
and the situation will be more complicated than in ira because
this part of the data is modified in lra also and the live_subreg
data will be recalculated.
gcc/ChangeLog:
* lra-coalesce.cc (update_live_info):
Adjust
This patch adds a live_subreg problem to extend the original live_reg to
track the liveness of subreg. We will only try to trace speudo registers
who's mode size is a multiple of nature size and eventually a small portion
of the inside will appear to use subreg. With live_reg problem, live_subreg
p
This patch relax the subreg track capability to all subreg registers.
gcc/ChangeLog:
* ira-build.cc (get_reg_unit_size): New.
(has_same_nregs): New.
(ira_set_allocno_class): Adjust.
---
gcc/ira-build.cc | 41 -
1 file changed, 36 i
This patch supports tracking the liveness of a subreg in a lra pass, with the
goal of getting it to agree with ira's register allocation scheme. There is some
duplication, maybe in the future this part of the code logic can be harmonized.
gcc/ChangeLog:
* ira-build.cc (setup_pseudos_has_s
Hi,
These patchs try to support subreg coalesce feature in
register allocation passes (ira and lra).
Let's consider a RISC-V program (https://godbolt.org/z/ec51d91aT):
```
#include
void
foo (int32_t *in, int32_t *out, size_t m)
{
vint32m2_t result = __riscv_vle32_v_i32m2 (in, 32);
vint32m1
This patch changes the previous way of creating a copy between allocnos to
objects.
gcc/ChangeLog:
* ira-build.cc (find_allocno_copy): Removed.
(find_object): New.
(ira_create_copy): Adjust.
(add_allocno_copy_to_list): Adjust.
(swap_allocno_copy_ends_if_ne
This patch supports tracking subreg liveness. It first extends
ira_object_t objects[2] to std::vector objects,
which can hold more than one object, and is used to collect all
access via subreg in program and the partial_in and partial_out
of the basic block live in/out.
Then there is a modificatio
This patch switch the use of live_reg data to live_subreg data.
gcc/ChangeLog:
* ira-build.cc (create_bb_allocnos): Switch.
(create_loop_allocnos): Ditto.
* ira-color.cc (ira_loop_edge_freq): Ditto.
* ira-emit.cc (generate_edge_moves): Ditto.
(add_ranges_an
Alexander Monakov writes:
> On Sat, 11 Nov 2023, Sam James wrote:
>
>> > Valgrind client requests are offered as macros that emit inline asm. For
>> > use
>> > in code generation, we need to wrap it in a built-in. We know that
>> > implementing
>> > such a built-in in libgcc is undesirable,
On Sat, 11 Nov 2023, Sam James wrote:
> > Valgrind client requests are offered as macros that emit inline asm. For
> > use
> > in code generation, we need to wrap it in a built-in. We know that
> > implementing
> > such a built-in in libgcc is undesirable, [...].
>
> Perhaps less objectiona
On Sat, 11 Nov 2023, Arsen Arsenović wrote:
> > +#else
> > +# define VALGRIND_MAKE_MEM_UNDEFINED(ptr, sz) __builtin_trap ()
> > +#endif
> > +
> > +void __valgrind_make_mem_undefined (void *ptr, unsigned long sz)
> > +{
> > + VALGRIND_MAKE_MEM_UNDEFINED (ptr, sz);
> > +}
>
> Would it be preferab
77 matches
Mail list logo