RE: [PATCH v1] Widening-Mul: Take gsi after_labels instead of start_bb for gcall insertion

2024-06-11 Thread Li, Pan2
Committed, thanks Richard. Pan -Original Message- From: Richard Biener Sent: Wednesday, June 12, 2024 2:41 PM To: Li, Pan2 Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; kito.ch...@gmail.com; jeffreya...@gmail.com Subject: Re: [PATCH v1] Widening-Mul: Take gsi after_labels instead

Re: [PATCH v2] Target-independent store forwarding avoidance.

2024-06-11 Thread Richard Biener
On Tue, 11 Jun 2024, Jeff Law wrote: > > > On 6/11/24 7:52 AM, Philipp Tomsich wrote: > > On Tue, 11 Jun 2024 at 15:37, Jeff Law wrote: > >> > >> > >> > >> On 6/11/24 1:22 AM, Richard Biener wrote: > >> > Absolutely. But forwarding from a smaller store to a wider load is > painful >

Re: [PATCH v1] Widening-Mul: Take gsi after_labels instead of start_bb for gcall insertion

2024-06-11 Thread Richard Biener
On Tue, Jun 11, 2024 at 3:53 PM wrote: > > From: Pan Li > > We inserted the gcall of .SAT_ADD before the gsi_start_bb for avoiding > the ssa def after use ICE issue. Unfortunately, there will be the > potential ICE when the first stmt is label. We cannot insert the gcall > before the label. T

Re: [PATCH 3/3] Add power11 tests

2024-06-11 Thread Kewen.Lin
Hi Mike, on 2024/6/4 09:46, Michael Meissner wrote: > This patch adds some simple tests for -mcpu=power11 support. In order to run > these tests, you need an assembler that supports the appropriate option for > supporting the Power11 processor (-mpower11 under Linux or -mpwr11 under AIX). > > I

Re: [PATCH] LoongArch: Fix mode size comparision in loongarch_expand_conditional_move

2024-06-11 Thread Lulu Cheng
在 2024/6/12 上午11:06, Xi Ruoyao 写道: We were comparing a mode size with word_mode, but word_mode is an enum value thus this does not really make any sense. (Un)luckily E_DImode happens to be 8 so this seemed to work, but let's make it correct so it won't blow up when we add LA32 support or add a

Re: [Patch-2v2, rs6000] Eliminate unnecessary byte swaps for duplicated constant vector store [PR113325]

2024-06-11 Thread Kewen.Lin
Hi Haochen, on 2024/6/12 10:47, HAO CHEN GUI wrote: > Hi, > This patch creates an insn_and_split pattern which helps the duplicated > constant vector replace the source pseudo of store insn in fwprop pass. > Thus the store can be implemented by a single stxvd2x and it eliminates the > unnecessar

[PATCH v2] [testsuite] add linkonly to dg-additional-sources [PR115295]

2024-06-11 Thread Alexandre Oliva
On Jun 11, 2024, Andrew Pinski wrote: > I think we should just fully revert the changes to > dg-additional-sources and add an explicit `dg-do run` to pr95401.cc I don't suppose an explicit "dg-do run" would make things work reliably, after we've detected that hardware or runtime support for vect

Re: [PATCH v3 2/2] C++: Support constexpr strings for asm statements

2024-06-11 Thread Xi Ruoyao
On Tue, 2024-06-11 at 20:53 -0700, Andi Kleen wrote: > > > -Some assemblers allow semicolons as a line separator. However, > > > -note that some assembler dialects use semicolons to start a comment. > > > +Some assemblers allow semicolons as a line separator. However, > > > +note that some assemble

[PATCH] match: Improve gimple_bitwise_equal_p and gimple_bitwise_inverted_equal_p for truncating casts [PR115449]

2024-06-11 Thread Andrew Pinski
As mentioned by Jeff in r15-831-g05daf617ea22e1d818295ed2d037456937e23530, we don't handle `(X | Y) & ~Y` -> `X & ~Y` on the gimple level when there are some different signed (but same precision) types dealing with matching `~Y` with the `Y` part. This improves both gimple_bitwise_equal_p and gim

Re: [PATCH v3 2/2] C++: Support constexpr strings for asm statements

2024-06-11 Thread Andi Kleen
Hi Jason, Sorry I must have misunderstood you. I thought the patch was already approved earlier and I did commit. I can revert or do additional changes. On Tue, Jun 11, 2024 at 04:31:30PM -0400, Jason Merrill wrote: > > + if (tok->type == CPP_OPEN_PAREN) > > +{ > > + matching_parens p

Re: [PATCH V4 1/2] split complicate 64bit constant to memory

2024-06-11 Thread Jiufu Guo
Hi Segher, Thanks a lot for your review! Segher Boessenkool writes: > Hi! > > On Tue, Jun 11, 2024 at 04:37:25PM +0800, Jiufu Guo wrote: >> Sometimes, a complicated constant is built via 3(or more) >> instructions. Generally speaking, it would not be as fast >> as loading it from the constan

[PATCH] [APX CCMP] Use ctestcc when comparing to const 0

2024-06-11 Thread Hongyu Wang
Hi, For CTEST, we don't have conditional AND so there's no optimization opportunity to write a new ctest pattern. Emit ctest when ccmp did comparison to const 0 to save bytes. Bootstrapped & regtested under x86-64-pc-linux-gnu. Ok for trunk? gcc/ChangeLog: * config/i386/i386.md (@ccmp)

[PATCH] LoongArch: Fix mode size comparision in loongarch_expand_conditional_move

2024-06-11 Thread Xi Ruoyao
We were comparing a mode size with word_mode, but word_mode is an enum value thus this does not really make any sense. (Un)luckily E_DImode happens to be 8 so this seemed to work, but let's make it correct so it won't blow up when we add LA32 support or add another machine mode... gcc/ChangeLog:

Re: [PATCH-1v3] fwprop: Replace rtx_cost with insn_cost in try_fwprop_subst_pattern [PR113325]

2024-06-11 Thread HAO CHEN GUI
Missing CC to Jeff Law. Sorry. 在 2024/6/12 10:41, HAO CHEN GUI 写道: > Hi, > This patch replaces rtx_cost with insn_cost in forward propagation. > In the PR, one constant vector should be propagated and replace a > pseudo in a store insn if we know it's a duplicated constant vector. > It reduces t

[Patch-2v2, rs6000] Eliminate unnecessary byte swaps for duplicated constant vector store [PR113325]

2024-06-11 Thread HAO CHEN GUI
Hi, This patch creates an insn_and_split pattern which helps the duplicated constant vector replace the source pseudo of store insn in fwprop pass. Thus the store can be implemented by a single stxvd2x and it eliminates the unnecessary byte swap insn on P8 LE. The test case shows the optimization

Re: [PATCH] [testsuite] add linkonly to dg-additional-sources [PR115295]

2024-06-11 Thread Andrew Pinski
On Tue, Jun 11, 2024 at 7:03 PM Alexandre Oliva wrote: > > > The D testsuite shows it was a mistake to assume that > dg-additional-sources are never to be used for compilation tests. > Even if an output file is specified for compilation, extra module > files can be named and used in the compilatio

[PATCH-1v3] fwprop: Replace rtx_cost with insn_cost in try_fwprop_subst_pattern [PR113325]

2024-06-11 Thread HAO CHEN GUI
Hi, This patch replaces rtx_cost with insn_cost in forward propagation. In the PR, one constant vector should be propagated and replace a pseudo in a store insn if we know it's a duplicated constant vector. It reduces the insn cost but not rtx cost. In this case, the cost is determined by destina

[FYI] map packed field type to unpacked for debug info

2024-06-11 Thread Alexandre Oliva
We create a distinct type for each field in a packed record with a gnu_size, but there is no distinct debug information for them. Use the same unpacked type for debug information. Regstrapped on x86_64-linux-gnu. Pre-approved by Eric. I'm checking it in. for gcc/ada/ChangeLog * gc

[PATCH] [testsuite] add linkonly to dg-additional-sources [PR115295]

2024-06-11 Thread Alexandre Oliva
The D testsuite shows it was a mistake to assume that dg-additional-sources are never to be used for compilation tests. Even if an output file is specified for compilation, extra module files can be named and used in the compilation without being flagged as errors. Introduce a 'linkonly' flag fo

[PATCH] [libstdc++] [testsuite] require cmath for c++23 cmath tests

2024-06-11 Thread Alexandre Oliva
Some c++23 tests fail on targets that don't satisfy dg-require-cmath, because referenced math functions don't get declared in std. Add the missing requirement. Regstrapping on x86_64-linux-gnu. Already successfully tested with gcc-13 on aarch64-rtems, where it avoids the errors that come up be

[PATCH] [libstdc++] [testsuite] xfail double-prec from_chars for float128_t

2024-06-11 Thread Alexandre Oliva
Tests involving float128_t were xfailed or otherwise worked around for vxworks on aarch64. The same issue came up on rtems. This patch adjusts them similarly. Regstrapping on x86_64-linux-gnu. Also tested with gcc-13 on aarch64-rtems6. Ok to install? (I'd have expected the fast_float limita

[PATCH] aarch64: Use bitreverse rtl code instead of unspec [PR115176]

2024-06-11 Thread Andrew Pinski
Bitreverse rtl code was added with r14-1586-g6160572f8d243c. So let's use it instead of an unspec. This is just a small cleanup but it does have one small fix with respect to rtx costs which didn't handle vector modes correctly for the UNSPEC and now it does. This is part of the first step in addin

[committed] c: Add -std=c2y, -std=gnu2y, -Wc23-c2y-compat, C2Y _Generic with type operand

2024-06-11 Thread Joseph Myers
The first new C2Y feature, _Generic where the controlling operand is a type name rather than an expression (as defined in N3260), was voted into C2Y today. (In particular, this form of _Generic allows distinguishing qualified and unqualified versions of a type.) This feature also includes allowin

Re: [committed] [v2] More logical op simplifications in simplify-rtx.cc

2024-06-11 Thread Andrew Pinski
On Sat, May 25, 2024 at 11:42 AM Jeff Law wrote: > > This is a revamp of what started as a target specific patch. > > Basically xalan (corrected, I originally thought it was perlbench) has a > bitset implementation with a bit of an oddity. Specifically setBit will > clear the bit before it is set

Re: [PATCH v2] fix PowerPC < 7 w/ Altivec not to default to power7

2024-06-11 Thread René Rebe
Hi! > On Jun 12, 2024, at 00:15, Segher Boessenkool > wrote: > > Hi! > > What does "powerpc < 7" mean? Something before POWER ISA 2.06? PowerPC ISA level 7 or whatever you like to call it. > On Tue, Jun 11, 2024 at 04:22:54PM +0200, Rene Rebe wrote: >> Glibc uses .machine to determine assem

[PATCH 2/2] RISC-V: Move mode assertion out of conditional branch in emit_insn

2024-06-11 Thread Edwin Lu
When emitting insns, we have an early assertion to ensure the input operand's mode and the expanded operand's mode are the same; however, it does not perform this check if the pattern does not have an explicit machine mode specifying the operand. In this scenario, it will always assume that mode =

[PATCH 1/2] RISC-V: Fix vwsll combine on rv32 targets

2024-06-11 Thread Edwin Lu
On rv32 targets, vwsll_zext1_scalar_ would trigger an ice in maybe_legitimize_instruction when zero extending a uint32 to uint64 due to a mismatch between the input operand's mode (DI) and the expanded insn operand's mode (Pmode == SI). Ensure that mode of the operands match gcc/ChangeLog:

[PATCH 0/2] Fix ICE with vwsll combine on 32bit targets

2024-06-11 Thread Edwin Lu
The following testcases have been failing on rv32 targets since r15-953-gaf4bf422a69: FAIL: gcc.target/riscv/rvv/autovec/binop/vwsll-1.c (internal compiler error: in maybe_legitimize_operand, at optabs.cc:8056) FAIL: gcc.target/riscv/rvv/autovec/binop/vwsll-1.c (test for excess errors) Fix the b

Re: [PATCH v2] fix PowerPC < 7 w/ Altivec not to default to power7

2024-06-11 Thread Segher Boessenkool
Hi! What does "powerpc < 7" mean? Something before POWER ISA 2.06? On Tue, Jun 11, 2024 at 04:22:54PM +0200, Rene Rebe wrote: > Glibc uses .machine to determine assembler optimizations to use. What does this mean? .machine is an *output* for glibc; nothing in glibc reads source code. Nothing

[pushed] doc: Remove redundant introduction of x86-64

2024-06-11 Thread Gerald Pfeifer
The same sentence as in the x86_64-*-solaris2* section is in the x86_64-*-* section directly above. gcc: PR target/69374 * doc/install.texi (Specific) : Remove redundant introduction of x86-64. --- gcc/doc/install.texi | 2 -- 1 file changed, 2 deletions(-) diff --git a/g

Re: [PATCH] Improve code generation of strided SLP loads

2024-06-11 Thread Richard Sandiford
Richard Biener writes: > This avoids falling back to elementwise accesses for strided SLP > loads when the group size is not a multiple of the vector element > size. Instead we can use a smaller vector or integer type for the load. > > For stores we can do the same though restrictions on stores w

Re: [PATCH] tree-optimization/115385 - handle more gaps with peeling of a single iteration

2024-06-11 Thread Richard Sandiford
Don't think it makes any difference, but: Richard Biener writes: > @@ -2151,7 +2151,16 @@ get_group_load_store_type (vec_info *vinfo, > stmt_vec_info stmt_info, >access excess elements. >??? Enhancements include peeling multiple iterations >

Re: [patch, rs6000, middle-end 0/1] v1: Add implementation for different targets for pair mem fusion

2024-06-11 Thread Richard Sandiford
Ajit Agarwal writes: > Hello Richard: > > On 11/06/24 9:41 pm, Richard Sandiford wrote: >> Ajit Agarwal writes: > Thanks a lot. Can I know what should we be doing with neg (fma) > correctness failures with load fusion. I think it would involve: - describing lxvp and st

Re: [PATCH] ifcvt: Clarify if_info.original_cost.

2024-06-11 Thread Richard Sandiford
Robin Dapp writes: >> I was looking at the code in more detail and just wanted to check. >> We have: >> >> int last_needs_comparison = -1; >> >> bool ok = noce_convert_multiple_sets_1 >> (if_info, &need_no_cmov, &rewired_src, &targets, &temporaries, >> &unmodified_insns, &last_needs

Re: [PATCH v3 2/2] C++: Support constexpr strings for asm statements

2024-06-11 Thread Jason Merrill
On 6/5/24 00:45, Andi Kleen wrote: Some programing styles use a lot of inline assembler, and it is common to use very complex preprocessor macros to generate the assembler strings for the asm statements. In C++ there would be a typesafe alternative using templates and constexpr to generate the as

Re: [PUSHED] Fix building JIT with musl libc [PR115442]

2024-06-11 Thread Andrew Pinski
On Tue, Jun 11, 2024 at 12:42 PM Andrew Pinski wrote: > > Just like r13-6662-g0e6f87835ccabf but this time for jit/jit-recording.cc. > > Pushed as obvious after a quick build to make sure jit still builds. Backported also to GCC 14 and GCC 13. Thanks, Andrew > > gcc/jit/ChangeLog: > > *

[PUSHED] Fix building JIT with musl libc [PR115442]

2024-06-11 Thread Andrew Pinski
Just like r13-6662-g0e6f87835ccabf but this time for jit/jit-recording.cc. Pushed as obvious after a quick build to make sure jit still builds. gcc/jit/ChangeLog: * jit-recording.cc: Define INCLUDE_SSTREAM before including system.h and don't directly incldue sstream. Signed-off-

Re: [PATCH] ifcvt: Clarify if_info.original_cost.

2024-06-11 Thread Robin Dapp
> I was looking at the code in more detail and just wanted to check. > We have: > > int last_needs_comparison = -1; > > bool ok = noce_convert_multiple_sets_1 > (if_info, &need_no_cmov, &rewired_src, &targets, &temporaries, > &unmodified_insns, &last_needs_comparison); > if (!ok) >

Re: [patch, rs6000, middle-end 0/1] v1: Add implementation for different targets for pair mem fusion

2024-06-11 Thread Ajit Agarwal
Hello Richard: On 11/06/24 9:41 pm, Richard Sandiford wrote: > Ajit Agarwal writes: Thanks a lot. Can I know what should we be doing with neg (fma) correctness failures with load fusion. >>> >>> I think it would involve: >>> >>> - describing lxvp and stxvp as unspec patterns, as I menti

[PATCH v4 6/6] opts: allow any combination of DWARF, CTF, BTF

2024-06-11 Thread David Faust
Previously it was not supported to generate both CTF and BTF debug info in the same compiler run, as both formats made incompatible changes to the same internal data structures. With the structural change in the prior patches, in particular the guarantee that CTF will always be fully emitted befor

[PATCH v4 2/6] ctf: use pointers instead of IDs internally

2024-06-11 Thread David Faust
This patch replaces all inter-type references in the ctfc internal data structures with pointers, rather than the references-by-ID which were used previously. A couple of small updates in the BPF backend are included to make it compatible with the change. This change is only to the in-memory repr

[PATCH v4 4/6] btf: add -gprune-btf option

2024-06-11 Thread David Faust
This patch adds a new option, -gprune-btf, to control BTF debug info generation. As the name implies, this option enables a kind of "pruning" of the BTF information before it is emitted. When enabled, rather than emitting all type information translated from DWARF, only information for types dire

[PATCH v4 3/6] btf: refactor and simplify implementation

2024-06-11 Thread David Faust
This patch heavily refactors btfout.cc to take advantage of the structural changes in the prior commits. Now that inter-type references are internally stored as simply pointers, all the painful, brittle, confusing infrastructure that was used in the process of converting CTF type IDs to BTF type I

[PATCH v4 5/6] bpf,btf: enable BTF pruning by default for BPF

2024-06-11 Thread David Faust
This patch enables -gprune-btf by default in the BPF backend when generating BTF information, and fixes BPF CO-RE generation when using -gprune-btf. When generating BPF CO-RE information, we must ensure that types used in CO-RE relocations always have sufficient BTF information emited so that the

[PATCH v4 1/6] ctf, btf: restructure CTF/BTF emission

2024-06-11 Thread David Faust
This commit makes some structural changes to the CTF/BTF debug info emission. In particular: a) CTF is new always fully generated and emitted before any BTF-related procedures are run. This means that BTF-related functions can change, even irreversibly, the shared in-memory represen

[PATCH v4 0/6] btf: refactor and add pruning option

2024-06-11 Thread David Faust
[v3: https://gcc.gnu.org/pipermail/gcc-patches/2024-May/653165.html Changes from v3: - Address typos, comment fixes and other minor nits pointed out by Indu in patches 1-3 and 5. - Rename option added in patch 4 from -fprune-btf to -gprune-btf. - Reword commit message in patch 4 to better de

Re: [PATCH v2 2/3] RISC-V: Add Zalrsc and Zaamo testsuite support

2024-06-11 Thread Patrick O'Neill
On 6/10/24 09:39, Patrick O'Neill wrote: On 6/7/24 16:04, Jeff Law wrote: On 6/3/24 3:53 PM, Patrick O'Neill wrote: Convert testsuite infrastructure to use Zalrsc and Zaamo rather than A. gcc/testsuite/ChangeLog: * gcc.target/riscv/amo-table-a-6-amo-add-1.c: Use Zaamo rather than A.

[PATCH 3/3] RISC-V: Allow any temp register to be used in amo tests

2024-06-11 Thread Patrick O'Neill
We artifically restrict the temp registers to be a[0-9]+ when other registers like t[0-9]+ are valid too. Update to make the regex accept any register for the temp value. gcc/testsuite/ChangeLog: * gcc.target/riscv/amo/amo-table-a-6-load-1.c: Update temp register regex. * gcc.tar

[PATCH 1/3] RISC-V: Move amo tests into subfolder

2024-06-11 Thread Patrick O'Neill
There's a large number of atomic related testcases in the riscv folder. Move them into a subfolder similar to what was done for rvv testcases. gcc/testsuite/ChangeLog: * gcc.target/riscv/amo-table-a-6-amo-add-1.c: Move to... * gcc.target/riscv/amo/amo-table-a-6-amo-add-1.c: ...her

[PATCH 2/3] RISC-V: Fix amoadd call arguments

2024-06-11 Thread Patrick O'Neill
Update __atomic_add_fetch arguments to be a pointer and value rather than two pointers. gcc/testsuite/ChangeLog: * gcc.target/riscv/amo/amo-table-a-6-amo-add-1.c: Update __atomic_add_fetch args. * gcc.target/riscv/amo/amo-table-a-6-amo-add-2.c: Ditto. * gcc.target/

[PATCH 0/3] RISC-V: Amo testsuite cleanup

2024-06-11 Thread Patrick O'Neill
This series moves the atomic-related riscv testcases into their own folder and fixes some minor bugs/rigidity of existing testcases. Patrick O'Neill (3): RISC-V: Move amo tests into subfolder RISC-V: Fix amoadd call arguments RISC-V: Allow any temp register to be used in amo tests .../risc

Re: [PATCH v3 0/3] RISC-V: Add basic Zaamo and Zalrsc support

2024-06-11 Thread Patrick O'Neill
On 6/10/24 21:32, Jeff Law wrote: On 6/10/24 6:15 PM, Andrea Parri wrote: On Mon, Jun 10, 2024 at 02:46:54PM -0700, Patrick O'Neill wrote: The A extension has been split into two parts: Zaamo and Zalrsc. This patch adds basic support by making the A extension imply Zaamo and Zalrsc. Zaamo/

Re: [PATCH] PHIOPT: Don't transform minmax if middle bb contains a phi [PR115143]

2024-06-11 Thread Andrew Pinski
On Mon, May 20, 2024 at 11:08 PM Richard Biener wrote: > > On Mon, May 20, 2024 at 11:37 PM Andrew Pinski (QUIC) > wrote: > > > > > -Original Message- > > > From: Richard Biener > > > Sent: Sunday, May 19, 2024 11:55 AM > > > To: Andrew Pinski (QUIC) > > > Cc: gcc-patches@gcc.gnu.org >

[Committed] RISC-V: Add basic Zaamo and Zalrsc support

2024-06-11 Thread Patrick O'Neill
On 6/10/24 21:33, Jeff Law wrote: On 6/10/24 3:46 PM, Patrick O'Neill wrote: The A extension has been split into two parts: Zaamo and Zalrsc. This patch adds basic support by making the A extension imply Zaamo and Zalrsc. Zaamo/Zalrsc spec: https://github.com/riscv/riscv-zaamo-zalrsc/tags R

[Committed 2/3] RISC-V: Add Zalrsc and Zaamo testsuite support

2024-06-11 Thread Patrick O'Neill
Convert testsuite infrastructure to use Zalrsc and Zaamo rather than A. gcc/ChangeLog: * doc/sourcebuild.texi: Add docs for atomic extension testsuite infra. gcc/testsuite/ChangeLog: * gcc.target/riscv/amo-table-a-6-amo-add-1.c: Use Zaamo rather than A. * gcc.target/risc

[Committed 3/3] RISC-V: Add Zalrsc amo-op patterns

2024-06-11 Thread Patrick O'Neill
All amo patterns can be represented with lrsc sequences. Add these patterns as a fallback when Zaamo is not enabled. gcc/ChangeLog: * config/riscv/sync.md (atomic_): New expand pattern. (amo_atomic_): Rename amo pattern. (atomic_fetch_): New lrsc sequence pattern.

[Committed 1/3] RISC-V: Add basic Zaamo and Zalrsc support

2024-06-11 Thread Patrick O'Neill
From: Edwin Lu There is a proposal to split the A extension into two parts: Zaamo and Zalrsc. This patch adds basic support by making the A extension imply Zaamo and Zalrsc. Proposal: https://github.com/riscv/riscv-zaamo-zalrsc/tags gcc/ChangeLog: * common/config/riscv/riscv-common.cc:

Ping [PATCH] aarch64: Add vector popcount besides QImode [PR113859]

2024-06-11 Thread Pengxuan Zheng (QUIC)
Ping https://gcc.gnu.org/pipermail/gcc-patches/2024-May/650311.html > -Original Message- > From: Pengxuan Zheng (QUIC) > Sent: Tuesday, April 30, 2024 5:32 PM > To: gcc-patches@gcc.gnu.org > Cc: Andrew Pinski (QUIC) ; Pengxuan Zheng > (QUIC) > Subject: [PATCH] aarch64: Add vector popcoun

RE: [PATCH] aarch64: Add vector floating point trunc pattern

2024-06-11 Thread Pengxuan Zheng (QUIC)
> Pengxuan Zheng writes: > > This patch is a follow-up of r15-1079-g230d62a2cdd16c to add vector > > floating point trunc pattern for V2DF->V2SF and V4SF->V4HF conversions > > by renaming the existing > > aarch64_float_truncate_lo_ pattern to the standard > > optab one, i.e., trunc2. This allows t

[committed] i386: Use CMOV in .SAT_{ADD|SUB} expansion for TARGET_CMOV [PR112600]

2024-06-11 Thread Uros Bizjak
For TARGET_CMOV targets emit insn sequence involving conditional move. .SAT_ADD: addl%esi, %edi movl$-1, %eax cmovnc %edi, %eax ret .SAT_SUB: subl%esi, %edi movl$0, %eax cmovnc %edi, %eax ret PR target/112600 gc

Re: [PATCH V4 1/2] split complicate 64bit constant to memory

2024-06-11 Thread Segher Boessenkool
Hi! On Tue, Jun 11, 2024 at 04:37:25PM +0800, Jiufu Guo wrote: > Sometimes, a complicated constant is built via 3(or more) > instructions. Generally speaking, it would not be as fast > as loading it from the constant pool (as the discussions in > PR63281): > "ld" is one instruction. If consider

Re: [RFC 1/2] libbacktrace: add FDPIC support

2024-06-11 Thread Max Filippov
On Sun, May 26, 2024 at 11:50 PM Max Filippov wrote: > > Instead of a single base address FDPIC ELF files use load map: a > structure with an array of mappings for individual segments. Change > libbacktrace functions and structures to support that. Ping? > libbacktrace/ > > PR libbacktr

[PATCH v2] Arm: Fix ldrd offset range [PR115153]

2024-06-11 Thread Wilco Dijkstra
v2: use a new arm_arch_v7ve_neon, fix use of DImode in output_move_neon The valid offset range of LDRD in arm_legitimate_index_p is increased to -1024..1020 if NEON is enabled since VALID_NEON_DREG_MODE includes DImode. Fix this by moving the LDRD check earlier. Passes bootstrap & regress, OK for

[PATCH v2] Arm: Fix disassembly error in Thumb-1 relaxed load/store [PR115188]

2024-06-11 Thread Wilco Dijkstra
Hi Christophe, >  PR target/115153 I guess this is typo (should be 115188) ? Correct. > +/* { dg-options "-O2 -mthumb" } */-mthumb is included in arm_arch_v6m, so I > think you don't need to add it here? Indeed, it's not strictly necessary. Fixed in v2: A Thumb-1 memory operand allows

Re: [patch, rs6000, middle-end 0/1] v1: Add implementation for different targets for pair mem fusion

2024-06-11 Thread Richard Sandiford
Ajit Agarwal writes: >>> Thanks a lot. Can I know what should we be doing with neg (fma) >>> correctness failures with load fusion. >> >> I think it would involve: >> >> - describing lxvp and stxvp as unspec patterns, as I mentioned >> in the previous reply >> >> - making plain movoo split lo

Re: [patch, rs6000, middle-end 0/1] v1: Add implementation for different targets for pair mem fusion

2024-06-11 Thread Ajit Agarwal
Hello Richard: On 11/06/24 8:59 pm, Richard Sandiford wrote: > Ajit Agarwal writes: >> On 11/06/24 7:07 pm, Richard Sandiford wrote: >>> Ajit Agarwal writes: Hello Richard: On 11/06/24 6:12 pm, Richard Sandiford wrote: > Ajit Agarwal writes: >> Hello Richard: >> >> On

Re: [PATCH] ifcvt: Clarify if_info.original_cost.

2024-06-11 Thread Richard Sandiford
Robin Dapp writes: > The attached v3 tracks the use of cond_earliest as you suggested > and adds its cost in default_noce_conversion_profitable_p. > > Bootstrapped and regtested on x86 and p10, aarch64 still > running. Regtested on riscv64. > > Regards > Robin > > Before noce_find_if_block proce

Re: [patch, rs6000, middle-end 0/1] v1: Add implementation for different targets for pair mem fusion

2024-06-11 Thread Richard Sandiford
Ajit Agarwal writes: > On 11/06/24 7:07 pm, Richard Sandiford wrote: >> Ajit Agarwal writes: >>> Hello Richard: >>> On 11/06/24 6:12 pm, Richard Sandiford wrote: Ajit Agarwal writes: > Hello Richard: > > On 11/06/24 5:15 pm, Richard Sandiford wrote: >> Ajit Agarwal writes:

Re: [PATCH] ifcvt: Clarify if_info.original_cost.

2024-06-11 Thread Robin Dapp
The attached v3 tracks the use of cond_earliest as you suggested and adds its cost in default_noce_conversion_profitable_p. Bootstrapped and regtested on x86 and p10, aarch64 still running. Regtested on riscv64. Regards Robin Before noce_find_if_block processes a block it sets up an if_info st

[PATCH v2] Test: Move target independent test cases to gcc.dg/torture

2024-06-11 Thread pan2 . li
From: Pan Li The test cases of pr115387 are target independent, at least x86 and riscv are able to reproduce. Thus, move these cases to the gcc.dg/torture. The below test suites are passed. 1. The rv64gcv fully regression test. 2. The x86 fully regression test. gcc/testsuite/ChangeLog:

Re: [PATCH V2] Fix ICE in rtl check due to CONST_WIDE_INT in CONST_VECTOR_DUPLICATE_P

2024-06-11 Thread Jakub Jelinek
On Tue, Jun 11, 2024 at 10:40:01PM +0800, liuhongt wrote: > gcc/ChangeLog: > > PR target/115384 > * simplify-rtx.cc (simplify_context::simplify_binary_operation_1): > Only do the simplification of (AND (ASHIFTRT A imm) mask) > to (LSHIFTRT A imm) when the component of const

Re: [PATCH] [testsuite] [arm] test board cflags in multilib.exp

2024-06-11 Thread Richard Earnshaw (lists)
On 07/06/2024 05:47, Alexandre Oliva wrote: > > multilib.exp tests for multilib-altering flags in a board's > multilib_flags and skips the test, but if such flags appear in the > board's cflags, with the same distorting effects on tested multilibs, > we fail to skip the test. > > Extend the skipp

Re: [patch, rs6000, middle-end 0/1] v1: Add implementation for different targets for pair mem fusion

2024-06-11 Thread Ajit Agarwal
Hello Richard: On 11/06/24 7:07 pm, Richard Sandiford wrote: > Ajit Agarwal writes: >> Hello Richard: >> On 11/06/24 6:12 pm, Richard Sandiford wrote: >>> Ajit Agarwal writes: Hello Richard: On 11/06/24 5:15 pm, Richard Sandiford wrote: > Ajit Agarwal writes: >> Hello Ric

[PATCH V2] Fix ICE in rtl check due to CONST_WIDE_INT in CONST_VECTOR_DUPLICATE_P

2024-06-11 Thread liuhongt
> > I think if you only handle CONST_INT_P, you should check just for that, and > in both places where you check for CONST_VECTOR_DUPLICATE_P (there is one > spot 2 lines above this). > So add > && CONST_INT_P (XVECEXP (XEXP (op0, 1), 0, 0)) > and > && CONST_INT_P (XVECEXP (op1, 0, 0)) > tests righ

Re: [PATCH v3 1/2] arm: Zero/Sign extends for CMSE security on Armv8-M.baseline [PR115253]

2024-06-11 Thread Andre Vieira (lists)
On 11/06/2024 14:59, Richard Earnshaw (lists) wrote: You effectively have an 'else if' split across a comment here, and the indentation looks weird. Either write 'else if' on one line (and re-indent accordingly) or put this entire block inside braces. Apologies here, Torbjorn had this as

[PATCH v2] fix PowerPC < 7 w/ Altivec not to default to power7

2024-06-11 Thread Rene Rebe
Hi Kewen, v2 with test case - I hope I worked all your nits in: Glibc uses .machine to determine assembler optimizations to use. However, since reworking the rs6000 .machine output selection in commit e154242724b084380e3221df7c08fcdbd8460674 22 May 2019, G5 as well as Cell, and even power4 w/ -ma

Re: [PATCH v2] Target-independent store forwarding avoidance.

2024-06-11 Thread Jeff Law
On 6/11/24 7:52 AM, Philipp Tomsich wrote: On Tue, 11 Jun 2024 at 15:37, Jeff Law wrote: On 6/11/24 1:22 AM, Richard Biener wrote: Absolutely. But forwarding from a smaller store to a wider load is painful from a hardware standpoint and if we can avoid it from a codegen standpoint, we

[Patch, Fortran, 96418] Fix Test coarray_alloc_comp_4.f08 ICEs

2024-06-11 Thread Andre Vehreschild
Hi all, attached patch has already been present in 2020, but lost my attention. It fixes an ICE in the testsuite. The old mails description is: attached patch fixes PR96418 where the code in the testsuite when compiled with -fcoarray=single lead to an ICE. The reason was that the coarray object

Re: [PATCH v3 2/2] testsuite: Fix expand-return CMSE test for Armv8.1-M [PR115253]

2024-06-11 Thread Richard Earnshaw (lists)
On 10/06/2024 15:04, Torbjörn SVENSSON wrote: > For Armv8.1-M, the clearing of the registers is handled differently than > for Armv8-M, so update the test case accordingly. > > gcc/testsuite/ChangeLog: > > PR target/115253 > * gcc.target/arm/cmse/extend-return.c: Update test case >

Re: [PATCH v3 1/2] arm: Zero/Sign extends for CMSE security on Armv8-M.baseline [PR115253]

2024-06-11 Thread Richard Earnshaw (lists)
On 10/06/2024 15:04, Torbjörn SVENSSON wrote: > Properly handle zero and sign extension for Armv8-M.baseline as > Cortex-M23 can have the security extension active. > Currently, there is an internal compiler error on Cortex-M23 for the > epilog processing of sign extension. > > This patch addresse

RE: [PATCH v1] Test: Move target independent test cases to gcc.dg/torture

2024-06-11 Thread Li, Pan2
> Since you are moving it to torture, please remove -O3 as it is already > supplied there as one of the torture options. Oh, I see. Thanks for comments, and will update it in v2. Pan From: Andrew Pinski Sent: Tuesday, June 11, 2024 9:45 PM To: Li, Pan2 Cc: GCC Patches ; 钟居哲 ; Kito Cheng ; Ri

[PATCH v1] Widening-Mul: Take gsi after_labels instead of start_bb for gcall insertion

2024-06-11 Thread pan2 . li
From: Pan Li We inserted the gcall of .SAT_ADD before the gsi_start_bb for avoiding the ssa def after use ICE issue. Unfortunately, there will be the potential ICE when the first stmt is label. We cannot insert the gcall before the label. Thus, we take gsi_after_labels to locate the 'really'

Re: [PATCH v2] Target-independent store forwarding avoidance.

2024-06-11 Thread Philipp Tomsich
On Tue, 11 Jun 2024 at 15:37, Jeff Law wrote: > > > > On 6/11/24 1:22 AM, Richard Biener wrote: > > >> Absolutely. But forwarding from a smaller store to a wider load is > >> painful > >> from a hardware standpoint and if we can avoid it from a codegen > >> standpoint, > >> we should. > > > >

Re: [PATCH v1] Test: Move target independent test cases to gcc.dg/torture

2024-06-11 Thread Andrew Pinski
On Mon, Jun 10, 2024, 11:20 PM wrote: > From: Pan Li > > The test cases of pr115387 are target independent, at least x86 > and riscv are able to reproduce. Thus, move these cases to > the gcc.dg/torture. > > The below test suites are passed. > 1. The rv64gcv fully regression test. > 2. The x8

Re: [PATCH v1] Test: Move target independent test cases to gcc.dg/torture

2024-06-11 Thread Jeff Law
On 6/11/24 12:19 AM, pan2...@intel.com wrote: From: Pan Li The test cases of pr115387 are target independent, at least x86 and riscv are able to reproduce. Thus, move these cases to the gcc.dg/torture. The below test suites are passed. 1. The rv64gcv fully regression test. 2. The x86 ful

RE: [PATCH v1] Widening-Mul: Fix one ICE of gcall insertion for PHI match

2024-06-11 Thread Li, Pan2
Got it. Thanks Richard. Pan -Original Message- From: Richard Biener Sent: Tuesday, June 11, 2024 5:31 PM To: Li, Pan2 Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; kito.ch...@gmail.com Subject: Re: [PATCH v1] Widening-Mul: Fix one ICE of gcall insertion for PHI match On Tue, Jun

Re: [PATCH v3 1/2] Factor out static_assert constexpr string extraction for reuse

2024-06-11 Thread Jason Merrill
On 6/5/24 00:45, Andi Kleen wrote: The only semantics changes are slightly more vague error messages to generalize. Just a few nits: +/* Extracting strings from constexpr. */ + +class cexpr_str +{ +public: + cexpr_str (tree message) : message(message) {} Space before paren. ... +/* Get

Re: [PATCH v2] Target-independent store forwarding avoidance.

2024-06-11 Thread Jeff Law
On 6/11/24 1:22 AM, Richard Biener wrote: Absolutely. But forwarding from a smaller store to a wider load is painful from a hardware standpoint and if we can avoid it from a codegen standpoint, we should. Note there's also the possibility to increase the distance between the store and the

Re: [patch, rs6000, middle-end 0/1] v1: Add implementation for different targets for pair mem fusion

2024-06-11 Thread Richard Sandiford
Ajit Agarwal writes: > Hello Richard: > On 11/06/24 6:12 pm, Richard Sandiford wrote: >> Ajit Agarwal writes: >>> Hello Richard: >>> >>> On 11/06/24 5:15 pm, Richard Sandiford wrote: Ajit Agarwal writes: > Hello Richard: > On 11/06/24 4:56 pm, Ajit Agarwal wrote: >> Hello Richar

Re: [patch, rs6000, middle-end 0/1] v1: Add implementation for different targets for pair mem fusion

2024-06-11 Thread Ajit Agarwal
Hello Richard: On 11/06/24 6:12 pm, Richard Sandiford wrote: > Ajit Agarwal writes: >> Hello Richard: >> >> On 11/06/24 5:15 pm, Richard Sandiford wrote: >>> Ajit Agarwal writes: Hello Richard: On 11/06/24 4:56 pm, Ajit Agarwal wrote: > Hello Richard: > > On 11/06/24 4:36 p

[Patch, Fortran] 3/3 RFC: Introduce gfc_class_set_vptr.

2024-06-11 Thread Andre Vehreschild
Hi all, although this mail has a patch attached, it is rather a request for comment. The attached patch introduces `gfc_class_set_vptr()` for consistently assigning the _vptr of a class data type. I figured that gfortran does these assignments in various locations and does them differently everywh

[Patch, Fortran, 90076] 1/3 Fix Polymorphic Allocate on Assignment Memory Leak

2024-06-11 Thread Andre Vehreschild
Hi all, the attached patch fix the last case in the bug report. The inital example code is already fixed by the combination of PR90068 and PR90072. The issue was the _vptr was not (re)set correctly, like in the __vtab_...-structure was not created. This made the compiler ICE. Regtests fine on x8

[Patch, Fortran] 2/3 Refactor locations where _vptr is (re)set.

2024-06-11 Thread Andre Vehreschild
Hi all, this patch refactors most of the locations where the _vptr of a class data type is reset. The code was inconsistent in most of the locations. The goal of using only one routine for setting the _vptr is to be able to later modify it more easily. The ultimate goal being that every time one

[Patch, Fortran] 0/3 (PR90076) Setting _vptr correctly.

2024-06-11 Thread Andre Vehreschild
Hi GFortraneers, I like to present a small series of patches. While working of PR90076 and figuring how to best set the _vptr of class types, I discovered several ways of doing this in slightly different ways which are more or less complete (mostly rather less). I therefore decided to fix not only

Re: [PATCH] gimple ssa: Teach switch conversion to optimize powers of 2 switches

2024-06-11 Thread Richard Biener
On Thu, 30 May 2024, Filip Kastl wrote: > Hi, > > This patch adds a transformation into the switch conversion pass -- > the "exponential index transform". This transformation can help switch > conversion convert switches it otherwise could not. The transformation is > intended for switches whos

Re: [COMMITTED] tree-optimization/115221 - Do not invoke SCEV if it will use a different range query.

2024-06-11 Thread Andrew MacLeod
On 5/29/24 03:19, Richard Biener wrote: On Tue, May 28, 2024 at 8:57 PM Andrew MacLeod wrote: The original patch causing the PR made ranger's cache re-entrant to enable SCEV to use the current range_query when called from within ranger.. SCEV uses the currently active range query (via get_ran

[PATCH v2 2/4] Libatomic: Define per-file identifier macros

2024-06-11 Thread Victor Do Nascimento
In order to facilitate the fine-tuning of how `libatomic_i.h' and `host-config.h' headers are used by different atomic functions, we define distinct identifier macros for each file which, in implementing atomic operations, imports these headers. The idea is that different parts of these headers co

[PATCH v2 4/4] Libatomic: Clean up AArch64 `atomic_16.S' implementation file

2024-06-11 Thread Victor Do Nascimento
At present, `atomic_16.S' groups different implementations of the same functions together in the file. Therefore, as an example, the LSE2 implementation of `load_16' follows on immediately from its core implementation, as does the `store_16' LSE2 implementation. Such architectural extension-depen

[PATCH v2 3/4] Libatomic: Make ifunc selector behavior contingent on importing file

2024-06-11 Thread Victor Do Nascimento
By querying previously-defined file-identifier macros, `host-config.h' is able to get information about its environment and, based on this information, select more appropriate function-specific ifunc selectors. This reduces the number of unnecessary feature tests that need to be carried out in ord

[PATCH v2 1/4] Libatomic: AArch64: Convert all lse128 assembly to .insn directives

2024-06-11 Thread Victor Do Nascimento
Given the lack of support for the LSE128 instructions in all but the the most up-to-date version of Binutils (2.42), having the build-time test for assembler support for these instructions often leads to the building of Libatomic without support for LSE128-dependent atomic function implementations.

  1   2   >