RE: [PATCH v1] Match: Support .SAT_SUB with IMM op for form 1-4

2024-07-26 Thread Li, Pan2
> OK. Committed, thanks Richard. Pan -Original Message- From: Richard Biener Sent: Friday, July 26, 2024 9:32 PM To: Li, Pan2 Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; kito.ch...@gmail.com; tamar.christ...@arm.com; jeffreya...@gmail.com; rdapp@gmail.com Subject: Re: [PAT

[PATCH] middle-end: Add and use few helper methods for current_properties

2024-07-26 Thread Andrew Pinski
While working on isel, I found that the current way of doing current_properties in function can easily make a mistake and having to do stuff like `(a & b ) == 0` and `a |= b;` and `a &= ~b;` is not so obvious what was going on. So let's add a few helper methods to function: * set_property * unset_

Re: [PATCH] c++/modules: Ensure deduction guides are always reachable [PR115231]

2024-07-26 Thread Nathaniel Shead
On Fri, Jul 26, 2024 at 01:17:57PM -0400, Jason Merrill wrote: > On 7/26/24 12:52 AM, Nathaniel Shead wrote: > > On Tue, Jul 23, 2024 at 04:17:22PM -0400, Jason Merrill wrote: > > > On 6/15/24 10:29 PM, Nathaniel Shead wrote: > > > > Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk?

[pushed] diagnostics: SARIF output: capture #include information (PR 107941; §3.34)

2024-07-26 Thread David Malcolm
This patch extends our SARIF output to capture relationships between locations within a result (§3.34). In particular, this captures chains of #includes relating to diagnostics and to events within diagnostic paths. For example, consider: include-chain-1.c: #include "include-chain-1.h" inclu

[PATCH] rs6000, add comment to VEC_IC definition

2024-07-26 Thread Carl Love
GCC maintainers: This patch adds a comment to the VEC_IC definitions to clarify the V1TI "TARGET_POWER10" mode per the request by Segher in the feedback to patch "https://gcc.gnu.org/pipermail/gcc-patches/2024-July/658156.html";. https://gcc.gnu.org/pipermail/gcc-patches/2024-July/658156.html

[PATCH] rs6000, document built-ins vec_test_lsbb_all_ones and, vec_test_lsbb_all_zeros

2024-07-26 Thread Carl Love
GCC maintainers: Per a report from a user, the existing vec_test_lsbb_all_ones and, vec_test_lsbb_all_zeros built-ins are not documented in the GCC documentation file. The following patch adds missing documentation for the vec_test_lsbb_all_ones and, vec_test_lsbb_all_zeros built-ins. Plea

Re: [PATCH] RISC-V: Expand subreg move via slide if necessary [PR116086].

2024-07-26 Thread Jeff Law
On 7/26/24 2:42 PM, Robin Dapp wrote: Hi, when the source mode is potentially larger than one vector (e.g. an LMUL2 mode for VLEN=128) we don't know which vector the subreg actually refers to. For zvl128b and LMUL=2 the subreg in (subreg:V2DI (reg:V4DI)) could actually be the a full (high) v

[PATCH ver 2] rs6000, Add new overloaded vector shift builtin int128, varients

2024-07-26 Thread Carl Love
GCC developers: Version 2, updated rs6000-overload.def to remove adding additonal internal names and to change XXSLDWI_Q to XXSLDWI_1TI per comments from Kewen.  Move new documentation statement for the PIVPR built-ins per comments from Kewen.  Updated dg-do-run directive and added comment ab

Re: [to-be-committed] [RISC-V][target/116085] Fix rv64 minmax extension avoidance splitter

2024-07-26 Thread Jeff Law
On 7/26/24 1:18 PM, Philipp Tomsich wrote: Nitpick: a typo slipped into the comment — "regsiter" -> "register". Thanks. The pre-commit tester also pointed out a couple formatting nits. I'll fix both. Jeff

Re: [PATCH v2] RISC-V: Add basic support for the Zacas extension

2024-07-26 Thread Jeff Law
On 7/23/24 6:15 PM, Patrick O'Neill wrote: From: Gianluca Guida This patch adds support for amocas.{b|h|w|d}. Support for amocas.q (64/128 bit cas for rv32/64) will be added in a future patch. Extension: https://github.com/riscv/riscv-zacas Ratification: https://jira.riscv.org/browse/RVS-68

Re: [RFH PATCH] c++: Implement C++26 P2963R3 - Ordering of constraints involving fold expressions [PR115746]

2024-07-26 Thread Patrick Palka
On Fri, 26 Jul 2024, Jakub Jelinek wrote: > On Fri, Jul 26, 2024 at 04:42:36PM -0400, Patrick Palka wrote: > > > // P2963R3 - Ordering of constraints involving fold expressions > > > // { dg-do compile { target c++20 } } > > > > > > template concept C = (__is_same (T, int) && ...); > > > templat

Re: [PATCH v2] RISC-V: Add basic support for the Zacas extension

2024-07-26 Thread Jeff Law
On 7/23/24 6:39 PM, Patrick O'Neill wrote: (define_expand "atomic_compare_and_swap" [(match_operand:SI 0 "register_operand" "") ;; bool output (match_operand:GPR 1 "register_operand" "") ;; val output (match_operand:GPR 2 "memory_operand" "");; memory - (match_operand:

Re: [PATCH 5/5] RISC-V: Enable stack clash in alloca

2024-07-26 Thread Jeff Law
On 7/24/24 12:00 PM, Raphael Moreira Zinsly wrote: Add the TARGET_STACK_CLASH_PROTECTION_ALLOCA_PROBE_RANGE to riscv in order to enable stack clash protection when using alloca. The code and tests are the same used by aarch64. gcc/ChangeLog: * config/riscv/riscv.cc (riscv_compute_fram

Re: [PATCH 4/5] RISC-V: Add support to vector stack-clash protection

2024-07-26 Thread Jeff Law
On 7/24/24 12:00 PM, Raphael Moreira Zinsly wrote: Adds basic support to vector stack-clash protection using a loop to do the probing and stack adjustments. gcc/ChangeLog: * config/riscv/riscv.cc (riscv_allocate_and_probe_stack_loop): New function. (riscv_v_adjust_scal

Re: [PATCH 3/5] RISC-V: Stack-clash protection implemention

2024-07-26 Thread Jeff Law
On 7/24/24 12:00 PM, Raphael Moreira Zinsly wrote: This implements stack-clash protection for riscv, with riscv_allocate_and_probe_stack_space being based of aarch64_allocate_and_probe_stack_space from aarch64's implementation. We enforce the probing interval and the guard size to always be eq

Re: [RFH PATCH] c++: Implement C++26 P2963R3 - Ordering of constraints involving fold expressions [PR115746]

2024-07-26 Thread Jakub Jelinek
On Fri, Jul 26, 2024 at 04:42:36PM -0400, Patrick Palka wrote: > > // P2963R3 - Ordering of constraints involving fold expressions > > // { dg-do compile { target c++20 } } > > > > template concept C = (__is_same (T, int) && ...); > > template > > struct S { > > template requires (C) > > st

[PATCH] RISC-V: Expand subreg move via slide if necessary [PR116086].

2024-07-26 Thread Robin Dapp
Hi, when the source mode is potentially larger than one vector (e.g. an LMUL2 mode for VLEN=128) we don't know which vector the subreg actually refers to. For zvl128b and LMUL=2 the subreg in (subreg:V2DI (reg:V4DI)) could actually be the a full (high) vector register of a two-register group (at

Re: [RFH PATCH] c++: Implement C++26 P2963R3 - Ordering of constraints involving fold expressions [PR115746]

2024-07-26 Thread Patrick Palka
On Fri, 26 Jul 2024, Jakub Jelinek wrote: > On Fri, Jul 26, 2024 at 09:49:27PM +0200, Jakub Jelinek wrote: > > On Fri, Jul 26, 2024 at 08:43:44PM +0200, Jakub Jelinek wrote: > > > Yeah, I saw ARGUMENT_PACK_SELECT being used, but didn't notice that > > > important > > > if (arg_pack && TREE_C

Re: [PATCH 2/5] RISC-V: Move riscv_v_adjust_scalable_frame

2024-07-26 Thread Jeff Law
On 7/24/24 12:00 PM, Raphael Moreira Zinsly wrote: Move riscv_v_adjust_scalable_frame () in preparation for the stack clash protection support. gcc/ChangeLog: * config/riscv/riscv.cc (riscv_v_adjust_scalable_frame): Move closer to riscv_expand_prologue. Guessing the point is

Re: [PATCH 1/5] RISC-V: Small stack tie changes

2024-07-26 Thread Jeff Law
On 7/26/24 12:43 PM, Raphael Zinsly wrote: On Fri, Jul 26, 2024 at 2:00 PM Jeff Law wrote: On 7/24/24 12:00 PM, Raphael Moreira Zinsly wrote: ... diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md index 46c46039c33..5780c5abacf 100644 --- a/gcc/config/riscv/riscv.md +++ b/gc

Re: [RFH PATCH] c++: Implement C++26 P2963R3 - Ordering of constraints involving fold expressions [PR115746]

2024-07-26 Thread Jakub Jelinek
On Fri, Jul 26, 2024 at 09:49:27PM +0200, Jakub Jelinek wrote: > On Fri, Jul 26, 2024 at 08:43:44PM +0200, Jakub Jelinek wrote: > > Yeah, I saw ARGUMENT_PACK_SELECT being used, but didn't notice that > > important > > if (arg_pack && TREE_CODE (arg_pack) == ARGUMENT_PACK_SELECT) > > a

Re: [RFH PATCH] c++: Implement C++26 P2963R3 - Ordering of constraints involving fold expressions [PR115746]

2024-07-26 Thread Jakub Jelinek
On Fri, Jul 26, 2024 at 08:43:44PM +0200, Jakub Jelinek wrote: > Yeah, I saw ARGUMENT_PACK_SELECT being used, but didn't notice that > important > if (arg_pack && TREE_CODE (arg_pack) == ARGUMENT_PACK_SELECT) > arg_pack = ARGUMENT_PACK_SELECT_FROM_PACK (arg_pack); > part of tsubst_pac

[pushed] c++: trait as typename scope [PR116052]

2024-07-26 Thread Jason Merrill
Tested x86_64-pc-linux-gnu, applying to trunk. -- 8< -- The stdexec library currently wrongly ends up using __decay as the scope of a typename, which leads to a crash. Let's give an error instead. PR c++/116052 gcc/cp/ChangeLog: * mangle.cc (write_prefix): Handle TRAIT_EXPR.

Re: [to-be-committed] [RISC-V][target/116085] Fix rv64 minmax extension avoidance splitter

2024-07-26 Thread Philipp Tomsich
Nitpick: a typo slipped into the comment — "regsiter" -> "register". On Fri, 26 Jul 2024 at 16:18, Jeff Law wrote: > > pr116085 is a long standing (since late 2022) regression on the riscv > port. > > A patch introduced a pattern to avoid unnecessary extensions when doing > a min/max operation w

Re: [PATCH 1/5] RISC-V: Small stack tie changes

2024-07-26 Thread Raphael Zinsly
On Fri, Jul 26, 2024 at 2:00 PM Jeff Law wrote: > On 7/24/24 12:00 PM, Raphael Moreira Zinsly wrote: > ... > > diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md > > index 46c46039c33..5780c5abacf 100644 > > --- a/gcc/config/riscv/riscv.md > > +++ b/gcc/config/riscv/riscv.md > > @@

Re: [RFH PATCH] c++: Implement C++26 P2963R3 - Ordering of constraints involving fold expressions [PR115746]

2024-07-26 Thread Jakub Jelinek
On Fri, Jul 26, 2024 at 02:35:01PM -0400, Patrick Palka wrote: > > IIUC the way gen_elem_of_pack_expansion_instantiation handles this for > > ordinary pack expnasions is by replacing each ARGUMENT_PACK with an > > ARGUMENT_PACK_SELECT. This ARGUMENT_PACK_SELECT contains the entire > > pack as well

Re: [RFH PATCH] c++: Implement C++26 P2963R3 - Ordering of constraints involving fold expressions [PR115746]

2024-07-26 Thread Patrick Palka
On Fri, 26 Jul 2024, Patrick Palka wrote: > On Fri, 26 Jul 2024, Jakub Jelinek wrote: > > > Hi! > > > > I've tried to implement the C++26 fold expanded constraints paper but ran > > into issues (see below). Would appreciate some guidance/help, I'm afraid > > I'm stuck. > > > > The patch introd

[Patch, v2] OpenMP/Fortran: Fix handling of 'declare target' with 'link' clause [PR11555]

2024-07-26 Thread Tobias Burnus
Updated patch - only change is to the testcase: * With the just posted patch for PR116107, array sections with offset work for 'link', hence, I updated the testcase. * For 'arr2', I added ref to the associated PR. I intent to commit it once PR116107 has been committed. Tobias Tobias Burnus

Re: [PATCH] gimple-ssa-sprintf: Fix typo in range check

2024-07-26 Thread Jakub Jelinek
On Fri, Jul 26, 2024 at 01:39:04PM -0400, Siddhesh Poyarekar wrote: > > What exactly the code really wants to do is unclear to me, what does > > the INT_MAX on the target have to do with the minimum/maximum/expected > > sizes of %S or %ls printed strings is unclear, target PTRDIFF_MAX > > I think

Re: [PATCH] rs6000, Add new overloaded vector shift builtin int128, varients

2024-07-26 Thread Peter Bergner
On 7/26/24 12:07 PM, Carl Love wrote: > On 7/24/24 11:47 AM, Segher Boessenkool wrote: > +/* { dg-do run { target { int128 } && { power10_hw } } } */ Everything power10 is int128 always. >>> OK, so don't need the power10_hw. Changed to just int128 for the target: >> No, the other way aro

Re: [RFH PATCH] c++: Implement C++26 P2963R3 - Ordering of constraints involving fold expressions [PR115746]

2024-07-26 Thread Patrick Palka
On Fri, 26 Jul 2024, Jakub Jelinek wrote: > Hi! > > I've tried to implement the C++26 fold expanded constraints paper but ran > into issues (see below). Would appreciate some guidance/help, I'm afraid > I'm stuck. > > The patch introduces a FOLD_CONSTR tree to represent fold expanded > constrai

[RFC/RFA][PATCH v3 06/12] aarch64: Implement new expander for efficient CRC computation

2024-07-26 Thread Mariam Arutunian
This patch introduces two new expanders for the aarch64 backend, dedicated to generate optimized code for CRC computations. The new expanders are designed to leverage specific hardware capabilities to achieve faster CRC calculations, particularly using the crc32, crc32c and pmull instructi

[RFC/RFA][PATCH v2 05/12] i386: Implement new expander for efficient CRC computation

2024-07-26 Thread Mariam Arutunian
This patch introduces two new expanders for the i386 backend, dedicated to generating optimized code for CRC computations. The new expanders are designed to leverage specific hardware capabilities to achieve faster CRC calculations, particularly using the pclmulqdq or crc32 instructions wh

[RFC/RFA][PATCH v2 03/12] RISC-V: Add CRC expander to generate faster CRC.

2024-07-26 Thread Mariam Arutunian
If the target is ZBC or ZBKC, it uses clmul instruction for the CRC calculation. Otherwise, if the target is ZBKB, generates table-based CRC, but for reversing inputs and the output uses bswap and brev8 instructions. Add new tests to check CRC generation for ZBC, ZBKC and ZBKB targets.

[RFC/RFA][PATCH v2 02/12] Add built-ins and tests for bit-forward and bit-reversed CRCs

2024-07-26 Thread Mariam Arutunian
This patch introduces new built-in functions to GCC for computing bit-forward and bit-reversed CRCs. These builtins aim to provide efficient CRC calculation capabilities. When the target architecture supports CRC operations (as indicated by the presence of a CRC optab), the builtins wil

[Patch] libgomp: Fix declare target link with offset array-section mapping [PR116107]

2024-07-26 Thread Tobias Burnus
The main idea of 'link' is to permit putting only a subset of a huge array on the device. Well, in order to make this work properly, it requires that one can map an array section, which does not start with the first element. This patch adjusts the pointers such, that this actually works. (Tested

[RFC/RFA][PATCH v2 01/12] Implement internal functions for efficient CRC computation

2024-07-26 Thread Mariam Arutunian
Add two new internal functions (IFN_CRC, IFN_CRC_REV), to provide faster CRC generation. One performs bit-forward and the other bit-reversed CRC computation. If CRC optabs are supported, they are used for the CRC computation. Otherwise, table-based CRC is generated. The supported dat

Re: [pushed] c++: #pragma target and deferred instantiation [PR115403]

2024-07-26 Thread Patrick Palka
On Thu, 25 Jul 2024, Jason Merrill wrote: > Tested x86_64-pc-linux-gnu, applying to trunk. > > Also built highway to check. > > -- 8< -- > > My patch for 109753 applies the current #pragma target/optimize to a > function when we compile it, which was a problem for a template > instantiation def

Re: [PATCH] testsuite: Add dg-do run to even more tests, fix typo

2024-07-26 Thread Sam James
Sam James writes: > All of these are for wrong-code bugs. Confirmed to be used before but > with no execution. > > Tested on x86_64-pc-linux-gnu and checked test logs before/after. > Pushed as obvious after discussion on IRC. Thanks.

Re: [PATCH] gimple-ssa-sprintf: Fix typo in range check

2024-07-26 Thread Siddhesh Poyarekar
On 2024-07-26 13:11, Jakub Jelinek wrote: On Thu, Jul 25, 2024 at 07:48:38PM -0400, Siddhesh Poyarekar wrote: The code to scale ranges for wide chars in format_string incorrectly checks range.likely to scale range.unlikely, which is a copy-paste typo from the immediate previous condition. gcc/C

Re: [PATCH] c++: Implement C++26 P2558R2 - Add @, $, and ` to the basic character set [PR110343]

2024-07-26 Thread Jason Merrill
On 7/26/24 11:55 AM, Jakub Jelinek wrote: On Fri, Jul 26, 2024 at 11:43:13AM -0400, Jason Merrill wrote: I'm now seeing a -std=c++26 failure on g++.dg/cpp/ucn-1.C. I don't remember seeing it when I wrote the patch, but today I see it as well. The following patch seems to fix that, tested on i

Re: [PATCH] c++/modules: Ensure deduction guides are always reachable [PR115231]

2024-07-26 Thread Jason Merrill
On 7/26/24 12:52 AM, Nathaniel Shead wrote: On Tue, Jul 23, 2024 at 04:17:22PM -0400, Jason Merrill wrote: On 6/15/24 10:29 PM, Nathaniel Shead wrote: Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk? This probably isn't the most efficient approach, since we need to do name look

Re: [PATCH] gimple-ssa-sprintf: Fix typo in range check

2024-07-26 Thread Jakub Jelinek
On Thu, Jul 25, 2024 at 07:48:38PM -0400, Siddhesh Poyarekar wrote: > The code to scale ranges for wide chars in format_string incorrectly > checks range.likely to scale range.unlikely, which is a copy-paste typo > from the immediate previous condition. > > gcc/ChangeLog: > > gimple-ssa-spr

Re: [PATCH v3 2/2] Prevent divide-by-zero

2024-07-26 Thread Patrick O'Neill
On 7/26/24 06:30, Richard Biener wrote: On Thu, May 30, 2024 at 2:11 AM Patrick O'Neill wrote: From: Greg McGary gcc/ChangeLog: * gcc/tree-vect-stmts.cc (gcc/tree-vect-stmts.cc): Prevent divide-by-zero. * testsuite/gcc.target/riscv/rvv/autovec/no-segment.c: Remove dg-ice. -

Re: [PATCH] rs6000, Add new overloaded vector shift builtin int128, varients

2024-07-26 Thread Carl Love
Segher: On 7/24/24 11:47 AM, Segher Boessenkool wrote: Hi! On Wed, Jul 24, 2024 at 11:38:11AM -0700, Carl Love wrote: On 7/24/24 10:03 AM, Segher Boessenkool wrote: So much manual stuff needed, sigh. On Fri, Jul 19, 2024 at 01:04:12PM -0700, Carl Love wrote: gcc/ChangeLog:     * config/rs6

Re: [PATCH 1/5] RISC-V: Small stack tie changes

2024-07-26 Thread Jeff Law
On 7/24/24 12:00 PM, Raphael Moreira Zinsly wrote: Enable the register used by riscv_emit_stack_tie () to be passed as an argument so we can tie the stack with other registers besides hard_frame_pointer_rtx. Also don't allow operand 1 of stack_tie to be optimized to sp in preparation for the s

[RFH PATCH] c++: Implement C++26 P2963R3 - Ordering of constraints involving fold expressions [PR115746]

2024-07-26 Thread Jakub Jelinek
Hi! I've tried to implement the C++26 fold expanded constraints paper but ran into issues (see below). Would appreciate some guidance/help, I'm afraid I'm stuck. The patch introduces a FOLD_CONSTR tree to represent fold expanded constraints, normalizes for C++26 some {U,BI}NARY_{LEFT,RIGHT}_FOLD

[PATCH v3 3/3] aarch64: Add fpm register helper functions.

2024-07-26 Thread Claudio Bantaloukas
The ACLE declares several helper types and functions to facilitate construction of `fpm` arguments. These are available when one of the arm_neon.h, arm_sve.h, or arm_sme.h headers is included. These helpers don't map to specific FP8 instructions and there's no expectation that they will produce a

[PATCH v3 0/3] aarch64: Add initial support for +fp8 arch extensions

2024-07-26 Thread Claudio Bantaloukas
This series introduces initial flags and functionality for the fp8 feature. Specifically, the following are added: - functions that enable constructing valid fpm register values. - support for the '+fp8' -march modifier. - support for reading and writing the new system register FPMR (Floating Po

[PATCH v3 1/3] aarch64: Add march flags for +fp8 arch extensions

2024-07-26 Thread Claudio Bantaloukas
This introduces the relevant flags to enable access to the fpmr register and fp8 intrinsics, which will be added subsequently. gcc/ChangeLog: * config/aarch64/aarch64-option-extensions.def (fp8): New. * config/aarch64/aarch64.h (TARGET_FP8): Likewise. * doc/invoke.texi (

[PATCH v3 2/3] aarch64: Add support for moving fpm system register

2024-07-26 Thread Claudio Bantaloukas
Unlike most system registers, fpmr can be heavily written to in code that exercises the fp8 functionality. That is because every fp8 instrinsic call can potentially change the value of fpmr. Rather than just use a an unspec, we treat the fpmr system register like all other registers and use a move

[PATCH] MAINTAINERS: Add myself to write after approval

2024-07-26 Thread Sam James
ChangeLog: * MAINTAINERS: Add myself. --- Pushed. MAINTAINERS | 1 + 1 file changed, 1 insertion(+) diff --git a/MAINTAINERS b/MAINTAINERS index 542d058d727c..595140b6f64f 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -550,6 +550,7 @@ Andreas Jaeger aj H

Re: [PATCH] c++: Implement C++26 P2558R2 - Add @, $, and ` to the basic character set [PR110343]

2024-07-26 Thread Jakub Jelinek
On Fri, Jul 26, 2024 at 11:43:13AM -0400, Jason Merrill wrote: > I'm now seeing a -std=c++26 failure on g++.dg/cpp/ucn-1.C. I don't remember seeing it when I wrote the patch, but today I see it as well. The following patch seems to fix that, tested on i686-linux, ok for trunk? 2024-07-26 Jakub

Re: [PATCH] c++: Implement C++26 P2558R2 - Add @, $, and ` to the basic character set [PR110343]

2024-07-26 Thread Jason Merrill
On 7/17/24 6:04 PM, Jakub Jelinek wrote: Hi! The following patch implements the easy parts of the paper. When @$` are added to the basic character set, it means that R"@$`()@$`" should now be valid (here I've noticed most of the raw string tests were tested solely with -std=c++11 or -std=gnu++11

[to-be-committed] [RISC-V][target/116085] Fix rv64 minmax extension avoidance splitter

2024-07-26 Thread Jeff Law
pr116085 is a long standing (since late 2022) regression on the riscv port. A patch introduced a pattern to avoid unnecessary extensions when doing a min/max operation where one of the values is a 32 bit positive constant. (define_insn_and_split "*minmax" [(set (match_operand:DI 0 "registe

arm: Prevent ICE when doloop dec_set is not PLUS_EXPR

2024-07-26 Thread Andre Vieira (lists)
This patch refactors and fixes an issue where arm_mve_dlstp_check_dec_counter was making an assumption about the form of what a candidate for a dec_insn. It also makes sure that if it does not initially encounter a 'set' in such a form it tries to find another set that could be the right one.

Re: [PATCH v1] Match: Support .SAT_SUB with IMM op for form 1-4

2024-07-26 Thread Richard Biener
On Fri, Jul 26, 2024 at 11:20 AM wrote: > > From: Pan Li > > This patch would like to support .SAT_SUB when one of the op > is IMM. Aka below 1-4 forms. > > Form 1: > #define DEF_SAT_U_SUB_IMM_FMT_1(T, IMM) \ > T __attribute__((noinline)) \ > sat_u_sub_imm##IMM##_##T##_fmt_1 (T y)

Re: [PATCH v3 2/2] Prevent divide-by-zero

2024-07-26 Thread Richard Biener
On Thu, May 30, 2024 at 2:11 AM Patrick O'Neill wrote: > > From: Greg McGary > > gcc/ChangeLog: > * gcc/tree-vect-stmts.cc (gcc/tree-vect-stmts.cc): Prevent > divide-by-zero. > * testsuite/gcc.target/riscv/rvv/autovec/no-segment.c: Remove dg-ice. > --- > No changes in v3. Depends

RE: [PATCH 1/8]AArch64: Update Neoverse V2 cost model to release costs

2024-07-26 Thread Tamar Christina
> -Original Message- > From: Richard Sandiford > Sent: Friday, July 26, 2024 2:12 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw > ; Marcus Shawcroft > ; ktkac...@gcc.gnu.org > Subject: Re: [PATCH 1/8]AArch64: Update Neoverse V2 cost model to release > costs

Re: [PATCH 1/8]AArch64: Update Neoverse V2 cost model to release costs

2024-07-26 Thread Richard Sandiford
Tamar Christina writes: > Hi All, > > This updates the cost for Neoverse V2 to reflect the updated > Software Optimization Guide. > > It also makes Cortex-X3 use the Neoverse V2 cost model. > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. > > Ok for master? > > Thanks, > Tamar >

Re: [PATCH 8/8]AArch64: take gather/scatter decode overhead into account

2024-07-26 Thread Kyrylo Tkachov
Hi Tamar, > On 26 Jul 2024, at 11:21, Tamar Christina wrote: > > External email: Use caution opening links or attachments > > > Hi All, > > Gather and scatters are not usually beneficial when the loop count is small. > This is because there's not only a cost to their execution within the loop

Re: [PATCH] fold: Allow SSA names in inverse_conditions_p and fold VCOND_MASK.

2024-07-26 Thread Richard Biener
On Thu, Jul 25, 2024 at 3:34 PM Robin Dapp wrote: > > Hi, > > In preparation for the maskload else operand I split off this patch. The > patch > looks through SSA names for the conditions passed to inverse_conditions_p > which > helps match.pd recognize more redundant vec_cond expressions. It

Re: [RFC] Generalize formation of lane-reducing ops in loop reduction

2024-07-26 Thread Richard Biener
On Sun, Jul 21, 2024 at 11:12 AM Feng Xue OS wrote: > > Hi, > > I composed some patches to generalize lane-reducing (dot-product is a > typical representative) pattern recognition, and prepared a RFC document so > as to help > review. The original intention was to make a complete solution for

[PATCH] LoongArch: Expand some SImode operations through "si3_extend" instructions if TARGET_64BIT

2024-07-26 Thread Xi Ruoyao
We already had "si3_extend" insns and we hoped the fwprop or combine passes can use them to remove unnecessary sign extensions. But this does not always work: for cases like x << 1 | y, the compiler tends to do (sign_extend:DI (ior:SI (ashift:SI (reg:SI $r4) (co

RE: [PATCH 5/8]AArch64: Update Generic Armv9-a cost model to release costs

2024-07-26 Thread Tamar Christina
> -Original Message- > From: Kyrylo Tkachov > Sent: Friday, July 26, 2024 1:35 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw > ; Marcus Shawcroft > ; ktkac...@gcc.gnu.org; Richard Sandiford > > Subject: Re: [PATCH 5/8]AArch64: Update Generic Armv9-a cost m

Re: [RFC][PATCH 1/5] vect: Fix single_imm_use in tree_vect_patterns

2024-07-26 Thread Richard Biener
On Sun, Jul 21, 2024 at 11:15 AM Feng Xue OS wrote: > > The work for RFC > (https://gcc.gnu.org/pipermail/gcc-patches/2024-July/657860.html) > involves not a little code change, so I have to separate it into several > batches > of patchset. This and the following patches constitute the first bat

Re: [PATCH 5/8]AArch64: Update Generic Armv9-a cost model to release costs

2024-07-26 Thread Kyrylo Tkachov
Hi Tamar, > On 26 Jul 2024, at 11:21, Tamar Christina wrote: > > External email: Use caution opening links or attachments > > > Hi All, > > this updates the costs for gener-armv9-a based on the updated costs for > Neoverse V2 and Neoverse N2. > > Bootstrapped Regtested on aarch64-none-linux-

Re: [PATCH v2] gimple ssa: Teach switch conversion to optimize powers of 2 switches

2024-07-26 Thread Richard Biener
On Thu, 18 Jul 2024, Filip Kastl wrote: > On Thu 2024-07-18 12:07:42, Richard Biener wrote: > > On Wed, 17 Jul 2024, Filip Kastl wrote: > > > > > + } > > > > > + > > > > > + vec v; > > > > > + v.create (1); > > > > > + v.quick_push (m_final_bb); > > > > > + iterate_fix_domi

Re: [PATCH 7/8]AArch64: Add Cortex-X925 core definition and cost model

2024-07-26 Thread Kyrylo Tkachov
> On 26 Jul 2024, at 11:21, Tamar Christina wrote: > > External email: Use caution opening links or attachments > > > Hi All, > > This adds a cost model and core definition for Cortex-X925. > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. > > Ok for master? Ok. Thanks,

Re: [PATCH 6/8]AArch64: Update Neoverse N2 cost model to release costs

2024-07-26 Thread Kyrylo Tkachov
> On 26 Jul 2024, at 11:21, Tamar Christina wrote: > > External email: Use caution opening links or attachments > > > Hi All, > > This updates the cost for Neoverse N2 to reflect the updated > Software Optimization Guide. > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.

Re: [PATCH 4/8]AArch64: Add Neoverse N3 and Cortex-A725 core definition and cost model

2024-07-26 Thread Kyrylo Tkachov
> On 26 Jul 2024, at 11:20, Tamar Christina wrote: > > External email: Use caution opening links or attachments > > > Hi All, > > This adds a cost model and core definition for Neoverse N3 and Cortex-A725. > > It also makes Cortex-A725 use the Neoverse N3 cost model. > > Bootstrapped Regt

Re: [PATCH 3/8]AArch64: Add Neoverse V3AE core definition and cost model

2024-07-26 Thread Kyrylo Tkachov
> On 26 Jul 2024, at 12:26, Tamar Christina wrote: > > External email: Use caution opening links or attachments > > > Hi All, > > This is a new version with the confirmed correct part number. > > An update TRM is being published. > > Bootstrapped Regtested on aarch64-none-linux-gnu and no

RE: [PATCH 1/8]AArch64: Update Neoverse V2 cost model to release costs

2024-07-26 Thread Tamar Christina
> -Original Message- > From: Kyrylo Tkachov > Sent: Friday, July 26, 2024 1:10 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw > ; Marcus Shawcroft > ; ktkac...@gcc.gnu.org; Richard Sandiford > > Subject: Re: [PATCH 1/8]AArch64: Update Neoverse V2 cost model

Re: [PATCH 2/8]AArch64: Add Neoverse V3 core definition and cost model

2024-07-26 Thread Kyrylo Tkachov
Hi Tamar, > On 26 Jul 2024, at 11:20, Tamar Christina wrote: > > External email: Use caution opening links or attachments > > > Hi All, > > This adds a cost model and core definition for Neoverse V3. > > It also makes Cortex-X4 use the Neoverse V3 cost model. > > Bootstrapped Regtested on a

Re: [PATCH 1/8]AArch64: Update Neoverse V2 cost model to release costs

2024-07-26 Thread Kyrylo Tkachov
Hi Tamar, > On 26 Jul 2024, at 11:19, Tamar Christina wrote: > > External email: Use caution opening links or attachments > > > Hi All, > > This updates the cost for Neoverse V2 to reflect the updated > Software Optimization Guide. > > It also makes Cortex-X3 use the Neoverse V2 cost model.

Re: [PATCH v2] i386: Fix AVX512 intrin macro typo

2024-07-26 Thread Jakub Jelinek
On Fri, Jul 26, 2024 at 04:10:48PM +0800, Haochen Jiang wrote: > * config/i386/avx512dqintrin.h > (_mm_mask_fpclass_ss_mask): Correct operand order. > (_mm_mask_fpclass_sd_mask): Ditto. > (_mm_reduce_round_sd): Use -1 as mask since it is non-mask. > (_mm_reduce_round_s

[PATCH 2/2] ipa: Move pass_ipa_cdtor_merge before pass_ipa_cp and pass_ipa_sra

2024-07-26 Thread Martin Jambor
Hi, when looking at PR 115815 we realized that it would make sense to make calls to functions originally declared static constructors and destructors created by pass_ipa_cdtor_merge visible to IPA-SRA. This patch does that. Bootstrapped and tested on x86_64-linux. OK for master? Thanks, Marti

[PATCH 1/2] ipa: Treat static constructors and destructors as non-local (PR 115815)

2024-07-26 Thread Martin Jambor
Hi, in PR 115815, IPA-SRA thought it had control over all invocations of a (recursive) static destructor but it did not see the implied invocation which led to the original being left behind and the clean-up code encountering uses of SSAs that definitely should have been dead. Fixed by teaching c

Re: [PATCH]AArch64: check for vector mode in get_mask_mode [PR116074]

2024-07-26 Thread Richard Biener
On Fri, Jul 26, 2024 at 1:15 PM Richard Sandiford wrote: > > Tamar Christina writes: > >> -Original Message- > >> From: Richard Sandiford > >> Sent: Friday, July 26, 2024 10:43 AM > >> To: Tamar Christina > >> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw > >> ; Marcus Shawcroft >

Re: [PATCH] i386: Mark target option with optimization when enabled with opt level [PR116065]

2024-07-26 Thread Richard Biener
On Fri, Jul 26, 2024 at 10:50 AM Hongyu Wang wrote: > > Hi, > > When introducing munroll-only-small-loops, the option was marked as > Target Save and added to -O2 default which makes attribute(optimize) > resets target option and causing error when cmdline has O1 and > funciton attribute has O2 an

Re: [PATCH v2] i386: Fix AVX512 intrin macro typo

2024-07-26 Thread Richard Biener
On Fri, Jul 26, 2024 at 10:14 AM Haochen Jiang wrote: > > Hi all, > > I have added related testcases into the patch. > > Ok for trunk and backport to GCC 14, GCC 13 and GCC 12? Hmm, it might be OK for 14.2 still, even without a new RC. But please wait until after 14.2 is released unless Jakub al

Re: [PATCH v1 1/2] PR116080: Fix tail call dejagnu checks

2024-07-26 Thread Richard Biener
On Fri, Jul 26, 2024 at 12:55 AM Andi Kleen wrote: > > From: Andi Kleen > > - Run the target_effective tail_call checks without optimization to > match the actual test cases. > - Add an extra check for external tail calls to handle targets like > powerpc that cannot tail call between different ob

Re: [PATCH v1 2/2] PR116019: Improve tail call error message

2024-07-26 Thread Richard Biener
On Fri, Jul 26, 2024 at 12:55 AM Andi Kleen wrote: > > From: Andi Kleen > > The "tail call must be the same type" message is common on some > targets with C++, or without optimization. It is generated > when gcc believes there is an access of the return value > after the call. However usually it

[PATCH] RISC-V: Work around bare apostrophe in error string.

2024-07-26 Thread Robin Dapp
Hi, an unquoted apostrophe slipped through when testing the recent V/M extension patch. This, again, re-words the message to "Currently the 'V' implementation requires the 'M' extension". Going to commit as obvious after testing. Regards Robin gcc/ChangeLog: * config/riscv/riscv.cc (

Re: [PATCH]AArch64: check for vector mode in get_mask_mode [PR116074]

2024-07-26 Thread Richard Sandiford
Tamar Christina writes: >> -Original Message- >> From: Richard Sandiford >> Sent: Friday, July 26, 2024 10:43 AM >> To: Tamar Christina >> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw >> ; Marcus Shawcroft >> ; ktkac...@gcc.gnu.org >> Subject: Re: [PATCH]AArch64: check for vector m

Re: [RESEND PATCH v5 1/3] ifcvt: handle sequences that clobber flags in noce_convert_multiple_sets

2024-07-26 Thread Sam James
Manolis Tsamis writes: > This is an extension of what was done in PR106590. FWIW, I think that if a bug is worth mentioning in the commit message, it's worth tagging so the hooks pick it up (as you get a nice reverse-mapping then if anyone is looking at it and wondering if a follow-up occurred).

[RESEND PATCH v5 3/3] ifcvt: Handle multiple rewired regs and refactor noce_convert_multiple_sets

2024-07-26 Thread Manolis Tsamis
The existing implementation of need_cmov_or_rewire and noce_convert_multiple_sets_1 assumes that sets are either REG or SUBREG. This commit enchances them so they can handle/rewire arbitrary set statements. To do that a new helper struct noce_multiple_sets_info is introduced which is used by noce_

[RESEND PATCH v5 2/3] ifcvt: Allow more operations in multiple set if conversion

2024-07-26 Thread Manolis Tsamis
Currently the operations allowed for if conversion of a basic block with multiple sets are few, namely REG, SUBREG and CONST_INT (as controlled by bb_ok_for_noce_convert_multiple_sets). This commit allows more operations (arithmetic, compare, etc) to participate in if conversion. The target's prof

[RESEND PATCH v5 1/3] ifcvt: handle sequences that clobber flags in noce_convert_multiple_sets

2024-07-26 Thread Manolis Tsamis
This is an extension of what was done in PR106590. Currently if a sequence generated in noce_convert_multiple_sets clobbers the condition rtx (cc_cmp or rev_cc_cmp) then only seq1 is used afterwards (sequences that emit the comparison itself). Since this applies only from the next iteration it ass

[RESEND PATCH v5 0/3] ifcvt: Allow if conversion of arithmetic in basic blocks with multiple sets

2024-07-26 Thread Manolis Tsamis
noce_convert_multiple_sets has been introduced and extended over time to handle if conversion for blocks with multiple sets. Currently this is focused on register moves and rejects any sort of arithmetic operations. This series is an extension to allow more sequences to take part in if conversio

Re: [PATCH v2 3/3] aarch64: Add fpm register helper functions.

2024-07-26 Thread Claudio Bantaloukas
On 26/07/2024 09:13, Kyrylo Tkachov wrote: > Hi Claudio, > >> On 25 Jul 2024, at 16:25, Claudio Bantaloukas >> wrote: >> >> External email: Use caution opening links or attachments >> >> >> The ACLE declares several helper types and functions to >> facilitate construction of `fpm` arguments. >

Re: [PATCH v2 1/3] aarch64: Add march flags for +fp8 arch extensions

2024-07-26 Thread Claudio Bantaloukas
On 26/07/2024 08:15, Kyrylo Tkachov wrote: > Hi Claudio, > >> On 25 Jul 2024, at 16:25, Claudio Bantaloukas >> wrote: >> >> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi >> index e0a641213ae..f293d49c61a 100644 >> --- a/gcc/doc/invoke.texi >> +++ b/gcc/doc/invoke.texi >> @@ -21843,6 +

RE: [PATCH 3/8]AArch64: Add Neoverse V3AE core definition and cost model

2024-07-26 Thread Tamar Christina
Hi All, This is a new version with the confirmed correct part number. An update TRM is being published. Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: * config/aarch64/aarch64-cores.def (neoverse-v3ae): New. * confi

RE: [PATCH]AArch64: check for vector mode in get_mask_mode [PR116074]

2024-07-26 Thread Tamar Christina
> -Original Message- > From: Richard Sandiford > Sent: Friday, July 26, 2024 10:43 AM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw > ; Marcus Shawcroft > ; ktkac...@gcc.gnu.org > Subject: Re: [PATCH]AArch64: check for vector mode in get_mask_mode > [PR116074]

Re: [PATCH]middle-end: check for vector mode before in get_mask_mode [PR116074]

2024-07-26 Thread Richard Biener
> Am 26.07.2024 um 11:40 schrieb Tamar Christina : > > Hi All, > > For historical reasons AArch64 has TI mode vector types but does not consider > TImode a vector mode. > > What's happening in the PR is that get_vectype_for_scalar_type is returning > vector(1) TImode for a TImode scalar. Th

Re: [PATCH]AArch64: check for vector mode in get_mask_mode [PR116074]

2024-07-26 Thread Richard Sandiford
Tamar Christina writes: >> -Original Message- >> From: Richard Sandiford >> Sent: Friday, July 26, 2024 10:24 AM >> To: Tamar Christina >> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw >> ; Marcus Shawcroft >> ; ktkac...@gcc.gnu.org >> Subject: Re: [PATCH]AArch64: check for vector m

Re: [PATCH]AArch64: check for vector mode in get_mask_mode [PR116074]

2024-07-26 Thread Richard Biener
> Am 26.07.2024 um 11:29 schrieb Tamar Christina : > >  >> >> -Original Message- >> From: Richard Sandiford >> Sent: Friday, July 26, 2024 10:24 AM >> To: Tamar Christina >> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw >> ; Marcus Shawcroft >> ; ktkac...@gcc.gnu.org >> Subject

[PATCH]middle-end: check for vector mode before in get_mask_mode [PR116074]

2024-07-26 Thread Tamar Christina
Hi All, For historical reasons AArch64 has TI mode vector types but does not consider TImode a vector mode. What's happening in the PR is that get_vectype_for_scalar_type is returning vector(1) TImode for a TImode scalar. This then fails when we call targetm.vectorize.get_mask_mode (vecmode).exi

RE: [PATCH]AArch64: check for vector mode in get_mask_mode [PR116074]

2024-07-26 Thread Tamar Christina
> -Original Message- > From: Richard Sandiford > Sent: Friday, July 26, 2024 10:24 AM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw > ; Marcus Shawcroft > ; ktkac...@gcc.gnu.org > Subject: Re: [PATCH]AArch64: check for vector mode in get_mask_mode > [PR116074]

Re: [PATCH 2/5] aarch64: sve: Rename aarch64_bic to standard pattern, andn

2024-07-26 Thread Richard Sandiford
Kyrylo Tkachov writes: >> On 25 Jul 2024, at 04:14, Andrew Pinski wrote: >> >> External email: Use caution opening links or attachments >> >> >> Now there is an optab for bic, andn since r15-1890-gf379596e0ba99d. >> This moves aarch64_bic for sve over to use it instead. >> >> Note unlike the

  1   2   >