[PATCH] RISC-V: Add norelax function attribute

2024-11-07 Thread shiyulong
From: yulong This patch adds norelax function attribute that be discussed in riscv-c-api-doc PR#94. URL:https://github.com/riscv-non-isa/riscv-c-api-doc/pull/94 gcc/ChangeLog: * config/riscv/riscv.cc (riscv_declare_function_name): Add new attribute. --- gcc/config/riscv/riscv.cc

[pushed] Darwin: Fix a narrowing warning.

2024-11-07 Thread Iain Sandoe
Tested on x86_64-darwin, pushed to trunk, thanks Iain --- 8< --- cdtor_record needs to have an unsigned entry for the position in order to match with vec_safe_length. gcc/ChangeLog: * config/darwin.cc (cdtor_record): Make position unsigned. Signed-off-by: Iain Sandoe --- gcc/config/d

[PATCH v2] arm: Don't ICE on arm_mve.h pragma without MVE types [PR117408]

2024-11-07 Thread Torbjörn SVENSSON
Changes since v1: - Updated the error message to mention that arm_mve_types.h needs to be included. - Corrected some spelling errors in commit message. As the warning for pure functions returning void is not related to this patch, I'll leave it for you Christophe to look into. :) Ok for trunk

Re: [PATCH 04/10] gimple: Disallow sizeless types in BIT_FIELD_REFs.

2024-11-07 Thread Richard Biener
On Thu, Nov 7, 2024 at 11:13 AM Tejas Belagod wrote: > > On 11/7/24 2:36 PM, Richard Biener wrote: > > On Thu, Nov 7, 2024 at 8:25 AM Tejas Belagod wrote: > >> > >> On 11/6/24 6:02 PM, Richard Biener wrote: > >>> On Wed, Nov 6, 2024 at 12:49 PM Tejas Belagod > >>> wrote: > > Ensure si

Re: [PATCH] RISC-V: Add norelax function attribute

2024-11-07 Thread Yangyu Chen
Thanks for doing this! > On Nov 8, 2024, at 00:19, shiyul...@iscas.ac.cn wrote: > > From: yulong > > This patch adds norelax function attribute that be discussed in > riscv-c-api-doc PR#94. > URL:https://github.com/riscv-non-isa/riscv-c-api-doc/pull/94 > > gcc/ChangeLog: > >* config/

Re: [PATCH][RFC][PR117093] match.pd: Fold vec_perm with view_convert

2024-11-07 Thread Richard Biener
On Tue, 5 Nov 2024, Jennifer Schmitz wrote: > We are working on a patch to improve the codegen for the following test case: > uint64x2_t foo (uint64x2_t r) { > uint32x4_t a = vreinterpretq_u32_u64 (r); > uint32_t t; > t = a[0]; a[0] = a[1]; a[1] = t; > t = a[2]; a[2] = a[3]; a[3] =

Re: [PATCH] testsuite: arm: Use effective-target arm_fp for pr68620.c test

2024-11-07 Thread Richard Earnshaw (lists)
On 06/11/2024 19:50, Torbjorn SVENSSON wrote: > > > On 2024-11-06 19:06, Richard Earnshaw (lists) wrote: >> On 06/11/2024 13:50, Torbjorn SVENSSON wrote: >>> >>> >>> On 2024-11-06 14:04, Richard Earnshaw (lists) wrote: On 06/11/2024 12:23, Torbjorn SVENSSON wrote: > > > On 2024-1

Re: [PATCH] rtl-optimization/117467 - 33% compile-time in rest of compilation

2024-11-07 Thread Jeff Law
On 11/7/24 2:15 AM, Richard Biener wrote: ext-dce uses TV_NONE, that's not OK for a pass taking 33% compile-time. The following adds a timevar to it for proper blaming. Bootstrap running on x86_64-unknown-linux-gnu. PR rtl-optimization/117467 * timevar.def (TV_EXT_DCE): New.

[PATCH] testsuite: arm: Allow vst1.32 instruction in pr40457-2.c

2024-11-07 Thread Torbjörn SVENSSON
Ok for trunk and releases/gcc-14? -- When building the test case with neon, the 'vst1.32' instruction is used instead of 'strd'. Allow both variants to make the test pass. gcc/testsuite/ChangeLog: * gcc.target/arm/pr40457-2.c: Add vst1.32 as an allowed instruction. Signed-off-b

Re: [PATCH] ifcvt: Clarify if_info.original_cost.

2024-11-07 Thread Richard Sandiford
"Robin Dapp" writes: >>> If the problem is tracking liveness, wouldn't it be better to >>> iterate over the "then" block in reverse order? We would start >>> with the liveness set for the join block and update as we move >>> backwards through the "then" block. This liveness set would >>> tell us

[PATCH v4 5/8] aarch64: Add masked-load else operands.

2024-11-07 Thread Robin Dapp
From: Robin Dapp This adds zero else operands to masked loads and their intrinsics. I needed to adjust more than initially thought because we rely on combine for several instructions and a change in a "base" pattern needs to propagate to all those. gcc/ChangeLog: * config/aarch64/aarch6

[PATCH v4 0/8] Add maskload else operand.

2024-11-07 Thread Robin Dapp
From: Robin Dapp Hi, changes from v3: - Check if we support vec_cond_expr for the selected mode in case we need to set the inactive elements to zero. - Add another undef operand to gcn. - Remove unnecessary changes in i386 patch. Robin Dapp (8): docs: Document maskload else operand and beh

[PATCH v4 1/8] docs: Document maskload else operand and behavior.

2024-11-07 Thread Robin Dapp
From: Robin Dapp This patch amends the documentation for masked loads (maskload, vec_mask_load_lanes, and mask_gather_load as well as their len counterparts) with an else operand. gcc/ChangeLog: * doc/md.texi: Document masked load else operand. --- gcc/doc/md.texi | 63

RE: [PATCH 5/5] Allow multiple vectorized epilogs via --param vect-epilogues-nomask=N

2024-11-07 Thread Richard Biener
On Thu, 7 Nov 2024, Tamar Christina wrote: > > -Original Message- > > From: Richard Biener > > Sent: Wednesday, November 6, 2024 2:32 PM > > To: gcc-patches@gcc.gnu.org > > Cc: RISC-V CI ; Tamar Christina > > ; Richard Sandiford > > Subject: [PATCH 5/5] Allow multiple vectorized epilogs

[PATCH v4 4/8] vect: Add maskload else value support.

2024-11-07 Thread Robin Dapp
From: Robin Dapp This patch adds an else operand to vectorized masked load calls. The current implementation adds else-value arguments to the respective target-querying functions that is used to supply the vectorizer with the proper else value. We query the target for its supported else operand

[PATCH v4 7/8] i386: Add zero maskload else operand.

2024-11-07 Thread Robin Dapp
From: Robin Dapp gcc/ChangeLog: * config/i386/sse.md (maskload): Call maskload..._1. (maskload_1): Rename. --- gcc/config/i386/sse.md | 21 ++--- 1 file changed, 18 insertions(+), 3 deletions(-) diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.m

[PATCH v4 8/8] RISC-V: Add else operand to masked loads [PR115336].

2024-11-07 Thread Robin Dapp
From: Robin Dapp This patch adds else operands to masked loads. Currently the default else operand predicate just accepts "undefined" (i.e. SCRATCH) values. PR middle-end/115336 PR middle-end/116059 gcc/ChangeLog: * config/riscv/autovec.md: Add else operand. *

[PATCH v4 2/8] ifn: Add else-operand handling.

2024-11-07 Thread Robin Dapp
From: Robin Dapp This patch adds else-operand handling to the internal functions. gcc/ChangeLog: * internal-fn.cc (add_mask_and_len_args): Rename... (add_mask_else_and_len_args): ...to this and add else handling. (expand_partial_load_optab_fn): Use adjusted function.

[PATCH v4 6/8] gcn: Add else operand to masked loads.

2024-11-07 Thread Robin Dapp
From: Robin Dapp This patch adds an undefined else operand to the masked loads. gcc/ChangeLog: * config/gcn/predicates.md (maskload_else_operand): New predicate. * config/gcn/gcn-valu.md: Use new predicate. --- gcc/config/gcn/gcn-valu.md | 23 +++

[PATCH v4 3/8] tree-ifcvt: Add zero maskload else value.

2024-11-07 Thread Robin Dapp
From: Robin Dapp When predicating a load we implicitly assume that the else value is zero. This matters in case the loaded value is padded (like e.g. a Bool) and we must ensure that the padding bytes are zero on targets that don't implicitly zero inactive elements. A former version of this patc

Re: [PATCH v4 6/8] gcn: Add else operand to masked loads.

2024-11-07 Thread Andrew Stubbs
On 07/11/2024 17:57, Robin Dapp wrote: From: Robin Dapp This patch adds an undefined else operand to the masked loads. gcc/ChangeLog: * config/gcn/predicates.md (maskload_else_operand): New predicate. * config/gcn/gcn-valu.md: Use new predicate. --- gcc/config/gcn/gc

[committed] btf: check hash maps are non-null before emptying

2024-11-07 Thread David Faust
These maps will always be non-null in btf_finalize under normal circumstances, but be safe and verify that before trying to empty them. Tested on x86_64-linux-gnu and x86_64-linux-gnu host for bpf-unknown-none target. Pushed as obvious. gcc/ * btfout.cc (btf_finalize): Check that hash map

Re: [PATCH] testsuite: arm: Use effective-target arm_fp for pr68620.c test

2024-11-07 Thread Torbjorn SVENSSON
On 2024-11-07 16:33, Richard Earnshaw (lists) wrote: On 06/11/2024 19:50, Torbjorn SVENSSON wrote: On 2024-11-06 19:06, Richard Earnshaw (lists) wrote: On 06/11/2024 13:50, Torbjorn SVENSSON wrote: On 2024-11-06 14:04, Richard Earnshaw (lists) wrote: On 06/11/2024 12:23, Torbjorn SVENS

[PATCH] bpf: avoid possible null deref in btf_ext_output [PR target/117447]

2024-11-07 Thread David Faust
The BPF-specific .BTF.ext section is always generated for BPF programs if -gbtf is specified, and generating it requires BTF information and assumes that the BTF info has already been generated. Compiling non-C languages to BPF is not supported, nor is generating CTF/BTF for non-C. But, compiling

Re: [PATCH] ifcombine: For short circuit case, allow 2 defining statements [PR85605]

2024-11-07 Thread Andrew Pinski
On Fri, Nov 1, 2024 at 4:06 PM Andrew Pinski wrote: > > On Tue, Oct 29, 2024 at 10:10 AM Andrew Pinski wrote: > > > > On Tue, Oct 29, 2024 at 5:59 AM Richard Biener > > wrote: > > > > > > On Tue, Oct 29, 2024 at 4:29 AM Andrew Pinski > > > wrote: > > > > > > > > r0-126134-g5d2a9da9a7f7c1 added

[RFC/PATCH] c++: Unwrap type traits defined in terms of builtins within concept diagnostics [PR117294]

2024-11-07 Thread Nathaniel Shead
Does this approach seem reasonable? I'm pretty sure that the way I've handled the templating here is unideal but I'm not sure what a neat way to do what I'm trying to do here would be; any comments are welcome. -- >8 -- Currently, concept failures of standard type traits just report 'expression

Re: [PATCH v2 2/2] VN: Handle `(A CMP B) !=/== 0` for predicates [PR117414]

2024-11-07 Thread Andrew Pinski
On Thu, Nov 7, 2024 at 12:50 AM Richard Biener wrote: > > On Thu, Nov 7, 2024 at 12:43 AM Andrew Pinski > wrote: > > > > After the last patch, we also want to record `(A CMP B) != 0` > > as `(A CMP B)` and `(A CMP B) == 0` as `(A CMP B)` with the > > true/false edges swapped. > > > > This shows

Re: [PATCH] bpf: avoid possible null deref in btf_ext_output [PR target/117447]

2024-11-07 Thread Jose E. Marchesi
Hi Faust. Thanks for the patch. OK for master. > The BPF-specific .BTF.ext section is always generated for BPF programs > if -gbtf is specified, and generating it requires BTF information and > assumes that the BTF info has already been generated. > > Compiling non-C languages to BPF is not sup

[PATCH][ivopts]: perform affine fold to unsigned on non address expressions. [PR114932]

2024-11-07 Thread Tamar Christina
Hi All, When the patch for PR114074 was applied we saw a good boost in exchange2. This boost was partially caused by a simplification of the addressing modes. With the patch applied IV opts saw the following form for the base addressing; Base: (integer(kind=4) *) &block + ((sizetype) ((unsigne

Re: [PATCH v2 01/10] Match: Simplify branch form 4 of unsigned SAT_ADD into branchless

2024-11-07 Thread Jeff Law
On 11/7/24 8:07 AM, Tamar Christina wrote: -Original Message- From: Li, Pan2 Sent: Thursday, November 7, 2024 12:57 PM To: Tamar Christina ; Richard Biener Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; kito.ch...@gmail.com; jeffreya...@gmail.com; rdapp@gmail.com Subject: R

[PATCH] VN: Canonicalize compares before calling vn_nary_op_lookup_pieces

2024-11-07 Thread Andrew Pinski
This is the followup as mentioned in https://gcc.gnu.org/pipermail/gcc-patches/2024-November/667987.html . We need to canonicalize the compares using tree_swap_operands_p instead of checking CONSTANT_CLASS_P. Bootstrapped and tested on x86_64-linux-gnu. gcc/ChangeLog: * tree-ssa-sccvn.cc

Re: [PATCH] testsuite: arm: Use check-function-bodies in epilog-1.c test

2024-11-07 Thread Christophe Lyon
On Thu, 7 Nov 2024 at 20:35, Torbjörn SVENSSON wrote: > > The generated assembler is: > > armv7-m: > push{r4, lr} > ldr r4, .L6 > ldr r4, [r4] > lslsr4, r4, #29 > it mi > addmi r2, r2, #1 > bl bar > movs

[PATCH v2] c: Implement C2y N3356, if declarations [PR117019]

2024-11-07 Thread Marek Polacek
On Wed, Nov 06, 2024 at 06:06:46PM +, Joseph Myers wrote: > On Wed, 6 Nov 2024, Marek Polacek wrote: > > > On Wed, Nov 06, 2024 at 09:42:02AM -0500, Marek Polacek wrote: > > > On reflection, I'm not so sure about these anymore: > > > > > > On Mon, Nov 04, 2024 at 06:26:47PM -0500, Marek Polac

[RFC 2/9] aarch64: add new define_insn for subg

2024-11-07 Thread Indu Bhagat
subg (Subtract with Tag) is an Armv8.5-A memory tagging (MTE) instruction. It can be used to subtract an immediate value scaled by the tag granule from the address in the source register. gcc/ChangeLog: * config/aarch64/aarch64.md (subg): New definition. --- gcc/config/aarch64/aarch64.m

[RFC 3/9] aarch64: add new insn definition for st2g

2024-11-07 Thread Indu Bhagat
Store Allocation Tags (st2g) is an Armv8.5-A memory tagging (MTE) instruction. It stores an allocation tag to two tag granules of memory. TBD: - Not too sure what is the best way to generate the st2g yet; A subsequent patch will emit them in one of the target hooks. - the current define_in

[RFC 4/9] opts: doc: aarch64: add new memtag sanitizer

2024-11-07 Thread Indu Bhagat
Add new command line option -fsanitize=memtag with the following new params: --param memtag-instrument-stack [0,1] (default 1) to use MTE insns for enabling dynamic checking of stack variables. --param memtag-instrument-alloca [0,1] (default 1) to use MTE insns for enabling dynamic checking of st

[committed 1/2] libstdc++: Define __is_pair variable template for C++11

2024-11-07 Thread Jonathan Wakely
libstdc++-v3/ChangeLog: * include/bits/stl_pair.h (__is_pair): Define for C++11 and C++14 as well. --- Tested powerpc64le-linux. Pushed to trunk. libstdc++-v3/include/bits/stl_pair.h | 6 ++ 1 file changed, 6 insertions(+) diff --git a/libstdc++-v3/include/bits/stl_pair.h b

[RFC 7/9] hwasan: add support for generating MTE instructions for memory tagging

2024-11-07 Thread Indu Bhagat
Memory tagging is used for detecting memory safety bugs. On AArch64, the memory tagging extension (MTE) helps in reducing the overheads of memory tagging: - CPU: MTE instructions for efficiently tagging and untagging memory. - Memory: New memory type, Normal Tagged Memory, added to the Arm Ar

[RFC 8/9] asan: memtag: enable pass_asan for memtag sanitizer

2024-11-07 Thread Indu Bhagat
Check for SANITIZER_MEMTAG in the gate function for pass_asan gimple pass; enable it. TBD: - This commit was initially carved out in order to ensure each patch works in isolation. Need to revisit and double check this. gcc/ChangeLog: * asan.cc (memtag_sanitize_p): Fix definition.

[RFC 1/9] opts: use unsigned HOST_WIDE_INT for sanitizer flags

2024-11-07 Thread Indu Bhagat
Currently, the data type of sanitizer flags is unsigned int, with SANITIZE_SHADOW_CALL_STACK (1UL << 31) being highest individual enumerator for enum sanitize_code. Use 'unsigned HOST_WIDE_INT' data type to allow for more distinct instrumentation modes be added when needed. FIXME: 1. Is using d_u

[RFC 5/9] targhooks: add new target hook TARGET_MEMTAG_TAG_MEMORY

2024-11-07 Thread Indu Bhagat
Add a new target hook TARGET_MEMTAG_TAG_MEMORY to tag (and untag) memory. The default implementation is empty. Hardware-assisted sanitizers on architectures providing instructions to tag/untag memory can then make use of this target hook. On AArch64, e.g., the MEMTAG sanitizer will use this hook

[RFC 6/9] aarch64: memtag: implement target hooks

2024-11-07 Thread Indu Bhagat
MEMTAG sanitizer, which is based on the HWASAN sanitizer, will invoke the target-specific hooks to create a random tag, add tag to memory address, and finally tag and untag memory. Implement the target hooks to emit MTE instructions if MEMTAG sanitizer is in effect. Continue to use the default ta

Re: [PATCH] c++: Fix ICE on constexpr virtual function [PR117317]

2024-11-07 Thread Jason Merrill
On 10/30/24 3:17 AM, Jakub Jelinek wrote: Hi! Since C++20 virtual methods can be constexpr, and if they are constexpr evaluated, we choose tentative_decl_linkage for those defer their output and decide at_eof again. On the following testcases we ICE though, because if expand_or_defer_fn_1 decide

[RFC 9/9] memtag: testsuite: add new tests

2024-11-07 Thread Indu Bhagat
Add basic tests for MEMTAG sanitizer. MEMTAG sanitizer uses target hooks to emit AArch64 specific MTE instructions. Add new target-specific tests. The currently generated code has quite a few limitations: 1. For basic-1.c testcase, currently we generate: subgx0, x0, #16, #0

[RFC 0/9] Add -fsanitize=memtag

2024-11-07 Thread Indu Bhagat
Hi, Sending the current state of the work. I would like to get feedback on whether this is generally the right direction of adding the MEMTAG sanitizer in GCC. I have added some TBD/FIXME notes to each commit log. These are some of the things I am aware of and need to be resolved. Please let m

[committed 2/2] libstdc++: Fix conversions to key/value types for hash table insertion [PR115285]

2024-11-07 Thread Jonathan Wakely
The conversions to key_type and value_type that are performed when inserting into _Hashtable need to be fixed to do any required conversions explicitly. The current code assumes that conversions from the parameter to the key_type or value_type can be done implicitly, which isn't necessarily true.

[committed] libstdc++: Improve comment for _Hashtable::_M_insert_unique_node

2024-11-07 Thread Jonathan Wakely
Clarify the effects if rehashing is needed. Document the __n_elt parameter. libstdc++-v3/ChangeLog: * include/bits/hashtable.h (_M_insert_unique_node): Improve comment. --- Pushed as obvious. libstdc++-v3/include/bits/hashtable.h | 7 +-- 1 file changed, 5 insertions(+), 2 d

Re: [PATCH v2][GCC14] aarch64: Add support for FUJITSU-MONAKA (-mcpu=fujitsu-monaka) CPU

2024-11-07 Thread Richard Sandiford
"Yuta Mukai (Fujitsu)" writes: > Thank you for pushing to trunk. > Can I also ask for a backport to GCC14? > > I have attached the patch for GCC14. > FP8 has been excluded from the list as it is not supported in GCC14. > > Bootstrapped/regtested on aarch64-unknown-linux-gnu. LGTM, thanks. Pushed

Re: [PATCH] testsuite: arm: Use effective-target arm_fp for pr68620.c test

2024-11-07 Thread Christophe Lyon
On Thu, 7 Nov 2024 at 19:09, Torbjorn SVENSSON wrote: > > > > On 2024-11-07 16:33, Richard Earnshaw (lists) wrote: > > On 06/11/2024 19:50, Torbjorn SVENSSON wrote: > >> > >> > >> On 2024-11-06 19:06, Richard Earnshaw (lists) wrote: > >>> On 06/11/2024 13:50, Torbjorn SVENSSON wrote: > >

[PATCH] libstdc++: Simplify _Hashtable merge functions

2024-11-07 Thread Jonathan Wakely
I realised that _M_merge_unique and _M_merge_multi call extract(iter) which then has to call _M_get_previous_node to iterate through the bucket to find the node before the one iter points to. Since the merge function is already iterating over the entire container, we had the previous node a moment

Re: [PATCH] testsuite: arm: Use effective-target for nomve_fp_1 test

2024-11-07 Thread Christophe Lyon
On Thu, 7 Nov 2024 at 18:33, Torbjorn SVENSSON wrote: > > > > On 2024-11-07 11:40, Christophe Lyon wrote: > > Hi Torbjörn, > > > > On Thu, 31 Oct 2024 at 19:34, Torbjörn SVENSSON > > wrote: > >> > >> Ok for trunk and releases/gcc-14? > >> > >> -- > >> > >> Test uses MVE, so add effective-target a

[PATCH] testsuite: arm: Use check-function-bodies in epilog-1.c test

2024-11-07 Thread Torbjörn SVENSSON
The generated assembler is: armv7-m: push{r4, lr} ldr r4, .L6 ldr r4, [r4] lslsr4, r4, #29 it mi addmi r2, r2, #1 bl bar movsr0, #0 pop {r4, pc} armv8.1-m.main: push{r3, r4, r5

RE: [PATCH 3/5] Add LOOP_VINFO_MAIN_LOOP_INFO

2024-11-07 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Wednesday, November 6, 2024 2:30 PM > To: gcc-patches@gcc.gnu.org > Cc: Richard Sandiford ; Tamar Christina > > Subject: [PATCH 3/5] Add LOOP_VINFO_MAIN_LOOP_INFO > > The following introduces LOOP_VINFO_MAIN_LOOP_INFO alongside > LOOP_V

Re: [PATCH] testsuite: arm: Use effective-target for nomve_fp_1 test

2024-11-07 Thread Torbjorn SVENSSON
On 2024-11-07 11:40, Christophe Lyon wrote: Hi Torbjörn, On Thu, 31 Oct 2024 at 19:34, Torbjörn SVENSSON wrote: Ok for trunk and releases/gcc-14? -- Test uses MVE, so add effective-target arm_fp requirement. gcc/testsuite/ChangeLog: * g++.target/arm/mve/general-c++/nomve_fp_1.

Re: [PATCH] ifcvt: Clarify if_info.original_cost.

2024-11-07 Thread Robin Dapp
> I think it'd be better if I abstain from this. I probably disagree too > much with the current structure and the way that the code is developing. > I won't object if anyone else approves it though. It's not that I'm happy with the current state either and I thought about how to rewrite it more

RE: [PATCH v2 01/10] Match: Simplify branch form 4 of unsigned SAT_ADD into branchless

2024-11-07 Thread Tamar Christina
> -Original Message- > From: Li, Pan2 > Sent: Thursday, November 7, 2024 1:45 AM > To: Tamar Christina ; Richard Biener > > Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; kito.ch...@gmail.com; > jeffreya...@gmail.com; rdapp@gmail.com > Subject: RE: [PATCH v2 01/10] Match: Simplify

Re: [PATCH v2 1/2] VN: Handle `(a | b) !=/== 0` for predicates [PR117414]

2024-11-07 Thread Andrew Pinski
On Thu, Nov 7, 2024 at 12:48 AM Richard Biener wrote: > > On Thu, Nov 7, 2024 at 12:43 AM Andrew Pinski > wrote: > > > > For `(a | b) == 0`, we can "assert" on the true edge that > > both `a == 0` and `b == 0` but nothing on the false edge. > > For `(a | b) != 0`, we can "assert" on the false ed

Re: [PATCH] RISC-V: Add norelax function attribute

2024-11-07 Thread Kito Cheng
LGTM, thanks!, and I will defer this for a little bit to make the c-api side stable :) On Fri, Nov 8, 2024 at 12:19 AM wrote: > > From: yulong > > This patch adds norelax function attribute that be discussed in > riscv-c-api-doc PR#94. > URL:https://github.com/riscv-non-isa/riscv-c-api-doc/pull

[PATCH v2] testsuite: arm: Use effective-target arm_libc_fp_abi for pr68620.c test

2024-11-07 Thread Torbjörn SVENSSON
Changes since v1: - Switch to arm_libc_fp_abi from arm_fp @Christophe, can you test this patch in the linaro farm to ensure that it does not fail again? Ok for trunk and releases/gcc-14? -- This fixes reported regression at https://linaro.atlassian.net/browse/GNU-1407. gcc/testsuite/ChangeLog

RE: [PATCH v2 01/10] Match: Simplify branch form 4 of unsigned SAT_ADD into branchless

2024-11-07 Thread Li, Pan2
Thanks Tamar and Jeff for comments. > I'm not sure it's that simple. It'll depend on the micro-architecture. > So things like strength of the branch predictors, how fetch blocks are > handled (can you have embedded not-taken branches, short-forward-branch > optimizations, etc). > After: > >

Re: [PATCH v4 7/8] i386: Add zero maskload else operand.

2024-11-07 Thread Hongtao Liu
On Fri, Nov 8, 2024 at 1:58 AM Robin Dapp wrote: > > From: Robin Dapp > > gcc/ChangeLog: > > * config/i386/sse.md (maskload): > Call maskload..._1. > (maskload_1): Rename. Ok for x86 part. > --- > gcc/config/i386/sse.md | 21 ++--- > 1 file changed, 18 ins

Re: [PATCH] vect: Do not try to duplicate_and_interleave one-element mode.

2024-11-07 Thread Robin Dapp
> Could you walk me through the failure in more detail? It sounds > like can_duplicate_and_interleave_p eventually gets to the point of > subdividing the original elements, instead of either combining consecutive > elements (the best case), or leaving them as-is (the expected fallback > for SVE).

Re: [PATCHv2 2/3] ada: Fix GNU/Hurd priority range

2024-11-07 Thread Marc Poulhiès
Samuel Thibault writes: > GNU/Mach currently uses a 0..63 range. > > gcc/ada/ChangeLog: > > * libgnat/system-gnu.ads: New file. > * Makefile.rtl (x86-gnuhurd): Use libgnat/system-gnu.ads instead of > libgnat/system-freebsd.ads. > > Signed-off-by: Samuel Thibault > --- OK witho

Re: [PATCHv2 3/3] ada: Add GNU/Hurd x86_64 support

2024-11-07 Thread Marc Poulhiès
Samuel Thibault writes: > This is essentially the same as the i386-pc-gnu section, the differences > are the same as between freebsd i386 and freebsd x86_64. > > gcc/ada/ChangeLog: > > * Makefile.rtl: Add x86_64-pc-gnu section. > > Signed-off-by: Samuel Thibault OK without the ChangeLog

[PATCH 09/15] arm: [MVE intrinsics] add load_ext_gather_offset shape

2024-11-07 Thread Christophe Lyon
This patch adds the load_ext_gather_offset shape description. gcc/ChangeLog: * config/arm/arm-mve-builtins-shapes.cc (struct load_ext_gather): New. (struct load_ext_gather_offset_def): New. * config/arm/arm-mve-builtins-shapes.h (load_ext_gather_offset): Ne

[PATCH 15/15] arm: [MVE intrinsics] remove useless call_properties implementations.

2024-11-07 Thread Christophe Lyon
vstrq_impl derives from store_truncating and vldrq_impl derives from load_extending which both implement call_properties. No need to re-implement them in the derived classes. gcc/ChangeLog: * config/arm/arm-mve-builtins-base.cc (vstrq_impl): Remove call_properties. (vldrq

Re: [PATCH v2] aarch64: Optimise calls to ldexp with SVE FSCALE instruction [PR111733]

2024-11-07 Thread Richard Sandiford
Soumya AR writes: > Changes since v1: > > This revision makes use of the extended definition of aarch64_ptrue_reg to > generate predicate registers with the appropriate set bits. > > Earlier, there was a suggestion to add support for half floats as well. I > extended the patch to include HFs but G

[PATCH] inline-asm, i386, v2: Add "redzone" clobber support

2024-11-07 Thread Jakub Jelinek
On Thu, Nov 07, 2024 at 09:12:34AM +0100, Uros Bizjak wrote: > On Thu, Nov 7, 2024 at 9:00 AM Jakub Jelinek wrote: > > > > On Thu, Nov 07, 2024 at 08:47:34AM +0100, Uros Bizjak wrote: > > > Maybe we should always recognize "redzone", even for targets without > > > it. This is the way we recognize

RE: [PATCH v2 01/10] Match: Simplify branch form 4 of unsigned SAT_ADD into branchless

2024-11-07 Thread Li, Pan2
I see your point that the backend can leverage condition move to emit the branch code. > For instance see https://godbolt.org/z/fvrq3aq6K > On ISAs with conditional operations the branch version gets ifconverted. > On AArch64 we get: > sat_add_u_1(unsigned int, unsigned int): > addsw0

Re: [PATCH] AArch64: Block combine_and_move from creating FP literal loads

2024-11-07 Thread Richard Sandiford
Wilco Dijkstra writes: > The IRA combine_and_move pass runs if the scheduler is disabled and > aggressively > combines moves. The movsf/df patterns allow all FP immediates since they rely > on a split pattern. However splits do not happen during IRA, so the result is > extra literal loads. To

[committed] libstdc++: Tweak comments on includes in hashtable headers

2024-11-07 Thread Jonathan Wakely
std::is_permutation is only used in not in , so move the comment referring to it. libstdc++-v3/ChangeLog: * include/bits/hashtable.h: Add is_permutation to comment. * include/bits/hashtable_policy.h: Remove it from comment. --- Pushed as obvious. libstdc++-v3/include/bits/hasht

Re: [patch][v2] libgomp.texi: Document OpenMP's Interoperability Routines

2024-11-07 Thread Tobias Burnus
As there were no further remarks, I have now committed it as r15-5017-ge52cfd4bc23de1 with minor changes: * Referring to v6.0 not TR13 (same section numbers), * fixed one item in the 5.2 to-do list: 'declare mapper with iterator and present modifiers' comes from Appendix B and we had before a

Re: [r15-4988 Regression] FAIL: gcc.dg/gomp/max_vf-1.c scan-tree-dump-times ompexp "__builtin_GOMP_parallel_loop_nonmonotonic_dynamic \\(.*, 16, 0\\);" 1 on Linux/x86_64

2024-11-07 Thread Jakub Jelinek
On Thu, Nov 07, 2024 at 10:54:40AM +, Andrew Stubbs wrote: > On 07/11/2024 00:37, haochen.jiang wrote: > > d334f729e53867b838e867375b3f475ba793d96e is the first bad commit > > commit d334f729e53867b838e867375b3f475ba793d96e > > Author: Andrew Stubbs > > Date: Wed Nov 6 12:26:08 2024 + >

Re: [PATCH 07/10] aarch64: Add testcase for C/C++ ops on SVE ACLE types.

2024-11-07 Thread Richard Sandiford
Tejas Belagod writes: > This patch adds a test case to cover C/C++ operators on SVE ACLE types. This > does not cover all types, but covers most representative types. > > gcc/testsuite: > > * gcc.target/aarch64/sve/acle/general/cops.c: New test. > --- > .../aarch64/sve/acle/general/cops.c

Re: [PATCH 06/10] rtl: Validate subreg info when optimizing vec_select.

2024-11-07 Thread Richard Sandiford
Tejas Belagod writes: > When optimizing for NOPs in case of overlapping regs in VEC_SELECT > expressions, > validate subreg data before using simplify_subreg_regno. There is no real > SUBREG rtx here, but a pseudo subreg call to check if subregs are possible. > > gcc/ChangeLog: > > * rtlan

Re: [PATCH 00/10] aarch64: Enable C/C++ operations on SVE ACLE types.

2024-11-07 Thread Richard Sandiford
Tejas Belagod writes: > Hi, > > This patchset enables C/C++ operations on SVE ACLE types. I've replied to some of the individual patches, but otherwise the AArch64 parts look good to me. Thanks, Richard

Re: [PATCH 04/10] gimple: Disallow sizeless types in BIT_FIELD_REFs.

2024-11-07 Thread Richard Biener
On Thu, Nov 7, 2024 at 8:25 AM Tejas Belagod wrote: > > On 11/6/24 6:02 PM, Richard Biener wrote: > > On Wed, Nov 6, 2024 at 12:49 PM Tejas Belagod wrote: > >> > >> Ensure sizeless types don't end up trying to be canonicalised to > >> BIT_FIELD_REFs. > > > > You mean variable-sized? But don't w

[PATCH 07/15] arm: [MVE intrinsics] rework vstr scatter_base

2024-11-07 Thread Christophe Lyon
Implement vstr?q_scatter_base using the new MVE builtins framework. We need to introduce a new iterator (MVE_4) to support the set needed by vstr?q_scatter_base (V4SI V4SF V2DI). gcc/ChangeLog: * config/arm/arm-builtins.cc (arm_strsbs_qualifiers) (arm_strsbu_qualifiers, arm_strsb

[PATCH 10/15] arm: [MVE intrinsics] rework vldr gather_offset

2024-11-07 Thread Christophe Lyon
Implement vldr?q_gather_offset using the new MVE builtins framework. The patch introduces a new attribute iterator (MVE_u_elem) to accomodate the fact that ACLE's expected output description uses "uNN" for all modes, except V8HF where it expects ".f16". Using "V_sz_elem" would work, but would req

[PATCH 13/15] arm: [MVE intrinsics] rework vldr gather_base

2024-11-07 Thread Christophe Lyon
Implement vldr?q_gather_base using the new MVE builtins framework. The patch updates two testcases rather than using different iterators for predicated and non-predicated versions. According to ACLE: vldrdq_gather_base_s64 is expected to generate VLDRD.64 vldrdq_gather_base_z_s64 is expected to ge

Re: [PATCH v2] Doc: Add doc for standard name mask_len_strided_load{store}m

2024-11-07 Thread Richard Biener
On Thu, Nov 7, 2024 at 2:49 AM Li, Pan2 wrote: > > Hi Richard, > > I would like to double confirm about the doc as I am not the native speaker. > It may be referenced by all other developers and I am not sure if there is > something misleading or fuzzy. > Thanks a lot. The docs look good to me -

[PATCH 04/15] arm: [MVE intrinsics] rework vstr_scatter_shifted_offset

2024-11-07 Thread Christophe Lyon
Implement vstr?q_scatter_shifted_offset intrinsics using the MVE builtins framework. We use the same approach as the previous patch, and we now have four sets of patterns: - vector scatter stores with shifted offset (non-truncating) - predicated vector scatter stores with shifted offset (non-trunc

[PATCH 08/15] arm: [MVE intrinsics] rework vstr scatter_base_wb

2024-11-07 Thread Christophe Lyon
Implement vstr?q_scatter_base_wb using the new MVE builtins framework. The patch introduces a new 'b' type for signatures, which represents the type of the 'base' argument of vstr?q_scatter_base_wb. gcc/ChangeLog: * config/arm/arm-builtins.cc (arm_strsbwbs_qualifiers) (arm_strsbw

[PATCH 14/15] arm: [MVE intrinsics] rework vldr gather_base_wb

2024-11-07 Thread Christophe Lyon
Implement vldr?q_gather_base_wb using the new MVE builtins framework. gcc/ChangeLog: * config/arm/arm-builtins.cc (arm_ldrgbwbxu_qualifiers) (arm_ldrgbwbxu_z_qualifiers, arm_ldrgbwbs_qualifiers) (arm_ldrgbwbu_qualifiers, arm_ldrgbwbs_z_qualifiers) (arm_ldrgbwbu_z_q

[PATCH 06/15] arm: [MVE intrinsics] Add store_scatter_base shape

2024-11-07 Thread Christophe Lyon
This patch adds the store_scatter_base shape description. gcc/ChangeLog: * config/arm/arm-mve-builtins-shapes.cc (store_scatter_base): New. * config/arm/arm-mve-builtins-shapes.h (store_scatter_base): New. --- gcc/config/arm/arm-mve-builtins-shapes.cc | 49 +++

Re: [r15-4988 Regression] FAIL: gcc.dg/gomp/max_vf-1.c scan-tree-dump-times ompexp "__builtin_GOMP_parallel_loop_nonmonotonic_dynamic \\(.*, 16, 0\\);" 1 on Linux/x86_64

2024-11-07 Thread Jakub Jelinek
On Thu, Nov 07, 2024 at 11:31:17AM +, Andrew Stubbs wrote: > Anyway, I think the attached patch should fix it. It passes on my > configuration, but I don't have a Cascade Lake. You could have tested with whatever you have (if it has AVX) as -march= > OK? Yes, thanks. Jakub

Re: [patch][v2] libgomp.texi: Document OpenMP's Interoperability Routines

2024-11-07 Thread Tobias Burnus
I intended – but forgot – to actually attach the committed patch. Here it is … Tobias Burnus wrote: As there were no further remarks, I have now committed it as r15-5017-ge52cfd4bc23de1 with minor changes: * Referring to v6.0 not TR13 (same section numbers), * fixed one item in the 5.2 to-do l

[PATCH] rs6000: Add PowerPC inline asm redzone clobber support

2024-11-07 Thread Jakub Jelinek
Hi! The following patch on top of the https://gcc.gnu.org/pipermail/gcc-patches/2024-November/667949.html patch adds rs6000 part of the support (the only other target I'm aware of which clearly has red zone as well). 2024-11-07 Jakub Jelinek * config/rs6000/rs6000.h (struct machine_fu

Re: [PATCH] Optimize incoming integer argument promotion

2024-11-07 Thread Richard Biener
On Thu, Nov 7, 2024 at 5:50 AM H.J. Lu wrote: > > On Wed, Nov 6, 2024 at 6:01 PM Richard Biener > wrote: > > > > On Wed, Nov 6, 2024 at 10:52 AM H.J. Lu wrote: > > > > > > On Wed, Nov 6, 2024 at 4:29 PM Richard Biener > > > wrote: > > > > > > > > On Tue, Nov 5, 2024 at 10:50 PM H.J. Lu wrote:

[committed] libstdc++: Fix typo in comment in hashtable.h

2024-11-07 Thread Jonathan Wakely
And tweak grammar in a couple of comments. libstdc++-v3/ChangeLog: * include/bits/hashtable.h: Fix spelling in comment. --- Pushed as obvious. libstdc++-v3/include/bits/hashtable.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/libstdc++-v3/include/bits/hashtabl

[PATCH 11/15] arm: [MVE intrinsics] rework vldr gather_shifted_offset

2024-11-07 Thread Christophe Lyon
Implement vldr?q_gather_shifted_offset using the new MVE builtins framework. gcc/ChangeLog: * config/arm/arm-builtins.cc (arm_ldrgu_qualifiers) (arm_ldrgs_qualifiers, arm_ldrgs_z_qualifiers) (arm_ldrgu_z_qualifiers): Delete. * config/arm/arm-mve-builtins-base.cc (v

[PATCH 01/15] arm: [MVE intrinsics] add mode_after_pred helper in function_shape

2024-11-07 Thread Christophe Lyon
This new helper returns true if the mode suffix goes after the predicate suffix. This is true in most cases, so the base implementations in nonoverloaded_base and overloaded_base return true. For instance: vaddq_m_n_s32. This will be useful in later patches to implement vstr?q_scatter_offset_p (_

[PATCH] rtl-optimization/117467 - 33% compile-time in rest of compilation

2024-11-07 Thread Richard Biener
ext-dce uses TV_NONE, that's not OK for a pass taking 33% compile-time. The following adds a timevar to it for proper blaming. Bootstrap running on x86_64-unknown-linux-gnu. PR rtl-optimization/117467 * timevar.def (TV_EXT_DCE): New. * ext-dce.cc (pass_data_ext_dce): Use T

[PATCH 05/15] arm: [MVE intrinsics] Check immediate is a multiple in a range

2024-11-07 Thread Christophe Lyon
This patch adds support to check that an immediate is a multiple of a given value in a given range. This will be used for instance by scatter_base to check that offset is in +/-4*[0..127]. Unlike require_immediate_range, require_immediate_range_multiple accepts signed range bounds to handle the a

[PATCH 02/15] arm: [MVE intrinsics] add store_scatter_offset shape

2024-11-07 Thread Christophe Lyon
This patch adds the store_scatter_offset shape and uses a new helper class (store_scatter), which will also be used by later patches. gcc/ChangeLog: * config/arm/arm-mve-builtins-shapes.cc (struct store_scatter): New. (struct store_scatter_offset_def): New. * config/arm/ar

Re: [PATCHv2 1/3] ada: Factorize bsd signal definitions

2024-11-07 Thread Marc Poulhiès
Samuel Thibault writes: > They are all the same on all BSD-like systems (including GNU/Hurd). > > gcc/ada/ChangeLog: > > * libgnarl/a-intnam__freebsd.ads: Rename to... > * libgnarl/a-intnam__bsd.ads: ... new file. > * libgnarl/a-intnam__dragonfly.ads: Remove file. > * Make

Re: [PATCH 04/10] gimple: Disallow sizeless types in BIT_FIELD_REFs.

2024-11-07 Thread Tejas Belagod
On 11/7/24 2:36 PM, Richard Biener wrote: On Thu, Nov 7, 2024 at 8:25 AM Tejas Belagod wrote: On 11/6/24 6:02 PM, Richard Biener wrote: On Wed, Nov 6, 2024 at 12:49 PM Tejas Belagod wrote: Ensure sizeless types don't end up trying to be canonicalised to BIT_FIELD_REFs. You mean variable-

Re: [PATCH] arm: Don't ICE on arm_mve.h pragma without MVE types [PR117408]

2024-11-07 Thread Christophe Lyon
Hi, On Fri, 1 Nov 2024 at 22:10, Torbjörn SVENSSON wrote: > > There is one more problem, that this patch does not address, and that is > that there are warnings like below, but I do not know what's causing them. > > .../gcc/testsuite/gcc.target/arm/pr117408-1.c:8:9: warning: 'pure' attribute >

[PATCH] c++: Disallow decomposition of lambda bases [PR90321]

2024-11-07 Thread Nathaniel Shead
Bootstrapped and lightly regtested on x86_64-pc-linux-gnu (so far just dg.exp), OK for trunk if full regtest succeeds? -- >8 -- Decomposition of lambda closure types is not allowed by [dcl.struct.bind] p6, since members of a closure have no name. r244909 made this an error, but missed the case w

RE: [PATCH 5/5] Allow multiple vectorized epilogs via --param vect-epilogues-nomask=N

2024-11-07 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Wednesday, November 6, 2024 2:32 PM > To: gcc-patches@gcc.gnu.org > Cc: RISC-V CI ; Tamar Christina > ; Richard Sandiford > Subject: [PATCH 5/5] Allow multiple vectorized epilogs via --param > vect-epilogues- > nomask=N > > The followi

  1   2   >