Re: [PATCH 04/10] gimple: Disallow sizeless types in BIT_FIELD_REFs.

2024-11-07 Thread Richard Biener
On Fri, Nov 8, 2024 at 7:30 AM Tejas Belagod wrote: > > On 11/7/24 5:52 PM, Richard Biener wrote: > > On Thu, Nov 7, 2024 at 11:13 AM Tejas Belagod wrote: > >> > >> On 11/7/24 2:36 PM, Richard Biener wrote: > >>> On Thu, Nov 7, 2024 at 8:25 AM Tejas Belagod > >>> wrote: > > On 11/6/24

Re: [PATCH v2] Optimize incoming integer argument promotion

2024-11-07 Thread Richard Biener
On Fri, Nov 8, 2024 at 5:32 AM H.J. Lu wrote: > > On Thu, Nov 7, 2024 at 8:16 PM Richard Biener > wrote: > > > > On Thu, Nov 7, 2024 at 5:50 AM H.J. Lu wrote: > > > > > > TARGET_PROMOTE_PROTOTYPES isn't defined for psABI purpose. > > > > > x86 psABI doesn't require it. GCC uses only the lower

Re: [PATCH] testsuite: arm: Use effective-target arm_fp for pr68620.c test

2024-11-07 Thread Torbjorn SVENSSON
On 2024-11-07 23:14, Christophe Lyon wrote: On Thu, 7 Nov 2024 at 19:09, Torbjorn SVENSSON wrote: On 2024-11-07 16:33, Richard Earnshaw (lists) wrote: On 06/11/2024 19:50, Torbjorn SVENSSON wrote: On 2024-11-06 19:06, Richard Earnshaw (lists) wrote: On 06/11/2024 13:50, Torbjorn SVEN

Re: [PATCH] VN: Canonicalize compares before calling vn_nary_op_lookup_pieces

2024-11-07 Thread Richard Biener
On Thu, Nov 7, 2024 at 10:13 PM Andrew Pinski wrote: > > This is the followup as mentioned in > https://gcc.gnu.org/pipermail/gcc-patches/2024-November/667987.html . > We need to canonicalize the compares using tree_swap_operands_p instead > of checking CONSTANT_CLASS_P. > > Bootstrapped and teste

RE: [PATCH v2 01/10] Match: Simplify branch form 4 of unsigned SAT_ADD into branchless

2024-11-07 Thread Tamar Christina
> -Original Message- > From: Jeff Law > Sent: Thursday, November 7, 2024 8:08 PM > To: Tamar Christina ; Li, Pan2 ; > Richard Biener > Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; kito.ch...@gmail.com; > rdapp@gmail.com > Subject: Re: [PATCH v2 01/10] Match: Simplify branch form

Re: [PATCH] i386: Disallow long address mode in the x32 mode. [PR 117418]

2024-11-07 Thread Uros Bizjak
On Fri, Nov 8, 2024 at 6:52 AM Hongtao Liu wrote: > > > > PR target/117418 > > > > * config/i386/i386-options.cc (ix86_option_override_internal): > > > > raise an > > > > error with option -mx32 -maddress-mode=long. > > > > > > > > gcc/testsuite/ChangeLog: > > > > > > > >

Re: [PATCH] RISC-V: Add testcases for unsigned imm vec SAT_SUB form1

2024-11-07 Thread 钟居哲
LGTM juzhe.zh...@rivai.ai From: Li Xu Date: 2024-11-08 14:57 To: gcc-patches CC: kito.cheng; palmer; juzhe.zhong; xuli Subject: [PATCH] RISC-V: Add testcases for unsigned imm vec SAT_SUB form1 From: xuli form1: void __attribute__((noinline)) \ vec_sat_u_sub_imm##IMM##_##T##_fmt_

Re: [PATCH] RISC-V: Add norelax function attribute

2024-11-07 Thread yulong
Thanks, Kito and yangyu! 在 2024/11/8 0:35, Yangyu Chen 写道: Thanks for doing this! On Nov 8, 2024, at 00:19, shiyul...@iscas.ac.cn wrote: From: yulong This patch adds norelax function attribute that be discussed in riscv-c-api-doc PR#94. URL:https://github.com/riscv-non-isa/riscv-c-api-doc/

[PATCH] RISC-V: Add testcases for unsigned imm vec SAT_SUB form1

2024-11-07 Thread Li Xu
From: xuli form1: void __attribute__((noinline)) \ vec_sat_u_sub_imm##IMM##_##T##_fmt_1 (T *out, T *in, unsigned limit) \ { \ unsigned i; \ for (i = 0; i < limit; i++) \

[PATCH v2] i386: Zero extend 32-bit address to 64-bit with option -mx32 -maddress-mode=long. [PR 117418]

2024-11-07 Thread Hu, Lin1
Thanks for your suggestions and answer. This is the current version. There is no problem in my test environment, but also in the further testing, sent for review. BRs, Lin -maddress-mode=long let Pmode = DI_mode, so zero extend 32-bit address to 64-bit and uses a 64-bit register as a pointer for

Re: [PATCH 04/10] gimple: Disallow sizeless types in BIT_FIELD_REFs.

2024-11-07 Thread Tejas Belagod
On 11/7/24 5:52 PM, Richard Biener wrote: On Thu, Nov 7, 2024 at 11:13 AM Tejas Belagod wrote: On 11/7/24 2:36 PM, Richard Biener wrote: On Thu, Nov 7, 2024 at 8:25 AM Tejas Belagod wrote: On 11/6/24 6:02 PM, Richard Biener wrote: On Wed, Nov 6, 2024 at 12:49 PM Tejas Belagod wrote: En

Re: [PATCH] [x86_64] Add microarchtecture tunable for pass_align_tight_loops

2024-11-07 Thread Mayshao-oc
> On Fri, Nov 8, 2024 at 10:21 AM Mayshao-oc wrote: > > > > -Original Message- > > > > From: Xi Ruoyao > > > > Sent: Thursday, November 7, 2024 1:12 PM > > > > To: Liu, Hongtao ; Mayshao-oc > > > o...@zhaoxin.com>; Hongtao Liu > > > > Cc: gcc-patches@gcc.gnu.org; hubi...@ucw.cz; ubiz...

Re: [PATCH] i386: Disallow long address mode in the x32 mode. [PR 117418]

2024-11-07 Thread Hongtao Liu
On Fri, Nov 8, 2024 at 1:21 PM Hongtao Liu wrote: > > On Fri, Nov 8, 2024 at 12:18 PM H.J. Lu wrote: > > > > On Fri, Nov 8, 2024 at 10:41 AM Hu, Lin1 wrote: > > > > > > Hi, all > > > > > > -maddress-mode=long will let Pmode = DI_mode, but -mx32 request x32 ABI. > > > So raise an error to avoid I

RE: [PATCH v2][GCC14] aarch64: Add support for FUJITSU-MONAKA (-mcpu=fujitsu-monaka) CPU

2024-11-07 Thread Yuta Mukai (Fujitsu)
>"Yuta Mukai (Fujitsu)" writes: >> Thank you for pushing to trunk. >> Can I also ask for a backport to GCC14? >> >> I have attached the patch for GCC14. >> FP8 has been excluded from the list as it is not supported in GCC14. >> >> Bootstrapped/regtested on aarch64-unknown-linux-gnu. > >LGTM, thank

Re: [PATCH] i386: Disallow long address mode in the x32 mode. [PR 117418]

2024-11-07 Thread Hongtao Liu
On Fri, Nov 8, 2024 at 12:18 PM H.J. Lu wrote: > > On Fri, Nov 8, 2024 at 10:41 AM Hu, Lin1 wrote: > > > > Hi, all > > > > -maddress-mode=long will let Pmode = DI_mode, but -mx32 request x32 ABI. > > So raise an error to avoid ICE. > > > > Bootstrapped and regtested, OK for trunk? > > > > BRs, >

[PATCH v2] Optimize incoming integer argument promotion

2024-11-07 Thread H.J. Lu
On Thu, Nov 7, 2024 at 8:16 PM Richard Biener wrote: > > On Thu, Nov 7, 2024 at 5:50 AM H.J. Lu wrote: > > > > TARGET_PROMOTE_PROTOTYPES isn't defined for psABI purpose. > > > > x86 psABI doesn't require it. GCC uses only the lower bits of incoming > > > > arguments. But it isn't the GCC's j

Re: [PATCH] i386: Disallow long address mode in the x32 mode. [PR 117418]

2024-11-07 Thread H.J. Lu
On Fri, Nov 8, 2024 at 10:41 AM Hu, Lin1 wrote: > > Hi, all > > -maddress-mode=long will let Pmode = DI_mode, but -mx32 request x32 ABI. > So raise an error to avoid ICE. > > Bootstrapped and regtested, OK for trunk? > > BRs, > Lin > > gcc/ChangeLog: > > PR target/117418 > * config

[PATCH] i386: Disallow long address mode in the x32 mode. [PR 117418]

2024-11-07 Thread Hu, Lin1
Hi, all -maddress-mode=long will let Pmode = DI_mode, but -mx32 request x32 ABI. So raise an error to avoid ICE. Bootstrapped and regtested, OK for trunk? BRs, Lin gcc/ChangeLog: PR target/117418 * config/i386/i386-options.cc (ix86_option_override_internal): raise an er

[PATCH] Guard truncate from vector float to vector __bf16 with !flag_rounding_math && HONOR_NANS (BFmode).

2024-11-07 Thread liuhongt
hw instruction doesn't raise exceptions, turns sNAN into qNAN quietly, and always round to nearest (even). Output denormals are always flushed to zero and input denormals are always treated as zero. MXCSR is not consulted nor updated. W/o native instructions, flag_unsafe_math_optimizations is neede

Re: [PATCH] [x86_64] Add microarchtecture tunable for pass_align_tight_loops

2024-11-07 Thread Hongtao Liu
On Fri, Nov 8, 2024 at 10:21 AM Mayshao-oc wrote: > > > > -Original Message- > > > From: Xi Ruoyao > > > Sent: Thursday, November 7, 2024 1:12 PM > > > To: Liu, Hongtao ; Mayshao-oc > > o...@zhaoxin.com>; Hongtao Liu > > > Cc: gcc-patches@gcc.gnu.org; hubi...@ucw.cz; ubiz...@gmail.com;

Re: [PATCH] [x86_64] Add microarchtecture tunable for pass_align_tight_loops

2024-11-07 Thread Mayshao-oc
> > -Original Message- > > From: Xi Ruoyao > > Sent: Thursday, November 7, 2024 1:12 PM > > To: Liu, Hongtao ; Mayshao-oc > o...@zhaoxin.com>; Hongtao Liu > > Cc: gcc-patches@gcc.gnu.org; hubi...@ucw.cz; ubiz...@gmail.com; > > richard.guent...@gmail.com; Tim Hu(WH-RD) ; Silvia > > Zhao(B

Re: [PATCH v4 7/8] i386: Add zero maskload else operand.

2024-11-07 Thread Hongtao Liu
On Fri, Nov 8, 2024 at 1:58 AM Robin Dapp wrote: > > From: Robin Dapp > > gcc/ChangeLog: > > * config/i386/sse.md (maskload): > Call maskload..._1. > (maskload_1): Rename. Ok for x86 part. > --- > gcc/config/i386/sse.md | 21 ++--- > 1 file changed, 18 ins

RE: [PATCH v2 01/10] Match: Simplify branch form 4 of unsigned SAT_ADD into branchless

2024-11-07 Thread Li, Pan2
Thanks Tamar and Jeff for comments. > I'm not sure it's that simple. It'll depend on the micro-architecture. > So things like strength of the branch predictors, how fetch blocks are > handled (can you have embedded not-taken branches, short-forward-branch > optimizations, etc). > After: > >

Re: [PATCH] testsuite: arm: Use effective-target for nomve_fp_1 test

2024-11-07 Thread Christophe Lyon
On Thu, 7 Nov 2024 at 18:33, Torbjorn SVENSSON wrote: > > > > On 2024-11-07 11:40, Christophe Lyon wrote: > > Hi Torbjörn, > > > > On Thu, 31 Oct 2024 at 19:34, Torbjörn SVENSSON > > wrote: > >> > >> Ok for trunk and releases/gcc-14? > >> > >> -- > >> > >> Test uses MVE, so add effective-target a

[PATCH] libstdc++: Simplify _Hashtable merge functions

2024-11-07 Thread Jonathan Wakely
I realised that _M_merge_unique and _M_merge_multi call extract(iter) which then has to call _M_get_previous_node to iterate through the bucket to find the node before the one iter points to. Since the merge function is already iterating over the entire container, we had the previous node a moment

Re: [PATCH] testsuite: arm: Use effective-target arm_fp for pr68620.c test

2024-11-07 Thread Christophe Lyon
On Thu, 7 Nov 2024 at 19:09, Torbjorn SVENSSON wrote: > > > > On 2024-11-07 16:33, Richard Earnshaw (lists) wrote: > > On 06/11/2024 19:50, Torbjorn SVENSSON wrote: > >> > >> > >> On 2024-11-06 19:06, Richard Earnshaw (lists) wrote: > >>> On 06/11/2024 13:50, Torbjorn SVENSSON wrote: > >

Re: [PATCH v2][GCC14] aarch64: Add support for FUJITSU-MONAKA (-mcpu=fujitsu-monaka) CPU

2024-11-07 Thread Richard Sandiford
"Yuta Mukai (Fujitsu)" writes: > Thank you for pushing to trunk. > Can I also ask for a backport to GCC14? > > I have attached the patch for GCC14. > FP8 has been excluded from the list as it is not supported in GCC14. > > Bootstrapped/regtested on aarch64-unknown-linux-gnu. LGTM, thanks. Pushed

[committed] libstdc++: Improve comment for _Hashtable::_M_insert_unique_node

2024-11-07 Thread Jonathan Wakely
Clarify the effects if rehashing is needed. Document the __n_elt parameter. libstdc++-v3/ChangeLog: * include/bits/hashtable.h (_M_insert_unique_node): Improve comment. --- Pushed as obvious. libstdc++-v3/include/bits/hashtable.h | 7 +-- 1 file changed, 5 insertions(+), 2 d

[committed 2/2] libstdc++: Fix conversions to key/value types for hash table insertion [PR115285]

2024-11-07 Thread Jonathan Wakely
The conversions to key_type and value_type that are performed when inserting into _Hashtable need to be fixed to do any required conversions explicitly. The current code assumes that conversions from the parameter to the key_type or value_type can be done implicitly, which isn't necessarily true.

[committed 1/2] libstdc++: Define __is_pair variable template for C++11

2024-11-07 Thread Jonathan Wakely
libstdc++-v3/ChangeLog: * include/bits/stl_pair.h (__is_pair): Define for C++11 and C++14 as well. --- Tested powerpc64le-linux. Pushed to trunk. libstdc++-v3/include/bits/stl_pair.h | 6 ++ 1 file changed, 6 insertions(+) diff --git a/libstdc++-v3/include/bits/stl_pair.h b

[PATCH v2] c: Implement C2y N3356, if declarations [PR117019]

2024-11-07 Thread Marek Polacek
On Wed, Nov 06, 2024 at 06:06:46PM +, Joseph Myers wrote: > On Wed, 6 Nov 2024, Marek Polacek wrote: > > > On Wed, Nov 06, 2024 at 09:42:02AM -0500, Marek Polacek wrote: > > > On reflection, I'm not so sure about these anymore: > > > > > > On Mon, Nov 04, 2024 at 06:26:47PM -0500, Marek Polac

[RFC 4/9] opts: doc: aarch64: add new memtag sanitizer

2024-11-07 Thread Indu Bhagat
Add new command line option -fsanitize=memtag with the following new params: --param memtag-instrument-stack [0,1] (default 1) to use MTE insns for enabling dynamic checking of stack variables. --param memtag-instrument-alloca [0,1] (default 1) to use MTE insns for enabling dynamic checking of st

Re: [PATCH] testsuite: arm: Use check-function-bodies in epilog-1.c test

2024-11-07 Thread Christophe Lyon
On Thu, 7 Nov 2024 at 20:35, Torbjörn SVENSSON wrote: > > The generated assembler is: > > armv7-m: > push{r4, lr} > ldr r4, .L6 > ldr r4, [r4] > lslsr4, r4, #29 > it mi > addmi r2, r2, #1 > bl bar > movs

[RFC 0/9] Add -fsanitize=memtag

2024-11-07 Thread Indu Bhagat
Hi, Sending the current state of the work. I would like to get feedback on whether this is generally the right direction of adding the MEMTAG sanitizer in GCC. I have added some TBD/FIXME notes to each commit log. These are some of the things I am aware of and need to be resolved. Please let m

[RFC 5/9] targhooks: add new target hook TARGET_MEMTAG_TAG_MEMORY

2024-11-07 Thread Indu Bhagat
Add a new target hook TARGET_MEMTAG_TAG_MEMORY to tag (and untag) memory. The default implementation is empty. Hardware-assisted sanitizers on architectures providing instructions to tag/untag memory can then make use of this target hook. On AArch64, e.g., the MEMTAG sanitizer will use this hook

[RFC 1/9] opts: use unsigned HOST_WIDE_INT for sanitizer flags

2024-11-07 Thread Indu Bhagat
Currently, the data type of sanitizer flags is unsigned int, with SANITIZE_SHADOW_CALL_STACK (1UL << 31) being highest individual enumerator for enum sanitize_code. Use 'unsigned HOST_WIDE_INT' data type to allow for more distinct instrumentation modes be added when needed. FIXME: 1. Is using d_u

[RFC 8/9] asan: memtag: enable pass_asan for memtag sanitizer

2024-11-07 Thread Indu Bhagat
Check for SANITIZER_MEMTAG in the gate function for pass_asan gimple pass; enable it. TBD: - This commit was initially carved out in order to ensure each patch works in isolation. Need to revisit and double check this. gcc/ChangeLog: * asan.cc (memtag_sanitize_p): Fix definition.

Re: [PATCH] c++: Fix ICE on constexpr virtual function [PR117317]

2024-11-07 Thread Jason Merrill
On 10/30/24 3:17 AM, Jakub Jelinek wrote: Hi! Since C++20 virtual methods can be constexpr, and if they are constexpr evaluated, we choose tentative_decl_linkage for those defer their output and decide at_eof again. On the following testcases we ICE though, because if expand_or_defer_fn_1 decide

[RFC 9/9] memtag: testsuite: add new tests

2024-11-07 Thread Indu Bhagat
Add basic tests for MEMTAG sanitizer. MEMTAG sanitizer uses target hooks to emit AArch64 specific MTE instructions. Add new target-specific tests. The currently generated code has quite a few limitations: 1. For basic-1.c testcase, currently we generate: subgx0, x0, #16, #0

[RFC 6/9] aarch64: memtag: implement target hooks

2024-11-07 Thread Indu Bhagat
MEMTAG sanitizer, which is based on the HWASAN sanitizer, will invoke the target-specific hooks to create a random tag, add tag to memory address, and finally tag and untag memory. Implement the target hooks to emit MTE instructions if MEMTAG sanitizer is in effect. Continue to use the default ta

[RFC 3/9] aarch64: add new insn definition for st2g

2024-11-07 Thread Indu Bhagat
Store Allocation Tags (st2g) is an Armv8.5-A memory tagging (MTE) instruction. It stores an allocation tag to two tag granules of memory. TBD: - Not too sure what is the best way to generate the st2g yet; A subsequent patch will emit them in one of the target hooks. - the current define_in

[RFC 2/9] aarch64: add new define_insn for subg

2024-11-07 Thread Indu Bhagat
subg (Subtract with Tag) is an Armv8.5-A memory tagging (MTE) instruction. It can be used to subtract an immediate value scaled by the tag granule from the address in the source register. gcc/ChangeLog: * config/aarch64/aarch64.md (subg): New definition. --- gcc/config/aarch64/aarch64.m

[RFC 7/9] hwasan: add support for generating MTE instructions for memory tagging

2024-11-07 Thread Indu Bhagat
Memory tagging is used for detecting memory safety bugs. On AArch64, the memory tagging extension (MTE) helps in reducing the overheads of memory tagging: - CPU: MTE instructions for efficiently tagging and untagging memory. - Memory: New memory type, Normal Tagged Memory, added to the Arm Ar

[PATCH] VN: Canonicalize compares before calling vn_nary_op_lookup_pieces

2024-11-07 Thread Andrew Pinski
This is the followup as mentioned in https://gcc.gnu.org/pipermail/gcc-patches/2024-November/667987.html . We need to canonicalize the compares using tree_swap_operands_p instead of checking CONSTANT_CLASS_P. Bootstrapped and tested on x86_64-linux-gnu. gcc/ChangeLog: * tree-ssa-sccvn.cc

Re: [PATCH v2 01/10] Match: Simplify branch form 4 of unsigned SAT_ADD into branchless

2024-11-07 Thread Jeff Law
On 11/7/24 8:07 AM, Tamar Christina wrote: -Original Message- From: Li, Pan2 Sent: Thursday, November 7, 2024 12:57 PM To: Tamar Christina ; Richard Biener Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; kito.ch...@gmail.com; jeffreya...@gmail.com; rdapp@gmail.com Subject: R

RE: [PATCH 3/5] Add LOOP_VINFO_MAIN_LOOP_INFO

2024-11-07 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Wednesday, November 6, 2024 2:30 PM > To: gcc-patches@gcc.gnu.org > Cc: Richard Sandiford ; Tamar Christina > > Subject: [PATCH 3/5] Add LOOP_VINFO_MAIN_LOOP_INFO > > The following introduces LOOP_VINFO_MAIN_LOOP_INFO alongside > LOOP_V

[PATCH] testsuite: arm: Use check-function-bodies in epilog-1.c test

2024-11-07 Thread Torbjörn SVENSSON
The generated assembler is: armv7-m: push{r4, lr} ldr r4, .L6 ldr r4, [r4] lslsr4, r4, #29 it mi addmi r2, r2, #1 bl bar movsr0, #0 pop {r4, pc} armv8.1-m.main: push{r3, r4, r5

[PATCH][ivopts]: perform affine fold to unsigned on non address expressions. [PR114932]

2024-11-07 Thread Tamar Christina
Hi All, When the patch for PR114074 was applied we saw a good boost in exchange2. This boost was partially caused by a simplification of the addressing modes. With the patch applied IV opts saw the following form for the base addressing; Base: (integer(kind=4) *) &block + ((sizetype) ((unsigne

Re: [PATCH] bpf: avoid possible null deref in btf_ext_output [PR target/117447]

2024-11-07 Thread Jose E. Marchesi
Hi Faust. Thanks for the patch. OK for master. > The BPF-specific .BTF.ext section is always generated for BPF programs > if -gbtf is specified, and generating it requires BTF information and > assumes that the BTF info has already been generated. > > Compiling non-C languages to BPF is not sup

Re: [PATCH] testsuite: arm: Use effective-target arm_fp for pr68620.c test

2024-11-07 Thread Torbjorn SVENSSON
On 2024-11-07 16:33, Richard Earnshaw (lists) wrote: On 06/11/2024 19:50, Torbjorn SVENSSON wrote: On 2024-11-06 19:06, Richard Earnshaw (lists) wrote: On 06/11/2024 13:50, Torbjorn SVENSSON wrote: On 2024-11-06 14:04, Richard Earnshaw (lists) wrote: On 06/11/2024 12:23, Torbjorn SVENS

[PATCH] bpf: avoid possible null deref in btf_ext_output [PR target/117447]

2024-11-07 Thread David Faust
The BPF-specific .BTF.ext section is always generated for BPF programs if -gbtf is specified, and generating it requires BTF information and assumes that the BTF info has already been generated. Compiling non-C languages to BPF is not supported, nor is generating CTF/BTF for non-C. But, compiling

[committed] btf: check hash maps are non-null before emptying

2024-11-07 Thread David Faust
These maps will always be non-null in btf_finalize under normal circumstances, but be safe and verify that before trying to empty them. Tested on x86_64-linux-gnu and x86_64-linux-gnu host for bpf-unknown-none target. Pushed as obvious. gcc/ * btfout.cc (btf_finalize): Check that hash map

RE: [PATCH 5/5] Allow multiple vectorized epilogs via --param vect-epilogues-nomask=N

2024-11-07 Thread Richard Biener
On Thu, 7 Nov 2024, Tamar Christina wrote: > > -Original Message- > > From: Richard Biener > > Sent: Wednesday, November 6, 2024 2:32 PM > > To: gcc-patches@gcc.gnu.org > > Cc: RISC-V CI ; Tamar Christina > > ; Richard Sandiford > > Subject: [PATCH 5/5] Allow multiple vectorized epilogs

Re: [PATCH v4 6/8] gcn: Add else operand to masked loads.

2024-11-07 Thread Andrew Stubbs
On 07/11/2024 17:57, Robin Dapp wrote: From: Robin Dapp This patch adds an undefined else operand to the masked loads. gcc/ChangeLog: * config/gcn/predicates.md (maskload_else_operand): New predicate. * config/gcn/gcn-valu.md: Use new predicate. --- gcc/config/gcn/gc

[PATCH v4 3/8] tree-ifcvt: Add zero maskload else value.

2024-11-07 Thread Robin Dapp
From: Robin Dapp When predicating a load we implicitly assume that the else value is zero. This matters in case the loaded value is padded (like e.g. a Bool) and we must ensure that the padding bytes are zero on targets that don't implicitly zero inactive elements. A former version of this patc

[PATCH v4 6/8] gcn: Add else operand to masked loads.

2024-11-07 Thread Robin Dapp
From: Robin Dapp This patch adds an undefined else operand to the masked loads. gcc/ChangeLog: * config/gcn/predicates.md (maskload_else_operand): New predicate. * config/gcn/gcn-valu.md: Use new predicate. --- gcc/config/gcn/gcn-valu.md | 23 +++

[PATCH v4 8/8] RISC-V: Add else operand to masked loads [PR115336].

2024-11-07 Thread Robin Dapp
From: Robin Dapp This patch adds else operands to masked loads. Currently the default else operand predicate just accepts "undefined" (i.e. SCRATCH) values. PR middle-end/115336 PR middle-end/116059 gcc/ChangeLog: * config/riscv/autovec.md: Add else operand. *

[PATCH v4 7/8] i386: Add zero maskload else operand.

2024-11-07 Thread Robin Dapp
From: Robin Dapp gcc/ChangeLog: * config/i386/sse.md (maskload): Call maskload..._1. (maskload_1): Rename. --- gcc/config/i386/sse.md | 21 ++--- 1 file changed, 18 insertions(+), 3 deletions(-) diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.m

[PATCH v4 4/8] vect: Add maskload else value support.

2024-11-07 Thread Robin Dapp
From: Robin Dapp This patch adds an else operand to vectorized masked load calls. The current implementation adds else-value arguments to the respective target-querying functions that is used to supply the vectorizer with the proper else value. We query the target for its supported else operand

[PATCH v4 5/8] aarch64: Add masked-load else operands.

2024-11-07 Thread Robin Dapp
From: Robin Dapp This adds zero else operands to masked loads and their intrinsics. I needed to adjust more than initially thought because we rely on combine for several instructions and a change in a "base" pattern needs to propagate to all those. gcc/ChangeLog: * config/aarch64/aarch6

[PATCH v4 0/8] Add maskload else operand.

2024-11-07 Thread Robin Dapp
From: Robin Dapp Hi, changes from v3: - Check if we support vec_cond_expr for the selected mode in case we need to set the inactive elements to zero. - Add another undef operand to gcn. - Remove unnecessary changes in i386 patch. Robin Dapp (8): docs: Document maskload else operand and beh

[PATCH v4 1/8] docs: Document maskload else operand and behavior.

2024-11-07 Thread Robin Dapp
From: Robin Dapp This patch amends the documentation for masked loads (maskload, vec_mask_load_lanes, and mask_gather_load as well as their len counterparts) with an else operand. gcc/ChangeLog: * doc/md.texi: Document masked load else operand. --- gcc/doc/md.texi | 63

[PATCH v4 2/8] ifn: Add else-operand handling.

2024-11-07 Thread Robin Dapp
From: Robin Dapp This patch adds else-operand handling to the internal functions. gcc/ChangeLog: * internal-fn.cc (add_mask_and_len_args): Rename... (add_mask_else_and_len_args): ...to this and add else handling. (expand_partial_load_optab_fn): Use adjusted function.

[PATCH v2] testsuite: arm: Use effective-target arm_libc_fp_abi for pr68620.c test

2024-11-07 Thread Torbjörn SVENSSON
Changes since v1: - Switch to arm_libc_fp_abi from arm_fp @Christophe, can you test this patch in the linaro farm to ensure that it does not fail again? Ok for trunk and releases/gcc-14? -- This fixes reported regression at https://linaro.atlassian.net/browse/GNU-1407. gcc/testsuite/ChangeLog

Re: [PATCH v2 1/2] VN: Handle `(a | b) !=/== 0` for predicates [PR117414]

2024-11-07 Thread Andrew Pinski
On Thu, Nov 7, 2024 at 12:48 AM Richard Biener wrote: > > On Thu, Nov 7, 2024 at 12:43 AM Andrew Pinski > wrote: > > > > For `(a | b) == 0`, we can "assert" on the true edge that > > both `a == 0` and `b == 0` but nothing on the false edge. > > For `(a | b) != 0`, we can "assert" on the false ed

RE: [PATCH v2 01/10] Match: Simplify branch form 4 of unsigned SAT_ADD into branchless

2024-11-07 Thread Tamar Christina
> -Original Message- > From: Li, Pan2 > Sent: Thursday, November 7, 2024 1:45 AM > To: Tamar Christina ; Richard Biener > > Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; kito.ch...@gmail.com; > jeffreya...@gmail.com; rdapp@gmail.com > Subject: RE: [PATCH v2 01/10] Match: Simplify

Re: [PATCH] ifcvt: Clarify if_info.original_cost.

2024-11-07 Thread Robin Dapp
> I think it'd be better if I abstain from this. I probably disagree too > much with the current structure and the way that the code is developing. > I won't object if anyone else approves it though. It's not that I'm happy with the current state either and I thought about how to rewrite it more

Re: [PATCH] testsuite: arm: Use effective-target for nomve_fp_1 test

2024-11-07 Thread Torbjorn SVENSSON
On 2024-11-07 11:40, Christophe Lyon wrote: Hi Torbjörn, On Thu, 31 Oct 2024 at 19:34, Torbjörn SVENSSON wrote: Ok for trunk and releases/gcc-14? -- Test uses MVE, so add effective-target arm_fp requirement. gcc/testsuite/ChangeLog: * g++.target/arm/mve/general-c++/nomve_fp_1.

Re: [PATCH] rtl-optimization/117467 - 33% compile-time in rest of compilation

2024-11-07 Thread Jeff Law
On 11/7/24 2:15 AM, Richard Biener wrote: ext-dce uses TV_NONE, that's not OK for a pass taking 33% compile-time. The following adds a timevar to it for proper blaming. Bootstrap running on x86_64-unknown-linux-gnu. PR rtl-optimization/117467 * timevar.def (TV_EXT_DCE): New.

Re: [PATCH] ifcvt: Clarify if_info.original_cost.

2024-11-07 Thread Richard Sandiford
"Robin Dapp" writes: >>> If the problem is tracking liveness, wouldn't it be better to >>> iterate over the "then" block in reverse order? We would start >>> with the liveness set for the join block and update as we move >>> backwards through the "then" block. This liveness set would >>> tell us

[PATCH] testsuite: arm: Allow vst1.32 instruction in pr40457-2.c

2024-11-07 Thread Torbjörn SVENSSON
Ok for trunk and releases/gcc-14? -- When building the test case with neon, the 'vst1.32' instruction is used instead of 'strd'. Allow both variants to make the test pass. gcc/testsuite/ChangeLog: * gcc.target/arm/pr40457-2.c: Add vst1.32 as an allowed instruction. Signed-off-b

Re: [PATCH 04/10] gimple: Disallow sizeless types in BIT_FIELD_REFs.

2024-11-07 Thread Richard Biener
On Thu, Nov 7, 2024 at 11:13 AM Tejas Belagod wrote: > > On 11/7/24 2:36 PM, Richard Biener wrote: > > On Thu, Nov 7, 2024 at 8:25 AM Tejas Belagod wrote: > >> > >> On 11/6/24 6:02 PM, Richard Biener wrote: > >>> On Wed, Nov 6, 2024 at 12:49 PM Tejas Belagod > >>> wrote: > > Ensure si

[PATCH v2] arm: Don't ICE on arm_mve.h pragma without MVE types [PR117408]

2024-11-07 Thread Torbjörn SVENSSON
Changes since v1: - Updated the error message to mention that arm_mve_types.h needs to be included. - Corrected some spelling errors in commit message. As the warning for pure functions returning void is not related to this patch, I'll leave it for you Christophe to look into. :) Ok for trunk

Re: [PATCH] ifcombine: For short circuit case, allow 2 defining statements [PR85605]

2024-11-07 Thread Andrew Pinski
On Fri, Nov 1, 2024 at 4:06 PM Andrew Pinski wrote: > > On Tue, Oct 29, 2024 at 10:10 AM Andrew Pinski wrote: > > > > On Tue, Oct 29, 2024 at 5:59 AM Richard Biener > > wrote: > > > > > > On Tue, Oct 29, 2024 at 4:29 AM Andrew Pinski > > > wrote: > > > > > > > > r0-126134-g5d2a9da9a7f7c1 added

Re: [PATCH] testsuite: arm: Use effective-target arm_fp for pr68620.c test

2024-11-07 Thread Richard Earnshaw (lists)
On 06/11/2024 19:50, Torbjorn SVENSSON wrote: > > > On 2024-11-06 19:06, Richard Earnshaw (lists) wrote: >> On 06/11/2024 13:50, Torbjorn SVENSSON wrote: >>> >>> >>> On 2024-11-06 14:04, Richard Earnshaw (lists) wrote: On 06/11/2024 12:23, Torbjorn SVENSSON wrote: > > > On 2024-1

Re: [PATCH][RFC][PR117093] match.pd: Fold vec_perm with view_convert

2024-11-07 Thread Richard Biener
On Tue, 5 Nov 2024, Jennifer Schmitz wrote: > We are working on a patch to improve the codegen for the following test case: > uint64x2_t foo (uint64x2_t r) { > uint32x4_t a = vreinterpretq_u32_u64 (r); > uint32_t t; > t = a[0]; a[0] = a[1]; a[1] = t; > t = a[2]; a[2] = a[3]; a[3] =

Re: [PATCH] RISC-V: Add norelax function attribute

2024-11-07 Thread Kito Cheng
LGTM, thanks!, and I will defer this for a little bit to make the c-api side stable :) On Fri, Nov 8, 2024 at 12:19 AM wrote: > > From: yulong > > This patch adds norelax function attribute that be discussed in > riscv-c-api-doc PR#94. > URL:https://github.com/riscv-non-isa/riscv-c-api-doc/pull

[pushed] Darwin: Fix a narrowing warning.

2024-11-07 Thread Iain Sandoe
Tested on x86_64-darwin, pushed to trunk, thanks Iain --- 8< --- cdtor_record needs to have an unsigned entry for the position in order to match with vec_safe_length. gcc/ChangeLog: * config/darwin.cc (cdtor_record): Make position unsigned. Signed-off-by: Iain Sandoe --- gcc/config/d

Re: [PATCH] RISC-V: Add norelax function attribute

2024-11-07 Thread Yangyu Chen
Thanks for doing this! > On Nov 8, 2024, at 00:19, shiyul...@iscas.ac.cn wrote: > > From: yulong > > This patch adds norelax function attribute that be discussed in > riscv-c-api-doc PR#94. > URL:https://github.com/riscv-non-isa/riscv-c-api-doc/pull/94 > > gcc/ChangeLog: > >* config/

Re: [PATCH v2 2/2] VN: Handle `(A CMP B) !=/== 0` for predicates [PR117414]

2024-11-07 Thread Andrew Pinski
On Thu, Nov 7, 2024 at 12:50 AM Richard Biener wrote: > > On Thu, Nov 7, 2024 at 12:43 AM Andrew Pinski > wrote: > > > > After the last patch, we also want to record `(A CMP B) != 0` > > as `(A CMP B)` and `(A CMP B) == 0` as `(A CMP B)` with the > > true/false edges swapped. > > > > This shows

[RFC/PATCH] c++: Unwrap type traits defined in terms of builtins within concept diagnostics [PR117294]

2024-11-07 Thread Nathaniel Shead
Does this approach seem reasonable? I'm pretty sure that the way I've handled the templating here is unideal but I'm not sure what a neat way to do what I'm trying to do here would be; any comments are welcome. -- >8 -- Currently, concept failures of standard type traits just report 'expression

[PATCH] RISC-V: Add norelax function attribute

2024-11-07 Thread shiyulong
From: yulong This patch adds norelax function attribute that be discussed in riscv-c-api-doc PR#94. URL:https://github.com/riscv-non-isa/riscv-c-api-doc/pull/94 gcc/ChangeLog: * config/riscv/riscv.cc (riscv_declare_function_name): Add new attribute. --- gcc/config/riscv/riscv.cc

[committed] libstdc++: Tweak comments on includes in hashtable headers

2024-11-07 Thread Jonathan Wakely
std::is_permutation is only used in not in , so move the comment referring to it. libstdc++-v3/ChangeLog: * include/bits/hashtable.h: Add is_permutation to comment. * include/bits/hashtable_policy.h: Remove it from comment. --- Pushed as obvious. libstdc++-v3/include/bits/hasht

Re: [patch][v2] libgomp.texi: Document OpenMP's Interoperability Routines

2024-11-07 Thread Tobias Burnus
As there were no further remarks, I have now committed it as r15-5017-ge52cfd4bc23de1 with minor changes: * Referring to v6.0 not TR13 (same section numbers), * fixed one item in the 5.2 to-do list: 'declare mapper with iterator and present modifiers' comes from Appendix B and we had before a

Re: [PATCH] Optimize incoming integer argument promotion

2024-11-07 Thread Richard Biener
On Thu, Nov 7, 2024 at 5:50 AM H.J. Lu wrote: > > On Wed, Nov 6, 2024 at 6:01 PM Richard Biener > wrote: > > > > On Wed, Nov 6, 2024 at 10:52 AM H.J. Lu wrote: > > > > > > On Wed, Nov 6, 2024 at 4:29 PM Richard Biener > > > wrote: > > > > > > > > On Tue, Nov 5, 2024 at 10:50 PM H.J. Lu wrote:

[PATCH] rs6000: Add PowerPC inline asm redzone clobber support

2024-11-07 Thread Jakub Jelinek
Hi! The following patch on top of the https://gcc.gnu.org/pipermail/gcc-patches/2024-November/667949.html patch adds rs6000 part of the support (the only other target I'm aware of which clearly has red zone as well). 2024-11-07 Jakub Jelinek * config/rs6000/rs6000.h (struct machine_fu

Re: [patch][v2] libgomp.texi: Document OpenMP's Interoperability Routines

2024-11-07 Thread Tobias Burnus
I intended – but forgot – to actually attach the committed patch. Here it is … Tobias Burnus wrote: As there were no further remarks, I have now committed it as r15-5017-ge52cfd4bc23de1 with minor changes: * Referring to v6.0 not TR13 (same section numbers), * fixed one item in the 5.2 to-do l

[committed] libstdc++: Fix typo in comment in hashtable.h

2024-11-07 Thread Jonathan Wakely
And tweak grammar in a couple of comments. libstdc++-v3/ChangeLog: * include/bits/hashtable.h: Fix spelling in comment. --- Pushed as obvious. libstdc++-v3/include/bits/hashtable.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/libstdc++-v3/include/bits/hashtabl

RE: [PATCH v2 01/10] Match: Simplify branch form 4 of unsigned SAT_ADD into branchless

2024-11-07 Thread Tamar Christina
> -Original Message- > From: Li, Pan2 > Sent: Thursday, November 7, 2024 12:57 PM > To: Tamar Christina ; Richard Biener > > Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; kito.ch...@gmail.com; > jeffreya...@gmail.com; rdapp@gmail.com > Subject: RE: [PATCH v2 01/10] Match: Simpli

Re: [PATCH] ifcvt: Clarify if_info.original_cost.

2024-11-07 Thread Robin Dapp
>> If the problem is tracking liveness, wouldn't it be better to >> iterate over the "then" block in reverse order? We would start >> with the liveness set for the join block and update as we move >> backwards through the "then" block. This liveness set would >> tell us whether the current instru

Re: [PATCH 1/3] aarch64: Add support for fp8 convert and scale

2024-11-07 Thread Saurabh Jha
On 11/7/2024 9:03 AM, Kyrylo Tkachov wrote: Hi Saurabh, On 6 Nov 2024, at 11:03, saurabh@arm.com wrote: The AArch64 FEAT_FP8 extension introduces instructions for conversion and scaling. This patch introduces the following intrinsics: 1. vcvt{1|2}_{bf16|high_bf16|low_bf16}_mf8_fpm. 2.

Re: [r15-4988 Regression] FAIL: gcc.dg/gomp/max_vf-1.c scan-tree-dump-times ompexp "__builtin_GOMP_parallel_loop_nonmonotonic_dynamic \\(.*, 16, 0\\);" 1 on Linux/x86_64

2024-11-07 Thread Andrew Stubbs
On 07/11/2024 11:07, Jakub Jelinek wrote: On Thu, Nov 07, 2024 at 10:54:40AM +, Andrew Stubbs wrote: On 07/11/2024 00:37, haochen.jiang wrote: d334f729e53867b838e867375b3f475ba793d96e is the first bad commit commit d334f729e53867b838e867375b3f475ba793d96e Author: Andrew Stubbs Date: Wed

Re: [PATCH] AArch64: Block combine_and_move from creating FP literal loads

2024-11-07 Thread Richard Sandiford
Wilco Dijkstra writes: > The IRA combine_and_move pass runs if the scheduler is disabled and > aggressively > combines moves. The movsf/df patterns allow all FP immediates since they rely > on a split pattern. However splits do not happen during IRA, so the result is > extra literal loads. To

RE: [PATCH v2 01/10] Match: Simplify branch form 4 of unsigned SAT_ADD into branchless

2024-11-07 Thread Li, Pan2
I see your point that the backend can leverage condition move to emit the branch code. > For instance see https://godbolt.org/z/fvrq3aq6K > On ISAs with conditional operations the branch version gets ifconverted. > On AArch64 we get: > sat_add_u_1(unsigned int, unsigned int): > addsw0

Re: [PATCH] Match: Fold pow calls to ldexp when possible [PR57492]

2024-11-07 Thread Richard Biener
On Tue, 5 Nov 2024, Soumya AR wrote: > > > > On 29 Oct 2024, at 7:16 PM, Richard Biener wrote: > > > > External email: Use caution opening links or attachments > > > > > > On Mon, 28 Oct 2024, Soumya AR wrote: > > > >> This patch transforms the following POW calls to equivalent LDEXP calls, as

RE: [PATCH 5/5] Allow multiple vectorized epilogs via --param vect-epilogues-nomask=N

2024-11-07 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Wednesday, November 6, 2024 2:32 PM > To: gcc-patches@gcc.gnu.org > Cc: RISC-V CI ; Tamar Christina > ; Richard Sandiford > Subject: [PATCH 5/5] Allow multiple vectorized epilogs via --param > vect-epilogues- > nomask=N > > The followi

Re: [r15-4988 Regression] FAIL: gcc.dg/gomp/max_vf-1.c scan-tree-dump-times ompexp "__builtin_GOMP_parallel_loop_nonmonotonic_dynamic \\(.*, 16, 0\\);" 1 on Linux/x86_64

2024-11-07 Thread Jakub Jelinek
On Thu, Nov 07, 2024 at 11:31:17AM +, Andrew Stubbs wrote: > Anyway, I think the attached patch should fix it. It passes on my > configuration, but I don't have a Cascade Lake. You could have tested with whatever you have (if it has AVX) as -march= > OK? Yes, thanks. Jakub

Re: [PATCH 00/10] aarch64: Enable C/C++ operations on SVE ACLE types.

2024-11-07 Thread Richard Sandiford
Tejas Belagod writes: > Hi, > > This patchset enables C/C++ operations on SVE ACLE types. I've replied to some of the individual patches, but otherwise the AArch64 parts look good to me. Thanks, Richard

Re: [PATCH 06/10] rtl: Validate subreg info when optimizing vec_select.

2024-11-07 Thread Richard Sandiford
Tejas Belagod writes: > When optimizing for NOPs in case of overlapping regs in VEC_SELECT > expressions, > validate subreg data before using simplify_subreg_regno. There is no real > SUBREG rtx here, but a pseudo subreg call to check if subregs are possible. > > gcc/ChangeLog: > > * rtlan

Re: [PATCH 07/10] aarch64: Add testcase for C/C++ ops on SVE ACLE types.

2024-11-07 Thread Richard Sandiford
Tejas Belagod writes: > This patch adds a test case to cover C/C++ operators on SVE ACLE types. This > does not cover all types, but covers most representative types. > > gcc/testsuite: > > * gcc.target/aarch64/sve/acle/general/cops.c: New test. > --- > .../aarch64/sve/acle/general/cops.c

  1   2   >