Re: [PATCH] c++, v2: Attempt to implement C++26 P3034R1 - Module Declarations Shouldn't be Macros [PR114461]

2024-10-29 Thread Jakub Jelinek
On Tue, Oct 29, 2024 at 04:31:00PM +0100, Jakub Jelinek wrote: > Here is a so far lightly tested patch, ok for trunk if it passes full > bootstrap/regtest or do you want some further changes? Bootstrapped/regtested successfully on x86_64-linux and i686-linux. Jakub

[PATCH] Fix ICE due to subreg:us_truncate.

2024-10-29 Thread liuhongt
Force_operand issues an ICE when input is (subreg:DI (us_truncate:V8QI)), it's probably because it's an invalid rtx, So refine backend patterns for that. Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}. Ready push to trunk. gcc/ChangeLog: PR target/117318 * config/i386/s

RE: [PATCH 1/5] Match: Simplify branch form 4 of unsigned SAT_ADD into branchless

2024-10-29 Thread Li, Pan2
Thanks Richard for comments. > You are testing GENERIC folding, so gcc.dg/ is a better location, not > tree-ssa/ Sure, will move all test files to there. > I wonder if the simplification is already applied by the frontend and thus > .original shows the simplified form or only .gimple? Yes, you

Re: testsuite: Use noinline in gcc.dg/simulate-thread/simulate-thread.h

2024-10-29 Thread Jeff Law
On 10/29/24 11:29 AM, Joseph Myers wrote: On Thu, 24 Oct 2024, Joseph Myers wrote: Among the changes of test results with a -std=gnu23 default were two tests changing from PASS to UNSUPPORTED: UNSUPPORTED: gcc.dg/simulate-thread/speculative-store.c -O2 -g thread simulation test UNSUPPOR

[PATCH 2/2] Support vector float_extend from __bf16 to float.

2024-10-29 Thread liuhongt
It's supported by vector permutation with zero vector. gcc/ChangeLog: * config/i386/i386-expand.cc (ix86_expand_vector_bf2sf_with_vec_perm): New function. * config/i386/i386-protos.h (ix86_expand_vector_bf2sf_with_vec_perm): New Declare. * config/i386/mmx.m

[PATCH 1/2] [x86] Support vector float_truncate for SF to BF.

2024-10-29 Thread liuhongt
Generate native instruction whenever possible, otherwise use vector permutation with odd indices. Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}. Ready push to trunk. gcc/ChangeLog: * config/i386/i386-expand.cc (ix86_expand_vector_sf2bf_with_vec_perm): New function.

Re: [PATCH v2 7/8] i386: Add else operand to masked loads.

2024-10-29 Thread Hongtao Liu
On Fri, Oct 18, 2024 at 10:23 PM Robin Dapp wrote: > > This patch adds a zero else operand to masked loads, in particular the > masked gather load builtins that are used for gather vectorization. > > gcc/ChangeLog: > > * config/i386/i386-expand.cc (ix86_expand_special_args_builtin): >

[PATCH v1] Doc: Add doc for standard name mask_len_strided_load{store}m

2024-10-29 Thread pan2 . li
From: Pan Li This patch would like to add doc for the below 2 standard names. 1. strided load: v = mask_len_strided_load (ptr, stried, mask, len, bias) 2. strided store: mask_len_stried_store (ptr, stride, v, mask, len, bias) gcc/ChangeLog: * doc/md.texi: Add doc for mask_len_stried_lo

[PATCH] gimple: Remove special handling of COND_EXPR for COMPARISON_CLASS_P [PR116949, PR114785]

2024-10-29 Thread Andrew Pinski
After r13-707-g68e0063397ba82, COND_EXPR for gimple assign no longer could contain a comparison. The vectorizer was builting gimple assigns with comparison until r15-4695-gd17e672ce82e69 (which added an assert to make sure it no longer builds it). So let's remove the special handling COND_EXPR i

Re: [PATCH] RISC-V: fix const interleaved stepped vector with a scalar pattern

2024-10-29 Thread 钟居哲
lgtm juzhe.zh...@rivai.ai From: Vineet Gupta Date: 2024-10-30 08:11 To: gcc-patches; Jeff Law; Robin Dapp; juzhe . zhong @ rivai . ai CC: gnu-toolchain; Vineet Gupta Subject: [PATCH] RISC-V: fix const interleaved stepped vector with a scalar pattern When bisecting for ICE in PR/117353, commi

[PATCH] RISC-V: fix const interleaved stepped vector with a scalar pattern

2024-10-29 Thread Vineet Gupta
When bisecting for ICE in PR/117353, commit 771256bcb9dd ("RISC-V: Emit costs for bool and stepped const vectors") uncovered yet another latent issue (first noted [1]) [1] https://github.com/patrick-rivos/gcc-postcommit-ci/issues/1625 This patch fixes some of the fortran regressions from that

Re: Frontend access to target features (was Re: [PATCH] libgccjit: Add ability to get CPU features)

2024-10-29 Thread Antoni Boucher
Thanks, David! Did you review the updated patch that depends on this gccrs patch? Is it also OK to merge when the PR in gccrs is merged? Le 2024-10-29 à 17 h 04, David Malcolm a écrit : On Tue, 2024-10-29 at 07:59 -0400, Antoni Boucher wrote: David: Arthur reviewed the gccrs patch and would be

Re: [PATCH v2 3/3] Simplify switch bit test clustering algorithmg

2024-10-29 Thread Andi Kleen
> > However this exposes PR117352 which is a negative interaction of the > > more aggressive bit test conversion. I don't think it's a show stopper, > > this can be sorted out later. > > I think it is a show stopper for GCC 15 because it is a pretty big > performance regression with targets that

Re: [PATCH v2 9/9] aarch64: Handle alignment when it is bigger than BIGGEST_ALIGNMENT

2024-10-29 Thread Richard Sandiford
Evgeny Karpov writes: >> Wednesday, October 23, 2024 >> Richard Sandiford wrote: >> >>> Or, even if that does work, it isn't clear to me why patching >>> ASM_OUTPUT_ALIGNED_LOCAL is a complete solution to the problem. >> >> This patch reproduces the same code as it was done without declaring >>

Re: [PATCH v2 3/3] Simplify switch bit test clustering algorithm

2024-10-29 Thread Andi Kleen
On Tue, Oct 29, 2024 at 01:50:57PM +0100, Richard Biener wrote: > On Mon, Oct 28, 2024 at 9:58 PM Andi Kleen wrote: > > > > From: Andi Kleen > > > > The current switch bit test clustering enumerates all possible case > > clusters combinations to find ones that fit the bit test constrains > > best

[pushed: r15-4760] diagnostics: support multiple output formats simultaneously [PR116613]

2024-10-29 Thread David Malcolm
This patch generalizes diagnostic_context so that rather than having a single output format, it has a vector of zero or more. It adds new two options: -fdiagnostics-add-output=DIAGNOSTICS-OUTPUT-SPEC -fdiagnostics-set-output=DIAGNOSTICS-OUTPUT-SPEC which both take a new configuration syntax of t

[PATCH] [testsuite] disable PIE on ia32 on more tests

2024-10-29 Thread Alexandre Oliva
Multiple tests fail on ia32 with -fPIE enabled by default because of different call sequences required by the call-saved PIC register (no-callee-saved-*.c), uses of the constant pool instead of computing constants (pr100865-*.c), and unexpected matches of esp in get_pc_thunk (sse2-stv-1.c). Disa

Re: [PATCH] aarch64: Use canonicalize_comparison in ccmp expansion [PR117346]

2024-10-29 Thread Richard Sandiford
Andrew Pinski writes: > While testing the patch for PR 85605 on aarch64, it was noticed that > imm_choice_comparison.c test failed. This was because canonicalize_comparison > was not being called in the ccmp case. This can be noticed without the patch > for PR 85605 as evidence of the new testcase

[PATCH] [testsuite] fix pr70321.c PIC expectations

2024-10-29 Thread Alexandre Oliva
When we select a non-bx get_pc_thunk, we get an extra mov to set up the PIC register before the abort call. Expect that mov or a get_pc_thunk.bx call. Regstrapped on x86_64-linux-gnu; also tested on i686-linux-gnu with -fPIE. Ok to install? for gcc/testsuite/ChangeLog * gcc.target/

Re: [PATCH v2] [PR83782] ifunc: back-propagate ifunc_resolver to aliases

2024-10-29 Thread Alexandre Oliva
On Nov 8, 2023, Alexandre Oliva wrote: > Ping? Ping? https://gcc.gnu.org/pipermail/gcc-patches/2023-November/635731.html The test still fails with gcc-14 and trunk on ia32 with -fPIE. I've just retested it on trunk on i686-linux-gnu with -fPIE, and on x86_64-linux-gnu. > gcc.target/i386/mvc1

Re: [PATCH 1/4] sched1: hookize pressure scheduling spilling agressiveness

2024-10-29 Thread Jeff Law
On 10/29/24 1:14 PM, Vineet Gupta wrote: On 10/29/24 11:51, Wilco Dijkstra wrote: Hi Vineet, I agree the NARROW/WIDE stuff is obfuscating things in technicalities. Is there evidence this change would make things significantly worse for some targets? Honestly I don't think this needs to be

Re: [PATCH 1/4] sched1: hookize pressure scheduling spilling agressiveness

2024-10-29 Thread Jeff Law
On 10/29/24 10:57 AM, Vineet Gupta wrote: Certainly open to more ideas on the naming, which I think will impact the documentation & comments as well. And to be 100% clear, no concerns with the behavior of the patch, it's really just the naming convention, documentation/comments. Thoughts?

Re: [PATCH v2 3/3] Simplify switch bit test clustering algorithm

2024-10-29 Thread Andrew Pinski
On Tue, Oct 29, 2024 at 3:04 PM Andi Kleen wrote: > > On Tue, Oct 29, 2024 at 01:50:57PM +0100, Richard Biener wrote: > > On Mon, Oct 28, 2024 at 9:58 PM Andi Kleen wrote: > > > > > > From: Andi Kleen > > > > > > The current switch bit test clustering enumerates all possible case > > > clusters

Re: [PATCH] Fortran: fix several front-end memleaks

2024-10-29 Thread Jerry D
On 10/29/24 2:00 PM, Harald Anlauf wrote: Dear all, while looking at the recent testcase gfortran.dg/pr115070.f90 with f951 running under valgrind, I noticed minor front-end memleaks of gfc_expr's that are probably fallout from a code refactoring, which are fixed by the attached. Regtested on x

[PATCH] libgo: Use stub syscall on GNU/Hurd

2024-10-29 Thread Samuel Thibault
GNU/Hurd does not actually have syscall(), it just has a stub that always return ENOSYS, and defines __stub_syscall. It does however expose a declaration for it: extern long int syscall (long int __sysno, ...) __THROW; that conflicts with the stub that libgo produces int syscall(int numbe

[PATCH] typos

2024-10-29 Thread Samuel Thibault
Changelog: * gcc/config/i386/t-freebsd64: Fix typo. * gcc/config/i386/t-gnu64: Fix typo. * gcc/config/i386/t-linux64: Fix typo. diff --git a/gcc/config/i386/t-freebsd64 b/gcc/config/i386/t-freebsd64 index 5e2cd3d2b6c..bd3a41c9516 100644 --- a/gcc/config/i386/t-freebsd64 ++

Re: Frontend access to target features (was Re: [PATCH] libgccjit: Add ability to get CPU features)

2024-10-29 Thread David Malcolm
On Tue, 2024-10-29 at 07:59 -0400, Antoni Boucher wrote: > David: Arthur reviewed the gccrs patch and would be OK with it. > > Could you please take a look and review it? https://github.com/Rust-GCC/gccrs/pull/3195 looks good to me; thanks! Dave > > Le 2024-10-17 à 11 h 38, Antoni Boucher a éc

[PATCH] Fortran: fix several front-end memleaks

2024-10-29 Thread Harald Anlauf
Dear all, while looking at the recent testcase gfortran.dg/pr115070.f90 with f951 running under valgrind, I noticed minor front-end memleaks of gfc_expr's that are probably fallout from a code refactoring, which are fixed by the attached. Regtested on x86_64-pc-linux-gnu. OK for mainline? Thank

Fix PR rtl-optimization/117327

2024-10-29 Thread Eric Botcazou
ion/117327 * reorg.cc (find_end_label): Do not return a dangling label at the end of the function and adjust commentary. 2024-10-29 Eric Botcazou * gcc.c-torture/execute/20241029-1.c: New test. -- Eric Botcazou/* PR rtl-optimization/117327 */ /* Testcase by Brad Moody */ __a

[PUSHED] aarch64: Remove unnecessary casts to rtx_code [PR117349]

2024-10-29 Thread Andrew Pinski
In aarch64_gen_ccmp_first/aarch64_gen_ccmp_next, the casts were no longer needed after r14-3412-gbf64392d66f291 which changed the type of the arguments to rtx_code. In aarch64_rtx_costs, they were no longer needed since r12-4828-g1d5c43db79b7ea which changed the type of code to rtx_code. Pushed a

[PATCH] aarch64: Use canonicalize_comparison in ccmp expansion [PR117346]

2024-10-29 Thread Andrew Pinski
While testing the patch for PR 85605 on aarch64, it was noticed that imm_choice_comparison.c test failed. This was because canonicalize_comparison was not being called in the ccmp case. This can be noticed without the patch for PR 85605 as evidence of the new testcase. Bootstrapped and tested on a

Re: [PATCH 1/4] sched1: hookize pressure scheduling spilling agressiveness

2024-10-29 Thread Vineet Gupta
On 10/29/24 11:51, Wilco Dijkstra wrote: > Hi Vineet, >> I agree the NARROW/WIDE stuff is obfuscating things in technicalities. > Is there evidence this change would make things significantly worse for > some targets? Honestly I don't think this needs to be behind any toggle or made optional at

Re: [PATCH] c: Add C2Y N3370 - Case range expressions support [PR117021]

2024-10-29 Thread Joseph Myers
On Tue, 29 Oct 2024, Jakub Jelinek wrote: > Hi! > > The following patch adds the C2Y N3370 paper support. > We had the case ranges as a GNU extension for decades, so this patch > simply: > 1) adds different diagnostics when it is used in C (depending on flag_isoc2y >and pedantic and warn_c23_

Re: [PATCH] c-family: Handle RAW_DATA_CST in complete_array_type [PR117313]

2024-10-29 Thread Joseph Myers
On Tue, 29 Oct 2024, Jakub Jelinek wrote: > Hi! > > The following testcase ICEs, because > add_flexible_array_elts_to_size -> complete_array_type > is done only after braced_lists_to_strings which optimizes > RAW_DATA_CST surrounded by INTEGER_CST into a larger RAW_DATA_CST > covering even the bo

Re: [PATCH 1/4] sched1: hookize pressure scheduling spilling agressiveness

2024-10-29 Thread Wilco Dijkstra
Hi Vineet, > I agree the NARROW/WIDE stuff is obfuscating things in technicalities. Is there evidence this change would make things significantly worse for some targets? I did a few runs on Neoverse V2 with various options and it looks beneficial both for integer and FP. On the example and option

Re: [PATCH] c: detect variably-modified types [PR117145,PR117245,PR100420]

2024-10-29 Thread Joseph Myers
On Sat, 26 Oct 2024, Martin Uecker wrote: > +tree > +c_build_pointer_type (tree to_type) > +{ > + addr_space_t as = to_type == error_mark_node? ADDR_SPACE_GENERIC > + : TYPE_ADDR_SPACE (to_type); This is badly formatted, missing space before '?'. > +/*

[PATCH v2] c++: Implement P2662R3, Pack Indexing [PR113798]

2024-10-29 Thread Marek Polacek
On Tue, Oct 22, 2024 at 07:42:57PM -0400, Jason Merrill wrote: > On 10/22/24 3:22 PM, Marek Polacek wrote: > > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk? > > > > -- >8 -- > > This patch implements C++26 Pack Indexing, as described in > > . > > Great! >

Re: [PATCH] c++: Implement P2662R3, Pack Indexing [PR113798]

2024-10-29 Thread Marek Polacek
On Thu, Oct 24, 2024 at 04:29:02PM -0400, Patrick Palka wrote: > On Wed, 23 Oct 2024, Jason Merrill wrote: > > > On 10/23/24 10:20 AM, Patrick Palka wrote: > > > On Tue, 22 Oct 2024, Marek Polacek wrote: > > > > > > > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk? > > > > > > > > -

Re: [PATCH] c++: Implement P2662R3, Pack Indexing [PR113798]

2024-10-29 Thread Marek Polacek
On Wed, Oct 23, 2024 at 10:20:39AM -0400, Patrick Palka wrote: > On Tue, 22 Oct 2024, Marek Polacek wrote: > > > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk? > > > > -- >8 -- > > This patch implements C++26 Pack Indexing, as described in > > . > > > > T

Re: [PATCH] Add 'cobol' to Makefile.def, take 2

2024-10-29 Thread Jakub Jelinek
On Mon, Oct 28, 2024 at 03:12:03PM -0400, James K. Lowden wrote: > Jakub Jelinek said: > > > > We'll mainly have to remember when pushing any of the series. > > > > The m2 dances were git_commit.py tweaks: > > https://gcc.gnu.org/r13-4588 > > followed by asking one of us gccadmins on IRC to inst

Re: [PATCH] Add 'cobol' to Makefile.def, take 2

2024-10-29 Thread James K. Lowden
On Tue, 29 Oct 2024 11:56:18 +0100 Richard Biener wrote: > gcc/ should be stripped from * gcc/common.opt, so just * common.opt ... > Likewise for gcc/cobol. I see. The names in these cases are relative to gcc, not to the whole project. The runtime library, libgcobol, like the other libraries, i

Re: testsuite: Use noinline in gcc.dg/simulate-thread/simulate-thread.h

2024-10-29 Thread Joseph Myers
On Thu, 24 Oct 2024, Joseph Myers wrote: > Among the changes of test results with a -std=gnu23 default were two > tests changing from PASS to UNSUPPORTED: > > UNSUPPORTED: gcc.dg/simulate-thread/speculative-store.c -O2 -g thread > simulation test > UNSUPPORTED: gcc.dg/simulate-thread/speculat

Re: [PATCH 1/2] RISC-V:Add intrinsic support for the CMOs extensions

2024-10-29 Thread Jeff Law
On 10/29/24 10:20 AM, Craig Topper wrote: Jeff, should this question in the spec be resolved before merging this? https://github.com/riscv-non-isa/riscv-c-api-doc/pull/93/ files#r1817437534 It looks like a wrapper

Re: [PATCH 1/2] RISC-V:Add intrinsic support for the CMOs extensions

2024-10-29 Thread Jeff Law
On 10/29/24 10:20 AM, Craig Topper wrote: Jeff, should this question in the spec be resolved before merging this? https://github.com/riscv-non-isa/riscv-c-api-doc/pull/93/ files#r1817437534 Actually, I think I mis-

Re: [PATCH] ifcombine: For short circuit case, allow 2 defining statements [PR85605]

2024-10-29 Thread Andrew Pinski
On Tue, Oct 29, 2024 at 5:59 AM Richard Biener wrote: > > On Tue, Oct 29, 2024 at 4:29 AM Andrew Pinski > wrote: > > > > r0-126134-g5d2a9da9a7f7c1 added support for circuiting and combing the ifs > > into using either AND or OR. But it only allowed the inner condition > > basic block having the

Re: [PATCH 1/4] sched1: hookize pressure scheduling spilling agressiveness

2024-10-29 Thread Vineet Gupta
On 10/29/24 08:05, Jeff Law wrote: > On 10/20/24 1:40 PM, Vineet Gupta wrote: >> Pressure senstive scheduling seems to prefer "wide" schedules with more >> parallelism tending to more spills. This works better for in-order >> cores [1][2]. > I'm not really sure I'd characterize it that way, but I c

[pushed: r15-4739] jit: fix leak of pending_assemble_externals_set [PR117275]

2024-10-29 Thread David Malcolm
My recent r15-4580-g779c0390e3b57d fix for resetting state in varasm.cc introduced some noise to "make selftest-valgrind" and, presumably, a memory leak in libgccjit: ==2462086== 160 (56 direct, 104 indirect) bytes in 1 blocks are definitely lost in loss record 248 of 352 ==2462086==at 0x5270

[PATCH] Remove dead part of bool pattern recognition

2024-10-29 Thread Richard Biener
Given we no longer want vcond[u]{,_eq} and VEC_COND_EXPR or COND_EXPR with embedded GENERIC comparisons the whole check_bool_pattern and adjust_bool_stmts machinery is dead. It is effectively dead after r15-4713-g0942bb85fc5573 and the following patch removes it. Bootstrapped and tested on x86_64

Re: [PATCH] Add 'cobol' to Makefile.def, take 2

2024-10-29 Thread Jakub Jelinek
On Tue, Oct 29, 2024 at 11:56:18AM +0100, Richard Biener wrote: > It's probably best to have a first commit just generate the directories with > the > empty ChangeLog and amend the contrib/gcc-changelog/git_commit.py > scipts default_changelog_locations. > > I'm not sure about the exact order of

Re: [PATCH v2 2/2] Match: make SAT_ADD case 7 commutative

2024-10-29 Thread Richard Biener
On Mon, Oct 28, 2024 at 4:45 PM Akram Ahmad wrote: > > Case 7 of unsigned scalar saturating addition defines > SAT_ADD = X <= (X + Y) ? (X + Y) : -1. This is the same as > SAT_ADD = Y <= (X + Y) ? (X + Y) : -1 due to usadd_left_part_1 > being commutative. > > The pattern for case 7 currently does

Re: [PATCH 1/2] RISC-V:Add intrinsic support for the CMOs extensions

2024-10-29 Thread Craig Topper
Jeff, should this question in the spec be resolved before merging this? https://github.com/riscv-non-isa/riscv-c-api-doc/pull/93/files#r1817437534 On Tue, Oct 29, 2024 at 8:55 AM Jeff Law wrote: > > > On 10/29/24 9:50 AM, Craig Topper wrote: > > The '# define rnum 1' may break user code that con

[pushed] c++: printing AGGR_INIT_EXPR args

2024-10-29 Thread Jason Merrill
Tested x86_64-pc-linux-gnu, applying to trunk. -- 8< -- PR30854 was about wrongly dumping the dummy object argument to a constructor; r126582 in 4.3 fixed that by skipping the first argument. But not all functions called by AGGR_INIT_EXPR are constructors, as observed in PR116634; we shouldn't s

[PATCH 1/2] Remove dead code in vectorizer pattern recog

2024-10-29 Thread Richard Biener
The following removes the code path in vect_recog_mask_conversion_pattern dealing with comparisons in COND_EXPRs. That can no longer happen. * tree-vect-patterns.cc (vect_recog_mask_conversion_pattern): Remove COMPARISON_CLASS_P rhs1 of COND_EXPR case and assert it doesn't

Re: [PATCH 1/4] sched1: hookize pressure scheduling spilling agressiveness

2024-10-29 Thread Jeff Law
On 10/20/24 1:40 PM, Vineet Gupta wrote: Pressure senstive scheduling seems to prefer "wide" schedules with more parallelism tending to more spills. This works better for in-order cores [1][2]. I'm not really sure I'd characterize it that way, but I can also see how you got to the wide vs nar

Re: [RFC PATCH 5/5] vect: Also cost gconds for scalar

2024-10-29 Thread Richard Biener
On Mon, 28 Oct 2024, Alex Coplan wrote: > Currently we only cost gconds for the vector loop while we omit costing > them when analyzing the scalar loop; this unfairly penalizes the vector > loop in the case of loops with early exits. > > This (together with the previous patches) enables us to vec

Re: [RFC PATCH 1/5] vect: Force alignment peeling to vectorize more early break loops

2024-10-29 Thread Alex Coplan
On 29/10/2024 15:53, Alex Coplan wrote: > On 29/10/2024 13:39, Richard Biener wrote: > > On Mon, 28 Oct 2024, Alex Coplan wrote: > > > > > This allows us to vectorize more loops with early exits by forcing > > > peeling for alignment to make sure that we're guaranteed to be able to > > > safely re

Re: [PATCH 1/2] RISC-V:Add intrinsic support for the CMOs extensions

2024-10-29 Thread Jeff Law
On 10/29/24 9:50 AM, Craig Topper wrote: The '# define rnum 1' may break user code that contains a variable called rnum. Yikes! Thanks for noting. I'll take care of it. jeff

Re: [RISC-V] RISC-V: Add implication for M extension.

2024-10-29 Thread Jeff Law
On 10/11/24 2:55 AM, Tsung Chun Lin wrote: From 114731cd9cf28ad313de05a507b7253fb9bef3cb Mon Sep 17 00:00:00 2001 From: Tsung Chun Lin Date: Tue, 8 Oct 2024 17:40:59 -0600 Subject: [RISC-V] RISC-V: Add implication for M extension. That M implies Zmmul. gcc/ChangeLog: * common/confi

Re: [RFC PATCH 1/5] vect: Force alignment peeling to vectorize more early break loops

2024-10-29 Thread Alex Coplan
On 29/10/2024 13:39, Richard Biener wrote: > On Mon, 28 Oct 2024, Alex Coplan wrote: > > > This allows us to vectorize more loops with early exits by forcing > > peeling for alignment to make sure that we're guaranteed to be able to > > safely read an entire vector iteration without crossing a pag

Re: [PATCH 1/2] RISC-V:Add intrinsic support for the CMOs extensions

2024-10-29 Thread Craig Topper
The '# define rnum 1' may break user code that contains a variable called rnum. On Tue, Oct 29, 2024 at 8:46 AM Jeff Law wrote: > > > On 10/29/24 4:12 AM, shiyul...@iscas.ac.cn wrote: > > From: yulong > > > > gcc/ChangeLog: > > > > * config.gcc: Add riscv_cmo.h. > > * config/r

Re: [PATCH 1/2] RISC-V:Add intrinsic support for the CMOs extensions

2024-10-29 Thread Jeff Law
On 10/29/24 4:12 AM, shiyul...@iscas.ac.cn wrote: From: yulong gcc/ChangeLog: * config.gcc: Add riscv_cmo.h. * config/riscv/riscv_cmo.h: New file. I think Kito pointed out a minor problem and the linter's also pointed out a whitespace problem. I've fixed both locally and

[PATCH] c++, v2: Attempt to implement C++26 P3034R1 - Module Declarations Shouldn't be Macros [PR114461]

2024-10-29 Thread Jakub Jelinek
On Fri, Oct 25, 2024 at 12:52:41PM -0400, Jason Merrill wrote: > This does seem like a hole in the wording. I think the clear intent is that > the name/partition must neither be macro-expanded nor come from macro > expansion. I'll defer filing the DR and figuring out the right wording for the sta

Re: [PATCH v2 6/8] gcn: Add else operand to masked loads.

2024-10-29 Thread Andrew Stubbs
On 29/10/2024 09:39, Andrew Stubbs wrote: On 28/10/2024 20:03, Robin Dapp wrote: I'm not sure how this is different to just deleting the zero-initializer, which is what I already tested and found some random behaviour? The difference is in the else-operand predicate.  So unless there are more

Re: [PATCH 6/7] RISC-V: Make vectorized memset handle more cases

2024-10-29 Thread Jeff Law
On 10/29/24 7:59 AM, Craig Blackmore wrote: On 19/10/2024 14:05, Jeff Law wrote: On 10/18/24 7:12 AM, Craig Blackmore wrote: `expand_vec_setmem` only generated vectorized memset if it fitted into a single vector store.  Extend it to generate a loop for longer and unknown lengths. The tes

Re: [PATCH] Remove code in vectorizer pattern recog relying on vec_cond{u,eq,}

2024-10-29 Thread Richard Biener
On Sat, 26 Oct 2024, Richard Biener wrote: > With the intent to rely on vec_cond_mask and vec_cmp patterns > comparisons do not need rewriting into COND_EXPRs that eventually > combine to vec_cond{u,eq,}. > > Bootstrap and regtest running on x86_64-unknown-linux-gnu. So with this I effectively r

[PATCH v3] [aarch64] Fix function multiversioning dispatcher link error with LTO

2024-10-29 Thread Yangyu Chen
We forgot to apply DECL_EXTERNAL to __init_cpu_features_resolver decl. When building with LTO, the linker cannot find the __init_cpu_features_resolver.lto_priv* symbol, causing the link error. This patch gets this fixed by adding DECL_EXTERNAL to the decl. To avoid used but never defined warning f

Re: [PATCH 6/6] simplify-rtx: Simplify ROTATE:HI (X:HI, 8) into BSWAP:HI (X)

2024-10-29 Thread Jeff Law
On 10/29/24 4:15 AM, Kyrylo Tkachov wrote: Thanks, I’ll extend the comment when I commit the series. Would you be able to help with the review of the first one in the series by any chance? https://gcc.gnu.org/pipermail/gcc-patches/2024-October/05.html It's in my queue :-) jeff

Re: [PATCH 6/7] RISC-V: Make vectorized memset handle more cases

2024-10-29 Thread Craig Blackmore
On 19/10/2024 14:05, Jeff Law wrote: On 10/18/24 7:12 AM, Craig Blackmore wrote: `expand_vec_setmem` only generated vectorized memset if it fitted into a single vector store.  Extend it to generate a loop for longer and unknown lengths. The test cases now use -O1 so that they are not sensit

Re: [PATCH 7/7] RISC-V: Disable by pieces for vector setmem length > UNITS_PER_WORD

2024-10-29 Thread Craig Blackmore
On 20/10/2024 17:36, Jeff Law wrote: On 10/19/24 7:09 AM, Jeff Law wrote: On 10/18/24 7:13 AM, Craig Blackmore wrote: For fast unaligned access targets, by pieces uses up to UNITS_PER_WORD size pieces resulting in more store instructions than needed. For example gcc.target/riscv/rvv/base/

Re: [PATCH] testcase: Add testcase for tree-optimization/117341

2024-10-29 Thread Jeff Law
On 10/28/24 11:08 PM, Andrew Pinski wrote: Even though PR 117341 was a duplicate of PR 116768, another testcase this time C++ does not hurt to have. The testcase is a self-contained and does not use directly libstdc++ except for operator new (it does not even call delete). Tested on x86_64-li

Re: [PATCH] Match: Fold pow calls to ldexp when possible [PR57492]

2024-10-29 Thread Richard Biener
On Mon, 28 Oct 2024, Soumya AR wrote: > This patch transforms the following POW calls to equivalent LDEXP calls, as > discussed in PR57492: > > powi (2.0, i) -> ldexp (1.0, i) > > a * powi (2.0, i) -> ldexp (a, i) > > 2.0 * powi (2.0, i) -> ldexp (1.0, i + 1) > > pow (powof2, i) -> ldexp (1.0,

Re: Ping: [PATCH] Always set SECTION_RELRO for or .data.rel.ro{,.local} [PR116887]

2024-10-29 Thread Jeff Law
On 10/29/24 7:10 AM, Xi Ruoyao wrote: On Fri, 2024-10-11 at 02:54 +0800, Xi Ruoyao wrote: At least two ports (hppa and loongarch) need to set SECTION_RELRO for .data.rel.ro{,.local} in section_type_flags (PR52999 and PR116887), and I cannot see a reason not to just set it in the generic code.

Re: [PATCH 16/22] aarch64: libgcc: add GCS marking to asm

2024-10-29 Thread Yury Khrustalev
On Thu, Oct 24, 2024 at 05:31:58PM +0100, Richard Sandiford wrote: > Yury Khrustalev writes: > > From: Szabolcs Nagy > > > > libgcc/ChangeLog: > > > > * config/aarch64/aarch64-asm.h (FEATURE_1_GCS): Define. > > (GCS_FLAG): Define if GCS is enabled. > > (GNU_PROPERTY): Add GCS_FLAG. >

[to-be-committed][RISC-V] Aggressively hoist VXRM assignments

2024-10-29 Thread Jeff Law
So a while back I was looking at pixel_avg for RISC-V where we try to use vaaddu for the halfword-ceiling-average step. The problem with vaaddu is that you must set VXRM to a suitable rounding mode as it has an undetermined state at function entry or after a function call. It turns out some d

Re: [PATCH] Match: Optimize log (x) CMP CST and exp (x) CMP CST operations

2024-10-29 Thread Richard Biener
On Mon, 28 Oct 2024, Soumya AR wrote: > This patch implements transformations for the following optimizations. > > logN(x) CMP CST -> x CMP expN(CST) > expN(x) CMP CST -> x CMP logN(CST) > > For example: > > int > foo (float x) > { > return __builtin_logf (x) < 0.0f; > } > > can just be: >

[PATCH] libstdc++: Avoid unnecessary copies in ranges::min/max [PR112349]

2024-10-29 Thread Patrick Palka
Tested on x86_64-pc-linux-gnu, does this look OK for trunk and perhaps 14? -- >8 -- Use a local reference for the (now possibly lifetime extended) result of *__first to avoid making unnecessary copies of it. PR libstdc++/112349 libstdc++-v3/ChangeLog: * include/bits/ranges_algo

Re: [PATCH 1/2] RISC-V:Add intrinsic support for the CMOs extensions

2024-10-29 Thread Kito Cheng
於 2024年10月29日 週二,18:13寫道: > From: yulong > > gcc/ChangeLog: > > * config.gcc: Add riscv_cmo.h. > * config/riscv/riscv_cmo.h: New file. > > --- > gcc/config.gcc | 2 +- > gcc/config/riscv/riscv_cmo.h | 93 > 2 files changed, 94 i

Ping: [PATCH] Always set SECTION_RELRO for or .data.rel.ro{,.local} [PR116887]

2024-10-29 Thread Xi Ruoyao
On Fri, 2024-10-11 at 02:54 +0800, Xi Ruoyao wrote: > At least two ports (hppa and loongarch) need to set SECTION_RELRO for > .data.rel.ro{,.local} in section_type_flags (PR52999 and PR116887), and > I cannot see a reason not to just set it in the generic code. > > With this applied we can also re

Re: [PATCH v2 2/8] ifn: Add else-operand handling.

2024-10-29 Thread Robin Dapp
>> +/* Integer constants representing which else value is supported for masked >> load >> + functions. */ >> +#define MASK_LOAD_ELSE_ZERO -1 >> +#define MASK_LOAD_ELSE_M1 -2 >> +#define MASK_LOAD_ELSE_UNDEFINED -3 >> + >> +#define MASK_LOAD_GATHER_ELSE_IDX 6 > > Why this define? I initially wa

Re: [PATCH 1/5] Match: Simplify branch form 4 of unsigned SAT_ADD into branchless

2024-10-29 Thread Richard Biener
On Tue, Oct 29, 2024 at 9:27 AM wrote: > > From: Pan Li > > There are sorts of forms for the unsigned SAT_ADD. Some of them are > complicated while others are cheap. This patch would like to simplify > the complicated form into the cheap ones. For example as below: > > From the form 4 (branch)

Re: [PATCH v2 2/3] Only do switch bit test clustering when multiple labels point to same bb

2024-10-29 Thread Richard Biener
On Mon, Oct 28, 2024 at 9:58 PM Andi Kleen wrote: > > From: Andi Kleen > > The bit cluster code generation strategy is only beneficial when > multiple case labels point to the same code. Do a quick check if > that is the case before trying to cluster. > > This fixes the switch part of PR117091 wh

Re: [RFC PATCH 5/5] vect: Also cost gconds for scalar

2024-10-29 Thread Richard Biener
On Tue, 29 Oct 2024, Richard Biener wrote: > On Mon, 28 Oct 2024, Alex Coplan wrote: > > > Currently we only cost gconds for the vector loop while we omit costing > > them when analyzing the scalar loop; this unfairly penalizes the vector > > loop in the case of loops with early exits. > > > > T

Re: [PATCH v2 3/3] Simplify switch bit test clustering algorithm

2024-10-29 Thread Richard Biener
On Mon, Oct 28, 2024 at 9:58 PM Andi Kleen wrote: > > From: Andi Kleen > > The current switch bit test clustering enumerates all possible case > clusters combinations to find ones that fit the bit test constrains > best. This causes performance problems with very large switches. > > For bit test

Re: [PATCH] ifcombine: For short circuit case, allow 2 defining statements [PR85605]

2024-10-29 Thread Richard Biener
On Tue, Oct 29, 2024 at 4:29 AM Andrew Pinski wrote: > > r0-126134-g5d2a9da9a7f7c1 added support for circuiting and combing the ifs > into using either AND or OR. But it only allowed the inner condition > basic block having the conditional only. This changes to allow up to 2 > defining > statemen

Re: [PATCH v2 1/2] Match: support new case of unsigned scalar SAT_SUB

2024-10-29 Thread Richard Biener
On Mon, Oct 28, 2024 at 4:44 PM Akram Ahmad wrote: > > This patch adds a new case for unsigned scalar saturating subtraction > using a branch with a greater-than-or-equal condition. For example, > > X >= (X - Y) ? (X - Y) : 0 > > is transformed into SAT_SUB (X, Y) when X and Y are unsigned scalars

Re: [RFC PATCH 2/5] vect: Don't guard scalar epilogue for inverted loops

2024-10-29 Thread Richard Biener
On Mon, 28 Oct 2024, Alex Coplan wrote: > For loops with LOOP_VINFO_EARLY_BREAKS_VECT_PEELED we should always > enter the scalar epilogue, so avoid emitting a guard on entry to the > epilogue. OK. I guess this can go in independently? Richard. > gcc/ChangeLog: > > * tree-vect-loop-manip

Re: [RFC PATCH 4/5] vect: Ensure we add vector skip guard even when versioning for aliasing

2024-10-29 Thread Richard Biener
On Mon, 28 Oct 2024, Alex Coplan wrote: > This fixes a latent wrong code issue whereby vect_do_peeling determined > the wrong condition for inserting the vector skip guard. Specifically > in the case where the loop niters are unknown at compile time we used to > check: > > !LOOP_REQUIRES_VERSI

RE: [PATCH 1/5] Internal-fn: Introduce new IFN MASK_LEN_STRIDED_LOAD{STORE}

2024-10-29 Thread Li, Pan2
Thanks Richard for comments. > Please mention the full optab names. Sure, let me adjust this before commit manually. > There is documentation missing for doc/md.texi for the new optabs. Ack, will take another patch for doc. > Otherwise looks OK. I'll note that non-masked or non-len-only-maske

Re: [RFC PATCH 3/5] vect: Fix dominators when adding a guard to skip the vector loop

2024-10-29 Thread Richard Biener
On Mon, 28 Oct 2024, Alex Coplan wrote: > From: Tamar Christina > > The alignment peeling changes exposed a latent missing dominator update > with early break vectorization, specifically when inserting the vector > skip edge, since the new edge bypasses the prolog skip block and thus > has the p

Re: [RFC PATCH 1/5] vect: Force alignment peeling to vectorize more early break loops

2024-10-29 Thread Richard Biener
On Mon, 28 Oct 2024, Alex Coplan wrote: > This allows us to vectorize more loops with early exits by forcing > peeling for alignment to make sure that we're guaranteed to be able to > safely read an entire vector iteration without crossing a page boundary. > > To make this work for VLA architectu

Re: [PATCH v2 5/8] amdgcn, openmp: Auto-detect USM mode and set HSA_XNACK

2024-10-29 Thread Tobias Burnus
Hi Andrew, Am 28.06.24 um 12:24 schrieb Andrew Stubbs: --- a/gcc/config/gcn/gcn.cc +++ b/gcc/config/gcn/gcn.cc @@ -70,6 +70,11 @@ static bool ext_gcn_constants_init = 0; enum gcn_isa gcn_isa = ISA_GCN3; /* Default to GCN3. */ +/* Record whether the host compiler added "omp unifed memory

[Patch] AMD GCN: mkoffload.cc - set HSA_XNACK for USM and 'xnack+' / 'xnack-' (was [Patch] AMD GCN: Set HSA_XNACK for USM and 'xnack+' / 'xnack-')

2024-10-29 Thread Tobias Burnus
Reposted because of two reasons: First, I realized that the message should contain the word 'mkoffload.cc' to be clearer. But the main reason is that I kept changing whether I wanted to set HSA_XNACK=0 and warn with USM for gfx90{0,6,8} or only one or not. (In GCC, those default to xnack=no as t

Re: [PATCH] config: add -Werror=lto-type-mismatch, odr to bootstrap-lto*

2024-10-29 Thread Richard Biener
On Mon, Oct 28, 2024 at 12:22 PM Sam James wrote: > > Sam James writes: > > > Sam James writes: > > > >> Add -Werror=lto-type-mismatch,odr to bootstrap-lto* configurations to > >> help stop LTO breakage/correctness issues sneaking in. > >> > >> We discussed -Werror=strict-aliasing but it runs ea

Re: [Patch] AMD GCN: Set HSA_XNACK for USM and 'xnack+' / 'xnack-'

2024-10-29 Thread Andrew Stubbs
On 29/10/2024 12:10, Tobias Burnus wrote: Hi Andrew, Am 29.10.24 um 13:07 schrieb Andrew Stubbs: On 29/10/2024 11:44, Tobias Burnus wrote: This somewhat matches what is done in OG13 and in Andrew's patch at https://gcc.gnu.org/pipermail/gcc-patches/2024-June/655951.html albeit the code is some

Re: Frontend access to target features (was Re: [PATCH] libgccjit: Add ability to get CPU features)

2024-10-29 Thread Antoni Boucher
David: Arthur reviewed the gccrs patch and would be OK with it. Could you please take a look and review it? Le 2024-10-17 à 11 h 38, Antoni Boucher a écrit : Hi. Thanks for the review, David! I talked to Arthur and he's OK with having a file to include in both gccrs and libgccjit. I sent th

Re: [Patch] AMD GCN: Set HSA_XNACK for USM and 'xnack+' / 'xnack-'

2024-10-29 Thread Tobias Burnus
Hi Andrew, Am 29.10.24 um 13:07 schrieb Andrew Stubbs: On 29/10/2024 11:44, Tobias Burnus wrote: This somewhat matches what is done in OG13 and in Andrew's patch at https://gcc.gnu.org/pipermail/gcc-patches/2024-June/655951.html albeit the code is somewhat different. [For some reasons, this cod

Re: [Patch] AMD GCN: Set HSA_XNACK for USM and 'xnack+' / 'xnack-'

2024-10-29 Thread Andrew Stubbs
On 29/10/2024 11:44, Tobias Burnus wrote: While users can set HSA_XNACK themselves, it is much more convenient if the compiler sets it for them (at least if it is overriddable). Some systems don't have XNACK, but for those that have it, the somewhat newisher object code versions support three mo

Re: SVE intrinsics: Fold constant operands for svlsl.

2024-10-29 Thread Soumya AR
> On 24 Oct 2024, at 2:55 PM, Richard Sandiford > wrote: > > External email: Use caution opening links or attachments > > > Kyrylo Tkachov writes: >>> On 24 Oct 2024, at 10:39, Soumya AR wrote: >>> >>> Hi Richard, >>> On 23 Oct 2024, at 5:58 PM, Richard Sandiford wrote:

[Patch] AMD GCN: Set HSA_XNACK for USM and 'xnack+' / 'xnack-'

2024-10-29 Thread Tobias Burnus
While users can set HSA_XNACK themselves, it is much more convenient if the compiler sets it for them (at least if it is overriddable). Some systems don't have XNACK, but for those that have it, the somewhat newisher object code versions support three modes: unset (GCC: '-mxnack=any'; supporting

Re: [PATCH] Add 'cobol' to Makefile.def, take 2

2024-10-29 Thread Richard Biener
On Sat, Oct 26, 2024 at 10:37 PM James K. Lowden wrote: > > On Sat, 26 Oct 2024 11:22:20 +0800 > Xi Ruoyao wrote: > > > The changelog is not formatted correctly. gcc/ has its own > > changelog. And gcc/cobol should have its own changelog too, like all > > other frontends. > > Thank you for point

  1   2   >