Re: [PATCH] SVE intrinsics: Fold svaba with op1 all zeros to svabd.

2024-10-24 Thread Jennifer Schmitz
> On 24 Oct 2024, at 21:55, Richard Sandiford wrote: > > External email: Use caution opening links or attachments > > > Jennifer Schmitz writes: >> Similar to >> https://gcc.gnu.org/pipermail/gcc-patches/2024-October/665780.html, >> this patch implements folding of svaba to svabd if op1 is a

[PATCH v4 2/2] RISC-V: Add testcases for unsigned .SAT_SUB form 2 with IMM = 1.

2024-10-24 Thread Li Xu
From: xuli form2: T __attribute__((noinline)) \ sat_u_sub_imm##IMM##_##T##_fmt_2 (T x) \ { \ return x >= (T)IMM ? x - (T)IMM : 0; \ } Passed the rv64gcv regression test. Signed-off-by: Li Xu gcc/testsuite/ChangeLog: * gcc.target/ri

Re: [PATCH] cgraph: remove dead if stmt in build_cgraph_edges pass

2024-10-24 Thread Jakub Jelinek
On Thu, Oct 24, 2024 at 04:37:24PM +0200, Josef Melcr wrote: > This patch removes a dead if statement checking for gomp-parallel gimple > statements. This if is in the execute method of build_cgraph_edges pass, > which is executed right after the omp_expand pass, which removes these > gimple statem

Re: [PATCH 2/2] Match: make SAT_ADD case 7 commutative

2024-10-24 Thread Richard Biener
On Mon, Oct 21, 2024 at 4:23 PM Akram Ahmad wrote: > > Case 7 of unsigned scalar saturating addition defines > SAT_ADD = X <= (X + Y) ? (X + Y) : -1. This is the same as > SAT_ADD = Y <= (X + Y) ? (X + Y) : -1 due to usadd_left_part_1 > being commutative. > > The pattern for case 7 currently does

Re: [PATCH 08/22] aarch64: Add __builtin_aarch64_gcs* tests

2024-10-24 Thread Richard Sandiford
Yury Khrustalev writes: > From: Szabolcs Nagy > > gcc/testsuite/ChangeLog: > > * gcc.target/aarch64/gcspopm-1.c: New test. > * gcc.target/aarch64/gcspr-1.c: New test. > * gcc.target/aarch64/gcsss-1.c: New test. > --- > gcc/testsuite/gcc.target/aarch64/gcspopm-1.c | 69 +

Re: [PATCH 3/3] AArch64: Add support for SIMD xor immediate

2024-10-24 Thread Andrew Pinski
On Tue, Oct 15, 2024 at 4:34 AM Wilco Dijkstra wrote: > > > Add support for SVE xor immediate when generating AdvSIMD code and SVE is > available. > > Passes bootstrap & regress, OK for commit? > > gcc/ChangeLog: > > * config/aarch64/aarch64.cc (enum simd_immediate_check): Add > AARCH64_

[PATCH v2 0/1] Support for FMV in C front end.

2024-10-24 Thread alfie.richards
From: Alfie Richards This update serves to provide a minor cleanup and to CC in relevant maintainers. Additionally, I looked into the behavior of FMV on x86 with this patch and found the assembly looks reasonable, however the assembler produces an error for duplicate definitons so have left this

[COMMITTED 4/4] - Implement pointer_or_operator.

2024-10-24 Thread Andrew MacLeod
Well, perhaps the subject isn't precise  The existing pointer_or_operator is, like a few others, using irange operands, so is non-functional and this patch removes it. The functionality was never moved to the new dispatch system when Prange was implemented, and IIRC it was because Aldy never f

[PATCH] asan: Fix up build_check_stmt gsi handling [PR117209]

2024-10-24 Thread Jakub Jelinek
Hi! gsi_safe_insert_before properly updates gsi_bb in gimple_stmt_iterator in case it splits objects, but unfortunately build_check_stmt was in some places (but not others) using a copy of the iterator rather than the iterator passed from callers and so didn't propagate that to callers. I guess it

Re: [PATCH] cgraph: remove dead if stmt in build_cgraph_edges pass

2024-10-24 Thread Josef Melcr
So I experimented a little and ran the testsuite a few times. While both if statements seem to be dead, the assertion gcc_checking_assert (!is_gimple_omp (stmt)) doesn't actually hold, as adding this assert breaks around 40 omp/oacc tests, so some other statements are definitely slipping throug

Re: [PATCH 2/2] c++/modules: Retrofit imported partial specs over existing implicit instantiations [PR113814]

2024-10-24 Thread Nathaniel Shead
On Thu, Oct 24, 2024 at 12:05:18PM -0400, Jason Merrill wrote: > On 10/24/24 3:25 AM, Nathaniel Shead wrote: > > I wasn't sure whether I should include the ambiguity checking logic from > > process_partial_specialization; we don't do this anywhere else in the > > modules handling code that I could

Re: [PATCH 1/3] c++: Handle ABI for non-polymorphic dynamic classes

2024-10-24 Thread Jason Merrill
On 8/20/24 7:38 PM, Nathaniel Shead wrote: Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk? OK. -- >8 -- The Itanium ABI has specific rules for when virtual tables for dynamic classes should be emitted. However we didn't consider structures with virtual inheritance but no vi

[RFC][PATCH] Adjust param_vect_max_version_for_alias_checks

2024-10-24 Thread Kugan Vivekanandarajah
Hi, This patch sets param_vect_max_version_for_alias_checks to 15. This was causing GCC to miss vectorization opportunities in one internal application making it slower than LLVM by about ~14%. I've tested different param_vect_max_version_for_alias_checks such as 15 and 100 and the SPEC2017 resu

[PATCH v4 1/2] Match: Simplify (x != 0 ? x + ~0 : 0) to (x - x != 0).

2024-10-24 Thread Li Xu
From: xuli When the imm operand op1=1 in the unsigned scalar sat_sub form2 below, we can simplify (x != 0 ? x + ~0 : 0) to (x - x != 0), thereby eliminating a branch instruction.This simplification also applies to signed integer. Form2: T __attribute__((noinline)) \ sat_u_sub_imm##IM

Re: [PATCH] target: Fix asm codegen for vfpclasss* and vcvtph2* instructions

2024-10-24 Thread Hongtao Liu
On Fri, Oct 25, 2024 at 12:19 AM Antoni Boucher wrote: > > Thanks. > Did you review the new patch? > Can I push it to master? Ok. > > Le 2024-10-20 à 22 h 01, Hongtao Liu a écrit : > > On Sat, Oct 19, 2024 at 2:06 AM Antoni Boucher wrote: > >> > >> Thanks for the review. > >> Here's the updated p

[COMMITTED 2/4] - Remove pointer_min_max_operator.

2024-10-24 Thread Andrew MacLeod
Similarly, the class pointer_min_max_operator was used back when it was shared with irange.  With prange, these operations are performed via bool operator_min::fold_range (prange &r, tree type, const prange &op1, const prange &op2, relation_trio) const   and bool operator_max::fold_range (pran

testsuite: Use noinline in gcc.dg/simulate-thread/simulate-thread.h

2024-10-24 Thread Joseph Myers
Among the changes of test results with a -std=gnu23 default were two tests changing from PASS to UNSUPPORTED: UNSUPPORTED: gcc.dg/simulate-thread/speculative-store.c -O2 -g thread simulation test UNSUPPORTED: gcc.dg/simulate-thread/speculative-store.c -O3 -g thread simulation test It appe

Re: [PATCH 06/22] aarch64: Add GCS instructions

2024-10-24 Thread Richard Sandiford
Yury Khrustalev writes: > From: Szabolcs Nagy > > Add instructions for the Guarded Control Stack extension. > > GCSSS1 and GCSSS2 are modelled as a single GCSSS unspec, because they > are always used together in the compiler. > > Before GCSPOPM and GCSSS2 an extra "mov xn, 0" is added to clear th

Re: [PATCH] cgraph: remove dead if stmt in build_cgraph_edges pass

2024-10-24 Thread Josef Melcr
Capital Remove The second line should be just tab indented, not tab + 2 spaces, and finished with dot. gomp_parallel rather than gomp-parallel. Sorry about the formatting issues, I didn't notice them. The if (gimple_code (stmt) == GIMPLE_OMP_TASK) case should go as well. Wonder if gcc_checking

Re: [PATCH] regenerate-opt-urls.py: fix transposed values for "vax" and "v850"

2024-10-24 Thread Maciej W. Rozycki
Hi Mark, > > I double-checked the GCC internals manual and all it says is: > > > > There files are generated from the '.opt' files and the generated > > HTML documentation by 'regenerate-opt-urls.py', and should be > > regenerated when adding new options, via manually invoking 'mak

Re: [PATCH] gcc: Remove trailing whitespace

2024-10-24 Thread Eric Gallager
On Thu, Oct 24, 2024 at 4:17 AM Jakub Jelinek wrote: > > Hi! > > I've tried to build stage3 with > -Wleading-whitespace=blanks -Wtrailing-whitespace=blank > -Wno-error=leading-whitespace=blanks -Wno-error=trailing-whitespace=blank So wait, it's "blanks" (plural) when it's leading, but "blank" (s

Re: [PATCH] aarch64: Support multiple variants including up to 3

2024-10-24 Thread Andrew Pinski
On Mon, Jun 3, 2024 at 2:23 AM Andrew Pinski (QUIC) wrote: > > > -Original Message- > > From: Andrew Pinski (QUIC) > > Sent: Saturday, May 4, 2024 2:03 AM > > To: gcc-patches@gcc.gnu.org > > Cc: Andrew Pinski (QUIC) > > Subject: [PATCH] aarch64: Support multiple variants including > > up

Re: [PATCH] toplevel: Error out if using --disable-libstdcxx with bootstrap [PR105474]

2024-10-24 Thread Andrew Pinski
On Thu, Sep 19, 2024 at 3:55 PM Andrew Pinski wrote: > > On Thu, Aug 22, 2024 at 2:45 PM Andrew Pinski > wrote: > > > > Bootstrapping and using --disable-libstdcxx will cause a build failure deep > > in compiling > > stage2 so instead error out early in the toplevel configure so it is more > >

Ping: [PATCH 1/1] PowerPC vector pair support

2024-10-24 Thread Michael Meissner
Ping this patch: https://gcc.gnu.org/pipermail/gcc-patches/2024-October/664701.html Here is the longer explanation for the patch: https://gcc.gnu.org/pipermail/gcc-patches/2024-October/664694.html -- Michael Meissner, IBM PO Box 98, Ayer, Massachusetts, USA, 01432 email: meiss...@linux.ibm.com

Ping #3: [PATCH] PR 99293: Optimize splat of a V2DF/V2DI extract with constant element

2024-10-24 Thread Michael Meissner
This patch seems to have been over looked. https://gcc.gnu.org/pipermail/gcc-patches/2024-September/663101.html I ran a set of spec 2017 benchmarks with this patch applied and compared it to a run without the patch applied. There were no regressions, but 3 benchmarks had slight improvement in ru

[PATCH] PR target/117251: Add PowerPC XXEVAL support to speed up SHA3 calculations

2024-10-24 Thread Michael Meissner
Can I apply this patch to the trunk and to the open branches after an appropriate burn-in period? I have tested this on both little endian PowerPC server systems, and there were no regressions. The multibuff.c benchmark attached to the PR target/117251 compiled for Power10 PowerPC that implement

[PATCH][AARCH64][PR115258]Fix excess moves

2024-10-24 Thread Kugan Vivekanandarajah
Hi, Fix for PR115258 cases a performance regression in some of the TSVC kernels by adding additional mov instructions. This patch fixes this. i.e., When operands are equal, it is likely that all of them get the same register similar to: (insn 19 15 20 3 (set (reg:V2x16QI 62 v30 [117])

[PATCH 3/3] libstdc++: Define config macros for additional IEEE formats

2024-10-24 Thread Jonathan Wakely
Some targets use IEEE binary64 for both double and long double, which means we could use memmove to optimize a std::copy from a range of double to a range of long double. We currently have no config macro to detect when long double is binary64, so add that to . This also adds config macros for the

testsuite: Use -fno-ipa-icf in gcc.dg/stack-check-2.c

2024-10-24 Thread Joseph Myers
One test failing with a -std=gnu23 default that I wanted to investigate further is gcc.dg/stack-check-2.c. The failures are FAIL: gcc.dg/stack-check-2.c scan-tree-dump-not optimized "tail call" FAIL: gcc.dg/stack-check-2.c scan-tree-dump-not tailc "tail call" but it turns out the tail calls in q

Re: [PATCH] SVE intrinsics: Fold svaba with op1 all zeros to svabd.

2024-10-24 Thread Richard Sandiford
Jennifer Schmitz writes: > Similar to > https://gcc.gnu.org/pipermail/gcc-patches/2024-October/665780.html, > this patch implements folding of svaba to svabd if op1 is all zeros, > resulting in the use of UABD/SABD instructions instead of UABA/SABA. > Tests were added to check the produced assembl

Re: Use unique_ptr in more places in pretty_printer/diagnostics: 'gcc/config/gcn/mkoffload.cc' [PR116613] (was: [RFC/PATCH] Use unique_ptr in more places in pretty_printer/diagnostics [PR116613])

2024-10-24 Thread David Malcolm
On Thu, 2024-10-24 at 21:18 +0200, Thomas Schwinge wrote: > Hi! > > On 2024-10-14T19:18:46-0400, David Malcolm > wrote: > > [...] [...] > ..., and without offloading configured -- which would bring a little > bit > of extra code.  (Indeed offloading configurations aren't covered in > 'contrib/c

[PATCH] testsuite: arm: Update expected asm in armv8_2-fp16-neon-2.c

2024-10-24 Thread Torbjörn SVENSSON
Ok for trunk? -- With the changes in r15-1579-g792f97b44ff, the test_vmul_n_16x8 function does not contain any vdup.16 q* r* instruction with -mfloat-abi=softfp. The differnce between r15-1578-g5185274c76c and r15-1579-g792f97b44ff with -mfloat-abi=softfp for the function is: .global tes

Re: testsuite: Use -fno-ipa-icf in gcc.dg/stack-check-2.c

2024-10-24 Thread Jakub Jelinek
On Thu, Oct 24, 2024 at 06:42:01PM +, Joseph Myers wrote: > One test failing with a -std=gnu23 default that I wanted to > investigate further is gcc.dg/stack-check-2.c. The failures are > > FAIL: gcc.dg/stack-check-2.c scan-tree-dump-not optimized "tail call" > FAIL: gcc.dg/stack-check-2.c sc

Re: [PATCH v4 3/7] OpenMP: C front-end support for dispatch + adjust_args

2024-10-24 Thread Tobias Burnus
Hi, some more comments: Paul-Antoine Arras wrote: Here is an updated patch following these comments. gcc/testsuite/ChangeLog: * gcc.dg/gomp/adjust-args-1.c: New test. * gcc.dg/gomp/dispatch-1.c: New test. The ChangeLog misses to include libgomp/testsuite/

Re: [PATCH 22/22] aarch64: Fix nonlocal goto tests incompatible with GCS

2024-10-24 Thread Richard Sandiford
Yury Khrustalev writes: > gcc/testsuite/ChangeLog: > * gcc.target/aarch64/gcs-nonlocal-3.c: New test. > * gcc.target/aarch64/sme/nonlocal_goto_4.c: Update. > * gcc.target/aarch64/sme/nonlocal_goto_5.c: Update. > * gcc.target/aarch64/sme/nonlocal_goto_6.c: Update. > --- > .

Re: [PATCH 00/22] aarch64: Add support for Guarded Control Stack extension

2024-10-24 Thread Richard Sandiford
Yury Khrustalev writes: > This patch series adds support for the Guarded Control Stack extension [1]. > > GCS marking for binaries is specified in [2]. > > Regression tested on AArch64 and no regressions have been found. > > Is this OK for trunk? > > Sources and branches: > - binutils-gdb: source

Re: [PATCH 21/22] aarch64: Fix tests incompatible with GCS

2024-10-24 Thread Richard Sandiford
Yury Khrustalev writes: > From: Matthieu Longo > > gcc/testsuite/ChangeLog: > > * g++.target/aarch64/return_address_sign_ab_exception.C: Update. > * gcc.target/aarch64/eh_return.c: Update. OK, thanks. Richard > --- > .../return_address_sign_ab_exception.C| 19 +

Re: [PATCH 20/22] aarch64: Add tests and docs for indirect_return attribute

2024-10-24 Thread Richard Sandiford
Yury Khrustalev writes: > From: Richard Ball > > This patch adds a new testcase and docs > for the indirect_return attribute. > > gcc/ChangeLog: > > * doc/extend.texi: Add AArch64 docs for indirect_return > attribute. > > gcc/testsuite/ChangeLog: > > * gcc.target/aarch64/indirec

Re: [PATCH] libstdc++: Implement P0849R8 auto(x) library changes

2024-10-24 Thread Jonathan Wakely
On Wed, 9 Oct 2024 at 14:02, Patrick Palka wrote: > > On Mon, 7 Oct 2024, Patrick Palka wrote: > > > Tested on x86_64-pc-linux-gnu, does this look OK for trunk only? > > This doesn't seem worth backporting since there should be no > > behavior change. > > > > -- >8 -- > > > > This implements the l

Re: [PATCH 7/9] Handle POLY_INT_CSTs in get_nonzero_bits

2024-10-24 Thread Richard Sandiford
Richard Biener writes: > On Fri, 18 Oct 2024, Richard Sandiford wrote: > >> This patch extends get_nonzero_bits to handle POLY_INT_CSTs, >> The easiest (but also most useful) case is that the number >> of trailing zeros in the runtime value is at least the number >> of trailing zeros in each indiv

Re: [PATCH 1/3] libstdc++: Fix typos in tests using macros for std::float128_t support

2024-10-24 Thread Jonathan Wakely
On Thu, 24 Oct 2024 at 15:45, Jonathan Wakely wrote: > > These tests check `_GLIBCXX_DOUBLE_IS_IEEE_BINARY128` but that's never > defined, it should be "LDOUBLE" not "DOUBLE". > > libstdc++-v3/ChangeLog: > > * testsuite/26_numerics/complex/ext_c++23.cc: Fix typo in macro. > * tests

Re: [PATCH 19/22] aarch64: Introduce indirect_return attribute

2024-10-24 Thread Richard Sandiford
Yury Khrustalev writes: > From: Szabolcs Nagy > > Tail calls of indirect_return functions from non-indirect_return > functions are disallowed even if BTI is disabled, since the call > site may have BTI enabled. > > Following x86, mismatching attribute on function pointers is not > a type error ev

[committed] testsuite: Require effective target pie for pr113197

2024-10-24 Thread Dimitar Dimitrov
The test for PR113197 explicitly enables PIE. But targets without PIE emit warnings when -fpie is passed (e.g. pru and avr), which causes the test to fail. Fix by adding an effective target requirement for PIE. With this patch, the test now is marked as unsupported for pru-unknown-elf. Testing

Re: [PATCH 5/9] Generalise ((X /[ex] A) +- B) * A -> X +- A * B rule

2024-10-24 Thread Richard Sandiford
Richard Biener writes: > On Fri, 18 Oct 2024, Richard Sandiford wrote: > >> match.pd had a rule to simplify ((X /[ex] A) +- B) * A -> X +- A * B >> when A and B are INTEGER_CSTs. This patch extends it to handle the >> case where the outer multiplication is by a factor of A, not just >> A itself.

[pushed: r15-4609] Add comment about pp_format to diagnostic_context::report_diagnostic

2024-10-24 Thread David Malcolm
No functional change intended. Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu. Pushed to trunk as r15-4609-gfc1a001921c9c3. gcc/ChangeLog: * diagnostic.cc (diagnostic_context::report_diagnostic): Add comment about interaction of this code with pretty-print f

[PATCH] cgraph: remove dead if stmt in build_cgraph_edges pass

2024-10-24 Thread Josef Melcr
This patch removes a dead if statement checking for gomp-parallel gimple statements. This if is in the execute method of build_cgraph_edges pass, which is executed right after the omp_expand pass, which removes these gimple statements and replaces them with simple gcalls, making this if practically

Re: [PATCH 16/22] aarch64: libgcc: add GCS marking to asm

2024-10-24 Thread Richard Sandiford
Yury Khrustalev writes: > From: Szabolcs Nagy > > libgcc/ChangeLog: > > * config/aarch64/aarch64-asm.h (FEATURE_1_GCS): Define. > (GCS_FLAG): Define if GCS is enabled. > (GNU_PROPERTY): Add GCS_FLAG. This might be a daft question, but don't we also want to use the new build att

[committed v2] libstdc++: Simplify std::__throw_bad_variant_access

2024-10-24 Thread Jonathan Wakely
This removes the overload of __throw_bad_variant_access that must be called with a string literal. This avoids a potential source of undefined behaviour if that function got misused. The other overload that takes a bool parameter can be adjusted to take an integer index selecting one of the four po

Re: [PATCH] target: Fix asm codegen for vfpclasss* and vcvtph2* instructions

2024-10-24 Thread Antoni Boucher
Thanks. Did you review the new patch? Can I push it to master? Le 2024-10-20 à 22 h 01, Hongtao Liu a écrit : On Sat, Oct 19, 2024 at 2:06 AM Antoni Boucher wrote: Thanks for the review. Here's the updated patch. Le 2024-10-17 à 21 h 50, Hongtao Liu a écrit : On Fri, Oct 18, 2024 at 9:08 AM

[PATCH] libstdc++: Add P1206R7 from_range members to std::vector [PR111055]

2024-10-24 Thread Jonathan Wakely
This is another piece of P1206R7, adding new members to std::vector and std::vector. The __uninitialized_copy_a extension needs to be enhanced to support passing non-common ranges (i.e. a sentinel that is a different type from the iterator) and move-only input iterators. libstdc++-v3/ChangeLog:

Re: [PATCH 3/3] c++/modules: Support decloned cdtors

2024-10-24 Thread Jason Merrill
On 8/20/24 7:41 PM, Nathaniel Shead wrote: Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk? OK. -- >8 -- When compiling with '-fdeclone-ctor-dtor' (enabled by default with -Os), we run into issues where we don't correctly emit the underlying functions. We also need to ensure

Re: [PATCH 2/3] c++/modules: Prevent maybe_clone_decl being called multiple times [PR115007]

2024-10-24 Thread Jason Merrill
On 8/20/24 7:40 PM, Nathaniel Shead wrote: Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk? OK. -- >8 -- The ICE in the linked PR is caused because maybe_clone_decl is not prepared to be called on a declaration that has already had clones created; what happens otherwise is th

Re: [PATCH 2/2] c++/modules: Retrofit imported partial specs over existing implicit instantiations [PR113814]

2024-10-24 Thread Jason Merrill
On 10/24/24 3:25 AM, Nathaniel Shead wrote: I wasn't sure whether I should include the ambiguity checking logic from process_partial_specialization; we don't do this anywhere else in the modules handling code that I could see so I left it out for now. The relevant bit in the standard seems to b

[PATCH] Restrict :c to commutative ops as intended

2024-10-24 Thread Richard Biener
genmatch was supposed to restrict :c to verifiable commutative operations while leaving :C to the "I know what I'm doing" case. The following enforces this, cleaning up parsing and amending the commutative_op helper. There's one pattern that needs adjustment, the pattern optimizing fmax (x, NaN) o

Re: [PATCH 09/22] aarch64: Add GCS support for nonlocal stack save

2024-10-24 Thread Richard Sandiford
Yury Khrustalev writes: > From: Szabolcs Nagy > > Nonlocal stack save and restore has to also save and restore the GCS > pointer. This is used in __builtin_setjmp/longjmp and nonlocal goto. > > The GCS specific code is only emitted if GCS branch-protection is > enabled and the code always checks

Re: [PATCH 1/2] Match: support new case of unsigned scalar SAT_SUB

2024-10-24 Thread Richard Biener
On Mon, Oct 21, 2024 at 4:22 PM Akram Ahmad wrote: > > This patch adds a new case for unsigned scalar saturating subtraction > using a branch with a greater-than-or-equal condition. For example, > > X >= (X - Y) ? (X - Y) : 0 > > is transformed into SAT_SUB (X, Y) when X and Y are unsigned

[PATCH 2/2][v2] tree-optimization/116575 - SLP masked load-lanes discovery

2024-10-24 Thread Richard Biener
The following implements masked load-lane discovery for SLP. The challenge here is that a masked load has a full-width mask with group-size number of elements when this becomes a masked load-lanes instruction one mask element gates all group members. We already have some discovery hints in place,

[PATCH 1/2][v2] Relax vect_check_scalar_mask check

2024-10-24 Thread Richard Biener
When the mask is not a constant or external def there's no need to check the scalar type, in particular with SLP and the mask being a VEC_PERM_EXPR there isn't a scalar operand ready to check (not one vect_is_simple_use will get you). We later check the vector type and reject non-mask types there.

[PATCH] tree-optimization/117277 - remove CLOBBERs before SLP code generation

2024-10-24 Thread Richard Biener
We have to remove CLOBBERs before SLP is code generated since for store-lanes we are inserting our own CLOBBERs that we want to survive. So the following refactors vect_transform_loop to remove unwanted stmts first. This resolves the gcc.target/aarch64/sve/store_lane_spill_1.c FAIL. Bootstrap and

Re: [PATCH] [lto] ipcp don't propagate where not needed

2024-10-24 Thread Jan Hubicka
> This patch disables propagation of ipcp information into lto partitions > where all instances of the node are marked to be inlined. > > Motivation: > Incremental LTO needs stable values between compilations to be > effective. This requirement fails with following example: > > void heavily_used_

Re: [PATCH v4 4/7] OpenMP: C++ front-end support for dispatch + adjust_args

2024-10-24 Thread Tobias Burnus
Hi PA; only playing around quickly and glancing at the patch; I need to have a real look at this later. Paul-Antoine Arras: This patch adds C++ support for the `dispatch` construct and the `adjust_args` clause. It relies on the c-family bits comprised in the corresponding C front end patch for

Re: [PATCH 6/6] simplify-rtx: Simplify ROTATE:HI (X:HI, 8) into BSWAP:HI (X)

2024-10-24 Thread Jeff Law
On 10/22/24 2:26 PM, Kyrylo Tkachov wrote: Hi all, With recent patch to improve detection of vector rotates at RTL level combine now tries matching a V8HImode rotate by 8 in the example in the testcase. We can teach AArch64 to emit a REV16 instruction for such a rotate but really this operat

Re: [PATCH] gcc: Remove trailing whitespace

2024-10-24 Thread Richard Biener
On Thu, Oct 24, 2024 at 10:17 AM Jakub Jelinek wrote: > > Hi! > > I've tried to build stage3 with > -Wleading-whitespace=blanks -Wtrailing-whitespace=blank > -Wno-error=leading-whitespace=blanks -Wno-error=trailing-whitespace=blank > added to STRICT_WARN and that expectably resulted in about > 27

[COMMITTED 1/4] - Cleanup pointer_plus_operator.

2024-10-24 Thread Andrew MacLeod
when looking at 117222, I discovered the prange operators need a bit of auditing. pointer_plus should be functioning properly, but there were some pre-Prange remnants hanging around.. there were wide_int and irange based routines which can no longer be called, so they are dead code and this r

[PATCH v2 1/1] C: Support Function multiversionsing in the C front end

2024-10-24 Thread alfie.richards
This patch adds support for `target_version` function multiversioning to the C frontend, specifically intended for enabling this for Aarch64 targets. The functionality and behavior matches the CPP frontend. This is likely to need to be changed later down the line for Aarch64 targets to match the

[COMMITTED 3/4] Remove pointer_and_operator.

2024-10-24 Thread Andrew MacLeod
Similarly, operator_bitwise_and::fold_range with prange arguments supercedes  the pointer_and_operator class, so this patch removes the class to avod any confusion. Bootstraps on x86_64-pc-linux-gnu with no regressions.  pushed. Andrew From afd6732de031e42fb54904d478d7c5a1663fc711 Mon Sep 17 0

Re: SVE intrinsics: Fold constant operands for svlsl.

2024-10-24 Thread Soumya AR
Hi Richard, > On 23 Oct 2024, at 5:58 PM, Richard Sandiford > wrote: > > External email: Use caution opening links or attachments > > > Soumya AR writes: >> diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc >> b/gcc/config/aarch64/aarch64-sve-builtins.cc >> index 41673745cfe..aa556859d2e

Re: [PATCH] SVE intrinsics: Fold svsra with op1 all zeros to svlsr/svasr.

2024-10-24 Thread Richard Sandiford
Jennifer Schmitz writes: >> On 22 Oct 2024, at 18:21, Richard Sandiford >> wrote: >> >> External email: Use caution opening links or attachments >> >> >> Jennifer Schmitz writes: >>> A common idiom in intrinsics loops is to have accumulator intrinsics >>> in an unrolled loop with an accumula

Re: [PATCH 1/2] c++/modules: Propagate some missing flags on type definitions

2024-10-24 Thread Jason Merrill
On 10/24/24 3:18 AM, Nathaniel Shead wrote: Bootstrapped and regtested on x86_64-pc-linux-gnu. I did a quick skim to see if I could find any more likely missing flags but I think this should be all of them now. OK. -- >8 -- Noticed while testing my fix for PR c++/113814. Not all of these a

Re: [PATCH 04/22] aarch64: Add __builtin_aarch64_chkfeat

2024-10-24 Thread Richard Sandiford
Yury Khrustalev writes: > From: Szabolcs Nagy > > Builtin for chkfeat: the input argument is used to initialize x16 then > execute chkfeat and return the updated x16. > > Note: ACLE __chkfeat(x) plans to flip the bits to be more intuitive > (xor the input to output), but for the builtin that seem

Re: [PATCH 05/22] aarch64: Add __builtin_aarch64_chkfeat tests

2024-10-24 Thread Richard Sandiford
Yury Khrustalev writes: > From: Szabolcs Nagy > > gcc/testsuite/ChangeLog: > > * gcc.target/aarch64/chkfeat-1.c: New test. > * gcc.target/aarch64/chkfeat-2.c: New test. > --- > gcc/testsuite/gcc.target/aarch64/chkfeat-1.c | 75 > gcc/testsuite/gcc.target/aarch64/

Re: [PATCH] asan: Fix up build_check_stmt gsi handling [PR117209]

2024-10-24 Thread Richard Biener
> Am 24.10.2024 um 09:29 schrieb Jakub Jelinek : > > Hi! > > gsi_safe_insert_before properly updates gsi_bb in gimple_stmt_iterator > in case it splits objects, but unfortunately build_check_stmt was in > some places (but not others) using a copy of the iterator rather than > the iterator pas

[PATCH] testsuite: arm: Use effective-target for memset-inline* tests

2024-10-24 Thread Torbjörn SVENSSON
Ok for trunk and releases/gcc-14? -- As these tests are set to execute and require neon hardware to do so, add the missing dg-require-effective-target arm_neon_hw. gcc/testsuite/ChangeLog: * gcc.target/arm/memset-inline-4.c: Use effective-target arm_neon_hw. * gcc.target

Re: [PATCH] SVE intrinsics: Fold svsra with op1 all zeros to svlsr/svasr.

2024-10-24 Thread Jennifer Schmitz
> On 24 Oct 2024, at 11:28, Richard Sandiford wrote: > > External email: Use caution opening links or attachments > > > Jennifer Schmitz writes: >>> On 22 Oct 2024, at 18:21, Richard Sandiford >>> wrote: >>> >>> External email: Use caution opening links or attachments >>> >>> >>> Jennif

Re: [PATCH] non-gcc: Remove trailing whitespace

2024-10-24 Thread Jonathan Wakely
On Thu, 24 Oct 2024 at 10:32, Jonathan Wakely wrote: > > > On Thu, 24 Oct 2024, 09:19 Jakub Jelinek, wrote: > >> Hi! >> >> Here is the non-gcc part of the previous patch, include/, libiberty/, >> libcpp/, libgcc/, libstdc++-v3/. >> >> Is there something that should be left out? >> > > > libstdc+

Re: [PATCH] non-gcc: Remove trailing whitespace

2024-10-24 Thread Jonathan Wakely
On Thu, 24 Oct 2024, 09:19 Jakub Jelinek, wrote: > Hi! > > Here is the non-gcc part of the previous patch, include/, libiberty/, > libcpp/, libgcc/, libstdc++-v3/. > > Is there something that should be left out? > libstdc++-v3/src/c++17/fast_float/fast_float.h comes from an external project. Pl

Re: SVE intrinsics: Fold constant operands for svlsl.

2024-10-24 Thread Richard Sandiford
Kyrylo Tkachov writes: >> On 24 Oct 2024, at 10:39, Soumya AR wrote: >> >> Hi Richard, >> >> > On 23 Oct 2024, at 5:58 PM, Richard Sandiford >> > wrote: >> > >> > External email: Use caution opening links or attachments >> > >> > >> > Soumya AR writes: >> >> diff --git a/gcc/config/aarch6

[PATCH v2 9/9] aarch64: Handle alignment when it is bigger than BIGGEST_ALIGNMENT

2024-10-24 Thread Evgeny Karpov
Wednesday, October 23, 2024 Richard Sandiford wrote: > Or, even if that does work, it isn't clear to me why patching > ASM_OUTPUT_ALIGNED_LOCAL is a complete solution to the problem. This patch reproduces the same code as it was done without declaring ASM_OUTPUT_ALIGNED_LOCAL. ASM_OUTPUT_ALIGNE

Re: SVE intrinsics: Fold constant operands for svlsl.

2024-10-24 Thread Kyrylo Tkachov
> On 24 Oct 2024, at 10:39, Soumya AR wrote: > > Hi Richard, > > > On 23 Oct 2024, at 5:58 PM, Richard Sandiford > > wrote: > > > > External email: Use caution opening links or attachments > > > > > > Soumya AR writes: > >> diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc > >> b/

Re: [Bug libstdc++/115285] [12/13/14/15 Regression] std::unordered_set can have duplicate value

2024-10-24 Thread Jonathan Wakely
I'm seeing new FAILs with -D_GLIBCXX_USE_CXX11_ABI=0 /home/test/src/gcc/libstdc++-v3/testsuite/23_containers/unordered_set/96088.cc:247: void test03(): Assertion '__gnu_test::counter::get()._M_increments == in crements + 1' failed. FAIL: 23_containers/unordered_set/96088.cc -std=gnu++17 execution

[PATCH 2/2] c++/modules: Retrofit imported partial specs over existing implicit instantiations [PR113814]

2024-10-24 Thread Nathaniel Shead
I wasn't sure whether I should include the ambiguity checking logic from process_partial_specialization; we don't do this anywhere else in the modules handling code that I could see so I left it out for now. I could also rework process_partial_specialization to call the new create_mergeable_partia

Re: [PATCH] SVE intrinsics: Fold svsra with op1 all zeros to svlsr/svasr.

2024-10-24 Thread Jennifer Schmitz
> On 22 Oct 2024, at 18:21, Richard Sandiford wrote: > > External email: Use caution opening links or attachments > > > Jennifer Schmitz writes: >> A common idiom in intrinsics loops is to have accumulator intrinsics >> in an unrolled loop with an accumulator initialized to zero at the begin

Re: testsuite: Fix up pr116488.c and pr117226.c tests [PR116488]

2024-10-24 Thread Jeff Law
On 10/22/24 7:09 AM, Jakub Jelinek wrote: Hi! On Mon, Oct 21, 2024 at 01:39:52PM -0600, Jeff Law wrote: * gcc.dg/torture/pr116488.c: New test. * gcc.dg/torture/pr117226.c: New test. These two tests FAIL on powerpc64le-linux (and I assume on all other -funsigned-char default

[PATCH] c: Add __builtin_stdc_rotate_{left, right} builtins [PR117030]

2024-10-24 Thread Jakub Jelinek
Hi! I believe the new C2Y type-generic functions stdc_rotate_{left,right} have the same problems the other stdc_* type-generic functions had. If we want to support arbitrary unsigned _BitInt(N), don't want to use statement expressions (so that one can actually use them in static variable initial

Re: [PATCH] RISC-V: Add function multiversioning support

2024-10-24 Thread Yangyu Chen
> On Oct 24, 2024, at 14:53, Kito Cheng wrote: > > ack, let you know I still remember this, but I just attending LLVM dev > and RISC-V summit this week, will review soon once I get back, and do > you mind letting me approve and commit few refactor/NFC patches first? > Sure. I’ve also been te

[PATCH 1/2] c++/modules: Propagate some missing flags on type definitions

2024-10-24 Thread Nathaniel Shead
Bootstrapped and regtested on x86_64-pc-linux-gnu. I did a quick skim to see if I could find any more likely missing flags but I think this should be all of them now. -- >8 -- Noticed while testing my fix for PR c++/113814. Not all of these are easily testable but I've tested a couple that were

[PATCH v3 10/11] RISC-V: Implement TARGET_GENERATE_VERSION_DISPATCHER_BODY and TARGET_GET_FUNCTION_VERSIONS_DISPATCHER

2024-10-24 Thread Yangyu Chen
This patch implements the TARGET_GENERATE_VERSION_DISPATCHER_BODY and TARGET_GET_FUNCTION_VERSIONS_DISPATCHER for RISC-V. This is used to generate the dispatcher function and get the dispatcher function for function multiversioning. This patch copies many codes from commit 0cfde688e213 ("[aarch64]

Re: [PATCH] SVE intrinsics: Fold division and multiplication by -1 to neg.

2024-10-24 Thread Jennifer Schmitz
> On 23 Oct 2024, at 16:40, Richard Sandiford wrote: > > External email: Use caution opening links or attachments > > > Jennifer Schmitz writes: >> Because a neg instruction has lower latency and higher throughput than >> sdiv and mul, svdiv and svmul by -1 can be folded to svneg. For svdiv,

[PATCH v3 08/11] RISC-V: Do not inline when callee is versioned but caller is not

2024-10-24 Thread Yangyu Chen
When the callee is versioned but the caller is not, we should not inline the callee into the caller, to prevent the default version of the callee from being inlined into a not versioned caller. gcc/ChangeLog: * config/riscv/riscv.cc (riscv_can_inline_p): Refuse to inline when call

[PATCH v3 11/11] RISC-V: Add Multi-Versioning Test Cases

2024-10-24 Thread Yangyu Chen
This patch adds test cases for the Function Multi-Versioning (FMV) feature for RISC-V, which reuses the existing test cases from the aarch64 and ported them to RISC-V. gcc/testsuite/ChangeLog: * g++.target/riscv/mv-symbols1.C: New test. * g++.target/riscv/mv-symbols2.C: New test.

[PATCH v3 09/11] RISC-V: Reapply target_version attribute after target attribute

2024-10-24 Thread Yangyu Chen
To ensure that the target_version attribute is applied after target attributes. gcc/ChangeLog: * config/riscv/riscv-target-attr.cc (riscv_option_valid_attribute_p): Reapply target_version attribute after target attribute --- gcc/config/riscv/riscv-target-attr.cc | 13

[PATCH v3 05/11] RISC-V: Implement TARGET_COMPARE_VERSION_PRIORITY and TARGET_OPTION_FUNCTION_VERSIONS

2024-10-24 Thread Yangyu Chen
This patch implements TARGET_COMPARE_VERSION_PRIORITY and TARGET_OPTION_FUNCTION_VERSIONS for RISC-V. The TARGET_COMPARE_VERSION_PRIORITY is implemented to compare the priority of two function versions based on the rules defined in the RISC-V C-API Doc PR #85: https://github.com/riscv-non-isa/ris

[PATCH v3 07/11] RISC-V: Implement TARGET_MANGLE_DECL_ASSEMBLER_NAME

2024-10-24 Thread Yangyu Chen
This patch implements the TARGET_MANGLE_DECL_ASSEMBLER_NAME for RISC-V. This is used to add function multiversioning suffixes to the assembler name. gcc/ChangeLog: * config/riscv/riscv.cc (riscv_mangle_decl_assembler_name): New function. (TARGET_MANGLE_DECL_ASSEMBLER_NAME)

[PATCH v3 06/11] RISC-V: Implement TARGET_OPTION_VALID_VERSION_ATTRIBUTE_P

2024-10-24 Thread Yangyu Chen
This patch implements the TARGET_OPTION_VALID_VERSION_ATTRIBUTE_P for RISC-V. This hook is used to process attribute ((target_version ("..."))). Co-Developed-by: Hank Chang gcc/ChangeLog: * config/riscv/riscv-protos.h (riscv_option_valid_version_attribute_p): Declare. (r

[PATCH v3 04/11] RISC-V: Implement riscv_minimal_hwprobe_feature_bits

2024-10-24 Thread Yangyu Chen
This patch implements the riscv_minimal_hwprobe_feature_bits feature for the RISC-V target. The feature bits are defined in the previous patch [1] to provide bitmasks of ISA extensions that defined in RISC-V C-API. Thus, we need a function to generate the feature bits for IFUNC resolver to dispatch

[PATCH v3 03/11] RISC-V: Implement Priority syntax parser for Function Multi-Versioning

2024-10-24 Thread Yangyu Chen
This patch adds the priority syntax parser to support the Function Multi-Versioning (FMV) feature in RISC-V. This feature allows users to specify the priority of the function version in the attribute syntax. Chnages based on RISC-V C-API PR: https://github.com/riscv-non-isa/riscv-c-api-doc/pull/85

[PATCH v3 02/11] RISC-V: Split riscv_process_target_attr with const char *args argument

2024-10-24 Thread Yangyu Chen
This patch splits static bool riscv_process_target_attr (tree args, location_t loc) into two functions: - bool riscv_process_target_attr (const char *args, location_t loc) - static bool riscv_process_target_attr (tree args, location_t loc) Thus, we can call `riscv_process_target_attr` with a `con

[PATCH v3 01/11] Introduce TARGET_CLONES_ATTR_SEPARATOR for RISC-V

2024-10-24 Thread Yangyu Chen
Some architectures may use ',' in the attribute string, but it is not used as the separator for different targets. To avoid conflict, we introduce a new macro TARGET_CLONES_ATTR_SEPARATOR to separate different clones. As an example, according to RISC-V C-API Specification [1], RISC-V allows ',' in

[PATCH v3 00/11] RISC-V: Add Function Multi-Versioning support

2024-10-24 Thread Yangyu Chen
This patch series adds support for Function Multi-Versioning (FMV) to RISC-V. The FMV feature allows users to specify multiple versions of a function and select the version at runtime based on the target_clones and target_version attributes, which follow the RISC-V C-API Docs [1] and the existing p

Re: [PATCH] SVE intrinsics: Add constant folding for svindex.

2024-10-24 Thread Jennifer Schmitz
> On 23 Oct 2024, at 16:49, Richard Sandiford wrote: > > External email: Use caution opening links or attachments > > > Jennifer Schmitz writes: >> This patch folds svindex with constant arguments into a vector series. >> We implemented this in svindex_impl::fold using the function >> build

  1   2   >