Re: Re: [PATCH 1/2 v3] Match: Simplify unsigned scalar sat_sub(x, 1) to (x - x != 0)

2024-10-23 Thread Andrew Pinski
On Wed, Oct 23, 2024 at 11:43 PM Li Xu wrote: > > > > > > -原始邮件-发件人:"Andrew Pinski" 发送时间:2024-10-24 > > 10:23:01 (星期四)收件人:"Li Xu" > > 抄送:gcc-patches@gcc.gnu.org, kito.ch...@gmail.com, > > richard.guent...@gmail.com, tamar.christ...@arm.com, juzhe.zh...@rivai.ai, > > pan2...@intel.com,

Re: [PATCH] RISC-V: Add function multiversioning support

2024-10-23 Thread Kito Cheng
ack, let you know I still remember this, but I just attending LLVM dev and RISC-V summit this week, will review soon once I get back, and do you mind letting me approve and commit few refactor/NFC patches first? On Mon, Oct 21, 2024 at 11:57 AM Yangyu Chen wrote: > > > > On Oct 21, 2024, at 10:41

Re: Re: [PATCH 1/2 v3] Match: Simplify unsigned scalar sat_sub(x, 1) to (x - x != 0)

2024-10-23 Thread Li Xu
> -原始邮件-发件人:"Andrew Pinski" 发送时间:2024-10-24 10:23:01 > (星期四)收件人:"Li Xu" 抄送:gcc-patches@gcc.gnu.org, > kito.ch...@gmail.com, richard.guent...@gmail.com, tamar.christ...@arm.com, > juzhe.zh...@rivai.ai, pan2...@intel.com, jeffreya...@gmail.com, > rdapp@gmail.com主题:Re: [PATCH 1/2 v3

Re: [PATCH 6/6] simplify-rtx: Simplify ROTATE:HI (X:HI, 8) into BSWAP:HI (X)

2024-10-23 Thread Kyrylo Tkachov
> On 24 Oct 2024, at 07:36, Jeff Law wrote: > > > > On 10/22/24 2:26 PM, Kyrylo Tkachov wrote: >> Hi all, >> With recent patch to improve detection of vector rotates at RTL level >> combine now tries matching a V8HImode rotate by 8 in the example in the >> testcase. We can teach AArch64 to

Re: [PATCH] phi-opt: Add missed optimization for "(cond | (a != b)) ? b : a"

2024-10-23 Thread Jeff Law
On 10/22/24 2:32 AM, Jovan Vukic wrote: Currently, within the phiopt pass, under value_replacement, we have the option to replace the expression "(cond & (a == b)) ? a : b" with "b". It checks whether there is a BIT_AND_EXPR and verifies if one of the operands contains the expression "a == b".

Re: [PATCH] RISC-V: override alignment of function/jump/loop

2024-10-23 Thread Jeff Law
On 10/22/24 4:54 AM, Wang Pengcheng wrote: Just like what AArch64 has done. Signed-off-by: Wang Pengcheng gcc/ChangeLog: * config/riscv/riscv.cc (struct riscv_tune_param): Add new tune options. (riscv_override_options_internal): Override the default alignment when n

Re: [PATCH] testsuite: add testcase for fixed PR115933

2024-10-23 Thread Jeff Law
On 10/20/24 1:33 AM, Sam James wrote: gcc/testsuite/ChangeLog: PR rtl-optimization/115933 * gcc.dg/pr115933.c: New test. OK jeff

Re:[pushed] [PATCH] LoongArch: Fix soft-float builds of libffi

2024-10-23 Thread Lulu Cheng
Pushed to r15-4588 在 2024/1/27 下午3:09, Yang Yujie 写道: This patch correspond to the upstream PR: https://github.com/libffi/libffi/pull/817 libffi/ChangeLog: * src/loongarch64/ffi.c: Avoid defining floats in struct call_context if the ABI is soft-float. --- libffi/src/loongarch

Re: [PATCH] simplify-rtx: Handle `a != 0 ? -a : 0` [PR58195]

2024-10-23 Thread Jeff Law
On 10/20/24 3:18 PM, Andrew Pinski wrote: The gimple (and generic) levels have this optmization since r12-2041-g7d6979197274a662da7bdc5. It seems like a good idea to add a similar one to rtl just in case it is not caught at the gimple level. Note the loop case in csel-neg-1.c is not handled

Re: [PATCH 1/2 v3] Match: Simplify unsigned scalar sat_sub(x, 1) to (x - x != 0)

2024-10-23 Thread Andrew Pinski
On Wed, Oct 23, 2024 at 2:08 AM Li Xu wrote: > > From: xuli > > When the imm operand op1=1 in the unsigned scalar sat_sub form2 below, > we can simplify (x != 0 ? x + max : 0) to (x - x != 0), thereby eliminating > a branch instruction. > > Form2: > T __attribute__((noinline)) \ > sat

RE: [PATCH 6/7] Support Intel MOVRS

2024-10-23 Thread Jiang, Haochen
> From: Uros Bizjak > Sent: Tuesday, October 22, 2024 7:32 PM > > On Tue, Oct 22, 2024 at 8:31 AM Haochen Jiang > wrote: > > > > diff --git a/gcc/builtins.cc b/gcc/builtins.cc index > > 37c7c98e5c7..52520d54b84 100644 > > --- a/gcc/builtins.cc > > +++ b/gcc/builtins.cc > > @@ -1296,8 +1296,8 @@

Re: [PATCH] c++: Further fix for get_member_function_from_ptrfunc [PR117259]

2024-10-23 Thread Jakub Jelinek
On Wed, Oct 23, 2024 at 12:27:32PM -0400, Jason Merrill wrote: > On 10/22/24 2:17 PM, Jakub Jelinek wrote: > > The following testcase shows that the previous > > get_member_function_from_ptrfunc > > changes weren't sufficient and we still have cases where > > -fsanitize=undefined with pointers to

RE: [Pushed] aarch64: Fix warning in aarch64_ptrue_reg

2024-10-23 Thread Pengxuan Zheng (QUIC)
My bad. Thanks for fixing this quickly, Andrew! Thanks, Pengxuan > > After r15-4579-g9ffcf1f193b477, we get the following warning/error while > bootstrapping on aarch64: > ``` > ../../gcc/gcc/config/aarch64/aarch64.cc: In function ‘rtx_def* > aarch64_ptrue_reg(machine_mode, unsigned int)’: > ../.

Re: [PATCH v2 3/4] aarch64: improve assembly debug comments for AEABI build attributes

2024-10-23 Thread Richard Sandiford
Matthieu Longo writes: > The previous implementation to emit AEABI build attributes did not > support string values (asciz) in aeabi_subsection, and was not > emitting values associated to tags in the assembly comments. > > This new approach provides a more user-friendly interface relying on > typ

Re: [PATCH 2/4] RISC-V: Implement TARGET_SCHED_PRESSURE_PREFER_NARROW [PR/114729]

2024-10-23 Thread Vineet Gupta
On 10/22/24 12:02, rep.dot@gmail.com wrote: >> +/* { dg-final { scan-assembler-times "%sfp" 0 } } */ > scan-assembler-not, please Fixed and also in the other patch. Thx, -Vineet

[Pushed] aarch64: Fix warning in aarch64_ptrue_reg

2024-10-23 Thread Andrew Pinski
After r15-4579-g9ffcf1f193b477, we get the following warning/error while bootstrapping on aarch64: ``` ../../gcc/gcc/config/aarch64/aarch64.cc: In function ‘rtx_def* aarch64_ptrue_reg(machine_mode, unsigned int)’: ../../gcc/gcc/config/aarch64/aarch64.cc:3643:21: error: comparison of integer expr

[PATCH] libstdc++: Simplify std::__throw_bad_variant_access

2024-10-23 Thread Jonathan Wakely
This removes the overload of __throw_bad_variant_access that must be called with a string literal. This avoids a potential source of undefined behaviour if that function got misused. The other overload that takes a bool parameter can be adjusted to take an int describing which of the four possible

[PATCH 3/2] c++: remove WILDCARD_DECL

2024-10-23 Thread Patrick Palka
Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for trunk? -- >8 -- This tree code was added as part of the initial Concepts TS implementation to support type-constraints introducing any kind of template-parameter, not just type template-parameters, e.g. template concept C

Re: [PATCH] c++: Further fix for get_member_function_from_ptrfunc [PR117259]

2024-10-23 Thread Jason Merrill
On 10/23/24 3:07 PM, Jakub Jelinek wrote: On Wed, Oct 23, 2024 at 08:53:36PM +0200, Jakub Jelinek wrote: save_expr has been doing that at least since 1992, likely before that. Though, that 4073 /* Array ref is const/volatile if the array elements are 4074 or if the array is.

Re: [PATCH] ginclude: stdalign.h should define __xxx_is_defined macros for C++

2024-10-23 Thread Jason Merrill
On 10/23/24 2:41 PM, Jonathan Wakely wrote: On Wed, 23 Oct 2024 at 16:02, Jason Merrill wrote: On 10/23/24 10:39 AM, Jonathan Wakely wrote: The __alignas_is_defined macro has been required by C++ since C++11, and C++ Library DR 4036 clarified that __alignof_is_defined should be defined too.

Re: [PATCH] top-level: Add pull request template for Forgejo

2024-10-23 Thread Joseph Myers
On Wed, 23 Oct 2024, Jonathan Wakely wrote: > This complements the existing .github/PULL_REQUEST_TEMPLATE.md file, > which is used when somebody opens a pull request for an unofficial > mirror/fork of GCC on Github. The text in the existing file is very > specific to GitHub and doesn't make much s

Re: [PATCH] ginclude: stdalign.h should define __xxx_is_defined macros for C++

2024-10-23 Thread Jonathan Wakely
On Wed, 23 Oct 2024 at 19:44, Jonathan Wakely wrote: > > On Wed, 23 Oct 2024 at 19:41, Jonathan Wakely wrote: > > > > On Wed, 23 Oct 2024 at 16:02, Jason Merrill wrote: > > > > > > Should there also/instead be a test with ? > > > > We don't usually (or ever?) bother to test the .h versions of he

Re: [PATCH] c++: Further fix for get_member_function_from_ptrfunc [PR117259]

2024-10-23 Thread Jakub Jelinek
On Wed, Oct 23, 2024 at 08:53:36PM +0200, Jakub Jelinek wrote: > save_expr has been doing that at least since 1992, likely before that. > Though, that > 4073/* Array ref is const/volatile if the array elements are > 4074 or if the array is.. */ > 4075TREE_READONLY (rval)

Re: [PATCH] c++: Further fix for get_member_function_from_ptrfunc [PR117259]

2024-10-23 Thread Jakub Jelinek
On Wed, Oct 23, 2024 at 12:45:53PM -0400, Jason Merrill wrote: > On 10/23/24 12:33 PM, Jakub Jelinek wrote: > > On Wed, Oct 23, 2024 at 12:27:32PM -0400, Jason Merrill wrote: > > > On 10/22/24 2:17 PM, Jakub Jelinek wrote: > > > > The following testcase shows that the previous > > > > get_member_f

Re: [PATCH] ginclude: stdalign.h should define __xxx_is_defined macros for C++

2024-10-23 Thread Jonathan Wakely
On Wed, 23 Oct 2024 at 19:41, Jonathan Wakely wrote: > > On Wed, 23 Oct 2024 at 16:02, Jason Merrill wrote: > > > > Should there also/instead be a test with ? > > We don't usually (or ever?) bother to test the .h versions of headers. > For these ones that are deprecated it probably makes sense to

Re: [PATCH] ginclude: stdalign.h should define __xxx_is_defined macros for C++

2024-10-23 Thread Jonathan Wakely
On Wed, 23 Oct 2024 at 16:02, Jason Merrill wrote: > > On 10/23/24 10:39 AM, Jonathan Wakely wrote: > > The __alignas_is_defined macro has been required by C++ since C++11, and > > C++ Library DR 4036 clarified that __alignof_is_defined should be > > defined too. > > > > The macros alignas and ali

[pushed: r15-4580] jit: reset state in varasm.cc [PR117275]

2024-10-23 Thread David Malcolm
PR jit/117275 reports various jit test failures seen on powerpc64le-unknown-linux-gnu due to hitting this assertion in varasm.cc on the 2nd compilation in a process: #2 0x763e67d0 in assemble_external_libcall (fun=0x72a4b1d8) at ../../src/gcc/varasm.cc:2650 2650 gcc_asser

Re: [PATCH v2 2/2] aarch64: Add mfloat vreinterpret intrinsics

2024-10-23 Thread Richard Sandiford
Andrew Carlotti writes: > This patch splits out some of the qualifier handling from the v1 patch, and > adjusts the VREINTERPRET* macros to include support for mf8 intrinsics. > > Bootstrapped and regression tested on aarch64; ok for master? > > gcc/ChangeLog: > > * config/aarch64/aarch64-bu

testsuite: Use -std=gnu17 in gcc.dg/pr114115.c

2024-10-23 Thread Joseph Myers
One test failing with a -std=gnu23 default that I wanted to investigate further is gcc.dg/pr114115.c. Building with -std=gnu23 produces a warning: pr114115.c:18:8: warning: 'ifunc' resolver for 'foo_ifunc2' should return 'void * (*)(void)' [-Wattribute-alias=] It turns out that this warning (fr

Re: [Bug libstdc++/115285] [12/13/14/15 Regression] std::unordered_set can have duplicate value

2024-10-23 Thread Jonathan Wakely
On Wed, 23 Oct 2024 at 18:37, François Dumont wrote: > > Sorry but I'm not sure, is it also ok for the 3 backports ? Yeah, I should have said - OK for the branches too, thanks. > > On 22/10/2024 22:43, Jonathan Wakely wrote: > > On Tue, 22 Oct 2024 at 18:28, François Dumont wrote: > >> Hi > >>

RE: [PATCH v3] aarch64: Improve scalar mode popcount expansion by using SVE [PR113860]

2024-10-23 Thread Pengxuan Zheng (QUIC)
> Pengxuan Zheng writes: > > This is similar to the recent improvements to the Advanced SIMD > > popcount expansion by using SVE. We can utilize SVE to generate more > > efficient code for scalar mode popcount too. > > > > Changes since v1: > > * v2: Add a new VNx1BI mode and a new test case for V

Re: [Bug libstdc++/115285] [12/13/14/15 Regression] std::unordered_set can have duplicate value

2024-10-23 Thread François Dumont
Sorry but I'm not sure, is it also ok for the 3 backports ? On 22/10/2024 22:43, Jonathan Wakely wrote: On Tue, 22 Oct 2024 at 18:28, François Dumont wrote: Hi libstdc++: Always instantiate key_type to compute hash code [PR115285] Even if it is possible to compute a hash code fro

Re: [PATCH 3/3] aarch64: Add SVE support for simd clones [PR 96342]

2024-10-23 Thread Victor Do Nascimento
On 2/1/24 21:59, Richard Sandiford wrote: Andre Vieira writes: This patch finalizes adding support for the generation of SVE simd clones when no simdlen is provided, following the ABI rules where the widest data type determines the minimum amount of elements in a length agnostic vector. gcc/Ch

Re: [PATCH 3/9] Simplify X /[ex] Y cmp Z -> X cmp (Y * Z)

2024-10-23 Thread Andrew MacLeod
On 10/18/24 12:48, Richard Sandiford wrote: [+ranger folks, who I forgot to CC originally, sorry!] This patch applies X /[ex] Y cmp Z -> X cmp (Y * Z) when Y * Z is representable. The closest check for "is representable" on range operations seemed to be overflow_free_p. However, that is desi

Re: [PATCH] c++: Further fix for get_member_function_from_ptrfunc [PR117259]

2024-10-23 Thread Jason Merrill
On 10/23/24 12:33 PM, Jakub Jelinek wrote: On Wed, Oct 23, 2024 at 12:27:32PM -0400, Jason Merrill wrote: On 10/22/24 2:17 PM, Jakub Jelinek wrote: The following testcase shows that the previous get_member_function_from_ptrfunc changes weren't sufficient and we still have cases where -fsanitize

Re: [PATCH v2 1/2] aarch64: Add support for mfloat8x{8|16}_t types

2024-10-23 Thread Richard Sandiford
Andrew Carlotti writes: > Compared to v1, I've split changes that aren't used for the type definitions > into a separate patch. I've also added some tests, mostly along the lines > suggested by Richard S. > > Bootstrapped and regression tested on aarch64; ok for master? > > gcc/ChangeLog: > >

Re: [PATCH v2 0/4] aarch64: add minimal support of AEABI build attributes for GCS

2024-10-23 Thread Richard Sandiford
Matthieu Longo writes: > The primary focus of this patch series is to add support for build attributes > in the context of GCS (Guarded Control Stack, an Armv9.4-a extension) to the > AArch64 backend. > It addresses comments from revision 1 [2] and 2 [3], and proposes a different > approach com

Re: [PATCH v2 2/4] aarch64: add minimal support of AEABI build attributes for GCS.

2024-10-23 Thread Richard Sandiford
Matthieu Longo writes: > @@ -24803,6 +24834,16 @@ aarch64_start_file (void) > asm_fprintf (asm_out_file, "\t.arch %s\n", > aarch64_last_printed_arch_string.c_str ()); > > + /* Check whether the current assembly supports gcs build attributes, if not > + fallback to .note.gn

[COMMITTED] PR tree-optimization/117222 - Implement operator_pointer_diff::fold_range

2024-10-23 Thread Andrew MacLeod
pointer_diff depends on range_operator::fold_range to do the generic fold, which invokes wi_fold on subranges.  It also in turn invokes op1_op2_relation_effect for relation effects. This worked fine when pointers were implemented with irange, but when the transition to prange was made, a new

Re: [PATCH] c++: Further fix for get_member_function_from_ptrfunc [PR117259]

2024-10-23 Thread Jason Merrill
On 10/22/24 2:17 PM, Jakub Jelinek wrote: Hi! The following testcase shows that the previous get_member_function_from_ptrfunc changes weren't sufficient and we still have cases where -fsanitize=undefined with pointers to member functions can cause wrong code being generated and related false pos

Re: [PATCH v4 2/7] OpenMP: middle-end support for dispatch + adjust_args

2024-10-23 Thread Paul-Antoine Arras
Here is the updated patch. On 23/10/2024 11:41, Tobias Burnus wrote: * The update to builtins.cc's builtin_fnspec  is lacking in the changelog list. Added missing items to the ChangeLog. * And the new testcase, new gcc/testsuite/c-c++-common/gomp/ dispatch-10.c, has to be put into 3/7 or lat

Re: [PATCH] SVE intrinsics: Fold division and multiplication by -1 to neg.

2024-10-23 Thread Richard Sandiford
Jennifer Schmitz writes: > Because a neg instruction has lower latency and higher throughput than > sdiv and mul, svdiv and svmul by -1 can be folded to svneg. For svdiv, > this is already implemented on the RTL level; for svmul, the > optimization was still missing. > This patch implements foldin

Re: [PATCH] Implement Fortran diagnostic buffering for non-textual formats [PR105916]

2024-10-23 Thread David Malcolm
On Wed, 2024-10-23 at 11:03 +0200, Tobias Burnus wrote: > David Malcolm wrote: > > In order to handle various awkward parsing issues, the Fortran > > frontend > > implements buffering of diagnostics, so that diagnostics reported > > to > > global_dc can be either: > > (a) immediately issued, or > >

[committed] libstdc++: Add -D_GLIBCXX_ASSERTIONS default for -O0 to API history

2024-10-23 Thread Jonathan Wakely
Excuse the huge diff, it's because it adds a new section heading so all the TOC pages and section listings change. Pushed to trunk. -- >8 -- libstdc++-v3/ChangeLog: * doc/xml/manual/evolution.xml: Document that assertions are enabled for unoptimized builds. * doc/html/*:

Re: [PATCH] c++: Implement P2662R3, Pack Indexing [PR113798]

2024-10-23 Thread Jason Merrill
On 10/23/24 10:20 AM, Patrick Palka wrote: On Tue, 22 Oct 2024, Marek Polacek wrote: Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk? -- >8 -- This patch implements C++26 Pack Indexing, as described in . The issue discussing how to mangle pack indexes ha

Re: [PATCH] ginclude: stdalign.h should define __xxx_is_defined macros for C++

2024-10-23 Thread Jason Merrill
On 10/23/24 10:39 AM, Jonathan Wakely wrote: The __alignas_is_defined macro has been required by C++ since C++11, and C++ Library DR 4036 clarified that __alignof_is_defined should be defined too. The macros alignas and alignof should not be defined, as they're keywords in C++. Technically it's

Re: [PATCH 1/2] aarch64: Use standard names for saturating arithmetic

2024-10-23 Thread Richard Sandiford
Akram Ahmad writes: > On 23/10/2024 12:20, Richard Sandiford wrote: >> Thanks for doing this. The approach looks good. My main question is: >> are we sure that we want to use the Advanced SIMD instructions for >> signed saturating SI and DI arithmetic on GPRs? E.g. for addition, >> we only satu

Re: [PATCH] SVE intrinsics: Add constant folding for svindex.

2024-10-23 Thread Richard Sandiford
Jennifer Schmitz writes: > This patch folds svindex with constant arguments into a vector series. > We implemented this in svindex_impl::fold using the function build_vec_series. > For example, > svuint64_t f1 () > { > return svindex_u642 (10, 3); > } > compiled with -O2 -march=armv8.2-a+sve, is

[PATCH] ginclude: stdalign.h should define __xxx_is_defined macros for C++

2024-10-23 Thread Jonathan Wakely
The __alignas_is_defined macro has been required by C++ since C++11, and C++ Library DR 4036 clarified that __alignof_is_defined should be defined too. The macros alignas and alignof should not be defined, as they're keywords in C++. Technically it's implementation-defined whether __STDC_VERSION_

Re: [PATCH v7] Target-independent store forwarding avoidance.

2024-10-23 Thread Jakub Jelinek
On Wed, Oct 23, 2024 at 04:27:29PM +0200, Konstantinos Eleftheriou wrote: Just random ChangeLog formatting nits, not actual patch review: > gcc/ChangeLog: > > * Makefile.in: Add avoid-store-forwarding.o Missing . at the end. Though, you should really also mention what you're changing, so

Re: [PATCH v6] Target-independent store forwarding avoidance.

2024-10-23 Thread Konstantinos Eleftheriou
Hi Jeff, thanks for the feedback. Indeed, there was an issue with copying back the load register when the load is eliminated. I just sent a new version (https://gcc.gnu.org/pipermail/gcc-patches/2024-October/666230.html). On Fri, Oct 18, 2024 at 9:55 PM Jeff Law wrote: > > > > On 10/18/24 3:57 AM

Re: counted_by attribute and type compatibility

2024-10-23 Thread Qing Zhao
> On Oct 22, 2024, at 15:16, Martin Uecker wrote: > >>> >>> I doesn't really make sense when they are inconsistent. >>> Still, we could just warn and pick one of the attributes >>> when forming the composite type. >> >> If both are defined locally, such inconsistencies should be very ea

Re: [PATCH 1/2] aarch64: Use standard names for saturating arithmetic

2024-10-23 Thread Akram Ahmad
On 23/10/2024 12:20, Richard Sandiford wrote: Thanks for doing this. The approach looks good. My main question is: are we sure that we want to use the Advanced SIMD instructions for signed saturating SI and DI arithmetic on GPRs? E.g. for addition, we only saturate at the negative limit if bot

[PATCH] top-level: Add pull request template for Forgejo

2024-10-23 Thread Jonathan Wakely
This complements the existing .github/PULL_REQUEST_TEMPLATE.md file, which is used when somebody opens a pull request for an unofficial mirror/fork of GCC on Github. The text in the existing file is very specific to GitHub and doesn't make much sense to include on every PR created on forge.sourcewa

[PATCH v7] Target-independent store forwarding avoidance.

2024-10-23 Thread Konstantinos Eleftheriou
From: kelefth This pass detects cases of expensive store forwarding and tries to avoid them by reordering the stores and using suitable bit insertion sequences. For example it can transform this: strbw2, [x1, 1] ldr x0, [x1] # Expensive store forwarding to larger load. To

Re: [PATCH 2/2] tree-optimization/116575 - SLP masked load-lanes discovery

2024-10-23 Thread Richard Sandiford
Richard Biener writes: > The following implements masked load-lane discovery for SLP. The > challenge here is that a masked load has a full-width mask with > group-size number of elements when this becomes a masked load-lanes > instruction one mask element gates all group members. We already > h

Re: [PATCH] libstdc++: Replace std::__to_address in C++20 branch in

2024-10-23 Thread Jonathan Wakely
On Wed, 23 Oct 2024 at 13:18, Jonathan Wakely wrote: > > As noted by Patrick, r15-4546-g85e5b80ee2de80 should have changed the > usage of std::__to_address to std::to_address in the C++20-specific > branch that works on types satisfying std::contiguous_iterator. > > libstdc++-v3/ChangeLog: > >

Re: [PATCH] c++: Implement P2662R3, Pack Indexing [PR113798]

2024-10-23 Thread Patrick Palka
On Tue, 22 Oct 2024, Marek Polacek wrote: > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk? > > -- >8 -- > This patch implements C++26 Pack Indexing, as described in > . > > The issue discussing how to mangle pack indexes has not been resolved > yet

Re: [PATCH v2 9/9] aarch64: Handle alignment when it is bigger than BIGGEST_ALIGNMENT

2024-10-23 Thread Richard Sandiford
Evgeny Karpov writes: > Tuesday, October 22, 2024 > Richard Sandiford wrote: > >>> If ASM_OUTPUT_ALIGNED_LOCAL uses an alignment less than BIGGEST_ALIGNMENT, >>> it might trigger a relocation issue. >>> >>> relocation truncated to fit: IMAGE_REL_ARM64_PAGEOFFSET_12L >> >> Sorry to press the issue

[pushed] doc: remove obsolete deprecated info

2024-10-23 Thread Jason Merrill
Tested x86_64-pc-linux-gnu, applying to trunk. -- 8< -- These formerly deprecated features eventually made it into the C++ standard. gcc/ChangeLog: * doc/extend.texi (Deprecated Features): Remove text about some no-longer-deprecated features. --- gcc/doc/extend.texi | 10 --

[PATCH v2 9/9] aarch64: Handle alignment when it is bigger than BIGGEST_ALIGNMENT

2024-10-23 Thread Evgeny Karpov
Tuesday, October 22, 2024 Richard Sandiford wrote: >> If ASM_OUTPUT_ALIGNED_LOCAL uses an alignment less than BIGGEST_ALIGNMENT, >> it might trigger a relocation issue. >> >> relocation truncated to fit: IMAGE_REL_ARM64_PAGEOFFSET_12L > > Sorry to press the issue, but: why does that happen? #def

Re: [PATCH] Add 'cobol' to Makefile.def, take 2

2024-10-23 Thread Richard Biener
On Tue, Oct 15, 2024 at 1:10 AM James K. Lowden wrote: > > Consequent to advice, I'm preparing the Cobol front-end patches as a > small number of hopefully meaningful patches covering many files. > > 1. meta files used by autotools etc. > 2. gcc/cobol/*.h > 3. gcc/cobol/*.{y,l,cc} > 4. libgcob

Re: [PATCH v3] Remove sys/user time in -ftime-report

2024-10-23 Thread Richard Biener
On Wed, Oct 9, 2024 at 6:18 PM Andi Kleen wrote: > > From: Andi Kleen > > Retrieving sys/user time in timevars is quite expensive because it > always needs a system call. Only getting the wall time is much > cheaper because operating systems have optimized paths for this. > > The sys time isn't t

[PATCH 2/2] tree-optimization/116575 - SLP masked load-lanes discovery

2024-10-23 Thread Richard Biener
The following implements masked load-lane discovery for SLP. The challenge here is that a masked load has a full-width mask with group-size number of elements when this becomes a masked load-lanes instruction one mask element gates all group members. We already have some discovery hints in place,

Re: [PATCH v3] aarch64: Improve scalar mode popcount expansion by using SVE [PR113860]

2024-10-23 Thread Richard Sandiford
Pengxuan Zheng writes: > This is similar to the recent improvements to the Advanced SIMD popcount > expansion by using SVE. We can utilize SVE to generate more efficient code for > scalar mode popcount too. > > Changes since v1: > * v2: Add a new VNx1BI mode and a new test case for V1DI. > * v3: A

Re: [PATCH 1/2] aarch64: Use standard names for saturating arithmetic

2024-10-23 Thread Richard Sandiford
Richard Sandiford writes: > Akram Ahmad writes: >> This renames the existing {s,u}q{add,sub} instructions to use the >> standard names {s,u}s{add,sub}3 which are used by IFN_SAT_ADD and >> IFN_SAT_SUB. >> >> The NEON intrinsics for saturating arithmetic and their corresponding >> builtins are cha

[PATCH 1/2] Relax vect_check_scalar_mask check

2024-10-23 Thread Richard Biener
When the mask is not a constant or external def there's no need to check the scalar type, in particular with SLP and the mask being a VEC_PERM_EXPR there isn't a scalar operand ready to check (not one vect_is_simple_use will get you). We later check the vector type and reject non-mask types there.

Re: SVE intrinsics: Fold constant operands for svlsl.

2024-10-23 Thread Richard Sandiford
Soumya AR writes: > diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc > b/gcc/config/aarch64/aarch64-sve-builtins.cc > index 41673745cfe..aa556859d2e 100644 > --- a/gcc/config/aarch64/aarch64-sve-builtins.cc > +++ b/gcc/config/aarch64/aarch64-sve-builtins.cc > @@ -1143,11 +1143,14 @@ aarch6

[PATCH] libstdc++: Add GLIBCXX_TESTSUITE_STDS example to docs

2024-10-23 Thread Jonathan Wakely
libstdc++-v3/ChangeLog: * doc/xml/manual/test.xml: Add GLIBCXX_TESTSUITE_STDS example. * doc/html/manual/test.html: Regenerate. --- This patch is also available as a pull request in the forge: https://forge.sourceware.org/gcc/gcc-TEST/pulls/1 libstdc++-v3/doc/html/manual/test.ht

[PATCH] libstdc++: Replace std::__to_address in C++20 branch in

2024-10-23 Thread Jonathan Wakely
As noted by Patrick, r15-4546-g85e5b80ee2de80 should have changed the usage of std::__to_address to std::to_address in the C++20-specific branch that works on types satisfying std::contiguous_iterator. libstdc++-v3/ChangeLog: * include/bits/basic_string.h (assign(Iter, Iter)): Call

Re: [PATCH v3] AArch64: Fix copysign patterns

2024-10-23 Thread Richard Sandiford
Wilco Dijkstra writes: > The current copysign pattern has a mismatch in the predicates and constraints > - > operand[2] is a register_operand but also has an alternative X which allows > any > operand. Since it is a floating point operation, having an integer > alternative > makes no sense. C

Re: [PATCH 1/2] aarch64: Use standard names for saturating arithmetic

2024-10-23 Thread Richard Sandiford
Akram Ahmad writes: > This renames the existing {s,u}q{add,sub} instructions to use the > standard names {s,u}s{add,sub}3 which are used by IFN_SAT_ADD and > IFN_SAT_SUB. > > The NEON intrinsics for saturating arithmetic and their corresponding > builtins are changed to use these standard names to

[PATCH 22/22] aarch64: Fix nonlocal goto tests incompatible with GCS

2024-10-23 Thread Yury Khrustalev
gcc/testsuite/ChangeLog: * gcc.target/aarch64/gcs-nonlocal-3.c: New test. * gcc.target/aarch64/sme/nonlocal_goto_4.c: Update. * gcc.target/aarch64/sme/nonlocal_goto_5.c: Update. * gcc.target/aarch64/sme/nonlocal_goto_6.c: Update. --- .../gcc.target/aarch64/gcs-nonlo

[PATCH 19/22] aarch64: Introduce indirect_return attribute

2024-10-23 Thread Yury Khrustalev
From: Szabolcs Nagy Tail calls of indirect_return functions from non-indirect_return functions are disallowed even if BTI is disabled, since the call site may have BTI enabled. Following x86, mismatching attribute on function pointers is not a type error even though this can lead to bugs. Neede

[PATCH 14/22] aarch64: Add GCS support to the unwinder

2024-10-23 Thread Yury Khrustalev
From: Szabolcs Nagy Follows the current linux ABI that uses single signal entry token and shared shadow stack between thread and alt stack. Could be behind __ARM_FEATURE_GCS_DEFAULT ifdef (only do anything special with gcs compat codegen) but there is a runtime check anyway. Change affected test

[PATCH 18/22] aarch64: libitm: Add GCS support

2024-10-23 Thread Yury Khrustalev
From: Szabolcs Nagy Transaction begin and abort use setjmp/longjmp like operations that need to be updated for GCS compatibility. We use similar logic to libc setjmp/longjmp that support switching stack and thus switching GCS (e.g. due to longjmp out of a makecontext stack), this is kept even tho

[PATCH 21/22] aarch64: Fix tests incompatible with GCS

2024-10-23 Thread Yury Khrustalev
From: Matthieu Longo gcc/testsuite/ChangeLog: * g++.target/aarch64/return_address_sign_ab_exception.C: Update. * gcc.target/aarch64/eh_return.c: Update. --- .../return_address_sign_ab_exception.C| 19 +-- gcc/testsuite/gcc.target/aarch64/eh_return.c | 13

[PATCH 12/22] aarch64: Add test for GCS ACLE defs

2024-10-23 Thread Yury Khrustalev
From: Szabolcs Nagy gcc/testsuite/ChangeLog: * gcc.target/aarch64/pragma_cpp_predefs_1.c: GCS test. --- .../gcc.target/aarch64/pragma_cpp_predefs_1.c | 30 +++ 1 file changed, 30 insertions(+) diff --git a/gcc/testsuite/gcc.target/aarch64/pragma_cpp_predefs_1.c b/gcc/t

[PATCH 16/22] aarch64: libgcc: add GCS marking to asm

2024-10-23 Thread Yury Khrustalev
From: Szabolcs Nagy libgcc/ChangeLog: * config/aarch64/aarch64-asm.h (FEATURE_1_GCS): Define. (GCS_FLAG): Define if GCS is enabled. (GNU_PROPERTY): Add GCS_FLAG. --- libgcc/config/aarch64/aarch64-asm.h | 16 ++-- 1 file changed, 14 insertions(+), 2 deletions(

[PATCH 07/22] aarch64: Add GCS builtins

2024-10-23 Thread Yury Khrustalev
From: Szabolcs Nagy Add new builtins for GCS: void *__builtin_aarch64_gcspr (void) uint64_t __builtin_aarch64_gcspopm (void) void *__builtin_aarch64_gcsss (void *) The builtins are always enabled, but should be used behind runtime checks in case the target does not support GCS. They are t

[PATCH 17/22] aarch64: libatomic: add GCS marking to asm

2024-10-23 Thread Yury Khrustalev
From: Szabolcs Nagy libatomic/ChangeLog: * config/linux/aarch64/atomic_16.S (FEATURE_1_GCS): Define. (GCS_FLAG): Define if GCS is enabled. (GNU_PROPERTY): Add GCS_FLAG. --- libatomic/config/linux/aarch64/atomic_16.S | 11 +-- 1 file changed, 9 insertions(+), 2 de

[PATCH 10/22] aarch64: Add non-local goto and jump tests for GCS

2024-10-23 Thread Yury Khrustalev
From: Szabolcs Nagy These are scan asm tests only, relying on existing execution tests for runtime coverage. gcc/testsuite/ChangeLog: * gcc.target/aarch64/gcs-nonlocal-1.c: New test. * gcc.target/aarch64/gcs-nonlocal-2.c: New test. --- .../gcc.target/aarch64/gcs-nonlocal-1.c

[PATCH 20/22] aarch64: Add tests and docs for indirect_return attribute

2024-10-23 Thread Yury Khrustalev
From: Richard Ball This patch adds a new testcase and docs for the indirect_return attribute. gcc/ChangeLog: * doc/extend.texi: Add AArch64 docs for indirect_return attribute. gcc/testsuite/ChangeLog: * gcc.target/aarch64/indirect_return.c: New test. Co-authore

[PATCH 13/22] aarch64: Add target pragma tests for gcs

2024-10-23 Thread Yury Khrustalev
From: Szabolcs Nagy gcc/testsuite/ChangeLog: * gcc.target/aarch64/pragma_cpp_predefs_4.c: Add gcs specific tests. --- .../gcc.target/aarch64/pragma_cpp_predefs_4.c | 35 +++ 1 file changed, 35 insertions(+) diff --git a/gcc/testsuite/gcc.target/aarch64/pragma_cp

[PATCH 09/22] aarch64: Add GCS support for nonlocal stack save

2024-10-23 Thread Yury Khrustalev
From: Szabolcs Nagy Nonlocal stack save and restore has to also save and restore the GCS pointer. This is used in __builtin_setjmp/longjmp and nonlocal goto. The GCS specific code is only emitted if GCS branch-protection is enabled and the code always checks at runtime if GCS is enabled. The ne

[PATCH 15/22] aarch64: Emit GNU property NOTE for GCS

2024-10-23 Thread Yury Khrustalev
From: Szabolcs Nagy gcc/ChangeLog: * config/aarch64/aarch64.cc (GNU_PROPERTY_AARCH64_FEATURE_1_GCS): Define. (aarch64_file_end_indicate_exec_stack): Set GCS property bit. --- gcc/config/aarch64/aarch64.cc | 5 + 1 file changed, 5 insertions(+) diff --git a/gcc/confi

[PATCH 04/22] aarch64: Add __builtin_aarch64_chkfeat

2024-10-23 Thread Yury Khrustalev
From: Szabolcs Nagy Builtin for chkfeat: the input argument is used to initialize x16 then execute chkfeat and return the updated x16. Note: ACLE __chkfeat(x) plans to flip the bits to be more intuitive (xor the input to output), but for the builtin that seems unnecessary complication. gcc/Chan

[PATCH 11/22] aarch64: Add ACLE feature macros for GCS

2024-10-23 Thread Yury Khrustalev
From: Szabolcs Nagy gcc/ChangeLog: * config/aarch64/aarch64-c.cc (aarch64_update_cpp_builtins): Define macros for GCS. --- gcc/config/aarch64/aarch64-c.cc | 3 +++ 1 file changed, 3 insertions(+) diff --git a/gcc/config/aarch64/aarch64-c.cc b/gcc/config/aarch64/aarch64-c.cc ind

[PATCH 06/22] aarch64: Add GCS instructions

2024-10-23 Thread Yury Khrustalev
From: Szabolcs Nagy Add instructions for the Guarded Control Stack extension. GCSSS1 and GCSSS2 are modelled as a single GCSSS unspec, because they are always used together in the compiler. Before GCSPOPM and GCSSS2 an extra "mov xn, 0" is added to clear the output register, this is needed to g

[PATCH 08/22] aarch64: Add __builtin_aarch64_gcs* tests

2024-10-23 Thread Yury Khrustalev
From: Szabolcs Nagy gcc/testsuite/ChangeLog: * gcc.target/aarch64/gcspopm-1.c: New test. * gcc.target/aarch64/gcspr-1.c: New test. * gcc.target/aarch64/gcsss-1.c: New test. --- gcc/testsuite/gcc.target/aarch64/gcspopm-1.c | 69 gcc/testsuite/gcc.targ

[PATCH 03/22] aarch64: Add support for chkfeat insn

2024-10-23 Thread Yury Khrustalev
From: Szabolcs Nagy This is a hint space instruction to check for enabled HW features and update the x16 register accordingly. Use unspec_volatile to prevent reordering it around calls since calls can enable or disable HW features. gcc/ChangeLog: * config/aarch64/aarch64.md (aarch64_ch

[PATCH 05/22] aarch64: Add __builtin_aarch64_chkfeat tests

2024-10-23 Thread Yury Khrustalev
From: Szabolcs Nagy gcc/testsuite/ChangeLog: * gcc.target/aarch64/chkfeat-1.c: New test. * gcc.target/aarch64/chkfeat-2.c: New test. --- gcc/testsuite/gcc.target/aarch64/chkfeat-1.c | 75 gcc/testsuite/gcc.target/aarch64/chkfeat-2.c | 15 2 files change

[PATCH 02/22] aarch64: Add branch-protection target pragma tests

2024-10-23 Thread Yury Khrustalev
From: Szabolcs Nagy gcc/testsuite/ChangeLog: * gcc.target/aarch64/pragma_cpp_predefs_4.c: Add branch-protection tests. --- .../gcc.target/aarch64/pragma_cpp_predefs_4.c | 50 +++ 1 file changed, 50 insertions(+) diff --git a/gcc/testsuite/gcc.target/aarch64/prag

[PATCH 01/22] aarch64: Add -mbranch-protection=gcs option

2024-10-23 Thread Yury Khrustalev
From: Szabolcs Nagy This enables Guarded Control Stack (GCS) compatible code generation. The "standard" branch-protection type enables it, and the default depends on the compiler default. gcc/ChangeLog: * config/aarch64/aarch64-protos.h (aarch_gcs_enabled): Declare. * config/aa

[PATCH 00/22] aarch64: Add support for Guarded Control Stack extension

2024-10-23 Thread Yury Khrustalev
This patch series adds support for the Guarded Control Stack extension [1]. GCS marking for binaries is specified in [2]. Regression tested on AArch64 and no regressions have been found. Is this OK for trunk? Sources and branches: - binutils-gdb: sourceware.org/git/binutils-gdb.git users/ARM/g

[PATCH v2 2/2] aarch64: Add mfloat vreinterpret intrinsics

2024-10-23 Thread Andrew Carlotti
This patch splits out some of the qualifier handling from the v1 patch, and adjusts the VREINTERPRET* macros to include support for mf8 intrinsics. Bootstrapped and regression tested on aarch64; ok for master? gcc/ChangeLog: * config/aarch64/aarch64-builtins.cc (MODE_d_mf8): New.

[PATCH 4/5] RISC-V: Implement the MASK_LEN_STRIDED_LOAD{STORE}

2024-10-23 Thread pan2 . li
From: Pan Li This patch would like to implment the MASK_LEN_STRIDED_LOAD{STORE} in the RISC-V backend by leveraging the vector strided load/store insn. For example: void foo (int * __restrict a, int * __restrict b, int stride, int n) { for (int i = 0; i < n; i++) a[i*stride] = b[i*stri

[PATCH 5/5] RISC-V: Add testcases for form 1 of MASK_LEN_STRIDED_LOAD{STORE}

2024-10-23 Thread pan2 . li
From: Pan Li Form 1: void __attribute__((noinline))\ vec_strided_load_store_##T##_form_1 (T *restrict out, T *restrict in, \ long stride, size_t size)\ {

[PATCH 2/5] Vect: Introduce MASK_LEN_STRIDED_LOAD{STORE} to loop vectorizer

2024-10-23 Thread pan2 . li
From: Pan Li This patch would like to allow generation of MASK_LEN_STRIDED_LOAD{STORE} IR for invariant stride memory access. For example as below void foo (int * __restrict a, int * __restrict b, int stride, int n) { for (int i = 0; i < n; i++) a[i*stride] = b[i*stride] + 100; } Bef

[PATCH 3/5] RISC-V: Adjust the gather-scatter testcases due to middle-end change

2024-10-23 Thread pan2 . li
From: Pan Li After we have MASK_LEN_STRIDED_LOAD{STORE} in the middle-end, the strided case need to be adjust for IR check. The below test suites are passed for this patch: * The riscv fully regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/gather-scatter/strided

  1   2   >