Re: [PATCH] fold: Remove (rrotate (rrotate A CST) CST) folding [PR117492]

2024-11-08 Thread Richard Biener
> Am 09.11.2024 um 05:00 schrieb Andrew Pinski : > > This removes an (broken) simplification from fold which is already handled > in match. > The reason why it was broken is because of the use of wi::to_wide on the RHS > of the > rotate which could be 2 different types even though the LHS wa

Re: [PATCH] VN: Don't recurse on for the same value of `a | b` [PR117496]

2024-11-08 Thread Richard Biener
> Am 09.11.2024 um 02:36 schrieb Andrew Pinski : > > After adding vn_valueize to the handle the `a | b ==/!= 0` case > of insert_predicates_for_cond, it would go into an infinite loop > as the Value number for either a or b could be the same as what it > is for the whole expression. This avoid

[PATCH] fold: Remove (rrotate (rrotate A CST) CST) folding [PR117492]

2024-11-08 Thread Andrew Pinski
This removes an (broken) simplification from fold which is already handled in match. The reason why it was broken is because of the use of wi::to_wide on the RHS of the rotate which could be 2 different types even though the LHS was the same type. Since it is already handled in match (by the patt

RE: [EXTERNAL] [PATCH] Update gcc-auto-profile / gen_autofdo_event.py

2024-11-08 Thread Eugene Rozenfeld
The patch looks good to me. Thank you for fixing this, Andi. -Original Message- From: Andi Kleen Sent: Thursday, October 31, 2024 4:37 PM To: gcc-patches@gcc.gnu.org Cc: Eugene Rozenfeld ; Andi Kleen Subject: [EXTERNAL] [PATCH] Update gcc-auto-profile / gen_autofdo_event.py From: Andi

RE: [EXTERNAL] [PATCH] Enable autofdo bootstrap for lto/fortran

2024-11-08 Thread Eugene Rozenfeld
This line in gcc/fortran/Make-lang.in looks wrong (copy/paste?): +f95.fda: create_fdas_for_lto1 There are no invocations of $(CREATE_GCOV in gcc/fortran/Make-lang.in so this is incomplete. -Original Message- From: Andi Kleen Sent: Thursday, October 31, 2024 4:19 PM To: gcc-patches@gcc

RE: [EXTERNAL] Re: [PATCH] PR117350: Keep assembler name for abstract decls for autofdo

2024-11-08 Thread Eugene Rozenfeld
The patch looks good to me. -Original Message- From: Richard Biener Sent: Wednesday, November 6, 2024 12:01 AM To: Andi Kleen Cc: Jason Merrill ; Andi Kleen ; gcc-patches@gcc.gnu.org; Eugene Rozenfeld ; pins...@gmail.com; Andi Kleen Subject: [EXTERNAL] Re: [PATCH] PR117350: Keep asse

[PATCH] VN: Don't recurse on for the same value of `a | b` [PR117496]

2024-11-08 Thread Andrew Pinski
After adding vn_valueize to the handle the `a | b ==/!= 0` case of insert_predicates_for_cond, it would go into an infinite loop as the Value number for either a or b could be the same as what it is for the whole expression. This avoids that recursion so there is no infinite loop here. Bootstrappe

Re: [PATCH V2 1/11] Add rs6000 architecture masks.

2024-11-08 Thread Segher Boessenkool
On Fri, Nov 08, 2024 at 02:28:11PM -0600, Peter Bergner wrote: > On 11/8/24 1:44 PM, Michael Meissner wrote: > > diff --git a/gcc/config/rs6000/rs6000-arch.def > > b/gcc/config/rs6000/rs6000-arch.def > > new file mode 100644 > > index 000..e5b6e958133 > > --- /dev/null > > +++ b/gcc/config

[committed] hppa: Don't allow mode size 32 in hard registers

2024-11-08 Thread John David Anglin
Tested on hppa64-hp-hpux11.11. Committed to trunk. Dave --- hppa: Don't allow mode size 32 in hard registers 2024-11-08 John David Anglin gcc/ChangeLog: PR target/117238 * config/pa/pa64-regs.h (PA_HARD_REGNO_MODE_OK): Don't allow mode size 32. diff --git a/gcc/con

[committed] hppa: Don't use '%' operator in base14_operand

2024-11-08 Thread John David Anglin
Tested on hppa-unknown-linux-gnu and hppa64-hp-hpux11.11. Committed to trunk and gcc-14. Dave --- hppa: Don't use '%' operator in base14_operand Division is slow on hppa and mode sizes are powers of 2. So, we can use '&' operator to check displacement alignment. 2024-11-08 John David Anglin

Re: [PATCH v3] C: Support Function multiversionsing in the C front end

2024-11-08 Thread Joseph Myers
I should also add: the ACLE specification for the details of how function multiversioning is supposed to work in terms of interactions of declarations for different versions in the same or different scopes and what happens regarding forming composite types seems rather vague. So maybe it would

[committed] hppa: Don't allow large modes in hard registers

2024-11-08 Thread John David Anglin
Tested on hppa-unknown-linux-gnu. Committed to trunk. Dave --- hppa: Don't allow large modes in hard registers LRA has problems handling spills for OI and TI modes. There are issues with SUBREG support as well. This change fixes gcc.c-torture/compile/pr92618.c with LRA. 2024-11-08 John Davi

Re: [PATCH v3] c: Implement C2y N3356, if declarations [PR117019]

2024-11-08 Thread Joseph Myers
On Fri, 8 Nov 2024, Marek Polacek wrote: > OK, I've reworded the comment to > > /* The call above already performed convert_lvalue_to_rvalue, but > if it parsed an expression, read_p was false. Make sure we mark > the expression as read. */ > > though it's questionable

Re: [PATCH v3] C: Support Function multiversionsing in the C front end

2024-11-08 Thread Joseph Myers
On Mon, 4 Nov 2024, alfie.richa...@arm.com wrote: > /* Subroutine of duplicate_decls. Compare NEWDECL to OLDDECL. > Returns true if the caller should proceed to merge the two, false > if OLDDECL should simply be discarded. As a side effect, issues > @@ -3365,11 +3382,53 @@ pushdecl (tre

[committed] hppa: Fix handling of secondary reloads involving a SUBREG

2024-11-08 Thread John David Anglin
Tested on hppa-unknown-linux-gnu and hppa64-hp-hpux11.11. Committed to trunk and gcc-14. Dave --- hppa: Fix handling of secondary reloads involving a SUBREG This is fairly subtle. When handling spills for SUBREG arguments in pa_emit_move_sequence, alter_subreg may be called. It in turn calls

[PATCH V2 7/11] Change TARGET_POPCNTD to TARGET_POWER7

2024-11-08 Thread Michael Meissner
As part of the architecture flags patches, this patch changes the use of TARGET_POPCNTD to TARGET_POWER7. The POPCNTD instruction was added in power7 (ISA 2.06). I have built both big endian and little endian bootstrap compilers and there were no regressions. In addition, I constructed a test ca

[PATCH v3] c: Implement C2y N3356, if declarations [PR117019]

2024-11-08 Thread Marek Polacek
On Fri, Nov 08, 2024 at 08:43:39PM +, Joseph Myers wrote: > On Thu, 7 Nov 2024, Marek Polacek wrote: > > > @@ -8355,7 +8492,9 @@ c_parser_switch_statement (c_parser *parser, bool > > *if_p, tree before_labels) > >if (c_parser_next_token_is (parser, CPP_OPEN_PAREN) > > && c_token

Re: [PATCH] Add COBOL to gcc

2024-11-08 Thread James K. Lowden
On Fri, 8 Nov 2024 13:52:55 +0100 Jakub Jelinek wrote: > Rather than a diff from /dev/null, > > it's a blob with the exact file contents. I hope it is correct in > > this form. > > That is just how the web git viewer presents new file commits. > On gcc-patches those should be posted as normal p

Re: [PATCH] AArch64: Cleanup fusion defines

2024-11-08 Thread Andrew Pinski
On Fri, Nov 8, 2024 at 8:56 AM Wilco Dijkstra wrote: > > > Cleanup the fusion defines by introducing AARCH64_FUSE_BASE as a common base > level of fusion supported by almost all cores. Add AARCH64_FUSE_MOVK as a > shortcut for all MOVK fusion. In most cases there is no change. It enables > AARC

Re: [PATCH v2] c: Implement C2y N3356, if declarations [PR117019]

2024-11-08 Thread Joseph Myers
On Thu, 7 Nov 2024, Marek Polacek wrote: > @@ -8355,7 +8492,9 @@ c_parser_switch_statement (c_parser *parser, bool > *if_p, tree before_labels) >if (c_parser_next_token_is (parser, CPP_OPEN_PAREN) > && c_token_starts_typename (c_parser_peek_2nd_token (parser))) > explicit_ca

Re: [PATCH] Add COBOL to gcc

2024-11-08 Thread James K. Lowden
On Fri, 8 Nov 2024 13:50:45 +0100 Jakub Jelinek wrote: > > * gcc-changelog/git_commit.py (default_changelog_locations): > > New entry for gcc/cobol. New entry for libgcobol. > > Dunno if your mailer ate the tabs at the start of the above 2 lines. > That is required so that it can be committed.

Re: [PATCH V2 1/11] Add rs6000 architecture masks.

2024-11-08 Thread Peter Bergner
On 11/8/24 1:44 PM, Michael Meissner wrote: > diff --git a/gcc/config/rs6000/rs6000-arch.def > b/gcc/config/rs6000/rs6000-arch.def > new file mode 100644 > index 000..e5b6e958133 > --- /dev/null > +++ b/gcc/config/rs6000/rs6000-arch.def > @@ -0,0 +1,48 @@ > +/* IBM RS/6000 CPU architecture

Re: [PATCH 0/11] Separate PowerPC architecture bits from ISA flags that use command line options

2024-11-08 Thread Michael Meissner
I have posted a new version of the patches at: https://gcc.gnu.org/pipermail/gcc-patches/2024-November/668177.html -- Michael Meissner, IBM PO Box 98, Ayer, Massachusetts, USA, 01432 email: meiss...@linux.ibm.com

[PATCH V2 11/11] Add -mcpu=future tuning support.

2024-11-08 Thread Michael Meissner
This patch makes -mtune=future use the same tuning decision as -mtune=power11. 2024-11-06 Michael Meissner gcc/ * config/rs6000/power10.md (all reservations): Add future as an alterntive to power10 and power11. --- gcc/config/rs6000/power10.md | 144 +-

[PATCH V2 10/11] Add support for -mcpu=future

2024-11-08 Thread Michael Meissner
This patch adds the support that can be used in developing GCC support for future PowerPC processors. 2024-11-06 Michael Meissner * config.gcc (powerpc*-*-*): Add support for --with-cpu=future. * config/rs6000/aix71.h (ASM_CPU_SPEC): Add support for -mcpu=future. * conf

[PATCH V2 9/11] Update tests to work with architecture flags changes.

2024-11-08 Thread Michael Meissner
Two tests used -mvsx to raise the processor level to at least power7. These tests were rewritten to add cpu=power7 support. I have built both big endian and little endian bootstrap compilers and there were no regressions. In addition, I constructed a test case that used every archiecture define

[PATCH V2 8/11] Change TARGET_MODULO to TARGET_POWER9

2024-11-08 Thread Michael Meissner
As part of the architecture flags patches, this patch changes the use of TARGET_MODULO to TARGET_POWER9. The modulo instructions were added in power9 (ISA 3.0). Note, I did not change the uses of TARGET_MODULO where it was explicitly generating different code if the machine had a modulo instruct

[PATCH V2 5/11] Change TARGET_FPRND to TARGET_POWER5X

2024-11-08 Thread Michael Meissner
As part of the architecture flags patches, this patch changes the use of TARGET_FPRND to TARGET_POWER5X. The FPRND instruction was added in power5+. I have built both big endian and little endian bootstrap compilers and there were no regressions. In addition, I constructed a test case that used

[PATCH V2 4/11] Change TARGET_POPCNTB to TARGET_POWER5

2024-11-08 Thread Michael Meissner
As part of the architecture flags patches, this patch changes the use of TARGET_POPCNTB to TARGET_POWER5. The POPCNTB instruction was added in ISA 2.02 (power5). I have built both big endian and little endian bootstrap compilers and there were no regressions. In addition, I constructed a test ca

[PATCH V2 3/11] Do not allow -mvsx to boost processor to power7.

2024-11-08 Thread Michael Meissner
This patch restructures the code so that -mvsx for example will not silently convert the processor to power7. The user must now use -mcpu=power7 or higher. This means if the user does -mvsx and the default processor does not have VSX support, it will be an error. I have built both big endian and

[PATCH V2 2/11] Use architecture flags for defining _ARCH_PWR macros.

2024-11-08 Thread Michael Meissner
For the newer architectures, this patch changes GCC to define the _ARCH_PWR macros using the new architecture flags instead of relying on isa options like -mpower10. The -mpower8-internal, -mpower10, and -mpower11 options were removed. The -mpower11 option was removed completely, since it was jus

[PATCH V2 1/11] Add rs6000 architecture masks.

2024-11-08 Thread Michael Meissner
This patch begins the journey to move architecture bits that are not user ISA options from rs6000_isa_flags to a new targt variable rs6000_arch_flags. The intention is to remove switches that are currently isa options, but the user should not be using this particular option. For example, we want u

Re: [PATCH] testsuite: arm: Require 16-bit float support

2024-11-08 Thread Torbjorn SVENSSON
On 2024-11-08 12:24, Richard Earnshaw (lists) wrote: On 05/11/2024 20:06, Torbjörn SVENSSON wrote: Based on how these functions are used in test cases, I think it's correct to require 16-bit float support in both functions. Without this change, the checks passes for armv8-m and armv8.1-m, bu

[PATCH V2 0/11] Separate PowerPC archiecture bits from ISA flags that use command line options

2024-11-08 Thread Michael Meissner
These patches are a clean up in the PowerPC port to move architecture bits that are not user ISA options from rs6000_isa_flags to a new targt variable rs6000_arch_flags. The intention is to remove switches that are currently isa options, but the user should not be using this particular option. For

[PATCH] testsuite: arm: Use effective-target for unsigned-extend-1.c

2024-11-08 Thread Torbjörn SVENSSON
Ok for trunk and releases/gcc-14? -- A long time ago, this test forced -march=armv6. With -marm, the generated assembler is: foo: sub r0, r0, #48 cmp r0, #9 movhi r0, #0 movls r0, #1 bx lr With -mthumb, the generated assembler is: foo:

Re: [PATCH] Add push/pop_function_decl

2024-11-08 Thread Richard Sandiford
Jakub Jelinek writes: > On Fri, Nov 08, 2024 at 05:44:48PM +, Richard Sandiford wrote: >> It's for https://gcc.gnu.org/pipermail/gcc-patches/2024-November/667499.html >> , >> which needs to switch to the simd clone's chosen target (SVE) in order >> to construct the correct types. Currently t

Re: [PATCH] testsuite: arm: Check that a far jump is used in thumb1-far-jump-2.c

2024-11-08 Thread Christophe Lyon
On Fri, 8 Nov 2024 at 19:20, Torbjörn SVENSSON wrote: > > Ok for trunk? > > -- > > With the changes in r15-1579-g792f97b44ff, the code used as "padding" in > the test case is optimized way. Prevent this optimization by forcing a > read of the volatile memory. > Also, validate that there is a far j

[PATCH] testsuite: arm: Check that a far jump is used in thumb1-far-jump-2.c

2024-11-08 Thread Torbjörn SVENSSON
Ok for trunk? -- With the changes in r15-1579-g792f97b44ff, the code used as "padding" in the test case is optimized way. Prevent this optimization by forcing a read of the volatile memory. Also, validate that there is a far jump in the generated assembler. Without this patch, the generated asse

Re: [PATCH] Add push/pop_function_decl

2024-11-08 Thread Jakub Jelinek
On Fri, Nov 08, 2024 at 05:44:48PM +, Richard Sandiford wrote: > It's for https://gcc.gnu.org/pipermail/gcc-patches/2024-November/667499.html , > which needs to switch to the simd clone's chosen target (SVE) in order > to construct the correct types. Currently the patch uses: > > + cl_ta

[PATCH] testsuite: arm: Use effective-target for pr68674.c test

2024-11-08 Thread Torbjörn SVENSSON
Ok for trunk and releases/gcc-14? -- gcc/testsuite/ChangeLog: * gcc.target/arm/pr68674.c: Use effective-target arm_arch_v7a and arm_libc_fp_abi. Signed-off-by: Torbjörn SVENSSON --- gcc/testsuite/gcc.target/arm/pr68674.c | 7 --- 1 file changed, 4 insertions(+), 3 deletion

Re: [PATCH] testsuite: arm: Update expected asm in no-literal-pool-m0.c

2024-11-08 Thread Christophe Lyon
On Fri, 8 Nov 2024 at 15:30, Richard Earnshaw (lists) wrote: > > On 14/10/2024 16:28, Christophe Lyon wrote: > > > > > > On 10/14/24 16:40, Torbjorn SVENSSON wrote: > >> Hi Christophe, > >> > >> On 2024-10-14 14:16, Christophe Lyon wrote: > >>> Hi Torbjörn, > >>> > >>> > >>> On 10/13/24 19:37, Tor

Re: [PATCH] Add push/pop_function_decl

2024-11-08 Thread Richard Sandiford
Andrew Stubbs writes: > On 08/11/2024 12:25, Richard Sandiford wrote: >> For the aarch64 simd clones patches, it would be useful to be able to >> push a function declaration onto the cfun stack, even though it has no >> function body associated with it. That is, we want cfun to be null, >> curren

Re: [PATCH v2] arm: Don't ICE on arm_mve.h pragma without MVE types [PR117408]

2024-11-08 Thread Christophe Lyon
On Thu, 7 Nov 2024 at 18:05, Torbjörn SVENSSON wrote: > > Changes since v1: > > - Updated the error message to mention that arm_mve_types.h needs to be > included. > - Corrected some spelling errors in commit message. > > As the warning for pure functions returning void is not related to this >

[PATCH] c++: Add __builtin_operator_{new,delete} support

2024-11-08 Thread Jakub Jelinek
Hi! clang++ adds __builtin_operator_{new,delete} builtins which as documented work similarly to ::operator {new,delete}, except that it is an error if the called ::operator {new,delete} is not a replaceable global operator and allow optimizations which C++ normally allows just when those are used

Re: [PATCH] Add push/pop_function_decl

2024-11-08 Thread Andrew Stubbs
On 08/11/2024 12:25, Richard Sandiford wrote: For the aarch64 simd clones patches, it would be useful to be able to push a function declaration onto the cfun stack, even though it has no function body associated with it. That is, we want cfun to be null, current_function_decl to be the decl itse

Re: [PATCH] testsuite: arm: Update expected asm in no-literal-pool-m0.c

2024-11-08 Thread Torbjorn SVENSSON
On 2024-11-08 15:30, Richard Earnshaw (lists) wrote: On 14/10/2024 16:28, Christophe Lyon wrote: On 10/14/24 16:40, Torbjorn SVENSSON wrote: Hi Christophe, On 2024-10-14 14:16, Christophe Lyon wrote: Hi Torbjörn, On 10/13/24 19:37, Torbjörn SVENSSON wrote: Ok for trunk? -- With the

[PATCH v2] c: minor fixes related to arrays of unspecified size [PR116284,PR117391]

2024-11-08 Thread Martin Uecker
This version of the already approved patch only adds the missing word "size" to the commit message and a missing "-std=gnu23" to  the first test. If there are no new comments, I will commit this once the pre-commit CI tests are complete. Bootstrapped and regression tested on x86_64. Martin

[PATCH 01/12] libstdc++: Refactor _Hashtable::operator=(initializer_list)

2024-11-08 Thread Jonathan Wakely
This replaces a call to _M_insert_range with open coding the loop. This will allow removing the node generator parameter from _M_insert_range in a later commit. libstdc++-v3/ChangeLog: * include/bits/hashtable.h (operator=(initializer_list)): Refactor to not use _M_insert_range.

[PATCH] AArch64: Cleanup fusion defines

2024-11-08 Thread Wilco Dijkstra
Cleanup the fusion defines by introducing AARCH64_FUSE_BASE as a common base level of fusion supported by almost all cores. Add AARCH64_FUSE_MOVK as a shortcut for all MOVK fusion. In most cases there is no change. It enables AARCH64_FUSE_CMP_BRANCH for a few older cores since it has no measura

[PATCH] AArch64: Remove duplicated addr_cost tables

2024-11-08 Thread Wilco Dijkstra
Remove duplicated addr_cost tables - use generic_armv9_a_addrcost_table for Armv9-a cores and generic_armv8_a_addrcost_table for recent Armv8-a cores. No changes in generated code. OK for commit? gcc/ChangeLog: * config/aarch64/tuning_models/cortexx925.h (cortexx925_addrcost_table): Re

[PATCH v2 0/4] aarch64: Add fp8 sve foundation

2024-11-08 Thread Claudio Bantaloukas
The ACLE defines a new set of fp8 vector types and intrinsics that operate on these, some of them operating on the vectors as if they were bags of bits and some requiring an additional argument of type fpm_t. The following patches introduce: - the types - intrinsics that operate without the fpm_

[PATCH v2 4/4] aarch64: add svcvt* FP8 intrinsics

2024-11-08 Thread Claudio Bantaloukas
This patch adds the following intrinsics: - svcvt1_bf16[_mf8]_fpm - svcvt1_f16[_mf8]_fpm - svcvt2_bf16[_mf8]_fpm - svcvt2_f16[_mf8]_fpm - svcvtlt1_bf16[_mf8]_fpm - svcvtlt1_f16[_mf8]_fpm - svcvtlt2_bf16[_mf8]_fpm - svcvtlt2_f16[_mf8]_fpm - svcvtn_mf8[_f16_x2]_fpm (unpredicated) - svcvtnb_mf8[_f32_

[PATCH v2 3/4] aarch64: specify fpm mode in function instances and groups

2024-11-08 Thread Claudio Bantaloukas
Some intrinsics require setting the fpm register before calling the specific asm opcode required. In order to simplify review, this patch: - adds the fpm_mode_index attribute to function_group_info and function_instance objects - updates existing initialisations and call sites. - updates equalit

[PATCH v2 1/4] aarch64: return scalar fp8 values in fp registers

2024-11-08 Thread Claudio Bantaloukas
According to the aapcs64: If the argument is an 8-bit (...) precision Floating-point or short vector type and the NSRN is less than 8, then the argument is allocated to the least significant bits of register v[NSRN]. gcc/ * config/aarch64/aarch64.cc (aarch64_vfp_is_call_or_return_

[PATCH 11/12] libstdc++: Simplify _Hashtable merge functions

2024-11-08 Thread Jonathan Wakely
I realised that _M_merge_unique and _M_merge_multi call extract(iter) which then has to call _M_get_previous_node to iterate through the bucket to find the node before the one iter points to. Since the merge function is already iterating over the entire container, we had the previous node a moment

[PATCH 09/12] libstdc++: Remove _Equality base class from _Hashtable

2024-11-08 Thread Jonathan Wakely
libstdc++-v3/ChangeLog: * include/bits/hashtable.h (_Hashtable): Remove _Equality base class. (_Hashtable::_M_equal): Define equality comparison here instead of in _Equality::_M_equal. * include/bits/hashtable_policy.h (_Equality): Remove. --- libstdc++-v3/

Re: [PATCH v17 2/2] c: Add __countof__ operator

2024-11-08 Thread Joseph Myers
On Fri, 8 Nov 2024, Alejandro Colomar wrote: > Hi Joseph, > > This is a gentle ping about this patch set, 10 days before the start of > stage 3. It's obviously not ready to include in its current form (using a name different from that actually accepted into C2Y). Since it requires significant

gcc-patches@gcc.gnu.org

2024-11-08 Thread Jonathan Wakely
We have two overloads of _M_find_before_node but they have quite different performance characteristics, which isn't necessarily obvious. The original version, _M_find_before_node(bucket, key, hash_code), looks only in the specified bucket, doing a linear search within that bucket for an element th

[PATCH 03/12] libstdc++: Refactor Hashtable insertion [PR115285]

2024-11-08 Thread Jonathan Wakely
This completely reworks the internal member functions for insertion into unordered containers. Currently we use a mixture of tag dispatching (for unique vs non-unique keys) and template specialization (for maps vs sets) to correctly implement insert and emplace members. This removes a lot of compl

[PATCH 10/12] libstdc++: Remove _Hashtable_base::_S_equals

2024-11-08 Thread Jonathan Wakely
This removes the overloaded _S_equals and _S_node_equals functions, replacing them with 'if constexpr' in the handful of places they're used. libstdc++-v3/ChangeLog: * include/bits/hashtable_policy.h (_Hashtable_base::_S_equals): Remove. (_Hashtable_base::_S_node_equals):

[PATCH 05/12] libstdc++: Add _Hashtable::_M_assign for the common case

2024-11-08 Thread Jonathan Wakely
This adds a convenient _M_assign overload for the common case where the node generator is the _AllocNode type. Only two places need to call _M_assign with a _ReuseOrAllocNode node generator, so all the other calls to _M_assign can use the new overload instead of manually constructing a node generat

[PATCH 06/12] libstdc++: Replace _Hashtable::__fwd_value_for with cast

2024-11-08 Thread Jonathan Wakely
We can just use a cast to the appropriate type instead of calling a function to do it. This gives the compiler less work to compile and optimize, and at -O0 avoids a function call per element. libstdc++-v3/ChangeLog: * include/bits/hashtable.h (_Hashtable::__fwd_value_for): Remove

[PATCH 02/12] libstdc++: Allow unordered_set assignment to assign to existing nodes

2024-11-08 Thread Jonathan Wakely
Currently the _ReuseOrAllocNode::operator(Args&&...) function always destroys the value stored in recycled nodes and constructs a new value. The _ReuseOrAllocNode type is only ever used for implementing assignment, either from another unordered container of the same type, or from std::initializer_

[PATCH 07/12] libstdc++: Use RAII in _Hashtable

2024-11-08 Thread Jonathan Wakely
Use scoped guard types to clean up if an exception is thrown. This allows some try-catch blocks to be removed. libstdc++-v3/ChangeLog: * include/bits/hashtable.h (operator=(const _Hashtable&)): Use RAII instead of try-catch. (_M_assign(_Ht&&, _NodeGenerator&)): Likewise.

[PATCH 04/12] libstdc++: Refactor Hashtable erasure

2024-11-08 Thread Jonathan Wakely
This reworks the internal member functions for erasure from unordered containers, similarly to the earlier commit doing it for insertion. Instead of multiple overloads of _M_erase which are selected via tag dispatching, the erase(const key_type&) member can use 'if constexpr' to choose an appropri

[PATCH 08/12] libstdc++: Remove _Insert base class from _Hashtable

2024-11-08 Thread Jonathan Wakely
There's no reason to have a separate base class defining the insert member functions now. They can all be moved into the _Hashtable class, which simplifies them slightly. libstdc++-v3/ChangeLog: * include/bits/hashtable.h (_Hashtable): Remove inheritance from __detail::_Insert and

[PATCH 0/12] libstdc++: Refactor _Hashtable class

2024-11-08 Thread Jonathan Wakely
This patch series attempts to remove some unnecessary complexity in the internals of std::unordered_xxx containers. There is a lot of overloading, tag dispatching, and inheritance that can be removed by using modern C++ features (with appropriate pragmas to disable warnings for older -std modes).

Re: [PATCH] AArch64: Block combine_and_move from creating FP literal loads

2024-11-08 Thread Richard Sandiford
Wilco Dijkstra writes: > Hi Richard, > >> That's because, once an instruction matches, the instruction should >> continue to match. It should always be possible to set the INSN_CODE of >> an existing instruction to -1, rerun recog, and get the same instruction >> code back. >> >> Because of that,

[committed] libstdc++: Make some _Hashtable members inline

2024-11-08 Thread Jonathan Wakely
libstdc++-v3/ChangeLog: * include/bits/hashtable.h (_Hashtable): Add 'inline' to some one-line constructors. Reviewed-by: François Dumont --- Tested x86_64-linux. Pushed to trunk. libstdc++-v3/include/bits/hashtable.h | 3 +++ 1 file changed, 3 insertions(+) diff --git a/libst

RE: [PATCH v2 01/10] Match: Simplify branch form 4 of unsigned SAT_ADD into branchless

2024-11-08 Thread Li, Pan2
Thanks Richard for comments. > That said - I'd avoid canonicalizing this via match.pd given that > inevitably will if-convert. I see, if no more concern I will revert the simplify merged into match.pd. > Instead I'd see it as a way to provide a generic .SAT_* expansion > though one could say we

Re: [PATCH] c++: Small initial fixes for zeroing of padding bits [PR117256]

2024-11-08 Thread Jason Merrill
On 11/8/24 4:29 AM, Jakub Jelinek wrote: Hi! https://eel.is/c++draft/dcl.init#general-6 says that even padding bits are supposed to be zeroed during zero-initialization. The following patch on top of the https://gcc.gnu.org/pipermail/gcc-patches/2024-October/665565.html patch attempts to impleme

[PATCH v3 23/23] aarch64: Fix nonlocal goto tests incompatible with GCS

2024-11-08 Thread Yury Khrustalev
gcc/testsuite/ChangeLog: * gcc.target/aarch64/gcs-nonlocal-3.c: New test. * gcc.target/aarch64/sme/nonlocal_goto_4.c: Update. * gcc.target/aarch64/sme/nonlocal_goto_5.c: Update. * gcc.target/aarch64/sme/nonlocal_goto_6.c: Update. --- .../gcc.target/aarch64/gcs-nonlo

Re: [PATCH] libstdc++: Add some further attributes to ::operator new in

2024-11-08 Thread Jonathan Wakely
On Fri, 1 Nov 2024 at 10:42, Jakub Jelinek wrote: > > Hi! > > I've noticed alloc_align attribute is missing on the non-vector > ::operator new with std::align_val_t and const std::nothrow_t& > arguments, this patch adds it. The last hunk is just > an attempt to make the line shorter. > The first

[PATCH v3 22/23] aarch64: Fix tests incompatible with GCS

2024-11-08 Thread Yury Khrustalev
From: Matthieu Longo gcc/testsuite/ChangeLog: * g++.target/aarch64/return_address_sign_ab_exception.C: Update. * gcc.target/aarch64/eh_return.c: Update. --- .../return_address_sign_ab_exception.C| 19 +-- gcc/testsuite/gcc.target/aarch64/eh_return.c | 13

[PATCH v3 20/23] aarch64: Introduce indirect_return attribute

2024-11-08 Thread Yury Khrustalev
From: Szabolcs Nagy Tail calls of indirect_return functions from non-indirect_return functions are disallowed even if BTI is disabled, since the call site may have BTI enabled. Following x86, mismatching attribute on function pointers is not a type error even though this can lead to bugs. Neede

[PATCH v3 17/23] aarch64: Emit GNU property NOTE for GCS

2024-11-08 Thread Yury Khrustalev
From: Szabolcs Nagy gcc/ChangeLog: * config/aarch64/aarch64.cc (GNU_PROPERTY_AARCH64_FEATURE_1_GCS): Define. (aarch64_file_end_indicate_exec_stack): Set GCS property bit. --- gcc/config/aarch64/aarch64.cc | 5 + 1 file changed, 5 insertions(+) diff --git a/gcc/confi

[PATCH v3 19/23] aarch64: libatomic: add GCS marking to asm

2024-11-08 Thread Yury Khrustalev
From: Szabolcs Nagy libatomic/ChangeLog: * config/linux/aarch64/atomic_16.S (FEATURE_1_GCS): Define. (GCS_FLAG): Define if GCS is enabled. (GNU_PROPERTY): Add GCS_FLAG. --- libatomic/config/linux/aarch64/atomic_16.S | 11 +-- 1 file changed, 9 insertions(+), 2 de

[PATCH v3 18/23] aarch64: libgcc: add GCS marking to asm

2024-11-08 Thread Yury Khrustalev
From: Szabolcs Nagy libgcc/ChangeLog: * config/aarch64/aarch64-asm.h (FEATURE_1_GCS): Define. (GCS_FLAG): Define if GCS is enabled. (GNU_PROPERTY): Add GCS_FLAG. --- libgcc/config/aarch64/aarch64-asm.h | 16 ++-- 1 file changed, 14 insertions(+), 2 deletions(

[PATCH v3 15/23] aarch64: Add target pragma tests for gcs

2024-11-08 Thread Yury Khrustalev
From: Szabolcs Nagy gcc/testsuite/ChangeLog: * gcc.target/aarch64/pragma_cpp_predefs_4.c: Add gcs specific tests. --- .../gcc.target/aarch64/pragma_cpp_predefs_4.c | 35 +++ 1 file changed, 35 insertions(+) diff --git a/gcc/testsuite/gcc.target/aarch64/pragma_cp

[PATCH v3 21/23] aarch64: Add tests and docs for indirect_return attribute

2024-11-08 Thread Yury Khrustalev
From: Richard Ball This patch adds a new testcase and docs for indirect_return attribute. gcc/ChangeLog: * doc/extend.texi: Add AArch64 docs for indirect_return attribute. gcc/testsuite/ChangeLog: * gcc.target/aarch64/indirect_return-1.c: New test. * gcc.target

[PATCH v3 16/23] aarch64: Add GCS support to the unwinder

2024-11-08 Thread Yury Khrustalev
From: Szabolcs Nagy Follows the current linux ABI that uses single signal entry token and shared shadow stack between thread and alt stack. Could be behind __ARM_FEATURE_GCS_DEFAULT ifdef (only do anything special with gcs compat codegen) but there is a runtime check anyway. Change affected test

[PATCH v3 10/23] aarch64: Add __builtin_aarch64_gcs* and __gcs* tests

2024-11-08 Thread Yury Khrustalev
From: Szabolcs Nagy gcc/testsuite/ChangeLog: * gcc.target/aarch64/acle/gcs-1.c: New test. * gcc.target/aarch64/gcspopm-1.c: New test. * gcc.target/aarch64/gcspr-1.c: New test. * gcc.target/aarch64/gcsss-1.c: New test. Co-authored-by: Yury Khrustalev --- gcc/tes

Re: [PATCH v2] testsuite: arm: Use check-function-bodies in epilog-1.c test

2024-11-08 Thread Richard Earnshaw (lists)
On 08/11/2024 12:12, Torbjorn SVENSSON wrote: On 2024-11-08 11:33, Richard Earnshaw (lists) wrote: On 08/11/2024 08:54, Torbjörn SVENSSON wrote: Changes since v1: - Added generated assembler in commit message. - Added comments in test case when each block is relevant. Ok for trunk and relea

[PATCH v3 12/23] aarch64: Add non-local goto and jump tests for GCS

2024-11-08 Thread Yury Khrustalev
From: Szabolcs Nagy These are scan asm tests only, relying on existing execution tests for runtime coverage. gcc/testsuite/ChangeLog: * gcc.target/aarch64/gcs-nonlocal-1.c: New test. * gcc.target/aarch64/gcs-nonlocal-1-track-speculation.c: New test. * gcc.target/aarch64/

[PATCH v3 14/23] aarch64: Add test for GCS ACLE defs

2024-11-08 Thread Yury Khrustalev
From: Szabolcs Nagy gcc/testsuite/ChangeLog: * gcc.target/aarch64/pragma_cpp_predefs_1.c: GCS test. --- .../gcc.target/aarch64/pragma_cpp_predefs_1.c | 30 +++ 1 file changed, 30 insertions(+) diff --git a/gcc/testsuite/gcc.target/aarch64/pragma_cpp_predefs_1.c b/gcc/t

[PATCH v3 07/23] aarch64: Add GCS instructions

2024-11-08 Thread Yury Khrustalev
From: Szabolcs Nagy Add instructions for the Guarded Control Stack extension. GCSSS1 and GCSSS2 are always used together in the compiler and an extra "mov xn, 0" should be always added before GCSSS2 to clear the output register. This is needed to get reasonable result when GCS is disabled, when

[PATCH v3 13/23] aarch64: Add ACLE feature macros for GCS

2024-11-08 Thread Yury Khrustalev
From: Szabolcs Nagy gcc/ChangeLog: * config/aarch64/aarch64-c.cc (aarch64_update_cpp_builtins): Define macros for GCS. --- gcc/config/aarch64/aarch64-c.cc | 3 +++ 1 file changed, 3 insertions(+) diff --git a/gcc/config/aarch64/aarch64-c.cc b/gcc/config/aarch64/aarch64-c.cc ind

[PATCH v3 06/23] aarch64: Add __builtin_aarch64_chkfeat and __chkfeat tests

2024-11-08 Thread Yury Khrustalev
From: Szabolcs Nagy gcc/testsuite/ChangeLog: * gcc.target/aarch64/acle/chkfeat-1.c: New test. * gcc.target/aarch64/chkfeat-1.c: New test. * gcc.target/aarch64/chkfeat-2.c: New test. Co-authored-by: Yury Khrustalev Co-authored-by: Richard Sandiford --- .../gcc.target/a

[PATCH v3 05/23] aarch64: Add ACLE __chkfeat intrinsic

2024-11-08 Thread Yury Khrustalev
Note that compared to __builtin_aarch64_chkfeat (x) the ACLE __chkfeat(x) flips the bits to be more intuitive (xor the input to output). gcc/ChangeLog: * config/aarch64/arm_acle.h (__chkfeat): New. --- gcc/config/aarch64/arm_acle.h | 13 + 1 file changed, 13 insertions(+) dif

[PATCH v3 11/23] aarch64: Add GCS support for nonlocal stack save

2024-11-08 Thread Yury Khrustalev
From: Szabolcs Nagy Nonlocal stack save and restore has to also save and restore the GCS pointer. This is used in __builtin_setjmp/longjmp and nonlocal goto. The GCS specific code is only emitted if GCS branch-protection is enabled and the code always checks at runtime if GCS is enabled. The ne

[PATCH v3 09/23] aarch64: Add ACLE __gcs* intrinsics

2024-11-08 Thread Yury Khrustalev
Add the following ACLE intrinsics: - void *__gcspr(void); - uint64_t __gcspopm(void); - void *__gcsss(void *); gcc/ChangeLog: * config/aarch64/arm_acle.h (__gcspr): New. (__gcspopm): New. (__gcsss): New. --- gcc/config/aarch64/arm_acle.h | 9 + 1 file changed,

[PATCH v3 08/23] aarch64: Add GCS builtins

2024-11-08 Thread Yury Khrustalev
From: Szabolcs Nagy Add new builtins for GCS: void *__builtin_aarch64_gcspr (void) uint64_t __builtin_aarch64_gcspopm (void) void *__builtin_aarch64_gcsss (void *) The builtins are always enabled, but should be used behind runtime checks in case the target does not support GCS. They are t

[PATCH v3 01/23] aarch64: Add -mbranch-protection=gcs option

2024-11-08 Thread Yury Khrustalev
From: Szabolcs Nagy This enables Guarded Control Stack (GCS) compatible code generation. The "standard" branch-protection type enables it, and the default depends on the compiler default. gcc/ChangeLog: * config/aarch64/aarch64-protos.h (aarch_gcs_enabled): Declare. * config/aa

[PATCH v3 02/23] aarch64: Add branch-protection target pragma tests

2024-11-08 Thread Yury Khrustalev
From: Szabolcs Nagy gcc/testsuite/ChangeLog: * gcc.target/aarch64/pragma_cpp_predefs_4.c: Add branch-protection tests. --- .../gcc.target/aarch64/pragma_cpp_predefs_4.c | 50 +++ 1 file changed, 50 insertions(+) diff --git a/gcc/testsuite/gcc.target/aarch64/prag

[PATCH v3 04/23] aarch64: Add __builtin_aarch64_chkfeat

2024-11-08 Thread Yury Khrustalev
From: Szabolcs Nagy Builtin for chkfeat: the input argument is used to initialize x16 then execute chkfeat and return the updated x16. Note: the ACLE __chkfeat(x) will flip the bits to be more intuitive (xor the input to output), but for the builtin that seems unnecessary complication. gcc/Chan

[PATCH v3 03/23] aarch64: Add support for chkfeat insn

2024-11-08 Thread Yury Khrustalev
From: Szabolcs Nagy This is a hint space instruction to check for enabled HW features and update the x16 register accordingly. Use unspec_volatile to prevent reordering it around calls since calls can enable or disable HW features. gcc/ChangeLog: * config/aarch64/aarch64.md (aarch64_ch

[PATCH v3 00/23] aarch64: Add support for Guarded Control Stack extension

2024-11-08 Thread Yury Khrustalev
This patch series adds support for the Guarded Control Stack extension [1]. GCS marking for binaries is specified in [2]. ACLE intrinsics are discussed in [3]. Regression tested on AArch64 and no regressions have been found. Applies to 137b26412f6 in trunk. Is this OK for trunk? Sources and bra

[committed] libstdc++: Do not define _Insert_base::try_emplace before C++17

2024-11-08 Thread Jonathan Wakely
This is not a reserved name in C++11 and C++14, so must not be defined. Also use the appropriate feature test macros for the try_emplace members of the Debug Mode maps. libstdc++-v3/ChangeLog: * include/bits/hashtable_policy.h (_Insert_base::try_emplace): Do not define for C++11

Re: [PATCH v2 07/21] aarch64: Add GCS builtins

2024-11-08 Thread Yury Khrustalev
Hi Kyrill, On Thu, Oct 31, 2024 at 02:05:00PM +, Kyrylo Tkachov wrote: > Hi Yury, > > > On 31 Oct 2024, at 14:23, Yury Khrustalev wrote: > > > > From: Szabolcs Nagy > > > > Add new builtins for GCS: > > > > void *__builtin_aarch64_gcspr (void) > > uint64_t __builtin_aarch64_gcspopm (vo

Re: [PATCH] testsuite: arm: Update expected asm in no-literal-pool-m0.c

2024-11-08 Thread Richard Earnshaw (lists)
On 14/10/2024 16:28, Christophe Lyon wrote: On 10/14/24 16:40, Torbjorn SVENSSON wrote: Hi Christophe, On 2024-10-14 14:16, Christophe Lyon wrote: Hi Torbjörn, On 10/13/24 19:37, Torbjörn SVENSSON wrote: Ok for trunk? -- With the changes in r15-1579-g792f97b44ff, the constants have been

  1   2   >