RE: [PATCH v2 01/10] Match: Simplify branch form 4 of unsigned SAT_ADD into branchless

2024-11-08 Thread Li, Pan2
Thanks Richard for comments. > That said - I'd avoid canonicalizing this via match.pd given that > inevitably will if-convert. I see, if no more concern I will revert the simplify merged into match.pd. > Instead I'd see it as a way to provide a generic .SAT_* expansion > though one could say we

[PATCH 08/12] libstdc++: Remove _Insert base class from _Hashtable

2024-11-08 Thread Jonathan Wakely
There's no reason to have a separate base class defining the insert member functions now. They can all be moved into the _Hashtable class, which simplifies them slightly. libstdc++-v3/ChangeLog: * include/bits/hashtable.h (_Hashtable): Remove inheritance from __detail::_Insert and

[PATCH 04/12] libstdc++: Refactor Hashtable erasure

2024-11-08 Thread Jonathan Wakely
This reworks the internal member functions for erasure from unordered containers, similarly to the earlier commit doing it for insertion. Instead of multiple overloads of _M_erase which are selected via tag dispatching, the erase(const key_type&) member can use 'if constexpr' to choose an appropri

[PATCH 09/12] libstdc++: Remove _Equality base class from _Hashtable

2024-11-08 Thread Jonathan Wakely
libstdc++-v3/ChangeLog: * include/bits/hashtable.h (_Hashtable): Remove _Equality base class. (_Hashtable::_M_equal): Define equality comparison here instead of in _Equality::_M_equal. * include/bits/hashtable_policy.h (_Equality): Remove. --- libstdc++-v3/

Re: [PATCH] Add push/pop_function_decl

2024-11-08 Thread Jakub Jelinek
On Fri, Nov 08, 2024 at 05:44:48PM +, Richard Sandiford wrote: > It's for https://gcc.gnu.org/pipermail/gcc-patches/2024-November/667499.html , > which needs to switch to the simd clone's chosen target (SVE) in order > to construct the correct types. Currently the patch uses: > > + cl_ta

[committed] hppa: Don't allow mode size 32 in hard registers

2024-11-08 Thread John David Anglin
Tested on hppa64-hp-hpux11.11. Committed to trunk. Dave --- hppa: Don't allow mode size 32 in hard registers 2024-11-08 John David Anglin gcc/ChangeLog: PR target/117238 * config/pa/pa64-regs.h (PA_HARD_REGNO_MODE_OK): Don't allow mode size 32. diff --git a/gcc/con

[committed] hppa: Don't use '%' operator in base14_operand

2024-11-08 Thread John David Anglin
Tested on hppa-unknown-linux-gnu and hppa64-hp-hpux11.11. Committed to trunk and gcc-14. Dave --- hppa: Don't use '%' operator in base14_operand Division is slow on hppa and mode sizes are powers of 2. So, we can use '&' operator to check displacement alignment. 2024-11-08 John David Anglin

Re: [PATCH v3] C: Support Function multiversionsing in the C front end

2024-11-08 Thread Joseph Myers
I should also add: the ACLE specification for the details of how function multiversioning is supposed to work in terms of interactions of declarations for different versions in the same or different scopes and what happens regarding forming composite types seems rather vague. So maybe it would

[PATCH] testsuite: arm: Check that a far jump is used in thumb1-far-jump-2.c

2024-11-08 Thread Torbjörn SVENSSON
Ok for trunk? -- With the changes in r15-1579-g792f97b44ff, the code used as "padding" in the test case is optimized way. Prevent this optimization by forcing a read of the volatile memory. Also, validate that there is a far jump in the generated assembler. Without this patch, the generated asse

Re: [PATCH] testsuite: arm: Check that a far jump is used in thumb1-far-jump-2.c

2024-11-08 Thread Christophe Lyon
On Fri, 8 Nov 2024 at 19:20, Torbjörn SVENSSON wrote: > > Ok for trunk? > > -- > > With the changes in r15-1579-g792f97b44ff, the code used as "padding" in > the test case is optimized way. Prevent this optimization by forcing a > read of the volatile memory. > Also, validate that there is a far j

Re: [PATCH] Add push/pop_function_decl

2024-11-08 Thread Richard Sandiford
Jakub Jelinek writes: > On Fri, Nov 08, 2024 at 05:44:48PM +, Richard Sandiford wrote: >> It's for https://gcc.gnu.org/pipermail/gcc-patches/2024-November/667499.html >> , >> which needs to switch to the simd clone's chosen target (SVE) in order >> to construct the correct types. Currently t

[PATCH V2 3/11] Do not allow -mvsx to boost processor to power7.

2024-11-08 Thread Michael Meissner
This patch restructures the code so that -mvsx for example will not silently convert the processor to power7. The user must now use -mcpu=power7 or higher. This means if the user does -mvsx and the default processor does not have VSX support, it will be an error. I have built both big endian and

[PATCH V2 4/11] Change TARGET_POPCNTB to TARGET_POWER5

2024-11-08 Thread Michael Meissner
As part of the architecture flags patches, this patch changes the use of TARGET_POPCNTB to TARGET_POWER5. The POPCNTB instruction was added in ISA 2.02 (power5). I have built both big endian and little endian bootstrap compilers and there were no regressions. In addition, I constructed a test ca

[PATCH V2 5/11] Change TARGET_FPRND to TARGET_POWER5X

2024-11-08 Thread Michael Meissner
As part of the architecture flags patches, this patch changes the use of TARGET_FPRND to TARGET_POWER5X. The FPRND instruction was added in power5+. I have built both big endian and little endian bootstrap compilers and there were no regressions. In addition, I constructed a test case that used

[PATCH V2 8/11] Change TARGET_MODULO to TARGET_POWER9

2024-11-08 Thread Michael Meissner
As part of the architecture flags patches, this patch changes the use of TARGET_MODULO to TARGET_POWER9. The modulo instructions were added in power9 (ISA 3.0). Note, I did not change the uses of TARGET_MODULO where it was explicitly generating different code if the machine had a modulo instruct

[PATCH V2 9/11] Update tests to work with architecture flags changes.

2024-11-08 Thread Michael Meissner
Two tests used -mvsx to raise the processor level to at least power7. These tests were rewritten to add cpu=power7 support. I have built both big endian and little endian bootstrap compilers and there were no regressions. In addition, I constructed a test case that used every archiecture define

[PATCH V2 10/11] Add support for -mcpu=future

2024-11-08 Thread Michael Meissner
This patch adds the support that can be used in developing GCC support for future PowerPC processors. 2024-11-06 Michael Meissner * config.gcc (powerpc*-*-*): Add support for --with-cpu=future. * config/rs6000/aix71.h (ASM_CPU_SPEC): Add support for -mcpu=future. * conf

[PATCH V2 11/11] Add -mcpu=future tuning support.

2024-11-08 Thread Michael Meissner
This patch makes -mtune=future use the same tuning decision as -mtune=power11. 2024-11-06 Michael Meissner gcc/ * config/rs6000/power10.md (all reservations): Add future as an alterntive to power10 and power11. --- gcc/config/rs6000/power10.md | 144 +-

[PATCH 0/12] libstdc++: Refactor _Hashtable class

2024-11-08 Thread Jonathan Wakely
This patch series attempts to remove some unnecessary complexity in the internals of std::unordered_xxx containers. There is a lot of overloading, tag dispatching, and inheritance that can be removed by using modern C++ features (with appropriate pragmas to disable warnings for older -std modes).

[PATCH 11/12] libstdc++: Simplify _Hashtable merge functions

2024-11-08 Thread Jonathan Wakely
I realised that _M_merge_unique and _M_merge_multi call extract(iter) which then has to call _M_get_previous_node to iterate through the bucket to find the node before the one iter points to. Since the merge function is already iterating over the entire container, we had the previous node a moment

[PATCH] AArch64: Remove duplicated addr_cost tables

2024-11-08 Thread Wilco Dijkstra
Remove duplicated addr_cost tables - use generic_armv9_a_addrcost_table for Armv9-a cores and generic_armv8_a_addrcost_table for recent Armv8-a cores. No changes in generated code. OK for commit? gcc/ChangeLog: * config/aarch64/tuning_models/cortexx925.h (cortexx925_addrcost_table): Re

Re: [PATCH v2] arm: Don't ICE on arm_mve.h pragma without MVE types [PR117408]

2024-11-08 Thread Christophe Lyon
On Thu, 7 Nov 2024 at 18:05, Torbjörn SVENSSON wrote: > > Changes since v1: > > - Updated the error message to mention that arm_mve_types.h needs to be > included. > - Corrected some spelling errors in commit message. > > As the warning for pure functions returning void is not related to this >

Re: [PATCH] Add push/pop_function_decl

2024-11-08 Thread Richard Sandiford
Andrew Stubbs writes: > On 08/11/2024 12:25, Richard Sandiford wrote: >> For the aarch64 simd clones patches, it would be useful to be able to >> push a function declaration onto the cfun stack, even though it has no >> function body associated with it. That is, we want cfun to be null, >> curren

[PATCH] testsuite: arm: Use effective-target for pr68674.c test

2024-11-08 Thread Torbjörn SVENSSON
Ok for trunk and releases/gcc-14? -- gcc/testsuite/ChangeLog: * gcc.target/arm/pr68674.c: Use effective-target arm_arch_v7a and arm_libc_fp_abi. Signed-off-by: Torbjörn SVENSSON --- gcc/testsuite/gcc.target/arm/pr68674.c | 7 --- 1 file changed, 4 insertions(+), 3 deletion

Re: [PATCH] testsuite: arm: Update expected asm in no-literal-pool-m0.c

2024-11-08 Thread Christophe Lyon
On Fri, 8 Nov 2024 at 15:30, Richard Earnshaw (lists) wrote: > > On 14/10/2024 16:28, Christophe Lyon wrote: > > > > > > On 10/14/24 16:40, Torbjorn SVENSSON wrote: > >> Hi Christophe, > >> > >> On 2024-10-14 14:16, Christophe Lyon wrote: > >>> Hi Torbjörn, > >>> > >>> > >>> On 10/13/24 19:37, Tor

[PATCH V2 7/11] Change TARGET_POPCNTD to TARGET_POWER7

2024-11-08 Thread Michael Meissner
As part of the architecture flags patches, this patch changes the use of TARGET_POPCNTD to TARGET_POWER7. The POPCNTD instruction was added in power7 (ISA 2.06). I have built both big endian and little endian bootstrap compilers and there were no regressions. In addition, I constructed a test ca

[committed] hppa: Fix handling of secondary reloads involving a SUBREG

2024-11-08 Thread John David Anglin
Tested on hppa-unknown-linux-gnu and hppa64-hp-hpux11.11. Committed to trunk and gcc-14. Dave --- hppa: Fix handling of secondary reloads involving a SUBREG This is fairly subtle. When handling spills for SUBREG arguments in pa_emit_move_sequence, alter_subreg may be called. It in turn calls

Re: [PATCH v3] C: Support Function multiversionsing in the C front end

2024-11-08 Thread Joseph Myers
On Mon, 4 Nov 2024, alfie.richa...@arm.com wrote: > /* Subroutine of duplicate_decls. Compare NEWDECL to OLDDECL. > Returns true if the caller should proceed to merge the two, false > if OLDDECL should simply be discarded. As a side effect, issues > @@ -3365,11 +3382,53 @@ pushdecl (tre

Re: [PATCH v3] c: Implement C2y N3356, if declarations [PR117019]

2024-11-08 Thread Joseph Myers
On Fri, 8 Nov 2024, Marek Polacek wrote: > OK, I've reworded the comment to > > /* The call above already performed convert_lvalue_to_rvalue, but > if it parsed an expression, read_p was false. Make sure we mark > the expression as read. */ > > though it's questionable

[committed] hppa: Don't allow large modes in hard registers

2024-11-08 Thread John David Anglin
Tested on hppa-unknown-linux-gnu. Committed to trunk. Dave --- hppa: Don't allow large modes in hard registers LRA has problems handling spills for OI and TI modes. There are issues with SUBREG support as well. This change fixes gcc.c-torture/compile/pr92618.c with LRA. 2024-11-08 John Davi

Re: [PATCH v17 2/2] c: Add __countof__ operator

2024-11-08 Thread Joseph Myers
On Fri, 8 Nov 2024, Alejandro Colomar wrote: > Hi Joseph, > > This is a gentle ping about this patch set, 10 days before the start of > stage 3. It's obviously not ready to include in its current form (using a name different from that actually accepted into C2Y). Since it requires significant

[PATCH v2 4/4] aarch64: add svcvt* FP8 intrinsics

2024-11-08 Thread Claudio Bantaloukas
This patch adds the following intrinsics: - svcvt1_bf16[_mf8]_fpm - svcvt1_f16[_mf8]_fpm - svcvt2_bf16[_mf8]_fpm - svcvt2_f16[_mf8]_fpm - svcvtlt1_bf16[_mf8]_fpm - svcvtlt1_f16[_mf8]_fpm - svcvtlt2_bf16[_mf8]_fpm - svcvtlt2_f16[_mf8]_fpm - svcvtn_mf8[_f16_x2]_fpm (unpredicated) - svcvtnb_mf8[_f32_

[PATCH v2 0/4] aarch64: Add fp8 sve foundation

2024-11-08 Thread Claudio Bantaloukas
The ACLE defines a new set of fp8 vector types and intrinsics that operate on these, some of them operating on the vectors as if they were bags of bits and some requiring an additional argument of type fpm_t. The following patches introduce: - the types - intrinsics that operate without the fpm_

[PATCH] AArch64: Cleanup fusion defines

2024-11-08 Thread Wilco Dijkstra
Cleanup the fusion defines by introducing AARCH64_FUSE_BASE as a common base level of fusion supported by almost all cores. Add AARCH64_FUSE_MOVK as a shortcut for all MOVK fusion. In most cases there is no change. It enables AARCH64_FUSE_CMP_BRANCH for a few older cores since it has no measura

[PATCH] c++: Add __builtin_operator_{new,delete} support

2024-11-08 Thread Jakub Jelinek
Hi! clang++ adds __builtin_operator_{new,delete} builtins which as documented work similarly to ::operator {new,delete}, except that it is an error if the called ::operator {new,delete} is not a replaceable global operator and allow optimizations which C++ normally allows just when those are used

[PATCH v3] c: Implement C2y N3356, if declarations [PR117019]

2024-11-08 Thread Marek Polacek
On Fri, Nov 08, 2024 at 08:43:39PM +, Joseph Myers wrote: > On Thu, 7 Nov 2024, Marek Polacek wrote: > > > @@ -8355,7 +8492,9 @@ c_parser_switch_statement (c_parser *parser, bool > > *if_p, tree before_labels) > >if (c_parser_next_token_is (parser, CPP_OPEN_PAREN) > > && c_token

[PATCH v2] c: minor fixes related to arrays of unspecified size [PR116284,PR117391]

2024-11-08 Thread Martin Uecker
This version of the already approved patch only adds the missing word "size" to the commit message and a missing "-std=gnu23" to  the first test. If there are no new comments, I will commit this once the pre-commit CI tests are complete. Bootstrapped and regression tested on x86_64. Martin

Re: [PATCH] testsuite: arm: Update expected asm in no-literal-pool-m0.c

2024-11-08 Thread Torbjorn SVENSSON
On 2024-11-08 15:30, Richard Earnshaw (lists) wrote: On 14/10/2024 16:28, Christophe Lyon wrote: On 10/14/24 16:40, Torbjorn SVENSSON wrote: Hi Christophe, On 2024-10-14 14:16, Christophe Lyon wrote: Hi Torbjörn, On 10/13/24 19:37, Torbjörn SVENSSON wrote: Ok for trunk? -- With the

Re: [PATCH] Add push/pop_function_decl

2024-11-08 Thread Andrew Stubbs
On 08/11/2024 12:25, Richard Sandiford wrote: For the aarch64 simd clones patches, it would be useful to be able to push a function declaration onto the cfun stack, even though it has no function body associated with it. That is, we want cfun to be null, current_function_decl to be the decl itse

[PATCH V2 0/11] Separate PowerPC archiecture bits from ISA flags that use command line options

2024-11-08 Thread Michael Meissner
These patches are a clean up in the PowerPC port to move architecture bits that are not user ISA options from rs6000_isa_flags to a new targt variable rs6000_arch_flags. The intention is to remove switches that are currently isa options, but the user should not be using this particular option. For

Re: [PATCH] testsuite: arm: Require 16-bit float support

2024-11-08 Thread Torbjorn SVENSSON
On 2024-11-08 12:24, Richard Earnshaw (lists) wrote: On 05/11/2024 20:06, Torbjörn SVENSSON wrote: Based on how these functions are used in test cases, I think it's correct to require 16-bit float support in both functions. Without this change, the checks passes for armv8-m and armv8.1-m, bu

[PATCH V2 1/11] Add rs6000 architecture masks.

2024-11-08 Thread Michael Meissner
This patch begins the journey to move architecture bits that are not user ISA options from rs6000_isa_flags to a new targt variable rs6000_arch_flags. The intention is to remove switches that are currently isa options, but the user should not be using this particular option. For example, we want u

[PATCH V2 2/11] Use architecture flags for defining _ARCH_PWR macros.

2024-11-08 Thread Michael Meissner
For the newer architectures, this patch changes GCC to define the _ARCH_PWR macros using the new architecture flags instead of relying on isa options like -mpower10. The -mpower8-internal, -mpower10, and -mpower11 options were removed. The -mpower11 option was removed completely, since it was jus

[PATCH] testsuite: arm: Use effective-target for unsigned-extend-1.c

2024-11-08 Thread Torbjörn SVENSSON
Ok for trunk and releases/gcc-14? -- A long time ago, this test forced -march=armv6. With -marm, the generated assembler is: foo: sub r0, r0, #48 cmp r0, #9 movhi r0, #0 movls r0, #1 bx lr With -mthumb, the generated assembler is: foo:

Re: [PATCH 0/11] Separate PowerPC architecture bits from ISA flags that use command line options

2024-11-08 Thread Michael Meissner
I have posted a new version of the patches at: https://gcc.gnu.org/pipermail/gcc-patches/2024-November/668177.html -- Michael Meissner, IBM PO Box 98, Ayer, Massachusetts, USA, 01432 email: meiss...@linux.ibm.com

Re: [PATCH V2 1/11] Add rs6000 architecture masks.

2024-11-08 Thread Peter Bergner
On 11/8/24 1:44 PM, Michael Meissner wrote: > diff --git a/gcc/config/rs6000/rs6000-arch.def > b/gcc/config/rs6000/rs6000-arch.def > new file mode 100644 > index 000..e5b6e958133 > --- /dev/null > +++ b/gcc/config/rs6000/rs6000-arch.def > @@ -0,0 +1,48 @@ > +/* IBM RS/6000 CPU architecture

[PATCH] VN: Don't recurse on for the same value of `a | b` [PR117496]

2024-11-08 Thread Andrew Pinski
After adding vn_valueize to the handle the `a | b ==/!= 0` case of insert_predicates_for_cond, it would go into an infinite loop as the Value number for either a or b could be the same as what it is for the whole expression. This avoids that recursion so there is no infinite loop here. Bootstrappe

RE: [EXTERNAL] [PATCH] Enable autofdo bootstrap for lto/fortran

2024-11-08 Thread Eugene Rozenfeld
This line in gcc/fortran/Make-lang.in looks wrong (copy/paste?): +f95.fda: create_fdas_for_lto1 There are no invocations of $(CREATE_GCOV in gcc/fortran/Make-lang.in so this is incomplete. -Original Message- From: Andi Kleen Sent: Thursday, October 31, 2024 4:19 PM To: gcc-patches@gcc

RE: [EXTERNAL] Re: [PATCH] PR117350: Keep assembler name for abstract decls for autofdo

2024-11-08 Thread Eugene Rozenfeld
The patch looks good to me. -Original Message- From: Richard Biener Sent: Wednesday, November 6, 2024 12:01 AM To: Andi Kleen Cc: Jason Merrill ; Andi Kleen ; gcc-patches@gcc.gnu.org; Eugene Rozenfeld ; pins...@gmail.com; Andi Kleen Subject: [EXTERNAL] Re: [PATCH] PR117350: Keep asse

[PATCH] fold: Remove (rrotate (rrotate A CST) CST) folding [PR117492]

2024-11-08 Thread Andrew Pinski
This removes an (broken) simplification from fold which is already handled in match. The reason why it was broken is because of the use of wi::to_wide on the RHS of the rotate which could be 2 different types even though the LHS was the same type. Since it is already handled in match (by the patt

RE: [EXTERNAL] [PATCH] Update gcc-auto-profile / gen_autofdo_event.py

2024-11-08 Thread Eugene Rozenfeld
The patch looks good to me. Thank you for fixing this, Andi. -Original Message- From: Andi Kleen Sent: Thursday, October 31, 2024 4:37 PM To: gcc-patches@gcc.gnu.org Cc: Eugene Rozenfeld ; Andi Kleen Subject: [EXTERNAL] [PATCH] Update gcc-auto-profile / gen_autofdo_event.py From: Andi

Re: [PATCH] VN: Don't recurse on for the same value of `a | b` [PR117496]

2024-11-08 Thread Richard Biener
> Am 09.11.2024 um 02:36 schrieb Andrew Pinski : > > After adding vn_valueize to the handle the `a | b ==/!= 0` case > of insert_predicates_for_cond, it would go into an infinite loop > as the Value number for either a or b could be the same as what it > is for the whole expression. This avoid

Re: [PATCH] fold: Remove (rrotate (rrotate A CST) CST) folding [PR117492]

2024-11-08 Thread Richard Biener
> Am 09.11.2024 um 05:00 schrieb Andrew Pinski : > > This removes an (broken) simplification from fold which is already handled > in match. > The reason why it was broken is because of the use of wi::to_wide on the RHS > of the > rotate which could be 2 different types even though the LHS wa

Re: [PATCH] AArch64: Block combine_and_move from creating FP literal loads

2024-11-08 Thread Richard Sandiford
Wilco Dijkstra writes: > Hi Richard, > >> That's because, once an instruction matches, the instruction should >> continue to match. It should always be possible to set the INSN_CODE of >> an existing instruction to -1, rerun recog, and get the same instruction >> code back. >> >> Because of that,

[PATCH 07/12] libstdc++: Use RAII in _Hashtable

2024-11-08 Thread Jonathan Wakely
Use scoped guard types to clean up if an exception is thrown. This allows some try-catch blocks to be removed. libstdc++-v3/ChangeLog: * include/bits/hashtable.h (operator=(const _Hashtable&)): Use RAII instead of try-catch. (_M_assign(_Ht&&, _NodeGenerator&)): Likewise.

[PATCH 02/12] libstdc++: Allow unordered_set assignment to assign to existing nodes

2024-11-08 Thread Jonathan Wakely
Currently the _ReuseOrAllocNode::operator(Args&&...) function always destroys the value stored in recycled nodes and constructs a new value. The _ReuseOrAllocNode type is only ever used for implementing assignment, either from another unordered container of the same type, or from std::initializer_

[PATCH 06/12] libstdc++: Replace _Hashtable::__fwd_value_for with cast

2024-11-08 Thread Jonathan Wakely
We can just use a cast to the appropriate type instead of calling a function to do it. This gives the compiler less work to compile and optimize, and at -O0 avoids a function call per element. libstdc++-v3/ChangeLog: * include/bits/hashtable.h (_Hashtable::__fwd_value_for): Remove

[committed] libstdc++: Make some _Hashtable members inline

2024-11-08 Thread Jonathan Wakely
libstdc++-v3/ChangeLog: * include/bits/hashtable.h (_Hashtable): Add 'inline' to some one-line constructors. Reviewed-by: François Dumont --- Tested x86_64-linux. Pushed to trunk. libstdc++-v3/include/bits/hashtable.h | 3 +++ 1 file changed, 3 insertions(+) diff --git a/libst

[PATCH 05/12] libstdc++: Add _Hashtable::_M_assign for the common case

2024-11-08 Thread Jonathan Wakely
This adds a convenient _M_assign overload for the common case where the node generator is the _AllocNode type. Only two places need to call _M_assign with a _ReuseOrAllocNode node generator, so all the other calls to _M_assign can use the new overload instead of manually constructing a node generat

[PATCH 10/12] libstdc++: Remove _Hashtable_base::_S_equals

2024-11-08 Thread Jonathan Wakely
This removes the overloaded _S_equals and _S_node_equals functions, replacing them with 'if constexpr' in the handful of places they're used. libstdc++-v3/ChangeLog: * include/bits/hashtable_policy.h (_Hashtable_base::_S_equals): Remove. (_Hashtable_base::_S_node_equals):

[PATCH 01/12] libstdc++: Refactor _Hashtable::operator=(initializer_list)

2024-11-08 Thread Jonathan Wakely
This replaces a call to _M_insert_range with open coding the loop. This will allow removing the node generator parameter from _M_insert_range in a later commit. libstdc++-v3/ChangeLog: * include/bits/hashtable.h (operator=(initializer_list)): Refactor to not use _M_insert_range.

[PATCH v2 1/4] aarch64: return scalar fp8 values in fp registers

2024-11-08 Thread Claudio Bantaloukas
According to the aapcs64: If the argument is an 8-bit (...) precision Floating-point or short vector type and the NSRN is less than 8, then the argument is allocated to the least significant bits of register v[NSRN]. gcc/ * config/aarch64/aarch64.cc (aarch64_vfp_is_call_or_return_

[PATCH v2 3/4] aarch64: specify fpm mode in function instances and groups

2024-11-08 Thread Claudio Bantaloukas
Some intrinsics require setting the fpm register before calling the specific asm opcode required. In order to simplify review, this patch: - adds the fpm_mode_index attribute to function_group_info and function_instance objects - updates existing initialisations and call sites. - updates equalit

[PATCH 03/12] libstdc++: Refactor Hashtable insertion [PR115285]

2024-11-08 Thread Jonathan Wakely
This completely reworks the internal member functions for insertion into unordered containers. Currently we use a mixture of tag dispatching (for unique vs non-unique keys) and template specialization (for maps vs sets) to correctly implement insert and emplace members. This removes a lot of compl

gcc-patches@gcc.gnu.org

2024-11-08 Thread Jonathan Wakely
We have two overloads of _M_find_before_node but they have quite different performance characteristics, which isn't necessarily obvious. The original version, _M_find_before_node(bucket, key, hash_code), looks only in the specified bucket, doing a linear search within that bucket for an element th

Re: [PATCH] Add COBOL to gcc

2024-11-08 Thread James K. Lowden
On Fri, 8 Nov 2024 13:50:45 +0100 Jakub Jelinek wrote: > > * gcc-changelog/git_commit.py (default_changelog_locations): > > New entry for gcc/cobol. New entry for libgcobol. > > Dunno if your mailer ate the tabs at the start of the above 2 lines. > That is required so that it can be committed.

Re: [PATCH v2] c: Implement C2y N3356, if declarations [PR117019]

2024-11-08 Thread Joseph Myers
On Thu, 7 Nov 2024, Marek Polacek wrote: > @@ -8355,7 +8492,9 @@ c_parser_switch_statement (c_parser *parser, bool > *if_p, tree before_labels) >if (c_parser_next_token_is (parser, CPP_OPEN_PAREN) > && c_token_starts_typename (c_parser_peek_2nd_token (parser))) > explicit_ca

Re: [PATCH] AArch64: Cleanup fusion defines

2024-11-08 Thread Andrew Pinski
On Fri, Nov 8, 2024 at 8:56 AM Wilco Dijkstra wrote: > > > Cleanup the fusion defines by introducing AARCH64_FUSE_BASE as a common base > level of fusion supported by almost all cores. Add AARCH64_FUSE_MOVK as a > shortcut for all MOVK fusion. In most cases there is no change. It enables > AARC

Re: [PATCH] Add COBOL to gcc

2024-11-08 Thread James K. Lowden
On Fri, 8 Nov 2024 13:52:55 +0100 Jakub Jelinek wrote: > Rather than a diff from /dev/null, > > it's a blob with the exact file contents. I hope it is correct in > > this form. > > That is just how the web git viewer presents new file commits. > On gcc-patches those should be posted as normal p

Re: [PATCH V2 1/11] Add rs6000 architecture masks.

2024-11-08 Thread Segher Boessenkool
On Fri, Nov 08, 2024 at 02:28:11PM -0600, Peter Bergner wrote: > On 11/8/24 1:44 PM, Michael Meissner wrote: > > diff --git a/gcc/config/rs6000/rs6000-arch.def > > b/gcc/config/rs6000/rs6000-arch.def > > new file mode 100644 > > index 000..e5b6e958133 > > --- /dev/null > > +++ b/gcc/config

Re: [PATCH] c++: Small initial fixes for zeroing of padding bits [PR117256]

2024-11-08 Thread Jakub Jelinek
On Fri, Nov 08, 2024 at 10:29:09AM +0100, Jakub Jelinek wrote: > I think we need far more than that, but am not sure where exactly > to implement that. > In particular, I think __builtin_bitcast should take it into account > during constant evaluation, if the padding bits in something are guarantee

Re: [PATCH v2 01/10] Match: Simplify branch form 4 of unsigned SAT_ADD into branchless

2024-11-08 Thread Richard Biener
On Fri, Nov 8, 2024 at 12:34 AM Li, Pan2 wrote: > > Thanks Tamar and Jeff for comments. > > > I'm not sure it's that simple. It'll depend on the micro-architecture. > > So things like strength of the branch predictors, how fetch blocks are > > handled (can you have embedded not-taken branches, sh

[PATCH] tree-optimization/117484 - issue with SLP discovery of permuted .MASK_LOAD

2024-11-08 Thread Richard Biener
When we do SLP discovery of a .MASK_LOAD for a dataref group with gaps the discovery for the mask will have gaps as well and this was unexpected in a few places. The following re-organizes things slightly to accomodate for this. Bootstrapped and tested on x86_64-unknown-linux-gnu. PR tre

Re: [PATCH v2] testsuite: arm: Use effective-target arm_libc_fp_abi for pr68620.c test

2024-11-08 Thread Richard Earnshaw (lists)
On 07/11/2024 17:48, Torbjörn SVENSSON wrote: Changes since v1: - Switch to arm_libc_fp_abi from arm_fp @Christophe, can you test this patch in the linaro farm to ensure that it does not fail again? Ok for trunk and releases/gcc-14? -- This fixes reported regression at https://linaro.atlassi

Re: [PATCH] testsuite: arm: Allow vst1.32 instruction in pr40457-2.c

2024-11-08 Thread Richard Earnshaw (lists)
On 07/11/2024 17:15, Torbjörn SVENSSON wrote: Ok for trunk and releases/gcc-14? -- When building the test case with neon, the 'vst1.32' instruction is used instead of 'strd'. Allow both variants to make the test pass. gcc/testsuite/ChangeLog: * gcc.target/arm/pr40457-2.c: Add vst1.32

Re: [PATCH 04/10] gimple: Disallow sizeless types in BIT_FIELD_REFs.

2024-11-08 Thread Tejas Belagod
On 11/8/24 1:19 PM, Richard Biener wrote: On Fri, Nov 8, 2024 at 7:30 AM Tejas Belagod wrote: On 11/7/24 5:52 PM, Richard Biener wrote: On Thu, Nov 7, 2024 at 11:13 AM Tejas Belagod wrote: On 11/7/24 2:36 PM, Richard Biener wrote: On Thu, Nov 7, 2024 at 8:25 AM Tejas Belagod wrote: On

Re: [PATCH] i386: Disallow long address mode in the x32 mode. [PR 117418]

2024-11-08 Thread Hongtao Liu
On Fri, Nov 8, 2024 at 3:18 PM Uros Bizjak wrote: > > On Fri, Nov 8, 2024 at 6:52 AM Hongtao Liu wrote: > > > > > > PR target/117418 > > > > > * config/i386/i386-options.cc > > > > > (ix86_option_override_internal): raise an > > > > > error with option -mx32 -maddress-mod

[PATCH v2] testsuite: arm: Use check-function-bodies in epilog-1.c test

2024-11-08 Thread Torbjörn SVENSSON
Changes since v1: - Added generated assembler in commit message. - Added comments in test case when each block is relevant. Ok for trunk and releases/gcc-14? -- Update test case for armv8.1-m.main that supports conditional arithmetic. armv7-m: push{r4, lr} ldr r4, .L6

Re: [PATCH v4 4/8] vect: Add maskload else value support.

2024-11-08 Thread Richard Biener
On Thu, 7 Nov 2024, Robin Dapp wrote: > From: Robin Dapp > > This patch adds an else operand to vectorized masked load calls. > The current implementation adds else-value arguments to the respective > target-querying functions that is used to supply the vectorizer with the > proper else value. >

[PATCH] c++: Small initial fixes for zeroing of padding bits [PR117256]

2024-11-08 Thread Jakub Jelinek
Hi! https://eel.is/c++draft/dcl.init#general-6 says that even padding bits are supposed to be zeroed during zero-initialization. The following patch on top of the https://gcc.gnu.org/pipermail/gcc-patches/2024-October/665565.html patch attempts to implement that, though only for the easy cases so

Re: [PATCH] testsuite: arm: Use effective-target for nomve_fp_1 test

2024-11-08 Thread Torbjorn SVENSSON
On 2024-11-07 23:19, Christophe Lyon wrote: On Thu, 7 Nov 2024 at 18:33, Torbjorn SVENSSON wrote: On 2024-11-07 11:40, Christophe Lyon wrote: Hi Torbjörn, On Thu, 31 Oct 2024 at 19:34, Torbjörn SVENSSON wrote: Ok for trunk and releases/gcc-14? -- Test uses MVE, so add effective-tar

[PATCH] trans-mem: Fix ICE caused by expand_assign_tm

2024-11-08 Thread Jakub Jelinek
Hi! My https://gcc.gnu.org/pipermail/gcc-patches/2024-November/668065.html patch regressed +FAIL: g++.dg/tm/pr45940-3.C -std=gnu++11 (internal compiler error: in create_tmp_var, at gimple-expr.cc:484) +FAIL: g++.dg/tm/pr45940-3.C -std=gnu++11 (test for excess errors) +FAIL: g++.dg/tm/pr45940-3.

Re: [PATCH] testsuite: arm: Use effective-target for pr84556.cc test

2024-11-08 Thread Richard Earnshaw (lists)
On 06/11/2024 09:39, Torbjörn SVENSSON wrote: Ok for trunk and releases/gcc-14? -- Using "dg-do run" with a selector breaks testing arm-none-eabi for any architecture when check_effective_target_arm_neon_hw returns 0. gcc/testsuite/ChangeLog: * g++.dg/vect/pr84556.cc: Change from "dg-

Re: [PATCH] testsuite: arm: Require 16-bit float support

2024-11-08 Thread Richard Earnshaw (lists)
On 05/11/2024 20:06, Torbjörn SVENSSON wrote: Based on how these functions are used in test cases, I think it's correct to require 16-bit float support in both functions. Without this change, the checks passes for armv8-m and armv8.1-m, but the test cases that uses them fails due to the incorrec

Re: [PATCH 0/4] libsanitizer: merge from upstream

2024-11-08 Thread Jakub Jelinek
On Thu, Nov 07, 2024 at 02:35:34PM +0800, Kito Cheng wrote: > The patch set aims to update libsanitizer from upstream. The motivation is > that > RISC-V is changing the shadow offset for AddressSanitizer, and I also plan to > submit another patch set to add dynamic shadow offset support for GCC. >

Re: [PATCH 0/4] libsanitizer: merge from upstream

2024-11-08 Thread Xi Ruoyao
On Fri, 2024-11-08 at 12:35 +0100, Jakub Jelinek wrote: > On Thu, Nov 07, 2024 at 02:35:34PM +0800, Kito Cheng wrote: > > The patch set aims to update libsanitizer from upstream. The motivation is > > that > > RISC-V is changing the shadow offset for AddressSanitizer, and I also plan > > to > > s

[PATCH v2] testsuite: arm: Use effective-target for pr84556.cc test

2024-11-08 Thread Torbjörn SVENSSON
Changes since v1: - Clarified the commit message to include where the descision is taken and why it's a bad idea to use "dg-do run" in a test case. Note: This does not only fix it for arm-none-eabi. I see the same kind of construct used by for example sparc. Sorry for the confusion Richard,

Re: [PATCH v2] testsuite: arm: Use effective-target for pr84556.cc test

2024-11-08 Thread Richard Earnshaw (lists)
On 08/11/2024 11:48, Torbjörn SVENSSON wrote: Changes since v1: - Clarified the commit message to include where the descision is taken and why it's a bad idea to use "dg-do run" in a test case. Note: This does not only fix it for arm-none-eabi. I see the same kind of construct used by f

Re: [PATCH] AArch64: Block combine_and_move from creating FP literal loads

2024-11-08 Thread Wilco Dijkstra
Hi Richard, > It's ok for instructions to require properties that are false during > early RTL passes and then transition to true.  But they can't require > properties that go from true to false, since that would mean that > existing instructions become unrecognisable at certain points during > th

[PATCH] Enable gcc.dg/vect/vect-early-break_21.c on x86_64

2024-11-08 Thread Richard Biener
The following also enables the testcase on x86 as it now has the required cbranch. tested on x86_64, pushed. * gcc.dg/vect/vect-early-break_21.c: Remove disabling of x86_64 and i?86. --- gcc/testsuite/gcc.dg/vect/vect-early-break_21.c | 2 +- 1 file changed, 1 insertion(+), 1 del

Re: [PATCH 0/4] libsanitizer: merge from upstream

2024-11-08 Thread Jakub Jelinek
On Fri, Nov 08, 2024 at 07:38:13PM +0800, Xi Ruoyao wrote: > IIUC 4/4 shouldn't be in LOCAL_PATCHES? It modifies our own test case, > not from the upstream. Sure, sorry. Jakub

[PATCH]AArch64 backport Neoverse and Cortex CPU definitions

2024-11-08 Thread Tamar Christina
Hi All, This is a conservative backport of a few core definitions backporting only the core definitions and mapping them to their closest cost model that exist on the branches. Bootstrapped Regtested on aarch64-none-linux-gnu on branches and no issues. Ok for GCC 13 and 14? Thanks, Tamar gcc/C

[PATCH] Add push/pop_function_decl

2024-11-08 Thread Richard Sandiford
For the aarch64 simd clones patches, it would be useful to be able to push a function declaration onto the cfun stack, even though it has no function body associated with it. That is, we want cfun to be null, current_function_decl to be the decl itself, and the target and optimisation flags to ref

[PATCH] Add missing SLP discovery for CFN[_MASK][_LEN]_SCATTER_STORE

2024-11-08 Thread Richard Biener
This was responsible for a bunch of SVE FAILs with --param vect-force-slp=1 Bootstrap and regtest running on x86_64-unknown-linux-gnu. * tree-vect-slp.cc (arg1_arg3_map): New. (arg1_arg3_arg4_map): Likewise. (vect_get_operand_map): Handle IFN_SCATTER_STORE, IFN_MAS

Re: [PATCH] Add push/pop_function_decl

2024-11-08 Thread Richard Biener
On Fri, 8 Nov 2024, Richard Sandiford wrote: > For the aarch64 simd clones patches, it would be useful to be able to > push a function declaration onto the cfun stack, even though it has no > function body associated with it. That is, we want cfun to be null, > current_function_decl to be the dec

Re: [PATCH 14/22] aarch64: Add GCS support to the unwinder

2024-11-08 Thread Yury Khrustalev
Hi Richard, On Thu, Oct 24, 2024 at 05:27:24PM +0100, Richard Sandiford wrote: > Yury Khrustalev writes: > > From: Szabolcs Nagy > > Could you explain these testsuite changes in more detail? It seems > on the face of it that they're changing the tests to test something > other than the origina

[committed] libstdc++: Simplify __detail::__distance_fw using 'if constexpr'

2024-11-08 Thread Jonathan Wakely
This uses 'if constexpr' instead of tag dispatching, removing the need for a second call using that tag, and simplifying the overload set that needs to be resolved for calls to __distance_fw. libstdc++-v3/ChangeLog: * include/bits/hashtable_policy.h (__distance_fw): Replace tag di

Re: [PATCH v2] testsuite: arm: Use effective-target for pr84556.cc test

2024-11-08 Thread Torbjorn SVENSSON
On 2024-11-08 12:57, Richard Earnshaw (lists) wrote: On 08/11/2024 11:48, Torbjörn SVENSSON wrote: Changes since v1: - Clarified the commit message to include where the descision is taken    and why it's a bad idea to use "dg-do run" in a test case.    Note: This does not only fix it for arm

[PATCH] tree-optimization/117502 - VMAT_STRIDED_SLP vs VMAT_ELEMENTWISE when considering gather

2024-11-08 Thread Richard Biener
The following treats both the same when considering to use gather or scatter for single-element interleaving accesses. Bootstrap and regtest running on x86_64-unknown-linux-gnu. PR tree-optimization/117502 * tree-vect-stmts.cc (get_group_load_store_type): Also consider VMA

Re: [PATCH] testsuite: arm: Allow vst1.32 instruction in pr40457-2.c

2024-11-08 Thread Torbjorn SVENSSON
On 2024-11-08 12:02, Richard Earnshaw (lists) wrote: On 07/11/2024 17:15, Torbjörn SVENSSON wrote: Ok for trunk and releases/gcc-14? -- When building the test case with neon, the 'vst1.32' instruction is used instead of 'strd'. Allow both variants to make the test pass. gcc/testsuite/Chang

  1   2   >