[r15-3464 Regression] FAIL: gcc.target/i386/avx10_2-partial-bf-vector-fma-1.c scan-assembler-times vfnmsub132nepbf16[ \\t]+[^{\n]*%xmm[0-9]+[^\n\r]*%xmm[0-9]+[^\n\r]*%xmm[0-9]+(?:\n|[ \\t]+#) 2 on Lin

2024-09-04 Thread haochen.jiang
On Linux/x86_64, f9ca3fd1fe30f3ee6725bfe4a612e9a1234c11ac is the first bad commit commit f9ca3fd1fe30f3ee6725bfe4a612e9a1234c11ac Author: Levy Hsu Date: Mon Sep 2 13:52:38 2024 +0800 i386: Support partial vectorized FMA for V2BF/V4BF caused FAIL: gcc.target/i386/avx10_2-partial-bf-vector

[PATCH] fab: Cleanup eh after optimize_memcpy [PR116601]

2024-09-04 Thread Andrew Pinski
When optimize_memcpy was added in r7-5443-g7b45d0dfeb5f85, a path was added such that a statement was turned into a non-throwing statement and maybe_clean_or_replace_eh_stmt/gimple_purge_dead_eh_edges would not be called for that statement. This adds these calls to that path. Bootstrapped and test

Re: [PATCH] aarch64: Handle attributes in the global namespace for aarch64_lookup_shared_state_flags [PR116598]

2024-09-04 Thread Andrew Pinski
On Wed, Sep 4, 2024 at 2:44 PM Andrew Pinski wrote: > > On Wed, Sep 4, 2024 at 2:36 PM Marek Polacek wrote: > > > > On Wed, Sep 04, 2024 at 02:05:21PM -0700, Andrew Pinski wrote: > > > The code in aarch64_lookup_shared_state_flags all C++11 attributes on the > > > function type > > > had a names

[PATCH] aarch64: Use is_attribute_namespace_p and get_attribute_name inside aarch64_lookup_shared_state_flags [PR116598]

2024-09-04 Thread Andrew Pinski
The code in aarch64_lookup_shared_state_flags all C++11 attributes on the function type had a namespace associated with them. But with the addition of reproducible/unsequenced, this is not true. This fixes the issue by using is_attribute_namespace_p instead of manually figuring out the namespac

Re: [PATCH] RISC-V: Make the setCC/REE tests robust to instruction selection

2024-09-04 Thread Jeff Law
On 9/4/24 4:07 PM, Palmer Dabbelt wrote: These tests were checking that the output of the setCC instruction was bit flipped, but it looks like they're really designed to test that redundant sign extension elimination fires on conditionals from function inputs. Jeff just posed a patch to clean

Re: FIXED_REGISTERS / ira_no_alloc_regs: aarch64 vs. risc-v (was Re: sched1 pathology on RISC-V : PR/114729)

2024-09-04 Thread Jeff Law
On 8/20/24 2:42 AM, Richard Sandiford wrote: Vineet Gupta writes: On 8/19/24 14:52, Richard Sandiford wrote: 2. On RISC-V sched1 is counter intuitively assuming HARD_FP is live due to the weird interaction of DF infra (which always marks HARD_FP with artificial def) and ira_no_alloc_regs.

Re: [RFC PATCH] RISC-V: Add support for LP64DV

2024-09-04 Thread Palmer Dabbelt
On Wed, 04 Sep 2024 19:24:41 PDT (-0700), Kito Cheng wrote: Just remember adding a system wide vector calling convention has wide compatible issues we need to worry about, like jump buf (for setjmp/longjmp) will need to keep vector status, it doesn't need to keep before since all vectors are call

Re: [RFC PATCH] RISC-V: Add support for LP64DV

2024-09-04 Thread Kito Cheng
Just remember adding a system wide vector calling convention has wide compatible issues we need to worry about, like jump buf (for setjmp/longjmp) will need to keep vector status, it doesn't need to keep before since all vectors are call-clobber by default. Also that may cause performance issue fo

RE: [PATCH v1] RISC-V: Fix SAT_* dump check failure due to middle-end change.

2024-09-04 Thread Li, Pan2
> This won't apply as I've already updated those tests. I think verifying > the number of SAT_ADDs is useful to ensure we don't regress as some of > these tests detect > 1 SAT_ADD idiom. I see, thanks Jeff. Then drop this patch. Pan -Original Message- From: Jeff Law Sent: Thursday,

Re: [PATCH v1] RISC-V: Fix SAT_* dump check failure due to middle-end change.

2024-09-04 Thread Jeff Law
On 9/4/24 8:01 PM, pan2...@intel.com wrote: From: Pan Li Some middl-end change may effect on the times of .SAT_*. Thus, refine the dump check for SAT_*, from the scan-times to scan as we only care about the .SAT_* exist or not. And there will an other PATCH to perform similar refinement an

[PATCH] i386: Fix incorrect avx512f-mask-type.h include

2024-09-04 Thread Haochen Jiang
Hi all, In avx512f-mask-type.h, we need SIZE being defined to get MASK_TYPE defined correctly. Fix those testcases where SIZE are not defined before the include for avv512f-mask-type.h. Note that for convert intrins in AVX10.2, they will need more modifications due to the current tests did not in

[PATCH v1] RISC-V: Fix SAT_* dump check failure due to middle-end change.

2024-09-04 Thread pan2 . li
From: Pan Li Some middl-end change may effect on the times of .SAT_*. Thus, refine the dump check for SAT_*, from the scan-times to scan as we only care about the .SAT_* exist or not. And there will an other PATCH to perform similar refinement and this PATCH only fix the failed test cases. gcc

Re: [PATCH] i386: Integrate BFmode for Enhanced Vectorization in ix86_preferred_simd_mode

2024-09-04 Thread Hongtao Liu
On Wed, Sep 4, 2024 at 9:32 AM Levy Hsu wrote: > > Hi > > This change adds BFmode support to the ix86_preferred_simd_mode function > enhancing SIMD vectorization for BF16 operations. The update ensures > optimized usage of SIMD capabilities improving performance and aligning > vector sizes with pr

Re: [PATCH] i386: Support partial signbit/xorsign/copysign/abs/neg/and/xor/ior/andn for V2BF/V4BF

2024-09-04 Thread Hongtao Liu
On Wed, Sep 4, 2024 at 10:53 AM Levy Hsu wrote: > > Hi > > This patch adds support for bf16 operations in V2BF and V4BF modes on i386, > handling signbit, xorsign, copysign, abs, neg, and various logical operations. > > Bootstrapped and tested on x86-64-pc-linux-gnu. > Ok for trunk? Ok. > > gcc/Ch

Re: [PATCH] i386: Support partial vectorized FMA for V2BF/V4BF

2024-09-04 Thread Hongtao Liu
On Wed, Sep 4, 2024 at 11:31 AM Levy Hsu wrote: > > Hi > > Bootstrapped and tested on x86-64-pc-linux-gnu. > Ok for trunk? Ok. > > This patch introduces support for vectorized FMA operations for bf16 types in > V2BF and V4BF modes on the i386 architecture. New mode iterators and > define_expand en

[PATCH] Handle const0_operand for *avx2_pcmp3_1.

2024-09-04 Thread liuhongt
*_eq3_1 supports nonimm_or_0_operand for op1 and op2, pass_combine would fail to lower avx512 comparision back to avx2 one when op1/op2 is const0_rtx. It's because the splitter only support nonimmediate_operand. Failed to match this instruction: (set (reg/i:V16QI 20 xmm0) (vec_merge:V16QI (con

Re: [PATCH] MATCH: add abs support for half float

2024-09-04 Thread Kugan Vivekanandarajah
Thanks for the explanation. > On 2 Sep 2024, at 9:47 am, Andrew Pinski wrote: > > External email: Use caution opening links or attachments > > > On Sun, Sep 1, 2024 at 4:27 PM Kugan Vivekanandarajah > wrote: >> >> Hi Andrew. >> >>> On 28 Aug 2024, at 2:23 pm, Andrew Pinski wrote: >>> >>> Exter

Re: [PATCH 3/3] RISC-V: Constant synthesis of inverted halves

2024-09-04 Thread Jeff Law
On 9/2/24 2:01 PM, Raphael Moreira Zinsly wrote: Improve handling of constants where the high half can be constructed by inverting the lower half. gcc/ChangeLog: * config/riscv/riscv.cc (riscv_build_integer): Detect constants were the higher half is the lower half inverted. g

Re: [PATCH 2/3] RISC-V: Additional large constant synthesis improvements

2024-09-04 Thread Jeff Law
On 9/2/24 2:01 PM, Raphael Moreira Zinsly wrote: Improve handling of large constants in riscv_build_integer, generate better code for constants where the high half can be constructed by shifting/shiftNadding the low half or if the halves differ by less than 2k. gcc/ChangeLog: * config

Re: [PATCH 1/3] RISC-V: Improve codegen for negative repeating large constants

2024-09-04 Thread Jeff Law
On 9/2/24 2:01 PM, Raphael Moreira Zinsly wrote: Improve handling of constants where its upper and lower 32-bit halves are the same and have negative values. e.g. for: unsigned long f (void) { return 0xf0f0f0f0f0f0f0f0UL; } Without the patch: li a0,-252645376 addia0,a0,240 li

RE: [PATCH v1 1/2] Genmatch: Support new flow for phi on condition

2024-09-04 Thread Li, Pan2
Thanks Richard for comments. > I also think we may want to split out this CFG matching code out into > a helper function > in gimple-match-head.cc instead of repeating it fully for each pattern? That makes sense to me, let me have a try in v2. Pan -Original Message- From: Richard Biener

Re: [PATCH][testsuite]: remove -fwrapv from signbit-5.c

2024-09-04 Thread Jeff Law
On 9/4/24 1:13 AM, Torbjorn SVENSSON wrote: On 2024-09-03 20:23, Richard Biener wrote: Am 03.09.2024 um 19:00 schrieb Tamar Christina : Hi All, The meaning of the testcase was changed by passing it -fwrapv.  The reason for the test failures on some platform was because the test was

Re: [RFC PATCH] RISC-V: Add support for LP64DV

2024-09-04 Thread Jeff Law
On 9/4/24 2:26 PM, Palmer Dabbelt wrote: Now that we've got the riscv_vector_cc attribute it's pretty much free to add a system-wide ABI -- at least in terms of implementation. So this just adds a new ABI command-line value that defaults to enabling the vector calling convention, essentially

Re: [to-be-committed][RISC-V] Avoid unnecessary extensions after sCC insns

2024-09-04 Thread Jeff Law
On 9/4/24 4:11 PM, Palmer Dabbelt wrote: On Wed, 04 Sep 2024 13:47:58 PDT (-0700), jeffreya...@gmail.com wrote: So I was looking at a performance regression in spec with Ventana's internal tree. Ultimately the problem was a bad interaction with an internal patch (REP_MODE_EXTENDED), fwprop

Re: [PATCH] RISC-V: Make the setCC/REE tests robust to instruction selection

2024-09-04 Thread Jeff Law
On 9/4/24 4:07 PM, Palmer Dabbelt wrote: These tests were checking that the output of the setCC instruction was bit flipped, but it looks like they're really designed to test that redundant sign extension elimination fires on conditionals from function inputs. Jeff just posed a patch to clean

Re: [to-be-committed][RISC-V] Avoid unnecessary extensions after sCC insns

2024-09-04 Thread Palmer Dabbelt
On Wed, 04 Sep 2024 13:47:58 PDT (-0700), jeffreya...@gmail.com wrote: > > So I was looking at a performance regression in spec with Ventana's > internal tree. Ultimately the problem was a bad interaction with an > internal patch (REP_MODE_EXTENDED), fwprop and ext-dce. The details of > that prob

[PATCH] RISC-V: Make the setCC/REE tests robust to instruction selection

2024-09-04 Thread Palmer Dabbelt
These tests were checking that the output of the setCC instruction was bit flipped, but it looks like they're really designed to test that redundant sign extension elimination fires on conditionals from function inputs. Jeff just posed a patch to clean this code up with trips up on the arbitrary x

Re: [PATCH] aarch64: Handle attributes in the global namespace for aarch64_lookup_shared_state_flags [PR116598]

2024-09-04 Thread Andrew Pinski
On Wed, Sep 4, 2024 at 2:36 PM Marek Polacek wrote: > > On Wed, Sep 04, 2024 at 02:05:21PM -0700, Andrew Pinski wrote: > > The code in aarch64_lookup_shared_state_flags all C++11 attributes on the > > function type > > had a namespace associated with them. But with the addition of > > reproducib

Re: [PATCH] aarch64: Handle attributes in the global namespace for aarch64_lookup_shared_state_flags [PR116598]

2024-09-04 Thread Marek Polacek
On Wed, Sep 04, 2024 at 02:05:21PM -0700, Andrew Pinski wrote: > The code in aarch64_lookup_shared_state_flags all C++11 attributes on the > function type > had a namespace associated with them. But with the addition of > reproducible/unsequenced, > this was no longer true. > This is the simple f

Re: [RFC PATCH] RISC-V: Add support for LP64DV

2024-09-04 Thread Palmer Dabbelt
On Wed, 04 Sep 2024 13:26:11 PDT (-0700), Palmer Dabbelt wrote: Now that we've got the riscv_vector_cc attribute it's pretty much free to add a system-wide ABI -- at least in terms of implementation. So this just adds a new ABI command-line value that defaults to enabling the vector calling conv

PING [PATCH v2] c++: Fix constrained auto deduction templ parms resolution [PR114915, PR115030]

2024-09-04 Thread Seyed Sajad Kahani
I'm gently pinging about the patch I submitted: https://gcc.gnu.org/pipermail/gcc-patches/2024-August/660177.html This patch was created in response to Jason's comments here: https://gcc.gnu.org/pipermail/gcc-patches/2024-July/657739.html I appreciate your time and consideration. Thank you.

[PATCH] aarch64: Handle attributes in the global namespace for aarch64_lookup_shared_state_flags [PR116598]

2024-09-04 Thread Andrew Pinski
The code in aarch64_lookup_shared_state_flags all C++11 attributes on the function type had a namespace associated with them. But with the addition of reproducible/unsequenced, this was no longer true. This is the simple fix to ignore attributes in the global namespace since we are looking for o

[to-be-committed][RISC-V] Avoid unnecessary extensions after sCC insns

2024-09-04 Thread Jeff Law
So I was looking at a performance regression in spec with Ventana's internal tree. Ultimately the problem was a bad interaction with an internal patch (REP_MODE_EXTENDED), fwprop and ext-dce. The details of that problem aren't particularly important. Removal of the local patch went reason

Re: [PATCH v2] testsuite: introduce hostedlib effective target

2024-09-04 Thread Mike Stump
On Sep 3, 2024, at 11:44 PM, Alexandre Oliva wrote: > > On Nov 9, 2023, Mike Stump wrote: > >> On Nov 8, 2023, at 8:29 AM, Alexandre Oliva wrote: >>> >>> On Nov 5, 2023, Mike Stump wrote: >>> that, otherwise, I'll approve this version. >>> >>> FWIW, this version is not usable as is.

[committed, gcc-14] libstdc++: Fix std::variant to reject array types [PR116381]

2024-09-04 Thread Jonathan Wakely
Tested x86_64-linux. Pushed to gcc-14. -- >8 -- For the backport, rejecting array types is only done in strict modes. libstdc++-v3/ChangeLog: PR libstdc++/116381 * include/std/variant (variant): Fix conditions for static_assert to match the spec. * testsuite/20_u

Re: [PATCH] c++, v2: Fix get_member_function_from_ptrfunc with -fsanitize=bounds [PR116449]

2024-09-04 Thread Franz Sirl
Am 2024-09-04 um 19:12 schrieb Jakub Jelinek: On Wed, Sep 04, 2024 at 12:34:04PM -0400, Jason Merrill wrote: So, one possibility would be to call save_expr unconditionally in get_member_function_from_ptrfunc as well. Or build a TARGET_EXPR (force_target_expr or similar). Yes. I don't have a

[RFC PATCH] RISC-V: Add support for LP64DV

2024-09-04 Thread Palmer Dabbelt
Now that we've got the riscv_vector_cc attribute it's pretty much free to add a system-wide ABI -- at least in terms of implementation. So this just adds a new ABI command-line value that defaults to enabling the vector calling convention, essentially the same as scattering the attribute on every

Ping^2: [PATCH] warn-access: ignore template parameters when matching operator new/delete [PR109224]

2024-09-04 Thread Arsen Arsenović
Evening, Arsen Arsenović writes: > [[PGP Signed Part:Good signature from 52C294301EA2C493 Arsen Arsenović > (trust ultimate) created at 2024-08-28T23:00:44+0200 > using EDDSA]] > Hi, > > Arsen Arsenović writes: > >>> The && should not be left of the =; if the initializer needs to span >>> m

RE: [PATCH] i386: Add _MM_FROUND_TO_NEAREST_TIES_EVEN to smmintrin.h

2024-09-04 Thread Paul Caprioli
Hi, I'm writing to ask that someone with write access to the git repo apply this patch, which provides the macro definition `_MM_FROUND_TO_NEAREST_TIES_EVEN`. Intrinsics such as `_mm512_add_round_ps` take a rounding mode argument to specify the floating point rounding mode. This and si

Re: [PATCH] c++, coroutines: Instrument missing return_void UB.

2024-09-04 Thread Iain Sandoe
> On 4 Sep 2024, at 17:21, Jason Merrill wrote: > > On 9/1/24 12:17 PM, Iain Sandoe wrote: >> This came up in discussion of an earlier patch. >> I'm in two minds as to whether it's a good idea or not - the underlying >> issue being that libubsan does not yet (AFAICT) have the concept of a >> c

Re: [PATCH] c++: ICE with TTP [PR96097]

2024-09-04 Thread Marek Polacek
On Wed, Sep 04, 2024 at 10:58:25AM -0400, Jason Merrill wrote: > On 9/3/24 6:12 PM, Marek Polacek wrote: > > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk/14? > > The change to return bool seems like unrelated cleanup; please push that > separately on trunk only. Done. > > + /

Re: [PATCH v3 0/5] aarch64: Fix intrinsic availability [PR112108]

2024-09-04 Thread Andrew Carlotti
On Mon, Aug 19, 2024 at 03:52:58PM +0100, Andrew Carlotti wrote: > On Fri, Aug 16, 2024 at 07:17:24AM +, Kyrylo Tkachov wrote: > > > > > > > On 15 Aug 2024, at 18:48, Andrew Carlotti wrote: > > > > > > External email: Use caution opening links or attachments > > > > > > > > > On Thu, Aug

[committed][RISC-V] Fix scan test output after recent path-splitting changes

2024-09-04 Thread Jeff Law
The recent path splitting changes from Andrew result in identifying more saturation idioms instead of just identifying an overflow check. As a result many of the tests in the RISC-V port started failing a scan check on the .expand output. As expected, identifying a saturation idiom is more

[PATCH] c++, v3: Partially implement CWG 2867 - Order of initialization for structured bindings [PR115769]

2024-09-04 Thread Jakub Jelinek
On Wed, Sep 04, 2024 at 01:22:47PM -0400, Jason Merrill wrote: > > @@ -8985,6 +9003,13 @@ cp_finish_decl (tree decl, tree init, bo > > if (var_definition_p) > > abstract_virtuals_error (decl, type); > > + if (decomp && !processing_template_decl) > > + { > > + need_decomp_init

[pushed] c++: cleanup coerce_template_template_parm

2024-09-04 Thread Marek Polacek
Split out from https://gcc.gnu.org/pipermail/gcc-patches/2024-September/662261.html which was tested on x86_64-pc-linux-gnu. I'm checking this in. -- >8 -- This function could use some sprucing up. gcc/cp/ChangeLog: * pt.cc (coerce_template_template_parm): Return bool instead of int. --

Re: [PATCH] c++, v2: Partially implement CWG 2867 - Order of initialization for structured bindings [PR115769]

2024-09-04 Thread Jason Merrill
On 8/30/24 1:37 PM, Jakub Jelinek wrote: On Wed, Aug 21, 2024 at 02:08:16PM -0400, Jason Merrill wrote: I was concerned about the use of a single boolean to guard the destruction of multiple objects, suspecting that it would break in obscure EH cases. When I finally managed to construct a testca

[PATCH v2] c++: fn redecl in fn scope wrongly accepted [PR116239]

2024-09-04 Thread Marek Polacek
On Wed, Sep 04, 2024 at 12:28:49PM -0400, Jason Merrill wrote: > On 8/30/24 3:40 PM, Marek Polacek wrote: > > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk? > > > > -- >8 -- > > Redeclaration such as > > > >void f(void); > >consteval void f(void); > > > > is invalid. In a

[PATCH] c++, v2: Fix get_member_function_from_ptrfunc with -fsanitize=bounds [PR116449]

2024-09-04 Thread Jakub Jelinek
On Wed, Sep 04, 2024 at 12:34:04PM -0400, Jason Merrill wrote: > > So, one possibility would be to call save_expr unconditionally in > > get_member_function_from_ptrfunc as well. > > > > Or build a TARGET_EXPR (force_target_expr or similar). > > Yes. I don't have a strong preference between the

Re: [to-be-committed] [RISC-V][PR target/115921] Improve reassociation for rv64

2024-09-04 Thread Jeff Law
On 9/4/24 8:08 AM, Xi Ruoyao wrote: Hi Jeff, On Mon, 2024-09-02 at 12:53 -0600, Jeff Law wrote:  (define_insn_and_split "_shift_reverse"    [(set (match_operand:X 0 "register_operand" "=r") (any_bitwise:X (ashift:X (match_operand:X 1 "register_operand" "r") @@ -2934,9 +2936,9 @@ (def

Re: [PATCH] c++: Fix get_member_function_from_ptrfunc with -fsanitize=bounds [PR116449]

2024-09-04 Thread Jason Merrill
On 9/4/24 11:15 AM, Jakub Jelinek wrote: On Wed, Sep 04, 2024 at 11:06:22AM -0400, Jason Merrill wrote: On 9/2/24 1:49 PM, Jakub Jelinek wrote: Hi! The following testcase is miscompiled, because get_member_function_from_ptrfunc emits something like (((FUNCTION.__pfn & 1) != 0) ? ptr + FUNCT

[PATCH] libstdc++: hashing support for chrono value classes (P2592R2)

2024-09-04 Thread Giuseppe D'Angelo
Hello, The attached patch implements P2592, adding std::hash specializations for std::chrono classes. One aspect I'm quite unhappy with is the hash combiner I've used. I'm not sure if there's some longer-term goal for libstdc++ here -- would you prefer to roll something à la Boost.HashCombin

Re: [PATCH v1 6/9] aarch64: Use symbols without offset to prevent relocation issues

2024-09-04 Thread Martin Storsjö
On Wed, 4 Sep 2024, Evgeny Karpov wrote: Monday, September 4, 2024 Martin Storsjö wrote: compilation time adrp x0, symbol + 256 9000 adrp x0, 0 As the symbol offset is 256, you will need to encode the offset "256" in the instruction immediate field. Not "256 >> 12". This is the somewhat

Re: [PATCH] c++: fn redecl in fn scope wrongly accepted [PR116239]

2024-09-04 Thread Jason Merrill
On 8/30/24 3:40 PM, Marek Polacek wrote: Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk? -- >8 -- Redeclaration such as void f(void); consteval void f(void); is invalid. In a namespace scope, we detect the collision in validate_constexpr_redeclaration, but not when one decl

Re: [PATCH] c++, coroutines: Revise promise construction/destruction.

2024-09-04 Thread Jason Merrill
On 8/31/24 12:37 PM, Iain Sandoe wrote: tested on x86_64-darwin/linux powerpc64le-linux, OK for trunk? alternate suggestions? thanks, Iain --- 8< --- In examining the coroutine testcases for unexpected diagnostic output for 'Wall', I found a 'statement has no effect' warning for the promise con

Re: [PATCH] c++, coroutines: Instrument missing return_void UB.

2024-09-04 Thread Jason Merrill
On 9/1/24 12:17 PM, Iain Sandoe wrote: This came up in discussion of an earlier patch. I'm in two minds as to whether it's a good idea or not - the underlying issue being that libubsan does not yet (AFAICT) have the concept of a coroutine, so that the diagnostics are not very specific and might

Re: [PATCH] c++: Fix overeager Woverloaded-virtual with conversion operators [PR109918]

2024-09-04 Thread Jason Merrill
On 9/1/24 2:51 PM, Simon Martin wrote: Hi Jason, On 26 Aug 2024, at 19:23, Jason Merrill wrote: On 8/25/24 12:37 PM, Simon Martin wrote: On 24 Aug 2024, at 23:59, Simon Martin wrote: On 24 Aug 2024, at 15:13, Jason Merrill wrote: On 8/23/24 12:44 PM, Simon Martin wrote: We currently emit

Re: [PATCH RFC] c-family: add attribute flag_enum [PR46457]

2024-09-04 Thread Eric Gallager
On Wed, Sep 4, 2024 at 8:18 AM Jason Merrill wrote: > > Tested x86_64-pc-linux-gnu. Any objections? > > -- 8< -- > > Several PRs complain about -Wswitch warning about a case for a bitwise > combination of enumerators. Clang has an attribute flag_enum to prevent > this; let's adopt that approach

Re: [PATCH v1 6/9] aarch64: Use symbols without offset to prevent relocation issues

2024-09-04 Thread Martin Storsjö
On Wed, 4 Sep 2024, Martin Storsjö wrote: On Wed, 4 Sep 2024, Evgeny Karpov wrote: Monday, September 4, 2024 Martin Storsjö wrote: Let's consider the following example, when symbol is located at 3072. 1. Example without the fix compilation time adrp        x0, (3072 + 256) & ~0xFFF // x0 =

Re: [PATCH] c++: Fix get_member_function_from_ptrfunc with -fsanitize=bounds [PR116449]

2024-09-04 Thread Jakub Jelinek
On Wed, Sep 04, 2024 at 11:06:22AM -0400, Jason Merrill wrote: > On 9/2/24 1:49 PM, Jakub Jelinek wrote: > > Hi! > > > > The following testcase is miscompiled, because > > get_member_function_from_ptrfunc > > emits something like > > (((FUNCTION.__pfn & 1) != 0) > > ? ptr + FUNCTION.__delta + FU

Re: Handle 'NUM' in 'PUSH_INSERT_PASSES_WITHIN' (was: [PATCH 03/11] Handwritten part of conversion of passes to C++ classes)

2024-09-04 Thread David Malcolm
On Fri, 2024-06-28 at 15:06 +0200, Thomas Schwinge wrote: > Hi! > > As part of this: > > On 2013-07-26T11:04:33-0400, David Malcolm > wrote: > > This patch is the hand-written part of the conversion of passes > > from > > C structs to C++ classes. > > > --- a/gcc/passes.c > > +++ b/gcc/passes.c

Re: [PATCH] c++: Add missing auto_diagnostic_groups

2024-09-04 Thread Jason Merrill
On 9/2/24 7:43 AM, Nathaniel Shead wrote: Ping for https://gcc.gnu.org/pipermail/gcc-patches/2024-August/659796.html OK. For clarity's sake, here's the full patch with the adjustment I mentioned earlier: -- >8 -- This patch goes through all .cc files in gcc/cp and adds in any auto_diagnosti

Re: [PATCH v1 6/9] aarch64: Use symbols without offset to prevent relocation issues

2024-09-04 Thread Martin Storsjö
On Wed, 4 Sep 2024, Evgeny Karpov wrote: Monday, September 4, 2024 Martin Storsjö wrote: Let's consider the following example, when symbol is located at 3072. 1. Example without the fix compilation time adrp        x0, (3072 + 256) & ~0xFFF // x0 = 0 add         x0, x0, (3072 + 256) & 0xFFF

Re: [PATCH] c++: Fix get_member_function_from_ptrfunc with -fsanitize=bounds [PR116449]

2024-09-04 Thread Jason Merrill
On 9/2/24 1:49 PM, Jakub Jelinek wrote: Hi! The following testcase is miscompiled, because get_member_function_from_ptrfunc emits something like (((FUNCTION.__pfn & 1) != 0) ? ptr + FUNCTION.__delta + FUNCTION.__pfn - 1 : FUNCTION.__pfn) (ptr + FUNCTION.__delta, ...) or so, so FUNCTION tree

Re: [PATCH RFC] c-family: add attribute flag_enum [PR46457]

2024-09-04 Thread Marek Polacek
On Wed, Sep 04, 2024 at 08:15:25AM -0400, Jason Merrill wrote: > Tested x86_64-pc-linux-gnu. Any objections? Looks good except... > +/* Attributes also recognized in the clang:: namespace. */ > +const struct attribute_spec c_common_clang_attributes[] = { > + { "flag_enum", 0, 0, fal

[pushed] c++: add a testcase for [PR 108620]

2024-09-04 Thread Arsen Arsenović
Pushed as obvious. -- >8 -- Fixed by r15-2540-g32e678b2ed7521. Add a testcase, as the original ones do not cover this particular failure mode. gcc/testsuite/ChangeLog: PR c++/108620 * g++.dg/coroutines/pr108620.C: New test. --- gcc/testsuite/g++.dg/coroutines/pr1

Re: [PATCH] c++: noexcept and pointer to member function type [PR113108]

2024-09-04 Thread Jason Merrill
On 9/3/24 2:47 PM, Marek Polacek wrote: Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk/14? OK. -- >8 -- We ICE in nothrow_spec_p because it got a DEFERRED_NOEXCEPT. This DEFERRED_NOEXCEPT was created in implicitly_declare_fn when declaring Foo& operator=(Foo&&) = default; in

Re: [PATCH] c++: ICE with TTP [PR96097]

2024-09-04 Thread Jason Merrill
On 9/3/24 6:12 PM, Marek Polacek wrote: Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk/14? The change to return bool seems like unrelated cleanup; please push that separately on trunk only. + /* We can also have: + + template typename X> + void

Re: Zen5 tuning part 2: disable gather and scatter

2024-09-04 Thread Toon Moene
On 9/4/24 12:55, Jan Hubicka wrote: On 9/3/24 15:07, Jan Hubicka wrote: Hi, We disable gathers for zen4. It seems that gather has improved a bit compared to zen4 and Zen5 optimization manual suggests "Avoid GATHER instructions when the indices are known ahead of time. Vector loads followed by

Re: [PATCH] RISC-V: Handle unused-only-live stmts in SLP discovery

2024-09-04 Thread Palmer Dabbelt
On Wed, 04 Sep 2024 04:10:52 PDT (-0700), rguent...@suse.de wrote: The following adds SLP discovery for roots that are only live but otherwise unused. These are usually inductions. This allows a few more testcases to be handled fully with SLP, for example gcc.dg/vect/no-scevccp-pr86725-1.c Boo

[PATCH v1 6/9] aarch64: Use symbols without offset to prevent relocation issues

2024-09-04 Thread Evgeny Karpov
Monday, September 4, 2024 Martin Storsjö wrote: >> Let's consider the following example, when symbol is located at 3072. >> >> 1. Example without the fix >> compilation time >> adrp        x0, (3072 + 256) & ~0xFFF // x0 = 0 >> add         x0, x0, (3072 + 256) & 0xFFF // x0 = 3328 >> >> linking t

[PATCH] RISC-V Handle non-grouped stores as single-lane SLP

2024-09-04 Thread Richard Biener
The following enables single-lane loop SLP discovery for non-grouped stores and adjusts vectorizable_store to properly handle those. For gfortran.dg/vect/vect-8.f90 we vectorize one additional loop, not running into the "not falling back to strided accesses" bail-out. I have not investigated in

Re: [to-be-committed] [RISC-V][PR target/115921] Improve reassociation for rv64

2024-09-04 Thread Xi Ruoyao
Hi Jeff, On Mon, 2024-09-02 at 12:53 -0600, Jeff Law wrote: >  (define_insn_and_split "_shift_reverse" >    [(set (match_operand:X 0 "register_operand" "=r") > (any_bitwise:X (ashift:X (match_operand:X 1 "register_operand" "r") > @@ -2934,9 +2936,9 @@ (define_insn_and_split "_shift_reverse" >

[PATCH] Use dg-additional-options for gfortran.dg/vect/vect-8.f90 and RISC-V

2024-09-04 Thread Richard Biener
r14-9122-g67a29f99cc8138 disabled scheduling on a lot of testcases for RISC-V for PR113249 but using dg-options. This makes gfortran.dg/vect/vect-8.f90 UNRESOLVED as it relies on default flags to enable vectorization. The following uses dg-additional-options instead. Tested on riscv64-linux with

[Bug tree-optimization/109429] [PATCH] ivopts: fixed complexities

2024-09-04 Thread Aleksandar Rakic
>From 0130d3cb01fd9d5c1c997003245ed57bbdeb00a2 Mon Sep 17 00:00:00 2001 From: Aleksandar Date: Fri, 23 Aug 2024 11:36:50 +0200 Subject: [PATCH] [Bug tree-optimization/109429] ivopts: fixed complexities This patch addresses a bug introduced in commit f9f69dd by correcting the complexity calculatio

[PATCH v2 35/36] arm: [MVE intrinsics] rework vsbcq vsbciq

2024-09-04 Thread Christophe Lyon
Implement vsbcq vsbciq using the new MVE builtins framework. We re-use most of the code introduced by the previous patches. 2024-08-28 Christophe Lyon gcc/ * config/arm/arm-mve-builtins-base.cc (class vadc_vsbc_impl): Add support for vsbciq and vsbcq. (vadciq,

[PATCH v2 32/36] arm: [MVE intrinsics] factorize vadc vadci vsbc vsbci

2024-09-04 Thread Christophe Lyon
Factorize vadc/vsbc and vadci/vsbci so that they use the same parameterized names. 2024-08-28 Christophe Lyon gcc/ * config/arm/iterators.md (mve_insn): Add VADCIQ_M_S, VADCIQ_M_U, VADCIQ_U, VADCIQ_S, VADCQ_M_S, VADCQ_M_U, VADCQ_S, VADCQ_U, VSBCIQ_M_S, VSBCIQ_M_

[PATCH v2 34/36] arm: [MVE intrinsics] rework vadcq

2024-09-04 Thread Christophe Lyon
Implement vadcq using the new MVE builtins framework. We re-use most of the code introduced by the previous patch to support vadciq: we just need to initialize carry from the input parameter. 2024-08-28 Christophe Lyon gcc/ * config/arm/arm-mve-builtins-base.cc (vadcq_vsbc):

[PATCH v2 30/36] arm: [MVE intrinsics] remove vshlcq useless expanders

2024-09-04 Thread Christophe Lyon
Since we rewrote the implementation of vshlcq intrinsics, we no longer need these expanders. 2024-08-28 Christophe Lyon gcc/ * config/arm/arm-builtins.cc (arm_ternop_unone_none_unone_imm_qualifiers) (-arm_ternop_none_none_unone_imm_qualifiers): Delete. *

[PATCH v2 29/36] arm: [MVE intrinsics] rework vshlcq

2024-09-04 Thread Christophe Lyon
Implement vshlc using the new MVE builtins framework. 2024-08-28 Christophe Lyon gcc/ * config/arm/arm-mve-builtins-base.cc (class vshlc_impl): New. (vshlc): New. * config/arm/arm-mve-builtins-base.def (vshlcq): New. * config/arm/arm-mve-builtins-base.h

[PATCH v2 25/36] arm: [MVE intrinsics] rework vdwdup viwdup

2024-09-04 Thread Christophe Lyon
Implement vdwdup and viwdup using the new MVE builtins framework. In order to share more code with viddup_impl, the patch swaps operands 1 and 2 in @mve_v[id]wdupq_m_wb_u_insn, so that the parameter order is similar to what @mve_v[id]dupq_m_wb_u_insn uses. 2024-08-28 Christophe Lyon g

[PATCH v2 28/36] arm: [MVE intrinsics] add vshlc shape

2024-09-04 Thread Christophe Lyon
This patch adds the vshlc shape description. 2024-08-28 Christophe Lyon gcc/ * config/arm/arm-mve-builtins-shapes.cc (vshlc): New. * config/arm/arm-mve-builtins-shapes.h (vshlc): New. --- gcc/config/arm/arm-mve-builtins-shapes.cc | 44 +++ gcc/confi

[PATCH v2 26/36] arm: [MVE intrinsics] update v[id]wdup tests

2024-09-04 Thread Christophe Lyon
Testing v[id]wdup overloads with '1' as argument for uint32_t* does not make sense: this patch adds a new 'unit32_t *a' parameter to foo2 in such tests. The difference with v[id]dup tests (where we removed 'foo2') is that in 'foo1' we test the overload with a variable 'wrap' parameter (b) and we n

[PATCH v2 23/36] arm: [MVE intrinsics] factorize vdwdup viwdup

2024-09-04 Thread Christophe Lyon
Factorize vdwdup and viwdup so that they use the same parameterized names. Like with vddup and vidup, we do not bother with the corresponding expanders, as we stop using them in a subsequent patch. The patch also adds the missing attributes to vdwdupq_wb_u_insn and viwdupq_wb_u_insn patterns. 20

[PATCH v2 36/36] arm: [MVE intrinsics] use long_type_suffix / half_type_suffix helpers

2024-09-04 Thread Christophe Lyon
In several places we are looking for a type twice or half as large as the type suffix: this patch introduces helper functions to avoid code duplication. long_type_suffix is similar to the SVE counterpart, but adds an 'expected_tclass' parameter. half_type_suffix is similar to it, but does not exis

[PATCH v2 24/36] arm: [MVE intrinsics] add vidwdup shape

2024-09-04 Thread Christophe Lyon
This patch adds the vidwdup shape description for vdwdup and viwdup. It is very similar to viddup, but accounts for the additional 'wrap' scalar parameter. 2024-08-21 Christophe Lyon gcc/ * config/arm/arm-mve-builtins-shapes.cc (vidwdup): New. * config/arm/arm-mve-buil

[PATCH v2 31/36] arm: [MVE intrinsics] add vadc_vsbc shape

2024-09-04 Thread Christophe Lyon
This patch adds the vadc_vsbc shape description. 2024-08-28 Christophe Lyon gcc/ * config/arm/arm-mve-builtins-shapes.cc (vadc_vsbc): New. * config/arm/arm-mve-builtins-shapes.h (vadc_vsbc): New. --- gcc/config/arm/arm-mve-builtins-shapes.cc | 36 ++

[PATCH v2 33/36] arm: [MVE intrinsics] rework vadciq

2024-09-04 Thread Christophe Lyon
Implement vadciq using the new MVE builtins framework. 2024-08-28 Christophe Lyon gcc/ * config/arm/arm-mve-builtins-base.cc (class vadc_vsbc_impl): New. (vadciq): New. * config/arm/arm-mve-builtins-base.def (vadciq): New. * config/arm/arm-mve-builtins-b

[PATCH v2 18/36] arm: [MVE intrinsics] add viddup shape

2024-09-04 Thread Christophe Lyon
This patch adds the viddup shape description for vidup and vddup. This requires the addition of report_not_one_of and function_checker::require_immediate_one_of to gcc/config/arm/arm-mve-builtins.cc (they are copies of the aarch64 SVE counterpart). This patch also introduces MODE_wb. 2024-08-21

[PATCH v2 27/36] arm: [MVE intrinsics] remove useless v[id]wdup expanders

2024-09-04 Thread Christophe Lyon
Like with vddup/vidup, we use code_for_mve_q_wb_u_insn, so we can drop the expanders and their declarations as builtins, now useless. 2024-08-28 Christophe Lyon gcc/ * config/arm/arm-builtins.cc (arm_quinop_unone_unone_unone_unone_imm_pred_qualifiers): Delete. *

[PATCH v2 17/36] arm: [MVE intrinsics] factorize vddup vidup

2024-09-04 Thread Christophe Lyon
Factorize vddup and vidup so that they use the same parameterized names. This patch updates only the (define_insn "@mve_q_u_insn") patterns and does not bother with the (define_expand "mve_vidupq_n_u") ones, because a subsequent patch avoids using them. 2024-08-21 Christophe Lyon gcc/

[PATCH v2 12/36] arm: [MVE intrinsics] rework vcvtaq vcvtmq vcvtnq vcvtpq

2024-09-04 Thread Christophe Lyon
Implement vcvtaq vcvtmq vcvtnq vcvtpq using the new MVE builtins framework. 2024-07-11 Christophe Lyon gcc/ * config/arm/arm-mve-builtins-base.cc (vcvtaq): New. (vcvtmq): New. (vcvtnq): New. (vcvtpq): New. * config/arm/arm-mve-builtins-base.def (

[PATCH v2 19/36] arm: [MVE intrinsics] rework vddup vidup

2024-09-04 Thread Christophe Lyon
Implement vddup and vidup using the new MVE builtins framework. We generate better code because we take advantage of the two outputs produced by the v[id]dup instructions. For instance, before: ldr r3, [r0] sub r2, r3, #8 str r2, [r0] mov r2, r3

Re: [PATCH v1 6/9] aarch64: Use symbols without offset to prevent relocation issues

2024-09-04 Thread Martin Storsjö
On Wed, 4 Sep 2024, Evgeny Karpov wrote: Monday, September 2, 2024 Martin Storsjö wrote: The only non-obvious thing, is that for IMAGE_REL_ARM64_PAGEBASE_REL21, i.e. "adrp" instructions, the immediate that gets stored in the instruction, is the byte offset to the symbol. After linking, when

[PATCH v2 21/36] arm: [MVE intrinsics] remove v[id]dup expanders

2024-09-04 Thread Christophe Lyon
We use code_for_mve_q_u_insn, rather than the expanders used by the previous implementation, so we can remove the expanders and their declaration as builtins. 2024-08-21 Christophe Lyon gcc/ * config/arm/arm_mve_builtins.def (vddupq_n_u, vidupq_n_u) (vddupq_m_n_u, vidup

[PATCH v2 16/36] arm: [MVE intrinsics] rework vctp

2024-09-04 Thread Christophe Lyon
Implement vctp using the new MVE builtins framework. 2024-08-21 Christophe Lyon gcc/ChangeLog: * config/arm/arm-mve-builtins-base.cc (class vctpq_impl): New. (vctp16q): New. (vctp32q): New. (vctp64q): New. (vctp8q): New. * config/arm/arm-mve-bui

[PATCH v2 22/36] arm: [MVE intrinsics] fix checks of immediate arguments

2024-09-04 Thread Christophe Lyon
As discussed in [1], it is better to use "su64" for immediates in intrinsics signatures in order to provide better diagnostics (erroneous constants are not truncated for instance). This patch thus uses su64 instead of ss32 in binary_lshift_unsigned, binary_rshift_narrow, binary_rshift_narrow_unsig

[PATCH v2 03/36] arm: [MVE intrinsics] Cleanup arm-mve-builtins-functions.h

2024-09-04 Thread Christophe Lyon
This patch brings no functional change but removes some code duplication in arm-mve-builtins-functions.h and makes it easier to read and maintain. It introduces a new expand_unspec () member of unspec_based_mve_function_base and makes a few classes inherit from it instead of function_base. This a

[PATCH v2 15/36] arm: [MVE intrinsics] rework vorn

2024-09-04 Thread Christophe Lyon
Implement vorn using the new MVE builtins framework. 2024-07-11 Christophe Lyon gcc/ * config/arm/arm-mve-builtins-base.cc (vornq): New. * config/arm/arm-mve-builtins-base.def (vornq): New. * config/arm/arm-mve-builtins-base.h (vornq): New. * config/arm/

[PATCH v2 20/36] arm: [MVE intrinsics] update v[id]dup tests

2024-09-04 Thread Christophe Lyon
Testing v[id]dup overloads with '1' as argument for uint32_t* does not make sense: instead of choosing the '_wb' overload, we choose the '_n', but we already do that in the '_n' tests. This patch removes all such bogus foo2 functions. 2024-08-28 Christophe Lyon gcc/testsuite/

[PATCH v2 13/36] arm: [MVE intrinsics] rework vbicq

2024-09-04 Thread Christophe Lyon
Implement vbicq using the new MVE builtins framework. 2024-07-11 Christophe Lyon gcc/ * config/arm/arm-mve-builtins-base.cc (vbicq): New. * config/arm/arm-mve-builtins-base.def (vbicq): New. * config/arm/arm-mve-builtins-base.h (vbicq): New. * config/arm

  1   2   >