Re: [PATCH] x86: Skip if the mode size is smaller than its natural size

2025-05-06 Thread H.J. Lu
On Tue, May 6, 2025 at 2:30 PM Liu, Hongtao wrote: > > > > > -Original Message- > > From: H.J. Lu > > Sent: Tuesday, May 6, 2025 2:16 PM > > To: Liu, Hongtao > > Cc: GCC Patches ; Uros Bizjak > > > > Subject: Re: [PATCH] x86: Skip if the mode size is smaller than its natural > > size >

Re: [PATCH v5 05/10] libstdc++: Implement layout_left from mdspan.

2025-05-06 Thread Tomasz Kaminski
On Mon, May 5, 2025 at 9:20 PM Luc Grosheintz wrote: > > > On 5/5/25 9:44 AM, Tomasz Kaminski wrote: > > On Sat, May 3, 2025 at 2:39 PM Luc Grosheintz > > wrote: > > > >> > >> > >> On 4/30/25 7:13 AM, Tomasz Kaminski wrote: > >>> Hi, > >>> > >>> As we will be landing patches for extends, this wi

Re: [PATCH 2/3] x86: Add a pass to fold tail call

2025-05-06 Thread H.J. Lu
On Mon, May 5, 2025 at 9:56 PM Andi Kleen wrote: > > On Mon, May 05, 2025 at 06:20:40AM -0700, Andi Kleen wrote: > > > If the branch edge destination is a basic block with only a direct > > > sibcall, change the jcc target to the sibcall target, decrement the > > > destination basic block entry la

Re: [patch, fortram] Bug 120049 - ICE when using IS_C_ASSOCIATED ()

2025-05-06 Thread Paul Richard Thomas
HI Jerry, The patch looks good to me. OK for mainline and for backporting. I never quite know what to suggest for delaying backporting and so I will leave it to your judgement. Thanks for the patch. Paul On Tue, 6 May 2025 at 04:30, Jerry D wrote: > Attached patch fixes this by checking for

Re: [PATCH v3 2/6] dwarf: create annotation DIEs for btf tags

2025-05-06 Thread Richard Biener
On Mon, May 5, 2025 at 10:40 PM David Faust wrote: > > > > On 5/2/25 01:26, Richard Biener wrote: > > On Wed, Apr 30, 2025 at 7:26 PM David Faust wrote: > >> > >> The btf_decl_tag and btf_type_tag attributes provide a means to annotate > >> declarations and types respectively with arbitrary user

Re: [PATCH 3/4] Rewrite VCEs of integral types [PR116939]

2025-05-06 Thread Richard Biener
On Mon, May 5, 2025 at 9:54 PM Andrew Pinski wrote: > > On Mon, May 5, 2025 at 12:00 AM Richard Biener > wrote: > > > > On Mon, May 5, 2025 at 3:45 AM Andrew Pinski > > wrote: > > > > > > Like the patch to phiopt (r15-4033-g1f619fe25925a5f7), this adds rewriting > > > of VCE to > > > gimple_wi

[RFC PATCH 0/5] aarch64: Support for user-defined aarch64 tuning parameters in JSON

2025-05-06 Thread soumyaa
From: Soumya AR Hi, This RFC and subsequent patch series introduces support for printing and parsing of aarch64 tuning parameters in the form of JSON. It is important to note that this mechanism is specifically intended for power users to experiment with tuning parameters. This proposal does no

[RFC PATCH 3/5] json: Add get_map() method to JSON object class

2025-05-06 Thread soumyaa
From: Soumya AR This patch adds a get_map () method to the JSON object class to provide access to the underlying hash map that stores the JSON key-value pairs. It also reorganizes the private and public sections of the class to expose the map_t typedef, which is the return type of get_map(). Th

[RFC PATCH 2/5] aarch64: Enable dumping of AArch64 CPU tuning parameters to JSON

2025-05-06 Thread soumyaa
From: Soumya AR This patch adds functionality to dump AArch64 CPU tuning parameters to a JSON file. The new '-fdump-tuning-model=' flag allows users to export the current tuning model configuration to a JSON file. This patch was bootstrapped and regtested on aarch64-linux-gnu, no regression. Si

[RFC PATCH 4/5] aarch64: Enable parsing of user-provided AArch64 CPU tuning parameters

2025-05-06 Thread soumyaa
From: Soumya AR This patch adds support for loading custom CPU tuning parameters from a JSON file for AArch64 targets. The '-muser-provided-CPU=' flag accepts a user provided JSON file and overrides the internal tuning parameters at GCC runtime. This patch was bootstrapped and regtested on aarch

[RFC PATCH 5/5] aarch64: Regression tests for parsing of user-provided AArch64 CPU tuning parameters

2025-05-06 Thread soumyaa
From: Soumya AR This patch was bootstrapped and regtested on aarch64-linux-gnu, no regression. Signed-off-by: Soumya AR gcc/testsuite/ChangeLog: * gcc.target/aarch64/aarch64-json-tunings/aarch64-json-tunings.exp: New test. * gcc.target/aarch64/aarch64-json-tunings/boolean-1.c

Re: [PATCH][GCC15] PR tree-optimization/120048 - Allow IPA_CP to handle UNDEFINED as VARYING.

2025-05-06 Thread Richard Biener
On Mon, May 5, 2025 at 9:55 PM Andrew MacLeod wrote: > > > On 5/3/25 07:41, Richard Biener wrote: > > On Sat, May 3, 2025 at 12:39 AM Andrew MacLeod wrote: > >> On trunk I'll eventually do something different.. but it will be more > >> invasive than I think is reasonable for a backport. > >> > >>

Re: [PATCH] [GCC14] PR tree-optimization/120048 - Allow IPA_CP to handle UNDEFINED as VARYING.

2025-05-06 Thread Richard Biener
On Mon, May 5, 2025 at 9:56 PM Andrew MacLeod wrote: > > > On 5/3/25 07:41, Richard Biener wrote: > > On Sat, May 3, 2025 at 12:39 AM Andrew MacLeod wrote: > >> On trunk I'll eventually do something different.. but it will be more > >> invasive than I think is reasonable for a backport. > >> > >>

Re: [PATCH 1/5] Document option -fdump-ipa-clones

2025-05-06 Thread Richard Biener
On Mon, Apr 28, 2025 at 4:10 PM Martin Jambor wrote: > > Hi, > > I have noticed that the option -fdump-ipa-clones is not documented > although there are users who depend on it. This patch adds the > missing documentation along with the description of the information it > dumps and the format it u

Re: [PATCH 2/5] ipa: Do not emit info about temporary clones to ipa-clones dump (PR119852)

2025-05-06 Thread Richard Biener
On Mon, Apr 28, 2025 at 4:10 PM Martin Jambor wrote: > > Hi, > > as described in PR 119852, the output of -fdump-ipa-clones can contain > "(null)" as the suffix/reason for cloning when we need to create a > clone to hold the original function during recursive inlining. Such > clone is never outpu

Re: [PATCH 5/5] ipa: Drop the default value of suffix parameter of create_clone (PR119852)

2025-05-06 Thread Richard Biener
On Mon, Apr 28, 2025 at 4:16 PM Martin Jambor wrote: > > Hi > > in PR 119852 we agreed that since the NULL-ness of the suffix > parameter should prevent creation of a record in the ipa-clones > dump (which is implemented by a previous patch), it should not default > to NULL. > > Bootstrapped and t

Re: [PATCH] i386: Implement Thread Local Storage on Windows

2025-05-06 Thread Sam James
Julian Waters writes: > gcc bootstrap works on my end pretty well, but you know what they say, > no one likes an "It works on my device" developer :) The reason he asked is https://gcc.gnu.org/contribute.html#patches (it's convention to state how you tested it & on what platforms) and whether th

Re: [PATCH 4/5] ipa: Fix create_version_clone_with_body declaration and comment

2025-05-06 Thread Richard Biener
On Mon, Apr 28, 2025 at 4:14 PM Martin Jambor wrote: > > Hi, > > I noticed that the name of the fifth parameter of > cgraph_node::create_version_clone_with_body is different in the class > definition in cgraph.h and in the actual member function definition in > cgraphclones.cc. The former (clone_

[PATCH] tree-optimization/120031 - CTZ pattern matching fails a case

2025-05-06 Thread Richard Biener
This PR is about the pattern matching in tree-ssa-forwprop.cc not working for the fallback implementation in ZSTD which uses a cast aroud the negation of the value to be tested. There's a pattern eliding casts in (T')-(T)x already but that only covered an inner widening conversion. The following

RE: [PATCH] tree-optimization/120089 - force all PHIs live for early-break vect

2025-05-06 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Tuesday, May 6, 2025 9:51 AM > To: gcc-patches@gcc.gnu.org > Cc: Tamar Christina ; RISC-V CI c...@rivosinc.com> > Subject: [PATCH] tree-optimization/120089 - force all PHIs live for > early-break vect > > The following makes sure to ev

Re: [PATCH] libstdc++: Hide TLS variables in `std::call_once`

2025-05-06 Thread Martin Storsjö
On Wed, 2 Apr 2025, Martin Storsjö wrote: On Fri, 29 Nov 2024, LIU Hao wrote: 在 2024-11-29 23:50, Jonathan Wakely 写道: It looks like your patch is against gcc-14 not trunk, the GLIBCXX_15.1.0 version is already there. Sorry, I mean GLIBCXX_3.4.34 for 15.1.0 Oops that's what I used to test

RE: [PATCH] tree-optimization/120089 - force all PHIs live for early-break vect

2025-05-06 Thread Richard Biener
On Tue, 6 May 2025, Tamar Christina wrote: > > -Original Message- > > From: Richard Biener > > Sent: Tuesday, May 6, 2025 9:51 AM > > To: gcc-patches@gcc.gnu.org > > Cc: Tamar Christina ; RISC-V CI > c...@rivosinc.com> > > Subject: [PATCH] tree-optimization/120089 - force all PHIs live f

RE: [PATCH 0/3] Remove non-SLP path from vectorizable_conversion

2025-05-06 Thread Tamar Christina
> > This is an example on how I'd like to see cleanup for SLP happening > in the vectorizable_* and related functions. While this example, > vectorizable_conversion, is quite straight-forward it helps to > isolate errors. I've done this in 3 steps: Happy to help with this if you let me know whi

Re: [PATCH] libstdc++: Fix constraint recursion in std::expected's operator== [PR119714]

2025-05-06 Thread Tomasz Kaminski
On Mon, May 5, 2025 at 8:50 PM Patrick Palka wrote: > Tested on x86_64-pc-linux-gnu, does this look OK for trunk/15? > This LGTM. Out of curiosity, would declaring them as members also fix the issue? > > -- >8 -- > > This std::expected friend operator== is prone to constraint recursion > after C

[PING PATCH v4 00/20] FMV refactor and ACLE compliance.

2025-05-06 Thread Alfie Richards
Hi all, Ping for this patch series. There are a handful of other patches that are dependant on this series so I am keen to start getting this reviewed. Kind regards, Alfie On 15/04/2025 11:31, Alfie Richards wrote: Hi all, Another update to this series. This patch changes the version info

Re: [PATCH] x86: Skip if the mode size is smaller than its natural size

2025-05-06 Thread Hongtao Liu
On Tue, May 6, 2025 at 3:06 PM H.J. Lu wrote: > > On Tue, May 6, 2025 at 2:30 PM Liu, Hongtao wrote: > > > > > > > > > -Original Message- > > > From: H.J. Lu > > > Sent: Tuesday, May 6, 2025 2:16 PM > > > To: Liu, Hongtao > > > Cc: GCC Patches ; Uros Bizjak > > > > > > Subject: Re: [PA

Re: [PATCH 2/3] x86: Add a pass to fold tail call

2025-05-06 Thread H.J. Lu
On Mon, May 5, 2025 at 9:20 PM Andi Kleen wrote: > > > If the branch edge destination is a basic block with only a direct > > sibcall, change the jcc target to the sibcall target, decrement the > > destination basic block entry label use count and redirect the edge > > to the exit basic block. Ca

Re: [RFC PATCH 0/2] Add target_clones profile option support

2025-05-06 Thread Alfie Richards
Hello, I like this idea. I have a couple thoughts to add. On 05/05/2025 09:46, Yangyu Chen wrote: On 5 May 2025, at 16:34, Kyrylo Tkachov wrote: On 4 May 2025, at 19:19, Yangyu Chen wrote: Hi everyone, This patch series introduces support for the target_clones profile option in GCC. Thi

Re: [PATCH] Allow a PCH to be mapped to a different address

2025-05-06 Thread Jonathan Yong
On 5/5/25 12:08 PM, Jonathan Yong wrote: On 5/5/25 11:25 AM, LIU Hao wrote: 在 2025-4-28 15:05, LIU Hao 写道: This is a response to https://gcc.gnu.org/bugzilla/show_bug.cgi? id=14940#c57 The patch was submitted to MSYS2 for testing in 2022-5. No issue reports have been received so far: * htt

Re: [PATCH] i386: Implement Thread Local Storage on Windows

2025-05-06 Thread Jonathan Yong
On 5/5/25 8:14 AM, Jonathan Yong wrote: On 5/5/25 6:14 AM, Julian Waters wrote: gcc bootstrap works on my end pretty well, but you know what they say, no one likes an "It works on my device" developer :) In all seriousness, no observable problems were seen on my end, apart from all the existing

[RFC PATCH 1/5] aarch64 + arm: Remove const keyword from tune_params members and nested members

2025-05-06 Thread soumyaa
From: Soumya AR To allow runtime updates to tuning parameters, the const keyword is removed from aarch64 tune_params and all its nested structures and structure members. Since this patch also touches tuning structures in the arm backend, it was bootstrapped on aarch64-linux-gnu as well as arm-li

Re: [PATCH v5 05/10] libstdc++: Implement layout_left from mdspan.

2025-05-06 Thread Tomasz Kaminski
The constructors that are inside mapping_left, that I think represents constructors with other extends: template mapping_left(const mapping_left_base& other) : mapping_left_base(other) {} Can be placed in mapping_left_base, and they will be inherited, as only copy/move constructors are shadowed.

[PATCH] tree-optimization/120089 - force all PHIs live for early-break vect

2025-05-06 Thread Richard Biener
The following makes sure to even mark unsupported PHIs live when doing early-break vectorization since otherwise we fail to validate we can vectorize those and generate wrong code based on the scalar PHIs which would only work with a vectorization factor of one. Bootstrapped and tested on x86_64-u

Re: [RFC PATCH 0/2] Add target_clones profile option support

2025-05-06 Thread Yangyu Chen
> On 6 May 2025, at 16:01, Alfie Richards wrote: > > Hello, > > I like this idea. I have a couple thoughts to add. > > On 05/05/2025 09:46, Yangyu Chen wrote: >>> On 5 May 2025, at 16:34, Kyrylo Tkachov wrote: >>> On 4 May 2025, at 19:19, Yangyu Chen wrote: Hi everyone, >>

Re: [PATCH 6/7] OpenMP: C front end support for "begin declare variant"

2025-05-06 Thread Tobias Burnus
On February 10, 2025, Sandra Loosemore wrote: […] When not using the variant function, having it limited to the TU means that that there are now warnings like: warning: ‘f.ompvariant1’ defined but not used [-Wunused-function] 4 | int f(int i) { | ^ I think that's okay - both

Patch Submission: Optimize Size of true and false Macros in C

2025-05-06 Thread SAKSHAM JOSHI
Dear GCC Patches Team, I hope this message finds you well. I am submitting a patch for the GCC compiler's "stdbool.h" to optimize the size of the true and false macros in the C programming language. Currently, the size of the true and false macros is 4 bytes, whereas the _Bool datatype is 1 byte

Re: [PATCH v5 05/10] libstdc++: Implement layout_left from mdspan.

2025-05-06 Thread Tomasz Kaminski
For better reference, here is illustration of the design I was thinking about: https://godbolt.org/z/7aTcM8fz4 I would also consider having left_mapping_base to accept padding, where layout_left uses left_mapping_base. On Tue, May 6, 2025 at 10:48 AM Tomasz Kaminski wrote: > The constructors tha

RE: [PATCH 0/3] Remove non-SLP path from vectorizable_conversion

2025-05-06 Thread Richard Biener
On Tue, 6 May 2025, Tamar Christina wrote: > > > > This is an example on how I'd like to see cleanup for SLP happening > > in the vectorizable_* and related functions. While this example, > > vectorizable_conversion, is quite straight-forward it helps to > > isolate errors. I've done this in 3

Re: [PATCH] asf: Enable pass at O2 or higher

2025-05-06 Thread Konstantinos Eleftheriou
Hi Andi, thanks for your response. The pass prevents store forwarding only in cases where smaller stores are followed by a large load. To the best of our knowledge, on most CPUs, the load will stall in that case. Have you taken that into account? Thanks, Konstantinos On Wed, Apr 23, 2025 at 6:55

Re: [RFC PATCH 0/2] Add target_clones profile option support

2025-05-06 Thread Alfie Richards
On 06/05/2025 09:36, Yangyu Chen wrote: On 6 May 2025, at 16:01, Alfie Richards wrote: Hello, I like this idea. I have a couple thoughts to add. On 05/05/2025 09:46, Yangyu Chen wrote: On 5 May 2025, at 16:34, Kyrylo Tkachov wrote: On 4 May 2025, at 19:19, Yangyu Chen wrote: Hi every

Fix i386 bootstrap on non-Windows targets

2025-05-06 Thread Jan Hubicka
Hi, this patch adds ifdef so we don't get warning on ix86_tls_index being unused. Bootstrapped x86_64-linux, comitted. * config/i386/i386.cc (ix86_tls_index): Add ifdef. diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc index f28c92a9d3a..89f518c86b5 100644 --- a/gcc/config/

Re: [PATCH] MIPS: Fixed the problem that the nop instruction is inserted at the wrong position after enabling '-fpatchable-function-entry='

2025-05-06 Thread WANG Xuerui
On 4/30/25 14:26, Lulu Cheng wrote: Because MIPS function symbol is generated in the prologue function, this nop generation should be done in prologue. OK for trunk? PR target/99217 gcc/ChangeLog: * config/mips/mips.cc (mips_start_function_definition): Implements the fu

Re: [PATCH] aarch64: Use LDR for first-element loads for Advanced SIMD

2025-05-06 Thread Dhruv Chawla
On 06/01/25 11:44, Andrew Pinski wrote: External email: Use caution opening links or attachments On Sun, Jan 5, 2025 at 10:06 PM Dhruv Chawla wrote: This patch modifies Advanced SIMD assembly generation to emit an LDR instruction when a vector is created using a load to the first element wit

Re: [RFC PATCH 0/5] aarch64: Support for user-defined aarch64 tuning parameters in JSON

2025-05-06 Thread Richard Sandiford
writes: > From: Soumya AR > > Hi, > > This RFC and subsequent patch series introduces support for printing and > parsing > of aarch64 tuning parameters in the form of JSON. Thanks for doing this. It looks really useful. My main question is: rather than write the parsing and printing routines

[PATCH] libcpp: Further fixes for incorrect line numbers in large files [PR120061]

2025-05-06 Thread Jakub Jelinek
Hi! The backport of the PR108900 fix to 14 branch broke building chromium because static_assert (__LINE__ == expected_line_number, ""); now triggers as the __LINE__ values are off by one. This isn't the case on the trunk and 15 branch because we've switched to 64-bit location_t and so one actually

Re: [GCC16,RFC,V2 03/14] aarch64: add new insn definition for st2g

2025-05-06 Thread Richard Sandiford
Indu Bhagat writes: > On 4/15/25 9:21 AM, Richard Sandiford wrote: >> Indu Bhagat writes: >>> Store Allocation Tags (st2g) is an Armv8.5-A memory tagging (MTE) >>> instruction. It stores an allocation tag to two tag granules of memory. >>> >>> TBD: >>>- Not too sure what is the best way to ge

[14 PATCH] libcpp: Further fixes for incorrect line numbers in large files [PR120061]

2025-05-06 Thread Jakub Jelinek
Hi! Here is the 14 branch version of the PR120061 fix I've just posted for 16/15. The differences from the earlier patch are all caused by the 32-bit location_t on the branch instead of 64-bit location_t that 16/15 has. So, it needs 1 << whatever instead of loc_one << whatever in the sources, and

Re: [PATCH] i386: Implement Thread Local Storage on Windows

2025-05-06 Thread Julian Waters
Argh, I see what's going on here. This was bootstrapped and tested on Windows, so the failure currently breaking Linux builds slipped right under me. I'll spin up a patch to fix the build failure pronto. Sorry about that. best regards, Julian On Tue, 6 May 2025, 17:11 Sam James, wrote: > Julian

Re: [PATCH] i386: Implement Thread Local Storage on Windows

2025-05-06 Thread Julian Waters
Never mind, it seems someone already beat me to it. Sorry for all the mess! best regards, Julian On Tue, May 6, 2025 at 5:11 PM Sam James wrote: > > Julian Waters writes: > > > gcc bootstrap works on my end pretty well, but you know what they say, > > no one likes an "It works on my device" dev

[PATCH] Printf properly on systems without %zu [PR120086]

2025-05-06 Thread Jørgen Kvalsvik
Some systems don't support the %zu format modifier for size_t, such as hppa64-hp-hpux. We don't really need the full width of size_t for printing the number of prime paths as path counts of those sizes would've already blown up the machine. For printing the vector size we can use the formatting dir

Re: [PATCH] Printf properly on systems without %zu [PR120086]

2025-05-06 Thread Richard Biener
On Tue, 6 May 2025, Jørgen Kvalsvik wrote: > Some systems don't support the %zu format modifier for size_t, such as > hppa64-hp-hpux. We don't really need the full width of size_t for > printing the number of prime paths as path counts of those sizes > would've already blown up the machine. For pr

[14 PATCH] c++: Backport r15-521 and r15-2154 to 14 branch [PR119305]

2025-05-06 Thread Jakub Jelinek
Hi! While the r15-521 commit was meant for trunk only: On Thu, Apr 25, 2024 at 11:30:48AM -0400, Jason Merrill wrote: > Hmm, maybe maybe_clone_body shouldn't clear DECL_SAVED_TREE for aliases, but > rather set it to some stub like void_node? > > Though with all these changes, it's probably better

Re: [PATCH] Printf properly on systems without %zu [PR120086]

2025-05-06 Thread Jakub Jelinek
On Tue, May 06, 2025 at 01:17:54PM +0200, Richard Biener wrote: > On Tue, 6 May 2025, Jørgen Kvalsvik wrote: > > > Some systems don't support the %zu format modifier for size_t, such as > > hppa64-hp-hpux. We don't really need the full width of size_t for > > printing the number of prime paths as

Re: [PATCH] tree-optimization/1157777 - STLF fails with BB vectorization of loop

2025-05-06 Thread Richard Biener
On Wed, 16 Apr 2025, Richard Biener wrote: > The following tries to address us BB vectorizing a loop body that > swaps consecutive elements of an array like for bubble-sort. This > causes the vector store in the previous iteration to fail to forward > to the vector load in the current iteration s

Re: [PATCH] Printf properly on systems without %zu [PR120086]

2025-05-06 Thread Jørgen Kvalsvik
Mostly because it would make the print more noisy, and because by the time we have 4 billion prime paths, all systems would probably already have been crushed under the load of computing them. I'm happy to change to fmt_size_t everywhere, of course, but the use of size_t for pathno was my own

Re: [PATCH] libstdc++: Provide ability to query _Sink_iter if writes are discarded.

2025-05-06 Thread Jonathan Wakely
On 05/05/25 14:44 +0200, Tomasz Kamiński wrote: This patch provides an equality operator between _Sink_iter and default_sentinel, that returns true, if any further writes to the _Sink_iter and underlying _Sink, will be discared, and thus can be omitted. This operator is implemented in terms new

Re: [PATCH] Printf properly on systems without %zu [PR120086]

2025-05-06 Thread Jakub Jelinek
On Tue, May 06, 2025 at 01:28:16PM +0200, Jørgen Kvalsvik wrote: > Mostly because it would make the print more noisy, and because by the time > we have 4 billion prime paths, all systems would probably already have been > crushed under the load of computing them. > > I'm happy to change to fmt_siz

Re: [PATCH v5 05/10] libstdc++: Implement layout_left from mdspan.

2025-05-06 Thread Luc Grosheintz
On 5/6/25 11:28 AM, Tomasz Kaminski wrote: For better reference, here is illustration of the design I was thinking about: https://godbolt.org/z/7aTcM8fz4 I would also consider having left_mapping_base to accept padding, where layout_left uses left_mapping_base. Thank you for all the help! I

Re: [PATCH] libstdc++: Provide ability to query _Sink_iter if writes are discarded.

2025-05-06 Thread Tomasz Kaminski
On Tue, May 6, 2025 at 1:34 PM Jonathan Wakely wrote: > On 05/05/25 14:44 +0200, Tomasz Kamiński wrote: > >This patch provides an equality operator between _Sink_iter and > default_sentinel, > >that returns true, if any further writes to the _Sink_iter and underlying > _Sink, > >will be discared,

[PATCH v4 0/6] RISC-V: Combine vec_duplicate + vadd.vv to vadd.vx on GR2VR cost

2025-05-06 Thread pan2 . li
From: Pan Li This patch would like to introduce the combine of vec_dup + vadd.vv into vadd.vx on the cost value of GR2VR. The late-combine will take place if the cost of GR2VR is zero, or reject the combine if non-zero like 1, 15 in test. A helper function get_gr2vr_cost is introduced to make s

[PATCH v4 5/6] RISC-V: Add testcases for vec_duplicate + vadd.vv combine when GR2VR cost 1

2025-05-06 Thread pan2 . li
From: Pan Li Add asm dump check and for vec_duplicate + vadd.vv combine to vadd.vx. The late-combine will not take action when GR2VR cost is 1. The below test suites are passed for this patch. * The rv64gcv fully regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/

[PATCH v2] libstdc++: Fix width computation for the chrono formatting [PR120114]

2025-05-06 Thread Tomasz Kamiński
Use `__unicode::_field_width` to compute the field width of the output when writting the formatted output for std::chrono::types. This applies both to characters copied from format string, and one produced by localized formatting. We also use _Str_sink::view() instead of get(), which avoids copy

Re: [PATCH] RISC-V: Add pattern for vector-scalar multiply-add/sub [PR119100]

2025-05-06 Thread Jeff Law
On 4/16/25 8:32 AM, Paul-Antoine Arras wrote: Please find attached an updated patch with an additional cost model. By default, an instruction is 4 and the penalty for moving data from floating-point to vector register is 2; thus, vfmadd.vf costs 6, which still makes it cheaper than vec_du

Re: [PATCH] emit-rtl: Add extra checks for paradoxical hardware subregs [PR119966]

2025-05-06 Thread Richard Sandiford
Dimitar Dimitrov writes: > After r16-160-ge6f89d78c1a752, late_combine2 started transforming the > following RTL for pru-unknown-elf: > > (insn 3949 3948 3951 255 (set (reg:QI 56 r14.b0 [orig:1856 _619 ] [1856]) > (and:QI (reg:QI 1 r0.b1 [orig:1855 _201 ] [1855]) > (const

[PATCH v4 4/6] RISC-V: Add testcases for vec_duplicate + vadd.vv combine when GR2VR cost 0

2025-05-06 Thread pan2 . li
From: Pan Li Add asm dump check and run test for vec_duplicate + vadd.vv combine to vadd.vx. Introduce new folder to hold all related testcases. The below test suites are passed for this patch. * The rv64gcv fully regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/rvv.ex

Re: [PATCH] libstdc++: Provide ability to query _Sink_iter if writes are discarded.

2025-05-06 Thread Tomasz Kaminski
On Tue, May 6, 2025 at 2:07 PM Jonathan Wakely wrote: > On Tue, 6 May 2025 at 12:42, Tomasz Kaminski wrote: > > > > > > > > On Tue, May 6, 2025 at 1:34 PM Jonathan Wakely > wrote: > >> > >> On 05/05/25 14:44 +0200, Tomasz Kamiński wrote: > >> >This patch provides an equality operator between _S

[PATCH v4 6/6] RISC-V: Add testcases for vec_duplicate + vadd.vv combine when GR2VR cost 15

2025-05-06 Thread pan2 . li
From: Pan Li Add asm dump check and for vec_duplicate + vadd.vv combine to vadd.vx. The late-combine will not take action when GR2VR cost is 15. The below test suites are passed for this patch. * The rv64gcv fully regression test. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec

Re: [PATCH v4 2/6] RISC-V: Add gr2vr cost helper function

2025-05-06 Thread Robin Dapp
+/* + * Return the cost of operation that move from gpr to vr. + * + * It will take the value of --param=gpr2vr_cost if it is provided. + * Or the default regmove->GR2VR will be returned. + */ Please still remove the leading '*' of the comment. The series is OK with that fixed. Thanks for you

RE: [PATCH v4 2/6] RISC-V: Add gr2vr cost helper function

2025-05-06 Thread Li, Pan2
> Please still remove the leading '*' of the comment. The series is OK with > that > fixed. Thanks for your patience. Thanks Robin for help and explanation. I will commit with these changes, it looks like auto inserted when enter. > We can take care of a more comprehensive rtx_cost once all

[PATCH v2] libstdc++: Provide ability to query _Sink_iter if writes are discarded.

2025-05-06 Thread Tomasz Kamiński
This patch provides _M_discarding functiosn for _Sink_iter and _Sink function that returns true, if any further writes to the _Sink_iter and underlying _Sink, will be discared, and thus can be omitted. Currently only the _Padding_sink reports discarding mode of if width of sequence characters is g

[PATCH 6/6] vect: Split vectorizable_lc_phi

2025-05-06 Thread andre.simoesdiasvieira
Remove the gimple ** argument that is no longer needed from vectorizable_lc_phi and vect_transform_lc_phi. --- gcc/tree-vect-loop.cc | 4 ++-- gcc/tree-vect-stmts.cc | 4 ++-- gcc/tree-vectorizer.h | 6 ++ 3 files changed, 6 insertions(+), 8 deletions(-) diff --git a/gcc/tree-vect-loop.cc

[pushed: r16-413] diagnostics: add logical_location_manager; reimplement logical_location

2025-05-06 Thread David Malcolm
Previously we used an abstract base class logical_location with concrete subclasses to separate the diagnostics subsystem from implementation details of "tree" and of libgdiagnostics. This approach required allocating implementation objects on the heap whenever working with logical locations, and

[pushed: r16-412] libgdiagnostics: add accessors for diagnostic_logical_location [LIBGDIAGNOSTICS_ABI_1]

2025-05-06 Thread David Malcolm
For followup work I need to be able to get at data from a diagnostic_logical_location after creating it, hence the need to extend libgdiagnostics with accessor entrypoints. This is the first extension to libgdiagnostics since the initial release. The patch uses symbol versioning to add the new en

[pushed: r16-416] json: implement JSON pointer; use it in sarif-replay [PR117988]

2025-05-06 Thread David Malcolm
This patch extends our json class to track JSON pointers (RFC 6901), and then uses this within sarif-replay to provide logical locations within the JSON when reporting on issues in the SARIF. Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu. Successful run of analyzer integration test

[pushed: r16-415] diagnostics: support XML and JSON kinds of logical locations

2025-05-06 Thread David Malcolm
Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu. Successful run of analyzer integration tests on x86_64-pc-linux-gnu. Pushed to trunk as r16-415-g9fb44cc4823106. gcc/ChangeLog: * diagnostic-format-sarif.cc (maybe_get_sarif_kind): Add cases for new kinds of logical loc

[pushed: r16-414] sarif output: capture nesting of logical locations [PR116176]

2025-05-06 Thread David Malcolm
Previously our SARIF output did not capture nesting of logical locations: any time a result or event referred to a logical location it would simply put a copy of the logical location into the pertinent location object without a "parentIndex" property. With this patch we instead populate such locat

Re: [14 PATCH] libcpp: Further fixes for incorrect line numbers in large files [PR120061]

2025-05-06 Thread Richard Biener
On Tue, 6 May 2025, Jakub Jelinek wrote: > Hi! > > Here is the 14 branch version of the PR120061 fix I've just posted > for 16/15. > The differences from the earlier patch are all caused by the > 32-bit location_t on the branch instead of 64-bit location_t that > 16/15 has. > So, it needs 1 << wh

[pushed] libgcobol: Fix bootstrap for targets without program_invocation_short_name

2025-05-06 Thread Iain Sandoe
This should work for Solaris and BSD variants; further extensions might be needed to produce a good QoI on other targets (but at least the library should still build there). Tested on x86_64, aarch64 and powerpc64le Linux and on x86_64-darwin21/23. Checked manually that the configurations were as

[pushed: r16-417] diagnostics: use diagnostic_option_id in one more place

2025-05-06 Thread David Malcolm
No functional change intended. Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu. Pushed to trunk as r16-417-gf4fa41cd5ccbcc. gcc/ChangeLog: * selftest-diagnostic.cc (test_diagnostic_context::report): Use diagnostic_option_id rather than plain int. * selftest-d

Re: [patch, Fortran] Fix PR 119928, rejects-valid 15/16 regression

2025-05-06 Thread Thomas Koenig
Hi Harald, It appears that something is not right and generates wrong code with the check enabled.  Can you have another look? The problem was indeed that generating a formal from an actual arglist is a bad idea when classes are involved.  Fixed in the attached patch.  I think it still makes s

Re: [PATCH 2/2] aarch64: Fold lsl+lsr+orr to rev for half-width shifts

2025-05-06 Thread Richard Sandiford
Dhruv Chawla writes: > This patch modifies the intrinsic expanders to expand svlsl and svlsr to > unpredicated forms when the predicate is a ptrue. It also folds the > following pattern: > > lsl , , > lsr , , > orr , , > > to: > > revb/h/w , > > when the shift amount is equal to half t

Re: [PATCH v2] libstdc++: Fix width computation for the chrono formatting [PR120114]

2025-05-06 Thread Jonathan Wakely
On Tue, 6 May 2025 at 13:35, Tomasz Kamiński wrote: > > Use `__unicode::_field_width` to compute the field width of the output when > writting > the formatted output for std::chrono::types. This applies both to characters > copied > from format string, and one produced by localized formatting. >

Re: [PATCH 2/3] x86: Add a pass to fold tail call

2025-05-06 Thread Andi Kleen
On 2025-05-06 09:48, H.J. Lu wrote: On Mon, May 5, 2025 at 9:56 PM Andi Kleen wrote: On Mon, May 05, 2025 at 06:20:40AM -0700, Andi Kleen wrote: > > If the branch edge destination is a basic block with only a direct > > sibcall, change the jcc target to the sibcall target, decrement the > > de

Re: [PATCH v5 05/10] libstdc++: Implement layout_left from mdspan.

2025-05-06 Thread Tomasz Kaminski
On Tue, May 6, 2025 at 1:39 PM Luc Grosheintz wrote: > > On 5/6/25 11:28 AM, Tomasz Kaminski wrote: > > For better reference, here is illustration of the design I was thinking > > about: > > https://godbolt.org/z/7aTcM8fz4 > > I would also consider having left_mapping_base to accept padding, wher

Re: [PATCH] libstdc++: Fix width computation for the chrono formatting [PR120114]

2025-05-06 Thread Tomasz Kaminski
On Tue, May 6, 2025 at 1:59 PM Jonathan Wakely wrote: > On 05/05/25 16:45 +0200, Tomasz Kamiński wrote: > >Use `__unicode::_field_width` to compute the field width of the output > when writting > >the formatted output for std::chrono::types. This applies both to > characters copied > >from format

Re: [PATCH] libstdc++: Provide ability to query _Sink_iter if writes are discarded.

2025-05-06 Thread Jonathan Wakely
On Tue, 6 May 2025 at 12:42, Tomasz Kaminski wrote: > > > > On Tue, May 6, 2025 at 1:34 PM Jonathan Wakely wrote: >> >> On 05/05/25 14:44 +0200, Tomasz Kamiński wrote: >> >This patch provides an equality operator between _Sink_iter and >> >default_sentinel, >> >that returns true, if any further

Re: [PATCH v5 05/10] libstdc++: Implement layout_left from mdspan.

2025-05-06 Thread Luc Grosheintz
On 5/6/25 1:56 PM, Tomasz Kaminski wrote: On Tue, May 6, 2025 at 1:39 PM Luc Grosheintz wrote: On 5/6/25 11:28 AM, Tomasz Kaminski wrote: For better reference, here is illustration of the design I was thinking about: https://godbolt.org/z/7aTcM8fz4 I would also consider having left_mappin

Re: [PATCH v5 05/10] libstdc++: Implement layout_left from mdspan.

2025-05-06 Thread Tomasz Kaminski
On Tue, May 6, 2025 at 1:39 PM Luc Grosheintz wrote: > > On 5/6/25 11:28 AM, Tomasz Kaminski wrote: > > For better reference, here is illustration of the design I was thinking > > about: > > https://godbolt.org/z/7aTcM8fz4 > > I would also consider having left_mapping_base to accept padding, wher

Re: [PATCH] libstdc++: Fix width computation for the chrono formatting [PR120114]

2025-05-06 Thread Jonathan Wakely
On Tue, 6 May 2025 at 13:06, Tomasz Kaminski wrote: > > > > On Tue, May 6, 2025 at 1:59 PM Jonathan Wakely wrote: >> >> On 05/05/25 16:45 +0200, Tomasz Kamiński wrote: >> >Use `__unicode::_field_width` to compute the field width of the output when >> >writting >> >the formatted output for std::c

Re: [PATCH] MIPS: Fixed the problem that the nop instruction is inserted at the wrong position after enabling '-fpatchable-function-entry='

2025-05-06 Thread Lulu Cheng
在 2025/5/6 下午6:14, WANG Xuerui 写道: On 4/30/25 14:26, Lulu Cheng wrote: Because MIPS function symbol is generated in the prologue function, this nop generation should be done in prologue. OK for trunk? PR target/99217 gcc/ChangeLog: * config/mips/mips.cc (mips_start_function_definiti

[PATCH] tree-optimization/119589 - alignment analysis for VF > 1 and VMAT_STRIDED_SLP

2025-05-06 Thread Richard Biener
The following fixes the alignment analysis done by the VMAT_STRIDED_SLP code which for the case of VF > 1 currently relies on dataref analysis which assumes consecutive accesses. But the code generation advances by DR_STEP between each iteration which requires us to assess that individual DR_STEP

Re: [RFC PATCH 0/2] Add target_clones profile option support

2025-05-06 Thread Yangyu Chen
> On 6 May 2025, at 17:49, Alfie Richards wrote: > > On 06/05/2025 09:36, Yangyu Chen wrote: >>> On 6 May 2025, at 16:01, Alfie Richards wrote: >>> >>> Hello, >>> >>> I like this idea. I have a couple thoughts to add. >>> >>> On 05/05/2025 09:46, Yangyu Chen wrote: > On 5 May 2025, at

Re: [14 PATCH] c++: Backport r15-521 and r15-2154 to 14 branch [PR119305]

2025-05-06 Thread Jason Merrill
On 5/6/25 7:18 AM, Jakub Jelinek wrote: Hi! While the r15-521 commit was meant for trunk only: On Thu, Apr 25, 2024 at 11:30:48AM -0400, Jason Merrill wrote: Hmm, maybe maybe_clone_body shouldn't clear DECL_SAVED_TREE for aliases, but rather set it to some stub like void_node? Though with all

[PATCH v4 1/6] RISC-V: Add new option --param=gpr2vr-cost= for rvv insn

2025-05-06 Thread pan2 . li
From: Pan Li During investigate the combine from vec_dup and vop.vv into vop.vx, we need to depend on the cost of the insn operate from the gpr to vr, for example, vadd.vx. Thus, for better control and test, we introduce a new option, aka below: --param=gpr2vr-cost= To specific the cost value

Re: [PATCH] libstdc++: Fix width computation for the chrono formatting [PR120114]

2025-05-06 Thread Jonathan Wakely
On 05/05/25 16:45 +0200, Tomasz Kamiński wrote: Use `__unicode::_field_width` to compute the field width of the output when writting the formatted output for std::chrono::types. This applies both to characters copied from format string, and one produced by localized formatting. We also use _St

Re: [PATCH] emit-rtl: Add extra checks for paradoxical hardware subregs [PR119966]

2025-05-06 Thread Dimitar Dimitrov
On Mon, May 05, 2025 at 06:51:31PM +0300, Dimitar Dimitrov wrote: > After r16-160-ge6f89d78c1a752, late_combine2 started transforming the > following RTL for pru-unknown-elf: > > (insn 3949 3948 3951 255 (set (reg:QI 56 r14.b0 [orig:1856 _619 ] [1856]) > (and:QI (reg:QI 1 r0.b1 [orig:1

Re: [PATCH] libstdc++: Fix constraint recursion in std::expected's operator== [PR119714]

2025-05-06 Thread Jonathan Wakely
On Mon, 5 May 2025 at 19:50, Patrick Palka wrote: > > Tested on x86_64-pc-linux-gnu, does this look OK for trunk/15? OK for trunk and 15. > > -- >8 -- > > This std::expected friend operator== is prone to constraint recursion > after CWG 2369 for the same reason as basic_const_iterator's compari

[PATCH v4 2/6] RISC-V: Add gr2vr cost helper function

2025-05-06 Thread pan2 . li
From: Pan Li After we introduced the --param=gpr2vr-cost option to set the cost value of when operation act from gpr to vr, we would like to introduce a new helper function to get the cost of gp2vr. And then make sure all reference to gr2vr should go this helper function. The helper function wi

[PATCH v4 3/6] RISC-V: Combine vec_duplicate + vadd.vv to vadd.vx on GR2VR cost

2025-05-06 Thread pan2 . li
From: Pan Li This patch would like to combine the vec_duplicate + vadd.vv to the vadd.vx. From example as below code. The related pattern will depend on the cost of vec_duplicate from GR2VR, it will: * The pattern matching will be active by default. * The cost of GR2VR will be added to the tot

[PATCH 2/6] vect: Remove non-SLP path from vectorizable_reduction

2025-05-06 Thread andre.simoesdiasvieira
Prunes code from the trivial true/false conditions. --- gcc/tree-vect-loop.cc | 540 -- 1 file changed, 155 insertions(+), 385 deletions(-) diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc index 69b692f1673..4ab7d227e42 100644 --- a/gcc/tree-vect

  1   2   >