Re: [PATCH v2] RISC-V: Add generic tune as default.

2025-06-19 Thread Jeff Law
On 6/19/25 9:17 AM, Kito Cheng wrote: I guess we should implement an auto generated document for mcpu and mtune document like what we do for -march. Yea, probably. The more of that stuff that's auto generated the better. It's easily forgotten. jeff

[PATCH 2/2] Use auto_vec in prime paths selftests [PR120634]

2025-06-19 Thread Jørgen Kvalsvik
The selftests had a bunch of memory leaks that showed up in make selftest-valgrind as a result of not using auto_vec or other explicitly calling release. Replacing vec with auto_vec makes the problem go away. The auto_vec_vec helper is made constructable from a vec so that objects returned from fu

[PATCH 1/2] Free buffer on function exit [PR120634]

2025-06-19 Thread Jørgen Kvalsvik
Using auto_vec ensures that the buffer is always free'd when the function returns. PR gcc/gcov-profile 120634 gcc/ChangeLog: * prime-paths.cc (trie::paths): Use auto_vec. --- gcc/prime-paths.cc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/prime-paths.c

Re: [PATCH] RISC-V: Use builtin clz/ctz when count_leading_zeros and count_trailing_zeros is used

2025-06-19 Thread Jeff Law
On 6/18/25 3:07 AM, Sosutha Sethuramapandian wrote: longlong.h for RISCV should define count_leading_zeros and count_trailing_zeros and COUNT_LEADING_ZEROS_0 when ZBB is enabled. The following patch patch fixes the bug reported in, https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110181 gcc.gnu

[to-be-committed][RISC-V] Force several tests to use rocket tuning

2025-06-19 Thread Jeff Law
My tester has been flagging these regressions since the default cost model was committed, along with several others unix/-march=rv64gc_zba_zbb_zbs_zicond: gcc: gcc.target/riscv/rvv/vsetvl/avl_single-37.c -O2 scan-assembler-times \\.L[0-9]+\\:\\s+addi\\s+\\s*[a-x0-9]+,\\s*[a-x0-9]+,\\s*[0-

[PATCH] i386: Remove CLDEMOTE for clients

2025-06-19 Thread Haochen Jiang
Hi all, CLDEMOTE is not enabled on clients according to SDM. SDM only mentioned it will be enabled on Xeon and Atom servers, not clients. Remove them since Alder Lake (where it is introduced). Also will backport this patch to GCC12/13/14/15 with some tweaks in texi change. Ok for trunk? Thx, Ha

Re: [PATCH v2] RISC-V: Add generic tune as default.

2025-06-19 Thread Dongyan Chen
Yes, I would like to do it. Dongyan Chen 在 2025/6/19 23:17, Kito Cheng 写道: I guess we should implement an auto generated document for mcpu and mtune document like what we do for -march. Dongyan, do you have interest to implement that? :) On Thu, Jun 19, 2025 at 10:02 PM Jeff Law wrote: We

Re: [PATCH] libsanitizer: Fix 'unknown-crash' reported for partial buffer overflows

2025-06-19 Thread Wern Lim
Note: This patch is currently in discussion on llvm-project's side and may have minor tweaks. Once that's done, the patch will be redone by applying upstream changes. Wern On 13/6/25 12:40 pm, Wern Lim wrote: Given a partially misaligned memory read for a large number of bytes (e.g., we allocat

Re: [PATCH] haifa-sched: Elide leftover instruction after breaking dependence [PR120459]

2025-06-19 Thread Jeff Law
On 6/19/25 9:15 AM, Paul-Antoine Arras wrote: *** Context *** The Haifa scheduler is able to recognise some cases where a dependence D between two instructions -- a producer P and a consumer C -- can be broken. In particular, if C contains a memory reference M and P is an increment, then M

[PATCH 0/2] Memory leak fixes in prime paths [PR120634]

2025-06-19 Thread Jørgen Kvalsvik
Hi, These patches fixes a memory leak in the prime paths, and some in the selftests that show up in make selftest-valgrind. After applying these patches on my x86-64-linux-gnu system and make selftest-valgrind: -fself-test: 7665942 pass(es) in 8.943705 seconds ==802130== ==802130== HEAP SUMMARY:

Re: [wwwdocs v2] Add C23 status table

2025-06-19 Thread Joseph Myers
On Fri, 13 Jun 2025, Marek Polacek wrote: > doesn't need any changes, I think. Another is "modified existing functions > to preserve the const-ness of the type placed into the function", I don't > what this is talking about. It's a duplicate of the entry "added qualifier preserving macros for b

Re: [PATCH] [lra] recompute ranges upon disabling fp2sp elimination [PR120424]

2025-06-19 Thread Vladimir Makarov
On 6/19/25 7:43 AM, Alexandre Oliva wrote: If the frame size grows to nonzero, arm_frame_pointer_required may flip to true under -fstack-clash-protection -fnon-call-exceptions, and that may disable the fp2sp elimination part-way through lra. If pseudos had got assigned to the frame pointer reg

Ping: [PATCH V4 0/5] Add more user friendly TARGET_ names for PowerPC

2025-06-19 Thread Michael Meissner
Ping patch. This is the explanation of the changes in the set of 5 patches to change the internal names of the power5, power6, etc. switches from the instruction that adds the new feature to the power processor level: I.e. change: TARGET_POPCNTB to TARGET_POWER5 The external switch rema

Ping: [PATCH V4 1/5] Change TARGET_POPCNTB to TARGET_POWER5.

2025-06-19 Thread Michael Meissner
Ping patch. This is patch 1 of 5 that changes the internal names of the power5, power6, etc. switches from the instruction that adds the new feature to the power processor level: I.e. change: TARGET_POPCNTB to TARGET_POWER5 The external switch remains the same, just the name used intern

Ping: [PATCH V4 3/5] Change TARGET_CMPB to TARGET_POWER6.

2025-06-19 Thread Michael Meissner
Ping patch. This is patch #3 in the set of 5 patches to change the internal names of the power5, power6, etc. switches from the instruction that adds the new feature to the power processor level: I.e. change: TARGET_POPCNTB to TARGET_POWER5 The external switch remains the same, just the

Ping: [PATCH V4 4/5] Change TARGET_POPCNTD to TARGET_POWER7.

2025-06-19 Thread Michael Meissner
Ping patch. This is patch #4 in the set of 5 patches to change the internal names of the power5, power6, etc. switches from the instruction that adds the new feature to the power processor level: I.e. change: TARGET_POPCNTB to TARGET_POWER5 The external switch remains the same, just the

Ping: [PATCH V4 2/5] Change TARGET_FPRND to TARGET_POWER5X.

2025-06-19 Thread Michael Meissner
Ping patch. This is patch #2 in the set of 5 patches to change the internal names of the power5, power6, etc. switches from the instruction that adds the new feature to the power processor level: I.e. change: TARGET_POPCNTB to TARGET_POWER5 The external switch remains the same, just the

Ping: [PATCH V4 5/5] Change TARGET_MODULO to TARGET_POWER9.

2025-06-19 Thread Michael Meissner
Ping patch. This is patch #5 in the set of 5 patches to change the internal names of the power5, power6, etc. switches from the instruction that adds the new feature to the power processor level: I.e. change: TARGET_POPCNTB to TARGET_POWER5 The external switch remains the same, just the

Re: [PATCH v5 2/3][__bdos]Use the counted_by attribute of pointers in builtinin-object-size.

2025-06-19 Thread Siddhesh Poyarekar
On 2025-06-16 18:08, Qing Zhao wrote: gcc/ChangeLog: * tree-object-size.cc (access_with_size_object_size): Handle pointers with counted_by. This should probably just say "Update comment for .ACCESS_WITH_SIZE.". (collect_object_sizes_for): Likewise. gcc/testsuite/Chan

Re: [PATCH v5 2/3][__bdos]Use the counted_by attribute of pointers in builtinin-object-size.

2025-06-19 Thread Siddhesh Poyarekar
On 2025-06-19 12:07, Siddhesh Poyarekar wrote: On 2025-06-16 18:08, Qing Zhao wrote: gcc/ChangeLog: * tree-object-size.cc (access_with_size_object_size): Handle pointers with counted_by. This should probably just say "Update comment for .ACCESS_WITH_SIZE.". (collect_object_sizes

Re: [PATCH] x86: Fix shrink wrap separate ICE under -fstack-clash-protection [PR120697]

2025-06-19 Thread Uros Bizjak
On Thu, Jun 19, 2025 at 9:37 AM Uros Bizjak wrote: > > On Wed, Jun 18, 2025 at 4:12 PM Cui, Lili wrote: > > > > > > > > > -Original Message- > > > From: Uros Bizjak > > > Sent: Wednesday, June 18, 2025 9:22 PM > > > To: Cui, Lili > > > Cc: gcc-patches@gcc.gnu.org; Liu, Hongtao ; > > > h

[PATCH 0/1] Adapt unwinder to Linux's SME signal behaviour

2025-06-19 Thread Yury Khrustalev
This patch aligns unwinder with the recent chnages in Linux SME signal behaviour. Regression checked on AArch64 and no regressions have been found. OK for trunk? base commit: 20f59301851 --- Richard Sandiford (1): aarch64: Adapt unwinder to linux's SME signal behaviour gcc/doc/sourcebuild.t

Re: [PATCH v1 0/3] RISC-V: Combine vec_duplicate + vminu.vv to vminu.vx on GR2VR cost

2025-06-19 Thread Kito Cheng
LGTM On Thu, Jun 19, 2025 at 6:27 PM wrote: > > From: Pan Li > > This patch would like to introduce the combine of vec_dup + vminu.vv > into vminu.vx on the cost value of GR2VR. The late-combine will take > place if the cost of GR2VR is zero, or reject the combine if non-zero > like 1, 2, 15 in

Re: [PATCH 2/2] or1k: Improve If-Conversion by delaying cbranch splits

2025-06-19 Thread Stafford Horne
On Thu, Jun 19, 2025 at 02:14:26PM +0100, Stafford Horne wrote: > When working on PR120587 I found that the ce1 pass was not able to > properly optimize branches on OpenRISC. This is because of the early > splitting of "compare" and "branch" instructions during the expand pass. > > Convert the cb

Re: [PATCH v2] RISC-V: Add generic tune as default.

2025-06-19 Thread Dongyan Chen
Thanks, I found that it can also be solved by changing the default mtune in file "configure" of riscv-gnu-toolchain and I will prepare a PR to riscv-gnu-toolchain repo. Dongyan Chen 在 2025/6/19 15:55, Kito Cheng 写道: Thanks, pushed with one minor change. Robin has mentioned that maybe we

[PATCH v2 0/7] LoongArch: Add support for _BitInt [PR117599]

2025-06-19 Thread Yang Yujie
Changes from v1: 1. Split into smaller patches with more comments. 2. Do not change gimple-lower-bitint.cc. Optimizations that relies on large/huge _BitInts being extended in memory will be posted in other series. Also, the bitint_extend state is not cached outside of gimple-lower-biti

[PATCH v2 2/7] bitint: Allow different limb_mode and abi_limb_mode in more cases

2025-06-19 Thread Yang Yujie
abi_limb_mode and limb_mode were asserted to be the same when the target has different endianness for limbs in _BitInts and words in objects. Otherwise, this assertion also held when the TYPE_PRECISION of _BitInt type being laid out is less than or equal to the precision of abi_limb_mode. But in

[PATCH v2 3/7] bitint: Extend the new value before atomic exchange

2025-06-19 Thread Yang Yujie
For targets that have extended bitints, we need to ensure these are extended before writing to the memory, including via atomic exchange. gcc/c-family/ChangeLog: * c-common.cc (resolve_overloaded_atomic_exchange): Extend _BitInts before atomic exchange if needed. (resolve_

[PATCH v2 7/7] expand: Reduce unneeded _BitInt extensions

2025-06-19 Thread Yang Yujie
For targets that set the "extended" flag in TARGET_C_BITINT_TYPE_INFO, we assume small _BitInts to be internally extended after arithmetic operations. In this case, an extra extension during RTL expansion can be avoided. gcc/ChangeLog: * expr.cc (expand_expr_real_1): Do not call r

[PATCH v2 6/7] bitint: Allow unused bits when testing extended _BitInt ABIs

2025-06-19 Thread Yang Yujie
In LoongArch psABI, large _BitInt(N) (N > 64) objects are only extended to fill the highest 8-byte chunk that contains any used bit, but the size of such a large _BitInt type is a multiple of their 16-byte alignment. So there may be an entire unused 8-byte chunk that is not filled by extension, an

Re: [PATCH v2 5/7] LoongArch: Add support for _BitInt [PR117599]

2025-06-19 Thread Jakub Jelinek
On Thu, Jun 19, 2025 at 07:13:07PM +0800, Yang Yujie wrote: > On Thu, Jun 19, 2025 at 12:32:57PM GMT, Jakub Jelinek wrote: > > As mentioned in another mail, please follow what aarch64 is doing here (at > > least unless you explain how it violates your psABI): > > if (n <= 8) > > info->limb_mo

[PATCH] x86: Get the widest vector mode from MOVE_MAX

2025-06-19 Thread H.J. Lu
Since MOVE_MAX defines the maximum number of bytes that an instruction can move quickly between memory and registers, use it to get the widest vector mode in vector loop when inlining memcpy and memset. gcc/ PR target/120708 * config/i386/i386-expand.cc (ix86_expand_set_or_cpymem): Use MOVE_MAX t

[PATCH] cobol: Fix build on 32-bit Solaris [PR120621]

2025-06-19 Thread Rainer Orth
Bootstrap with COBOL included is currently broken for 32-bit-default Solaris configurations. There are three issues: gcc/cobol/lexio.cc: In static member function ‘static std::FILE* cdftext::lex_open(const char*)’: gcc/cobol/lexio.cc:1527:55: error: format ‘%d’ expects argument of type ‘int’, b

Re: [PATCH v2 5/7] LoongArch: Add support for _BitInt [PR117599]

2025-06-19 Thread Yang Yujie
On Thu, Jun 19, 2025 at 01:18:10PM GMT, Jakub Jelinek wrote: > On Thu, Jun 19, 2025 at 07:13:07PM +0800, Yang Yujie wrote: > > On Thu, Jun 19, 2025 at 12:32:57PM GMT, Jakub Jelinek wrote: > > > As mentioned in another mail, please follow what aarch64 is doing here (at > > > least unless you explain

[PATCH] [lra] recompute ranges upon disabling fp2sp elimination [PR120424]

2025-06-19 Thread Alexandre Oliva
If the frame size grows to nonzero, arm_frame_pointer_required may flip to true under -fstack-clash-protection -fnon-call-exceptions, and that may disable the fp2sp elimination part-way through lra. If pseudos had got assigned to the frame pointer register before that, they have to be spilled, a

[PATCH] c++: Implement part of C++26 P2686R4 - constexpr structured bindings [PR117784]

2025-06-19 Thread Jakub Jelinek
Hi! The following patch implements the constexpr structured bindings part of the P2686R4 paper, so the [dcl.pre], [dcl.struct.bind], [dcl.constinit] and first hunk in [dcl.constexpr] changes. The paper doesn't have a feature test macro and the constexpr structured binding part of it seems more-les

[PATCH 1/1] aarch64: Adapt unwinder to linux's SME signal behaviour

2025-06-19 Thread Yury Khrustalev
From: Richard Sandiford SME uses a lazy save system to manage ZA. The idea is that, if a function with ZA state wants to call a "normal" function, it can leave its state in ZA and instead set up a lazy save buffer. If, unexpectedly, that normal function contains a nested use of ZA, that nested u

Re: [PATCH v2] RISC-V: Add generic tune as default.

2025-06-19 Thread Kito Cheng
I guess we should implement an auto generated document for mcpu and mtune document like what we do for -march. Dongyan, do you have interest to implement that? :) On Thu, Jun 19, 2025 at 10:02 PM Jeff Law wrote: > > > > On 6/19/25 1:55 AM, Kito Cheng wrote: > > Thanks, pushed with one minor chan

Re: [PATCH] cobol: Fix build on 32-bit Solaris [PR120621]

2025-06-19 Thread James K. Lowden
On Thu, 19 Jun 2025 13:53:06 +0200 Jakub Jelinek wrote: > On Thu, Jun 19, 2025 at 01:38:06PM +0200, Rainer Orth wrote: > > --- a/gcc/cobol/genapi.cc > > +++ b/gcc/cobol/genapi.cc > > @@ -957,7 +957,7 @@ parser_compile_ecs( const std::vector > { > > SHOW_PARSE_HEADER > > char ach[64

[PATCH] haifa-sched: Elide leftover instruction after breaking dependence [PR120459]

2025-06-19 Thread Paul-Antoine Arras
*** Context *** The Haifa scheduler is able to recognise some cases where a dependence D between two instructions -- a producer P and a consumer C -- can be broken. In particular, if C contains a memory reference M and P is an increment, then M can be replaced with its incremented version M' w

[PATCH] [arm] require armv7 support for [PR120424] (was: Re: [PATCH, FYI?] [arm] [vxworks] require thumb2 for pr120424.C)

2025-06-19 Thread Alexandre Oliva
On Jun 19, 2025, Alexandre Oliva wrote: > Or maybe the requirements for this testcase should be stated as > arm_arch_v7? I'd have to add arm_arch_v7 to > check_effective_target_arm_arch_FUNC_ok et al, if there aren't reasons > why it's not there, but I'd be happy to do that, and use dg-add-optio

[PATCH 2/2] or1k: Improve If-Conversion by delaying cbranch splits

2025-06-19 Thread Stafford Horne
When working on PR120587 I found that the ce1 pass was not able to properly optimize branches on OpenRISC. This is because of the early splitting of "compare" and "branch" instructions during the expand pass. Convert the cbranch* instructions from define_expand to define_insn_and_split. This dal

[PATCH 1/2] or1k: Implement *extendbisi* to fix ICE in convert_mode_scalar [PR120587]

2025-06-19 Thread Stafford Horne
After commit 2dcc6dbd8a0 ("emit-rtl: Use simplify_subreg_regno to validate hardware subregs [PR119966]") the OpenRISC port is broken again. Add extend* iinstruction patterns for the SR_F pseudo registers to avoid having to use the subreg conversions which no longer work. gcc/ChangeLog: P

[PATCH 0/2] OpenRISC fixes for PR120587

2025-06-19 Thread Stafford Horne
This is a small series to fix If-Conversion on OpenRISC after the build broken with recent subreg changes. Stafford Horne (2): or1k: Implement *extendbisi* to fix ICE in convert_mode_scalar [PR120587] or1k: Improve If-Conversion by delaying cbranch splits gcc/config/or1k/or1k.cc |

Re: [PATCH v2] RISC-V: Add generic tune as default.

2025-06-19 Thread Jeff Law
On 6/19/25 1:55 AM, Kito Cheng wrote: Thanks, pushed with one minor change. Robin has mentioned that maybe we could name it generic-in-order, but I think this could be a follow up patch if we want, I would like to have -mtune=generic even though we added that since clang/LLVM already provided

[PATCH v5 05/10] AArch64: make `far_branch` attribute a boolean

2025-06-19 Thread Karl Meakin
The `far_branch` attribute only ever takes the values 0 or 1, so make it a `no/yes` valued string attribute instead. gcc/ChangeLog: * config/aarch64/aarch64.md (far_branch): Replace 0/1 with no/yes. (aarch64_bcond): Handle rename. (aarch64_cbz1): Likewise.

[PATCH v5 06/10] AArch64: recognize `+cmpbr` option

2025-06-19 Thread Karl Meakin
Add the `+cmpbr` option to enable the FEAT_CMPBR architectural extension. gcc/ChangeLog: * config/aarch64/aarch64-option-extensions.def (cmpbr): New option. * config/aarch64/aarch64.h (TARGET_CMPBR): New macro. * doc/invoke.texi (cmpbr): New option. --- gcc/config

[PATCH v5 00/10] AArch64: CMPBR support

2025-06-19 Thread Karl Meakin
AArch64: CMPBR support New changes in this series: * Moved 55d981eb91a (adding `%j`/`%J` format specifiers) before 6cc06968320 (adding rules for generating CB instructions). Every commit in the series should now produce a correct compiler. * Reduce excessive diff context by not passing `--func

[PATCH v5 04/10] AArch64: add constants for branch displacements

2025-06-19 Thread Karl Meakin
Extract the hardcoded values for the minimum PC-relative displacements into named constants and document them. gcc/ChangeLog: * config/aarch64/aarch64.md (BRANCH_LEN_P_128MiB): New constant. (BRANCH_LEN_N_128MiB): Likewise. (BRANCH_LEN_P_1MiB): Likewise. (BRANCH_LE

[PATCH v5 09/10] AArch64: rules for CMPBR instructions

2025-06-19 Thread Karl Meakin
Add rules for lowering `cbranch4` to CBB/CBH/CB when CMPBR extension is enabled. gcc/ChangeLog: * config/aarch64/aarch64.md (BRANCH_LEN_P_1Kib): New constant. (BRANCH_LEN_N_1Kib): Likewise. (cbranch4): Emit CMPBR instructions if possible. (cbranch4): New expand rul

[PATCH v5 10/10] AArch64: make rules for CBZ/TBZ higher priority

2025-06-19 Thread Karl Meakin
Move the rules for CBZ/TBZ to be above the rules for CBB/CBH/CB. We want them to have higher priority because they can express larger displacements. gcc/ChangeLog: * config/aarch64/aarch64.md (aarch64_cbz1): Move above rules for CBB/CBH/CB. (*aarch64_tbz1): Likewise. gcc/

[PATCH v5 02/10] AArch64: reformat branch instruction rules

2025-06-19 Thread Karl Meakin
Make the formatting of the RTL templates in the rules for branch instructions more consistent with each other. gcc/ChangeLog: * config/aarch64/aarch64.md (cbranch4): Reformat. (cbranchcc4): Likewise. (condjump): Likewise. (*compare_condjump): Likewise. (aar

[PATCH v5 08/10] AArch64: precommit test for CMPBR instructions

2025-06-19 Thread Karl Meakin
Commit the test file `cmpbr.c` before rules for generating the new instructions are added, so that the changes in codegen are more obvious in the next commit. gcc/testsuite/ChangeLog: * lib/target-supports.exp: Add `cmpbr` to the list of extensions. * gcc.target/aarch64/cmpbr.c: N

[PATCH v5 01/10] AArch64: place branch instruction rules together

2025-06-19 Thread Karl Meakin
The rules for conditional branches were spread throughout `aarch64.md`. Group them together so it is easier to understand how `cbranch4` is lowered to RTL. gcc/ChangeLog: * config/aarch64/aarch64.md (condjump): Move. (*compare_condjump): Likewise. (aarch64_cb1): Likewise.

[PATCH v5 07/10] AArch64: add `%j` and `%J` format specifiers

2025-06-19 Thread Karl Meakin
The CB family of instructions does not support using the CS or CC condition codes; instead the synonyms HS and LO must be used. GCC has traditionally used the CS and CC names. To work around this while avoiding test churn, add new `j` and `J` format specifiers; they will be used in the next commit

[PATCH v5 03/10] AArch64: rename branch instruction rules

2025-06-19 Thread Karl Meakin
Give the `define_insn` rules used in lowering `cbranch4` to RTL more descriptive and consistent names: from now on, each rule is named after the AArch64 instruction that it generates. Also add comments to document each rule. gcc/ChangeLog: * config/aarch64/aarch64.md (condjump): Rename to

Re: [PATCH] x86: Fix shrink wrap separate ICE under -fstack-clash-protection [PR120697]

2025-06-19 Thread Uros Bizjak
On Wed, Jun 18, 2025 at 4:12 PM Cui, Lili wrote: > > > > > -Original Message- > > From: Uros Bizjak > > Sent: Wednesday, June 18, 2025 9:22 PM > > To: Cui, Lili > > Cc: gcc-patches@gcc.gnu.org; Liu, Hongtao ; > > hongjiu...@intel.com > > Subject: Re: [PATCH] x86: Fix shrink wrap separate

Re: [PATCH v2] RISC-V: Add generic tune as default.

2025-06-19 Thread Kito Cheng
Thanks, pushed with one minor change. Robin has mentioned that maybe we could name it generic-in-order, but I think this could be a follow up patch if we want, I would like to have -mtune=generic even though we added that since clang/LLVM already provided -mtune=generic :) > diff --git > a/gcc/t

Re: [PATCH] fortran: Statically initialize length of SAVEd character arrays

2025-06-19 Thread Mikael Morin
Le 18/06/2025 à 23:22, Jerry D a écrit : On 6/18/25 2:02 PM, Mikael Morin wrote: From: Mikael Morin   Regression-tested on x86_64-pc-linux-gnu.   OK for master? Was there a PR for this? or something you just ran into? I'm not aware of any PR. I was trying to create a testcase exercising t

Re: [PATCH v4] x86: Enable *mov_(and|or) only for -Oz

2025-06-19 Thread Uros Bizjak
On Thu, Jun 19, 2025 at 9:01 AM Hongtao Liu wrote: > > On Wed, Jun 18, 2025 at 6:38 PM H.J. Lu wrote: > > > > commit ef26c151c14a87177d46fd3d725e7f82e040e89f > > Author: Roger Sayle > > Date: Thu Dec 23 12:33:07 2021 + > > > > x86: PR target/103773: Fix wrong-code with -Oz from pop to

Re: Improve static and AFDO profile combination

2025-06-19 Thread Jan Hubicka
> In an internal application I noticed that the ipa-inliner is quite > sensitive to AFDO counts and that seems to make the performance worse. > Did you notice this? This was before some of your changes. I will try > again. The cases I looked into were mixture of late inlining and ipa-cp cloning be

Re: [PATCH v2 6/7] bitint: Allow unused bits when testing extended _BitInt ABIs

2025-06-19 Thread Jakub Jelinek
On Thu, Jun 19, 2025 at 05:59:09PM +0800, Yang Yujie wrote: > In LoongArch psABI, large _BitInt(N) (N > 64) objects are only > extended to fill the highest 8-byte chunk that contains any used bit, > but the size of such a large _BitInt type is a multiple of their > 16-byte alignment. So there may

Re: [PATCH v2 7/7] expand: Reduce unneeded _BitInt extensions

2025-06-19 Thread Jakub Jelinek
On Thu, Jun 19, 2025 at 05:59:10PM +0800, Yang Yujie wrote: > --- a/gcc/expr.cc > +++ b/gcc/expr.cc > @@ -11268,6 +11268,10 @@ expand_expr_real_1 (tree exp, rtx target, > machine_mode tmode, >tree ssa_name = NULL_TREE; >gimple *g; > > + type = TREE_TYPE (exp); > + mode = TYPE_MODE (typ

[PATCH v1] RISC-V: Fix ICE for expand_select_vldi [PR120652]

2025-06-19 Thread pan2 . li
From: Pan Li The will be one ICE when expand pass, the bt similar as below. during RTL pass: expand red.c: In function 'main': red.c:20:5: internal compiler error: in require, at machmode.h:323 20 | int main() { | ^~~~ 0x2e0b1d6 internal_error(char const*, ...) ../../../gcc/

Re: [PATCH v4 08/10] AArch64: rules for CMPBR instructions

2025-06-19 Thread Richard Sandiford
Karl Meakin writes: > diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc > index be5a97294dd..1d4ae73a963 100644 > --- a/gcc/config/aarch64/aarch64.cc > +++ b/gcc/config/aarch64/aarch64.cc > @@ -944,16 +944,50 @@ static const char * > svpattern_token (enum aarch64_svpatter

RE: [PATCH v4] x86: Enable *mov_(and|or) only for -Oz

2025-06-19 Thread Roger Sayle
Looks good to me. Sorry for any inconvenience. Cheers, Roger > -Original Message- > From: Hongtao Liu > Sent: 19 June 2025 08:01 > To: H.J. Lu > Cc: GCC Patches ; Uros Bizjak ; > Hongtao Liu ; Roger Sayle > > Subject: Re: [PATCH v4] x86: Enable *mov_(and|or) only for -Oz > > On Wed,

[PATCH v2 5/7] LoongArch: Add support for _BitInt [PR117599]

2025-06-19 Thread Yang Yujie
This patch adds support for C23's _BitInt for LoongArch. >From the LoongArch psABI[1]: > _BitInt(N) objects are stored in little-endian order in memory > and are signed by default. > > For N ≤ 64, a _BitInt(N) object have the same size and alignment > of the smallest fundamental integral type tha

[PATCH v2 1/7] bitint: Allow mode promotion of _BitInt types

2025-06-19 Thread Yang Yujie
For targets that treat small _BitInts like the fundamental integral types, we should allow their machine modes to be promoted in the same way. gcc/ChangeLog: * explow.cc (promote_function_mode): Add a case for small/medium _BitInts. (promote_mode): Same. --- gcc/explow.cc

[PATCH v2 4/7] LoongArch: Prioritize target-specific makefile fragments

2025-06-19 Thread Yang Yujie
libgcc/ChangeLog: * config.host: Remove unused code. Include LoongArch-specific tmake_files after the OS-specific ones. --- libgcc/config.host | 31 --- 1 file changed, 12 insertions(+), 19 deletions(-) diff --git a/libgcc/config.host b/libgcc/config.h

Re: [PATCH v2 1/2] emit-rtl: Allow extra checks for paradoxical subregs [PR119966]

2025-06-19 Thread Stafford Horne
On Wed, Jun 18, 2025 at 09:41:25PM +0300, Dimitar Dimitrov wrote: > On Wed, Jun 18, 2025 at 04:06:14PM +0100, Stafford Horne wrote: > > On Sat, Jun 07, 2025 at 06:53:28PM +0300, Dimitar Dimitrov wrote: > > > On Sat, Jun 07, 2025 at 11:38:46AM +0100, Stafford Horne wrote: > > > > On Fri, Jun 06, 202

Re: [PATCH v2 3/7] bitint: Extend the new value before atomic exchange

2025-06-19 Thread Jakub Jelinek
On Thu, Jun 19, 2025 at 05:59:06PM +0800, Yang Yujie wrote: > For targets that have extended bitints, we need to ensure these > are extended before writing to the memory, including via atomic > exchange. > > gcc/c-family/ChangeLog: > > * c-common.cc (resolve_overloaded_atomic_exchange): Ext

[PATCH v1 2/3] RISC-V: Add test for vec_duplicate + vminu.vv combine case 0 with GR2VR cost 0, 2 and 15

2025-06-19 Thread pan2 . li
From: Pan Li Add asm dump check and run test for vec_duplicate + vminu.vv combine to vminu.vx, with the GR2VR cost is 0, 2 and 15. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u16.c: Add asm check. * gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u32.c: Ditto.

[PATCH v1 1/3] RISC-V: Combine vec_duplicate + vminu.vv to vminu.vx on GR2VR cost

2025-06-19 Thread pan2 . li
From: Pan Li This patch would like to combine the vec_duplicate + vminu.vv to the vminu.vx. From example as below code. The related pattern will depend on the cost of vec_duplicate from GR2VR. Then the late-combine will take action if the cost of GR2VR is zero, and reject the combination if th

[PATCH v1 0/3] RISC-V: Combine vec_duplicate + vminu.vv to vminu.vx on GR2VR cost

2025-06-19 Thread pan2 . li
From: Pan Li This patch would like to introduce the combine of vec_dup + vminu.vv into vminu.vx on the cost value of GR2VR. The late-combine will take place if the cost of GR2VR is zero, or reject the combine if non-zero like 1, 2, 15 in test. There will be two cases for the combine: Case 0:

Re: [PATCH v2 1/7] bitint: Allow mode promotion of _BitInt types

2025-06-19 Thread Jakub Jelinek
On Thu, Jun 19, 2025 at 05:59:04PM +0800, Yang Yujie wrote: > For targets that treat small _BitInts like the fundamental > integral types, we should allow their machine modes to be promoted > in the same way. > > gcc/ChangeLog: > > * explow.cc (promote_function_mode): Add a case for >

Re: [PATCH v2 2/7] bitint: Allow different limb_mode and abi_limb_mode in more cases

2025-06-19 Thread Jakub Jelinek
On Thu, Jun 19, 2025 at 05:59:05PM +0800, Yang Yujie wrote: > abi_limb_mode and limb_mode were asserted to be the same when > the target has different endianness for limbs in _BitInts > and words in objects. Otherwise, this assertion also held when the > TYPE_PRECISION of _BitInt type being laid o

Re: [RFC PATCH] gimple-simulate: Add a gimple IR interpreter/simulator

2025-06-19 Thread Mikael Morin
Le 18/06/2025 à 16:51, Richard Biener a écrit : On Wed, Jun 18, 2025 at 11:23 AM Mikael Morin wrote: From: Mikael Morin Hello, I'm proposing here an interpretor/simulator of the gimple IR. It proved useful for me to debug complicated testcases, where the misbehaviour is not obvious if you j

[PATCH v1 3/3] RISC-V: Add test for vec_duplicate + vminu.vv combine case 1 with GR2VR cost 0, 1 and 2

2025-06-19 Thread pan2 . li
From: Pan Li Add asm dump check test for vec_duplicate + vminu.vv combine to vminu.vx, with the GR2VR cost is 0, 1 and 2. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u16.c: Add asm check for vminu.vx combine. * gcc.target/riscv/rvv/autovec/vx_vf/vx

Re: [PATCH] [lra] force reg update after spilling to memory [PR120424]

2025-06-19 Thread Georg-Johann Lay
Am 14.06.25 um 12:45 schrieb Georg-Johann Lay: This patch introduces an ICE in lra-eliminations.cc:1200 for an existing test case. Please ignore this mail. I missed that https://gcc.gnu.org/pipermail/gcc-patches/2025-June/685870.html fixes the problem. Johann In $builddir/gcc: $ make -k

Re: [PATCH v2 5/7] LoongArch: Add support for _BitInt [PR117599]

2025-06-19 Thread Jakub Jelinek
On Thu, Jun 19, 2025 at 05:59:08PM +0800, Yang Yujie wrote: > +/* Implement TARGET_C_BITINT_TYPE_INFO. > + Return true if _BitInt(N) is supported and fill its details into *INFO. > */ > +bool > +loongarch_bitint_type_info (int n, struct bitint_info *info) > +{ > + if (n <= 8) > +info->limb

Re: *** SPAM *** Re: [PATCH] fortran: Statically initialize length of SAVEd character arrays

2025-06-19 Thread Mikael Morin
Le 18/06/2025 à 23:50, Thomas Koenig a écrit : Hi Mikael,   Regression-tested on x86_64-pc-linux-gnu.   OK for master? Just wondering... how does this relate to the recent fix of PR120483 by Andre? Is this also a regression?  If so, maybe a backport would be in order. Best regads Thoma

Re: [PATCH v2 5/7] LoongArch: Add support for _BitInt [PR117599]

2025-06-19 Thread Yang Yujie
On Thu, Jun 19, 2025 at 12:32:57PM GMT, Jakub Jelinek wrote: > As mentioned in another mail, please follow what aarch64 is doing here (at > least unless you explain how it violates your psABI): > if (n <= 8) > info->limb_mode = QImode; > else if (n <= 16) > info->limb_mode = HImode; >

Re: [PATCH] cobol: Fix build on 32-bit Solaris [PR120621]

2025-06-19 Thread Jakub Jelinek
On Thu, Jun 19, 2025 at 01:38:06PM +0200, Rainer Orth wrote: > --- a/gcc/cobol/genapi.cc > +++ b/gcc/cobol/genapi.cc > @@ -957,7 +957,7 @@ parser_compile_ecs( const std::vector { > SHOW_PARSE_HEADER > char ach[64]; > -snprintf(ach, sizeof(ach), " Size is %ld; retval is %p", > +

Re: [PATCH] expand: Align PARM_DECLs again to at least BITS_PER_WORD if possible [PR120689]

2025-06-19 Thread Richard Biener
> Am 19.06.2025 um 08:53 schrieb Jakub Jelinek : > > On Thu, Jun 19, 2025 at 08:33:10AM +0200, Richard Biener wrote: >> How does this interact with -mincoming-stack-boundary? Is this, and thus >> when we need stack realignment, visible here? Do we know whether we need >> to realign the stack

Re: [PATCH v4] x86: Enable *mov_(and|or) only for -Oz

2025-06-19 Thread Hongtao Liu
On Wed, Jun 18, 2025 at 6:38 PM H.J. Lu wrote: > > commit ef26c151c14a87177d46fd3d725e7f82e040e89f > Author: Roger Sayle > Date: Thu Dec 23 12:33:07 2021 + > > x86: PR target/103773: Fix wrong-code with -Oz from pop to memory. > > added "*mov_and" and extended "*mov_or" to transform > "

Re: [PATCH v2 1/4] RISC-V: Add support for xtheadvector unit-stride segment load/store intrinsics

2025-06-19 Thread Kito Cheng
Hi YunZe: Generally I am open minded to accept vendor extensions, however this patch set really introduces too much pattern... - NUM_INSN_CODES (defined in insn-codes.h) become 83625 from 48573. (+72%) - Total line of insn-emit-*.cc becomes 1749362 from 1055750. (+65%) - Total line of insn-recog

[Patch] libgomp.texi: Document omp(x)::allocator::*, restructure memory allocator doc (was: [PATCH] Docs: Document omp::allocator::* and ompx::allocator::* allocators.)

2025-06-19 Thread Tobias Burnus
The attached patch does some cleanup to the memory allocation description, which I mainly started as I wondered myself about some details - especially about the pool_size feature. It also includes the documentation about omp::allocator::* by Alex. And, as I proposed by then (cf. below), it mov