RE: [PATCH] vect: Support early break with gswitch statements

2025-09-22 Thread Richard Biener
On Mon, 22 Sep 2025, Tamar Christina wrote: > > -Original Message- > > From: Richard Biener > > Sent: 22 September 2025 12:34 > > To: Pengfei Li > > Cc: gcc-patches@gcc.gnu.org; rguent...@suse.de; Tamar Christina > > > > Subject: Re: [PAT

Re: [PATCH] vect: Support early break with gswitch statements

2025-09-22 Thread Richard Biener
On Thu, Sep 18, 2025 at 2:34 PM Pengfei Li wrote: > > This patch adds vectorization support to early-break loops with gswitch > statements. Such gswitches may come from original switch-case constructs > in the source or the iftoswitch pass which rewrites if conditions with a > chain of comparisons

Re: [PATCH v1 1/2] Match: Add form 5 of unsigned SAT_MUL for mul

2025-09-22 Thread Richard Biener
On Fri, Sep 19, 2025 at 8:58 AM wrote: > > From: Pan Li > > This patch would like to try to match the the unsigned > SAT_MUL form 5, aka below: > > #define DEF_SAT_U_MUL_FMT_5(NT, WT) \ > NT __attribute__((noinline))\ > sat_u_mul_##NT##_from_##WT##_fmt_5 (NT

Re: [PATCH v2] vect: Add vectorization logic for FLOOR_{MOD, DIV}[PR104116]

2025-09-22 Thread Richard Biener
On Thu, Sep 18, 2025 at 10:31 AM Avinash Jayakar wrote: > > Hi, > > Following is version 2 of the patch proposed for master aiming to fix > PR104116. This has been bootstrapped and regtested on powerpc64le with > regression failures. > Kindly review. > > Just had one question. > If I have to imple

[PATCH] tree-optimization/122016 - PRE insertion breaks abnormal coalescing

2025-09-22 Thread Richard Biener
When PRE asks VN to simplify a NARY but not insert, that bypasses the abnormal guard in maybe_push_res_to_seq and we blindly accept new uses of abnormals. The following fixes this. Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed. PR tree-optimization/122016 * tree-ssa

[PATCH] tree-optimization/122023 - rotate pattern with reductions

2025-09-22 Thread Richard Biener
The following disables the use of rotate patterns with reductions since it breaks then single rotate SSA use-def chain constraints. Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed. PR tree-optimization/122023 * tree-vect-patterns.cc (vect_recog_rotate_pattern): Disable

Re: [PATCH] [x86] Disable vect unroll for znver2.

2025-09-22 Thread Richard Biener
On Mon, Sep 22, 2025 at 4:22 AM liuhongt wrote: > > Since it regressed SPEC performance(Refer to PR121994), I guess > it's related to register pressure and can be tuned by adjusting > reduc_lat_mult_thr. I don't have Zen2 machine, so for simplity, I'll > just disable unroll in vectorizer for Zen2.

Re: [PATCH] maintainer-scripts: add gen_gcc_docs.sh

2025-09-22 Thread Richard Biener
On Sun, Sep 21, 2025 at 5:20 PM Mark Wielaard wrote: > > Hi Aarsen, > > Added Jonathan to CC to get his opinion on the libstdc++ part of the > documentation (re)generation. > > On Mon, Sep 08, 2025 at 06:07:48PM +0200, Arsen Arsenović wrote: > > Mark Wielaard writes: > > > > > I think it is a goo

Re: [PATCH] fab/gimple-fold: Move __builtin_constant_p folding to gimple-fold [PR121762]

2025-09-22 Thread Richard Biener
On Sat, Sep 20, 2025 at 5:15 AM Andrew Pinski wrote: > > This is the first patch in removing fold_all_builtins pass. > We want to fold __builtin_constant_p into 0 if we know the argument can't be > a constant. So currently that is done in fab pass (though ranger handles it > now too). > Instead o

Re: [PATCH v1] tree-optimization: Fold aggregate assignments to scalar operations [PR99504]

2025-09-22 Thread Richard Biener
On Sat, Sep 20, 2025 at 4:24 AM Andrew Pinski wrote: > > > > On Fri, Sep 19, 2025, 7:08 PM Peter0x44 wrote: >> >> On 2025-09-20 02:33, Andrew Pinski wrote: >> > On Fri, Sep 19, 2025, 6:22 PM Peter Damianov >> > wrote: >> > >> >> This patch implements folding of aggregate assignments (*dest = >>

Re: [PATCH v2] vect: Handle grouped accesses via gather/scatter.

2025-09-20 Thread Richard Biener
On Wed, Sep 17, 2025 at 9:22 AM Robin Dapp wrote: > > > We are supposed to not get into > > > > if (mask_element != index) > > noop_p = false; > > I guess the problem is the vectype mismatch. We're checking the permutation > for e.g. V16QI = {0, 1, 2, 3, 8, 9, 10, 11, ...} which, in

Re: [PATCH v4 3/3]middle-end: Use addhn for compression instead of inclusive OR when reducing comparison values

2025-09-20 Thread Richard Biener
g) > { > if (dump_enabled_p ()) > dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, > @@ -12461,10 +12488,22 @@ vectorizable_early_exit (loop_vec_info loop_vinfo, > stmt_vec_info stmt_info, > >while (workset.length () > 1) > { > - new_temp = make_temp_ssa_name (vectype, NULL, "vexit_reduc"); > tree arg0 = workset.pop (); > tree arg1 = workset.pop (); > - new_stmt = gimple_build_assign (new_temp, BIT_IOR_EXPR, arg0, arg1); > + if (addhn_supported_p && workset.length () == 0) > + { > + new_stmt = gimple_build_call_internal (ifn, 2, arg0, arg1); > + vectype_out = narrow_type; > + new_temp = make_temp_ssa_name (vectype_out, NULL, "vexit_reduc"); > + gimple_call_set_lhs (as_a (new_stmt), new_temp); > + gimple_call_set_nothrow (as_a (new_stmt), true); > + } > + else > + { > + new_temp = make_temp_ssa_name (vectype_out, NULL, "vexit_reduc"); > + new_stmt > + = gimple_build_assign (new_temp, BIT_IOR_EXPR, arg0, arg1); > + } > vect_finish_stmt_generation (loop_vinfo, stmt_info, new_stmt, > &cond_gsi); > workset.quick_insert (0, new_temp); > @@ -12487,6 +12526,7 @@ vectorizable_early_exit (loop_vec_info loop_vinfo, > stmt_vec_info stmt_info, > >gcc_assert (new_temp); > > + tree cst = build_zero_cst (vectype_out); >gimple_cond_set_condition (cond_stmt, NE_EXPR, new_temp, cst); >update_stmt (orig_stmt); > > > > -- Richard Biener SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

[PATCH] tree-optimization/121720 - missed PRE hoisting

2025-09-20 Thread Richard Biener
The following re-implements the fix for PR84830 where the original fix causes missed optimizations. The issue with PR84830 is that we end up growing ANTIC_IN value set during iteration which happens because we conditionally prune values based on ANTIC_OUT - TMP_GEN expressions. But when ANTIC_OUT

Re: [PATCH v2] vect: Handle grouped accesses via gather/scatter.

2025-09-20 Thread Richard Biener
On Wed, Sep 17, 2025 at 10:30 PM Robin Dapp wrote: > > When trying to unify the vector_vector_composition variants I noticed that > there are even more alignment checks than when I last looked ;) Yeah :/ I don't like the way it's done very much. Having a unified idea of a "punning" type and do

Re: [PATCH v2] vect: Handle grouped accesses via gather/scatter.

2025-09-20 Thread Richard Biener
On Wed, Sep 17, 2025 at 3:38 PM Robin Dapp wrote: > > > For a non-STMT_VINFO_STRIDED_P access the DR_GROUP_SIZE is > > basically the DR_STRIDE, because the DR group models contiguous memory. > > You meant DR_STEP? So if step/stride = 100 and we access the first two > elements at 0, 1, the third i

[PATCH] lto/121935 - visit all DECL_ARGUMENTS in free-lang-data

2025-09-20 Thread Richard Biener
With no longer visiting TREE_CHAIN for decls we have to visit the DECL_ARGUMENT chain manually. LTO bootstrap and regtest running on x86_64-unknown-linux-gnu. I'm not sure whether LTO bootstrap worked before, but hopefully this would have fixed it. Testing non-LTO to be able to push it anyway as

[PATCH] Remove accidentially left if (0) block

2025-09-20 Thread Richard Biener
The following removes a block I added (and disabled again) when developing the PR121720 fix. Bootstrapped on x86_64-unknown-linux-gnu, pushed. * tree-ssa-pre.cc (compute_antic_aux): Remove dead code. --- gcc/tree-ssa-pre.cc | 14 -- 1 file changed, 14 deletions(-) diff --git

Re: [PATCH] forwprop: Add a simple DSE after a clobber

2025-09-20 Thread Richard Biener
On Tue, Sep 16, 2025 at 6:34 AM Andrew Pinski wrote: > > After copy propagation for aggregates patches we might end up with > now: > ``` > tmp = a; > b = a; // was b = tmp; > tmp = {CLOBBER}; > ``` > To help out ESRA, it would be a good idea to remove the `tmp = a` statement as > there is no DSE b

Re: [PATCH] tree-optimization/121720 - missed PRE hoisting

2025-09-19 Thread Richard Biener
On Fri, 19 Sep 2025, Mikael Morin wrote: > Le 18/09/2025 à 09:43, Richard Biener a écrit : > > diff --git a/gcc/tree-ssa-pre.cc b/gcc/tree-ssa-pre.cc > > index 99331730bc2..18b36259cb4 100644 > > --- a/gcc/tree-ssa-pre.cc > > +++ b/gcc/tree-ssa-pre.cc > > @@ -211

[PATCH] Remove DR_GROUP_STORE_COUNT

2025-09-19 Thread Richard Biener
This was only used for non-SLP. Bootstrap and regtest running on x86_64-unknown-linux-gnu. * tree-vectorizer.h (_stmt_vec_info::store_count): Remove. (DR_GROUP_STORE_COUNT): Likewise. * tree-vect-stmts.cc (vect_transform_stmt): Remove non-SLP path. --- gcc/tree-ve

[PATCH] Cleanup vect_get_num_copies API

2025-09-19 Thread Richard Biener
The following removes the dual non-SLP/SLP API in favor of only handling SLP. This also removes the possibility to override vectype of a SLP node with an inconsistent one while still using the SLP nodes number of lanes. This requires adjustment of a few places where such inconsistencies happened.

[PATCH] Remove SLP_TREE_NUMBER_OF_VEC_STMTS

2025-09-19 Thread Richard Biener
The following removes the redundant SLP_TREE_NUMBER_OF_VEC_STMTS, replacing it with vect_get_num_copies. Previously it was already made sure that all setters adhere to that. Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed. * tree-vectorizer.h (_slp_tree::vec_stmts_size): Remo

Re: [RFC PATCH] c++, gimplify: Implement C++26 P2795R5 - Erroneous behavior for uninitialized reads [PR114457]

2025-09-18 Thread Richard Biener
On Thu, 18 Sep 2025, Jakub Jelinek wrote: > On Thu, Sep 18, 2025 at 06:54:06PM +0200, Jason Merrill wrote: > > > There are some regressions caused by the removal of {CLOBBER(bob)} > > > clobbers from the start of certain constructors, e.g. one testcase has > > > struct A > > > { > > >int f,g;

Re: [PATCH 1/2] Fix errno handling for sin and cos [PR80042]

2025-09-18 Thread Richard Biener
On Fri, Sep 19, 2025 at 12:08 AM Peter Damianov wrote: > > POSIX says that sin and cos should set errno to EDOM when infinity is passed > to > them. Make sure this is accounted for in builtins.def. > > When sin/cos are called with values that set errno (like INFINITY), GCC was > incorrectly optim

Re: [PATCH v2] vect: Handle grouped accesses via gather/scatter.

2025-09-18 Thread Richard Biener
On Thu, Sep 18, 2025 at 10:19 PM Robin Dapp wrote: > > > But the vector type we perform the permutation on should be unchanged (it's > > not the punned type but the original type we pun the loaded vector back to)? > > Yeah, I was trying to re-use what we have but I see now that just passing a > di

Re: [PATCH v1 1/2] Widening-Mul: Refine build_and_insert_cast when rhs is cast

2025-09-18 Thread Richard Biener
On Tue, Sep 9, 2025 at 3:31 AM wrote: > > From: Pan Li > > The widening-mul will insert a cast for the widen-mul, the > function build_and_insert_cast is design to take care of it. > > In some case the optimized gimple has some unnecessary cast, > for example as below code. > > #define SAT_U_MU

RE: New optabs and IFN required for early break [bikeshed]

2025-09-18 Thread Richard Biener
On Thu, 11 Sep 2025, Tamar Christina wrote: > > -Original Message- > > From: Richard Biener > > Sent: Thursday, September 11, 2025 12:56 PM > > To: Tamar Christina > > Cc: Robin Dapp ; GCC Patches > patc...@gcc.gnu.org>; rdsandif...@googlemail.com

Re: [PATCH] vect: Add vect_analyze_slp_perm_load and add vectype override.

2025-09-18 Thread Richard Biener
On Thu, Sep 18, 2025 at 1:21 PM Robin Dapp wrote: > > Hi, > > This patch adds an explicit variant of vect_transform_slp_perm_load that > just does the analysis part of vect_transform_slp_perm_load. > > I find it slightly clearer to indicate "analysis" in the > function name already rather than hav

[PATCH] tree-optimization/87615 - VN predication is expensive

2025-09-18 Thread Richard Biener
The following restricts the number of locations we register a predicate as valid which avoids the expensive linear search for cases like if (a) A; if (a) B; if (a) C; ... where we register a != 0 as true for locations A, B, C ... in an unlimited way. The patch simply c

Re: [PATCH] expr, tree: Ensure get_range_pos_neg is called only on scalar integral types [PR121904]

2025-09-17 Thread Richard Biener
& SCALAR_INT_MODE_P (mode) > +&& INTEGRAL_TYPE_P (TREE_TYPE (treeop0)) > && (GET_MODE_SIZE (as_a (mode)) > > GET_MODE_SIZE (as_a (GET_MODE (op0 > && get_range_pos_neg (treeop0, > > Jakub > > -- Richard Biener SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Re: [PATCH] forwprop: Add a simple DSE after a clobber

2025-09-17 Thread Richard Biener
On Wed, Sep 17, 2025 at 2:40 AM Andrew Pinski wrote: > > On Tue, Sep 16, 2025 at 5:54 AM Richard Biener > wrote: > > > > On Tue, Sep 16, 2025 at 6:34 AM Andrew Pinski > > wrote: > > > > > > After copy propagation for aggregates patches we might e

Re: [PATCH 1/2 v2] c++: Implement P1494 and P3641 Partial program correctness [PR119060].

2025-09-17 Thread Richard Biener
On Mon, Sep 8, 2025 at 2:22 PM Iain Sandoe wrote: > > > > > On 8 Sep 2025, at 12:46, Richard Biener wrote: > > > > On Mon, Sep 8, 2025 at 1:04 PM Ville Voutilainen > > wrote: > >> > >> On Mon, 8 Sept 2025 at 13:54, Richard Biener

RE: New optabs and IFN required for early break [bikeshed]

2025-09-17 Thread Richard Biener
On Fri, 12 Sep 2025, Tamar Christina wrote: > > -Original Message- > > From: Richard Biener > > Sent: Friday, September 12, 2025 1:40 PM > > To: Tamar Christina > > Cc: Robin Dapp ; GCC Patches > patc...@gcc.gnu.org>; rdsandif...@googlemail.com

Re: [PATCH v2] vect: Handle grouped accesses via gather/scatter.

2025-09-17 Thread Richard Biener
On Mon, Sep 15, 2025 at 8:53 PM Robin Dapp wrote: > > Hi, > > This patch adds gather/scatter handling for grouped access. The idea is > to e.g. replace an access (for uint8_t elements) like > arr[0] > arr[1] > arr[2] > arr[3] > arr[0 + step] > arr[1 + step] > ... > by a gather load

Re: [PATCH v3] [x86] Exclude fake cross-lane permutation from avx256_avoid_vec_perm.

2025-09-17 Thread Richard Biener
On Mon, Sep 8, 2025 at 10:15 AM liuhongt wrote: > > SLP may take a broadcast as kind of vec_perm, the patch checks the > permutation index to exclude those false positive. > > > > > so the vectorizer costs sth withy count == 0? I'll see to fix that, > > > > but this also > > > > means the code sh

Re: [PATCH 1/2 v2] c++: Implement P1494 and P3641 Partial program correctness [PR119060].

2025-09-17 Thread Richard Biener
On Mon, Sep 8, 2025 at 3:54 PM Iain Sandoe wrote: > > > > > On 8 Sep 2025, at 14:40, Richard Biener wrote: > > > > On Mon, Sep 8, 2025 at 3:16 PM Jakub Jelinek wrote: > >> > >> On Mon, Sep 08, 2025 at 03:05:58PM +0200, Richard Biener wrote:

Re: [PATCH v2] vect: Handle grouped accesses via gather/scatter.

2025-09-17 Thread Richard Biener
On Wed, Sep 17, 2025 at 1:15 PM Robin Dapp wrote: > > > On Wed, Sep 17, 2025 at 9:22 AM Robin Dapp wrote: > >> > >> > We are supposed to not get into > >> > > >> > if (mask_element != index) > >> > noop_p = false; > >> > >> I guess the problem is the vectype mismatch. We're checkin

[PATCH] [gimplefe] fix SSA operand creation

2025-09-17 Thread Richard Biener
When transitioning gcc.dg/torture/pr84830.c to a GIMPLE testcase to feed the IL into PRE that caused the original issue (and verify it's still there with the fix reverted), I noticed we put up SSA operands before having fully parsed the function and thus with not all variables having the final TREE

Re: [PATCH] Remove SPR/GNR/DMR from avx512_{move, store}_by pieces tune.

2025-09-16 Thread Richard Biener
> > > > > > > > > > > > > > > -Original Message- > > > > > From: Richard Biener > > > > > Sent: Tuesday, September 16, 2025 3:03 PM > > > > > To: Liu, Hongtao > > > > > Cc: gcc-patches@

Re: [PATCH] forwprop: Don't loop on the stmt when optimize_aggr_zeroprop or optimize_agr_copyprop return true

2025-09-16 Thread Richard Biener
On Wed, Sep 17, 2025 at 4:55 AM Andrew Pinski wrote: > > Since now optimize_aggr_zeroprop and optimize_agr_copyprop work by forward > walk to prop > the zero/aggregate and does not change the statement at hand, there is no > reason to > repeat the loop if they do anything. This will prevent pro

Re: [PATCH] uninclude: Add lib/gcc//include as an possible include dir

2025-09-16 Thread Richard Biener
On Wed, Sep 17, 2025 at 3:18 AM Andrew Pinski wrote: > > While running uninclude on PR99912's preprocessed source uninclude > didn't uninclude some of the x86_64 target headers. This was because > `lib/gcc//include` was not noticed as an possible system > include dir. It supported `gcc-lib//includ

Re: [PATCH 2/2] forwprop: Fix up "nop" copies after recent changes [PR121962]

2025-09-16 Thread Richard Biener
On Wed, Sep 17, 2025 at 12:33 AM Andrew Pinski wrote: > > After r16-3887-g597b50abb0d2fc, the check to see if the copy is > a nop copy becomes inefficient. The code going into an infinite > loop as the copy keeps on being propagated over and over again. > > That is if we have: > ``` > struct s1

Re: [PATCH 1/2] forwprop: Add a quick out for new_src_based_on_copy when both are decls

2025-09-16 Thread Richard Biener
On Wed, Sep 17, 2025 at 12:33 AM Andrew Pinski wrote: > > If both operands that are being compared are decls, operand_equal_p will > already > handle that case so an early out can be done here. > > Bootstrapped and tested on x86_64-linux-gnu. OK. > gcc/ChangeLog: > > * tree-ssa-forwprop

Re: [PATCH v2] vect: Handle grouped accesses via gather/scatter.

2025-09-16 Thread Richard Biener
On Tue, Sep 16, 2025 at 4:15 PM Robin Dapp wrote: > > > Well, what you want to catch now isn't single-lane anymore. But I guess > > since > > we now check the permute before this we can rely on check for n_perms == 0 > > to catch the "no actual permutation required" case? > > I'm seeing n_perms =

Re: [PATCH] Preserve TREE_THIS_NOTRAP during inlining in more cases

2025-09-16 Thread Richard Biener
On Tue, Sep 16, 2025 at 10:30 AM Eric Botcazou wrote: > > > I mean TREE_READONLY on ..._REF nodes. We can't rely on the absence of > > TREE_READONLY on ..._REF meaning the object is writable, so the flag does > > not add any information (but maybe some costing hint that the object is > > definite

Re: [PATCH v2] vect: Handle grouped accesses via gather/scatter.

2025-09-16 Thread Richard Biener
On Tue, Sep 16, 2025 at 3:07 PM Robin Dapp wrote: > > > I think this now conflicts a bit with what I just pushed (sorry). > > > >>&& loop_vinfo) > >> { > >> + unsigned i, j; > >> + bool simple_perm_series = true; > >> + FOR_EACH_VEC_ELT (SLP_TREE_LOAD_PERMUTATION (slp_n

Re: [PATCH][PR104116] Add vectorization logic for floor_{mod,div}

2025-09-16 Thread Richard Biener
On Mon, 15 Sep 2025, Avinash Jayakar wrote: > Hello Richard, > > Thank you for reviewing the patch! I have made changes based on your > comments, but I have some doubts for a few comments as mentioned below. > > On Thu, 2025-09-11 at 13:08 +0200, Richard Biener wrote: >

Re: [PATCH v2 1/2] Match: Add form 5 of unsigned SAT_MUL for widen-mul

2025-09-16 Thread Richard Biener
On Tue, Sep 16, 2025 at 5:22 AM wrote: > > From: Pan Li > > This patch would like to try to match the the unsigned > SAT_MUL form 4, aka below: > > #define DEF_SAT_U_MUL_FMT_5(NT, WT) \ > NT __attribute__((noinline))\ > sat_u_mul_##NT##_from_##WT##_fmt_5 (NT

Re: [PATCH] i386/testsuite: Fix scan tree dump in vect-epilogue-4.c

2025-09-16 Thread Richard Biener
scan-tree-dump-times "epilogue loop vectorized using masked > 64 byte vectors" 1 "vect" } } */ > /* { dg-final { scan-tree-dump-not "loop vectorized using 32 byte vectors" > "vect" } } */ > -- Richard Biener SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Re: [PATCH] Remove SPR/GNR/DMR from avx512_{move, store}_by pieces tune.

2025-09-16 Thread Richard Biener
On Tue, Sep 16, 2025 at 9:53 AM Liu, Hongtao wrote: > > > > > -Original Message- > > From: Richard Biener > > Sent: Tuesday, September 16, 2025 3:03 PM > > To: Liu, Hongtao > > Cc: gcc-patches@gcc.gnu.org; hjl.to...@gmail.com > > Subject: Re:

Re: [PATCH] forwprop: Handle memcpy for arguments with respect to copies

2025-09-16 Thread Richard Biener
On Mon, Sep 15, 2025 at 7:20 PM Andrew Pinski wrote: > > This moves the code used in optimize_agr_copyprop_1 (r16-3887-g597b50abb0d) > to handle this same case into its new function and use it inside > optimize_agr_copyprop_arg. This allows to remove more copies that show up only > in arguments. >

Re: [PATCH] Remove SPR/GNR/DMR from avx512_{move, store}_by pieces tune.

2025-09-16 Thread Richard Biener
On Tue, Sep 16, 2025 at 7:53 AM liuhongt wrote: > > From: "hongtao.liu" > > Align move_max with prefer_vector_width for SPR/GNR/DMR to avoid STLF issue. > It's similar as previous commit. > > commit 6ea25c041964bf63014fcf7bb68fb1f5a0a4e123 > Author: liuhongt > Date: Thu Aug 15 12:54:07 2024 +0

Re: [PATCH] Preserve TREE_THIS_NOTRAP during inlining in more cases

2025-09-16 Thread Richard Biener
On Mon, Sep 15, 2025 at 9:29 PM Eric Botcazou wrote: > > > Yes. So I read the comment in a way to say that TREE_THIS_NOTRAP does not > > mean the reference is writable. In some context we check > > > > || tree_could_trap_p (lhs) > > > > /* tree_could_trap_p is a predicate for

Re: [PATCH v2] vect: Remove type from misalignment hook.

2025-09-15 Thread Richard Biener
On Mon, Sep 15, 2025 at 8:51 PM Robin Dapp wrote: > > Hi, > > This patch removes the type argument from the vector_misalignment hook. > Ever since we switched from element to byte misalignment its > semantics haven't been particularly clear and nowadays it should be > redundant. > > Also, in case

Re: [PATCH] vect: Try signed and unsigned gather offsets.

2025-09-15 Thread Richard Biener
On Thu, Sep 11, 2025 at 12:03 PM Robin Dapp wrote: > > Hi, > > This patch adjusts vect_gather_scatter_fn_p to always check an offset > type with swapped signedness (vs. the original offset argument). > If the target supports the gather/scatter with the new offset type the > offset is converted to

Re: [PATCH v1 2/2] Match: Adjust the unsigned SAT_MUL pattern

2025-09-15 Thread Richard Biener
On Tue, Sep 9, 2025 at 3:31 AM wrote: > > From: Pan Li > > The widen-mul removed the unnecessary cast, thus adjust the > SAT_MUL of wide-mul to a simpler form. OK. > gcc/ChangeLog: > > * match.pd: Remove unnecessary cast of unsigned > SAT_MUL for widen-mul. > > Signed-off-by: Pa

Re: [PATCH] Preserve TREE_THIS_NOTRAP during inlining in more cases

2025-09-15 Thread Richard Biener
On Mon, Sep 15, 2025 at 12:05 PM Eric Botcazou wrote: > > > Yes please. Can we assert the MEM_REF offset is zero and the MEM_REF > > isn't type-punning, aka TREE_TYPE of the MEM_REF is compatible with > > the decls type, or is this not easily possible (again because of the > > placeholders)? > >

Re: [PATCH] Preserve TREE_THIS_NOTRAP during inlining in more cases

2025-09-15 Thread Richard Biener
On Mon, Sep 15, 2025 at 1:44 PM Eric Botcazou wrote: > > > Ah, I wasn't aware of that. This makes TREE_THIS_NOTRAP possibly not > > usable for tree_could_trap_p :/ One could read the docs so that it means > > when you have a read with TREE_THIS_NOTRAP then you can't infer > > from that that writ

Re: [PATCH] tree-optimization/117760 - `a != b` implies that a or b is also non-zero

2025-09-15 Thread Richard Biener
On Thu, Sep 11, 2025 at 3:16 PM Matteo Nicoli wrote: > > I am writing this follow-up email to specify that I executed the tests > contained in this patch on aarch64-arm64-linux-gnu The changelog part of the commit message is formatted wrongly. * gcc/match.pd: added the following optimizations

[PATCH] Unify last two vect_transform_slp_perm_load calls

2025-09-15 Thread Richard Biener
The following unifies the vect_transform_slp_perm_load call done in vectorizable_load with that eventually done in get_load_store_type. On the way it fixes the conditions on which we can allow VMAT_ELEMENTWISE or VMAT_GATHER_SCATTER when there's a SLP permutation (and we arrange to not code generat

Re: [PATCH v2] forwprop: Handle memcpy for copy prop [PR121418, PR121417]

2025-09-15 Thread Richard Biener
On Tue, Sep 9, 2025 at 6:17 AM Andrew Pinski wrote: > > It turns out easy to add support for memcpy copy prop when the memcpy > has changed into `MEM` copy. > Instead of rejecting right out we need to figure out that > `a` and `MEM[&a]` are equivalent in terms of address and size. > And then creat

Re: [PATCH] vect: Handle grouped accesses via gather/scatter.

2025-09-15 Thread Richard Biener
On Mon, Sep 15, 2025 at 11:52 AM Robin Dapp wrote: > > > The rest of the GCN hook is quite inconsistent, it says misalignment > > == -1 is OK, is_packed > > is not and then the above ... and gather-scatter is also always OK > > (even if the scalar accesses > > are 'packed' aka not naturally aligne

Re: [PATCH] vect: Handle grouped accesses via gather/scatter.

2025-09-15 Thread Richard Biener
On Mon, Sep 15, 2025 at 9:11 AM Robin Dapp wrote: > > >> In that case I relied on !is_packed for riscv. > > > > I guess it's easiest to keep is_packed then, but is it having the data > > accesses aligned to element _size_ or to element mode alignment? > > For gather a target couldn't distinguish t

Re: [PATCH] Preserve TREE_THIS_NOTRAP during inlining in more cases

2025-09-15 Thread Richard Biener
On Mon, Sep 15, 2025 at 11:23 AM Eric Botcazou wrote: > > > Do we need to ensure that, for the MEM_REF case at least, the DECL is of > > appropriate size with respect to the TREE_TYPE of the MEM_REF and > > the offset (TREE_OPERAND (*tp, 1))? That is, consider > > > > ptr = &too_small_object; >

Re: [PATCH] Preserve TREE_THIS_NOTRAP during inlining in more cases

2025-09-15 Thread Richard Biener
On Mon, Sep 15, 2025 at 8:03 AM Eric Botcazou wrote: > > Hi, > > For parameters passed by reference, the Ada compiler sets TREE_THIS_NOTRAP on > their dereference to prevent tree_could_trap_p from returning true and then > causing a new basic block to be created for every access to them (given tha

Re: [PATCH] match: Simplify `ptr0 - (ptr0 - ptr1)` into ptr1 [PR121921]

2025-09-15 Thread Richard Biener
On Sun, Sep 14, 2025 at 8:17 PM Andrew Pinski wrote: > > This pattern shows up with some C++ code (std::vector) where we get: > ``` > _9 = _201 - _36; > _10 = (long unsigned int) _9; > _11 = -_10; > _12 = _201 + _11; > ``` > > In the original code it was `end - (end - begin)` but with inli

Re: [PATCH] lto/121935 - visit all DECL_ARGUMENTS in free-lang-data

2025-09-14 Thread Richard Biener
On Sun, 14 Sep 2025, Sam James wrote: > Richard Biener writes: > > > With no longer visiting TREE_CHAIN for decls we have to visit > > the DECL_ARGUMENT chain manually. > > > > LTO bootstrap and regtest running on x86_64-unknown-linux-gnu. > > > >

Re: [PATCH][RFC] ipa-free-lang-data: Don't walk into DECL_CHAIN when finding decls/types [PR121865]

2025-09-13 Thread Richard Biener
On Fri, 12 Sep 2025, Nathaniel Shead wrote: > On Fri, Sep 12, 2025 at 09:13:18AM +0200, Richard Biener wrote: > > On Fri, 12 Sep 2025, Nathaniel Shead wrote: > > > > > On Thu, Sep 11, 2025 at 11:08:54AM +0200, Richard Biener wrote: > > > > On Th

Re: [RFA] Fix latent LRA bug

2025-09-13 Thread Richard Biener
> Am 12.09.2025 um 19:03 schrieb Jeff Law : > > Shreya's work to add the addptr pattern on the RISC-V port exposed a latent > bug in LRA. > > We lazily allocate/reallocate the ira_reg_equiv structure and when we do > (re)allocation we'll over-allocate and zero-fill so that we don't have to

[PATCH] Do less redundant vect_transform_slp_perm_load calls

2025-09-12 Thread Richard Biener
The following tries to do vect_transform_slp_perm_load exactly once during analysis and once during transform. There's a 2nd case left during analysis in get_load_store_type. Temporarily this records n_perms in the load-store info and verifies that against the value computed at transform stage.

Re: [PATCH] vect: Handle grouped accesses via gather/scatter.

2025-09-12 Thread Richard Biener
On Fri, Sep 12, 2025 at 12:09 PM Robin Dapp wrote: > > > I wonder in which cases we have misalignment == -1 but !is_packed? > > Isn't that just the new case of gather_scatter (without punning and when we > couldn't analyze the dataref)? The dataref might be naturally aligned but we > explicitly s

[PATCH] Integrate SLP permute transform into vect_transform_stmt

2025-09-12 Thread Richard Biener
This adds permute_info_type and removes the duplication from vect_schedule_slp_node. Bootstrap and regtest running on x86_64-unknown-linux-gnu. * tree-vectorizer.h (stmt_vec_info_type::permute_info_type): Add. (vectorizable_slp_permutation): Declare. * tree-vect-slp.cc (ve

[PATCH] Avoid VMAT_ELEMENTWISE for negative stride SLP

2025-09-12 Thread Richard Biener
The following makes us always use VMAT_STRIDED_SLP for negative stride multi-element accesses. That handles falling back to single element accesses transparently. Bootstrap and regtest running on x86_64-unknown-linux-gnu. * tree-vect-stmts.cc (get_load_store_type): Use VMAT_STRIDED_SLP

RE: [PATCH v4 3/3]middle-end: Use addhn for compression instead of inclusive OR when reducing comparison values

2025-09-12 Thread Richard Biener
"target.\n"); > + addhn_supported_p = false; > + } > +} > + >/* Analyze only. */ >if (cost_vec) > { > - if (direct_optab_handler (cbranch_optab, mode) == CODE_FOR_nothing) > + if (!addhn_supported_p >

Re: [PATCH][RFC] ipa-free-lang-data: Don't walk into DECL_CHAIN when finding decls/types [PR121865]

2025-09-12 Thread Richard Biener
On Fri, 12 Sep 2025, Nathaniel Shead wrote: > On Thu, Sep 11, 2025 at 11:08:54AM +0200, Richard Biener wrote: > > On Thu, 11 Sep 2025, Richard Biener wrote: > > > > > On Wed, 10 Sep 2025, Nathaniel Shead wrote: > > > > > > > Does this fix seem

Re: [PATCH] vect: Handle grouped accesses via gather/scatter.

2025-09-11 Thread Richard Biener
On Thu, Sep 11, 2025 at 6:06 PM Robin Dapp wrote: > > > Hmm, so the existing "punning" code for VMAT_STRIDED_SLP does > > > > tree vtype > > = vector_vector_composition_type (vectype, const_nunits / n, > > &ptype); > >

Re: [PATCH] tree-optimization/121703 - UBSAN error with moving from uninit data

2025-09-11 Thread Richard Biener
On Fri, 5 Sep 2025, Richard Biener wrote: > The PR reports > > vectorizer.h:276:3: runtime error: load of value 32695, which is not a valid > value for type 'internal_fn' > > which I believe is from > > slp_node->data = new vect_load_store_data (st

Re: [PATCH][RFC] ipa-free-lang-data: Don't walk into DECL_CHAIN when finding decls/types [PR121865]

2025-09-11 Thread Richard Biener
effective-target lto } > +// { dg-additional-options "-fmodules -flto" } > + > +export module M; > +export template struct S; > +export template void foo(S) {} > +template struct S { > + friend void foo<>(S); > +}; > diff --git a/gcc/testsuite/g++.dg/

Re: New optabs and IFN required for early break [bikeshed]

2025-09-11 Thread Richard Biener
does only calls the IFN when at least one lane is > active. > > I do not believe I need a LEN version here either? But If If I'm wrong It > would > be useful to have a small example. I think you need a len variant unless the mask producer had len applied with an else value of 0 (IIRC RVV always preferse 'undefined' as else value). OTOH the "first" element - if one is set and we never require 'else' - should work with or without loop masking (with len or mask). That said, I do wonder why we have both extract_last and fold_extract_last. Richard. > > Thanks, > Tamar > -- Richard Biener SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Re: [PATCH] tree-optimization: fabs(a+0.0) -> fabs(a) for non trapping case

2025-09-11 Thread Richard Biener
uld simply say: New testcase. I have pushed the change with this adjustments as r16-3802-gaa4aafbad5235f Richard. > Best regards, > Matteo > > > > On 5 Sep 2025, at 9:27 am, Richard Biener wrote: > > On Thu, Sep 4, 2025 at 4:15 PM Matteo Nicoli > wrote: > > &g

Re: [PATCH][PR104116] Add vectorization logic for floor_{mod,div}

2025-09-11 Thread Richard Biener
ULL); > + def_stmt = gimple_build_assign(extr_cond, COND_EXPR, cond_reg4, > +build_int_cst(itype, 1), build_int_cst(itype, 0)); > + append_pattern_def_seq (vinfo, stmt_vinfo, def_stmt); > + > + // q -= (x ^ y < 0 && r) ? 1 : 0 > + tree floor_mod_r = vect_recog_temp_ssa_var(itype, NULL); > + pattern_stmt = gimple_build_assign(floor_mod_r, MINUS_EXPR, q, > extr_cond); > +} You are emitting code that might not be vectorizable and which needs post-processing with bool vector patterns. So you should 1) use the appropriate precision scalar bools, anticipating the vector mask type used 2) check at least whether the compares are supported, I think we can rely on bit operations suppoort Richard. > + } > } > >/* Pattern detected. */ > -- Richard Biener SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Re: [PATCH V3 1/2] Match: Support SAT_TRUNC variant NARROW_CLIP

2025-09-11 Thread Richard Biener
On Mon, Sep 8, 2025 at 7:58 PM Edwin Lu wrote: > > This patch tries to add support for a variant of SAT_TRUNC where > negative numbers are clipped to 0 instead of NARROW_TYPE_MAX_VALUE. > This form is seen in x264, aka > > UT clip (T a) > { > return a & (UT)(-1) ? (-a) >> 31 : a; > } > > Where s

Re: [PATCH] vect: Handle grouped accesses via gather/scatter.

2025-09-11 Thread Richard Biener
On Mon, Sep 8, 2025 at 3:19 PM Robin Dapp wrote: > > Hi, > > This patch adds gather/scatter handling for grouped access. The idea is > to e.g. replace an access (for uint8_t elements) like > arr[0] > arr[1] > arr[2] > arr[3] > arr[0 + step] > arr[1 + step] > ... > by a gather load o

[PATCH] Deal with prior EH/abormal cleanup when fixing up noreturn calls

2025-09-11 Thread Richard Biener
When a dead EH or abnormal edge makes a call queued for noreturn fixup unreachable, just skip processing it. Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed. PR tree-optimization/121870 * tree-ssa-propagate.cc (substitute_and_fold_engine::substitute_and_fold):

Re: [PATCH][RFC] ipa-free-lang-data: Don't walk into DECL_CHAIN when finding decls/types [PR121865]

2025-09-11 Thread Richard Biener
On Thu, 11 Sep 2025, Richard Biener wrote: > On Wed, 10 Sep 2025, Nathaniel Shead wrote: > > > Does this fix seem reasonable, or is there something I've missed? > > > > My change to g++.dg/lto/pr101396_0.C also causes it to fail link with > > some flags on

Re: [PATCH] match.pd: Add missing type check to reduc(ctor) pattern [PR121772]

2025-09-11 Thread Richard Biener
re or backport to 12 without a testcase (assuming a suitable one > can't be crafted). > > Thanks, > Alex > > gcc/ChangeLog: > > PR tree-optimization/121772 > * match.pd: Add type check to reduc(ctor) pattern. > > gcc/testsuite/ChangeLog: > >

[PATCH] tree-optimization/121829 - bogus CFG with asm goto

2025-09-10 Thread Richard Biener
When the vectorizer removes a forwarder created earlier by split_edge it uses redirect_edge_pred for convenience and efficiency. That breaks down when the edge split is originating from an asm goto as that is a jump that needs adjustments from redirect_edge_and_branch. The following factores a si

Re: [PATCH] expr: Handle RAW_DATA_CST in store_constructor [PR121831]

2025-09-10 Thread Richard Biener
> Am 10.09.2025 um 09:27 schrieb Jakub Jelinek : > > Hi! > > I thought this wouldn't be necessary because RAW_DATA_CST can only appear > inside of (array) CONSTRUCTORs within DECL_INITIAL of TREE_STATIC vars, > so there shouldn't be a need to expand it. Except that we have an > optimization

Re: [PATCH] bitint: Fix up lowering optimization of .*_OVERFLOW ifns [PR121828]

2025-09-10 Thread Richard Biener
> Am 10.09.2025 um 09:50 schrieb Jakub Jelinek : > > Hi! > > THe lowering of .{ADD,SUB,MUL}_OVERFLOW ifns is optimized, so that we don't > in the common cases uselessly don't create a large _Complex _BitInt > temporary with the first (real) part being the result and second (imag) part > just

Re: [PATCH] testsuite: Only scan for known file extensions in lto.exp

2025-09-10 Thread Richard Biener
> Am 10.09.2025 um 10:01 schrieb Jakub Jelinek : > > Hi! > > This is something that has bothered me for a few years but I've only found > time for it now. > The glob used for finding *_1.* etc. counterparts to the *_0.* tests is too > broad, so if one has say next to *_1.c file also *_1.c~ or

[PATCH] tree-optimization/121844 - IVOPTs and asm goto in latch

2025-09-09 Thread Richard Biener
When there's an asm goto in the latch of a loop we may not use IP_END IVs since instantiating those would (need to) split the latch edge which in turn invalidates IP_NORMAL position handling. This is a revision of the PR107997 fix. Bootstrap and regtest running on x86_64-unknown-linux-gnu.

Re: [PATCH] Fix load/store bias handling for extractlast.

2025-09-09 Thread Richard Biener
> Am 09.09.2025 um 12:54 schrieb Juergen Christ : > > The length returned by vect_get_loop_len is REALLEN + BIAS, but was > assumed to be REALLEN - BIAS. If BIAS is -1, this leads to wrong > code. > > Bootstrapped and regtested on s390. Ok for trunk? Ok. Can you also test on ppc64le which

Re: [PATCH 1/2 v2] c++: Implement P1494 and P3641 Partial program correctness [PR119060].

2025-09-08 Thread Richard Biener
On Mon, Sep 8, 2025 at 11:44 AM Iain Sandoe wrote: > > > > > On 8 Sep 2025, at 08:57, Richard Biener wrote: > > > > On Sun, Sep 7, 2025 at 9:43 PM Iain Sandoe wrote: > >> > >> Thanks for the helpful input from reviewers; > >> > &g

Re: [PATCH 1/2 v2] c++: Implement P1494 and P3641 Partial program correctness [PR119060].

2025-09-08 Thread Richard Biener
On Sun, Sep 7, 2025 at 9:43 PM Iain Sandoe wrote: > > Thanks for the helpful input from reviewers; > > This version has 4 changes from v1: > 1. removes some unrelated changes. > 2. As per Jakub's observations, we now special-case >std::observable_checkpoint so that it is guaranteed to be lower

Re: [PATCH 1/2 v2] c++: Implement P1494 and P3641 Partial program correctness [PR119060].

2025-09-08 Thread Richard Biener
> Am 08.09.2025 um 17:53 schrieb Iain Sandoe : > >  > >>> On 8 Sep 2025, at 15:53, Richard Biener wrote: >>> >>> >>> >>>> Am 08.09.2025 um 16:28 schrieb Iain Sandoe : >>> >>>  >>> >>>>

Re: [PATCH 1/2 v2] c++: Implement P1494 and P3641 Partial program correctness [PR119060].

2025-09-08 Thread Richard Biener
> Am 08.09.2025 um 16:28 schrieb Iain Sandoe : > >  > >>> On 8 Sep 2025, at 15:20, Iain Sandoe wrote: >>> >>> >>> On 8 Sep 2025, at 15:05, Jakub Jelinek wrote: >>> >>> On Mon, Sep 08, 2025 at 02:54:18PM +0100, Iain Sandoe wrote: (for pre-conditions) they lower to a series of

Re: [PATCH 1/2 v2] c++: Implement P1494 and P3641 Partial program correctness [PR119060].

2025-09-08 Thread Richard Biener
On Mon, Sep 8, 2025 at 3:16 PM Jakub Jelinek wrote: > > On Mon, Sep 08, 2025 at 03:05:58PM +0200, Richard Biener wrote: > > is reduced to __builtin_abort () (for C++). That's because it's > > __builtin_unreachable () at the end. I am not aware of any > > oth

Re: [PATCH] tree-optimization/121844 - IVOPTs and asm goto in latch

2025-09-08 Thread Richard Biener
On Mon, 8 Sep 2025, Jakub Jelinek wrote: > On Mon, Sep 08, 2025 at 02:38:48PM +0200, Richard Biener wrote: > > When there's an asm goto in the latch of a loop we may not use > > IP_END IVs since instantiating those would (need to) split the > > latch edge which in

[PATCH] tree-optimization/121830 - SLP cycle detection confused by nested cycle

2025-09-08 Thread Richard Biener
The SLP reduc-index computation is confused by having an outer reduction inner loop nested cycle fed by another non-reduction nested cycle. Instead of undoing the unfortunate mixing of outer reduction inner cycles with general nested cycles the following instead distinguishes them by not setting ST

Re: [PATCH 1/2 v2] c++: Implement P1494 and P3641 Partial program correctness [PR119060].

2025-09-08 Thread Richard Biener
On Mon, Sep 8, 2025 at 1:04 PM Ville Voutilainen wrote: > > On Mon, 8 Sept 2025 at 13:54, Richard Biener > wrote: > > That said, I see no point in std::observable_checkpoint to be represented > > in the IL at all if all it is is to cater to FUD around what compilers might &

  1   2   3   4   5   6   7   8   9   10   >