[PATCH v2] combine: Discard REG_UNUSED note in i2 when register is also referenced in i3 [PR118739]

2025-02-26 Thread Uros Bizjak
The combine pass is trying to combine:

Trying 16, 22, 21 -> 23:
   16: r104:QI=flags:CCNO>0
   22: {r120:QI=r104:QI^0x1;clobber flags:CC;}
  REG_UNUSED flags:CC
   21: r119:QI=flags:CCNO<=0
  REG_DEAD flags:CCNO
   23: {r110:QI=r119:QI|r120:QI;clobber flags:CC;}
  REG_DEAD r120:QI
  REG_DEAD r119:QI
  REG_UNUSED flags:CC

and creates the following two insn sequence:

modifying insn i222: r104:QI=flags:CCNO>0
  REG_DEAD flags:CC
deferring rescan insn with uid = 22.
modifying insn i323: r110:QI=flags:CCNO<=0
  REG_DEAD flags:CC
deferring rescan insn with uid = 23.

where the REG_DEAD note in i2 is not correct, because the flags
register is still referenced in i3.  In try_combine() megafunction,
we have this part:

--cut here--
/* Distribute all the LOG_LINKS and REG_NOTES from I1, I2, and I3.  */
if (i3notes)
  distribute_notes (i3notes, i3, i3, newi2pat ? i2 : NULL,
elim_i2, elim_i1, elim_i0);
if (i2notes)
  distribute_notes (i2notes, i2, i3, newi2pat ? i2 : NULL,
elim_i2, elim_i1, elim_i0);
if (i1notes)
  distribute_notes (i1notes, i1, i3, newi2pat ? i2 : NULL,
elim_i2, local_elim_i1, local_elim_i0);
if (i0notes)
  distribute_notes (i0notes, i0, i3, newi2pat ? i2 : NULL,
elim_i2, elim_i1, local_elim_i0);
if (midnotes)
  distribute_notes (midnotes, NULL, i3, newi2pat ? i2 : NULL,
elim_i2, elim_i1, elim_i0);
--cut here--

where the compiler distributes REG_UNUSED note from i2:

   22: {r120:QI=r104:QI^0x1;clobber flags:CC;}
  REG_UNUSED flags:CC

via distribute_notes() using the following:

--cut here--
  /* Otherwise, if this register is used by I3, then this register
 now dies here, so we must put a REG_DEAD note here unless there
 is one already.  */
  else if (reg_referenced_p (XEXP (note, 0), PATTERN (i3))
   && ! (REG_P (XEXP (note, 0))
 ? find_regno_note (i3, REG_DEAD,
REGNO (XEXP (note, 0)))
 : find_reg_note (i3, REG_DEAD, XEXP (note, 0
{
  PUT_REG_NOTE_KIND (note, REG_DEAD);
  place = i3;
}
--cut here--

Flags register is used in I3, but there already is a REG_DEAD note in I3.
The above condition doesn't trigger and continues in the "else" part where
REG_DEAD note is put to I2.  The proposed solution corrects the above
logic to trigger every time the register is referenced in I3, avoiding the
"else" part.

PR rtl-optimization/118739

gcc/ChangeLog:

* combine.cc (distribute_notes) : Correct the
  logic when the register is used by I3.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr118739.c: New test.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

OK for master and eventual backports?

Uros.
diff --git a/gcc/combine.cc b/gcc/combine.cc
index 3beeb514b81..1b2bd34748e 100644
--- a/gcc/combine.cc
+++ b/gcc/combine.cc
@@ -14523,14 +14523,15 @@ distribute_notes (rtx notes, rtx_insn *from_insn, 
rtx_insn *i3, rtx_insn *i2,
  /* Otherwise, if this register is used by I3, then this register
 now dies here, so we must put a REG_DEAD note here unless there
 is one already.  */
- else if (reg_referenced_p (XEXP (note, 0), PATTERN (i3))
-  && ! (REG_P (XEXP (note, 0))
-? find_regno_note (i3, REG_DEAD,
-   REGNO (XEXP (note, 0)))
-: find_reg_note (i3, REG_DEAD, XEXP (note, 0
+ else if (reg_referenced_p (XEXP (note, 0), PATTERN (i3)))
{
- PUT_REG_NOTE_KIND (note, REG_DEAD);
- place = i3;
+ if (! (REG_P (XEXP (note, 0))
+? find_regno_note (i3, REG_DEAD, REGNO (XEXP (note, 0)))
+: find_reg_note (i3, REG_DEAD, XEXP (note, 0
+   {
+ PUT_REG_NOTE_KIND (note, REG_DEAD);
+ place = i3;
+   }
}
 
  /* A SET or CLOBBER of the REG_UNUSED reg has been removed,
diff --git a/gcc/testsuite/gcc.target/i386/pr118739.c 
b/gcc/testsuite/gcc.target/i386/pr118739.c
new file mode 100644
index 000..89bed546363
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr118739.c
@@ -0,0 +1,50 @@
+/* PR rtl-optimization/118739 */
+/* { dg-do run } */
+/* { dg-options "-O3 -fno-tree-forwprop -fno-tree-vrp" } */
+
+volatile int a;
+int b, c, d = 1, e, f, g;
+
+int h (void)
+{
+  int i = 1;
+
+ j:
+  for (b = 1; b; b--)
+{
+  asm ("#");
+
+  g = 0;
+
+  for (; g <= 1; g++)
+   {
+ int k = f = 0;
+
+ for (; f <= 1; f++)
+   k = (1 == i) >= k || ((d = 0) >= a) + k;
+   }
+}
+
+  for (; i < 3; i++)
+{
+  if (!c)
+   return g;
+
+  if (e)
+   goto j;
+
+  asm ("#");
+}
+
+  return 0;
+}
+
+int main()
+{
+  h();
+
+  if (d != 1)
+__builtin_abort();
+
+  re

RE: [3/3 PATCH v3]middle-end: delay checking for alignment to load [PR118464]

2025-02-26 Thread Tamar Christina
> -Original Message-
> From: Richard Biener 
> Sent: Wednesday, February 26, 2025 1:52 PM
> To: Tamar Christina 
> Cc: gcc-patches@gcc.gnu.org; nd 
> Subject: RE: [3/3 PATCH v3]middle-end: delay checking for alignment to load
> [PR118464]
> 
> On Wed, 26 Feb 2025, Tamar Christina wrote:
> 
> > > -Original Message-
> > > From: Richard Biener 
> > > Sent: Wednesday, February 26, 2025 12:30 PM
> > > To: Tamar Christina 
> > > Cc: gcc-patches@gcc.gnu.org; nd 
> > > Subject: Re: [3/3 PATCH v3]middle-end: delay checking for alignment to 
> > > load
> > > [PR118464]
> > >
> > > On Tue, 25 Feb 2025, Tamar Christina wrote:
> > >
> > > > Hi All,
> > > >
> > > > This fixes two PRs on Early break vectorization by delaying the safety 
> > > > checks
> to
> > > > vectorizable_load when the VF, VMAT and vectype are all known.
> > > >
> > > > This patch does add two new restrictions:
> > > >
> > > > 1. On LOAD_LANES targets, where the buffer size is known, we reject 
> > > > uneven
> > > >group sizes, as they are unaligned every n % 2 iterations and so may 
> > > > cross
> > > >a page unwittingly.
> > > >
> > > > 2. On LOAD_LANES targets when the buffer is unknown, we reject
> vectorization
> > > if
> > > >we cannot peel for alignment, as the alignment requirement is quite 
> > > > large at
> > > >GROUP_SIZE * vectype_size.  This is unlikely to ever be beneficial 
> > > > so we
> > > >don't support it for now.
> > > >
> > > > There are other steps documented inside the code itself so that the 
> > > > reasoning
> > > > is next to the code.
> > > >
> > > > Note that for VLA I have still left this fully disabled when not 
> > > > working on a
> > > > fixed buffer.
> > > >
> > > > For VLA targets like SVE return element alignment as the desired vector
> > > > alignment.  This means that the loads are never misaligned and so 
> > > > annoying it
> > > > won't ever need to peel.
> > > >
> > > > So what I think needs to happen in GCC 16 is that.
> > > >
> > > > 1. during vect_compute_data_ref_alignment we need to take the max of
> > > >POLY_VALUE_MIN and vector_alignment.
> > > >
> > > > 2. vect_do_peeling define skip_vector when PFA for VLA, and in the guard
> add a
> > > >check that ncopies * vectype does not exceed POLY_VALUE_MAX which we
> use
> > > as a
> > > >proxy for pagesize.
> > > >
> > > > 3. Force LOOP_VINFO_USING_PARTIAL_VECTORS_P to be true in
> > > >vect_determine_partial_vectors_and_peeling since the first iteration 
> > > > has to
> > > >be partial. If LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P we have to fail
> to
> > > >vectorize.
> > > >
> > > > 4. Create a default mask to be used, so that
> > > vect_use_loop_mask_for_alignment_p
> > > >becomes true and we generate the peeled check through loop control 
> > > > for
> > > >partial loops.  From what I can tell this won't work for
> > > >LOOP_VINFO_FULLY_WITH_LENGTH_P since they don't have any peeling
> > > support at
> > > >all in the compiler.  That would need to be done independently from 
> > > > the
> > > >above.
> > > >
> > > > In any case, not GCC 15 material so I've kept the WIP patches I have
> > > downstream.
> > > >
> > > > Bootstrapped Regtested on aarch64-none-linux-gnu,
> > > > arm-none-linux-gnueabihf, x86_64-pc-linux-gnu
> > > > -m32, -m64 and no issues.
> > > >
> > > > Ok for master?
> > > >
> > > > Thanks,
> > > > Tamar
> > > >
> > > > gcc/ChangeLog:
> > > >
> > > > PR tree-optimization/118464
> > > > PR tree-optimization/116855
> > > > * doc/invoke.texi (min-pagesize): Update docs with vectorizer 
> > > > use.
> > > > * tree-vect-data-refs.cc 
> > > > (vect_analyze_early_break_dependences): Delay
> > > > checks.
> > > > (vect_compute_data_ref_alignment): Remove alignment checks and 
> > > > move
> > > to
> > > > get_load_store_type, increase group access alignment.
> > > > (vect_enhance_data_refs_alignment): Add note to comment needing
> > > > investigating.
> > > > (vect_analyze_data_refs_alignment): Likewise.
> > > > (vect_supportable_dr_alignment): For group loads look at first 
> > > > DR.
> > > > * tree-vect-stmts.cc (get_load_store_type):
> > > > Perform safety checks for early break pfa.
> > > > * tree-vectorizer.h (dr_set_safe_speculative_read_required,
> > > > dr_safe_speculative_read_required, DR_SCALAR_KNOWN_BOUNDS):
> > > New.
> > > > (need_peeling_for_alignment): Renamed to...
> > > > (safe_speculative_read_required): .. This
> > > > (class dr_vec_info): Add scalar_access_known_in_bounds.
> > > >
> > > > gcc/testsuite/ChangeLog:
> > > >
> > > > PR tree-optimization/118464
> > > > PR tree-optimization/116855
> > > > * gcc.dg/vect/bb-slp-pr65935.c: Update, it now vectorizes 
> > > > because the
> > > > load type is relaxed later.
> > > > * gcc.dg/vect/vect-early-break_121-pr1140

Re: [PATCH] alias: Perform offset arithmetics in poly_offset_int rather than poly_int64 [PR118819]

2025-02-26 Thread Jakub Jelinek
On Wed, Feb 26, 2025 at 01:34:04PM +0100, Richard Biener wrote:
> > It wants to have a MEM which overlaps anything below the stack.
> > So, uses for stack grows down and non-biased stack sp - PTRDIFF_MAX with
> > PTRDIFF_MAX MEM_SIZE as an approximation to that.
> 
> I see.  Wouldn't setting MEM_OFFSET_KNOWN_P and MEM_SIZE_KNOWN_P
> to false work as well?

Well, that would then overlap even with all the sp based memories at sp or
above it.  Which is what doesn't need to be clobbered.

Jakub



Re: [PATCH] libstdc++: implement constexpr memory algorithms

2025-02-26 Thread Giuseppe D'Angelo

On 26/02/2025 16:22, Jonathan Wakely wrote:

Clang 17/18 rejects 'constexpr' on non-template functions that use
(non-constexpr) placement new but accepts it on templates (silently
dropping constexpr at instantiation time):https://godbolt.org/z/Tqnvc1f1W
So it seems Clang 17/18 behavior is consistent enough with P2448R2
that there should be no compatibility issues.

Oh great, thanks for checking.

EDG is also OK with it, I checked back to v6.0 from 2020.

So making it unconditionally constexpr seems fine then.


Whops, sorry, missed this sub-thread (while replying to the other one).
Change of plans then, I'll amend and remove the ad-hoc constexpr macro.

Thanks,
--
Giuseppe D'Angelo



smime.p7s
Description: S/MIME Cryptographic Signature


Re: [PATCH v2] RISC-V: Fix a typo in zce to zcf implication

2025-02-26 Thread Kito Cheng
Hi Yuriy:

V2 is LGTM, thanks :)

On Wed, Feb 26, 2025 at 3:06 AM Yuriy Kolerov
 wrote:
>
> Hi Jeff,
>
> That check is performed in a lambda function:
>
>   {"zce", "zcf",
>[] (const riscv_subset_list *subset_list) -> bool
>{
>  return subset_list->xlen () == 32 && subset_list->lookup ("f");
>}},
>
> The typo was in a rule itself:
>
> {"zcf", "f", ...
>
> So, with this fix zcf in implied by zce if this condition is passed:
>
> subset_list->xlen () == 32 && subset_list->lookup ("f")
>
> Before 9e12010b5e724277ea this rule was implemented in this code:
>
>   if (subset_list->lookup ("zce") != NULL
> && subset_list->m_xlen == 32
> && subset_list->lookup ("f") != NULL
> && subset_list->lookup ("zcf") == NULL)
> subset_list->add ("zcf", false);
>
> But it was accidentally refactored in a wrong way.
>
> Regards,
> Yuriy Kolerov
>
> -Original Message-
> From: Jeff Law 
> Sent: Tuesday, February 25, 2025 4:46 PM
> To: Yuriy Kolerov ; gcc-patches@gcc.gnu.org
> Cc: Artemiy Volkov 
> Subject: Re: [PATCH v2] RISC-V: Fix a typo in zce to zcf implication
>
>
>
> On 2/24/25 3:22 AM, Yuriy Kolerov wrote:
> > zce must imply zcf but this rule was corrupted after refactoring in
> > 9e12010b5e724277ea. This may be observed ater generating an .s file
> > from any source code file with -mriscv-attribute -march=rv32if_zce
> > -mabi=ilp32 -S options. A full march will be presented in arch
> > attribute:
> >
> >  rv32i2p1_f2p2_zicsr2p0_zca1p0_zcb1p0_zce1p0_zcmp1p0_zcmt1p0
> >
> > As you see, zcf is not presented here though f_zce pair is passed in
> > -march. According to The RISC-V Instruction Set Manual:
> >
> >  Specifying Zce on RV32 with F includes Zca, Zcb, Zcmp,
> >  Zcmt and Zcf.
> >
> >   PR target/118906
> >
> > gcc/ChangeLog:
> >
> >   * common/config/riscv/riscv-common.cc: fix zce to zcf
> >   implication.
> >
> > gcc/testsuite/ChangeLog:
> >
> >   * gcc.target/riscv/attribute-zce-1.c: New test.
> >   * gcc.target/riscv/attribute-zce-2.c: New test.
> >   * gcc.target/riscv/attribute-zce-3.c: New test.
> >   * gcc.target/riscv/attribute-zce-4.c: New test.
> I'm not 100% sure this is implementation is correct.  My understanding is 
> that zce implies zcf iff rv32 and f are also enabled.  So don't you need to 
> verify that rv32 and f are enabled?  Or is that done elsewhere?
>
> Though it looks like the other cases zce -> {zca, zcb, zcmp, zcmp} don't have 
> that check.
>
> It feels like I'm missing something in how all this works.
>
> Kito, you know this stuff better than I, thoughts?
>
>
> Jeff


[PATCH] c++: ICE in replace_decl [PR118986]

2025-02-26 Thread Marek Polacek
Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

-- >8 --
Yet another problem that started with r15-6052, compile time evaluation of
prvalues.

cp_fold_r/TARGET_EXPR sees:

  TARGET_EXPR >>> 

so when we call maybe_constant_init, the object we're initializing is D.2701,
and the init is the expr_stmt.  We unwrap the EXPR_STMT/INIT_EXPR/TARGET_EXPR
in maybe_constant_init_1 and so end up evaluating the f1 call.  But f1 returns
c2 whereas the type of D.2701 is ._anon_0 -- the closure.

So then we crash in replace_decl on:

  gcc_checking_assert (same_type_ignoring_top_level_qualifiers_p
   (TREE_TYPE (decl), TREE_TYPE (replacement)));

due to the mismatched types.

cxx_eval_outermost_constant_expr is already ready for the types to be
different, in which case the result isn't constant.  But replace_decl
is called before that check.

I'm leaving the assert in replace_decl on purpose, maybe we'll find
another use for it.

PR c++/118986

gcc/cp/ChangeLog:

* constexpr.cc (cxx_eval_call_expression): Check that the types match
before calling replace_decl.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/constexpr-prvalue1.C: New test.
---
 gcc/cp/constexpr.cc   |  4 +++-
 .../g++.dg/cpp2a/constexpr-prvalue1.C | 23 +++
 2 files changed, 26 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/constexpr-prvalue1.C

diff --git a/gcc/cp/constexpr.cc b/gcc/cp/constexpr.cc
index 59dd0668af3..204cda2a222 100644
--- a/gcc/cp/constexpr.cc
+++ b/gcc/cp/constexpr.cc
@@ -3390,7 +3390,9 @@ cxx_eval_call_expression (const constexpr_ctx *ctx, tree 
t,
   current object under construction.  */
if (!*non_constant_p && ctx->object
&& CLASS_TYPE_P (TREE_TYPE (res))
-   && !is_empty_class (TREE_TYPE (res)))
+   && !is_empty_class (TREE_TYPE (res))
+   && same_type_ignoring_top_level_qualifiers_p
+   (TREE_TYPE (res), TREE_TYPE (ctx->object)))
  if (replace_decl (&result, res, ctx->object))
{
  cacheable = false;
diff --git a/gcc/testsuite/g++.dg/cpp2a/constexpr-prvalue1.C 
b/gcc/testsuite/g++.dg/cpp2a/constexpr-prvalue1.C
new file mode 100644
index 000..f4e704d9487
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/constexpr-prvalue1.C
@@ -0,0 +1,23 @@
+// PR c++/118986
+// { dg-do compile { target c++20 } }
+// { dg-options "-O" }
+
+struct c1 {
+constexpr c1(int *ptr) {}
+};
+struct c2 {
+c1 _M_t;
+constexpr ~c2() {}
+};
+constexpr inline
+c2 f1 ()
+{
+  return c2(new int);
+}
+
+void
+f ()
+{
+  auto l = [p = f1()](){};
+  [p = f1()](){};
+}

base-commit: ad2908ed4ec5eff3cad3fd142cde5c1fac4788e9
-- 
2.48.1



[committed] libstdc++: Add code comment documenting LWG 4027 change [PR118083]

2025-02-26 Thread Patrick Palka
Tested on x86_64-pc-linxu-gnu, pushed to trunk as obvious.

-- >8 --

PR libstdc++/118083

libstdc++-v3/ChangeLog:

* include/bits/ranges_base.h
(ranges::__access::__possibly_const_range): Mention LWG 4027.
---
 libstdc++-v3/include/bits/ranges_base.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/libstdc++-v3/include/bits/ranges_base.h 
b/libstdc++-v3/include/bits/ranges_base.h
index 28fe64a9e9d..516d04afdab 100644
--- a/libstdc++-v3/include/bits/ranges_base.h
+++ b/libstdc++-v3/include/bits/ranges_base.h
@@ -646,6 +646,8 @@ namespace ranges
   constexpr auto&
   __possibly_const_range(_Range& __r) noexcept
   {
+   // _GLIBCXX_RESOLVE_LIB_DEFECTS
+   // 4027. possibly-const-range should prefer returning const R&
if constexpr (input_range)
  return const_cast(__r);
else
-- 
2.48.1.431.g5a526e5e18



Re: [PATCH] libstdc++: implement constexpr memory algorithms

2025-02-26 Thread Jonathan Wakely
On Wed, 26 Feb 2025 at 15:06, Patrick Palka  wrote:
>
> On Tue, 25 Feb 2025, Jonathan Wakely wrote:
>
> > On Thu, 20 Feb 2025 at 16:23, Patrick Palka  wrote:
> > >
> > > On Sun, 16 Feb 2025, Giuseppe D'Angelo wrote:
> > >
> > > > Hello,
> > > >
> > > > the attached patch implements the C++26 papers that add `constexpr` to 
> > > > the
> > > > specialized memory algorithms (the uninitialized_* family). Tested on 
> > > > x86-64
> > > > Linux.
> > > >
> > > > Thank you,
> > > > --
> > > > Giuseppe D'Angelo
> > > >
> > >
> > > > Subject: [PATCH] libstdc++: implement constexpr memory algorithms
> > > >
> > > > This commit adds support for C++26's constexpr specialized memory
> > > > algorithms, introduced by P2283R2, P3508R0, P3369R0.
> > > >
> > > > The uninitialized_default, value, copy, move and fill algorithms are
> > > > affected, in all of their variants (iterator-based, range-based and _n
> > > > versions.)
> > > >
> > > > The changes are mostly mechanical -- add `constexpr` to a number of
> > > > signatures. I've introduced a helper macro to conditionally expand to
> > > > `constexpr` only in C++26 and above modes. The internal helper guard
> > > > class for range algorithms instead can be marked unconditionally.
> > > >
> > > > uninitialized_fill is the only algorithm where I had to add a branch to
> > > > a constexpr-friendly version (already existing).
> > >
> > > Seems the patch also adds code to uninitialized_copy and
> > > uninitialized_fill_n?
> > >
> > > >
> > > > For each algorithm family I've added only one test to cover it and its
> > > > variants; the idea is to avoid too much repetition and simplify future
> > > > maintenance.
> > > >
> > > > libstdc++-v3/ChangeLog:
> > > >
> > > >   * include/bits/ranges_uninitialized.h: Mark the specialized
> > > >   memory algorithms as constexpr in C++26. Also mark the members
> > > >   of the _DestroyGuard helper class.
> > > >   * include/bits/stl_uninitialized.h: Ditto.
> > > >   * include/bits/stl_construct.h: Mark _Construct_novalue (which
> > > >   uses placement new to do default initialization) as constexpr
> > > >   in C++26. This is possible due to P2747R2, which GCC already
> > > >   implements; check P2747's feature-testing macro to avoid
> > > >   issues with other compilers.
> > > >   * include/bits/version.def: Bump the feature-testing macro.
> > > >   * include/bits/version.h: Regenerate.
> > > >   * testsuite/20_util/specialized_algorithms/feature_test_macro.cc: 
> > > > New test.
> > > >   * 
> > > > testsuite/20_util/specialized_algorithms/uninitialized_copy/constexpr.cc:
> > > >  New test.
> > > >   * 
> > > > testsuite/20_util/specialized_algorithms/uninitialized_default_construct/constexpr.cc:
> > > >   New test.
> > > >   * 
> > > > testsuite/20_util/specialized_algorithms/uninitialized_fill/constexpr.cc:
> > > >  New test.
> > > >   * 
> > > > testsuite/20_util/specialized_algorithms/uninitialized_move/constexpr.cc:
> > > >  New test.
> > > >   * 
> > > > testsuite/20_util/specialized_algorithms/uninitialized_value_construct/constexpr.cc:
> > > >   New test.
> > > >
> > > > Signed-off-by: Giuseppe D'Angelo 
> > > > ---
> > > >  .../include/bits/ranges_uninitialized.h   | 29 
> > > >  libstdc++-v3/include/bits/stl_construct.h |  3 +
> > > >  libstdc++-v3/include/bits/stl_uninitialized.h | 42 
> > > >  libstdc++-v3/include/bits/version.def |  5 ++
> > > >  libstdc++-v3/include/bits/version.h   |  7 +-
> > > >  .../feature_test_macro.cc | 14 
> > > >  .../uninitialized_copy/constexpr.cc   | 58 
> > > >  .../constexpr.cc  | 67 ++
> > > >  .../uninitialized_fill/constexpr.cc   | 68 +++
> > > >  .../uninitialized_move/constexpr.cc   | 51 ++
> > > >  .../constexpr.cc  | 64 +
> > > >  11 files changed, 407 insertions(+), 1 deletion(-)
> > > >  create mode 100644 
> > > > libstdc++-v3/testsuite/20_util/specialized_algorithms/feature_test_macro.cc
> > > >  create mode 100644 
> > > > libstdc++-v3/testsuite/20_util/specialized_algorithms/uninitialized_copy/constexpr.cc
> > > >  create mode 100644 
> > > > libstdc++-v3/testsuite/20_util/specialized_algorithms/uninitialized_default_construct/constexpr.cc
> > > >  create mode 100644 
> > > > libstdc++-v3/testsuite/20_util/specialized_algorithms/uninitialized_fill/constexpr.cc
> > > >  create mode 100644 
> > > > libstdc++-v3/testsuite/20_util/specialized_algorithms/uninitialized_move/constexpr.cc
> > > >  create mode 100644 
> > > > libstdc++-v3/testsuite/20_util/specialized_algorithms/uninitialized_value_construct/constexpr.cc
> > > >
> > > > diff --git a/libstdc++-v3/include/bits/ranges_uninitialized.h 
> > > > b/libstdc++-v3/include/bits/ranges_uninitialized.h
> > > > index ced7bda5e37..337d3217

RE: [3/3 PATCH v3]middle-end: delay checking for alignment to load [PR118464]

2025-02-26 Thread Tamar Christina
> -Original Message-
> From: Richard Biener 
> Sent: Wednesday, February 26, 2025 12:30 PM
> To: Tamar Christina 
> Cc: gcc-patches@gcc.gnu.org; nd 
> Subject: Re: [3/3 PATCH v3]middle-end: delay checking for alignment to load
> [PR118464]
> 
> On Tue, 25 Feb 2025, Tamar Christina wrote:
> 
> > Hi All,
> >
> > This fixes two PRs on Early break vectorization by delaying the safety 
> > checks to
> > vectorizable_load when the VF, VMAT and vectype are all known.
> >
> > This patch does add two new restrictions:
> >
> > 1. On LOAD_LANES targets, where the buffer size is known, we reject uneven
> >group sizes, as they are unaligned every n % 2 iterations and so may 
> > cross
> >a page unwittingly.
> >
> > 2. On LOAD_LANES targets when the buffer is unknown, we reject vectorization
> if
> >we cannot peel for alignment, as the alignment requirement is quite 
> > large at
> >GROUP_SIZE * vectype_size.  This is unlikely to ever be beneficial so we
> >don't support it for now.
> >
> > There are other steps documented inside the code itself so that the 
> > reasoning
> > is next to the code.
> >
> > Note that for VLA I have still left this fully disabled when not working on 
> > a
> > fixed buffer.
> >
> > For VLA targets like SVE return element alignment as the desired vector
> > alignment.  This means that the loads are never misaligned and so annoying 
> > it
> > won't ever need to peel.
> >
> > So what I think needs to happen in GCC 16 is that.
> >
> > 1. during vect_compute_data_ref_alignment we need to take the max of
> >POLY_VALUE_MIN and vector_alignment.
> >
> > 2. vect_do_peeling define skip_vector when PFA for VLA, and in the guard 
> > add a
> >check that ncopies * vectype does not exceed POLY_VALUE_MAX which we use
> as a
> >proxy for pagesize.
> >
> > 3. Force LOOP_VINFO_USING_PARTIAL_VECTORS_P to be true in
> >vect_determine_partial_vectors_and_peeling since the first iteration has 
> > to
> >be partial. If LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P we have to fail to
> >vectorize.
> >
> > 4. Create a default mask to be used, so that
> vect_use_loop_mask_for_alignment_p
> >becomes true and we generate the peeled check through loop control for
> >partial loops.  From what I can tell this won't work for
> >LOOP_VINFO_FULLY_WITH_LENGTH_P since they don't have any peeling
> support at
> >all in the compiler.  That would need to be done independently from the
> >above.
> >
> > In any case, not GCC 15 material so I've kept the WIP patches I have
> downstream.
> >
> > Bootstrapped Regtested on aarch64-none-linux-gnu,
> > arm-none-linux-gnueabihf, x86_64-pc-linux-gnu
> > -m32, -m64 and no issues.
> >
> > Ok for master?
> >
> > Thanks,
> > Tamar
> >
> > gcc/ChangeLog:
> >
> > PR tree-optimization/118464
> > PR tree-optimization/116855
> > * doc/invoke.texi (min-pagesize): Update docs with vectorizer use.
> > * tree-vect-data-refs.cc (vect_analyze_early_break_dependences): Delay
> > checks.
> > (vect_compute_data_ref_alignment): Remove alignment checks and move
> to
> > get_load_store_type, increase group access alignment.
> > (vect_enhance_data_refs_alignment): Add note to comment needing
> > investigating.
> > (vect_analyze_data_refs_alignment): Likewise.
> > (vect_supportable_dr_alignment): For group loads look at first DR.
> > * tree-vect-stmts.cc (get_load_store_type):
> > Perform safety checks for early break pfa.
> > * tree-vectorizer.h (dr_set_safe_speculative_read_required,
> > dr_safe_speculative_read_required, DR_SCALAR_KNOWN_BOUNDS):
> New.
> > (need_peeling_for_alignment): Renamed to...
> > (safe_speculative_read_required): .. This
> > (class dr_vec_info): Add scalar_access_known_in_bounds.
> >
> > gcc/testsuite/ChangeLog:
> >
> > PR tree-optimization/118464
> > PR tree-optimization/116855
> > * gcc.dg/vect/bb-slp-pr65935.c: Update, it now vectorizes because the
> > load type is relaxed later.
> > * gcc.dg/vect/vect-early-break_121-pr114081.c: Update.
> > * gcc.dg/vect/vect-early-break_22.c: Reject for load_lanes targets
> > * g++.dg/vect/vect-early-break_7-pr118464.cc: New test.
> > * gcc.dg/vect/vect-early-break_132-pr118464.c: New test.
> > * gcc.dg/vect/vect-early-break_133_pfa1.c: New test.
> > * gcc.dg/vect/vect-early-break_133_pfa10.c: New test.
> > * gcc.dg/vect/vect-early-break_133_pfa2.c: New test.
> > * gcc.dg/vect/vect-early-break_133_pfa3.c: New test.
> > * gcc.dg/vect/vect-early-break_133_pfa4.c: New test.
> > * gcc.dg/vect/vect-early-break_133_pfa5.c: New test.
> > * gcc.dg/vect/vect-early-break_133_pfa6.c: New test.
> > * gcc.dg/vect/vect-early-break_133_pfa7.c: New test.
> > * gcc.dg/vect/vect-early-break_133_pfa8.c: New test.
> > * gcc.dg/vect/vect-early-break_133_pfa9.c: New test.
> > * gcc.dg/vect/vect-early-break_39.c: Update testcase for misalignment.
> > * gcc.

Re: [PATCH v2] combine: Discard REG_UNUSED note in i2 when register is also referenced in i3 [PR118739]

2025-02-26 Thread Richard Biener
On Wed, 26 Feb 2025, Uros Bizjak wrote:

> The combine pass is trying to combine:
> 
> Trying 16, 22, 21 -> 23:
>16: r104:QI=flags:CCNO>0
>22: {r120:QI=r104:QI^0x1;clobber flags:CC;}
>   REG_UNUSED flags:CC
>21: r119:QI=flags:CCNO<=0
>   REG_DEAD flags:CCNO
>23: {r110:QI=r119:QI|r120:QI;clobber flags:CC;}
>   REG_DEAD r120:QI
>   REG_DEAD r119:QI
>   REG_UNUSED flags:CC
> 
> and creates the following two insn sequence:
> 
> modifying insn i222: r104:QI=flags:CCNO>0
>   REG_DEAD flags:CC
> deferring rescan insn with uid = 22.
> modifying insn i323: r110:QI=flags:CCNO<=0
>   REG_DEAD flags:CC
> deferring rescan insn with uid = 23.
> 
> where the REG_DEAD note in i2 is not correct, because the flags
> register is still referenced in i3.  In try_combine() megafunction,
> we have this part:
> 
> --cut here--
> /* Distribute all the LOG_LINKS and REG_NOTES from I1, I2, and I3.  */
> if (i3notes)
>   distribute_notes (i3notes, i3, i3, newi2pat ? i2 : NULL,
> elim_i2, elim_i1, elim_i0);
> if (i2notes)
>   distribute_notes (i2notes, i2, i3, newi2pat ? i2 : NULL,
> elim_i2, elim_i1, elim_i0);
> if (i1notes)
>   distribute_notes (i1notes, i1, i3, newi2pat ? i2 : NULL,
> elim_i2, local_elim_i1, local_elim_i0);
> if (i0notes)
>   distribute_notes (i0notes, i0, i3, newi2pat ? i2 : NULL,
> elim_i2, elim_i1, local_elim_i0);
> if (midnotes)
>   distribute_notes (midnotes, NULL, i3, newi2pat ? i2 : NULL,
> elim_i2, elim_i1, elim_i0);
> --cut here--
> 
> where the compiler distributes REG_UNUSED note from i2:
> 
>22: {r120:QI=r104:QI^0x1;clobber flags:CC;}
>   REG_UNUSED flags:CC
> 
> via distribute_notes() using the following:
> 
> --cut here--
>   /* Otherwise, if this register is used by I3, then this register
>  now dies here, so we must put a REG_DEAD note here unless there
>  is one already.  */
>   else if (reg_referenced_p (XEXP (note, 0), PATTERN (i3))
>&& ! (REG_P (XEXP (note, 0))
>  ? find_regno_note (i3, REG_DEAD,
> REGNO (XEXP (note, 0)))
>  : find_reg_note (i3, REG_DEAD, XEXP (note, 0
> {
>   PUT_REG_NOTE_KIND (note, REG_DEAD);
>   place = i3;
> }
> --cut here--
> 
> Flags register is used in I3, but there already is a REG_DEAD note in I3.
> The above condition doesn't trigger and continues in the "else" part where
> REG_DEAD note is put to I2.  The proposed solution corrects the above
> logic to trigger every time the register is referenced in I3, avoiding the
> "else" part.
>
> PR rtl-optimization/118739
> 
> gcc/ChangeLog:
> 
> * combine.cc (distribute_notes) : Correct the
>   logic when the register is used by I3.
> 
> gcc/testsuite/ChangeLog:
> 
> * gcc.target/i386/pr118739.c: New test.
> 
> Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.
> 
> OK for master and eventual backports?

OK if there's no other comments on this until Monday.

Thanks,
Richard.


Re: [PATCH 2/4] c++/modules: Track module purview for deferred instantiations [PR114630]

2025-02-26 Thread Jason Merrill

On 2/21/25 6:05 AM, Nathaniel Shead wrote:

After seeing PR c++/118964 I'm coming back around to this [1] patch
series, since it appears that this can cause errors on otherwise valid
code by instantiations coming into module purview that reference
TU-local entities.

[1]: https://gcc.gnu.org/pipermail/gcc-patches/2024-May/650324.html

We'd previously discussed that the best way to solve this in general
would be to perform all deferred instantiations at the end of the GMF to
ensure that they would not leak into the module purview.  I still
tentatively agree that this would be the "nicer" way to go (though I've
since come across https://eel.is/c++draft/temp.point#7 and wonder if
this may not be strictly conforming according to the current wording?).


Hmm, interesting point.


I've not yet managed to implement this however, and even if I had, at
this stage it would definitely not be appropriate for GCC15; would the
approach described below be appropriate for GCC15 as a stop-gap to
reduce these issues?


As I suggested earlier in this discussion, it seems to me that the 
purviewness of an implicit instantiation shouldn't matter, they should 
all be treated as discardable.


Could we instead set DECL_MODULE_PURVIEW_P as appropriate when we see an 
explicit instantiation and use that + DECL_EXPLICIT_INSTANTIATION to 
recompute module_kind?


Or go with this patch but only look at the saved module_kind if 
DECL_EXPLICIT_INSTANTIATION, but that seems a bit of a waste of space. 
Might be safer at this point, though.



Another approach would be to maybe lower the error to a permerror, so
that users that 'know better' could compile such modules anyway (at risk
of various link errors and ODR issues), though I would also need to
adjust the streaming logic to handle this better.  Thoughts?


This also sounds desirable.  With a warning flag so it can be disabled 
for certain deliberate cases like the gthr stuff.


Jason


On Wed, May 01, 2024 at 08:00:22PM +1000, Nathaniel Shead wrote:

Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk?

-- >8 --

When calling instantiate_pending_templates at end of parsing, any new
functions that are instantiated from this point have their module
purview set based on the current value of module_kind.

This is unideal, however, as the modules code will then treat these
instantiations as reachable and cause large swathes of the GMF to be
emitted into the module CMI, despite no code in the actual module
purview referencing it.

This patch fixes this by also remembering the value of module_kind when
the instantiation was deferred, and restoring it when doing this
deferred instantiation.  That way newly instantiated declarations
appropriately get a DECL_MODULE_PURVIEW_P appropriate for where the
instantiation was required, meaning that GMF entities won't be counted
as reachable unless referenced by an actually reachable entity.

Note that purviewness and attachment etc. is generally only determined
by the base template: this is purely for determining whether a
specialisation was declared in the module purview and hence whether it
should be streamed out.  See the comment on 'set_instantiating_module'.

PR c++/114630
PR c++/114795

gcc/cp/ChangeLog:

* cp-tree.h (struct tinst_level): Add field for tracking
module_kind.
* pt.cc (push_tinst_level_loc): Cache module_kind in new_level.
(reopen_tinst_level): Restore module_kind from level.
(instantiate_pending_templates): Save and restore module_kind so
it isn't affected by reopen_tinst_level.

gcc/testsuite/ChangeLog:

* g++.dg/modules/gmf-3.C: New test.

Signed-off-by: Nathaniel Shead 
---
  gcc/cp/cp-tree.h |  3 +++
  gcc/cp/pt.cc |  4 
  gcc/testsuite/g++.dg/modules/gmf-3.C | 13 +
  3 files changed, 20 insertions(+)
  create mode 100644 gcc/testsuite/g++.dg/modules/gmf-3.C

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 1938ada0268..0e619120ccc 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -6626,6 +6626,9 @@ struct GTY((chain_next ("%h.next"))) tinst_level {
/* The location where the template is instantiated.  */
location_t locus;
  
+  /* The module kind where the template is instantiated. */

+  unsigned module_kind;
+
/* errorcount + sorrycount when we pushed this level.  */
unsigned short errors;
  
diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc

index 1c3eef60c06..401aa92bc3e 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -11277,6 +11277,7 @@ push_tinst_level_loc (tree tldcl, tree targs, 
location_t loc)
new_level->tldcl = tldcl;
new_level->targs = targs;
new_level->locus = loc;
+  new_level->module_kind = module_kind;
new_level->errors = errorcount + sorrycount;
new_level->next = NULL;
new_level->refcount = 0;
@@ -11345,6 +11346,7 @@ reopen_tinst_level (struct tinst_level *level)
for (t = level; t; t = t->next)
  ++tinst_depth;
  
+

RE: [3/3 PATCH v3]middle-end: delay checking for alignment to load [PR118464]

2025-02-26 Thread Tamar Christina
> > >
> > > No, I don't think so.  The code that eventually performs a
> > > contiguous sub-group access directly should never extend
> > > the load beyond GROUP_SIZE - or should be gated on the DR
> > > not executed speculatively.  That is, we should "fix" this
> > > elsewhere.
> > >
> >
> > It doesn't, it's just not aligned within the range of GROUP_SIZE
> > from what I remember.
> >
> > > If you have an updated patch I can look at what's wrong here if you
> > > tell me how to reproduce (after applying the patch I suppose).
> >
> > Yes, applying the patch and running:
> >
> > /work/build/gcc/xgcc -B/work/build/gcc/
> /work/gcc/gcc/testsuite/gcc.dg/vect/vect-early-break_26.c  -m64   
> -fdiagnostics-
> plain-output  -flto -ffat-lto-objects -msse2 -ftree-vectorize -fno-tree-loop-
> distribute-patterns -fno-vect-cost-model -fno-common -O2 -fdump-tree-vect-
> details -msse4.1  -lm  -o ./vect-early-break_26.exe
> 
> So it works as in executing fine.  We have a VF of 4 and
> 
> note:   recording new base alignment for &b
>   alignment:32
>   misalignment: 0
>   based on: _1 = b[i_32];
> /space/rguenther/src/gcc/gcc/testsuite/gcc.dg/vect/vect-early-break_26.c:35:21:
> note:   recording new base alignment for &a
>   alignment:32
>   misalignment: 0
>   based on: _2 = a[i_32];
> /space/rguenther/src/gcc/gcc/testsuite/gcc.dg/vect/vect-early-break_26.c:35:21:
> note:   vect_compute_data_ref_alignment:
> /space/rguenther/src/gcc/gcc/testsuite/gcc.dg/vect/vect-early-break_26.c:35:21:
> note:   alignment increased due to early break to 32 bytes.
> /space/rguenther/src/gcc/gcc/testsuite/gcc.dg/vect/vect-early-break_26.c:35:21:
> missed:   misalign = 8 bytes of ref b[i_32]
> /space/rguenther/src/gcc/gcc/testsuite/gcc.dg/vect/vect-early-break_26.c:35:21:
> note:   vect_compute_data_ref_alignment:
> /space/rguenther/src/gcc/gcc/testsuite/gcc.dg/vect/vect-early-break_26.c:35:21:
> note:   alignment increased due to early break to 32 bytes.
> 
> so no peeling necessary.  But we also have like
> 
> /space/rguenther/src/gcc/gcc/testsuite/gcc.dg/vect/vect-early-break_26.c:35:21:
> missed:   misalign = 12 bytes of ref b[_6]
> /space/rguenther/src/gcc/gcc/testsuite/gcc.dg/vect/vect-early-break_26.c:35:21:
> note:   vect_compute_data_ref_alignment:
> /space/rguenther/src/gcc/gcc/testsuite/gcc.dg/vect/vect-early-break_26.c:35:21:
> note:   alignment increased due to early break to 32 bytes.
> 
> and we are correctly saying we vectorize an unaligned access.
> 
> The "issue" is we're having SLP nodes with a load permutation, their
> expansion might not happen with the whole DR group in mind.  I'd say
> we simply refuse to do early break speculative load vectorization
> for SLP nodes with a load permutation.

This is what I was trying to say on IRC when I mentioned that the permutes
can end up creating an unaligned access wrt to the original address.

But the reason I was still trying to allow this case is because conceptually
my assumption was that the permutes still maintain the access within
the group.  After all, they're just shifting elements around.

In other words, I was assuming that the group a[i] - a[i-2] still stays within
the group alignment of 32-bytes, even if the permute can make the second
load in the group start at say, byte 28.  My assumption was though that it can't
make it start at byte 36.

Are you saying that this is the case? that it can? Then I agree the load 
permutations
on group loads are unsafe to speculate for unmasked loops...

Thanks,
Tamar
> 
> It looks like a latent issue to me which could also interfere with
> gap peeling, I have to dig a bit further what code is responsible
> for the current behavior ...
> 
> 
> 
> > Thanks,
> > Tamar
> >
> > >
> > > > Enforcing the alignment on every group member would be wrong I think 
> > > > since
> > > > that ends up higher overall alignment than they need.
> > > >
> > > > > So besides these issues in get_load_store_type the change looks good 
> > > > > now.
> > > > >
> > > >
> > > > Thanks for the reviews.
> > > >
> > > > Tamar
> > > > > Richard.
> > > > >
> > > > > > +  else
> > > > > > +   *alignment_support_scheme = dr_aligned;
> > > > > > +}
> > > > > > +
> > > > > >if (*alignment_support_scheme == dr_unaligned_unsupported)
> > > > > >  {
> > > > > >if (dump_enabled_p ())
> > > > > > diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h
> > > > > > index
> > > > >
> > >
> b0cb081cba0ae8b11fbfcfcb8c6d440ec451ccb5..97caf61b345735d297ec49fd6ca
> > > > > 64797435b46fc 100644
> > > > > > --- a/gcc/tree-vectorizer.h
> > > > > > +++ b/gcc/tree-vectorizer.h
> > > > > > @@ -1281,7 +1281,11 @@ public:
> > > > > >
> > > > > >/* Set by early break vectorization when this DR needs peeling 
> > > > > > for
> alignment
> > > > > >   for correctness.  */
> > > > > > -  bool need_peeling_for_alignment;
> > > > > > +  bool safe_speculative_read_required;
> > > > > > +
> > > > > > +  /* Set by early break vectorization whe

Re: [PATCH v2] libstdc++: implement constexpr memory algorithms

2025-02-26 Thread Giuseppe D'Angelo

Hello,

On 25/02/2025 23:46, Jonathan Wakely wrote:

Maybe we can get away with unconditionally declaring this
_GLIBCXX26_CONSTEXPR?  If the compiler doesn't support constexpr
placement new then the 'constexpr' would be silently dropped at
instantiation time.  This would be in line with C++23 P2448R2 which
made it no longer IFNDR to declare a constexpr function template
for which no specialization is actually constexpr.


Yeah, for internal functions that aren't ever compiled as C++98, we
can often just make them constexpr. It will never be called during
constant evaluation in C++20 or older, but that's usually fine.

In this case though, would the placement new make it ill-formed in
Clang 18, which didn't support P2448R2?


Yes, I had the same question. Cppreference says that Clang 17/18 have a 
partial implementation of P2448, not sure what that means:


https://en.cppreference.com/w/cpp/23

A quick test shows that it seems happy with a placement new in a 
constexpr function in C++17 mode, but I'm not positive it won't complain 
in some other case.


So I'm going to leave these macros alone if you don't mind.



We typically call __is_constant_evaluated fully qualified (though I
don't remember why since it's not eligible for ADL?)


I don't think we're consistent, and it's not necessary.

But we could simplify things a little by doing:

#if __glibcxx_raw_memory_algorithms >= 202411L // >= C++26
   if consteval {
 return std::__do_uninit_copy(__first, __last, __result);
   }
#endif

We don't need to use the __is_constant_evaluated() wrapper, or even
the std::is_constant_evaluated() function, because this is C++26 code
so we know 'if consteval' works. Clang supports it since version 14,
which is too old to support any C++26 mode, so every Clang that
supports -std=c++2c also supports 'if consteval'.


I'm not sure if the extra FTM check makes it more readable, but I guess 
`if consteval` is strictly better than the wrappers, so I'll amend this way.


New patch is attached.

Thank you,

--
Giuseppe D'Angelo
From f4257d81d4314bc415c283df48e2b5cbc95abc8c Mon Sep 17 00:00:00 2001
From: Giuseppe D'Angelo 
Date: Sun, 16 Feb 2025 19:37:07 +0100
Subject: [PATCH] libstdc++: implement constexpr memory algorithms

This commit adds support for C++26's constexpr specialized memory
algorithms, introduced by P2283R2, P3508R0, P3369R0.

The uninitialized_default, value, copy, move and fill algorithms are
affected, in all of their variants (iterator-based, range-based and _n
versions.)

The changes are mostly mechanical -- add `constexpr` to a number of
signatures. I've introduced a helper macro to conditionally expand to
`constexpr` only in C++26 and above modes. The internal helper guard
class for range algorithms instead can be marked unconditionally.

The only "real" change to the implementation of the algorithms is that
during constant evaluation I dispatch to a constexpr-friendly version of
them.

For each algorithm family I've added only one test to cover it and its
variants; the idea is to avoid too much repetition and simplify future
maintenance.

libstdc++-v3/ChangeLog:

	* include/bits/ranges_uninitialized.h: Mark the specialized
	memory algorithms as constexpr in C++26. Also mark the members
	of the _DestroyGuard helper class.
	* include/bits/stl_uninitialized.h: Ditto.
	* include/bits/stl_construct.h: Mark _Construct_novalue (which
	uses placement new to do default initialization) as constexpr
	in C++26. This is possible due to P2747R2, which GCC already
	implements; check P2747's feature-testing macro to avoid
	issues with other compilers.
	* include/bits/version.def: Bump the feature-testing macro.
	* include/bits/version.h: Regenerate.
	* testsuite/20_util/specialized_algorithms/feature_test_macro.cc: New test.
	* testsuite/20_util/specialized_algorithms/uninitialized_copy/constexpr.cc: New test.
	* testsuite/20_util/specialized_algorithms/uninitialized_default_construct/constexpr.cc:
	New test.
	* testsuite/20_util/specialized_algorithms/uninitialized_fill/constexpr.cc: New test.
	* testsuite/20_util/specialized_algorithms/uninitialized_move/constexpr.cc: New test.
	* testsuite/20_util/specialized_algorithms/uninitialized_value_construct/constexpr.cc:
	New test.

Signed-off-by: Giuseppe D'Angelo 
---
 .../include/bits/ranges_uninitialized.h   | 29 
 libstdc++-v3/include/bits/stl_construct.h |  3 +
 libstdc++-v3/include/bits/stl_uninitialized.h | 47 +
 libstdc++-v3/include/bits/version.def |  5 ++
 libstdc++-v3/include/bits/version.h   |  7 +-
 .../feature_test_macro.cc | 14 
 .../uninitialized_copy/constexpr.cc   | 58 
 .../constexpr.cc  | 67 ++
 .../uninitialized_fill/constexpr.cc   | 68 +++
 .../uninitialized_move/constexpr.cc   | 51 ++
 .../constexpr.cc  | 64 +
 11 files changed,

Re: [PATCH] combine: Discard REG_UNUSED note in i2 when register is also referenced in i3 [PR118739]

2025-02-26 Thread Uros Bizjak
On Mon, Feb 24, 2025 at 10:46 AM Richard Biener
 wrote:

>   /* Otherwise, if this register is used by I3, then this register
>  now dies here, so we must put a REG_DEAD note here unless there
>  is one already.  */
>   else if (reg_referenced_p (XEXP (note, 0), PATTERN (i3))
>  {
>  if (! (REG_P (XEXP (note, 0))
>  ? find_regno_note (i3, REG_DEAD,
> REGNO (XEXP (note, 0)))
>  : find_reg_note (i3, REG_DEAD, XEXP (note, 0
>{
>  PUT_REG_NOTE_KIND (note, REG_DEAD);
>  place = i3;
>}
>  }
>
> ?  At least the else { case seems to assume the reg isn't refernced in i3.
> The comment wording might also need an update of course.

Reading the comment a couple of times, considering that "here" and
"there" means I3 and that "one" means REG_DEAD note for "this
register" in I3, the comment actually makes sense.

Uros.


Re: [PATCH] simple-diagnostic-path: Inline two trivial methods [PR116143]

2025-02-26 Thread Jakub Jelinek
On Wed, Feb 26, 2025 at 07:55:28AM -0500, David Malcolm wrote:
> BTW, Qing Zhao's patch kit
>   "[PATCH v4 0/3][RFC]Provide more contexts for -Warray-bounds and -
> Wstringop-* warning messages"
> https://gcc.gnu.org/pipermail/gcc-patches/2025-January/673474.html

I'm not sure we want that, that is just one of the many possibilities
of showing up some reasoning why a misdesigned late warning has been
emitted.

> adds a usage of simple_diagnostic_path to OBJS via a new gcc/move-
> history-rich-location.o in this patch:
>   https://gcc.gnu.org/pipermail/gcc-patches/2024-November/667615.html

So what about at least for now something like the patch below, or
perhaps export some extra substitutions for ENABLE_PLUGIN and CHECKING_P
from configure and conditionalize this on plugins enabled and CHECKING_P
disabled.

This certainly fixes all the plugin.exp tests for --enable-checking=release.

2025-02-26  Jakub Jelinek  

PR testsuite/116143
* Makefile.in (EXTRA_BACKEND_OBJS): New variable.
(BACKEND): Use it before libbackend.a.

--- gcc/Makefile.in.jj  2025-02-12 22:23:07.136772228 +0100
+++ gcc/Makefile.in 2025-02-26 14:51:50.372704489 +0100
@@ -1904,8 +1904,12 @@ ifeq (@enable_libgdiagnostics@,yes)
 ALL_HOST_OBJS += $(libgdiagnostics_OBJS) $(SARIF_REPLAY_OBJS)
 endif
 
-BACKEND = libbackend.a main.o libcommon-target.a libcommon.a \
-   $(CPPLIB) $(LIBDECNUMBER)
+# libbackend.a objs that might not be in some cases linked into the compiler,
+# yet they are supposed to be part of the plugin ABI.  See PR116143.
+EXTRA_BACKEND_OBJS = simple-diagnostic-path.o lazy-diagnostic-path.o
+
+BACKEND = $(EXTRA_BACKEND_OBJS) libbackend.a main.o libcommon-target.a \
+   libcommon.a $(CPPLIB) $(LIBDECNUMBER)
 
 # This is defined to "yes" if Tree checking is enabled, which roughly means
 # front-end checking.


Jakub



RE: [3/3 PATCH v3]middle-end: delay checking for alignment to load [PR118464]

2025-02-26 Thread Richard Biener
On Wed, 26 Feb 2025, Tamar Christina wrote:

> > -Original Message-
> > From: Richard Biener 
> > Sent: Wednesday, February 26, 2025 12:30 PM
> > To: Tamar Christina 
> > Cc: gcc-patches@gcc.gnu.org; nd 
> > Subject: Re: [3/3 PATCH v3]middle-end: delay checking for alignment to load
> > [PR118464]
> > 
> > On Tue, 25 Feb 2025, Tamar Christina wrote:
> > 
> > > Hi All,
> > >
> > > This fixes two PRs on Early break vectorization by delaying the safety 
> > > checks to
> > > vectorizable_load when the VF, VMAT and vectype are all known.
> > >
> > > This patch does add two new restrictions:
> > >
> > > 1. On LOAD_LANES targets, where the buffer size is known, we reject uneven
> > >group sizes, as they are unaligned every n % 2 iterations and so may 
> > > cross
> > >a page unwittingly.
> > >
> > > 2. On LOAD_LANES targets when the buffer is unknown, we reject 
> > > vectorization
> > if
> > >we cannot peel for alignment, as the alignment requirement is quite 
> > > large at
> > >GROUP_SIZE * vectype_size.  This is unlikely to ever be beneficial so 
> > > we
> > >don't support it for now.
> > >
> > > There are other steps documented inside the code itself so that the 
> > > reasoning
> > > is next to the code.
> > >
> > > Note that for VLA I have still left this fully disabled when not working 
> > > on a
> > > fixed buffer.
> > >
> > > For VLA targets like SVE return element alignment as the desired vector
> > > alignment.  This means that the loads are never misaligned and so 
> > > annoying it
> > > won't ever need to peel.
> > >
> > > So what I think needs to happen in GCC 16 is that.
> > >
> > > 1. during vect_compute_data_ref_alignment we need to take the max of
> > >POLY_VALUE_MIN and vector_alignment.
> > >
> > > 2. vect_do_peeling define skip_vector when PFA for VLA, and in the guard 
> > > add a
> > >check that ncopies * vectype does not exceed POLY_VALUE_MAX which we 
> > > use
> > as a
> > >proxy for pagesize.
> > >
> > > 3. Force LOOP_VINFO_USING_PARTIAL_VECTORS_P to be true in
> > >vect_determine_partial_vectors_and_peeling since the first iteration 
> > > has to
> > >be partial. If LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P we have to fail to
> > >vectorize.
> > >
> > > 4. Create a default mask to be used, so that
> > vect_use_loop_mask_for_alignment_p
> > >becomes true and we generate the peeled check through loop control for
> > >partial loops.  From what I can tell this won't work for
> > >LOOP_VINFO_FULLY_WITH_LENGTH_P since they don't have any peeling
> > support at
> > >all in the compiler.  That would need to be done independently from the
> > >above.
> > >
> > > In any case, not GCC 15 material so I've kept the WIP patches I have
> > downstream.
> > >
> > > Bootstrapped Regtested on aarch64-none-linux-gnu,
> > > arm-none-linux-gnueabihf, x86_64-pc-linux-gnu
> > > -m32, -m64 and no issues.
> > >
> > > Ok for master?
> > >
> > > Thanks,
> > > Tamar
> > >
> > > gcc/ChangeLog:
> > >
> > >   PR tree-optimization/118464
> > >   PR tree-optimization/116855
> > >   * doc/invoke.texi (min-pagesize): Update docs with vectorizer use.
> > >   * tree-vect-data-refs.cc (vect_analyze_early_break_dependences): Delay
> > >   checks.
> > >   (vect_compute_data_ref_alignment): Remove alignment checks and move
> > to
> > >   get_load_store_type, increase group access alignment.
> > >   (vect_enhance_data_refs_alignment): Add note to comment needing
> > >   investigating.
> > >   (vect_analyze_data_refs_alignment): Likewise.
> > >   (vect_supportable_dr_alignment): For group loads look at first DR.
> > >   * tree-vect-stmts.cc (get_load_store_type):
> > >   Perform safety checks for early break pfa.
> > >   * tree-vectorizer.h (dr_set_safe_speculative_read_required,
> > >   dr_safe_speculative_read_required, DR_SCALAR_KNOWN_BOUNDS):
> > New.
> > >   (need_peeling_for_alignment): Renamed to...
> > >   (safe_speculative_read_required): .. This
> > >   (class dr_vec_info): Add scalar_access_known_in_bounds.
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > >   PR tree-optimization/118464
> > >   PR tree-optimization/116855
> > >   * gcc.dg/vect/bb-slp-pr65935.c: Update, it now vectorizes because the
> > >   load type is relaxed later.
> > >   * gcc.dg/vect/vect-early-break_121-pr114081.c: Update.
> > >   * gcc.dg/vect/vect-early-break_22.c: Reject for load_lanes targets
> > >   * g++.dg/vect/vect-early-break_7-pr118464.cc: New test.
> > >   * gcc.dg/vect/vect-early-break_132-pr118464.c: New test.
> > >   * gcc.dg/vect/vect-early-break_133_pfa1.c: New test.
> > >   * gcc.dg/vect/vect-early-break_133_pfa10.c: New test.
> > >   * gcc.dg/vect/vect-early-break_133_pfa2.c: New test.
> > >   * gcc.dg/vect/vect-early-break_133_pfa3.c: New test.
> > >   * gcc.dg/vect/vect-early-break_133_pfa4.c: New test.
> > >   * gcc.dg/vect/vect-early-break_133_pfa5.c: New test.
> > >   * gcc.dg/vect/vect-early-break_133_pfa6.c: New test.
> > >   * gcc.dg/vect/vect-early-b

Re: [PATCH] simple-diagnostic-path: Inline two trivial methods [PR116143]

2025-02-26 Thread Richard Biener
On Wed, Feb 26, 2025 at 3:01 PM Jakub Jelinek  wrote:
>
> On Wed, Feb 26, 2025 at 07:55:28AM -0500, David Malcolm wrote:
> > BTW, Qing Zhao's patch kit
> >   "[PATCH v4 0/3][RFC]Provide more contexts for -Warray-bounds and -
> > Wstringop-* warning messages"
> > https://gcc.gnu.org/pipermail/gcc-patches/2025-January/673474.html
>
> I'm not sure we want that, that is just one of the many possibilities
> of showing up some reasoning why a misdesigned late warning has been
> emitted.
>
> > adds a usage of simple_diagnostic_path to OBJS via a new gcc/move-
> > history-rich-location.o in this patch:
> >   https://gcc.gnu.org/pipermail/gcc-patches/2024-November/667615.html
>
> So what about at least for now something like the patch below, or
> perhaps export some extra substitutions for ENABLE_PLUGIN and CHECKING_P
> from configure and conditionalize this on plugins enabled and CHECKING_P
> disabled.
>
> This certainly fixes all the plugin.exp tests for --enable-checking=release.

That works for me (even unconditionally).

Richard.

> 2025-02-26  Jakub Jelinek  
>
> PR testsuite/116143
> * Makefile.in (EXTRA_BACKEND_OBJS): New variable.
> (BACKEND): Use it before libbackend.a.
>
> --- gcc/Makefile.in.jj  2025-02-12 22:23:07.136772228 +0100
> +++ gcc/Makefile.in 2025-02-26 14:51:50.372704489 +0100
> @@ -1904,8 +1904,12 @@ ifeq (@enable_libgdiagnostics@,yes)
>  ALL_HOST_OBJS += $(libgdiagnostics_OBJS) $(SARIF_REPLAY_OBJS)
>  endif
>
> -BACKEND = libbackend.a main.o libcommon-target.a libcommon.a \
> -   $(CPPLIB) $(LIBDECNUMBER)
> +# libbackend.a objs that might not be in some cases linked into the compiler,
> +# yet they are supposed to be part of the plugin ABI.  See PR116143.
> +EXTRA_BACKEND_OBJS = simple-diagnostic-path.o lazy-diagnostic-path.o
> +
> +BACKEND = $(EXTRA_BACKEND_OBJS) libbackend.a main.o libcommon-target.a \
> +   libcommon.a $(CPPLIB) $(LIBDECNUMBER)
>
>  # This is defined to "yes" if Tree checking is enabled, which roughly means
>  # front-end checking.
>
>
> Jakub
>


RE: [3/3 PATCH v3]middle-end: delay checking for alignment to load [PR118464]

2025-02-26 Thread Richard Biener
On Wed, 26 Feb 2025, Tamar Christina wrote:

> > -Original Message-
> > From: Richard Biener 
> > Sent: Wednesday, February 26, 2025 1:52 PM
> > To: Tamar Christina 
> > Cc: gcc-patches@gcc.gnu.org; nd 
> > Subject: RE: [3/3 PATCH v3]middle-end: delay checking for alignment to load
> > [PR118464]
> > 
> > On Wed, 26 Feb 2025, Tamar Christina wrote:
> > 
> > > > -Original Message-
> > > > From: Richard Biener 
> > > > Sent: Wednesday, February 26, 2025 12:30 PM
> > > > To: Tamar Christina 
> > > > Cc: gcc-patches@gcc.gnu.org; nd 
> > > > Subject: Re: [3/3 PATCH v3]middle-end: delay checking for alignment to 
> > > > load
> > > > [PR118464]
> > > >
> > > > On Tue, 25 Feb 2025, Tamar Christina wrote:
> > > >
> > > > > Hi All,
> > > > >
> > > > > This fixes two PRs on Early break vectorization by delaying the 
> > > > > safety checks
> > to
> > > > > vectorizable_load when the VF, VMAT and vectype are all known.
> > > > >
> > > > > This patch does add two new restrictions:
> > > > >
> > > > > 1. On LOAD_LANES targets, where the buffer size is known, we reject 
> > > > > uneven
> > > > >group sizes, as they are unaligned every n % 2 iterations and so 
> > > > > may cross
> > > > >a page unwittingly.
> > > > >
> > > > > 2. On LOAD_LANES targets when the buffer is unknown, we reject
> > vectorization
> > > > if
> > > > >we cannot peel for alignment, as the alignment requirement is 
> > > > > quite large at
> > > > >GROUP_SIZE * vectype_size.  This is unlikely to ever be beneficial 
> > > > > so we
> > > > >don't support it for now.
> > > > >
> > > > > There are other steps documented inside the code itself so that the 
> > > > > reasoning
> > > > > is next to the code.
> > > > >
> > > > > Note that for VLA I have still left this fully disabled when not 
> > > > > working on a
> > > > > fixed buffer.
> > > > >
> > > > > For VLA targets like SVE return element alignment as the desired 
> > > > > vector
> > > > > alignment.  This means that the loads are never misaligned and so 
> > > > > annoying it
> > > > > won't ever need to peel.
> > > > >
> > > > > So what I think needs to happen in GCC 16 is that.
> > > > >
> > > > > 1. during vect_compute_data_ref_alignment we need to take the max of
> > > > >POLY_VALUE_MIN and vector_alignment.
> > > > >
> > > > > 2. vect_do_peeling define skip_vector when PFA for VLA, and in the 
> > > > > guard
> > add a
> > > > >check that ncopies * vectype does not exceed POLY_VALUE_MAX which 
> > > > > we
> > use
> > > > as a
> > > > >proxy for pagesize.
> > > > >
> > > > > 3. Force LOOP_VINFO_USING_PARTIAL_VECTORS_P to be true in
> > > > >vect_determine_partial_vectors_and_peeling since the first 
> > > > > iteration has to
> > > > >be partial. If LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P we have to fail
> > to
> > > > >vectorize.
> > > > >
> > > > > 4. Create a default mask to be used, so that
> > > > vect_use_loop_mask_for_alignment_p
> > > > >becomes true and we generate the peeled check through loop control 
> > > > > for
> > > > >partial loops.  From what I can tell this won't work for
> > > > >LOOP_VINFO_FULLY_WITH_LENGTH_P since they don't have any peeling
> > > > support at
> > > > >all in the compiler.  That would need to be done independently 
> > > > > from the
> > > > >above.
> > > > >
> > > > > In any case, not GCC 15 material so I've kept the WIP patches I have
> > > > downstream.
> > > > >
> > > > > Bootstrapped Regtested on aarch64-none-linux-gnu,
> > > > > arm-none-linux-gnueabihf, x86_64-pc-linux-gnu
> > > > > -m32, -m64 and no issues.
> > > > >
> > > > > Ok for master?
> > > > >
> > > > > Thanks,
> > > > > Tamar
> > > > >
> > > > > gcc/ChangeLog:
> > > > >
> > > > >   PR tree-optimization/118464
> > > > >   PR tree-optimization/116855
> > > > >   * doc/invoke.texi (min-pagesize): Update docs with vectorizer 
> > > > > use.
> > > > >   * tree-vect-data-refs.cc 
> > > > > (vect_analyze_early_break_dependences): Delay
> > > > >   checks.
> > > > >   (vect_compute_data_ref_alignment): Remove alignment checks and 
> > > > > move
> > > > to
> > > > >   get_load_store_type, increase group access alignment.
> > > > >   (vect_enhance_data_refs_alignment): Add note to comment needing
> > > > >   investigating.
> > > > >   (vect_analyze_data_refs_alignment): Likewise.
> > > > >   (vect_supportable_dr_alignment): For group loads look at first 
> > > > > DR.
> > > > >   * tree-vect-stmts.cc (get_load_store_type):
> > > > >   Perform safety checks for early break pfa.
> > > > >   * tree-vectorizer.h (dr_set_safe_speculative_read_required,
> > > > >   dr_safe_speculative_read_required, DR_SCALAR_KNOWN_BOUNDS):
> > > > New.
> > > > >   (need_peeling_for_alignment): Renamed to...
> > > > >   (safe_speculative_read_required): .. This
> > > > >   (class dr_vec_info): Add scalar_access_known_in_bounds.
> > > > >
> > > > > gcc/testsuite/Chang

Re: [PATCH] libstdc++: implement constexpr memory algorithms

2025-02-26 Thread Patrick Palka
On Tue, 25 Feb 2025, Jonathan Wakely wrote:

> On Thu, 20 Feb 2025 at 16:23, Patrick Palka  wrote:
> >
> > On Sun, 16 Feb 2025, Giuseppe D'Angelo wrote:
> >
> > > Hello,
> > >
> > > the attached patch implements the C++26 papers that add `constexpr` to the
> > > specialized memory algorithms (the uninitialized_* family). Tested on 
> > > x86-64
> > > Linux.
> > >
> > > Thank you,
> > > --
> > > Giuseppe D'Angelo
> > >
> >
> > > Subject: [PATCH] libstdc++: implement constexpr memory algorithms
> > >
> > > This commit adds support for C++26's constexpr specialized memory
> > > algorithms, introduced by P2283R2, P3508R0, P3369R0.
> > >
> > > The uninitialized_default, value, copy, move and fill algorithms are
> > > affected, in all of their variants (iterator-based, range-based and _n
> > > versions.)
> > >
> > > The changes are mostly mechanical -- add `constexpr` to a number of
> > > signatures. I've introduced a helper macro to conditionally expand to
> > > `constexpr` only in C++26 and above modes. The internal helper guard
> > > class for range algorithms instead can be marked unconditionally.
> > >
> > > uninitialized_fill is the only algorithm where I had to add a branch to
> > > a constexpr-friendly version (already existing).
> >
> > Seems the patch also adds code to uninitialized_copy and
> > uninitialized_fill_n?
> >
> > >
> > > For each algorithm family I've added only one test to cover it and its
> > > variants; the idea is to avoid too much repetition and simplify future
> > > maintenance.
> > >
> > > libstdc++-v3/ChangeLog:
> > >
> > >   * include/bits/ranges_uninitialized.h: Mark the specialized
> > >   memory algorithms as constexpr in C++26. Also mark the members
> > >   of the _DestroyGuard helper class.
> > >   * include/bits/stl_uninitialized.h: Ditto.
> > >   * include/bits/stl_construct.h: Mark _Construct_novalue (which
> > >   uses placement new to do default initialization) as constexpr
> > >   in C++26. This is possible due to P2747R2, which GCC already
> > >   implements; check P2747's feature-testing macro to avoid
> > >   issues with other compilers.
> > >   * include/bits/version.def: Bump the feature-testing macro.
> > >   * include/bits/version.h: Regenerate.
> > >   * testsuite/20_util/specialized_algorithms/feature_test_macro.cc: 
> > > New test.
> > >   * 
> > > testsuite/20_util/specialized_algorithms/uninitialized_copy/constexpr.cc: 
> > > New test.
> > >   * 
> > > testsuite/20_util/specialized_algorithms/uninitialized_default_construct/constexpr.cc:
> > >   New test.
> > >   * 
> > > testsuite/20_util/specialized_algorithms/uninitialized_fill/constexpr.cc: 
> > > New test.
> > >   * 
> > > testsuite/20_util/specialized_algorithms/uninitialized_move/constexpr.cc: 
> > > New test.
> > >   * 
> > > testsuite/20_util/specialized_algorithms/uninitialized_value_construct/constexpr.cc:
> > >   New test.
> > >
> > > Signed-off-by: Giuseppe D'Angelo 
> > > ---
> > >  .../include/bits/ranges_uninitialized.h   | 29 
> > >  libstdc++-v3/include/bits/stl_construct.h |  3 +
> > >  libstdc++-v3/include/bits/stl_uninitialized.h | 42 
> > >  libstdc++-v3/include/bits/version.def |  5 ++
> > >  libstdc++-v3/include/bits/version.h   |  7 +-
> > >  .../feature_test_macro.cc | 14 
> > >  .../uninitialized_copy/constexpr.cc   | 58 
> > >  .../constexpr.cc  | 67 ++
> > >  .../uninitialized_fill/constexpr.cc   | 68 +++
> > >  .../uninitialized_move/constexpr.cc   | 51 ++
> > >  .../constexpr.cc  | 64 +
> > >  11 files changed, 407 insertions(+), 1 deletion(-)
> > >  create mode 100644 
> > > libstdc++-v3/testsuite/20_util/specialized_algorithms/feature_test_macro.cc
> > >  create mode 100644 
> > > libstdc++-v3/testsuite/20_util/specialized_algorithms/uninitialized_copy/constexpr.cc
> > >  create mode 100644 
> > > libstdc++-v3/testsuite/20_util/specialized_algorithms/uninitialized_default_construct/constexpr.cc
> > >  create mode 100644 
> > > libstdc++-v3/testsuite/20_util/specialized_algorithms/uninitialized_fill/constexpr.cc
> > >  create mode 100644 
> > > libstdc++-v3/testsuite/20_util/specialized_algorithms/uninitialized_move/constexpr.cc
> > >  create mode 100644 
> > > libstdc++-v3/testsuite/20_util/specialized_algorithms/uninitialized_value_construct/constexpr.cc
> > >
> > > diff --git a/libstdc++-v3/include/bits/ranges_uninitialized.h 
> > > b/libstdc++-v3/include/bits/ranges_uninitialized.h
> > > index ced7bda5e37..337d321702d 100644
> > > --- a/libstdc++-v3/include/bits/ranges_uninitialized.h
> > > +++ b/libstdc++-v3/include/bits/ranges_uninitialized.h
> > > @@ -35,6 +35,12 @@
> > >
> > >  #include 
> > >
> > > +#if __glibcxx_raw_memory_algorithms >= 202411L // >= C++26
> > > +# define _

Re: [PATCH] arm: Fix up REVERSE_CONDITION macro [PR119002]

2025-02-26 Thread Richard Earnshaw (lists)
On 26/02/2025 15:59, Jakub Jelinek wrote:
> Hi!
> 
> The linaro CI found my PR119002 patch broke bootstrap on arm.
> Seems the problem is that it has incorrect REVERSE_CONDITION macro
> definition.
> All other target's REVERSE_CONDITION definitions and the default one
> just use the macro's arguments, while arm.h definition uses the MODE
> argument but uses code instead of CODE (the first argument).
> This happens to work because before my patch the only use of the
> macro was in jump.cc with
>   /* First see if machine description supplies us way to reverse the
>  comparison.  Give it priority over everything else to allow
>  machine description to do tricks.  */
>   if (GET_MODE_CLASS (mode) == MODE_CC
>   && REVERSIBLE_CC_MODE (mode))
> return REVERSE_CONDITION (code, mode);
> but in my patch it is used with GT rather than code.
> 
> The following patch fixes it, completely untested (but without my
> other patch it doesn't change anything on the preprocessed source),
> ok for trunk?
> 
> 2025-02-26  Jakub Jelinek  
> 
>   PR rtl-optimization/119002
>   * config/arm/arm.h (REVERSE_CONDITION): Use CODE - the macro
>   argument - in the macro rather than code.
> 
> --- gcc/config/arm/arm.h.jj   2025-01-09 22:04:32.140141200 +0100
> +++ gcc/config/arm/arm.h  2025-02-26 16:46:13.127544209 +0100
> @@ -2261,8 +2261,8 @@ extern int making_const_table;
>  
>  #define REVERSE_CONDITION(CODE,MODE) \
>(((MODE) == CCFPmode || (MODE) == CCFPEmode) \
> -   ? reverse_condition_maybe_unordered (code) \
> -   : reverse_condition (code))
> +   ? reverse_condition_maybe_unordered (CODE) \
> +   : reverse_condition (CODE))
>  
>  #define CLZ_DEFINED_VALUE_AT_ZERO(MODE, VALUE) \
>((VALUE) = GET_MODE_UNIT_BITSIZE (MODE), 2)
> 
>   Jakub
> 

OK

R.


Re: [PATCH v2] RISC-V: Fix bug for expand_const_vector interleave [PR118931]

2025-02-26 Thread Robin Dapp

Thanks Robin.


- IMHO we need to check both series for overflow, if step2 overflows in the
smaller type isn't the result equally wrong?


The series2 will shift right before IOR, thus the overflow bits never effect
on the final result.
For example, the series2 will be similar as below after shift.

v2.b = {0, 17, 0, 33, 0, 49, 0, 65, 0, 81, 0, 97, 0, 113, 0, 129}


Hmm, I think it's the other way around and it is being shifted left.  In that 
case we're only keeping the lower bits and the "overflow bits" don't matter.  
That means we should indeed be OK here.



For i32 interleave, we will extend to i64 for series1 and series2, I think we
can design
a series like base + i * step with last element overflow to bit 65 but have
bits 32-64 all zero, and then
the element before the last one will have overflow bits pollute the result.

It seems checking the last 2 elements is good enough here.


I guess the worst that can happen theoretically is i = 2 ^ 32 - 1,
step = 2 ^ 32 - 1, and base = 2 ^ 32 - 1?  But i is bound by our "largest"
mode so IMHO a 64-bit value should be sufficient.

We always have at least two elements BTW, no need to check that.

Please also adjust the comment above the code regarding overflow.


We need to either forbid variable-length vectors for this scheme altogether or
determine the maximum runtime number of elements possible for the current mode.
We don't support more than 65536 bits in a vector which would "naturally" limit
the number of elements.  I suppose variable-length vectors are not too common
for this scheme and falling back to a less optimized one shouldn't be too
costly.  That's just a gut feeling, though.


Make sense, will fall back to less optimized for VLA.


Have you checked how the more generic version (last else branch) compares?  I 
could imagine it's close or even better for our case.


[PATCH] arm: Fix up REVERSE_CONDITION macro [PR119002]

2025-02-26 Thread Jakub Jelinek
Hi!

The linaro CI found my PR119002 patch broke bootstrap on arm.
Seems the problem is that it has incorrect REVERSE_CONDITION macro
definition.
All other target's REVERSE_CONDITION definitions and the default one
just use the macro's arguments, while arm.h definition uses the MODE
argument but uses code instead of CODE (the first argument).
This happens to work because before my patch the only use of the
macro was in jump.cc with
  /* First see if machine description supplies us way to reverse the
 comparison.  Give it priority over everything else to allow
 machine description to do tricks.  */
  if (GET_MODE_CLASS (mode) == MODE_CC
  && REVERSIBLE_CC_MODE (mode))
return REVERSE_CONDITION (code, mode);
but in my patch it is used with GT rather than code.

The following patch fixes it, completely untested (but without my
other patch it doesn't change anything on the preprocessed source),
ok for trunk?

2025-02-26  Jakub Jelinek  

PR rtl-optimization/119002
* config/arm/arm.h (REVERSE_CONDITION): Use CODE - the macro
argument - in the macro rather than code.

--- gcc/config/arm/arm.h.jj 2025-01-09 22:04:32.140141200 +0100
+++ gcc/config/arm/arm.h2025-02-26 16:46:13.127544209 +0100
@@ -2261,8 +2261,8 @@ extern int making_const_table;
 
 #define REVERSE_CONDITION(CODE,MODE) \
   (((MODE) == CCFPmode || (MODE) == CCFPEmode) \
-   ? reverse_condition_maybe_unordered (code) \
-   : reverse_condition (code))
+   ? reverse_condition_maybe_unordered (CODE) \
+   : reverse_condition (CODE))
 
 #define CLZ_DEFINED_VALUE_AT_ZERO(MODE, VALUE) \
   ((VALUE) = GET_MODE_UNIT_BITSIZE (MODE), 2)

Jakub



Re: [PATCH] libstdc++: Allow 'configure.host' to modify 'EXTRA_CFLAGS', 'EXTRA_CXX_FLAGS'

2025-02-26 Thread Jonathan Wakely
On Wed, 26 Feb 2025 at 20:53, Thomas Schwinge  wrote:
>
> Hi Jonathan!
>
> Sorry for not providing more context initially.
>
> On 2025-02-26T10:50:11+, Jonathan Wakely  wrote:
> > On Wed, 26 Feb 2025 at 10:47, Jonathan Wakely  wrote:
> >> On Wed, 26 Feb 2025 at 10:19, Thomas Schwinge wrote:
> >> > In particular, 'GLIBCXX_ENABLE_CXX_FLAGS' shouldn't overwrite 
> >> > 'EXTRA_CXX_FLAGS'
> >> > (and prepend any additional '--enable-cxx-flags=[...]').
> >>
> >> This seems good, but why prepend instead of append here?
> >> If there are important flags passed down from top-level configure that
> >> shouldn't be replaced by the libstdc++ --enable-cxx-flags option, can
> >> we mention that in the new comment in acinclude.m4?
> >
> > Oh sorry, they're more likely to be from configure.host not from the
> > top-level, right?
>
> That's right: for GCN, nvptx configurations inject '-fno-exceptions' etc.
> via 'libstdc++-v3/configure.host'.
>
> > But if we prepend, then when users put bad options in
> > --enable-cxx-flags we silently override that with good target-specific
> > ones from configure.host
>
> That was my intention indeed -- but not a strong one.  ;-) (I've myself
> never used '--enable-cxx-flags=[...]'.)

I have used it occasionally, but long ago.

> > Maybe if users explicitly give bad options, they should get an error,
> > or a bad result?
>
> That's fine for me.  So if you think that makes more sense, then I'll be
> happy to swap it around.  I agree that it would be more standard to allow
> user-specified flags to override the default ones.

Yeah, I think I'd prefer that. OK with that change, and maybe change
the comment in acinclude.m4 from "Prepend the additional flags." to
something like:

# Append the additional flags to any that came from configure.host


> > Or maybe I just don't understand the context properly :-)
> >
> >
> >>
> >> >
> >> > libstdc++-v3/
> >> > * acinclude.m4 (GLIBCXX_ENABLE_CXX_FLAGS): Prepend any additional
> >> > flags to 'EXTRA_CXX_FLAGS'.
> >> > * configure: Regenerate.
> >> > * configure.host: Document 'EXTRA_CFLAGS', 'EXTRA_CXX_FLAGS'.
> >> > ---
> >> >  libstdc++-v3/acinclude.m4   | 3 ++-
> >> >  libstdc++-v3/configure  | 3 ++-
> >> >  libstdc++-v3/configure.host | 4 
> >> >  3 files changed, 8 insertions(+), 2 deletions(-)
> >> >
> >> > diff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4
> >> > index b3423d7957a..3287dab3b89 100644
> >> > --- a/libstdc++-v3/acinclude.m4
> >> > +++ b/libstdc++-v3/acinclude.m4
> >> > @@ -3269,7 +3269,8 @@ AC_DEFUN([GLIBCXX_ENABLE_CXX_FLAGS], [dnl
> >> >  done
> >> >fi
> >> >
> >> > -  EXTRA_CXX_FLAGS="$enable_cxx_flags"
> >> > +  # Prepend the additional flags.
> >> > +  EXTRA_CXX_FLAGS="$enable_cxx_flags $EXTRA_CXX_FLAGS"
> >> >AC_MSG_RESULT($EXTRA_CXX_FLAGS)
> >> >AC_SUBST(EXTRA_CXX_FLAGS)
> >> >  ])
> >> > diff --git a/libstdc++-v3/configure b/libstdc++-v3/configure
> >> > index e115ee55739..ba908577a66 100755
> >> > --- a/libstdc++-v3/configure
> >> > +++ b/libstdc++-v3/configure
> >> > @@ -19452,7 +19452,8 @@ fi
> >> >  done
> >> >fi
> >> >
> >> > -  EXTRA_CXX_FLAGS="$enable_cxx_flags"
> >> > +  # Prepend the additional flags.
> >> > +  EXTRA_CXX_FLAGS="$enable_cxx_flags $EXTRA_CXX_FLAGS"
> >> >{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $EXTRA_CXX_FLAGS" >&5
> >> >  $as_echo "$EXTRA_CXX_FLAGS" >&6; }
> >> >
> >> > diff --git a/libstdc++-v3/configure.host b/libstdc++-v3/configure.host
> >> > index 45f55b250ce..1e84c78af30 100644
> >> > --- a/libstdc++-v3/configure.host
> >> > +++ b/libstdc++-v3/configure.host
> >> > @@ -61,6 +61,10 @@
> >> >  #
> >> >  # It possibly modifies the following variables:
> >> >  #
> >> > +#   EXTRA_CFLAGS   extra flags to pass when compiling C code
> >> > +#
> >> > +#   EXTRA_CXX_FLAGSextra flags to pass when compiling C++ code
> >> > +#
> >> >  #   OPT_LDFLAGSextra flags to pass when linking the 
> >> > library, of
> >> >  #  the form '-Wl,blah'
> >> >  #  (defaults to empty in acinclude.m4)
> >> > --
> >> > 2.34.1
> >> >
>



RE: [PATCH v2] RISC-V: Fix bug for expand_const_vector interleave [PR118931]

2025-02-26 Thread Li, Pan2
> Hmm, I think it's the other way around and it is being shifted left.  In that 
> case we're only keeping the lower bits and the "overflow bits" don't matter.  
> That means we should indeed be OK here.

Yes, will add some comments here for series2.

> I guess the worst that can happen theoretically is i = 2 ^ 32 - 1,
> step = 2 ^ 32 - 1, and base = 2 ^ 32 - 1?  But i is bound by our "largest"
> mode so IMHO a 64-bit value should be sufficient.

> We always have at least two elements BTW, no need to check that.

> Please also adjust the comment above the code regarding overflow.

Sure thing, your are right, the last element is good enough here.

> Have you checked how the more generic version (last else branch) compares?  I 
> could imagine it's close or even better for our case.

If you mean the last branch of interleave, I think it is safe because it 
leverage the
merge to generate the result, instead of IOR. Only the IOR for final result have
this issue.

Pan

-Original Message-
From: Robin Dapp  
Sent: Thursday, February 27, 2025 12:23 AM
To: Li, Pan2 ; gcc-patches@gcc.gnu.org
Cc: juzhe.zh...@rivai.ai; kito.ch...@gmail.com; jeffreya...@gmail.com; Robin 
Dapp 
Subject: Re: [PATCH v2] RISC-V: Fix bug for expand_const_vector interleave 
[PR118931]

> Thanks Robin.
>
>> - IMHO we need to check both series for overflow, if step2 overflows in the
>> smaller type isn't the result equally wrong?
>
> The series2 will shift right before IOR, thus the overflow bits never effect
> on the final result.
> For example, the series2 will be similar as below after shift.
>
> v2.b = {0, 17, 0, 33, 0, 49, 0, 65, 0, 81, 0, 97, 0, 113, 0, 129}

Hmm, I think it's the other way around and it is being shifted left.  In that 
case we're only keeping the lower bits and the "overflow bits" don't matter.  
That means we should indeed be OK here.

> For i32 interleave, we will extend to i64 for series1 and series2, I think we
> can design
> a series like base + i * step with last element overflow to bit 65 but have
> bits 32-64 all zero, and then
> the element before the last one will have overflow bits pollute the result.
>
> It seems checking the last 2 elements is good enough here.

I guess the worst that can happen theoretically is i = 2 ^ 32 - 1,
step = 2 ^ 32 - 1, and base = 2 ^ 32 - 1?  But i is bound by our "largest"
mode so IMHO a 64-bit value should be sufficient.

We always have at least two elements BTW, no need to check that.

Please also adjust the comment above the code regarding overflow.

>> We need to either forbid variable-length vectors for this scheme altogether 
>> or
>> determine the maximum runtime number of elements possible for the current 
>> mode.
>> We don't support more than 65536 bits in a vector which would "naturally" 
>> limit
>> the number of elements.  I suppose variable-length vectors are not too common
>> for this scheme and falling back to a less optimized one shouldn't be too
>> costly.  That's just a gut feeling, though.
>
> Make sense, will fall back to less optimized for VLA.

Have you checked how the more generic version (last else branch) compares?  I 
could imagine it's close or even better for our case.


[patch, doc] PR108369 GCC: Documentation of -x option

2025-02-26 Thread Jerry D
This attached patch is intended to clarify the '-x' option using '-x 
f77' as an example. I was not sure who should review.


Tested by inspecting the generated info file from make info.

OK for trunk and backport to 14?

Regards,

Jerry

Author: Jerry DeLisle 
Date:   Wed Feb 26 17:26:26 2025 -0800

GCC: Documentation of -x option

This change updates information about the -x option to clarify
that it does not ensure standards compliance. Sparked by
discussions in the following PR.

PR fortran/108369

gcc/ChangeLog:

* doc/invoke.texi: Add a note to clarify. Adjust some wording.

commit a933852fa177abd79f6a3de3a01fafa07ed6ddc6
Author: Jerry DeLisle 
Date:   Wed Feb 26 17:26:26 2025 -0800

GCC: Documentation of -x option

This change updates information about the -x option to clarify
that it does not ensure standards compliance. Sparked by
discussions in the following PR.

PR fortran/108369

gcc/ChangeLog:

* doc/invoke.texi: Add a note to clarify. Adjust some wording.

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index bad49a017cc..6f8bf392386 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -1700,9 +1700,13 @@ f77  f77-cpp-input f95  f95-cpp-input
 go
 @end smallexample
 
+Note that @option{-x} does not imply a particular language standard.
+For example @option{-x f77} may also require @option{-std=legacy} for some older
+source codes.
+
 @item -x none
 Turn off any specification of a language, so that subsequent files are
-handled according to their file name suffixes (as they are if @option{-x}
+handled according to their file name suffixes (as if @option{-x}
 has not been used at all).
 @end table
 


RE: [PATCH] i386: Treat Granite Rapids/Granite Rapids-D/Diamond Rapids similar as Sapphire Rapids in x86-tune.def

2025-02-26 Thread Liu, Hongtao



> -Original Message-
> From: Jiang, Haochen 
> Sent: Wednesday, February 26, 2025 4:18 PM
> To: gcc-patches@gcc.gnu.org
> Cc: Liu, Hongtao ; ubiz...@gmail.com
> Subject: [PATCH] i386: Treat Granite Rapids/Granite Rapids-D/Diamond Rapids
> similar as Sapphire Rapids in x86-tune.def
> 
> Hi all,
> 
> Since GNR, GNR-D, DMR are both P-core based, we should treat them just like
> SPR in tuning for now.
> 
> Ok for trunk and backport to GCC13/14 for GNR/GNR-D part?
Ok.
> 
> Thx,
> Haochen
> 
> gcc/ChangeLog:
> 
>   * config/i386/x86-tune.def
>   (X86_TUNE_DEST_FALSE_DEP_FOR_GLC): Add GNR, GNR-D, DMR.
>   (X86_TUNE_AVOID_256FMA_CHAINS): Ditto.
>   (X86_TUNE_AVX512_MOVE_BY_PIECES): Ditto.
>   (X86_TUNE_AVX512_STORE_BY_PIECES): Ditto.
> ---
>  gcc/config/i386/x86-tune.def | 12 
>  1 file changed, 8 insertions(+), 4 deletions(-)
> 
> diff --git a/gcc/config/i386/x86-tune.def b/gcc/config/i386/x86-tune.def
> index df7b4ed22bc..0bdad7234a6 100644
> --- a/gcc/config/i386/x86-tune.def
> +++ b/gcc/config/i386/x86-tune.def
> @@ -87,7 +87,8 @@ DEF_TUNE
> (X86_TUNE_SSE_PARTIAL_REG_CONVERTS_DEPENDENCY,
> several insns to break false dependency on the dest register for GLC
> micro-architecture.  */
>  DEF_TUNE (X86_TUNE_DEST_FALSE_DEP_FOR_GLC,
> -   "dest_false_dep_for_glc", m_SAPPHIRERAPIDS | m_CORE_HYBRID
> +   "dest_false_dep_for_glc", m_SAPPHIRERAPIDS | m_GRANITERAPIDS
> +   | m_GRANITERAPIDS_D | m_DIAMONDRAPIDS | m_CORE_HYBRID
> | m_CORE_ATOM)
> 
>  /* X86_TUNE_SSE_SPLIT_REGS: Set for machines where the type and
> dependencies @@ -527,7 +528,8 @@ DEF_TUNE
> (X86_TUNE_AVOID_128FMA_CHAINS, "avoid_fma_chains", m_ZNVER
> smaller FMA chain.  */
>  DEF_TUNE (X86_TUNE_AVOID_256FMA_CHAINS, "avoid_fma256_chains",
> m_ZNVER2 | m_ZNVER3 | m_ZNVER4 | m_ZNVER5 |
> m_CORE_HYBRID
> -   | m_SAPPHIRERAPIDS | m_CORE_ATOM | m_GENERIC)
> +   | m_SAPPHIRERAPIDS | m_GRANITERAPIDS | m_GRANITERAPIDS_D
> +   | m_DIAMONDRAPIDS | m_CORE_ATOM | m_GENERIC)
> 
>  /* X86_TUNE_AVOID_512FMA_CHAINS: Avoid creating loops with tight
> 512bit or
> smaller FMA chain.  */
> @@ -594,12 +596,14 @@ DEF_TUNE
> (X86_TUNE_AVX256_STORE_BY_PIECES, "avx256_store_by_pieces",
>  /* X86_TUNE_AVX512_MOVE_BY_PIECES: Optimize move_by_pieces with
> 512-bit
> AVX instructions.  */
>  DEF_TUNE (X86_TUNE_AVX512_MOVE_BY_PIECES,
> "avx512_move_by_pieces",
> -   m_SAPPHIRERAPIDS | m_ZNVER4 | m_ZNVER5)
> +   m_SAPPHIRERAPIDS | m_GRANITERAPIDS | m_GRANITERAPIDS_D
> +   | m_DIAMONDRAPIDS | m_ZNVER4 | m_ZNVER5)
> 
>  /* X86_TUNE_AVX512_STORE_BY_PIECES: Optimize store_by_pieces with
> 512-bit
> AVX instructions.  */
>  DEF_TUNE (X86_TUNE_AVX512_STORE_BY_PIECES,
> "avx512_store_by_pieces",
> -   m_SAPPHIRERAPIDS | m_ZNVER4 | m_ZNVER5)
> +   m_SAPPHIRERAPIDS | m_GRANITERAPIDS | m_GRANITERAPIDS_D
> +   | m_DIAMONDRAPIDS | m_ZNVER4 | m_ZNVER5)
> 
>  /* X86_TUNE_AVX512_TWO_EPILOGUES: Use two vector epilogues for 512-
> bit
> vectorized loops.  */
> --
> 2.31.1



Re: [PATCH] c: stddef.h C23 fixes [PR114870]

2025-02-26 Thread Joseph Myers
On Wed, 26 Feb 2025, Jakub Jelinek wrote:

> In any case, the following patch has been bootstrapped/regtested on
> x86_64-linux and i686-linux, ok for trunk?  Or something else?
> 
> 2025-02-26  Jakub Jelinek  
> 
>   PR c/114870
>   * ginclude/stddef.h (__STDC_VERSION_STDDEF_H__, unreachable): Don't
>   redefine multiple times if stddef.h is first included without __need_*
>   defines and later with them.  Move nullptr_t and unreachable and
>   __STDC_VERSION_STDDEF_H__ definitions into the same
>   defined (__STDC_VERSION__) && __STDC_VERSION__ > 201710L #if block.

OK.

-- 
Joseph S. Myers
josmy...@redhat.com



[pushed]PR119021][LRA]: Fix rtl correctness check failure in LRA.

2025-02-26 Thread Vladimir Makarov

The following patch solves

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119021

The patch was successfully tested and bootstrapped on x86-64.

commit 7ce3a8e872d945d537a7e7ba1bd3f45b1cf9a6d2
Author: Vladimir N. Makarov 
Date:   Wed Feb 26 11:28:08 2025 -0500

[PR119021][LRA]: Fix rtl correctness check failure in LRA.

  Patch to fix PR115458 contained a code change in dealing with asm
errors to avoid cycling in reporting the error for asm gotos.  This
code was wrong and resulted in checking RTL correctness failure.  This
patch reverts the code change and solves cycling in asm error
reporting in a different way.

gcc/ChangeLog:

PR middle-end/119021
* lra.cc (lra_asm_insn_error): Use lra_invalidate_insn_data
instead of lra_update_insn_regno_info.
* lra-assigns.cc (lra_split_hard_reg_for): Restore old code.

diff --git a/gcc/lra-assigns.cc b/gcc/lra-assigns.cc
index 480925ad894..46f9c9d20e2 100644
--- a/gcc/lra-assigns.cc
+++ b/gcc/lra-assigns.cc
@@ -1856,11 +1856,6 @@ lra_split_hard_reg_for (bool fail_p)
 	  {
 	asm_p = true;
 	lra_asm_insn_error (insn);
-	if (JUMP_P (insn))
-	  ira_nullify_asm_goto (insn);
-	else
-	  PATTERN (insn) = gen_rtx_USE (VOIDmode, const0_rtx);
-	lra_invalidate_insn_data (insn);
 	  }
 	else if (!asm_p)
 	  {
diff --git a/gcc/lra.cc b/gcc/lra.cc
index b753729d43d..8f30284e9da 100644
--- a/gcc/lra.cc
+++ b/gcc/lra.cc
@@ -549,7 +549,7 @@ lra_asm_insn_error (rtx_insn *insn)
   if (JUMP_P (insn))
 {
   ira_nullify_asm_goto (insn);
-  lra_update_insn_regno_info (insn);
+  lra_invalidate_insn_data (insn);
 }
   else
 {


Re: [PATCH v3] libstdc++: implement constexpr memory algorithms

2025-02-26 Thread Giuseppe D'Angelo

On 26/02/2025 16:33, Giuseppe D'Angelo wrote:

Whops, sorry, missed this sub-thread (while replying to the other one).
Change of plans then, I'll amend and remove the ad-hoc constexpr macro.


Done, v3 attached.

Thanks,
--
Giuseppe D'Angelo

From de3751a38330f508be9f08b77136a31481018828 Mon Sep 17 00:00:00 2001
From: Giuseppe D'Angelo 
Date: Sun, 16 Feb 2025 19:37:07 +0100
Subject: [PATCH] libstdc++: implement constexpr memory algorithms

This commit adds support for C++26's constexpr specialized memory
algorithms, introduced by P2283R2, P3508R0, P3369R0.

The uninitialized_default, value, copy, move and fill algorithms are
affected, in all of their variants (iterator-based, range-based and _n
versions.)

The changes are mostly mechanical -- add `constexpr` to a number of
signatures when compiling in C++26 and above modes. The internal helper
guard class for range algorithms instead can be marked unconditionally.

The only "real" change to the implementation of the algorithms is that
during constant evaluation I need to dispatch to a constexpr-friendly
version of them.

For each algorithm family I've added only one test to cover it and its
variants; the idea is to avoid too much repetition and simplify future
maintenance.

libstdc++-v3/ChangeLog:

	* include/bits/ranges_uninitialized.h: Mark the specialized
	memory algorithms as constexpr in C++26. Also mark the members
	of the _DestroyGuard helper class.
	* include/bits/stl_uninitialized.h: Ditto.
	* include/bits/stl_construct.h: Mark _Construct_novalue (which
	uses placement new to do default initialization) as constexpr
	in C++26. This is possible due to P2747R2, which GCC already
	implements; other compilers in C++26 modes already implement
	P2448R2, so there should be no issues there.
	* include/bits/version.def: Bump the feature-testing macro.
	* include/bits/version.h: Regenerate.
	* testsuite/20_util/specialized_algorithms/feature_test_macro.cc: New test.
	* testsuite/20_util/specialized_algorithms/uninitialized_copy/constexpr.cc: New test.
	* testsuite/20_util/specialized_algorithms/uninitialized_default_construct/constexpr.cc:
	New test.
	* testsuite/20_util/specialized_algorithms/uninitialized_fill/constexpr.cc: New test.
	* testsuite/20_util/specialized_algorithms/uninitialized_move/constexpr.cc: New test.
	* testsuite/20_util/specialized_algorithms/uninitialized_value_construct/constexpr.cc:
	New test.

Signed-off-by: Giuseppe D'Angelo 
---
 .../include/bits/ranges_uninitialized.h   | 21 ++
 libstdc++-v3/include/bits/stl_construct.h |  1 +
 libstdc++-v3/include/bits/stl_uninitialized.h | 39 +++
 libstdc++-v3/include/bits/version.def |  5 ++
 libstdc++-v3/include/bits/version.h   |  7 +-
 .../feature_test_macro.cc | 14 
 .../uninitialized_copy/constexpr.cc   | 58 
 .../constexpr.cc  | 67 ++
 .../uninitialized_fill/constexpr.cc   | 68 +++
 .../uninitialized_move/constexpr.cc   | 51 ++
 .../constexpr.cc  | 64 +
 11 files changed, 394 insertions(+), 1 deletion(-)
 create mode 100644 libstdc++-v3/testsuite/20_util/specialized_algorithms/feature_test_macro.cc
 create mode 100644 libstdc++-v3/testsuite/20_util/specialized_algorithms/uninitialized_copy/constexpr.cc
 create mode 100644 libstdc++-v3/testsuite/20_util/specialized_algorithms/uninitialized_default_construct/constexpr.cc
 create mode 100644 libstdc++-v3/testsuite/20_util/specialized_algorithms/uninitialized_fill/constexpr.cc
 create mode 100644 libstdc++-v3/testsuite/20_util/specialized_algorithms/uninitialized_move/constexpr.cc
 create mode 100644 libstdc++-v3/testsuite/20_util/specialized_algorithms/uninitialized_value_construct/constexpr.cc

diff --git a/libstdc++-v3/include/bits/ranges_uninitialized.h b/libstdc++-v3/include/bits/ranges_uninitialized.h
index ced7bda5e37..990929efaa9 100644
--- a/libstdc++-v3/include/bits/ranges_uninitialized.h
+++ b/libstdc++-v3/include/bits/ranges_uninitialized.h
@@ -105,15 +105,18 @@ namespace ranges
 	const _Iter* _M_cur;
 
   public:
+	constexpr
 	explicit
 	_DestroyGuard(const _Iter& __iter)
 	  : _M_first(__iter), _M_cur(std::__addressof(__iter))
 	{ }
 
+	constexpr
 	void
 	release() noexcept
 	{ _M_cur = nullptr; }
 
+	constexpr
 	~_DestroyGuard()
 	{
 	  if (_M_cur != nullptr)
@@ -126,10 +129,12 @@ namespace ranges
 	&& is_trivially_destructible_v>
   struct _DestroyGuard<_Iter>
   {
+	constexpr
 	explicit
 	_DestroyGuard(const _Iter&)
 	{ }
 
+	constexpr
 	void
 	release() noexcept
 	{ }
@@ -141,6 +146,7 @@ namespace ranges
 template<__detail::__nothrow_forward_iterator _Iter,
 	 __detail::__nothrow_sentinel<_Iter> _Sent>
   requires default_initializable>
+  _GLIBCXX26_CONSTEXPR
   _Iter
   operator()(_Iter __first, _Sent __last) const
   {
@@ -159,6 +165,7 @@ namespace ranges
 
 templa

Re: [PATCH] c: Assorted fixes for flexible array members in unions [PR119001]

2025-02-26 Thread Joseph Myers
On Wed, 26 Feb 2025, Jakub Jelinek wrote:

> Hi!
> 
> r15-209 allowed flexible array members inside of unions, but as the
> following testcase shows, not everything has been adjusted for that.
> Unlike structures, in unions flexible array member (as an extension)
> can be any of the members, not just the last one, as in union all
> members are effectively last.
> The first hunk is about an ICE on the initialization of the FAM
> in union which is not the last FIELD_DECL with a string literal,
> the second hunk just formatting fix, third hunk fixes a bug in which
> we were just throwing away the initializers (except for with string literal)
> of FAMs in unions which aren't the last FIELD_DECL, and the last hunk
> is to diagnose FAM errors in unions the same as for structures, in
> particular trying to initialize a FAM with non-constant or initialization
> in nested context.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

-- 
Joseph S. Myers
josmy...@redhat.com



Re: [Fortran, Patch, PR118789, v1] Fix associate to void*

2025-02-26 Thread Thomas Koenig

Hi Andre,


Regtests ok on x86_64-pc-linux-gnu / F41. Ok for mainline?


Looks good to me.

Thanks for the patch!

Best regards

Thomas



Re: [PATCH] libstdc++: Allow 'configure.host' to modify 'EXTRA_CFLAGS', 'EXTRA_CXX_FLAGS'

2025-02-26 Thread Thomas Schwinge
Hi Jonathan!

Sorry for not providing more context initially.

On 2025-02-26T10:50:11+, Jonathan Wakely  wrote:
> On Wed, 26 Feb 2025 at 10:47, Jonathan Wakely  wrote:
>> On Wed, 26 Feb 2025 at 10:19, Thomas Schwinge wrote:
>> > In particular, 'GLIBCXX_ENABLE_CXX_FLAGS' shouldn't overwrite 
>> > 'EXTRA_CXX_FLAGS'
>> > (and prepend any additional '--enable-cxx-flags=[...]').
>>
>> This seems good, but why prepend instead of append here?
>> If there are important flags passed down from top-level configure that
>> shouldn't be replaced by the libstdc++ --enable-cxx-flags option, can
>> we mention that in the new comment in acinclude.m4?
>
> Oh sorry, they're more likely to be from configure.host not from the
> top-level, right?

That's right: for GCN, nvptx configurations inject '-fno-exceptions' etc.
via 'libstdc++-v3/configure.host'.

> But if we prepend, then when users put bad options in
> --enable-cxx-flags we silently override that with good target-specific
> ones from configure.host

That was my intention indeed -- but not a strong one.  ;-) (I've myself
never used '--enable-cxx-flags=[...]'.)

> Maybe if users explicitly give bad options, they should get an error,
> or a bad result?

That's fine for me.  So if you think that makes more sense, then I'll be
happy to swap it around.  I agree that it would be more standard to allow
user-specified flags to override the default ones.


Grüße
 Thomas


> Or maybe I just don't understand the context properly :-)
>
>
>>
>> >
>> > libstdc++-v3/
>> > * acinclude.m4 (GLIBCXX_ENABLE_CXX_FLAGS): Prepend any additional
>> > flags to 'EXTRA_CXX_FLAGS'.
>> > * configure: Regenerate.
>> > * configure.host: Document 'EXTRA_CFLAGS', 'EXTRA_CXX_FLAGS'.
>> > ---
>> >  libstdc++-v3/acinclude.m4   | 3 ++-
>> >  libstdc++-v3/configure  | 3 ++-
>> >  libstdc++-v3/configure.host | 4 
>> >  3 files changed, 8 insertions(+), 2 deletions(-)
>> >
>> > diff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4
>> > index b3423d7957a..3287dab3b89 100644
>> > --- a/libstdc++-v3/acinclude.m4
>> > +++ b/libstdc++-v3/acinclude.m4
>> > @@ -3269,7 +3269,8 @@ AC_DEFUN([GLIBCXX_ENABLE_CXX_FLAGS], [dnl
>> >  done
>> >fi
>> >
>> > -  EXTRA_CXX_FLAGS="$enable_cxx_flags"
>> > +  # Prepend the additional flags.
>> > +  EXTRA_CXX_FLAGS="$enable_cxx_flags $EXTRA_CXX_FLAGS"
>> >AC_MSG_RESULT($EXTRA_CXX_FLAGS)
>> >AC_SUBST(EXTRA_CXX_FLAGS)
>> >  ])
>> > diff --git a/libstdc++-v3/configure b/libstdc++-v3/configure
>> > index e115ee55739..ba908577a66 100755
>> > --- a/libstdc++-v3/configure
>> > +++ b/libstdc++-v3/configure
>> > @@ -19452,7 +19452,8 @@ fi
>> >  done
>> >fi
>> >
>> > -  EXTRA_CXX_FLAGS="$enable_cxx_flags"
>> > +  # Prepend the additional flags.
>> > +  EXTRA_CXX_FLAGS="$enable_cxx_flags $EXTRA_CXX_FLAGS"
>> >{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $EXTRA_CXX_FLAGS" >&5
>> >  $as_echo "$EXTRA_CXX_FLAGS" >&6; }
>> >
>> > diff --git a/libstdc++-v3/configure.host b/libstdc++-v3/configure.host
>> > index 45f55b250ce..1e84c78af30 100644
>> > --- a/libstdc++-v3/configure.host
>> > +++ b/libstdc++-v3/configure.host
>> > @@ -61,6 +61,10 @@
>> >  #
>> >  # It possibly modifies the following variables:
>> >  #
>> > +#   EXTRA_CFLAGS   extra flags to pass when compiling C code
>> > +#
>> > +#   EXTRA_CXX_FLAGSextra flags to pass when compiling C++ code
>> > +#
>> >  #   OPT_LDFLAGSextra flags to pass when linking the library, 
>> > of
>> >  #  the form '-Wl,blah'
>> >  #  (defaults to empty in acinclude.m4)
>> > --
>> > 2.34.1
>> >


[PATCH] i386: Treat Granite Rapids/Granite Rapids-D/Diamond Rapids similar as Sapphire Rapids in x86-tune.def

2025-02-26 Thread Haochen Jiang
Hi all,

Since GNR, GNR-D, DMR are both P-core based, we should treat them
just like SPR in tuning for now.

Ok for trunk and backport to GCC13/14 for GNR/GNR-D part?

Thx,
Haochen

gcc/ChangeLog:

* config/i386/x86-tune.def
(X86_TUNE_DEST_FALSE_DEP_FOR_GLC): Add GNR, GNR-D, DMR.
(X86_TUNE_AVOID_256FMA_CHAINS): Ditto.
(X86_TUNE_AVX512_MOVE_BY_PIECES): Ditto.
(X86_TUNE_AVX512_STORE_BY_PIECES): Ditto.
---
 gcc/config/i386/x86-tune.def | 12 
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/gcc/config/i386/x86-tune.def b/gcc/config/i386/x86-tune.def
index df7b4ed22bc..0bdad7234a6 100644
--- a/gcc/config/i386/x86-tune.def
+++ b/gcc/config/i386/x86-tune.def
@@ -87,7 +87,8 @@ DEF_TUNE (X86_TUNE_SSE_PARTIAL_REG_CONVERTS_DEPENDENCY,
several insns to break false dependency on the dest register for GLC
micro-architecture.  */
 DEF_TUNE (X86_TUNE_DEST_FALSE_DEP_FOR_GLC,
- "dest_false_dep_for_glc", m_SAPPHIRERAPIDS | m_CORE_HYBRID
+ "dest_false_dep_for_glc", m_SAPPHIRERAPIDS | m_GRANITERAPIDS
+ | m_GRANITERAPIDS_D | m_DIAMONDRAPIDS | m_CORE_HYBRID
  | m_CORE_ATOM)
 
 /* X86_TUNE_SSE_SPLIT_REGS: Set for machines where the type and dependencies
@@ -527,7 +528,8 @@ DEF_TUNE (X86_TUNE_AVOID_128FMA_CHAINS, "avoid_fma_chains", 
m_ZNVER
smaller FMA chain.  */
 DEF_TUNE (X86_TUNE_AVOID_256FMA_CHAINS, "avoid_fma256_chains",
  m_ZNVER2 | m_ZNVER3 | m_ZNVER4 | m_ZNVER5 | m_CORE_HYBRID
- | m_SAPPHIRERAPIDS | m_CORE_ATOM | m_GENERIC)
+ | m_SAPPHIRERAPIDS | m_GRANITERAPIDS | m_GRANITERAPIDS_D
+ | m_DIAMONDRAPIDS | m_CORE_ATOM | m_GENERIC)
 
 /* X86_TUNE_AVOID_512FMA_CHAINS: Avoid creating loops with tight 512bit or
smaller FMA chain.  */
@@ -594,12 +596,14 @@ DEF_TUNE (X86_TUNE_AVX256_STORE_BY_PIECES, 
"avx256_store_by_pieces",
 /* X86_TUNE_AVX512_MOVE_BY_PIECES: Optimize move_by_pieces with 512-bit
AVX instructions.  */
 DEF_TUNE (X86_TUNE_AVX512_MOVE_BY_PIECES, "avx512_move_by_pieces",
- m_SAPPHIRERAPIDS | m_ZNVER4 | m_ZNVER5)
+ m_SAPPHIRERAPIDS | m_GRANITERAPIDS | m_GRANITERAPIDS_D
+ | m_DIAMONDRAPIDS | m_ZNVER4 | m_ZNVER5)
 
 /* X86_TUNE_AVX512_STORE_BY_PIECES: Optimize store_by_pieces with 512-bit
AVX instructions.  */
 DEF_TUNE (X86_TUNE_AVX512_STORE_BY_PIECES, "avx512_store_by_pieces",
- m_SAPPHIRERAPIDS | m_ZNVER4 | m_ZNVER5)
+ m_SAPPHIRERAPIDS | m_GRANITERAPIDS | m_GRANITERAPIDS_D
+ | m_DIAMONDRAPIDS | m_ZNVER4 | m_ZNVER5)
 
 /* X86_TUNE_AVX512_TWO_EPILOGUES: Use two vector epilogues for 512-bit
vectorized loops.  */
-- 
2.31.1



Re: [pushed] doc: update C++98 bootstrap note

2025-02-26 Thread Richard Biener
On Tue, Feb 25, 2025 at 9:19 PM Jason Merrill  wrote:
>
> r10-11132 uses C++11 default member initializers, which breaks bootstrapping
> with a C++98 compiler.

It's probably more interesting to add a note to gcc-10/changes.html as
caveat for the
10.5 minor release.

> gcc/ChangeLog:
>
> * doc/install.texi: 10.5 won't bootstrap with C++98.
> ---
>  gcc/doc/install.texi | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
> index 29de3200ae8..bc5263e5348 100644
> --- a/gcc/doc/install.texi
> +++ b/gcc/doc/install.texi
> @@ -227,7 +227,7 @@ Necessary to bootstrap GCC.  GCC 5.4 or newer has 
> sufficient support
>  for used C++14 features.
>
>  Versions of GCC prior to 15 allow bootstrapping with an ISO C++11
> -compiler, versions prior to 11 allow bootstrapping with an ISO C++98
> +compiler, versions prior to 10.5 allow bootstrapping with an ISO C++98
>  compiler, and versions prior to 4.8 allow bootstrapping with an ISO
>  C89 compiler.
>
>
> base-commit: 6be1b9e94d9a2ead15e3625e833f1e34503ab803
> --
> 2.48.1
>


Re: [Fortran, Patch, PR108233, v1] Prevent SAVE_EXPR on lhs in assign.

2025-02-26 Thread Andre Vehreschild
Hi Jerry,

thanks for the review. Committed as gcc-15-7712-g751b37047b2.

Thanks again,
Andre

On Tue, 25 Feb 2025 09:49:29 -0800
Jerry D  wrote:

> On 2/25/25 9:18 AM, Andre Vehreschild wrote:
> > Hi all,
> >
> > for some recreation after all the coarray stuff, I found this pr cc'ed to
> > me. Taking a look at it, I figured that using a SAVE_EXPR on the lhs of the
> > assignment was doing the harm. The data seems to be not written back into
> > the vector shaped data type (like a complex number in this case). The
> > current fix just removes the save_expr from the lhs in an assignment and
> > everything is fine.
> >
> > Regtest ok on x86_64-pc-linux-gnu / F41. Ok for mainline?
>
> Looks OK Andre, thanks.
>
> Jerry
>
> >
> > Regards,
> > Andre
> > --
> > Andre Vehreschild * Email: vehre ad gmx dot de
>


--
Andre Vehreschild * Email: vehre ad gmx dot de


[PATCH] simple-diagnostic-path, v2: Inline two trivial methods [PR116143]

2025-02-26 Thread Jakub Jelinek
On Wed, Feb 26, 2025 at 10:45:37AM +0100, Richard Biener wrote:
> OK

Unfortunately I've only bootstrapped/regtested it with normal checking.
Testing it with --enable-checking=release now shows that this patch just
moved the FAILs to a different symbol.  And note that isn't even a LTO
build.

The following patch which IMHO still makes sense, those methods are also
trivial, moves it even further.  But the next problem is
_ZN22simple_diagnostic_path9add_eventEmP9tree_nodeiPKcz
and that method is IMHO too large for the header file to be defined inline,
and doesn't even have final override like the others, isn't virtual in the
abstract class.
So, I have really no idea why it isn't compiled in.

2025-02-26  Jakub Jelinek  

PR testsuite/116143
* simple-diagnostic-path.h (simple_diagnostic_path::get_event): Define
inline.
(simple_diagnostic_path::get_thread): Likewise.
(simple_diagnostic_path::same_function_p): Likewise.
* simple-diagnostic-path.cc (simple_diagnostic_path::get_event):
Remove out of line definition.
(simple_diagnostic_path::get_thread): Likewise.
(simple_diagnostic_path::same_function_p): Likewise.

--- gcc/simple-diagnostic-path.h.jj 2025-02-26 10:59:04.475354169 +0100
+++ gcc/simple-diagnostic-path.h2025-02-26 11:09:48.191388638 +0100
@@ -101,14 +101,19 @@ class simple_diagnostic_path : public di
   simple_diagnostic_path (pretty_printer *event_pp);
 
   unsigned num_events () const final override { return m_events.length (); }
-  const diagnostic_event & get_event (int idx) const final override;
+  const diagnostic_event & get_event (int idx) const final override
+  { return *m_events[idx]; }
   unsigned num_threads () const final override { return m_threads.length (); }
   const diagnostic_thread &
-  get_thread (diagnostic_thread_id_t) const final override;
+  get_thread (diagnostic_thread_id_t idx) const final override
+  { return *m_threads[idx]; }
   bool
   same_function_p (int event_idx_a,
-  int event_idx_b) const final override;
-
+  int event_idx_b) const final override
+  { 
+return (m_events[event_idx_a]->get_fndecl ()
+   == m_events[event_idx_b]->get_fndecl ());
+  }
   diagnostic_thread_id_t add_thread (const char *name);
 
   diagnostic_event_id_t add_event (location_t loc, tree fndecl, int depth,
--- gcc/simple-diagnostic-path.cc.jj2025-02-26 10:59:04.476354155 +0100
+++ gcc/simple-diagnostic-path.cc   2025-02-26 11:10:01.842198544 +0100
@@ -41,29 +41,6 @@ simple_diagnostic_path::simple_diagnosti
   add_thread ("main");
 }
 
-/* Implementation of diagnostic_path::get_event vfunc for
-   simple_diagnostic_path: simply return the event in the vec.  */
-
-const diagnostic_event &
-simple_diagnostic_path::get_event (int idx) const
-{
-  return *m_events[idx];
-}
-
-const diagnostic_thread &
-simple_diagnostic_path::get_thread (diagnostic_thread_id_t idx) const
-{
-  return *m_threads[idx];
-}
-
-bool
-simple_diagnostic_path::same_function_p (int event_idx_a,
-int event_idx_b) const
-{
-  return (m_events[event_idx_a]->get_fndecl ()
- == m_events[event_idx_b]->get_fndecl ());
-}
-
 diagnostic_thread_id_t
 simple_diagnostic_path::add_thread (const char *name)
 {


Jakub



[FYI, PATCH v3] [testsuite] add x86 effective target

2025-02-26 Thread Alexandre Oliva
On Feb 18, 2025, Richard Sandiford  wrote:

> Thanks for doing this.  How about also replacing all uses of:

>([check_effective_target_x86])

> with:

>[check_effective_target_x86]

> OK with that change if there are no objections within 24 hours.

Sure, thanks for the review and for the suggestion.  Here's the patch
I'm about to install.  Tested on x86_64-linux-gnu.


I got tired of repeating the conditional that recognizes ia32 or
x86_64, and introduced 'x86' as a shorthand for that, adjusting all
occurrences in target-supports.exp, to set an example.  I found some
patterns that recognized i?86* and x86_64*, but I took those as likely
cut&pastos instead of trying to preserve those weirdnesses.


for  gcc/ChangeLog

* doc/sourcebuild.texi: Add x86 effective target.

for  gcc/testsuite/ChangeLog

* lib/target-supports.exp (check_effective_target_x86): New.
Replace all uses of i?86-*-* and x86_64-*-* in this file.
---
 gcc/doc/sourcebuild.texi  |3 
 gcc/testsuite/lib/target-supports.exp |  200 +
 2 files changed, 105 insertions(+), 98 deletions(-)

diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi
index 28338324f0724..d44c2e8cbe6a1 100644
--- a/gcc/doc/sourcebuild.texi
+++ b/gcc/doc/sourcebuild.texi
@@ -2798,6 +2798,9 @@ Target supports the execution of @code{user_msr} 
instructions.
 @item vect_cmdline_needed
 Target requires a command line argument to enable a SIMD instruction set.
 
+@item x86
+Target is ia32 or x86_64.
+
 @item xorsign
 Target supports the xorsign optab expansion.
 
diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index c6f3acfadb3d8..d02d1fa9becbe 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -735,7 +735,7 @@ proc check_profiling_available { test_what } {
 }
 
 if { $test_what == "-fauto-profile" } {
-   if { !([istarget i?86-*-linux*] || [istarget x86_64-*-linux*]) } {
+   if { !([check_effective_target_x86] && [istarget *-*-linux*]) } {
verbose "autofdo only supported on linux"
return 0
}
@@ -2609,17 +2609,23 @@ proc remove_options_for_riscv_zvbb { flags } {
 return [add_options_for_riscv_z_ext zvbb $flags]
 }
 
+# Return 1 if the target is ia32 or x86_64.
+
+proc check_effective_target_x86 { } {
+if { ([istarget x86_64-*-*] || [istarget i?86-*-*]) } {
+   return 1
+} else {
+return 0
+}
+}
+
 # Return 1 if the target OS supports running SSE executables, 0
 # otherwise.  Cache the result.
 
 proc check_sse_os_support_available { } {
 return [check_cached_effective_target sse_os_support_available {
# If this is not the right target then we can skip the test.
-   if { !([istarget i?86-*-*] || [istarget x86_64-*-*]) } {
-   expr 0
-   } else {
-   expr 1
-   }
+   expr [check_effective_target_x86]
 }]
 }
 
@@ -2629,7 +2635,7 @@ proc check_sse_os_support_available { } {
 proc check_avx_os_support_available { } {
 return [check_cached_effective_target avx_os_support_available {
# If this is not the right target then we can skip the test.
-   if { !([istarget i?86-*-*] || [istarget x86_64-*-*]) } {
+   if { ![check_effective_target_x86] } {
expr 0
} else {
# Check that OS has AVX and SSE saving enabled.
@@ -2652,7 +2658,7 @@ proc check_avx_os_support_available { } {
 proc check_avx512_os_support_available { } {
 return [check_cached_effective_target avx512_os_support_available {
# If this is not the right target then we can skip the test.
-   if { !([istarget i?86-*-*] || [istarget x86_64-*-*]) } {
+   if { ![check_effective_target_x86] } {
expr 0
} else {
# Check that OS has AVX512, AVX and SSE saving enabled.
@@ -2675,7 +2681,7 @@ proc check_avx512_os_support_available { } {
 proc check_sse_hw_available { } {
 return [check_cached_effective_target sse_hw_available {
# If this is not the right target then we can skip the test.
-   if { !([istarget i?86-*-*] || [istarget x86_64-*-*]) } {
+   if { ![check_effective_target_x86] } {
expr 0
} else {
check_runtime_nocache sse_hw_available {
@@ -2699,7 +2705,7 @@ proc check_sse_hw_available { } {
 proc check_sse2_hw_available { } {
 return [check_cached_effective_target sse2_hw_available {
# If this is not the right target then we can skip the test.
-   if { !([istarget i?86-*-*] || [istarget x86_64-*-*]) } {
+   if { ![check_effective_target_x86] } {
expr 0
} else {
check_runtime_nocache sse2_hw_available {
@@ -2723,7 +2729,7 @@ proc check_sse2_hw_available { } {
 proc check_sse4_hw_available { } {
 return [check_cached_effective_target sse4_hw_available {
# If this is not the right target then we can skip the test.
-   

Re: [PATCH] libstdc++: Allow 'configure.host' to modify 'EXTRA_CFLAGS', 'EXTRA_CXX_FLAGS'

2025-02-26 Thread Sam James
Thomas Schwinge  writes:

> In particular, 'GLIBCXX_ENABLE_CXX_FLAGS' shouldn't overwrite 
> 'EXTRA_CXX_FLAGS'
> (and prepend any additional '--enable-cxx-flags=[...]').

Why 'CXX_FLAGS' spelling (which is unusual) rather than 'CXXFLAG-- ah, I
see we have a load of EXTRA_CXXFLAGS, and then a lot of EXTRA_CXX_FLAGS
in libstdc++. I feel like this is a typo waiting to happen but maybe
it's like this for a reason?

(If the intention is to avoid whatever is normally in EXTRA_CXXFLAGS, we
should consider a namespaced var name instead - but I think not given
EXTRA_CFLAGS is used elsewhere & within libstdc++.)

>
>   libstdc++-v3/
>   * acinclude.m4 (GLIBCXX_ENABLE_CXX_FLAGS): Prepend any additional
>   flags to 'EXTRA_CXX_FLAGS'.
>   * configure: Regenerate.
>   * configure.host: Document 'EXTRA_CFLAGS', 'EXTRA_CXX_FLAGS'.
> ---
>  libstdc++-v3/acinclude.m4   | 3 ++-
>  libstdc++-v3/configure  | 3 ++-
>  libstdc++-v3/configure.host | 4 
>  3 files changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4
> index b3423d7957a..3287dab3b89 100644
> --- a/libstdc++-v3/acinclude.m4
> +++ b/libstdc++-v3/acinclude.m4
> @@ -3269,7 +3269,8 @@ AC_DEFUN([GLIBCXX_ENABLE_CXX_FLAGS], [dnl
>  done
>fi
>  
> -  EXTRA_CXX_FLAGS="$enable_cxx_flags"
> +  # Prepend the additional flags.
> +  EXTRA_CXX_FLAGS="$enable_cxx_flags $EXTRA_CXX_FLAGS"
>AC_MSG_RESULT($EXTRA_CXX_FLAGS)
>AC_SUBST(EXTRA_CXX_FLAGS)
>  ])
> diff --git a/libstdc++-v3/configure b/libstdc++-v3/configure
> index e115ee55739..ba908577a66 100755
> --- a/libstdc++-v3/configure
> +++ b/libstdc++-v3/configure
> @@ -19452,7 +19452,8 @@ fi
>  done
>fi
>  
> -  EXTRA_CXX_FLAGS="$enable_cxx_flags"
> +  # Prepend the additional flags.
> +  EXTRA_CXX_FLAGS="$enable_cxx_flags $EXTRA_CXX_FLAGS"
>{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $EXTRA_CXX_FLAGS" >&5
>  $as_echo "$EXTRA_CXX_FLAGS" >&6; }
>  
> diff --git a/libstdc++-v3/configure.host b/libstdc++-v3/configure.host
> index 45f55b250ce..1e84c78af30 100644
> --- a/libstdc++-v3/configure.host
> +++ b/libstdc++-v3/configure.host
> @@ -61,6 +61,10 @@
>  #
>  # It possibly modifies the following variables:
>  #
> +#   EXTRA_CFLAGS   extra flags to pass when compiling C code
> +#
> +#   EXTRA_CXX_FLAGSextra flags to pass when compiling C++ code
> +#
>  #   OPT_LDFLAGSextra flags to pass when linking the library, of
>  #  the form '-Wl,blah'
>  #  (defaults to empty in acinclude.m4)


Re: [PATCH] libstdc++: Allow 'configure.host' to modify 'EXTRA_CFLAGS', 'EXTRA_CXX_FLAGS'

2025-02-26 Thread Jonathan Wakely
On Wed, 26 Feb 2025 at 10:19, Thomas Schwinge wrote:
>
> In particular, 'GLIBCXX_ENABLE_CXX_FLAGS' shouldn't overwrite 
> 'EXTRA_CXX_FLAGS'
> (and prepend any additional '--enable-cxx-flags=[...]').

This seems good, but why prepend instead of append here?
If there are important flags passed down from top-level configure that
shouldn't be replaced by the libstdc++ --enable-cxx-flags option, can
we mention that in the new comment in acinclude.m4?

>
> libstdc++-v3/
> * acinclude.m4 (GLIBCXX_ENABLE_CXX_FLAGS): Prepend any additional
> flags to 'EXTRA_CXX_FLAGS'.
> * configure: Regenerate.
> * configure.host: Document 'EXTRA_CFLAGS', 'EXTRA_CXX_FLAGS'.
> ---
>  libstdc++-v3/acinclude.m4   | 3 ++-
>  libstdc++-v3/configure  | 3 ++-
>  libstdc++-v3/configure.host | 4 
>  3 files changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4
> index b3423d7957a..3287dab3b89 100644
> --- a/libstdc++-v3/acinclude.m4
> +++ b/libstdc++-v3/acinclude.m4
> @@ -3269,7 +3269,8 @@ AC_DEFUN([GLIBCXX_ENABLE_CXX_FLAGS], [dnl
>  done
>fi
>
> -  EXTRA_CXX_FLAGS="$enable_cxx_flags"
> +  # Prepend the additional flags.
> +  EXTRA_CXX_FLAGS="$enable_cxx_flags $EXTRA_CXX_FLAGS"
>AC_MSG_RESULT($EXTRA_CXX_FLAGS)
>AC_SUBST(EXTRA_CXX_FLAGS)
>  ])
> diff --git a/libstdc++-v3/configure b/libstdc++-v3/configure
> index e115ee55739..ba908577a66 100755
> --- a/libstdc++-v3/configure
> +++ b/libstdc++-v3/configure
> @@ -19452,7 +19452,8 @@ fi
>  done
>fi
>
> -  EXTRA_CXX_FLAGS="$enable_cxx_flags"
> +  # Prepend the additional flags.
> +  EXTRA_CXX_FLAGS="$enable_cxx_flags $EXTRA_CXX_FLAGS"
>{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $EXTRA_CXX_FLAGS" >&5
>  $as_echo "$EXTRA_CXX_FLAGS" >&6; }
>
> diff --git a/libstdc++-v3/configure.host b/libstdc++-v3/configure.host
> index 45f55b250ce..1e84c78af30 100644
> --- a/libstdc++-v3/configure.host
> +++ b/libstdc++-v3/configure.host
> @@ -61,6 +61,10 @@
>  #
>  # It possibly modifies the following variables:
>  #
> +#   EXTRA_CFLAGS   extra flags to pass when compiling C code
> +#
> +#   EXTRA_CXX_FLAGSextra flags to pass when compiling C++ code
> +#
>  #   OPT_LDFLAGSextra flags to pass when linking the library, of
>  #  the form '-Wl,blah'
>  #  (defaults to empty in acinclude.m4)
> --
> 2.34.1
>



Re: [PATCH] libstdc++: Allow 'configure.host' to modify 'EXTRA_CFLAGS', 'EXTRA_CXX_FLAGS'

2025-02-26 Thread Jonathan Wakely
On Wed, 26 Feb 2025 at 10:47, Jonathan Wakely  wrote:
>
> On Wed, 26 Feb 2025 at 10:19, Thomas Schwinge wrote:
> >
> > In particular, 'GLIBCXX_ENABLE_CXX_FLAGS' shouldn't overwrite 
> > 'EXTRA_CXX_FLAGS'
> > (and prepend any additional '--enable-cxx-flags=[...]').
>
> This seems good, but why prepend instead of append here?
> If there are important flags passed down from top-level configure that
> shouldn't be replaced by the libstdc++ --enable-cxx-flags option, can
> we mention that in the new comment in acinclude.m4?

Oh sorry, they're more likely to be from configure.host not from the
top-level, right?

But if we prepend, then when users put bad options in
--enable-cxx-flags we silently override that with good target-specific
ones from configure.host

Maybe if users explicitly give bad options, they should get an error,
or a bad result?

Or maybe I just don't understand the context properly :-)


>
> >
> > libstdc++-v3/
> > * acinclude.m4 (GLIBCXX_ENABLE_CXX_FLAGS): Prepend any additional
> > flags to 'EXTRA_CXX_FLAGS'.
> > * configure: Regenerate.
> > * configure.host: Document 'EXTRA_CFLAGS', 'EXTRA_CXX_FLAGS'.
> > ---
> >  libstdc++-v3/acinclude.m4   | 3 ++-
> >  libstdc++-v3/configure  | 3 ++-
> >  libstdc++-v3/configure.host | 4 
> >  3 files changed, 8 insertions(+), 2 deletions(-)
> >
> > diff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4
> > index b3423d7957a..3287dab3b89 100644
> > --- a/libstdc++-v3/acinclude.m4
> > +++ b/libstdc++-v3/acinclude.m4
> > @@ -3269,7 +3269,8 @@ AC_DEFUN([GLIBCXX_ENABLE_CXX_FLAGS], [dnl
> >  done
> >fi
> >
> > -  EXTRA_CXX_FLAGS="$enable_cxx_flags"
> > +  # Prepend the additional flags.
> > +  EXTRA_CXX_FLAGS="$enable_cxx_flags $EXTRA_CXX_FLAGS"
> >AC_MSG_RESULT($EXTRA_CXX_FLAGS)
> >AC_SUBST(EXTRA_CXX_FLAGS)
> >  ])
> > diff --git a/libstdc++-v3/configure b/libstdc++-v3/configure
> > index e115ee55739..ba908577a66 100755
> > --- a/libstdc++-v3/configure
> > +++ b/libstdc++-v3/configure
> > @@ -19452,7 +19452,8 @@ fi
> >  done
> >fi
> >
> > -  EXTRA_CXX_FLAGS="$enable_cxx_flags"
> > +  # Prepend the additional flags.
> > +  EXTRA_CXX_FLAGS="$enable_cxx_flags $EXTRA_CXX_FLAGS"
> >{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $EXTRA_CXX_FLAGS" >&5
> >  $as_echo "$EXTRA_CXX_FLAGS" >&6; }
> >
> > diff --git a/libstdc++-v3/configure.host b/libstdc++-v3/configure.host
> > index 45f55b250ce..1e84c78af30 100644
> > --- a/libstdc++-v3/configure.host
> > +++ b/libstdc++-v3/configure.host
> > @@ -61,6 +61,10 @@
> >  #
> >  # It possibly modifies the following variables:
> >  #
> > +#   EXTRA_CFLAGS   extra flags to pass when compiling C code
> > +#
> > +#   EXTRA_CXX_FLAGSextra flags to pass when compiling C++ code
> > +#
> >  #   OPT_LDFLAGSextra flags to pass when linking the library, of
> >  #  the form '-Wl,blah'
> >  #  (defaults to empty in acinclude.m4)
> > --
> > 2.34.1
> >



Re: [PATCH] simple-diagnostic-path, v2: Inline two trivial methods [PR116143]

2025-02-26 Thread Richard Biener
On Wed, Feb 26, 2025 at 11:38 AM Jakub Jelinek  wrote:
>
> On Wed, Feb 26, 2025 at 10:45:37AM +0100, Richard Biener wrote:
> > OK
>
> Unfortunately I've only bootstrapped/regtested it with normal checking.
> Testing it with --enable-checking=release now shows that this patch just
> moved the FAILs to a different symbol.  And note that isn't even a LTO
> build.
>
> The following patch which IMHO still makes sense, those methods are also
> trivial, moves it even further.  But the next problem is
> _ZN22simple_diagnostic_path9add_eventEmP9tree_nodeiPKcz
> and that method is IMHO too large for the header file to be defined inline,
> and doesn't even have final override like the others, isn't virtual in the
> abstract class.
> So, I have really no idea why it isn't compiled in.

Hmm, so why isn't it part of libgccjit?  I suppose C++ does not really support
exposing a class but not exporting the classes ABI?

> 2025-02-26  Jakub Jelinek  
>
> PR testsuite/116143
> * simple-diagnostic-path.h (simple_diagnostic_path::get_event): Define
> inline.
> (simple_diagnostic_path::get_thread): Likewise.
> (simple_diagnostic_path::same_function_p): Likewise.
> * simple-diagnostic-path.cc (simple_diagnostic_path::get_event):
> Remove out of line definition.
> (simple_diagnostic_path::get_thread): Likewise.
> (simple_diagnostic_path::same_function_p): Likewise.
>
> --- gcc/simple-diagnostic-path.h.jj 2025-02-26 10:59:04.475354169 +0100
> +++ gcc/simple-diagnostic-path.h2025-02-26 11:09:48.191388638 +0100
> @@ -101,14 +101,19 @@ class simple_diagnostic_path : public di
>simple_diagnostic_path (pretty_printer *event_pp);
>
>unsigned num_events () const final override { return m_events.length (); }
> -  const diagnostic_event & get_event (int idx) const final override;
> +  const diagnostic_event & get_event (int idx) const final override
> +  { return *m_events[idx]; }
>unsigned num_threads () const final override { return m_threads.length (); 
> }
>const diagnostic_thread &
> -  get_thread (diagnostic_thread_id_t) const final override;
> +  get_thread (diagnostic_thread_id_t idx) const final override
> +  { return *m_threads[idx]; }
>bool
>same_function_p (int event_idx_a,
> -  int event_idx_b) const final override;
> -
> +  int event_idx_b) const final override
> +  {
> +return (m_events[event_idx_a]->get_fndecl ()
> +   == m_events[event_idx_b]->get_fndecl ());
> +  }
>diagnostic_thread_id_t add_thread (const char *name);
>
>diagnostic_event_id_t add_event (location_t loc, tree fndecl, int depth,
> --- gcc/simple-diagnostic-path.cc.jj2025-02-26 10:59:04.476354155 +0100
> +++ gcc/simple-diagnostic-path.cc   2025-02-26 11:10:01.842198544 +0100
> @@ -41,29 +41,6 @@ simple_diagnostic_path::simple_diagnosti
>add_thread ("main");
>  }
>
> -/* Implementation of diagnostic_path::get_event vfunc for
> -   simple_diagnostic_path: simply return the event in the vec.  */
> -
> -const diagnostic_event &
> -simple_diagnostic_path::get_event (int idx) const
> -{
> -  return *m_events[idx];
> -}
> -
> -const diagnostic_thread &
> -simple_diagnostic_path::get_thread (diagnostic_thread_id_t idx) const
> -{
> -  return *m_threads[idx];
> -}
> -
> -bool
> -simple_diagnostic_path::same_function_p (int event_idx_a,
> -int event_idx_b) const
> -{
> -  return (m_events[event_idx_a]->get_fndecl ()
> - == m_events[event_idx_b]->get_fndecl ());
> -}
> -
>  diagnostic_thread_id_t
>  simple_diagnostic_path::add_thread (const char *name)
>  {
>
>
> Jakub
>


Re: [PATCH] middle-end/118801 - excessive redundant DEBUG BEGIN_STMT

2025-02-26 Thread Richard Biener
On Tue, 11 Feb 2025, Richard Biener wrote:

> On Mon, 10 Feb 2025, Richard Biener wrote:
> 
> > On Mon, 10 Feb 2025, Richard Biener wrote:
> > 
> > > The following addresses the fact that we keep an excessive amount of
> > > redundant DEBUG BEGIN_STMTs - in the testcase it sums up to 99.999%
> > > of all stmts, sucking up compile-time in IL walks.  The patch amends
> > > the GIMPLE DCE code that elides redundant DEBUG BIND stmts, also
> > > pruning uninterrupted sequences of DEBUG BEGIN_STMTs, keeping only
> > > the last one.
> > > 
> > > Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.
> > > 
> > > For the testcase this brings down compile-time to 150% of -g0 levels
> > > (and only 215 out of originally 1981380 DEBUG BEGIN_STMTs kept).
> > > 
> > > OK for trunk and possibly backports?
> > 
> > It regresses a few guality checks (and progresses one), I've looked
> > only into one, g++.dg/guality/pr67192.C, where we now see
> > FAIL: g++.dg/guality/pr67192.C   -O[123sg]  line 54 cnt == 15
> > because the breakpoint happens in the wrong place.  But this shows
> > it "works" only by accident.  The testcase is
> > 
> > __attribute__((noinline, noclone)) static void
> > f4 (void)
> > {
> >   while (1) /* { dg-final { gdb-test 54 "cnt" "15" } } */
> > if (last ())
> >   break;
> > else
> >   do_it ();
> >   do_it (); /* { dg-final { gdb-test 59 "cnt" "20" } } */
> > }
> > 
> > and we have two BEGIN_STMTs for line 54(!) originally:
> > 
> >   [/space/rguenther/src/gcc/gcc/testsuite/g++.dg/guality/pr67192.C:54:3] # 
> > DEBUG BEGIN_STMT
> >   :
> >   [/space/rguenther/src/gcc/gcc/testsuite/g++.dg/guality/pr67192.C:55:5] # 
> > DEBUG BEGIN_STMT
> > ...
> >   [/space/rguenther/src/gcc/gcc/testsuite/g++.dg/guality/pr67192.C:54:3] # 
> > DEBUG BEGIN_STMT
> >   [/space/rguenther/src/gcc/gcc/testsuite/g++.dg/guality/pr67192.C:55:5] 
> > goto ;
> > 
> > and special code in make_blocks() moves the first BEGIN_STMT after
> > the label, altering when we reach a breakpoint on the line.
> > 
> > You can see that with the first BEGIN_STMT moved the patch will elide it,
> > and gdb will find the second location.
> > 
> > With removing only repeating BEGIN_STMT with exactly
> > the same location (unfortunately with uint64_t a bitmap no longer
> > works), we're "only" down to 996 BEGIN_STMTs for the testcase.
> > 
> > So I'm retesting the following.
> 
> Bootstrapped and tested on x86_64-unknown-linux-gnu without
> regressions this time.
> 
> Alex, is this OK for trunk?

Ping.

Richard.

> Thanks,
> Richard.
> 
> 
> > Richard.
> > 
> > From 38d49d3e2c0bf98e9e2a46e251ae0454b084cc8d Mon Sep 17 00:00:00 2001
> > From: Richard Biener 
> > Date: Mon, 10 Feb 2025 10:23:45 +0100
> > Subject: [PATCH] middle-end/118801 - excessive redundant DEBUG BEGIN_STMT
> > To: gcc-patches@gcc.gnu.org
> > 
> > The following addresses the fact that we keep an excessive amount of
> > redundant DEBUG BEGIN_STMTs - in the testcase it sums up to 99.999%
> > of all stmts, sucking up compile-time in IL walks.  The patch amends
> > the GIMPLE DCE code that elides redundant DEBUG BIND stmts, also
> > pruning uninterrupted sequences of DEBUG BEGIN_STMTs, keeping only
> > the last of each set of DEBUG BEGIN_STMT with unique location.
> > 
> > PR middle-end/118801
> > * tree-ssa-dce.cc (eliminate_unnecessary_stmts): Prune
> > sequences of uninterrupted DEBUG BEGIN_STMTs, keeping only
> > the last of a set with unique location.
> > ---
> >  gcc/tree-ssa-dce.cc | 10 ++
> >  1 file changed, 10 insertions(+)
> > 
> > diff --git a/gcc/tree-ssa-dce.cc b/gcc/tree-ssa-dce.cc
> > index be21a2d0b50..461283ba858 100644
> > --- a/gcc/tree-ssa-dce.cc
> > +++ b/gcc/tree-ssa-dce.cc
> > @@ -1508,6 +1508,7 @@ eliminate_unnecessary_stmts (bool aggressive)
> >  
> >/* Remove dead statements.  */
> >auto_bitmap debug_seen;
> > +  hash_set> locs_seen;
> >for (gsi = gsi_last_bb (bb); !gsi_end_p (gsi); gsi = psi)
> > {
> >   stmt = gsi_stmt (gsi);
> > @@ -1670,6 +1671,15 @@ eliminate_unnecessary_stmts (bool aggressive)
> > remove_dead_stmt (&gsi, bb, to_remove_edges);
> >   continue;
> > }
> > + else if (gimple_debug_begin_stmt_p (stmt))
> > +   {
> > + /* We are only keeping the last debug-begin in a series of
> > +debug-begin stmts.  */
> > + if (locs_seen.add (gimple_location (stmt)))
> > +   remove_dead_stmt (&gsi, bb, to_remove_edges);
> > + continue;
> > +   }
> > + locs_seen.empty ();
> >   bitmap_clear (debug_seen);
> > }
> >  
> > 
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)


[PATCH] alias: Perform offset arithmetics in poly_offset_int rather than poly_int64 [PR118819]

2025-02-26 Thread Jakub Jelinek
On Wed, Feb 26, 2025 at 10:58:26AM +0100, Richard Biener wrote:
> > This PR is about ubsan error on the c - cx1 + cy1 evaluation in the first
> > hunk.
> > 
> > The following patch hopefully fixes that by doing the additions/subtractions
> > in poly_uint64 rather than poly_int64.  Or shall we instead perform it in
> > offset_int and watch for overflows and punt somehow for those?
> 
> I think when we have the offset computation overflow ignoring such
> overflow will make the memrefs_conflict_p give possibly wrong
> answers.  In the PR you say cselib now has those
> -9223372036854775807ish offsets from sp, why does it do alias queries
> with those clearly invalid offsets?

It wants to have a MEM which overlaps anything below the stack.
So, uses for stack grows down and non-biased stack sp - PTRDIFF_MAX with
PTRDIFF_MAX MEM_SIZE as an approximation to that.

> So yes, I think we need to punt on overflow.  Using poly_offset_int
> should work but it comes at a cost.  Does poly_wide_int have the same
> wide-int like overflow overloads?

poly_offset_int will be definitely cheaper than other poly_wide_int, it
would need to be at least 66 bits and so 128 bit is certainly cheaper then.

So like this if it passes full bootstrap/regtest?

2025-02-26  Jakub Jelinek  

PR middle-end/118819
* alias.cc (memrefs_conflict_p): Perform arithmetics on c, xsize and
ysize in poly_offset_int and return -1 if it is not representable in
poly_int64.

--- gcc/alias.cc.jj 2025-01-02 11:23:24.0 +0100
+++ gcc/alias.cc2025-02-26 12:31:52.860341105 +0100
@@ -2535,19 +2535,39 @@ memrefs_conflict_p (poly_int64 xsize, rt
return memrefs_conflict_p (xsize, x1, ysize, y1, c);
  if (poly_int_rtx_p (x1, &cx1))
{
+ poly_offset_int co = c;
+ co -= cx1;
  if (poly_int_rtx_p (y1, &cy1))
-   return memrefs_conflict_p (xsize, x0, ysize, y0,
-  c - cx1 + cy1);
+   {
+ co += cy1;
+ if (!co.to_shwi (&c))
+   return -1;
+ return memrefs_conflict_p (xsize, x0, ysize, y0, c);
+   }
+ else if (!co.to_shwi (&c))
+   return -1;
  else
-   return memrefs_conflict_p (xsize, x0, ysize, y, c - cx1);
+   return memrefs_conflict_p (xsize, x0, ysize, y, c);
}
  else if (poly_int_rtx_p (y1, &cy1))
-   return memrefs_conflict_p (xsize, x, ysize, y0, c + cy1);
+   {
+ poly_offset_int co = c;
+ co += cy1;
+ if (!co.to_shwi (&c))
+   return -1;
+ return memrefs_conflict_p (xsize, x, ysize, y0, c);
+   }
 
  return -1;
}
   else if (poly_int_rtx_p (x1, &cx1))
-   return memrefs_conflict_p (xsize, x0, ysize, y, c - cx1);
+   {
+ poly_offset_int co = c;
+ co -= cx1;
+ if (!co.to_shwi (&c))
+   return -1;
+ return memrefs_conflict_p (xsize, x0, ysize, y, c);
+   }
 }
   else if (GET_CODE (y) == PLUS)
 {
@@ -2563,7 +2583,13 @@ memrefs_conflict_p (poly_int64 xsize, rt
 
   poly_int64 cy1;
   if (poly_int_rtx_p (y1, &cy1))
-   return memrefs_conflict_p (xsize, x, ysize, y0, c + cy1);
+   {
+ poly_offset_int co = c;
+ co += cy1;
+ if (!co.to_shwi (&c))
+   return -1;
+ return memrefs_conflict_p (xsize, x, ysize, y0, c);
+   }
   else
return -1;
 }
@@ -2616,8 +2642,16 @@ memrefs_conflict_p (poly_int64 xsize, rt
  if (maybe_gt (xsize, 0))
xsize = -xsize;
  if (maybe_ne (xsize, 0))
-   xsize += sc + 1;
- c -= sc + 1;
+   {
+ poly_offset_int xsizeo = xsize;
+ xsizeo += sc + 1;
+ if (!xsizeo.to_shwi (&xsize))
+   return -1;
+   }
+ poly_offset_int co = c;
+ co -= sc + 1;
+ if (!co.to_shwi (&c))
+   return -1;
  return memrefs_conflict_p (xsize, canon_rtx (XEXP (x, 0)),
 ysize, y, c);
}
@@ -2631,8 +2665,16 @@ memrefs_conflict_p (poly_int64 xsize, rt
  if (maybe_gt (ysize, 0))
ysize = -ysize;
  if (maybe_ne (ysize, 0))
-   ysize += sc + 1;
- c += sc + 1;
+   {
+ poly_offset_int ysizeo = ysize;
+ ysizeo += sc + 1;
+ if (!ysizeo.to_shwi (&ysize))
+   return -1;
+   }
+ poly_offset_int co = c;
+ co += sc + 1;
+ if (!co.to_shwi (&c))
+   return -1;
  return memrefs_conflict_p (xsize, x,
 ysize, canon_rtx (XEXP (y, 0)), c);
}
@@ -2643,7 +2685,11 @@ memrefs_conflict_p (poly_int64 xsize, rt
   poly_int64 cx, cy;
   if (poly_int_rtx_p (x, &c

Re: [PATCH] middle-end/118801 - excessive redundant DEBUG BEGIN_STMT

2025-02-26 Thread Jakub Jelinek
On Wed, Feb 26, 2025 at 12:36:15PM +0100, Richard Biener wrote:
> > Bootstrapped and tested on x86_64-unknown-linux-gnu without
> > regressions this time.
> > 
> > Alex, is this OK for trunk?
> 
> Ping.

I'd really like to hear from Alex on this one.

Jakub



[committed][wwwdocs] Correct the MIPS O64 floating-point argument passing convention

2025-02-26 Thread Maciej W. Rozycki
Update according to the amendment made for GCC 4.0.0 with commit
b11a9d5f3f90 back in 2004:

(mips_arg_info): Don't allow fpr_p to affect the register or
stack alignment.  Remove o64 silliness.

Retain the description of the former semantics for reference.
---
 htdocs/projects/mipso64-abi.html | 13 -
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/htdocs/projects/mipso64-abi.html b/htdocs/projects/mipso64-abi.html
index 6cd60c22..8a2a6618 100644
--- a/htdocs/projects/mipso64-abi.html
+++ b/htdocs/projects/mipso64-abi.html
@@ -39,11 +39,14 @@ packed towards the upper-address side.

 Floating-Point Arguments

-If the first and second arguments floating-point arguments to a
-function are 32-bit values, they are passed in $f12 and
-$f14.  If the first is a 32-bit value and the second is a
-64-bit value, they are passed in $f12 and
-$f13.  If they are both 64-bit values, they are passed in
+As from GCC 4.0.0 the first and second floating-point arguments to a
+function are passed in $f12 and $f14.
+
+Previously if the first and second floating-point arguments to a
+function were 32-bit values, they were passed in $f12 and
+$f14.  If the first was a 32-bit value and the second was a
+64-bit value, they were passed in $f12 and
+$f13.  If they were both 64-bit values, they were passed in
 $f12 and $f13.

 ELF Header
-- 
2.20.1


[PATCH] simple-diagnostic-path: Inline two trivial methods [PR116143]

2025-02-26 Thread Jakub Jelinek
Hi!

Various plugin tests fail with --enable-checking=release, because the
num_events and num_threads methods of simple_diagnostic_path are only used
inside of #if CHECKING_P code inside of GCC proper and then tested inside of
some plugin tests.  So, with --enable-checking=yes they are compiled into
cc1/cc1plus etc. binaries and plugins can call those, but with
--enable-checking=release they are optimized away (at least for LTO builds).

As they are trivial, the following patch just defines them inline, so that
the plugin tests get their definitions directly and don't have to rely
on cc1/cc1plus etc. exporting those.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2025-02-26  Jakub Jelinek  

PR testsuite/116143
* simple-diagnostic-path.h (simple_diagnostic_path::num_events): Define
inline.
(simple_diagnostic_path::num_threads): Likewise.
* simple-diagnostic-path.cc (simple_diagnostic_path::num_events):
Remove out of line definition.
(simple_diagnostic_path::num_threads): Likewise.

--- gcc/simple-diagnostic-path.h.jj 2025-01-02 11:23:37.876218670 +0100
+++ gcc/simple-diagnostic-path.h2025-02-05 15:29:32.882855368 +0100
@@ -100,9 +100,9 @@ class simple_diagnostic_path : public di
  public:
   simple_diagnostic_path (pretty_printer *event_pp);
 
-  unsigned num_events () const final override;
+  unsigned num_events () const final override { return m_events.length (); }
   const diagnostic_event & get_event (int idx) const final override;
-  unsigned num_threads () const final override;
+  unsigned num_threads () const final override { return m_threads.length (); }
   const diagnostic_thread &
   get_thread (diagnostic_thread_id_t) const final override;
   bool
--- gcc/simple-diagnostic-path.cc.jj2025-01-02 11:23:19.409476476 +0100
+++ gcc/simple-diagnostic-path.cc   2025-02-05 15:29:59.185492553 +0100
@@ -41,15 +41,6 @@ simple_diagnostic_path::simple_diagnosti
   add_thread ("main");
 }
 
-/* Implementation of diagnostic_path::num_events vfunc for
-   simple_diagnostic_path: simply get the number of events in the vec.  */
-
-unsigned
-simple_diagnostic_path::num_events () const
-{
-  return m_events.length ();
-}
-
 /* Implementation of diagnostic_path::get_event vfunc for
simple_diagnostic_path: simply return the event in the vec.  */
 
@@ -59,12 +50,6 @@ simple_diagnostic_path::get_event (int i
   return *m_events[idx];
 }
 
-unsigned
-simple_diagnostic_path::num_threads () const
-{
-  return m_threads.length ();
-}
-
 const diagnostic_thread &
 simple_diagnostic_path::get_thread (diagnostic_thread_id_t idx) const
 {

Jakub



[PATCH] c: stddef.h C23 fixes [PR114870]

2025-02-26 Thread Jakub Jelinek
Hi!

The stddef.h header for C23 defines __STDC_VERSION_STDDEF_H__ and
unreachable macros multiple times in some cases.
The header doesn't have normal multiple inclusion guard, because it supports
for glibc inclusion with __need_{size_t,wchar_t,ptrdiff_t,wint_t,NULL}.
While the definition of __STDC_VERSION_STDDEF_H__ and unreachable is done
solely in the #ifdef _STDDEF_H part, so they are defined only if stddef.h
is included without those __need_* macros defined.  But actually once
stddef.h is included without the __need_* macros, _STDDEF_H is then defined
and while further stddef.h includes without __need_* macros don't do
anything:
#if (!defined(_STDDEF_H) && !defined(_STDDEF_H_) && !defined(_ANSI_STDDEF_H) \
 && !defined(__STDDEF_H__)) \
|| defined(__need_wchar_t) || defined(__need_size_t) \
|| defined(__need_ptrdiff_t) || defined(__need_NULL) \
|| defined(__need_wint_t)
if one includes whole stddef.h first and then stddef.h with some of the
__need_* macros defined, the #ifdef _STDDEF_H part is used again.
It isn't that big deal for most cases, as it uses extra guarding macros
like:
#ifndef _GCC_MAX_ALIGN_T
#define _GCC_MAX_ALIGN_T
...
#endif
etc., but for __STDC_VERSION_STDDEF_H__/unreachable nothing like that is
used.

So, either we do what the following patch does and just don't define
__STDC_VERSION_STDDEF_H__/unreachable second time, or use #ifndef
unreachable separately for the #define unreachable() case, or use
new _GCC_STDC_VERSION_STDDEF_H macro to guard this (or two, one for
__STDC_VERSION_STDDEF_H__ and one for unreachable), or rework the initial
condition to be just
#if !defined(_STDDEF_H) && !defined(_STDDEF_H_) && !defined(_ANSI_STDDEF_H) \
&& !defined(__STDDEF_H__)
- I really don't understand why the header should do anything at all after
it has been included once without __need_* macros.  But changing how this
behaves after 35 years might be risky for various OS/libc combinations.

In any case, the following patch has been bootstrapped/regtested on
x86_64-linux and i686-linux, ok for trunk?  Or something else?

2025-02-26  Jakub Jelinek  

PR c/114870
* ginclude/stddef.h (__STDC_VERSION_STDDEF_H__, unreachable): Don't
redefine multiple times if stddef.h is first included without __need_*
defines and later with them.  Move nullptr_t and unreachable and
__STDC_VERSION_STDDEF_H__ definitions into the same
defined (__STDC_VERSION__) && __STDC_VERSION__ > 201710L #if block.

* gcc.dg/c23-stddef-2.c: New test.

--- gcc/ginclude/stddef.h.jj2025-01-02 11:47:29.191236622 +0100
+++ gcc/ginclude/stddef.h   2025-02-25 18:41:19.755634248 +0100
@@ -444,18 +444,16 @@ typedef struct {
 #endif
 #endif /* C++11.  */
 
-#if (defined (__STDC_VERSION__) && __STDC_VERSION__ > 201710L)
+#if defined (__STDC_VERSION__) && __STDC_VERSION__ > 201710L
 #ifndef _GCC_NULLPTR_T
 #define _GCC_NULLPTR_T
   typedef __typeof__(nullptr) nullptr_t;
-/* ??? This doesn't define __STDC_VERSION_STDDEF_H__ yet.  */
 #endif
-#endif /* C23.  */
-
-#if defined __STDC_VERSION__ && __STDC_VERSION__ > 201710L
+#ifndef __STDC_VERSION_STDDEF_H__
 #define unreachable() (__builtin_unreachable ())
 #define __STDC_VERSION_STDDEF_H__  202311L
 #endif
+#endif /* C23.  */
 
 #endif /* _STDDEF_H was defined this time */
 
--- gcc/testsuite/gcc.dg/c23-stddef-2.c.jj  2025-02-25 19:04:36.931290359 
+0100
+++ gcc/testsuite/gcc.dg/c23-stddef-2.c 2025-02-25 19:05:46.575324915 +0100
@@ -0,0 +1,17 @@
+/* Test __STDC_VERSION_STDDEF_H__ in C23.  */
+/* { dg-do preprocess } */
+/* { dg-options "-std=c23 -pedantic-errors -Wsystem-headers" } */
+
+#include 
+/* Simulate what glibc  does without actually including it.  */
+#define __need_size_t
+#define __need_NULL
+#include 
+
+#ifndef __STDC_VERSION_STDDEF_H__
+#error "__STDC_VERSION_STDDEF_H__ not defined"
+#endif
+
+#if __STDC_VERSION_STDDEF_H__ != 202311L
+#error "bad value of __STDC_VERSION_STDDEF_H__"
+#endif

Jakub



[PATCH] c: Assorted fixes for flexible array members in unions [PR119001]

2025-02-26 Thread Jakub Jelinek
Hi!

r15-209 allowed flexible array members inside of unions, but as the
following testcase shows, not everything has been adjusted for that.
Unlike structures, in unions flexible array member (as an extension)
can be any of the members, not just the last one, as in union all
members are effectively last.
The first hunk is about an ICE on the initialization of the FAM
in union which is not the last FIELD_DECL with a string literal,
the second hunk just formatting fix, third hunk fixes a bug in which
we were just throwing away the initializers (except for with string literal)
of FAMs in unions which aren't the last FIELD_DECL, and the last hunk
is to diagnose FAM errors in unions the same as for structures, in
particular trying to initialize a FAM with non-constant or initialization
in nested context.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2025-02-26  Jakub Jelinek  

PR c/119001
gcc/
* varasm.cc (output_constructor_regular_field): Don't fail
assertion if next is non-NULL and FIELD_DECL if
TREE_CODE (local->type) is UNION_TYPE.
gcc/c/
* c-typeck.cc (pop_init_level): Don't clear constructor_type
if DECL_CHAIN of constructor_fields is NULL but p->type is UNION_TYPE.
Formatting fix.
(process_init_element): Diagnose non-static initialization of flexible
array member in union or FAM in union initialization in nested context.
gcc/testsuite/
* gcc.dg/pr119001-1.c: New test.
* gcc.dg/pr119001-2.c: New test.

--- gcc/varasm.cc.jj2025-01-03 17:59:48.816160159 +0100
+++ gcc/varasm.cc   2025-02-25 10:52:08.043968775 +0100
@@ -5827,10 +5827,13 @@ output_constructor_regular_field (oc_loc
 and the FE splits them into dynamic initialization.  */
  gcc_checking_assert (fieldsize >= fldsize);
  /* Given a non-empty initialization, this field had better
-be last.  Given a flexible array member, the next field
-on the chain is a TYPE_DECL of the enclosing struct.  */
+be last except in unions.  Given a flexible array member, the next
+field on the chain is a TYPE_DECL of the enclosing struct.  */
  const_tree next = DECL_CHAIN (local->field);
- gcc_assert (!fieldsize || !next || TREE_CODE (next) != FIELD_DECL);
+ gcc_assert (!fieldsize
+ || !next
+ || TREE_CODE (next) != FIELD_DECL
+ || TREE_CODE (local->type) == UNION_TYPE);
}
   else
fieldsize = tree_to_uhwi (DECL_SIZE_UNIT (local->field));
--- gcc/c/c-typeck.cc.jj2025-02-13 14:10:52.934623189 +0100
+++ gcc/c/c-typeck.cc   2025-02-25 11:54:39.857363690 +0100
@@ -10270,7 +10270,8 @@ pop_init_level (location_t loc, int impl
  gcc_assert (!TYPE_SIZE (constructor_type));
 
  if (constructor_depth > 2)
-   error_init (loc, "initialization of flexible array member in a 
nested context");
+   error_init (loc, "initialization of flexible array member "
+"in a nested context");
  else
pedwarn_init (loc, OPT_Wpedantic,
  "initialization of a flexible array member");
@@ -10278,7 +10279,8 @@ pop_init_level (location_t loc, int impl
  /* We have already issued an error message for the existence
 of a flexible array member not at the end of the structure.
 Discard the initializer so that we do not die later.  */
- if (DECL_CHAIN (constructor_fields) != NULL_TREE)
+ if (DECL_CHAIN (constructor_fields) != NULL_TREE
+ && (!p->type || TREE_CODE (p->type) != UNION_TYPE))
constructor_type = NULL_TREE;
}
 }
@@ -12124,6 +12126,42 @@ retry:
warning (OPT_Wtraditional, "traditional C rejects initialization "
 "of unions");
 
+ /* Error for non-static initialization of a flexible array member.  */
+ if (fieldcode == ARRAY_TYPE
+ && !require_constant_value
+ && TYPE_SIZE (fieldtype) == NULL_TREE)
+   {
+ error_init (loc, "non-static initialization of a flexible "
+ "array member");
+ break;
+   }
+
+ /* Error for initialization of a flexible array member with
+a string constant if the structure is in an array.  E.g.:
+union U { int x; char y[]; };
+union U s[] = { { 1, "foo" } };
+is invalid.  */
+ if (string_flag
+ && fieldcode == ARRAY_TYPE
+ && constructor_depth > 1
+ && TYPE_SIZE (fieldtype) == NULL_TREE)
+   {
+ bool in_array_p = false;
+ for (struct constructor_stack *p = constructor_stack;
+  p && p->type; p = p->next)
+   if (TREE_CODE (p->type) == ARRAY_TYPE)
+ {
+   in_array_

[PATCH] RISC-V: Do not free a riscv_arch_string when handling target-arch attribute

2025-02-26 Thread 翁愷邑
The build_target_option_node() function may return a cached node when
fndecl having the same effective global_options. Therefore, freeing
memory used in target nodes can lead to a use-after-free issue, as a
target node may be shared by multiple fndecl.
This issue occurs in gcc.target/riscv/target-attr-16.c, where all
functions have the same march, but the last function tries to free its
old x_riscv_arch_string (which is shared) when processing the second
target attribute.However, the behavior of this issue depends on how the
OS handles malloc. It's very likely that xstrdup returns the old address
just freed, coincidentally hiding the issue. We can verify the issue by
forcing xstrdup to return a new address, e.g.,

-  if (opts->x_riscv_arch_string != default_opts->x_riscv_arch_string)
-free (CONST_CAST (void *, (const void *) opts->x_riscv_arch_string));
+  // Force it to use a new address, NFCI
+  const char *tmp = opts->x_riscv_arch_string;
   opts->x_riscv_arch_string = xstrdup (local_arch_str);

+  if (tmp != default_opts->x_riscv_arch_string)
+free (CONST_CAST (void *, (const void *) tmp));

This patch replaces xstrdup with ggc_strdup and let gc to take care of
unused strings.

gcc/ChangeLog:

  * config/riscv/riscv-target-attr.cc
(riscv_target_attr_parser::update_settings):
  Do not manually free any arch string.


0001-RISC-V-Do-not-free-a-riscv_arch_string-when-handling.patch
Description: Binary data


[PATCH] alias: Perform offset arithmetics in poly_uint64 rather than poly_int64 [PR118819]

2025-02-26 Thread Jakub Jelinek
Hi!

This PR is about ubsan error on the c - cx1 + cy1 evaluation in the first
hunk.

The following patch hopefully fixes that by doing the additions/subtractions
in poly_uint64 rather than poly_int64.  Or shall we instead perform it in
offset_int and watch for overflows and punt somehow for those?

Or shall we just treat this way only the first case where it is
adding/subtracting 3 numbers and not just 2, so there is at least a chance
the overflow is just temporary?

Bootstrapped/regtested on x86_64-linux and i686-linux (but just normal
bootstrap, not ubsan one), ok for trunk?

2025-02-26  Jakub Jelinek  

PR middle-end/118819
* alias.cc (memrefs_conflict_p): Perform arithmetics on c in
poly_uint64 type rather than poly_int64 to avoid compile time
UB.

--- gcc/alias.cc.jj 2025-01-02 11:23:24.0 +0100
+++ gcc/alias.cc2025-02-25 12:43:16.655507666 +0100
@@ -2537,12 +2537,14 @@ memrefs_conflict_p (poly_int64 xsize, rt
{
  if (poly_int_rtx_p (y1, &cy1))
return memrefs_conflict_p (xsize, x0, ysize, y0,
-  c - cx1 + cy1);
+  (poly_uint64) c - cx1 + cy1);
  else
-   return memrefs_conflict_p (xsize, x0, ysize, y, c - cx1);
+   return memrefs_conflict_p (xsize, x0, ysize, y,
+  (poly_uint64) c - cx1);
}
  else if (poly_int_rtx_p (y1, &cy1))
-   return memrefs_conflict_p (xsize, x, ysize, y0, c + cy1);
+   return memrefs_conflict_p (xsize, x, ysize, y0,
+  (poly_uint64) c + cy1);
 
  return -1;
}
@@ -2563,7 +2565,8 @@ memrefs_conflict_p (poly_int64 xsize, rt
 
   poly_int64 cy1;
   if (poly_int_rtx_p (y1, &cy1))
-   return memrefs_conflict_p (xsize, x, ysize, y0, c + cy1);
+   return memrefs_conflict_p (xsize, x, ysize, y0,
+  (poly_uint64) c + cy1);
   else
return -1;
 }
@@ -2643,7 +2646,7 @@ memrefs_conflict_p (poly_int64 xsize, rt
   poly_int64 cx, cy;
   if (poly_int_rtx_p (x, &cx) && poly_int_rtx_p (y, &cy))
{
- c += cy - cx;
+ c += (poly_uint64) cy - cx;
  return offset_overlap_p (c, xsize, ysize);
}
 

Jakub



[Fortran, Patch, PR118789, v1] Fix associate to void*

2025-02-26 Thread Andre Vehreschild
Hi all,

here is my shot on fixing this PR. The issue is, that when checking if the tree
to associate to is a pointer, gfortran does not respect void* aka c_ptr
correctly. On the tree level this can be done by checking the compatibility of
the data pointed to. If not, then just add an address op.

I also check F2018 standard and could not find any mention that a c_ptr is
disallowed or needs to be treated specially in an associate.

Regtests ok on x86_64-pc-linux-gnu / F41. Ok for mainline?

Regards,
Andre
--
Andre Vehreschild * Email: vehre ad gmx dot de
From 292a1d9e67f44124474d8e4198723baa5dea5b4d Mon Sep 17 00:00:00 2001
From: Andre Vehreschild 
Date: Tue, 25 Feb 2025 17:15:47 +0100
Subject: [PATCH] Fortran: Fix ICE on associate of pointer [PR118789]

Fix ICE when associating a pointer to void (c_ptr) by looking at the
compatibility of the type hierarchy.

	PR fortran/118789

gcc/fortran/ChangeLog:

	* trans-stmt.cc (trans_associate_var): Compare pointed to types when
	expr to associate is already a pointer.

gcc/testsuite/ChangeLog:

	* gfortran.dg/associate_73.f90: New test.
---
 gcc/fortran/trans-stmt.cc  |  7 ++-
 gcc/testsuite/gfortran.dg/associate_73.f90 | 21 +
 2 files changed, 27 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gfortran.dg/associate_73.f90

diff --git a/gcc/fortran/trans-stmt.cc b/gcc/fortran/trans-stmt.cc
index e7da8fea3b2..f16e1e3b46e 100644
--- a/gcc/fortran/trans-stmt.cc
+++ b/gcc/fortran/trans-stmt.cc
@@ -2287,7 +2287,12 @@ trans_associate_var (gfc_symbol *sym, gfc_wrapped_block *block)
 		  tmp = se.expr;
 		}
 	}
-	  if (!POINTER_TYPE_P (TREE_TYPE (se.expr)))
+	  /* For non-pointer types in se.expr, the first condition holds.
+	 For pointer or reference types in se.expr, a double TREE_TYPE ()
+	 is possible and an associate variable always is a pointer.  */
+	  if (!POINTER_TYPE_P (TREE_TYPE (se.expr))
+	  || TREE_TYPE (TREE_TYPE (se.expr))
+		   != TREE_TYPE (TREE_TYPE (sym->backend_decl)))
 	tmp = gfc_build_addr_expr (tmp, se.expr);
 	}

diff --git a/gcc/testsuite/gfortran.dg/associate_73.f90 b/gcc/testsuite/gfortran.dg/associate_73.f90
new file mode 100644
index 000..a5c3ca79b9c
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/associate_73.f90
@@ -0,0 +1,21 @@
+!{ dg-do compile }
+
+! Check associate to a "void *" does not ICE.
+! Contributed by Matthias Klose  
+! and Steve Kargl  
+
+module pr118789
+
+   implicit none
+
+   CONTAINS
+
+   subroutine fckit_c_nodelete(cptr) bind(c)
+  use, intrinsic :: iso_c_binding
+  type(c_ptr), value :: cptr
+  associate( unused_ => cptr )
+  end associate
+   end subroutine
+
+end module
+
--
2.48.1



Re: [PATCH] alias: Perform offset arithmetics in poly_uint64 rather than poly_int64 [PR118819]

2025-02-26 Thread Richard Biener
On Wed, 26 Feb 2025, Jakub Jelinek wrote:

> Hi!
> 
> This PR is about ubsan error on the c - cx1 + cy1 evaluation in the first
> hunk.
> 
> The following patch hopefully fixes that by doing the additions/subtractions
> in poly_uint64 rather than poly_int64.  Or shall we instead perform it in
> offset_int and watch for overflows and punt somehow for those?

I think when we have the offset computation overflow ignoring such
overflow will make the memrefs_conflict_p give possibly wrong
answers.  In the PR you say cselib now has those
-9223372036854775807ish offsets from sp, why does it do alias queries
with those clearly invalid offsets?

So yes, I think we need to punt on overflow.  Using poly_offset_int
should work but it comes at a cost.  Does poly_wide_int have the same
wide-int like overflow overloads?

> Or shall we just treat this way only the first case where it is
> adding/subtracting 3 numbers and not just 2, so there is at least a chance
> the overflow is just temporary?
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux (but just normal
> bootstrap, not ubsan one), ok for trunk?
> 
> 2025-02-26  Jakub Jelinek  
> 
>   PR middle-end/118819
>   * alias.cc (memrefs_conflict_p): Perform arithmetics on c in
>   poly_uint64 type rather than poly_int64 to avoid compile time
>   UB.
> 
> --- gcc/alias.cc.jj   2025-01-02 11:23:24.0 +0100
> +++ gcc/alias.cc  2025-02-25 12:43:16.655507666 +0100
> @@ -2537,12 +2537,14 @@ memrefs_conflict_p (poly_int64 xsize, rt
>   {
> if (poly_int_rtx_p (y1, &cy1))
>   return memrefs_conflict_p (xsize, x0, ysize, y0,
> -c - cx1 + cy1);
> +(poly_uint64) c - cx1 + cy1);
> else
> - return memrefs_conflict_p (xsize, x0, ysize, y, c - cx1);
> + return memrefs_conflict_p (xsize, x0, ysize, y,
> +(poly_uint64) c - cx1);
>   }
> else if (poly_int_rtx_p (y1, &cy1))
> - return memrefs_conflict_p (xsize, x, ysize, y0, c + cy1);
> + return memrefs_conflict_p (xsize, x, ysize, y0,
> +(poly_uint64) c + cy1);
>  
> return -1;
>   }
> @@ -2563,7 +2565,8 @@ memrefs_conflict_p (poly_int64 xsize, rt
>  
>poly_int64 cy1;
>if (poly_int_rtx_p (y1, &cy1))
> - return memrefs_conflict_p (xsize, x, ysize, y0, c + cy1);
> + return memrefs_conflict_p (xsize, x, ysize, y0,
> +(poly_uint64) c + cy1);
>else
>   return -1;
>  }
> @@ -2643,7 +2646,7 @@ memrefs_conflict_p (poly_int64 xsize, rt
>poly_int64 cx, cy;
>if (poly_int_rtx_p (x, &cx) && poly_int_rtx_p (y, &cy))
>   {
> -   c += cy - cx;
> +   c += (poly_uint64) cy - cx;
> return offset_overlap_p (c, xsize, ysize);
>   }
>  
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)


Re: [PATCH] simple-diagnostic-path: Inline two trivial methods [PR116143]

2025-02-26 Thread Richard Biener
On Wed, Feb 26, 2025 at 10:27 AM Jakub Jelinek  wrote:
>
> Hi!
>
> Various plugin tests fail with --enable-checking=release, because the
> num_events and num_threads methods of simple_diagnostic_path are only used
> inside of #if CHECKING_P code inside of GCC proper and then tested inside of
> some plugin tests.  So, with --enable-checking=yes they are compiled into
> cc1/cc1plus etc. binaries and plugins can call those, but with
> --enable-checking=release they are optimized away (at least for LTO builds).
>
> As they are trivial, the following patch just defines them inline, so that
> the plugin tests get their definitions directly and don't have to rely
> on cc1/cc1plus etc. exporting those.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK

> 2025-02-26  Jakub Jelinek  
>
> PR testsuite/116143
> * simple-diagnostic-path.h (simple_diagnostic_path::num_events): 
> Define
> inline.
> (simple_diagnostic_path::num_threads): Likewise.
> * simple-diagnostic-path.cc (simple_diagnostic_path::num_events):
> Remove out of line definition.
> (simple_diagnostic_path::num_threads): Likewise.
>
> --- gcc/simple-diagnostic-path.h.jj 2025-01-02 11:23:37.876218670 +0100
> +++ gcc/simple-diagnostic-path.h2025-02-05 15:29:32.882855368 +0100
> @@ -100,9 +100,9 @@ class simple_diagnostic_path : public di
>   public:
>simple_diagnostic_path (pretty_printer *event_pp);
>
> -  unsigned num_events () const final override;
> +  unsigned num_events () const final override { return m_events.length (); }
>const diagnostic_event & get_event (int idx) const final override;
> -  unsigned num_threads () const final override;
> +  unsigned num_threads () const final override { return m_threads.length (); 
> }
>const diagnostic_thread &
>get_thread (diagnostic_thread_id_t) const final override;
>bool
> --- gcc/simple-diagnostic-path.cc.jj2025-01-02 11:23:19.409476476 +0100
> +++ gcc/simple-diagnostic-path.cc   2025-02-05 15:29:59.185492553 +0100
> @@ -41,15 +41,6 @@ simple_diagnostic_path::simple_diagnosti
>add_thread ("main");
>  }
>
> -/* Implementation of diagnostic_path::num_events vfunc for
> -   simple_diagnostic_path: simply get the number of events in the vec.  */
> -
> -unsigned
> -simple_diagnostic_path::num_events () const
> -{
> -  return m_events.length ();
> -}
> -
>  /* Implementation of diagnostic_path::get_event vfunc for
> simple_diagnostic_path: simply return the event in the vec.  */
>
> @@ -59,12 +50,6 @@ simple_diagnostic_path::get_event (int i
>return *m_events[idx];
>  }
>
> -unsigned
> -simple_diagnostic_path::num_threads () const
> -{
> -  return m_threads.length ();
> -}
> -
>  const diagnostic_thread &
>  simple_diagnostic_path::get_thread (diagnostic_thread_id_t idx) const
>  {
>
> Jakub
>


[PATCH] libstdc++: Allow 'configure.host' to modify 'EXTRA_CFLAGS', 'EXTRA_CXX_FLAGS'

2025-02-26 Thread Thomas Schwinge
In particular, 'GLIBCXX_ENABLE_CXX_FLAGS' shouldn't overwrite 'EXTRA_CXX_FLAGS'
(and prepend any additional '--enable-cxx-flags=[...]').

libstdc++-v3/
* acinclude.m4 (GLIBCXX_ENABLE_CXX_FLAGS): Prepend any additional
flags to 'EXTRA_CXX_FLAGS'.
* configure: Regenerate.
* configure.host: Document 'EXTRA_CFLAGS', 'EXTRA_CXX_FLAGS'.
---
 libstdc++-v3/acinclude.m4   | 3 ++-
 libstdc++-v3/configure  | 3 ++-
 libstdc++-v3/configure.host | 4 
 3 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4
index b3423d7957a..3287dab3b89 100644
--- a/libstdc++-v3/acinclude.m4
+++ b/libstdc++-v3/acinclude.m4
@@ -3269,7 +3269,8 @@ AC_DEFUN([GLIBCXX_ENABLE_CXX_FLAGS], [dnl
 done
   fi
 
-  EXTRA_CXX_FLAGS="$enable_cxx_flags"
+  # Prepend the additional flags.
+  EXTRA_CXX_FLAGS="$enable_cxx_flags $EXTRA_CXX_FLAGS"
   AC_MSG_RESULT($EXTRA_CXX_FLAGS)
   AC_SUBST(EXTRA_CXX_FLAGS)
 ])
diff --git a/libstdc++-v3/configure b/libstdc++-v3/configure
index e115ee55739..ba908577a66 100755
--- a/libstdc++-v3/configure
+++ b/libstdc++-v3/configure
@@ -19452,7 +19452,8 @@ fi
 done
   fi
 
-  EXTRA_CXX_FLAGS="$enable_cxx_flags"
+  # Prepend the additional flags.
+  EXTRA_CXX_FLAGS="$enable_cxx_flags $EXTRA_CXX_FLAGS"
   { $as_echo "$as_me:${as_lineno-$LINENO}: result: $EXTRA_CXX_FLAGS" >&5
 $as_echo "$EXTRA_CXX_FLAGS" >&6; }
 
diff --git a/libstdc++-v3/configure.host b/libstdc++-v3/configure.host
index 45f55b250ce..1e84c78af30 100644
--- a/libstdc++-v3/configure.host
+++ b/libstdc++-v3/configure.host
@@ -61,6 +61,10 @@
 #
 # It possibly modifies the following variables:
 #
+#   EXTRA_CFLAGS   extra flags to pass when compiling C code
+#
+#   EXTRA_CXX_FLAGSextra flags to pass when compiling C++ code
+#
 #   OPT_LDFLAGSextra flags to pass when linking the library, of
 #  the form '-Wl,blah'
 #  (defaults to empty in acinclude.m4)
-- 
2.34.1



Re: [PATCH] [testsuite] adjust expectations of x86 vect-simd-clone tests

2025-02-26 Thread Alexandre Oliva
On Feb 24, 2025, Mike Stump  wrote:

> I thought I saw one more needing review.

Thanks, Richard Sandiford reviewed it.
https://gcc.gnu.org/pipermail/gcc-patches/2025-February/676031.html

-- 
Alexandre Oliva, happy hackerhttps://blog.lx.oliva.nom.br/
Free Software Activist FSFLA co-founder GNU Toolchain Engineer
More tolerance and less prejudice are key for inclusion and diversity.
Excluding neuro-others for not behaving ""normal"" is *not* inclusive!


Re: [3/3 PATCH v3]middle-end: delay checking for alignment to load [PR118464]

2025-02-26 Thread Richard Biener
On Tue, 25 Feb 2025, Tamar Christina wrote:

> Hi All,
> 
> This fixes two PRs on Early break vectorization by delaying the safety checks 
> to
> vectorizable_load when the VF, VMAT and vectype are all known.
> 
> This patch does add two new restrictions:
> 
> 1. On LOAD_LANES targets, where the buffer size is known, we reject uneven
>group sizes, as they are unaligned every n % 2 iterations and so may cross
>a page unwittingly.
> 
> 2. On LOAD_LANES targets when the buffer is unknown, we reject vectorization 
> if
>we cannot peel for alignment, as the alignment requirement is quite large 
> at
>GROUP_SIZE * vectype_size.  This is unlikely to ever be beneficial so we
>don't support it for now.
> 
> There are other steps documented inside the code itself so that the reasoning
> is next to the code.
> 
> Note that for VLA I have still left this fully disabled when not working on a
> fixed buffer.
> 
> For VLA targets like SVE return element alignment as the desired vector
> alignment.  This means that the loads are never misaligned and so annoying it
> won't ever need to peel.
> 
> So what I think needs to happen in GCC 16 is that.
> 
> 1. during vect_compute_data_ref_alignment we need to take the max of
>POLY_VALUE_MIN and vector_alignment.
> 
> 2. vect_do_peeling define skip_vector when PFA for VLA, and in the guard add a
>check that ncopies * vectype does not exceed POLY_VALUE_MAX which we use 
> as a
>proxy for pagesize.
> 
> 3. Force LOOP_VINFO_USING_PARTIAL_VECTORS_P to be true in
>vect_determine_partial_vectors_and_peeling since the first iteration has to
>be partial. If LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P we have to fail to
>vectorize.
> 
> 4. Create a default mask to be used, so that 
> vect_use_loop_mask_for_alignment_p
>becomes true and we generate the peeled check through loop control for
>partial loops.  From what I can tell this won't work for
>LOOP_VINFO_FULLY_WITH_LENGTH_P since they don't have any peeling support at
>all in the compiler.  That would need to be done independently from the
>above.
> 
> In any case, not GCC 15 material so I've kept the WIP patches I have 
> downstream.
> 
> Bootstrapped Regtested on aarch64-none-linux-gnu,
> arm-none-linux-gnueabihf, x86_64-pc-linux-gnu
> -m32, -m64 and no issues.
> 
> Ok for master?
> 
> Thanks,
> Tamar
> 
> gcc/ChangeLog:
> 
>   PR tree-optimization/118464
>   PR tree-optimization/116855
>   * doc/invoke.texi (min-pagesize): Update docs with vectorizer use.
>   * tree-vect-data-refs.cc (vect_analyze_early_break_dependences): Delay
>   checks.
>   (vect_compute_data_ref_alignment): Remove alignment checks and move to
>   get_load_store_type, increase group access alignment.
>   (vect_enhance_data_refs_alignment): Add note to comment needing
>   investigating.
>   (vect_analyze_data_refs_alignment): Likewise.
>   (vect_supportable_dr_alignment): For group loads look at first DR.
>   * tree-vect-stmts.cc (get_load_store_type):
>   Perform safety checks for early break pfa.
>   * tree-vectorizer.h (dr_set_safe_speculative_read_required,
>   dr_safe_speculative_read_required, DR_SCALAR_KNOWN_BOUNDS): New.
>   (need_peeling_for_alignment): Renamed to...
>   (safe_speculative_read_required): .. This
>   (class dr_vec_info): Add scalar_access_known_in_bounds.
> 
> gcc/testsuite/ChangeLog:
> 
>   PR tree-optimization/118464
>   PR tree-optimization/116855
>   * gcc.dg/vect/bb-slp-pr65935.c: Update, it now vectorizes because the
>   load type is relaxed later.
>   * gcc.dg/vect/vect-early-break_121-pr114081.c: Update.
>   * gcc.dg/vect/vect-early-break_22.c: Reject for load_lanes targets
>   * g++.dg/vect/vect-early-break_7-pr118464.cc: New test.
>   * gcc.dg/vect/vect-early-break_132-pr118464.c: New test.
>   * gcc.dg/vect/vect-early-break_133_pfa1.c: New test.
>   * gcc.dg/vect/vect-early-break_133_pfa10.c: New test.
>   * gcc.dg/vect/vect-early-break_133_pfa2.c: New test.
>   * gcc.dg/vect/vect-early-break_133_pfa3.c: New test.
>   * gcc.dg/vect/vect-early-break_133_pfa4.c: New test.
>   * gcc.dg/vect/vect-early-break_133_pfa5.c: New test.
>   * gcc.dg/vect/vect-early-break_133_pfa6.c: New test.
>   * gcc.dg/vect/vect-early-break_133_pfa7.c: New test.
>   * gcc.dg/vect/vect-early-break_133_pfa8.c: New test.
>   * gcc.dg/vect/vect-early-break_133_pfa9.c: New test.
>   * gcc.dg/vect/vect-early-break_39.c: Update testcase for misalignment.
>   * gcc.dg/vect/vect-early-break_53.c: Likewise.
>   * gcc.dg/vect/vect-early-break_56.c: Likewise.
>   * gcc.dg/vect/vect-early-break_57.c: Likewise.
>   * gcc.dg/vect/vect-early-break_81.c: Likewise.
> 
> ---
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index 
> ca8e468f3f2dbf68c959f74d8fec48c79463504d..fff2874b326d605fc2656adf4ab6eb5bd5d42d71
>  100644
> --- a/gcc/doc/invok

Re: [PATCH] simple-diagnostic-path, v2: Inline two trivial methods [PR116143]

2025-02-26 Thread Jakub Jelinek
On Wed, Feb 26, 2025 at 12:22:10PM +0100, Richard Biener wrote:
> On Wed, Feb 26, 2025 at 11:38 AM Jakub Jelinek  wrote:
> >
> > On Wed, Feb 26, 2025 at 10:45:37AM +0100, Richard Biener wrote:
> > > OK
> >
> > Unfortunately I've only bootstrapped/regtested it with normal checking.
> > Testing it with --enable-checking=release now shows that this patch just
> > moved the FAILs to a different symbol.  And note that isn't even a LTO
> > build.
> >
> > The following patch which IMHO still makes sense, those methods are also
> > trivial, moves it even further.  But the next problem is
> > _ZN22simple_diagnostic_path9add_eventEmP9tree_nodeiPKcz
> > and that method is IMHO too large for the header file to be defined inline,
> > and doesn't even have final override like the others, isn't virtual in the
> > abstract class.
> > So, I have really no idea why it isn't compiled in.
> 
> Hmm, so why isn't it part of libgccjit?  I suppose C++ does not really support
> exposing a class but not exporting the classes ABI?

Actually, I had a closer look on what is going on.
The _ZN22simple_diagnostic_path9add_eventEmP9tree_nodeiPKcz symbol (and the
others too) are normally emitted in simple-diagnostic-path.o, it isn't some
fancy C++ optimization of classes with final method or LTO optimization.

The problem is that simple-diagnostic-path.o is like most objects added into
libbackend.a and we then link libbackend.a without -Wl,--whole-archive ...
-Wl,--no-whole-archive around it (and can't easily, not all system compilers
and linkers will support that).
With --enable-checking=yes simple-diagnostic-path.o is pulled in, because
selftest-run-tests.o calls simple_diagnostic_path_cc_tests and so
simple-diagnostic-path.o is linked in.
With --enable-checking=release self-tests aren't done and nothing links in
simple-diagnostic-path.o, because nothing in the compiler proper needs
anything from it, only the plugin tests.

Using -Wl,-M on cc1 linking, I see that in --enable-checking=release
build
analyzer/analyzer-selftests.o
digraph.o
dwarf2codeview.o
fibonacci_heap.o
function-tests.o
hash-map-tests.o
hash-set-tests.o
hw-doloop.o
insn-peep.o
lazy-diagnostic-path.o
options-urls.o
ordered-hash-map-tests.o
pair-fusion.o
print-rtl-function.o
resource.o
rtl-tests.o
selftest-rtl.o
selftest-run-tests.o
simple-diagnostic-path.o
splay-tree-utils.o
typed-splay-tree.o
vmsdbgout.o
aren't linked into cc1 (the *test* for obvious reasons of not doing
selftests, pair-fusion.o because it is aarch64 specific, hw-doloop.o because
x86 doesn't have doloop opts, vmsdbgout.o because not on VMS).

So, the question is if and what from digraph.o, fibinacci_heap.o,
hw-doloop.o, insn-peep.o, lazy-diagnostic-path.o, options-urls.o,
pair-fusion.o, print-rtl-function.o, resource.o, simple-diagnostic-path.o,
splay-tree-utils.o, typed-splay-tree.o are supposed to be part of the
plugin API if anything and how we arrange for those to be linked in when
plugins are enabled.

Jakub



Re: [PATCH] ipa-vr: Handle non-conversion unary ops separately from conversions (PR 118756)

2025-02-26 Thread Martin Jambor
Hello,

On Tue, Feb 25 2025, Jakub Jelinek wrote:
> On Tue, Feb 25, 2025 at 04:48:37PM +0100, Martin Jambor wrote:
>> --- /dev/null
>> +++ b/gcc/testsuite/g++.dg/lto/pr118785_0.C
>> @@ -0,0 +1,14 @@
>> +// { dg-lto-do link }
>> +// { dg-require-effective-target fpic }x
>
> Not a review, just a nit, what is the x doing above?

that is of course a typo, thanks for spotting it, I'll remove it.

Martin

>
>> +// { dg-lto-options { "-O3 -flto -fPIC" } }
>> +
>> +void WriteLiteral( unsigned long data, unsigned long bits) {}
>> +void WriteQIndexDelta( short qDelta)
>> +{
>> +  WriteLiteral(__builtin_abs(qDelta), 4);
>> +}
>> +__attribute((used))
>> +void ff(signed char *qIndexDeltaLumaDC) {
>> +  WriteQIndexDelta(*qIndexDeltaLumaDC);
>> +}
>> +int main(){}
>
>   Jakub


Re: [PATCH] alias: Perform offset arithmetics in poly_offset_int rather than poly_int64 [PR118819]

2025-02-26 Thread Richard Biener
On Wed, 26 Feb 2025, Jakub Jelinek wrote:

> On Wed, Feb 26, 2025 at 10:58:26AM +0100, Richard Biener wrote:
> > > This PR is about ubsan error on the c - cx1 + cy1 evaluation in the first
> > > hunk.
> > > 
> > > The following patch hopefully fixes that by doing the 
> > > additions/subtractions
> > > in poly_uint64 rather than poly_int64.  Or shall we instead perform it in
> > > offset_int and watch for overflows and punt somehow for those?
> > 
> > I think when we have the offset computation overflow ignoring such
> > overflow will make the memrefs_conflict_p give possibly wrong
> > answers.  In the PR you say cselib now has those
> > -9223372036854775807ish offsets from sp, why does it do alias queries
> > with those clearly invalid offsets?
> 
> It wants to have a MEM which overlaps anything below the stack.
> So, uses for stack grows down and non-biased stack sp - PTRDIFF_MAX with
> PTRDIFF_MAX MEM_SIZE as an approximation to that.

I see.  Wouldn't setting MEM_OFFSET_KNOWN_P and MEM_SIZE_KNOWN_P
to false work as well?

> > So yes, I think we need to punt on overflow.  Using poly_offset_int
> > should work but it comes at a cost.  Does poly_wide_int have the same
> > wide-int like overflow overloads?
> 
> poly_offset_int will be definitely cheaper than other poly_wide_int, it
> would need to be at least 66 bits and so 128 bit is certainly cheaper then.
> 
> So like this if it passes full bootstrap/regtest?

Yes, that looks good to me.

Thanks,
Richard.

> 2025-02-26  Jakub Jelinek  
> 
>   PR middle-end/118819
>   * alias.cc (memrefs_conflict_p): Perform arithmetics on c, xsize and
>   ysize in poly_offset_int and return -1 if it is not representable in
>   poly_int64.
> 
> --- gcc/alias.cc.jj   2025-01-02 11:23:24.0 +0100
> +++ gcc/alias.cc  2025-02-26 12:31:52.860341105 +0100
> @@ -2535,19 +2535,39 @@ memrefs_conflict_p (poly_int64 xsize, rt
>   return memrefs_conflict_p (xsize, x1, ysize, y1, c);
> if (poly_int_rtx_p (x1, &cx1))
>   {
> +   poly_offset_int co = c;
> +   co -= cx1;
> if (poly_int_rtx_p (y1, &cy1))
> - return memrefs_conflict_p (xsize, x0, ysize, y0,
> -c - cx1 + cy1);
> + {
> +   co += cy1;
> +   if (!co.to_shwi (&c))
> + return -1;
> +   return memrefs_conflict_p (xsize, x0, ysize, y0, c);
> + }
> +   else if (!co.to_shwi (&c))
> + return -1;
> else
> - return memrefs_conflict_p (xsize, x0, ysize, y, c - cx1);
> + return memrefs_conflict_p (xsize, x0, ysize, y, c);
>   }
> else if (poly_int_rtx_p (y1, &cy1))
> - return memrefs_conflict_p (xsize, x, ysize, y0, c + cy1);
> + {
> +   poly_offset_int co = c;
> +   co += cy1;
> +   if (!co.to_shwi (&c))
> + return -1;
> +   return memrefs_conflict_p (xsize, x, ysize, y0, c);
> + }
>  
> return -1;
>   }
>else if (poly_int_rtx_p (x1, &cx1))
> - return memrefs_conflict_p (xsize, x0, ysize, y, c - cx1);
> + {
> +   poly_offset_int co = c;
> +   co -= cx1;
> +   if (!co.to_shwi (&c))
> + return -1;
> +   return memrefs_conflict_p (xsize, x0, ysize, y, c);
> + }
>  }
>else if (GET_CODE (y) == PLUS)
>  {
> @@ -2563,7 +2583,13 @@ memrefs_conflict_p (poly_int64 xsize, rt
>  
>poly_int64 cy1;
>if (poly_int_rtx_p (y1, &cy1))
> - return memrefs_conflict_p (xsize, x, ysize, y0, c + cy1);
> + {
> +   poly_offset_int co = c;
> +   co += cy1;
> +   if (!co.to_shwi (&c))
> + return -1;
> +   return memrefs_conflict_p (xsize, x, ysize, y0, c);
> + }
>else
>   return -1;
>  }
> @@ -2616,8 +2642,16 @@ memrefs_conflict_p (poly_int64 xsize, rt
> if (maybe_gt (xsize, 0))
>   xsize = -xsize;
> if (maybe_ne (xsize, 0))
> - xsize += sc + 1;
> -   c -= sc + 1;
> + {
> +   poly_offset_int xsizeo = xsize;
> +   xsizeo += sc + 1;
> +   if (!xsizeo.to_shwi (&xsize))
> + return -1;
> + }
> +   poly_offset_int co = c;
> +   co -= sc + 1;
> +   if (!co.to_shwi (&c))
> + return -1;
> return memrefs_conflict_p (xsize, canon_rtx (XEXP (x, 0)),
>ysize, y, c);
>   }
> @@ -2631,8 +2665,16 @@ memrefs_conflict_p (poly_int64 xsize, rt
> if (maybe_gt (ysize, 0))
>   ysize = -ysize;
> if (maybe_ne (ysize, 0))
> - ysize += sc + 1;
> -   c += sc + 1;
> + {
> +   poly_offset_int ysizeo = ysize;
> +   ysizeo += sc + 1;
> +   if (!ysizeo.to_shwi (&ysize))
> + return -1;
> + }
> +   poly_offset_int co = c;
> +   co += sc + 1;
> +   if (!co.to_shwi (&c))
> +  

Re: [PATCH] simple-diagnostic-path: Inline two trivial methods [PR116143]

2025-02-26 Thread David Malcolm
On Wed, 2025-02-26 at 09:44 +0100, Jakub Jelinek wrote:
> Hi!
> 
> Various plugin tests fail with --enable-checking=release, because the
> num_events and num_threads methods of simple_diagnostic_path are only
> used
> inside of #if CHECKING_P code inside of GCC proper and then tested
> inside of
> some plugin tests.  So, with --enable-checking=yes they are compiled
> into
> cc1/cc1plus etc. binaries and plugins can call those, but with
> --enable-checking=release they are optimized away (at least for LTO
> builds).
> 
> As they are trivial, the following patch just defines them inline, so
> that
> the plugin tests get their definitions directly and don't have to
> rely
> on cc1/cc1plus etc. exporting those.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

BTW, Qing Zhao's patch kit
  "[PATCH v4 0/3][RFC]Provide more contexts for -Warray-bounds and -
Wstringop-* warning messages"
https://gcc.gnu.org/pipermail/gcc-patches/2025-January/673474.html

adds a usage of simple_diagnostic_path to OBJS via a new gcc/move-
history-rich-location.o in this patch:
  https://gcc.gnu.org/pipermail/gcc-patches/2024-November/667615.html

Dave

> 
> 2025-02-26  Jakub Jelinek  
> 
>   PR testsuite/116143
>   * simple-diagnostic-path.h
> (simple_diagnostic_path::num_events): Define
>   inline.
>   (simple_diagnostic_path::num_threads): Likewise.
>   * simple-diagnostic-path.cc
> (simple_diagnostic_path::num_events):
>   Remove out of line definition.
>   (simple_diagnostic_path::num_threads): Likewise.
> 
> --- gcc/simple-diagnostic-path.h.jj   2025-01-02
> 11:23:37.876218670 +0100
> +++ gcc/simple-diagnostic-path.h  2025-02-05
> 15:29:32.882855368 +0100
> @@ -100,9 +100,9 @@ class simple_diagnostic_path : public di
>   public:
>    simple_diagnostic_path (pretty_printer *event_pp);
>  
> -  unsigned num_events () const final override;
> +  unsigned num_events () const final override { return
> m_events.length (); }
>    const diagnostic_event & get_event (int idx) const final override;
> -  unsigned num_threads () const final override;
> +  unsigned num_threads () const final override { return
> m_threads.length (); }
>    const diagnostic_thread &
>    get_thread (diagnostic_thread_id_t) const final override;
>    bool
> --- gcc/simple-diagnostic-path.cc.jj  2025-01-02
> 11:23:19.409476476 +0100
> +++ gcc/simple-diagnostic-path.cc 2025-02-05
> 15:29:59.185492553 +0100
> @@ -41,15 +41,6 @@ simple_diagnostic_path::simple_diagnosti
>    add_thread ("main");
>  }
>  
> -/* Implementation of diagnostic_path::num_events vfunc for
> -   simple_diagnostic_path: simply get the number of events in the
> vec.  */
> -
> -unsigned
> -simple_diagnostic_path::num_events () const
> -{
> -  return m_events.length ();
> -}
> -
>  /* Implementation of diagnostic_path::get_event vfunc for
>     simple_diagnostic_path: simply return the event in the vec.  */
>  
> @@ -59,12 +50,6 @@ simple_diagnostic_path::get_event (int i
>    return *m_events[idx];
>  }
>  
> -unsigned
> -simple_diagnostic_path::num_threads () const
> -{
> -  return m_threads.length ();
> -}
> -
>  const diagnostic_thread &
>  simple_diagnostic_path::get_thread (diagnostic_thread_id_t idx)
> const
>  {
> 
>   Jakub
> 



Re: [FYI, PATCH v3] [testsuite] add x86 effective target

2025-02-26 Thread Florian Weimer
* Alexandre Oliva:

> diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi
> index 28338324f0724..d44c2e8cbe6a1 100644
> --- a/gcc/doc/sourcebuild.texi
> +++ b/gcc/doc/sourcebuild.texi
> @@ -2798,6 +2798,9 @@ Target supports the execution of @code{user_msr} 
> instructions.
>  @item vect_cmdline_needed
>  Target requires a command line argument to enable a SIMD instruction set.
>  
> +@item x86
> +Target is ia32 or x86_64.

Does it match x86-64 x32?  Maybe this could be clarified?


Re: [PATCH] simple-diagnostic-path, v2: Inline two trivial methods [PR116143]

2025-02-26 Thread Richard Biener
On Wed, Feb 26, 2025 at 1:13 PM Jakub Jelinek  wrote:
>
> On Wed, Feb 26, 2025 at 12:22:10PM +0100, Richard Biener wrote:
> > On Wed, Feb 26, 2025 at 11:38 AM Jakub Jelinek  wrote:
> > >
> > > On Wed, Feb 26, 2025 at 10:45:37AM +0100, Richard Biener wrote:
> > > > OK
> > >
> > > Unfortunately I've only bootstrapped/regtested it with normal checking.
> > > Testing it with --enable-checking=release now shows that this patch just
> > > moved the FAILs to a different symbol.  And note that isn't even a LTO
> > > build.
> > >
> > > The following patch which IMHO still makes sense, those methods are also
> > > trivial, moves it even further.  But the next problem is
> > > _ZN22simple_diagnostic_path9add_eventEmP9tree_nodeiPKcz
> > > and that method is IMHO too large for the header file to be defined 
> > > inline,
> > > and doesn't even have final override like the others, isn't virtual in the
> > > abstract class.
> > > So, I have really no idea why it isn't compiled in.
> >
> > Hmm, so why isn't it part of libgccjit?  I suppose C++ does not really 
> > support
> > exposing a class but not exporting the classes ABI?
>
> Actually, I had a closer look on what is going on.
> The _ZN22simple_diagnostic_path9add_eventEmP9tree_nodeiPKcz symbol (and the
> others too) are normally emitted in simple-diagnostic-path.o, it isn't some
> fancy C++ optimization of classes with final method or LTO optimization.
>
> The problem is that simple-diagnostic-path.o is like most objects added into
> libbackend.a and we then link libbackend.a without -Wl,--whole-archive ...
> -Wl,--no-whole-archive around it (and can't easily, not all system compilers
> and linkers will support that).
> With --enable-checking=yes simple-diagnostic-path.o is pulled in, because
> selftest-run-tests.o calls simple_diagnostic_path_cc_tests and so
> simple-diagnostic-path.o is linked in.
> With --enable-checking=release self-tests aren't done and nothing links in
> simple-diagnostic-path.o, because nothing in the compiler proper needs
> anything from it, only the plugin tests.
>
> Using -Wl,-M on cc1 linking, I see that in --enable-checking=release
> build
> analyzer/analyzer-selftests.o
> digraph.o
> dwarf2codeview.o
> fibonacci_heap.o
> function-tests.o
> hash-map-tests.o
> hash-set-tests.o
> hw-doloop.o
> insn-peep.o
> lazy-diagnostic-path.o
> options-urls.o
> ordered-hash-map-tests.o
> pair-fusion.o
> print-rtl-function.o
> resource.o
> rtl-tests.o
> selftest-rtl.o
> selftest-run-tests.o
> simple-diagnostic-path.o
> splay-tree-utils.o
> typed-splay-tree.o
> vmsdbgout.o
> aren't linked into cc1 (the *test* for obvious reasons of not doing
> selftests, pair-fusion.o because it is aarch64 specific, hw-doloop.o because
> x86 doesn't have doloop opts, vmsdbgout.o because not on VMS).
>
> So, the question is if and what from digraph.o, fibinacci_heap.o,
> hw-doloop.o, insn-peep.o, lazy-diagnostic-path.o, options-urls.o,
> pair-fusion.o, print-rtl-function.o, resource.o, simple-diagnostic-path.o,
> splay-tree-utils.o, typed-splay-tree.o are supposed to be part of the
> plugin API if anything and how we arrange for those to be linked in when
> plugins are enabled.

I think "everything" is part of the plugin API (at least of what is declared in
headers we install).  I suppose we'd need to split up libbackend into two
parts so we can link it with whole-archive but avoid, say, linking in
hash-map-tests.o or other TUs we obviously do not need?

Alternatively libgccjit could export the symbols in its map file, that should
pull in the required archive members?

Richard.

>
> Jakub
>


Re: [PATCH v2] RISC-V: Fix bug for expand_const_vector interleave [PR118931]

2025-02-26 Thread Robin Dapp

If you mean the last branch of interleave, I think it is safe because it 
leverage the
merge to generate the result, instead of IOR. Only the IOR for final result have
this issue.


Yep, I meant checking overflow before the initial if

  if (known_ge (step1, 0) && known_ge (step2, 0)
  && int_mode_for_size (new_smode_bitsize, 0).exists (&new_smode)
  && get_vector_mode (new_smode, new_nunits).exists (&new_mode))

and basically adding a new clause here.  That would make us fall back to the 
merge scheme in case of overflow instead of making the other scheme more 
complicated.


I haven't checked this but could imagine the merge sequence is not worse or 
even preferable.


--
Regards
Robin



RE: [PATCH v2] RISC-V: Fix bug for expand_const_vector interleave [PR118931]

2025-02-26 Thread Li, Pan2
> and basically adding a new clause here.  That would make us fall back to the 
> merge scheme in case of overflow instead of making the other scheme more 
> complicated.

Oh, I see, that make sense to me, given current implementation of 
expand_const_vector is complicated, let me update in v4.

Pan

-Original Message-
From: Robin Dapp  
Sent: Thursday, February 27, 2025 1:37 PM
To: Li, Pan2 ; gcc-patches@gcc.gnu.org
Cc: juzhe.zh...@rivai.ai; kito.ch...@gmail.com; jeffreya...@gmail.com; Robin 
Dapp 
Subject: Re: [PATCH v2] RISC-V: Fix bug for expand_const_vector interleave 
[PR118931]

> If you mean the last branch of interleave, I think it is safe because it 
> leverage the
> merge to generate the result, instead of IOR. Only the IOR for final result 
> have
> this issue.

Yep, I meant checking overflow before the initial if

  if (known_ge (step1, 0) && known_ge (step2, 0)
  && int_mode_for_size (new_smode_bitsize, 0).exists (&new_smode)
  && get_vector_mode (new_smode, new_nunits).exists (&new_mode))

and basically adding a new clause here.  That would make us fall back to the 
merge scheme in case of overflow instead of making the other scheme more 
complicated.

I haven't checked this but could imagine the merge sequence is not worse or 
even preferable.

-- 
Regards
 Robin



[committed] gimple-range-phi: Fix comment typo

2025-02-26 Thread Jakub Jelinek
Hi!

During reading of this file I've noticed a typo in the comment, which
this patch fixes.

Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk
as obvious.

2025-02-27  Jakub Jelinek  

* gimple-range-phi.cc (phi_analyzer::process_phi): Fix comment typo,
dpoesn;t -> doesn't.

--- gcc/gimple-range-phi.cc.jj  2025-01-02 11:23:32.784289758 +0100
+++ gcc/gimple-range-phi.cc 2025-02-26 17:33:10.539432633 +0100
@@ -483,7 +483,7 @@ phi_analyzer::process_phi (gphi *phi)
}
}
 }
-  // If this dpoesn;t form a group, all members are instead simple phis.
+  // If this doesn't form a group, all members are instead simple phis.
   if (!g)
 {
   bitmap_ior_into (m_simple, m_current);

Jakub