Re: [PATCH 3/3] OpenMP: Handle more cases in user/condition selector

2025-05-29 Thread Tobias Burnus

Sandra Loosemore wrote:

Like the attached V2 patch?


LGTM.

However, I think in metadirective-condition-template.C the 
"scan-tree-dump" should now be changed to "scan-tree-dump-times", given 
that there are several tests.


Thanks,

Tobias



[PATCH] C: Flex array in union followed by a structure field is not reported [PR120354]

2025-05-29 Thread Qing Zhao
There is only one last_field for a structure type, but there might
be multiple last_fields for a union type, therefore we should ORed
the result of TYPE_INCLUDES_FLEXARRAY for multiple last_fields of
a union type.

The patch has been bootstrapped and regression tested on both x86 and aarch64.
Okay for trunk and also GCC14?

thanks.

Qing

PR c/120354

gcc/c/ChangeLog:

* c-decl.cc (finish_struct): Or the results for TYPE_INCLUDES_FLEXARRAY.

gcc/testsuite/ChangeLog:

* gcc.dg/pr120354.c: New test.
---
 gcc/c/c-decl.cc |  9 ++---
 gcc/testsuite/gcc.dg/pr120354.c | 33 +
 2 files changed, 39 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr120354.c

diff --git a/gcc/c/c-decl.cc b/gcc/c/c-decl.cc
index 4733287eaf8..2b72b782fc5 100644
--- a/gcc/c/c-decl.cc
+++ b/gcc/c/c-decl.cc
@@ -9647,15 +9647,18 @@ finish_struct (location_t loc, tree t, tree fieldlist, 
tree attributes,
   DECL_NOT_FLEXARRAY (x) = !is_flexible_array_member_p (is_last_field, x);
 
   /* Set TYPE_INCLUDES_FLEXARRAY for the context of x, t.
-when x is an array and is the last field.  */
+when x is an array and is the last field.
+There is only one last_field for a structure type, but there might
+be multiple last_fields for a union type, therefore we should ORed
+the result for multiple last_fields.  */
   if (TREE_CODE (TREE_TYPE (x)) == ARRAY_TYPE)
TYPE_INCLUDES_FLEXARRAY (t)
- = is_last_field && c_flexible_array_member_type_p (TREE_TYPE (x));
+ |= is_last_field && c_flexible_array_member_type_p (TREE_TYPE (x));
   /* Recursively set TYPE_INCLUDES_FLEXARRAY for the context of x, t
 when x is an union or record and is the last field.  */
   else if (RECORD_OR_UNION_TYPE_P (TREE_TYPE (x)))
TYPE_INCLUDES_FLEXARRAY (t)
- = is_last_field && TYPE_INCLUDES_FLEXARRAY (TREE_TYPE (x));
+ |= is_last_field && TYPE_INCLUDES_FLEXARRAY (TREE_TYPE (x));
 
   if (warn_flex_array_member_not_at_end
  && !is_last_field
diff --git a/gcc/testsuite/gcc.dg/pr120354.c b/gcc/testsuite/gcc.dg/pr120354.c
new file mode 100644
index 000..6749737a173
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr120354.c
@@ -0,0 +1,33 @@
+/* PR120354: Test for -Wflex-array-member-not-at-end on union with 
+   flexible array members.  */ 
+/* { dg-do compile } */
+/* { dg-options "-Wflex-array-member-not-at-end" } */
+
+struct P {};
+union L {};
+
+union X {
+int x[];
+struct P y;
+};
+
+struct T {
+union X x; /* { dg-warning "structure containing a flexible array member 
is not at the end of another structure" } */
+int plug;
+};
+
+struct Q {
+int len;
+int data[];
+};
+
+union Y {
+struct Q q;
+union L y;
+};
+
+struct S {
+union Y y;  /* { dg-warning "structure containing a flexible array member 
is not at the end of another structure" } */
+int plug;
+};
+
-- 
2.43.5



[PATCH v3] c++: Unwrap type traits defined in terms of builtins within diagnostics [PR117294]

2025-05-29 Thread Nathaniel Shead
On Wed, May 28, 2025 at 02:14:06PM -0400, Patrick Palka wrote:
> On Tue, 27 May 2025, Nathaniel Shead wrote:
> 
> > On Wed, Nov 27, 2024 at 11:45:40AM -0500, Patrick Palka wrote:
> > > On Fri, 8 Nov 2024, Nathaniel Shead wrote:
> > > 
> > > > Does this approach seem reasonable?  I'm pretty sure that the way I've
> > > > handled the templating here is unideal but I'm not sure what a neat way
> > > > to do what I'm trying to do here would be; any comments are welcome.
> > > 
> > > Clever approach, I like it!
> > > 
> > > > 
> > > > -- >8 --
> > > > 
> > > > Currently, concept failures of standard type traits just report
> > > > 'expression X evaluates to false'.  However, many type traits are
> > > > actually defined in terms of compiler builtins; we can do better here.
> > > > For instance, 'is_constructible_v' could go on to explain why the type
> > > > is not constructible, or 'is_invocable_v' could list potential
> > > > candidates.
> > > 
> > > That'd be great improvement.
> > > 
> > > > 
> > > > As a first step to supporting that we need to be able to map the
> > > > standard type traits to the builtins that they use.  Rather than adding
> > > > another list that would need to be kept up-to-date whenever a builtin is
> > > > added, this patch instead tries to detect any variable template defined
> > > > directly in terms of a TRAIT_EXPR.
> > > > 
> > > > To avoid false positives, we ignore any variable templates that have any
> > > > specialisations (partial or explicit), even if we wouldn't have chosen
> > > > that specialisation anyway.  This shouldn't affect any of the standard
> > > > library type traits that I could see.
> > > 
> > > You should be able to tsubst the TEMPLATE_ID_EXPR directly and look at
> > > its TI_PARTIAL_INFO in order to determine which (if any) partial
> > > specialization was selected.  And if an explicit specialization was
> > > selected the resulting VAR_DECL will have DECL_TEMPLATE_SPECIALIZATION
> > > set.
> > > 
> > > > ...[snip]...
> > > 
> > > If we substituted the TEMPLATE_ID_EXPR as a whole we could use the
> > > DECL_TI_ARGS of that IIUC?
> > > 
> > 
> > Thanks for your comments, they were very helpful.  Here's a totally new
> > approach which I'm much happier with.  I've also removed the "disable in
> > case any specialisation exists" logic, as on further reflection I don't
> > imagine this to be the kind of issue I thought it might have been.
> > 
> > With this patch,
> > 
> >   template 
> >   constexpr bool is_default_constructible_v = __is_constructible(T);
> > 
> >   template 
> >   concept default_constructible = is_default_constructible_v;
> > 
> >   static_assert(default_constructible);
> > 
> > now emits the following error:
> > 
> >   test.cpp:6:15: error: static assertion failed
> >   6 | static_assert(default_constructible);
> > |   ^~~
> >   test.cpp:6:15: note: constraints not satisfied
> >   test.cpp:4:9:   required by the constraints of ‘template concept 
> > default_constructible’
> >   test.cpp:4:33: note:   ‘void’ is not default constructible
> >   4 | concept default_constructible = is_default_constructible_v;
> > | ^
> > 
> > There's still a lot of improvements to be made in this area, I think:
> > 
> > - I haven't yet looked into updating the specific diagnostics emitted by
> >   the traits; I'd like to try to avoid too much code duplication with
> >   the implementation in cp/semantics.cc.  (I also don't think the manual
> >   indentation at the start of the message is particularly helpful?)
> 
> For is_xible / is_convertible etc, perhaps they could use a 'complain'
> parameter that they propagate through instead of always passing tf_none,
> similar to build_invoke?  Then we can call those predicates directly
> from diagnose_trait_expr with complain=tf_error so that they elaborate
> why they failed.
> 

Done; for is_xible I ended up slightly preferring a 'bool explain'
(since it doesn't really make sense to talk about "complaining" for a
predicate?) but happy to swap over if that's more consistent.

> Agreed about the extra indentation
> 
> > 
> > - The message doesn't print the mapping '[with T = void]'; I tried a
> >   couple of things but this doesn't currently look especially
> >   straight-forward, as we don't currently associate the args with the
> >   normalised atomic constraint of the declaration.
> 
> Maybe we can still print the
> 
>  note: the expression ‘normal [with T = void]’ evaluated to ‘false’
> 
> note alongside the extended diagnostics?  Which would mean moving the
> maybe_diagnose_standard_trait call a bit lower in
> diagnose_atomic_constraint.
> 
> This would arguably make the diagnostic even noiser, but IMHO the
> parameter mapping is an important piece of information to omit.
> 

Agreed, can always look at condensing things later but I think this is a
good improvement.

> > 
> > - Just generally I thi

Re: [PATCH 1/2] forwprop: Change test in loop of optimize_memcpy_to_memset

2025-05-29 Thread Andrew Pinski
On Tue, May 27, 2025 at 5:14 AM Richard Biener
 wrote:
>
> On Tue, May 27, 2025 at 5:02 AM Andrew Pinski  
> wrote:
> >
> > This was noticed in the review of copy propagation for aggregates
> > patch, instead of checking for a NULL or a non-ssa name of vuse,
> > we should instead check if it the vuse is a default name and stop
> > then.
> >
> > Bootstrapped and tested on x86_64-linux-gnu.
> >
> > gcc/ChangeLog:
> >
> > * tree-ssa-forwprop.cc (optimize_memcpy_to_memset): Change check
> > from NULL/non-ssa name to default name.
> >
> > Signed-off-by: Andrew Pinski 
> > ---
> >  gcc/tree-ssa-forwprop.cc | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/gcc/tree-ssa-forwprop.cc b/gcc/tree-ssa-forwprop.cc
> > index 4c048a9a298..e457a69ed48 100644
> > --- a/gcc/tree-ssa-forwprop.cc
> > +++ b/gcc/tree-ssa-forwprop.cc
> > @@ -1226,7 +1226,8 @@ optimize_memcpy_to_memset (gimple_stmt_iterator 
> > *gsip, tree dest, tree src, tree
> >gimple *defstmt;
> >unsigned limit = param_sccvn_max_alias_queries_per_access;
> >do {
> > -if (vuse == NULL || TREE_CODE (vuse) != SSA_NAME)
> > +/* If the vuse is the default definition, then there is no stores 
> > beforhand. */
> > +if (SSA_NAME_IS_DEFAULT_DEF (vuse))
>
> Since forwprop does update_ssa in the end I was wondering whether any
> bare non-SSA VUSE/VDEFs sneak in - for this the != SSA_NAME check
> would be useful.  On a GIMPLE stmt gimple_vuse () will return NULL
> when it's not a load or store (or with a novops call), as you are using
> gimple_store_p/gimple_assign_load_p there might be a disconnect
> between those predicates and the presence of a vuse (I hope not, but ...)
>
> The patch looks OK to me, the comments above apply to the copy propagation 
> case.

The copy prop case should be ok too since the vuse/vdef on the
statement does not change when doing the prop; only the rhs of the
statement. There is no inserting of a statement.  This is unless we
remove the statement and then unlink_stmt_vdef will prop the vuse into
the vdef of the statement which we are removing.

I did test the copy prop using just SSA_NAME_IS_DEFAULT_DEF and there
were no regressions there either.

When optimize_memcpy_to_memset was part of fold_stmt, a NULL vuse
and/or a non-SSA vuse was common due to running before ssa. This was
why there was a check for non-SSA.
I am not sure why there was a check for NULLness was there when it was
part of fold-all-builtins though.

On a side note I think many passes have TODO_update_ssa on them when
they already keep the ssa up to date now. I wonder if most of that
dates from the days of VMUST_DEF/VMAY_DEF and multiple names on them
rather than one virtual name.

Thanks,
Andrew


>
> Thanks,
> Richard.
>
> >return false;
> >  defstmt = SSA_NAME_DEF_STMT (vuse);
> >  if (is_a (defstmt))
> > --
> > 2.43.0
> >


Re: [PATCH] c++tools: Don't check --enable-default-pie.

2025-05-29 Thread Richard Biener
On Thu, May 29, 2025 at 8:06 AM Kito Cheng  wrote:
>
> `--enable-default-pie` is an option to specify whether to enable
> position-independent executables by default for `target`.
>
> However c++tools is build for `host`, so it should just follow
> `--enable-host-pie` option to determine whether to build with
> position-independent executables or not.
>
> NOTE:
>
> I checked PR 98324 and build with same configure option
> (`--enable-default-pie` and lto bootstrap) on x86-64 linux to make sure
> it won't cause same problem.

Makes sense to me, thus OK if nobody objects over the weekend.

Richard.

> c++tools/ChangeLog:
>
> * configure.ac: Don't check `--enable-default-pie`.
> * configure: Regen.
> ---
>  c++tools/configure| 11 ---
>  c++tools/configure.ac |  6 --
>  2 files changed, 17 deletions(-)
>
> diff --git a/c++tools/configure b/c++tools/configure
> index 1353479beca..6df4a2f0dfa 100755
> --- a/c++tools/configure
> +++ b/c++tools/configure
> @@ -700,7 +700,6 @@ enable_option_checking
>  enable_c___tools
>  enable_maintainer_mode
>  enable_checking
> -enable_default_pie
>  enable_host_pie
>  enable_host_bind_now
>  with_gcc_major_version_only
> @@ -1335,7 +1334,6 @@ Optional Features:
>enable expensive run-time checks. With LIST, enable
>only specific categories of checks. Categories are:
>yes,no,all,none,release.
> -  --enable-default-pieenable Position Independent Executable as default
>--enable-host-pie   build host code as PIE
>--enable-host-bind-now  link host code as BIND_NOW
>
> @@ -2946,15 +2944,6 @@ $as_echo "#define ENABLE_ASSERT_CHECKING 1" 
> >>confdefs.h
>
>  fi
>
> -# Check whether --enable-default-pie was given.
> -# Check whether --enable-default-pie was given.
> -if test "${enable_default_pie+set}" = set; then :
> -  enableval=$enable_default_pie; PICFLAG=-fPIE
> -else
> -  PICFLAG=
> -fi
> -
> -
>  # Enable --enable-host-pie
>  # Check whether --enable-host-pie was given.
>  if test "${enable_host_pie+set}" = set; then :
> diff --git a/c++tools/configure.ac b/c++tools/configure.ac
> index db34ee678e0..8c4b72a8023 100644
> --- a/c++tools/configure.ac
> +++ b/c++tools/configure.ac
> @@ -97,12 +97,6 @@ if test x$ac_assert_checking != x ; then
>  [Define if you want assertions enabled.  This is a cheap check.])
>  fi
>
> -# Check whether --enable-default-pie was given.
> -AC_ARG_ENABLE(default-pie,
> -[AS_HELP_STRING([--enable-default-pie],
> - [enable Position Independent Executable as default])],
> -[PICFLAG=-fPIE], [PICFLAG=])
> -
>  # Enable --enable-host-pie
>  AC_ARG_ENABLE(host-pie,
>  [AS_HELP_STRING([--enable-host-pie],
> --
> 2.34.1
>


[PATCH v1 3/3] RISC-V: Add test cases for avg_ceil vaadd implementation

2025-05-29 Thread pan2 . li
From: Pan Li 

Add asm and run testcase for avg_ceil vaadd implementation.

The below test suites are passed for this patch series.
* The rv64gcv fully regression test.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/avg.h: Add test helper macros.
* gcc.target/riscv/rvv/autovec/avg_data.h: Add test data for
avg_ceil.
* gcc.target/riscv/rvv/autovec/avg_ceil-1-i16-from-i32.c: New test.
* gcc.target/riscv/rvv/autovec/avg_ceil-1-i16-from-i64.c: New test.
* gcc.target/riscv/rvv/autovec/avg_ceil-1-i32-from-i64.c: New test.
* gcc.target/riscv/rvv/autovec/avg_ceil-1-i8-from-i16.c: New test.
* gcc.target/riscv/rvv/autovec/avg_ceil-1-i8-from-i32.c: New test.
* gcc.target/riscv/rvv/autovec/avg_ceil-1-i8-from-i64.c: New test.
* gcc.target/riscv/rvv/autovec/avg_ceil-run-1-i16-from-i32.c: New test.
* gcc.target/riscv/rvv/autovec/avg_ceil-run-1-i16-from-i64.c: New test.
* gcc.target/riscv/rvv/autovec/avg_ceil-run-1-i32-from-i64.c: New test.
* gcc.target/riscv/rvv/autovec/avg_ceil-run-1-i8-from-i16.c: New test.
* gcc.target/riscv/rvv/autovec/avg_ceil-run-1-i8-from-i32.c: New test.
* gcc.target/riscv/rvv/autovec/avg_ceil-run-1-i8-from-i64.c: New test.

Signed-off-by: Pan Li 
---
 .../gcc.target/riscv/rvv/autovec/avg.h|  17 ++
 .../rvv/autovec/avg_ceil-1-i16-from-i32.c |  12 ++
 .../rvv/autovec/avg_ceil-1-i16-from-i64.c |  12 ++
 .../rvv/autovec/avg_ceil-1-i32-from-i64.c |  12 ++
 .../rvv/autovec/avg_ceil-1-i8-from-i16.c  |  12 ++
 .../rvv/autovec/avg_ceil-1-i8-from-i32.c  |  12 ++
 .../rvv/autovec/avg_ceil-1-i8-from-i64.c  |  12 ++
 .../rvv/autovec/avg_ceil-run-1-i16-from-i32.c |  16 ++
 .../rvv/autovec/avg_ceil-run-1-i16-from-i64.c |  16 ++
 .../rvv/autovec/avg_ceil-run-1-i32-from-i64.c |  16 ++
 .../rvv/autovec/avg_ceil-run-1-i8-from-i16.c  |  16 ++
 .../rvv/autovec/avg_ceil-run-1-i8-from-i32.c  |  16 ++
 .../rvv/autovec/avg_ceil-run-1-i8-from-i64.c  |  16 ++
 .../gcc.target/riscv/rvv/autovec/avg_data.h   | 176 ++
 14 files changed, 361 insertions(+)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/avg_ceil-1-i16-from-i32.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/avg_ceil-1-i16-from-i64.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/avg_ceil-1-i32-from-i64.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/avg_ceil-1-i8-from-i16.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/avg_ceil-1-i8-from-i32.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/avg_ceil-1-i8-from-i64.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/avg_ceil-run-1-i16-from-i32.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/avg_ceil-run-1-i16-from-i64.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/avg_ceil-run-1-i32-from-i64.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/avg_ceil-run-1-i8-from-i16.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/avg_ceil-run-1-i8-from-i32.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/avg_ceil-run-1-i8-from-i64.c

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/avg.h 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/avg.h
index 746c635ae57..4aeb637bba7 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/avg.h
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/avg.h
@@ -20,4 +20,21 @@ test_##NAME##_##WT##_##NT##_0(NT * restrict a, NT * restrict 
b, \
 #define RUN_AVG_0_WRAP(NT, WT, NAME, a, b, out, n) \
   RUN_AVG_0(NT, WT, NAME, a, b, out, n)
 
+#define DEF_AVG_1(NT, WT, NAME) \
+__attribute__((noinline))   \
+void\
+test_##NAME##_##WT##_##NT##_1(NT * restrict a, NT * restrict b, \
+ NT * restrict out, int n) \
+{   \
+  for (int i = 0; i < n; i++) { \
+out[i] = (NT)(((WT)a[i] + (WT)b[i] + 1) >> 1);  \
+  } \
+}
+#define DEF_AVG_1_WRAP(NT, WT, NAME) DEF_AVG_1(NT, WT, NAME)
+
+#define RUN_AVG_1(NT, WT, NAME, a, b, out, n) \
+  test_##NAME##_##WT##_##NT##_1(a, b, out, n)
+#define RUN_AVG_1_WRAP(NT, WT, NAME, a, b, out, n) \
+  RUN_AVG_1(NT, WT, NAME, a, b, out, n)
+
 #endif
diff --git 
a/gcc/testsuite/gcc.target/riscv/rvv/autovec/avg_ceil-1-i16-from-i32.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/avg_ceil-1-i16-from-i32.c
new file mode 100644
index 000..138124c8c4a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/avg_ceil-1-i16-from-i32.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d" } */
+
+#include "avg.h"
+
+#define NT int16_t
+#defi

[PATCH v1 2/3] RISC-V: Reconcile the existing test for avg_ceil

2025-05-29 Thread pan2 . li
From: Pan Li 

Some existing avg_floor test need updated due to change to
leverage vaadd.vv directly.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vls/avg-4.c: Update asm check
to vaadd.
* gcc.target/riscv/rvv/autovec/vls/avg-5.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls/avg-6.c: Ditto.
* gcc.target/riscv/rvv/autovec/widen/vec-avg-rv32gcv.c: Ditto.
* gcc.target/riscv/rvv/autovec/widen/vec-avg-rv64gcv.c: Ditto.

Signed-off-by: Pan Li 
---
 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/avg-4.c  | 6 ++
 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/avg-5.c  | 6 ++
 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/avg-6.c  | 6 ++
 .../gcc.target/riscv/rvv/autovec/widen/vec-avg-rv32gcv.c| 2 +-
 .../gcc.target/riscv/rvv/autovec/widen/vec-avg-rv64gcv.c| 2 +-
 5 files changed, 8 insertions(+), 14 deletions(-)

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/avg-4.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/avg-4.c
index 8d106aaeed0..986a0ff21cf 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/avg-4.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/avg-4.c
@@ -25,11 +25,9 @@ DEF_AVG_CEIL (uint8_t, uint16_t, 512)
 DEF_AVG_CEIL (uint8_t, uint16_t, 1024)
 DEF_AVG_CEIL (uint8_t, uint16_t, 2048)
 
-/* { dg-final { scan-assembler-times {vwadd\.vv} 10 } } */
-/* { dg-final { scan-assembler-times {csrwi\s*vxrm,\s*0} 10 } } */
-/* { dg-final { scan-assembler-times {vnsra\.wi} 10 } } */
+/* { dg-final { scan-assembler-times {csrwi\s*vxrm,\s*0} 20 } } */
+/* { dg-final { scan-assembler-times {vaadd\.vv} 10 } } */
 /* { dg-final { scan-assembler-times {vaaddu\.vv} 10 } } */
-/* { dg-final { scan-assembler-times {vadd\.vi} 10 } } */
 /* { dg-final { scan-assembler-not {csrr} } } */
 /* { dg-final { scan-tree-dump-not "1,1" "optimized" } } */
 /* { dg-final { scan-tree-dump-not "2,2" "optimized" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/avg-5.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/avg-5.c
index 981abd51588..c450f80291a 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/avg-5.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/avg-5.c
@@ -23,11 +23,9 @@ DEF_AVG_CEIL (uint16_t, uint32_t, 256)
 DEF_AVG_CEIL (uint16_t, uint32_t, 512)
 DEF_AVG_CEIL (uint16_t, uint32_t, 1024)
 
-/* { dg-final { scan-assembler-times {vwadd\.vv} 9 } } */
-/* { dg-final { scan-assembler-times {csrwi\s*vxrm,\s*0} 9 } } */
-/* { dg-final { scan-assembler-times {vnsra\.wi} 9 } } */
+/* { dg-final { scan-assembler-times {csrwi\s*vxrm,\s*0} 18 } } */
 /* { dg-final { scan-assembler-times {vaaddu\.vv} 9 } } */
-/* { dg-final { scan-assembler-times {vadd\.vi} 9 } } */
+/* { dg-final { scan-assembler-times {vaadd\.vv} 9 } } */
 /* { dg-final { scan-assembler-not {csrr} } } */
 /* { dg-final { scan-tree-dump-not "1,1" "optimized" } } */
 /* { dg-final { scan-tree-dump-not "2,2" "optimized" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/avg-6.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/avg-6.c
index bfe4ba3c4bd..3473e193a5c 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/avg-6.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/avg-6.c
@@ -21,11 +21,9 @@ DEF_AVG_CEIL (uint16_t, uint32_t, 128)
 DEF_AVG_CEIL (uint16_t, uint32_t, 256)
 DEF_AVG_CEIL (uint16_t, uint32_t, 512)
 
-/* { dg-final { scan-assembler-times {vwadd\.vv} 8 } } */
-/* { dg-final { scan-assembler-times {csrwi\s*vxrm,\s*0} 8 } } */
-/* { dg-final { scan-assembler-times {vnsra\.wi} 8 } } */
+/* { dg-final { scan-assembler-times {csrwi\s*vxrm,\s*0} 16 } } */
 /* { dg-final { scan-assembler-times {vaaddu\.vv} 8 } } */
-/* { dg-final { scan-assembler-times {vadd\.vi} 8 } } */
+/* { dg-final { scan-assembler-times {vaadd\.vv} 8 } } */
 /* { dg-final { scan-assembler-not {csrr} } } */
 /* { dg-final { scan-tree-dump-not "1,1" "optimized" } } */
 /* { dg-final { scan-tree-dump-not "2,2" "optimized" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/vec-avg-rv32gcv.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/vec-avg-rv32gcv.c
index b7246a38dba..a5224e78d94 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/vec-avg-rv32gcv.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/vec-avg-rv32gcv.c
@@ -5,4 +5,4 @@
 
 /* { dg-final { scan-assembler-times {csrwi\s*vxrm,\s*2} 6 } } */
 /* { dg-final { scan-assembler-times {vaaddu\.vv} 6 } } */
-/* { dg-final { scan-assembler-times {vaadd\.vv} 3 } } */
+/* { dg-final { scan-assembler-times {vaadd\.vv} 6 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/vec-avg-rv64gcv.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/vec-avg-rv64gcv.c
index 3ffe0ef39ee..32446ae3c23 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/vec-avg-rv64gcv.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/vec-avg-rv64gcv.c
@@ -5,4 +5,4 @@
 
 /* { dg-final { scan-assembler-times {csrwi\s*vxrm,

[PATCH v1 1/3] RISC-V: Leverage vaadd.vv for signed standard name avg_ceil

2025-05-29 Thread pan2 . li
From: Pan Li 

The avg_ceil has the rounding mode towards +inf, while the
vaadd.vv has the rnu which totally match the sematics.  From
RVV spec, the fixed vaadd.vv with rnu,

roundoff_signed(v, d) = (signed(v) >> d) + r
r = v[d - 1]

For vaadd, d = 1, then we have

roundoff_signed(v, 1) = (signed(v) >> 1) + v[0]

If v[0] is bit 0, nothing need to do as there is no rounding.
If v[0] is bit 1, there will be rounding with 2 cases.

Case 1: v is positive.
  roundoff_signed(v, 1) = (signed(v) >> 1) + 1, aka round towards +inf
  roundoff_signed(2 + 3, 1) = (5 >> 1) + 1 = 3

Case 2: v is negative.
  roundoff_signed(v, 1) = (signed(v) >> 1) + 1, aka round towards +inf
  roundoff_signed(-9 + 2, 1) = (-7 >> 1) + 1 = -4 + 1 = -3

Thus, we can leverage the vaadd with rnu directly for avg_ceil.

The below test suites are passed for this patch series.
* The rv64gcv fully regression test.

gcc/ChangeLog:

* config/riscv/autovec.md (avg3_ceil): Add insn
expand to leverage vaadd with rnu directly.

Signed-off-by: Pan Li 
---
 gcc/config/riscv/autovec.md | 25 ++---
 1 file changed, 6 insertions(+), 19 deletions(-)

diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
index a54f552a80c..5ac7b62c2cf 100644
--- a/gcc/config/riscv/autovec.md
+++ b/gcc/config/riscv/autovec.md
@@ -2510,25 +2510,12 @@ (define_expand "avg3_ceil"
(match_operand: 2 "register_operand")))
   (const_int 1)]
   "TARGET_VECTOR"
-{
-  /* First emit a widening addition.  */
-  rtx tmp1 = gen_reg_rtx (mode);
-  rtx ops1[] = {tmp1, operands[1], operands[2]};
-  insn_code icode = code_for_pred_dual_widen (PLUS, SIGN_EXTEND, mode);
-  riscv_vector::emit_vlmax_insn (icode, riscv_vector::BINARY_OP, ops1);
-
-  /* Then add 1.  */
-  rtx tmp2 = gen_reg_rtx (mode);
-  rtx ops2[] = {tmp2, tmp1, const1_rtx};
-  icode = code_for_pred_scalar (PLUS, mode);
-  riscv_vector::emit_vlmax_insn (icode, riscv_vector::BINARY_OP, ops2);
-
-  /* Finally, a narrowing shift.  */
-  rtx ops3[] = {operands[0], tmp2, const1_rtx};
-  icode = code_for_pred_narrow_scalar (ASHIFTRT, mode);
-  riscv_vector::emit_vlmax_insn (icode, riscv_vector::BINARY_OP, ops3);
-  DONE;
-})
+  {
+insn_code icode = code_for_pred (UNSPEC_VAADD, mode);
+riscv_vector::emit_vlmax_insn (icode, riscv_vector::BINARY_OP_VXRM_RNU, 
operands);
+DONE;
+  }
+)
 
 ;; csrwi vxrm, 2
 ;; vaaddu.vv vd, vs2, vs1
-- 
2.43.0



[PATCH v1 0/3] Refine the avg_ceil with fixed point vaadd

2025-05-29 Thread pan2 . li
From: Pan Li 

Similar to the avg_floor, the avg_ceil has the rounding mode
towards +inf, while the vaadd.vv has the rnu which totally match
the sematics.  From RVV spec, the fixed vaadd.vv with rnu,

roundoff_signed(v, d) = (signed(v) >> d) + r
r = v[d - 1]

For vaadd, d = 1, then we have

roundoff_signed(v, 1) = (signed(v) >> 1) + v[0]

If v[0] is bit 0, nothing need to do as there is no rounding.
If v[0] is bit 1, there will be rounding with 2 cases.

Case 1: v is positive.
  roundoff_signed(v, 1) = (signed(v) >> 1) + 1, aka round towards +inf
  roundoff_signed(2 + 3, 1) = (5 >> 1) + 1 = 3

Case 2: v is negative.
  roundoff_signed(v, 1) = (signed(v) >> 1) + 1, aka round towards +inf
  roundoff_signed(-9 + 2, 1) = (-7 >> 1) + 1 = -4 + 1 = -3

Thus, we can leverage the vaadd with rnu directly for avg_ceil.

The below test suites are passed for this patch series.
* The rv64gcv fully regression test.

Pan Li (3):
  RISC-V: Leverage vaadd.vv for signed standard name avg_ceil
  RISC-V: Reconcile the existing test for avg_ceil
  RISC-V: Add test cases for avg_ceil vaadd implementation

 gcc/config/riscv/autovec.md   |  25 +--
 .../gcc.target/riscv/rvv/autovec/avg.h|  17 ++
 .../rvv/autovec/avg_ceil-1-i16-from-i32.c |  12 ++
 .../rvv/autovec/avg_ceil-1-i16-from-i64.c |  12 ++
 .../rvv/autovec/avg_ceil-1-i32-from-i64.c |  12 ++
 .../rvv/autovec/avg_ceil-1-i8-from-i16.c  |  12 ++
 .../rvv/autovec/avg_ceil-1-i8-from-i32.c  |  12 ++
 .../rvv/autovec/avg_ceil-1-i8-from-i64.c  |  12 ++
 .../rvv/autovec/avg_ceil-run-1-i16-from-i32.c |  16 ++
 .../rvv/autovec/avg_ceil-run-1-i16-from-i64.c |  16 ++
 .../rvv/autovec/avg_ceil-run-1-i32-from-i64.c |  16 ++
 .../rvv/autovec/avg_ceil-run-1-i8-from-i16.c  |  16 ++
 .../rvv/autovec/avg_ceil-run-1-i8-from-i32.c  |  16 ++
 .../rvv/autovec/avg_ceil-run-1-i8-from-i64.c  |  16 ++
 .../gcc.target/riscv/rvv/autovec/avg_data.h   | 176 ++
 .../gcc.target/riscv/rvv/autovec/vls/avg-4.c  |   6 +-
 .../gcc.target/riscv/rvv/autovec/vls/avg-5.c  |   6 +-
 .../gcc.target/riscv/rvv/autovec/vls/avg-6.c  |   6 +-
 .../riscv/rvv/autovec/widen/vec-avg-rv32gcv.c |   2 +-
 .../riscv/rvv/autovec/widen/vec-avg-rv64gcv.c |   2 +-
 20 files changed, 375 insertions(+), 33 deletions(-)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/avg_ceil-1-i16-from-i32.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/avg_ceil-1-i16-from-i64.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/avg_ceil-1-i32-from-i64.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/avg_ceil-1-i8-from-i16.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/avg_ceil-1-i8-from-i32.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/avg_ceil-1-i8-from-i64.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/avg_ceil-run-1-i16-from-i32.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/avg_ceil-run-1-i16-from-i64.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/avg_ceil-run-1-i32-from-i64.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/avg_ceil-run-1-i8-from-i16.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/avg_ceil-run-1-i8-from-i32.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/avg_ceil-run-1-i8-from-i64.c

-- 
2.43.0



Re: [PATCH] rtl-ssa: Reject non-address uses of autoinc regs [PR120347]

2025-05-29 Thread Richard Biener
On Wed, May 28, 2025 at 6:55 PM Richard Sandiford
 wrote:
>
> Richard Biener  writes:
> > On Thu, May 22, 2025 at 12:19 PM Richard Sandiford
> >  wrote:
> >>
> >> As the rtl.texi documentation of RTX_AUTOINC expressions says:
> >>
> >>   If a register used as the operand of these expressions is used in
> >>   another address in an insn, the original value of the register is
> >>   used.  Uses of the register outside of an address are not permitted
> >>   within the same insn as a use in an embedded side effect expression
> >>   because such insns behave differently on different machines and hence
> >>   must be treated as ambiguous and disallowed.
> >>
> >> late-combine was failing to follow this rule.  One option would have
> >> been to enforce it during the substitution phase, like combine does.
> >> This could either be a dedicated condition in the substitution code
> >> or, more generally, an extra condition in can_merge_accesses.
> >> (The latter would include extending is_pre_post_modify to uses.)
> >>
> >> However, since the restriction applies to patterns rather than to
> >> actions on patterns, the more robust fix seemed to be test and reject
> >> this case in (a subroutine of) rtl_ssa::recog.  We already do something
> >> similar for hard-coded register clobbers.
> >>
> >> Using vec_rtx_properties isn't the lightest-weight operation
> >> out there.  I did wonder about relying on the is_pre_post_modify
> >> flag of the definitions in the new_defs array, but that would
> >> require callers that create new autoincs to set the flag before
> >> calling recog.  Normally these flags are instead updated
> >> automatically based on the final pattern.
> >>
> >> Besides, recog itself has had to traverse the whole pattern,
> >> and it is even less light-weight than vec_rtx_properties.
> >> At least the pattern should be in cache.
> >>
> >> Tested on arm-linux-gnueabihf, aarch64-linux-gnu and
> >> x86_64-linux-gnu.  OK for trunk and backports?
> >
> > LGTM, note the 14 branch is currently frozen.
>
> Thanks.  It turns out that I looked at the wrong results for the
> arm-linux-gnueabihf testing :-(, and the Linaro CI flagged up a
> regression.  Although I think the rtl-ssa fix is still the right
> one, it showed up a mistake (of mine) in the rtl_properties walker:
> try_to_add_src would drop all flags except IN_NOTE before recursing
> into RTX_AUTOINC addresses.
>
> RTX_AUTOINCs only occur in addresses, and so for them, the flags coming
> into try_to_add_src are set by:
>
>   unsigned int base_flags = flags & rtx_obj_flags::STICKY_FLAGS;
>   ...
>   if (MEM_P (x))
> {
>   ...
>
>   unsigned int addr_flags = base_flags | rtx_obj_flags::IN_MEM_STORE;
>   if (flags & rtx_obj_flags::IS_READ)
> addr_flags |= rtx_obj_flags::IN_MEM_LOAD;
>   try_to_add_src (XEXP (x, 0), addr_flags);
>   return;
> }
>
> This means that the only flags that can be set are:
>
> - IN_NOTE (the sole member of STICKY_FLAGS)
> - IN_MEM_STORE
> - IN_MEM_LOAD
>
> Thus dropping all flags except IN_NOTE had the effect of dropping
> IN_MEM_STORE and IN_MEM_LOAD, and nothing else.  But those flags
> are the ones that mark something as being part of a mem address.
> The exclusion was therefore exactly wrong.
>
> So is the patch OK with the extra rtlanal.cc hunk below?  I was wondering
> whether it would count as obvious, but the length of the explanation above
> suggests not :)

Yes, the patch is OK.  The 14 branch is unfrozen, the 13 branch is frozen now.

Richard.

> Richard
>
>
> gcc/
> PR rtl-optimization/120347
> * rtlanal.cc (rtx_properties::try_to_add_src): Don't drop the
> IN_MEM_LOAD and IN_MEM_STORE flags for autoinc registers.
> * rtl-ssa/changes.cc (recog_level2): Check whether an
> RTX_AUTOINCed register also appears outside of an address.
>
> gcc/testsuite/
> PR rtl-optimization/120347
> * gcc.dg/torture/pr120347.c: New test.
> ---
>  gcc/rtl-ssa/changes.cc  | 18 ++
>  gcc/rtlanal.cc  |  2 +-
>  gcc/testsuite/gcc.dg/torture/pr120347.c | 13 +
>  3 files changed, 32 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.dg/torture/pr120347.c
>
> diff --git a/gcc/rtl-ssa/changes.cc b/gcc/rtl-ssa/changes.cc
> index eb579ad3ad7..f7aa6a66cdf 100644
> --- a/gcc/rtl-ssa/changes.cc
> +++ b/gcc/rtl-ssa/changes.cc
> @@ -1106,6 +1106,24 @@ recog_level2 (insn_change &change, 
> add_regno_clobber_fn add_regno_clobber)
> }
> }
>
> +  // Per rtl.texi, registers that are modified using RTX_AUTOINC operations
> +  // cannot also appear outside an address.
> +  vec_rtx_properties properties;
> +  properties.add_pattern (pat);
> +  for (rtx_obj_reference def : properties.refs ())
> +if (def.is_pre_post_modify ())
> +  for (rtx_obj_reference use : properties.refs ())
> +   if (def.regno == use.regno && !use.in_address ())
> + {
> +   if 

Re: [PATCH] expmed: Prevent non-canonical subreg generation in store_bit_field [PR118873]

2025-05-29 Thread Richard Biener
On Thu, May 29, 2025 at 12:27 PM Konstantinos Eleftheriou
 wrote:
>
> Hi Richard, thanks for the response.
>
> On Mon, May 26, 2025 at 11:55 AM Richard Biener  wrote:
> >
> > On Mon, 26 May 2025, Konstantinos Eleftheriou wrote:
> >
> > > In `store_bit_field_1`, when the value to be written in the bitfield
> > > and/or the bitfield itself have vector modes, non-canonical subregs
> > > are generated, like `(subreg:V4SI (reg:V8SI x) 0)`. If one them is
> > > a scalar, this happens only when the scalar mode is different than the
> > > vector's inner mode.
> > >
> > > This patch tries to prevent this, using vec_set patterns when
> > > possible.
> >
> > I know almost nothing about this code, but why does the patch
> > fixup things after the fact rather than avoid generating the
> > SUBREG in the first place?
>
> That's what we are doing, we are trying to prevent the non-canonical
> subreg generation (it's not always possible). But, there are cases
> where these types of subregs are passed into `store_bit_field` by its
> caller, in which case we choose not to touch them.
>
> > ISTR it also (unfortunately) depends on the target which forms
> > are considered canonical.
>
> But, the way that we interpret the documentation, the
> canonicalizations are machine-independent. Is that not true? Or,
> specifically for the subregs that operate on vectors, is there any
> target that considers them canonical?
>
> > I'm also not sure you got endianess right for all possible
> > values of SUBREG_BYTE.  One more reason to not generate such
> > subreg in the first place but stick to vec_select/concat.
>
> The only way that we would generate subregs are from the calls to
> `extract_bit_field` or `store_bit_field_1` and these should handle the
> endianness. Also, these subregs wouldn't operate on vectors. Do you
> mean that something could go wrong with these calls?

I wanted to remark that endianess WRT memory order (which is
what store/extract_bit_field deal with) isn't always the same as
endianess in register order (which is what vec_concat and friends
operate on).  If we can avoid transitioning from one to the other
this will help avoid mistakes.

In general it would be more obvious (to me) if you fixed the callers
that create those subregs.

Now, I didn't want to pretend I'm reviewing the patch - so others please
do that (as said, I'm not familiar enough with the code to tell whether
it's actually correct).

Richard.

>
> Konstantinos
>
>
> > Richard.
> >
> > > Bootstrapped/regtested on AArch64 and x86_64.
> > >
> > >   PR rtl-optimization/118873
> > >
> > > gcc/ChangeLog:
> > >
> > >   * expmed.cc (generate_vec_concat): New function.
> > >   (store_bit_field_1): Check for cases where the value
> > >   to be written and/or the bitfield have vector modes
> > >   and try to generate the corresponding vec_set patterns
> > >   instead of subregs.
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > >   * gcc.target/i386/pr118873.c: New test.
> > > ---
> > >  gcc/expmed.cc| 174 ++-
> > >  gcc/testsuite/gcc.target/i386/pr118873.c |  33 +
> > >  2 files changed, 200 insertions(+), 7 deletions(-)
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr118873.c
> > >
> > > diff --git a/gcc/expmed.cc b/gcc/expmed.cc
> > > index 8cf10d9c73bf..8c641f55b9c6 100644
> > > --- a/gcc/expmed.cc
> > > +++ b/gcc/expmed.cc
> > > @@ -740,6 +740,42 @@ store_bit_field_using_insv (const extraction_insn 
> > > *insv, rtx op0,
> > >return false;
> > >  }
> > >
> > > +/* Helper function for store_bit_field_1, used in the case that the 
> > > bitfield
> > > +   and the destination are both vectors.  It extracts the elements of OP 
> > > from
> > > +   LOWER_BOUND to UPPER_BOUND using a vec_select and uses a vec_concat to
> > > +   concatenate the extracted elements with the VALUE.  */
> > > +
> > > +rtx
> > > +generate_vec_concat (machine_mode fieldmode, rtx op, rtx value,
> > > +  HOST_WIDE_INT lower_bound,
> > > +  HOST_WIDE_INT upper_bound)
> > > +{
> > > +  if (!VECTOR_MODE_P (fieldmode))
> > > +return NULL_RTX;
> > > +
> > > +  rtvec vec = rtvec_alloc (GET_MODE_NUNITS (fieldmode).to_constant ());
> > > +  machine_mode outermode = GET_MODE (op);
> > > +
> > > +  for (HOST_WIDE_INT i = lower_bound; i < upper_bound; ++i)
> > > +RTVEC_ELT (vec, i) = GEN_INT (i);
> > > +  rtx par = gen_rtx_PARALLEL (VOIDmode, vec);
> > > +  rtx select = gen_rtx_VEC_SELECT (fieldmode, op, par);
> > > +  if (BYTES_BIG_ENDIAN)
> > > +{
> > > +  if (lower_bound > 0)
> > > + return gen_rtx_VEC_CONCAT (outermode, select, value);
> > > +  else
> > > + return gen_rtx_VEC_CONCAT (outermode, value, select);
> > > +}
> > > +  else
> > > +{
> > > +  if (lower_bound > 0)
> > > + return gen_rtx_VEC_CONCAT (outermode, value, select);
> > > +  else
> > > + return gen_rtx_VEC_CONCAT (outermode, select, value);
> > > +}
> >

Re: [PATCH] Fix crash with constant initializer caused by IPA

2025-05-29 Thread Richard Biener
On Thu, May 29, 2025 at 11:38 AM Eric Botcazou  wrote:
>
> Hi,
>
> the attached Ada testcase compiled with -O2 -gnatn makes the compiler crash in
> vect_can_force_dr_alignment_p during SLP vectorization:
>
>   if (decl_in_symtab_p (decl)
>   && !symtab_node::get (decl)->can_increase_alignment_p ())
> return false;
>
> because symtab_node::get (decl) returns a null node.  The phenomenon occurs
> for a pair of twin symbols listed like so in .cgraph:
>
> Opt7_Pkg.T12b/17 (Opt7_Pkg.T12b)
>   Type: variable definition analyzed
>   Visibility: semantic_interposition external public artificial
>   Aux: @0x44d45e0
>   References:
>   Referring: opt7_pkg__enum_name_table/13 (addr) opt7_pkg__enum_name_table/13
> (addr)
>   Availability: not-ready
>   Varpool flags: initialized read-only const-value-known
>
> Opt7_Pkg.T8b/16 (Opt7_Pkg.T8b)
>   Type: variable definition analyzed
>   Visibility: semantic_interposition external public artificial
>   Aux: @0x7f9fda3fff00
>   References:
>   Referring: opt7_pkg__enum_name_table/13 (addr) opt7_pkg__enum_name_table/13
> (addr)
>   Availability: not-ready
>   Varpool flags: initialized read-only const-value-known
>
> with:
>
> opt7_pkg__enum_name_table/13 (Opt7_Pkg.Enum_Name_Table)
>   Type: variable definition analyzed
>   Visibility: semantic_interposition external public
>   Aux: @0x44d45e0
>   References: Opt7_Pkg.T8b/16 (addr) Opt7_Pkg.T8b/16 (addr) Opt7_Pkg.T12b/17
> (addr) Opt7_Pkg.T12b/17 (addr)
>   Referring: opt7_pkg__image/2 (read) opt7_pkg__image/2 (read)
> opt7_pkg__image/2 (read) opt7_pkg__image/2 (read) opt7_pkg__image/2 (read)
> opt7_pkg__image/2 (read) opt7_pkg__image/2 (read) opt7_pkg__image/2 (read)
>   Availability: not-ready
>   Varpool flags: initialized read-only const-value-known
>
> being the crux of the matter.
>
> What happens is that symtab_remove_unreachable_nodes leaves the last symbol in
> kind of a limbo state: in .remove_symbols, we have:
>
> opt7_pkg__enum_name_table/13 (Opt7_Pkg.Enum_Name_Table)
>   Type: variable
>   Body removed by symtab_remove_unreachable_nodes
>   Visibility: externally_visible semantic_interposition external public
>   References:
>   Referring: opt7_pkg__image/2 (read) opt7_pkg__image/2 (read)
>   Availability: not_available
>   Varpool flags: initialized read-only const-value-known
>
> This means that the "body" (DECL_INITIAL) of the symbol has been disregarded
> during reachability analysis, causing the first two symbols to be discarded:
>
> Reclaiming variables: Opt7_Pkg.T12b/17 Opt7_Pkg.T8b/16
>
> but the DECL_INITIAL is explicitly preserved for later constant folding, which
> makes it possible to retrofit the DECLs corresponding to the first two symbols
> in the GIMPLE IR and ultimately leads vect_can_force_dr_alignment_p to crash.
>
>
> The decision to disregard the "body" (DECL_INITIAL) of the symbol is made in
> the first process_references present in ipa.cc:
>
>   if (node->definition && !node->in_other_partition
>   && ((!DECL_EXTERNAL (node->decl) || node->alias)
>   || (possible_inline_candidate_p (node)
>   /* We use variable constructors during late compilation for
>  constant folding.  Keep references alive so partitioning
>  knows about potential references.  */
>   || (VAR_P (node->decl)
>   && (flag_wpa
>   || flag_incremental_link
>  == INCREMENTAL_LINK_LTO)
>   && dyn_cast  (node)
>->ctor_useable_for_folding_p ()
>
> because neither flag_wpa nor flag_incremental_link = INCREMENTAL_LINK_LTO is
> true, while the decision to ultimately preserve the DECL_INITIAL is made later
> in remove_unreachable_nodes:
>
>   /* Keep body if it may be useful for constant folding. */
>   if ((flag_wpa || flag_incremental_link == INCREMENTAL_LINK_LTO)
>   || ((init = ctor_for_folding (vnode->decl)) == error_mark_node))
> vnode->remove_initializer ();
>   else
> DECL_INITIAL (vnode->decl) = init;
>
>
> I think that the testcase shows that the "body" of ctor_useable_for_folding_p
> symbols must always be considered for reachability analysis (which could make
> the above test on ctor_for_folding useless).  But implementing that introduces
> a regression for g++.dg/ipa/devirt-39.C, because the vtable is preserved and
> in turn forces the method to be preserved, hence the special case for vtables.
>
> The test also renames the first process_references function in ipa.cc to clear
> the confusion with the second function in the same file.
>
> Bootstrapped/regtested on x86-64/Linux, OK for the mainline?

Ah, I've run into the same issue with IPA PTA recently, unfortunately Honza
seems unresponsive in bugzilla.  IMO the patch looks OK, but let's give Honza
the chance to chime in here - esp. the DECL_VIRTUAL special-casing is
sth I'm not familiar with (wouldn't this apply to all COMDATs?  but only
considering w

Re: [PATCH 1/2] forwprop: Change test in loop of optimize_memcpy_to_memset

2025-05-29 Thread Richard Biener
On Thu, May 29, 2025 at 11:48 PM Andrew Pinski  wrote:
>
> On Tue, May 27, 2025 at 5:14 AM Richard Biener
>  wrote:
> >
> > On Tue, May 27, 2025 at 5:02 AM Andrew Pinski  
> > wrote:
> > >
> > > This was noticed in the review of copy propagation for aggregates
> > > patch, instead of checking for a NULL or a non-ssa name of vuse,
> > > we should instead check if it the vuse is a default name and stop
> > > then.
> > >
> > > Bootstrapped and tested on x86_64-linux-gnu.
> > >
> > > gcc/ChangeLog:
> > >
> > > * tree-ssa-forwprop.cc (optimize_memcpy_to_memset): Change check
> > > from NULL/non-ssa name to default name.
> > >
> > > Signed-off-by: Andrew Pinski 
> > > ---
> > >  gcc/tree-ssa-forwprop.cc | 3 ++-
> > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/gcc/tree-ssa-forwprop.cc b/gcc/tree-ssa-forwprop.cc
> > > index 4c048a9a298..e457a69ed48 100644
> > > --- a/gcc/tree-ssa-forwprop.cc
> > > +++ b/gcc/tree-ssa-forwprop.cc
> > > @@ -1226,7 +1226,8 @@ optimize_memcpy_to_memset (gimple_stmt_iterator 
> > > *gsip, tree dest, tree src, tree
> > >gimple *defstmt;
> > >unsigned limit = param_sccvn_max_alias_queries_per_access;
> > >do {
> > > -if (vuse == NULL || TREE_CODE (vuse) != SSA_NAME)
> > > +/* If the vuse is the default definition, then there is no stores 
> > > beforhand. */
> > > +if (SSA_NAME_IS_DEFAULT_DEF (vuse))
> >
> > Since forwprop does update_ssa in the end I was wondering whether any
> > bare non-SSA VUSE/VDEFs sneak in - for this the != SSA_NAME check
> > would be useful.  On a GIMPLE stmt gimple_vuse () will return NULL
> > when it's not a load or store (or with a novops call), as you are using
> > gimple_store_p/gimple_assign_load_p there might be a disconnect
> > between those predicates and the presence of a vuse (I hope not, but ...)
> >
> > The patch looks OK to me, the comments above apply to the copy propagation 
> > case.
>
> The copy prop case should be ok too since the vuse/vdef on the
> statement does not change when doing the prop; only the rhs of the
> statement. There is no inserting of a statement.  This is unless we
> remove the statement and then unlink_stmt_vdef will prop the vuse into
> the vdef of the statement which we are removing.
>
> I did test the copy prop using just SSA_NAME_IS_DEFAULT_DEF and there
> were no regressions there either.
>
> When optimize_memcpy_to_memset was part of fold_stmt, a NULL vuse
> and/or a non-SSA vuse was common due to running before ssa. This was
> why there was a check for non-SSA.
> I am not sure why there was a check for NULLness was there when it was
> part of fold-all-builtins though.
>
> On a side note I think many passes have TODO_update_ssa on them when
> they already keep the ssa up to date now. I wonder if most of that
> dates from the days of VMUST_DEF/VMAY_DEF and multiple names on them
> rather than one virtual name.

Could be.  TODO_update_ssa is cheap when nothing is to be done, but of
course it hides missed SSA updates.  Getting rid of unnecessary ones would
be nice.

Richard.

>
> Thanks,
> Andrew
>
>
> >
> > Thanks,
> > Richard.
> >
> > >return false;
> > >  defstmt = SSA_NAME_DEF_STMT (vuse);
> > >  if (is_a (defstmt))
> > > --
> > > 2.43.0
> > >


Re: [PATCH] scc_copy: conditional return TODO_cleanup_cfg.

2025-05-29 Thread Richard Biener
On Fri, May 30, 2025 at 3:53 AM Andrew Pinski  wrote:
>
> Only have cleanup cfg happen if scc copy did some proping.
> This should be a small compile time improvement by not doing cleanup
> cfg if scc copy does nothing.
>
> Also removes TODO_update_ssa since it should not be needed.

OK.

Richard.

> gcc/ChangeLog:
>
> * gimple-ssa-sccopy.cc (scc_copy_prop::replace_scc_by_value): Return 
> true
> if something was done.
> (scc_copy_prop::propagate): Return true if something was changed.
> (pass_sccopy::execute): Return TODO_cleanup_cfg if a prop happened.
>
> Signed-off-by: Andrew Pinski 
> ---
>  gcc/gimple-ssa-sccopy.cc | 20 
>  1 file changed, 12 insertions(+), 8 deletions(-)
>
> diff --git a/gcc/gimple-ssa-sccopy.cc b/gcc/gimple-ssa-sccopy.cc
> index ee2a7fa8a72..c93374572a9 100644
> --- a/gcc/gimple-ssa-sccopy.cc
> +++ b/gcc/gimple-ssa-sccopy.cc
> @@ -464,7 +464,7 @@ class scc_copy_prop
>  public:
>scc_copy_prop ();
>~scc_copy_prop ();
> -  void propagate ();
> +  bool propagate ();
>
>  private:
>/* Bitmap tracking statements which were propagated so that they can be
> @@ -474,15 +474,16 @@ private:
>void visit_op (tree op, hash_set &outer_ops,
> hash_set &scc_set, bool &is_inner,
> tree &last_outer_op);
> -  void replace_scc_by_value (vec scc, tree val);
> +  bool replace_scc_by_value (vec scc, tree val);
>  };
>
>  /* For each statement from given SCC, replace its usages by value
> VAL.  */
>
> -void
> +bool
>  scc_copy_prop::replace_scc_by_value (vec scc, tree val)
>  {
> +  bool didsomething = false;
>for (gimple *stmt : scc)
>  {
>tree name = gimple_get_lhs (stmt);
> @@ -497,10 +498,12 @@ scc_copy_prop::replace_scc_by_value (vec scc, 
> tree val)
> }
>replace_uses_by (name, val);
>bitmap_set_bit (dead_stmts, SSA_NAME_VERSION (name));
> +  didsomething = true;
>  }
>
>if (dump_file)
>  fprintf (dump_file, "Replacing SCC of size %d\n", scc.length ());
> +  return didsomething;
>  }
>
>  /* Part of 'scc_copy_prop::propagate ()'.  */
> @@ -566,9 +569,10 @@ scc_copy_prop::visit_op (tree op, hash_set 
> &outer_ops,
>   Braun, Buchwald, Hack, Leissa, Mallon, Zwinkau, 2013, LNCS vol. 7791,
>   Section 3.2.  */
>
> -void
> +bool
>  scc_copy_prop::propagate ()
>  {
> +  bool didsomething = false;
>auto_vec useful_stmts = get_all_stmt_may_generate_copy ();
>scc_discovery discovery;
>
> @@ -636,7 +640,7 @@ scc_copy_prop::propagate ()
> {
>   /* The only operand in outer_ops.  */
>   tree outer_op = last_outer_op;
> - replace_scc_by_value (scc, outer_op);
> + didsomething |= replace_scc_by_value (scc, outer_op);
> }
>else if (outer_ops.elements () > 1)
> {
> @@ -651,6 +655,7 @@ scc_copy_prop::propagate ()
>
>scc.release ();
>  }
> +  return didsomething;
>  }
>
>  scc_copy_prop::scc_copy_prop ()
> @@ -683,7 +688,7 @@ const pass_data pass_data_sccopy =
>0, /* properties_provided */
>0, /* properties_destroyed */
>0, /* todo_flags_start */
> -  TODO_update_ssa | TODO_cleanup_cfg, /* todo_flags_finish */
> +  0, /* todo_flags_finish */
>  };
>
>  class pass_sccopy : public gimple_opt_pass
> @@ -703,8 +708,7 @@ unsigned
>  pass_sccopy::execute (function *)
>  {
>scc_copy_prop sccopy;
> -  sccopy.propagate ();
> -  return 0;
> +  return sccopy.propagate () ?  TODO_cleanup_cfg : 0;
>  }
>
>  } // anon namespace
> --
> 2.43.0
>


RE: [PATCH][GCC16][GCC15] aarch64: Add support for FUJITSU-MONAKA (-mcpu=fujitsu-monaka) CPU

2025-05-29 Thread Yuta Mukai (Fujitsu)
Hi Kyrill-san
Thank you for the review and for pushing.
Yuta

> -Original Message-
> From: Kyrylo Tkachov 
> Sent: Thursday, May 29, 2025 6:45 PM
> To: Mukai, Yuta/向井 優太 
> Cc: gcc-patches@gcc.gnu.org; Richard Sandiford ; 
> andre.simoesdiasvie...@arm.com
> Subject: Re: [PATCH][GCC16][GCC15] aarch64: Add support for FUJITSU-MONAKA 
> (-mcpu=fujitsu-monaka) CPU
> 
> 
> 
> > On 28 May 2025, at 13:36, Kyrylo Tkachov  wrote:
> >
> > Hi Yuta-san
> >
> >> On 23 May 2025, at 07:49, Yuta Mukai (Fujitsu)  
> >> wrote:
> >>
> >> Hello,
> >>
> >> We would like to enable features for FUJITSU-MONAKA that were implemented 
> >> in GCC after we added support for
> FUJITSU-MONAKA.
> >> As the features were implemented in GCC15, we also want to backport it to 
> >> GCC15.
> >>
> >> Thanks to Andre Vieira for notifying us.
> >>
> >> Bootstrapped/regtested on aarch64-unknown-linux-gnu.
> >>
> >> We would be grateful if someone could push this on our behalf, as we do 
> >> not have write access.
> >
> > Thanks, this is ok and I’ve pushed it to trunk with an adjusted ChangeLog 
> > entry.
> > I’ll push a backport to the GCC 15 branch next week after some simple smoke 
> > testing.
> 
> I found a bit of time and bootstrapped a backport.
> So pushed to the GCC 15 branch as well.
> Thanks again,
> Kyrill
> 
> 
> >
> > Kyrill
> >
> >   2025-05-23  Yuta Mukai  
> >
> >   gcc/ChangeLog:
> >
> >   * config/aarch64/aarch64-cores.def (fujitsu-monaka): Update ISA
> >   features.
> >
> >>
> >> Thanks,
> >> Yuta
> >> --
> >> Yuta Mukai
> >> Fujitsu Limited
> >>
> >> <0001-aarch64-Enable-newly-implemented-features-for-FUJITS.patch>
> 



Re: [PATCH 3/3] OpenMP: Handle more cases in user/condition selector

2025-05-29 Thread Sandra Loosemore

On 5/29/25 02:51, Tobias Burnus wrote:

@Jason – The idea is make semantics.cc's maybe_convert_cond callable
from parser.cc + pt.cc, i.e. to make it a non-static function.
Any reasons not do so?

Sandra Loosemore wrote:


[…] By using the existing front-end
hooks for the implicit conversion to bool in conditional expressions,
we also get free support for using a C++ class object that has a bool
conversion operator in the user/condition selector.


Can you also add type-dependent testcases? They seem to work fine,
but are missing. Like

template
void f(T x, T2 y) {
  #pragma omp metadirective when(user={condition(x)}, 
target_device={device_num(y)} : flush)

}

plus calls to them.

* * *

In parser.cc and pt.cc, you don't call maybe_convert_cond (because it is
currently accessible) - but by calling some of its ingredients, you
bypass some code.
I discussed with Jakub and the idea is (see top of the page) to make
maybe_convert_cond non-static and use it instead.

Additionally, it seems as if we should add

   if (!processing_template_decl)
     t = fold_build_cleanup_point_expr (TREE_TYPE (t), t);

as we do for the other clauses.



Like the attached V2 patch?

-SandraFrom 802bbefdf57548cee0e5aaab518b95a99aa26593 Mon Sep 17 00:00:00 2001
From: Sandra Loosemore 
Date: Fri, 30 May 2025 03:14:35 +
Subject: [PATCH V2] OpenMP: Handle more cases in user/condition selector

Tobias had noted that the C front end was not treating C23 constexprs
as constant in the user/condition selector property, which led to
missed opportunities to resolve metadirectives at parse time.
Additionally neither C nor C++ was permitting the expression to have
pointer or floating-point type -- the former being a common idiom in
other C/C++ conditional expressions.  By using the existing front-end
hooks for the implicit conversion to bool in conditional expressions,
we also get free support for using a C++ class object that has a bool
conversion operator in the user/condition selector.

gcc/c/ChangeLog
	* c-parser.cc (c_parser_omp_context_selector): Call
	convert_lvalue_to_rvalue and c_objc_common_truthvalue_conversion
	on the expression for OMP_TRAIT_PROPERTY_BOOL_EXPR.

gcc/cp/ChangeLog
	* cp-tree.h (maybe_convert_cond): Declare.
	* parser.cc (cp_parser_omp_context_selector): Call
	maybe_convert_cond and fold_build_cleanup_point_expr on the
	expression for OMP_TRAIT_PROPERTY_BOOL_EXPR.
	* pt.cc (tsubst_omp_context_selector): Likewise.
	* semantics.cc (maybe_convert_cond): Remove static declaration.

gcc/testsuite/ChangeLog
	* c-c++-common/gomp/declare-variant-2.c: Update expected output.
	* c-c++-common/gomp/metadirective-condition-constexpr.c: New.
	* c-c++-common/gomp/metadirective-condition.c: New.
	* c-c++-common/gomp/metadirective-error-recovery.c: Update expected
	output.
	* g++.dg/gomp/metadirective-condition-class.C: New.
	* g++.dg/gomp/metadirective-condition-template.C: New.
---
 gcc/c/c-parser.cc | 19 ++--
 gcc/cp/cp-tree.h  |  1 +
 gcc/cp/parser.cc  | 21 +++--
 gcc/cp/pt.cc  | 30 ++---
 gcc/cp/semantics.cc   |  3 +-
 .../c-c++-common/gomp/declare-variant-2.c |  2 +-
 .../gomp/metadirective-condition-constexpr.c  | 13 ++
 .../gomp/metadirective-condition.c| 25 +++
 .../gomp/metadirective-error-recovery.c   |  9 +++-
 .../gomp/metadirective-condition-class.C  | 43 +++
 .../gomp/metadirective-condition-template.C   | 41 ++
 11 files changed, 188 insertions(+), 19 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/gomp/metadirective-condition-constexpr.c
 create mode 100644 gcc/testsuite/c-c++-common/gomp/metadirective-condition.c
 create mode 100644 gcc/testsuite/g++.dg/gomp/metadirective-condition-class.C
 create mode 100644 gcc/testsuite/g++.dg/gomp/metadirective-condition-template.C

diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc
index 4144aa17fde..e11e6034461 100644
--- a/gcc/c/c-parser.cc
+++ b/gcc/c/c-parser.cc
@@ -26865,17 +26865,30 @@ c_parser_omp_context_selector (c_parser *parser, enum omp_tss_code set,
 	  break;
 	case OMP_TRAIT_PROPERTY_DEV_NUM_EXPR:
 	case OMP_TRAIT_PROPERTY_BOOL_EXPR:
-	  t = c_parser_expr_no_commas (parser, NULL).value;
+	  {
+		c_expr texpr = c_parser_expr_no_commas (parser, NULL);
+		texpr = convert_lvalue_to_rvalue (token->location, texpr,
+		  true, true);
+		t = texpr.value;
+	  }
 	  if (t == error_mark_node)
 		return error_mark_node;
 	  mark_exp_read (t);
-	  t = c_fully_fold (t, false, NULL);
-	  if (!INTEGRAL_TYPE_P (TREE_TYPE (t)))
+	  if (property_kind == OMP_TRAIT_PROPERTY_BOOL_EXPR)
+		{
+		  t = c_objc_common_truthvalue_conversion (token->location,
+			   t,
+			   boolean_type_node);
+		  if (t == error_mark_node)
+		return error_mark_node;
+		}
+	  else if (!INTEGRAL_TYPE_P (TREE

[PATCH] scc_copy: conditional return TODO_cleanup_cfg.

2025-05-29 Thread Andrew Pinski
Only have cleanup cfg happen if scc copy did some proping.
This should be a small compile time improvement by not doing cleanup
cfg if scc copy does nothing.

Also removes TODO_update_ssa since it should not be needed.

gcc/ChangeLog:

* gimple-ssa-sccopy.cc (scc_copy_prop::replace_scc_by_value): Return 
true
if something was done.
(scc_copy_prop::propagate): Return true if something was changed.
(pass_sccopy::execute): Return TODO_cleanup_cfg if a prop happened.

Signed-off-by: Andrew Pinski 
---
 gcc/gimple-ssa-sccopy.cc | 20 
 1 file changed, 12 insertions(+), 8 deletions(-)

diff --git a/gcc/gimple-ssa-sccopy.cc b/gcc/gimple-ssa-sccopy.cc
index ee2a7fa8a72..c93374572a9 100644
--- a/gcc/gimple-ssa-sccopy.cc
+++ b/gcc/gimple-ssa-sccopy.cc
@@ -464,7 +464,7 @@ class scc_copy_prop
 public:
   scc_copy_prop ();
   ~scc_copy_prop ();
-  void propagate ();
+  bool propagate ();
 
 private:
   /* Bitmap tracking statements which were propagated so that they can be
@@ -474,15 +474,16 @@ private:
   void visit_op (tree op, hash_set &outer_ops,
hash_set &scc_set, bool &is_inner,
tree &last_outer_op);
-  void replace_scc_by_value (vec scc, tree val);
+  bool replace_scc_by_value (vec scc, tree val);
 };
 
 /* For each statement from given SCC, replace its usages by value
VAL.  */
 
-void
+bool
 scc_copy_prop::replace_scc_by_value (vec scc, tree val)
 {
+  bool didsomething = false;
   for (gimple *stmt : scc)
 {
   tree name = gimple_get_lhs (stmt);
@@ -497,10 +498,12 @@ scc_copy_prop::replace_scc_by_value (vec scc, 
tree val)
}
   replace_uses_by (name, val);
   bitmap_set_bit (dead_stmts, SSA_NAME_VERSION (name));
+  didsomething = true;
 }
 
   if (dump_file)
 fprintf (dump_file, "Replacing SCC of size %d\n", scc.length ());
+  return didsomething;
 }
 
 /* Part of 'scc_copy_prop::propagate ()'.  */
@@ -566,9 +569,10 @@ scc_copy_prop::visit_op (tree op, hash_set 
&outer_ops,
  Braun, Buchwald, Hack, Leissa, Mallon, Zwinkau, 2013, LNCS vol. 7791,
  Section 3.2.  */
 
-void
+bool
 scc_copy_prop::propagate ()
 {
+  bool didsomething = false;
   auto_vec useful_stmts = get_all_stmt_may_generate_copy ();
   scc_discovery discovery;
 
@@ -636,7 +640,7 @@ scc_copy_prop::propagate ()
{
  /* The only operand in outer_ops.  */
  tree outer_op = last_outer_op;
- replace_scc_by_value (scc, outer_op);
+ didsomething |= replace_scc_by_value (scc, outer_op);
}
   else if (outer_ops.elements () > 1)
{
@@ -651,6 +655,7 @@ scc_copy_prop::propagate ()
 
   scc.release ();
 }
+  return didsomething;
 }
 
 scc_copy_prop::scc_copy_prop ()
@@ -683,7 +688,7 @@ const pass_data pass_data_sccopy =
   0, /* properties_provided */
   0, /* properties_destroyed */
   0, /* todo_flags_start */
-  TODO_update_ssa | TODO_cleanup_cfg, /* todo_flags_finish */
+  0, /* todo_flags_finish */
 };
 
 class pass_sccopy : public gimple_opt_pass
@@ -703,8 +708,7 @@ unsigned
 pass_sccopy::execute (function *)
 {
   scc_copy_prop sccopy;
-  sccopy.propagate ();
-  return 0;
+  return sccopy.propagate () ?  TODO_cleanup_cfg : 0;
 }
 
 } // anon namespace
-- 
2.43.0



Re: [PATCH v3] libstdc++: Implement stringstream from string_view [PR119741]

2025-05-29 Thread Jonathan Wakely

On 29/05/25 09:50 -0400, Nathan Myers wrote:

Change in V3:
* Comment that p2495 specifies a drive-by constraint omitted as redundant
* Adjust whitespace to fit in 80 columns

Change in V2:
* apply all review comments
* remove redundant drive-by "requires" on ctor from string allocator arg
* check allocators are plumbed through

-- >8 --

Implement PR libstdc++/119741 (P2495R3)
Add constructors to stringbuf, stringstream, istringstream,
and ostringstream, and a matching overload of str(sv) in each,
that take anything convertible to a string_view in places
where the existing functions take a string.

libstdc++-v3/ChangeLog:

PR libstdc++/119741
* include/std/sstream: full implementation, really just
decls, requires clause and plumbing.
* include/bits/version.def, include/bits/version.h:
new preprocessor symbol
__cpp_lib_sstream_from_string_view.
* testsuite/27_io/basic_stringbuf/cons/char/3.cc: New tests.
* testsuite/27_io/basic_istringstream/cons/char/2.cc: New tests.
* testsuite/27_io/basic_ostringstream/cons/char/4.cc: New tests.
* testsuite/27_io/basic_stringstream/cons/char/2.cc: New tests.


Historically we just named most tests as 1.cc, 2.cc, 3.cc but it's not
very helpful when you see "FAIL: .../1.cc" in the test logs, so since
these new tests are for specific new constructors, I think it would
make more sense for them to all be named string_view.cc

That makes it very clear that they're testing construction from
string_view.

Please add some wchar_t tests too, i.e. cons/wchar_t/string_view.cc
That ensures that the impl doesn't accidentally use char_traits
where it should be char_traits<_CharT> or anything like that. If the
tests only use the char specializations then we wouldn't notice. The
wchar_t tests can be copies of the char ones, with char -> wchar_t
substituted

Or if you want to avoid copy&pasting the whole test, you could use C
instead of char and do this in the char/*.cc versions:

#ifndef C
#define C char
#endif

and then in the wchar_t/*.cc versions do:

#define C wchar_t
#include "../char/string_view.cc"

So that you reuse the char ones.


---
libstdc++-v3/include/bits/version.def |  11 +-
libstdc++-v3/include/bits/version.h   |  10 +
libstdc++-v3/include/std/sstream  | 200 --
.../27_io/basic_istringstream/cons/char/2.cc  | 187 
.../27_io/basic_ostringstream/cons/char/4.cc  | 186 
.../27_io/basic_stringbuf/cons/char/3.cc  | 196 +
.../27_io/basic_stringstream/cons/char/2.cc   | 196 +
7 files changed, 967 insertions(+), 19 deletions(-)
create mode 100644 
libstdc++-v3/testsuite/27_io/basic_istringstream/cons/char/2.cc
create mode 100644 
libstdc++-v3/testsuite/27_io/basic_ostringstream/cons/char/4.cc
create mode 100644 libstdc++-v3/testsuite/27_io/basic_stringbuf/cons/char/3.cc
create mode 100644 
libstdc++-v3/testsuite/27_io/basic_stringstream/cons/char/2.cc

diff --git a/libstdc++-v3/include/bits/version.def 
b/libstdc++-v3/include/bits/version.def
index 282667eabda..8172bcd4e26 100644
--- a/libstdc++-v3/include/bits/version.def
+++ b/libstdc++-v3/include/bits/version.def
@@ -649,7 +649,7 @@ ftms = {
  };
  values = {
v = 1;
-/* For when there's no gthread.  */
+// For when there is no gthread.
cxxmin = 17;
hosted = yes;
gthread = no;
@@ -1945,6 +1945,15 @@ ftms = {
  };
};

+ftms = {
+  name = sstream_from_string_view;
+  values = {
+v = 202302;


The correct value for this macro is 202306, see [version.syn] in the
working draft, or SD-6:
https://isocpp.org/std/standing-documents/sd-6-sg10-feature-test-recommendations#__cpp_lib_sstream_from_string_view


+cxxmin = 26;
+hosted = yes;
+  };
+};
+
// Standard test specifications.
stds[97] = ">= 199711L";
stds[03] = ">= 199711L";
diff --git a/libstdc++-v3/include/bits/version.h 
b/libstdc++-v3/include/bits/version.h
index bb7c0479c72..b4b487fba92 100644
--- a/libstdc++-v3/include/bits/version.h
+++ b/libstdc++-v3/include/bits/version.h
@@ -2174,4 +2174,14 @@
#endif /* !defined(__cpp_lib_modules) && defined(__glibcxx_want_modules) */
#undef __glibcxx_want_modules

+#if !defined(__cpp_lib_sstream_from_string_view)
+# if (__cplusplus >  202302L) && _GLIBCXX_HOSTED
+#  define __glibcxx_sstream_from_string_view 202302L
+#  if defined(__glibcxx_want_all) || 
defined(__glibcxx_want_sstream_from_string_view)
+#   define __cpp_lib_sstream_from_string_view 202302L
+#  endif
+# endif
+#endif /* !defined(__cpp_lib_sstream_from_string_view) && 
defined(__glibcxx_want_sstream_from_string_view) */
+#undef __glibcxx_want_sstream_from_string_view
+
#undef __glibcxx_want_all
diff --git a/libstdc++-v3/include/std/sstream b/libstdc++-v3/include/std/sstream
index ad0c16a91e8..528756ed631 100644
--- a/libstdc++-v3/include/std/sstream
+++ b/libstdc++-v3/include/std/sstream
@@ -38,9 +38,14 @@
#endif

#include  // iostream
+#include 


As

Re: [PATCH] libstdc++: Compare keys and values separately in flat_map::operator==

2025-05-29 Thread Tomasz Kaminski
On Thu, May 29, 2025 at 3:56 PM Patrick Palka  wrote:

> Tested on x86_64-pc-linux-gnu, does this look OK for trunk/15?
>
> -- >8 --
>
> Instead of effectively doing a zipped comparison of the keys and values,
> compare them separately to leverage the underlying containers' optimized
> equality implementations.
>
> libstdc++-v3/ChangeLog:
>
> * include/std/flat_map (_Flat_map_impl::operator==): Compare
> keys and values separately.
> ---
>  libstdc++-v3/include/std/flat_map | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/libstdc++-v3/include/std/flat_map
> b/libstdc++-v3/include/std/flat_map
> index c0716d12412a..134307324190 100644
> --- a/libstdc++-v3/include/std/flat_map
> +++ b/libstdc++-v3/include/std/flat_map
> @@ -873,7 +873,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>[[nodiscard]]
>friend bool
>operator==(const _Derived& __x, const _Derived& __y)
> -  { return std::equal(__x.begin(), __x.end(), __y.begin(),
> __y.end()); }
> +  {
> +   return __x._M_cont.keys == __y._M_cont.keys
> + && __x._M_cont.values == __y._M_cont.values;
>
Previously we supported containers that do not have operator==, by calling
equal.
For the flat_set we also do not compare the containers. I would suggest
using in both:
  ranges::equal(x._M_cont)
Or using == on containers in both flat_map and flat_set.

> +  }
>
>template
> [[nodiscard]]
> --
> 2.50.0.rc0
>
>


Re: [AUTOFDO] Fix annotated profile for de-duplicated call

2025-05-29 Thread Jan Hubicka
> diff --git a/gcc/auto-profile.cc b/gcc/auto-profile.cc
> index 7e0e8c66124..8a317d85277 100644
> --- a/gcc/auto-profile.cc
> +++ b/gcc/auto-profile.cc
> @@ -1129,6 +1129,26 @@ afdo_set_bb_count (basic_block bb, const stmt_set 
> &promoted)
>gimple *stmt = gsi_stmt (gsi);
>if (gimple_clobber_p (stmt) || is_gimple_debug (stmt))
>  continue;
> +  /* If statements are de-duplicated, we will have same stmt executing 
> from
> +  more than one path (by jumping to same statment).  In this case, the
> +  profile we get will be for multiple paths and would make the annotated
> +  profile wrong.  An example of this is:
> +
> +  if (foo () == 4)
> +{
> +  bar ();
> +}
> +  else if (foo () == 5)
> +{
> +  bar ();
> +}
> + In this case, we want to skip the profile count of bar () and calculate
> + the profile from the edge counts.  In case of LBR/BRBE we are
> + profiling branches and GIMPLE_CALL is the important statement
> + here.  */
> +
> +  if (gimple_code (stmt) == GIMPLE_CALL)
> + continue;

I am not quite sure about this.  With this change you will basically
ignore all samples anotated to calls.  We can deduplciate other
statements, not only calls.  Ignoring all samples annotated with calls
seems to be throwing away good part of useful information, since
pre-inline there are very many of them.  

We can mitigate this particular problem in some cases by deduplicating
early. This would also help inliner.

There are many later optimizations that will inavoidably lead to AFDO
disturption.  We may have something like -fautofdo-collection which will
disable passes that disturbs afdo a lot (like ICF or deduplication), but
I am not sure that makes a lot of sense either...

> location_t phi_loc
>   = gimple_phi_arg_location_from_edge (phi, tmp_e);
> count_info info;
> -   if (afdo_source_profile->get_count_info (phi_loc, &info)
> -   && info.count != 0)
> +   if (afdo_source_profile->get_count_info (phi_loc, &info))
>   {
> if (info.count > max_count)
>   max_count = info.count;

So the idea is to not mark BB as anotated if it only has zero executed statement
since deduplication effectively makes the other BB to contain such
statmeent even while it is executed?

> @@ -1217,7 +1236,9 @@ afdo_find_equiv_class (bb_set *annotated_bb)
> && bb1->loop_father == bb->loop_father)
>   {
> bb1->aux = bb;
> -   if (bb1->count > bb->count && is_bb_annotated (bb1, *annotated_bb))
> +   if (bb1->count > bb->count
> +   && !is_bb_annotated (bb, *annotated_bb)
> +   && is_bb_annotated (bb1, *annotated_bb))
>   {
> bb->count = bb1->count;
> set_bb_annotated (bb, annotated_bb);
> @@ -1229,7 +1250,9 @@ afdo_find_equiv_class (bb_set *annotated_bb)
> && bb1->loop_father == bb->loop_father)
>   {
> bb1->aux = bb;
> -   if (bb1->count > bb->count && is_bb_annotated (bb1, *annotated_bb))
> +   if (bb1->count > bb->count
> +   && !is_bb_annotated (bb, *annotated_bb)
> +   && is_bb_annotated (bb1, *annotated_bb))

Why these two are necessary? The code identifies pairs of BBs that
should execute same number of times (which is visible in CFG) and
attemtps to fixup the counts.  Perhaps the merging should be smarter,
but if we do not make them executed same number of time, we will only
have more inconsistent profiles...
@@ -1269,10 +1293,14 @@ afdo_propagate_edge (bool is_succ, bb_set *annotated_bb)
>   else
> total_known_count += AFDO_EINFO (e)->get_count ();
>   num_edge++;
> + if (is_bb_annotated (is_succ ? e->dest : e->src, *annotated_bb))
> +   num_annotated++;
> + else
> +   bb_edge_to_annotate = e;
>}
>  
>  /* Be careful not to annotate block with no successor in special cases.  
> */
> -if (num_unknown_edge == 0 && total_known_count > bb->count)
> +if (num_unknown_edge == 0 && total_known_count >= bb->count)
>{
>   bb->count = total_known_count;
>   if (!is_bb_annotated (bb, *annotated_bb))
> @@ -1281,26 +1309,52 @@ afdo_propagate_edge (bool is_succ, bb_set 
> *annotated_bb)
>}
>  else if (num_unknown_edge == 1 && is_bb_annotated (bb, *annotated_bb))
>{
> - if (bb->count > total_known_count)
> -   {
> -   profile_count new_count = bb->count - total_known_count;
> -   AFDO_EINFO(unknown_edge)->set_count(new_count);
> -   if (num_edge == 1)
> - {
> -   basic_block succ_or_pred_bb = is_succ ? unknown_edge->dest : 
> unknown_edge->src;
> -   if (new_count > succ_or_pred_bb->count)
> - {
> -   succ_or_pred_bb->count = new_count;
> -   if (!is_bb_annotated (succ_or_pred_bb, *annotated_bb))
> -   

[PATCH v3] libstdc++: Implement stringstream from string_view [PR119741]

2025-05-29 Thread Nathan Myers
Change in V3:
 * Comment that p2495 specifies a drive-by constraint omitted as redundant
 * Adjust whitespace to fit in 80 columns

Change in V2:
 * apply all review comments
 * remove redundant drive-by "requires" on ctor from string allocator arg
 * check allocators are plumbed through

-- >8 --

Implement PR libstdc++/119741 (P2495R3)
Add constructors to stringbuf, stringstream, istringstream,
and ostringstream, and a matching overload of str(sv) in each,
that take anything convertible to a string_view in places
where the existing functions take a string.

libstdc++-v3/ChangeLog:

PR libstdc++/119741
* include/std/sstream: full implementation, really just
decls, requires clause and plumbing.
* include/bits/version.def, include/bits/version.h:
new preprocessor symbol
__cpp_lib_sstream_from_string_view.
* testsuite/27_io/basic_stringbuf/cons/char/3.cc: New tests.
* testsuite/27_io/basic_istringstream/cons/char/2.cc: New tests.
* testsuite/27_io/basic_ostringstream/cons/char/4.cc: New tests.
* testsuite/27_io/basic_stringstream/cons/char/2.cc: New tests.
---
 libstdc++-v3/include/bits/version.def |  11 +-
 libstdc++-v3/include/bits/version.h   |  10 +
 libstdc++-v3/include/std/sstream  | 200 --
 .../27_io/basic_istringstream/cons/char/2.cc  | 187 
 .../27_io/basic_ostringstream/cons/char/4.cc  | 186 
 .../27_io/basic_stringbuf/cons/char/3.cc  | 196 +
 .../27_io/basic_stringstream/cons/char/2.cc   | 196 +
 7 files changed, 967 insertions(+), 19 deletions(-)
 create mode 100644 
libstdc++-v3/testsuite/27_io/basic_istringstream/cons/char/2.cc
 create mode 100644 
libstdc++-v3/testsuite/27_io/basic_ostringstream/cons/char/4.cc
 create mode 100644 libstdc++-v3/testsuite/27_io/basic_stringbuf/cons/char/3.cc
 create mode 100644 
libstdc++-v3/testsuite/27_io/basic_stringstream/cons/char/2.cc

diff --git a/libstdc++-v3/include/bits/version.def 
b/libstdc++-v3/include/bits/version.def
index 282667eabda..8172bcd4e26 100644
--- a/libstdc++-v3/include/bits/version.def
+++ b/libstdc++-v3/include/bits/version.def
@@ -649,7 +649,7 @@ ftms = {
   };
   values = {
 v = 1;
-/* For when there's no gthread.  */
+// For when there is no gthread.
 cxxmin = 17;
 hosted = yes;
 gthread = no;
@@ -1945,6 +1945,15 @@ ftms = {
   };
 };
 
+ftms = {
+  name = sstream_from_string_view;
+  values = {
+v = 202302;
+cxxmin = 26;
+hosted = yes;
+  };
+};
+
 // Standard test specifications.
 stds[97] = ">= 199711L";
 stds[03] = ">= 199711L";
diff --git a/libstdc++-v3/include/bits/version.h 
b/libstdc++-v3/include/bits/version.h
index bb7c0479c72..b4b487fba92 100644
--- a/libstdc++-v3/include/bits/version.h
+++ b/libstdc++-v3/include/bits/version.h
@@ -2174,4 +2174,14 @@
 #endif /* !defined(__cpp_lib_modules) && defined(__glibcxx_want_modules) */
 #undef __glibcxx_want_modules
 
+#if !defined(__cpp_lib_sstream_from_string_view)
+# if (__cplusplus >  202302L) && _GLIBCXX_HOSTED
+#  define __glibcxx_sstream_from_string_view 202302L
+#  if defined(__glibcxx_want_all) || 
defined(__glibcxx_want_sstream_from_string_view)
+#   define __cpp_lib_sstream_from_string_view 202302L
+#  endif
+# endif
+#endif /* !defined(__cpp_lib_sstream_from_string_view) && 
defined(__glibcxx_want_sstream_from_string_view) */
+#undef __glibcxx_want_sstream_from_string_view
+
 #undef __glibcxx_want_all
diff --git a/libstdc++-v3/include/std/sstream b/libstdc++-v3/include/std/sstream
index ad0c16a91e8..528756ed631 100644
--- a/libstdc++-v3/include/std/sstream
+++ b/libstdc++-v3/include/std/sstream
@@ -38,9 +38,14 @@
 #endif
 
 #include  // iostream
+#include 
 
 #include 
 #include 
+#ifdef __cpp_lib_sstream_from_string_view
+# include   // is_convertible_v
+#endif
+
 #include  // allocator_traits, __allocator_like
 
 #if __cplusplus > 201703L && _GLIBCXX_USE_CXX11_ABI
@@ -52,8 +57,6 @@
 # define _GLIBCXX_SSTREAM_ALWAYS_INLINE [[__gnu__::__always_inline__]]
 #endif
 
-
-
 namespace std _GLIBCXX_VISIBILITY(default)
 {
 _GLIBCXX_BEGIN_NAMESPACE_VERSION
@@ -159,6 +162,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
   { __rhs._M_sync(const_cast(__rhs._M_string.data()), 0, 0); }
 
 #if __cplusplus > 201703L && _GLIBCXX_USE_CXX11_ABI
+   // P0408 Efficient access to basic_stringbuf buffer
   explicit
   basic_stringbuf(const allocator_type& __a)
   : basic_stringbuf(ios_base::in | std::ios_base::out, __a)
@@ -197,7 +201,36 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
| ios_base::out)
: basic_stringbuf(__s, __mode, allocator_type{})
{ }
+#endif
+
+#ifdef __cpp_lib_sstream_from_string_view
+  template
+   explicit
+   basic_stringbuf(const _Tp& __t,
+   ios_base::openmode __mode = ios_base::in | ios_base::out)
+ requires (is_convertible_v>)
+   : basic_stringbu

[PATCH] libstdc++: Compare keys and values separately in flat_map::operator==

2025-05-29 Thread Patrick Palka
Tested on x86_64-pc-linux-gnu, does this look OK for trunk/15?

-- >8 --

Instead of effectively doing a zipped comparison of the keys and values,
compare them separately to leverage the underlying containers' optimized
equality implementations.

libstdc++-v3/ChangeLog:

* include/std/flat_map (_Flat_map_impl::operator==): Compare
keys and values separately.
---
 libstdc++-v3/include/std/flat_map | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/libstdc++-v3/include/std/flat_map 
b/libstdc++-v3/include/std/flat_map
index c0716d12412a..134307324190 100644
--- a/libstdc++-v3/include/std/flat_map
+++ b/libstdc++-v3/include/std/flat_map
@@ -873,7 +873,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   [[nodiscard]]
   friend bool
   operator==(const _Derived& __x, const _Derived& __y)
-  { return std::equal(__x.begin(), __x.end(), __y.begin(), __y.end()); }
+  {
+   return __x._M_cont.keys == __y._M_cont.keys
+ && __x._M_cont.values == __y._M_cont.values;
+  }
 
   template
[[nodiscard]]
-- 
2.50.0.rc0



Re: [PATCH] libstdc++: Compare keys and values separately in flat_map::operator==

2025-05-29 Thread Tomasz Kaminski
On Thu, May 29, 2025 at 4:37 PM Tomasz Kaminski  wrote:

>
>
> On Thu, May 29, 2025 at 3:56 PM Patrick Palka  wrote:
>
>> Tested on x86_64-pc-linux-gnu, does this look OK for trunk/15?
>>
>> -- >8 --
>>
>> Instead of effectively doing a zipped comparison of the keys and values,
>> compare them separately to leverage the underlying containers' optimized
>> equality implementations.
>>
>> libstdc++-v3/ChangeLog:
>>
>> * include/std/flat_map (_Flat_map_impl::operator==): Compare
>> keys and values separately.
>> ---
>>  libstdc++-v3/include/std/flat_map | 5 -
>>  1 file changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/libstdc++-v3/include/std/flat_map
>> b/libstdc++-v3/include/std/flat_map
>> index c0716d12412a..134307324190 100644
>> --- a/libstdc++-v3/include/std/flat_map
>> +++ b/libstdc++-v3/include/std/flat_map
>> @@ -873,7 +873,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>>[[nodiscard]]
>>friend bool
>>operator==(const _Derived& __x, const _Derived& __y)
>> -  { return std::equal(__x.begin(), __x.end(), __y.begin(),
>> __y.end()); }
>> +  {
>> +   return __x._M_cont.keys == __y._M_cont.keys
>> + && __x._M_cont.values == __y._M_cont.values;
>>
> Previously we supported containers that do not have operator==, by calling
> equal.
> For the flat_set we also do not compare the containers. I would suggest
> using in both:
>   ranges::equal(x._M_cont)
> Or using == on containers in both flat_map and flat_set.
>
queue and stack uses operator== for the containers, so I think  we should
use == on containers in both.

> +  }
>>
>>template
>> [[nodiscard]]
>> --
>> 2.50.0.rc0
>>
>>


Re: [PATCH] libstdc++: Compare keys and values separately in flat_map::operator==

2025-05-29 Thread Jonathan Wakely
On Thu, 29 May 2025 at 15:42, Tomasz Kaminski  wrote:
>
>
>
> On Thu, May 29, 2025 at 3:56 PM Patrick Palka  wrote:
>>
>> Tested on x86_64-pc-linux-gnu, does this look OK for trunk/15?
>>
>> -- >8 --
>>
>> Instead of effectively doing a zipped comparison of the keys and values,
>> compare them separately to leverage the underlying containers' optimized
>> equality implementations.
>>
>> libstdc++-v3/ChangeLog:
>>
>> * include/std/flat_map (_Flat_map_impl::operator==): Compare
>> keys and values separately.
>> ---
>>  libstdc++-v3/include/std/flat_map | 5 -
>>  1 file changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/libstdc++-v3/include/std/flat_map 
>> b/libstdc++-v3/include/std/flat_map
>> index c0716d12412a..134307324190 100644
>> --- a/libstdc++-v3/include/std/flat_map
>> +++ b/libstdc++-v3/include/std/flat_map
>> @@ -873,7 +873,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>>[[nodiscard]]
>>friend bool
>>operator==(const _Derived& __x, const _Derived& __y)
>> -  { return std::equal(__x.begin(), __x.end(), __y.begin(), __y.end()); }
>> +  {
>> +   return __x._M_cont.keys == __y._M_cont.keys
>> + && __x._M_cont.values == __y._M_cont.values;
>
> Previously we supported containers that do not have operator==, by calling 
> equal.

Oh, good point.
Using == means the element types of the underlying containers must be
equality comparable, but the original approach of using std::equal on
the zipped values only means those tuples must be equality comparable,
and an evil user could have overloaded:

bool operator==(const tuple&, const tuple&);

so that those comparisons work, but MyVal might not be equality comparable.

> For the flat_set we also do not compare the containers. I would suggest using 
> in both:
>   ranges::equal(x._M_cont)
> Or using == on containers in both flat_map and flat_set.
>>
>> +  }
>>
>>template
>> [[nodiscard]]
>> --
>> 2.50.0.rc0
>>



[pushed] c++: C++17 constexpr lambda and goto/static

2025-05-29 Thread Jason Merrill
Tested x86_64-pc-linux-gnu, applying to trunk.

-- 8< --

We only want the error for these cases for functions explicitly declared
constexpr, but we still want to set invalid_constexpr on C++17 lambdas so
maybe_save_constexpr_fundef doesn't make them implicitly constexpr.

The potential_constant_expression_1 change isn't necessary for this test,
but still seems correct.

gcc/cp/ChangeLog:

* decl.cc (start_decl): Also set invalid_constexpr
for maybe_constexpr_fn.
* parser.cc (cp_parser_jump_statement): Likewise.
* constexpr.cc (potential_constant_expression_1): Ignore
goto to an artificial label.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1z/constexpr-lambda29.C: New test.
---
 gcc/cp/constexpr.cc   |  3 ++
 gcc/cp/decl.cc| 28 +++
 gcc/cp/parser.cc  |  7 +++--
 .../g++.dg/cpp1z/constexpr-lambda29.C | 19 +
 4 files changed, 43 insertions(+), 14 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp1z/constexpr-lambda29.C

diff --git a/gcc/cp/constexpr.cc b/gcc/cp/constexpr.cc
index fa754b9a176..272fab32896 100644
--- a/gcc/cp/constexpr.cc
+++ b/gcc/cp/constexpr.cc
@@ -10979,6 +10979,9 @@ potential_constant_expression_1 (tree t, bool 
want_rval, bool strict, bool now,
*jump_target = *target;
return true;
  }
+   if (DECL_ARTIFICIAL (*target))
+ /* The user didn't write this goto, this isn't the problem.  */
+ return true;
if (flags & tf_error)
  constexpr_error (loc, fundef_p, "% is not a constant "
   "expression");
diff --git a/gcc/cp/decl.cc b/gcc/cp/decl.cc
index a9ef28bfd80..ec4b6298b11 100644
--- a/gcc/cp/decl.cc
+++ b/gcc/cp/decl.cc
@@ -6198,22 +6198,28 @@ start_decl (const cp_declarator *declarator,
 }
 
   if (current_function_decl && VAR_P (decl)
-  && DECL_DECLARED_CONSTEXPR_P (current_function_decl)
+  && maybe_constexpr_fn (current_function_decl)
   && cxx_dialect < cxx23)
 {
   bool ok = false;
   if (CP_DECL_THREAD_LOCAL_P (decl) && !DECL_REALLY_EXTERN (decl))
-   error_at (DECL_SOURCE_LOCATION (decl),
- "%qD defined % in %qs function only "
- "available with %<-std=c++23%> or %<-std=gnu++23%>", decl,
- DECL_IMMEDIATE_FUNCTION_P (current_function_decl)
- ? "consteval" : "constexpr");
+   {
+ if (DECL_DECLARED_CONSTEXPR_P (current_function_decl))
+   error_at (DECL_SOURCE_LOCATION (decl),
+ "%qD defined % in %qs function only "
+ "available with %<-std=c++23%> or %<-std=gnu++23%>", decl,
+ DECL_IMMEDIATE_FUNCTION_P (current_function_decl)
+ ? "consteval" : "constexpr");
+   }
   else if (TREE_STATIC (decl))
-   error_at (DECL_SOURCE_LOCATION (decl),
- "%qD defined % in %qs function only available "
- "with %<-std=c++23%> or %<-std=gnu++23%>", decl,
- DECL_IMMEDIATE_FUNCTION_P (current_function_decl)
- ? "consteval" : "constexpr");
+   {
+ if (DECL_DECLARED_CONSTEXPR_P (current_function_decl))
+   error_at (DECL_SOURCE_LOCATION (decl),
+ "%qD defined % in %qs function only available "
+ "with %<-std=c++23%> or %<-std=gnu++23%>", decl,
+ DECL_IMMEDIATE_FUNCTION_P (current_function_decl)
+ ? "consteval" : "constexpr");
+   }
   else
ok = true;
   if (!ok)
diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index 3e39bf33fab..091873cbe3a 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -15431,11 +15431,12 @@ cp_parser_jump_statement (cp_parser* parser, tree 
&std_attrs)
 
 case RID_GOTO:
   if (parser->in_function_body
- && DECL_DECLARED_CONSTEXPR_P (current_function_decl)
+ && maybe_constexpr_fn (current_function_decl)
  && cxx_dialect < cxx23)
{
- error ("% in % function only available with "
-"%<-std=c++23%> or %<-std=gnu++23%>");
+ if (DECL_DECLARED_CONSTEXPR_P (current_function_decl))
+   error ("% in % function only available with "
+  "%<-std=c++23%> or %<-std=gnu++23%>");
  cp_function_chain->invalid_constexpr = true;
}
 
diff --git a/gcc/testsuite/g++.dg/cpp1z/constexpr-lambda29.C 
b/gcc/testsuite/g++.dg/cpp1z/constexpr-lambda29.C
new file mode 100644
index 000..9e661b6a55d
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1z/constexpr-lambda29.C
@@ -0,0 +1,19 @@
+// Test that we don't make lambdas with goto/static implicitly constexpr
+// when an explicitly constexpr function would be ill-formed.
+
+// { dg-do compile { target c++17 } }
+
+int main()
+{
+  constexpr int a = [] {
+return 42;
+goto label;
+  label:
+return 1

Re: [PATCH] RISC-V: Add 'bclr+binv' peephole2 optimization.

2025-05-29 Thread Jeff Law




On 5/28/25 9:05 PM, Jiawei wrote:


This seems like it would be much better as a combine pattern.   In 
fact, I'm a bit surprised that combine didn't simplify this series of 
operations into a IOR.  So I'd really like to see the .combine dump 
with and without this hunk for the relevant testcase.


Here is the dump log, using 
trunk(7fca794e0199baff8f07140a950ba3374c6aa634), more details please see 
https://godbolt.org/z/3hfzdz3Ks


===

~/rv/bin/riscv64-unknown-linux-gnu-g++ -march=rv64gc_zba_zbb_zbs -O2  -S 
-fdump-rtl-all redundant-bitmap-2.C


before combine in .ext_dce

Thanks!   That was helpful.

I was looking for the full .combine dump -- the full dump includes 
information about patterns that were tried and failed.  That often will 
point the way to a better solution.


In particular in the .combine dump we have this nugget:

Trying 15 -> 18:
   15: r151:DI=0xfffe<-I think combine really should have simplified that before querying the 
target.  That really should have been simpified to a bit insertion idiom 
or perhaps an simpler ior.


More generally, the question we should first ask is whether or not the 
source should have simplified independent of the target.  I think the 
answer is yes in this case, which means we should try to fix that 
problem first since it'll improve every target rather than just RISC-V.


When we do find ourselves needing to write new target patterns, a 
define_insn will generally be preferable to a define_peephole.


The define_insn will match when there's a data dependency within a basic 
block.  A define_peephole requires the insns to be consecutive in the 
IL.  Thus the define_insn will tend to match more often and is those 
preferable to a define_peephole.


Anyway, to recap, I think the better solution is to improve 
simplify_binary_operation or one of its children or perhaps 
simplify_compound_operation its related functions.


jeff




Re: [EXT] Re: [PATCH v3] rs6000: Adding missed ISA 3.0 atomic memory operation instructions.

2025-05-29 Thread Peter Bergner
On 5/29/25 5:35 AM, Segher Boessenkool wrote:
>
> Add yourself to suthors as well?

Agreed.  Just add your name/email address directly under mine, like so:

2025-05-29  Peter Bergner  
Jeevitha Palanisamy  




>> +{   \
>> +  register TYPE _ret asm ("r8");\
>> +  register TYPE _cond asm ("r9") = _COND;   \
>> +  register TYPE _value asm ("r10") = _VALUE;
>> \
>> +  __asm__ __volatile__ (OPCODE " %[ret],%P[addr],%[code]"   \
>> +: [addr] "+Q" (_PTR[0]), [ret] "=r" (_ret)  \
>> +: "r" (_cond), "r" (_value), [code] "n" (FC));  \
>> +  return _ret;  
>> \
>> +}
> 
> Naming the operands is an extra indirection, and makes things way less
> readable (which means *understandable*) as well.  Just use %0, %1, %2
> please?  It's a single line, people will not lose track of what is what
> anyway (and if they would, the code is then way too big for extended
> asm, so named asm operands is always a code stench).

I agree that's a little too much syntactic sugar, but we were just
being consistent with the other existing code that uses this syntax.
I suppose you could use %1,%0,%4 here (%2 & %3 are not used directly)
and then clean up the other code similarly as a follow-on cleanup?




>> +#define _AMO_LD_INCREMENT(NAME, TYPE, OPCODE, FC)   \
>> +static __inline__ TYPE  
>> \
>> +NAME (TYPE *_PTR)   \
>> +{   \
>> +  TYPE _RET;
>> \
>> +  __asm__ volatile (OPCODE " %[ret],%P[addr],%[code]\n" 
>> \
>> +: [addr] "+Q" (_PTR[0]), [ret] "=r" (_RET)  \
>> +: "Q" (*(TYPE (*)[2]) _PTR), [code] "n" (FC));  \
>> +  return _RET;  
>> \
>> +}
> 
> I don't understand the [2].  Should it be [1]?  These instructions
> can use the value at mem+s (as the ISA names things) as input, but not
> mem+2*s.

I think 2 is correct here.  This 2 isn't an index like the 0 in _PTR[0],
but it's a size.  This specific use is trying to say we're reading from
memory and we're reading 2 locations, mem(EA,s) and mem(EA+s,s).
Maybe we could use separate mentions of _PTR[0] and _PTR[1] instead???
We don't actually use that "operand" in the instruction, it's just there
to tell the compiler that those memory locations are read.

Ditto for _AMO_LD_DECREMENT usage, which reads mem(EA-s,s) and mem(EA,s).

Peter



Re: [PATCH] libstdc++: Compare keys and values separately in flat_map::operator==

2025-05-29 Thread Tomasz Kaminski
On Thu, May 29, 2025 at 4:49 PM Jonathan Wakely  wrote:

> On Thu, 29 May 2025 at 15:48, Jonathan Wakely  wrote:
> >
> > On Thu, 29 May 2025 at 15:42, Tomasz Kaminski 
> wrote:
> > >
> > >
> > >
> > > On Thu, May 29, 2025 at 3:56 PM Patrick Palka 
> wrote:
> > >>
> > >> Tested on x86_64-pc-linux-gnu, does this look OK for trunk/15?
> > >>
> > >> -- >8 --
> > >>
> > >> Instead of effectively doing a zipped comparison of the keys and
> values,
> > >> compare them separately to leverage the underlying containers'
> optimized
> > >> equality implementations.
> > >>
> > >> libstdc++-v3/ChangeLog:
> > >>
> > >> * include/std/flat_map (_Flat_map_impl::operator==): Compare
> > >> keys and values separately.
> > >> ---
> > >>  libstdc++-v3/include/std/flat_map | 5 -
> > >>  1 file changed, 4 insertions(+), 1 deletion(-)
> > >>
> > >> diff --git a/libstdc++-v3/include/std/flat_map
> b/libstdc++-v3/include/std/flat_map
> > >> index c0716d12412a..134307324190 100644
> > >> --- a/libstdc++-v3/include/std/flat_map
> > >> +++ b/libstdc++-v3/include/std/flat_map
> > >> @@ -873,7 +873,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> > >>[[nodiscard]]
> > >>friend bool
> > >>operator==(const _Derived& __x, const _Derived& __y)
> > >> -  { return std::equal(__x.begin(), __x.end(), __y.begin(),
> __y.end()); }
> > >> +  {
> > >> +   return __x._M_cont.keys == __y._M_cont.keys
> > >> + && __x._M_cont.values == __y._M_cont.values;
> > >
> > > Previously we supported containers that do not have operator==, by
> calling equal.
> >
> > Oh, good point.
> > Using == means the element types of the underlying containers must be
> > equality comparable, but the original approach of using std::equal on
> > the zipped values only means those tuples must be equality comparable,
> > and an evil user could have overloaded:
> >
> > bool operator==(const tuple&, const tuple&);
>
> Or const tuple& or whatever the zipped type is.
>
Actually in [container.reqmts] p42
 we require that:
T  is equality comparable
Which in our case is std::tuple, but then we are comparing
std::tuple.
So I think just comparing containers is fine.

>
>
> >
> > so that those comparisons work, but MyVal might not be equality
> comparable.
> >
> > > For the flat_set we also do not compare the containers. I would
> suggest using in both:
> > >   ranges::equal(x._M_cont)
> > > Or using == on containers in both flat_map and flat_set.
> > >>
> > >> +  }
> > >>
> > >>template
> > >> [[nodiscard]]
> > >> --
> > >> 2.50.0.rc0
> > >>
>
>


Re: [PATCH v4 0/8] Implement layouts from mdspan.

2025-05-29 Thread Tomasz Kaminski
Sending a bit after the fact, but:
I have finished the review, and most of the commits have really minimal
cosmetic changes.
The only major functional one I have requested are for layout_stride
implementation,

On Wed, May 28, 2025 at 4:36 PM Tomasz Kaminski  wrote:

> I have reviewed and posted feedback up to, but not including layout_stride
> today.
> Will try to finish tomorrow.
> Thank you again for continuous work on the patches.
>
> On Tue, May 27, 2025 at 4:40 PM Tomasz Kaminski 
> wrote:
>
>>
>>
>> On Tue, May 27, 2025 at 4:32 PM Luc Grosheintz 
>> wrote:
>>
>>> Since, I believe now we're through the larger questions about
>>> how to implement layouts. If reviewing all three over and over
>>> is too painful, it might now make sense to split the patch into
>>> separate patches, one per layout.
>>>
>> I think we are OK. As you mentioned we are past general discussion,
>> so I need to do more throughroul review with checking against the
>> standard.
>> I will try to book some time for this this week.
>>
>>
>>> On 5/26/25 16:04, Luc Grosheintz wrote:
>>> > This follows up on:
>>> > https://gcc.gnu.org/pipermail/libstdc++/2025-May/061572.html
>>> >
>>> > Note that this patch series can only be applied after merging:
>>> > https://gcc.gnu.org/pipermail/libstdc++/2025-May/061653.html
>>> >
>>> > The important changes since v3 are:
>>> >* Fixed and testsed several related overflow issues that occured in
>>> >  extents of size 0 by using `size_t` to compute products.
>>> >* Fixed and tested default ctors.
>>> >* Add missing code for module support.
>>> >* Documented deviation from standard.
>>> >
>>> > The smaller changes include:
>>> >* Squashed the three small commits that make cosmetic changes to
>>> >  std::extents.
>>> >* Remove layout_left related changes from the layout_stride commit.
>>> >* Remove superfluous `mapping(extents_type(__exts))`.
>>> >* Fix indenting and improve comment in layout_stride.
>>> >* Add an easy check for representable required_span_size to
>>> >  layout_stride.
>>> >* Inline __dynamic_extents_prod
>>> >
>>> > Thank you Tomasz for all the great reviews!
>>> >
>>> > Luc Grosheintz (8):
>>> >libstdc++: Improve naming and whitespace for extents.
>>> >libstdc++: Implement layout_left from mdspan.
>>> >libstdc++: Add tests for layout_left.
>>> >libstdc++: Implement layout_right from mdspan.
>>> >libstdc++: Add tests for layout_right.
>>> >libstdc++: Implement layout_stride from mdspan.
>>> >libstdc++: Add tests for layout_stride.
>>> >libstdc++: Make layout_left(layout_stride) noexcept.
>>> >
>>> >   libstdc++-v3/include/std/mdspan   | 711
>>> +-
>>> >   libstdc++-v3/src/c++23/std.cc.in  |   5 +-
>>> >   .../mdspan/layouts/class_mandate_neg.cc   |  42 ++
>>> >   .../23_containers/mdspan/layouts/ctors.cc | 459 +++
>>> >   .../23_containers/mdspan/layouts/empty.cc |  78 ++
>>> >   .../23_containers/mdspan/layouts/mapping.cc   | 568 ++
>>> >   .../23_containers/mdspan/layouts/stride.cc| 500 
>>> >   7 files changed, 2349 insertions(+), 14 deletions(-)
>>> >   create mode 100644
>>> libstdc++-v3/testsuite/23_containers/mdspan/layouts/class_mandate_neg.cc
>>> >   create mode 100644
>>> libstdc++-v3/testsuite/23_containers/mdspan/layouts/ctors.cc
>>> >   create mode 100644
>>> libstdc++-v3/testsuite/23_containers/mdspan/layouts/empty.cc
>>> >   create mode 100644
>>> libstdc++-v3/testsuite/23_containers/mdspan/layouts/mapping.cc
>>> >   create mode 100644
>>> libstdc++-v3/testsuite/23_containers/mdspan/layouts/stride.cc
>>> >
>>>
>>>


Re: [PATCH] libstdc++: Define flat_set::operator== in terms of ==

2025-05-29 Thread Tomasz Kaminski
On Thu, May 29, 2025 at 7:14 PM Patrick Palka  wrote:

> Tested on x86_64-pc-linux-gnu, does this look OK for trunk/15?
>
> -- >8 --
>
> ... for consistency with the other standard container adaptors
> (stack, queue, etc).
>
> libstdc++-v3/ChangeLog:
>
> * include/std/flat_set (_Flat_set_impl::operator==):
> Define in terms of ==, not std::equal.
> ---
>
LGTM. I think this at the end is the best choice, as containers may have
more information than iterators,
to perform comparison quicker.

>  libstdc++-v3/include/std/flat_set | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/libstdc++-v3/include/std/flat_set
> b/libstdc++-v3/include/std/flat_set
> index c48340d79809..3da8882d154e 100644
> --- a/libstdc++-v3/include/std/flat_set
> +++ b/libstdc++-v3/include/std/flat_set
> @@ -728,7 +728,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>[[nodiscard]]
>friend bool
>operator==(const _Derived& __x, const _Derived& __y)
> -  { return std::equal(__x.begin(), __x.end(), __y.begin(),
> __y.end()); }
> +  { return __x._M_cont == __y._M_cont; }
>
>template
> [[nodiscard]]
> --
> 2.50.0.rc0
>
>


Re: [AUTOFDO] Fix annotated profile for de-duplicated call

2025-05-29 Thread Jan Hubicka
> 
> However i do not quite follow the old or new logic here.
> So if I have only one unknown edge out (or in) from BB and I know
> its count, I can determine count of that edge by Kirhoff law.
> 
> But then the old code computes number of edges out of the BB
> and if it is only one it updates the count of destinating BB.
> I think it should be be testing number of in-edgs of the
> destinating bb which seems a bug you are fixing.
> (and if it is indeed bug, I think we should fix it first before dealing
> with deduplication).

Also I wonder, the code uses POST_DOMINATORS so somewhere it needs to
include fake edges for infinite loops and noreturns.

Are fake edges still present when annotating edges? If so, we probably
want to ignore them.  Also it would proably help to check what
edges have profile_probabilyt::zero () and profile_probability::always ().
Those can come from static profile when we are pretty sure about the
outcome (for example, for EH edges or for user annotated cold paths) and
unless we have good data from AFDO showing they are wrong, we could
trust them similarly to afdo anotations.

Honza


[pushed] c++, coroutines: Delete now unused code for parm guards.

2025-05-29 Thread Iain Sandoe
tested on x86_64-darwin, powerpc64le-linux; NFC pushed as obvious,
thanks,
Iain

--- 8< ---

Since r16-775-g18df4a10bc9694 we use nested cleanups to
handle parameter copy destructors in the ramp (and pass
a list of cleanups required to the actor which will only
be invoked if the parameter copies were all correctly
built - and therefore does not need to guard destructors
either.

This deletes the provisions for frame parameter copy
destructor guards.

gcc/cp/ChangeLog:

* coroutines.cc (analyze_fn_parms): No longer
create a parameter copy guard var.
* coroutines.h (struct param_info): Remove the
entry for the parameter copy destructor guard.

Signed-off-by: Iain Sandoe 
---
 gcc/cp/coroutines.cc | 12 +---
 gcc/cp/coroutines.h  |  1 -
 2 files changed, 1 insertion(+), 12 deletions(-)

diff --git a/gcc/cp/coroutines.cc b/gcc/cp/coroutines.cc
index b1e555cb336..64a0a344349 100644
--- a/gcc/cp/coroutines.cc
+++ b/gcc/cp/coroutines.cc
@@ -4089,17 +4089,7 @@ analyze_fn_parms (tree orig, hash_map 
*param_uses)
}
   parm.field_id = name;
   if (TYPE_HAS_NONTRIVIAL_DESTRUCTOR (parm.frame_type))
-   {
- char *buf = xasprintf ("_Coro_q%u_%s_live", parm_num,
-DECL_NAME (arg) ? IDENTIFIER_POINTER (name)
-: "__unnamed");
- parm.guard_var
-   = coro_build_artificial_var (UNKNOWN_LOCATION, get_identifier (buf),
-boolean_type_node, orig,
-boolean_false_node);
- free (buf);
- parm.trivial_dtor = false;
-   }
+   parm.trivial_dtor = false;
   else
parm.trivial_dtor = true;
 }
diff --git a/gcc/cp/coroutines.h b/gcc/cp/coroutines.h
index d13bea0f302..10698cf2e12 100644
--- a/gcc/cp/coroutines.h
+++ b/gcc/cp/coroutines.h
@@ -9,7 +9,6 @@ struct param_info
   vec *body_uses; /* Worklist of uses, void if there are none.  */
   tree frame_type;   /* The type used to represent this parm in the frame.  */
   tree orig_type;/* The original type of the parm (not as passed).  */
-  tree guard_var;/* If we need a DTOR on exception, this bool guards it.  
*/
   tree fr_copy_dtor; /* If we need a DTOR on exception, this is it.  */
   bool by_ref;   /* Was passed by reference.  */
   bool pt_ref;   /* Was a pointer to object.  */
-- 
2.39.2 (Apple Git-143)



Re: [PATCH v5 01/24] ppc: Add PowerPC FMV symbol tests.

2025-05-29 Thread Jeff Law




On 5/29/25 6:40 AM, Alfie Richards wrote:

From: Alice Carlotti 

This tests the mangling of function assembly names when annotated with
target_clones attributes.

gcc/testsuite/ChangeLog:

* g++.target/powerpc/mvc-symbols1.C: New test.
* g++.target/powerpc/mvc-symbols2.C: New test.
* g++.target/powerpc/mvc-symbols3.C: New test.
* g++.target/powerpc/mvc-symbols4.C: New test.
So is this patch independent of the rest of the patchkit?  ie, does it 
pass on the trunk as-is right now?  If so, this is fine to commit now.


Thanks,
Jeff



Re: [PATCH v5 02/24] i386: Add x86 FMV symbol tests

2025-05-29 Thread Jeff Law




On 5/29/25 6:40 AM, Alfie Richards wrote:

From: Alice Carlotti 

This is for testing the x86 mangling of FMV versioned function
assembly names.

gcc/testsuite/ChangeLog:

* g++.target/i386/mv-symbols1.C: New test.
* g++.target/i386/mv-symbols2.C: New test.
* g++.target/i386/mv-symbols3.C: New test.
* g++.target/i386/mv-symbols4.C: New test.
* g++.target/i386/mv-symbols5.C: New test.
* g++.target/i386/mvc-symbols1.C: New test.
* g++.target/i386/mvc-symbols2.C: New test.
* g++.target/i386/mvc-symbols3.C: New test.
* g++.target/i386/mvc-symbols4.C: New test.
Similarly, if this isnt' dependent on subsequent patches to pass, then 
it can be committed now.


Jeff



[PATCH v4] libstdc++: stringstream ctors from string_view [PR119741]

2025-05-29 Thread Nathan Myers
Change in V4:
 * Rename tests to string_view.cc
 * Adapt tests to cons/wchar_t directories
 * Define symbol __cpp_lib_sstream_from_string_view as 202406
 * Define symbol __glibcxx_want_sstream_from_string_view before version.h
 * Include version.h after other includes
 * No include type_traits
 * Drive-by comment moved to commit message
 * Each `explicit` on its own line
 * Run tests even when using old COW string

Change in V3:
 * Comment that p2495 specifies a drive-by constraint omitted as redundant
 * Adjust whitespace to fit in 80 columns

Change in V2:
 * Apply all review comments
 * Remove redundant drive-by "requires" on ctor from string allocator arg
 * Check allocators are plumbed through

-- >8 --

Implement PR libstdc++/119741 (P2495R3).
Add constructors to stringbuf, stringstream, istringstream, and ostringstream,
and a matching overload of str(sv) in each, that take anything convertible to
a string_view in places where the existing ctors and function take a string.
Note this change omits the constraint applied to the istringstream constructor
from string cited as a "drive-by" in P2495R3, as we have determined it is
redundant.

libstdc++-v3/ChangeLog:

PR libstdc++/119741
* include/std/sstream: full implementation, really just
decls, requires clause and plumbing.
* include/bits/version.def, include/bits/version.h:
new preprocessor symbol
__cpp_lib_sstream_from_string_view.
* testsuite/27_io/basic_stringbuf/cons/char/string_view.cc:
New tests.
* testsuite/27_io/basic_istringstream/cons/char/string_view.cc:
New tests.
* testsuite/27_io/basic_ostringstream/cons/char/string_view.cc:
New tests.
* testsuite/27_io/basic_stringstream/cons/char/string_view.cc:
New tests.
* testsuite/27_io/basic_stringbuf/cons/wchar_t/string_view.cc:
New tests.
* testsuite/27_io/basic_istringstream/cons/wchar_t/string_view.cc:
New tests.
* testsuite/27_io/basic_ostringstream/cons/wchar_t/string_view.cc:
New tests.
* testsuite/27_io/basic_stringstream/cons/wchar_t/string_view.cc:
New tests.
---
 libstdc++-v3/include/bits/version.def |  11 +-
 libstdc++-v3/include/bits/version.h   |  10 +
 libstdc++-v3/include/std/sstream  | 198 +++--
 .../cons/char/string_view.cc  | 195 +
 .../cons/wchar_t/string_view.cc   |   3 +
 .../cons/char/string_view.cc  | 194 +
 .../cons/wchar_t/string_view.cc   |   3 +
 .../basic_stringbuf/cons/char/string_view.cc  | 205 ++
 .../cons/wchar_t/string_view.cc   |   3 +
 .../cons/char/string_view.cc  | 204 +
 .../cons/wchar_t/string_view.cc   |   3 +
 11 files changed, 1010 insertions(+), 19 deletions(-)
 create mode 100644 
libstdc++-v3/testsuite/27_io/basic_istringstream/cons/char/string_view.cc
 create mode 100644 
libstdc++-v3/testsuite/27_io/basic_istringstream/cons/wchar_t/string_view.cc
 create mode 100644 
libstdc++-v3/testsuite/27_io/basic_ostringstream/cons/char/string_view.cc
 create mode 100644 
libstdc++-v3/testsuite/27_io/basic_ostringstream/cons/wchar_t/string_view.cc
 create mode 100644 
libstdc++-v3/testsuite/27_io/basic_stringbuf/cons/char/string_view.cc
 create mode 100644 
libstdc++-v3/testsuite/27_io/basic_stringbuf/cons/wchar_t/string_view.cc
 create mode 100644 
libstdc++-v3/testsuite/27_io/basic_stringstream/cons/char/string_view.cc
 create mode 100644 
libstdc++-v3/testsuite/27_io/basic_stringstream/cons/wchar_t/string_view.cc

diff --git a/libstdc++-v3/include/bits/version.def 
b/libstdc++-v3/include/bits/version.def
index 282667eabda..53bf72d95c2 100644
--- a/libstdc++-v3/include/bits/version.def
+++ b/libstdc++-v3/include/bits/version.def
@@ -649,7 +649,7 @@ ftms = {
   };
   values = {
 v = 1;
-/* For when there's no gthread.  */
+// For when there is no gthread.
 cxxmin = 17;
 hosted = yes;
 gthread = no;
@@ -1945,6 +1945,15 @@ ftms = {
   };
 };
 
+ftms = {
+  name = sstream_from_string_view;
+  values = {
+v = 202306;
+cxxmin = 26;
+hosted = yes;
+  };
+};
+
 // Standard test specifications.
 stds[97] = ">= 199711L";
 stds[03] = ">= 199711L";
diff --git a/libstdc++-v3/include/bits/version.h 
b/libstdc++-v3/include/bits/version.h
index bb7c0479c72..0b932183e5b 100644
--- a/libstdc++-v3/include/bits/version.h
+++ b/libstdc++-v3/include/bits/version.h
@@ -2174,4 +2174,14 @@
 #endif /* !defined(__cpp_lib_modules) && defined(__glibcxx_want_modules) */
 #undef __glibcxx_want_modules
 
+#if !defined(__cpp_lib_sstream_from_string_view)
+# if (__cplusplus >=  202306L) && _GLIBCXX_HOSTED
+#  define __glibcxx_sstream_from_string_view 202306L
+#  if defined(__glibcxx_want_all) || 
defined(__glibcxx_want_sstream_from_string_view)
+#   define __cpp_lib_sstream_from_string_view 

[PATCH v5 11/24] Add clone_identifier function.

2025-05-29 Thread Alfie Richards
This is similar to clone_function_name and its siblings but takes an
identifier tree node rather than a function declaration.

This is to be used in conjunction with the identifier node stored in
cgraph_function_version_info::assembler_name to mangle FMV functions in
later patches.

gcc/ChangeLog:

* cgraph.h (clone_identifier): New function.
* cgraphclones.cc (clone_identifier): New function.
clone_function_name: Refactored to use clone_identifier.
---
 gcc/cgraph.h|  1 +
 gcc/cgraphclones.cc | 16 ++--
 2 files changed, 11 insertions(+), 6 deletions(-)

diff --git a/gcc/cgraph.h b/gcc/cgraph.h
index cea1dcaad77..9e991f0 100644
--- a/gcc/cgraph.h
+++ b/gcc/cgraph.h
@@ -2640,6 +2640,7 @@ tree clone_function_name (const char *name, const char 
*suffix,
 tree clone_function_name (tree decl, const char *suffix,
  unsigned long number);
 tree clone_function_name (tree decl, const char *suffix);
+tree clone_identifier (tree decl, const char *suffix);
 
 void tree_function_versioning (tree, tree, vec *,
   ipa_param_adjustments *,
diff --git a/gcc/cgraphclones.cc b/gcc/cgraphclones.cc
index c160e8b6985..0932b352317 100644
--- a/gcc/cgraphclones.cc
+++ b/gcc/cgraphclones.cc
@@ -570,6 +570,14 @@ clone_function_name (tree decl, const char *suffix)
   /* For consistency this needs to behave the same way as
  ASM_FORMAT_PRIVATE_NAME does, but without the final number
  suffix.  */
+  return clone_identifier (identifier, suffix);
+}
+
+/* Return a new clone of ID ending with the string SUFFIX.  */
+
+tree
+clone_identifier (tree id, const char *suffix)
+{
   char *separator = XALLOCAVEC (char, 2);
   separator[0] = symbol_table::symbol_suffix_separator ();
   separator[1] = 0;
@@ -578,15 +586,11 @@ clone_function_name (tree decl, const char *suffix)
 #else
   const char *prefix = "";
 #endif
-  char *result = ACONCAT ((prefix,
-  IDENTIFIER_POINTER (identifier),
-  separator,
-  suffix,
-  (char*)0));
+  char *result = ACONCAT (
+(prefix, IDENTIFIER_POINTER (id), separator, suffix, (char *) 0));
   return get_identifier (result);
 }
 
-
 /* Create callgraph node clone with new declaration.  The actual body will be
copied later at compilation stage.  The name of the new clone will be
constructed from the name of the original node, SUFFIX and NUM_SUFFIX.
-- 
2.34.1



[PATCH v5 21/24] aarch64: Remove FMV beta warning.

2025-05-29 Thread Alfie Richards
This patch removes the warning for target_version and target_clones in aarch64
as it is now spec compliant.

gcc/ChangeLog:

* config/aarch64/aarch64.cc (aarch64_process_target_version_attr):
Remove warning.
* config/aarch64/aarch64.opt: Mark -Wno-experimental-fmv-target
deprecated.
* doc/invoke.texi: Ditto.

gcc/testsuite/ChangeLog:

* g++.target/aarch64/mv-1.C: Remove option.
* g++.target/aarch64/mv-and-mvc-error1.C: Ditto.
* g++.target/aarch64/mv-and-mvc-error2.C: Ditto.
* g++.target/aarch64/mv-and-mvc-error3.C: Ditto.
* g++.target/aarch64/mv-and-mvc1.C: Ditto.
* g++.target/aarch64/mv-and-mvc2.C: Ditto.
* g++.target/aarch64/mv-and-mvc3.C: Ditto.
* g++.target/aarch64/mv-and-mvc4.C: Ditto.
* g++.target/aarch64/mv-error1.C: Ditto.
* g++.target/aarch64/mv-error2.C: Ditto.
* g++.target/aarch64/mv-error3.C: Ditto.
* g++.target/aarch64/mv-error4.C: Ditto.
* g++.target/aarch64/mv-error5.C: Ditto.
* g++.target/aarch64/mv-error6.C: Ditto.
* g++.target/aarch64/mv-error7.C: Ditto.
* g++.target/aarch64/mv-error8.C: Ditto.
* g++.target/aarch64/mv-pragma.C: Ditto.
* g++.target/aarch64/mv-symbols1.C: Ditto.
* g++.target/aarch64/mv-symbols10.C: Ditto.
* g++.target/aarch64/mv-symbols11.C: Ditto.
* g++.target/aarch64/mv-symbols12.C: Ditto.
* g++.target/aarch64/mv-symbols13.C: Ditto.
* g++.target/aarch64/mv-symbols2.C: Ditto.
* g++.target/aarch64/mv-symbols3.C: Ditto.
* g++.target/aarch64/mv-symbols4.C: Ditto.
* g++.target/aarch64/mv-symbols5.C: Ditto.
* g++.target/aarch64/mv-symbols6.C: Ditto.
* g++.target/aarch64/mv-symbols7.C: Ditto.
* g++.target/aarch64/mv-symbols8.C: Ditto.
* g++.target/aarch64/mv-symbols9.C: Ditto.
* g++.target/aarch64/mvc-error1.C: Ditto.
* g++.target/aarch64/mvc-error2.C: Ditto.
* g++.target/aarch64/mvc-symbols1.C: Ditto.
* g++.target/aarch64/mvc-symbols2.C: Ditto.
* g++.target/aarch64/mvc-symbols3.C: Ditto.
* g++.target/aarch64/mvc-symbols4.C: Ditto.
* g++.target/aarch64/mv-warning1.C: Removed.
* g++.target/aarch64/mvc-warning1.C: Removed.
---
 gcc/config/aarch64/aarch64.cc| 9 -
 gcc/config/aarch64/aarch64.opt   | 2 +-
 gcc/doc/invoke.texi  | 5 +
 gcc/testsuite/g++.target/aarch64/mv-1.C  | 1 -
 gcc/testsuite/g++.target/aarch64/mv-and-mvc-error1.C | 1 -
 gcc/testsuite/g++.target/aarch64/mv-and-mvc-error2.C | 1 -
 gcc/testsuite/g++.target/aarch64/mv-and-mvc-error3.C | 1 -
 gcc/testsuite/g++.target/aarch64/mv-and-mvc1.C   | 1 -
 gcc/testsuite/g++.target/aarch64/mv-and-mvc2.C   | 1 -
 gcc/testsuite/g++.target/aarch64/mv-and-mvc3.C   | 1 -
 gcc/testsuite/g++.target/aarch64/mv-and-mvc4.C   | 1 -
 gcc/testsuite/g++.target/aarch64/mv-error1.C | 1 -
 gcc/testsuite/g++.target/aarch64/mv-error2.C | 1 -
 gcc/testsuite/g++.target/aarch64/mv-error3.C | 1 -
 gcc/testsuite/g++.target/aarch64/mv-error4.C | 1 -
 gcc/testsuite/g++.target/aarch64/mv-error5.C | 1 -
 gcc/testsuite/g++.target/aarch64/mv-error6.C | 1 -
 gcc/testsuite/g++.target/aarch64/mv-error7.C | 1 -
 gcc/testsuite/g++.target/aarch64/mv-error8.C | 1 -
 gcc/testsuite/g++.target/aarch64/mv-pragma.C | 1 -
 gcc/testsuite/g++.target/aarch64/mv-symbols1.C   | 1 -
 gcc/testsuite/g++.target/aarch64/mv-symbols10.C  | 1 -
 gcc/testsuite/g++.target/aarch64/mv-symbols11.C  | 1 -
 gcc/testsuite/g++.target/aarch64/mv-symbols12.C  | 1 -
 gcc/testsuite/g++.target/aarch64/mv-symbols13.C  | 1 -
 gcc/testsuite/g++.target/aarch64/mv-symbols2.C   | 1 -
 gcc/testsuite/g++.target/aarch64/mv-symbols3.C   | 1 -
 gcc/testsuite/g++.target/aarch64/mv-symbols4.C   | 1 -
 gcc/testsuite/g++.target/aarch64/mv-symbols5.C   | 1 -
 gcc/testsuite/g++.target/aarch64/mv-symbols6.C   | 1 -
 gcc/testsuite/g++.target/aarch64/mv-symbols7.C   | 1 -
 gcc/testsuite/g++.target/aarch64/mv-symbols8.C   | 1 -
 gcc/testsuite/g++.target/aarch64/mv-symbols9.C   | 1 -
 gcc/testsuite/g++.target/aarch64/mv-warning1.C   | 9 -
 gcc/testsuite/g++.target/aarch64/mvc-error1.C| 1 -
 gcc/testsuite/g++.target/aarch64/mvc-error2.C| 1 -
 gcc/testsuite/g++.target/aarch64/mvc-symbols1.C  | 1 -
 gcc/testsuite/g++.target/aarch64/mvc-symbols2.C  | 1 -
 gcc/testsuite/g++.target/aarch64/mvc-symbols3.C  | 1 -
 gcc/testsuite/g++.target/aarch64/mvc-symbols4.C  | 1 -
 gcc/testsuite/g++.target/aarch64/mvc-warning1.C  | 1 -
 41 files changed, 2 insertions(+), 60 deletions(-)
 delete mode 100644 gcc/testsuite/g++.target/aarch64/mv-warning1.C

diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aar

[PATCH v5 22/24] c: Add target_version attribute support.

2025-05-29 Thread Alfie Richards
This commit introduces support for the target_version attribute in the c
frontend, following the behavior defined in the Arm C Language Extension.

Key changes include:

- During pushdecl, the compiler now checks whether the current symbol is
  part of a multiversioned set.
  - New versions are added to the function multiversioning (FMV) set, and the
symbol binding is updated to include the default version (if present).
This means the binding for a multiversioned symbol will always reference
the default version (if present), as it defines the scope and signature
for the entire set.
  - Pre-existing versions are merged with their previous version (or diagnosed).
- Lookup logic is adjusted to prevent resolving non-default versions.
- start_decl and start_function are updated to handle marking and mangling of
  versioned functions.
- c_parse_final_cleanups now includes a call to process_same_body_aliases.
  This has no functional impact other than setting cpp_implicit_aliases_done
  on all nodes, which is necessary for certain shared FMV logic.

gcc/c/ChangeLog:

* c-decl.cc (maybe_mark_function_versioned): New function.
(merge_decls): Preserve DECL_FUNCTION_VERSIONED in merging.
(duplicate_decls): Add check and diagnostic for unmergable version 
decls.
(pushdecl): Add FMV target_version logic.
(lookup_name): Don't resolve non-default versions.
(start_decl): Mark and mangle versioned functions.
(start_function): Mark and mangle versioned functions.
(c_parse_final_cleanups): Add call to process_same_body_aliases.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/mv-1.c: New test.
* gcc.target/aarch64/mv-and-mvc1.c: New test.
* gcc.target/aarch64/mv-and-mvc2.c: New test.
* gcc.target/aarch64/mv-and-mvc3.c: New test.
* gcc.target/aarch64/mv-and-mvc4.c: New test.
* gcc.target/aarch64/mv-symbols1.c: New test.
* gcc.target/aarch64/mv-symbols10.c: New test.
* gcc.target/aarch64/mv-symbols11.c: New test.
* gcc.target/aarch64/mv-symbols12.c: New test.
* gcc.target/aarch64/mv-symbols13.c: New test.
* gcc.target/aarch64/mv-symbols2.c: New test.
* gcc.target/aarch64/mv-symbols3.c: New test.
* gcc.target/aarch64/mv-symbols4.c: New test.
* gcc.target/aarch64/mv-symbols5.c: New test.
* gcc.target/aarch64/mv-symbols6.c: New test.
* gcc.target/aarch64/mv-symbols7.c: New test.
* gcc.target/aarch64/mv-symbols8.c: New test.
* gcc.target/aarch64/mv-symbols9.c: New test.
* gcc.target/aarch64/mvc-symbols1.c: New test.
* gcc.target/aarch64/mvc-symbols2.c: New test.
* gcc.target/aarch64/mvc-symbols3.c: New test.
* gcc.target/aarch64/mvc-symbols4.c: New test.
---
 gcc/c/c-decl.cc   | 113 ++
 gcc/testsuite/gcc.target/aarch64/mv-1.c   |  43 +++
 .../gcc.target/aarch64/mv-and-mvc1.c  |  37 ++
 .../gcc.target/aarch64/mv-and-mvc2.c  |  28 +
 .../gcc.target/aarch64/mv-and-mvc3.c  |  40 +++
 .../gcc.target/aarch64/mv-and-mvc4.c  |  37 ++
 .../gcc.target/aarch64/mv-symbols1.c  |  38 ++
 .../gcc.target/aarch64/mv-symbols10.c |  42 +++
 .../gcc.target/aarch64/mv-symbols11.c |  16 +++
 .../gcc.target/aarch64/mv-symbols12.c |  27 +
 .../gcc.target/aarch64/mv-symbols13.c |  28 +
 .../gcc.target/aarch64/mv-symbols2.c  |  28 +
 .../gcc.target/aarch64/mv-symbols3.c  |  27 +
 .../gcc.target/aarch64/mv-symbols4.c  |  31 +
 .../gcc.target/aarch64/mv-symbols5.c  |  36 ++
 .../gcc.target/aarch64/mv-symbols6.c  |  20 
 .../gcc.target/aarch64/mv-symbols7.c  |  47 
 .../gcc.target/aarch64/mv-symbols8.c  |  47 
 .../gcc.target/aarch64/mv-symbols9.c  |  44 +++
 .../gcc.target/aarch64/mvc-symbols1.c |  25 
 .../gcc.target/aarch64/mvc-symbols2.c |  15 +++
 .../gcc.target/aarch64/mvc-symbols3.c |  19 +++
 .../gcc.target/aarch64/mvc-symbols4.c |  12 ++
 23 files changed, 800 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/mv-1.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/mv-and-mvc1.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/mv-and-mvc2.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/mv-and-mvc3.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/mv-and-mvc4.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/mv-symbols1.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/mv-symbols10.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/mv-symbols11.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/mv-symbols12.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/mv-symbols13.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/mv-symbols2.c
 create mode 100

[PATCH] RISC-V: Add svbare extension.

2025-05-29 Thread Dongyan Chen
This patch support svbare extension, which is an extension in RVA23 profile.
To enable GCC to recognize and process svbare extension correctly at compile 
time.
---
 gcc/config/riscv/riscv-ext.def   | 13 +
 gcc/config/riscv/riscv-ext.opt   |  2 ++
 gcc/doc/riscv-ext.texi   |  4 
 gcc/testsuite/gcc.target/riscv/arch-59.c |  5 +
 4 files changed, 24 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/arch-59.c

diff --git a/gcc/config/riscv/riscv-ext.def b/gcc/config/riscv/riscv-ext.def
index 2e3c21660184..5dca1cde4f57 100644
--- a/gcc/config/riscv/riscv-ext.def
+++ b/gcc/config/riscv/riscv-ext.def
@@ -1935,6 +1935,19 @@ DEFINE_RISCV_EXT(
   /* BITMASK_BIT_POSITION*/ BITMASK_NOT_YET_ALLOCATED,
   /* EXTRA_EXTENSION_FLAGS */ 0)

+DEFINE_RISCV_EXT(
+  /* NAME */ svbare,
+  /* UPPERCAE_NAME */ SVBARE,
+  /* FULL_NAME */ "Satp mode bare is supported",
+  /* DESC */ "",
+  /* URL */ ,
+  /* DEP_EXTS */ ({"zicsr"}),
+  /* SUPPORTED_VERSIONS */ ({{1, 0}}),
+  /* FLAG_GROUP */ sv,
+  /* BITMASK_GROUP_ID */ BITMASK_NOT_YET_ALLOCATED,
+  /* BITMASK_BIT_POSITION*/ BITMASK_NOT_YET_ALLOCATED,
+  /* EXTRA_EXTENSION_FLAGS */ 0)
+
 #include "riscv-ext-corev.def"
 #include "riscv-ext-sifive.def"
 #include "riscv-ext-thead.def"
diff --git a/gcc/config/riscv/riscv-ext.opt b/gcc/config/riscv/riscv-ext.opt
index 5e9c5f56ad67..ceffb61e27fd 100644
--- a/gcc/config/riscv/riscv-ext.opt
+++ b/gcc/config/riscv/riscv-ext.opt
@@ -375,6 +375,8 @@ Mask(SVADU) Var(riscv_sv_subext)

 Mask(SVADE) Var(riscv_sv_subext)

+Mask(SVBARE) Var(riscv_sv_subext)
+
 Mask(XCVALU) Var(riscv_xcv_subext)

 Mask(XCVBI) Var(riscv_xcv_subext)
diff --git a/gcc/doc/riscv-ext.texi b/gcc/doc/riscv-ext.texi
index 7a22d841d1b6..f86190f5c242 100644
--- a/gcc/doc/riscv-ext.texi
+++ b/gcc/doc/riscv-ext.texi
@@ -574,6 +574,10 @@
 @tab 1.0
 @tab Cause exception when hardware updating of A/D bits is disabled

+@item svbare
+@tab 1.0
+@tab Satp mode bare is supported
+
 @item xcvalu
 @tab 1.0
 @tab Core-V miscellaneous ALU extension
diff --git a/gcc/testsuite/gcc.target/riscv/arch-59.c 
b/gcc/testsuite/gcc.target/riscv/arch-59.c
new file mode 100644
index ..ea599f20522d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/arch-59.c
@@ -0,0 +1,5 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64i_svbare -mabi=lp64" } */
+int foo()
+{
+}
--
2.43.0



Minor patch committed.

2025-05-29 Thread Jerry D

This was a followup to make the error message a little better.

Committed.

commit c69afa2f1bd7455457ab4e028a6bc51211b2dd20 (HEAD -> master, 
origin/master, origin/HEAD)

Author: Jerry DeLisle 
Date:   Thu May 29 10:02:00 2025 -0700

Fortran: Make minor adjustment to error message.

PR fortran/120049

gcc/fortran/ChangeLog:

* check.cc(check_c_ptr_2): Rephrase error message
for clarity.

gcc/testsuite/ChangeLog:

* gfortran.dg/c_f_pointer_tests_6.f90: Adjust dg-error
directive.




[PATCH] libstdc++: Define flat_set::operator== in terms of ==

2025-05-29 Thread Patrick Palka
Tested on x86_64-pc-linux-gnu, does this look OK for trunk/15?

-- >8 --

... for consistency with the other standard container adaptors
(stack, queue, etc).

libstdc++-v3/ChangeLog:

* include/std/flat_set (_Flat_set_impl::operator==):
Define in terms of ==, not std::equal.
---
 libstdc++-v3/include/std/flat_set | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libstdc++-v3/include/std/flat_set 
b/libstdc++-v3/include/std/flat_set
index c48340d79809..3da8882d154e 100644
--- a/libstdc++-v3/include/std/flat_set
+++ b/libstdc++-v3/include/std/flat_set
@@ -728,7 +728,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   [[nodiscard]]
   friend bool
   operator==(const _Derived& __x, const _Derived& __y)
-  { return std::equal(__x.begin(), __x.end(), __y.begin(), __y.end()); }
+  { return __x._M_cont == __y._M_cont; }
 
   template
[[nodiscard]]
-- 
2.50.0.rc0



Re: [PATCH] aarch64: Use LDR for first-element loads for Advanced SIMD

2025-05-29 Thread Richard Sandiford
Dhruv Chawla  writes:
> On 08/05/25 18:43, Richard Sandiford wrote:
>> Otherwise it looks good.  But I think we should think about how we
>> plan to integrate the related optimisation for register inputs.  E.g.:
>> 
>> int32x4_t foo(int32_t x) {
>>  return vsetq_lane_s32(x, vdupq_n_s32(0), 0);
>> }
>> 
>> generates:
>> 
>> foo:
>>  moviv0.4s, 0
>>  ins v0.s[0], w0
>>  ret
>> 
>> rather than a single UMOV.  Same idea when the input is in an FPR rather
>> than a GPR, but using FMOV rather than UMOV.
>> 
>> Conventionally, the register and memory forms should be listed as
>> alternatives in a single pattern, but that's somewhat complex because of
>> the different instruction availability for 64-bit+32-bit, 16-bit, and
>> 8-bit register operations.
>> 
>> My worry is that if we handle the register case as an entirely separate
>> patch, it would have to rewrite this one.
>
> I have been experimenting with this, and yeah, it gets quite messy when
> trying to handle both memory and register cases together. Would it be okay
> to enable the register case only for 64-/32-bit sizes? It would complicate
> the code only a little and could still be done with a single pattern. I've
> attached a patch that does the same.

Unfortunately, I don't think it's that easy -- more below.

> diff --git a/gcc/config/aarch64/aarch64-simd.md 
> b/gcc/config/aarch64/aarch64-simd.md
> index 6e30dc48934..5368b7f21fe 100644
> --- a/gcc/config/aarch64/aarch64-simd.md
> +++ b/gcc/config/aarch64/aarch64-simd.md
> @@ -1164,6 +1164,24 @@
> [(set_attr "type" "neon_logic")]
>   )
>   
> +(define_insn "*aarch64_simd_vec_set_low"
> +  [(set (match_operand:VALL_F16 0 "register_operand")
> + (vec_merge:VALL_F16
> +   (vec_duplicate:VALL_F16
> + (match_operand: 1 "aarch64_simd_nonimmediate_operand"))
> +   (match_operand:VALL_F16 3 "aarch64_simd_imm_zero")
> +   (match_operand:SI 2 "const_int_operand")))]
> +  "TARGET_SIMD
> +   && ENDIAN_LANE_N (, exact_log2 (INTVAL (operands[2]))) == 0
> +   && (aarch64_simd_mem_operand_p (operands[1]) ||
> +   GET_MODE_UNIT_BITSIZE (mode) >= 32)"

Adding a condition like this isn't safe, because the constraints:

> +  {@ [ cons: =0 , 1   ; attrs: type  ]
> + [ w, w   ; neon_move ] fmov\t%0, %1
> + [ w, r   ; neon_from_gp ] fmov\t%0, %1
> + [ w, Utv ; f_loads  ] ldr\t%0, %1
> +  }
> +)

promise that the w and r sources are supported for all modes.

When reloading an existing instruction, LRA only needs to look at
the instruction's constraints.  It doesn't need to look at the C++
condition or the predicates.

This means that if we provide a w,w alternative for V8HF (say), LRA
might use it in certain cases.  We could then either get wrong code or
an ICE, depending on whether something notices that the instruction no
longer matches its C++ condition.

So even if we just added the 32-bit and 64-bit register cases,
I think we'd need to separate them from the 8-bit and 16-bit cases.
And like I say, the 8/16-bit pattern would then need to be split
and rewritten if the 16-bit register case was added later.

Thanks,
Richard


[pushed] c++: xobj lambda 'this' capture [PR113563]

2025-05-29 Thread Jason Merrill
Tested x86_64-pc-linux-gnu, applying to trunk.

-- 8< --

Various places were still making assumptions that we could get to the 'this'
capture through current_class_ref in a lambda op(), which is incorrect for
an explicit object op().

PR c++/113563

gcc/cp/ChangeLog:

* lambda.cc (build_capture_proxy): Check pointerness of the
member, not the proxy type.
(lambda_expr_this_capture): Don't assume current_class_ref.
(nonlambda_method_basetype): Likewise.
* semantics.cc (finish_non_static_data_member): Don't assume
TREE_TYPE (object) is set.
(finish_this_expr): Check current_class_type for lambda,
not current_class_ref.

gcc/testsuite/ChangeLog:

* g++.dg/cpp23/explicit-obj-lambda16.C: New test.
---
 gcc/cp/lambda.cc  | 12 +++---
 gcc/cp/semantics.cc   | 17 +++-
 .../g++.dg/cpp23/explicit-obj-lambda16.C  | 39 +++
 3 files changed, 50 insertions(+), 18 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp23/explicit-obj-lambda16.C

diff --git a/gcc/cp/lambda.cc b/gcc/cp/lambda.cc
index a2bed9fb36a..34c7defb604 100644
--- a/gcc/cp/lambda.cc
+++ b/gcc/cp/lambda.cc
@@ -442,7 +442,7 @@ build_capture_proxy (tree member, tree init)
 
   type = lambda_proxy_type (object);
 
-  if (name == this_identifier && !INDIRECT_TYPE_P (type))
+  if (name == this_identifier && !INDIRECT_TYPE_P (TREE_TYPE (member)))
 {
   type = build_pointer_type (type);
   type = cp_build_qualified_type (type, TYPE_QUAL_CONST);
@@ -921,8 +921,9 @@ lambda_expr_this_capture (tree lambda, int add_capture_p)
   else
 {
   /* To make sure that current_class_ref is for the lambda.  */
-  gcc_assert (TYPE_MAIN_VARIANT (TREE_TYPE (current_class_ref))
- == LAMBDA_EXPR_CLOSURE (lambda));
+  gcc_assert (!current_class_ref
+ || (TYPE_MAIN_VARIANT (TREE_TYPE (current_class_ref))
+ == LAMBDA_EXPR_CLOSURE (lambda)));
 
   result = this_capture;
 
@@ -1037,12 +1038,9 @@ current_nonlambda_function (void)
 tree
 nonlambda_method_basetype (void)
 {
-  if (!current_class_ref)
-return NULL_TREE;
-
   tree type = current_class_type;
   if (!type || !LAMBDA_TYPE_P (type))
-return type;
+return current_class_ref ? type : NULL_TREE;
 
   while (true)
 {
diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index 241f2730878..1279d78b186 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -2770,7 +2770,7 @@ finish_non_static_data_member (tree decl, tree object, 
tree qualifying_scope,
   else if (PACK_EXPANSION_P (type))
/* Don't bother trying to represent this.  */
type = NULL_TREE;
-  else if (WILDCARD_TYPE_P (TREE_TYPE (object)))
+  else if (!TREE_TYPE (object) || WILDCARD_TYPE_P (TREE_TYPE (object)))
/* We don't know what the eventual quals will be, so punt until
   instantiation time.
 
@@ -3605,16 +3605,11 @@ finish_this_expr (void)
 {
   tree result = NULL_TREE;
 
-  if (current_class_ptr)
-{
-  tree type = TREE_TYPE (current_class_ref);
-
-  /* In a lambda expression, 'this' refers to the captured 'this'.  */
-  if (LAMBDA_TYPE_P (type))
-result = lambda_expr_this_capture (CLASSTYPE_LAMBDA_EXPR (type), true);
-  else
-result = current_class_ptr;
-}
+  if (current_class_type && LAMBDA_TYPE_P (current_class_type))
+result = (lambda_expr_this_capture
+ (CLASSTYPE_LAMBDA_EXPR (current_class_type), /*add*/true));
+  else if (current_class_ptr)
+result = current_class_ptr;
 
   if (result)
 /* The keyword 'this' is a prvalue expression.  */
diff --git a/gcc/testsuite/g++.dg/cpp23/explicit-obj-lambda16.C 
b/gcc/testsuite/g++.dg/cpp23/explicit-obj-lambda16.C
new file mode 100644
index 000..69936388969
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp23/explicit-obj-lambda16.C
@@ -0,0 +1,39 @@
+// PR c++/113563
+// { dg-do compile { target c++23 } }
+
+struct S {
+  int x_;
+  void f() {
+[this](this auto) {
+  this->x_ = 42;
+  return this;
+}();
+  }
+};
+
+struct R {
+  int x;
+
+  auto foo() {
+return [*this](this auto &self) {
+  this->x = 4;
+};
+  }
+};
+
+
+struct A
+{
+int n;
+void fun()
+{
+auto _ = [&](this auto self) { return n; };
+}
+};
+
+struct B {
+  int i = 42;
+  int foo() {
+return [this](this auto &&self) { auto p = &i; return *p; }();
+  }
+};

base-commit: 5c6364b09a67de8d2237f65016ea1e3365a76e8d
-- 
2.49.0



[PATCH v5 14/24] fmv: Add reject_target_clone hook for filtering target_clone versions.

2025-05-29 Thread Alfie Richards
This patch introduces the TARGET_REJECT_FUNCTION_CLONE_VERSION hook
which is used to determine if a target_clones version string parses.

If true is returned, a warning is emitted and from then on the version
is ignored.

This is as specified in the Arm C Language Extension. The purpose of this
is to allow some portability of code using target_clones attributes.

Currently this is only properly implemented for the Aarch64 backend.

For riscv which is the only other backend which uses target_version
semantics a partial implementation is present, where this hook is used
to check parsing, in which errors will be emitted on a failed parse
rather than warnings. A refactor of the riscv parsing logic would be
required to enable this functionality fully.

This fixes PR 118339 where parse failures could cause ICE in Aarch64.

gcc/ChangeLog:

PR target/118339
* target.def: Add reject_target_clone_version hook.
* tree.cc (get_clone_attr_versions): Add filter and location argument.
(get_clone_versions): Add filter argument.
* tree.h (get_clone_attr_versions): Add filter and location argument.
(get_clone_versions): Add filter argument.
* config/aarch64/aarch64.cc (aarch64_reject_target_clone_version):
New function
(TARGET_REJECT_FUNCTION_CLONE_VERSION): New define.
* config/riscv/riscv.cc (riscv_reject_target_clone_version):
New function.
(TARGET_REJECT_FUNCTION_CLONE_VERSION): New define.
* doc/tm.texi: Regenerated.
* doc/tm.texi.in: Add documentation for new hook.
* hooks.h (hook_stringslice_locationt_false): New function.
* hooks.cc (hook_stringslice_locationt_false): New function.

gcc/c-family/ChangeLog:

* c-attribs.cc (handle_target_clones_attribute): Update to emit warnings
for rejected versions.
---
 gcc/c-family/c-attribs.cc | 26 +-
 gcc/config/aarch64/aarch64.cc | 20 
 gcc/config/riscv/riscv.cc | 18 ++
 gcc/doc/tm.texi   |  5 +
 gcc/doc/tm.texi.in|  2 ++
 gcc/hooks.cc  |  6 ++
 gcc/hooks.h   |  3 +++
 gcc/target.def|  8 
 gcc/tree.cc   | 15 ---
 gcc/tree.h| 15 +++
 10 files changed, 106 insertions(+), 12 deletions(-)

diff --git a/gcc/c-family/c-attribs.cc b/gcc/c-family/c-attribs.cc
index 5dff489fcca..b5287f0da06 100644
--- a/gcc/c-family/c-attribs.cc
+++ b/gcc/c-family/c-attribs.cc
@@ -6132,12 +6132,28 @@ handle_target_clones_attribute (tree *node, tree name, 
tree ARG_UNUSED (args),
}
}
 
-  auto_vec versions= get_clone_attr_versions (args, NULL);
-
-  if (versions.length () == 1)
-   {
+  int num_defaults = 0;
+  auto_vec versions= get_clone_attr_versions (args,
+ &num_defaults,
+ DECL_SOURCE_LOCATION (*node),
+ false);
+
+  for (auto v : versions)
+   if (targetm.reject_function_clone_version
+ (v, DECL_SOURCE_LOCATION (*node)))
  warning (OPT_Wattributes,
-  "single % attribute is ignored");
+  "invalid % version %qB ignored",
+  &v);
+
+  /* Lone target_clones version is always ignored for target attr 
semantics.
+Only ignore under target_version semantics if it is a default
+version.  */
+  if (versions.length () == 1 && (TARGET_HAS_FMV_TARGET_ATTRIBUTE
+ || num_defaults == 1))
+   {
+ if (TARGET_HAS_FMV_TARGET_ATTRIBUTE)
+   warning (OPT_Wattributes,
+"single % attribute is ignored");
  *no_add_attrs = true;
}
   else
diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
index 01fc5538f18..a9571a850f1 100644
--- a/gcc/config/aarch64/aarch64.cc
+++ b/gcc/config/aarch64/aarch64.cc
@@ -31516,6 +31516,23 @@ aarch64_expand_fp_spaceship (rtx dest, rtx op0, rtx 
op1, rtx hint)
 }
 }
 
+bool
+aarch64_reject_target_clone_version (string_slice str,
+location_t loc ATTRIBUTE_UNUSED)
+{
+  str = str.strip ();
+
+  if (str == "default")
+return false;
+
+  enum aarch_parse_opt_result parse_res;
+  auto isa_flags = aarch64_asm_isa_flags;
+  parse_res = aarch64_parse_fmv_features (str, &isa_flags, NULL, NULL);
+
+  /* Reject any version which does not parse.  */
+  return parse_res != AARCH_PARSE_OK;
+}
+
 /* Target-specific selftests.  */
 
 #if CHECKING_P
@@ -32339,6 +32356,9 @@ aarch64_libgcc_floating_mode_supported_p
 #undef TARGET_OPTION_FUNCTION_VERSIONS
 #define TARGET_OPTION_FUNCTION_VERSIONS aarch64_common_function_versions
 
+#undef TARGET_REJECT_FUNCTION_CLONE_VERSION
+#define TARGET_REJECT_FUNCTION_CLONE_VERSION 
aarch64_reject_target_clone_version

[PATCH v5 16/24] c/c++: Add target_[version/clones] to decl diagnostics formatting.

2025-05-29 Thread Alfie Richards
Adds the target_version and target_clones attributes to diagnostic messages
for target_version semantics.

This is because the target_version/target_clones attributes affect the identity
of the decls, so need to be represented in diagnostics for them.

After this change diagnostics look like:

```
test.c:5:7: error: redefinition of ‘[[target_version("sve")]] foo’
5 | float foo () {return 2;}
  |   ^~~
test.c:2:7: note: previous definition of ‘[[target_version("sve")]] foo’ with 
type ‘float(void)’
2 | float foo () {return 1;}
```

This only affects targets which use target_version (aarch64 and riscv).

gcc/c-family/ChangeLog:

* c-pretty-print.cc (pp_c_function_target_version): New function.
(pp_c_function_target_clones): New function.
* c-pretty-print.h (pp_c_function_target_version): New function.
(pp_c_function_target_clones): New function.

gcc/c/ChangeLog:

* c-objc-common.cc (c_tree_printer): Add printing of target_clone and
target_version in decl diagnostics.

gcc/cp/ChangeLog:

* cxx-pretty-print.h (pp_cxx_function_target_version): New macro.
(pp_cxx_function_target_clones): Ditto.
* error.cc (dump_function_decl): Add printing of target_clone and
target_version in decl diagnostics.
---
 gcc/c-family/c-pretty-print.cc | 65 ++
 gcc/c-family/c-pretty-print.h  |  2 ++
 gcc/c/c-objc-common.cc |  6 
 gcc/cp/cxx-pretty-print.h  |  4 +++
 gcc/cp/error.cc|  3 ++
 5 files changed, 80 insertions(+)

diff --git a/gcc/c-family/c-pretty-print.cc b/gcc/c-family/c-pretty-print.cc
index fad6b5eb9b0..1cb71d99b16 100644
--- a/gcc/c-family/c-pretty-print.cc
+++ b/gcc/c-family/c-pretty-print.cc
@@ -36,6 +36,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "function.h"
 #include "basic-block.h"
 #include "gimple.h"
+#include "tm.h"
 
 /* The pretty-printer code is primarily designed to closely follow
(GNU) C and C++ grammars.  That is to be contrasted with spaghetti
@@ -3054,6 +3055,70 @@ pp_c_tree_decl_identifier (c_pretty_printer *pp, tree t)
   pp_c_identifier (pp, name);
 }
 
+/* Prints "[version: VERSION]" for a versioned function decl.
+   This only works for target_version.  */
+void
+pp_c_function_target_version (c_pretty_printer *pp, tree t)
+{
+  if (TARGET_HAS_FMV_TARGET_ATTRIBUTE)
+return;
+
+  string_slice version = get_target_version (t);
+  if (!version.is_valid ())
+return;
+
+  pp_c_left_bracket (pp);
+  pp_c_left_bracket (pp);
+  pp_string (pp, "target_version");
+  pp_c_left_paren (pp);
+  pp_doublequote (pp);
+  pp_string_n (pp, version.begin (), version.size ());
+  pp_doublequote (pp);
+  pp_c_right_paren (pp);
+  pp_c_right_bracket (pp);
+  pp_c_right_bracket (pp);
+  pp_c_whitespace (pp);
+}
+
+/* Prints "[clones: VERSION, +]" for a versioned function decl.
+   This only works for target_version.  */
+void
+pp_c_function_target_clones (c_pretty_printer *pp, tree t)
+{
+  /* Only print for target_version semantics.
+ This is because for target FMV semantics a target_clone always defines
+ the entire FMV set.  target_version semantics can mix target_clone and
+ target_version decls in the definition of a FMV set and so the
+ target_clone becomes a part of the identity of the declaration.  */
+  if (TARGET_HAS_FMV_TARGET_ATTRIBUTE)
+return;
+
+  auto_vec versions = get_clone_versions (t, NULL, false);
+  if (versions.is_empty ())
+return;
+
+  string_slice final_version = versions.pop ();
+  pp_c_left_bracket (pp);
+  pp_c_left_bracket (pp);
+  pp_string (pp, "target_clones");
+  pp_c_left_paren (pp);
+  for (string_slice version : versions)
+{
+  pp_doublequote (pp);
+  pp_string_n (pp, version.begin (), version.size ());
+  pp_doublequote (pp);
+  pp_string (pp, ",");
+  pp_c_whitespace (pp);
+}
+  pp_doublequote (pp);
+  pp_string_n (pp, final_version.begin (), final_version.size ());
+  pp_doublequote (pp);
+  pp_c_right_paren (pp);
+  pp_c_right_bracket (pp);
+  pp_c_right_bracket (pp);
+  pp_c_whitespace (pp);
+}
+
 #if CHECKING_P
 
 namespace selftest {
diff --git a/gcc/c-family/c-pretty-print.h b/gcc/c-family/c-pretty-print.h
index c8fb6789991..5dc1cdff513 100644
--- a/gcc/c-family/c-pretty-print.h
+++ b/gcc/c-family/c-pretty-print.h
@@ -138,6 +138,8 @@ void pp_c_ws_string (c_pretty_printer *, const char *);
 void pp_c_identifier (c_pretty_printer *, const char *);
 void pp_c_string_literal (c_pretty_printer *, tree);
 void pp_c_integer_constant (c_pretty_printer *, tree);
+void pp_c_function_target_version (c_pretty_printer *, tree);
+void pp_c_function_target_clones (c_pretty_printer *, tree);
 
 void print_c_tree (FILE *file, tree t, dump_flags_t);
 
diff --git a/gcc/c/c-objc-common.cc b/gcc/c/c-objc-common.cc
index 2016eaebf17..84a4ee5fc17 100644
--- a/gcc/c/c-objc-common.cc
+++ b/gcc/c/c-objc-common.cc
@@ -23,6 +23,7 @@ along with GCC; see the file C

[committed] libstdc++: Fix another 17_intro/names.cc failure on AIX

2025-05-29 Thread Jonathan Wakely
FAIL: 17_intro/names.cc  -std=gnu++98 (test for excess errors)

Also fix typo in experimental/names.cc where I did #undef for the wrong
name in r16-901-gd1ced2a5ea6b09.

libstdc++-v3/ChangeLog:

* testsuite/17_intro/names.cc [_AIX] (a): Undefine.
* testsuite/experimental/names.cc [_AIX] (ptr): Undefine.
---

Tested x86_64-linux and powerpc-aix.

Pushed to trunk.

 libstdc++-v3/testsuite/17_intro/names.cc | 2 ++
 libstdc++-v3/testsuite/experimental/names.cc | 2 +-
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/libstdc++-v3/testsuite/17_intro/names.cc 
b/libstdc++-v3/testsuite/17_intro/names.cc
index a61e49dc8191..f32205d9c7f5 100644
--- a/libstdc++-v3/testsuite/17_intro/names.cc
+++ b/libstdc++-v3/testsuite/17_intro/names.cc
@@ -248,6 +248,8 @@
 #undef r
 #undef x
 #undef y
+//  defines drand48_data::a
+#undef a
 //  defines _LC_weight_t::n
 #undef n
 //  defines pollfd_ext::u on AIX 7.3
diff --git a/libstdc++-v3/testsuite/experimental/names.cc 
b/libstdc++-v3/testsuite/experimental/names.cc
index 4bedd530ecc5..94ae76fc610b 100644
--- a/libstdc++-v3/testsuite/experimental/names.cc
+++ b/libstdc++-v3/testsuite/experimental/names.cc
@@ -25,7 +25,7 @@
 
 #ifdef _AIX
 //  declares endnetgrent_r with ptr parameter.
-# undef n
+# undef ptr
 #endif
 
 // Filesystem
-- 
2.49.0



[committed] libstdc++: Re-enable some XPASS tests for AIX

2025-05-29 Thread Jonathan Wakely
The deque shrink_to_fit.cc test always passes on AIX, I think it should
not have been disabled.

The 96088.cc tests pass for C++20 and later (I don't know why) so make
them require C++20, as they fail otherwise.

libstdc++-v3/ChangeLog:

* testsuite/23_containers/deque/capacity/shrink_to_fit.cc:
Remove dg-xfail-run-if for AIX.
* testsuite/23_containers/unordered_map/96088.cc: Replace
dg-xfail-run-if with dg-require-effective-target c++20.
* testsuite/23_containers/unordered_multimap/96088.cc: Likewise.
* testsuite/23_containers/unordered_multiset/96088.cc: Likewise.
* testsuite/23_containers/unordered_set/96088.cc: Likewise.
---

Tested x86_64-linux and powerpc-aix.

Pushed to trunk.

 .../testsuite/23_containers/deque/capacity/shrink_to_fit.cc | 1 -
 libstdc++-v3/testsuite/23_containers/unordered_map/96088.cc | 2 +-
 .../testsuite/23_containers/unordered_multimap/96088.cc | 2 +-
 .../testsuite/23_containers/unordered_multiset/96088.cc | 2 +-
 libstdc++-v3/testsuite/23_containers/unordered_set/96088.cc | 2 +-
 5 files changed, 4 insertions(+), 5 deletions(-)

diff --git 
a/libstdc++-v3/testsuite/23_containers/deque/capacity/shrink_to_fit.cc 
b/libstdc++-v3/testsuite/23_containers/deque/capacity/shrink_to_fit.cc
index 4dbf405d57b8..63717554280c 100644
--- a/libstdc++-v3/testsuite/23_containers/deque/capacity/shrink_to_fit.cc
+++ b/libstdc++-v3/testsuite/23_containers/deque/capacity/shrink_to_fit.cc
@@ -1,6 +1,5 @@
 // { dg-do run { target c++11 } }
 // { dg-require-effective-target std_allocator_new }
-// { dg-xfail-run-if "AIX operator new" { powerpc-ibm-aix* } }
 
 // 2010-01-08  Paolo Carlini  
 
diff --git a/libstdc++-v3/testsuite/23_containers/unordered_map/96088.cc 
b/libstdc++-v3/testsuite/23_containers/unordered_map/96088.cc
index c7dfd4fe1c60..0ec0bba2bba6 100644
--- a/libstdc++-v3/testsuite/23_containers/unordered_map/96088.cc
+++ b/libstdc++-v3/testsuite/23_containers/unordered_map/96088.cc
@@ -1,6 +1,6 @@
 // { dg-do run { target c++17 } }
 // { dg-require-effective-target std_allocator_new }
-// { dg-xfail-run-if "AIX operator new" { powerpc-ibm-aix* } }
+// { dg-require-effective-target c++20 { target powerpc-ibm-aix* } }
 
 // Copyright (C) 2021-2025 Free Software Foundation, Inc.
 //
diff --git a/libstdc++-v3/testsuite/23_containers/unordered_multimap/96088.cc 
b/libstdc++-v3/testsuite/23_containers/unordered_multimap/96088.cc
index 6f9429610775..3c1de3761e65 100644
--- a/libstdc++-v3/testsuite/23_containers/unordered_multimap/96088.cc
+++ b/libstdc++-v3/testsuite/23_containers/unordered_multimap/96088.cc
@@ -1,6 +1,6 @@
 // { dg-do run { target c++17 } }
 // { dg-require-effective-target std_allocator_new }
-// { dg-xfail-run-if "AIX operator new" { powerpc-ibm-aix* } }
+// { dg-require-effective-target c++20 { target powerpc-ibm-aix* } }
 
 // Copyright (C) 2021-2025 Free Software Foundation, Inc.
 //
diff --git a/libstdc++-v3/testsuite/23_containers/unordered_multiset/96088.cc 
b/libstdc++-v3/testsuite/23_containers/unordered_multiset/96088.cc
index 6f79ddf4c152..c016c88c1955 100644
--- a/libstdc++-v3/testsuite/23_containers/unordered_multiset/96088.cc
+++ b/libstdc++-v3/testsuite/23_containers/unordered_multiset/96088.cc
@@ -1,6 +1,6 @@
 // { dg-do run { target c++17 } }
 // { dg-require-effective-target std_allocator_new }
-// { dg-xfail-run-if "AIX operator new" { powerpc-ibm-aix* } }
+// { dg-require-effective-target c++20 { target powerpc-ibm-aix* } }
 
 // Copyright (C) 2021-2025 Free Software Foundation, Inc.
 //
diff --git a/libstdc++-v3/testsuite/23_containers/unordered_set/96088.cc 
b/libstdc++-v3/testsuite/23_containers/unordered_set/96088.cc
index c09e6f747ef0..10838c4d24f8 100644
--- a/libstdc++-v3/testsuite/23_containers/unordered_set/96088.cc
+++ b/libstdc++-v3/testsuite/23_containers/unordered_set/96088.cc
@@ -1,6 +1,6 @@
 // { dg-do run { target c++17 } }
 // { dg-require-effective-target std_allocator_new }
-// { dg-xfail-run-if "AIX operator new" { powerpc-ibm-aix* } }
+// { dg-require-effective-target c++20 { target powerpc-ibm-aix* } }
 
 // Copyright (C) 2021-2025 Free Software Foundation, Inc.
 //
-- 
2.49.0



[PATCH v5 07/24] Change make_attribute to take string_slice.

2025-05-29 Thread Alfie Richards
gcc/ChangeLog:

* attribs.cc (make_attribute): Change arguments.
* attribs.h (make_attribute): Change arguments.

Approved by Richard Sandiford.
---
 gcc/attribs.cc | 16 +---
 gcc/attribs.h  |  2 +-
 2 files changed, 6 insertions(+), 12 deletions(-)

diff --git a/gcc/attribs.cc b/gcc/attribs.cc
index f6667839c01..3fce9d62525 100644
--- a/gcc/attribs.cc
+++ b/gcc/attribs.cc
@@ -1076,21 +1076,15 @@ apply_tm_attr (tree fndecl, tree attr)
it to CHAIN.  */
 
 tree
-make_attribute (const char *name, const char *arg_name, tree chain)
+make_attribute (string_slice name, string_slice arg_name, tree chain)
 {
-  tree attr_name;
-  tree attr_arg_name;
-  tree attr_args;
-  tree attr;
-
-  attr_name = get_identifier (name);
-  attr_arg_name = build_string (strlen (arg_name), arg_name);
-  attr_args = tree_cons (NULL_TREE, attr_arg_name, NULL_TREE);
-  attr = tree_cons (attr_name, attr_args, chain);
+  tree attr_name = get_identifier_with_length (name.begin (), name.size ());
+  tree attr_arg_name = build_string (arg_name.size (), arg_name.begin ());
+  tree attr_args = tree_cons (NULL_TREE, attr_arg_name, NULL_TREE);
+  tree attr = tree_cons (attr_name, attr_args, chain);
   return attr;
 }
 
-
 /* Common functions used for target clone support.  */
 
 /* Comparator function to be used in qsort routine to sort attribute
diff --git a/gcc/attribs.h b/gcc/attribs.h
index 4b946390f76..b8b6838599c 100644
--- a/gcc/attribs.h
+++ b/gcc/attribs.h
@@ -45,7 +45,7 @@ extern bool cxx11_attribute_p (const_tree);
 extern tree get_attribute_name (const_tree);
 extern tree get_attribute_namespace (const_tree);
 extern void apply_tm_attr (tree, tree);
-extern tree make_attribute (const char *, const char *, tree);
+extern tree make_attribute (string_slice, string_slice, tree);
 extern bool attribute_ignored_p (tree);
 extern bool attribute_ignored_p (const attribute_spec *const);
 extern bool any_nonignored_attribute_p (tree);
-- 
2.34.1



[PATCH v5 23/24] c/aarch64: Add FMV diagnostic tests.

2025-05-29 Thread Alfie Richards
Adds some aarch64 C fmv diagnostic tests.

This mostly tests C front end code, but has to be target specific at FMV
is requires specifying target extensions.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/mv-and-mvc-error1.c: New test.
* gcc.target/aarch64/mv-and-mvc-error2.c: New test.
* gcc.target/aarch64/mv-and-mvc-error3.c: New test.
* gcc.target/aarch64/mv-error1.c: New test.
* gcc.target/aarch64/mv-error2.c: New test.
* gcc.target/aarch64/mv-error3.c: New test.
* gcc.target/aarch64/mv-error4.c: New test.
* gcc.target/aarch64/mv-error5.c: New test.
* gcc.target/aarch64/mv-error6.c: New test.
* gcc.target/aarch64/mv-error7.c: New test.
* gcc.target/aarch64/mv-error8.c: New test.
* gcc.target/aarch64/mv-error9.c: New test.
* gcc.target/aarch64/mvc-error1.c: New test.
* gcc.target/aarch64/mvc-error2.c: New test.
* gcc.target/aarch64/mvc-warning1.c: New test.
---
 .../gcc.target/aarch64/mv-and-mvc-error1.c|  9 +
 .../gcc.target/aarch64/mv-and-mvc-error2.c|  9 +
 .../gcc.target/aarch64/mv-and-mvc-error3.c|  8 
 gcc/testsuite/gcc.target/aarch64/mv-error1.c  | 18 +
 gcc/testsuite/gcc.target/aarch64/mv-error2.c  |  9 +
 gcc/testsuite/gcc.target/aarch64/mv-error3.c  | 12 +++
 gcc/testsuite/gcc.target/aarch64/mv-error4.c  |  9 +
 gcc/testsuite/gcc.target/aarch64/mv-error5.c  |  8 
 gcc/testsuite/gcc.target/aarch64/mv-error6.c  | 20 +++
 gcc/testsuite/gcc.target/aarch64/mv-error7.c  | 11 ++
 gcc/testsuite/gcc.target/aarch64/mv-error8.c  | 12 +++
 gcc/testsuite/gcc.target/aarch64/mv-error9.c  | 12 +++
 gcc/testsuite/gcc.target/aarch64/mvc-error1.c |  9 +
 gcc/testsuite/gcc.target/aarch64/mvc-error2.c |  9 +
 .../gcc.target/aarch64/mvc-warning1.c | 13 
 15 files changed, 168 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/mv-and-mvc-error1.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/mv-and-mvc-error2.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/mv-and-mvc-error3.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/mv-error1.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/mv-error2.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/mv-error3.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/mv-error4.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/mv-error5.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/mv-error6.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/mv-error7.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/mv-error8.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/mv-error9.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/mvc-error1.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/mvc-error2.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/mvc-warning1.c

diff --git a/gcc/testsuite/gcc.target/aarch64/mv-and-mvc-error1.c 
b/gcc/testsuite/gcc.target/aarch64/mv-and-mvc-error1.c
new file mode 100644
index 000..f14f7ed8f02
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/mv-and-mvc-error1.c
@@ -0,0 +1,9 @@
+/* { dg-do compile } */
+/* { dg-require-ifunc "" } */
+/* { dg-options "-O0" } */
+
+__attribute__ ((target_version ("dotprod"))) int
+foo () { return 3; } /* { dg-message "previous definition of 
.\\\[\\\[target_version\\(.dotprod.\\)\\\]\\\] foo. with type .int\\(void\\)." 
} */
+
+__attribute__ ((target_clones ("dotprod", "sve"))) int
+foo () { return 1; } /* { dg-error "redefinition of 
.\\\[\\\[target_clones\\(.dotprod., .sve.\\)\\\]\\\] foo." } */
diff --git a/gcc/testsuite/gcc.target/aarch64/mv-and-mvc-error2.c 
b/gcc/testsuite/gcc.target/aarch64/mv-and-mvc-error2.c
new file mode 100644
index 000..b25dcac6b7c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/mv-and-mvc-error2.c
@@ -0,0 +1,9 @@
+/* { dg-do compile } */
+/* { dg-require-ifunc "" } */
+/* { dg-options "-O0" } */
+
+__attribute__ ((target_version ("default"))) int
+foo () { return 1; } /* { dg-message "previous definition of 
.\\\[\\\[target_version\\(.default.\\)\\\]\\\] foo. with type .int\\(void\\)." 
} */
+
+__attribute__ ((target_clones ("dotprod", "sve"))) float
+foo () { return 3; } /* { dg-error "conflicting types for 
.\\\[\\\[target_clones\\(.dotprod., .sve.\\)\\\]\\\] foo.; have 
.float\\(void\\)." } */
diff --git a/gcc/testsuite/gcc.target/aarch64/mv-and-mvc-error3.c 
b/gcc/testsuite/gcc.target/aarch64/mv-and-mvc-error3.c
new file mode 100644
index 000..a6dd7a2dfff
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/mv-and-mvc-error3.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-require-ifunc "" } */
+/* { dg-options "-O0" } */
+
+float foo () { return 1; } /* { dg-message "previous definition of .foo." } */
+
+__attribute__ ((target_clones ("default", "dotprod", "sve"))) float
+foo () { return 3; } /*

[PATCH v5 05/24] Update is_function_default_version to work with target_version.

2025-05-29 Thread Alfie Richards
Notably this respects target_version semantics where an unannotated
function can be the default version.

gcc/ChangeLog:

* attribs.cc (is_function_default_version): Add target_version logic.

Approved by Richard Sandiford.
---
 gcc/attribs.cc | 27 ---
 1 file changed, 20 insertions(+), 7 deletions(-)

diff --git a/gcc/attribs.cc b/gcc/attribs.cc
index 56dd18c2fa8..f6667839c01 100644
--- a/gcc/attribs.cc
+++ b/gcc/attribs.cc
@@ -1279,18 +1279,31 @@ make_dispatcher_decl (const tree decl)
   return func_decl;
 }
 
-/* Returns true if DECL is multi-versioned using the target attribute, and this
-   is the default version.  This function can only be used for targets that do
-   not support the "target_version" attribute.  */
+/* Returns true if DECL a multiversioned default.
+   With the target attribute semantics, returns true if the function is marked
+   as default with the target version.
+   With the target_version attribute semantics, returns true if the function
+   is either not annotated, or annotated as default.  */
 
 bool
 is_function_default_version (const tree decl)
 {
-  if (TREE_CODE (decl) != FUNCTION_DECL
-  || !DECL_FUNCTION_VERSIONED (decl))
+  tree attr;
+  if (TREE_CODE (decl) != FUNCTION_DECL)
 return false;
-  tree attr = lookup_attribute ("target", DECL_ATTRIBUTES (decl));
-  gcc_assert (attr);
+  if (TARGET_HAS_FMV_TARGET_ATTRIBUTE)
+{
+  if (!DECL_FUNCTION_VERSIONED (decl))
+   return false;
+  attr = lookup_attribute ("target", DECL_ATTRIBUTES (decl));
+  gcc_assert (attr);
+}
+  else
+{
+  attr = lookup_attribute ("target_version", DECL_ATTRIBUTES (decl));
+  if (!attr)
+   return true;
+}
   attr = TREE_VALUE (TREE_VALUE (attr));
   return (TREE_CODE (attr) == STRING_CST
  && strcmp (TREE_STRING_POINTER (attr), "default") == 0);
-- 
2.34.1



[PATCH v5 20/24] aarch64: testsuite: Add diagnostic tests for Aarch64 FMV.

2025-05-29 Thread Alfie Richards
Add tests covering many FMV errors for Aarch64, including
redeclaration, and mixing target_clones and target_versions.

gcc/testsuite/ChangeLog:

* g++.target/aarch64/mv-and-mvc-error1.C: New test.
* g++.target/aarch64/mv-and-mvc-error2.C: New test.
* g++.target/aarch64/mv-and-mvc-error3.C: New test.
* g++.target/aarch64/mv-error1.C: New test.
* g++.target/aarch64/mv-error2.C: New test.
* g++.target/aarch64/mv-error3.C: New test.
* g++.target/aarch64/mv-error4.C: New test.
* g++.target/aarch64/mv-error5.C: New test.
* g++.target/aarch64/mv-error6.C: New test.
* g++.target/aarch64/mv-error7.C: New test.
* g++.target/aarch64/mv-error8.C: New test.
* g++.target/aarch64/mvc-error1.C: New test.
* g++.target/aarch64/mvc-error2.C: New test.
* g++.target/aarch64/mvc-warning1.C: Modified test.
---
 .../g++.target/aarch64/mv-and-mvc-error1.C| 10 +
 .../g++.target/aarch64/mv-and-mvc-error2.C| 10 +
 .../g++.target/aarch64/mv-and-mvc-error3.C|  9 
 gcc/testsuite/g++.target/aarch64/mv-error1.C  | 19 +
 gcc/testsuite/g++.target/aarch64/mv-error2.C  | 10 +
 gcc/testsuite/g++.target/aarch64/mv-error3.C  | 13 
 gcc/testsuite/g++.target/aarch64/mv-error4.C  | 10 +
 gcc/testsuite/g++.target/aarch64/mv-error5.C  |  9 
 gcc/testsuite/g++.target/aarch64/mv-error6.C  | 21 +++
 gcc/testsuite/g++.target/aarch64/mv-error7.C  | 12 +++
 gcc/testsuite/g++.target/aarch64/mv-error8.C  | 13 
 gcc/testsuite/g++.target/aarch64/mvc-error1.C | 10 +
 gcc/testsuite/g++.target/aarch64/mvc-error2.C | 10 +
 .../g++.target/aarch64/mvc-warning1.C | 12 +--
 14 files changed, 166 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/g++.target/aarch64/mv-and-mvc-error1.C
 create mode 100644 gcc/testsuite/g++.target/aarch64/mv-and-mvc-error2.C
 create mode 100644 gcc/testsuite/g++.target/aarch64/mv-and-mvc-error3.C
 create mode 100644 gcc/testsuite/g++.target/aarch64/mv-error1.C
 create mode 100644 gcc/testsuite/g++.target/aarch64/mv-error2.C
 create mode 100644 gcc/testsuite/g++.target/aarch64/mv-error3.C
 create mode 100644 gcc/testsuite/g++.target/aarch64/mv-error4.C
 create mode 100644 gcc/testsuite/g++.target/aarch64/mv-error5.C
 create mode 100644 gcc/testsuite/g++.target/aarch64/mv-error6.C
 create mode 100644 gcc/testsuite/g++.target/aarch64/mv-error7.C
 create mode 100644 gcc/testsuite/g++.target/aarch64/mv-error8.C
 create mode 100644 gcc/testsuite/g++.target/aarch64/mvc-error1.C
 create mode 100644 gcc/testsuite/g++.target/aarch64/mvc-error2.C

diff --git a/gcc/testsuite/g++.target/aarch64/mv-and-mvc-error1.C 
b/gcc/testsuite/g++.target/aarch64/mv-and-mvc-error1.C
new file mode 100644
index 000..c54e464e402
--- /dev/null
+++ b/gcc/testsuite/g++.target/aarch64/mv-and-mvc-error1.C
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-require-ifunc "" } */
+/* { dg-options "-O0" } */
+/* { dg-additional-options "-Wno-experimental-fmv-target" } */
+
+__attribute__ ((target_version ("dotprod"))) int
+foo () { return 3; } /* { dg-message "previous declaration of 
.\\\[\\\[target_version\\(.dotprod.\\)\\\]\\\] int foo\\(\\)." } */
+
+__attribute__ ((target_clones ("dotprod", "sve"))) int
+foo () { return 1; } /* { dg-error ".\\\[\\\[target_clones\\(.dotprod., 
.sve.\\)\\\]\\\] int foo\\(\\). conflicts for version .dotprod." } */
diff --git a/gcc/testsuite/g++.target/aarch64/mv-and-mvc-error2.C 
b/gcc/testsuite/g++.target/aarch64/mv-and-mvc-error2.C
new file mode 100644
index 000..5cba47e1e48
--- /dev/null
+++ b/gcc/testsuite/g++.target/aarch64/mv-and-mvc-error2.C
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-require-ifunc "" } */
+/* { dg-options "-O0" } */
+/* { dg-additional-options "-Wno-experimental-fmv-target" } */
+
+__attribute__ ((target_version ("default"))) int
+foo () { return 1; } /* { dg-message "old declaration 
.\\\[\\\[target_version\\(.default.\\)\\\]\\\] int foo\\(\\)." } */
+
+__attribute__ ((target_clones ("dotprod", "sve"))) float
+foo () { return 3; } /* { dg-error "ambiguating new declaration of 
.\\\[\\\[target_clones\\(.dotprod., .sve.\\)\\\]\\\] float foo\\(\\)." } */
diff --git a/gcc/testsuite/g++.target/aarch64/mv-and-mvc-error3.C 
b/gcc/testsuite/g++.target/aarch64/mv-and-mvc-error3.C
new file mode 100644
index 000..3738bac7829
--- /dev/null
+++ b/gcc/testsuite/g++.target/aarch64/mv-and-mvc-error3.C
@@ -0,0 +1,9 @@
+/* { dg-do compile } */
+/* { dg-require-ifunc "" } */
+/* { dg-options "-O0" } */
+/* { dg-additional-options "-Wno-experimental-fmv-target" } */
+
+float foo () { return 1; } /* { dg-message ".float foo\\(\\). previously 
defined here" } */
+
+__attribute__ ((target_clones ("default", "dotprod", "sve"))) float
+foo () { return 3; } /* { dg-error "redefinition of 
.\\\[\\\[target_clones\\(.default., .dotprod.

[committed] libstdc++: Fix lwg4084.cc test FAIL on AIX

2025-05-29 Thread Jonathan Wakely
On AIX printf formats a quiet NaN as "NaNQ" and it doesn't matter
whether %f or %F is used. Similarly, it always prints "INF" for
infinity, even when %f is used. Adjust a test that currently fails due
to this AIX-specific (and non-conforming) behaviour.

libstdc++-v3/ChangeLog:

* testsuite/22_locale/num_put/put/char/lwg4084.cc [_AIX]: Adjust
expected output for NaN and infinity.
---

Tested x86_64-linux and powerpc-aix.

Pushed to trunk.

 .../testsuite/22_locale/num_put/put/char/lwg4084.cc   | 8 
 1 file changed, 8 insertions(+)

diff --git a/libstdc++-v3/testsuite/22_locale/num_put/put/char/lwg4084.cc 
b/libstdc++-v3/testsuite/22_locale/num_put/put/char/lwg4084.cc
index b7c7da11f863..6ce4e8fe9f71 100644
--- a/libstdc++-v3/testsuite/22_locale/num_put/put/char/lwg4084.cc
+++ b/libstdc++-v3/testsuite/22_locale/num_put/put/char/lwg4084.cc
@@ -20,7 +20,11 @@ test_nan()
   out << ' ' << nan << ' ' << -nan;
   out << std::showpos;
   out << ' ' << nan << ' ' << -nan;
+#ifdef _AIX // non-conforming
+  VERIFY( out.str() == " NaNQ -NaNQ NaNQ -NaNQ NaNQ -NaNQ +NaNQ -NaNQ" );
+#else
   VERIFY( out.str() == " nan -nan NAN -NAN NAN -NAN +NAN -NAN" );
+#endif
 }
 
 void
@@ -36,7 +40,11 @@ test_inf()
   out << ' ' << inf << ' ' << -inf;
   out << std::showpos;
   out << ' ' << inf << ' ' << -inf;
+#ifdef _AIX // non-conforming
+  VERIFY( out.str() == " INF -INF INF -INF INF -INF +INF -INF" );
+#else
   VERIFY( out.str() == " inf -inf INF -INF INF -INF +INF -INF" );
+#endif
 }
 
 int main()
-- 
2.49.0



[PATCH] RISC-V: Imply zicsr for svade and svadu extensions.

2025-05-29 Thread Dongyan Chen
This patch implies zicsr for svade and svadu extensions.
According to the riscv-privileged spec, the svade and svadu extensions
are privileged instructions, so they should imply zicsr.

gcc/ChangeLog:

* config/riscv/riscv-ext.def: Imply zicsr.

---
 gcc/config/riscv/riscv-ext.def | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/config/riscv/riscv-ext.def b/gcc/config/riscv/riscv-ext.def
index dbda8ded3974..2e3c21660184 100644
--- a/gcc/config/riscv/riscv-ext.def
+++ b/gcc/config/riscv/riscv-ext.def
@@ -1915,7 +1915,7 @@ DEFINE_RISCV_EXT(
   /* FULL_NAME */ "Hardware Updating of A/D Bits extension",
   /* DESC */ "",
   /* URL */ ,
-  /* DEP_EXTS */ ({}),
+  /* DEP_EXTS */ ({"zicsr"}),
   /* SUPPORTED_VERSIONS */ ({{1, 0}}),
   /* FLAG_GROUP */ sv,
   /* BITMASK_GROUP_ID */ BITMASK_NOT_YET_ALLOCATED,
@@ -1928,7 +1928,7 @@ DEFINE_RISCV_EXT(
   /* FULL_NAME */ "Cause exception when hardware updating of A/D bits is 
disabled",
   /* DESC */ "",
   /* URL */ ,
-  /* DEP_EXTS */ ({}),
+  /* DEP_EXTS */ ({"zicsr"}),
   /* SUPPORTED_VERSIONS */ ({{1, 0}}),
   /* FLAG_GROUP */ sv,
   /* BITMASK_GROUP_ID */ BITMASK_NOT_YET_ALLOCATED,
--
2.43.0



Re: [PATCH RFA (diagnostic)] c++: modules and #pragma diagnostic

2025-05-29 Thread Jason Merrill

On 5/27/25 5:12 PM, Jason Merrill wrote:

On 5/27/25 4:47 PM, Jason Merrill wrote:

On 5/27/25 1:33 PM, David Malcolm wrote:

On Fri, 2025-05-23 at 16:58 -0400, Jason Merrill wrote:

On 4/14/25 9:57 AM, Jason Merrill wrote:

On 1/9/25 10:00 PM, Jason Merrill wrote:

Tested x86_64-pc-linux-gnu.  Is the diagnostic.h change OK for
trunk?



To respect the #pragma diagnostic lines in libstdc++ headers when
compiling
with module std, we need to represent them in the module.

I think it's reasonable to make module_state a friend of
diagnostic_option_classifier to allow direct access to the data.
This
is a
different approach from how Jakub made PCH streaming members of
diagnostic_option_classifier, but it seems to me that modules
handling
belongs in module.cc.


Putting it in module.cc looks good to me, though perhaps it should be
just a friend of diagnostic_option_classifier but not of
diagnostic_context?  Could the functions take a
diagnostic_option_classifier rather than a diagnostic_context?
diagnostic_context is something of a "big blob" of a class.


The friend in diagnostic_context is to be able to name 
m_option_classifier.  We could instead make that member public?


Thoughts?  The functions could take the _classifier, or even just the 
m_classification_history, but that just moves the access problem into 
their callers, who would still need some way to get there from the 
diagnostic_context.  Do you have another idea for that?


Jason


[...snip...]

+  bytes_out sec (to);
+  if (sec.streaming_p ())
+    sec.begin ();


I confess I don't fully understand the module code yet - in particular
the streaming vs non-streaming distinction.  What are the "if
(sec.streaming_p ())" guards doing here?  It looks it can be false if
the param "elf_out *to" is null (can that happen?), and if it's false,
then this function essentially becomes a no-op.  Is that what we want?


Hmm, perhaps an early if (!sec.streaming_p ()) return would be 
simpler, I'll try that.


That breaks, apparently because we need the early calls to 
write_location to record that we need to represent these locations.


Jason




Re: [PATCH] c++, coroutines: Fix identification of coroutine ramps [PR120453].

2025-05-29 Thread Jason Merrill

On 5/29/25 6:56 AM, Iain Sandoe wrote:

Tested on x86_64-darwin, confirming that the original reported code
fails without the change here.  Unfortunately, if we add a move
constructor to the reduced case, it no longer fails on unpatched
trunk - so not proposing to add that as a testcase (since it tests
something unrelated to coroutines).


Actually, I think I was wrong and the reduced testcase is fine after all 
-- the problem was indeed of trying to check the conversion for a class 
without a move constructor, when we want to avoid that check with NRVO 
in a ramp.


In the original testcase TaskBase has a move constructor, but LazyTask 
still doesn't.  Adding a move constructor to TaskBase and a destructor 
to LazyTask in the reduced testcase would more precisely preserve the 
situation, but I think the reduced testcase without changes is good enough.


So please do add the testcase.  OK with that change.


OK for trunk?
thanks
Iain

--- 8< ---

The existing implementation, incorrectly, tried to use DECL_RAMP_FN
in check_return_expr to determine if we are handling a ramp func.
However, that query is only set for the resume/destroy functions.

Replace the use of DECL_RAMP_FN with a new query.

PR c++/120453

gcc/cp/ChangeLog:

* cp-tree.h (DECL_RAMP_P): New.
* typeck.cc (check_return_expr): Use DECL_RAMP_P instead
of DECL_RAMP_FN.

Signed-off-by: Iain Sandoe 
---
  gcc/cp/cp-tree.h | 4 
  gcc/cp/typeck.cc | 2 +-
  2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 19c0b452d86..d9fc80b92e5 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -5522,6 +5522,10 @@ decl_template_parm_check (const_tree t, const char *f, 
int l, const char *fn)
  #define DECL_RAMP_FN(NODE) \
(coro_get_ramp_function (NODE))
  
+/* For a FUNCTION_DECL this is true if it is a coroutine ramp.  */

+#define DECL_RAMP_P(NODE) \
+  DECL_COROUTINE_P (NODE) && !DECL_RAMP_FN (NODE)
+
  /* True for an OMP_ATOMIC that has dependent parameters.  These are stored
 as an expr in operand 1, and integer_zero_node or clauses in operand 0.  */
  #define OMP_ATOMIC_DEPENDENT_P(NODE) \
diff --git a/gcc/cp/typeck.cc b/gcc/cp/typeck.cc
index af2cbaff8fd..ac1eb397f01 100644
--- a/gcc/cp/typeck.cc
+++ b/gcc/cp/typeck.cc
@@ -11466,7 +11466,7 @@ check_return_expr (tree retval, bool *no_warning, bool 
*dangling)
/* Don't check copy-initialization for NRV in a coroutine ramp; we
 implement this case as NRV, but it's specified as directly
 initializing the return value from get_return_object().  */
-  if (DECL_RAMP_FN (current_function_decl) && named_return_value_okay_p)
+  if (DECL_RAMP_P (current_function_decl) && named_return_value_okay_p)
converted = true;
  
/* First convert the value to the function's return type, then




Re: [PATCH] ipa: When inlining, don't combine PT JFs changing signedness (PR120295)

2025-05-29 Thread Jan Hubicka
> Hi,
> 
> in GCC 15 we allowed jump-function generation code to skip over a
> type-cast converting one integer to another as long as the latter can
> hold all the values of the former or has at least the same precision.
> This works well for IPA-CP where we do then evaluate each jump
> function as we propagate values and value-ranges.  However, the
> test-case in PR 120295 shows a problem with inlining, where we combine
> pass-through jump-functions so that they are always relative to the
> function which is the root of the inline tree.  Unfortunately, we are
> happy to combine also those with type-casts to a different signedness
> which makes us use sign zero extension for the expected value ranges
> where we should have used sign extension.  When the value-range which
> then leads to wrong insertion of a call to builtin_unreachable is
> being computed, the information about an existence of a intermediary
> signed type has already been lost during previous inlining.
> 
> This patch simply blocks combining such jump-functions so that it is
> back-portable to GCC 15.  Once we switch pass-through jump functions
> to use a vector of operations rather than having room for just one, we
> will be able to address this situation with adding an extra conversion
> instead.
> 
> Bootstrapped and LTO-bootstrapped on x86_64-linux.  OK for master and
> gcc-15 branch?
> 
> Thanks,
> 
> Martin
> 
> 
> gcc/ChangeLog:
> 
> 2025-05-19  Martin Jambor  
> 
>   PR ipa/120295
>   * ipa-prop.cc (update_jump_functions_after_inlining): Do not
>   combine pass-through jump functions with type-casts changing
>   signedness.
> 
> gcc/testsuite/ChangeLog:
> 
> 2025-05-19  Martin Jambor  
> 
>   PR ipa/120295
>   * gcc.dg/ipa/pr120295.c: New test.
OK,
thanks!
Honza


[PATCH] c++, coroutines: Simplify initial_await_resume_called.

2025-05-29 Thread Iain Sandoe
Tested on x86_64-darwin, powerpc64le-linux, OK for trunk?
thanks
Iain

--- 8< ---

We do not need to generate this code early, since it does not affect
any of the analysis.  Lowering it later takes less code, and avoids
modifying the initial await expresssion which will simplify changes
to analysis to deal with open PRs.

gcc/cp/ChangeLog:

* coroutines.cc (expand_one_await_expression): Set the
initial_await_resume_called flag here.
(build_actor_fn): Populate the frame accessor for the
initial_await_resume_called flag.
(cp_coroutine_transform::wrap_original_function_body): Do
not modify the initial_await expression to include the
initial_await_resume_called flag here.

Signed-off-by: Iain Sandoe 
---
 gcc/cp/coroutines.cc | 43 ---
 1 file changed, 16 insertions(+), 27 deletions(-)

diff --git a/gcc/cp/coroutines.cc b/gcc/cp/coroutines.cc
index 64a0a344349..c1c10782906 100644
--- a/gcc/cp/coroutines.cc
+++ b/gcc/cp/coroutines.cc
@@ -2027,8 +2027,10 @@ expand_one_await_expression (tree *expr, tree 
*await_expr, void *d)
   tree awaiter_calls = TREE_OPERAND (saved_co_await, 3);
 
   tree source = TREE_OPERAND (saved_co_await, 4);
-  bool is_final = (source
-  && TREE_INT_CST_LOW (source) == (int) FINAL_SUSPEND_POINT);
+  bool is_final
+= (source && TREE_INT_CST_LOW (source) == (int) FINAL_SUSPEND_POINT);
+  bool is_initial
+= (source && TREE_INT_CST_LOW (source) == (int) INITIAL_SUSPEND_POINT);
 
   /* Build labels for the destinations of the control flow when we are resuming
  or destroying.  */
@@ -2156,6 +2158,13 @@ expand_one_await_expression (tree *expr, tree 
*await_expr, void *d)
   /* Resume point.  */
   add_stmt (build_stmt (loc, LABEL_EXPR, resume_label));
 
+  if (is_initial && data->i_a_r_c)
+{
+  r = cp_build_modify_expr (loc, data->i_a_r_c, NOP_EXPR, 
boolean_true_node,
+   tf_warning_or_error);
+  finish_expr_stmt (r);
+}
+
   /* This will produce the value (if one is provided) from the co_await
  expression.  */
   tree resume_call = TREE_VEC_ELT (awaiter_calls, 2); /* await_resume().  */
@@ -2670,8 +2679,12 @@ build_actor_fn (location_t loc, tree coro_frame_type, 
tree actor, tree fnbody,
 
   /* We've now rewritten the tree and added the initial and final
  co_awaits.  Now pass over the tree and expand the co_awaits.  */
+  tree i_a_r_c = NULL_TREE;
+  if (flag_exceptions)
+i_a_r_c = coro_build_frame_access_expr (actor_frame, coro_frame_i_a_r_c_id,
+  false, tf_warning_or_error);
 
-  coro_aw_data data = {actor, actor_fp, resume_idx_var, NULL_TREE,
+  coro_aw_data data = {actor, actor_fp, resume_idx_var, i_a_r_c,
   ash, del_promise_label, ret_label,
   continue_label, restart_dispatch_label, continuation, 2};
   cp_walk_tree (&actor_body, await_statement_expander, &data, NULL);
@@ -4449,30 +4462,6 @@ cp_coroutine_transform::wrap_original_function_body ()
   tree tcb = build_stmt (loc, TRY_BLOCK, NULL_TREE, NULL_TREE);
   add_stmt (tcb);
   TRY_STMTS (tcb) = push_stmt_list ();
-  if (initial_await != error_mark_node)
-   {
- /* Build a compound expression that sets the
-initial-await-resume-called variable true and then calls the
-initial suspend expression await resume.
-In the case that the user decides to make the initial await
-await_resume() return a value, we need to discard it and, it is
-a reference type, look past the indirection.  */
- if (INDIRECT_REF_P (initial_await))
-   initial_await = TREE_OPERAND (initial_await, 0);
- /* In the case that the initial_await returns a target expression
-we might need to look through that to update the await expr.  */
- tree iaw = initial_await;
- if (TREE_CODE (iaw) == TARGET_EXPR)
-   iaw = TARGET_EXPR_INITIAL (iaw);
- gcc_checking_assert (TREE_CODE (iaw) == CO_AWAIT_EXPR);
- tree vec = TREE_OPERAND (iaw, 3);
- tree aw_r = TREE_VEC_ELT (vec, 2);
- aw_r = convert_to_void (aw_r, ICV_STATEMENT, tf_warning_or_error);
- tree update = build2 (MODIFY_EXPR, boolean_type_node, i_a_r_c,
-   boolean_true_node);
- aw_r = cp_build_compound_expr (update, aw_r, tf_warning_or_error);
- TREE_VEC_ELT (vec, 2) = aw_r;
-   }
   /* Add the initial await to the start of the user-authored function.  */
   finish_expr_stmt (initial_await);
   /* Append the original function body.  */
-- 
2.39.2 (Apple Git-143)



Re: [PATCH] libstdc++: Compare keys and values separately in flat_map::operator==

2025-05-29 Thread Patrick Palka
On Thu, 29 May 2025, Tomasz Kaminski wrote:

> 
> 
> On Thu, May 29, 2025 at 4:37 PM Tomasz Kaminski  wrote:
> 
> 
> On Thu, May 29, 2025 at 3:56 PM Patrick Palka  wrote:
>   Tested on x86_64-pc-linux-gnu, does this look OK for trunk/15?
> 
>   -- >8 --
> 
>   Instead of effectively doing a zipped comparison of the keys and values,
>   compare them separately to leverage the underlying containers' optimized
>   equality implementations.
> 
>   libstdc++-v3/ChangeLog:
> 
>           * include/std/flat_map (_Flat_map_impl::operator==): Compare
>           keys and values separately.
>   ---
>    libstdc++-v3/include/std/flat_map | 5 -
>    1 file changed, 4 insertions(+), 1 deletion(-)
> 
>   diff --git a/libstdc++-v3/include/std/flat_map 
> b/libstdc++-v3/include/std/flat_map
>   index c0716d12412a..134307324190 100644
>   --- a/libstdc++-v3/include/std/flat_map
>   +++ b/libstdc++-v3/include/std/flat_map
>   @@ -873,7 +873,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>          [[nodiscard]]
>          friend bool
>          operator==(const _Derived& __x, const _Derived& __y)
>   -      { return std::equal(__x.begin(), __x.end(), __y.begin(), 
> __y.end()); }
>   +      {
>   +       return __x._M_cont.keys == __y._M_cont.keys
>   +         && __x._M_cont.values == __y._M_cont.values;
> 
> Previously we supported containers that do not have operator==, by calling 
> equal.
> For the flat_set we also do not compare the containers. I would suggest using 
> in both:
>   ranges::equal(x._M_cont)
> Or using == on containers in both flat_map and flat_set.
> 
> queue and stack uses operator== for the containers, so I think  we should use 
> == on containers in both.

Using operator== in flat_set for consistency with the other container
adaptors makes sense to me.

> +      }
> 
>        template
>         [[nodiscard]]
> --
> 2.50.0.rc0
> 
> 
> 

[PATCH v2] libstdc++: Implement LWG 2439 for std::unique_copy [PR120386]

2025-05-29 Thread Jonathan Wakely
The current overload set for __unique_copy handles three cases:

- The input range uses forward iterators, the output range does not.
  This is the simplest case, and can just compare adjacent elements of
  the input range.

- Neither the input range nor output range use forward iterators.
  This requires a local variable copied from the input range and updated
  by assigning each element to the local variable.

- The output range uses forward iterators.
  For this case we compare the current element from the input range with
  the element just written to the output range.

There are two problems with this implementation. Firstly, the third case
assumes that the value type of the output range can be compared to the
value type of the input range, which might not be possible at all, or
might be possible but give different results to comparing elements of
the input range. This is the problem identified in LWG 2439.

Secondly, the third case is used when both ranges use forward iterators,
even though the first case could (and should) be used. This means that
we compare elements from the output range instead of the input range,
with the problems described above (either not well-formed, or might give
the wrong results).

The cause of the second problem is that the overload for the first case
looks like:

OutputIterator
__unique_copy(ForwardIter, ForwardIter, OutputIterator, BinaryPred,
  forward_iterator_tag, output_iterator_tag);

When the output range uses forward iterators this overload cannot be
used, because forward_iterator_tag does not inherit from
output_iterator_tag, so is not convertible to it.

To fix these problems we need to implement the resolution of LWG 2439 so
that the third case is only used when the value types of the two ranges
are the same. This ensures that the comparisons are well behaved. We
also need to ensure that the first case is used when both ranges use
forward iterators.

This change replaces a single step of tag dispatching to choose between
three overloads with two step of tag dispatching, choosing between two
overloads at each step. The first step dispatches based on the iterator
category of the input range, ignoring the category of the output range.
The second step only happens when the input range uses non-forward
iterators, and dispatches based on the category of the output range and
whether the value type of the two ranges is the same. So now the cases
that are handled are:

- The input range uses forward iterators.
- The output range uses non-forward iterators or a different value type.
- The output range uses forward iterators and has the same value type.

For the second case, the old code used __gnu_cxx::__ops::__iter_comp_val
to wrap the predicate in another level of indirection. That seems
unnecessary, as we can just use a pointer to the local variable instead
of an iterator referring to it.

During review of this patch, it was discovered that all known
implementations of std::unique_copy and ranges::unique_copy (except
cmcstl2) disagree with the specification. The standard (and the SGI STL
documentation) say that it uses pred(*i, *(i-1)) but everybody uses
pred(*(i-1), *i) instead, and apparently always has done. This patch
adjusts ranges::unique_copy to be consistent.

In the first __unique_copy overload, the local copy of the iterator is
changed to be the previous position not the next one, so that we use
++first as the "next" iterator, consistent with the logic used in the
other overloads. This makes it easier to compare them, because we aren't
using pred(*first, *next) in one and pred(something, *first) in the
others. Instead it's always pred(something, *first).

libstdc++-v3/ChangeLog:

PR libstdc++/120386
* include/bits/ranges_algo.h (__unique_copy_fn): Reorder
arguments for third case to match the first two cases.
* include/bits/stl_algo.h (__unique_copy): Replace three
overloads with two, depending only on the iterator category of
the input range.  Dispatch to __unique_copy_1 for the
non-forward case.
(__unique_copy_1): New overloads for the case where the input
range uses non-forward iterators.
(unique_copy): Only pass the input range category to
__unique_copy.
* testsuite/25_algorithms/unique_copy/lwg2439.cc: New test.
---

Patch v2 adds a new test, and doesn't reorder the predicate arguments to
match the standard, instead reorder the one inconsistent case in
ranges::unique_copy. And I've submitted a library issue to change the
standard: https://cplusplus.github.io/LWG/issue4269

Tested x86_64-linux.

 libstdc++-v3/include/bits/ranges_algo.h   |   4 +-
 libstdc++-v3/include/bits/stl_algo.h  |  86 ++--
 .../25_algorithms/unique_copy/lwg2439.cc  | 127 ++
 3 files changed, 176 insertions(+), 41 deletions(-)
 create mode 100644 libstdc++-v3/testsuite/25_algorithms/unique_copy/lwg2439.cc

diff --git a/libstdc++-v3/incl

Re: [PATCH] c: fix ICE for mutually recursive structures [PR120381]

2025-05-29 Thread Joseph Myers
On Thu, 29 May 2025, Martin Uecker wrote:

> 
> This is a fun one. 
> 
> Bootstrapped and regression tested for x86_64.
> 
> Martin
> 
> 
> c: fix ICE for mutually recursive structures [PR120381]
> 
> For invalid nesting of a structure definition in a definition
> of itself or when using a rather obscure construction using statement
> expressions, we can create mutually recursive pairs of non-identical
> but compatible structure types.  This can lead to invalid composite
> types and an ICE.  If we detect recursion even for swapped pairs
> when forming composite types, this is avoided.

OK.

-- 
Joseph S. Myers
josmy...@redhat.com

Re: [PATCH] c: fix ICE related to tagged types with attributes in diagnostics [PR120380]

2025-05-29 Thread Joseph Myers
On Thu, 29 May 2025, Martin Uecker wrote:

> get_aka_type will create a new type for diagnostics, but for tagged types
> attributes will then be ignored with a warning.  This can lead to 
> reentering
> warning code which leads to an ICE.  Fix this by ignoring the attributes
> for tagged types.

> +  /* For tagged types ignore qualifiers here because the will
> + otherwise be ignored later causing a warning inside diagnostics
> + which leads to an ICE.  */

Do you mean ignore attributes (as in the proposed commit message) or 
qualifiers (as in the comment)?  Also, "the will" -> "they will" (I 
think).

-- 
Joseph S. Myers
josmy...@redhat.com



Re: [PATCH v3] rs6000: Adding missed ISA 3.0 atomic memory operation instructions.

2025-05-29 Thread Segher Boessenkool
Hi!

On Thu, May 29, 2025 at 10:36:12AM +0530, jeevitha wrote:
> Changes to amo.h include the addition of the following load atomic operations:
> Compare and Swap Not Equal, Fetch and Increment Bounded, Fetch and Increment
> Equal, and Fetch and Decrement Bounded. Additionally, Store Twin is added for
> store atomic operations.
> 
> 2025-05-29  Peter Bergner  
> 
> gcc/
>   * config/rs6000/amo.h: Add missing atomic memory operations.
>   * doc/extend.texi (PowerPC Atomic Memory Operation Functions):
>   Document new functions.

Add yourself to suthors as well?

> +/* Implementation of the LWAT/LDAT operations that take two input registers
> +   and modify one word or double-word of memory and return the value that was
> +   previously in the memory location.  The destination and two source
> +   registers are encoded with only one register number, so we need three
> +   consecutive GPR registers and there is no C/C++ type that will give
> +   us that, so we have to use register asm variables to achieve that.

That is the easiest (and probably best) way to do it, yeah.  Always
using the same hardcoded register numbers (8..10) isn't a big problem
the way these funtions are used (if they were ussed more often you
firstly don't want hardcoded #s so you can use multiple of those at the
same spot in the program, and secondly, hardcoding can restrict
scheduling, which in the end probably will mean the compiler will do a
whole bunch of register moves).

> +   The LWAT/LDAT opcode requires the address to be a single register,
> +   and that points to a suitably aligned memory location.  Asm volatile
> +   is used to prevent the optimizer from moving the operation.  */

That is not what asm volatile does.  asm volatile means the asm has an
unspecified side effect, so the asm instruction(s) should appear in
the output file exactly as in the input file, the asm cannot be
optimised away (if the output reg is unused), nor can two identical asms
(with the same inputs and outputs) be executes as just one insn, etc.

Typically you put "asm volatile" on instructions that touch memory with
some side effect (such as here), or on a "darn" or timestamp or similar
insn.

asm volatile does *not* mean an instruction cannot be moved, whatever
that may mean even.

"Load atomic has a side effect, so mark the asm as volatile"?  Something
like that.

> +#define _AMO_LD_CMPSWP(NAME, TYPE, OPCODE, FC)   
> \
> +static __inline__ TYPE   
> \
> +NAME (TYPE *_PTR, TYPE _COND, TYPE _VALUE)   \

Please call it _ADDR here as well?  Or name things "ptr" elsewhere as
well, but addr is a better name.

> +{\
> +  register TYPE _ret asm ("r8"); \
> +  register TYPE _cond asm ("r9") = _COND;\
> +  register TYPE _value asm ("r10") = _VALUE; \
> +  __asm__ __volatile__ (OPCODE " %[ret],%P[addr],%[code]"\
> + : [addr] "+Q" (_PTR[0]), [ret] "=r" (_ret)  \
> + : "r" (_cond), "r" (_value), [code] "n" (FC));  \
> +  return _ret;   
> \
> +}

Naming the operands is an extra indirection, and makes things way less
readable (which means *understandable*) as well.  Just use %0, %1, %2
please?  It's a single line, people will not lose track of what is what
anyway (and if they would, the code is then way too big for extended
asm, so named asm operands is always a code stench).

Please use *_PTR instead of _PTR[0] btw.  Yes, it means the same thing,
but there isn't (necessarily) an array here, let's not suggest there is
one.  It is shorter and more obvious anyway :-)

> +/* Implementation of the LWAT/LDAT fetch and increment operations.
> +
> +   The LWAT/LDAT opcode requires the address to be a single register that
> +   points to a suitably aligned memory location.  Asm volatile is used to
> +   prevent the optimizer from moving the operation.  */

Same things.  Why repeat yourself anyway?

> +#define _AMO_LD_INCREMENT(NAME, TYPE, OPCODE, FC)\
> +static __inline__ TYPE   
> \
> +NAME (TYPE *_PTR)\
> +{\
> +  TYPE _RET; \
> +  __asm__ volatile (OPCODE " %[ret],%P[addr],%[code]\n"  
> \
> + : [addr] "+Q" (_PTR[0]), [ret] "=r" (_RET)  \
> + : "Q" (*(TYPE (*)[2]) _PTR), [code] "n" (FC));  \
> +  return _RET;   
> \
> +}

I don't understand the [2].  Should it be [1]?  These instructions
can use the value at mem+s (as the

[PATCH] c++, coroutines: Make a check more specific [PR109283].

2025-05-29 Thread Iain Sandoe
Tested on x86_64-darwin, powerpc64le-linux; I'd like to minimize
effort on this code, since I expect that we will need some changes
to deal with open BZs.  This fixes an ICE tho,
OK for trunk?
thanks
Iain

--- 8< ---

The check was intended to assert that we had visited contained
ternary expressions with embedded co_awaits, but had been made
too general - and therefore was ICEing on code that was actually
OK.  Fixed by checking specifically that no co_awaits embedded.

PR c++/109283

gcc/cp/ChangeLog:

* coroutines.cc (find_any_await): Only save the statement
pointer if the caller passes a place for it.
(flatten_await_stmt): When checking that ternary expressions
have been handled, also check that they contain a co_await.

gcc/testsuite/ChangeLog:

* g++.dg/coroutines/pr109283.C: New test.

Signed-off-by: Iain Sandoe 
---
 gcc/cp/coroutines.cc   |  8 +---
 gcc/testsuite/g++.dg/coroutines/pr109283.C | 23 ++
 2 files changed, 28 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/coroutines/pr109283.C

diff --git a/gcc/cp/coroutines.cc b/gcc/cp/coroutines.cc
index c1c10782906..dbb21a2ff77 100644
--- a/gcc/cp/coroutines.cc
+++ b/gcc/cp/coroutines.cc
@@ -2878,8 +2878,8 @@ find_any_await (tree *stmt, int *dosub, void *d)
   if (TREE_CODE (*stmt) == CO_AWAIT_EXPR)
 {
   *dosub = 0; /* We don't need to consider this any further.  */
-  tree **p = (tree **) d;
-  *p = stmt;
+  if (d)
+   *(tree **)d = stmt;
   return *stmt;
 }
   return NULL_TREE;
@@ -3129,7 +3129,9 @@ flatten_await_stmt (var_nest_node *n, hash_set 
*promoted,
  bool already_present = promoted->add (var);
  gcc_checking_assert (!already_present);
  tree inner = TARGET_EXPR_INITIAL (init);
- gcc_checking_assert (TREE_CODE (inner) != COND_EXPR);
+ gcc_checking_assert
+   (TREE_CODE (inner) != COND_EXPR
+|| !cp_walk_tree (&inner, find_any_await, nullptr, nullptr));
  init = cp_build_modify_expr (input_location, var, INIT_EXPR, init,
   tf_warning_or_error);
  /* Simplify for the case that we have an init containing the temp
diff --git a/gcc/testsuite/g++.dg/coroutines/pr109283.C 
b/gcc/testsuite/g++.dg/coroutines/pr109283.C
new file mode 100644
index 000..d73092b595e
--- /dev/null
+++ b/gcc/testsuite/g++.dg/coroutines/pr109283.C
@@ -0,0 +1,23 @@
+// PR 109283.
+// This used to ICE from a check set too widely.
+#include 
+
+struct foo
+{ ~foo(); };
+
+struct task
+{
+   struct promise_type
+   {
+   std::suspend_never initial_suspend();
+   std::suspend_never final_suspend() noexcept;
+   std::suspend_never yield_value(foo);
+   void return_void();
+   void unhandled_exception(); 
+   task get_return_object();
+   };
+};
+
+task source(int b) {
+   co_yield b ? foo{} : foo{};
+}
-- 
2.39.2 (Apple Git-143)



[PATCH] c-lex: Handle NULL filenames from UNKNOWN_LOCATION [PR120273].

2025-05-29 Thread Iain Sandoe
To trigger this involves somewhat tortuous pathways through the
c++ requires code.  I did consider the alternative of putting in
an assert and then checking every call-site, but that seemed to
be a much larger change.
tested on x86_64-darwin and powerpc64le-linux, OK for trunk?
thanks
Iain

--- 8< ---

This was reported against the coroutines implementation, but could
affect other code.

We intentionally synthesize code with UNKNOWN_LOCATIONs in the
implementation to avoid the effect of the code position jumping
around in debug sessions.

The location expansion of fileinfo is not expecting to deal with
NULL name pointers.  The patch here checks this case and adds a
"" as the name, avoiding the ICE and providing at
least a basic indication to the end-user.

PR c++/120273

gcc/c-family/ChangeLog:

* c-lex.cc (get_fileinfo): When presented with a NULL
name pointer, report "".

gcc/testsuite/ChangeLog:

* g++.dg/coroutines/pr120273.C: New test.

Signed-off-by: Iain Sandoe 
---
 gcc/c-family/c-lex.cc  |  4 ++
 gcc/testsuite/g++.dg/coroutines/pr120273.C | 58 ++
 2 files changed, 62 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/coroutines/pr120273.C

diff --git a/gcc/c-family/c-lex.cc b/gcc/c-family/c-lex.cc
index fef6ae6f457..43054f105ea 100644
--- a/gcc/c-family/c-lex.cc
+++ b/gcc/c-family/c-lex.cc
@@ -109,6 +109,10 @@ get_fileinfo (const char *name)
 0,
 splay_tree_delete_pointers);
 
+  /* If we have an UNKOWN_LOCATION, it has no filename.  */
+  if (!name)
+name = "";
+
   n = splay_tree_lookup (file_info_tree, (splay_tree_key) name);
   if (n)
 return (struct c_fileinfo *) n->value;
diff --git a/gcc/testsuite/g++.dg/coroutines/pr120273.C 
b/gcc/testsuite/g++.dg/coroutines/pr120273.C
new file mode 100644
index 000..19b9e51b9fa
--- /dev/null
+++ b/gcc/testsuite/g++.dg/coroutines/pr120273.C
@@ -0,0 +1,58 @@
+// PR120273
+// { dg-additional-options "-Wno-literal-suffix" }
+namespace std {
+void declval();
+template < typename > struct invoke_result;
+template < typename _Fn > using invoke_result_t = invoke_result< _Fn >;
+template < typename _Derived, typename _Base >
+concept derived_from = __is_base_of(_Base, _Derived);
+template < typename, typename >
+concept convertible_to = requires { declval; };
+template < char... > int operator""ms();
+template < typename _Result, typename > struct coroutine_traits : _Result {};
+template < typename = void > struct coroutine_handle {
+  static coroutine_handle from_address(void *);
+  operator coroutine_handle<>();
+  void *address();
+};
+}
+
+using namespace std;
+
+template < class > using CoroutineHandle = coroutine_handle<>;
+
+template < class Callable >
+  requires(derived_from< invoke_result_t< Callable >, int >)
+Callable operator co_await(Callable);
+
+struct FinalSuspendProxy {
+  bool await_ready() noexcept;
+  void await_suspend(CoroutineHandle< void >) noexcept ;
+  void await_resume() noexcept;
+};
+
+struct Task {
+  struct Promise;
+  using promise_type = Promise;
+
+  struct Promise {
+auto initial_suspend() { return FinalSuspendProxy(); }
+auto final_suspend () noexcept  { return FinalSuspendProxy(); }
+void unhandled_exception () {}
+Task get_return_object () { return {}; }
+  };
+} ;
+
+struct TestEventLoop {
+  struct Sleep {
+Sleep(TestEventLoop, int);
+bool await_ready();
+void await_suspend(CoroutineHandle< void >);
+void await_resume();
+  };
+  auto sleep(int tm) { return Sleep(*this, tm); }
+};
+
+Task test_body_11(TestEventLoop t) {
+  co_await t.sleep(5ms);
+}
-- 
2.39.2 (Apple Git-143)



[PATCH] c++, coroutines: Improve diagnostics for awaiter/promise.

2025-05-29 Thread Iain Sandoe
Tested on x86_64-darwin, powerpc64le-linux, OK for trunk?
thanks
Iain

--- 8< ---

At present, we can issue diagnostics about missing or malformed
awaiter or promise methods when we encounter their uses in the
body of a users function.  We might then re-issue the same
diagnostics when processing the initial or final await expressions.

This change avoids such duplication, and also attempts to
identify issues with the initial or final expressions specifically
since diagnostics for those do not have any useful line number.

gcc/cp/ChangeLog:

* coroutines.cc (build_co_await): Identify diagnostics
for initial and final await expressions.
(cp_coroutine_transform::wrap_original_function_body): Do
not handle initial and final await expressions here ...
(cp_coroutine_transform::apply_transforms): ... handle them
here and avoid duplicate diagnostics.
* coroutines.h: Declare inital and final await expressions
in the transform class.

gcc/testsuite/ChangeLog:

* g++.dg/coroutines/coro-missing-final-suspend.C: Adjust for
improved diagnostics.
* g++.dg/coroutines/coro1-missing-await-method.C: Likewise.
* g++.dg/coroutines/pr104051.C: Likewise.

Signed-off-by: Iain Sandoe 
---
 gcc/cp/coroutines.cc  | 21 +++
 gcc/cp/coroutines.h   |  3 +++
 .../coroutines/coro-missing-final-suspend.C   |  4 ++--
 .../coroutines/coro1-missing-await-method.C   |  2 +-
 gcc/testsuite/g++.dg/coroutines/pr104051.C|  4 ++--
 5 files changed, 25 insertions(+), 9 deletions(-)

diff --git a/gcc/cp/coroutines.cc b/gcc/cp/coroutines.cc
index bf3ab2d7250..aa00a8a4e68 100644
--- a/gcc/cp/coroutines.cc
+++ b/gcc/cp/coroutines.cc
@@ -1277,8 +1277,13 @@ build_co_await (location_t loc, tree a, 
suspend_point_kind suspend_kind,
 
   if (TREE_CODE (o_type) != RECORD_TYPE)
 {
-  error_at (loc, "awaitable type %qT is not a structure",
-   o_type);
+  const char *extra = "";
+  if (suspend_kind == FINAL_SUSPEND_POINT)
+   extra = "final_suspend ";
+  if (suspend_kind == INITIAL_SUSPEND_POINT)
+   extra = "initial_suspend ";
+  error_at (loc, "%sawaitable type %qT is not a structure",
+   extra, o_type);
   return error_mark_node;
 }
 
@@ -4346,7 +4351,6 @@ cp_coroutine_transform::wrap_original_function_body ()
   /* Wrap the function body in a try {} catch (...) {} block, if exceptions
  are enabled.  */
   tree var_list = NULL_TREE;
-  tree initial_await = build_init_or_final_await (fn_start, false);
 
   /* [stmt.return.coroutine] / 3
  If p.return_void() is a valid expression, flowing off the end of a
@@ -4540,7 +4544,8 @@ cp_coroutine_transform::wrap_original_function_body ()
   zero_resume = build2_loc (loc, MODIFY_EXPR, act_des_fn_ptr_type,
resume_fn_ptr, zero_resume);
   finish_expr_stmt (zero_resume);
-  finish_expr_stmt (build_init_or_final_await (fn_start, true));
+  finish_expr_stmt (final_await);
+
   BIND_EXPR_BODY (update_body) = pop_stmt_list (BIND_EXPR_BODY (update_body));
   BIND_EXPR_VARS (update_body) = nreverse (var_list);
   BLOCK_VARS (top_block) = BIND_EXPR_VARS (update_body);
@@ -5278,6 +5283,14 @@ cp_coroutine_transform::apply_transforms ()
 = coro_build_actor_or_destroy_function (orig_fn_decl, act_des_fn_type,
frame_ptr_type, false);
 
+  /* Avoid repeating diagnostics about promise or awaiter fails.  */
+  if (!seen_error ())
+{
+  initial_await = build_init_or_final_await (fn_start, false);
+  if (initial_await && initial_await != error_mark_node)
+   final_await = build_init_or_final_await (input_location, true);
+}
+
   /* Transform the function body as per [dcl.fct.def.coroutine] / 5.  */
   wrap_original_function_body ();
 
diff --git a/gcc/cp/coroutines.h b/gcc/cp/coroutines.h
index 55caa6e61e3..77f7bd33f76 100644
--- a/gcc/cp/coroutines.h
+++ b/gcc/cp/coroutines.h
@@ -126,6 +126,9 @@ private:
   bool inline_p = false;
   bool valid_coroutine = false;
 
+  tree initial_await = error_mark_node;
+  tree final_await = error_mark_node;
+  
   void analyze_fn_parms ();
   void wrap_original_function_body ();
   bool build_ramp_function ();
diff --git a/gcc/testsuite/g++.dg/coroutines/coro-missing-final-suspend.C 
b/gcc/testsuite/g++.dg/coroutines/coro-missing-final-suspend.C
index 6a0878c1269..b2522311a49 100644
--- a/gcc/testsuite/g++.dg/coroutines/coro-missing-final-suspend.C
+++ b/gcc/testsuite/g++.dg/coroutines/coro-missing-final-suspend.C
@@ -7,10 +7,10 @@
 #include "coro1-ret-int-yield-int.h"
 
 coro1
-my_coro () // { dg-error {no member named 'final_suspend' in} }
+my_coro ()
 {
   co_return 0;
-}
+} // { dg-error {no member named 'final_suspend' in} }
 
 // check we have not messed up continuation of the compilation.
 template 
diff --git a/gcc/testsuite/g++.dg/coroutines/coro1-missing-await-meth

Re: [PATCH RFA (diagnostic)] c++: modules and #pragma diagnostic

2025-05-29 Thread David Malcolm
On Thu, 2025-05-29 at 09:11 -0400, Jason Merrill wrote:
> On 5/27/25 5:12 PM, Jason Merrill wrote:
> > On 5/27/25 4:47 PM, Jason Merrill wrote:
> > > On 5/27/25 1:33 PM, David Malcolm wrote:
> > > > On Fri, 2025-05-23 at 16:58 -0400, Jason Merrill wrote:
> > > > > On 4/14/25 9:57 AM, Jason Merrill wrote:
> > > > > > On 1/9/25 10:00 PM, Jason Merrill wrote:
> > > > > > > Tested x86_64-pc-linux-gnu.  Is the diagnostic.h change
> > > > > > > OK for
> > > > > > > trunk?
> > > > > > 
> > > > > > > To respect the #pragma diagnostic lines in libstdc++
> > > > > > > headers when
> > > > > > > compiling
> > > > > > > with module std, we need to represent them in the module.
> > > > > > > 
> > > > > > > I think it's reasonable to make module_state a friend of
> > > > > > > diagnostic_option_classifier to allow direct access to
> > > > > > > the data.
> > > > > > > This
> > > > > > > is a
> > > > > > > different approach from how Jakub made PCH streaming
> > > > > > > members of
> > > > > > > diagnostic_option_classifier, but it seems to me that
> > > > > > > modules
> > > > > > > handling
> > > > > > > belongs in module.cc.
> > > > 
> > > > Putting it in module.cc looks good to me, though perhaps it
> > > > should be
> > > > just a friend of diagnostic_option_classifier but not of
> > > > diagnostic_context?  Could the functions take a
> > > > diagnostic_option_classifier rather than a diagnostic_context?
> > > > diagnostic_context is something of a "big blob" of a class.
> > > 
> > > The friend in diagnostic_context is to be able to name 
> > > m_option_classifier.  We could instead make that member public?
> 
> Thoughts?  The functions could take the _classifier, or even just the
> m_classification_history, but that just moves the access problem into
> their callers, who would still need some way to get there from the 
> diagnostic_context.  Do you have another idea for that?

I'm trying to eventually make all of the member data of
diagnostic_context private, so how about keeping it private, but adding
a public accessor, something like:

diff --git a/gcc/diagnostic.h b/gcc/diagnostic.h
index cdd6f26ba2a..cda090d55ff 100644
--- a/gcc/diagnostic.h
+++ b/gcc/diagnostic.h
@@ -830,6 +830,12 @@ public:
 m_abort_on_error = val;
   }
 
+  diagnostic_option_classifier &
+  get_option_classifier ()
+  {
+return m_option_classifier;
+  }
+
 private:
   void error_recursion () ATTRIBUTE_NORETURN;



[PATCH v5 15/24] fmv: Change target_version semantics to follow ACLE specification.

2025-05-29 Thread Alfie Richards
This patch changes the semantics of target_version and target_clones attributes
to match the behavior described in the Arm C Language extension.

The changes to behavior are:

- The scope and signature of an FMV function set is now that of the default
  version.
- The FMV resolver is now created at the locations of the default version
  implementation. Previously this was at the first call to an FMV function.
- When a TU has a single annotated function version, it gets mangled.
  - This includes a lone annotated default version.

This only affects targets with TARRGET_HAS_FMV_TARGET_ATTRIBUTE set to false.
Currently that is aarch64 and riscv.

This is achieved by:

- Skipping the existing FMV dispatching code at C++ gimplification and instead
  making use of the target_clones dispatching code in multiple_targets.cc.
  (This fixes PR target/118313 for aarch64 and riscv).
- Splitting target_clones pass in two, an early and late pass, where the early
  pass handles cases where multiple declarations are used to define a version,
  and the late pass handling target semantics targets, and cases where a FMV
  set is defined by a single target_clones decl.
- Changing the logic in add_candidates and resolve_address of overloaded
  function to prevent resolution of any version except a default version.
  (thus making the default version determine scope and signature of the
  versioned function set).
- Adding logic for dispatching a lone annotated default version in
  multiple_targets.cc
  - As as annotated default version gets mangled an alias is created from the
dispatched symbol to the default version as no ifunc resolution is required
in this case. (ie. an alias from `_Z3foov` to `_Z3foov.default`)
- Adding logic to `symbol_table::remove_unreachable_nodes` and analyze_functions
  that a reference to the default function version also implies a possible
  reference to the other versions (so they shouldnt be deleted and do need to
  be analyzed).

gcc/ChangeLog:

PR target/118313
* cgraphunit.cc (analyze_functions): Add logic for target version
dependencies.
* ipa.cc (symbol_table::remove_unreachable_nodes): Ditto.
* multiple_target.cc (create_dispatcher_calls): Change to support
target version semantics.
(ipa_target_clone): Change to dispatch all function sets in
target_version semantics, and to have early and late pass.
(is_simple_target_clones_case): New function.
* config/aarch64/aarch64.cc: (aarch64_get_function_versions_dispatcher):
Refactor with the assumption that the DECL node will be default.
* config/riscv/riscv.cc: (riscv_get_function_versions_dispatcher):
Refactor with the assumption that the DECL node will be default.
* passes.def: Split target_clones pass into early and late version.

gcc/cp/ChangeLog:

PR target/118313
* call.cc (add_candidates): Change to not resolve non-default versions 
in
target_version semantics.
* class.cc (resolve_address_of_overloaded_function): Ditto.
* cp-gimplify.cc (cp_genericize_r): Change logic to not apply for
target_version semantics.
* decl.cc (start_decl): Change to mark and therefore mangle all
target_version decls.
(start_preparsed_function): Ditto.
* typeck.cc (cp_build_function_call_vec): Add error for calling 
unresolvable
non-default node in target_version semantics.

gcc/testsuite/ChangeLog:

* g++.target/aarch64/mv-1.C: Change for target_version semantics.
* g++.target/aarch64/mv-symbols2.C: Ditto.
* g++.target/aarch64/mv-symbols3.C: Ditto.
* g++.target/aarch64/mv-symbols4.C: Ditto.
* g++.target/aarch64/mv-symbols5.C: Ditto.
* g++.target/aarch64/mvc-symbols3.C: Ditto.
* g++.target/riscv/mv-symbols2.C: Ditto.
* g++.target/riscv/mv-symbols3.C: Ditto.
* g++.target/riscv/mv-symbols4.C: Ditto.
* g++.target/riscv/mv-symbols5.C: Ditto.
* g++.target/riscv/mvc-symbols3.C: Ditto.
* g++.target/aarch64/mv-symbols10.C: New test.
* g++.target/aarch64/mv-symbols11.C: New test.
* g++.target/aarch64/mv-symbols12.C: New test.
* g++.target/aarch64/mv-symbols13.C: New test.
* g++.target/aarch64/mv-symbols6.C: New test.
* g++.target/aarch64/mv-symbols7.C: New test.
* g++.target/aarch64/mv-symbols8.C: New test.
* g++.target/aarch64/mv-symbols9.C: New test.
---
 gcc/cgraphunit.cc |   9 ++
 gcc/config/aarch64/aarch64.cc |  43 ++
 gcc/config/riscv/riscv.cc |  43 ++
 gcc/cp/call.cc|  10 ++
 gcc/cp/class.cc   |  13 +-
 gcc/cp/cp-gimplify.cc |  11 +-
 gcc/cp/decl.cc|  14 ++
 gcc/cp/typeck.cc  |  10 ++
 gcc/ipa.cc

[PATCH v5 18/24] fmv: Support mixing of target_clones and target_version.

2025-05-29 Thread Alfie Richards
Add support for a FMV set defined by a combination of target_clones and
target_version definitions.

Additionally, change is_function_default_version to consider a function
declaration annotated with target_clones containing default to be a
default version.

Lastly, add support for the case that a target_clone has all versions filtered
out and therefore the declaration should be removed. This is relevant as now
the default could be defined in a target_version, so a target_clones no longer
necessarily contains the default.

This takes advantage of refactoring done in previous patches changing how
target_clones are expanded and how conflicting decls are handled.

gcc/ChangeLog:

* attribs.cc (is_function_default_version): Update to handle
target_clones.
* cgraph.h (FOR_EACH_FUNCTION_REMOVABLE): New macro.
* multiple_target.cc (expand_target_clones): Update logic to delete
empty target_clones and modify diagnostic.
(ipa_target_clone): Update to use FOR_EACH_FUNCTION_REMOVABLE.

gcc/c-family/ChangeLog:

* c-attribs.cc: Add support for target_version and target_clone mixing.

gcc/testsuite/ChangeLog:

* g++.target/aarch64/mv-and-mvc1.C: New test.
* g++.target/aarch64/mv-and-mvc2.C: New test.
* g++.target/aarch64/mv-and-mvc3.C: New test.
* g++.target/aarch64/mv-and-mvc4.C: New test.
---
 gcc/attribs.cc| 10 -
 gcc/c-family/c-attribs.cc |  9 +---
 gcc/cgraph.h  |  7 
 gcc/multiple_target.cc| 24 +--
 .../g++.target/aarch64/mv-and-mvc1.C  | 38 +
 .../g++.target/aarch64/mv-and-mvc2.C  | 29 +
 .../g++.target/aarch64/mv-and-mvc3.C  | 41 +++
 .../g++.target/aarch64/mv-and-mvc4.C  | 38 +
 8 files changed, 183 insertions(+), 13 deletions(-)
 create mode 100644 gcc/testsuite/g++.target/aarch64/mv-and-mvc1.C
 create mode 100644 gcc/testsuite/g++.target/aarch64/mv-and-mvc2.C
 create mode 100644 gcc/testsuite/g++.target/aarch64/mv-and-mvc3.C
 create mode 100644 gcc/testsuite/g++.target/aarch64/mv-and-mvc4.C

diff --git a/gcc/attribs.cc b/gcc/attribs.cc
index 06785eaa136..2ca82674f7c 100644
--- a/gcc/attribs.cc
+++ b/gcc/attribs.cc
@@ -1242,7 +1242,8 @@ make_dispatcher_decl (const tree decl)
With the target attribute semantics, returns true if the function is marked
as default with the target version.
With the target_version attribute semantics, returns true if the function
-   is either not annotated, or annotated as default.  */
+   is either not annotated, annotated as default, or is a target_clone
+   containing the default declaration.  */
 
 bool
 is_function_default_version (const tree decl)
@@ -1259,6 +1260,13 @@ is_function_default_version (const tree decl)
 }
   else
 {
+  if (lookup_attribute ("target_clones", DECL_ATTRIBUTES (decl)))
+   {
+ int num_defaults = 0;
+ get_clone_versions (decl, &num_defaults);
+ return num_defaults > 0;
+   }
+
   attr = lookup_attribute ("target_version", DECL_ATTRIBUTES (decl));
   if (!attr)
return true;
diff --git a/gcc/c-family/c-attribs.cc b/gcc/c-family/c-attribs.cc
index b5287f0da06..a4e657d9ffd 100644
--- a/gcc/c-family/c-attribs.cc
+++ b/gcc/c-family/c-attribs.cc
@@ -249,13 +249,6 @@ static const struct attribute_spec::exclusions 
attr_target_clones_exclusions[] =
   ATTR_EXCL ("always_inline", true, true, true),
   ATTR_EXCL ("target", TARGET_HAS_FMV_TARGET_ATTRIBUTE,
 TARGET_HAS_FMV_TARGET_ATTRIBUTE, TARGET_HAS_FMV_TARGET_ATTRIBUTE),
-  ATTR_EXCL ("target_version", true, true, true),
-  ATTR_EXCL (NULL, false, false, false),
-};
-
-static const struct attribute_spec::exclusions 
attr_target_version_exclusions[] =
-{
-  ATTR_EXCL ("target_clones", true, true, true),
   ATTR_EXCL (NULL, false, false, false),
 };
 
@@ -543,7 +536,7 @@ const struct attribute_spec c_common_gnu_attributes[] =
  attr_target_exclusions },
   { "target_version", 1, 1, true, false, false, false,
  handle_target_version_attribute,
- attr_target_version_exclusions },
+ NULL },
   { "target_clones",  1, -1, true, false, false, false,
  handle_target_clones_attribute,
  attr_target_clones_exclusions },
diff --git a/gcc/cgraph.h b/gcc/cgraph.h
index 334441d1003..a2b0551ce6b 100644
--- a/gcc/cgraph.h
+++ b/gcc/cgraph.h
@@ -3104,6 +3104,13 @@ symbol_table::next_function_with_gimple_body 
(cgraph_node *node)
for ((node) = symtab->first_function (); (node); \
(node) = symtab->next_function ((node)))
 
+/* Walk all functions but precompute so a node can be deleted if needed.  */
+#define FOR_EACH_FUNCTION_REMOVABLE(node) \
+   cgraph_nod

Re: [PATCH v4 0/1] Add warnings of potentially-uninitialized padding bits

2025-05-29 Thread Christopher Bazley

Dear GCC Developers,

Please could somebody review this patch? I previously received comments 
from Joseph and Jakub, which I believe I have addressed.


Thanks,

Chris

On 21/05/2025 16:13, Christopher Bazley wrote:

Commit 0547dbb725b reduced the number of cases in which
union padding bits are zeroed when the relevant language
standard does not strictly require it, unless gcc was
invoked with -fzero-init-padding-bits=unions or
-fzero-init-padding-bits=all in order to explicitly
request zeroing of padding bits.

This commit adds a closely related warning,
-Wzero-init-padding-bits=, which is intended to help
programmers to find code that might now need to be
rewritten or recompiled with
-fzero-init-padding-bits=unions or
-fzero-init-padding-bits=all in order to replicate
the behaviour that it had when compiled by older
versions of GCC. It can also be used to find struct
padding that was never previously guaranteed to be
zero initialized and still isn't unless GCC is
invoked with -fzero-init-padding-bits=all option.

The new warning can be set to the same three states
as -fzero-init-padding-bits ('standard', 'unions'
or 'all') and has the same default value ('standard').

The two options interact as follows:

   f: standard  f: unions   f: all
w: standard X X X
w: unions   U X X
w: all  A S X

X = No warnings about padding
U = Warnings about padding of unions.
S = Warnings about padding of structs.
A = Warnings about padding of structs and unions.

The level of optimisation and whether or not the
entire initializer is dropped to memory can both
affect whether warnings are produced when compiling
a given program. This is intentional, since tying
the warnings more closely to the relevant language
standard would require a very different approach
that would still be target-dependent, might impose
an unacceptable burden on programmers, and would
risk not satisfying the intended use-case (which
is closely tied to a specific optimisation).

Bootstrapped the compiler and tested on AArch64
and x86-64 using some new tests for
-Wzero-init-padding-bits and the existing tests
for -fzero-init-padding-bits
(check-gcc RUNTESTFLAGS="dg.exp=*-empty-init-*.c").

Base commit is a470433732e77ae29a717cf79049ceeea3cbe979

Changes in v2:
  - Added missing changelog entry.

Changes in v3:
  - Modified two tests in which I had neglected to
ensure that initializers were not compile time
constants. This policy prevents the entire
initializer being dropped to memory, which
would otherwise prevent the expected diagnostic
message from being produced.
  - Amended the diagnostic message from "Padding bits
might not.." to "padding might not..."

Changes in v4:
- Removed redundant braces.
- Added "if code relies on it being zero," to the
   diagnostic message.

Link to v1:
https://inbox.sourceware.org/gcc-patches/20250520104940.3546-1-chris.baz...@arm.com/

Link to v2:
https://inbox.sourceware.org/gcc-patches/20250520144524.5968-1-chris.baz...@arm.com/

Link to v3:
https://inbox.sourceware.org/gcc-patches/20250521124745.24592-1-chris.baz...@arm.com/

Christopher Bazley (1):
   Add warnings of potentially-uninitialized padding bits

  gcc/common.opt|  4 +
  gcc/doc/invoke.texi   | 85 ++-
  gcc/expr.cc   | 41 -
  gcc/expr.h|  7 +-
  gcc/gimplify.cc   | 27 +-
  gcc/testsuite/gcc.dg/c23-empty-init-warn-1.c  | 68 +++
  gcc/testsuite/gcc.dg/c23-empty-init-warn-10.c |  8 ++
  gcc/testsuite/gcc.dg/c23-empty-init-warn-11.c |  8 ++
  gcc/testsuite/gcc.dg/c23-empty-init-warn-12.c |  8 ++
  gcc/testsuite/gcc.dg/c23-empty-init-warn-13.c |  8 ++
  gcc/testsuite/gcc.dg/c23-empty-init-warn-14.c |  8 ++
  gcc/testsuite/gcc.dg/c23-empty-init-warn-15.c |  8 ++
  gcc/testsuite/gcc.dg/c23-empty-init-warn-16.c |  8 ++
  gcc/testsuite/gcc.dg/c23-empty-init-warn-17.c | 51 +++
  gcc/testsuite/gcc.dg/c23-empty-init-warn-2.c  | 69 +++
  gcc/testsuite/gcc.dg/c23-empty-init-warn-3.c  |  7 ++
  gcc/testsuite/gcc.dg/c23-empty-init-warn-4.c  | 69 +++
  gcc/testsuite/gcc.dg/c23-empty-init-warn-5.c  |  8 ++
  gcc/testsuite/gcc.dg/c23-empty-init-warn-6.c  |  8 ++
  gcc/testsuite/gcc.dg/c23-empty-init-warn-7.c  |  8 ++
  gcc/testsuite/gcc.dg/c23-empty-init-warn-8.c  |  8 ++
  gcc/testsuite/gcc.dg/c23-empty-init-warn-9.c  | 69 +++
  .../gcc.dg/gnu11-empty-init-warn-1.c  | 52 
  .../gcc.dg/gnu11-empty-init-warn-10.c |  8 ++
  .../gcc.dg/gnu11-empty-init-warn-11.c |  8 ++
  .../gcc.dg/gnu11-empty-init-warn-12.c |  8 ++
  .../gcc.dg/gnu11-empty-init-warn-13.c |  8 ++
  .../gcc.dg/gnu11-empty-init-warn-14.c |  8 ++
  .../gcc.dg/gnu11-empty-init-warn-15.c |  8 ++
  .../gcc.dg/gnu11-empty-i

Re: [PATCH] libstdc++: Compare keys and values separately in flat_map::operator==

2025-05-29 Thread Jonathan Wakely
On Thu, 29 May 2025 at 15:48, Jonathan Wakely  wrote:
>
> On Thu, 29 May 2025 at 15:42, Tomasz Kaminski  wrote:
> >
> >
> >
> > On Thu, May 29, 2025 at 3:56 PM Patrick Palka  wrote:
> >>
> >> Tested on x86_64-pc-linux-gnu, does this look OK for trunk/15?
> >>
> >> -- >8 --
> >>
> >> Instead of effectively doing a zipped comparison of the keys and values,
> >> compare them separately to leverage the underlying containers' optimized
> >> equality implementations.
> >>
> >> libstdc++-v3/ChangeLog:
> >>
> >> * include/std/flat_map (_Flat_map_impl::operator==): Compare
> >> keys and values separately.
> >> ---
> >>  libstdc++-v3/include/std/flat_map | 5 -
> >>  1 file changed, 4 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/libstdc++-v3/include/std/flat_map 
> >> b/libstdc++-v3/include/std/flat_map
> >> index c0716d12412a..134307324190 100644
> >> --- a/libstdc++-v3/include/std/flat_map
> >> +++ b/libstdc++-v3/include/std/flat_map
> >> @@ -873,7 +873,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> >>[[nodiscard]]
> >>friend bool
> >>operator==(const _Derived& __x, const _Derived& __y)
> >> -  { return std::equal(__x.begin(), __x.end(), __y.begin(), 
> >> __y.end()); }
> >> +  {
> >> +   return __x._M_cont.keys == __y._M_cont.keys
> >> + && __x._M_cont.values == __y._M_cont.values;
> >
> > Previously we supported containers that do not have operator==, by calling 
> > equal.
>
> Oh, good point.
> Using == means the element types of the underlying containers must be
> equality comparable, but the original approach of using std::equal on
> the zipped values only means those tuples must be equality comparable,
> and an evil user could have overloaded:
>
> bool operator==(const tuple&, const tuple&);

Or const tuple& or whatever the zipped type is.


>
> so that those comparisons work, but MyVal might not be equality comparable.
>
> > For the flat_set we also do not compare the containers. I would suggest 
> > using in both:
> >   ranges::equal(x._M_cont)
> > Or using == on containers in both flat_map and flat_set.
> >>
> >> +  }
> >>
> >>template
> >> [[nodiscard]]
> >> --
> >> 2.50.0.rc0
> >>



[PATCH v1] doc: Replace "fixed-point" with "integer"

2025-05-29 Thread Karl Meakin
In some places the documentation refers to "fixed-point" types or values
when talking about plain integer types. Although this is meant to mean
"the opposite of floating-point", it is misleading and can be confused
with the fractional types that are also known as "fixed-point". For the
avoidance of doubt, refer to plain integer types as "integer"
throughout.

Testing done:
`make info` and `make dvi`

gcc/ChangeLog:
* doc/rtl.texi: Replace "fixed-point" with "integer" where
appropriate.
---
 gcc/doc/rtl.texi | 44 ++--
 1 file changed, 22 insertions(+), 22 deletions(-)

diff --git a/gcc/doc/rtl.texi b/gcc/doc/rtl.texi
index 089bb1c4ede..ddba52a4014 100644
--- a/gcc/doc/rtl.texi
+++ b/gcc/doc/rtl.texi
@@ -2619,40 +2619,40 @@ integers.
 @cindex bitwise complement
 @item (not:@var{m} @var{x})
 Represents the bitwise complement of the value represented by @var{x},
-carried out in mode @var{m}, which must be a fixed-point machine mode.
+carried out in mode @var{m}, which must be an integer machine mode.
 
 @findex and
 @cindex logical-and, bitwise
 @cindex bitwise logical-and
 @item (and:@var{m} @var{x} @var{y})
 Represents the bitwise logical-and of the values represented by
 @var{x} and @var{y}, carried out in machine mode @var{m}, which must be
-a fixed-point machine mode.
+an integer machine mode.
 
 @findex ior
 @cindex inclusive-or, bitwise
 @cindex bitwise inclusive-or
 @item (ior:@var{m} @var{x} @var{y})
 Represents the bitwise inclusive-or of the values represented by @var{x}
-and @var{y}, carried out in machine mode @var{m}, which must be a
-fixed-point mode.
+and @var{y}, carried out in machine mode @var{m}, which must be an
+integer mode.
 
 @findex xor
 @cindex exclusive-or, bitwise
 @cindex bitwise exclusive-or
 @item (xor:@var{m} @var{x} @var{y})
 Represents the bitwise exclusive-or of the values represented by @var{x}
-and @var{y}, carried out in machine mode @var{m}, which must be a
-fixed-point mode.
+and @var{y}, carried out in machine mode @var{m}, which must be an
+integer mode.
 
 @findex ashift
 @findex ss_ashift
 @findex us_ashift
 @cindex left shift
 @cindex shift
 @cindex arithmetic shift
 @cindex arithmetic shift with signed saturation
 @cindex arithmetic shift with unsigned saturation
 @item (ashift:@var{m} @var{x} @var{c})
 @itemx (ss_ashift:@var{m} @var{x} @var{c})
 @itemx (us_ashift:@var{m} @var{x} @var{c})
@@ -2663,8 +2663,8 @@ in case of a change in the sign bit; @code{ss_ashift} and 
@code{us_ashift}
 saturates to the minimum or maximum representable value if any of the bits
 shifted out differs from the final sign bit.
 
-@var{x} have mode @var{m}, a fixed-point machine mode.  @var{c}
-be a fixed-point mode or be a constant with mode @code{VOIDmode}; which
+@var{x} have mode @var{m}, an integer machine mode.  @var{c}
+be an integer mode or be a constant with mode @code{VOIDmode}; which
 mode is determined by the mode called for in the machine description
 entry for the left-shift instruction.  For example, on the VAX, the mode
 of @var{c} is @code{QImode} regardless of @var{m}.
@@ -2750,13 +2750,13 @@ integer of mode @var{m}.  The mode of @var{x} must be 
@var{m} or
 @findex bswap
 @item (bswap:@var{m} @var{x})
 Represents the value @var{x} with the order of bytes reversed, carried out
-in mode @var{m}, which must be a fixed-point machine mode.
+in mode @var{m}, which must be an integer machine mode.
 The mode of @var{x} must be @var{m} or @code{VOIDmode}.
 
 @findex bitreverse
 @item (bitreverse:@var{m} @var{x})
 Represents the value @var{x} with the order of bits reversed, carried out
-in mode @var{m}, which must be a fixed-point machine mode.
+in mode @var{m}, which must be an integer machine mode.
 The mode of @var{x} must be @var{m} or @code{VOIDmode}.
 
 @findex copysign
@@ -2824,18 +2824,18 @@ are not equal, otherwise 0.
 @findex gt
 @cindex greater than
 @item (gt:@var{m} @var{x} @var{y})
 @code{STORE_FLAG_VALUE} if the @var{x} is greater than @var{y}.  If they
-are fixed-point, the comparison is done in a signed sense.
+are integers, the comparison is done in a signed sense.
 
 @findex gtu
 @cindex greater than
 @cindex unsigned greater than
 @item (gtu:@var{m} @var{x} @var{y})
-Like @code{gt} but does unsigned comparison, on fixed-point numbers only.
+Like @code{gt} but does unsigned comparison, on integers only.
 
 @findex lt
 @cindex less than
 @findex ltu
 @cindex unsigned less than
 @item (lt:@var{m} @var{x} @var{y})
 @itemx (ltu:@var{m} @var{x} @var{y})
@@ -3002,14 +3002,14 @@ must be placed into a register.
 @findex sign_extend
 @item (sign_extend:@var{m} @var{x})
 Represents the result of sign-extending the value @var{x}
-to machine mode @var{m}.  @var{m} must be a fixed-point mode
-and @var{x} a fixed-point value of a mode narrower than @var{m}.
+to machine mode @var{m}.  @var{m} must be an integer mode
+and @var{x} an integer value of a mode narrower than @var{m}.
 
 @findex zero_extend
 @item (zero_exten

[PATCH] rtl-optimization: Invalid CSE of inline asm with memory clobber [PR111901]

2025-05-29 Thread Uros Bizjak
The following test:

--cut here--
int test (void)
{
  unsigned int sum = 0;

  for (int i = 0; i < 4; i++)
{
  unsigned int val;

  asm ("magic %0" : "=r" (val) : : "memory");
  sum += val;
}

  return sum;
}
--cut here--

compiles on x86_64 with -O2 -funroll-all-loops to nonsensical code
where over-eager CSE combines the non-volatile asm despite the fact
that it has a memory clobber, which gcc documentation states means:

 The "memory" clobber tells the compiler that the assembly code
 performs memory reads or writes to items other than those listed
 in the input and output operands (for example, accessing thež
 memory pointed to by one of the input parameters).

so combining the four identical asm statements into one seems to be
actively buggy. The inline asm may not be marked volatile, but it
does clearly tell the compiler that it does memory reads OR WRITES
to operands other than those listed. Which would on the face of it
make the CSE invalid.

The patch avoids simplifications of asms with memory clobber
in postreload.cc/reload_cse_simpify function.

The testcase now compiles to the expected code:

test:
magic %eax
magic %edx
addl%edx, %eax
magic %ecx
addl%ecx, %eax
magic %esi
addl%esi, %eax
ret

PR rtl-optimization/111901

gcc/ChangeLog:

* postreload.cc (reload_cse_simplify): Avoid simplification of
asms with memory clobber.

gcc/testsuite/ChangeLog:

* gcc.dg/pr111901.c: New test.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

OK for mainline?

Uros.
diff --git a/gcc/postreload.cc b/gcc/postreload.cc
index 487aa8aad05..493f2d62cca 100644
--- a/gcc/postreload.cc
+++ b/gcc/postreload.cc
@@ -132,12 +132,28 @@ reload_cse_simplify (rtx_insn *insn, rtx testreg)
 we don't try to substitute values for them.  */
   if (asm_noperands (body) >= 0)
{
+ bool mem_clobber_seen = false;
+
  for (i = XVECLEN (body, 0) - 1; i >= 0; --i)
{
  rtx part = XVECEXP (body, 0, i);
- if (GET_CODE (part) == CLOBBER && REG_P (XEXP (part, 0)))
-   cselib_invalidate_rtx (XEXP (part, 0));
+ if (GET_CODE (part) == CLOBBER)
+   switch (GET_CODE (XEXP (part, 0)))
+ {
+ case REG:
+   cselib_invalidate_rtx (XEXP (part, 0));
+   break;
+ case MEM:
+   if (GET_CODE (XEXP (XEXP (part, 0), 0)) == SCRATCH)
+ mem_clobber_seen = true;
+   break;
+ default:
+   gcc_unreachable ();
+ }
}
+
+ if (mem_clobber_seen)
+   return false;
}
 
   /* If every action in a PARALLEL is a noop, we can delete
diff --git a/gcc/testsuite/gcc.dg/pr111901.c b/gcc/testsuite/gcc.dg/pr111901.c
new file mode 100644
index 000..30a4a30056e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr111901.c
@@ -0,0 +1,20 @@
+/* PR rtl-optimization/111901 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -funroll-all-loops" } */
+
+int test (void)
+{
+  unsigned int sum = 0;
+
+  for (int i = 0; i < 4; i++)
+{
+  unsigned int val;
+
+  asm ("magic %0" : "=r" (val) : : "memory");
+  sum += val;
+}
+
+  return sum;
+}
+
+/* { dg-final { scan-assembler-times "magic" 4 } } */


[PATCH] c++, coroutines: Make analyze_fn_params into a class method.

2025-05-29 Thread Iain Sandoe
Tested on x86_64-darwin, powerpc64le-linux, OK for trunk?
thanks
Iain

--- 8< ---

This continues code cleanups and migration to encapsulation of the
whole coroutine transform.

gcc/cp/ChangeLog:

* coroutines.cc (analyze_fn_parms): Move from free function..
(cp_coroutine_transform::analyze_fn_parms):... to method.
(cp_coroutine_transform::apply_transforms): Adjust call to
analyze_fn_parms.
* coroutines.h: Declare analyze_fn_parms.

Signed-off-by: Iain Sandoe 
---
 gcc/cp/coroutines.cc | 20 +++-
 gcc/cp/coroutines.h  |  1 +
 2 files changed, 12 insertions(+), 9 deletions(-)

diff --git a/gcc/cp/coroutines.cc b/gcc/cp/coroutines.cc
index dbb21a2ff77..bf3ab2d7250 100644
--- a/gcc/cp/coroutines.cc
+++ b/gcc/cp/coroutines.cc
@@ -4043,12 +4043,14 @@ rewrite_param_uses (tree *stmt, int *do_subtree 
ATTRIBUTE_UNUSED, void *d)
 }
 
 /* Build up a set of info that determines how each param copy will be
-   handled.  */
+   handled.  We store this in a hash map so that we can access it from
+   a tree walk callback that re-writes the original parameters to their
+   copies.  */
 
-static void
-analyze_fn_parms (tree orig, hash_map *param_uses)
+void
+cp_coroutine_transform::analyze_fn_parms ()
 {
-  if (!DECL_ARGUMENTS (orig))
+  if (!DECL_ARGUMENTS (orig_fn_decl))
 return;
 
   /* Build a hash map with an entry for each param.
@@ -4058,19 +4060,19 @@ analyze_fn_parms (tree orig, hash_map 
*param_uses)
  Then a tree list of the uses.
  The second two entries start out empty - and only get populated
  when we see uses.  */
-  bool lambda_p = LAMBDA_FUNCTION_P (orig);
+  bool lambda_p = LAMBDA_FUNCTION_P (orig_fn_decl);
 
   /* Count the param copies from 1 as per the std.  */
   unsigned parm_num = 1;
-  for (tree arg = DECL_ARGUMENTS (orig); arg != NULL;
+  for (tree arg = DECL_ARGUMENTS (orig_fn_decl); arg != NULL;
++parm_num, arg = DECL_CHAIN (arg))
 {
   bool existed;
-  param_info &parm = param_uses->get_or_insert (arg, &existed);
+  param_info &parm = param_uses.get_or_insert (arg, &existed);
   gcc_checking_assert (!existed);
   parm.body_uses = NULL;
   tree actual_type = TREE_TYPE (arg);
-  actual_type = complete_type_or_else (actual_type, orig);
+  actual_type = complete_type_or_else (actual_type, orig_fn_decl);
   if (actual_type == NULL_TREE)
actual_type = error_mark_node;
   parm.orig_type = actual_type;
@@ -5265,7 +5267,7 @@ cp_coroutine_transform::apply_transforms ()
 
   /* Collect information on the original function params and their use in the
  function body.  */
-  analyze_fn_parms (orig_fn_decl, ¶m_uses);
+  analyze_fn_parms ();
 
   /* Declare the actor and destroyer functions, the following code needs to
  see these.  */
diff --git a/gcc/cp/coroutines.h b/gcc/cp/coroutines.h
index 10698cf2e12..55caa6e61e3 100644
--- a/gcc/cp/coroutines.h
+++ b/gcc/cp/coroutines.h
@@ -126,6 +126,7 @@ private:
   bool inline_p = false;
   bool valid_coroutine = false;
 
+  void analyze_fn_parms ();
   void wrap_original_function_body ();
   bool build_ramp_function ();
 };
-- 
2.39.2 (Apple Git-143)



[PATCH] c: fix ICE for mutually recursive structures [PR120381]

2025-05-29 Thread Martin Uecker


This is a fun one. 

Bootstrapped and regression tested for x86_64.

Martin


c: fix ICE for mutually recursive structures [PR120381]

For invalid nesting of a structure definition in a definition
of itself or when using a rather obscure construction using statement
expressions, we can create mutually recursive pairs of non-identical
but compatible structure types.  This can lead to invalid composite
types and an ICE.  If we detect recursion even for swapped pairs
when forming composite types, this is avoided.

PR c/120381

gcc/c/ChangeLog:
* c-typeck.cc (composite_type_internal): Stop recursion for
swapped pairs.

gcc/testsuite/ChangeLog:
* gcc.dg/pr120381.c: New test.
* gcc.dg/gnu23-tag-composite-6.c: New test.

diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc
index 0e1f842e22d..c8c1b86aa21 100644
--- a/gcc/c/c-typeck.cc
+++ b/gcc/c/c-typeck.cc
@@ -773,7 +773,7 @@ composite_type_internal (tree t1, tree t2, struct 
composite_cache* cache)
 construction, return it.  */
 
  for (struct composite_cache *c = cache; c != NULL; c = c->next)
-   if (c->t1 == t1 && c->t2 == t2)
+   if ((c->t1 == t1 && c->t2 == t2) || (c->t1 == t2 && c->t2 == t1))
   return c->composite;
 
  /* Otherwise, create a new type node and link it into the cache.  */
diff --git a/gcc/testsuite/gcc.dg/gnu23-tag-composite-6.c 
b/gcc/testsuite/gcc.dg/gnu23-tag-composite-6.c
new file mode 100644
index 000..2411b04d388
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/gnu23-tag-composite-6.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-std=gnu23" } */
+
+int f()
+{
+typedef struct foo bar;
+struct foo { typeof(({ (struct foo { bar * x; }){ }; })) * x; } *q;
+typeof(q->x) p;
+1 ? p : q;
+}
+
diff --git a/gcc/testsuite/gcc.dg/pr120381.c b/gcc/testsuite/gcc.dg/pr120381.c
new file mode 100644
index 000..5c017e60c6b
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr120381.c
@@ -0,0 +1,10 @@
+/* PR120381 */
+/* { dg-do compile } */
+
+struct A {
+  struct A {   /* { dg-error "nested redefinition" } */
+struct A *p;
+  } *p;
+};
+int foo(const struct A *q) { return q->p == q; }
+



[PATCH] C: Flex array in the middle via type alias is not reported [PR120353]

2025-05-29 Thread Qing Zhao
The root cause of the bug is: the TYPE_INCLUDES_FLEXARRAY marking of the
structure type is not copied to its aliased type.
The fix is to copy this marking to all the variant types of the current
structure type.

The patch has been bootstrapped and regression tested on both x86 and aarch64.
Okay for trunk and also GCC14?

thanks.

Qing

PR c/120353

gcc/c/ChangeLog:

* c-decl.cc (finish_struct): Copy TYPE_INCLUDES_FLEXARRAY marking
to all the variant types of the current structure type.

gcc/testsuite/ChangeLog:

* gcc.dg/pr120353.c: New test.
---
 gcc/c/c-decl.cc |  1 +
 gcc/testsuite/gcc.dg/pr120353.c | 11 +++
 2 files changed, 12 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/pr120353.c

diff --git a/gcc/c/c-decl.cc b/gcc/c/c-decl.cc
index ad66d7d258b..4733287eaf8 100644
--- a/gcc/c/c-decl.cc
+++ b/gcc/c/c-decl.cc
@@ -9891,6 +9891,7 @@ finish_struct (location_t loc, tree t, tree fieldlist, 
tree attributes,
   C_TYPE_VARIABLE_SIZE (x) = C_TYPE_VARIABLE_SIZE (t);
   C_TYPE_VARIABLY_MODIFIED (x) = C_TYPE_VARIABLY_MODIFIED (t);
   C_TYPE_INCOMPLETE_VARS (x) = NULL_TREE;
+  TYPE_INCLUDES_FLEXARRAY (x) = TYPE_INCLUDES_FLEXARRAY (t);
 }
 
   /* Update type location to the one of the definition, instead of e.g.
diff --git a/gcc/testsuite/gcc.dg/pr120353.c b/gcc/testsuite/gcc.dg/pr120353.c
new file mode 100644
index 000..6f8e4acf7f2
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr120353.c
@@ -0,0 +1,11 @@
+/* PR120353: Test for -Wflex-array-member-not-at-end on structure with 
+   typedef.  */ 
+/* { dg-do compile } */
+/* { dg-options "-Wflex-array-member-not-at-end" } */
+
+typedef struct flex flex_t;
+struct flex { int n; int data[]; };
+struct out_flex_mid {flex_t flex_data;  int m; }; /* { dg-warning "structure 
containing a flexible array member is not at the end of another structure" } */
+
+typedef struct flex flex_t1;
+struct out_flex_mid1 {flex_t1 flex_data1; int n; }; /* { dg-warning "structure 
containing a flexible array member is not at the end of another structure" } */ 
-- 
2.43.5



[PATCH] c: fix ICE related to tagged types with attributes in diagnostics [PR120380]

2025-05-29 Thread Martin Uecker


This fixes an error recovery issue.

Bootstrapped and regression tested for x86_64.

Martin


c: fix ICE related to tagged types with attributes in diagnostics [PR120380]

get_aka_type will create a new type for diagnostics, but for tagged types
attributes will then be ignored with a warning.  This can lead to reentering
warning code which leads to an ICE.  Fix this by ignoring the attributes
for tagged types.

PR c/120380

gcc/c/ChangeLog:
c-objc-common.cc (get_aka_type): Ignore attributes for tagged types.

gcc/testsuite/ChangeLog:
gcc.dg/pr120380.c: New test.

diff --git a/gcc/c/c-objc-common.cc b/gcc/c/c-objc-common.cc
index 2016eaebf17..b7b9c74bdf7 100644
--- a/gcc/c/c-objc-common.cc
+++ b/gcc/c/c-objc-common.cc
@@ -216,6 +216,11 @@ get_aka_type (tree type)
  return canonical ? canonical : type;
}
 }
+  /* For tagged types ignore qualifiers here because the will
+ otherwise be ignored later causing a warning inside diagnostics
+ which leads to an ICE.  */
+  if (RECORD_OR_UNION_TYPE_P (type) || TREE_CODE (type) == ENUMERAL_TYPE)
+return build_qualified_type (result, TYPE_QUALS (type));
   return build_type_attribute_qual_variant (result, TYPE_ATTRIBUTES (type),
TYPE_QUALS (type));
 }
diff --git a/gcc/testsuite/gcc.dg/pr120380.c b/gcc/testsuite/gcc.dg/pr120380.c
new file mode 100644
index 000..7f50936d80f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr120380.c
@@ -0,0 +1,24 @@
+/* PR c/120380 */
+/* { dg-do compile } */
+
+struct pair_t {
+  char c;
+  __int128_t i;
+} __attribute__((packed));
+typedef struct unaligned_int128_t_ {   /* { dg-error 
"no members" } */
+  struct unaligned_int128_t_ { /* { dg-error 
"nested redefinition" } */
+   /* { dg-error 
"no members" "" { target *-*-* } .-1 } */
+struct unaligned_int128_t_ {   /* { dg-error 
"nested redefinition" } */
+  __int128_t value;
+}
+  }/* { dg-error 
"does not declare anything" } */
+   /* { dg-error 
"no semicolon" "" { target *-*-* } .-1 } */
+} __attribute__((packed, may_alias)) unaligned_int128_t;   /* { dg-error 
"does not declare anything" } */
+   /* { dg-error 
"no semicolon" "" { target *-*-* } .-1 } */
+struct pair_t p = {0, 1};
+unaligned_int128_t *addr = (unaligned_int128_t *)&p.i;
+int main() {
+  addr->value = 0; /* { dg-error 
"has no member" } */
+  return 0;
+}
+




[PATCH v5 17/24] c++: Refactor FMV frontend conflict and merging logic and hooks.

2025-05-29 Thread Alfie Richards
This change refactors FMV handling in the frontend to allows greater
reasoning about versions in shared code.

This is needed for allowing target_clones and target_versions to be used
together in a function set, as there is then two distinct concerns when
encountering two declarations that previously were conflated:

1. Are these two declarations completely distinct FMV declarations
(ie. the sets of versions they define have no overlap). If so, they don't
conflict so there is no need to merge and both can be pushed.
2. For two declarations that aren't completely distinct, are they matching
and therefore mergeable. (ie. two target_clone decls that define the same set
of versions, or an un-annotated declaration, and a target_clones definition
containing the default version). If so, continue to the existing merging logic
to try to merge these and diagnose if it's not possible.
If not, then diagnose the confliciting declarations.

To do this the common_function_versions function has been renamed
distinct_function_versions (meaning, are the version sets defined by these
two decl's completely distinct from eachother).

The common function version hook was modified to instead take two
string_slice's (each representing a single version) and determine if they
define the same version.

A new function, called diagnose_versioned_decls is added, which checks
if two decls (with overlapping version sets) can be merged and diagnose when
they cannot be (only in terms of the attributes, the existing logic is used to
detect other mergability conflicts like redefinition).

This only effects targets with TARGET_HAS_FMV_TARGET_ATTRIBUTE set to false.
(ie. aarch64 and riscv), the existing logic for i86 and ppc is unchanged.
This also means the common version hook is only used for aarch64 and riscv.

gcc/ChangeLog:

* attribs.cc (common_function_versions): Change to an error, existing
logic moved to distinct_version_decls.
* attribs.h (common_function_versions): Change arguments.
* config/aarch64/aarch64.cc (aarch64_common_function_versions):
New function.
* config/riscv/riscv.cc (riscv_common_function_versions): New function.
* doc/tm.texi: Regenerated.
* target.def: Change common_function_versions hook.
* tree.cc (distinct_version_decls): New function.
(mergeable_version_decls): Ditto.
* tree.h (distinct_version_decls): New function.
(mergeable_version_decls): Ditto.
* hooks.h (hook_stringslice_stringslice_unreachable): New function.
* hooks.cc (hook_stringslice_stringslice_unreachable): New function.

gcc/cp/ChangeLog:

* class.cc (resolve_address_of_overloaded_function): Updated to use
distinct_version_decls instead of common_function_version hook.
* decl.cc (decls_match): Refacture to use distinct_version_decls and
to pass through conflicting_version argument.
(maybe_version_functions): Updated to use
distinct_version_decls instead of common_function_version hook.
(duplicate_decls): Add logic to handle conflicting unmergable decls
and improve diagnostics for conflicting versions.
* decl2.cc (check_classfn): Updated to use
distinct_version_decls instead of common_function_version hook.
---
 gcc/attribs.cc|  74 ++-
 gcc/attribs.h |   3 +-
 gcc/config/aarch64/aarch64.cc |  16 ++-
 gcc/config/riscv/riscv.cc |  30 +++--
 gcc/cp/class.cc   |   3 +-
 gcc/cp/decl.cc|   8 +-
 gcc/cp/decl2.cc   |   2 +-
 gcc/doc/tm.texi   |  11 +-
 gcc/hooks.cc  |   7 +
 gcc/hooks.h   |   1 +
 gcc/target.def|  13 +-
 gcc/tree.cc   | 235 ++
 gcc/tree.h|   6 +
 13 files changed, 305 insertions(+), 104 deletions(-)

diff --git a/gcc/attribs.cc b/gcc/attribs.cc
index c75fd1371fd..06785eaa136 100644
--- a/gcc/attribs.cc
+++ b/gcc/attribs.cc
@@ -1086,7 +1086,14 @@ make_attribute (string_slice name, string_slice 
arg_name, tree chain)
   return attr;
 }
 
-/* Common functions used for target clone support.  */
+/* Used for targets with target_version semantics.  */
+
+bool
+common_function_versions (string_slice fn1 ATTRIBUTE_UNUSED,
+ string_slice fn2 ATTRIBUTE_UNUSED)
+{
+  gcc_unreachable ();
+}
 
 /* Comparator function to be used in qsort routine to sort attribute
specification strings to "target".  */
@@ -1177,71 +1184,6 @@ sorted_attr_string (tree arglist)
   return ret_str;
 }
 
-
-/* This function returns true if FN1 and FN2 are versions of the same function,
-   that is, the target strings of the function decls are different.  This 
assumes
-   that FN1 and FN2 have the same signature.  */
-
-bool
-common_function_versions (tree fn1, tree fn2)
-{
-  tree attr1, attr2;
-  char *target1, *target2;
-  bool result;
-
-  if (TREE

[PATCH v5 06/24] Refactor record_function_versions.

2025-05-29 Thread Alfie Richards
Renames record_function_versions to add_function_version, and make it
explicit that it is adding a single version to the function structure.

Additionally, change the insertion point to always maintain priority ordering
of the versions.

This allows for removing logic for moving the default to the first
position which was duplicated across target specific code and enables
easier reasoning about function sets.

gcc/ChangeLog:

* cgraph.cc (cgraph_node::record_function_versions): Refactor and
rename to...
(cgraph_node::add_function_version): new function.
* cgraph.h (cgraph_node::record_function_versions): Refactor and
rename to...
(cgraph_node::add_function_version): new function.
* config/aarch64/aarch64.cc (aarch64_get_function_versions_dispatcher):
Remove reordering.
* config/i386/i386-features.cc (ix86_get_function_versions_dispatcher):
Remove reordering.
* config/riscv/riscv.cc (riscv_get_function_versions_dispatcher):
Remove reordering.
* config/rs6000/rs6000.cc (rs6000_get_function_versions_dispatcher):
Remove reordering.

gcc/cp/ChangeLog:

* decl.cc (maybe_version_functions): Change record_function_versions
call to add_function_version.
---
 gcc/cgraph.cc| 75 +++-
 gcc/cgraph.h |  6 +--
 gcc/config/aarch64/aarch64.cc| 34 +++
 gcc/config/i386/i386-features.cc | 33 +++---
 gcc/config/riscv/riscv.cc| 38 +++-
 gcc/config/rs6000/rs6000.cc  | 35 +++
 gcc/cp/decl.cc   |  8 +++-
 7 files changed, 78 insertions(+), 151 deletions(-)

diff --git a/gcc/cgraph.cc b/gcc/cgraph.cc
index 3f95ca1fa85..e7c296851a8 100644
--- a/gcc/cgraph.cc
+++ b/gcc/cgraph.cc
@@ -231,45 +231,60 @@ cgraph_node::delete_function_version_by_decl (tree decl)
   decl_node->remove ();
 }
 
-/* Record that DECL1 and DECL2 are semantically identical function
+/* Add decl to the structure of semantically identical function versions.
+   The node is inserted at the point maintaining the priority ordering on the
versions.  */
 void
-cgraph_node::record_function_versions (tree decl1, tree decl2)
+cgraph_node::add_function_version (cgraph_function_version_info *fn_v,
+  tree decl)
 {
-  cgraph_node *decl1_node = cgraph_node::get_create (decl1);
-  cgraph_node *decl2_node = cgraph_node::get_create (decl2);
-  cgraph_function_version_info *decl1_v = NULL;
-  cgraph_function_version_info *decl2_v = NULL;
-  cgraph_function_version_info *before;
-  cgraph_function_version_info *after;
-
-  gcc_assert (decl1_node != NULL && decl2_node != NULL);
-  decl1_v = decl1_node->function_version ();
-  decl2_v = decl2_node->function_version ();
-
-  if (decl1_v != NULL && decl2_v != NULL)
-return;
-
-  if (decl1_v == NULL)
-decl1_v = decl1_node->insert_new_function_version ();
+  cgraph_node *decl_node = cgraph_node::get_create (decl);
+  cgraph_function_version_info *decl_v = NULL;
 
-  if (decl2_v == NULL)
-decl2_v = decl2_node->insert_new_function_version ();
+  gcc_assert (decl_node != NULL);
 
-  /* Chain decl2_v and decl1_v.  All semantically identical versions
- will be chained together.  */
+  decl_v = decl_node->function_version ();
 
-  before = decl1_v;
-  after = decl2_v;
+  /* If the nodes are already linked, skip.  */
+  if (decl_v != NULL && (decl_v->next || decl_v->prev))
+return;
 
-  while (before->next != NULL)
-before = before->next;
+  if (decl_v == NULL)
+decl_v = decl_node->insert_new_function_version ();
+
+  gcc_assert (decl_v);
+  gcc_assert (fn_v);
+
+  /* Go to start of the FMV structure.  */
+  while (fn_v->prev)
+fn_v = fn_v->prev;
+
+  cgraph_function_version_info *insert_point_before = NULL;
+  cgraph_function_version_info *insert_point_after = fn_v;
+
+  /* Find the insertion point for the new version to maintain ordering.
+ The default node must always go at the beginning.  */
+  if (!is_function_default_version (decl))
+while (insert_point_after
+  && (targetm.compare_version_priority
+(decl, insert_point_after->this_node->decl) > 0
+  || is_function_default_version
+   (insert_point_after->this_node->decl)
+  || lookup_attribute
+   ("target_clones",
+DECL_ATTRIBUTES (insert_point_after->this_node->decl
+  {
+   insert_point_before = insert_point_after;
+   insert_point_after = insert_point_after->next;
+  }
 
-  while (after->prev != NULL)
-after= after->prev;
+  decl_v->prev = insert_point_before;
+  decl_v->next= insert_point_after;
 
-  before->next = after;
-  after->prev = before;
+  if (insert_point_before)
+insert_point_before->next = decl_v;
+  if (insert_point_after)
+insert_point_after->prev = decl_v;
 }
 
 /* Initialize callgraph dump file.  

[PATCH v5 12/24] fmv: i386: Refactor FMV name mangling.

2025-05-29 Thread Alfie Richards
This patch is an overhaul of how FMV name mangling works. Previously
mangling logic was duplicated in several places across both target
specific and independent code. This patch changes this such that all
mangling is done in targetm.mangle_decl_assembler_name (including for the
dispatched symbol and dispatcher resolver).

This allows for the removing of previous hacks, such as where the default
mangled decl's assembler name was unmangled to then remangle all versions
and the resolver and dispatched symbol.

This introduces a change (shown in test changes) for the assembler name of the
dispatched symbol for a x86 versioned function set. Previously it used the
function name mangled twice. This was hard to reproduce without hacks I
wasn't comfortable with. Therefore, the mangling is changed to instead append
".ifunc" which matches clang's behavior.

This change also refactors expand_target_clone using
targetm.mangle_decl_assembler_name for mangling and get_clone_versions.
It is modified such that if the target_clone is in a FMV structure
the ordering is preserved once expanded. This is used later for ACLE semantics
and target_clone/target_version mixing.

gcc/ChangeLog:

* attribs.cc (make_dispatcher_decl): Move duplicated cgraph logic into
this function and change to use targetm.mangle_decl_assembler_name for
mangling.
* cgraph.cc (delete_function_version): Made public static member of
cgraph_node.
* cgraph.h (delete_function_version): Ditto.
* config/aarch64/aarch64.cc (aarch64_parse_fmv_features): Change to
support string_slice.
(aarch64_process_target_version_attr): Ditto.
(get_feature_mask_for_version): Ditto.
(aarch64_mangle_decl_assembler_name): Add logic for mangling dispatched
symbol and resolver.
(get_suffixed_assembler_name): Removed.
(make_resolver_func): Refactor to use
aarch64_mangle_decl_assembler_name for mangling.
(aarch64_generate_version_dispatcher_body): Remove remangling.
(aarch64_get_function_versions_dispatcher): Refactor to remove
duplicated cgraph logic.
* config/i386/i386-features.cc (is_valid_asm_symbol): Moved from
multiple_target.cc.
(create_new_asm_name): Ditto.
(ix86_mangle_function_version_assembler_name): Refactor to use
clone_identifier and to mangle default.
(ix86_mangle_decl_assembler_name): Add logic for mangling dispatched
symbol and resolver.
(ix86_get_function_versions_dispatcher): Remove duplicated cgraph
logic.
(make_resolver_func): Refactor to use ix86_mangle_decl_assembler_name
for mangling.
* config/riscv/riscv.cc (riscv_mangle_decl_assembler_name): Add logic
for FMV mangling.
(get_suffixed_assembler_name): Removed.
(make_resolver_func): Refactor to use riscv_mangle_decl_assembler_name
for mangling.
(riscv_generate_version_dispatcher_body): Remove unnecessary remangling.
(riscv_get_function_versions_dispatcher): Remove duplicated cgraph
logic.
* config/rs6000/rs6000.cc (rs6000_mangle_decl_assembler_name): New
function.
(rs6000_get_function_versions_dispatcher): Remove duplicated cgraph
logic.
(make_resolver_func): Refactor to use rs6000_mangle_decl_assembler_name
for mangling.
(is_valid_asm_symbol): Move from multiple_target.cc.
(create_new_asm_name): Ditto.
(rs6000_mangle_function_version_assembler_name): New function.
* multiple_target.cc (create_dispatcher_calls): Remove mangling code.
(get_attr_str): Removed.
(separate_attrs): Ditto.
(is_valid_asm_symbol): Moved to target specific.
(create_new_asm_name): Ditto.
(expand_target_clones): Refactor to use
targetm.mangle_decl_assembler_name for mangling and be more general.
* tree.cc (get_target_clone_attr_len): Removed.
* tree.h (get_target_clone_attr_len): Removed.

gcc/cp/ChangeLog:

* decl.cc (maybe_mark_function_versioned): Change to insert function 
version
and therefore record assembler name.

gcc/testsuite/ChangeLog:

* g++.target/i386/mv-symbols1.C: Update x86 FMV mangling.
* g++.target/i386/mv-symbols3.C: Ditto.
* g++.target/i386/mv-symbols4.C: Ditto.
* g++.target/i386/mv-symbols5.C: Ditto.
---
 gcc/attribs.cc  |  45 +++-
 gcc/cgraph.cc   |   4 +-
 gcc/cgraph.h|   2 +
 gcc/config/aarch64/aarch64.cc   | 163 +---
 gcc/config/i386/i386-features.cc| 108 +---
 gcc/config/riscv/riscv.cc   | 110 +++-
 gcc/config/rs6000/rs6000.cc | 115 +++--
 gcc/cp/decl.cc  |   7 +
 gcc/multiple_target.cc  | 262 +++-
 gcc/testsu

[PATCH v5 02/24] i386: Add x86 FMV symbol tests

2025-05-29 Thread Alfie Richards
From: Alice Carlotti 

This is for testing the x86 mangling of FMV versioned function
assembly names.

gcc/testsuite/ChangeLog:

* g++.target/i386/mv-symbols1.C: New test.
* g++.target/i386/mv-symbols2.C: New test.
* g++.target/i386/mv-symbols3.C: New test.
* g++.target/i386/mv-symbols4.C: New test.
* g++.target/i386/mv-symbols5.C: New test.
* g++.target/i386/mvc-symbols1.C: New test.
* g++.target/i386/mvc-symbols2.C: New test.
* g++.target/i386/mvc-symbols3.C: New test.
* g++.target/i386/mvc-symbols4.C: New test.

Co-authored-by: Alfie Richards 
---
 gcc/testsuite/g++.target/i386/mv-symbols1.C  | 68 
 gcc/testsuite/g++.target/i386/mv-symbols2.C  | 56 
 gcc/testsuite/g++.target/i386/mv-symbols3.C  | 44 +
 gcc/testsuite/g++.target/i386/mv-symbols4.C  | 50 ++
 gcc/testsuite/g++.target/i386/mv-symbols5.C  | 56 
 gcc/testsuite/g++.target/i386/mvc-symbols1.C | 44 +
 gcc/testsuite/g++.target/i386/mvc-symbols2.C | 29 +
 gcc/testsuite/g++.target/i386/mvc-symbols3.C | 35 ++
 gcc/testsuite/g++.target/i386/mvc-symbols4.C | 23 +++
 9 files changed, 405 insertions(+)
 create mode 100644 gcc/testsuite/g++.target/i386/mv-symbols1.C
 create mode 100644 gcc/testsuite/g++.target/i386/mv-symbols2.C
 create mode 100644 gcc/testsuite/g++.target/i386/mv-symbols3.C
 create mode 100644 gcc/testsuite/g++.target/i386/mv-symbols4.C
 create mode 100644 gcc/testsuite/g++.target/i386/mv-symbols5.C
 create mode 100644 gcc/testsuite/g++.target/i386/mvc-symbols1.C
 create mode 100644 gcc/testsuite/g++.target/i386/mvc-symbols2.C
 create mode 100644 gcc/testsuite/g++.target/i386/mvc-symbols3.C
 create mode 100644 gcc/testsuite/g++.target/i386/mvc-symbols4.C

diff --git a/gcc/testsuite/g++.target/i386/mv-symbols1.C 
b/gcc/testsuite/g++.target/i386/mv-symbols1.C
new file mode 100644
index 000..1290299aea5
--- /dev/null
+++ b/gcc/testsuite/g++.target/i386/mv-symbols1.C
@@ -0,0 +1,68 @@
+/* { dg-do compile } */
+/* { dg-require-ifunc "" } */
+/* { dg-options "-O0" } */
+
+__attribute__((target("default")))
+int foo ()
+{
+  return 1;
+}
+
+__attribute__((target("arch=slm")))
+int foo ()
+{
+  return 3;
+}
+
+__attribute__((target("sse4.2")))
+int foo ()
+{
+  return 5;
+}
+
+__attribute__((target("sse4.2")))
+int foo (int)
+{
+  return 6;
+}
+
+__attribute__((target("arch=slm")))
+int foo (int)
+{
+  return 4;
+}
+
+__attribute__((target("default")))
+int foo (int)
+{
+  return 2;
+}
+
+int bar()
+{
+  return foo ();
+}
+
+int bar(int x)
+{
+  return foo (x);
+}
+
+/* When updating any of the symbol names in these tests, make sure to also
+   update any tests for their absence in mvc-symbolsN.C */
+
+/* { dg-final { scan-assembler-times "\n_Z3foov:\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n_Z3foov\.arch_slm:\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n_Z3foov\.sse4.2:\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n_Z3foov\.resolver:\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n\tcall\t_Z7_Z3foovv\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n\t\.type\t_Z7_Z3foovv, 
@gnu_indirect_function\n" 1 } } */
+/* { dg-final { scan-assembler-times 
"\n\t\.set\t_Z7_Z3foovv,_Z3foov\.resolver\n" 1 } } */
+
+/* { dg-final { scan-assembler-times "\n_Z3fooi:\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n_Z3fooi\.arch_slm:\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n_Z3fooi\.sse4.2:\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n_Z3fooi\.resolver:\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n\tcall\t_Z7_Z3fooii\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n\t\.type\t_Z7_Z3fooii, 
@gnu_indirect_function\n" 1 } } */
+/* { dg-final { scan-assembler-times 
"\n\t\.set\t_Z7_Z3fooii,_Z3fooi\.resolver\n" 1 } } */
diff --git a/gcc/testsuite/g++.target/i386/mv-symbols2.C 
b/gcc/testsuite/g++.target/i386/mv-symbols2.C
new file mode 100644
index 000..8b75565d78d
--- /dev/null
+++ b/gcc/testsuite/g++.target/i386/mv-symbols2.C
@@ -0,0 +1,56 @@
+/* { dg-do compile } */
+/* { dg-require-ifunc "" } */
+/* { dg-options "-O0" } */
+
+__attribute__((target("default")))
+int foo ()
+{
+  return 1;
+}
+
+__attribute__((target("arch=slm")))
+int foo ()
+{
+  return 3;
+}
+
+__attribute__((target("sse4.2")))
+int foo ()
+{
+  return 5;
+}
+
+__attribute__((target("sse4.2")))
+int foo (int)
+{
+  return 6;
+}
+
+__attribute__((target("arch=slm")))
+int foo (int)
+{
+  return 4;
+}
+
+__attribute__((target("default")))
+int foo (int)
+{
+  return 2;
+}
+
+/* When updating any of the symbol names in these tests, make sure to also
+   update any tests for their absence in mvc-symbolsN.C */
+
+/* { dg-final { scan-assembler-times "\n_Z3foov:\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n_Z3foov\.arch_slm:\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n_Z3foov\.sse4.2:\n" 1 } } */
+/* { dg-final { scan-assembl

[PATCH v5 00/24] FMV refactor, C FMV support and ACLE compliance.

2025-05-29 Thread Alfie Richards
Hi all,

This is a minor update to V4.

Firstly it updates the diagnostics for versioned function decls to include
the target_version/target_clones attribute, and simplified the diagnostic logic
changes in cpp/c frontends.

Secondly, I merged this series with my C FMV support series and my FMV inlining
optimization patch. It proved keeping 3 patch series in sync was more bother
than it was worth.

Kind regards,
Alfie Richards

Change log
==

V5:
- Merged patch series with C support series, and FMV call inlining optimization
- Addressed Jason's feedback and simplified the diagnostics for FMV.

V4: https://gcc.gnu.org/pipermail/gcc-patches/2025-April/681047.html
- Changed version_info structure to be sorted by priority
- Split the target_clones pass into early/late stages
- Split out fix for PR c++/119498

V3: https://gcc.gnu.org/pipermail/gcc-patches/2025-March/679488.html
- Added reject target_clones version logic and hook
- Added pretty print for string_slice
- Refactored merging and conflict logic in front end
- Improved diagnostics

V2: https://gcc.gnu.org/pipermail/gcc-patches/2025-February/675960.html
- Changed recording of assembly name to be done in version into initialisation
- Changed behaviour for a lone default decl

V1: https://gcc.gnu.org/pipermail/gcc-patches/2025-February/674973.html
- Initial

Alfie Richards (22):
  Add string_slice class.
  Remove unnecessary `record` argument from maybe_version_functions.
  Update is_function_default_version to work with target_version.
  Refactor record_function_versions.
  Change make_attribute to take string_slice.
  Add get_clone_versions and get_target_version functions.
  Add assembler_name to cgraph_function_version_info.
  Add dispatcher_resolver_function and is_target_clone flags to
cgraph_node.
  Add clone_identifier function.
  fmv: i386: Refactor FMV name mangling.
  riscv: Refactor riscv target parsing to take string_slice.
  fmv: Add reject_target_clone hook for filtering target_clone versions.
  fmv: Change target_version semantics to follow ACLE specification.
  c/c++: Add target_[version/clones] to decl diagnostics formatting.
  c++: Refactor FMV frontend conflict and merging logic and hooks.
  fmv: Support mixing of target_clones and target_version.
  c++: Fix FMV return type ambiguation
  aarch64: testsuite: Add diagnostic tests for Aarch64 FMV.
  aarch64: Remove FMV beta warning.
  c: Add target_version attribute support.
  c/aarch64: Add FMV diagnostic tests.
  FMV: Redirect to specific target

Alice Carlotti (2):
  ppc: Add PowerPC FMV symbol tests.
  i386: Add x86 FMV symbol tests

 gcc/attribs.cc| 186 +++
 gcc/attribs.h |   6 +-
 gcc/c-family/c-attribs.cc |  33 +-
 gcc/c-family/c-format.cc  |   7 +
 gcc/c-family/c-format.h   |   1 +
 gcc/c-family/c-pretty-print.cc|  65 +++
 gcc/c-family/c-pretty-print.h |   2 +
 gcc/c/c-decl.cc   | 113 
 gcc/c/c-objc-common.cc|   6 +
 gcc/cgraph.cc |  80 +--
 gcc/cgraph.h  |  29 +-
 gcc/cgraphclones.cc   |  16 +-
 gcc/cgraphunit.cc |   9 +
 gcc/config/aarch64/aarch64.cc | 299 +-
 gcc/config/aarch64/aarch64.opt|   2 +-
 gcc/config/i386/i386-features.cc  | 141 ++---
 gcc/config/riscv/riscv-protos.h   |   2 +
 gcc/config/riscv/riscv-target-attr.cc |  14 +-
 gcc/config/riscv/riscv.cc | 267 -
 gcc/config/rs6000/rs6000.cc   | 150 +++--
 gcc/cp/call.cc|  10 +
 gcc/cp/class.cc   |  18 +-
 gcc/cp/cp-gimplify.cc |  11 +-
 gcc/cp/cp-tree.h  |   2 +-
 gcc/cp/cxx-pretty-print.h |   4 +
 gcc/cp/decl.cc|  55 +-
 gcc/cp/decl2.cc   |   2 +-
 gcc/cp/error.cc   |   3 +
 gcc/cp/typeck.cc  |  10 +
 gcc/doc/invoke.texi   |   5 +-
 gcc/doc/tm.texi   |  20 +-
 gcc/doc/tm.texi.in|   4 +
 gcc/hooks.cc  |  13 +
 gcc/hooks.h   |   4 +
 gcc/ipa.cc|  11 +
 gcc/multiple_target.cc| 517 ++
 gcc/passes.def|   3 +-
 gcc/pretty-print.cc   |  10 +
 gcc/target.def|  30 +-
 .../g++.target/aarch64/fmv-selection1.C   |  40 ++
 .../g++.target/aarch64/fmv-selection2.C   |  40 ++
 .../g++.target/aarch64/fmv-selection3.C   |  25 +
 ..

[PATCH v5 01/24] ppc: Add PowerPC FMV symbol tests.

2025-05-29 Thread Alfie Richards
From: Alice Carlotti 

This tests the mangling of function assembly names when annotated with
target_clones attributes.

gcc/testsuite/ChangeLog:

* g++.target/powerpc/mvc-symbols1.C: New test.
* g++.target/powerpc/mvc-symbols2.C: New test.
* g++.target/powerpc/mvc-symbols3.C: New test.
* g++.target/powerpc/mvc-symbols4.C: New test.

Co-authored-by: Alfie Richards 
---
 .../g++.target/powerpc/mvc-symbols1.C | 47 +++
 .../g++.target/powerpc/mvc-symbols2.C | 35 ++
 .../g++.target/powerpc/mvc-symbols3.C | 41 
 .../g++.target/powerpc/mvc-symbols4.C | 29 
 4 files changed, 152 insertions(+)
 create mode 100644 gcc/testsuite/g++.target/powerpc/mvc-symbols1.C
 create mode 100644 gcc/testsuite/g++.target/powerpc/mvc-symbols2.C
 create mode 100644 gcc/testsuite/g++.target/powerpc/mvc-symbols3.C
 create mode 100644 gcc/testsuite/g++.target/powerpc/mvc-symbols4.C

diff --git a/gcc/testsuite/g++.target/powerpc/mvc-symbols1.C 
b/gcc/testsuite/g++.target/powerpc/mvc-symbols1.C
new file mode 100644
index 000..9424382bf14
--- /dev/null
+++ b/gcc/testsuite/g++.target/powerpc/mvc-symbols1.C
@@ -0,0 +1,47 @@
+/* { dg-do compile } */
+/* { dg-require-ifunc "" } */
+/* { dg-options "-O0" } */
+
+__attribute__((target_clones("default", "cpu=power6", "cpu=power6x")))
+int foo ()
+{
+  return 1;
+}
+
+__attribute__((target_clones("cpu=power6x", "cpu=power6", "default")))
+int foo (int)
+{
+  return 2;
+}
+
+int bar()
+{
+  return foo ();
+}
+
+int bar(int x)
+{
+  return foo (x);
+}
+
+/* { dg-final { scan-assembler-times "\n_Z3foov\.default:\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n_Z3foov\.cpu_power6:\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n_Z3foov\.cpu_power6x:\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n_Z3foov\.resolver:\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n\tbl _Z3foov\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n\t\.type\t_Z3foov, 
@gnu_indirect_function\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n\t\.set\t_Z3foov,_Z3foov\.resolver\n" 
1 } } */
+/* { dg-final { scan-assembler-times "\n\t\.quad\t_Z3foov\.default\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n\t\.quad\t_Z3foov\.cpu_power6\n" 1 } } 
*/
+/* { dg-final { scan-assembler-times "\n\t\.quad\t_Z3foov\.cpu_power6x\n" 0 } 
} */
+
+/* { dg-final { scan-assembler-times "\n_Z3fooi\.default:\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n_Z3fooi\.cpu_power6:\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n_Z3fooi\.cpu_power6x:\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n_Z3fooi\.resolver:\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n\tbl _Z3fooi\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n\t\.type\t_Z3fooi, 
@gnu_indirect_function\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n\t\.set\t_Z3fooi,_Z3fooi\.resolver\n" 
1 } } */
+/* { dg-final { scan-assembler-times "\n\t\.quad\t_Z3fooi\.default\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n\t\.quad\t_Z3fooi\.cpu_power6\n" 0 } } 
*/
+/* { dg-final { scan-assembler-times "\n\t\.quad\t_Z3fooi\.cpu_power6x\n" 1 } 
} */
diff --git a/gcc/testsuite/g++.target/powerpc/mvc-symbols2.C 
b/gcc/testsuite/g++.target/powerpc/mvc-symbols2.C
new file mode 100644
index 000..edf54480efd
--- /dev/null
+++ b/gcc/testsuite/g++.target/powerpc/mvc-symbols2.C
@@ -0,0 +1,35 @@
+/* { dg-do compile } */
+/* { dg-require-ifunc "" } */
+/* { dg-options "-O0" } */
+
+__attribute__((target_clones("default", "cpu=power6", "cpu=power6x")))
+int foo ()
+{
+  return 1;
+}
+
+__attribute__((target_clones("cpu=power6x", "cpu=power6", "default")))
+int foo (int)
+{
+  return 2;
+}
+
+/* { dg-final { scan-assembler-times "\n_Z3foov\.default:\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n_Z3foov\.cpu_power6:\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n_Z3foov\.cpu_power6x:\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n_Z3foov\.resolver:\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n\t\.type\t_Z3foov, 
@gnu_indirect_function\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n\t\.set\t_Z3foov,_Z3foov\.resolver\n" 
1 } } */
+/* { dg-final { scan-assembler-times "\n\t\.quad\t_Z3foov\.default\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n\t\.quad\t_Z3foov\.cpu_power6\n" 1 } } 
*/
+/* { dg-final { scan-assembler-times "\n\t\.quad\t_Z3foov\.cpu_power6x\n" 0 } 
} */
+
+/* { dg-final { scan-assembler-times "\n_Z3fooi\.default:\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n_Z3fooi\.cpu_power6:\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n_Z3fooi\.cpu_power6x:\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n_Z3fooi\.resolver:\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n\t\.type\t_Z3fooi, 
@gnu_indirect_function\n" 1 } } */
+/* { dg-final { scan-assembler-times "\n\t\.set\t_Z3fooi,_Z3fooi\.resolver\n" 
1 } } */
+/* { dg-final { scan-assembler-times "\n\t\.q

[PATCH v5 04/24] Remove unnecessary `record` argument from maybe_version_functions.

2025-05-29 Thread Alfie Richards
Previously, the `record` argument in maybe_version_function allowed the
call to cgraph_node::record_function_versions to be skipped.  However,
this was only skipped when both decls were already marked as versioned,
in which case we trigger the early exit in record_function_versions
instead. Therefore, the argument is unnecessary.

gcc/cp/ChangeLog:

* class.cc (add_method): Remove argument.
* cp-tree.h (maybe_version_functions): Ditto.
* decl.cc (decls_match): Ditto.
(maybe_version_functions): Ditto.
---
 gcc/cp/class.cc  |  2 +-
 gcc/cp/cp-tree.h |  2 +-
 gcc/cp/decl.cc   | 13 +
 3 files changed, 7 insertions(+), 10 deletions(-)

diff --git a/gcc/cp/class.cc b/gcc/cp/class.cc
index db39e579870..ac59bbb0058 100644
--- a/gcc/cp/class.cc
+++ b/gcc/cp/class.cc
@@ -1402,7 +1402,7 @@ add_method (tree type, tree method, bool via_using)
   /* If these are versions of the same function, process and
 move on.  */
   if (TREE_CODE (fn) == FUNCTION_DECL
- && maybe_version_functions (method, fn, true))
+ && maybe_version_functions (method, fn))
continue;
 
   if (DECL_INHERITED_CTOR (method))
diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 19c0b452d86..44cda5b312e 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -7138,7 +7138,7 @@ extern void determine_local_discriminator (tree, tree = 
NULL_TREE);
 extern bool member_like_constrained_friend_p   (tree);
 extern bool fns_correspond (tree, tree);
 extern int decls_match (tree, tree, bool = true);
-extern bool maybe_version_functions(tree, tree, bool);
+extern bool maybe_version_functions(tree, tree);
 extern bool validate_constexpr_redeclaration   (tree, tree);
 extern bool merge_default_template_args(tree, tree, bool);
 extern tree duplicate_decls(tree, tree,
diff --git a/gcc/cp/decl.cc b/gcc/cp/decl.cc
index a9ef28bfd80..3a1b59434ed 100644
--- a/gcc/cp/decl.cc
+++ b/gcc/cp/decl.cc
@@ -1214,9 +1214,7 @@ decls_match (tree newdecl, tree olddecl, bool 
record_versions /* = true */)
  && targetm.target_option.function_versions (newdecl, olddecl))
{
  if (record_versions)
-   maybe_version_functions (newdecl, olddecl,
-(!DECL_FUNCTION_VERSIONED (newdecl)
- || !DECL_FUNCTION_VERSIONED (olddecl)));
+   maybe_version_functions (newdecl, olddecl);
  return 0;
}
 }
@@ -1283,11 +1281,11 @@ maybe_mark_function_versioned (tree decl)
 }
 
 /* NEWDECL and OLDDECL have identical signatures.  If they are
-   different versions adjust them and return true.
-   If RECORD is set to true, record function versions.  */
+   different versions adjust them, record function versions, and return
+   true.  */
 
 bool
-maybe_version_functions (tree newdecl, tree olddecl, bool record)
+maybe_version_functions (tree newdecl, tree olddecl)
 {
   if (!targetm.target_option.function_versions (newdecl, olddecl))
 return false;
@@ -1310,8 +1308,7 @@ maybe_version_functions (tree newdecl, tree olddecl, bool 
record)
   maybe_mark_function_versioned (newdecl);
 }
 
-  if (record)
-cgraph_node::record_function_versions (olddecl, newdecl);
+  cgraph_node::record_function_versions (olddecl, newdecl);
 
   return true;
 }
-- 
2.34.1



[PATCH v5 19/24] c++: Fix FMV return type ambiguation

2025-05-29 Thread Alfie Richards
Add logic for the case of two FMV annotated functions with identical
signature other than the return type.

Previously this was ignored, this changes the behavior to emit a diagnostic.

gcc/cp/ChangeLog:
PR c++/119498
* decl.cc (duplicate_decls): Change logic to not always exclude FMV
annotated functions in cases of return type non-ambiguation.
---
 gcc/cp/decl.cc | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/gcc/cp/decl.cc b/gcc/cp/decl.cc
index f439f456122..efd0c34fb67 100644
--- a/gcc/cp/decl.cc
+++ b/gcc/cp/decl.cc
@@ -2016,8 +2016,11 @@ duplicate_decls (tree newdecl, tree olddecl, bool 
hiding, bool was_hidden)
}
  /* For function versions, params and types match, but they
 are not ambiguous.  */
- else if ((!DECL_FUNCTION_VERSIONED (newdecl)
-   && !DECL_FUNCTION_VERSIONED (olddecl))
+ else if (((!DECL_FUNCTION_VERSIONED (newdecl)
+&& !DECL_FUNCTION_VERSIONED (olddecl))
+   || !comptypes (TREE_TYPE (TREE_TYPE (newdecl)),
+  TREE_TYPE (TREE_TYPE (olddecl)),
+  COMPARE_STRICT))
   /* Let constrained hidden friends coexist for now, we'll
  check satisfaction later.  */
   && !member_like_constrained_friend_p (newdecl)
-- 
2.34.1



[PATCH v5 24/24] FMV: Redirect to specific target

2025-05-29 Thread Alfie Richards
Adds an optimisation in FMV to redirect to a specific target if possible.

A call is redirected to a specific target if both:
- the caller can always call the callee version
- and, it is possible to rule out all higher priority versions of the callee
  fmv set. That is estabilished either by the callee being the highest priority
  version, or each higher priority version of the callee implying that, were it
  resolved, a higher priority version of the caller would have been selected.

For this logic, introduces the new TARGET_OPTION_VERSION_A_IMPLIES_VERSION_B
hook. Adds a full implementation for Aarch64, and a weaker default version
for other targets.

This allows the target to replace the previous optimisation as the new one is
able to cover the same case where two function sets implement the same versions.

gcc/ChangeLog:

* config/aarch64/aarch64.cc (aarch64_version_a_implies_version_b): New
function.
(TARGET_OPTION_VERSION_A_IMPLIES_VERSION_B): New define.
* doc/tm.texi: Regenerate.
* doc/tm.texi.in: Add documentation for version_a_implies_version_b.
* multiple_target.cc (redirect_to_specific_clone): Add new optimisation
logic.
(ipa_target_clone): Add
* target.def: Remove TARGET_HAS_FMV_TARGET_ATTRIBUTE check.
* attribs.cc: (version_a_implies_version_b) New function.
* attribs.h: (version_a_implies_version_b) New function.

gcc/testsuite/ChangeLog:

* g++.target/aarch64/fmv-selection1.C: New test.
* g++.target/aarch64/fmv-selection2.C: New test.
* g++.target/aarch64/fmv-selection3.C: New test.
* g++.target/aarch64/fmv-selection4.C: New test.
* g++.target/aarch64/fmv-selection5.C: New test.
* g++.target/aarch64/fmv-selection6.C: New test.
---
 gcc/attribs.cc| 16 
 gcc/attribs.h |  1 +
 gcc/config/aarch64/aarch64.cc | 26 +
 gcc/doc/tm.texi   |  4 +
 gcc/doc/tm.texi.in|  2 +
 gcc/multiple_target.cc| 96 ---
 gcc/target.def|  9 ++
 .../g++.target/aarch64/fmv-selection1.C   | 40 
 .../g++.target/aarch64/fmv-selection2.C   | 40 
 .../g++.target/aarch64/fmv-selection3.C   | 25 +
 .../g++.target/aarch64/fmv-selection4.C   | 30 ++
 .../g++.target/aarch64/fmv-selection5.C   | 28 ++
 .../g++.target/aarch64/fmv-selection6.C   | 27 ++
 13 files changed, 311 insertions(+), 33 deletions(-)
 create mode 100644 gcc/testsuite/g++.target/aarch64/fmv-selection1.C
 create mode 100644 gcc/testsuite/g++.target/aarch64/fmv-selection2.C
 create mode 100644 gcc/testsuite/g++.target/aarch64/fmv-selection3.C
 create mode 100644 gcc/testsuite/g++.target/aarch64/fmv-selection4.C
 create mode 100644 gcc/testsuite/g++.target/aarch64/fmv-selection5.C
 create mode 100644 gcc/testsuite/g++.target/aarch64/fmv-selection6.C

diff --git a/gcc/attribs.cc b/gcc/attribs.cc
index 2ca82674f7c..66c77904404 100644
--- a/gcc/attribs.cc
+++ b/gcc/attribs.cc
@@ -1095,6 +1095,22 @@ common_function_versions (string_slice fn1 
ATTRIBUTE_UNUSED,
   gcc_unreachable ();
 }
 
+bool
+version_a_implies_version_b (tree fn1, tree fn2)
+{
+  const char *attr_name = TARGET_HAS_FMV_TARGET_ATTRIBUTE
+ ? "target"
+ : "target_version";
+
+  tree attr1 = lookup_attribute (attr_name, DECL_ATTRIBUTES (fn1));
+  tree attr2 = lookup_attribute (attr_name, DECL_ATTRIBUTES (fn2));
+
+  if (!attr1 || !attr2)
+return false;
+
+  return attribute_value_equal (attr1, attr2);
+}
+
 /* Comparator function to be used in qsort routine to sort attribute
specification strings to "target".  */
 
diff --git a/gcc/attribs.h b/gcc/attribs.h
index fc343c0eab5..b846ce0d3a2 100644
--- a/gcc/attribs.h
+++ b/gcc/attribs.h
@@ -58,6 +58,7 @@ extern bool common_function_versions (string_slice, 
string_slice);
 extern bool reject_target_clone_version (string_slice, location_t);
 extern tree make_dispatcher_decl (const tree);
 extern bool is_function_default_version (const tree);
+extern bool version_a_implies_version_b (tree, tree);
 extern void handle_ignored_attributes_option (vec *);
 
 /* Return a type like TTYPE except that its TYPE_ATTRIBUTES
diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
index abd300c8f39..902c8544c96 100644
--- a/gcc/config/aarch64/aarch64.cc
+++ b/gcc/config/aarch64/aarch64.cc
@@ -20345,6 +20345,29 @@ aarch64_compare_version_priority (tree decl1, tree 
decl2)
   return compare_feature_masks (mask1, mask2);
 }
 
+/* Check if version a implies version b.  */
+bool
+aarch64_version_a_implies_version_b (tree decl_a, tree decl_b)
+{
+  auto a_isa = aarch64_get_isa_flags
+(TREE_TARGET_OPTION (aarch64_fndecl_options (decl_a)));
+  auto b_isa = aarch64_get_isa_flags
+  

[PATCH v5 03/24] Add string_slice class.

2025-05-29 Thread Alfie Richards
The string_slice inherits from array_slice and is used to refer to a
substring of an array that is memory managed elsewhere without modifying
the underlying array.

For example, this is useful in cases such as when needing to refer to a
substring of an attribute in the syntax tree.

Adds some minimal helper functions for string_slice,
such as a strtok alternative, equality operators, strcmp, and a function
to strip whitespace from the beginning and end of a string_slice.

gcc/c-family/ChangeLog:

* c-format.cc (local_string_slice_node): New node type.
(asm_fprintf_char_table): New entry.
(init_dynamic_diag_info): Add support for string_slice.
* c-format.h (T_STRING_SLICE): New node type.

gcc/ChangeLog:

* pretty-print.cc (format_phase_2): Add support for string_slice.
* vec.cc (string_slice::tokenize): New method.
(strcmp): New implementation for string_slice.
(string_slice::strip): New method.
(test_string_slice_initializers): New test.
(test_string_slice_tokenize): Ditto.
(test_string_slice_strcmp): Ditto.
(test_string_slice_equality): Ditto.
(test_string_slice_inequality): Ditto.
(test_string_slice_invalid): Ditto.
(test_string_slice_strip): Ditto.
(vec_cc_tests): Add new tests.
* vec.h (class string_slice): New class.
(strcmp): New implementation for stirng_slice.
---
 gcc/c-family/c-format.cc |   7 ++
 gcc/c-family/c-format.h  |   1 +
 gcc/pretty-print.cc  |  10 ++
 gcc/vec.cc   | 207 +++
 gcc/vec.h|  45 +
 5 files changed, 270 insertions(+)

diff --git a/gcc/c-family/c-format.cc b/gcc/c-family/c-format.cc
index a44249a0222..80430e9a8f7 100644
--- a/gcc/c-family/c-format.cc
+++ b/gcc/c-family/c-format.cc
@@ -70,6 +70,7 @@ static GTY(()) tree local_event_ptr_node;
 static GTY(()) tree local_pp_element_ptr_node;
 static GTY(()) tree local_gimple_ptr_node;
 static GTY(()) tree local_cgraph_node_ptr_node;
+static GTY(()) tree local_string_slice_node;
 static GTY(()) tree locus;
 
 static bool decode_format_attr (const_tree, tree, tree, function_format_info *,
@@ -770,6 +771,7 @@ static const format_char_info asm_fprintf_char_table[] =
   { "p",   1, STD_C89, { T89_V,   BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  
BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN  }, "q",  "c",  NULL }, \
   { "r",   1, STD_C89, { T89_C,   BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  
BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN  }, "","//cR",   NULL 
}, \
   { "@",   1, STD_C89, { T_EVENT_PTR,   BADLEN,  BADLEN,  BADLEN,  BADLEN,  
BADLEN,  BADLEN,  BADLEN,  BADLEN  }, "", "\"",   NULL }, \
+  { "B",   1, STD_C89, { T_STRING_SLICE,   BADLEN,  BADLEN,  BADLEN,  BADLEN,  
BADLEN,  BADLEN,  BADLEN,  BADLEN  }, "q", "",   NULL }, \
   { "e",   1, STD_C89, { T_PP_ELEMENT_PTR,   BADLEN,  BADLEN,  BADLEN,  
BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN  }, "", "\"", NULL }, \
   { "<",   0, STD_C89, NOARGUMENTS, "",  "<",   NULL }, \
   { ">",   0, STD_C89, NOARGUMENTS, "",  ">",   NULL }, \
@@ -5211,6 +5213,11 @@ init_dynamic_diag_info (void)
   || local_cgraph_node_ptr_node == void_type_node)
 local_cgraph_node_ptr_node = get_named_type ("cgraph_node");
 
+  /* Similar to the above but for string_slice*.  */
+  if (!local_string_slice_node
+  || local_string_slice_node == void_type_node)
+local_string_slice_node = get_named_type ("string_slice");
+
   /* Similar to the above but for diagnostic_event_id_t*.  */
   if (!local_event_ptr_node
   || local_event_ptr_node == void_type_node)
diff --git a/gcc/c-family/c-format.h b/gcc/c-family/c-format.h
index 323338cb8e7..d44d3862d83 100644
--- a/gcc/c-family/c-format.h
+++ b/gcc/c-family/c-format.h
@@ -317,6 +317,7 @@ struct format_kind_info
 #define T89_G   { STD_C89, NULL, &local_gimple_ptr_node }
 #define T_CGRAPH_NODE   { STD_C89, NULL, &local_cgraph_node_ptr_node }
 #define T_EVENT_PTR{ STD_C89, NULL, &local_event_ptr_node }
+#define T_STRING_SLICE{ STD_C89, NULL, &local_string_slice_node }
 #define T_PP_ELEMENT_PTR{ STD_C89, NULL, &local_pp_element_ptr_node }
 #define T89_T   { STD_C89, NULL, &local_tree_type_node }
 #define T89_V  { STD_C89, NULL, T_V }
diff --git a/gcc/pretty-print.cc b/gcc/pretty-print.cc
index 1f38702b611..4db0ce9120c 100644
--- a/gcc/pretty-print.cc
+++ b/gcc/pretty-print.cc
@@ -2034,6 +2034,16 @@ format_phase_2 (pretty_printer *pp,
pp_string (pp, va_arg (*text.m_args_ptr, const char *));
  break;
 
+   case 'B':
+ {
+   string_slice s = *va_arg (*text.m_args_ptr, string_slice *);
+   if (quote)
+ pp_quoted_string (pp, s.begin (), s.size ());
+   else
+ pp_string_n (pp, s.begin (), s.size ());
+   break;
+ }
+
case 'p':
  pp_pointer (pp, va_arg (*text.m_args_ptr, void *));
  

[PATCH v5 13/24] riscv: Refactor riscv target parsing to take string_slice.

2025-05-29 Thread Alfie Richards
This is a quick refactor of the riscv target processing code
to take a string_slice rather than a decl.

The reason for this is to enable it to work with target_clones
where merging logic requires reasoning about each version string
individually in the front end.

This refactor primarily serves just to get this working. Ideally the
logic here would be further refactored as currently there is no way to
check if a parse fails or not without emitting an error.
This makes things difficult for later patches which intends to emit a
warning and ignoring unrecognised/not parsed target_clone values rather
than erroring which can't currently be achieved with the current riscv
code.

gcc/ChangeLog:

* config/riscv/riscv-protos.h (riscv_process_target_version_str): New 
function..
* config/riscv/riscv-target-attr.cc (riscv_process_target_attr): 
Refactor to take
string_slice.
(riscv_process_target_version_str): Ditto.
* config/riscv/riscv.cc (parse_features_for_version): Refactor to take
string_slice.
(riscv_compare_version_priority): Ditto.
(dispatch_function_versions): Change to pass location.
---
 gcc/config/riscv/riscv-protos.h   |  2 ++
 gcc/config/riscv/riscv-target-attr.cc | 14 +---
 gcc/config/riscv/riscv.cc | 50 ++-
 3 files changed, 37 insertions(+), 29 deletions(-)

diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index d8c8f6b5079..c4c0e401cfc 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -819,6 +819,8 @@ riscv_option_valid_attribute_p (tree, tree, tree, int);
 extern bool
 riscv_option_valid_version_attribute_p (tree, tree, tree, int);
 extern bool
+riscv_process_target_version_str (string_slice, location_t);
+extern bool
 riscv_process_target_version_attr (tree, location_t);
 extern void
 riscv_override_options_internal (struct gcc_options *);
diff --git a/gcc/config/riscv/riscv-target-attr.cc 
b/gcc/config/riscv/riscv-target-attr.cc
index 8ad3025579b..c255bd3906f 100644
--- a/gcc/config/riscv/riscv-target-attr.cc
+++ b/gcc/config/riscv/riscv-target-attr.cc
@@ -350,11 +350,11 @@ num_occurrences_in_str (char c, char *str)
and update the global target options space.  */
 
 bool
-riscv_process_target_attr (const char *args,
+riscv_process_target_attr (string_slice args,
   location_t loc,
   const struct riscv_attribute_info *attrs)
 {
-  size_t len = strlen (args);
+  size_t len = args.size ();
 
   /* No need to emit warning or error on empty string here, generic code 
already
  handle this case.  */
@@ -365,7 +365,7 @@ riscv_process_target_attr (const char *args,
 
   std::unique_ptr buf (new char[len+1]);
   char *str_to_check = buf.get ();
-  strcpy (str_to_check, args);
+  strncpy (str_to_check, args.begin (), args.size ());
 
   /* Used to catch empty spaces between semi-colons i.e.
  attribute ((target ("attr1;;attr2"))).  */
@@ -387,8 +387,7 @@ riscv_process_target_attr (const char *args,
 
   if (num_attrs != num_semicolons + 1)
 {
-  error_at (loc, "malformed % attribute",
-   args);
+  error_at (loc, "malformed % attribute", &args);
   return false;
 }
 
@@ -509,6 +508,11 @@ riscv_process_target_version_attr (tree args, location_t 
loc)
   return riscv_process_target_attr (str, loc, riscv_target_version_attrs);
 }
 
+bool
+riscv_process_target_version_str (string_slice str, location_t loc)
+{
+  return riscv_process_target_attr (str, loc, riscv_target_version_attrs);
+}
 
 /* Implement TARGET_OPTION_VALID_VERSION_ATTRIBUTE_P.  This is used to
process attribute ((target_version ("..."))).  */
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 72218fc6516..ba61fcc2c74 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -13259,31 +13259,22 @@ riscv_c_mode_for_floating_type (enum tree_index ti)
   return default_mode_for_floating_type (ti);
 }
 
-/* This parses the attribute arguments to target_version in DECL and modifies
-   the feature mask and priority required to select those targets.  */
-static void
-parse_features_for_version (tree decl,
+/* This parses STR and modifies the feature mask and priority required to
+   select those targets.  */
+static bool
+parse_features_for_version (string_slice version_str,
+   location_t loc,
struct riscv_feature_bits &res,
int &priority)
 {
-  tree version_attr = lookup_attribute ("target_version",
-   DECL_ATTRIBUTES (decl));
-  if (version_attr == NULL_TREE)
+  gcc_assert (version_str.is_valid ());
+  if (version_str == "default")
 {
   res.length = 0;
   priority = 0;
-  return;
+  return true;
 }
 
-  const char *version_string = TREE_STRING_POINTER (TREE_VALUE (TREE_VALUE
- 

[PATCH v5 10/24] Add dispatcher_resolver_function and is_target_clone flags to cgraph_node.

2025-05-29 Thread Alfie Richards
These are needed to correctly mangle FMV declarations.

gcc/ChangeLog:

* cgraph.h (struct cgraph_node): Add dispatcher_resolver_function and
is_target_clone.
---
 gcc/cgraph.h | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/gcc/cgraph.h b/gcc/cgraph.h
index 6ea0179c7d1..cea1dcaad77 100644
--- a/gcc/cgraph.h
+++ b/gcc/cgraph.h
@@ -907,7 +907,9 @@ struct GTY((tag ("SYMTAB_FUNCTION"))) cgraph_node : public 
symtab_node
   used_as_abstract_origin (false),
   lowered (false), process (false), frequency (NODE_FREQUENCY_NORMAL),
   only_called_at_startup (false), only_called_at_exit (false),
-  tm_clone (false), dispatcher_function (false), calls_comdat_local 
(false),
+  tm_clone (false), dispatcher_function (false),
+  dispatcher_resolver_function (false), is_target_clone (false),
+  calls_comdat_local (false),
   icf_merged (false), nonfreeing_fn (false), merged_comdat (false),
   merged_extern_inline (false), parallelized_function (false),
   split_part (false), indirect_call_target (false), local (false),
@@ -1470,6 +1472,12 @@ struct GTY((tag ("SYMTAB_FUNCTION"))) cgraph_node : 
public symtab_node
   unsigned tm_clone : 1;
   /* True if this decl is a dispatcher for function versions.  */
   unsigned dispatcher_function : 1;
+  /* True if this decl is a resolver for function versions.  */
+  unsigned dispatcher_resolver_function : 1;
+  /* True this is part of a multiversioned set and this version comes from a
+ target_clone attribute.  Or if this is a dispatched symbol or resolver
+ and the default version comes from a target_clones.  */
+  unsigned is_target_clone : 1;
   /* True if this decl calls a COMDAT-local function.  This is set up in
  compute_fn_summary and inline_call.  */
   unsigned calls_comdat_local : 1;
-- 
2.34.1



[PATCH v5 08/24] Add get_clone_versions and get_target_version functions.

2025-05-29 Thread Alfie Richards
This is a reimplementation of get_target_clone_attr_len,
get_attr_str, and separate_attrs using string_slice and auto_vec to make
memory management and use simpler.

Adds get_target_version helper function to get the target_version string
from a decl.

gcc/c-family/ChangeLog:

* c-attribs.cc (handle_target_clones_attribute): Change to use
get_clone_versions.

gcc/ChangeLog:

* tree.cc (get_clone_versions): New function.
(get_clone_attr_versions): New function.
(get_version): New function.
* tree.h (get_clone_versions): New function.
(get_clone_attr_versions): New function.
(get_target_version): New function.
---
 gcc/c-family/c-attribs.cc |  4 ++-
 gcc/tree.cc   | 59 +++
 gcc/tree.h| 11 
 3 files changed, 73 insertions(+), 1 deletion(-)

diff --git a/gcc/c-family/c-attribs.cc b/gcc/c-family/c-attribs.cc
index 5a0e3d328ba..5dff489fcca 100644
--- a/gcc/c-family/c-attribs.cc
+++ b/gcc/c-family/c-attribs.cc
@@ -6132,7 +6132,9 @@ handle_target_clones_attribute (tree *node, tree name, 
tree ARG_UNUSED (args),
}
}
 
-  if (get_target_clone_attr_len (args) == -1)
+  auto_vec versions= get_clone_attr_versions (args, NULL);
+
+  if (versions.length () == 1)
{
  warning (OPT_Wattributes,
   "single % attribute is ignored");
diff --git a/gcc/tree.cc b/gcc/tree.cc
index 98575a51f58..36fee9eeed5 100644
--- a/gcc/tree.cc
+++ b/gcc/tree.cc
@@ -15358,6 +15358,65 @@ get_target_clone_attr_len (tree arglist)
   return str_len_sum;
 }
 
+/* Returns an auto_vec of string_slices containing the version strings from
+   ARGLIST.  DEFAULT_COUNT is incremented for each default version found.  */
+
+auto_vec
+get_clone_attr_versions (const tree arglist, int *default_count)
+{
+  gcc_assert (TREE_CODE (arglist) == TREE_LIST);
+  auto_vec versions;
+
+  static const char separator_str[] = {TARGET_CLONES_ATTR_SEPARATOR, 0};
+  string_slice separators = string_slice (separator_str);
+
+  for (tree arg = arglist; arg; arg = TREE_CHAIN (arg))
+{
+  string_slice str = string_slice (TREE_STRING_POINTER (TREE_VALUE (arg)));
+  while (str.is_valid ())
+   {
+ string_slice attr = string_slice::tokenize (&str, separators);
+ attr = attr.strip ();
+
+ if (attr == "default" && default_count)
+   (*default_count)++;
+ versions.safe_push (attr);
+   }
+}
+  return versions;
+}
+
+/* Returns an auto_vec of string_slices containing the version strings from
+   the target_clone attribute from DECL.  DEFAULT_COUNT is incremented for each
+   default version found.  */
+auto_vec
+get_clone_versions (const tree decl, int *default_count)
+{
+  tree attr = lookup_attribute ("target_clones", DECL_ATTRIBUTES (decl));
+  if (!attr)
+return auto_vec ();
+  tree arglist = TREE_VALUE (attr);
+  return get_clone_attr_versions (arglist, default_count);
+}
+
+/* If DECL has a target_version attribute, returns a string_slice containing 
the
+   attribute value.  Otherwise, returns string_slice::invalid.
+   Only works for target_version due to target attributes allowing multiple
+   string arguments to specify one target.  */
+string_slice
+get_target_version (const tree decl)
+{
+  gcc_assert (!TARGET_HAS_FMV_TARGET_ATTRIBUTE);
+
+  tree attr = lookup_attribute ("target_version", DECL_ATTRIBUTES (decl));
+
+  if (!attr)
+return string_slice::invalid ();
+
+  return string_slice (TREE_STRING_POINTER (TREE_VALUE (TREE_VALUE (attr
+  .strip ();
+}
+
 void
 tree_cc_finalize (void)
 {
diff --git a/gcc/tree.h b/gcc/tree.h
index 1e41316b4c9..9adc5bae7fc 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -22,6 +22,7 @@ along with GCC; see the file COPYING3.  If not see
 
 #include "tree-core.h"
 #include "options.h"
+#include "vec.h"
 
 /* Convert a target-independent built-in function code to a combined_fn.  */
 
@@ -7052,4 +7053,14 @@ extern tree get_attr_nonstring_decl (tree, tree * = 
NULL);
 
 extern int get_target_clone_attr_len (tree);
 
+/* Returns the version string for a decl with target_version attribute.
+   Returns an invalid string_slice if no attribute is present.  */
+extern string_slice get_target_version (const tree);
+/* Returns a vector of the version strings from a target_clones attribute on
+   a decl.  Can also record the number of default versions found.  */
+extern auto_vec get_clone_versions (const tree, int * = NULL);
+/* Returns a vector of the version strings from a target_clones attribute
+   directly.  */
+extern auto_vec get_clone_attr_versions (const tree, int *);
+
 #endif  /* GCC_TREE_H  */
-- 
2.34.1



[PATCH v5 09/24] Add assembler_name to cgraph_function_version_info.

2025-05-29 Thread Alfie Richards
Add the assembler_name member to cgraph_function_version_info to store
the base assembler name of the funciton set, before FMV mangling. This is
used in later patches for refactoring FMV mangling.

gcc/ChangeLog:

* cgraph.cc (cgraph_node::insert_new_function_version): Record
assembler_name.
* cgraph.h (struct cgraph_function_version_info): Add assembler_name.
---
 gcc/cgraph.cc | 1 +
 gcc/cgraph.h  | 3 +++
 2 files changed, 4 insertions(+)

diff --git a/gcc/cgraph.cc b/gcc/cgraph.cc
index e7c296851a8..e21cba43a7e 100644
--- a/gcc/cgraph.cc
+++ b/gcc/cgraph.cc
@@ -187,6 +187,7 @@ cgraph_node::insert_new_function_version (void)
   version_info_node = NULL;
   version_info_node = ggc_cleared_alloc ();
   version_info_node->this_node = this;
+  version_info_node->assembler_name = DECL_ASSEMBLER_NAME (this->decl);
 
   if (cgraph_fnver_htab == NULL)
 cgraph_fnver_htab = hash_table::create_ggc (2);
diff --git a/gcc/cgraph.h b/gcc/cgraph.h
index 2c39a4a93d1..6ea0179c7d1 100644
--- a/gcc/cgraph.h
+++ b/gcc/cgraph.h
@@ -856,6 +856,9 @@ struct GTY((for_user)) cgraph_function_version_info {
  dispatcher. The dispatcher decl is an alias to the resolver
  function decl.  */
   tree dispatcher_resolver;
+
+  /* The assmbly name of the function set before version mangling.  */
+  tree assembler_name;
 };
 
 #define DEFCIFCODE(code, type, string) CIF_ ## code,
-- 
2.34.1



Re: [PATCH] libstdc++: Compare keys and values separately in flat_map::operator==

2025-05-29 Thread Jonathan Wakely
On Thu, 29 May 2025 at 14:55, Patrick Palka wrote:
>
> Tested on x86_64-pc-linux-gnu, does this look OK for trunk/15?

OK for both, thanks.

>
> -- >8 --
>
> Instead of effectively doing a zipped comparison of the keys and values,
> compare them separately to leverage the underlying containers' optimized
> equality implementations.
>
> libstdc++-v3/ChangeLog:
>
> * include/std/flat_map (_Flat_map_impl::operator==): Compare
> keys and values separately.
> ---
>  libstdc++-v3/include/std/flat_map | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/libstdc++-v3/include/std/flat_map 
> b/libstdc++-v3/include/std/flat_map
> index c0716d12412a..134307324190 100644
> --- a/libstdc++-v3/include/std/flat_map
> +++ b/libstdc++-v3/include/std/flat_map
> @@ -873,7 +873,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>[[nodiscard]]
>friend bool
>operator==(const _Derived& __x, const _Derived& __y)
> -  { return std::equal(__x.begin(), __x.end(), __y.begin(), __y.end()); }
> +  {
> +   return __x._M_cont.keys == __y._M_cont.keys
> + && __x._M_cont.values == __y._M_cont.values;
> +  }
>
>template
> [[nodiscard]]
> --
> 2.50.0.rc0
>



Re: [PATCH v2 2/2] emit-rtl: Validate mode for paradoxical hardware subregs [PR119966]

2025-05-29 Thread Richard Sandiford
Sorry for the slow reply.

Dimitar Dimitrov  writes:
> On Fri, May 16, 2025 at 06:14:30PM +0100, Richard Sandiford wrote:
>> Dimitar Dimitrov  writes:
>> > After r16-160-ge6f89d78c1a752, late_combine2 started transforming the
>> > following RTL for pru-unknown-elf:
>> >
>> >   (insn 3949 3948 3951 255 (set (reg:QI 56 r14.b0 [orig:1856 _619 ] [1856])
>> >   (and:QI (reg:QI 1 r0.b1 [orig:1855 _201 ] [1855])
>> >   (const_int 3 [0x3])))
>> >(nil))
>> >   ...
>> >   (insn 3961 7067 3962 255 (set (reg:SI 56 r14.b0)
>> >   (zero_extend:SI (reg:QI 56 r14.b0 [orig:1856 _619 ] [1856])))
>> >(nil))
>> >
>> > into:
>> >
>> >   (insn 3961 7067 3962 255 (set (reg:SI 56 r14.b0)
>> >   (and:SI (subreg:SI (reg:QI 1 r0.b1 [orig:1855 _201 ] [1855]) 0)
>> >   (const_int 3 [0x3])))
>> >(nil))
>> >
>> > That caused libbacktrace build to break for pru-unknown-elf.  Register
>> > r0.b1 (regno 1) is not valid for SImode, which validate_subreg failed to
>> > reject.
>> >
>> > Fix by calling HARD_REGNO_MODE_OK to ensure that both inner and outer
>> > modes are valid for the hardware subreg.
>> >
>> > This patch fixes the broken PRU toolchain build.  It leaves only two
>> > test case regressions for PRU, caused by rnreg pass renaming a valid
>> > paradoxical subreg into an invalid one.
>> >   gcc.c-torture/execute/20040709-1.c
>> >   gcc.c-torture/execute/20040709-2.c
>> >
>> >PR target/119966
>> >
>> > gcc/ChangeLog:
>> >
>> >* emit-rtl.cc (validate_subreg): Validate inner
>> >and outer mode for paradoxical hardware subregs.
>> >
>> > Co-authored-by: Andrew Pinski 
>> > Signed-off-by: Dimitar Dimitrov 
>> > ---
>> >  gcc/emit-rtl.cc | 3 +++
>> >  1 file changed, 3 insertions(+)
>> >
>> > diff --git a/gcc/emit-rtl.cc b/gcc/emit-rtl.cc
>> > index e46b0f9eac4..6c5d9b55508 100644
>> > --- a/gcc/emit-rtl.cc
>> > +++ b/gcc/emit-rtl.cc
>> > @@ -983,6 +983,9 @@ validate_subreg (machine_mode omode, machine_mode 
>> > imode,
>> >if ((COMPLEX_MODE_P (imode) || VECTOR_MODE_P (imode))
>> >  && GET_MODE_INNER (imode) == omode)
>> >;
>> > +  else if (!targetm.hard_regno_mode_ok (regno, imode)
>> > + || !targetm.hard_regno_mode_ok (regno, omode))
>> > +  return false;
>> 
>> It isn't meaningful to test regno against omode, since that isn't
>> necessarily the register that would be produced by the subreg.
>
> Do you refer to the register renaming pass?  I can't think of another reason
> for the regno of a hardware register in a subreg to be changed.

It's a general property of subregs.  For example, suppose we have
(reg:SI 0) on a 16-bit target.  Then:

  (subreg:HI (reg:SI 0) 2)

folds to (reg:HI 1) rather than (reg:HI 0).  It's therefore register 1
rather than register 0 that should be tested against HImode.

>> ISTR that this is a sensitive part of the codebase.  I think there
>> are/were targets that create unfoldable subregs for argument passing
>> and return.  And I think e500 had unfoldable subregs of FP registers,
>> although that port is gone now.
>
> Could you share what is "unfoldable subreg"?  I could not find this phrase
> anywhere in the source, except in one comment in the i386 port.

Yeah, it was an ad-hoc term, sorry, but...

> Perhaps a subreg of a hardware register is "unfoldable" when the hardware
> register is not valid in the outer mode?  In which case the subreg cannot be
> replaced directly with a hardware register?

...yes, that's what I meant.

>> So I suppose the question is: when given a hard register, should
>> validate_subreg test whether the subreg can be folded to a hard
>> register?  Or is it more relaxed than that?  Do we need different
>> rules before LRA (which could fix up subregs through reloading)
>> and after LRA (where unfoldable subregs stay unfoldable).
>
> My naive answer _was_ that validate_subreg should always perform checks for
> hardware registers.  Now I see it was too naive, because I was not aware of
> the different ways targets use subregs.  Hence this patch should be dropped.
>
> Meanwhile PR119966, which this patch hoped to address, got fixed instead
> with r16-809-gf725d6765373f7.

Ah, nice!

>> If validate_subreg should test whether a subreg of a hard register
>> can be folded to a hard register, the fix would be to use
>> simplify_subreg_regno instead of the current tests.  But it looks
>> like that was deliberately not done.
>
> When validate_subreg was introduced with r0-63800-gbeb72684810c1a,
> simplify_subreg_regno simply did not exit.  The simplify_subreg_regno
> itself was added later with r0-89444-geef302d277ea42.

Ah, yeah, my bad.  I'd forgotten the history of these routines.

>> It might still be worth trying to use simplify_subreg_regno and
>> seeing what breaks.  Any fallaout would at least let us expand
>> the comments to explain the constraints.
>
> I tried simplify_subreg_regno, and some tests regressed for 
> x86_64-pc-linux-gnu:
>
>   check-gcc-c

Re: [PATCH] c++, coroutines: Make a check more specific [PR109283].

2025-05-29 Thread Jason Merrill

On 5/29/25 11:34 AM, Iain Sandoe wrote:

Tested on x86_64-darwin, powerpc64le-linux; I'd like to minimize
effort on this code, since I expect that we will need some changes
to deal with open BZs.  This fixes an ICE tho,
OK for trunk?


OK.


thanks
Iain

--- 8< ---

The check was intended to assert that we had visited contained
ternary expressions with embedded co_awaits, but had been made
too general - and therefore was ICEing on code that was actually
OK.  Fixed by checking specifically that no co_awaits embedded.

PR c++/109283

gcc/cp/ChangeLog:

* coroutines.cc (find_any_await): Only save the statement
pointer if the caller passes a place for it.
(flatten_await_stmt): When checking that ternary expressions
have been handled, also check that they contain a co_await.

gcc/testsuite/ChangeLog:

* g++.dg/coroutines/pr109283.C: New test.

Signed-off-by: Iain Sandoe 
---
  gcc/cp/coroutines.cc   |  8 +---
  gcc/testsuite/g++.dg/coroutines/pr109283.C | 23 ++
  2 files changed, 28 insertions(+), 3 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/coroutines/pr109283.C

diff --git a/gcc/cp/coroutines.cc b/gcc/cp/coroutines.cc
index c1c10782906..dbb21a2ff77 100644
--- a/gcc/cp/coroutines.cc
+++ b/gcc/cp/coroutines.cc
@@ -2878,8 +2878,8 @@ find_any_await (tree *stmt, int *dosub, void *d)
if (TREE_CODE (*stmt) == CO_AWAIT_EXPR)
  {
*dosub = 0; /* We don't need to consider this any further.  */
-  tree **p = (tree **) d;
-  *p = stmt;
+  if (d)
+   *(tree **)d = stmt;
return *stmt;
  }
return NULL_TREE;
@@ -3129,7 +3129,9 @@ flatten_await_stmt (var_nest_node *n, hash_set 
*promoted,
  bool already_present = promoted->add (var);
  gcc_checking_assert (!already_present);
  tree inner = TARGET_EXPR_INITIAL (init);
- gcc_checking_assert (TREE_CODE (inner) != COND_EXPR);
+ gcc_checking_assert
+   (TREE_CODE (inner) != COND_EXPR
+|| !cp_walk_tree (&inner, find_any_await, nullptr, nullptr));
  init = cp_build_modify_expr (input_location, var, INIT_EXPR, init,
   tf_warning_or_error);
  /* Simplify for the case that we have an init containing the temp
diff --git a/gcc/testsuite/g++.dg/coroutines/pr109283.C 
b/gcc/testsuite/g++.dg/coroutines/pr109283.C
new file mode 100644
index 000..d73092b595e
--- /dev/null
+++ b/gcc/testsuite/g++.dg/coroutines/pr109283.C
@@ -0,0 +1,23 @@
+// PR 109283.
+// This used to ICE from a check set too widely.
+#include 
+
+struct foo
+{ ~foo(); };
+
+struct task
+{
+   struct promise_type
+   {
+   std::suspend_never initial_suspend();
+   std::suspend_never final_suspend() noexcept;
+   std::suspend_never yield_value(foo);
+   void return_void();
+   void unhandled_exception();
+   task get_return_object();
+   };
+};
+
+task source(int b) {
+   co_yield b ? foo{} : foo{};
+}




Re: [PATCH] C: Flex array in the middle via type alias is not reported [PR120353]

2025-05-29 Thread Joseph Myers
On Thu, 29 May 2025, Qing Zhao wrote:

> The root cause of the bug is: the TYPE_INCLUDES_FLEXARRAY marking of the
> structure type is not copied to its aliased type.
> The fix is to copy this marking to all the variant types of the current
> structure type.
> 
> The patch has been bootstrapped and regression tested on both x86 and aarch64.
> Okay for trunk and also GCC14?
> 
> thanks.

OK.

-- 
Joseph S. Myers
josmy...@redhat.com



[pushed: r16-972] diagnostics: use unique_ptr for m_format_postprocessor

2025-05-29 Thread David Malcolm
No functional change intended.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r16-972-gafee0b19dfdc39.

gcc/cp/ChangeLog:
* error.cc (cxx_format_postprocessor::clone): Update to use
unique_ptr.
(cxx_dump_pretty_printer::cxx_dump_pretty_printer): Likewise.
(cxx_initialize_diagnostics): Likewise.

gcc/ChangeLog:
* pretty-print.cc (pretty_printer::pretty_printer): Use "nullptr"
rather than "NULL".  Remove explicit delete of
m_format_postprocessor.
* pretty-print.h (format_postprocessor::clone): Use unique_ptr.
(pretty_printer::set_format_postprocessor): New.
(pretty_printer::m_format_postprocessor): Use unique_ptr.
(pp_format_postprocessor): Update for use of unique_ptr, removing
reference from return type.

Signed-off-by: David Malcolm 
---
 gcc/cp/error.cc | 10 +-
 gcc/pretty-print.cc |  6 ++
 gcc/pretty-print.h  | 17 +++--
 3 files changed, 18 insertions(+), 15 deletions(-)

diff --git a/gcc/cp/error.cc b/gcc/cp/error.cc
index d52dad3db293..a6a4a8c6212e 100644
--- a/gcc/cp/error.cc
+++ b/gcc/cp/error.cc
@@ -182,9 +182,10 @@ class cxx_format_postprocessor : public 
format_postprocessor
   : m_type_a (), m_type_b ()
   {}
 
-  format_postprocessor *clone() const final override
+  std::unique_ptr
+  clone() const final override
   {
-return new cxx_format_postprocessor ();
+return std::make_unique ();
   }
 
   void handle (pretty_printer *pp) final override;
@@ -204,8 +205,7 @@ cxx_dump_pretty_printer (int phase)
   if (outf)
 {
   pp_format_decoder (this) = cp_printer;
-  /* This gets deleted in ~pretty_printer.  */
-  pp_format_postprocessor (this) = new cxx_format_postprocessor ();
+  set_format_postprocessor (std::make_unique ());
   set_output_stream (outf);
 }
 }
@@ -301,7 +301,7 @@ void
 cxx_initialize_diagnostics (diagnostic_context *context)
 {
   cxx_pretty_printer *pp = new cxx_pretty_printer ();
-  pp_format_postprocessor (pp) = new cxx_format_postprocessor ();
+  pp->set_format_postprocessor (std::make_unique ());
   context->set_pretty_printer (std::unique_ptr (pp));
 
   c_common_diagnostics_set_defaults (context);
diff --git a/gcc/pretty-print.cc b/gcc/pretty-print.cc
index 1f38702b6117..6ecfcb26c43c 100644
--- a/gcc/pretty-print.cc
+++ b/gcc/pretty-print.cc
@@ -2461,7 +2461,7 @@ pretty_printer::pretty_printer (int maximum_length)
 m_indent_skip (0),
 m_wrapping (),
 m_format_decoder (nullptr),
-m_format_postprocessor (NULL),
+m_format_postprocessor (nullptr),
 m_token_printer (nullptr),
 m_emitted_prefix (false),
 m_need_newline (false),
@@ -2487,7 +2487,7 @@ pretty_printer::pretty_printer (const pretty_printer 
&other)
   m_indent_skip (other.m_indent_skip),
   m_wrapping (other.m_wrapping),
   m_format_decoder (other.m_format_decoder),
-  m_format_postprocessor (NULL),
+  m_format_postprocessor (nullptr),
   m_token_printer (other.m_token_printer),
   m_emitted_prefix (other.m_emitted_prefix),
   m_need_newline (other.m_need_newline),
@@ -2508,8 +2508,6 @@ pretty_printer::pretty_printer (const pretty_printer 
&other)
 
 pretty_printer::~pretty_printer ()
 {
-  if (m_format_postprocessor)
-delete m_format_postprocessor;
   m_buffer->~output_buffer ();
   XDELETE (m_buffer);
   free (m_prefix);
diff --git a/gcc/pretty-print.h b/gcc/pretty-print.h
index 066d19d4cdac..6cd9150a9d08 100644
--- a/gcc/pretty-print.h
+++ b/gcc/pretty-print.h
@@ -196,7 +196,7 @@ class format_postprocessor
 {
  public:
   virtual ~format_postprocessor () {}
-  virtual format_postprocessor *clone() const = 0;
+  virtual std::unique_ptr clone() const = 0;
   virtual void handle (pretty_printer *) = 0;
 };
 
@@ -229,7 +229,7 @@ inline int & pp_indentation (pretty_printer *pp);
 inline bool & pp_translate_identifiers (pretty_printer *pp);
 inline bool & pp_show_color (pretty_printer *pp);
 inline printer_fn &pp_format_decoder (pretty_printer *pp);
-inline format_postprocessor *& pp_format_postprocessor (pretty_printer *pp);
+inline format_postprocessor *pp_format_postprocessor (pretty_printer *pp);
 inline bool & pp_show_highlight_colors (pretty_printer *pp);
 
 class urlifier;
@@ -256,7 +256,7 @@ public:
   friend bool & pp_translate_identifiers (pretty_printer *pp);
   friend bool & pp_show_color (pretty_printer *pp);
   friend printer_fn &pp_format_decoder (pretty_printer *pp);
-  friend format_postprocessor *& pp_format_postprocessor (pretty_printer *pp);
+  friend format_postprocessor * pp_format_postprocessor (pretty_printer *pp);
   friend bool & pp_show_highlight_colors (pretty_printer *pp);
 
   friend void pp_output_formatted_text (pretty_printer *,
@@ -316,6 +316,11 @@ public:
   void set_real_maximum_length ();
   int remaining_character_count_for_line ();
 
+  void set_format_postprocessor (std::unique_ptr p)
+  {
+m_format_postprocessor = std::move (p);
+  }
+
   void dump (F

[pushed: r16-974] diagnostics: reimplement html_token_printer in terms of xml::printer

2025-05-29 Thread David Malcolm
No functional change intended.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r16-974-g3d00efc156367e.

gcc/ChangeLog:
* diagnostic-format-html.cc
(html_builder::make_element_for_diagnostic::html_token_printer):
Reimplement in terms of xml::printer.
(html_builder::make_element_for_diagnostic): Create an
xml::printer and use it with the html_token_printer.

Signed-off-by: David Malcolm 
---
 gcc/diagnostic-format-html.cc | 46 ++-
 1 file changed, 13 insertions(+), 33 deletions(-)

diff --git a/gcc/diagnostic-format-html.cc b/gcc/diagnostic-format-html.cc
index 3fa5e662d2be..f418ccc3ed97 100644
--- a/gcc/diagnostic-format-html.cc
+++ b/gcc/diagnostic-format-html.cc
@@ -700,11 +700,9 @@ html_builder::make_element_for_diagnostic (const 
diagnostic_info &diagnostic,
   class html_token_printer : public token_printer
   {
   public:
-html_token_printer (html_builder &builder,
-xml::element &parent_element)
-: m_builder (builder)
+html_token_printer (xml::printer &xp)
+: m_xp (xp)
 {
-  m_open_elements.push_back (&parent_element);
 }
 void print_tokens (pretty_printer */*pp*/,
   const pp_token_list &tokens) final override
@@ -722,7 +720,7 @@ html_builder::make_element_for_diagnostic (const 
diagnostic_info &diagnostic,
  pp_token_text *sub = as_a  (iter);
  /* The value might be in the obstack, so we may need to
 copy it.  */
- insertion_element ().add_text (sub->m_value.get ());
+ m_xp.add_text (sub->m_value.get ());
}
break;
 
@@ -733,51 +731,32 @@ html_builder::make_element_for_diagnostic (const 
diagnostic_info &diagnostic,
 
  case pp_token::kind::begin_quote:
{
- insertion_element ().add_text (open_quote);
- push_element (make_span ("gcc-quoted-text"));
+ m_xp.add_text (open_quote);
+ m_xp.push_tag_with_class ("span", "gcc-quoted-text");
}
break;
  case pp_token::kind::end_quote:
{
- pop_element ();
- insertion_element ().add_text (close_quote);
+ m_xp.pop_tag ();
+ m_xp.add_text (close_quote);
}
break;
 
  case pp_token::kind::begin_url:
{
  pp_token_begin_url *sub = as_a  (iter);
- auto anchor = std::make_unique ("a", true);
- anchor->set_attr ("href", sub->m_value.get ());
- push_element (std::move (anchor));
+ m_xp.push_tag ("a", true);
+ m_xp.set_attr ("href", sub->m_value.get ());
}
break;
  case pp_token::kind::end_url:
-   pop_element ();
+   m_xp.pop_tag ();
break;
  }
 }
 
   private:
-xml::element &insertion_element () const
-{
-  return *m_open_elements.back ();
-}
-void push_element (std::unique_ptr new_element)
-{
-  xml::element ¤t_top = insertion_element ();
-  m_open_elements.push_back (new_element.get ());
-  current_top.add_child (std::move (new_element));
-}
-void pop_element ()
-{
-  m_open_elements.pop_back ();
-}
-
-html_builder &m_builder;
-/* We maintain a stack of currently "open" elements.
-   Children are added to the topmost open element.  */
-std::vector m_open_elements;
+xml::printer &m_xp;
   };
 
   auto diag_element = make_div ("gcc-diagnostic");
@@ -798,7 +777,8 @@ html_builder::make_element_for_diagnostic (const 
diagnostic_info &diagnostic,
   message_span->set_attr ("id", message_span_id);
   add_focus_id (message_span_id);
 
-  html_token_printer tok_printer (*this, *message_span.get ());
+  xml::printer xp (*message_span.get ());
+  html_token_printer tok_printer (xp);
   m_printer->set_token_printer (&tok_printer);
   pp_output_formatted_text (m_printer, m_context.get_urlifier ());
   m_printer->set_token_printer (nullptr);
-- 
2.26.3



[pushed: r16-973] diagnostics: bulletproof html_builder::make_metadata_element

2025-05-29 Thread David Malcolm
Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r16-973-g554d2a2f0e2006.

gcc/ChangeLog:
* diagnostic-format-html.cc (html_builder::make_metadata_element):
Gracefully handle the case where "url" is null.

Signed-off-by: David Malcolm 
---
 gcc/diagnostic-format-html.cc | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/gcc/diagnostic-format-html.cc b/gcc/diagnostic-format-html.cc
index f2b255bf9cd2..3fa5e662d2be 100644
--- a/gcc/diagnostic-format-html.cc
+++ b/gcc/diagnostic-format-html.cc
@@ -897,10 +897,14 @@ html_builder::make_metadata_element (label_text label,
   xml::printer xp (*item.get ());
   xp.add_text ("[");
   {
-xp.push_tag ("a", true);
-xp.set_attr ("href", url.get ());
+if (url.get ())
+  {
+   xp.push_tag ("a", true);
+   xp.set_attr ("href", url.get ());
+  }
 xp.add_text (label.get ());
-xp.pop_tag ();
+if (url.get ())
+  xp.pop_tag ();
   }
   xp.add_text ("]");
   return item;
-- 
2.26.3



[pushed: r16-975] diagnostics: fix PatternFly URL

2025-05-29 Thread David Malcolm
Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r16-975-g8b3300fe2c2794.

gcc/ChangeLog:
* diagnostic-format-html.cc (HTML_STYLE): Fix PatternFly URL in
comment.

Signed-off-by: David Malcolm 
---
 gcc/diagnostic-format-html.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/diagnostic-format-html.cc b/gcc/diagnostic-format-html.cc
index c93962cd9c4c..05d4273c2c61 100644
--- a/gcc/diagnostic-format-html.cc
+++ b/gcc/diagnostic-format-html.cc
@@ -472,7 +472,7 @@ diagnostic_html_format_buffer::flush ()
 /* class html_builder.  */
 
 /* Style information for writing out HTML paths.
-   Colors taken from https://www.patternfly.org/v3/styles/color-palette/ */
+   Colors taken from https://pf3.patternfly.org/v3/styles/color-palette/ */
 
 static const char * const HTML_STYLE
   = ("  

Re: [PATCH v4 4/8] libstdc++: Implement layout_right from mdspan.

2025-05-29 Thread Luc Grosheintz




On 5/28/25 16:22, Tomasz Kaminski wrote:

On Mon, May 26, 2025 at 4:15 PM Luc Grosheintz 
wrote:


Implement the parts of layout_left that depend on layout_right; and the
parts of layout_right that don't depend on layout_stride.

libstdc++-v3/ChangeLog:

 * include/std/mdspan (layout_right): New class.
 * src/c++23/std.cc.in: Add layout_right.

Signed-off-by: Luc Grosheintz 


LGTM. Only some very subjective comments regarding parenthesis.
Also added some comments for possible future improvements for extents
converting constructor.


---
  libstdc++-v3/include/std/mdspan  | 153 ++-
  libstdc++-v3/src/c++23/std.cc.in |   1 +
  2 files changed, 153 insertions(+), 1 deletion(-)

diff --git a/libstdc++-v3/include/std/mdspan
b/libstdc++-v3/include/std/mdspan
index d81072596b4..7daa0713716 100644
--- a/libstdc++-v3/include/std/mdspan
+++ b/libstdc++-v3/include/std/mdspan
@@ -397,6 +397,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
class mapping;
};

+  struct layout_right
+  {
+template
+  class mapping;
+  };
+
namespace __mdspan
{
  template
@@ -489,7 +495,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   _Mapping>;

  template
-  concept __standardized_mapping = __mapping_of;
+  concept __standardized_mapping = __mapping_of
+  || __mapping_of;

  template
concept __mapping_like = requires
@@ -539,6 +546,14 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 : mapping(__other.extents(), __mdspan::__internal_ctor{})
 { }

+  template
+   requires (_Extents::rank() <= 1
+ && is_constructible_v<_Extents, _OExtents>)


I got confused for a moment by parametrization here. My preference would be
to use  (_Extents::rank() <= 1) && is_constructible_v<_Extents, _OExtents>?


+   constexpr explicit(!is_convertible_v<_OExtents, _Extents>)
+   mapping(const layout_right::mapping<_OExtents>& __other) noexcept
+   : mapping(__other.extents(), __mdspan::__internal_ctor{})
+   { }
+
constexpr mapping&
operator=(const mapping&) noexcept = default;

@@ -606,6 +621,142 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 [[no_unique_address]] _Extents _M_extents{};
  };

+  namespace __mdspan
+  {
+template
+  constexpr typename _Extents::index_type
+  __linear_index_right(const _Extents& __exts, _Indices... __indices)
+  {
+   using _IndexType = typename _Extents::index_type;
+   array<_IndexType, sizeof...(__indices)> __ind_arr{__indices...};
+   _IndexType __res = 0;
+   if constexpr (sizeof...(__indices) > 0)
+ {
+   _IndexType __mult = 1;
+   auto __update = [&, __pos = __exts.rank()](_IndexType) mutable
+ {
+   --__pos;
+   __res += __ind_arr[__pos] * __mult;
+   __mult *= __exts.extent(__pos);
+ };
+   (__update(__indices), ...);
+ }
+   return __res;
+  }
+  }
+
+  template
+class layout_right::mapping
+{
+public:
+  using extents_type = _Extents;
+  using index_type = typename extents_type::index_type;
+  using size_type = typename extents_type::size_type;
+  using rank_type = typename extents_type::rank_type;
+  using layout_type = layout_right;
+
+  static_assert(__mdspan::__representable_size<_Extents, index_type>,
+   "The size of extents_type must be representable as index_type");
+
+  constexpr
+  mapping() noexcept = default;
+
+  constexpr
+  mapping(const mapping&) noexcept = default;
+
+  constexpr
+  mapping(const _Extents& __extents) noexcept
+  : _M_extents(__extents)
+  {
__glibcxx_assert(__mdspan::__is_representable_extents(_M_extents)); }
+
+  template
+   requires (is_constructible_v)


I do not think this parense are necessary? Are they?

No, it seems not. I think I got confused by the need of putting
around function calls, e.g.

  requires (is_foo())

requires the extra set of parens. I think that's why I started
putting them unconditionally.

I stripped all the unneeded parens I could find.




+   constexpr explicit(!is_convertible_v<_OExtents, extents_type>)
+   mapping(const mapping<_OExtents>& __other) noexcept
+   : mapping(__other.extents(), __mdspan::__internal_ctor{})
+   { }
+
+  template
+   requires (extents_type::rank() <= 1
+   && is_constructible_v)


Same comment regarding parametrization.


+   constexpr explicit(!is_convertible_v<_OExtents, extents_type>)
+   mapping(const layout_left::mapping<_OExtents>& __other) noexcept
+   : mapping(__other.extents(), __mdspan::__internal_ctor{})
+   { }
+
+  constexpr mapping&
+  operator=(const mapping&) noexcept = default;
+
+  constexpr const _Extents&
+  extents() const noexcept { return _M_extents; }
+
+  constexpr index_type
+  required_span_size() const noexcept

Re: [PATCH] expmed: Prevent non-canonical subreg generation in store_bit_field [PR118873]

2025-05-29 Thread Konstantinos Eleftheriou
Hi Richard, thanks for the response.

On Mon, May 26, 2025 at 11:55 AM Richard Biener  wrote:
>
> On Mon, 26 May 2025, Konstantinos Eleftheriou wrote:
>
> > In `store_bit_field_1`, when the value to be written in the bitfield
> > and/or the bitfield itself have vector modes, non-canonical subregs
> > are generated, like `(subreg:V4SI (reg:V8SI x) 0)`. If one them is
> > a scalar, this happens only when the scalar mode is different than the
> > vector's inner mode.
> >
> > This patch tries to prevent this, using vec_set patterns when
> > possible.
>
> I know almost nothing about this code, but why does the patch
> fixup things after the fact rather than avoid generating the
> SUBREG in the first place?

That's what we are doing, we are trying to prevent the non-canonical
subreg generation (it's not always possible). But, there are cases
where these types of subregs are passed into `store_bit_field` by its
caller, in which case we choose not to touch them.

> ISTR it also (unfortunately) depends on the target which forms
> are considered canonical.

But, the way that we interpret the documentation, the
canonicalizations are machine-independent. Is that not true? Or,
specifically for the subregs that operate on vectors, is there any
target that considers them canonical?

> I'm also not sure you got endianess right for all possible
> values of SUBREG_BYTE.  One more reason to not generate such
> subreg in the first place but stick to vec_select/concat.

The only way that we would generate subregs are from the calls to
`extract_bit_field` or `store_bit_field_1` and these should handle the
endianness. Also, these subregs wouldn't operate on vectors. Do you
mean that something could go wrong with these calls?

Konstantinos


> Richard.
>
> > Bootstrapped/regtested on AArch64 and x86_64.
> >
> >   PR rtl-optimization/118873
> >
> > gcc/ChangeLog:
> >
> >   * expmed.cc (generate_vec_concat): New function.
> >   (store_bit_field_1): Check for cases where the value
> >   to be written and/or the bitfield have vector modes
> >   and try to generate the corresponding vec_set patterns
> >   instead of subregs.
> >
> > gcc/testsuite/ChangeLog:
> >
> >   * gcc.target/i386/pr118873.c: New test.
> > ---
> >  gcc/expmed.cc| 174 ++-
> >  gcc/testsuite/gcc.target/i386/pr118873.c |  33 +
> >  2 files changed, 200 insertions(+), 7 deletions(-)
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr118873.c
> >
> > diff --git a/gcc/expmed.cc b/gcc/expmed.cc
> > index 8cf10d9c73bf..8c641f55b9c6 100644
> > --- a/gcc/expmed.cc
> > +++ b/gcc/expmed.cc
> > @@ -740,6 +740,42 @@ store_bit_field_using_insv (const extraction_insn 
> > *insv, rtx op0,
> >return false;
> >  }
> >
> > +/* Helper function for store_bit_field_1, used in the case that the 
> > bitfield
> > +   and the destination are both vectors.  It extracts the elements of OP 
> > from
> > +   LOWER_BOUND to UPPER_BOUND using a vec_select and uses a vec_concat to
> > +   concatenate the extracted elements with the VALUE.  */
> > +
> > +rtx
> > +generate_vec_concat (machine_mode fieldmode, rtx op, rtx value,
> > +  HOST_WIDE_INT lower_bound,
> > +  HOST_WIDE_INT upper_bound)
> > +{
> > +  if (!VECTOR_MODE_P (fieldmode))
> > +return NULL_RTX;
> > +
> > +  rtvec vec = rtvec_alloc (GET_MODE_NUNITS (fieldmode).to_constant ());
> > +  machine_mode outermode = GET_MODE (op);
> > +
> > +  for (HOST_WIDE_INT i = lower_bound; i < upper_bound; ++i)
> > +RTVEC_ELT (vec, i) = GEN_INT (i);
> > +  rtx par = gen_rtx_PARALLEL (VOIDmode, vec);
> > +  rtx select = gen_rtx_VEC_SELECT (fieldmode, op, par);
> > +  if (BYTES_BIG_ENDIAN)
> > +{
> > +  if (lower_bound > 0)
> > + return gen_rtx_VEC_CONCAT (outermode, select, value);
> > +  else
> > + return gen_rtx_VEC_CONCAT (outermode, value, select);
> > +}
> > +  else
> > +{
> > +  if (lower_bound > 0)
> > + return gen_rtx_VEC_CONCAT (outermode, value, select);
> > +  else
> > + return gen_rtx_VEC_CONCAT (outermode, select, value);
> > +}
> > +}
> > +
> >  /* A subroutine of store_bit_field, with the same arguments.  Return true
> > if the operation could be implemented.
> >
> > @@ -778,18 +814,142 @@ store_bit_field_1 (rtx str_rtx, poly_uint64 bitsize, 
> > poly_uint64 bitnum,
> >if (VECTOR_MODE_P (outermode)
> >&& !MEM_P (op0)
> >&& optab_handler (vec_set_optab, outermode) != CODE_FOR_nothing
> > -  && fieldmode == innermode
> > -  && known_eq (bitsize, GET_MODE_PRECISION (innermode))
> >&& multiple_p (bitnum, GET_MODE_PRECISION (innermode), &pos))
> >  {
> > +  /* Cases where the destination's inner mode is not equal to the
> > +  value's mode need special treatment.  */
> > +
> >class expand_operand ops[3];
> >enum insn_code icode = optab_handler (vec_set_optab, outermode);
> >
> > -  create_fixed_oper

  1   2   >