date:20250113

Re: [PATCH v4 6/7] OpenMP: Fortran front-end support for dispatch + adjust_args

2025-01-13 Thread Tobias Burnus


Hi PA,

Paul-Antoine Arras wrote:

Hi Thomas,

Added libgomp/testsuite/libgomp.fortran/dispatch-1.f90.

I see this new test case FAIL (execution test SIGSEGV) for most (but not
all) offloading configurations, both GCN and nvptx:

 +PASS: libgomp.fortran/dispatch-1.f90   -O  (test for excess 
errors)

 +FAIL: libgomp.fortran/dispatch-1.f90   -O  execution test


Thanks for pointing that out! The testcase missed an OpenMP target 
directive. The attached patch should fix it.


…


--- libgomp/testsuite/libgomp.fortran/dispatch-1.f90
+++ libgomp/testsuite/libgomp.fortran/dispatch-1.f90
@@ -55,6 +55,7 @@ module procedures
  call c_f_pointer(d_av, fp_av, [n])
  
  ! Perform operations on target

+!$omp target is_device_ptr(fp_bv, fp_av)
  do i = 1, n
fp_bv(i) = fp_av(i) * i
  end do


I think the patch is okay in the sense that it works;
still, I think you should consider the following.

Using 'is_device_ptr' for for an argument that
is not a type(c_ptr) is deprecated since OpenMP 5.1 and
removed from the specification since OpenMP 6.0.

Thus, it would be a bit cleaner (and might avoid future
-Wdeprecated warnings) using
   has_device_addr(fp_bv, fp_av) instead. (5.1 semantic states that this replacement 
happens automatically when the is_device_ptr argument is not a C_PTR.) 
Albeit it feels a bit cleaner to move the device pointer handling to the 
device side, i.e. implicit none integer :: res, n, i type(c_ptr) :: d_bv 
type(c_ptr) :: d_av !$omp target is_device_ptr(d_bv, d_av) block 
real(8), pointer :: fp_bv(:), fp_av(:) ! Fortran pointers for array 
access ! Associate C pointers with Fortran pointers call 
c_f_pointer(d_bv, fp_bv, [n]) call c_f_pointer(d_av, fp_av, [n]) ! 
Perform operations on target do i = 1, n fp_bv(i) = fp_av(i) * i end do 
end block However, as all variants work in practice, I don't feel strong 
about it, with a small preference of a variant that does not use 
deprecated features. (Both dispatch and the has_device_addr clause/the 
deprecation are new with OpenMP 5.1) Tobias

Re: [PATCH] testsuite: libstdc++: Use effective-target libatomic

2025-01-13 Thread Jonathan Wakely

On Mon, 13 Jan 2025 at 11:12, Thomas Schwinge  wrote:
>
> Hi!
>
> On 2025-01-13T11:04:50+, Jonathan Wakely  wrote:
> > On Mon, 13 Jan 2025 at 11:03, Thomas Schwinge  
> > wrote:
> >> On 2025-01-12T08:38:05+0100, Torbjorn SVENSSON 
> >>  wrote:
> >> > On 2025-01-12 01:05, Jonathan Wakely wrote:
> >> >> On Mon, 23 Dec 2024, 19:05 Torbjörn SVENSSON,
> >> >> mailto:torbjorn.svens...@foss.st.com>>
> >> >> wrote:
> >> >>
> >> >> Ok for trunk and releases/gcc-14?
> >> >>
> >> >> OK
> >> >
> >> > Pushed as r15-6828-g4b0ef49d02f and r14.2.0-680-gd82fc939f91.
> >>
> >> On a configuration where libatomic does get built, I see (with standard
> >
> > Does *not* get built?
>
> No, *does* get built, and thus the PASS -> UNSUPPORTED is a regression.


Oh right! I misunderstood the problem, sorry.

So we need the init so that libatomic_available actually works.

[PATCH] tree-optimization/118405 - ICE with vector(1) T vs T load

2025-01-13 Thread Richard Biener

When vectorizing a load we are now checking alignment before emitting
a vector(1) T load instead of blindly assuming it's OK when we had
a scalar T load.  For reasons we're not handling alignment computation
optimally here but we shouldn't ICE when we fall back to loads of T.

The following ensures the IL remains correct by emitting VIEW_CONVERT
from T to vector(1) T when needed.  It also removes an earlier fix
done in r9-382-gbb4e47476537f6 for the same issue with VMAT_ELEMENTWISE.

Bootstrapped and tested on x86_64-unknown-linux-gnu.

PR tree-optimization/118405
* tree-vect-stmts.cc (vectorizable_load): When we fall back
to scalar loads make sure we properly convert to vector(1) T
when there was only a single vector element.
---
 gcc/tree-vect-stmts.cc | 18 --
 1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index f5b3608f6b1..0c0f999d3e3 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -10731,9 +10731,6 @@ vectorizable_load (vec_info *vinfo,
  /* Else fall back to the default element-wise access.  */
  ltype = build_aligned_type (ltype, TYPE_ALIGN (TREE_TYPE (vectype)));
}
-  /* Load vector(1) scalar_type if it's 1 element-wise vectype.  */
-  else if (nloads == 1)
-   ltype = vectype;
 
   if (slp)
{
@@ -10782,11 +10779,11 @@ vectorizable_load (vec_info *vinfo,
 group_el * elsz + cst_offset);
  tree data_ref = build2 (MEM_REF, ltype, running_off, this_off);
  vect_copy_ref_info (data_ref, DR_REF (first_dr_info->dr));
- new_stmt = gimple_build_assign (make_ssa_name (ltype), data_ref);
+ new_temp = make_ssa_name (ltype);
+ new_stmt = gimple_build_assign (new_temp, data_ref);
  vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi);
  if (nloads > 1)
-   CONSTRUCTOR_APPEND_ELT (v, NULL_TREE,
-   gimple_assign_lhs (new_stmt));
+   CONSTRUCTOR_APPEND_ELT (v, NULL_TREE, new_temp);
 
  group_el += lnel;
  if (! slp
@@ -10833,6 +10830,15 @@ vectorizable_load (vec_info *vinfo,
}
}
}
+ else if (!costing_p && ltype != vectype)
+   {
+ new_stmt = gimple_build_assign (make_ssa_name (vectype),
+ VIEW_CONVERT_EXPR,
+ build1 (VIEW_CONVERT_EXPR,
+ vectype, new_temp));
+ vect_finish_stmt_generation (vinfo, stmt_info, new_stmt,
+  gsi);
+   }
 
  if (!costing_p)
{
-- 
2.43.0

Re: [PATCH] gcc/d: give dependency files better filenames

2025-01-13 Thread Rainer Orth

Arsen Arsenović  writes:

> Regstrapped on x86_64-pc-linux-gnu.  I've also checked the generated
> dependency files are correct by hand and "instrumented" the build to
> fail if two dependency files are the same, by doing the following:
>
>   DPOSTCOMPILE = ! test -f $(DEPFILE).Po && mv ...
>
> ... and confirmed no further conflicts of this sort happen.
>
> OK for trunk?
> -- >8 --
> Currently, the dependency files for root-file.o and common-file.o were
> both d/.deps/file.Po, which would cause parallel builds to fail
> sometimes with:
>
>   make[3]: Leaving directory 
> '/var/tmp/portage/sys-devel/gcc-14.1.1_p20240511/work/build/gcc'
>   make[3]: Entering directory 
> '/var/tmp/portage/sys-devel/gcc-14.1.1_p20240511/work/build/gcc'
>   mv: cannot stat 'd/.deps/file.TPo': No such file or directory
>   make[3]: *** 
> [/var/tmp/portage/sys-devel/gcc-14.1.1_p20240511/work/gcc-14-20240511/gcc/d/Make-lang.in:421:
>  d/root-file.o] Error 1 shuffle=131581365
>
> Also, this means that dependencies of one of root-file or common-file
> are missing when developing.  After this patch, those two files get
> assigned dependency files d/.deps/d-root-file.o.Po and
> d/.deps/d-common-file.o.Po respectively.
>
> There are other files with similar conflicts (mangle-package.o,
> visitor-package.o for instance).

I'm also seeing this for some degrees of parallelism
(x86_64-pc-linux-gnu -j64, sparc-sun-solaris2.11 -j64 and -j96, not
i386-pc-solaris2.11 -j28), always for visitor-package.o.

I can confirm that your patch fixes the problem.

Thanks.
Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University

[PATCH] expr: Fix up the divmod cost debugging note [PR115910]

2025-01-13 Thread Jakub Jelinek

Hi!

Something I've noticed during working on the crc wrong-code fix.
My first version of the patch failed because of no longer matching some
expected strings in the assembly, so I had to add TDF_DETAILS debugging
into the -fdump-rtl-expand-details dump which the crc tests can use.

For PR115910 Andrew has added similar note for the division/modulo case
if it is positive and we can choose either unsigned or signed
division.  The problem is that unlike most other TDF_DETAILS diagnostics,
this is not done before emitting the IL for the function, but during it.

Other messages there are prefixed with ;;, both details on what it is doing
and the GIMPLE IL for which it expands RTL, so the
;; Generating RTL for gimple basic block 4

;;

(code_label 13 12 14 2 (nil) [0 uses])

(note 14 13 0 NOTE_INSN_BASIC_BLOCK)
positive division: unsigned cost: 30; signed cost: 28

;; return _4;

message in between just looks weird and IMHO should be ;; prefixed.

The following patch does that, ok for trunk?

2025-01-13  Jakub Jelinek  

PR target/115910
* expr.cc (expand_expr_divmod): Prefix the TDF_DETAILS note with
";; " and add a space before (needed tie breaker).  Formatting fixes.

--- gcc/expr.cc.jj  2025-01-13 09:12:08.589966845 +0100
+++ gcc/expr.cc 2025-01-13 11:21:11.501285143 +0100
@@ -9710,9 +9710,9 @@ expand_expr_divmod (tree_code code, mach
}
 
   if (dump_file && (dump_flags & TDF_DETAILS))
- fprintf(dump_file, "positive division:%s unsigned cost: %u; "
- "signed cost: %u\n", was_tie ? "(needed tie breaker)" : "",
- uns_cost, sgn_cost);
+   fprintf (dump_file, ";; positive division:%s unsigned cost: %u; "
+   "signed cost: %u\n",
+was_tie ? " (needed tie breaker)" : "", uns_cost, sgn_cost);
 
   if (uns_cost < sgn_cost || (uns_cost == sgn_cost && unsignedp))
{

Jakub

[PATCH] MAINTAINERS: Make contrib/check-MAINTAINERS.py happy

2025-01-13 Thread Martin Jambor

This commit makes the contrib/check-MAINTAINERS.py script happy about
our MAINTAINERS file.  I hope that it knows best how things ought to
be and so am committing this as obvious.

ChangeLog:

2025-01-13  Martin Jambor  

* MAINTAINERS: Fix the name order of the Write After Approval section.
---
 MAINTAINERS | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 0c571bde8bc..256a03957d5 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -327,12 +327,12 @@ from other maintainers or reviewers.
 
 NameBZ account  Email
 
+Soumya AR   soumyaa 
 Mark G. Adams   mgadams 
 Ajit Kumar Agarwal  aagarwa 
 Pedro Alves palves  
 John David Anglin   danglin 
 Harald Anlauf   anlauf  
-Soumya AR   soumyaa 
 Paul-Antoine Arras  parras  
 Arsen Arsenović arsen   
 Raksit Ashokraksit  
-- 
2.47.1

Re: [PATCH] Accept commas between clauses in OpenMP declare variant

2025-01-13 Thread Paul-Antoine Arras


Hi Tobias,

Here are an updated patch and a few questions.

On 07/01/2025 13:18, Tobias Burnus wrote:

Paul-Antoine Arras:

Add support to the Fortran parser for the new OpenMP syntax that allows a
comma after the directive name and between clauses of declare variant.
The C and C++ parsers already support this syntax so only a new test 
is added.


Note: only the optional comma between directive name and (first) clause
is new (since 5.2). The one between clauses is old (since OpenMP 2.5).

While the patch supports both, 'new OpenMP syntax' is a bit misleading.

For 'declare variant', the comma-between-clauses support is now required
as only with 5.1 and this patch set, more clauses ('adjust_args' and
'append_args') are supported.

This second type of comma (between directive and first clause) is
supported in C/C++ since OpenMP 5.1; I think mainly because of the
added [[omp::directive(...)]] C++11 attribute feature.

In Fortran, this comma is only permitted since 5.2; I think mostly
for consistency with C/C++.


I rephrased the commit message by just removing "new". Should it be more 
detailed?



* * *

BTW: Adding support for this comma for all directives is tracked as
to-be-done Fortran item for 5.2 (under other features as Appendix B
does not list it.)
→ https://gcc.gnu.org/onlinedocs/libgomp/OpenMP-5_002e2.html

* * *



--- a/gcc/fortran/openmp.cc
+++ b/gcc/fortran/openmp.cc
@@ -6595,6 +6595,10 @@ gfc_match_omp_declare_variant (void)
    for (;;)
  {
+  gfc_gobble_whitespace ();
+  gfc_match_char (',');
+  gfc_gobble_whitespace ();
+


I think that will do, but having a better error would be IMHO better.

For the first fail, we get:

    13 | !$omp declare variant(f) ,,match(construct={dispatch}) 
adjust_args(need_device_ptr : c)

   |   1
Error: expected ‘match’ or ‘adjust_args’ at (1)

which is quite helpful. But moving the ,, later gives:


    13 | !$omp declare variant(f) 
match(construct={dispatch}),,adjust_args(need_device_ptr : c)

   |   1
Error: Invalid character in name at (1)

That's kind of helpful but not really useful.

And adding a valid but not yet supported clause (it's supported for C/C++):

    13 | !$omp declare variant(f) 
match(construct={dispatch}),adjust_args(need_device_ptr : c),append_args(a)

   |   1
Error: Unclassifiable statement at (1)


I think it would be better to replace the first_p by, e.g.,
'error_p = true; break;' – and do the this diagnostic
after the for(;;) loop. This seems to yield a better diagnostic.


I am not sure I am getting that part. Is this what you are suggesting?

diff --git gcc/fortran/openmp.cc gcc/fortran/openmp.cc
index 9d28dc9..e3abbeeef98 100644
--- gcc/fortran/openmp.cc
+++ gcc/fortran/openmp.cc
@@ -6532,7 +6532,6 @@ gfc_match_omp_context_selector_specification 
(gfc_omp_declare_variant *odv)

 match
 gfc_match_omp_declare_variant (void)
 {
-  bool first_p = true;
   char buf[GFC_MAX_SYMBOL_LEN + 1];

   if (gfc_match (" (") != MATCH_YES)
@@ -6590,7 +6589,7 @@ gfc_match_omp_declare_variant (void)
   return MATCH_ERROR;
 }

-  bool has_match = false, has_adjust_args = false;
+  bool has_match = false, has_adjust_args = false, error_p = false;
   locus adjust_args_loc;

   for (;;)
@@ -6614,13 +6613,9 @@ gfc_match_omp_declare_variant (void)
}
   else
{
- if (first_p)
-   {
- gfc_error ("expected % or % at %C");
- return MATCH_ERROR;
-   }
- else
-   break;
+ if (!has_match)
+   error_p = true;
+ break;
}

   if (gfc_match (" (") != MATCH_YES)
@@ -,8 +6661,12 @@ gfc_match_omp_declare_variant (void)
for (gfc_omp_namelist *n = *head; n != NULL; n = n->next)
  n->u.need_device_ptr = true;
}
+}

-  first_p = false;
+  if (error_p)
+{
+  gfc_error ("expected % or % at %C");
+  return MATCH_ERROR;
 }

   if (has_adjust_args && !has_match)

==

If so, it does yield a better diagnostic ("expected 'match' or 
'adjust_args' at (1)") for this testcase; but it completely breaks other 
cases where the rest of the line is non empty. For instance, with an 
end-of-line comment:


   29 |  !$omp declare variant (f0) adjust_args (nothing: a) ! 
{ dg-error "an 'adjust_args' clause at .1. can only be specified if the 
'dispatch' selector of the construct selector set appears in the 'match' 
clause" }

  |  1


* * *

diff --git a/gcc/testsuite/c-c++-common/gomp/adjust-args-5.c b/gcc/ 
testsuite/c-c++-common/gomp/adjust-args-5.c

new file mode 100644
index 000..863b77458e4
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/gomp/adjust-args-5.c
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+
+/* Check that the OpenMP 6 syntax with commas after the directive

Re: [PATCH v4 6/7] OpenMP: Fortran front-end support for dispatch + adjust_args

2025-01-13 Thread Paul-Antoine Arras


On 13/01/2025 14:57, Tobias Burnus wrote:

Hi PA,

Paul-Antoine Arras wrote:

Here is an updated patch following your suggestion.


Thanks. It is not clear whether you are just waiting
for test result or not before committing it as obvious.

Thus, just in case: LGTM.

Thanks,

Tobias



Thanks for the final approval. It is now in mainline.

Best,
--
PA

Re: [PATCH] c++: Inhibit subsequent warnings/notes in diagnostic_groups with an inhibited warning [PR118163,PR118392]

2025-01-13 Thread Jason Merrill


On 1/12/25 2:39 PM, Simon Martin wrote:

[ Fixing David’s email address :-/ ]

Hi,

On 9 Jan 2025, at 20:08, Simon Martin wrote:


On 9 Jan 2025, at 20:00, Marek Polacek wrote:


On Thu, Jan 09, 2025 at 12:05:43PM -0500, Patrick Palka wrote:

On Wed, 8 Jan 2025, Jason Merrill wrote:


On 12/21/24 11:35 AM, Simon Martin wrote:

When erroring out due to an incomplete type, we add a contextual
note
about the type. However, when the error is suppressed by
-Wno-template-body, the note remains, making the compiler output
quite
puzzling.

This patch makes sure the note is suppressed if we're processing a



template declaration body with -Wno-template-body.

Successfully tested on x86_64-pc-linux-gnu.

PR c++/118163

gcc/cp/ChangeLog:

* cp-tree.h (get_current_template): Declare.
* error.cc (get_current_template): Make non static.
* typeck2.cc (cxx_incomplete_type_inform): Suppress note when
parsing a template declaration with -Wno-template-body.


I think rather than adding this sort of thing in lots of places
where an error
is followed by an inform, we should change error to return bool
like
other
diagnostic functions, and check its return value before calling
cxx_incomplete_type_inform or plain inform.  This likely involves
the same
number of changes, but they should be smaller.

Patrick, what do you think?


That makes sense to me, it's consistent with the 'warning' API and
how
we handle issuing a warning followed by a note.  But since the
-Wtemplate-body mechanism is really only useful for compiling legacy



code where you don't really care about any diagnostics anyway, and
the intended way to use it is -fpermissive /
-Wno-error=template-body
rather than -Wno-template-body, I'd prefer a less invasive solution
that
doesn't change the API of 'error' if possible.

I wonder if we can work around this by taking advantage of the fact
that
notes that follow an error are expected to be linked via an active
auto_diagnostic_group?  Roughly, if we issued a -Wtemplate-body
diagnostic from an active auto_diagnostic_group then all other
diagnostics from that auto_diagnostic_group should also be
associated
with -Wtemplate-body, including notes.  That way -Wno-template-body
will
effectively suppress subsequent notes followed by an eligible error,



and
no 'error' callers need to be changed (unless to use
auto_diagnostic_group).


FWIW, I love this auto_diagnostic_group idea.

Thanks folks, I’ll explore the auto_diagnostic_group idea (and maybe
*also* the error returning bool one because I am not a fan of
functions
that “lie” to their callers :-))

I’ll send a follow-up patch in the coming days.

Please find attached an updated version of the patch, that implements

Patrick’s idea and fixes both PR118163 and PR118392. It tracks the
depth at which a warning is inhibited, and suppresses all the notes from
that depth on until an error/warning is emitted or that depth is left.

Successfully tested on x86_64-pc-linux-gnu. OK for GCC 16?



+  int curr_depth = (m_diagnostic_groups.m_group_nesting_depth
+   + m_diagnostic_groups.m_diagnostic_nesting_level);


Do we care about the nesting level?  I'd lean toward ignoring it and 
only considering the group.


Jason

Re: [PATCH v2] c: improve UX for -Wincompatible-pointer-types [PR116871]

2025-01-13 Thread David Malcolm

On Tue, 2025-01-14 at 00:08 +, Joseph Myers wrote:
> On Sun, 12 Jan 2025, David Malcolm wrote:
> 
> > So I've dropped the takes_int_p, takes_void_p, and
> > maybe_inform_empty_args_c23_transition from the patch.  Here's an
> > updated version that keeps the rest of the changes.  I'd like to
> > get
> > this into GCC 15 to make build failures due to C23-
> > incompatibilities
> > more readable.
> 
> Some comments in testcases still repeat the misconception about
> implicit 
> (int) for unprototyped functions.
> 
> OK with that fixed.
> 
> > diff --git a/gcc/testsuite/gcc.dg/c23-mismatching-fn-ptr-
> > alsatools.c b/gcc/testsuite/gcc.dg/c23-mismatching-fn-ptr-
> > alsatools.c
> > new file mode 100644
> > index ..e3460e546a9a
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/c23-mismatching-fn-ptr-alsatools.c
> > @@ -0,0 +1,21 @@
> > +/* Examples of a mismatching function pointer types in
> > +   legacy code compiled with C23 that assumed () meant (int).
> 
> This comment is incorrect, unprototyped is not (int).
> 
> > diff --git a/gcc/testsuite/gcc.dg/c23-mismatching-fn-ptr.c
> > b/gcc/testsuite/gcc.dg/c23-mismatching-fn-ptr.c
> > new file mode 100644
> > index ..4db44f48a3f2
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/c23-mismatching-fn-ptr.c
> > @@ -0,0 +1,76 @@
> > +/* Verify that when we complain about incompatible pointer types
> > +   involving function pointers, we show the declaration of the
> > +   function.  
> > +
> > +   In particular, check the case of
> > +  extern void fn ();
> > +   changing meaning in C23 (from taking int to taking void).  */
> 
> Likewise.
> 
> > +/* Test of storing a sighandler_t where the declaration of the
> > +   destination might be relying on implicit int arg, which
> > +   becomes void in C23.
> 
> Likewise.

Thanks.  I removed the bad wording in the comments, and have pushed it
as r15-6886-gbbc7900ce7e2c3.

Dave

Re: [PATCH] c++: make finish_pseudo_destructor_expr SFINAE-aware [PR116417]

2025-01-13 Thread Jason Merrill


On 1/13/25 2:57 PM, Marek Polacek wrote:

On Mon, Jan 13, 2025 at 11:25:25AM -0500, Patrick Palka wrote:

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look
OK for trunk?


Looks okay to me.


OK.


-- >8 --

PR c++/116417

gcc/cp/ChangeLog:

* cp-tree.h (finish_pseudo_destructor_expr): Add complain
parameter.
* parser.cc (cp_parser_postfix_dot_deref_expression): Pass
complain=tf_warning_or_error to finish_pseudo_destructor_expr.
* pt.cc (tsubst_expr): Pass complain to
finish_pseudo_destructor_expr.
* semantics.cc (finish_pseudo_destructor_expr): Check complain
before issuing a diagnostic.

gcc/testsuite/ChangeLog:

* g++.dg/template/pseudodtor7.C: New test.
---
  gcc/cp/cp-tree.h|  2 +-
  gcc/cp/parser.cc|  3 ++-
  gcc/cp/pt.cc|  4 ++--
  gcc/cp/semantics.cc | 15 +--
  gcc/testsuite/g++.dg/template/pseudodtor7.C | 15 +++
  5 files changed, 29 insertions(+), 10 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/template/pseudodtor7.C

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index aed58523b16..1b42e8ba7d8 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -7966,7 +7966,7 @@ extern tree lookup_and_finish_template_variable (tree, 
tree, tsubst_flags_t = tf
  extern tree finish_template_variable  (tree, tsubst_flags_t = 
tf_warning_or_error);
  extern cp_expr finish_increment_expr  (cp_expr, enum tree_code);
  extern tree finish_this_expr  (void);
-extern tree finish_pseudo_destructor_expr   (tree, tree, tree, location_t);
+extern tree finish_pseudo_destructor_expr   (tree, tree, tree, location_t, 
tsubst_flags_t);
  extern cp_expr finish_unary_op_expr   (location_t, enum tree_code, 
cp_expr,
 tsubst_flags_t);
  /* Whether this call to finish_compound_literal represents a C++11 functional
diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index f548dc31c2b..7f4340537c9 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -8847,7 +8847,8 @@ cp_parser_postfix_dot_deref_expression (cp_parser *parser,
  pseudo_destructor_p = true;
  postfix_expression
= finish_pseudo_destructor_expr (postfix_expression,
-s, type, location);
+s, type, location,
+tf_warning_or_error);
}
  }
  
diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc

index a141de56446..537e4c4a494 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -21537,7 +21537,7 @@ tsubst_expr (tree t, tree args, tsubst_flags_t 
complain, tree in_decl)
tree op1 = RECUR (TREE_OPERAND (t, 1));
tree op2 = tsubst (TREE_OPERAND (t, 2), args, complain, in_decl);
RETURN (finish_pseudo_destructor_expr (op0, op1, op2,
-  input_location));
+  input_location, complain));
}
  
  case TREE_LIST:

@@ -21601,7 +21601,7 @@ tsubst_expr (tree t, tree args, tsubst_flags_t 
complain, tree in_decl)
dtor = TREE_OPERAND (dtor, 0);
if (TYPE_P (dtor))
  RETURN (finish_pseudo_destructor_expr
- (object, s, dtor, input_location));
+ (object, s, dtor, input_location, complain));
  }
  }
  }
diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index 15840e10620..76c79c6a8cc 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -3527,7 +3527,7 @@ finish_this_expr (void)
  
  tree

  finish_pseudo_destructor_expr (tree object, tree scope, tree destructor,
-  location_t loc)
+  location_t loc, tsubst_flags_t complain)
  {
if (object == error_mark_node || destructor == error_mark_node)
  return error_mark_node;
@@ -3538,16 +3538,18 @@ finish_pseudo_destructor_expr (tree object, tree scope, 
tree destructor,
  {
if (scope == error_mark_node)
{
- error_at (loc, "invalid qualifying scope in pseudo-destructor name");
+ if (complain & tf_error)
+   error_at (loc, "invalid qualifying scope in pseudo-destructor 
name");
  return error_mark_node;
}
if (is_auto (destructor))
destructor = TREE_TYPE (object);
if (scope && TYPE_P (scope) && !check_dtor_name (scope, destructor))
{
- error_at (loc,
-   "qualified type %qT does not match destructor name ~%qT",
-   scope, destructor);
+ if (complain & tf_error)
+   error_at (loc,
+ "qualified type %qT does not match destructor name ~%qT",
+ sco

[PATCH] [ifcombine] check and extend constants to compare with bitfields [PR118456]

2025-01-13 Thread Alexandre Oliva



Add logic to check and extend constants compared with bitfields, so
that fields are only compared with constants they could actually
equal.  This involves making sure the signedness doesn't change
between loads and conversions before shifts: we'd need to carry a lot
more data to deal with all the possibilities.

Regstrapped on x86_64-linux-gnu.  Ok to install?


for  gcc/ChangeLog

PR tree-optimization/118456
* gimple-fold.cc (decode_field_reference): Punt if shifting
after changing signedness.
(fold_truth_andor_for_ifcombine): Check extension bits in
constants before clipping.

for  gcc/testsuite/ChangeLog

PR tree-optimization/118456
* gcc.dg/field-merge-21.c: New.
* gcc.dg/field-merge-22.c: New.
---
 gcc/gimple-fold.cc|   40 -
 gcc/testsuite/gcc.dg/field-merge-21.c |   53 +
 gcc/testsuite/gcc.dg/field-merge-22.c |   31 +++
 3 files changed, 122 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/field-merge-21.c
 create mode 100644 gcc/testsuite/gcc.dg/field-merge-22.c

diff --git a/gcc/gimple-fold.cc b/gcc/gimple-fold.cc
index 93ed8b3abb056..5b1fbe6db1df3 100644
--- a/gcc/gimple-fold.cc
+++ b/gcc/gimple-fold.cc
@@ -7712,6 +7712,18 @@ decode_field_reference (tree *pexp, HOST_WIDE_INT 
*pbitsize,
 
   if (shiftrt)
 {
+  /* Punt if we're shifting by more than the loaded bitfield (after
+adjustment), or if there's a shift after a change of signedness, punt.
+When comparing this field with a constant, we'll check that the
+constant is a proper sign- or zero-extension (depending on signedness)
+of a value that would fit in the selected portion of the bitfield.  A
+shift after a change of signedness would make the extension
+non-uniform, and we can't deal with that (yet ???).  See
+gcc.dg/field-merge-22.c for a test that would go wrong.  */
+  if (*pbitsize <= shiftrt
+ || (convert_before_shift
+ && outer_type && unsignedp != TYPE_UNSIGNED (outer_type)))
+   return NULL_TREE;
   if (!*preversep ? !BYTES_BIG_ENDIAN : BYTES_BIG_ENDIAN)
*pbitpos += shiftrt;
   *pbitsize -= shiftrt;
@@ -8512,13 +8524,25 @@ fold_truth_andor_for_ifcombine (enum tree_code code, 
tree truth_type,
  and bit position.  */
   if (l_const.get_precision ())
 {
+  /* Before clipping upper bits of the right-hand operand of the compare,
+check that they're sign or zero extensions, depending on how the
+left-hand operand would be extended.  */
+  bool l_non_ext_bits = false;
+  if (ll_bitsize < lr_bitsize)
+   {
+ wide_int zext = wi::zext (l_const, ll_bitsize);
+ if ((ll_unsignedp ? zext : wi::sext (l_const, ll_bitsize)) == l_const)
+   l_const = zext;
+ else
+   l_non_ext_bits = true;
+   }
   /* We're doing bitwise equality tests, so don't bother with sign
 extensions.  */
   l_const = wide_int::from (l_const, lnprec, UNSIGNED);
   if (ll_and_mask.get_precision ())
l_const &= wide_int::from (ll_and_mask, lnprec, UNSIGNED);
   l_const <<= xll_bitpos;
-  if ((l_const & ~ll_mask) != 0)
+  if (l_non_ext_bits || (l_const & ~ll_mask) != 0)
{
  warning_at (lloc, OPT_Wtautological_compare,
  "comparison is always %d", wanted_code == NE_EXPR);
@@ -8530,11 +8554,23 @@ fold_truth_andor_for_ifcombine (enum tree_code code, 
tree truth_type,
 again.  */
   gcc_checking_assert (r_const.get_precision ());
 
+  /* Before clipping upper bits of the right-hand operand of the compare,
+check that they're sign or zero extensions, depending on how the
+left-hand operand would be extended.  */
+  bool r_non_ext_bits = false;
+  if (rl_bitsize < rr_bitsize)
+   {
+ wide_int zext = wi::zext (r_const, rl_bitsize);
+ if ((rl_unsignedp ? zext : wi::sext (r_const, rl_bitsize)) == r_const)
+   r_const = zext;
+ else
+   r_non_ext_bits = true;
+   }
   r_const = wide_int::from (r_const, lnprec, UNSIGNED);
   if (rl_and_mask.get_precision ())
r_const &= wide_int::from (rl_and_mask, lnprec, UNSIGNED);
   r_const <<= xrl_bitpos;
-  if ((r_const & ~rl_mask) != 0)
+  if (r_non_ext_bits || (r_const & ~rl_mask) != 0)
{
  warning_at (rloc, OPT_Wtautological_compare,
  "comparison is always %d", wanted_code == NE_EXPR);
diff --git a/gcc/testsuite/gcc.dg/field-merge-21.c 
b/gcc/testsuite/gcc.dg/field-merge-21.c
new file mode 100644
index 0..042b2123eb63e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/field-merge-21.c
@@ -0,0 +1,53 @@
+/* { dg-do run } */
+/* { dg-options "-O2" } */
+
+/* PR tree-optimization/118456 */
+/* Check that shifted fields compared with a constants compare correctl

Re: [PATCH] RISC-V: Fix the result error caused by not updating ratio when using "use_max_sew" to merge vsetvl.

2025-01-13 Thread Jin Ma

> So as written this test will be totally skipped (and I've verified that 
> locally).  It looks like you just wanted -O2 and we're not cycling 
> through options, so we don't need/want the dg-skip-if.   I'll fix that too.

Sorry, I made a mistake again :(

> 
> I'll make the obvious changes and push the result to the trunk.

Thanks,
Jin

Re: [PATCH] Fortran: Added support for locality specs in DO CONCURRENT (Fortran 2018/23)

2025-01-13 Thread Jerry D


On 1/13/25 12:18 AM, Tobias Burnus wrote:

Hi,

On 9/25/24 3:18 AM, Andre Vehreschild wrote:


@@ -3089,7 +3099,15 @@ typedef struct gfc_code
  gfc_inquire *inquire;
  gfc_wait *wait;
  gfc_dt *dt;
-    gfc_forall_iterator *forall_iterator;
+
+    struct
+    {
+  gfc_forall_iterator *forall_iterator;
+  gfc_expr_list *locality[LOCALITY_NUM];
+  bool default_none;
+    }
+    concur;


I am more than unhappy about that construct. Because every concurrent 
loop has

a forall_iterator, but not every forall_iterator is a concurrent loop. I
therefore propose to move the forall_iterator out of the struct and 
only have
the concurrent specific elements in the struct. This would also reduce 
the
changes significantly. 


First, regarding the naming, Fortran 2018 had:

FORALL forall-header

[do-stmt …]
CONCURRENT forall-header

Thus, both do-concurrent and forall used both a forall-header.

Since Fortran 2023, there is now:
    FORALL concurrent-header
etc.

Thus, in Fortran 2024 both use a concurrent-header.

* * *

On the technical side:

   DO CONCURRENT (i = 1:5) mask(.true.) local(x) default(none)

This has two parts:

* A forall_iterator: 'i = 1:5' and one 'mask' expression
* A locality spec: default(none) local(x)

The forall-header is saved on mainline as:

   new_st.expr1 = mask;
   new_st.ext.forall_iterator = head;

Since Fortran 2018 (and for this patch) we additionally have to save
for 'do concurrent' somehow a boolean ('default(none)') and a list of
symbols (with knowledge about in which locality they appeared).

Storing the forall-header iterator in .ext.forall_iterator
sounds fine, but where to put locality other data?

The most sensible place is to put it also into .ext, but as
the latter is a union, we need to ensure that both the iterator *and*
the locality data is available. - Thus, we create a struct for
'do concurrent'. But as the forall-header / concurrent-header is identical,
it makes sense to also use the same struct for FORALL and not to duplicate
code here.

IMHO the current code is fine.

* * *

On 1/7/25 12:06 PM, Jerry D wrote:

cannot understand why moving the forall_iterator from the sub- 
structure 'concur' back to where it was at the 'ext' sub-structure of 
typedef struct gfc_code. 'ext' is a union. I suspected there is an 
overlap going on there such that something is getting overwritten or 
optimized away.


Well, as mentioned, a DO CONCURRENT can have both. Assume a simple do- 
concurrent loop:

   DO, concurrent (I = 0:4)

This will fill code.ext.forall_iterator, which is fine. When this being 
resolved

in resolve.cc, for checking the iterator, the access goes to
   code.ext.forall_iterator
which is fine - this will work for the example above.

As next step, the locality is checked, accessing
   code.ext.concurr.*

but that variable shares the memory (union!) with 
code.ext.forall_iterator. Thus,

accessing locality[0] will be identical to
   static_cast(code.ext.forall_iterator)
and this has a very high chance to crash.

When using 'do concurrent(i=0:5) local(x)', the code.ext.forall_iterator
would be already overridden during parsing.

* * *

I hope that helps!

Tobias



Thank you for this explanation. I did do one other thing which is 
rearranged the typedef as follows which is more readable to me. If this 
is OK I will push the whole thing. I do need to do the gcc-commit-mklog 
part of the whole thing.


Regression tested on x86_64.

OK?

Jerry

diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h
index 70913e3312b..7367db8853c 100644
--- a/gcc/fortran/gfortran.h
+++ b/gcc/fortran/gfortran.h
@@ -3111,6 +3111,16 @@ enum gfc_exec_op
   EXEC_OMP_ERROR, EXEC_OMP_ALLOCATE, EXEC_OMP_ALLOCATORS, 
EXEC_OMP_DISPATCH

 };

+/* Enum Definition for locality types.  */
+enum locality_type
+{
+  LOCALITY_LOCAL = 0,
+  LOCALITY_LOCAL_INIT,
+  LOCALITY_SHARED,
+  LOCALITY_REDUCE,
+  LOCALITY_NUM
+};
+
 typedef struct gfc_code
 {
   gfc_exec_op op;
@@ -3131,6 +3141,20 @@ typedef struct gfc_code
   {
 gfc_actual_arglist *actual;
 gfc_iterator *iterator;
+gfc_open *open;
+gfc_close *close;
+gfc_filepos *filepos;
+gfc_inquire *inquire;
+gfc_wait *wait;
+gfc_dt *dt;
+struct gfc_code *which_construct;
+gfc_entry_list *entry;
+gfc_oacc_declare *oacc_declare;
+gfc_omp_clauses *omp_clauses;
+const char *omp_name;
+gfc_omp_namelist *omp_namelist;
+bool omp_bool;
+int stop_code;

 struct
 {
@@ -3152,21 +3176,13 @@ typedef struct gfc_code
 }
 block;

-gfc_open *open;
-gfc_close *close;
-gfc_filepos *filepos;
-gfc_inquire *inquire;
-gfc_wait *wait;
-gfc_dt *dt;
-gfc_forall_iterator *forall_iterator;
-struct gfc_code *which_construct;
-int stop_code;
-gfc_entry_list *entry;
-gfc_oacc_declare *oacc_declare;
-gfc_omp_clauses *omp_clauses;
-const char *omp_name;
-gfc_omp_namelist *omp_namelist;
-bool omp_bool;
+struct
+{
+

[PATCH] c++: Add support for vec_dup to constexpr [PR118445]

2025-01-13 Thread Andrew Pinski

With the addition of supporting operations on the SVE scalable vector types,
the vec_duplicate tree will show up in expressions and the constexpr handling
was not done for this tree code.
This is a simple fix to treat VEC_DUPLICATE like any other unary operator and 
allows
the constexpr-add-1.C testcase to work.

Built and tested for aarch64-linux-gnu.

PR c++/118445

gcc/cp/ChangeLog:

* constexpr.cc (cxx_eval_constant_expression): Handle VEC_DUPLICATE like
a "normal" unary operator.
(potential_constant_expression_1): Likewise.

gcc/testsuite/ChangeLog:

* g++.target/aarch64/sve/constexpr-add-1.C: New test.

Signed-off-by: Andrew Pinski 
---
 gcc/cp/constexpr.cc  |  2 ++
 .../g++.target/aarch64/sve/constexpr-add-1.C | 16 
 2 files changed, 18 insertions(+)
 create mode 100644 gcc/testsuite/g++.target/aarch64/sve/constexpr-add-1.C

diff --git a/gcc/cp/constexpr.cc b/gcc/cp/constexpr.cc
index 1345bc124ef..0896576fd28 100644
--- a/gcc/cp/constexpr.cc
+++ b/gcc/cp/constexpr.cc
@@ -8005,6 +8005,7 @@ cxx_eval_constant_expression (const constexpr_ctx *ctx, 
tree t,
 case BIT_NOT_EXPR:
 case TRUTH_NOT_EXPR:
 case FIXED_CONVERT_EXPR:
+case VEC_DUPLICATE_EXPR:
   r = cxx_eval_unary_expression (ctx, t, lval,
 non_constant_p, overflow_p);
   break;
@@ -10344,6 +10345,7 @@ potential_constant_expression_1 (tree t, bool 
want_rval, bool strict, bool now,
 case UNARY_PLUS_EXPR:
 case UNARY_LEFT_FOLD_EXPR:
 case UNARY_RIGHT_FOLD_EXPR:
+case VEC_DUPLICATE_EXPR:
 unary:
   return RECUR (TREE_OPERAND (t, 0), rval);
 
diff --git a/gcc/testsuite/g++.target/aarch64/sve/constexpr-add-1.C 
b/gcc/testsuite/g++.target/aarch64/sve/constexpr-add-1.C
new file mode 100644
index 000..43489560c8a
--- /dev/null
+++ b/gcc/testsuite/g++.target/aarch64/sve/constexpr-add-1.C
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+
+/* PR C++/118445 */
+
+#include 
+
+/* See if constexpr handles VEC_DUPLICATE and SVE. */
+constexpr svfloat32_t f(svfloat32_t b, float a)
+{
+  return b + a;
+}
+
+svfloat32_t g(void)
+{
+  return f((svfloat32_t){1.0}, 2.0);
+}
-- 
2.43.0

Re: [PATCH]AArch64: have -mcpu=native detect architecture extensions for unknown non-homogenous systems [PR113257]

2025-01-13 Thread Richard Sandiford

Richard Sandiford  writes:
> Tamar Christina  writes:
>>> -Original Message-
>>> From: Richard Sandiford 
>>> Sent: Monday, January 13, 2025 6:35 PM
>>> To: Tamar Christina 
>>> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw
>>> ; ktkac...@gcc.gnu.org
>>> Subject: Re: [PATCH]AArch64: have -mcpu=native detect architecture 
>>> extensions
>>> for unknown non-homogenous systems [PR113257]
>>> 
>>> Tamar Christina  writes:
>>> > Hi All,
>>> >
>>> > in g:e91a17fe39c39e98cebe6e1cbc8064ee6846a3a7 we added the ability for
>>> > -mcpu=native on unknown CPUs to still enable architecture extensions.
>>> >
>>> > This has worked great but was only added for homogenous systems.
>>> >
>>> > However the same thing works for big.LITTLE as in such system the cores 
>>> > must
>>> > have the same extensions otherwise it doesn't fundamentally work.
>>> >
>>> > i.e. task migration from one core to the other wouldn't work.
>>> >
>>> > This extends the same handling to non-homogenous systems.
>>> >
>>> > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
>>> >
>>> > Ok for master?
>>> >
>>> > Thanks,
>>> > Tamar
>>> >
>>> > gcc/ChangeLog:
>>> >
>>> >   PR target/113257
>>> >   * config/aarch64/driver-aarch64.cc (host_detect_local_cpu):
>>> >
>>> > gcc/testsuite/ChangeLog:
>>> >
>>> >   PR target/113257
>>> >   * gcc.target/aarch64/cpunative/info_34: New test.
>>> >   * gcc.target/aarch64/cpunative/native_cpu_34.c: New test.
>>> >
>>> > ---
>>> >
>>> > diff --git a/gcc/config/aarch64/driver-aarch64.cc 
>>> > b/gcc/config/aarch64/driver-
>>> aarch64.cc
>>> > index
>>> 45fce67a646351b848b7cd7d0fd35d343731c0d1..2a454daf031aa3ac81a9a2c0
>>> 3b15c09731e4f56e 100644
>>> > --- a/gcc/config/aarch64/driver-aarch64.cc
>>> > +++ b/gcc/config/aarch64/driver-aarch64.cc
>>> > @@ -449,6 +449,20 @@ host_detect_local_cpu (int argc, const char **argv)
>>> > break;
>>> >   }
>>> >   }
>>> > +
>>> > +  /* On big.LITTLE if we find any unknown CPUs we can still pick arch
>>> > +  features as the cores should have the same features.  So just pick
>>> > +  the feature flags from any of the cpus.  */
>>> > +  if (aarch64_cpu_data[i].name == NULL)
>>> > + {
>>> > +   auto arch_info = get_arch_from_id (DEFAULT_ARCH);
>>> > +
>>> > +   gcc_assert (arch_info);
>>> > +
>>> > +   res = concat ("-march=", arch_info->name, NULL);
>>> > +   default_flags = arch_info->flags;
>>> > + }
>>> > +
>>> 
>>> Currently, if gcc recognises the host cpu, and if one-thing is more
>>> restrictive than that cpu, gcc will warn on:
>>> 
>>>   gcc -march=one-thing -mcpu=native
>>> 
>>> and choose one-thing.  It looks like one consequence of this patch
>>> is that, for unrecognised big.LITTLE, the command line would get
>>> converted to:
>>> 
>>>   gcc -march=one-thing -march=above-replacement
>>> 
>>> and so -mcpu=native would silently "win" over one-thing.  Is that right?
>>> 
>>> Perhaps we should adjust:
>>> 
>>>" %{mcpu=native:%>> 
>>> to pass something like "cpu/arch" rather than "cpu" when -march
>>> is not specified, so that the routine knows that it has the choice
>>> of using either -mcpu or -march.  We wouldn't get the warning, but we
>>> would get predictable preemption of -march over -mcpu.
>>> 
>>> Admittedly, it looks like we already have this problem with:
>>> 
>>>   if (aarch64_cpu_data[i].name == NULL)
>>> {
>>>   auto arch_info = get_arch_from_id (DEFAULT_ARCH);
>>> 
>>>   gcc_assert (arch_info);
>>> 
>>>   res = concat ("-march=", arch_info->name, NULL);
>>>   default_flags = arch_info->flags;
>>> }
>>> 
>>> so I guess this is pre-existing.
>>
>> Yes, it looks like this was a deliberate choice that mcpu=native
>> overrides any march, this is the case for all previous GCCs as well.
>>
>> i.e. GCC 12-15 (ones I had at hand) convert
>>
>> -march=armv8.8-a+sve -mcpu=native on an Neoverse-V1 system to
>>
>> '-march=armv8.8-a+sve'  '-c' '-mlittle-endian' '-mabi=lp64' 
>> '-mcpu=neoverse-v1+sm4+crc+aes+sha3+nossbs'
>>
>> In both the calls to CC1 and to AS
>>
>> "-march=armv8.8-a+sve" 
>> "-march=armv8.4-a+rng+sm4+crc+aes+sha3+i8mm+bf16+sve+profile"
>>
>> Your request would change this long standing behavior but I don't know If 
>> that's correct or not.
>>
>> I'll admit it's inconsistent with the warning given by CC1 for the flags.
>>
>> The warnings are coming from CC1, and the question is if it's up to GCC to 
>> enforce the CC1 constraint onto binutils or not.
>
> I was talking about what DRIVER_SELF_SPECS does, and therefore what cc1
> sees.  My point was that, like you say, if you use:
>
>   -march=armv8.8-a+sve -mcpu=native
>
> on a host that GCC recognises, cc1 will see:
>
>   -march=armv8.8-a+sve -mcpu=...
>
> cc1 will then warn and compile for the -march.
>
> But with this patch, if you run gcc on a big.LITTLE host that GCC
> doesn't recognise, cc1 will see:
>
>   -march=armv8.8-a+sve -march=...
>
> The second -march will then silently override the first -march,
> so cc1 won't warn, a

[PATCH] Fix setting of call graph node AutoFDO count [PR116743]

2025-01-13 Thread Eugene Rozenfeld

We are initializing both the call graph node count and
the entry block count of the function with the head_count value
from the profile.

Count propagation algorithm may refine the entry block count
and we may end up with a case where the call graph node count
is set to 0 but the entry block count is non-zero. That becomes
a problem because we have this code in execute_fixup_cfg:

profile_count num = node->count;
profile_count den = ENTRY_BLOCK_PTR_FOR_FN (cfun)->count;
bool scale = num.initialized_p () && !(num == den);

Here if num is 0 but den is not 0, scale becomes true and we
lose the counts in

if (scale)
  bb->count = bb->count.apply_scale (num, den);

This is what happened the issue reported in PR116743
(a 10% regression in MySQL HAMMERDB tests).
3d9e6767939e9658260e2506e81ec32b37cba041 made an improvement in
AutoFDO count propagation, which caused the mismatch between
the call graph node count (zero) and the entry block count (non-zero)
and subsequent loss of counts as described above.

The fix is to update the call graph node count once we've done count 
propagation.

Tested on x86_64-pc-linux-gnu.

gcc/ChangeLog:
PR gcov-profile/116743
* auto-profile.c (afdo_annotate_cfg): Fix mismatch between the 
call graph node count
and the entry block count.
---
gcc/auto-profile.cc | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/auto-profile.cc b/gcc/auto-profile.cc
index 5d0e8afb9a1..aa4d1634f01 100644
--- a/gcc/auto-profile.cc
+++ b/gcc/auto-profile.cc
@@ -1538,8 +1538,6 @@ afdo_annotate_cfg (const stmt_set &promoted_stmts)
   if (s == NULL)
 return;
-  cgraph_node::get (current_function_decl)->count
- = profile_count::from_gcov_type (s->head_count ()).afdo ();
   ENTRY_BLOCK_PTR_FOR_FN (cfun)->count
  = profile_count::from_gcov_type (s->head_count ()).afdo ();
   EXIT_BLOCK_PTR_FOR_FN (cfun)->count = profile_count::zero ().afdo ();
@@ -1578,6 +1576,8 @@ afdo_annotate_cfg (const stmt_set &promoted_stmts)
   /* Calculate, propagate count and probability information on CFG.  */
   afdo_calculate_branch_prob (&annotated_bb);
 }
+  cgraph_node::get(current_function_decl)->count
+  = ENTRY_BLOCK_PTR_FOR_FN(cfun)->count;
   update_max_bb_count ();
   profile_status_for_fn (cfun) = PROFILE_READ;
   if (flag_value_profile_transformations)
--
2.34.1

[PATCH] rs6000: Fix ICE for invalid constants in built-in functions

2025-01-13 Thread Peter Bergner

After my other patch, I decided to write a test case with an illegal
constant operand value to a built-in to see what the results would be.
Without my other patch, we fail to catch the illegal use and emit an
invalid rtl insn and hit an unrecognizable insn ICE.  With my previous
patch, we correctly flag the use as invalid, but end up ICEing anyway,
because we don't return the correct return value (const0_rtx) to
signify we had an error.  


rs6000: Fix ICE for invalid constants in built-in functions

For invalid constant operand values used in built-in functions, return
const0_rtx to signify an error occurred during expansion.

Bootstrapped and retested on powerlc64le-linux with no regressions.
Ok for trunk and backports after some trunk burn-in time?

Peter


gcc/
* config/rs6000/rs6000-builtin.cc (rs6000_expand_builtin): Return
const0_rtx when there is an error.

gcc/testsuite/
* gcc.target/powerpc/mma-builtin-error.c: New test.


diff --git a/gcc/config/rs6000/rs6000-builtin.cc 
b/gcc/config/rs6000/rs6000-builtin.cc
index bdf2fa0b680..111802381ac 100644
--- a/gcc/config/rs6000/rs6000-builtin.cc
+++ b/gcc/config/rs6000/rs6000-builtin.cc
@@ -3459,7 +3459,7 @@ rs6000_expand_builtin (tree exp, rtx target, rtx /* 
subtarget */,
error ("argument %d must be a literal between 0 and %d,"
   " inclusive",
   bifaddr->restr_opnd[i], p);
-   return CONST0_RTX (mode[0]);
+   return const0_rtx;
  }
break;
  }
@@ -3476,7 +3476,7 @@ rs6000_expand_builtin (tree exp, rtx target, rtx /* 
subtarget */,
   " inclusive",
   bifaddr->restr_opnd[i], bifaddr->restr_val1[i],
   bifaddr->restr_val2[i]);
-   return CONST0_RTX (mode[0]);
+   return const0_rtx;
  }
break;
  }
@@ -3493,7 +3493,7 @@ rs6000_expand_builtin (tree exp, rtx target, rtx /* 
subtarget */,
   "between %d and %d, inclusive",
   bifaddr->restr_opnd[i], bifaddr->restr_val1[i],
   bifaddr->restr_val2[i]);
-   return CONST0_RTX (mode[0]);
+   return const0_rtx;
  }
break;
  }
@@ -3509,7 +3509,7 @@ rs6000_expand_builtin (tree exp, rtx target, rtx /* 
subtarget */,
   "literal %d",
   bifaddr->restr_opnd[i], bifaddr->restr_val1[i],
   bifaddr->restr_val2[i]);
-   return CONST0_RTX (mode[0]);
+   return const0_rtx;
  }
break;
  }
diff --git a/gcc/testsuite/gcc.target/powerpc/mma-builtin-error.c 
b/gcc/testsuite/gcc.target/powerpc/mma-builtin-error.c
new file mode 100644
index 000..a87a1570925
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/mma-builtin-error.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=power10" } */
+
+typedef unsigned char vec_t __attribute__((vector_size(16)));
+
+void
+foo (__vector_quad *dst, vec_t vec0, vec_t vec1) /* { dg-error "argument 5 
must be a literal between 0 and 15, inclusive" } */
+{
+  __builtin_mma_pmxvi8ger4 (dst, vec0, vec1, 15, 15, -1);
+}

[PATCH] c++: dump-lang-raw with obj_type_ref fields

2025-01-13 Thread anetczuk


Raw dump of lang tree was missing information about virtual method call.

The information is provided in "tok" field of obj_type_ref.

gcc/ChangeLog:

* tree-dump.cc (dequeue_and_dump): Handle OBJ_TYPE_REF.

gcc/testsuite/ChangeLog:

* g++.dg/lang-dump-1.C: New test.

---

gcc/testsuite/g++.dg/lang-dump-1.C | 22 ++

gcc/tree-dump.cc | 7 +++

2 files changed, 29 insertions(+)

create mode 100644 gcc/testsuite/g++.dg/lang-dump-1.C

diff --git a/gcc/testsuite/g++.dg/lang-dump-1.C 
b/gcc/testsuite/g++.dg/lang-dump-1.C


new file mode 100644

index 000..cc467aee0f0

--- /dev/null

+++ b/gcc/testsuite/g++.dg/lang-dump-1.C

@@ -0,0 +1,22 @@

+/* { dg-do compile } */

+// { dg-additional-options "-fdump-lang-raw" }

+// Check if dump file contains OBJ_TYPE_REF with additional fields 
(information about called virtual method).


+

+class VExample {

+public:

+ virtual void methodV1() {}

+ virtual void methodV2() {}

+};

+

+void funcA() {

+ VExample objA;

+ VExample *ptrA = &objA;

+

+ ptrA->methodV2();

+ ptrA->methodV1();

+}

+

+// { dg-final { scan-lang-dump-times {obj_type_ref[^\n]*type:} 2 raw } }

+// { dg-final { scan-lang-dump-times {obj_type_ref[^\n]*expr:} 2 raw } }

+// { dg-final { scan-lang-dump-times {obj_type_ref[^\n]*obj :} 2 raw } }

+// { dg-final { scan-lang-dump-times {obj_type_ref[^\n]*\n[^\n]*tok :} 
2 raw } }


diff --git a/gcc/tree-dump.cc b/gcc/tree-dump.cc

index c234d1ccaf3..bec36b41ea5 100644

--- a/gcc/tree-dump.cc

+++ b/gcc/tree-dump.cc

@@ -697,6 +697,13 @@ dequeue_and_dump (dump_info_p di)

dump_child ("op: ", OMP_CLAUSE_OPERAND (t, i));

}

break;

+

+ case OBJ_TYPE_REF:

+ dump_child ("expr", OBJ_TYPE_REF_EXPR (t));

+ dump_child ("obj", OBJ_TYPE_REF_OBJECT (t));

+ dump_child ("tok", OBJ_TYPE_REF_TOKEN (t));

+ break;

+

default:

/* There are no additional fields to print. */

break;

--

2.43.0

Re: [PATCH] RISC-V: fix thinko in riscv_register_move_cost ()

2025-01-13 Thread Kito Cheng

Thanks, that's apparently my stupid mistake...:P

On Tue, Jan 14, 2025 at 12:26 AM Jeff Law  wrote:
>
>
>
> On 1/11/25 4:45 PM, Vineet Gupta wrote:
> > This seeming benign mistake caused a massive SPEC2017 Cactu regression
> > (2.1 trillion insn to 2.5 trillion) wiping out all the gains from my
> > recent sched1 improvement. Thankfully the issue was trivial to fix even
> > if hard to isolate.
> >
> > On BPI3:
> >
> > Before bug
> > --
> > |  Performance counter stats for './cactusBSSN_r_base-1':
> > |
> > |   4,557,471.02 msec task-clock:u #1.000 CPUs 
> > utilized
> > |  1,245  context-switches:u   #0.273 /sec
> > |  1  cpu-migrations:u #0.000 /sec
> > |205,376  page-faults:u#   45.064 /sec
> > |  7,291,944,801,307  cycles:u #1.600 GHz
> > |  2,134,835,735,951  instructions:u   #0.29  insn 
> > per cycle
> > | 10,799,296,738  branches:u   #2.370 M/sec
> > | 15,308,966  branch-misses:u  #0.14% of 
> > all branches
> > |
> > | 4557.710508078 seconds time elapsed
> >
> > Bug
> > ---
> > |  Performance counter stats for './cactusBSSN_r_base-2':
> > |
> > |   4,801,813.79 msec task-clock:u #1.000 CPUs 
> > utilized
> > |  8,066  context-switches:u   #1.680 /sec
> > |  1  cpu-migrations:u #0.000 /sec
> > |203,836  page-faults:u#   42.450 /sec
> > |  7,682,826,638,790  cycles:u #1.600 GHz
> > |  2,503,133,291,344  instructions:u   #0.33  insn 
> > per cycle
> > ^
> > | 10,799,287,796  branches:u   #2.249 M/sec
> > | 16,641,200  branch-misses:u  #0.15% of 
> > all branches
> > |
> > | 4802.616638386 seconds time elapsed
> > |
> >
> > Fix
> > ---
> > |  Performance counter stats for './cactusBSSN_r_base-3':
> > |
> > |   4,556,170.75 msec task-clock:u #1.000 CPUs 
> > utilized
> > |  1,739  context-switches:u   #0.382 /sec
> > |  0  cpu-migrations:u #0.000 /sec
> > |203,458  page-faults:u#   44.655 /sec
> > |  7,289,854,613,923  cycles:u #1.600 GHz
> > |  2,134,854,070,916  instructions:u   #0.29  insn 
> > per cycle
> > | 10,799,296,807  branches:u   #2.370 M/sec
> > | 15,403,357  branch-misses:u  #0.14% of 
> > all branches
> > |
> > | 4556.445490123 seconds time elapsed
> >
> > Fixes: 46888571d242 "RISC-V: Add cr and cf constraint"
> > Signed-off-by: Vineet Gupta 
> >
> > gcc/ChangeLog:
> >   * config/riscv/riscv.cc (riscv_register_move_cost): Remove buggy
> >   check.
> OK
> jeff
>

Re: [PATCH] c++: 'this' capture clobbered during recursive inst [PR116756]

2025-01-13 Thread Jason Merrill


On 1/10/25 2:20 PM, Patrick Palka wrote:

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look
OK for trunk?

The documentation for LAMBDA_EXPR_THIS_CAPTURE seems outdated because
it says the field is only used at parse time, but apparently it's also
used at instantiation time.

Non-'this' captures don't seem to be affected, because there is no
corresponding LAMBDA_EXPR field that gets clobbered, and instead their
uses get resolved via the local specialization mechanism which is
recursion aware.

The bug also disappears if we explicitly use this in the openSeries call,
i.e. this->openSeries(...), because that sidesteps the use of
maybe_resolve_dummy / LAMBDA_EXPR_THIS_CAPTURE for resolving the
implicit object, and instead gets resolved via the local mechanism
specialization.

Maybe this suggests that there's a better way to fix this, but I'm not
sure...


That does sound like an interesting direction.  Maybe for a generic 
lambda, LAMBDA_EXPR_THIS_CAPTURE could just refer to the captured 
parameter, and we use retrieve_local_specialization to find the proxy?



-- >8 --

Here during instantiation of lambda::op() [with I = 0] we substitute
into the call self(self, cst<1>{}) which requires recursive instantiation
of the same lambda::op() [with I = 1] (which isn't deferred due to
lambda's deduced return type.  During this recursive instantiation, the
DECL_EXPR case of tsubst_stmt clobbers LAMBDA_EXPR_THIS_CAPTURE to point
to the inner lambda::op()'s capture proxy instead of the outer
lambda::op(), and the original value is never restored.

So later during substitution into the openSeries call in the outer
lambda::op() maybe_resolve_dummy uses the 'this' proxy belonging to the
inner lambda::op(), which leads to an context mismatch ICE during
gimplification of the proxy.

This patch naively fixes this by making us restore LAMBDA_EXPR_THIS_CAPTURE
after instantiating a lambda's op().

PR c++/116756

gcc/cp/ChangeLog:

* pt.cc (instantiate_body): Restore LAMBDA_EXPR_THIS_CAPTURE
after instantiating a lambda's op().

gcc/testsuite/ChangeLog:

* g++.dg/cpp1z/constexpr-if-lambda7.C: New test.
---
  gcc/cp/pt.cc  | 11 +
  .../g++.dg/cpp1z/constexpr-if-lambda7.C   | 24 +++
  2 files changed, 35 insertions(+)
  create mode 100644 gcc/testsuite/g++.dg/cpp1z/constexpr-if-lambda7.C

diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index b22129d8a46..a141de56446 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -27512,6 +27512,13 @@ instantiate_body (tree pattern, tree args, tree d, 
bool nested_p)
local_specialization_stack lss (push_to_top ? lss_blank : lss_copy);
tree block = NULL_TREE;
  
+  tree saved_this_capture = NULL_TREE;

+  if (LAMBDA_FUNCTION_P (d))
+   /* Save/restore the 'this' capture, which gets clobbered by tsubst_stmt,
+  which causes problems in case of recursive op() instantiation.  */
+   saved_this_capture
+ = LAMBDA_EXPR_THIS_CAPTURE (CLASSTYPE_LAMBDA_EXPR (DECL_CONTEXT (d)));
+
/* Set up context.  */
if (nested_p)
block = push_stmt_list ();
@@ -27555,6 +27562,10 @@ instantiate_body (tree pattern, tree args, tree d, 
bool nested_p)
  
if (DECL_OMP_DECLARE_REDUCTION_P (code_pattern))

cp_check_omp_declare_reduction (d);
+
+  if (LAMBDA_FUNCTION_P (d))
+   LAMBDA_EXPR_THIS_CAPTURE (CLASSTYPE_LAMBDA_EXPR (DECL_CONTEXT (d)))
+ = saved_this_capture;
  }
  
/* We're not deferring instantiation any more.  */

diff --git a/gcc/testsuite/g++.dg/cpp1z/constexpr-if-lambda7.C 
b/gcc/testsuite/g++.dg/cpp1z/constexpr-if-lambda7.C
new file mode 100644
index 000..8304c8f22e3
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1z/constexpr-if-lambda7.C
@@ -0,0 +1,24 @@
+// PR c++/116756
+// { dg-do compile { target c++17 } }
+
+template struct cst { static constexpr int value = N; };
+
+struct Store {
+  void openDF() {
+auto lambda = [this](auto& self, auto I) {
+  if constexpr (I.value == 0) {
+auto next = [&self] { self(self, cst<1>{}); };
+openSeries(next);
+  } else {
+openSeries(0);
+  }
+};
+lambda(lambda, cst<0>{});
+  }
+  template void openSeries(T) { }
+};
+
+int main() {
+  Store store;
+  store.openDF();
+}

Re: [PATCH] c++: Delete defaulted operator <=> if std::strong_ordering::equal doesn't convert to its rettype [PR118387]

2025-01-13 Thread Jason Merrill


On 1/13/25 10:57 AM, Jakub Jelinek wrote:

On Fri, Jan 10, 2025 at 12:04:53PM -0500, Jason Merrill wrote:

Note, the PR raises another problem.
If on the same testcase the B b; line is removed, we silently synthetize
operator<=> which will crash at runtime due to returning without a return
statement.  That is because the standard says that in that case
it should return static_cast(std::strong_ordering::equal);
but I can't find anywhere wording which would say that if that isn't
valid, the function is deleted.
https://eel.is/c++draft/class.compare#class.spaceship-2.2
seems to talk just about cases where there are some members and their
comparison is invalid it is deleted, but here there are none and it
follows
https://eel.is/c++draft/class.compare#class.spaceship-3.sentence-2
So, we synthetize with tf_none, see the static_cast is invalid, don't
add error_mark_node statement silently, but as the function isn't deleted,
we just silently emit it.
Should the standard be amended to say that the operator should be deleted
even if it has no elements and the static cast from
https://eel.is/c++draft/class.compare#class.spaceship-3.sentence-2
?


That seems pretty obviously what we want, and is what the other compilers
implement.


So like this?


I'd rather hoist the build_static_cast from later; the rules for static 
cast aren't quite the same as can_convert.  See:



  if (defining)
{
  tree val;
  if (code == EQ_EXPR)
val = boolean_true_node;
  else
{
  tree seql = lookup_comparison_result (cc_strong_ordering,
"equal", complain);
  val = build_static_cast (input_location, rettype, seql,
   complain);


Apparently this also needs to happen when !defining.


}
  finish_return_stmt (val);
}



Will you handle the defect report (unless you think nothing needs to be
clarified), or should I file something?


I will.


2025-01-13  Jakub Jelinek  

PR c++/118387
* method.cc (build_comparison_op): Set bad if
std::strong_ordering::equal doesn't convert to rettype.

* g++.dg/cpp2a/spaceship-err6.C: Expect another error.
* g++.dg/cpp2a/spaceship-synth17.C: Likewise.
* g++.dg/cpp2a/spaceship-synth-neg6.C: Likewise.
* g++.dg/cpp2a/spaceship-synth-neg7.C: New test.

--- gcc/cp/method.cc.jj 2025-01-11 21:58:05.387588681 +0100
+++ gcc/cp/method.cc2025-01-13 16:19:09.896650756 +0100
@@ -1635,6 +1635,26 @@ build_comparison_op (tree fndecl, bool d
  rettype = common_comparison_type (comps);
  apply_deduced_return_type (fndecl, rettype);
}
+  else if (code == SPACESHIP_EXPR && cat_tag_for (rettype) == cc_last)
+   {
+ /* The return value is ... and
+static_cast(std::strong_ordering::equal) otherwise.
+Make sure to delete or diagnose if such a static cast is not
+valid.  */
+ tree seql = lookup_comparison_result (cc_strong_ordering,
+   "equal", complain);
+ if (seql == error_mark_node)
+   bad = true;
+ else if (!can_convert (rettype, TREE_TYPE (seql), complain))
+   {
+ if (complain & tf_error)
+   error_at (info.loc,
+ "% does not convert "
+ "to %qD return type %qT",
+ fndecl, rettype);
+ bad = true;
+   }
+   }
if (bad)
{
  DECL_DELETED_FN (fndecl) = true;
--- gcc/testsuite/g++.dg/cpp2a/spaceship-err6.C.jj  2021-04-14 
19:19:14.050804249 +0200
+++ gcc/testsuite/g++.dg/cpp2a/spaceship-err6.C 2025-01-13 16:30:13.613331069 
+0100
@@ -10,7 +10,7 @@ class MyClass
  public:
MyClass(int value): mValue(value) {}
  
-  bool operator<=>(const MyClass&) const = default;

+  bool operator<=>(const MyClass&) const = default;  // { dg-error 
"'std::strong_ordering::equal' does not convert to 'constexpr bool MyClass::operator<=>\\\(const 
MyClass&\\\) const' return type 'bool'" }
  };
  
  int main()

--- gcc/testsuite/g++.dg/cpp2a/spaceship-synth17.C.jj   2025-01-11 
21:58:05.460587663 +0100
+++ gcc/testsuite/g++.dg/cpp2a/spaceship-synth17.C  2025-01-13 
16:32:11.383677413 +0100
@@ -8,7 +8,7 @@ struct B {};
  struct A
  {
B b;// { dg-error "no match for 'operator<=>' in 
'\[^\n\r]*' \\\(operand types are 'B' and 'B'\\\)" }
-  int operator<=> (const A &) const = default;
+  int operator<=> (const A &) const = default;   // { dg-error 
"'std::strong_ordering::equal' does not convert to 'constexpr int A::operator<=>\\\(const 
A&\\\) const' return type 'int'" }
  };
  
  int

--- gcc/testsuite/g++.dg/cpp2a/spaceship-synth-neg6.C.jj2021-08-12 
20:37:12.696473756 +0200
+++ gcc/testsuite/g++.dg/cpp2a/spaceship-synth-neg6.C   2025-01-13 
16:48:22

Re: [PATCH] Fortran: Added support for locality specs in DO CONCURRENT (Fortran 2018/23)

2025-01-13 Thread Jerry D


Committed as:

commit 20b8500cfa522ebe0fcf756d5b32816da7f904dd (HEAD -> master, 
origin/master, origin/HEAD)

Author: Anuj Mohite 
Date:   Mon Jan 13 16:28:57 2025 -0800

Fortran: Add LOCALITY support for DO_CONCURRENT

This patch provided by Anuj Mohite as part of the GSoC project.
It is modified slightly by Jerry DeLisle for minor formatting.
The patch provides front-end parsing of the LOCALITY specs in
DO_CONCURRENT and adds numerous test cases.

gcc/fortran/ChangeLog:

* dump-parse-tree.cc (show_code_node):  Updated to use
c->ext.concur.forall_iterator instead of 
c->ext.forall_iterator.

* frontend-passes.cc (index_interchange): Updated to
use c->ext.concur.forall_iterator instead of 
c->ext.forall_iterator.

(gfc_code_walker): Likewise.
* gfortran.h (enum locality_type): Added new enum for 
locality types

in DO CONCURRENT constructs.
* match.cc (match_simple_forall): Updated to use
new_st.ext.concur.forall_iterator instead of 
new_st.ext.forall_iterator.

(gfc_match_forall): Likewise.
(gfc_match_do):  Implemented support for matching DO 
CONCURRENT locality
specifiers (LOCAL, LOCAL_INIT, SHARED, DEFAULT(NONE), and 
REDUCE).

* parse.cc (parse_do_block): Updated to use
new_st.ext.concur.forall_iterator instead of 
new_st.ext.forall_iterator.

* resolve.cc (struct check_default_none_data): Added struct
check_default_none_data.
(do_concur_locality_specs_f2023): New function to check 
compliance

with F2023's C1133 constraint for DO CONCURRENT.
(check_default_none_expr): New function to check DEFAULT(NONE)
compliance.
(resolve_locality_spec): New function to resolve locality 
specs.

(gfc_count_forall_iterators): Updated to use
code->ext.concur.forall_iterator.
(gfc_resolve_forall): Updated to use 
code->ext.concur.forall_iterator.
* st.cc (gfc_free_statement): Updated to free locality 
specifications

and use p->ext.concur.forall_iterator.
* trans-stmt.cc (gfc_trans_forall_1): Updated to use
code->ext.concur.forall_iterator.

gcc/testsuite/ChangeLog:

* gfortran.dg/do_concurrent_10.f90: New test.
* gfortran.dg/do_concurrent_8_f2018.f90: New test.
* gfortran.dg/do_concurrent_8_f2023.f90: New test.
* gfortran.dg/do_concurrent_9.f90: New test.
* gfortran.dg/do_concurrent_all_clauses.f90: New test.
* gfortran.dg/do_concurrent_basic.f90: New test.
* gfortran.dg/do_concurrent_constraints.f90: New test.
* gfortran.dg/do_concurrent_local_init.f90: New test.
* gfortran.dg/do_concurrent_locality_specs.f90: New test.
* gfortran.dg/do_concurrent_multiple_reduce.f90: New test.
* gfortran.dg/do_concurrent_nested.f90: New test.
* gfortran.dg/do_concurrent_parser.f90: New test.
* gfortran.dg/do_concurrent_reduce_max.f90: New test.
* gfortran.dg/do_concurrent_reduce_sum.f90: New test.
* gfortran.dg/do_concurrent_shared.f90: New test.

Signed-off-by: Anuj

Re: [PATCH] c++: explicit spec of constrained member tmpl [PR107522]

2025-01-13 Thread Jason Merrill


On 9/12/24 1:07 PM, Patrick Palka wrote:

(Sorry to resurrect this thread so late, I lost track of this patch...)

On Fri, 2 Dec 2022, Jason Merrill wrote:


On 12/2/22 09:30, Patrick Palka wrote:

On Thu, 1 Dec 2022, Jason Merrill wrote:


On 12/1/22 14:51, Patrick Palka wrote:

On Thu, 1 Dec 2022, Jason Merrill wrote:


On 12/1/22 11:37, Patrick Palka wrote:

When defining a explicit specialization of a constrained member
template
(of a class template) such as f and g in the below testcase, the
DECL_TEMPLATE_PARMS of the corresponding TEMPLATE_DECL are partially
instantiated, whereas its associated constraints are carried over
from the original template and thus are in terms of the original
DECL_TEMPLATE_PARMS.


But why are they carried over?  We wrote a specification of the
constraints in
terms of the template parameters of the specialization, why are we
throwing
that away?


Using the partially instantiated constraints would require adding a
special case to satisfaction since during satisfaction we currently
always use the full set of template arguments (relative to the most
general template).


But not for partial specializations, right?  It seems natural to handle
this
explicit instantiation the way we handle partial specializations, as both
have
their constraints written in terms of their template parameters.


True, but what about the general rule that we don't partially instantiate
constraints outside of declaration matching?  Checking satisfaction of
partially instantiated constraints here can introduce hard errors during
normalization, e.g.

template
concept C1 = __same_as(T, void);

template
concept C2 = C1;

template
concept D = (N == 42);

template
struct A {
  template
  static void f() requires C2 || D;
};

template<>
template
void A::f() requires C2 || D { }

int main() {
  A::f<42>();
}

Normalization of the the partially instantiated constraints will give a
hard error due to 'int::type' being ill-formed, whereas the uninstantiated
constraints are fine.


Hmm, interesting point, but in this example that happens because the
specialization is nonsensical: we wouldn't be normalizing the
partially-instantiated constraints so much as the ones that the user
explicitly wrote, so a hard error seems justified.


While the written partially-instantiated constraints are nonsensical,
aren't they only needed for sake of declaration matching?  It doesn't
seem to necessarily imply that that form of constraints is what should
prevail.  This is where the analogy with partial specializations breaks
down IMHO: partial specializations own their constraints.


Hmm, I suppose you're right, we aren't overwriting the partially 
instantiated decl, just matching against it.  Your original patch is OK.


It would be nice to have predicate functions to distinguish between 
partial specializations and member specializations so we don't have to 
get into the lower-level details here.


Jason

[PATCH] lto: Remove link() to fix build with MinGW [PR118238]

2025-01-13 Thread Michal Jires

I used link() to create cheap copies of Incremental LTO cache contents
to prevent their deletion once linking is finished.
This is unnecessary, since output_files are deleted in our lto-plugin
and not in the linker itself.

Bootstrapped/regtested on x86_64-linux.
lto-wrapper now again builds on MinGW. Though so far I have not setup
MinGW to be able to do full bootstrap.
Ok for trunk?

PR lto/118238

gcc/ChangeLog:

* lto-wrapper.cc (run_gcc): Remove link() copying.

lto-plugin/ChangeLog:

* lto-plugin.c (cleanup_handler):
Keep output_files when using Incremental LTO.
(onload): Detect Incremental LTO.
---
 gcc/lto-wrapper.cc  | 34 +-
 lto-plugin/lto-plugin.c |  9 +++--
 2 files changed, 12 insertions(+), 31 deletions(-)

diff --git a/gcc/lto-wrapper.cc b/gcc/lto-wrapper.cc
index f9b2511c38e..a980b208783 100644
--- a/gcc/lto-wrapper.cc
+++ b/gcc/lto-wrapper.cc
@@ -1571,6 +1571,8 @@ run_gcc (unsigned argc, char *argv[])
  /* Exists.  */
  if (access (option->arg, W_OK) == 0)
ltrans_cache_dir = option->arg;
+ else
+   fatal_error (input_location, "missing directory: %s", option->arg);
  break;
 
case OPT_flto_incremental_cache_size_:
@@ -2218,39 +2220,13 @@ cont:
{
  for (i = 0; i < nr; ++i)
{
- char *input_name = input_names[i];
- char const *output_name = output_names[i];
-
  ltrans_file_cache::item* item;
- item = ltrans_cache.get_item (input_name);
+ item = ltrans_cache.get_item (input_names[i]);
 
- if (item && !save_temps)
+ if (item)
{
+ /* Ensure LTRANS for this item finished.  */
  item->lock.lock_read ();
- /* Ensure that cached compiled file is not deleted.
-Create copy.  */
-
- obstack_grow (&env_obstack, output_name,
-   strlen (output_name) - 2);
- obstack_grow (&env_obstack, ".cache_copy.XXX.o",
-   sizeof (".cache_copy.XXX.o"));
-
- char* output_name_link = XOBFINISH (&env_obstack, char *);
- char* name_idx = output_name_link + strlen (output_name_link)
-  - strlen ("XXX.o");
-
- /* lto-wrapper can run in parallel and access
-the same partition.  */
- for (int j = 0; ; j++)
-   {
- gcc_assert (j < 1000);
- sprintf (name_idx, "%03d.o", j);
-
- if (link (output_name, output_name_link) != EEXIST)
-   break;
-   }
-
- output_names[i] = output_name_link;
  item->lock.unlock ();
}
}
diff --git a/lto-plugin/lto-plugin.c b/lto-plugin/lto-plugin.c
index 6bccb56291c..6c78d019cf1 100644
--- a/lto-plugin/lto-plugin.c
+++ b/lto-plugin/lto-plugin.c
@@ -214,6 +214,7 @@ static char *ltrans_objects = NULL;
 
 static bool debug;
 static bool save_temps;
+static bool flto_incremental;
 static bool verbose;
 static char nop;
 static char *resolution_file = NULL;
@@ -941,8 +942,9 @@ cleanup_handler (void)
   if (arguments_file_name)
 maybe_unlink (arguments_file_name);
 
-  for (i = 0; i < num_output_files; i++)
-maybe_unlink (output_files[i]);
+  if (!flto_incremental)
+for (i = 0; i < num_output_files; i++)
+  maybe_unlink (output_files[i]);
 
   free_2 ();
   return LDPS_OK;
@@ -1615,6 +1617,9 @@ onload (struct ld_plugin_tv *tv)
   if (strstr (collect_gcc_options, "'-save-temps'"))
save_temps = true;
 
+  if (strstr (collect_gcc_options, "'-flto-incremental="))
+   flto_incremental = true;
+
   if (strstr (collect_gcc_options, "'-v'")
   || strstr (collect_gcc_options, "'--verbose'"))
verbose = true;
-- 
2.47.1

Re: [PATCH v2] RISC-V: Fix ICE for unrecognizable insn `UNSPEC_VSETVL` for XTheadVector

2025-01-13 Thread Jin Ma

> > Thank you very much for your professional reply. I am trying to solve the 
> > problem
> > using the "spec_restriction" way. But unfortunately, I have a new problem. 
> > As
> > pattern below, how can I enable "r" and disable "K" when XTheadVector? "rK" 
> > already
> > seems to be the smallest unit and not able to be
> > controlled separately using spec_restriction?
> >
> > (define_insn "@pred_madc"
> >   [(set (match_operand: 0 "register_operand" "=vr, &vr, &vr")
> >  (unspec:
> > [(plus:VI
> >   (match_operand:VI 1 "register_operand" "  %0,  vr,  vr")
> >   (match_operand:VI 2 "vector_arith_operand" "vrvi,  vr,  vi"))
> >  (match_operand: 3 "register_operand""  vm,  vm,  vm")
> >  (unspec:
> >[(match_operand 4 "vector_length_operand" "  rK,  rK,  rK")
> > (match_operand 5 "const_int_operand" "   i,   i,   i")
> > (reg:SI VL_REGNUM)
> > (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)] UNSPEC_VMADC))]
> >   "TARGET_VECTOR"
> >   "vmadc.v%o2m\t%0,%1,%v2,%3"
> >   [(set_attr "type" "vicalu")
> >(set_attr "mode" "")
> >(set_attr "vl_op_idx" "4")
> >(set (attr "avl_type_idx") (const_int 5))
> >(set_attr "spec_restriction" "thv,none,none")])
> 
> "rK" can be split up further into "r" and "K" so I'd say you
> need to adjust (and split) the alternatives accordingly.  The new "r"
> alternative would have spec_restriction "none" and the "K" alternative "rvv".

Yes. This will solve the problem, but it will lead to very large-scale changes
(splitting each rK, adding 1 column constraint), and make the pattern more 
complex
and more difficult to maintain. In contrast, how about replacing "rK" with a new
constrain in the way jeff mentioned? For example, "Vvl".

Jeff: We could also create a new constraint that mostly behaves like rK, but 
rejects (const_int 1) when thead-vector is enabled and use that in the 
vsetvl pattern instead of rK. ( 
https://gcc.gnu.org/pipermail/gcc-patches/2024-December/671836.html )

BR
Jin

> -- 
> Regards
>  Robin

Re: [PATCH] Fortran: Added support for locality specs in DO CONCURRENT (Fortran 2018/23)

2025-01-13 Thread Tobias Burnus


Hi,

On 9/25/24 3:18 AM, Andre Vehreschild wrote:


@@ -3089,7 +3099,15 @@ typedef struct gfc_code
  gfc_inquire *inquire;
  gfc_wait *wait;
  gfc_dt *dt;
-    gfc_forall_iterator *forall_iterator;
+
+    struct
+    {
+  gfc_forall_iterator *forall_iterator;
+  gfc_expr_list *locality[LOCALITY_NUM];
+  bool default_none;
+    }
+    concur;


I am more than unhappy about that construct. Because every concurrent 
loop has

a forall_iterator, but not every forall_iterator is a concurrent loop. I
therefore propose to move the forall_iterator out of the struct and 
only have
the concurrent specific elements in the struct. This would also reduce 
the
changes significantly. 


First, regarding the naming, Fortran 2018 had:

FORALL forall-header

[do-stmt …]
CONCURRENT forall-header

Thus, both do-concurrent and forall used both a forall-header.

Since Fortran 2023, there is now:
   FORALL concurrent-header
etc.

Thus, in Fortran 2024 both use a concurrent-header.

* * *

On the technical side:

  DO CONCURRENT (i = 1:5) mask(.true.) local(x) default(none)

This has two parts:

* A forall_iterator: 'i = 1:5' and one 'mask' expression
* A locality spec: default(none) local(x)

The forall-header is saved on mainline as:

  new_st.expr1 = mask;
  new_st.ext.forall_iterator = head;

Since Fortran 2018 (and for this patch) we additionally have to save
for 'do concurrent' somehow a boolean ('default(none)') and a list of
symbols (with knowledge about in which locality they appeared).

Storing the forall-header iterator in .ext.forall_iterator
sounds fine, but where to put locality other data?

The most sensible place is to put it also into .ext, but as
the latter is a union, we need to ensure that both the iterator *and*
the locality data is available. - Thus, we create a struct for
'do concurrent'. But as the forall-header / concurrent-header is identical,
it makes sense to also use the same struct for FORALL and not to duplicate
code here.

IMHO the current code is fine.

* * *

On 1/7/25 12:06 PM, Jerry D wrote:

cannot understand why moving the forall_iterator from the 
sub-structure 'concur' back to where it was at the 'ext' sub-structure 
of typedef struct gfc_code. 'ext' is a union. I suspected there is an 
overlap going on there such that something is getting overwritten or 
optimized away.


Well, as mentioned, a DO CONCURRENT can have both. Assume a simple 
do-concurrent loop:
  DO, concurrent (I = 0:4)

This will fill code.ext.forall_iterator, which is fine. When this being resolved
in resolve.cc, for checking the iterator, the access goes to
  code.ext.forall_iterator
which is fine - this will work for the example above.

As next step, the locality is checked, accessing
  code.ext.concurr.*

but that variable shares the memory (union!) with code.ext.forall_iterator. 
Thus,
accessing locality[0] will be identical to
  static_cast(code.ext.forall_iterator)
and this has a very high chance to crash.

When using 'do concurrent(i=0:5) local(x)', the code.ext.forall_iterator
would be already overridden during parsing.

* * *

I hope that helps!

Tobias

Re: [PATCH v2] RISC-V: Fix ICE for unrecognizable insn `UNSPEC_VSETVL` for XTheadVector

2025-01-13 Thread Robin Dapp

> Thank you very much for your professional reply. I am trying to solve the 
> problem
> using the "spec_restriction" way. But unfortunately, I have a new problem. As
> pattern below, how can I enable "r" and disable "K" when XTheadVector? "rK" 
> already
> seems to be the smallest unit and not able to be
> controlled separately using spec_restriction?
>
> (define_insn "@pred_madc"
>   [(set (match_operand: 0 "register_operand" "=vr, &vr, &vr")
>   (unspec:
>  [(plus:VI
>(match_operand:VI 1 "register_operand" "  %0,  vr,  vr")
>(match_operand:VI 2 "vector_arith_operand" "vrvi,  vr,  vi"))
>   (match_operand: 3 "register_operand""  vm,  vm,  vm")
>   (unspec:
> [(match_operand 4 "vector_length_operand" "  rK,  rK,  rK")
>  (match_operand 5 "const_int_operand" "   i,   i,   i")
>  (reg:SI VL_REGNUM)
>  (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)] UNSPEC_VMADC))]
>   "TARGET_VECTOR"
>   "vmadc.v%o2m\t%0,%1,%v2,%3"
>   [(set_attr "type" "vicalu")
>(set_attr "mode" "")
>(set_attr "vl_op_idx" "4")
>(set (attr "avl_type_idx") (const_int 5))
>(set_attr "spec_restriction" "thv,none,none")])

"rK" can be split up further into "r" and "K" so I'd say you
need to adjust (and split) the alternatives accordingly.  The new "r"
alternative would have spec_restriction "none" and the "K" alternative "rvv".

-- 
Regards
 Robin

[PATCH v2 3/4] RISC-V: Add .note.gnu.property for ZICFILP and ZICFISS ISA extension

2025-01-13 Thread Monk Chiang

gcc/ChangeLog:
* gcc/config/riscv/riscv.cc
(riscv_file_end_indicate_exec_stack): Add .note.gnu.property.
* gcc/config/riscv/linux.h (TARGET_ASM_FILE_END): Define.

libgcc/ChangeLog:
* libgcc/config/riscv/crti.S: Add lpad instructions.
* libgcc/config/riscv/crtn.S: Likewise.
* libgcc/config/riscv/save-restore.S: Likewise.
* libgcc/config/riscv/riscv-asm.h: Add GNU_PROPERTY for ZICFILP,
  ZICFISS.
---
 gcc/config/riscv/linux.h   |  2 +
 gcc/config/riscv/riscv.cc  | 50 ++
 libgcc/config/riscv/crti.S |  2 +
 libgcc/config/riscv/crtn.S |  2 +
 libgcc/config/riscv/riscv-asm.h| 69 +-
 libgcc/config/riscv/save-restore.S |  5 +++
 6 files changed, 129 insertions(+), 1 deletion(-)

diff --git a/gcc/config/riscv/linux.h b/gcc/config/riscv/linux.h
index 9060c940a44..c1cfa9bf28a 100644
--- a/gcc/config/riscv/linux.h
+++ b/gcc/config/riscv/linux.h
@@ -61,6 +61,8 @@ along with GCC; see the file COPYING3.  If not see
-dynamic-linker " GNU_USER_DYNAMIC_LINKER "}} \
 %{static:-static} %{static-pie:-static -pie --no-dynamic-linker -z text}}"
 
+#define TARGET_ASM_FILE_END riscv_file_end_indicate_exec_stack
+
 #define STARTFILE_PREFIX_SPEC  \
"/lib" XLEN_SPEC "/" ABI_SPEC "/ "  \
"/usr/lib" XLEN_SPEC "/" ABI_SPEC "/ "  \
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 4afb0b95839..fa4a706bf56 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -10334,6 +10334,56 @@ riscv_file_start (void)
 riscv_emit_attribute ();
 }
 
+void
+riscv_file_end_indicate_exec_stack ()
+{
+  file_end_indicate_exec_stack ();
+  long GNU_PROPERTY_RISCV_FEATURE_1_AND  = 0;
+  unsigned long feature_1_and = 0;
+
+  if (TARGET_ZICFISS)
+feature_1_and |= 0x1 << 0;
+
+  if (TARGET_ZICFILP)
+feature_1_and |= 0x1 << 1;
+
+  if (feature_1_and)
+{
+  /* Generate .note.gnu.property section.  */
+  switch_to_section (get_section (".note.gnu.property",
+ SECTION_NOTYPE, NULL));
+
+  /* The program property descriptor is aligned to 4 bytes in 32-bit
+objects and 8 bytes in 64-bit objects.  */
+  unsigned p2align = TARGET_64BIT ? 3 : 2;
+
+  fprintf (asm_out_file, "\t.p2align\t%u\n", p2align);
+  /* name length.  */
+  fprintf (asm_out_file, "\t.long\t1f - 0f\n");
+  /* data length.  */
+  fprintf (asm_out_file, "\t.long\t5f - 2f\n");
+  /* note type.  */
+  fprintf (asm_out_file, "\t.long\t5\n");
+  fprintf (asm_out_file, "0:\n");
+  /* vendor name: "GNU".  */
+  fprintf (asm_out_file, "\t.asciz\t\"GNU\"\n");
+  fprintf (asm_out_file, "1:\n");
+
+  /* pr_type.  */
+  fprintf (asm_out_file, "\t.p2align\t3\n");
+  fprintf (asm_out_file, "2:\n");
+  fprintf (asm_out_file, "\t.long\t0xc000\n");
+  /* pr_datasz.  */
+  fprintf (asm_out_file, "\t.long\t4f - 3f\n");
+  fprintf (asm_out_file, "3:\n");
+  /* zicfiss, zicfilp.  */
+  fprintf (asm_out_file, "\t.long\t%x\n", feature_1_and);
+  fprintf (asm_out_file, "4:\n");
+  fprintf (asm_out_file, "\t.p2align\t%u\n", p2align);
+  fprintf (asm_out_file, "5:\n");
+}
+}
+
 /* Implement TARGET_ASM_OUTPUT_MI_THUNK.  Generate rtl rather than asm text
in order to avoid duplicating too much logic from elsewhere.  */
 
diff --git a/libgcc/config/riscv/crti.S b/libgcc/config/riscv/crti.S
index 89bac706c63..3a67fd77156 100644
--- a/libgcc/config/riscv/crti.S
+++ b/libgcc/config/riscv/crti.S
@@ -1 +1,3 @@
 /* crti.S is empty because .init_array/.fini_array are used exclusively. */
+
+#include "riscv-asm.h"
diff --git a/libgcc/config/riscv/crtn.S b/libgcc/config/riscv/crtn.S
index ca6ee7b6fba..cb80782bb55 100644
--- a/libgcc/config/riscv/crtn.S
+++ b/libgcc/config/riscv/crtn.S
@@ -1 +1,3 @@
 /* crtn.S is empty because .init_array/.fini_array are used exclusively. */
+
+#include "riscv-asm.h"
diff --git a/libgcc/config/riscv/riscv-asm.h b/libgcc/config/riscv/riscv-asm.h
index b6dbeaedc20..73bddb3f9e7 100644
--- a/libgcc/config/riscv/riscv-asm.h
+++ b/libgcc/config/riscv/riscv-asm.h
@@ -23,9 +23,11 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If 
not, see
 #define FUNC_SIZE(X)   .size X,.-X
 
 #define FUNC_BEGIN(X)  \
+   .align 2;   \
.globl X;   \
FUNC_TYPE (X);  \
-X:
+X: \
+   LPAD
 
 #define FUNC_END(X)\
FUNC_SIZE(X)
@@ -39,3 +41,68 @@ X:
 #define HIDDEN_JUMPTARGET(X)   CONCAT1(__hidden_, X)
 #define HIDDEN_DEF(X)  FUNC_ALIAS(HIDDEN_JUMPTARGET(X), X); \
.hidden HIDDEN_JUMPTARGET(X)
+
+/* GNU_PROPERTY_RISCV64_* macros from elf.h for use in asm code.  */
+#define FEATURE_1_AND 0xc000
+#define FEATURE_1_FCFI 1
+#define FEATURE_1_BCFI 2
+
+/* Add a

GCC 15.0.0 Status Report (2025-01-13), Stage 4 in effect NOW

2025-01-13 Thread Richard Biener

Status
==

The GCC development branch which will become GCC 15 is now in
stage4, open for regression and documentation fixes only.


Quality Data


Priority  #   Change from last report
---   ---
P1   32 +  6
P2  611 - 25
P3  267 + 47
P4  208 +  6
P5   24 -  1   
---   ---
Total P1-P3 910 + 28
Total  1142 + 33


Previous Report
===

https://gcc.gnu.org/pipermail/gcc/2024-November/245163.html

[PATCH] RISC-V: Fix program logic errors caused by data truncation on 32-bit host for zbs, such as i386.

2025-01-13 Thread Jin Ma

Correct logic on 64-bit host:
...
bseti   a5,zero,38
bseti   a5,a5,63
addia5,a5,-1
and a4,a4,a5
...

Wrong logic on 32-bit host:
...
li  a5,64
bseti   a5,a5,31
addia5,a5,-1
and a4,a4,a5
...

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_build_integer_1): Change
1UL/1ULL to HOST_WIDE_INT_1U.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/zbs-bug.c: New test.
---
 gcc/config/riscv/riscv.cc|  4 ++--
 gcc/testsuite/gcc.target/riscv/zbs-bug.c | 15 +++
 2 files changed, 17 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/zbs-bug.c

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 65e09842fde8..3b5712429e46 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -1112,10 +1112,10 @@ riscv_build_integer_1 (struct riscv_integer_op 
codes[RISCV_MAX_INTEGER_OPS],
{
  HOST_WIDE_INT bit = ctz_hwi (value);
  alt_codes[i].code = (i == 0 ? UNKNOWN : IOR);
- alt_codes[i].value = 1UL << bit;
+ alt_codes[i].value = HOST_WIDE_INT_1U << bit;
  alt_codes[i].use_uw = false;
  alt_codes[i].save_temporary = false;
- value &= ~(1ULL << bit);
+ value &= ~(HOST_WIDE_INT_1U << bit);
  i++;
}
 
diff --git a/gcc/testsuite/gcc.target/riscv/zbs-bug.c 
b/gcc/testsuite/gcc.target/riscv/zbs-bug.c
new file mode 100644
index ..10dde5801334
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/zbs-bug.c
@@ -0,0 +1,15 @@
+/* { dg-do compile { target { rv64 } } } */
+/* { dg-skip-if "" { *-*-* } { "-O1" "-O2" "-O3" "-Og" "-Os" "-Oz" } } */
+/* { dg-options "-march=rv64gc_zbb_zbs -mabi=lp64d -O0" } */
+
+struct a {
+  unsigned : 29;
+  signed : 6;
+  signed b : 25;
+};
+
+void c() {
+  struct a d = {808};
+}
+
+/* { dg-final { scan-assembler-not "bseti.*31" } }*/
-- 
2.25.1

Re: [PATCH v4 6/7] OpenMP: Fortran front-end support for dispatch + adjust_args

2025-01-13 Thread Paul-Antoine Arras


Hi Thomas,

On 08/01/2025 10:04, Thomas Schwinge wrote:

Hi Paul-Antoine!

On 2024-12-16T19:35:01+0100, Paul-Antoine Arras  wrote:

On 15/11/2024 14:59, Tobias Burnus wrote:

Paul-Antoine Arras wrote:

This patch adds support for the `dispatch` construct and the
`adjust_args` clause to the Fortran front-end.

Handling of `adjust_args` across translation units is missing due
to PR115271.



First, can you add a run-time test?

[I think it helps to have at least one run-time test feature for every
major feature - as we had in the past e.g. C runtime tests and Fortran
compile time tests - but it turned out that some flags was not set,
causing the middle to ignore the feature completely ...]


Added libgomp/testsuite/libgomp.fortran/dispatch-1.f90.


I see this new test case FAIL (execution test SIGSEGV) for most (but not
all) offloading configurations, both GCN and nvptx:

 +PASS: libgomp.fortran/dispatch-1.f90   -O  (test for excess errors)
 +FAIL: libgomp.fortran/dispatch-1.f90   -O  execution test


Thanks for pointing that out! The testcase missed an OpenMP target 
directive. The attached patch should fix it.


Best,

PA


For example:

 [...]
 Thread 1 "a.out" received signal SIGSEGV, Segmentation fault.
 0x004022fc in procedures::bar (d_bv=0x7fffc7002040, 
d_av=0x7fffc700, n=1024) at 
source-gcc/libgomp/testsuite/libgomp.fortran/dispatch-1.f90:59
 59fp_bv(i) = fp_av(i) * i
 (gdb) bt
 #0  0x004022fc in procedures::bar (d_bv=0x7fffc7002040, 
d_av=0x7fffc700, n=1024) at 
source-gcc/libgomp/testsuite/libgomp.fortran/dispatch-1.f90:59
 #1  0x00401c41 in procedures::test (n=1024) at 
source-gcc/libgomp/testsuite/libgomp.fortran/dispatch-1.f90:86
 #2  0x00402b1e in MAIN__ () at 
source-gcc/libgomp/testsuite/libgomp.fortran/dispatch-1.f90:115
 (gdb) print i
 $1 = 1
 (gdb) print fp_bv
 $2 = (0, , ...)
 (gdb) print fp_av
 $3 = (0, , ...)
 (gdb) print fp_bv(1)
 $4 = 0
 (gdb) print fp_av(1)
 $5 = 0
 (gdb) ptype fp_bv
 type = real(kind=8) (1024)
 (gdb) ptype fp_av
 type = real(kind=8) (1024)
 (gdb) up
 #1  0x00401c41 in procedures::test (n=1024) at 
source-gcc/libgomp/testsuite/libgomp.fortran/dispatch-1.f90:86
 86!$omp dispatch nocontext(n > 1024) novariants(n < 1024) 
device(last_dev)
 (gdb) print last_dev
 $6 = 0


Grüße
  Thomas



--- /dev/null
+++ libgomp/testsuite/libgomp.fortran/dispatch-1.f90
@@ -0,0 +1,120 @@
+module procedures
+  use iso_c_binding, only: c_ptr, c_f_pointer
+  use omp_lib
+  implicit none
+
+  contains
+
+  function foo(bv, av, n) result(res)
+implicit none
+integer :: res, n, i
+type(c_ptr) :: bv
+type(c_ptr) :: av
+real(8), pointer :: fp_bv(:), fp_av(:)  ! Fortran pointers for array access
+!$omp declare variant(bar) match(construct={dispatch}) 
adjust_args(need_device_ptr: bv, av)
+!$omp declare variant(baz) match(implementation={vendor(gnu)})
+
+! Associate C pointers with Fortran pointers
+call c_f_pointer(bv, fp_bv, [n])
+call c_f_pointer(av, fp_av, [n])
+
+! Perform operations using Fortran pointers
+do i = 1, n
+  fp_bv(i) = fp_av(i) * i
+end do
+res = -1
+  end function foo
+
+  function baz(d_bv, d_av, n) result(res)
+implicit none
+integer :: res, n, i
+type(c_ptr) :: d_bv
+type(c_ptr) :: d_av
+real(8), pointer :: fp_bv(:), fp_av(:)  ! Fortran pointers for array access
+
+! Associate C pointers with Fortran pointers
+call c_f_pointer(d_bv, fp_bv, [n])
+call c_f_pointer(d_av, fp_av, [n])
+
+!$omp distribute parallel do
+do i = 1, n
+  fp_bv(i) = fp_av(i) * i
+end do
+res = -3
+  end function baz
+
+  function bar(d_bv, d_av, n) result(res)
+implicit none
+integer :: res, n, i
+type(c_ptr) :: d_bv
+type(c_ptr) :: d_av
+real(8), pointer :: fp_bv(:), fp_av(:)  ! Fortran pointers for array access
+
+! Associate C pointers with Fortran pointers
+call c_f_pointer(d_bv, fp_bv, [n])
+call c_f_pointer(d_av, fp_av, [n])
+
+! Perform operations on target
+do i = 1, n
+  fp_bv(i) = fp_av(i) * i
+end do
+res = -2
+  end function bar
+
+  function test(n) result(res)
+use iso_c_binding, only: c_ptr, c_loc
+implicit none
+integer :: n, res, i, f, ff, last_dev
+real(8), allocatable, target :: av(:), bv(:), d_bv(:)
+real(8), parameter :: e = 2.71828d0
+type(c_ptr) :: c_av, c_bv, c_d_bv
+
+allocate(av(n), bv(n), d_bv(n))
+
+! Initialize arrays
+do i = 1, n
+  av(i) = e * i
+  bv(i) = 0.0d0
+  d_bv(i) = 0.0d0
+end do
+
+last_dev = omp_get_num_devices() - 1
+
+c_av = c_loc(av)
+c_d_bv = c_loc(d_bv)
+!$omp target data map(to: av(:n)) map(from: d_bv(:n)) device(last_dev) 
if(n == 1024)
+  !$omp dispatch nocontext(n > 1024) novariants(n < 1024) device(last_dev)
+  f = foo(c_d_bv, c

Re: [PATCH] expr: Fix up the divmod cost debugging note [PR115910]

2025-01-13 Thread Richard Biener

On Mon, 13 Jan 2025, Jakub Jelinek wrote:

> Hi!
> 
> Something I've noticed during working on the crc wrong-code fix.
> My first version of the patch failed because of no longer matching some
> expected strings in the assembly, so I had to add TDF_DETAILS debugging
> into the -fdump-rtl-expand-details dump which the crc tests can use.
> 
> For PR115910 Andrew has added similar note for the division/modulo case
> if it is positive and we can choose either unsigned or signed
> division.  The problem is that unlike most other TDF_DETAILS diagnostics,
> this is not done before emitting the IL for the function, but during it.
> 
> Other messages there are prefixed with ;;, both details on what it is doing
> and the GIMPLE IL for which it expands RTL, so the
> ;; Generating RTL for gimple basic block 4
> 
> ;;
> 
> (code_label 13 12 14 2 (nil) [0 uses])
> 
> (note 14 13 0 NOTE_INSN_BASIC_BLOCK)
> positive division: unsigned cost: 30; signed cost: 28
> 
> ;; return _4;
> 
> message in between just looks weird and IMHO should be ;; prefixed.
> 
> The following patch does that, ok for trunk?

OK.

> 2025-01-13  Jakub Jelinek  
> 
>   PR target/115910
>   * expr.cc (expand_expr_divmod): Prefix the TDF_DETAILS note with
>   ";; " and add a space before (needed tie breaker).  Formatting fixes.
> 
> --- gcc/expr.cc.jj2025-01-13 09:12:08.589966845 +0100
> +++ gcc/expr.cc   2025-01-13 11:21:11.501285143 +0100
> @@ -9710,9 +9710,9 @@ expand_expr_divmod (tree_code code, mach
>   }
>  
>if (dump_file && (dump_flags & TDF_DETAILS))
> -   fprintf(dump_file, "positive division:%s unsigned cost: %u; "
> -   "signed cost: %u\n", was_tie ? "(needed tie breaker)" : "",
> -   uns_cost, sgn_cost);
> + fprintf (dump_file, ";; positive division:%s unsigned cost: %u; "
> + "signed cost: %u\n",
> +  was_tie ? " (needed tie breaker)" : "", uns_cost, sgn_cost);
>  
>if (uns_cost < sgn_cost || (uns_cost == sgn_cost && unsignedp))
>   {
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Re: [PATCH] d, v2: give dependency files better filenames

2025-01-13 Thread Jakub Jelinek

On Mon, Jan 13, 2025 at 11:00:09AM +0100, Jakub Jelinek wrote:
> Why do you need there the directory or suffix?
> $(*F) is clearly bad, because there are rules like
> d/%.o: d/dmd/%.d
> $(DCOMPILE) $(D_INCLUDES) $<
> $(DPOSTCOMPILE)
> 
> d/common-%.o: d/dmd/common/%.d
> $(DCOMPILE) $(D_INCLUDES) $<
> $(DPOSTCOMPILE)
> etc. and while the stem in the first case is the basename of the filename
> part, in the second case it is the basename of the filename part in the
> common directory.
> I think
> DEPFILE = $(basename $(@F))
> would be sufficient.
> So the former d/.deps/file.Po which handled both d/dmd/common/file.d and
> d/dmd/root/file.d in your case would be d/.deps/d-common-file.o.d and
> d/.deps/d-root-file.o.d while with the above DEPFILE it would be
> d/.deps/common-file.d and d/.deps/root-file.d
> There are no d/dmd/*-*.d files and among d/*-*.cc the only are just d-
> prefixed ones, and there are no clashes between the *.cc and *.d filenames:
> for i in gcc/d/*.cc; do j=`basename $i .cc`; find gcc/d -name $j.d; done

And note, the filenames better be unique, as if they weren't, multiple
sources would use the same object file d/$(basename $(@F)).o

Jakub

Re: [RFC/RFA] [PR tree-optimization/92539] Improve code and avoid Warray-bounds false positive

2025-01-13 Thread Richard Biener

On Fri, Jan 10, 2025 at 9:43 PM Jeff Law  wrote:
>
>
>
> On 1/10/25 1:00 AM, Richard Biener wrote:
> >
> > It's a problem we're never going to fully solve.  Some of the
> > testcases show missed optimizations which we can work on.  Some show
> > we diagnose IL we later are able to optimize away, some simply show
> > that users are not always happy with how we decide on suppressing a
> > diagnostic.
> >
> > For the case at hand we should be able to optimize it fully.
> Do you think it's reasonable to approach in unrolling?  I'm just not
> familiar enough with that code to even hazard a guess.

Yes, but as said it's niter analysis that needs the improvement.  Both
possible for detecting the strlen() and for handling UB in address-takens
to bound the number of iterations.

> And any thoughts on putting this into path isolation?  It wouldn't help
> with the false positive from -Warray-bounds, but it does fit into what
> that pass is generally trying to do and would at least make the final
> code tighter.  That would also give us a chance to clean up if user code
> made similar goofs while still giving a suitable diagnostic.

Sure - I do think catching UB in path isolation is a good thing.  But then
path isolation is run quite early for some of our late UB diagnostic code,
thus we might miss diagnostics then unless we can force-keep the UB
triggering stmt (as we do for NULL pointer derefs).

> >
> > But optimizing based on UB is always going to be to interact with
> > diagnosing UB, so we have to be careful.  Our "late" diagnostics are
> > most problematic here and I'd argue moving those earlier is the
> > first thing we should try.
> There's philosophical disagreements on where these warnings belong.  As
> you go earlier you get more false positives, but fewer false negatives
> and the opposite as you push the analysis later in the pipeline.

Yeah.

> That's what drove my high level support of the __builtin_warning idea as
> well as earlier ideas around two-stage detection with a flag indicating
> if the user wanted the early diagnostic or the later one.

Yes, having a diagnostic defered and keyed on survival of __builtin_warning
is a good thing.  The diagnostic machinery might be good enough to stash
a fully formatted multi-line diagnostic aside and replay it later -
David probably
knows best.

> Coming back to the bug.  Unless there's a really simple way to catch in
> the unroller, this one probably won't make gcc-15.

I don't think we want to change it too much at this point - esp. using
more UB for niter analysis.  What might be OK is to improve
loop_niter_by_eval, I'll have a short look to see what this would mean
(it's very restrictive at the moment).

Richard.

> jeff
>

Re: [PATCH] rtl: Remove invalid compare simplification [PR117186]

2025-01-13 Thread Tobias Burnus


Andreas Schwab wrote:

This breaks m68k:


Same issue on GCN, hence I filed https://gcc.gnu.org/PR118418

If I look at the debugging output, see  PR, it seems as if
the self-test function test_comparisons contains the assumption:
   FALSE < TRUE
but if TRUE is -1, that assumption does not hold
(for signed variables).

And both GCN and m68k '#define STORE_FLAG_VALUE -1',
as Andreas noted.

Tobias

Re: [PATCH] testsuite: libstdc++: Use effective-target libatomic

2025-01-13 Thread Thomas Schwinge

Hi!

On 2025-01-12T08:38:05+0100, Torbjorn SVENSSON  
wrote:
> On 2025-01-12 01:05, Jonathan Wakely wrote:
>> On Mon, 23 Dec 2024, 19:05 Torbjörn SVENSSON, 
>> mailto:torbjorn.svens...@foss.st.com>> 
>> wrote:
>> 
>> Ok for trunk and releases/gcc-14?
>> 
>> OK
>
> Pushed as r15-6828-g4b0ef49d02f and r14.2.0-680-gd82fc939f91.

On a configuration where libatomic does get built, I see (with standard
build-tree testing: 'make check'):

[-PASS:-]{+UNSUPPORTED:+} 
29_atomics/atomic_float/compare_exchange_padding.cc  -std=gnu++20[-(test for 
excess errors)-]
[-PASS: 29_atomics/atomic_float/compare_exchange_padding.cc  -std=gnu++20 
execution test-]
[Etc.]

[...]
spawn -ignore SIGHUP [...]/gcc/xg++ [...] libatomic_available1221570.c 
-latomic [...] -o libatomic_available1221570.exe
/usr/bin/ld: cannot find -latomic: No such file or directory
[...]

I presume that the new 'dg-require-effective-target libatomic_available'
is evaluated when the 'atomic_link_flags' via 'dg-additional-options'
have not yet been set?

Would it work to call 'atomic_init' (plus 'atomic_finish', I suppose?)
(see 'gcc/testsuite/lib/atomic-dg.exp') in libstdc++ test suite setup,
and then to '29_atomics/atomic_float/compare_exchange_padding.cc' apply
the usual pattern:

-// { dg-require-effective-target libatomic_available }
-// { dg-additional-options "[atomic_link_flags [get_multilibs]] -latomic" }
+// { dg-additional-options -latomic { target libatomic_available } }


Grüße
 Thomas


>> Test assumes libatomic.a is always available, but for some embedded
>> targets, there is no libatomic.a and the test thus fail.
>> 
>> libstdc++/ChangeLog:
>> 
>>          * 29_atomics/atomic_float/compare_exchange_padding.cc: Use
>>          effective-target libatomic_available.
>> 
>> Signed-off-by: Torbjörn SVENSSON 
>> ---
>>   .../29_atomics/atomic_float/compare_exchange_padding.cc          | 1 +
>>   1 file changed, 1 insertion(+)
>> 
>> diff --git 
>> a/libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc 
>> b/libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
>> index 49626ac6651..9395e3026a7 100644
>> --- 
>> a/libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
>> +++ 
>> b/libstdc++-v3/testsuite/29_atomics/atomic_float/compare_exchange_padding.cc
>> @@ -1,5 +1,6 @@
>>   // { dg-do run { target c++20 } }
>>   // { dg-options "-O0" }
>> +// { dg-require-effective-target libatomic_available }
>>   // { dg-additional-options "[atomic_link_flags [get_multilibs]] 
>> -latomic" }
>> 
>>   #include 
>> -- 
>> 2.25.1
>>

Re: [PATCH] aarch64: Provide initial specifications for Apple CPU cores.

2025-01-13 Thread Kyrylo Tkachov

Hi Iain,

> On 11 Jan 2025, at 14:21, Iain Sandoe  wrote:
> 
> Hi,
> 
> I originally made this patch for the Darwin Arm64 development branch,
> however in discussions on IRC, it seems that it is also relevant to
> Linux - since there are implementations running on Apple hardware with
> the M1..3 CPUs.  It might also be helpful to the resolution of
> PR113257 - although it is not a solution on its own.
> 
> Bootstrapped and tested manually (that it gives the expected .arch lines)
> on aarch64-linux.
> 
> OK for trunk?
> thanks
> Iain
> 
> --- 8< ---
> 
> This covers the M1-M3 cores used in Apple desktop hardware that is also
> sometimes used with Linux as the OS.
> 
> It does not cover the wider range that might be used in iOS and other
> embedded platform versions.
> 
> Some of the content is estimates/best guesses - based on the following
> public sources of information:
> * XNU (only for the Apple Implementer ID)
> * sysctl -a | grep hw on various M1, M2 and machines
> * AArch64.td from the Apple Open Source repo for LLVM.
> * What XCode-14 clang passes to cc1.
> 

How about the llvm/lib/TargetParser/Host.cpp in upstream LLVM for the part 
numbers?
I see it has different values for the M1,M2,M3 ones that you have in your patch.

> Unfortunately, these sources are in conflict; in particular the clang-claimed
> feature set disagrees with the output of sysctl -a, and the base Arm revs.
> claimed in some cases miss features that ARM DDI 0487J.a lists as mandatory
> for the rev.
> 
> This latter point might not be actually significant - but for the sake of
> caution I've made the spec use the lower arch rev + the additional features
> that are consistently claimed by both sysctl and clang.
> 

I think going for the lowest common denominator of features you can deduce is 
fine.

> GCC does not seem to have a scheduler that is similar to the "Cyclone" one
> in LLVM - so I've guessed to use cortex57 (but, maybe we miss 8-issue, it's
> not clear - and my experience with the scheduler is ≈ 0).
> 

Yes, that’s probably good enough. We haven’t had a new “big core” scheduling 
model in a while and cortexa57 tends to be good enough as a fallback.
I’d like us to have something new for SVE-enabled cores but that’s not relevant 
in this case.


> Likewise we do not (yet) have specific cost models, so choose the generic
> Armv8 one.
> 
> Thus, the choices here are intended to be conservative.
> 
> * Currently, we do not seem to have any way to specify that M2/M3 has support
>  for FEAT_BTI, but because of missing feaures is not compliant with the Arm
>  base rev that implies this.
> * Proper version numbers are not readily available.
> * Since we have FIRESTORM/ICESTORM and similar pairs for the performance and
>   efficiency cores on various machines, perhaps we should be using a 
> big.LITTLE
>   configuration; OTOH currently, I have no idea if that is usable in any way
>   with the hardware as configured.

Modulo Tamar’s comments in the PR about unknown CPUs it should work fine under 
Linux as long as /proc/cpuinfo contains entries for both types of cores.
You can use a mock cpuinfo for testing through the GCC_CPUINFO environment 
variable, like the tests in gcc.target/aarch64/cpunative.
Once that is detected you can specify its other tuning and arch parameters like 
any other -mcpu option.

> 
> gcc/ChangeLog:
> 
> * config/aarch64/aarch64-cores.def (AARCH64_CORE): Add apple-a12,
> apple-m1, apple-m2, apple-m3.
> * config/aarch64/aarch64-tune.md: Regenerate.

These need entries in the documentation too.

Thanks,
Kyrill


> 
> Signed-off-by: Iain Sandoe 
> ---
> gcc/config/aarch64/aarch64-cores.def | 12 
> gcc/config/aarch64/aarch64-tune.md   |  2 +-
> 2 files changed, 13 insertions(+), 1 deletion(-)
> 
> diff --git a/gcc/config/aarch64/aarch64-cores.def 
> b/gcc/config/aarch64/aarch64-cores.def
> index caf61437d18..0bd3e80cf7f 100644
> --- a/gcc/config/aarch64/aarch64-cores.def
> +++ b/gcc/config/aarch64/aarch64-cores.def
> @@ -173,6 +173,18 @@ AARCH64_CORE("cortex-a76.cortex-a55",  
> cortexa76cortexa55, cortexa53, V8_2A,  (F
> AARCH64_CORE("cortex-r82", cortexr82, cortexa53, V8R, (), cortexa53, 0x41, 
> 0xd15, -1)
> AARCH64_CORE("cortex-r82ae", cortexr82ae, cortexa53, V8R, (), cortexa53, 
> 0x41, 0xd14, -1)
> 
> +/* Apple (A12 and M) cores based on Armv8.
> +   Apple implementer ID from xnu,
> +   Guesses for part # and suitable scheduler ident, generic_armv8_a for 
> costs.
> +   A12 seems mostly 8.3,
> +   M1 seems to be 8.4 + extras (see comments in option-extensions about 
> f16fml),
> +   M2 mostly 8.5 but with missing mandatory features.
> +   M3 is essentially the same as M2 for the features declared here.  */
> +AARCH64_CORE("apple-a12", applea12, cortexa53, V8_3A,  (), generic_armv8_a, 
> 0x61, 0x12, -1)
> +AARCH64_CORE("apple-m1", applem1, cortexa57, V8_4A,  (F16, SB, SSBS), 
> generic_armv8_a, 0x61, 0x23, -1)
> +AARCH64_CORE("apple-m2", applem2, cortexa57, V8_4A,  (I8MM, BF16, F16, SB, 
> SSBS

[committed][PR rtl-optimization/107455] Eliminate unnecessary constant load - v3

2025-01-13 Thread Jeff Law

This resurrects a patch from a bit over 2 years ago that I never wrapped 
up.  IIRC, I ended up up catching covid, then in the hospital for an 
unrelated issue and it just got dropped on the floor in the insanity.


The basic idea here is to help postreload-cse eliminate more 
const/copies by recording a small set of conditional equivalences (as 
Richi said in 2022, "Ick").


It was originally to help eliminate an unnecessary constant load I saw 
in coremark, but as seen in BZ107455 the same issues show up in real 
code as well.


Changes since v2:

  - Simplified logic for blocks to examine
  - Remove redundant tests when filtering blocks to examine
  - Remove bogus check which only allowed reg->reg copies


Changes since v1:

Richard B and Richard S both had good comments last time around and 
their requests are reflected in this update:


  - Use rtx_equal_p rather than pointer equality
  - Restrict to register "destinations"
  - Restrict to integer modes
  - Adjust entry block handling

My own wider scale testing resulted in a few more changes.

  - Robustify extracting the (set (pc) ... ), which then required ...
  - Handle if src/dst are clobbered by the conditional branch
  - Fix logic error causing too many equivalences to be recorded

Pushing to the trunk.
Jeff

commit d23d338da4d2bd581b2d3fd97785dd2c26053a92
Author: Jeff Law 
Date:   Mon Jan 13 07:29:39 2025 -0700

[PR rtl-optimization/107455] Eliminate unnecessary constant load

This resurrects a patch from a bit over 2 years ago that I never wrapped up.
IIRC, I ended up up catching covid, then in the hospital for an unrelated 
issue
and it just got dropped on the floor in the insanity.

The basic idea here is to help postreload-cse eliminate more const/copies by
recording a small set of conditional equivalences (as Richi said in 2022,
"Ick").

It was originally to help eliminate an unnecessary constant load I saw in
coremark, but as seen in BZ107455 the same issues show up in real code as 
well.

Bootstrapped and regression tested on x86-64, also been through multiple 
spins
in my tester.

Changes since v2:

  - Simplified logic for blocks to examine
  - Remove redundant tests when filtering blocks to examine
  - Remove bogus check which only allowed reg->reg copies

Changes since v1:

Richard B and Richard S both had good comments last time around and their
requests are reflected in this update:

  - Use rtx_equal_p rather than pointer equality
  - Restrict to register "destinations"
  - Restrict to integer modes
  - Adjust entry block handling

My own wider scale testing resulted in a few more changes.

  - Robustify extracting the (set (pc) ... ), which then required ...
  - Handle if src/dst are clobbered by the conditional branch
  - Fix logic error causing too many equivalences to be recorded

PR rtl-optimization/107455
gcc/
* postreload.cc (reload_cse_regs_1): Take advantage of conditional
equivalences.

gcc/testsuite
* gcc.target/riscv/pr107455-1.c: New test.
* gcc.target/riscv/pr107455-2.c: New test.

diff --git a/gcc/postreload.cc b/gcc/postreload.cc
index 05c2e0e2644..487aa8aad05 100644
--- a/gcc/postreload.cc
+++ b/gcc/postreload.cc
@@ -33,6 +33,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "emit-rtl.h"
 #include "recog.h"
 
+#include "cfghooks.h"
 #include "cfgrtl.h"
 #include "cfgbuild.h"
 #include "cfgcleanup.h"
@@ -221,13 +222,107 @@ reload_cse_regs_1 (void)
   init_alias_analysis ();
 
   FOR_EACH_BB_FN (bb, cfun)
-FOR_BB_INSNS (bb, insn)
-  {
-   if (INSN_P (insn))
- cfg_changed |= reload_cse_simplify (insn, testreg);
+{
+  /* If BB has a small number of predecessors, see if each of the
+has the same implicit set.  If so, record that implicit set so
+that we can add it to the cselib tables.  */
+  rtx_insn *implicit_set;
 
-   cselib_process_insn (insn);
-  }
+  implicit_set = NULL;
+  if (EDGE_COUNT (bb->preds) <= 3)
+   {
+ edge e;
+ edge_iterator ei;
+ rtx src = NULL_RTX;
+ rtx dest = NULL_RTX;
+
+ /* Iterate over each incoming edge and see if they
+all have the same implicit set.  */
+ FOR_EACH_EDGE (e, ei, bb->preds)
+   {
+ /* Skip the entry/exit block.  */
+ if (e->src == ENTRY_BLOCK_PTR_FOR_FN (cfun))
+   break;
+
+ /* Verify this block ends with a suitable condjump  */
+ rtx_insn *condjump = BB_END (e->src);
+ if (!condjump || ! any_condjump_p (condjump))
+   break;
+
+ /* This predecessor ends with a possible equivalence
+producing conditional branch.  Extract the condition
+and try to use it to create an equivalence.

Re: [PATCH] tree-optimization/92539 - missed optimization leads to bogus -Warray-bounds

2025-01-13 Thread Richard Biener

On Mon, 13 Jan 2025, Richard Biener wrote:

> The following makes niter analysis recognize a loop with an exit
> condition scanning over a STRING_CST.  This is done via enhancing
> the force evaluation code rather than recognizing for example
> strlen (s) as number of iterations because it allows to handle
> some more cases.
> 
> STRING_CSTs are easy to handle since nothing can write to them, also
> processing those should be cheap.  I'd appreciate another eye on
> the constraints I put in.
> 
> Note to avoid the -Warray-bound dianostic we have to early unroll
> the loop (there's no final value replacement done, there's a PR
> for doing this as part of CD-DCE when possibly eliding a loop).
> This works for strings up to 8 chars (including the '\0') only
> (rather than 16, the unroll niter limit) because unroll estimation
> will not see that the load from the string constant goes away.
> 
> Final value replacement doesn't work since ivcanon is now after it,
> it's not the time to move the pass though.  The pass is in theory
> supposed to add a canonical IV for the _by_eval cases, but we
> didn't "fix" this when we added cunrolli (we probably should have
> moved ivcanon very early, or made cunroll add such IV if we
> used _by_eval but did not unroll).
> 
> Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Some testsuite adjustments are necessary, but the following followup
handles enabling final value replacement by forcing a canonical IV
from cunrolli when we didn't unroll but used force-evaluation to
compute niter.  It also makes us handle the new testcase which ends
up with POINTER_PLUS_EXPR for the inital value.

Bootstrap and regtest running on x86_64-unknown-linux-gnu.

Any thoughts?

Thanks,
Richard.

>From 705a287694404aafa72bbdc9da21dd1bf448cd85 Mon Sep 17 00:00:00 2001
From: Richard Biener 
Date: Mon, 13 Jan 2025 15:39:07 +0100
Subject: [PATCH] tree-optimization/92539 - handle more cases
To: gcc-patches@gcc.gnu.org

The following ontop of the previous fix also handles POINTER_PLUS_EXPR
of &"Foo" and a constant offset as happens for the added testcase
as well as making sure to add a canonical IV when we figured niter
by force evaluation during cunrolli so that work isn't wasted,
DCE can eliminate the load and SCCP perform final value replacement.

PR tree-optimization/92539
* tree-ssa-loop-ivcanon.cc (canonicalize_loop_induction_variables):
When niter was computed constant by force evaluation add a
canonical IV if we didn't unroll.
* tre-ssa-loop-niter.cc (loop_niter_by_eval): Use
split_constant_offset to get at a STRING_CST and an initial
constant offset.

* gcc.dg/tree-ssa/sccp-16.c: New testcase.
---
 gcc/testsuite/gcc.dg/tree-ssa/sccp-16.c | 16 +++
 gcc/tree-ssa-loop-ivcanon.cc|  9 --
 gcc/tree-ssa-loop-niter.cc  | 38 +++--
 3 files changed, 46 insertions(+), 17 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/sccp-16.c

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/sccp-16.c 
b/gcc/testsuite/gcc.dg/tree-ssa/sccp-16.c
new file mode 100644
index 000..c35426fc8c4
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/sccp-16.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-cunrolli -fdump-tree-sccp-details" } */
+
+int foo ()
+{
+  const char *s = "Hello World!";
+  int len = 0;
+  while (s[len])
+++len;
+  return len;
+}
+
+/* For cunrolli the growth is too large, but it should add a canonical IV
+   and SCCP peform final value replacement.  */
+/* { dg-final { scan-tree-dump "ivtmp\[^\r\n\]*PHI\[^\r\n\]*13" "cunrolli" } } 
*/
+/* { dg-final { scan-tree-dump "with expr: 12" "sccp" } } */
diff --git a/gcc/tree-ssa-loop-ivcanon.cc b/gcc/tree-ssa-loop-ivcanon.cc
index 7cc340b23c5..d07b3d593f5 100644
--- a/gcc/tree-ssa-loop-ivcanon.cc
+++ b/gcc/tree-ssa-loop-ivcanon.cc
@@ -1257,6 +1257,7 @@ canonicalize_loop_induction_variables (class loop *loop,
   bool modified = false;
   class tree_niter_desc niter_desc;
   bool may_be_zero = false;
+  bool by_eval = false;
 
   /* For unrolling allow conditional constant or zero iterations, thus
  perform loop-header copying on-the-fly.  */
@@ -1291,7 +1292,11 @@ canonicalize_loop_induction_variables (class loop *loop,
   if (try_eval
  && (chrec_contains_undetermined (niter)
  || TREE_CODE (niter) != INTEGER_CST))
-   niter = find_loop_niter_by_eval (loop, &exit);
+   {
+ niter = find_loop_niter_by_eval (loop, &exit);
+ if (TREE_CODE (niter) == INTEGER_CST)
+   by_eval = true;
+   }
 
   if (TREE_CODE (niter) != INTEGER_CST)
exit = NULL;
@@ -1346,7 +1351,7 @@ canonicalize_loop_induction_variables (class loop *loop,
  innermost_cunrolli_p))
 return true;
 
-  if (create_iv
+  if ((create_iv || by_eval)
   && niter && !chrec_contains_undetermined (niter)
   && exit &&

Re: [Regression] [PATCH] internal-fn: Do not force vcond operand to reg.

2025-01-13 Thread Torbjorn SVENSSON





On 2025-01-13 15:21, Christophe Lyon wrote:



On 1/13/25 15:05, Torbjorn SVENSSON wrote:

Hi Richard and Robin,

It looks like this patch introduced a regression with MVE (Cortex-M55 
and Cortex-M85).


If I try to build testsuite/c-c++-common/vector-compare-3.c (there are 
other test cases that fail with a similar ICE):


arm-none-eabi-gcc /src/gcc/testsuite/c-c++-common/vector-compare-3.c - 
march=armv8.1-m.main+mve+fp.dp -mfloat-abi=hard -mfpu=auto -Wc++- 
compat -O2 -S -o /dev/null

/src/gcc/testsuite/c-c++-common/vector-compare-3.c: In function 'g':
/src/gcc/testsuite/c-c++-common/vector-compare-3.c:24:1: error: 
unrecognizable insn:

(insn 26 25 27 2 (set (reg:V4SI 137)
 (unspec:V4SI [
 (reg:V4SI 144)
 (reg:V4SI 145)
 (subreg:V4BI (reg:HI 143) 0)
 ] VPSELQ_S)) "/src/gcc/testsuite/c-c++-common/vector- 
compare-3.c":23:6 -1

  (nil))
during RTL pass: vregs
/src/gcc/testsuite/c-c++-common/vector-compare-3.c:24:1: internal 
compiler error: in extract_insn, at recog.cc:2882
0x20d7ca8 diagnostic_context::diagnostic_impl(rich_location*, 
diagnostic_metadata const*, diagnostic_option_id, char const*, 
__va_list_tag (*) [1], diagnostic_t)

 ???:0
0x20eddd2 internal_error(char const*, ...)
 ???:0
0x726ee5 fancy_abort(char const*, int, char const*)
 ???:0
0x7166bf _fatal_insn(char const*, rtx_def const*, char const*, int, 
char const*)

 ???:0
0x7166de _fatal_insn_not_found(rtx_def const*, char const*, int, char 
const*)

 ???:0
0xe952f7 extract_insn(rtx_insn*)
 ???:0
Please submit a full bug report, with preprocessed source (by using - 
freport-bug).

Please include the complete backtrace with any bug report.
See  for instructions.


Above is with r15-6752-g08b6e875c6b.

Please take a look and see if this can easily be fixed.


I think this is https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115439


Indeed it is. Not sure why I didn't find that one when searching the 
bugzilla. Anyway, sorry for the noise! :S




I'll have a look.


Thanks Christophe!

Kind regards,
Torbjörn



Thanks,

Christophe




Kind regards,
Torbjörn


On 2024-05-13 16:14, Robin Dapp wrote:

What happens if we simply remove all of the force_reg here?


On x86 I bootstrapped and tested the attached without fallout
(gcc188, so it's no avx512-native machine and therefore limited
coverage).  riscv regtest is unchanged.
For aarch64 I would to rely on the pre-commit CI to pick it
up (does that work on sub-threads?).

Regards
  Robin


gcc/ChangeLog:

PR middle-end/113474

* internal-fn.cc (expand_vec_cond_mask_optab_fn):  Remove
force_regs.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/pr113474.c: New test.
---
  gcc/internal-fn.cc  |  3 ---
  .../gcc.target/riscv/rvv/autovec/pr113474.c | 13 +
  2 files changed, 13 insertions(+), 3 deletions(-)
  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/ 
pr113474.c


diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc
index 2c764441cde..4d226c478b4 100644
--- a/gcc/internal-fn.cc
+++ b/gcc/internal-fn.cc
@@ -3163,9 +3163,6 @@ expand_vec_cond_mask_optab_fn (internal_fn, 
gcall *stmt, convert_optab optab)

    rtx_op1 = expand_normal (op1);
    rtx_op2 = expand_normal (op2);
-  mask = force_reg (mask_mode, mask);
-  rtx_op1 = force_reg (mode, rtx_op1);
-
    rtx target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
    create_output_operand (&ops[0], target, mode);
    create_input_operand (&ops[1], rtx_op1, mode);
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr113474.c b/ 
gcc/testsuite/gcc.target/riscv/rvv/autovec/pr113474.c

new file mode 100644
index 000..0364bf9f5e3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr113474.c
@@ -0,0 +1,13 @@
+/* { dg-do compile { target riscv_v } }  */
+/* { dg-additional-options "-std=c99" }  */
+
+void
+foo (int n, int **a)
+{
+  int b;
+  for (b = 0; b < n; b++)
+    for (long e = 8; e > 0; e--)
+  a[b][e] = a[b][e] == 15;
+}
+
+/* { dg-final { scan-assembler "vmerge.vim" } }  */

Re: [PATCH] RISC-V: fix thinko in riscv_register_move_cost ()

2025-01-13 Thread Jeff Law





On 1/11/25 4:45 PM, Vineet Gupta wrote:

This seeming benign mistake caused a massive SPEC2017 Cactu regression
(2.1 trillion insn to 2.5 trillion) wiping out all the gains from my
recent sched1 improvement. Thankfully the issue was trivial to fix even
if hard to isolate.

On BPI3:

Before bug
--
|  Performance counter stats for './cactusBSSN_r_base-1':
|
|   4,557,471.02 msec task-clock:u #1.000 CPUs 
utilized
|  1,245  context-switches:u   #0.273 /sec
|  1  cpu-migrations:u #0.000 /sec
|205,376  page-faults:u#   45.064 /sec
|  7,291,944,801,307  cycles:u #1.600 GHz
|  2,134,835,735,951  instructions:u   #0.29  insn per 
cycle
| 10,799,296,738  branches:u   #2.370 M/sec
| 15,308,966  branch-misses:u  #0.14% of all 
branches
|
| 4557.710508078 seconds time elapsed

Bug
---
|  Performance counter stats for './cactusBSSN_r_base-2':
|
|   4,801,813.79 msec task-clock:u #1.000 CPUs 
utilized
|  8,066  context-switches:u   #1.680 /sec
|  1  cpu-migrations:u #0.000 /sec
|203,836  page-faults:u#   42.450 /sec
|  7,682,826,638,790  cycles:u #1.600 GHz
|  2,503,133,291,344  instructions:u   #0.33  insn per 
cycle
^
| 10,799,287,796  branches:u   #2.249 M/sec
| 16,641,200  branch-misses:u  #0.15% of all 
branches
|
| 4802.616638386 seconds time elapsed
|

Fix
---
|  Performance counter stats for './cactusBSSN_r_base-3':
|
|   4,556,170.75 msec task-clock:u #1.000 CPUs 
utilized
|  1,739  context-switches:u   #0.382 /sec
|  0  cpu-migrations:u #0.000 /sec
|203,458  page-faults:u#   44.655 /sec
|  7,289,854,613,923  cycles:u #1.600 GHz
|  2,134,854,070,916  instructions:u   #0.29  insn per 
cycle
| 10,799,296,807  branches:u   #2.370 M/sec
| 15,403,357  branch-misses:u  #0.14% of all 
branches
|
| 4556.445490123 seconds time elapsed

Fixes: 46888571d242 "RISC-V: Add cr and cf constraint"
Signed-off-by: Vineet Gupta 

gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_register_move_cost): Remove buggy
check.

OK
jeff

Re: [PATCH v2] RISC-V: Fix ICE for unrecognizable insn `UNSPEC_VSETVL` for XTheadVector

2025-01-13 Thread Robin Dapp

> Yes. This will solve the problem, but it will lead to very large-scale changes
> (splitting each rK, adding 1 column constraint), and make the pattern more 
> complex
> and more difficult to maintain. In contrast, how about replacing "rK" with a 
> new
> constrain in the way jeff mentioned? For example, "Vvl".

> Jeff: We could also create a new constraint that mostly behaves like rK, but 
> rejects (const_int 1) when thead-vector is enabled and use that in the 
> vsetvl pattern instead of rK. ( 
> https://gcc.gnu.org/pipermail/gcc-patches/2024-December/671836.html )

If you need to change every rK, then yeah, spec_restriction seem reasonable and
a new constraint seems better.

Re: [PATCH] RISC-V: Fix program logic errors caused by data truncation on 32-bit host for zbs, such as i386.

2025-01-13 Thread Jeff Law





On 1/13/25 2:07 AM, Jin Ma wrote:

Correct logic on 64-bit host:
 ...
 bseti   a5,zero,38
 bseti   a5,a5,63
 addia5,a5,-1
 and a4,a4,a5
...

Wrong logic on 32-bit host:
...
 li  a5,64
 bseti   a5,a5,31
 addia5,a5,-1
 and a4,a4,a5
...

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_build_integer_1): Change
1UL/1ULL to HOST_WIDE_INT_1U.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/zbs-bug.c: New test.

Egad.  Definitely my bad.  Thanks for debugging & fixing this.

I've pushed it to the trunk.

Jeff

Re: [PATCH] d, v2: give dependency files better filenames

2025-01-13 Thread Iain Buclaw

Excerpts from Jakub Jelinek's message of Januar 13, 2025 2:58 pm:
> On Mon, Jan 13, 2025 at 02:45:28PM +0100, Arsen Arsenović wrote:
>> > So the former d/.deps/file.Po which handled both d/dmd/common/file.d and
>> > d/dmd/root/file.d in your case would be d/.deps/d-common-file.o.d and
>> > d/.deps/d-root-file.o.d while with the above DEPFILE it would be
>> > d/.deps/common-file.d and d/.deps/root-file.d
>> > There are no d/dmd/*-*.d files and among d/*-*.cc the only are just d-
>> > prefixed ones, and there are no clashes between the *.cc and *.d filenames:
>> > for i in gcc/d/*.cc; do j=`basename $i .cc`; find gcc/d -name $j.d; done
>> 
>> Relying that is more error-prone, I think.  While that is true today, it
>> might not stay true forever, and such a change won't be caught until it
>> fails again in the same way.
>> 
>> $@ is necessarily unique (however, still, with the proposed approach
>> d/foo.o and d-foo.o will collide).
> 
> We are talking about d/*.o object files and d/.deps/*.Po files corresponding
> to that.  As there are no subdirectories, the * must be necessarily unique.
> And it already uses $(@D) in the directory name ($(@D)/$(DEPDIR) in
> particular, so even if in the future some subdirectory for object files is
> added, it would still be unique, say if there is
> d/*.o and d/whatever/*.o, the deps files would be d/.deps/*.Po and
> d/whatever/.deps/*.Po.  There is no need to avoid clashes with files in
> the gcc main build directory, those have their own gcc/.deps/ rather than
> gcc/d/.deps/
> 

What about just building these all in subdirectories then, as per the 
source directory layout?

For example, changing the common package sources we end up with the 
following, though can't say I'm a strong fan of having test and 
optionally mkdir ran on every recipe execution.

--- a/gcc/d/Make-lang.in
+++ b/gcc/d/Make-lang.in
@@ -93,12 +93,12 @@ D_FRONTEND_OBJS = \
d/canthrow.o \
d/chkformat.o \
d/clone.o \
-   d/common-bitfields.o \
-   d/common-charactertables.o \
-   d/common-file.o \
-   d/common-identifiertables.o \
-   d/common-outbuffer.o \
-   d/common-smallbuffer.o \
+   d/common/bitfields.o \
+   d/common/charactertables.o \
+   d/common/file.o \
+   d/common/identifiertables.o \
+   d/common/outbuffer.o \
+   d/common/smallbuffer.o \
d/compiler.o \
d/cond.o \
d/constfold.o \
@@ -416,7 +416,8 @@ d/%.o: d/dmd/%.d
$(DCOMPILE) $(D_INCLUDES) $<
$(DPOSTCOMPILE)
 
-d/common-%.o: d/dmd/common/%.d
+d/common/%.o: d/dmd/common/%.d
+   -test -d $(@D)/$(DEPDIR) || $(mkinstalldirs) $(@D)/$(DEPDIR)
$(DCOMPILE) $(D_INCLUDES) $<
$(DPOSTCOMPILE)

Re: [PATCH 1/2] RISC-V: Improve bitwise and ashift reassociation for single-bit immediate without zbs [PR 115921]

2025-01-13 Thread Jeff Law





On 1/12/25 7:49 AM, Xi Ruoyao wrote:

When zbs is not available, there's nothing special with single-bit
immediates and we should perform reassociation as normal immediates.

gcc/ChangeLog:

PR target/115921
* config/riscv/riscv.md (_shift_reverse): Only check
popcount_hwi if !TARGET_ZBS.

Thanks.  I've pushed this to the trunk.
jeff

Re: [PATCH 2/2] RISC-V: Remove zba check in bitwise and ashift reassociation [PR 115921]

2025-01-13 Thread Jeff Law





On 1/12/25 7:49 AM, Xi Ruoyao wrote:

The test case

 long
 test (long x, long y)
 {
   return ((x | 0x1ff) << 3) + y;
 }

is now compiled (-O2 -march=rv64g_zba) to

 li  a4,4096
 slliw   a5,a0,3
 addia4,a4,-8
 or  a5,a5,a4
 addwa0,a5,a1
 ret

Despite this check was originally intended to use zba better, now
removing it actually enables the use of zba for this test case (thanks
to late combine):

 oria5,a0,511
 sh3add  a0,a5,a1
 ret

Obviously, bitmanip.md does not cover
(any_or (ashift (reg) (imm123)) imm) at all, and even for and it just
seems more natural splitting to (ashift (and (reg) (imm')) (imm123))
first, then let late combine to combine the outer ashift and the plus.

I've not found any test case regressed by the removal.
And "make check-gcc RUNTESTFLAGS=riscv.exp='zba-*.c'" also reports no
failure.

gcc/ChangeLog:

PR target/115921
* config/riscv/riscv.md (_shift_reverse): Remove
check for TARGET_ZBA.

gcc/testsuite/ChangeLog:

PR target/115921
* gcc.target/riscv/zba-shNadd-08.c: New test.
So I've gone back and tried to jog my memory about what we were trying 
avoid with the zba condition, including looking through tests in our 
internal tree that hadn't been upstreamed yet (there aren't many and 
those that exist are likely related to internal patches that aren't 
ready to upstream yet or things that are upstream, but under a different 
name).


Unfortunately I couldn't find either a test or context to jog my memory. 
 The "andi.add.uw" pattern in bitmanip.md or this splitter look to be 
the most relevant:



;; During combine, we may encounter an attempt to combine
;;   slli rtmp, rs, #imm
;;   zext.w rtmp, rtmp
;;   sh[123]add rd, rtmp, rs2
;; which will lead to the immediate not satisfying the above constraints.
;; By splitting the compound expression, we can simplify to a slli and a
;; sh[123]add.uw.
(define_split
  [(set (match_operand:DI 0 "register_operand")
(plus:DI (and:DI (ashift:DI (match_operand:DI 1 "register_operand")
(match_operand:QI 2 "immediate_operand"))
 (match_operand:DI 3 "consecutive_bits_operand"))
 (match_operand:DI 4 "register_operand")))
   (clobber (match_operand:DI 5 "register_operand"))]
  "TARGET_64BIT && TARGET_ZBA"
  [(set (match_dup 5) (ashift:DI (match_dup 1) (match_dup 6))) 
   (set (match_dup 0) (plus:DI (and:DI (ashift:DI (match_dup 5)

  (match_dup 7))
   (match_dup 8))
   (match_dup 4)))]


The other reasonable possibility would be:




; Make sure that an andi followed by a sh[123]add remains a two instruction
; sequence--and is not torn apart into slli, slri, add.
(define_insn_and_split "*andi_add.uw"
  [(set (match_operand:DI 0 "register_operand" "=r")
(plus:DI (and:DI (ashift:DI (match_operand:DI 1 "register_operand" "r")
(match_operand:QI 2 "imm123_operand" "Ds3"))
 (match_operand:DI 3 "consecutive_bits_operand" ""))
 (match_operand:DI 4 "register_operand" "r")))
   (clobber (match_scratch:DI 5 "=&r"))]
  "TARGET_64BIT && TARGET_ZBA
   && riscv_shamt_matches_mask_p (INTVAL (operands[2]), INTVAL (operands[3]))
   && SMALL_OPERAND (INTVAL (operands[3]) >> INTVAL (operands[2]))"




But without a testcase it's obviously hard to be sure either way.  But I 
do know that condition was added for a reason.


Regardless, I don't think that without a testcase that we should hold up 
this patch.  It's entirely possible that late-combine cleanings things 
up now and we don't need the special casing anymore.  Or it may turn out 
that we have to revisit (and get a testcase installed if so!).


The net is I'll commit this momentarily.

Thanks,
Jeff

Re: Subject: [PATCH] RISC-V: testsuite: Skip test with -flto.

2025-01-13 Thread Jeff Law





On 1/10/25 1:38 AM, Robin Dapp wrote:

Hi,

the zbb-rol-ror and stack_save_restore tests use the -fno-lto option and
scan the final assembly.  For an invocation like -flto ... -fno-lto the
output file we scan is still something like
   zbb-rol-ror-09.ltrans0.ltrans.s.

Therefore skip the tests when "-flto" is present.  This gets rid
of a few UNRESOLVED tests.

Regtested on rv64gcv_zvl512b.  Going to push if the CI agrees.

Regards
  Robin

gcc/testsuite/ChangeLog:

* gcc.target/riscv/stack_save_restore_1.c: Skip for -flto.
* gcc.target/riscv/stack_save_restore_2.c: Ditto.
* gcc.target/riscv/zbb-rol-ror-04.c: Ditto.
* gcc.target/riscv/zbb-rol-ror-05.c: Ditto.
* gcc.target/riscv/zbb-rol-ror-06.c: Ditto.
* gcc.target/riscv/zbb-rol-ror-07.c: Ditto.
* gcc.target/riscv/zbb-rol-ror-08.c: Ditto.
* gcc.target/riscv/zbb-rol-ror-09.c: Ditto.
I went ahead and pushed this as I was cleaning up the queue before 
tomorrowing meeting.


jeff

Re: [PATCH v5 02/10] OpenMP: Re-work and extend context selector resolution

2025-01-13 Thread Tobias Burnus


Hi all,

Tobias Burnus wrote:

Tobias Burnus wrote:

Sandra Loosemore wrote:

This patch reimplements the middle-end support for "declare variant"
and extends the resolution mechanism to also handle metadirectives
(PR112779).  It also adds partial support for dynamic selectors
(PR113904) and fixes a selector scoring bug reported as PR114596.  I 
hope
this rewrite also improves the engineering aspect of the code, e.g. 
more

comments to explain what it is doing.


Do to a Clang bug and GCC issuing a warning - and me not finding
the right spot in the spec, I claimed in the C FE thread that
  context={target}
is not fulfilled inside a declare-target function as it is not
inside an 'omp target' construct.

Well, I missed the following line:

OpenMP 6.0, "9.1 OpenMP Contexts" [326:28-30]:


3. For procedures that are determined to be target variants by a declare target
   directive, the target trait is added to the beginning of the construct trait
   set as c1 so the total size of the trait set is increased by one.


Seemingly the original patch author found that line - as GCC does the right 
thing
forhttps://github.com/OpenMP/Examples/blob/main/program_control/sources/metadirective.3.c#L15

#pragma omp begin declare target
void exp_pi_diff(double *d, double my_pi){
   #pragma omp metadirective \
   when(   construct={target}: distribute parallel for ) \
   otherwise(  parallel for simd )

namely, as the function is 'exp_pi_diff' the 'when' is correctly matched
and "distribute parallel for" is used.

So far so good, but when compiling it, one gets the confusing warning:

warning: direct calls to an offloadable function containing metadirectives
with a ‘construct={target}’ selector may produce unexpected results


(To the defense of the original patch author, back then the spec (5.0) talked
about "device routines" which is ambiguous whether it applies to functions
marked as declare target - or only those actually running on non-host
devices.)

* * *

As GCC does the right thing - but the warning is highly confusing in light
of the much clearer 6.0 wording, IMHO, we just can just remove the following
bits of the patch.

Namely, removing the said warning:


--- a/gcc/omp-low.cc
+++ b/gcc/omp-low.cc
@@ -14676,6 +14676,24 @@ lower_omp_1 (gimple_stmt_iterator *gsi_p, omp_context 
*ctx)
tree fndecl;
call_stmt = as_a  (stmt);
fndecl = gimple_call_fndecl (call_stmt);
+  if (fndecl
+ && lookup_attribute ("omp metadirective construct target",
+  DECL_ATTRIBUTES (fndecl)))
+   {
+ bool in_target_ctx = false;
+
+ for (omp_context *up = ctx; up; up = up->outer)
+   if (gimple_code (up->stmt) == GIMPLE_OMP_TARGET)
+ {
+   in_target_ctx = true;
+   break;
+ }
+ if (!ctx || !in_target_ctx)
+   warning_at (gimple_location (stmt), 0,
+   "direct calls to an offloadable function containing "
+   "metadirectives with a % "
+   "selector may produce unexpected results");
+   }
if (fndecl
  && fndecl_built_in_p (fndecl, BUILT_IN_NORMAL))
switch (DECL_FUNCTION_CODE (fndecl))



And as this is the only use for "omp metadirective construct target", 
the following can be then removed in



--- a/gcc/gimplify.cc
+++ b/gcc/gimplify.cc
...

+static enum gimplify_status
+gimplify_omp_metadirective (tree *expr_p, gimple_seq *pre_p, gimple_seq *,
+   bool (*) (tree), fallback_t)
+{

Namely:

+  /* Mark offloadable functions containing metadirectives that specify
+ a 'construct' selector with a 'target' constructor.  */
+  if (offloading_function_p (current_function_decl))
+{
+  for (tree variant = OMP_METADIRECTIVE_VARIANTS (*expr_p);
+  variant != NULL_TREE; variant = TREE_CHAIN (variant))
+   {
+ tree selector = OMP_METADIRECTIVE_VARIANT_SELECTOR (variant);
+
+ if (omp_get_context_selector (selector, OMP_TRAIT_SET_CONSTRUCT,
+   OMP_TRAIT_CONSTRUCT_TARGET))
+   {
+ tree id = get_identifier ("omp metadirective construct target");
+
+ DECL_ATTRIBUTES (current_function_decl)
+   = tree_cons (id, NULL_TREE,
+DECL_ATTRIBUTES (current_function_decl));
+ break;
+   }
+   }
+}


Just as followup to my claim that the 2/10 patch was LGTM. With those 
changes, it is then LGTM v2.0. BTW: A later patch in the patch series 
added a testcase similar to the OpenMP example document; it needs to be 
updated for the removed warning but it will test that this works.
Tobias PS: Sorry for causing confusion, but I hope it is now sorted out 
correctly.

Re: [PATCH] c++/modules: Fix linkage checks for exported using-decls

2025-01-13 Thread Jason Merrill


On 1/12/25 7:20 AM, Nathaniel Shead wrote:

On Fri, Jan 03, 2025 at 05:18:55PM +, xxx wrote:

From: yxj-github-437 <2457369...@qq.com>

This patch attempts to fix an error when build module std. The reason for the
error is __builrin_va_list (aka struct __va_list) is an internal linkage. so
attempt to handle this builtin type by identifying whether DECL_SOURCE_LOCATION 
(entity)
is BUILTINS_LOCATION.



Hi, thanks for the patch!  I suspect this may not be sufficient to
completely avoid issues with the __gnuc_va_list type; in particular, if
it's internal linkage that may prevent it from being referred to in
other ways by inline functions in named modules (due to P1815).

Maybe a better approach would be to instead mark this builtin type as
TREE_PUBLIC (presumably in aarch64_build_builtin_va_list)?


Agreed.  Why does it currently have internal linkage?

Jason

Re: [RFC PATCH v2] RISC-V:Fix th.vsetvli generates from vext_x_v with wrong operand

2025-01-13 Thread Jeff Law





On 12/22/24 11:51 PM, yunze...@linux.alibaba.com wrote:

From: Yunze Zhu 

Fix a bug th.vsetvli generates from vext_x_v with an imm operand,
which reports illegal operand. This patch fix this by replacing
imm operand with reg operand in th.vsetvli.

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc:

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/xtheadvector/vext_x_v.c: New test.

I think this was fixed a while back.

The "problem" insns look like this:


(gdb) p debug_rtx (insn)
(insn 39 7 13 (parallel [
(set (reg:SI 66 vl)
(unspec:SI [
(reg:DI 0 zero)
(const_int 32 [0x20])
(const_int 0 [0])
] UNSPEC_VSETVL))
(set (reg:SI 67 vtype)
(unspec:SI [
(const_int 32 [0x20])
(const_int 0 [0])
(const_int 1 [0x1]) repeated x2
] UNSPEC_VSETVL))
]) "j.c":8:18 3722 {vsetvl_discard_resultdi}
 (expr_list:REG_UNUSED (reg:SI 66 vl)
(nil)))
Which looks sensible to me.  That's going to output "zero" rather than 
"0", which is what we want.


As I mentioned, I'm iterating with Jinma on a possible case where 
IRA/LRA might be substituting the (const_int 0) form back in.  But as it 
stands I don't see a need for this patch right now.


If you can reproduce this problem on trunk, certainly let me know and 
pass along an updated testcase/args.


Thanks,
Jeff

[COMMITTED] RISC-V: fix thinko in riscv_register_move_cost ()

2025-01-13 Thread Vineet Gupta

This seeming benign mistake caused a massive SPEC2017 Cactu regression
(2.1 trillion insn to 2.5 trillion) wiping out all the gains from my
recent sched1 improvement. Thankfully the issue was trivial to fix even
if hard to isolate.

On BPI3:

Before bug
--
|  Performance counter stats for './cactusBSSN_r_base-1':
|
|   4,557,471.02 msec task-clock:u #1.000 CPUs 
utilized
|  1,245  context-switches:u   #0.273 /sec
|  1  cpu-migrations:u #0.000 /sec
|205,376  page-faults:u#   45.064 /sec
|  7,291,944,801,307  cycles:u #1.600 GHz
|  2,134,835,735,951  instructions:u   #0.29  insn per 
cycle
| 10,799,296,738  branches:u   #2.370 M/sec
| 15,308,966  branch-misses:u  #0.14% of all 
branches
|
| 4557.710508078 seconds time elapsed

Bug
---
|  Performance counter stats for './cactusBSSN_r_base-2':
|
|   4,801,813.79 msec task-clock:u #1.000 CPUs 
utilized
|  8,066  context-switches:u   #1.680 /sec
|  1  cpu-migrations:u #0.000 /sec
|203,836  page-faults:u#   42.450 /sec
|  7,682,826,638,790  cycles:u #1.600 GHz
|  2,503,133,291,344  instructions:u   #0.33  insn per 
cycle
   ^
| 10,799,287,796  branches:u   #2.249 M/sec
| 16,641,200  branch-misses:u  #0.15% of all 
branches
|
| 4802.616638386 seconds time elapsed
|

Fix
---
|  Performance counter stats for './cactusBSSN_r_base-3':
|
|   4,556,170.75 msec task-clock:u #1.000 CPUs 
utilized
|  1,739  context-switches:u   #0.382 /sec
|  0  cpu-migrations:u #0.000 /sec
|203,458  page-faults:u#   44.655 /sec
|  7,289,854,613,923  cycles:u #1.600 GHz
|  2,134,854,070,916  instructions:u   #0.29  insn per 
cycle
| 10,799,296,807  branches:u   #2.370 M/sec
| 15,403,357  branch-misses:u  #0.14% of all 
branches
|
| 4556.445490123 seconds time elapsed

Fixes: 46888571d242 ("RISC-V: Add cr and cf constraint")
Signed-off-by: Vineet Gupta 

gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_register_move_cost): Remove buggy
check.

Signed-off-by: Vineet Gupta 
---
 gcc/config/riscv/riscv.cc | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 3b5712429e46..a03b35bb9cfd 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -9591,8 +9591,7 @@ riscv_register_move_cost (machine_mode mode,
   bool from_is_gpr = from == GR_REGS || from == RVC_GR_REGS;
   bool to_is_fpr = to == FP_REGS || to == RVC_FP_REGS;
   bool to_is_gpr = to == GR_REGS || to == RVC_GR_REGS;
-  if ((from_is_fpr && to == to_is_gpr) ||
-  (from_is_gpr && to_is_fpr))
+  if ((from_is_fpr && to_is_gpr) || (from_is_gpr && to_is_fpr))
 return tune_param->fmv_cost;
 
   if (from == V_REGS)
-- 
2.43.0

Re: [PATCH] RISC-V: Fix the result error caused by not updating ratio when using "use_max_sew" to merge vsetvl.

2025-01-13 Thread Jeff Law





On 1/13/25 3:15 AM, Jin Ma wrote:

When the vsetvl instructions of the two RVV instructions are merged
using "use_max_sew", it is possible to update the sew of prev if
prev.sew < next.sew, but keep the original ratio, which is obviously
wrong. when the subsequent instructions are equal to the wrong ratio,
it is possible to generate the wrong "vsetvli zero,zero" instruction,
which will lead to unknown avl.

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc:

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/bug-10.c: New test.

ChangeLog missing for the riscv-vsetvl.cc change.  I'll fix that.



diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/bug-10.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/bug-10.c
new file mode 100644
index ..c1a8ac95c17f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/bug-10.c
@@ -0,0 +1,15 @@
+/* { dg-do compile { target { rv64 } } } */
+/* { dg-skip-if "" { *-*-* } { "-O0" "-O1" "-O3" "-Og" "-Os" "-Oz" } } */
+/* { dg-options " -march=rv64gcv_zvfh -mabi=lp64d -O2 --param=vsetvl-strategy=optim 
-fno-schedule-insns  -fno-schedule-insns2 -fno-schedule-fusion " } */
So as written this test will be totally skipped (and I've verified that 
locally).  It looks like you just wanted -O2 and we're not cycling 
through options, so we don't need/want the dg-skip-if.   I'll fix that too.



I'll make the obvious changes and push the result to the trunk.

Thanks,
jeff

Re: [PATCH]AArch64: have -mcpu=native detect architecture extensions for unknown non-homogenous systems [PR113257]

2025-01-13 Thread Richard Sandiford

Tamar Christina  writes:
> Hi All,
>
> in g:e91a17fe39c39e98cebe6e1cbc8064ee6846a3a7 we added the ability for
> -mcpu=native on unknown CPUs to still enable architecture extensions.
>
> This has worked great but was only added for homogenous systems.
>
> However the same thing works for big.LITTLE as in such system the cores must
> have the same extensions otherwise it doesn't fundamentally work.
>
> i.e. task migration from one core to the other wouldn't work.
>
> This extends the same handling to non-homogenous systems.
>
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
>
> Ok for master?
>
> Thanks,
> Tamar
>
> gcc/ChangeLog:
>
>   PR target/113257
>   * config/aarch64/driver-aarch64.cc (host_detect_local_cpu):
>
> gcc/testsuite/ChangeLog:
>
>   PR target/113257
>   * gcc.target/aarch64/cpunative/info_34: New test.
>   * gcc.target/aarch64/cpunative/native_cpu_34.c: New test.
>
> ---
>
> diff --git a/gcc/config/aarch64/driver-aarch64.cc 
> b/gcc/config/aarch64/driver-aarch64.cc
> index 
> 45fce67a646351b848b7cd7d0fd35d343731c0d1..2a454daf031aa3ac81a9a2c03b15c09731e4f56e
>  100644
> --- a/gcc/config/aarch64/driver-aarch64.cc
> +++ b/gcc/config/aarch64/driver-aarch64.cc
> @@ -449,6 +449,20 @@ host_detect_local_cpu (int argc, const char **argv)
> break;
>   }
>   }
> +
> +  /* On big.LITTLE if we find any unknown CPUs we can still pick arch
> +  features as the cores should have the same features.  So just pick
> +  the feature flags from any of the cpus.  */
> +  if (aarch64_cpu_data[i].name == NULL)
> + {
> +   auto arch_info = get_arch_from_id (DEFAULT_ARCH);
> +
> +   gcc_assert (arch_info);
> +
> +   res = concat ("-march=", arch_info->name, NULL);
> +   default_flags = arch_info->flags;
> + }
> +

Currently, if gcc recognises the host cpu, and if one-thing is more
restrictive than that cpu, gcc will warn on:

  gcc -march=one-thing -mcpu=native

and choose one-thing.  It looks like one consequence of this patch
is that, for unrecognised big.LITTLE, the command line would get
converted to:

  gcc -march=one-thing -march=above-replacement

and so -mcpu=native would silently "win" over one-thing.  Is that right?

Perhaps we should adjust:

   " %{mcpu=native:% +/* { dg-additional-options "-mcpu=native" } */
> +
> +int main()
> +{
> +  return 0;
> +}
> +
> +/* { dg-final { scan-assembler {\.arch 
> armv8-a\+dotprod\+crc\+crypto\+sve2\n} } } */
> +
> +/* Test a normal looking procinfo.  */

[PATCH] Fix build for STORE_FLAG_VALUE<0 targets [PR118418]

2025-01-13 Thread Richard Sandiford

In g:06c4cf398947b53b4bfc65752f9f879bb2d07924 I mishandled signed
comparisons of comparison results on STORE_FLAG_VALUE < 0 targets
(despite specifically referencing STORE_FLAG_VALUE in the commit
message).  There, (lt TRUE FALSE) is true, although (ltu FALSE TRUE)
still holds.

Things get messy with vector modes, and since those weren't the focus
of the original commit, it seemed better to punt on them for now.
However, punting means that this optimisation no longer feels like
a natural tail-call operation.  The patch therefore converts
"return simplify..." to the usual call-and-conditional-return pattern.

Bootstrapped & regression-tested on aarch64-linux-gnu.  Also tested
by build m68k-elf.  OK to install?

Richard


gcc/
* simplify-rtx.cc (simplify_context::simplify_relational_operation_1):
Take STORE_FLAG_VALUE into account when handling signed comparisons
of comparison results.
---
 gcc/simplify-rtx.cc | 39 ---
 1 file changed, 24 insertions(+), 15 deletions(-)

diff --git a/gcc/simplify-rtx.cc b/gcc/simplify-rtx.cc
index 71c5d3c1b1b..dda8fc689e7 100644
--- a/gcc/simplify-rtx.cc
+++ b/gcc/simplify-rtx.cc
@@ -6434,7 +6434,7 @@ simplify_context::simplify_relational_operation_1 
(rtx_code code,
return simplify_gen_binary (AND, mode, XEXP (tmp, 0), const1_rtx);
 }
 
-  /* For two booleans A and B:
+  /* For two unsigned booleans A and B:
 
  A >  B == ~B & A
  A >= B == ~B | A
@@ -6443,20 +6443,29 @@ simplify_context::simplify_relational_operation_1 
(rtx_code code,
  A == B == ~A ^ B (== ~B ^ A)
  A != B ==  A ^ B
 
- simplify_logical_relational_operation checks whether A and B
- are booleans.  */
-  if (code == GTU || code == GT)
-return simplify_logical_relational_operation (AND, mode, op1, op0, true);
-  if (code == GEU || code == GE)
-return simplify_logical_relational_operation (IOR, mode, op1, op0, true);
-  if (code == LTU || code == LT)
-return simplify_logical_relational_operation (AND, mode, op0, op1, true);
-  if (code == LEU || code == LE)
-return simplify_logical_relational_operation (IOR, mode, op0, op1, true);
-  if (code == EQ)
-return simplify_logical_relational_operation (XOR, mode, op0, op1, true);
-  if (code == NE)
-return simplify_logical_relational_operation (XOR, mode, op0, op1);
+ For signed comparisons, we have to take STORE_FLAG_VALUE into account,
+ with the rules above applying for positive STORE_FLAG_VALUE and with
+ the relations reversed for negative STORE_FLAG_VALUE.  */
+  if (is_a (cmp_mode)
+  && COMPARISON_P (op0)
+  && COMPARISON_P (op1))
+{
+  rtx t = NULL_RTX;
+  if (code == GTU || code == (STORE_FLAG_VALUE > 0 ? GT : LT))
+   t = simplify_logical_relational_operation (AND, mode, op1, op0, true);
+  else if (code == GEU || code == (STORE_FLAG_VALUE > 0 ? GE : LE))
+   t = simplify_logical_relational_operation (IOR, mode, op1, op0, true);
+  else if (code == LTU || code == (STORE_FLAG_VALUE > 0 ? LT : GT))
+   t = simplify_logical_relational_operation (AND, mode, op0, op1, true);
+  else if (code == LEU || code == (STORE_FLAG_VALUE > 0 ? LE : GE))
+   t = simplify_logical_relational_operation (IOR, mode, op0, op1, true);
+  else if (code == EQ)
+   t = simplify_logical_relational_operation (XOR, mode, op0, op1, true);
+  else if (code == NE)
+   t = simplify_logical_relational_operation (XOR, mode, op0, op1);
+  if (t)
+   return t;
+}
 
   return NULL_RTX;
 }
-- 
2.25.1

Re: [PATCH] d, v2: give dependency files better filenames

2025-01-13 Thread Jakub Jelinek

On Mon, Jan 13, 2025 at 07:29:52PM +0100, Iain Buclaw wrote:
> For example, changing the common package sources we end up with the 
> following, though can't say I'm a strong fan of having test and 
> optionally mkdir ran on every recipe execution.

You also then need to handle it for cleaning etc.
Most FEs don't use subdirectories of the language subdirectories in GCC,
the exception is Ada (but that one builds there the runtime library too)
and Modula 2 (but its dependencies have been a nightmare).

Jakub

Re: [PATCH] c++: make finish_pseudo_destructor_expr SFINAE-aware [PR116417]

2025-01-13 Thread Marek Polacek

On Mon, Jan 13, 2025 at 11:25:25AM -0500, Patrick Palka wrote:
> Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look
> OK for trunk?

Looks okay to me.
 
> -- >8 --
> 
>   PR c++/116417
> 
> gcc/cp/ChangeLog:
> 
>   * cp-tree.h (finish_pseudo_destructor_expr): Add complain
>   parameter.
>   * parser.cc (cp_parser_postfix_dot_deref_expression): Pass
>   complain=tf_warning_or_error to finish_pseudo_destructor_expr.
>   * pt.cc (tsubst_expr): Pass complain to
>   finish_pseudo_destructor_expr.
>   * semantics.cc (finish_pseudo_destructor_expr): Check complain
>   before issuing a diagnostic.
> 
> gcc/testsuite/ChangeLog:
> 
>   * g++.dg/template/pseudodtor7.C: New test.
> ---
>  gcc/cp/cp-tree.h|  2 +-
>  gcc/cp/parser.cc|  3 ++-
>  gcc/cp/pt.cc|  4 ++--
>  gcc/cp/semantics.cc | 15 +--
>  gcc/testsuite/g++.dg/template/pseudodtor7.C | 15 +++
>  5 files changed, 29 insertions(+), 10 deletions(-)
>  create mode 100644 gcc/testsuite/g++.dg/template/pseudodtor7.C
> 
> diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
> index aed58523b16..1b42e8ba7d8 100644
> --- a/gcc/cp/cp-tree.h
> +++ b/gcc/cp/cp-tree.h
> @@ -7966,7 +7966,7 @@ extern tree lookup_and_finish_template_variable (tree, 
> tree, tsubst_flags_t = tf
>  extern tree finish_template_variable (tree, tsubst_flags_t = 
> tf_warning_or_error);
>  extern cp_expr finish_increment_expr (cp_expr, enum tree_code);
>  extern tree finish_this_expr (void);
> -extern tree finish_pseudo_destructor_expr   (tree, tree, tree, 
> location_t);
> +extern tree finish_pseudo_destructor_expr   (tree, tree, tree, 
> location_t, tsubst_flags_t);
>  extern cp_expr finish_unary_op_expr  (location_t, enum tree_code, 
> cp_expr,
>tsubst_flags_t);
>  /* Whether this call to finish_compound_literal represents a C++11 functional
> diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
> index f548dc31c2b..7f4340537c9 100644
> --- a/gcc/cp/parser.cc
> +++ b/gcc/cp/parser.cc
> @@ -8847,7 +8847,8 @@ cp_parser_postfix_dot_deref_expression (cp_parser 
> *parser,
> pseudo_destructor_p = true;
> postfix_expression
>   = finish_pseudo_destructor_expr (postfix_expression,
> -  s, type, location);
> +  s, type, location,
> +  tf_warning_or_error);
>   }
>  }
>  
> diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
> index a141de56446..537e4c4a494 100644
> --- a/gcc/cp/pt.cc
> +++ b/gcc/cp/pt.cc
> @@ -21537,7 +21537,7 @@ tsubst_expr (tree t, tree args, tsubst_flags_t 
> complain, tree in_decl)
>   tree op1 = RECUR (TREE_OPERAND (t, 1));
>   tree op2 = tsubst (TREE_OPERAND (t, 2), args, complain, in_decl);
>   RETURN (finish_pseudo_destructor_expr (op0, op1, op2,
> -input_location));
> +input_location, complain));
>}
>  
>  case TREE_LIST:
> @@ -21601,7 +21601,7 @@ tsubst_expr (tree t, tree args, tsubst_flags_t 
> complain, tree in_decl)
>   dtor = TREE_OPERAND (dtor, 0);
>   if (TYPE_P (dtor))
> RETURN (finish_pseudo_destructor_expr
> -   (object, s, dtor, input_location));
> +   (object, s, dtor, input_location, complain));
> }
> }
> }
> diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
> index 15840e10620..76c79c6a8cc 100644
> --- a/gcc/cp/semantics.cc
> +++ b/gcc/cp/semantics.cc
> @@ -3527,7 +3527,7 @@ finish_this_expr (void)
>  
>  tree
>  finish_pseudo_destructor_expr (tree object, tree scope, tree destructor,
> -location_t loc)
> +location_t loc, tsubst_flags_t complain)
>  {
>if (object == error_mark_node || destructor == error_mark_node)
>  return error_mark_node;
> @@ -3538,16 +3538,18 @@ finish_pseudo_destructor_expr (tree object, tree 
> scope, tree destructor,
>  {
>if (scope == error_mark_node)
>   {
> -   error_at (loc, "invalid qualifying scope in pseudo-destructor name");
> +   if (complain & tf_error)
> + error_at (loc, "invalid qualifying scope in pseudo-destructor 
> name");
> return error_mark_node;
>   }
>if (is_auto (destructor))
>   destructor = TREE_TYPE (object);
>if (scope && TYPE_P (scope) && !check_dtor_name (scope, destructor))
>   {
> -   error_at (loc,
> - "qualified type %qT does not match destructor name ~%qT",
> - scope, destructor);
> +   if (complain & tf_error)
> + error_at (loc,
> +   "qualified type %qT

Re: [PATCH] AArch64: Deprecate -mabi=ilp32

2025-01-13 Thread Wilco Dijkstra

Hi all,

> In that case, I'm coming round to the idea of deprecating ILP32.
> I think it was already common ground that the GNU/Linux support is dead.
> watchOS would use Mach objects rather than ELF.  As you say, it isn't
> clear how much of the current ILP32 support would be relevant for it.
> And there is no specific evidence that anyone is using the baremetal
> support.  Deprecating ILP32 is probably the only way of finding out
> whether there are baremetal users.

Yes, even just knowing there are actual users would be useful!

> (Note that I don't think the argument about libatomic & long really
> shifts the balance, since baremetal doesn't use libatomic, and I assume
> watchOS wouldn't either.  PR118142 has been closed as WONTFIX, which I
> agree is the right resolution whatever happens with the deprecation
> decision.)

Libatomic is just one example of new code that hasn't been designed for or
tested on ILP32. The underlying point is that software quickly rots if you don't
actively maintain it. When I tried building SPECINT, I got several assembler
errors like: "whilewr p15.b,w25,w28". That's likely easy to fix (like PR117711),
but someone has to take on this maintenance. And that just hasn't happened
for years...

Cheers,
Wilco

Re: [PATCH v2] c++/modules: Don't emit imported deduction guides [PR117397]

2025-01-13 Thread Jason Merrill


On 1/12/25 7:03 AM, Nathaniel Shead wrote:

On Sun, Jan 12, 2025 at 04:14:41AM +1100, Nathaniel Shead wrote:

Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk?

-- >8 --

The ICE in the linked PR is caused because name lookup finds duplicate
copies of the deduction guides, causing a checking assert to fail.

This is ultimately because we're exporting an imported guide; when name
lookup processes 'dguide-5_b.H' it goes via the 'tt_entity' path and
just returns the entity from 'dguide-5_a.H'.  Because this doesn't ever
go through 'key_mergeable' we never set 'BINDING_VECTOR_GLOBAL_DUPS_P'
and so deduping is not engaged, allowing duplicate results.

Currently I believe this to be a perculiarity of the ANY_REACHABLE
handling for deduction guides; in no other case that I can find do we
emit bindings purely to imported entities.  As such, this patch fixes
this problem from that end, by ensuring that we simply do not emit any
imported deduction guides.  This avoids the ICE because no duplicates
need deduping to start with, and should otherwise have no functional
change because lookup of deduction guides will look at all reachable
modules (exported or not) regardless.

Since we're now deliberately not emitting imported deduction guides we
can use LOOK_want::NORMAL instead of LOOK_want::ANY_REACHABLE, since the
extra work to find as-yet undiscovered deduction guides in transitive
importers is not necessary here.

PR c++/117397

gcc/cp/ChangeLog:

* module.cc (depset::hash::add_deduction_guides): Don't emit
imported deduction guides.

gcc/testsuite/ChangeLog:

* g++.dg/modules/dguide-5_a.H: New test.
* g++.dg/modules/dguide-5_b.H: New test.
* g++.dg/modules/dguide-5_c.H: New test.

Signed-off-by: Nathaniel Shead 
---
  gcc/cp/module.cc  | 30 +++
  gcc/testsuite/g++.dg/modules/dguide-5_a.H |  6 +
  gcc/testsuite/g++.dg/modules/dguide-5_b.H |  6 +
  gcc/testsuite/g++.dg/modules/dguide-5_c.H |  7 ++
  4 files changed, 39 insertions(+), 10 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/modules/dguide-5_a.H
  create mode 100644 gcc/testsuite/g++.dg/modules/dguide-5_b.H
  create mode 100644 gcc/testsuite/g++.dg/modules/dguide-5_c.H

diff --git a/gcc/cp/module.cc b/gcc/cp/module.cc
index 51990f36284..c4df18b9ca9 100644
--- a/gcc/cp/module.cc
+++ b/gcc/cp/module.cc
@@ -14341,22 +14341,32 @@ depset::hash::add_deduction_guides (tree decl)
if (find_binding (ns, name))
  return;
  
-  tree guides = lookup_qualified_name (ns, name, LOOK_want::ANY_REACHABLE,

+  tree guides = lookup_qualified_name (ns, name, LOOK_want::NORMAL,
   /*complain=*/false);
if (guides == error_mark_node)
  return;
  
-  /* We have bindings to add.  */

-  depset *binding = make_binding (ns, name);
-  add_namespace_context (binding, ns);
+  depset *binding = nullptr;
+  for (tree t : lkp_range (guides))
+{
+  /* We don't want to export imported deduction guides, since searches will
+look there anyway.  */
+  if (DECL_LANG_SPECIFIC (STRIP_TEMPLATE (t))
+ && DECL_MODULE_IMPORT_P (STRIP_TEMPLATE (t)))
+   continue;


Actually on further thought, this is not correct; this will prevent us
from exporting declarations inherited from a partition.  Here's an
updated version of the patch that will handle this properly (using
depset::is_import which handles this case correctly) and with a bonus
assertion re: bindings of imported decls to help ensure we don't run
into similar errors in the future.

Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk?

-- >8 --

The ICE in the linked PR is caused because name lookup finds duplicate
copies of the deduction guides, causing a checking assert to fail.

This is ultimately because we're exporting an imported guide; when name
lookup processes 'dguide-5_b.H' it goes via the 'tt_entity' path and
just returns the entity from 'dguide-5_a.H'.  Because this doesn't ever
go through 'key_mergeable' we never set 'BINDING_VECTOR_GLOBAL_DUPS_P'
and so deduping is not engaged, allowing duplicate results.

Currently I believe this to be a perculiarity of the ANY_REACHABLE
handling for deduction guides; in no other case that I can find do we
emit bindings purely to imported entities.  As such, this patch fixes
this problem from that end, by ensuring that we simply do not emit any
imported deduction guides.  This avoids the ICE because no duplicates
need deduping to start with, and should otherwise have no functional
change because lookup of deduction guides will look at all reachable
modules (exported or not) regardless.

Since we're now deliberately not emitting imported deduction guides we
can use LOOK_want::NORMAL instead of LOOK_want::ANY_REACHABLE, since the
extra work to find as-yet undiscovered deduction guides in transitive
importers is not necessary here.


OK.

Do we still need any_reachable in check_module_

Re: [PATCH] c++: Add support for vec_dup to constexpr [PR118445]

2025-01-13 Thread Jason Merrill


On 1/13/25 4:55 PM, Andrew Pinski wrote:

With the addition of supporting operations on the SVE scalable vector types,
the vec_duplicate tree will show up in expressions and the constexpr handling
was not done for this tree code.
This is a simple fix to treat VEC_DUPLICATE like any other unary operator and 
allows
the constexpr-add-1.C testcase to work.

Built and tested for aarch64-linux-gnu.


OK.


PR c++/118445

gcc/cp/ChangeLog:

* constexpr.cc (cxx_eval_constant_expression): Handle VEC_DUPLICATE like
a "normal" unary operator.
(potential_constant_expression_1): Likewise.

gcc/testsuite/ChangeLog:

* g++.target/aarch64/sve/constexpr-add-1.C: New test.

Signed-off-by: Andrew Pinski 
---
  gcc/cp/constexpr.cc  |  2 ++
  .../g++.target/aarch64/sve/constexpr-add-1.C | 16 
  2 files changed, 18 insertions(+)
  create mode 100644 gcc/testsuite/g++.target/aarch64/sve/constexpr-add-1.C

diff --git a/gcc/cp/constexpr.cc b/gcc/cp/constexpr.cc
index 1345bc124ef..0896576fd28 100644
--- a/gcc/cp/constexpr.cc
+++ b/gcc/cp/constexpr.cc
@@ -8005,6 +8005,7 @@ cxx_eval_constant_expression (const constexpr_ctx *ctx, 
tree t,
  case BIT_NOT_EXPR:
  case TRUTH_NOT_EXPR:
  case FIXED_CONVERT_EXPR:
+case VEC_DUPLICATE_EXPR:
r = cxx_eval_unary_expression (ctx, t, lval,
 non_constant_p, overflow_p);
break;
@@ -10344,6 +10345,7 @@ potential_constant_expression_1 (tree t, bool 
want_rval, bool strict, bool now,
  case UNARY_PLUS_EXPR:
  case UNARY_LEFT_FOLD_EXPR:
  case UNARY_RIGHT_FOLD_EXPR:
+case VEC_DUPLICATE_EXPR:
  unary:
return RECUR (TREE_OPERAND (t, 0), rval);
  
diff --git a/gcc/testsuite/g++.target/aarch64/sve/constexpr-add-1.C b/gcc/testsuite/g++.target/aarch64/sve/constexpr-add-1.C

new file mode 100644
index 000..43489560c8a
--- /dev/null
+++ b/gcc/testsuite/g++.target/aarch64/sve/constexpr-add-1.C
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+
+/* PR C++/118445 */
+
+#include 
+
+/* See if constexpr handles VEC_DUPLICATE and SVE. */
+constexpr svfloat32_t f(svfloat32_t b, float a)
+{
+  return b + a;
+}
+
+svfloat32_t g(void)
+{
+  return f((svfloat32_t){1.0}, 2.0);
+}

Re: [PATCH] c++: pack expansion arg vs non-pack parm checking ICE [PR118454]

2025-01-13 Thread Jason Merrill


On 1/13/25 3:27 PM, Patrick Palka wrote:

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK
for trunk?


OK, but do we also need this in the "still packed" case earlier in the 
function?



-- >8 --

During ahead of time template argument coercion, we handle the case of
passing a pack expansion to a non-pack parameter by breaking out early
and using the original unconverted arguments, deferring coercion until
instantiation time where we have concrete arguments.  We still however
need to strip typedefs from the original arguments as in the ordinary
case, for sake of our template argument hashing/equivalence routines
which assume template arguments went through strip_typedefs.

PR c++/118454

gcc/cp/ChangeLog:

* pt.cc (coerce_template_parms): Strip typedefs in the pack
expansion arg vs non-pack parm early break special case.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/variadic187.C: New test.
---
  gcc/cp/pt.cc |  2 +-
  gcc/testsuite/g++.dg/cpp0x/variadic187.C | 13 +
  2 files changed, 14 insertions(+), 1 deletion(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp0x/variadic187.C

diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 537e4c4a494..8cdbf7f65ac 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -9362,7 +9362,7 @@ coerce_template_parms (tree parms,
/* We don't know how many args we have yet, just
 use the unconverted (but unpacked) ones for now.  */
  ggc_free (new_inner_args);
-  new_inner_args = inner_args;
+ new_inner_args = strip_typedefs (inner_args);
  arg_idx = nargs;
break;
  }
diff --git a/gcc/testsuite/g++.dg/cpp0x/variadic187.C 
b/gcc/testsuite/g++.dg/cpp0x/variadic187.C
new file mode 100644
index 000..af1770e4d89
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/variadic187.C
@@ -0,0 +1,13 @@
+// PR c++/118454
+// { dg-do compile { target c++11 } }
+// { dg-additional-options --param=hash-table-verification-limit=1000 }
+
+template using identity = T;
+
+template struct dual;
+
+template
+using ty1 = dual, Ts...>;
+
+template
+using ty2 = dual;

Re: [PATCH v2] c: improve UX for -Wincompatible-pointer-types [PR116871]

2025-01-13 Thread Joseph Myers

On Sun, 12 Jan 2025, David Malcolm wrote:

> So I've dropped the takes_int_p, takes_void_p, and
> maybe_inform_empty_args_c23_transition from the patch.  Here's an
> updated version that keeps the rest of the changes.  I'd like to get
> this into GCC 15 to make build failures due to C23-incompatibilities
> more readable.

Some comments in testcases still repeat the misconception about implicit 
(int) for unprototyped functions.

OK with that fixed.

> diff --git a/gcc/testsuite/gcc.dg/c23-mismatching-fn-ptr-alsatools.c 
> b/gcc/testsuite/gcc.dg/c23-mismatching-fn-ptr-alsatools.c
> new file mode 100644
> index ..e3460e546a9a
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/c23-mismatching-fn-ptr-alsatools.c
> @@ -0,0 +1,21 @@
> +/* Examples of a mismatching function pointer types in
> +   legacy code compiled with C23 that assumed () meant (int).

This comment is incorrect, unprototyped is not (int).

> diff --git a/gcc/testsuite/gcc.dg/c23-mismatching-fn-ptr.c 
> b/gcc/testsuite/gcc.dg/c23-mismatching-fn-ptr.c
> new file mode 100644
> index ..4db44f48a3f2
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/c23-mismatching-fn-ptr.c
> @@ -0,0 +1,76 @@
> +/* Verify that when we complain about incompatible pointer types
> +   involving function pointers, we show the declaration of the
> +   function.  
> +
> +   In particular, check the case of
> +  extern void fn ();
> +   changing meaning in C23 (from taking int to taking void).  */

Likewise.

> +/* Test of storing a sighandler_t where the declaration of the
> +   destination might be relying on implicit int arg, which
> +   becomes void in C23.

Likewise.

-- 
Joseph S. Myers
josmy...@redhat.com

Re: [PATCH] RISC-V: Disallow negative step for interleaving [PR117682].

2025-01-13 Thread Jeff Law





On 12/17/24 8:27 AM, Robin Dapp wrote:

Hi,

in PR117682 we build an interleaving pattern

   { 1, 201, 209, 25, 161, 105, 113, 185, 65, 9,
 17, 89, 225, 169, 177, 249, 129, 73, 81, 153,
 33, 233, 241, 57, 193, 137, 145, 217, 97, 41,
 49, 121 };

with negative step expecting wraparound semantics due to -fwrapv.

For building interleaved patterns we have an optimization that
does e.g.
   {1, 209, ...} = { 1, 0, 209, 0, ...}
and
   {201, 25, ...} >> 8 = { 0, 201, 0, 25, ...}
and IORs those.

The optimization only works if the lowpart bits are zero.  When
overflowing e.g. with a negative step we cannot guarantee this.

This patch makes us fall back to the generic merge handling for negative
steps.

I'm not 100% certain we're good even for positive steps.  If the
step or the vector length is large enough we'd still overflow and
have non-zero lower bits.  I haven't seen this happen during my
testing, though and the patch doesn't make things worse, so...

Regtested on rv64gcv_zvl512b.  Let's see what the CI says.

Regards
  Robin

PR target/117682

gcc/ChangeLog:

* config/riscv/riscv-v.cc (expand_const_vector): Fall back to
merging if either step is negative.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/pr117682.c: New test.

I've pushed this to the trunk.

jeff

Re: [PATCH] RISC-V: Expand shift count in Xmode in interleave pattern.

2025-01-13 Thread Jeff Law





On 12/18/24 3:37 AM, Robin Dapp wrote:

Hi,

currently ssa-dse-1.C ICEs because expand_simple_binop returns NULL
when building the scalar that is used to IOR two interleaving
sequences.

That's because we try to emit a shift in HImode.  This patch shifts in
Xmode and then lowpart-subregs the result to HImode.

Regtested on rv64gcv_zvl512b.

Regards
  Robin

gcc/ChangeLog:

* config/riscv/riscv-v.cc (expand_const_vector): Shift in Xmode.

I've pushed this to the trunk as well.
jeff

[COMMITTED 03/14] ada: Warn about redundant parentheses in upper range bounds

2025-01-13 Thread Marc Poulhiès

From: Piotr Trojanek 

Fix a glitch in condition that effectively caused detection of redundant
parentheses in upper range bounds to be dead code.

gcc/ada/ChangeLog:

* par-ch3.adb (P_Discrete_Range): Replace N_Subexpr, which was catching
all subexpressions, with kinds that catch nodes that require
parentheses to become "simple expressions".

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/par-ch3.adb | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/gcc/ada/par-ch3.adb b/gcc/ada/par-ch3.adb
index edea6785512..e58e2a2342b 100644
--- a/gcc/ada/par-ch3.adb
+++ b/gcc/ada/par-ch3.adb
@@ -3070,11 +3070,15 @@ package body Ch3 is
  Check_Simple_Expression (Expr_Node);
  Set_High_Bound (Range_Node, Expr_Node);
 
- --  If Expr_Node (ignoring parentheses) is not a simple expression
- --  then emit a style check.
+ --  If the upper bound doesn't require parentheses, then emit a style
+ --  check. Parentheses that make "expression" syntax nodes a "simple
+ --  expression" are required; we filter those nodes both here and
+ --  inside Check_Xtra_Parens itself.
 
  if Style_Check
-   and then Nkind (Expr_Node) not in N_Op_Boolean | N_Subexpr
+   and then Nkind (Expr_Node) not in N_Membership_Test
+   | N_Op_Boolean
+   | N_Short_Circuit
  then
 Style.Check_Xtra_Parens (Expr_Node);
  end if;
-- 
2.43.0

[COMMITTED 01/14] ada: Fix parsing of raise expressions with no parens

2025-01-13 Thread Marc Poulhiès

From: Piotr Trojanek 

According to Ada grammar, raise expression is an expression, but requires
parens to be a simple_expression. We wrongly classified raise expressions
as expressions, because we mishandled a global state variable in the parser.

This patch causes some illegal code to be rejected.

gcc/ada/ChangeLog:

* par-ch4.adb (P_Relation): Prevent Expr_Form to be overwritten when
parsing the raise expression itself.
(P_Simple_Expression): Fix manipulation of Expr_Form.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/par-ch4.adb | 10 --
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/gcc/ada/par-ch4.adb b/gcc/ada/par-ch4.adb
index 97f9b7ddeb2..3f8d1f1d2e3 100644
--- a/gcc/ada/par-ch4.adb
+++ b/gcc/ada/par-ch4.adb
@@ -2181,8 +2181,9 @@ package body Ch4 is
   --  First check for raise expression
 
   if Token = Tok_Raise then
+ Node1 := P_Raise_Expression;
  Expr_Form := EF_Non_Simple;
- return P_Raise_Expression;
+ return Node1;
   end if;
 
   --  All other cases
@@ -2415,6 +2416,8 @@ package body Ch4 is
 Node1 := P_Term;
  end if;
 
+ Expr_Form := EF_Simple;
+
  --  In the following, we special-case a sequence of concatenations of
  --  string literals, such as "aaa" & "bbb" & ... & "ccc", with nothing
  --  else mixed in. For such a sequence, we return a tree representing
@@ -2530,11 +2533,6 @@ package body Ch4 is
end;
 end if;
  end;
-
- --  All done, we clearly do not have name or numeric literal so this
- --  is a case of a simple expression which is some other possibility.
-
- Expr_Form := EF_Simple;
   end if;
 
   --  If all extensions are enabled and we have a deep delta aggregate
-- 
2.43.0

[COMMITTED 02/14] ada: Add more commentary to System.Val_Real.Large_Powfive

2025-01-13 Thread Marc Poulhiès

From: Eric Botcazou 

gcc/ada/ChangeLog:

* libgnat/s-valrea.adb (Large_Powfive) [2 parameters]: Add a couple
of additional comments.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/libgnat/s-valrea.adb | 5 +
 1 file changed, 5 insertions(+)

diff --git a/gcc/ada/libgnat/s-valrea.adb b/gcc/ada/libgnat/s-valrea.adb
index ed22366840f..aff694dd721 100644
--- a/gcc/ada/libgnat/s-valrea.adb
+++ b/gcc/ada/libgnat/s-valrea.adb
@@ -396,6 +396,9 @@ package body System.Val_Real is
begin
   pragma Assert (Exp > Maxexp);
 
+  --  This routine supports any type but it is not necessary to invoke it
+  --  for large types because the above one is sufficient for them.
+
   pragma Warnings (Off, "-gnatw.a");
   pragma Assert (not Is_Large_Type);
   pragma Warnings (On, "-gnatw.a");
@@ -407,6 +410,8 @@ package body System.Val_Real is
   --  its final value does not overflow but, if it's too large, then do not
   --  bother doing it since overflow is just fine. The scaling factor is -3
   --  for every power of 5 above the maximum, in other words division by 8.
+  --  Note that Maxpow is an upper bound of the span of exponents for which
+  --  scaling is needed, but it's OK to apply it even if it is not needed.
 
   if Exp - Maxexp <= Maxpow then
  S := 3 * (Exp - Maxexp);
-- 
2.43.0

[COMMITTED 06/14] ada: Fix spurious warning about redundant parentheses in range bound

2025-01-13 Thread Marc Poulhiès

From: Piotr Trojanek 

Use the same logic for warning about redundant parentheses in lower and upper
bounds of a discrete range. This fixes a spurious warning that, if followed,
would render the code illegal.

gcc/ada/ChangeLog:

* par-ch3.adb (P_Discrete_Range): Detect redundant parentheses in the
lower bound like in the upper bound.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/par-ch3.adb | 16 +++-
 1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/gcc/ada/par-ch3.adb b/gcc/ada/par-ch3.adb
index e58e2a2342b..fe727d7c094 100644
--- a/gcc/ada/par-ch3.adb
+++ b/gcc/ada/par-ch3.adb
@@ -3061,7 +3061,16 @@ package body Ch3 is
  Range_Node := New_Node (N_Range, Token_Ptr);
  Set_Low_Bound (Range_Node, Expr_Node);
 
- if Style_Check then
+ --  If the bound doesn't require parentheses, then emit a style
+ --  check. Parentheses that change an "expression" syntax node into a
+ --  "simple expression" are required; we filter those nodes both here
+ --  and inside Check_Xtra_Parens itself.
+
+ if Style_Check
+   and then Nkind (Expr_Node) not in N_Membership_Test
+   | N_Op_Boolean
+   | N_Short_Circuit
+ then
 Style.Check_Xtra_Parens (Expr_Node);
  end if;
 
@@ -3070,10 +3079,7 @@ package body Ch3 is
  Check_Simple_Expression (Expr_Node);
  Set_High_Bound (Range_Node, Expr_Node);
 
- --  If the upper bound doesn't require parentheses, then emit a style
- --  check. Parentheses that make "expression" syntax nodes a "simple
- --  expression" are required; we filter those nodes both here and
- --  inside Check_Xtra_Parens itself.
+ --  Check for extra parentheses like for the lower bound
 
  if Style_Check
and then Nkind (Expr_Node) not in N_Membership_Test
-- 
2.43.0

Un-XFAIL 'dg-note's in 'gfortran.dg/goacc/routine-external-level-of-parallelism-2.f' (was: [Patch] Fortran: Fix location_t in gfc_get_extern_function_decl; support 'omp dispatch interop')

2025-01-13 Thread Thomas Schwinge

Hi!

On 2025-01-10T13:33:25+0100, Tobias Burnus  wrote:
> The first change is a simple, generic Fortran change.
>
> Without it, external declarations have odd locations
> (namely their input_location):
>
> gcc/testsuite/gfortran.dg/gomp/dispatch-11.f90:67:46:
>
> 67 |   !$omp dispatch interop(obj2, obj1) device(3)
>|  ^
> note: ‘declare variant’ candidate ‘repl2’ declared here
>
>
> While with the change, i.e. gfc_get_location (&sym->declared_at),
> we get:
>
> gcc/testsuite/gfortran.dg/gomp/dispatch-11.f90:25:5:
>
> 25 | subroutine base2 (x, y)
>| ^~~~
> note: ‘base2’ declared here

> Additionally, this patch adds the 'interop' clause to OpenMP's
> 'dispatch' clause.

Separate commit?  ;-|

> Fortran: Fix location_t in gfc_get_extern_function_decl; [...]
>
> The declaration created by gfc_get_extern_function_decl used input_location
> as DECL_SOURCE_LOCATION, which gave rather odd results with 'declared here'
> diagnostic. - It is much more useful to use the gfc_symbol's declated_at,
> which this commit now dows.

> --- a/gcc/fortran/trans-decl.cc
> +++ b/gcc/fortran/trans-decl.cc
> @@ -2412,7 +2412,7 @@ module_sym:
>  
>type = gfc_get_function_type (sym, actual_args, fnspec);
>  
> -  fndecl = build_decl (input_location,
> +  fndecl = build_decl (gfc_get_location (&sym->declared_at),
>  FUNCTION_DECL, name, type);
>  
>/* Initialize DECL_EXTERNAL and TREE_PUBLIC before calling decl_attributes;

Nice, thanks!

> --- 
> a/gcc/testsuite/gfortran.dg/goacc/routine-external-level-of-parallelism-2.f
> +++ 
> b/gcc/testsuite/gfortran.dg/goacc/routine-external-level-of-parallelism-2.f
> @@ -7,6 +7,13 @@
>integer, parameter :: n = 100
>integer :: a(n), i, j
>external :: gangr, workerr, vectorr, seqr
> +! { dg-bogus "note: routine 'gangr' declared here" "TODO1" { xfail { ! 
> offloading_enabled } } .-1 }
> +! { dg-bogus "note: routine 'gangr_' declared here" "TODO2" { xfail 
> offloading_enabled } .-2 }
> +! { dg-bogus "note: routine 'workerr' declared here" "TODO1" { xfail { ! 
> offloading_enabled } } .-3 }
> +! { dg-bogus "note: routine 'workerr_' declared here" "TODO2" { xfail 
> offloading_enabled } .-4 }
> +! { dg-bogus "note: routine 'vectorr' declared here" "TODO1" { xfail { ! 
> offloading_enabled } } .-5 }
> +! { dg-bogus "note: routine 'vectorr_' declared here" "TODO2" { xfail 
> offloading_enabled } .-6 }
> +
>  !$acc routine (gangr) gang
>  !$acc routine (workerr) worker
>  !$acc routine (vectorr) vector
> @@ -22,8 +29,6 @@
>  ! { dg-warning "insufficient partitioning available to parallelize loop" "" 
> { target *-*-* } .-1 }
>   do j = 1, n
>  call workerr (a, n) ! { dg-message "optimized: assigned OpenACC 
> worker vector loop parallelism" }
> -! { dg-bogus "note: routine 'workerr' declared here" "TODO1" { xfail { ! 
> offloading_enabled } } .-1 }
> -! { dg-bogus "note: routine 'workerr_' declared here" "TODO2" { xfail 
> offloading_enabled } .-2 }
>   end do
>end do
>  !$acc end parallel loop
> @@ -36,8 +41,6 @@
>   do j = 1, n
>  call gangr (a, n) ! { dg-message "optimized: assigned OpenACC 
> worker vector loop parallelism" }
>  ! { dg-error "routine call uses same OpenACC parallelism as containing loop" 
> "" { target *-*-* } .-1 }
> -! { dg-bogus "note: routine 'gangr' declared here" "TODO1" { xfail { ! 
> offloading_enabled } } .-2 }
> -! { dg-bogus "note: routine 'gangr_' declared here" "TODO2" { xfail 
> offloading_enabled } .-3 }
>   end do
>end do
>  !$acc end parallel loop
> @@ -162,8 +165,6 @@
>  !$acc parallel loop ! { dg-message "optimized: assigned OpenACC gang worker 
> loop parallelism" }
>do i = 1, n
>   call vectorr (a, n) ! { dg-message "optimized: assigned OpenACC 
> vector loop parallelism" }
> -! { dg-bogus "note: routine 'vectorr' declared here" "TODO1" { xfail { ! 
> offloading_enabled } } .-1 }
> -! { dg-bogus "note: routine 'vectorr_' declared here" "TODO2" { xfail 
> offloading_enabled } .-2 }
>end do
>  !$acc end parallel loop
>  
> @@ -199,6 +200,13 @@
>integer, parameter :: n = 100
>integer :: a(n), i, j
>integer, external :: gangf, workerf, vectorf, seqf
> +! { dg-bogus "note: routine 'gangf' declared here" "TODO1" { xfail { ! 
> offloading_enabled } } .-1 }
> +! { dg-bogus "note: routine 'gangf_' declared here" "TODO2" { xfail 
> offloading_enabled } .-2 }
> +! { dg-bogus "note: routine 'workerf' declared here" "TODO1" { xfail { ! 
> offloading_enabled } } .-3 }
> +! { dg-bogus "note: routine 'workerf_' declared here" "TODO2" { xfail 
> offloading_enabled } .-4 }
> +! { dg-bogus "note: routine 'vectorf' declared here" "TODO1" { xfail { ! 
> offloading_enabled } } .-5 }
> +! { dg-bogus "note: routine 'vectorf_' declared here" "TODO2" { xfail 
> offloading_enabled } .-6 }
> +
>  !$acc routine (gangf) gang
>  !$acc

Re: [PATCH] lto: Remove link() to fix build with MinGW [PR118238]

2025-01-13 Thread Richard Biener

On Mon, 13 Jan 2025, Michal Jires wrote:

> I used link() to create cheap copies of Incremental LTO cache contents
> to prevent their deletion once linking is finished.
> This is unnecessary, since output_files are deleted in our lto-plugin
> and not in the linker itself.
> 
> Bootstrapped/regtested on x86_64-linux.
> lto-wrapper now again builds on MinGW. Though so far I have not setup
> MinGW to be able to do full bootstrap.
> Ok for trunk?

OK.

Richard.

>   PR lto/118238
> 
> gcc/ChangeLog:
> 
>   * lto-wrapper.cc (run_gcc): Remove link() copying.
> 
> lto-plugin/ChangeLog:
> 
>   * lto-plugin.c (cleanup_handler):
>   Keep output_files when using Incremental LTO.
>   (onload): Detect Incremental LTO.
> ---
>  gcc/lto-wrapper.cc  | 34 +-
>  lto-plugin/lto-plugin.c |  9 +++--
>  2 files changed, 12 insertions(+), 31 deletions(-)
> 
> diff --git a/gcc/lto-wrapper.cc b/gcc/lto-wrapper.cc
> index f9b2511c38e..a980b208783 100644
> --- a/gcc/lto-wrapper.cc
> +++ b/gcc/lto-wrapper.cc
> @@ -1571,6 +1571,8 @@ run_gcc (unsigned argc, char *argv[])
> /* Exists.  */
> if (access (option->arg, W_OK) == 0)
>   ltrans_cache_dir = option->arg;
> +   else
> + fatal_error (input_location, "missing directory: %s", option->arg);
> break;
>  
>   case OPT_flto_incremental_cache_size_:
> @@ -2218,39 +2220,13 @@ cont:
>   {
> for (i = 0; i < nr; ++i)
>   {
> -   char *input_name = input_names[i];
> -   char const *output_name = output_names[i];
> -
> ltrans_file_cache::item* item;
> -   item = ltrans_cache.get_item (input_name);
> +   item = ltrans_cache.get_item (input_names[i]);
>  
> -   if (item && !save_temps)
> +   if (item)
>   {
> +   /* Ensure LTRANS for this item finished.  */
> item->lock.lock_read ();
> -   /* Ensure that cached compiled file is not deleted.
> -  Create copy.  */
> -
> -   obstack_grow (&env_obstack, output_name,
> - strlen (output_name) - 2);
> -   obstack_grow (&env_obstack, ".cache_copy.XXX.o",
> - sizeof (".cache_copy.XXX.o"));
> -
> -   char* output_name_link = XOBFINISH (&env_obstack, char *);
> -   char* name_idx = output_name_link + strlen (output_name_link)
> -- strlen ("XXX.o");
> -
> -   /* lto-wrapper can run in parallel and access
> -  the same partition.  */
> -   for (int j = 0; ; j++)
> - {
> -   gcc_assert (j < 1000);
> -   sprintf (name_idx, "%03d.o", j);
> -
> -   if (link (output_name, output_name_link) != EEXIST)
> - break;
> - }
> -
> -   output_names[i] = output_name_link;
> item->lock.unlock ();
>   }
>   }
> diff --git a/lto-plugin/lto-plugin.c b/lto-plugin/lto-plugin.c
> index 6bccb56291c..6c78d019cf1 100644
> --- a/lto-plugin/lto-plugin.c
> +++ b/lto-plugin/lto-plugin.c
> @@ -214,6 +214,7 @@ static char *ltrans_objects = NULL;
>  
>  static bool debug;
>  static bool save_temps;
> +static bool flto_incremental;
>  static bool verbose;
>  static char nop;
>  static char *resolution_file = NULL;
> @@ -941,8 +942,9 @@ cleanup_handler (void)
>if (arguments_file_name)
>  maybe_unlink (arguments_file_name);
>  
> -  for (i = 0; i < num_output_files; i++)
> -maybe_unlink (output_files[i]);
> +  if (!flto_incremental)
> +for (i = 0; i < num_output_files; i++)
> +  maybe_unlink (output_files[i]);
>  
>free_2 ();
>return LDPS_OK;
> @@ -1615,6 +1617,9 @@ onload (struct ld_plugin_tv *tv)
>if (strstr (collect_gcc_options, "'-save-temps'"))
>   save_temps = true;
>  
> +  if (strstr (collect_gcc_options, "'-flto-incremental="))
> + flto_incremental = true;
> +
>if (strstr (collect_gcc_options, "'-v'")
>|| strstr (collect_gcc_options, "'--verbose'"))
>   verbose = true;
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

[Regression] [PATCH] internal-fn: Do not force vcond operand to reg.

2025-01-13 Thread Torbjorn SVENSSON


Hi Richard and Robin,

It looks like this patch introduced a regression with MVE (Cortex-M55 and 
Cortex-M85).

If I try to build testsuite/c-c++-common/vector-compare-3.c (there are other 
test cases that fail with a similar ICE):

arm-none-eabi-gcc /src/gcc/testsuite/c-c++-common/vector-compare-3.c 
-march=armv8.1-m.main+mve+fp.dp -mfloat-abi=hard -mfpu=auto -Wc++-compat -O2 -S 
-o /dev/null
/src/gcc/testsuite/c-c++-common/vector-compare-3.c: In function 'g':
/src/gcc/testsuite/c-c++-common/vector-compare-3.c:24:1: error: unrecognizable 
insn:
(insn 26 25 27 2 (set (reg:V4SI 137)
(unspec:V4SI [
(reg:V4SI 144)
(reg:V4SI 145)
(subreg:V4BI (reg:HI 143) 0)
] VPSELQ_S)) 
"/src/gcc/testsuite/c-c++-common/vector-compare-3.c":23:6 -1
 (nil))
during RTL pass: vregs
/src/gcc/testsuite/c-c++-common/vector-compare-3.c:24:1: internal compiler 
error: in extract_insn, at recog.cc:2882
0x20d7ca8 diagnostic_context::diagnostic_impl(rich_location*, 
diagnostic_metadata const*, diagnostic_option_id, char const*, __va_list_tag 
(*) [1], diagnostic_t)
???:0
0x20eddd2 internal_error(char const*, ...)
???:0
0x726ee5 fancy_abort(char const*, int, char const*)
???:0
0x7166bf _fatal_insn(char const*, rtx_def const*, char const*, int, char const*)
???:0
0x7166de _fatal_insn_not_found(rtx_def const*, char const*, int, char const*)
???:0
0xe952f7 extract_insn(rtx_insn*)
???:0
Please submit a full bug report, with preprocessed source (by using 
-freport-bug).
Please include the complete backtrace with any bug report.
See  for instructions.


Above is with r15-6752-g08b6e875c6b.

Please take a look and see if this can easily be fixed.

Kind regards,
Torbjörn


On 2024-05-13 16:14, Robin Dapp wrote:

What happens if we simply remove all of the force_reg here?


On x86 I bootstrapped and tested the attached without fallout
(gcc188, so it's no avx512-native machine and therefore limited
coverage).  riscv regtest is unchanged.
For aarch64 I would to rely on the pre-commit CI to pick it
up (does that work on sub-threads?).

Regards
  Robin


gcc/ChangeLog:

PR middle-end/113474

* internal-fn.cc (expand_vec_cond_mask_optab_fn):  Remove
force_regs.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/pr113474.c: New test.
---
  gcc/internal-fn.cc  |  3 ---
  .../gcc.target/riscv/rvv/autovec/pr113474.c | 13 +
  2 files changed, 13 insertions(+), 3 deletions(-)
  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/pr113474.c

diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc
index 2c764441cde..4d226c478b4 100644
--- a/gcc/internal-fn.cc
+++ b/gcc/internal-fn.cc
@@ -3163,9 +3163,6 @@ expand_vec_cond_mask_optab_fn (internal_fn, gcall *stmt, 
convert_optab optab)
rtx_op1 = expand_normal (op1);
rtx_op2 = expand_normal (op2);
  
-  mask = force_reg (mask_mode, mask);

-  rtx_op1 = force_reg (mode, rtx_op1);
-
rtx target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
create_output_operand (&ops[0], target, mode);
create_input_operand (&ops[1], rtx_op1, mode);
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr113474.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr113474.c
new file mode 100644
index 000..0364bf9f5e3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr113474.c
@@ -0,0 +1,13 @@
+/* { dg-do compile { target riscv_v } }  */
+/* { dg-additional-options "-std=c99" }  */
+
+void
+foo (int n, int **a)
+{
+  int b;
+  for (b = 0; b < n; b++)
+for (long e = 8; e > 0; e--)
+  a[b][e] = a[b][e] == 15;
+}
+
+/* { dg-final { scan-assembler "vmerge.vim" } }  */

Re: [Regression] [PATCH] internal-fn: Do not force vcond operand to reg.

2025-01-13 Thread Christophe Lyon





On 1/13/25 15:05, Torbjorn SVENSSON wrote:

Hi Richard and Robin,

It looks like this patch introduced a regression with MVE (Cortex-M55 
and Cortex-M85).


If I try to build testsuite/c-c++-common/vector-compare-3.c (there are 
other test cases that fail with a similar ICE):


arm-none-eabi-gcc /src/gcc/testsuite/c-c++-common/vector-compare-3.c 
-march=armv8.1-m.main+mve+fp.dp -mfloat-abi=hard -mfpu=auto -Wc++-compat 
-O2 -S -o /dev/null

/src/gcc/testsuite/c-c++-common/vector-compare-3.c: In function 'g':
/src/gcc/testsuite/c-c++-common/vector-compare-3.c:24:1: error: 
unrecognizable insn:

(insn 26 25 27 2 (set (reg:V4SI 137)
     (unspec:V4SI [
     (reg:V4SI 144)
     (reg:V4SI 145)
     (subreg:V4BI (reg:HI 143) 0)
     ] VPSELQ_S)) 
"/src/gcc/testsuite/c-c++-common/vector-compare-3.c":23:6 -1

  (nil))
during RTL pass: vregs
/src/gcc/testsuite/c-c++-common/vector-compare-3.c:24:1: internal 
compiler error: in extract_insn, at recog.cc:2882
0x20d7ca8 diagnostic_context::diagnostic_impl(rich_location*, 
diagnostic_metadata const*, diagnostic_option_id, char const*, 
__va_list_tag (*) [1], diagnostic_t)

     ???:0
0x20eddd2 internal_error(char const*, ...)
     ???:0
0x726ee5 fancy_abort(char const*, int, char const*)
     ???:0
0x7166bf _fatal_insn(char const*, rtx_def const*, char const*, int, char 
const*)

     ???:0
0x7166de _fatal_insn_not_found(rtx_def const*, char const*, int, char 
const*)

     ???:0
0xe952f7 extract_insn(rtx_insn*)
     ???:0
Please submit a full bug report, with preprocessed source (by using 
-freport-bug).

Please include the complete backtrace with any bug report.
See  for instructions.


Above is with r15-6752-g08b6e875c6b.

Please take a look and see if this can easily be fixed.


I think this is https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115439

I'll have a look.

Thanks,

Christophe




Kind regards,
Torbjörn


On 2024-05-13 16:14, Robin Dapp wrote:

What happens if we simply remove all of the force_reg here?


On x86 I bootstrapped and tested the attached without fallout
(gcc188, so it's no avx512-native machine and therefore limited
coverage).  riscv regtest is unchanged.
For aarch64 I would to rely on the pre-commit CI to pick it
up (does that work on sub-threads?).

Regards
  Robin


gcc/ChangeLog:

PR middle-end/113474

* internal-fn.cc (expand_vec_cond_mask_optab_fn):  Remove
force_regs.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/pr113474.c: New test.
---
  gcc/internal-fn.cc  |  3 ---
  .../gcc.target/riscv/rvv/autovec/pr113474.c | 13 +
  2 files changed, 13 insertions(+), 3 deletions(-)
  create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/pr113474.c


diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc
index 2c764441cde..4d226c478b4 100644
--- a/gcc/internal-fn.cc
+++ b/gcc/internal-fn.cc
@@ -3163,9 +3163,6 @@ expand_vec_cond_mask_optab_fn (internal_fn, 
gcall *stmt, convert_optab optab)

    rtx_op1 = expand_normal (op1);
    rtx_op2 = expand_normal (op2);
-  mask = force_reg (mask_mode, mask);
-  rtx_op1 = force_reg (mode, rtx_op1);
-
    rtx target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
    create_output_operand (&ops[0], target, mode);
    create_input_operand (&ops[1], rtx_op1, mode);
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr113474.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr113474.c

new file mode 100644
index 000..0364bf9f5e3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr113474.c
@@ -0,0 +1,13 @@
+/* { dg-do compile { target riscv_v } }  */
+/* { dg-additional-options "-std=c99" }  */
+
+void
+foo (int n, int **a)
+{
+  int b;
+  for (b = 0; b < n; b++)
+    for (long e = 8; e > 0; e--)
+  a[b][e] = a[b][e] == 15;
+}
+
+/* { dg-final { scan-assembler "vmerge.vim" } }  */

Re: [PATCH] Accept commas between clauses in OpenMP declare variant

2025-01-13 Thread Tobias Burnus


Hi PA,

Paul-Antoine Arras wrote:

I am not sure I am getting that part. Is this what you are suggesting?


Yes, something like that, but not quite, as you found out.

I think we need something like the following (untested):


diff --git gcc/fortran/openmp.cc gcc/fortran/openmp.cc
index 9d28dc9..e3abbeeef98 100644
--- gcc/fortran/openmp.cc
+++ gcc/fortran/openmp.cc
@@ -6532,7 +6532,6 @@ gfc_match_omp_context_selector_specification 
(gfc_omp_declare_variant *odv)

 match
 gfc_match_omp_declare_variant (void)
 {
-  bool first_p = true;
   char buf[GFC_MAX_SYMBOL_LEN + 1];

   if (gfc_match (" (") != MATCH_YES)
@@ -6590,7 +6589,7 @@ gfc_match_omp_declare_variant (void)
   return MATCH_ERROR;
 }

-  bool has_match = false, has_adjust_args = false;
+  bool has_match = false, has_adjust_args = false, error_p = false;
   locus adjust_args_loc;

   for (;;)
@@ -6614,13 +6613,9 @@ gfc_match_omp_declare_variant (void)
 }
   else
 {
-  if (first_p)
-    {
-  gfc_error ("expected % or % at %C");
-  return MATCH_ERROR;
-    }
-  else
-    break;
+  if (!has_match)



if (gfc_match_omp_eos () != MATCH_YES)



+ error_p = true;
+  break;
 }

   if (gfc_match (" (") != MATCH_YES)
@@ -,8 +6661,12 @@ gfc_match_omp_declare_variant (void)
 for (gfc_omp_namelist *n = *head; n != NULL; n = n->next)
   n->u.need_device_ptr = true;
 }
+    }

-  first_p = false;
+  if (error_p)



if (error || (!has_match && !has_adjust_args))

as the missing 'match' is handled more explicitly by the next error.


+ {
+  gfc_error ("expected % or % at %C");
+  return MATCH_ERROR;
 }


* * *

The rest looks good to me.

Thanks,

Tobias

[PATCH] c++: Delete defaulted operator <=> if std::strong_ordering::equal doesn't convert to its rettype [PR118387]

2025-01-13 Thread Jakub Jelinek

On Fri, Jan 10, 2025 at 12:04:53PM -0500, Jason Merrill wrote:
> > Note, the PR raises another problem.
> > If on the same testcase the B b; line is removed, we silently synthetize
> > operator<=> which will crash at runtime due to returning without a return
> > statement.  That is because the standard says that in that case
> > it should return static_cast(std::strong_ordering::equal);
> > but I can't find anywhere wording which would say that if that isn't
> > valid, the function is deleted.
> > https://eel.is/c++draft/class.compare#class.spaceship-2.2
> > seems to talk just about cases where there are some members and their
> > comparison is invalid it is deleted, but here there are none and it
> > follows
> > https://eel.is/c++draft/class.compare#class.spaceship-3.sentence-2
> > So, we synthetize with tf_none, see the static_cast is invalid, don't
> > add error_mark_node statement silently, but as the function isn't deleted,
> > we just silently emit it.
> > Should the standard be amended to say that the operator should be deleted
> > even if it has no elements and the static cast from
> > https://eel.is/c++draft/class.compare#class.spaceship-3.sentence-2
> > ?
> 
> That seems pretty obviously what we want, and is what the other compilers
> implement.

So like this?

Will you handle the defect report (unless you think nothing needs to be
clarified), or should I file something?

2025-01-13  Jakub Jelinek  

PR c++/118387
* method.cc (build_comparison_op): Set bad if
std::strong_ordering::equal doesn't convert to rettype.

* g++.dg/cpp2a/spaceship-err6.C: Expect another error.
* g++.dg/cpp2a/spaceship-synth17.C: Likewise.
* g++.dg/cpp2a/spaceship-synth-neg6.C: Likewise.
* g++.dg/cpp2a/spaceship-synth-neg7.C: New test.

--- gcc/cp/method.cc.jj 2025-01-11 21:58:05.387588681 +0100
+++ gcc/cp/method.cc2025-01-13 16:19:09.896650756 +0100
@@ -1635,6 +1635,26 @@ build_comparison_op (tree fndecl, bool d
  rettype = common_comparison_type (comps);
  apply_deduced_return_type (fndecl, rettype);
}
+  else if (code == SPACESHIP_EXPR && cat_tag_for (rettype) == cc_last)
+   {
+ /* The return value is ... and
+static_cast(std::strong_ordering::equal) otherwise.
+Make sure to delete or diagnose if such a static cast is not
+valid.  */
+ tree seql = lookup_comparison_result (cc_strong_ordering,
+   "equal", complain);
+ if (seql == error_mark_node)
+   bad = true;
+ else if (!can_convert (rettype, TREE_TYPE (seql), complain))
+   {
+ if (complain & tf_error)
+   error_at (info.loc,
+ "% does not convert "
+ "to %qD return type %qT",
+ fndecl, rettype);
+ bad = true;
+   }
+   }
   if (bad)
{
  DECL_DELETED_FN (fndecl) = true;
--- gcc/testsuite/g++.dg/cpp2a/spaceship-err6.C.jj  2021-04-14 
19:19:14.050804249 +0200
+++ gcc/testsuite/g++.dg/cpp2a/spaceship-err6.C 2025-01-13 16:30:13.613331069 
+0100
@@ -10,7 +10,7 @@ class MyClass
 public:
   MyClass(int value): mValue(value) {}
 
-  bool operator<=>(const MyClass&) const = default;
+  bool operator<=>(const MyClass&) const = default;// { dg-error 
"'std::strong_ordering::equal' does not convert to 'constexpr bool 
MyClass::operator<=>\\\(const MyClass&\\\) const' return type 'bool'" }
 };
 
 int main()
--- gcc/testsuite/g++.dg/cpp2a/spaceship-synth17.C.jj   2025-01-11 
21:58:05.460587663 +0100
+++ gcc/testsuite/g++.dg/cpp2a/spaceship-synth17.C  2025-01-13 
16:32:11.383677413 +0100
@@ -8,7 +8,7 @@ struct B {};
 struct A
 {
   B b; // { dg-error "no match for 'operator<=>' in 
'\[^\n\r]*' \\\(operand types are 'B' and 'B'\\\)" }
-  int operator<=> (const A &) const = default;
+  int operator<=> (const A &) const = default; // { dg-error 
"'std::strong_ordering::equal' does not convert to 'constexpr int 
A::operator<=>\\\(const A&\\\) const' return type 'int'" }
 };
 
 int
--- gcc/testsuite/g++.dg/cpp2a/spaceship-synth-neg6.C.jj2021-08-12 
20:37:12.696473756 +0200
+++ gcc/testsuite/g++.dg/cpp2a/spaceship-synth-neg6.C   2025-01-13 
16:48:22.482043534 +0100
@@ -5,7 +5,7 @@
 
 struct S {
   int a;   // { dg-error "three-way comparison of 'S::a' 
has type 'std::strong_ordering', which does not convert to 'int\\*'" }
-  int *operator<=>(const S&) const = default;
+  int *operator<=>(const S&) const = default;  // { dg-error 
"'std::strong_ordering::equal' does not convert to 'constexpr int\\* 
S::operator<=>\\\(const S&\\\) const' return type 'int\\*'" }
 };
 
 bool b = S{} < S{};// { dg-error "use of deleted function 
'constexpr int\\* S::operator<=>\\\(const S&\\\) const'" }
--- gcc/testsuite/g++.dg/cpp2a/spaceship-synth-neg7.C.jj2025-01-13 
16:19:09

Re: [PATCH] tree-optimization/92539 - missed optimization leads to bogus -Warray-bounds

2025-01-13 Thread Jeff Law





On 1/13/25 7:47 AM, Richard Biener wrote:

On Mon, 13 Jan 2025, Richard Biener wrote:


The following makes niter analysis recognize a loop with an exit
condition scanning over a STRING_CST.  This is done via enhancing
the force evaluation code rather than recognizing for example
strlen (s) as number of iterations because it allows to handle
some more cases.

STRING_CSTs are easy to handle since nothing can write to them, also
processing those should be cheap.  I'd appreciate another eye on
the constraints I put in.

Note to avoid the -Warray-bound dianostic we have to early unroll
the loop (there's no final value replacement done, there's a PR
for doing this as part of CD-DCE when possibly eliding a loop).
This works for strings up to 8 chars (including the '\0') only
(rather than 16, the unroll niter limit) because unroll estimation
will not see that the load from the string constant goes away.

Final value replacement doesn't work since ivcanon is now after it,
it's not the time to move the pass though.  The pass is in theory
supposed to add a canonical IV for the _by_eval cases, but we
didn't "fix" this when we added cunrolli (we probably should have
moved ivcanon very early, or made cunroll add such IV if we
used _by_eval but did not unroll).

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.


Some testsuite adjustments are necessary, but the following followup
handles enabling final value replacement by forcing a canonical IV
from cunrolli when we didn't unroll but used force-evaluation to
compute niter.  It also makes us handle the new testcase which ends
up with POINTER_PLUS_EXPR for the inital value.

Bootstrap and regtest running on x86_64-unknown-linux-gnu.

Any thoughts?

Thanks,
Richard.

 From 705a287694404aafa72bbdc9da21dd1bf448cd85 Mon Sep 17 00:00:00 2001
From: Richard Biener 
Date: Mon, 13 Jan 2025 15:39:07 +0100
Subject: [PATCH] tree-optimization/92539 - handle more cases
To: gcc-patches@gcc.gnu.org

The following ontop of the previous fix also handles POINTER_PLUS_EXPR
of &"Foo" and a constant offset as happens for the added testcase
as well as making sure to add a canonical IV when we figured niter
by force evaluation during cunrolli so that work isn't wasted,
DCE can eliminate the load and SCCP perform final value replacement.

PR tree-optimization/92539
* tree-ssa-loop-ivcanon.cc (canonicalize_loop_induction_variables):
When niter was computed constant by force evaluation add a
canonical IV if we didn't unroll.
* tre-ssa-loop-niter.cc (loop_niter_by_eval): Use
split_constant_offset to get at a STRING_CST and an initial
constant offset.

* gcc.dg/tree-ssa/sccp-16.c: New testcase.
The ivcanon code looks reasonable in isolation, so no worries there if 
you're happy with the extra recorded IV.   The niter code just 
generalizes some existing bits slightly.


So as long as you're happy with the ivcanon bits, I think it's reasonable.

jeff

Re: [PATCH] aarch64: Provide initial specifications for Apple CPU cores.

2025-01-13 Thread Andrew Carlotti

On Sat, Jan 11, 2025 at 01:21:13PM +, Iain Sandoe wrote:
> Hi,
> 
> I originally made this patch for the Darwin Arm64 development branch,
> however in discussions on IRC, it seems that it is also relevant to
> Linux - since there are implementations running on Apple hardware with
> the M1..3 CPUs.  It might also be helpful to the resolution of
> PR113257 - although it is not a solution on its own.
> 
> Bootstrapped and tested manually (that it gives the expected .arch lines)
> on aarch64-linux.
> 
> OK for trunk?
> thanks
> Iain
> 
> --- 8< ---
> 
> This covers the M1-M3 cores used in Apple desktop hardware that is also
> sometimes used with Linux as the OS.
> 
> It does not cover the wider range that might be used in iOS and other
> embedded platform versions.
> 
> Some of the content is estimates/best guesses - based on the following
> public sources of information:
>  * XNU (only for the Apple Implementer ID)
>  * sysctl -a | grep hw on various M1, M2 and machines
>  * AArch64.td from the Apple Open Source repo for LLVM.
>  * What XCode-14 clang passes to cc1.
> 
> Unfortunately, these sources are in conflict; in particular the clang-claimed
> feature set disagrees with the output of sysctl -a, and the base Arm revs.
> claimed in some cases miss features that ARM DDI 0487J.a lists as mandatory
> for the rev.
> 
> This latter point might not be actually significant - but for the sake of
> caution I've made the spec use the lower arch rev + the additional features
> that are consistently claimed by both sysctl and clang.
> 
> GCC does not seem to have a scheduler that is similar to the "Cyclone" one
> in LLVM - so I've guessed to use cortex57 (but, maybe we miss 8-issue, it's
> not clear - and my experience with the scheduler is ≈ 0).
> 
> Likewise we do not (yet) have specific cost models, so choose the generic
> Armv8 one.
> 
> Thus, the choices here are intended to be conservative.
> 
>  * Currently, we do not seem to have any way to specify that M2/M3 has support
>   for FEAT_BTI, but because of missing feaures is not compliant with the Arm
>   base rev that implies this.

Since FEAT_BTI only adds hint instructions, I don't think any part of the
compiler actually checks for whether the feature is supported.  Whether or not
to emit FEAT_BTI instructions is controlled by a different compiler option.

>  * Proper version numbers are not readily available.
>  * Since we have FIRESTORM/ICESTORM and similar pairs for the performance and
>efficiency cores on various machines, perhaps we should be using a 
> big.LITTLE
>configuration; OTOH currently, I have no idea if that is usable in any way
>with the hardware as configured.
> 
> gcc/ChangeLog:
> 
>   * config/aarch64/aarch64-cores.def (AARCH64_CORE): Add apple-a12,
>   apple-m1, apple-m2, apple-m3.
>   * config/aarch64/aarch64-tune.md: Regenerate.
> 
> Signed-off-by: Iain Sandoe 
> ---
>  gcc/config/aarch64/aarch64-cores.def | 12 
>  gcc/config/aarch64/aarch64-tune.md   |  2 +-
>  2 files changed, 13 insertions(+), 1 deletion(-)
> 
> diff --git a/gcc/config/aarch64/aarch64-cores.def 
> b/gcc/config/aarch64/aarch64-cores.def
> index caf61437d18..0bd3e80cf7f 100644
> --- a/gcc/config/aarch64/aarch64-cores.def
> +++ b/gcc/config/aarch64/aarch64-cores.def
> @@ -173,6 +173,18 @@ AARCH64_CORE("cortex-a76.cortex-a55",  
> cortexa76cortexa55, cortexa53, V8_2A,  (F
>  AARCH64_CORE("cortex-r82", cortexr82, cortexa53, V8R, (), cortexa53, 0x41, 
> 0xd15, -1)
>  AARCH64_CORE("cortex-r82ae", cortexr82ae, cortexa53, V8R, (), cortexa53, 
> 0x41, 0xd14, -1)
>  
> +/* Apple (A12 and M) cores based on Armv8.
> +   Apple implementer ID from xnu,
> +   Guesses for part # and suitable scheduler ident, generic_armv8_a for 
> costs.
> +   A12 seems mostly 8.3,
> +   M1 seems to be 8.4 + extras (see comments in option-extensions about 
> f16fml),
> +   M2 mostly 8.5 but with missing mandatory features.
> +   M3 is essentially the same as M2 for the features declared here.  */
> +AARCH64_CORE("apple-a12", applea12, cortexa53, V8_3A,  (), generic_armv8_a, 
> 0x61, 0x12, -1)
> +AARCH64_CORE("apple-m1", applem1, cortexa57, V8_4A,  (F16, SB, SSBS), 
> generic_armv8_a, 0x61, 0x23, -1)
> +AARCH64_CORE("apple-m2", applem2, cortexa57, V8_4A,  (I8MM, BF16, F16, SB, 
> SSBS), generic_armv8_a, 0x61, 0x23, -1)
> +AARCH64_CORE("apple-m3", applem3, cortexa57, V8_4A,  (I8MM, BF16, F16, SB, 
> SSBS), generic_armv8_a, 0x61, 0x23, -1)
> +

Comparing to LLVM's AArch64Processors.td, this seems to be missing a few things:
- Crpyto extensions (SHA2 and AES, and SHA3 from apple-m1);
- New flags I just added (FRINTTS and FLAGM2 from apple-m1);
- PREDRES (from apple-m1)

If that's accurate, then I think you could list apple-m1 as V8_5A (although
LLVM only specifies V8_4A), and apple-m2 and apple-m3 as V8_6A (same as LLVM).
The only other difference from the increased architecture version would be to
enable a few more sysreg names (and our system register gating is an

[PATCH] d, v2: give dependency files better filenames

2025-01-13 Thread Jakub Jelinek

On Sun, Jan 12, 2025 at 04:16:58PM +0100, Arsen Arsenović wrote:
> Regstrapped on x86_64-pc-linux-gnu.  I've also checked the generated
> dependency files are correct by hand and "instrumented" the build to
> fail if two dependency files are the same, by doing the following:
> 
>   DPOSTCOMPILE = ! test -f $(DEPFILE).Po && mv ...
> 
> ... and confirmed no further conflicts of this sort happen.
> 
> OK for trunk?
> -- >8 --
> Currently, the dependency files for root-file.o and common-file.o were
> both d/.deps/file.Po, which would cause parallel builds to fail
> sometimes with:
> 
>   make[3]: Leaving directory 
> '/var/tmp/portage/sys-devel/gcc-14.1.1_p20240511/work/build/gcc'
>   make[3]: Entering directory 
> '/var/tmp/portage/sys-devel/gcc-14.1.1_p20240511/work/build/gcc'
>   mv: cannot stat 'd/.deps/file.TPo': No such file or directory
>   make[3]: *** 
> [/var/tmp/portage/sys-devel/gcc-14.1.1_p20240511/work/gcc-14-20240511/gcc/d/Make-lang.in:421:
>  d/root-file.o] Error 1 shuffle=131581365
> 
> Also, this means that dependencies of one of root-file or common-file
> are missing when developing.  After this patch, those two files get
> assigned dependency files d/.deps/d-root-file.o.Po and
> d/.deps/d-common-file.o.Po respectively.
> 
> There are other files with similar conflicts (mangle-package.o,
> visitor-package.o for instance).

Note, I ran into the same problem in
https://kojipkgs.fedoraproject.org//work/tasks/6545/127766545/build.log
> 
> gcc/d/ChangeLog:
> 
>   * Make-lang.in: Assign dependency-tracking files better
>   filenames.
> ---
>  gcc/d/Make-lang.in | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/gcc/d/Make-lang.in b/gcc/d/Make-lang.in
> index f28761e4b370..25e2b0bbfe94 100644
> --- a/gcc/d/Make-lang.in
> +++ b/gcc/d/Make-lang.in
> @@ -65,8 +65,9 @@ ALL_DFLAGS = $(DFLAGS-$@) $(GDCFLAGS) -fversion=IN_GCC 
> $(CHECKING_DFLAGS) \
>   $(WARN_DFLAGS)
>  
>  DCOMPILE.base = $(GDC) -c $(ALL_DFLAGS) -o $@
> -DCOMPILE = $(DCOMPILE.base) -MT $@ -MMD -MP -MF $(@D)/$(DEPDIR)/$(*F).TPo
> -DPOSTCOMPILE = @mv $(@D)/$(DEPDIR)/$(*F).TPo $(@D)/$(DEPDIR)/$(*F).Po
> +DEPFILE = $(subst /,-,$@)

Why do you need there the directory or suffix?
$(*F) is clearly bad, because there are rules like
d/%.o: d/dmd/%.d
$(DCOMPILE) $(D_INCLUDES) $<
$(DPOSTCOMPILE)

d/common-%.o: d/dmd/common/%.d
$(DCOMPILE) $(D_INCLUDES) $<
$(DPOSTCOMPILE)
etc. and while the stem in the first case is the basename of the filename
part, in the second case it is the basename of the filename part in the
common directory.
I think
DEPFILE = $(basename $(@F))
would be sufficient.
So the former d/.deps/file.Po which handled both d/dmd/common/file.d and
d/dmd/root/file.d in your case would be d/.deps/d-common-file.o.d and
d/.deps/d-root-file.o.d while with the above DEPFILE it would be
d/.deps/common-file.d and d/.deps/root-file.d
There are no d/dmd/*-*.d files and among d/*-*.cc the only are just d-
prefixed ones, and there are no clashes between the *.cc and *.d filenames:
for i in gcc/d/*.cc; do j=`basename $i .cc`; find gcc/d -name $j.d; done

So just (or if you want a helper variable, perhaps it should standa for the
whole $(@D)/$(DEPDIR)/$(basename $(@F)) part?

2025-01-13  Arsen Arsenović  
Jakub Jelinek  

* Make-lang.in (DCOMPILE, DPOSTCOMPILE): Use $(basename $(@F))
instead of $(*F).

--- gcc/d/Make-lang.in.jj   2025-01-13 09:12:07.408983471 +0100
+++ gcc/d/Make-lang.in  2025-01-13 10:57:16.398315375 +0100
@@ -65,8 +65,8 @@ ALL_DFLAGS = $(DFLAGS-$@) $(GDCFLAGS) -f
$(WARN_DFLAGS)
 
 DCOMPILE.base = $(GDC) -c $(ALL_DFLAGS) -o $@
-DCOMPILE = $(DCOMPILE.base) -MT $@ -MMD -MP -MF $(@D)/$(DEPDIR)/$(*F).TPo
-DPOSTCOMPILE = @mv $(@D)/$(DEPDIR)/$(*F).TPo $(@D)/$(DEPDIR)/$(*F).Po
+DCOMPILE = $(DCOMPILE.base) -MT $@ -MMD -MP -MF $(@D)/$(DEPDIR)/$(basename 
$(@F)).TPo
+DPOSTCOMPILE = @mv $(@D)/$(DEPDIR)/$(basename $(@F)).TPo 
$(@D)/$(DEPDIR)/$(basename $(@F)).Po
 DLINKER = $(GDC) $(NO_PIE_FLAG) -lstdc++
 
 # Like LINKER, but use a mutex for serializing front end links.

Jakub

[PATCH] RISC-V: Fix the result error caused by not updating ratio when using "use_max_sew" to merge vsetvl.

2025-01-13 Thread Jin Ma

When the vsetvl instructions of the two RVV instructions are merged
using "use_max_sew", it is possible to update the sew of prev if
prev.sew < next.sew, but keep the original ratio, which is obviously
wrong. when the subsequent instructions are equal to the wrong ratio,
it is possible to generate the wrong "vsetvli zero,zero" instruction,
which will lead to unknown avl.

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc:

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/bug-10.c: New test.
---
 gcc/config/riscv/riscv-vsetvl.cc |  1 +
 gcc/testsuite/gcc.target/riscv/rvv/base/bug-10.c | 15 +++
 2 files changed, 16 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/bug-10.c

diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc
index 851e52a20ba6..e9de21787dda 100644
--- a/gcc/config/riscv/riscv-vsetvl.cc
+++ b/gcc/config/riscv/riscv-vsetvl.cc
@@ -1722,6 +1722,7 @@ private:
   {
 int max_sew = MAX (prev.get_sew (), next.get_sew ());
 prev.set_sew (max_sew);
+prev.set_ratio (calculate_ratio (prev.get_sew (), prev.get_vlmul ()));
 use_min_of_max_sew (prev, next);
   }
   inline void use_next_sew_lmul (vsetvl_info &prev, const vsetvl_info &next)
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/bug-10.c 
b/gcc/testsuite/gcc.target/riscv/rvv/base/bug-10.c
new file mode 100644
index ..c1a8ac95c17f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/bug-10.c
@@ -0,0 +1,15 @@
+/* { dg-do compile { target { rv64 } } } */
+/* { dg-skip-if "" { *-*-* } { "-O0" "-O1" "-O3" "-Og" "-Os" "-Oz" } } */
+/* { dg-options " -march=rv64gcv_zvfh -mabi=lp64d -O2 
--param=vsetvl-strategy=optim -fno-schedule-insns  -fno-schedule-insns2 
-fno-schedule-fusion " } */
+
+#include 
+
+void
+foo (uint8_t *ptr, vfloat16m4_t *v1, vuint32m8_t *v2, vuint8m2_t *v3, size_t 
vl)
+{
+  *v1 = __riscv_vfmv_s_f_f16m4 (1, vl);
+  *v2 = __riscv_vmv_s_x_u32m8 (2963090659u, vl);
+  *v3 = __riscv_vsll_vx_u8m2 (__riscv_vid_v_u8m2 (vl), 2, vl);
+}
+
+/* { dg-final { scan-assembler-not {vsetvli.*zero,zero} } }*/
-- 
2.25.1

[COMMITTED 07/14] ada: Remove redundant parentheses inside unary operators

2025-01-13 Thread Marc Poulhiès

From: Piotr Trojanek 

GNAT already emits a style warning when redundant parentheses appear inside
logical and short-circuit operators. A similar warning will be soon emitted for
unary operators as well. This patch removes the redundant parentheses to avoid
future build errors.

gcc/ada/ChangeLog:

* checks.adb, exp_dist.adb, exp_imgv.adb, exp_util.adb,
libgnarl/a-reatim.adb, libgnat/a-coinve.adb, libgnat/a-nbnbre.adb,
libgnat/a-ngcoty.adb, libgnat/a-ngelfu.adb, libgnat/a-ngrear.adb,
libgnat/a-strbou.ads, libgnat/a-strfix.ads, libgnat/a-strsea.adb,
libgnat/a-strsea.ads, libgnat/a-strsup.ads,
libgnat/a-strunb__shared.ads, libgnat/g-alleve.adb,
libgnat/g-spitbo.adb, libgnat/s-aridou.adb, libgnat/s-arit32.adb,
libgnat/s-dourea.ads, libgnat/s-genbig.adb, libgnat/s-imager.adb,
libgnat/s-statxd.adb, libgnat/s-widthi.adb, sem_attr.adb, sem_ch10.adb,
sem_ch3.adb, sem_ch6.adb, sem_ch7.adb, sem_dim.adb, sem_prag.adb,
sem_res.adb, uintp.adb: Remove redundant parentheses inside NOT and ABS
operators.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/checks.adb   |  6 +++---
 gcc/ada/exp_dist.adb |  2 +-
 gcc/ada/exp_imgv.adb |  4 ++--
 gcc/ada/exp_util.adb |  2 +-
 gcc/ada/libgnarl/a-reatim.adb|  2 +-
 gcc/ada/libgnat/a-coinve.adb |  6 +++---
 gcc/ada/libgnat/a-nbnbre.adb |  4 ++--
 gcc/ada/libgnat/a-ngcoty.adb | 22 +++---
 gcc/ada/libgnat/a-ngelfu.adb |  2 +-
 gcc/ada/libgnat/a-ngrear.adb |  2 +-
 gcc/ada/libgnat/a-strbou.ads | 16 
 gcc/ada/libgnat/a-strfix.ads | 16 
 gcc/ada/libgnat/a-strsea.adb | 26 +-
 gcc/ada/libgnat/a-strsea.ads |  8 
 gcc/ada/libgnat/a-strsup.ads | 16 
 gcc/ada/libgnat/a-strunb__shared.ads | 16 
 gcc/ada/libgnat/g-alleve.adb |  8 
 gcc/ada/libgnat/g-spitbo.adb |  4 ++--
 gcc/ada/libgnat/s-aridou.adb | 10 +-
 gcc/ada/libgnat/s-arit32.adb |  2 +-
 gcc/ada/libgnat/s-dourea.ads |  2 +-
 gcc/ada/libgnat/s-genbig.adb |  4 ++--
 gcc/ada/libgnat/s-imager.adb |  2 +-
 gcc/ada/libgnat/s-statxd.adb |  6 +++---
 gcc/ada/libgnat/s-widthi.adb |  8 
 gcc/ada/sem_attr.adb |  4 ++--
 gcc/ada/sem_ch10.adb |  2 +-
 gcc/ada/sem_ch3.adb  |  2 +-
 gcc/ada/sem_ch6.adb  |  4 ++--
 gcc/ada/sem_ch7.adb  |  2 +-
 gcc/ada/sem_dim.adb  |  4 ++--
 gcc/ada/sem_prag.adb |  2 +-
 gcc/ada/sem_res.adb  |  2 +-
 gcc/ada/uintp.adb| 12 ++--
 34 files changed, 115 insertions(+), 115 deletions(-)

diff --git a/gcc/ada/checks.adb b/gcc/ada/checks.adb
index 7a5bc71f36b..dcfcaa33bcc 100644
--- a/gcc/ada/checks.adb
+++ b/gcc/ada/checks.adb
@@ -2076,7 +2076,7 @@ package body Checks is
  Lo := Succ (Expr_Type, UR_From_Uint (Ifirst - 1));
  Lo_OK := True;
 
-  elsif abs (Ifirst) < Max_Bound then
+  elsif abs Ifirst < Max_Bound then
  Lo := UR_From_Uint (Ifirst) - Ureal_Half;
  Lo_OK := (Ifirst > 0);
 
@@ -2120,7 +2120,7 @@ package body Checks is
  Hi := Pred (Expr_Type, UR_From_Uint (Ilast + 1));
  Hi_OK := True;
 
-  elsif abs (Ilast) < Max_Bound then
+  elsif abs Ilast < Max_Bound then
  Hi := UR_From_Uint (Ilast) + Ureal_Half;
  Hi_OK := (Ilast < 0);
   else
@@ -6243,7 +6243,7 @@ package body Checks is
   --  do the corresponding optimizations later on when applying the checks.
 
   if Mode in Minimized_Or_Eliminated then
- if not (Overflow_Checks_Suppressed (Etype (N)))
+ if not Overflow_Checks_Suppressed (Etype (N))
and then not (Is_Entity_Name (N)
   and then Overflow_Checks_Suppressed (Entity (N)))
  then
diff --git a/gcc/ada/exp_dist.adb b/gcc/ada/exp_dist.adb
index f3cc4b4f9af..694fbe47dab 100644
--- a/gcc/ada/exp_dist.adb
+++ b/gcc/ada/exp_dist.adb
@@ -8626,7 +8626,7 @@ package body Exp_Dist is
 --  The RACW case is taken care of by Exp_Dist.Add_RACW_From_Any
 
 pragma Assert
-  (not (Is_Remote_Access_To_Class_Wide_Type (Typ)));
+  (not Is_Remote_Access_To_Class_Wide_Type (Typ));
 
 Use_Opaque_Representation := False;
 
diff --git a/gcc/ada/exp_imgv.adb b/gcc/ada/exp_imgv.adb
index a8c0fa0c1e6..c7cf06ba444 100644
--- a/gcc/ada/exp_imgv.adb
+++ b/gcc/ada/exp_imgv.adb
@@ -1615,9 +1615,9 @@ package body Exp_Imgv is
  end if;
 
   elsif Is_Decimal_Fixed_Point_Type (Rtyp) then
- if Esize (Rtyp) <= 32 and then abs (Scale_Value (Rtyp)) <= 9 then
+ if Esize (Rtyp) <= 32 and then abs Sca

[COMMITTED 05/14] ada: Unbounded recursion on character aggregates with predicated component subtype

2025-01-13 Thread Marc Poulhiès

From: Gary Dismukes 

The compiler was recursing endlessly when analyzing an aggregate of
an array type whose component subtype has a static predicate and the
component expressions are static, repeatedly transforming the aggregate
first into a string literal and then back into an aggregate. This is fixed
by suppressing the transformation to a string literal in the case where
the component subtype has predicates.

gcc/ada/ChangeLog:

* sem_aggr.adb (Resolve_Aggregate): Add another condition to prevent 
rewriting
an aggregate whose type is an array of characters, testing for the 
presence of
predicates on the component type.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/sem_aggr.adb | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/gcc/ada/sem_aggr.adb b/gcc/ada/sem_aggr.adb
index 095093cc76b..f6db5cb97a4 100644
--- a/gcc/ada/sem_aggr.adb
+++ b/gcc/ada/sem_aggr.adb
@@ -1382,11 +1382,11 @@ package body Sem_Aggr is
 
  --  Do not perform this transformation if this was a string literal
  --  to start with, whose components needed constraint checks, or if
- --  the component type is non-static, because it will require those
- --  checks and be transformed back into an aggregate. If the index
- --  type is not Integer the aggregate may represent a user-defined
- --  string type but the context might need the original type so we
- --  do not perform the transformation at this point.
+ --  the component type is nonstatic or has predicates, because it will
+ --  require those checks and be transformed back into an aggregate.
+ --  If the index type is not Integer, then the aggregate may represent
+ --  a user-defined string type but the context might need the original
+ --  type, so we do not perform the transformation at this point.
 
  if Number_Dimensions (Typ) = 1
and then Is_Standard_Character_Type (Component_Type (Typ))
@@ -1396,6 +1396,7 @@ package body Sem_Aggr is
and then not Is_Bit_Packed_Array (Typ)
and then Nkind (Original_Node (Parent (N))) /= N_String_Literal
and then Is_OK_Static_Subtype (Component_Type (Typ))
+   and then not Has_Predicates (Component_Type (Typ))
and then Base_Type (Etype (First_Index (Typ))) =
   Base_Type (Standard_Integer)
and then not Has_Static_Empty_Array_Bounds (Typ)
-- 
2.43.0

[COMMITTED 13/14] ada: Cleanup preanalysis of static expressions (part 5)

2025-01-13 Thread Marc Poulhiès

From: Javier Miranda 

Partially revert the fix for sem_ch13.adb as it does not comply
with RM 13.14(7.2/5).

gcc/ada/ChangeLog:

* sem_ch13.adb (Check_Aspect_At_End_Of_Declarations): Restore calls
to Preanalyze_Spec_Expression that were replaced by calls to
Preanalyze_And_Resolve. Add documentation.
(Check_Aspect_At_Freeze_Point): Ditto.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/sem_ch13.adb | 28 
 1 file changed, 20 insertions(+), 8 deletions(-)

diff --git a/gcc/ada/sem_ch13.adb b/gcc/ada/sem_ch13.adb
index 9bbec28ddb3..072ec66a8f3 100644
--- a/gcc/ada/sem_ch13.adb
+++ b/gcc/ada/sem_ch13.adb
@@ -10039,7 +10039,7 @@ package body Sem_Ch13 is
 
   --  If the predicate pragma comes from an aspect, replace the
   --  saved expression because we need the subtype references
-  --  replaced for the calls to Preanalyze_And_Resolve in
+  --  replaced for the calls to Preanalyze_Spec_Expression in
   --  Check_Aspect_At_xxx routines.
 
   if Present (Asp) then
@@ -10853,12 +10853,12 @@ package body Sem_Ch13 is
  | Aspect_Static_Predicate
 then
Push_Type (Ent);
-   Preanalyze_And_Resolve (Freeze_Expr, Standard_Boolean);
+   Preanalyze_Spec_Expression (Freeze_Expr, Standard_Boolean);
Pop_Type (Ent);
 
 elsif A_Id = Aspect_Priority then
Push_Type (Ent);
-   Preanalyze_And_Resolve (Freeze_Expr, Any_Integer);
+   Preanalyze_Spec_Expression (Freeze_Expr, Any_Integer);
Pop_Type (Ent);
 
 else
@@ -10894,6 +10894,12 @@ package body Sem_Ch13 is
 end if;
 return;
 
+ --  The expression must be analyzed in the special manner described in
+ --  "Handling of Default and Per-Object Expressions" in sem.ads, since
+ --  any static expressions within an aspect_specification also cause
+ --  freezing at the end of the immediately enclosing declaration list
+ --  (RM 13.14(7.2/5)).
+
  --  The default values attributes may be defined in the private part,
  --  and the analysis of the expression may take place when only the
  --  partial view is visible. The expression must be scalar, so use
@@ -10902,7 +10908,7 @@ package body Sem_Ch13 is
  elsif A_Id in Aspect_Default_Component_Value | Aspect_Default_Value
 and then Is_Private_Type (T)
  then
-Preanalyze_And_Resolve (End_Decl_Expr, Full_View (T));
+Preanalyze_Spec_Expression (End_Decl_Expr, Full_View (T));
 
  --  The following aspect expressions may contain references to
  --  components and discriminants of the type.
@@ -10916,14 +10922,14 @@ package body Sem_Ch13 is
  | Aspect_Static_Predicate
  then
 Push_Type (Ent);
-Preanalyze_And_Resolve (End_Decl_Expr, T);
+Preanalyze_Spec_Expression (End_Decl_Expr, T);
 Pop_Type (Ent);
 
  elsif A_Id = Aspect_Predicate_Failure then
-Preanalyze_And_Resolve (End_Decl_Expr, Standard_String);
+Preanalyze_Spec_Expression (End_Decl_Expr, Standard_String);
 
  elsif Present (End_Decl_Expr) then
-Preanalyze_And_Resolve (End_Decl_Expr, T);
+Preanalyze_Spec_Expression (End_Decl_Expr, T);
  end if;
 
  Err :=
@@ -11346,8 +11352,14 @@ package body Sem_Ch13 is
 
   --  Do the preanalyze call
 
+  --  The expression must be analyzed in the special manner described in
+  --  "Handling of Default and Per-Object Expressions" in sem.ads, since
+  --  at the freezing point of the entity associated with an aspect
+  --  specification, any static expressions expressions or names within
+  --  the aspect_specification cause freezing (RM 13.14(7.2/5)).
+
   if Present (Expression (ASN)) then
- Preanalyze_And_Resolve (Expression (ASN), T);
+ Preanalyze_Spec_Expression (Expression (ASN), T);
   end if;
end Check_Aspect_At_Freeze_Point;
 
-- 
2.43.0

Re: [PATCH v4 6/7] OpenMP: Fortran front-end support for dispatch + adjust_args

2025-01-13 Thread Paul-Antoine Arras


Hi Tobias,

On 13/01/2025 13:24, Tobias Burnus wrote:

Hi PA,

Paul-Antoine Arras wrote:

Hi Thomas,

Added libgomp/testsuite/libgomp.fortran/dispatch-1.f90.

I see this new test case FAIL (execution test SIGSEGV) for most (but not
all) offloading configurations, both GCN and nvptx:

 +PASS: libgomp.fortran/dispatch-1.f90   -O  (test for excess 
errors)

 +FAIL: libgomp.fortran/dispatch-1.f90   -O  execution test


Thanks for pointing that out! The testcase missed an OpenMP target 
directive. The attached patch should fix it.


…


--- libgomp/testsuite/libgomp.fortran/dispatch-1.f90
+++ libgomp/testsuite/libgomp.fortran/dispatch-1.f90
@@ -55,6 +55,7 @@ module procedures
  call c_f_pointer(d_av, fp_av, [n])
  ! Perform operations on target
+    !$omp target is_device_ptr(fp_bv, fp_av)
  do i = 1, n
    fp_bv(i) = fp_av(i) * i
  end do


I think the patch is okay in the sense that it works;
still, I think you should consider the following.

Using 'is_device_ptr' for for an argument that
is not a type(c_ptr) is deprecated since OpenMP 5.1 and
removed from the specification since OpenMP 6.0.

Thus, it would be a bit cleaner (and might avoid future
-Wdeprecated warnings) using
    has_device_addr(fp_bv, fp_av) instead. (5.1 semantic states that 
this replacement happens automatically when the is_device_ptr argument 
is not a C_PTR.) Albeit it feels a bit cleaner to move the device 
pointer handling to the device side, i.e. implicit none integer :: res, 
n, i type(c_ptr) :: d_bv type(c_ptr) :: d_av !$omp target 
is_device_ptr(d_bv, d_av) block real(8), pointer :: fp_bv(:), fp_av(:) ! 
Fortran pointers for array access ! Associate C pointers with Fortran 
pointers call c_f_pointer(d_bv, fp_bv, [n]) call c_f_pointer(d_av, 
fp_av, [n]) ! Perform operations on target do i = 1, n fp_bv(i) = 
fp_av(i) * i end do end block However, as all variants work in practice, 
I don't feel strong about it, with a small preference of a variant that 
does not use deprecated features. (Both dispatch and the has_device_addr 
clause/the deprecation are new with OpenMP 5.1) Tobias




Here is an updated patch following your suggestion.

Thanks,
--
PAcommit c76b5ebf73201074cdf07842097a297e7ae4e9c5
Author: Paul-Antoine Arras 
Date:   Mon Jan 13 12:57:15 2025 +0100

Add missing target directive in OpenMP dispatch Fortran runtime test

Without the target directive, the test would run on the host but still try to
use device pointers, which causes a segfault.

libgomp/ChangeLog:

* testsuite/libgomp.fortran/dispatch-1.f90: Add missing target
directive.

diff --git libgomp/testsuite/libgomp.fortran/dispatch-1.f90 libgomp/testsuite/libgomp.fortran/dispatch-1.f90
index 7b2f03f9d68..f56477e4972 100644
--- libgomp/testsuite/libgomp.fortran/dispatch-1.f90
+++ libgomp/testsuite/libgomp.fortran/dispatch-1.f90
@@ -48,16 +48,21 @@ module procedures
 integer :: res, n, i
 type(c_ptr) :: d_bv
 type(c_ptr) :: d_av
-real(8), pointer :: fp_bv(:), fp_av(:)  ! Fortran pointers for array access
 
-! Associate C pointers with Fortran pointers
-call c_f_pointer(d_bv, fp_bv, [n])
-call c_f_pointer(d_av, fp_av, [n])
+!$omp target is_device_ptr(d_bv, d_av)
+block
+  real(8), pointer :: fp_bv(:), fp_av(:)  ! Fortran pointers for array access
+
+  ! Associate C pointers with Fortran pointers
+  call c_f_pointer(d_bv, fp_bv, [n])
+  call c_f_pointer(d_av, fp_av, [n])
+
+  ! Perform operations on target
+  do i = 1, n
+fp_bv(i) = fp_av(i) * i
+  end do
+end block
 
-! Perform operations on target
-do i = 1, n
-  fp_bv(i) = fp_av(i) * i
-end do
 res = -2
   end function bar

Re: [PATCH] d, v2: give dependency files better filenames

2025-01-13 Thread Arsen Arsenović

Jakub Jelinek  writes:

> On Sun, Jan 12, 2025 at 04:16:58PM +0100, Arsen Arsenović wrote:
>> Regstrapped on x86_64-pc-linux-gnu.  I've also checked the generated
>> dependency files are correct by hand and "instrumented" the build to
>> fail if two dependency files are the same, by doing the following:
>> 
>>   DPOSTCOMPILE = ! test -f $(DEPFILE).Po && mv ...
>> 
>> ... and confirmed no further conflicts of this sort happen.
>> 
>> OK for trunk?
>> -- >8 --
>> Currently, the dependency files for root-file.o and common-file.o were
>> both d/.deps/file.Po, which would cause parallel builds to fail
>> sometimes with:
>> 
>>   make[3]: Leaving directory 
>> '/var/tmp/portage/sys-devel/gcc-14.1.1_p20240511/work/build/gcc'
>>   make[3]: Entering directory 
>> '/var/tmp/portage/sys-devel/gcc-14.1.1_p20240511/work/build/gcc'
>>   mv: cannot stat 'd/.deps/file.TPo': No such file or directory
>>   make[3]: *** 
>> [/var/tmp/portage/sys-devel/gcc-14.1.1_p20240511/work/gcc-14-20240511/gcc/d/Make-lang.in:421:
>>  d/root-file.o] Error 1 shuffle=131581365
>> 
>> Also, this means that dependencies of one of root-file or common-file
>> are missing when developing.  After this patch, those two files get
>> assigned dependency files d/.deps/d-root-file.o.Po and
>> d/.deps/d-common-file.o.Po respectively.
>> 
>> There are other files with similar conflicts (mangle-package.o,
>> visitor-package.o for instance).
>
> Note, I ran into the same problem in
> https://kojipkgs.fedoraproject.org//work/tasks/6545/127766545/build.log
>> 
>> gcc/d/ChangeLog:
>> 
>>  * Make-lang.in: Assign dependency-tracking files better
>>  filenames.
>> ---
>>  gcc/d/Make-lang.in | 5 +++--
>>  1 file changed, 3 insertions(+), 2 deletions(-)
>> 
>> diff --git a/gcc/d/Make-lang.in b/gcc/d/Make-lang.in
>> index f28761e4b370..25e2b0bbfe94 100644
>> --- a/gcc/d/Make-lang.in
>> +++ b/gcc/d/Make-lang.in
>> @@ -65,8 +65,9 @@ ALL_DFLAGS = $(DFLAGS-$@) $(GDCFLAGS) -fversion=IN_GCC 
>> $(CHECKING_DFLAGS) \
>>  $(WARN_DFLAGS)
>>  
>>  DCOMPILE.base = $(GDC) -c $(ALL_DFLAGS) -o $@
>> -DCOMPILE = $(DCOMPILE.base) -MT $@ -MMD -MP -MF $(@D)/$(DEPDIR)/$(*F).TPo
>> -DPOSTCOMPILE = @mv $(@D)/$(DEPDIR)/$(*F).TPo $(@D)/$(DEPDIR)/$(*F).Po
>> +DEPFILE = $(subst /,-,$@)
>
> Why do you need there the directory or suffix?
> $(*F) is clearly bad, because there are rules like
> d/%.o: d/dmd/%.d
> $(DCOMPILE) $(D_INCLUDES) $<
> $(DPOSTCOMPILE)
> 
> d/common-%.o: d/dmd/common/%.d
> $(DCOMPILE) $(D_INCLUDES) $<
> $(DPOSTCOMPILE)
> etc. and while the stem in the first case is the basename of the filename
> part, in the second case it is the basename of the filename part in the
> common directory.
> I think
> DEPFILE = $(basename $(@F))
> would be sufficient.
>
> So the former d/.deps/file.Po which handled both d/dmd/common/file.d and
> d/dmd/root/file.d in your case would be d/.deps/d-common-file.o.d and
> d/.deps/d-root-file.o.d while with the above DEPFILE it would be
> d/.deps/common-file.d and d/.deps/root-file.d
> There are no d/dmd/*-*.d files and among d/*-*.cc the only are just d-
> prefixed ones, and there are no clashes between the *.cc and *.d filenames:
> for i in gcc/d/*.cc; do j=`basename $i .cc`; find gcc/d -name $j.d; done

Relying that is more error-prone, I think.  While that is true today, it
might not stay true forever, and such a change won't be caught until it
fails again in the same way.

$@ is necessarily unique (however, still, with the proposed approach
d/foo.o and d-foo.o will collide).

I might be overthinking it - I trust your judgment, so I'm okay with
either.

> So just (or if you want a helper variable, perhaps it should standa for the
> whole $(@D)/$(DEPDIR)/$(basename $(@F)) part?
>
> 2025-01-13  Arsen Arsenović  
>   Jakub Jelinek  
>
>   * Make-lang.in (DCOMPILE, DPOSTCOMPILE): Use $(basename $(@F))
>   instead of $(*F).
>
> --- gcc/d/Make-lang.in.jj 2025-01-13 09:12:07.408983471 +0100
> +++ gcc/d/Make-lang.in2025-01-13 10:57:16.398315375 +0100
> @@ -65,8 +65,8 @@ ALL_DFLAGS = $(DFLAGS-$@) $(GDCFLAGS) -f
>   $(WARN_DFLAGS)
>  
>  DCOMPILE.base = $(GDC) -c $(ALL_DFLAGS) -o $@
> -DCOMPILE = $(DCOMPILE.base) -MT $@ -MMD -MP -MF $(@D)/$(DEPDIR)/$(*F).TPo
> -DPOSTCOMPILE = @mv $(@D)/$(DEPDIR)/$(*F).TPo $(@D)/$(DEPDIR)/$(*F).Po
> +DCOMPILE = $(DCOMPILE.base) -MT $@ -MMD -MP -MF $(@D)/$(DEPDIR)/$(basename 
> $(@F)).TPo
> +DPOSTCOMPILE = @mv $(@D)/$(DEPDIR)/$(basename $(@F)).TPo 
> $(@D)/$(DEPDIR)/$(basename $(@F)).Po
>  DLINKER = $(GDC) $(NO_PIE_FLAG) -lstdc++
>  
>  # Like LINKER, but use a mutex for serializing front end links.
>
>   Jakub
-- 
Arsen Arsenović


signature.asc
Description: PGP signature

[ping,patch 1/2] Add new target hook to assemble a variable

2025-01-13 Thread Georg-Johann Lay


Ping for trunk

https://gcc.gnu.org/pipermail/gcc-patches/2024-December/672050.html

Notice that the patch is bootstrapped and reg-tested and I may
commit-after-approval, so no further work from admins is needed.

The avr part has already been approved 2024-12-20.

The default action is an obvious no-op: Do nothing and return false.

This patch adds a new target hook that allows the backend to asm output
a full variable definition in its own way.  This hook is needed because
varasm.cc imposes a very restrictive layout for all variable definitions
which will be basically ELF style (on ELF targets as least).  To date,
there is no way for a backend to output a variable definition in a
different way.

For example,

int var;

would print:

.global var
.type var, @object
.size var, ...
var:
.byte ...

Though the avr backend would use it to define a symbol (which is the
semantics of the "io", "io_low" and "address" attributes) like:

.global var
var = 1234

no matter if -f[no-]common -f[no-]data-sections etc.

The current implementation in the avr backend is a broken hack, and this
patch puts the feature on solid grounds.

The hook name is TARGET_ASM_VARIABLE but since avr BE is just defining
a symbol, Hans-Peter Nilsson proposed TARGET_ASM_SYMBOL.

I am fine with any naming as approved by the reviewers, like

TARGET_ASM_VARIABLE
TARGET_ASM_SYMBOL
TARGET_ASM_OBJECT

Johann


--

Add new target hook TARGET_ASM_VARIABLE.

This patch adds a new target hook that allows the backend to asm output
a variable definition in its own way.  This hook is needed because
varasm.cc imposes a very restrictive layout for all variable definitions
which will be basically ELF style (on ELF targets as least).  To date,
there is no way for a backend to output a variable definition in a
different way.
   This hook is required by the avr backend when it outputs definitions
for variables defined with the "io", "io_low" or "address" attribute that
don't follow ELF style.  These attributes are basically symbol definitions
of the form

   .global var_io
   var_io = 32

with some additional assertions.

gcc/
* target.def (TARGET_ASM_OUT) : Add new DEFHOOK.
* targhooks.cc (default_asm_out_variable): New function.
* targhooks.h (default_asm_out_variable): New prototype.
* doc/tm.texi.in (TARGET_ASM_VARIABLE): Place hook documentation.
* doc/tm.texi: Rebuild.
* varasm.cc (assemble_variable): Call targetm.asm_out.variable
in order to allow the backend to output a variable definition
in its own style.
Add new target hook TARGET_ASM_VARIABLE.

This patch adds a new target hook that allows the backend to asm output
a variable definition in its own way.  This hook is needed because
varasm.cc imposes a very restrictive layout for all variable definitions
which will be basically ELF style (on ELF targets as least).  To date,
there is no way for a backend to output a variable definition in a
different way.
   This hook is required by the avr backend when it outputs definitions
for variables defined with the "io", "io_low" or "address" attribute that
don't follow ELF style.  These attributes are basically symbol definitions
of the form

   .global var_io
   var_io = 32

with some additional assertions.

gcc/
* target.def (TARGET_ASM_OUT) : Add new DEFHOOK.
* targhooks.cc (default_asm_out_variable): New function.
* targhooks.h (default_asm_out_variable): New prototype.
* doc/tm.texi.in (TARGET_ASM_VARIABLE): Place hook documentation.
* doc/tm.texi: Rebuild.
* varasm.cc (assemble_variable): Call targetm.asm_out.variable
in order to allow the backend to output a variable definition
in its own style.

diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index d7170f45206..3fc698367b9 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -8522,6 +8522,25 @@ The default implementation of this hook will use the
 when the relevant string is @code{NULL}.
 @end deftypefn
 
+@deftypefn {Target Hook} bool TARGET_ASM_VARIABLE (FILE *@var{stream}, tree @var{decl}, const char *@var{name})
+This hook outputs the assembly code for a @var{decl} that satisfies
+@code{VAR_P} for an object with assembler name @var{name}.
+@code{DECL_RTL (@var{decl})} is of the form @code{(mem (symbol_ref))}.
+Returns @code{true} when the output has been performed and
+@code{false}, otherwise.
+
+It is unlikely that you'll ever need to implement this hook.
+The middle-and knows how to output object definitions in terms of hook
+macros and functions like @code{GLOBAL_ASM_OP},
+@code{TARGET_ASM_INTEGER}, etc.
+
+When the output is performed by means of this hook, then the complete object
+definition has to be emit, including directives like @code{.global},
+@code{.size}, @code{.type}, @code{.section}, the object label and the
+object content. The default implementation of the hook does nothing
+and returns @code{false}.
+@end deftypefn
+
 @deftypefn {Target Hook} void TARGE

Re: [PATCH] testsuite: libstdc++: Use effective-target libatomic

2025-01-13 Thread Jonathan Wakely

On Mon, 13 Jan 2025 at 11:03, Thomas Schwinge  wrote:
>
> Hi!
>
> On 2025-01-12T08:38:05+0100, Torbjorn SVENSSON 
>  wrote:
> > On 2025-01-12 01:05, Jonathan Wakely wrote:
> >> On Mon, 23 Dec 2024, 19:05 Torbjörn SVENSSON,
> >> mailto:torbjorn.svens...@foss.st.com>>
> >> wrote:
> >>
> >> Ok for trunk and releases/gcc-14?
> >>
> >> OK
> >
> > Pushed as r15-6828-g4b0ef49d02f and r14.2.0-680-gd82fc939f91.
>
> On a configuration where libatomic does get built, I see (with standard

Does *not* get built?

> build-tree testing: 'make check'):
>
> [-PASS:-]{+UNSUPPORTED:+} 
> 29_atomics/atomic_float/compare_exchange_padding.cc  -std=gnu++20[-(test for 
> excess errors)-]
> [-PASS: 29_atomics/atomic_float/compare_exchange_padding.cc  -std=gnu++20 
> execution test-]
> [Etc.]
>
> [...]
> spawn -ignore SIGHUP [...]/gcc/xg++ [...] libatomic_available1221570.c 
> -latomic [...] -o libatomic_available1221570.exe
> /usr/bin/ld: cannot find -latomic: No such file or directory
> [...]
>
> I presume that the new 'dg-require-effective-target libatomic_available'
> is evaluated when the 'atomic_link_flags' via 'dg-additional-options'
> have not yet been set?
>
> Would it work to call 'atomic_init' (plus 'atomic_finish', I suppose?)
> (see 'gcc/testsuite/lib/atomic-dg.exp') in libstdc++ test suite setup,
> and then to '29_atomics/atomic_float/compare_exchange_padding.cc' apply
> the usual pattern:
>
> -// { dg-require-effective-target libatomic_available }
> -// { dg-additional-options "[atomic_link_flags [get_multilibs]] 
> -latomic" }
> +// { dg-additional-options -latomic { target libatomic_available } }

Yes that seems OK

[COMMITTED 04/14] ada: Simplify expansion of negative membership operator

2025-01-13 Thread Marc Poulhiès

From: Piotr Trojanek 

Code cleanup; semantics is unaffected.

gcc/ada/ChangeLog:

* exp_ch4.adb: (Expand_N_Not_In): Preserve Alternatives in expanded
membership operator just like preserving Right_Opnd (though only
one of these fields is present at a time).
* par-ch4.adb (P_Membership_Test): Remove redundant setting of fields
to their default values.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/exp_ch4.adb | 9 +++--
 gcc/ada/par-ch4.adb | 4 +---
 2 files changed, 4 insertions(+), 9 deletions(-)

diff --git a/gcc/ada/exp_ch4.adb b/gcc/ada/exp_ch4.adb
index a7f759fc8a5..82978c775cf 100644
--- a/gcc/ada/exp_ch4.adb
+++ b/gcc/ada/exp_ch4.adb
@@ -7727,12 +7727,9 @@ package body Exp_Ch4 is
 Make_Op_Not (Loc,
   Right_Opnd =>
 Make_In (Loc,
-  Left_Opnd  => Left_Opnd (N),
-  Right_Opnd => Right_Opnd (N;
-
-  --  If this is a set membership, preserve list of alternatives
-
-  Set_Alternatives (Right_Opnd (N), Alternatives (Original_Node (N)));
+  Left_Opnd=> Left_Opnd (N),
+  Right_Opnd   => Right_Opnd (N),
+  Alternatives => Alternatives (N;
 
   --  We want this to appear as coming from source if original does (see
   --  transformations in Expand_N_In).
diff --git a/gcc/ada/par-ch4.adb b/gcc/ada/par-ch4.adb
index 3f8d1f1d2e3..648a4cf6464 100644
--- a/gcc/ada/par-ch4.adb
+++ b/gcc/ada/par-ch4.adb
@@ -3926,7 +3926,6 @@ package body Ch4 is
   if Token = Tok_Vertical_Bar then
  Error_Msg_Ada_2012_Feature ("set notation", Token_Ptr);
  Set_Alternatives (N, New_List (Alt));
- Set_Right_Opnd   (N, Empty);
 
  --  Loop to accumulate alternatives
 
@@ -3940,8 +3939,7 @@ package body Ch4 is
   --  Not set case
 
   else
- Set_Right_Opnd   (N, Alt);
- Set_Alternatives (N, No_List);
+ Set_Right_Opnd (N, Alt);
   end if;
end P_Membership_Test;
 
-- 
2.43.0

[COMMITTED 11/14] ada: Remove redundant parentheses inside unary operators (cont.)

2025-01-13 Thread Marc Poulhiès

From: Piotr Trojanek 

GNAT already emits a style warning when redundant parentheses appear inside
logical and short-circuit operators. A similar warning will be soon emitted for
unary operators as well. This patch removes the redundant parentheses to avoid
build errors.

gcc/ada/ChangeLog:

* libgnat/a-strunb.ads: Remove redundant parentheses inside NOT
operators.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/libgnat/a-strunb.ads | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/gcc/ada/libgnat/a-strunb.ads b/gcc/ada/libgnat/a-strunb.ads
index 5a1427c31a2..60d57954e5c 100644
--- a/gcc/ada/libgnat/a-strunb.ads
+++ b/gcc/ada/libgnat/a-strunb.ads
@@ -432,8 +432,8 @@ is
   then J <= Index'Result - 1
   else J - 1 in Index'Result
 .. Length (Source) - Pattern'Length)
-  then not (Search.Match
-(To_String (Source), Pattern, Mapping, J,
+  then not Search.Match
+(To_String (Source), Pattern, Mapping, J))),
 
 --  Otherwise, 0 is returned
 
@@ -485,8 +485,8 @@ is
   then J <= Index'Result - 1
   else J - 1 in Index'Result
 .. Length (Source) - Pattern'Length)
-  then not (Search.Match
-(To_String (Source), Pattern, Mapping, J,
+  then not Search.Match
+(To_String (Source), Pattern, Mapping, J))),
 
 --  Otherwise, 0 is returned
 
@@ -591,8 +591,8 @@ is
   then J in From .. Index'Result - 1
   else J - 1 in Index'Result
 .. From - Pattern'Length)
-  then not (Search.Match
-(To_String (Source), Pattern, Mapping, J,
+  then not Search.Match
+(To_String (Source), Pattern, Mapping, J))),
 
 --  Otherwise, 0 is returned
 
@@ -655,8 +655,8 @@ is
   then J in From .. Index'Result - 1
   else J - 1 in Index'Result
 .. From - Pattern'Length)
-  then not (Search.Match
-(To_String (Source), Pattern, Mapping, J,
+  then not Search.Match
+(To_String (Source), Pattern, Mapping, J))),
 
 --  Otherwise, 0 is returned
 
-- 
2.43.0

[COMMITTED 10/14] ada: Cleanup preanalysis of static expressions (part 4)

2025-01-13 Thread Marc Poulhiès

From: Javier Miranda 

Fix regression in the SPARK 2014 testsuite.

gcc/ada/ChangeLog:

* sem_util.adb (Build_Actual_Subtype_Of_Component): No action
under preanalysis.
* sem_ch5.adb (Set_Assignment_Type): If the right-hand side contains
target names, expansion has been disabled to prevent expansion that
might move target names out of the context of the assignment statement.
Restore temporarily the current compilation mode so that the actual
subtype can be built.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/sem_ch5.adb  | 25 +
 gcc/ada/sem_util.adb |  5 ++---
 2 files changed, 23 insertions(+), 7 deletions(-)

diff --git a/gcc/ada/sem_ch5.adb b/gcc/ada/sem_ch5.adb
index 432debf3e25..12d6426671e 100644
--- a/gcc/ada/sem_ch5.adb
+++ b/gcc/ada/sem_ch5.adb
@@ -121,6 +121,9 @@ package body Sem_Ch5 is
   Lhs : constant Node_Id := Name (N);
   Rhs : constant Node_Id := Expression (N);
 
+  Save_Full_Analysis : Boolean := False;
+  --  Force initialization to facilitate static analysis
+
   procedure Diagnose_Non_Variable_Lhs (N : Node_Id);
   --  N is the node for the left hand side of an assignment, and it is not
   --  a variable. This routine issues an appropriate diagnostic.
@@ -318,7 +321,24 @@ package body Sem_Ch5 is
and then No (Actual_Designated_Subtype (Opnd
and then not Is_Unchecked_Union (Opnd_Type)
  then
-Decl := Build_Actual_Subtype_Of_Component (Opnd_Type, Opnd);
+--  If the right-hand side contains target names, expansion has
+--  been disabled to prevent expansion that might move target
+--  names out of the context of the assignment statement. Restore
+--  temporarily the current compilation mode so that the actual
+--  subtype can be built.
+
+if Nkind (N) = N_Assignment_Statement
+  and then Has_Target_Names (N)
+  and then Present (Current_Assignment)
+then
+   Expander_Mode_Restore;
+   Full_Analysis := Save_Full_Analysis;
+   Decl := Build_Actual_Subtype_Of_Component (Opnd_Type, Opnd);
+   Expander_Mode_Save_And_Set (False);
+   Full_Analysis := False;
+else
+   Decl := Build_Actual_Subtype_Of_Component (Opnd_Type, Opnd);
+end if;
 
 if Present (Decl) then
Insert_Action (N, Decl);
@@ -366,9 +386,6 @@ package body Sem_Ch5 is
   T1 : Entity_Id;
   T2 : Entity_Id;
 
-  Save_Full_Analysis : Boolean := False;
-  --  Force initialization to facilitate static analysis
-
--  Start of processing for Analyze_Assignment
 
begin
diff --git a/gcc/ada/sem_util.adb b/gcc/ada/sem_util.adb
index 058c868aa07..0e1505bbdbe 100644
--- a/gcc/ada/sem_util.adb
+++ b/gcc/ada/sem_util.adb
@@ -1467,10 +1467,9 @@ package body Sem_Util is
--  Start of processing for Build_Actual_Subtype_Of_Component
 
begin
-  --  The subtype does not need to be created for a selected component
-  --  in a Spec_Expression.
+  --  The subtype does not need to be created during preanalysis
 
-  if In_Spec_Expression then
+  if Preanalysis_Active then
  return Empty;
 
   --  More comments for the rest of this body would be good ???
-- 
2.43.0

[COMMITTED 08/14] ada: Remove redundant parentheses inside unary operators in comments

2025-01-13 Thread Marc Poulhiès

From: Piotr Trojanek 

GNAT already emits a style warning when redundant parentheses appear inside
logical and short-circuit operators. A similar warning will be soon emitted for
unary operators as well. This patch removes the redundant parentheses to avoid
future build errors.

gcc/ada/ChangeLog:

* libgnat/s-genbig.adb: Remove redundant parentheses in comments.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/libgnat/s-genbig.adb | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/ada/libgnat/s-genbig.adb b/gcc/ada/libgnat/s-genbig.adb
index 82bf3f76fc2..2780305e042 100644
--- a/gcc/ada/libgnat/s-genbig.adb
+++ b/gcc/ada/libgnat/s-genbig.adb
@@ -91,7 +91,7 @@ package body System.Generic_Bignums is
   Remainder : out Big_Integer;
   Discard_Quotient  : Boolean := False;
   Discard_Remainder : Boolean := False);
-   --  Returns the Quotient and Remainder from dividing abs (X) by abs (Y). The
+   --  Returns the Quotient and Remainder from dividing abs X by abs Y. The
--  values of X and Y are not modified. If Discard_Quotient is True, then
--  Quotient is undefined on return, and if Discard_Remainder is True, then
--  Remainder is undefined on return. Service routine for Big_Div/Rem/Mod.
-- 
2.43.0

[COMMITTED 12/14] ada: Fix relocatable DLL creation with gnatdll

2025-01-13 Thread Marc Poulhiès

From: Pascal Obry 

gcc/ada/ChangeLog:

* mdll.adb: For the created DLL to be relocatable we do not want to use
the base file name when calling gnatdll.
* gnatdll.adb: Removes option -d which is not working anymore. And
when using a truly relocatable DLL the base-address has no real
meaning. Also reword the usage string for -d as we do not want to
specify relocatable as gnatdll can be used to create both
relocatable and non relocatable DLL.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/gnatdll.adb |  8 +---
 gcc/ada/mdll.adb| 13 +
 2 files changed, 6 insertions(+), 15 deletions(-)

diff --git a/gcc/ada/gnatdll.adb b/gcc/ada/gnatdll.adb
index 8881fc91ab4..0faf79f361b 100644
--- a/gcc/ada/gnatdll.adb
+++ b/gcc/ada/gnatdll.adb
@@ -134,10 +134,8 @@ procedure Gnatdll is
   P ("   -l file   File contains a list-of-files to be added to "
  & "the library");
   P ("   -e file   Definition file containing exports");
-  P ("   -d file   Put objects in the relocatable dynamic "
+  P ("   -d file   Put objects in the dynamic "
  & "library ");
-  P ("   -b addr   Set base address for the relocatable DLL");
-  P (" default address is " & Default_DLL_Address);
   P ("   -a[addr]  Build non-relocatable DLL at address ");
   P (" if  is not specified use "
  & Default_DLL_Address);
@@ -315,10 +313,6 @@ procedure Gnatdll is
 
Must_Build_Relocatable := False;
 
-when 'b' =>
-   DLL_Address := To_Unbounded_String (Parameter);
-   Must_Build_Relocatable := True;
-
 when 'e' =>
Def_Filename := To_Unbounded_String (Parameter);
 
diff --git a/gcc/ada/mdll.adb b/gcc/ada/mdll.adb
index 281f6a97e5f..64350ff2ec3 100644
--- a/gcc/ada/mdll.adb
+++ b/gcc/ada/mdll.adb
@@ -77,10 +77,7 @@ package body MDLL is
   Bas_Opt  : aliased String := "-Wl,--base-file," & Bas_File;
   Lib_Opt  : aliased String := "-mdll";
   Out_Opt  : aliased String := "-o";
-  Adr_Opt  : aliased String :=
-   (if Relocatable
-then ""
-else "-Wl,--image-base=" & Lib_Address);
+  Adr_Opt  : aliased String := "-Wl,--image-base=" & Lib_Address;
   Map_Opt  : aliased String := "-Wl,-Map," & Lib_Filename & ".map";
 
   L_Afiles : Argument_List := Afiles;
@@ -133,7 +130,7 @@ package body MDLL is
  --  2) Build exp from base file
 
  Utl.Dlltool (Def_File, Dll_File, Lib_File,
-  Base_File=> Bas_File,
+  Base_File=> (if Relocatable then "" else Bas_File),
   Exp_Table=> Exp_File,
   Build_Import => False);
 
@@ -148,7 +145,7 @@ package body MDLL is
  --  4) Build new exp from base file and the lib file (.a)
 
  Utl.Dlltool (Def_File, Dll_File, Lib_File,
-  Base_File=> Bas_File,
+  Base_File=> (if Relocatable then "" else Bas_File),
   Exp_Table=> Exp_File,
   Build_Import => Build_Import);
 
@@ -223,7 +220,7 @@ package body MDLL is
  --  2) Build exp from base file
 
  Utl.Dlltool (Def_File, Dll_File, Lib_File,
-  Base_File=> Bas_File,
+  Base_File=> (if Relocatable then "" else Bas_File),
   Exp_Table=> Exp_File,
   Build_Import => False);
 
@@ -247,7 +244,7 @@ package body MDLL is
  --  4) Build new exp from base file and the lib file (.a)
 
  Utl.Dlltool (Def_File, Dll_File, Lib_File,
-  Base_File=> Bas_File,
+  Base_File=> (if Relocatable then "" else Bas_File),
   Exp_Table=> Exp_File,
   Build_Import => Build_Import);
 
-- 
2.43.0

[COMMITTED 09/14] ada: Warn about redundant parentheses inside unary operators

2025-01-13 Thread Marc Poulhiès

From: Piotr Trojanek 

GNAT already emits a style warning when redundant parentheses appear inside
logical and short-circuit operators. A similar warning is now emitted for
unary operators as well.

gcc/ada/ChangeLog:

* par-ch4.adb (P_Factor): Warn when the operand of a unary operator
doesn't require parentheses.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 gcc/ada/par-ch4.adb | 38 ++
 1 file changed, 38 insertions(+)

diff --git a/gcc/ada/par-ch4.adb b/gcc/ada/par-ch4.adb
index 648a4cf6464..ca02f1baac1 100644
--- a/gcc/ada/par-ch4.adb
+++ b/gcc/ada/par-ch4.adb
@@ -2839,6 +2839,30 @@ package body Ch4 is
   Node1 : Node_Id;
   Node2 : Node_Id;
 
+  subtype N_Primary is Node_Kind with Static_Predicate =>
+N_Primary in N_Aggregate
+   | N_Allocator
+   | N_Attribute_Reference
+   | N_Case_Expression--  requires single parens
+   | N_Delta_Aggregate
+   | N_Direct_Name
+   | N_Explicit_Dereference
+   | N_Expression_With_Actions--  requires single parens
+   | N_Extension_Aggregate
+   | N_If_Expression  --  requires single parens
+   | N_Indexed_Component
+   | N_Null
+   | N_Numeric_Or_String_Literal
+   | N_Qualified_Expression
+   | N_Quantified_Expression  --  requires single parens
+   | N_Selected_Component
+   | N_Slice
+   | N_Subprogram_Call
+   | N_Target_Name
+   | N_Type_Conversion;
+  --  Node kinds that represents a "primary" subexpression, which does not
+  --  require parentheses when used as an operand of a unary operator.
+
begin
   if Token = Tok_Abs then
  Node1 := New_Op_Node (N_Op_Abs, Token_Ptr);
@@ -2849,6 +2873,13 @@ package body Ch4 is
 
  Scan; -- past ABS
  Set_Right_Opnd (Node1, P_Primary);
+
+ if Style_Check then
+if Nkind (Right_Opnd (Node1)) in N_Primary then
+   Style.Check_Xtra_Parens_Precedence (Right_Opnd (Node1));
+end if;
+ end if;
+
  return Node1;
 
   elsif Token = Tok_Not then
@@ -2860,6 +2891,13 @@ package body Ch4 is
 
  Scan; -- past NOT
  Set_Right_Opnd (Node1, P_Primary);
+
+ if Style_Check then
+if Nkind (Right_Opnd (Node1)) in N_Primary then
+   Style.Check_Xtra_Parens_Precedence (Right_Opnd (Node1));
+end if;
+ end if;
+
  return Node1;
 
   else
-- 
2.43.0

[COMMITTED 14/14] ada: Update gnatdll documentation (-b option removed)

2025-01-13 Thread Marc Poulhiès

From: Pascal Obry 

gcc/ada/ChangeLog:
* doc/gnat_ugn/platform_specific_information.rst: Update.
* gnat_ugn.texi: Regenerate.

Tested on x86_64-pc-linux-gnu, committed on master.

---
 .../platform_specific_information.rst | 19 ++-
 gcc/ada/gnat_ugn.texi | 11 ++-
 2 files changed, 8 insertions(+), 22 deletions(-)

diff --git a/gcc/ada/doc/gnat_ugn/platform_specific_information.rst 
b/gcc/ada/doc/gnat_ugn/platform_specific_information.rst
index aa63bb97e84..f2fc737f90d 100644
--- a/gcc/ada/doc/gnat_ugn/platform_specific_information.rst
+++ b/gcc/ada/doc/gnat_ugn/platform_specific_information.rst
@@ -167,7 +167,7 @@ Alternatively, you can specify :file:`rts-sjlj/adainclude` 
in the file
 
 .. index:: --RTS switch
 
-You can select another run-time library temporarily 
+You can select another run-time library temporarily
 by using the :switch:`--RTS` switch, e.g., :switch:`--RTS=sjlj`
 
 
@@ -538,17 +538,17 @@ and::
 Choosing the Scheduling Policy with Windows
 ---
 
-Under Windows, the standard 31 priorities of the Ada model are mapped onto 
+Under Windows, the standard 31 priorities of the Ada model are mapped onto
 Window's seven standard priority levels by default: Idle, Lowest, Below Normal,
 Normal, Above Normal,
 
 When using the ``FIFO_Within_Priorities`` task dispatching policy, GNAT
-assigns the ``REALTIME_PRIORITY_CLASS`` priority class to the application 
-and maps the Ada priority range to the sixteen priorities made available under 
-``REALTIME_PRIORITY_CLASS``. 
+assigns the ``REALTIME_PRIORITY_CLASS`` priority class to the application
+and maps the Ada priority range to the sixteen priorities made available under
+``REALTIME_PRIORITY_CLASS``.
 
 For details on the values of the different priority mappings, see declarations
-in :file:`system.ads`. For more information about Windows priorities, please 
+in :file:`system.ads`. For more information about Windows priorities, please
 refer to Microsoft documentation.
 
 Windows Socket Timeouts
@@ -1548,13 +1548,6 @@ You may specify any of the following switches to 
``gnatdll``:
   relocatable DLL. We advise you to build relocatable DLL.
 
 
-  .. index:: -b (gnatdll)
-
-:switch:`-b {address}`
-  Set the relocatable DLL base address. By default the address is
-  ``0x1100``.
-
-
   .. index:: -bargs (gnatdll)
 
 :switch:`-bargs {opts}`
diff --git a/gcc/ada/gnat_ugn.texi b/gcc/ada/gnat_ugn.texi
index 0b62540a2fd..2579b31a7fc 100644
--- a/gcc/ada/gnat_ugn.texi
+++ b/gcc/ada/gnat_ugn.texi
@@ -19,7 +19,7 @@
 
 @copying
 @quotation
-GNAT User's Guide for Native Platforms , Jan 03, 2025
+GNAT User's Guide for Native Platforms , Jan 13, 2025
 
 AdaCore
 
@@ -24749,13 +24749,6 @@ Build a non-relocatable DLL at @code{address}. If you 
don’t specify
 default, when this switch is missing, @code{gnatdll} builds a
 relocatable DLL. We advise you to build relocatable DLL.
 
-@geindex -b (gnatdll)
-
-@item @code{-b `address'}
-
-Set the relocatable DLL base address. By default the address is
-@code{0x1100}.
-
 @geindex -bargs (gnatdll)
 
 @item @code{-bargs `opts'}
@@ -29839,8 +29832,8 @@ to permit their use in free software.
 
 @printindex ge
 
-@anchor{gnat_ugn/gnat_utility_programs switches-related-to-project-files}@w{   
   }
 @anchor{d2}@w{  }
+@anchor{gnat_ugn/gnat_utility_programs switches-related-to-project-files}@w{   
   }
 
 @c %**end of body
 @bye
-- 
2.43.0

Re: [PATCH] testsuite: libstdc++: Use effective-target libatomic

2025-01-13 Thread Thomas Schwinge

Hi!

On 2025-01-13T11:04:50+, Jonathan Wakely  wrote:
> On Mon, 13 Jan 2025 at 11:03, Thomas Schwinge  wrote:
>> On 2025-01-12T08:38:05+0100, Torbjorn SVENSSON 
>>  wrote:
>> > On 2025-01-12 01:05, Jonathan Wakely wrote:
>> >> On Mon, 23 Dec 2024, 19:05 Torbjörn SVENSSON,
>> >> mailto:torbjorn.svens...@foss.st.com>>
>> >> wrote:
>> >>
>> >> Ok for trunk and releases/gcc-14?
>> >>
>> >> OK
>> >
>> > Pushed as r15-6828-g4b0ef49d02f and r14.2.0-680-gd82fc939f91.
>>
>> On a configuration where libatomic does get built, I see (with standard
>
> Does *not* get built?

No, *does* get built, and thus the PASS -> UNSUPPORTED is a regression.


Grüße
 Thomas


>> build-tree testing: 'make check'):
>>
>> [-PASS:-]{+UNSUPPORTED:+} 
>> 29_atomics/atomic_float/compare_exchange_padding.cc  -std=gnu++20[-(test for 
>> excess errors)-]
>> [-PASS: 29_atomics/atomic_float/compare_exchange_padding.cc  
>> -std=gnu++20 execution test-]
>> [Etc.]
>>
>> [...]
>> spawn -ignore SIGHUP [...]/gcc/xg++ [...] libatomic_available1221570.c 
>> -latomic [...] -o libatomic_available1221570.exe
>> /usr/bin/ld: cannot find -latomic: No such file or directory
>> [...]
>>
>> I presume that the new 'dg-require-effective-target libatomic_available'
>> is evaluated when the 'atomic_link_flags' via 'dg-additional-options'
>> have not yet been set?
>>
>> Would it work to call 'atomic_init' (plus 'atomic_finish', I suppose?)
>> (see 'gcc/testsuite/lib/atomic-dg.exp') in libstdc++ test suite setup,
>> and then to '29_atomics/atomic_float/compare_exchange_padding.cc' apply
>> the usual pattern:
>>
>> -// { dg-require-effective-target libatomic_available }
>> -// { dg-additional-options "[atomic_link_flags [get_multilibs]] 
>> -latomic" }
>> +// { dg-additional-options -latomic { target libatomic_available } }
>
> Yes that seems OK

[PATCH] tree-optimization/92539 - missed optimization leads to bogus -Warray-bounds

2025-01-13 Thread Richard Biener

The following makes niter analysis recognize a loop with an exit
condition scanning over a STRING_CST.  This is done via enhancing
the force evaluation code rather than recognizing for example
strlen (s) as number of iterations because it allows to handle
some more cases.

STRING_CSTs are easy to handle since nothing can write to them, also
processing those should be cheap.  I'd appreciate another eye on
the constraints I put in.

Note to avoid the -Warray-bound dianostic we have to early unroll
the loop (there's no final value replacement done, there's a PR
for doing this as part of CD-DCE when possibly eliding a loop).
This works for strings up to 8 chars (including the '\0') only
(rather than 16, the unroll niter limit) because unroll estimation
will not see that the load from the string constant goes away.

Final value replacement doesn't work since ivcanon is now after it,
it's not the time to move the pass though.  The pass is in theory
supposed to add a canonical IV for the _by_eval cases, but we
didn't "fix" this when we added cunrolli (we probably should have
moved ivcanon very early, or made cunroll add such IV if we
used _by_eval but did not unroll).

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

I'm not sure it's worth handling this special case in the imperfect
way at this point.  I have a followup to also handle POINTER_PLUS
of "Hello World" and 1 which appears when doing a C test and that
shows IVCANON then eventually adds a simple counting IV, we DCE
the load and late unroll it for larger strings.

PR tree-optimization/92539
* tree-ssa-loop-ivcanon.cc (tree_unroll_loops_completely_1):
Also try force-evaluation if ivcanon did not yet run.
* tree-ssa-loop-niter.cc (loop_niter_by_eval): When we
don't find a proper PHI try if the exit condition scans
over a STRING_CST and simulate that.

* g++.dg/warn/Warray-bounds-pr92539.C: New testcase.
---
 .../g++.dg/warn/Warray-bounds-pr92539.C   | 51 +++
 gcc/tree-ssa-loop-ivcanon.cc  |  2 +-
 gcc/tree-ssa-loop-niter.cc| 39 +-
 3 files changed, 90 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/warn/Warray-bounds-pr92539.C

diff --git a/gcc/testsuite/g++.dg/warn/Warray-bounds-pr92539.C 
b/gcc/testsuite/g++.dg/warn/Warray-bounds-pr92539.C
new file mode 100644
index 000..ea506ed1450
--- /dev/null
+++ b/gcc/testsuite/g++.dg/warn/Warray-bounds-pr92539.C
@@ -0,0 +1,51 @@
+// { dg-do compile { target c++11 } }
+// { dg-options "-O2 -Warray-bounds" }
+
+static bool
+ischar(int ch)
+{
+return (0 == (ch & ~0xff) || ~0 == (ch | 0xff)) != 0;
+}
+
+static bool eat(char const*& first, char const* last)
+{
+if (first != last && ischar(*first)) { // { dg-bogus "bounds" }
+++first;
+return true;
+}
+return false;
+}
+
+static bool eat_two(char const*& first, char const* last)
+{
+auto save = first;
+if (eat(first, last) && eat(first, last))
+return true;
+first = save;
+return false;
+}
+
+static bool foo(char const*& first, char const* last)
+{
+auto local_iterator = first;
+int i = 0;
+for (; i < 3; ++i)
+if (!eat_two(local_iterator, last))
+return false;
+first = local_iterator;
+return true;
+}
+
+static bool test(char const* in, bool full_match = true)
+{
+auto last = in;
+while (*last)
+++last;
+return foo(in, last) && (!full_match || (in == last)); // { dg-bogus 
"bounds" }
+}
+
+int main()
+{
+return test("aa");
+}
+
diff --git a/gcc/tree-ssa-loop-ivcanon.cc b/gcc/tree-ssa-loop-ivcanon.cc
index 6e6569e5713..7cc340b23c5 100644
--- a/gcc/tree-ssa-loop-ivcanon.cc
+++ b/gcc/tree-ssa-loop-ivcanon.cc
@@ -1492,7 +1492,7 @@ tree_unroll_loops_completely_1 (bool may_increase_size, 
bool unroll_outer,
 ul = UL_NO_GROWTH;
 
   if (canonicalize_loop_induction_variables
-  (loop, false, ul, !flag_tree_loop_ivcanon, unroll_outer,
+  (loop, false, ul, !flag_tree_loop_ivcanon || cunrolli, unroll_outer,
innermost, cunrolli))
 {
   /* If we'll continue unrolling, we need to propagate constants
diff --git a/gcc/tree-ssa-loop-niter.cc b/gcc/tree-ssa-loop-niter.cc
index 9e966266ca3..7d9b7eb7594 100644
--- a/gcc/tree-ssa-loop-niter.cc
+++ b/gcc/tree-ssa-loop-niter.cc
@@ -3595,7 +3595,44 @@ loop_niter_by_eval (class loop *loop, edge exit)
{
  phi = get_base_for (loop, op[j]);
  if (!phi)
-   return chrec_dont_know;
+   {
+ gassign *def;
+ if (j == 0
+ && (cmp == NE_EXPR || cmp == EQ_EXPR)
+ && TREE_CODE (op[0]) == SSA_NAME
+ && TREE_CODE (op[1]) == INTEGER_CST
+ && (def = dyn_cast  (SSA_NAME_DEF_STMT (op[0])))
+ && gimple_assign_rhs_code (def) == MEM_REF)
+   {
+ tree mem = gimple_assign_rhs1 (def);
+

Re: [gcc r15-6807] vect: Force alignment peeling to vectorize more early break loops [PR118211]

2025-01-13 Thread Thomas Schwinge

Hi!

On 2025-01-10T21:22:03+, Tamar Christina via Gcc-cvs  
wrote:
> https://gcc.gnu.org/g:68326d5d1a593dc0bf098c03aac25916168bc5a9
>
> commit r15-6807-g68326d5d1a593dc0bf098c03aac25916168bc5a9
> Author: Alex Coplan 
> Date:   Mon Mar 11 13:09:10 2024 +
>
> vect: Force alignment peeling to vectorize more early break loops 
> [PR118211]

In addition to the regression already noted elsewhere:

PASS: gcc.dg/tree-ssa/predcom-8.c (test for excess errors)
PASS: gcc.dg/tree-ssa/predcom-8.c scan-tree-dump pcom "Executing predictive 
commoning without unrolling"
[-PASS:-]{+FAIL:+} gcc.dg/tree-ssa/predcom-8.c scan-tree-dump-not pcom 
"Invalid sum"

..., this commit for for '--target=amdgcn-amdhsa' (tested
'-march=gfx908', '-march=gfx1100') also regresses:

PASS: gcc.dg/vect/vect-switch-search-line-fast.c (test for excess errors)
[-XFAIL:-]{+FAIL:+} gcc.dg/vect/vect-switch-search-line-fast.c 
scan-tree-dump-times vect "vectorized 1 loops" [-1-]{+0+}

gcc.dg/vect/vect-switch-search-line-fast.c: pattern found 1 times

> --- a/gcc/testsuite/gcc.dg/vect/vect-switch-search-line-fast.c
> +++ b/gcc/testsuite/gcc.dg/vect/vect-switch-search-line-fast.c
> [...]
> -/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { xfail 
> *-*-* } } } */
> +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target 
> { ilp32 } } } } */
> +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 0 "vect" { target 
> { ! ilp32 } } } } */

Presuming that it's correct that GCN continues to be able vectorize this,
what is the appropriate conditional to use?


Grüße
 Thomas


> This allows us to vectorize more loops with early exits by forcing
> peeling for alignment to make sure that we're guaranteed to be able to
> safely read an entire vector iteration without crossing a page boundary.
> 
> To make this work for VLA architectures we have to allow compile-time
> non-constant target alignments.  We also have to override the result of
> the target's preferred_vector_alignment hook if it isn't a power-of-two
> multiple of the TYPE_SIZE of the chosen vector type.
> 
> gcc/ChangeLog:
> 
> PR tree-optimization/118211
> PR tree-optimization/116126
> * tree-vect-data-refs.cc (vect_analyze_early_break_dependences):
> Set need_peeling_for_alignment flag on read DRs instead of
> failing vectorization.  Punt on gathers.
> (dr_misalignment): Handle non-constant target alignments.
> (vect_compute_data_ref_alignment): If need_peeling_for_alignment
> flag is set on the DR, then override the target alignment chosen
> by the preferred_vector_alignment hook to choose a safe
> alignment.
> (vect_supportable_dr_alignment): Override
> support_vector_misalignment hook if need_peeling_for_alignment
> is set on the DR: in this case we must return
> dr_unaligned_unsupported in order to force peeling.
> * tree-vect-loop-manip.cc (vect_do_peeling): Allow prolog
> peeling by a compile-time non-constant amount.
> * tree-vectorizer.h (dr_vec_info): Add new flag
> need_peeling_for_alignment.
> 
> gcc/testsuite/ChangeLog:
> 
> PR tree-optimization/118211
> PR tree-optimization/116126
> * gcc.dg/tree-ssa/cunroll-13.c: Don't vectorize.
> * gcc.dg/tree-ssa/cunroll-14.c: Likewise.
> * gcc.dg/unroll-6.c: Likewise.
> * gcc.dg/tree-ssa/gen-vect-28.c: Likewise.
> * gcc.dg/vect/vect-104.c: Expect to vectorize.
> * gcc.dg/vect/vect-early-break_108-pr113588.c: Likewise.
> * gcc.dg/vect/vect-early-break_109-pr113588.c: Likewise.
> * gcc.dg/vect/vect-early-break_110-pr113467.c: Likewise.
> * gcc.dg/vect/vect-early-break_3.c: Likewise.
> * gcc.dg/vect/vect-early-break_65.c: Likewise.
> * gcc.dg/vect/vect-early-break_8.c: Likewise.
> * gfortran.dg/vect/vect-5.f90: Likewise.
> * gfortran.dg/vect/vect-8.f90: Likewise.
> * gcc.dg/vect/vect-switch-search-line-fast.c:
> 
> Co-Authored-By: Tamar Christina 
>
> Diff:
> ---
>  gcc/testsuite/gcc.dg/tree-ssa/cunroll-13.c |   2 +-
>  gcc/testsuite/gcc.dg/tree-ssa/cunroll-14.c |   2 +-
>  gcc/testsuite/gcc.dg/tree-ssa/gen-vect-28.c|   1 +
>  gcc/testsuite/gcc.dg/unroll-6.c|   2 +-
>  gcc/testsuite/gcc.dg/vect/vect-104.c   |   1 +
>  .../gcc.dg/vect/vect-early-break_108-pr113588.c|   2 +-
>  .../gcc.dg/vect/vect-early-break_109-pr113588.c|   2 +-
>  .../gcc.dg/vect/vect-early-break_110-pr113467.c|   2 +-
>  gcc/testsuite/gcc.dg/vect/vect-early-break_3.c |   2 +-
>  gcc/testsuite/gcc.dg/vect/vect-early-break_65.c|   2 +-
>  gcc/testsuite/gcc.dg/vect

[PATCH] tree-optimization/117119 - ICE with int128 IV in dataref analysis

2025-01-13 Thread Richard Biener

Here's another fix for a missing check that an IV value fits in a
HIW.  It's originally from Stefan.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

PR tree-optimization/117119
* tree-data-ref.cc (initialize_matrix_A): Check whether
an INTEGER_CST fits in HWI, otherwise return chrec_dont_know.

* gcc.dg/torture/pr117119.c: New testcase.

Co-Authored-By: Stefan Schulze Frielinghaus 
---
 gcc/testsuite/gcc.dg/torture/pr117119.c | 10 ++
 gcc/tree-data-ref.cc|  2 +-
 2 files changed, 11 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr117119.c

diff --git a/gcc/testsuite/gcc.dg/torture/pr117119.c 
b/gcc/testsuite/gcc.dg/torture/pr117119.c
new file mode 100644
index 000..0ec4ac1b180
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr117119.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target int128 } */
+
+unsigned __int128 g_728;
+int func_1_l_5011[8];
+void func_1() {
+  for (;; g_728 += 1)
+func_1_l_5011[g_728] ^= func_1_l_5011[g_728 + 5];
+}
+void main() {}
diff --git a/gcc/tree-data-ref.cc b/gcc/tree-data-ref.cc
index 26ad0536c36..08c14fe29f2 100644
--- a/gcc/tree-data-ref.cc
+++ b/gcc/tree-data-ref.cc
@@ -4088,7 +4088,7 @@ initialize_matrix_A (lambda_matrix A, tree chrec, 
unsigned index, int mult)
   }
 
 case INTEGER_CST:
-  return chrec;
+  return cst_and_fits_in_hwi (chrec) ? chrec : chrec_dont_know;
 
 default:
   gcc_unreachable ();
-- 
2.43.0

Re: [PATCH v4 6/7] OpenMP: Fortran front-end support for dispatch + adjust_args

2025-01-13 Thread Tobias Burnus


Hi PA,

Paul-Antoine Arras wrote:

Here is an updated patch following your suggestion.


Thanks. It is not clear whether you are just waiting
for test result or not before committing it as obvious.

Thus, just in case: LGTM.

Thanks,

Tobias

Re: [PATCH] d, v2: give dependency files better filenames

2025-01-13 Thread Jakub Jelinek

On Mon, Jan 13, 2025 at 02:45:28PM +0100, Arsen Arsenović wrote:
> > So the former d/.deps/file.Po which handled both d/dmd/common/file.d and
> > d/dmd/root/file.d in your case would be d/.deps/d-common-file.o.d and
> > d/.deps/d-root-file.o.d while with the above DEPFILE it would be
> > d/.deps/common-file.d and d/.deps/root-file.d
> > There are no d/dmd/*-*.d files and among d/*-*.cc the only are just d-
> > prefixed ones, and there are no clashes between the *.cc and *.d filenames:
> > for i in gcc/d/*.cc; do j=`basename $i .cc`; find gcc/d -name $j.d; done
> 
> Relying that is more error-prone, I think.  While that is true today, it
> might not stay true forever, and such a change won't be caught until it
> fails again in the same way.
> 
> $@ is necessarily unique (however, still, with the proposed approach
> d/foo.o and d-foo.o will collide).

We are talking about d/*.o object files and d/.deps/*.Po files corresponding
to that.  As there are no subdirectories, the * must be necessarily unique.
And it already uses $(@D) in the directory name ($(@D)/$(DEPDIR) in
particular, so even if in the future some subdirectory for object files is
added, it would still be unique, say if there is
d/*.o and d/whatever/*.o, the deps files would be d/.deps/*.Po and
d/whatever/.deps/*.Po.  There is no need to avoid clashes with files in
the gcc main build directory, those have their own gcc/.deps/ rather than
gcc/d/.deps/

> I might be overthinking it - I trust your judgment, so I'm okay with
> either.

Jakub

5th Ping: [Middle-end][PATCH v4 0/3][RFC]Provide more contexts for -Warray-bounds and -Wstringop-* warning messages

2025-01-13 Thread Qing Zhao

Hi,

This is the 5th ping of the middle end review of the patch set.

Could you please take a look at the patch set for -fdiagnostics-details,  and 
provide
more advice on:

A. Whether the middle-end design and implementation is reasonable and 
extendable to more other
optimizations (including loop unrolling)?
B. If not, how to improve the current design to make it more extendable?
C. If yes, whether it’s okay to add this patches into GCC15, and then improve 
it in GCC16?

Thanks.

Qing


> On Dec 23, 2024, at 12:29, Qing Zhao  wrote:
> 
> Hi,  
> 
> This is the 4th ping of the middle end review for the patch set. 
> 
> Really appreciate any comments and suggestions from Middle-end reviewer 
> on this patch (the diagnostic part of the patch has been reviewed and 
> approved already). 
> 
> As I know, Kees and Sam have been using this option for a while and both 
> found very helpful. 
> 
> Could you please take a look and let me know any issue in the patch? 
> 
> Thanks a lot!
> 
> The latest version of(4th version) is:
> https://gcc.gnu.org/pipermail/gcc-patches/2024-November/667613.html
> https://gcc.gnu.org/pipermail/gcc-patches/2024-November/667614.html
> https://gcc.gnu.org/pipermail/gcc-patches/2024-November/667615.html
> https://gcc.gnu.org/pipermail/gcc-patches/2024-November/667616.html
> 
> Qing
> 
>> Begin forwarded message:
>> 
>> From: Sam James 
>> Subject: Re: 3rd Ping: [Middle-end][PATCH v4 0/3][RFC]Provide more contexts 
>> for -Warray-bounds and -Wstringop-* warning messages
>> Date: December 6, 2024 at 13:32:55 EST
>> To: Qing Zhao , Jeff Law 
>> Cc: richard Biener , GCC Patches 
>> , kees Cook , Andrew Pinski 
>> , David Malcolm 
>> 
>> Qing Zhao  writes:
>> 
>>> This is the 3rd ping of the Middle-end review for this patch.
>>> 
>> 
>> Jeff, would you be able to take a look? (In part because I know
>> you've had a lot of comments and feedback on the middle-end warnings
>> before). The diagnostics bits are OK'd already.
>> 
>> I've been running this on distro builds for a few months now and had
>> great results with it so far (including finding some real bugs in
>> packages that I'd previously dismissed as probable-FPs).
>> 
>> I can also chuck it in to our general testing builds if it'd help any.
>> 
>>> Thanks a lot!
>>> 
>>> Qing
>>> 
 On Nov 26, 2024, at 10:30, Qing Zhao  wrote:
 
 Another ping on the Middle-end review of this patch. 
 
 This patch has been waiting for the middle-end review for a long time. 
 
 Please review it and provide any feedback, I believe that this should be a 
 nice improvement to GCC diagnostic in general. 
 
 Thanks.
 
 Qing
 
> On Nov 15, 2024, at 10:34, Qing Zhao  wrote:
> 
> Gentle ping on the middle-end review for this patch. 
> 
> There are two parts of this patch:
> 
> 1. Diagnostic part (Part 2), which has been reviewed by David;
> 2. Middle end part (Part 1 and 3), mainly on the copy_history information 
> collection during transformation. 
> 
> Thanks,
> 
> Qing
> 
> 
>> On Nov 5, 2024, at 11:31, Qing Zhao  wrote:
>> 
>> Hi,
>> 
>> This is the 4th version of the patch for fixing PR109071.
>> 
>> Compared to the 3nd version:
>> https://gcc.gnu.org/pipermail/gcc-patches/2024-October/666870.html
>> https://gcc.gnu.org/pipermail/gcc-patches/2024-October/666872.html
>> https://gcc.gnu.org/pipermail/gcc-patches/2024-October/666871.html
>> 
>> The major improvements to this patch are:
>> 
>> 1. Divide the patch into 3 parts:
>> Part 1: Add new data structure move_history, record move_history during
>> transformation;
>> Part 2: In warning analysis, Use the new move_history to form a rich
>> location with a sequence of events, to report more context info
>> of the warnings.
>> Part 3: Add debugging mechanism for move_history.
>> 
>> 2. Major change to the above Part 2, completely rewritten based on 
>> David's
>> new class lazy_diagnostic_path. 
>> 
>> 3. Fix all issues identied By Sam;
>> A. fix PR117375 (Bug in tree-ssa-sink.cc);
>> B. documentation clarification;
>> C. Add all the duplicated PRs in the commit comments;
>> 
>> 4. Bootstrap GCC with the new -fdiagnostics-details on by default (Init 
>> (1)).
>> exposed some ICE similar as PR117375 in tree-ssa-sink.cc, fixed.
>> 
>> 
>> bootstrapping and regression testing on both x86 and aarch64.
>> 
>> Please let me know any comment and suggestion.
>> 
>> Thanks.
>> 
>> Qing
>> Qing Zhao (3):
>> Provide more contexts for -Warray-bounds, -Wstringop-* warning
>> messages due to code movements from compiler transformation (Part 1)
>> [PR109071,PR85788,PR88771,PR106762,PR108770,PR115274,PR117179]
>> Provide more contexts for -Warray-bounds, -Wstringop-* warning
>> messages due to co

Re: [PATCH v5 02/10] OpenMP: Re-work and extend context selector resolution

2025-01-13 Thread Sandra Loosemore


On 1/13/25 16:43, Tobias Burnus wrote:

Hi all,

Tobias Burnus wrote:

Tobias Burnus wrote:

Sandra Loosemore wrote:

This patch reimplements the middle-end support for "declare variant"
and extends the resolution mechanism to also handle metadirectives
(PR112779).  It also adds partial support for dynamic selectors
(PR113904) and fixes a selector scoring bug reported as PR114596.  I 
hope
this rewrite also improves the engineering aspect of the code, e.g. 
more

comments to explain what it is doing.


Do to a Clang bug and GCC issuing a warning - and me not finding
the right spot in the spec, I claimed in the C FE thread that
   context={target}
is not fulfilled inside a declare-target function as it is not
inside an 'omp target' construct.

Well, I missed the following line:

OpenMP 6.0, "9.1 OpenMP Contexts" [326:28-30]:


3. For procedures that are determined to be target variants by a declare target
   directive, the target trait is added to the beginning of the construct trait
   set as c1 so the total size of the trait set is increased by one.


Seemingly the original patch author found that line - as GCC does the right 
thing
forhttps://github.com/OpenMP/Examples/blob/main/program_control/sources/metadirective.3.c#L15

#pragma omp begin declare target
void exp_pi_diff(double *d, double my_pi){
#pragma omp metadirective \
when(   construct={target}: distribute parallel for ) \
otherwise(  parallel for simd )

namely, as the function is 'exp_pi_diff' the 'when' is correctly matched
and "distribute parallel for" is used.

So far so good, but when compiling it, one gets the confusing warning:

warning: direct calls to an offloadable function containing metadirectives
with a ‘construct={target}’ selector may produce unexpected results


(To the defense of the original patch author, back then the spec (5.0) talked
about "device routines" which is ambiguous whether it applies to functions
marked as declare target - or only those actually running on non-host
devices.)

* * *

As GCC does the right thing - but the warning is highly confusing in light
of the much clearer 6.0 wording, IMHO, we just can just remove the following
bits of the patch.

[snip]

Just as followup to my claim that the 2/10 patch was LGTM. With those 
changes, it is then LGTM v2.0. BTW: A later patch in the patch series 
added a testcase similar to the OpenMP example document; it needs to be 
updated for the removed warning but it will test that this works.
Tobias PS: Sorry for causing confusion, but I hope it is now sorted out 
correctly.


Thanks so much for your help with this patch series!

I'd also noted this change in wording and a later patch in the series 
(OpenMP: Update "declare target"/OpenMP context interaction) adjusts the 
behavior to account for the current 5.0-based implementation only adding 
the "target" construct for "begin declare target" vs the current spec 
which is clear it applies to both "begin declare target" and "declare 
target".  But I'd missed the connection to this bogus warning about 
behavior that was unclear in the original spec but is well-specified now.


My regression-testing of the revised patches isn't going to finish while 
I'm still awake tonight; I'm testing both the parts 1-3 that have been 
approved to commit now, and the whole series that actually exercises the 
new code.  Assuming all goes well I'll push the 3 approved patches and 
post the current version of the whole set in the morning.  (I haven't 
yet had time to address any other comments on the C front-end piece, but 
I have tweaked some test cases for all 3 languages so the last posted 
version is bit-rotten again.)


-Sandra

1 2 >

1 - 100 of 126 matches

Mail list logo