date:20240131

Re: [PATCH v2] c++: avoid -Wdangling-reference for std::span-like classes [PR110358]

2024-01-31 Thread Jason Merrill


On 1/31/24 14:44, Alex Coplan wrote:

Hi Marek,

On 30/01/2024 13:15, Marek Polacek wrote:

On Thu, Jan 25, 2024 at 10:13:10PM -0500, Jason Merrill wrote:

On 1/25/24 20:36, Marek Polacek wrote:

Better version:

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

-- >8 --
Real-world experience shows that -Wdangling-reference triggers for
user-defined std::span-like classes a lot.  We can easily avoid that
by considering classes like

  template
  struct Span {
T* data_;
std::size len_;
  };

to be std::span-like, and not warning for them.  Unlike the previous
patch, this one considers a non-union class template that has a pointer
data member and a trivial destructor as std::span-like.

PR c++/110358
PR c++/109640

gcc/cp/ChangeLog:

* call.cc (reference_like_class_p): Don't warn for std::span-like
classes.

gcc/ChangeLog:

* doc/invoke.texi: Update -Wdangling-reference description.

gcc/testsuite/ChangeLog:

* g++.dg/warn/Wdangling-reference18.C: New test.
* g++.dg/warn/Wdangling-reference19.C: New test.
* g++.dg/warn/Wdangling-reference20.C: New test.
---
   gcc/cp/call.cc| 18 
   gcc/doc/invoke.texi   | 14 +++
   .../g++.dg/warn/Wdangling-reference18.C   | 24 +++
   .../g++.dg/warn/Wdangling-reference19.C   | 25 +++
   .../g++.dg/warn/Wdangling-reference20.C   | 42 +++
   5 files changed, 123 insertions(+)
   create mode 100644 gcc/testsuite/g++.dg/warn/Wdangling-reference18.C
   create mode 100644 gcc/testsuite/g++.dg/warn/Wdangling-reference19.C
   create mode 100644 gcc/testsuite/g++.dg/warn/Wdangling-reference20.C

diff --git a/gcc/cp/call.cc b/gcc/cp/call.cc
index 9de0d77c423..afd3e1ff024 100644
--- a/gcc/cp/call.cc
+++ b/gcc/cp/call.cc
@@ -14082,6 +14082,24 @@ reference_like_class_p (tree ctype)
return true;
   }
+  /* Avoid warning if CTYPE looks like std::span: it's a class template,
+ has a T* member, and a trivial destructor.  For example,
+
+  template
+  struct Span {
+   T* data_;
+   std::size len_;
+  };
+
+ is considered std::span-like.  */
+  if (NON_UNION_CLASS_TYPE_P (ctype)
+  && CLASSTYPE_TEMPLATE_INSTANTIATION (ctype)
+  && TYPE_HAS_TRIVIAL_DESTRUCTOR (ctype))
+for (tree field = next_aggregate_field (TYPE_FIELDS (ctype));
+field; field = next_aggregate_field (DECL_CHAIN (field)))
+  if (TYPE_PTR_P (TREE_TYPE (field)))
+   return true;
+
 /* Some classes, such as std::tuple, have the reference member in its
(non-direct) base class.  */
 if (dfs_walk_once (TYPE_BINFO (ctype), class_has_reference_member_p_r,
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 6ec56493e59..e0ff18a86f5 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -3916,6 +3916,20 @@ where @code{std::minmax} returns @code{std::pair}, and
   both references dangle after the end of the full expression that contains
   the call to @code{std::minmax}.
+The warning does not warn for @code{std::span}-like classes.  We consider
+classes of the form:
+
+@smallexample
+template
+struct Span @{
+  T* data_;
+  std::size len_;
+@};
+@end smallexample
+
+as @code{std::span}-like; that is, the class is a non-union class template
+that has a pointer data member and a trivial destructor.
+
   This warning is enabled by @option{-Wall}.
   @opindex Wdelete-non-virtual-dtor
diff --git a/gcc/testsuite/g++.dg/warn/Wdangling-reference18.C 
b/gcc/testsuite/g++.dg/warn/Wdangling-reference18.C
new file mode 100644
index 000..e088c177769
--- /dev/null
+++ b/gcc/testsuite/g++.dg/warn/Wdangling-reference18.C
@@ -0,0 +1,24 @@
+// PR c++/110358
+// { dg-do compile { target c++11 } }
+// { dg-options "-Wdangling-reference" }
+// Don't warn for std::span-like classes.
+
+template 
+struct Span {
+T* data_;
+int len_;
+
+[[nodiscard]] constexpr auto operator[](int n) const noexcept -> T& { 
return data_[n]; }
+[[nodiscard]] constexpr auto front() const noexcept -> T& { return 
data_[0]; }
+[[nodiscard]] constexpr auto back() const noexcept -> T& { return 
data_[len_ - 1]; }
+};
+
+auto get() -> Span;
+
+auto f() -> int {
+int const& a = get().front(); // { dg-bogus "dangling reference" }
+int const& b = get().back();  // { dg-bogus "dangling reference" }
+int const& c = get()[0];  // { dg-bogus "dangling reference" }
+
+return a + b + c;
+}
diff --git a/gcc/testsuite/g++.dg/warn/Wdangling-reference19.C 
b/gcc/testsuite/g++.dg/warn/Wdangling-reference19.C
new file mode 100644
index 000..053467d822f
--- /dev/null
+++ b/gcc/testsuite/g++.dg/warn/Wdangling-reference19.C
@@ -0,0 +1,25 @@
+// PR c++/110358
+// { dg-do compile { target c++11 } }
+// { dg-options "-Wdangling-reference" }
+// Like Wdangling-reference18.C but not actually a span-like class.
+
+template 
+struct Span {
+

Re: [PATCH v3 4/5] Add tests for C/C++ musttail attributes

2024-01-31 Thread Marek Polacek

On Tue, Jan 30, 2024 at 06:17:17PM -0800, Andi Kleen wrote:
> Mostly adopted from the existing C musttail plugin tests.

Please add a ChangeLog entry.

> ---
>  gcc/testsuite/c-c++-common/musttail1.c  | 17 
>  gcc/testsuite/c-c++-common/musttail2.c  | 36 +
>  gcc/testsuite/c-c++-common/musttail3.c  | 31 +
>  gcc/testsuite/c-c++-common/musttail4.c  | 19 +
>  gcc/testsuite/gcc.dg/musttail-invalid.c | 17 
>  5 files changed, 120 insertions(+)
>  create mode 100644 gcc/testsuite/c-c++-common/musttail1.c
>  create mode 100644 gcc/testsuite/c-c++-common/musttail2.c
>  create mode 100644 gcc/testsuite/c-c++-common/musttail3.c
>  create mode 100644 gcc/testsuite/c-c++-common/musttail4.c
>  create mode 100644 gcc/testsuite/gcc.dg/musttail-invalid.c
> 
> diff --git a/gcc/testsuite/c-c++-common/musttail1.c 
> b/gcc/testsuite/c-c++-common/musttail1.c
> new file mode 100644
> index ..476185e3ed4b
> --- /dev/null
> +++ b/gcc/testsuite/c-c++-common/musttail1.c
> @@ -0,0 +1,17 @@
> +/* { dg-do compile { target tail_call } } */
> +/* { dg-options "-O2" } */
> +/* { dg-additional-options "-std=c++11" { target c++ } } */

This will run the test only once with -std=c++11.  We'll get better coverage
with dropping the line above and using

/* { dg-do compile { target { tail_call && { c || c++11 } } } } */

but here it may not matter.

> +/* { dg-additional-options "-std=c23" { target c } } */
> +/* { dg-additional-options "-fdelayed-branch" { target sparc*-*-* } } */
> +
> +int __attribute__((noinline,noclone))
> +callee (int i)
> +{
> +  return i * i;
> +}
> +
> +int __attribute__((noinline,noclone))
> +caller (int i)
> +{
> +  [[gnu::musttail]] return callee (i + 1);
> +}
> diff --git a/gcc/testsuite/c-c++-common/musttail2.c 
> b/gcc/testsuite/c-c++-common/musttail2.c
> new file mode 100644
> index ..28f2f68ef13d
> --- /dev/null
> +++ b/gcc/testsuite/c-c++-common/musttail2.c
> @@ -0,0 +1,36 @@
> +/* { dg-do compile { target tail_call } } */
> +/* { dg-additional-options "-std=c++11" { target c++ } } */
> +/* { dg-additional-options "-std=c23" { target c } } */
> +
> +struct box { char field[256]; int i; };
> +
> +int __attribute__((noinline,noclone))
> +test_2_callee (int i, struct box b)
> +{
> +  if (b.field[0])
> +return 5;
> +  return i * i;
> +}
> +
> +int __attribute__((noinline,noclone))
> +test_2_caller (int i)
> +{
> +  struct box b;
> +  [[gnu::musttail]] return test_2_callee (i + 1, b); /* { dg-error "cannot 
> tail-call: " } */
> +}
> +
> +extern void setjmp (void);
> +void
> +test_3 (void)
> +{
> +  [[gnu::musttail]] return setjmp (); /* { dg-error "cannot tail-call: " } */
> +}
> +
> +typedef void (fn_ptr_t) (void);
> +volatile fn_ptr_t fn_ptr;
> +
> +void
> +test_5 (void)
> +{
> +  [[gnu::musttail]] return fn_ptr (); /* { dg-error "cannot tail-call: " } */
> +}
> diff --git a/gcc/testsuite/c-c++-common/musttail3.c 
> b/gcc/testsuite/c-c++-common/musttail3.c
> new file mode 100644
> index ..fdbb292944ad
> --- /dev/null
> +++ b/gcc/testsuite/c-c++-common/musttail3.c
> @@ -0,0 +1,31 @@
> +/* { dg-do compile { target tail_call } } */
> +/* { dg-additional-options "-std=c++11" { target c++ } } */
> +/* { dg-additional-options "-std=c23" { target c } } */
> +
> +extern int foo2 (int x, ...);
> +
> +struct str
> +{
> +  int a, b;
> +};
> +
> +struct str
> +cstruct (int x)
> +{
> +  if (x < 10)
> +[[clang::musttail]] return cstruct (x + 1);
> +  return ((struct str){ x, 0 });
> +}
> +
> +int
> +foo (int x)
> +{
> +  if (x < 10)
> +[[clang::musttail]] return foo2 (x, 29);
> +  if (x < 100)
> +{
> +  int k = foo (x + 1);
> +  [[clang::musttail]] return k;  /* { dg-error "cannot tail-call: " } */
> +}
> +  return x;
> +}
> diff --git a/gcc/testsuite/c-c++-common/musttail4.c 
> b/gcc/testsuite/c-c++-common/musttail4.c
> new file mode 100644
> index ..7bf44816f14a
> --- /dev/null
> +++ b/gcc/testsuite/c-c++-common/musttail4.c
> @@ -0,0 +1,19 @@
> +/* { dg-do compile { target tail_call } } */
> +/* { dg-additional-options "-std=c++11" { target c++ } } */
> +/* { dg-additional-options "-std=c23" { target c } } */
> +
> +struct box { char field[64]; int i; };
> +
> +struct box __attribute__((noinline,noclone))
> +returns_struct (int i)
> +{
> +  struct box b;
> +  b.i = i * i;
> +  return b;
> +}
> +
> +int __attribute__((noinline,noclone))
> +test_1 (int i)
> +{
> +  [[gnu::musttail]] return returns_struct (i * 5).i; /* { dg-error "cannot 
> tail-call: " } */
> +}
> diff --git a/gcc/testsuite/gcc.dg/musttail-invalid.c 
> b/gcc/testsuite/gcc.dg/musttail-invalid.c
> new file mode 100644
> index ..c4725b4b8226
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/musttail-invalid.c

Is there a C++ test for the invalid cases?

> @@ -0,0 +1,17 @@
> +/* { dg-do compile } */
> +/* { dg-options "-std=c23" } */
> +
> +[[musttail]] int j; /* { dg-warning "attribute ignored" } */

Re: [PATCH v3 1/5] Improve must tail in RTL backend

2024-01-31 Thread Andi Kleen

> This results in "error: cannot tail-call: cannot tail-call: other reasons".
> So the second argument should be "other reasons" only.

Yes will fix those. Thanks.

> 
> I notice that if I don't use -O2 I also get "other reasons".  But it should be
> easy-ish to say "cannot tail-call: optimizations not enabled" or so.

It's unfortunately not easy to distinguish. It's not just -O2, but
various missing transformations make tree-tailcall not do it its job,
and they could depend on other flags. But there might be also other
reasons not related to the optimization that makes the tail call fall.
I would be uncomfortable reporting the problem is -O2 when it might 
be something else.

The right fix would be to make tree-tailcall not fail with optimization,
and for the remaining cases add errors there.  But that would
make the patch a lot bigger and it's not clear it would improve usability
that much. So I opted to just mention the problem in the documentation.

-Andi

Re: [PATCH] c-family: Fix ICE with large column number after restoring a PCH [PR105608]

2024-01-31 Thread Jason Merrill


On 1/30/24 21:49, Lewis Hyatt wrote:

On Fri, Jan 26, 2024 at 04:16:54PM -0500, Jason Merrill wrote:

On 12/5/23 20:52, Lewis Hyatt wrote:

Hello-

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105608

There are two related issues here really, a regression since GCC 11 where we
can ICE after restoring a PCH, and a deeper issue with bogus locations
assigned to macros that were defined prior to restoring a PCH.  This patch
fixes the ICE regression with a simple change, and I think it's appropriate
for GCC 14 as well as backport to 11, 12, 13. The bad locations (wrong, but
not generally causing an ICE, and mostly affecting only the output of
-Wunused-macros) are not as problematic, and will be harder to fix. I could
take a stab at that for GCC 15. In the meantime the patch adds XFAILed
tests for the wrong locations (as well as passing tests for the regression
fix). Does it look OK please? Bootstrap + regtest all languages on x86-64
Linux. Thanks!


OK for trunk and branches, thanks!



Thanks for the review! That is all taken care of. I have one more request if
you don't mind please... There have been some further comments on the PR
indicating that the new xfailed testcase I added is failing in an unexpected
way on at least one architecture. To recap, the idea here was that

1) libcpp needs new logic to be able to output correct locations for this
case. That will be some new code that is suitable for stage 1, not now.

2) In the meantime, we fixed things up enough to avoid an ICE that showed up
in GCC 11, and added an xfailed testcase to remind about #1.

The problem is that, the reason that libcpp outputs the wrong locations, is
that it has always used a location from the old line_map instance to index
into the new line_map instance, and so the exact details of the wrong
locations it outputs depend on the state of those two line maps, which may
differ depending on system includes and things like that. So I was hoping to
make one further one-line change to libcpp, not yet to output correct
locations, but at least to output one which is the same always and doesn't
depend on random things. This would assign all restored macros to a
consistent location, one line following the #include that triggered the PCH
process. I think this probably shouldn't be backported but it would be nice
to get into GCC 14, while nothing critical, at least it would avoid the new
test failure that's being reported. But more generally, I think using a
location from a totally different line map is dangerous and could have worse
consequences that haven't been seen yet. Does it look OK please? Thanks!


Can we use the line of the #include, as the test expects, rather than 
the following line?


Jason

Re: [PATCH v3 2/5] C++: Support clang compatible [[musttail]] (PR83324)

2024-01-31 Thread Andi Kleen

> > For compatibility it also detects clang::musttail
> 
> FWIW, it's not clear to me we should do this.  I don't see a precedent.

It would make existing code just work (as long as they don't use ifdef)

> 
> > One problem is that tree-tailcall usually fails when optimization
> > is disabled, which implies the attribute only really works with
> > optimization on. But that seems to be a reasonable limitation.
> > 
> > Passes bootstrap and full test
> 
> I don't see a ChangeLog entry.

I have them, but will add them to the next post.
> >  static void cp_parser_declaration_statement
> >(cp_parser *);
> >  
> > @@ -12719,9 +12719,27 @@ cp_parser_statement (cp_parser* parser, tree 
> > in_statement_expr,
> >  NULL_TREE, false);
> >   break;
> >  
> > +   case RID_RETURN:
> > + {
> > +   bool musttail_p = false;
> > +   std_attrs = process_stmt_hotness_attribute (std_attrs, attrs_loc);
> > +   if (lookup_attribute ("", "musttail", std_attrs))
> > + {
> > +   musttail_p = true;
> > +   std_attrs = remove_attribute ("", "musttail", std_attrs);
> > + }
> > +   // support this for compatibility
> > +   if (lookup_attribute ("clang", "musttail", std_attrs))
> > + {
> > +   musttail_p = true;
> > +   std_attrs = remove_attribute ("clang", "musttail", std_attrs);
> > + }
> 
> Doing lookup_attribute unconditionally twice seems like a lot.
> You could do just lookup_attribute ("musttail", std_attrs) and then
> check get_attribute_namespace() == nullptr/gnu_identifier?

Actually the common case is 0 and very rarely 1 attribute, and in that it is 
both 
very cheap. If people ever write code with lots of attributes
per line we can worry about optimizations, but at this point it would
see premature.


> 
> It's not pretty that you have to remove_attribute but I guess we emit
> warnings otherwise?

Yes. 


-Andi

Re: [PATCH v3 1/5] Improve must tail in RTL backend

2024-01-31 Thread Marek Polacek

On Wed, Jan 31, 2024 at 12:16:59PM -0800, Andi Kleen wrote:
> > This results in "error: cannot tail-call: cannot tail-call: other reasons".
> > So the second argument should be "other reasons" only.
> 
> Yes will fix those. Thanks.
> 
> > 
> > I notice that if I don't use -O2 I also get "other reasons".  But it should 
> > be
> > easy-ish to say "cannot tail-call: optimizations not enabled" or so.
> 
> It's unfortunately not easy to distinguish. It's not just -O2, but
> various missing transformations make tree-tailcall not do it its job,
> and they could depend on other flags. But there might be also other
> reasons not related to the optimization that makes the tail call fall.
> I would be uncomfortable reporting the problem is -O2 when it might 
> be something else.
> 
> The right fix would be to make tree-tailcall not fail with optimization,
> and for the remaining cases add errors there.  But that would
> make the patch a lot bigger and it's not clear it would improve usability
> that much. So I opted to just mention the problem in the documentation.

Ah, we don't want that.  I meant to check the simplest case:
  if (optimize < 2)
 ...
but if it's more complicated than that then let's let it be.

Thanks,
Marek

Re: [PATCH V4 2/4] RISC-V: Add vector related pipelines

2024-01-31 Thread Robin Dapp

LGTM, thanks.

Regards
 Robin

Re: [PATCH v3 2/5] C++: Support clang compatible [[musttail]] (PR83324)

2024-01-31 Thread Jakub Jelinek

On Wed, Jan 31, 2024 at 12:21:38PM -0800, Andi Kleen wrote:
> > > + case RID_RETURN:
> > > +   {
> > > + bool musttail_p = false;
> > > + std_attrs = process_stmt_hotness_attribute (std_attrs, attrs_loc);
> > > + if (lookup_attribute ("", "musttail", std_attrs))
> > > +   {
> > > + musttail_p = true;
> > > + std_attrs = remove_attribute ("", "musttail", std_attrs);
> > > +   }

Using "" looks wrong to me, that is for standard attributes which
are also gnu attributes, say [[noreturn]]/[[gnu::noreturn]].
That is not the case here.  Even the __attribute__((musttail)) form will have
gnu namespace.

> > > + // support this for compatibility
> > > + if (lookup_attribute ("clang", "musttail", std_attrs))
> > > +   {
> > > + musttail_p = true;
> > > + std_attrs = remove_attribute ("clang", "musttail", std_attrs);
> > > +   }
> > 
> > Doing lookup_attribute unconditionally twice seems like a lot.
> > You could do just lookup_attribute ("musttail", std_attrs) and then
> > check get_attribute_namespace() == nullptr/gnu_identifier?

I agree with Marek here.  The fact that it is most often NULL std_attrs is
indeed already optimized by lookup_attribute, but people write all kinds of
code.  The remove_attribute can be done separately of course.

Though, I'd also prefer not to add clang attributes, just add gnu ones.

Jakub

Re: [PATCH] aarch64: libgcc: Cleanup ELF marking in asm

2024-01-31 Thread Richard Sandiford

Szabolcs Nagy  writes:
> Use aarch64-asm.h in asm code consistently, this was started in
>
>   commit c608ada288ced0268c1fd4136f56c34b24d4
>   Author: Zac Walker 
>   CommitDate: 2024-01-23 15:32:30 +
>
>   Ifdef `.hidden`, `.type`, and `.size` pseudo-ops for `aarch64-w64-mingw32` 
> target
>
> But that commit failed to remove some existing markings from asm files,
> which means some objects got double marked with gnu property notes.
>
> libgcc/ChangeLog:
>
>   * config/aarch64/crti.S: Remove stack marking.
>   * config/aarch64/crtn.S: Remove stack marking, include aarch64-asm.h
>   * config/aarch64/lse.S: Remove stack and GNU property markings.

OK, thanks.

Richard

> ---
>  libgcc/config/aarch64/crti.S |  6 --
>  libgcc/config/aarch64/crtn.S |  6 +-
>  libgcc/config/aarch64/lse.S  | 40 
>  3 files changed, 1 insertion(+), 51 deletions(-)
>
> diff --git a/libgcc/config/aarch64/crti.S b/libgcc/config/aarch64/crti.S
> index b6805b86421..52ca1bb56d6 100644
> --- a/libgcc/config/aarch64/crti.S
> +++ b/libgcc/config/aarch64/crti.S
> @@ -23,12 +23,6 @@
>  
>  #include "aarch64-asm.h"
>  
> -/* An executable stack is *not* required for these functions.  */
> -#if defined(__ELF__) && defined(__linux__)
> -.section .note.GNU-stack,"",%progbits
> -.previous
> -#endif
> -
>  # This file creates a stack frame for the contents of the .fini and
>  # .init sections.  Users may put any desired instructions in those
>  # sections.
> diff --git a/libgcc/config/aarch64/crtn.S b/libgcc/config/aarch64/crtn.S
> index 59f2441032a..67bcfab8564 100644
> --- a/libgcc/config/aarch64/crtn.S
> +++ b/libgcc/config/aarch64/crtn.S
> @@ -21,11 +21,7 @@
>  # see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
>  # .
>  
> -/* An executable stack is *not* required for these functions.  */
> -#if defined(__ELF__) && defined(__linux__)
> -.section .note.GNU-stack,"",%progbits
> -.previous
> -#endif
> +#include "aarch64-asm.h"
>  
>  # This file just makes sure that the .fini and .init sections do in
>  # fact return.  Users may put any desired instructions in those sections.
> diff --git a/libgcc/config/aarch64/lse.S b/libgcc/config/aarch64/lse.S
> index cee1e88c6a4..ecef47086c6 100644
> --- a/libgcc/config/aarch64/lse.S
> +++ b/libgcc/config/aarch64/lse.S
> @@ -315,43 +315,3 @@ STARTFN  NAME(LDNM)
>  
>  ENDFNNAME(LDNM)
>  #endif
> -
> -/* GNU_PROPERTY_AARCH64_* macros from elf.h for use in asm code.  */
> -#define FEATURE_1_AND 0xc000
> -#define FEATURE_1_BTI 1
> -#define FEATURE_1_PAC 2
> -
> -/* Supported features based on the code generation options.  */
> -#if defined(__ARM_FEATURE_BTI_DEFAULT)
> -# define BTI_FLAG FEATURE_1_BTI
> -#else
> -# define BTI_FLAG 0
> -#endif
> -
> -#if __ARM_FEATURE_PAC_DEFAULT & 3
> -# define PAC_FLAG FEATURE_1_PAC
> -#else
> -# define PAC_FLAG 0
> -#endif
> -
> -/* Add a NT_GNU_PROPERTY_TYPE_0 note.  */
> -#define GNU_PROPERTY(type, value)\
> -  .section .note.gnu.property, "a";  \
> -  .p2align 3;\
> -  .word 4;   \
> -  .word 16;  \
> -  .word 5;   \
> -  .asciz "GNU";  \
> -  .word type;\
> -  .word 4;   \
> -  .word value;   \
> -  .word 0;
> -
> -#if defined(__linux__) || defined(__FreeBSD__)
> -.section .note.GNU-stack, "", %progbits
> -
> -/* Add GNU property note if built with branch protection.  */
> -# if (BTI_FLAG|PAC_FLAG) != 0
> -GNU_PROPERTY (FEATURE_1_AND, BTI_FLAG|PAC_FLAG)
> -# endif
> -#endif

Re: [PATCH] aarch64: -mstrict-align vs __arm_data512_t [PR113657]

2024-01-31 Thread Richard Sandiford

Andrew Pinski  writes:
> After r14-1187-gd6b756447cd58b, simplify_gen_subreg can return
> NULL for "unaligned" memory subreg. Since V8DI has an alignment of 8 bytes,
> using TImode causes simplify_gen_subreg to return NULL.
> This fixes the issue by using DImode instead for the loop. And then we will 
> have
> later on the STP/LDP pass combine it back into STP/LDP if needed.
> Since strict align is less important (usually used for firmware and early 
> boot only),
> not doing LDP/STP here is ok.
>
> Built and tested for aarch64-linux-gnu with no regressions.
>
>   PR target/113657
>
> gcc/ChangeLog:
>
>   * config/aarch64/aarch64-simd.md (split for movv8di):
>   For strict aligned mode, use DImode instead of TImode.
>
> gcc/testsuite/ChangeLog:
>
>   * gcc.target/aarch64/acle/ls64_strict_align.c: New test.
>
> Signed-off-by: Andrew Pinski 
> ---
>  gcc/config/aarch64/aarch64-simd.md   | 16 
>  .../gcc.target/aarch64/acle/ls64_strict_align.c  |  7 +++
>  2 files changed, 19 insertions(+), 4 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/acle/ls64_strict_align.c
>
> diff --git a/gcc/config/aarch64/aarch64-simd.md 
> b/gcc/config/aarch64/aarch64-simd.md
> index 7a6b4430112..0b4b37115d6 100644
> --- a/gcc/config/aarch64/aarch64-simd.md
> +++ b/gcc/config/aarch64/aarch64-simd.md
> @@ -8221,14 +8221,22 @@ (define_split
>  || (memory_operand (operands[0], V8DImode)
>  && register_operand (operands[1], V8DImode)))
>  {
> +  int increment = 16;
> +  machine_mode mode = TImode;
> +  /* For strict alignment, use DImode to avoid "unalign" subreg. */
> +  if (STRICT_ALIGNMENT)
> +{
> +   mode = DImode;
> +   increment = 8;
> + }

Sorry for the trivial change, but I think this would be slightly neater as:

  /* V8DI only guarantees 8-byte alignment, whereas TImode requires 16.  */
  auto mode = STRICT_ALIGNMENT ? DImode : TImode;
  int increment = GET_MODE_SIZE (mode);

OK with that change, thanks.

Richard

>std::pair last_pair = {};
> -  for (int offset = 0; offset < 64; offset += 16)
> +  for (int offset = 0; offset < 64; offset += increment)
>  {
> std::pair pair = {
> - simplify_gen_subreg (TImode, operands[0], V8DImode, offset),
> - simplify_gen_subreg (TImode, operands[1], V8DImode, offset)
> + simplify_gen_subreg (mode, operands[0], V8DImode, offset),
> + simplify_gen_subreg (mode, operands[1], V8DImode, offset)
> };
> -   if (register_operand (pair.first, TImode)
> +   if (register_operand (pair.first, mode)
> && reg_overlap_mentioned_p (pair.first, pair.second))
>   last_pair = pair;
> else
> diff --git a/gcc/testsuite/gcc.target/aarch64/acle/ls64_strict_align.c 
> b/gcc/testsuite/gcc.target/aarch64/acle/ls64_strict_align.c
> new file mode 100644
> index 000..bf49ac76f78
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/acle/ls64_strict_align.c
> @@ -0,0 +1,7 @@
> +/* { dg-do compile } */
> +/* { dg-options "-mstrict-align" } */
> +/* PR target/113657 */
> +
> +#pragma GCC target "+ls64"
> +#pragma GCC aarch64 "arm_acle.h"
> +__arm_data512_t foo(__arm_data512_t* ptr) { return *ptr; }

Re: [PATCH] aarch64: Fix ICE in poly-int.h due to SLP.

2024-01-31 Thread Richard Sandiford

Richard Ball  writes:
> Hi Prathamesh,
>
> Thanks for the review, I missed that code up above.
> I've looking into this and it seems to me at least, that what you have
> suggested, is equivalent.
> I'll make the change and repost.

The original patch is OK.  Checking in the other loop is too early,
because we want to accept variable-length vectors when repeating_p is true.

Like Prathamesh says, please add the testcase to the testsuite.

Thanks,
Richard

>
> Thanks,
> Richard
> ---
> From: Prathamesh Kulkarni 
> Sent: 30 January 2024 17:36
> To: Richard Ball 
> Cc: gcc-patches@gcc.gnu.org ; Richard Sandiford
> ; Kyrylo Tkachov ; Richard
> Earnshaw ; Marcus Shawcroft
> 
> Subject: Re: [PATCH] aarch64: Fix ICE in poly-int.h due to SLP.
>  
> On Tue, 30 Jan 2024 at 20:13, Richard Ball  wrote:
>>
>> Adds a check to ensure that the input vector arguments
>> to a function are not variable length. Previously, only the
>> output vector of a function was checked.
> Hi,
> Quoting from patch:
> @@ -8989,6 +8989,14 @@ vectorizable_slp_permutation_1 (vec_info
> *vinfo, gimple_stmt_iterator *gsi,
>instead of relying on the pattern described above.  */
>if (!nunits.is_constant (&npatterns))
>   return -1;
> +  FOR_EACH_VEC_ELT (children, i, child)
> + if (SLP_TREE_VECTYPE (child))
> +   {
> + tree child_vectype = SLP_TREE_VECTYPE (child);
> + poly_uint64 child_nunits = TYPE_VECTOR_SUBPARTS (child_vectype);
> + if (!child_nunits.is_constant ())
> +   return -1;
> +   }
>
> Just wondering if that'd be equivalent to checking:
> if (!TYPE_VECTOR_SUBPARTS (op_vectype).is_constant ())
>   return -1;
> Instead of (again) iterating over children since we bail out in the
> function above,
> if SLP_TREE_VECTYPE (child) and op_vectype are not compatible types ?
>
> Also, could you please include the offending test-case in the patch ?
>
> Thanks,
> Prathamesh
>
>>
>> gcc/ChangeLog:
>>
>> * tree-vect-slp.cc (vectorizable_slp_permutation_1):
>> Add variable-length check for vector input arguments
>> to a function.

Re: [PATCH v2] c++: avoid -Wdangling-reference for std::span-like classes [PR110358]

2024-01-31 Thread Marek Polacek

On Wed, Jan 31, 2024 at 07:44:41PM +, Alex Coplan wrote:
> Hi Marek,
> 
> On 30/01/2024 13:15, Marek Polacek wrote:
> > On Thu, Jan 25, 2024 at 10:13:10PM -0500, Jason Merrill wrote:
> > > On 1/25/24 20:36, Marek Polacek wrote:
> > > > Better version:
> > > > 
> > > > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?
> > > > 
> > > > -- >8 --
> > > > Real-world experience shows that -Wdangling-reference triggers for
> > > > user-defined std::span-like classes a lot.  We can easily avoid that
> > > > by considering classes like
> > > > 
> > > >  template
> > > >  struct Span {
> > > >T* data_;
> > > >std::size len_;
> > > >  };
> > > > 
> > > > to be std::span-like, and not warning for them.  Unlike the previous
> > > > patch, this one considers a non-union class template that has a pointer
> > > > data member and a trivial destructor as std::span-like.
> > > > 
> > > > PR c++/110358
> > > > PR c++/109640
> > > > 
> > > > gcc/cp/ChangeLog:
> > > > 
> > > > * call.cc (reference_like_class_p): Don't warn for 
> > > > std::span-like
> > > > classes.
> > > > 
> > > > gcc/ChangeLog:
> > > > 
> > > > * doc/invoke.texi: Update -Wdangling-reference description.
> > > > 
> > > > gcc/testsuite/ChangeLog:
> > > > 
> > > > * g++.dg/warn/Wdangling-reference18.C: New test.
> > > > * g++.dg/warn/Wdangling-reference19.C: New test.
> > > > * g++.dg/warn/Wdangling-reference20.C: New test.
> > > > ---
> > > >   gcc/cp/call.cc| 18 
> > > >   gcc/doc/invoke.texi   | 14 +++
> > > >   .../g++.dg/warn/Wdangling-reference18.C   | 24 +++
> > > >   .../g++.dg/warn/Wdangling-reference19.C   | 25 +++
> > > >   .../g++.dg/warn/Wdangling-reference20.C   | 42 +++
> > > >   5 files changed, 123 insertions(+)
> > > >   create mode 100644 gcc/testsuite/g++.dg/warn/Wdangling-reference18.C
> > > >   create mode 100644 gcc/testsuite/g++.dg/warn/Wdangling-reference19.C
> > > >   create mode 100644 gcc/testsuite/g++.dg/warn/Wdangling-reference20.C
> > > > 
> > > > diff --git a/gcc/cp/call.cc b/gcc/cp/call.cc
> > > > index 9de0d77c423..afd3e1ff024 100644
> > > > --- a/gcc/cp/call.cc
> > > > +++ b/gcc/cp/call.cc
> > > > @@ -14082,6 +14082,24 @@ reference_like_class_p (tree ctype)
> > > > return true;
> > > >   }
> > > > +  /* Avoid warning if CTYPE looks like std::span: it's a class 
> > > > template,
> > > > + has a T* member, and a trivial destructor.  For example,
> > > > +
> > > > +  template
> > > > +  struct Span {
> > > > +   T* data_;
> > > > +   std::size len_;
> > > > +  };
> > > > +
> > > > + is considered std::span-like.  */
> > > > +  if (NON_UNION_CLASS_TYPE_P (ctype)
> > > > +  && CLASSTYPE_TEMPLATE_INSTANTIATION (ctype)
> > > > +  && TYPE_HAS_TRIVIAL_DESTRUCTOR (ctype))
> > > > +for (tree field = next_aggregate_field (TYPE_FIELDS (ctype));
> > > > +field; field = next_aggregate_field (DECL_CHAIN (field)))
> > > > +  if (TYPE_PTR_P (TREE_TYPE (field)))
> > > > +   return true;
> > > > +
> > > > /* Some classes, such as std::tuple, have the reference member in 
> > > > its
> > > >(non-direct) base class.  */
> > > > if (dfs_walk_once (TYPE_BINFO (ctype), 
> > > > class_has_reference_member_p_r,
> > > > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> > > > index 6ec56493e59..e0ff18a86f5 100644
> > > > --- a/gcc/doc/invoke.texi
> > > > +++ b/gcc/doc/invoke.texi
> > > > @@ -3916,6 +3916,20 @@ where @code{std::minmax} returns 
> > > > @code{std::pair}, and
> > > >   both references dangle after the end of the full expression that 
> > > > contains
> > > >   the call to @code{std::minmax}.
> > > > +The warning does not warn for @code{std::span}-like classes.  We 
> > > > consider
> > > > +classes of the form:
> > > > +
> > > > +@smallexample
> > > > +template
> > > > +struct Span @{
> > > > +  T* data_;
> > > > +  std::size len_;
> > > > +@};
> > > > +@end smallexample
> > > > +
> > > > +as @code{std::span}-like; that is, the class is a non-union class 
> > > > template
> > > > +that has a pointer data member and a trivial destructor.
> > > > +
> > > >   This warning is enabled by @option{-Wall}.
> > > >   @opindex Wdelete-non-virtual-dtor
> > > > diff --git a/gcc/testsuite/g++.dg/warn/Wdangling-reference18.C 
> > > > b/gcc/testsuite/g++.dg/warn/Wdangling-reference18.C
> > > > new file mode 100644
> > > > index 000..e088c177769
> > > > --- /dev/null
> > > > +++ b/gcc/testsuite/g++.dg/warn/Wdangling-reference18.C
> > > > @@ -0,0 +1,24 @@
> > > > +// PR c++/110358
> > > > +// { dg-do compile { target c++11 } }
> > > > +// { dg-options "-Wdangling-reference" }
> > > > +// Don't warn for std::span-like classes.
> > > > +
> > > > +template 
> > > > +struct Span {
> > > > +T* data_;
> > > > +int len_;
> > > > +
> > > > +

Re: [PATCH v2] c++: avoid -Wdangling-reference for std::span-like classes [PR110358]

2024-01-31 Thread Marek Polacek

On Wed, Jan 31, 2024 at 02:57:09PM -0500, Jason Merrill wrote:
> On 1/31/24 14:44, Alex Coplan wrote:
> > Hi Marek,
> > 
> > On 30/01/2024 13:15, Marek Polacek wrote:
> > > On Thu, Jan 25, 2024 at 10:13:10PM -0500, Jason Merrill wrote:
> > > > On 1/25/24 20:36, Marek Polacek wrote:
> > > > > Better version:
> > > > > 
> > > > > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?
> > > > > 
> > > > > -- >8 --
> > > > > Real-world experience shows that -Wdangling-reference triggers for
> > > > > user-defined std::span-like classes a lot.  We can easily avoid that
> > > > > by considering classes like
> > > > > 
> > > > >   template
> > > > >   struct Span {
> > > > > T* data_;
> > > > > std::size len_;
> > > > >   };
> > > > > 
> > > > > to be std::span-like, and not warning for them.  Unlike the previous
> > > > > patch, this one considers a non-union class template that has a 
> > > > > pointer
> > > > > data member and a trivial destructor as std::span-like.
> > > > > 
> > > > >   PR c++/110358
> > > > >   PR c++/109640
> > > > > 
> > > > > gcc/cp/ChangeLog:
> > > > > 
> > > > >   * call.cc (reference_like_class_p): Don't warn for 
> > > > > std::span-like
> > > > >   classes.
> > > > > 
> > > > > gcc/ChangeLog:
> > > > > 
> > > > >   * doc/invoke.texi: Update -Wdangling-reference description.
> > > > > 
> > > > > gcc/testsuite/ChangeLog:
> > > > > 
> > > > >   * g++.dg/warn/Wdangling-reference18.C: New test.
> > > > >   * g++.dg/warn/Wdangling-reference19.C: New test.
> > > > >   * g++.dg/warn/Wdangling-reference20.C: New test.
> > > > > ---
> > > > >gcc/cp/call.cc| 18 
> > > > >gcc/doc/invoke.texi   | 14 +++
> > > > >.../g++.dg/warn/Wdangling-reference18.C   | 24 +++
> > > > >.../g++.dg/warn/Wdangling-reference19.C   | 25 +++
> > > > >.../g++.dg/warn/Wdangling-reference20.C   | 42 
> > > > > +++
> > > > >5 files changed, 123 insertions(+)
> > > > >create mode 100644 
> > > > > gcc/testsuite/g++.dg/warn/Wdangling-reference18.C
> > > > >create mode 100644 
> > > > > gcc/testsuite/g++.dg/warn/Wdangling-reference19.C
> > > > >create mode 100644 
> > > > > gcc/testsuite/g++.dg/warn/Wdangling-reference20.C
> > > > > 
> > > > > diff --git a/gcc/cp/call.cc b/gcc/cp/call.cc
> > > > > index 9de0d77c423..afd3e1ff024 100644
> > > > > --- a/gcc/cp/call.cc
> > > > > +++ b/gcc/cp/call.cc
> > > > > @@ -14082,6 +14082,24 @@ reference_like_class_p (tree ctype)
> > > > >   return true;
> > > > >}
> > > > > +  /* Avoid warning if CTYPE looks like std::span: it's a class 
> > > > > template,
> > > > > + has a T* member, and a trivial destructor.  For example,
> > > > > +
> > > > > +  template
> > > > > +  struct Span {
> > > > > + T* data_;
> > > > > + std::size len_;
> > > > > +  };
> > > > > +
> > > > > + is considered std::span-like.  */
> > > > > +  if (NON_UNION_CLASS_TYPE_P (ctype)
> > > > > +  && CLASSTYPE_TEMPLATE_INSTANTIATION (ctype)
> > > > > +  && TYPE_HAS_TRIVIAL_DESTRUCTOR (ctype))
> > > > > +for (tree field = next_aggregate_field (TYPE_FIELDS (ctype));
> > > > > +  field; field = next_aggregate_field (DECL_CHAIN (field)))
> > > > > +  if (TYPE_PTR_P (TREE_TYPE (field)))
> > > > > + return true;
> > > > > +
> > > > >  /* Some classes, such as std::tuple, have the reference member 
> > > > > in its
> > > > > (non-direct) base class.  */
> > > > >  if (dfs_walk_once (TYPE_BINFO (ctype), 
> > > > > class_has_reference_member_p_r,
> > > > > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> > > > > index 6ec56493e59..e0ff18a86f5 100644
> > > > > --- a/gcc/doc/invoke.texi
> > > > > +++ b/gcc/doc/invoke.texi
> > > > > @@ -3916,6 +3916,20 @@ where @code{std::minmax} returns 
> > > > > @code{std::pair}, and
> > > > >both references dangle after the end of the full expression that 
> > > > > contains
> > > > >the call to @code{std::minmax}.
> > > > > +The warning does not warn for @code{std::span}-like classes.  We 
> > > > > consider
> > > > > +classes of the form:
> > > > > +
> > > > > +@smallexample
> > > > > +template
> > > > > +struct Span @{
> > > > > +  T* data_;
> > > > > +  std::size len_;
> > > > > +@};
> > > > > +@end smallexample
> > > > > +
> > > > > +as @code{std::span}-like; that is, the class is a non-union class 
> > > > > template
> > > > > +that has a pointer data member and a trivial destructor.
> > > > > +
> > > > >This warning is enabled by @option{-Wall}.
> > > > >@opindex Wdelete-non-virtual-dtor
> > > > > diff --git a/gcc/testsuite/g++.dg/warn/Wdangling-reference18.C 
> > > > > b/gcc/testsuite/g++.dg/warn/Wdangling-reference18.C
> > > > > new file mode 100644
> > > > > index 000..e088c177769
> > > > > --- /dev/null
> > > > > +++ b/gcc/testsuite/g++.dg/warn/Wdangling-referen

Re: [PATCH] c++: ttp CTAD equivalence [PR112737]

2024-01-31 Thread Patrick Palka

On Wed, 31 Jan 2024, Jason Merrill wrote:

> On 1/31/24 12:12, Patrick Palka wrote:
> > Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
> > trunk?
> > 
> > -- >8 --
> > 
> > Here during declaration matching we undesirably consider the two TT{42}
> > CTAD expressions to be non-equivalent ultimately because for CTAD
> > placeholder equivalence we compare the TEMPLATE_DECLs (which uses
> > pointer identity) and here the corresponding TEMPLATE_DECLs for TT are
> > different since they're from different scopes.  On the other hand, the
> > corresponding TEMPLATE_TEMPLATE_PARMs are deemed equivalent (since they
> > have the same position and template parameters).  This turns out to be
> > the root cause of some of the xtreme-header modules regressions.
> > 
> > We don't have this ttp equivalence issue in other contexts because either
> > the TEMPLATE_TEMPLATE_PARM is used instead of the TEMPLATE_DECL already
> > (e.g. when a ttp is used as a targ), or the equivalence logic is relaxed
> > (e.g. for bound ttps), it seems.
> > 
> > So this patch relaxes ttp CTAD placeholder equivalence accordingly, by
> > comparing the TEMPLATE_TEMPLATE_PARM instead of the TEMPLATE_DECL.  The
> > ctp_hasher doesn't need to be adjusted since it currently doesn't include
> > CLASS_PLACEHOLDER_TEMPLATE in the hash anyway.
> 
> Maybe put this handling in cp_tree_equal and call it from here?  Does
> iterative_hash_template_arg need something similar?

I was hoping cp_tree_equal would never be called for a ttp TEMPLATE_DECL
after this patch, and so it wouldn't matter either way, but it turns out
to matter for a function template-id:

  template class, class T>
  void g(T);

  template class TT, class T>
  decltype(g(T{})) f(T); // #1

  template class TT, class T>
  decltype(g(T{})) f(T); // redeclaration of #1

  template struct A { A(T); };

  int main() {
f(0);
  }

Here we represent TT in g as a TEMPLATE_DECL because it's not until
coercion that convert_template_argument turns it into a
TEMPLATE_TEMPLATE_PARM, but of course we can't coerce until the call is
non-dependent and we know which function we're calling.  (So TT within
a class, variable or alias template-id would be represented as a
TEMPLATE_TEMPLATE_PARM since we can do coercion ahead of time in that
case.)

So indeed it seems desirable to handle this in cp_tree_equal... like so?
Bootstrap and regtest nearly finished.

-- >8 --

PR c++/112737

gcc/cp/ChangeLog:

* pt.cc (iterative_hash_template_arg) :
Adjust hashing to match cp_tree_equal.
(ctp_hasher::hash): Also hash CLASS_PLACEHOLDER_TEMPLATE.
* tree.cc (cp_tree_equal) : Return true
for ttp TEMPLATE_DECLs if their TEMPLATE_TEMPLATE_PARMs are
equivalent.
* typeck.cc (structural_comptypes) :
Use cp_tree_equal to compare CLASS_PLACEHOLDER_TEMPLATE.

gcc/testsuite/ChangeLog:

* g++.dg/template/ttp42.C: New test.
* g++.dg/template/ttp42a.C: New test.
---
 gcc/cp/pt.cc   |  9 +
 gcc/cp/tree.cc |  6 +-
 gcc/cp/typeck.cc   |  4 ++--
 gcc/testsuite/g++.dg/template/ttp42.C  | 14 ++
 gcc/testsuite/g++.dg/template/ttp42a.C | 18 ++
 5 files changed, 48 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/template/ttp42.C
 create mode 100644 gcc/testsuite/g++.dg/template/ttp42a.C

diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 5871cb668d0..ca454758ca7 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -1816,6 +1816,13 @@ iterative_hash_template_arg (tree arg, hashval_t val)
}
   return iterative_hash_template_arg (TREE_TYPE (arg), val);
 
+case TEMPLATE_DECL:
+  if (DECL_TEMPLATE_TEMPLATE_PARM_P (arg))
+   return iterative_hash_template_arg (TREE_TYPE (arg), val);
+  else
+   /* Hash it like any other declaration.  */
+   break;
+
 case TARGET_EXPR:
   return iterative_hash_template_arg (TARGET_EXPR_INITIAL (arg), val);
 
@@ -4499,6 +4506,8 @@ struct ctp_hasher : ggc_ptr_hash
 hashval_t val = iterative_hash_object (code, 0);
 val = iterative_hash_object (TEMPLATE_TYPE_LEVEL (t), val);
 val = iterative_hash_object (TEMPLATE_TYPE_IDX (t), val);
+if (TREE_CODE (t) == TEMPLATE_TYPE_PARM)
+  val = iterative_hash_template_arg (CLASS_PLACEHOLDER_TEMPLATE (t), val);
 if (TREE_CODE (t) == BOUND_TEMPLATE_TEMPLATE_PARM)
   val = iterative_hash_template_arg (TYPE_TI_ARGS (t), val);
 --comparing_specializations;
diff --git a/gcc/cp/tree.cc b/gcc/cp/tree.cc
index 77f57e0f9ac..5c8c05dc168 100644
--- a/gcc/cp/tree.cc
+++ b/gcc/cp/tree.cc
@@ -4084,11 +4084,15 @@ cp_tree_equal (tree t1, tree t2)
}
   return false;
 
+case TEMPLATE_DECL:
+  if (DECL_TEMPLATE_TEMPLATE_PARM_P (t1)
+ && DECL_TEMPLATE_TEMPLATE_PARM_P (t2))
+   return cp_tree_equal (TREE_TYPE (t1), TREE_TYPE (t2));
+  /* Fall through.  */
 case

Re: [PATCH] c++: ttp CTAD equivalence [PR112737]

2024-01-31 Thread Patrick Palka

On Wed, 31 Jan 2024, Patrick Palka wrote:

> On Wed, 31 Jan 2024, Jason Merrill wrote:
> 
> > On 1/31/24 12:12, Patrick Palka wrote:
> > > Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
> > > trunk?
> > > 
> > > -- >8 --
> > > 
> > > Here during declaration matching we undesirably consider the two TT{42}
> > > CTAD expressions to be non-equivalent ultimately because for CTAD
> > > placeholder equivalence we compare the TEMPLATE_DECLs (which uses
> > > pointer identity) and here the corresponding TEMPLATE_DECLs for TT are
> > > different since they're from different scopes.  On the other hand, the
> > > corresponding TEMPLATE_TEMPLATE_PARMs are deemed equivalent (since they
> > > have the same position and template parameters).  This turns out to be
> > > the root cause of some of the xtreme-header modules regressions.
> > > 
> > > We don't have this ttp equivalence issue in other contexts because either
> > > the TEMPLATE_TEMPLATE_PARM is used instead of the TEMPLATE_DECL already
> > > (e.g. when a ttp is used as a targ), or the equivalence logic is relaxed
> > > (e.g. for bound ttps), it seems.
> > > 
> > > So this patch relaxes ttp CTAD placeholder equivalence accordingly, by
> > > comparing the TEMPLATE_TEMPLATE_PARM instead of the TEMPLATE_DECL.  The
> > > ctp_hasher doesn't need to be adjusted since it currently doesn't include
> > > CLASS_PLACEHOLDER_TEMPLATE in the hash anyway.
> > 
> > Maybe put this handling in cp_tree_equal and call it from here?  Does
> > iterative_hash_template_arg need something similar?
> 
> I was hoping cp_tree_equal would never be called for a ttp TEMPLATE_DECL
> after this patch, and so it wouldn't matter either way, but it turns out
> to matter for a function template-id:
> 
>   template class, class T>
>   void g(T);
> 
>   template class TT, class T>
>   decltype(g(T{})) f(T); // #1
> 
>   template class TT, class T>
>   decltype(g(T{})) f(T); // redeclaration of #1
> 
>   template struct A { A(T); };
> 
>   int main() {
> f(0);
>   }
> 
> Here we represent TT in g as a TEMPLATE_DECL because it's not until
> coercion that convert_template_argument turns it into a
> TEMPLATE_TEMPLATE_PARM, but of course we can't coerce until the call is
> non-dependent and we know which function we're calling.  (So TT within
> a class, variable or alias template-id would be represented as a
> TEMPLATE_TEMPLATE_PARM since we can do coercion ahead of time in that
> case.)
> 
> So indeed it seems desirable to handle this in cp_tree_equal... like so?
> Bootstrap and regtest nearly finished.
> 
> -- >8 --
> 
>   PR c++/112737
> 
> gcc/cp/ChangeLog:
> 
>   * pt.cc (iterative_hash_template_arg) :
>   Adjust hashing to match cp_tree_equal.
>   (ctp_hasher::hash): Also hash CLASS_PLACEHOLDER_TEMPLATE.

I forgot to mention this change to ctp_hasher::hash is just a drive-by
optimization, so that CTAD placeholders don't all get the same hash.

>   * tree.cc (cp_tree_equal) : Return true
>   for ttp TEMPLATE_DECLs if their TEMPLATE_TEMPLATE_PARMs are
>   equivalent.
>   * typeck.cc (structural_comptypes) :
>   Use cp_tree_equal to compare CLASS_PLACEHOLDER_TEMPLATE.
> 
> gcc/testsuite/ChangeLog:
> 
>   * g++.dg/template/ttp42.C: New test.
>   * g++.dg/template/ttp42a.C: New test.
> ---
>  gcc/cp/pt.cc   |  9 +
>  gcc/cp/tree.cc |  6 +-
>  gcc/cp/typeck.cc   |  4 ++--
>  gcc/testsuite/g++.dg/template/ttp42.C  | 14 ++
>  gcc/testsuite/g++.dg/template/ttp42a.C | 18 ++
>  5 files changed, 48 insertions(+), 3 deletions(-)
>  create mode 100644 gcc/testsuite/g++.dg/template/ttp42.C
>  create mode 100644 gcc/testsuite/g++.dg/template/ttp42a.C
> 
> diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
> index 5871cb668d0..ca454758ca7 100644
> --- a/gcc/cp/pt.cc
> +++ b/gcc/cp/pt.cc
> @@ -1816,6 +1816,13 @@ iterative_hash_template_arg (tree arg, hashval_t val)
>   }
>return iterative_hash_template_arg (TREE_TYPE (arg), val);
>  
> +case TEMPLATE_DECL:
> +  if (DECL_TEMPLATE_TEMPLATE_PARM_P (arg))
> + return iterative_hash_template_arg (TREE_TYPE (arg), val);
> +  else
> + /* Hash it like any other declaration.  */
> + break;
> +
>  case TARGET_EXPR:
>return iterative_hash_template_arg (TARGET_EXPR_INITIAL (arg), val);
>  
> @@ -4499,6 +4506,8 @@ struct ctp_hasher : ggc_ptr_hash
>  hashval_t val = iterative_hash_object (code, 0);
>  val = iterative_hash_object (TEMPLATE_TYPE_LEVEL (t), val);
>  val = iterative_hash_object (TEMPLATE_TYPE_IDX (t), val);
> +if (TREE_CODE (t) == TEMPLATE_TYPE_PARM)
> +  val = iterative_hash_template_arg (CLASS_PLACEHOLDER_TEMPLATE (t), 
> val);
>  if (TREE_CODE (t) == BOUND_TEMPLATE_TEMPLATE_PARM)
>val = iterative_hash_template_arg (TYPE_TI_ARGS (t), val);
>  --comparing_specializations;
> diff --git a/gcc/cp/tree.cc b/gcc/cp/tree.c

[RFC PATCH 0/1] Nix Environment Support for GCC Development

2024-01-31 Thread Vincenzo Palazzo

I am writing to submit a revision of a patch for consideration to be included 
in the mainline GCC repository discussed in [1].

The only change is that this include the following suggestion [2]

[1] https://gcc.gnu.org/pipermail/gcc-patches/2023-December/639235.html
[2] https://gcc.gnu.org/pipermail/gcc-patches/2023-December/639308.html

Cheers,

Vincent.

Vincenzo Palazzo (1):
  nix: add a simple flake nix shell

 .gitignore|  1 +
 contrib/nix/flake.nix | 35 +++
 2 files changed, 36 insertions(+)
 create mode 100644 contrib/nix/flake.nix

-- 
2.43.0

[committed] c: Fix ICE for nested enum redefinitions with/without fixed underlying type [PR112571]

2024-01-31 Thread Joseph Myers

Bug 112571 reports an ICE-on-invalid for cases where an enum is
defined, without a fixed underlying type, inside the enum type
specifier for a definition of that same enum with a fixed underlying
type.

The ultimate cause is attempting to access ENUM_UNDERLYING_TYPE in a
case where it is NULL.  Avoid this by clearing
ENUM_FIXED_UNDERLYING_TYPE_P in thie case of inconsistent definitions.

Bootstrapped wth no regressions for x86_64-pc-linux-gnu.

PR c/112571

gcc/c/
* c-decl.cc (start_enum): Clear ENUM_FIXED_UNDERLYING_TYPE_P when
defining without a fixed underlying type an enumeration previously
declared with a fixed underlying type.

gcc/testsuite/
* gcc.dg/c23-enum-9.c, gcc.dg/c23-enum-10.c: New tests.

---

Applied to mainline.  Should also be backported to GCC 13 branch (the
oldest version with support for enums with fixed underlying types),
after waiting to see if any problems arise with the patch on mainline,
subject to changing -std=c23 to -std=c2x for the older version and
making sure the patch does indeed work on the older version (there
have been significant changes renaming to redefinitions of tagged
types in GCC 14 as part of Martin's tag compatibility work).

diff --git a/gcc/c/c-decl.cc b/gcc/c/c-decl.cc
index 8d18a3e11f4..934e557dc3b 100644
--- a/gcc/c/c-decl.cc
+++ b/gcc/c/c-decl.cc
@@ -9905,8 +9905,11 @@ start_enum (location_t loc, struct c_enum_contents 
*the_enum, tree name,
 
   if (ENUM_FIXED_UNDERLYING_TYPE_P (enumtype)
   && fixed_underlying_type == NULL_TREE)
-error_at (loc, "% declared with but defined without "
- "fixed underlying type");
+{
+  error_at (loc, "% declared with but defined without "
+   "fixed underlying type");
+  ENUM_FIXED_UNDERLYING_TYPE_P (enumtype) = false;
+}
 
   the_enum->enum_next_value = integer_zero_node;
   the_enum->enum_type = enumtype;
diff --git a/gcc/testsuite/gcc.dg/c23-enum-10.c 
b/gcc/testsuite/gcc.dg/c23-enum-10.c
new file mode 100644
index 000..dd5f3453b1f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/c23-enum-10.c
@@ -0,0 +1,6 @@
+/* PR c/112571.  */
+/* { dg-do compile } */
+/* { dg-options "-std=c23" } */
+
+enum X : typeof (enum X { A }); /* { dg-error "declared with but defined 
without fixed underlying type" } */
+/* { dg-error "invalid 'enum' underlying type" "invalid" { target *-*-* } .-1 
} */
diff --git a/gcc/testsuite/gcc.dg/c23-enum-9.c 
b/gcc/testsuite/gcc.dg/c23-enum-9.c
new file mode 100644
index 000..10bb493ca3c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/c23-enum-9.c
@@ -0,0 +1,8 @@
+/* PR c/112571.  */
+/* { dg-do compile } */
+/* { dg-options "-std=c23" } */
+
+enum h : typeof (enum h { D }) { D }; /* { dg-error "declared with but defined 
without fixed underlying type" } */
+/* { dg-error "invalid 'enum' underlying type" "invalid" { target *-*-* } .-1 
} */
+/* { dg-error "nested redefinition" "nested" { target *-*-* } .-2 } */
+/* { dg-error "conflicting redefinition" "conflicting" { target *-*-* } .-3 } 
*/

-- 
Joseph S. Myers
josmy...@redhat.com

[RFC PATCH 1/1] nix: add a simple flake nix shell

2024-01-31 Thread Vincenzo Palazzo

This commit is specifically targeting enhancements in
Nix support for GCC development. This initiative stems
from the recognized need within our community for a more
streamlined and efficient development process when using Nix.

Please not that in this case the Nix tool is used to define
what should be in the dev environment, and not as a NixOS distro
package manager.

Signed-off-by: Vincenzo Palazzo 
---
 .gitignore|  1 +
 contrib/nix/flake.nix | 35 +++
 2 files changed, 36 insertions(+)
 create mode 100644 contrib/nix/flake.nix

diff --git a/.gitignore b/.gitignore
index 93a16b0b950..801b1d1709e 100644
--- a/.gitignore
+++ b/.gitignore
@@ -2,6 +2,7 @@
 *.patch
 *.orig
 *.rej
+*.lock
 
 *~
 .#*
diff --git a/contrib/nix/flake.nix b/contrib/nix/flake.nix
new file mode 100644
index 000..b0ff1915adc
--- /dev/null
+++ b/contrib/nix/flake.nix
@@ -0,0 +1,35 @@
+{
+  description = "gcc compiler";
+
+  inputs = {
+nixpkgs.url = "github:nixos/nixpkgs";
+flake-utils.url = "github:numtide/flake-utils";
+  };
+
+  outputs = { self, nixpkgs, flake-utils }:
+flake-utils.lib.eachDefaultSystem (system:
+  let pkgs = nixpkgs.legacyPackages.${system};
+  in {
+packages = {
+  default = pkgs.gnumake;
+};
+formatter = pkgs.nixpkgs-fmt;
+
+devShell = pkgs.mkShell {
+  buildInputs = [
+pkgs.gnumake
+pkgs.gcc13
+
+pkgs.gmp
+pkgs.libmpc
+pkgs.mpfr
+pkgs.isl
+pkgs.pkg-config
+pkgs.autoconf-archive
+pkgs.autoconf
+pkgs.automake
+  ];
+};
+  }
+);
+}
-- 
2.43.0

Re: [PATCH v3 2/5] C++: Support clang compatible [[musttail]] (PR83324)

2024-01-31 Thread Joseph Myers

On Wed, 31 Jan 2024, Jakub Jelinek wrote:

> On Wed, Jan 31, 2024 at 12:21:38PM -0800, Andi Kleen wrote:
> > > > +   case RID_RETURN:
> > > > + {
> > > > +   bool musttail_p = false;
> > > > +   std_attrs = process_stmt_hotness_attribute (std_attrs, 
> > > > attrs_loc);
> > > > +   if (lookup_attribute ("", "musttail", std_attrs))
> > > > + {
> > > > +   musttail_p = true;
> > > > +   std_attrs = remove_attribute ("", "musttail", 
> > > > std_attrs);
> > > > + }
> 
> Using "" looks wrong to me, that is for standard attributes which
> are also gnu attributes, say [[noreturn]]/[[gnu::noreturn]].
> That is not the case here.  Even the __attribute__((musttail)) form will have
> gnu namespace.

And it's incorrect to use [[musttail]] (C23 syntax, no namespace) in any 
circumstances, at least for C, as it's not a standard attribute - so tests 
should verify that [[musttail]] is diagnosed as ignored even in contexts 
where [[gnu::musttail]] is valid.  (It can't be standardized as 
[[musttail]] because of the rule that standard attributes must be 
ignorable; the proposed syntax for a TS and possible future 
standardization after that is "return goto".)

-- 
Joseph S. Myers
josmy...@redhat.com

Re: [PATCH 2/2] libstdc++: Implement P2165R4 changes to std::pair/tuple/etc

2024-01-31 Thread Jonathan Wakely

On Wed, 31 Jan 2024 at 19:41, Patrick Palka  wrote:
>
> On Wed, 31 Jan 2024, Patrick Palka wrote:
>
> > On Wed, 24 Jan 2024, Patrick Palka wrote:
> > >
> > > In v2:
> > >
> > > * Named the template parameters of the forward declaration of pair.
> > > * Added dangling checks for the new tuple and pair constructors
> > >   and corresponding tests.
> > > * Replaced make_index_sequence with index_sequence_for where applicable.
> >
> > Ping.
>
> ... now also as an attachment since it seems Gmail doesn't like my
> inline patch.

Please add this above __dangles_from_tuple_like

// _GLIBCXX_RESOLVE_LIB_DEFECTS
// 4045. tuple can create dangling references from tuple-like

OK fo trunk with that change, thanks.

Re: [RFC PATCH 1/1] nix: add a simple flake nix shell

2024-01-31 Thread Eli Schwartz

On 1/31/24 4:43 PM, Vincenzo Palazzo wrote:
> This commit is specifically targeting enhancements in
> Nix support for GCC development. This initiative stems
> from the recognized need within our community for a more
> streamlined and efficient development process when using Nix.
> 
> Please not that in this case the Nix tool is used to define
> what should be in the dev environment, and not as a NixOS distro
> package manager.
> 
> Signed-off-by: Vincenzo Palazzo 
> ---


I was originally trying to figure out what the idea behind this patch
was, as I recalled discussing the patch before. Then I double checked
the mailing list and saw:

https://inbox.sourceware.org/gcc-patches/20240131214259.142253-1-vincenzopalazzo...@gmail.com/T/#u

One thing that can potentially reduce confusion here is:

- use git send-email -v2 to mark the patch as an update to an existing
  patch.

- Use the --annotate option, and edit the patch before sending it. Right
  here, after the "---" and in the same semantic patch section as the
  diffstat, you can put arbitrary non-patch commentary. It is
  essentially comments for patches -- it won't be included in the commit
  message when the patch is applied with `git am`. It is common to
  insert something that looks like this:


v2: moved the flake to contrib/ instead of installing it at the root of
the repository



>  .gitignore|  1 +
>  contrib/nix/flake.nix | 35 +++
>  2 files changed, 36 insertions(+)
>  create mode 100644 contrib/nix/flake.nix
> 
> diff --git a/.gitignore b/.gitignore
> index 93a16b0b950..801b1d1709e 100644
> --- a/.gitignore
> +++ b/.gitignore



-- 
Eli Schwartz


OpenPGP_0x84818A6819AF4A9B.asc
Description: OpenPGP public key


OpenPGP_signature.asc
Description: OpenPGP digital signature

[pushed] analyzer: fix skipping of debug stmts [PR113253]

2024-01-31 Thread David Malcolm

PR analyzer/113253 reports a case where the analyzer output varied
with and without -g enabled.

The root cause was that debug stmts were in the
FOR_EACH_IMM_USE_FAST list for SSA names, leading to the analyzer's
state purging logic differing between the -g and non-debugging cases,
and thus leading to differences in the exploration of the user's code.

Fix by skipping such stmts in the state-purging logic, and removing
debug stmts when constructing the supergraph.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Successful run of analyzer integration tests on x86_64-pc-linux-gnu.
Pushed to trunk as r14-8670-gcc7aebff74d896.

gcc/analyzer/ChangeLog:
PR analyzer/113253
* region-model.cc (region_model::on_stmt_pre): Add gcc_unreachable
for debug statements.
* state-purge.cc
(state_purge_per_ssa_name::state_purge_per_ssa_name): Skip any
debug stmts in the FOR_EACH_IMM_USE_FAST list.
* supergraph.cc (supergraph::supergraph): Don't add debug stmts
to the supernodes.

gcc/testsuite/ChangeLog:
PR analyzer/113253
* gcc.dg/analyzer/deref-before-check-pr113253.c: New test.

Signed-off-by: David Malcolm 
---
 gcc/analyzer/region-model.cc  |   5 +
 gcc/analyzer/state-purge.cc   |   9 +
 gcc/analyzer/supergraph.cc|   4 +
 .../analyzer/deref-before-check-pr113253.c| 154 ++
 4 files changed, 172 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/deref-before-check-pr113253.c

diff --git a/gcc/analyzer/region-model.cc b/gcc/analyzer/region-model.cc
index 082972f9d294..a26be7075997 100644
--- a/gcc/analyzer/region-model.cc
+++ b/gcc/analyzer/region-model.cc
@@ -1307,6 +1307,11 @@ region_model::on_stmt_pre (const gimple *stmt,
   /* No-op for now.  */
   break;
 
+case GIMPLE_DEBUG:
+  /* We should have stripped these out when building the supergraph.  */
+  gcc_unreachable ();
+  break;
+
 case GIMPLE_ASSIGN:
   {
const gassign *assign = as_a  (stmt);
diff --git a/gcc/analyzer/state-purge.cc b/gcc/analyzer/state-purge.cc
index 284a03f712c3..93959fb08ea3 100644
--- a/gcc/analyzer/state-purge.cc
+++ b/gcc/analyzer/state-purge.cc
@@ -329,6 +329,15 @@ state_purge_per_ssa_name::state_purge_per_ssa_name (const 
state_purge_map &map,
  map.log ("used by stmt: %s", pp_formatted_text (&pp));
}
 
+ if (is_gimple_debug (use_stmt))
+   {
+ /* We skipped debug stmts when building the supergraph,
+so ignore them now.  */
+ if (map.get_logger ())
+   map.log ("skipping debug stmt");
+ continue;
+   }
+
  const supernode *snode
= map.get_sg ().get_supernode_for_stmt (use_stmt);
 
diff --git a/gcc/analyzer/supergraph.cc b/gcc/analyzer/supergraph.cc
index d41a7e607f86..b82275256b72 100644
--- a/gcc/analyzer/supergraph.cc
+++ b/gcc/analyzer/supergraph.cc
@@ -182,6 +182,10 @@ supergraph::supergraph (logger *logger)
  for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
{
  gimple *stmt = gsi_stmt (gsi);
+ /* Discard debug stmts here, so we don't have to check for
+them anywhere within the analyzer.  */
+ if (is_gimple_debug (stmt))
+   continue;
  node_for_stmts->m_stmts.safe_push (stmt);
  m_stmt_to_node_t.put (stmt, node_for_stmts);
  m_stmt_uids.make_uid_unique (stmt);
diff --git a/gcc/testsuite/gcc.dg/analyzer/deref-before-check-pr113253.c 
b/gcc/testsuite/gcc.dg/analyzer/deref-before-check-pr113253.c
new file mode 100644
index ..d9015accd6ab
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/deref-before-check-pr113253.c
@@ -0,0 +1,154 @@
+/* Regression test for PR analyzer/113253 which was showing analyzer
+   differences with and without -g.
+
+   C only: reduced reproducer doesn't easily work with C++.  */
+
+/* { dg-additional-options "-O2 -g" } */
+
+typedef long int ptrdiff_t;
+typedef unsigned long int uintptr_t;
+typedef long int EMACS_INT;
+enum
+{
+  EMACS_INT_WIDTH = 64,
+  VALBITS = EMACS_INT_WIDTH - 3,
+};
+typedef struct Lisp_X* Lisp_Word;
+enum Lisp_Type
+{
+  Lisp_Symbol = 0,
+  Lisp_Vectorlike = 5,
+};
+typedef Lisp_Word Lisp_Object;
+static inline EMACS_INT(XLI)(Lisp_Object o)
+{
+  return ((EMACS_INT)(o));
+}
+static inline void*(XLP)(Lisp_Object o)
+{
+  return ((void*)(o));
+}
+struct Lisp_Symbol
+{};
+typedef uintptr_t Lisp_Word_tag;
+extern struct Lisp_Symbol lispsym[1608];
+union vectorlike_header
+{
+  ptrdiff_t size;
+};
+enum pvec_type
+{
+  PVEC_MARKER,
+};
+enum More_Lisp_Bits
+{
+  PSEUDOVECTOR_SIZE_BITS = 12,
+  PSEUDOVECTOR_REST_BITS = 12,
+  PSEUDOVECTOR_AREA_BITS = PSEUDOVECTOR_SIZE_BITS + PSEUDOVECTOR_REST_BITS,
+  PVEC_TYPE_MASK = 0x3f << PSEUDOVECTOR_AREA_BITS
+};
+static inline _Bool
+PSEUDOVECTORP(Lisp_Object a, in

Re: [Bug libstdc++/90276] PSTL tests fail in Debug Mode

2024-01-31 Thread Jonathan Wakely

On Wed, 31 Jan 2024 at 18:18, François Dumont  wrote:

> I replied to bugzilla rather than sending to proper mailing list !
>
> At the same time it looks like you also found the root cause of the
> problem Jonathan. Just let me know if you want to deal with it eventually.
>

I'll take care of it, thanks.


> François
>
>  Forwarded Message 
> Subject: Re: [Bug libstdc++/90276] PSTL tests fail in Debug Mode
> Date: Wed, 31 Jan 2024 19:09:02 +0100
> From: François Dumont  
> To: redi at gcc dot gnu.org 
> , fdum...@gcc.gnu.org
>
> Here is the reason of the
> 20_util/specialized_algorithms/pstl/uninitialized_copy_move.cc FAIL.
>
> Maybe it fixes some other tests too, I need to run all of them.
>
> libstdc++: Do not forward arguments several times [PR90276]
>
> Forwarding several times the same arguments results in UB. It is
> detected
> by the _GLIBCXX_DEBUG mode as an attempt to use a singular iterator
> which has
> been moved.
>
> libstdc++-v3/ChangeLog
>
> PR libstdc++/90276
> * testsuite/util/pstl/test_utils.h: Remove std::forward<>
> calls when
> done several times on the same arguments.
>
> Ok to commit ?
>
> François
>
>
> On 31/01/2024 14:11, redi at gcc dot gnu.org wrote:
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90276
>
> Jonathan Wakely  changed:
>
> What |Removed |Added
>
> 
> See Also| |https://github.com/llvm/llv
> | |m-project/issues/80136
>
>

[PATCH] c++: ICE with throw inside concept [PR112437]

2024-01-31 Thread Marek Polacek

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk/13/12?

-- >8 --
We crash in the loop at the end of treat_lvalue_as_rvalue_p for code
like

  template 
  concept Throwable = requires(T x) { throw x; };

because the code assumes that we eventually reach sk_function_parms or
sk_try and bail, but in a concept we're in a sk_namespace.

We're already checking sk_try so we don't crash in a function-try-block,
but I've added a test anyway.

PR c++/112437

gcc/cp/ChangeLog:

* typeck.cc (treat_lvalue_as_rvalue_p): Bail out on sk_namespace in
the move on throw of parms loop.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-throw1.C: New test.
* g++.dg/eh/throw4.C: New test.
---
 gcc/cp/typeck.cc |  4 +++-
 gcc/testsuite/g++.dg/cpp2a/concepts-throw1.C |  8 
 gcc/testsuite/g++.dg/eh/throw4.C | 13 +
 3 files changed, 24 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-throw1.C
 create mode 100644 gcc/testsuite/g++.dg/eh/throw4.C

diff --git a/gcc/cp/typeck.cc b/gcc/cp/typeck.cc
index a15eda3f5f8..4937022ff20 100644
--- a/gcc/cp/typeck.cc
+++ b/gcc/cp/typeck.cc
@@ -10863,7 +10863,9 @@ treat_lvalue_as_rvalue_p (tree expr, bool return_p)
   for (tree decl = b->names; decl; decl = TREE_CHAIN (decl))
if (decl == retval)
  return set_implicit_rvalue_p (move (expr));
-  if (b->kind == sk_function_parms || b->kind == sk_try)
+  if (b->kind == sk_function_parms
+ || b->kind == sk_try
+ || b->kind == sk_namespace)
return NULL_TREE;
 }
 }
diff --git a/gcc/testsuite/g++.dg/cpp2a/concepts-throw1.C 
b/gcc/testsuite/g++.dg/cpp2a/concepts-throw1.C
new file mode 100644
index 000..bc3e3b6891a
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/concepts-throw1.C
@@ -0,0 +1,8 @@
+// PR c++/112437
+// { dg-do compile { target c++20 } }
+
+struct S {};
+template 
+concept Throwable = requires(T x) { throw x; };
+
+bool a = Throwable;
diff --git a/gcc/testsuite/g++.dg/eh/throw4.C b/gcc/testsuite/g++.dg/eh/throw4.C
new file mode 100644
index 000..b474472de48
--- /dev/null
+++ b/gcc/testsuite/g++.dg/eh/throw4.C
@@ -0,0 +1,13 @@
+// PR c++/112437
+// { dg-do compile }
+
+struct S {};
+
+S
+foo (S s)
+try {
+  throw s;
+}
+catch (...) {
+  throw s;
+}

base-commit: d22d1a9346f27db41459738c6eb404f8f0956e6f
-- 
2.43.0

[COMMITTEDv2] aarch64: -mstrict-align vs __arm_data512_t [PR113657]

2024-01-31 Thread Andrew Pinski

After r14-1187-gd6b756447cd58b, simplify_gen_subreg can return
NULL for "unaligned" memory subreg. Since V8DI has an alignment of 8 bytes,
using TImode causes simplify_gen_subreg to return NULL.
This fixes the issue by using DImode instead for the loop. And then we will have
later on the STP/LDP pass combine it back into STP/LDP if needed.
Since strict align is less important (usually used for firmware and early boot 
only),
not doing LDP/STP here is ok.

Built and tested for aarch64-linux-gnu with no regressions.

PR target/113657

gcc/ChangeLog:

* config/aarch64/aarch64-simd.md (split for movv8di):
For strict aligned mode, use DImode instead of TImode.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/acle/ls64_strict_align.c: New test.

Signed-off-by: Andrew Pinski 
---
 gcc/config/aarch64/aarch64-simd.md| 11 +++
 .../gcc.target/aarch64/acle/ls64_strict_align.c   |  7 +++
 2 files changed, 14 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/acle/ls64_strict_align.c

diff --git a/gcc/config/aarch64/aarch64-simd.md 
b/gcc/config/aarch64/aarch64-simd.md
index 7a6b4430112..4023b918882 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -8221,14 +8221,17 @@ (define_split
   || (memory_operand (operands[0], V8DImode)
   && register_operand (operands[1], V8DImode)))
 {
+  /* V8DI only guarantees 8-byte alignment, whereas TImode requires 16.  */
+  auto mode = STRICT_ALIGNMENT ? DImode : TImode;
+  int increment = GET_MODE_SIZE (mode);
   std::pair last_pair = {};
-  for (int offset = 0; offset < 64; offset += 16)
+  for (int offset = 0; offset < 64; offset += increment)
 {
  std::pair pair = {
-   simplify_gen_subreg (TImode, operands[0], V8DImode, offset),
-   simplify_gen_subreg (TImode, operands[1], V8DImode, offset)
+   simplify_gen_subreg (mode, operands[0], V8DImode, offset),
+   simplify_gen_subreg (mode, operands[1], V8DImode, offset)
  };
- if (register_operand (pair.first, TImode)
+ if (register_operand (pair.first, mode)
  && reg_overlap_mentioned_p (pair.first, pair.second))
last_pair = pair;
  else
diff --git a/gcc/testsuite/gcc.target/aarch64/acle/ls64_strict_align.c 
b/gcc/testsuite/gcc.target/aarch64/acle/ls64_strict_align.c
new file mode 100644
index 000..bf49ac76f78
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/acle/ls64_strict_align.c
@@ -0,0 +1,7 @@
+/* { dg-do compile } */
+/* { dg-options "-mstrict-align" } */
+/* PR target/113657 */
+
+#pragma GCC target "+ls64"
+#pragma GCC aarch64 "arm_acle.h"
+__arm_data512_t foo(__arm_data512_t* ptr) { return *ptr; }
-- 
2.39.3

Re: [PATCH] RISC-V: Allow constraint "S" even if the symbol does not bind locally

2024-01-31 Thread Fangrui Song

On Tue, Jan 30, 2024 at 11:26 PM Kito Cheng  wrote:
>
> I realized there is 's' constraint which is defined in GCC generic
> infra[1], and that's kinda what same as the new semantic of 'S' here,
>
> (define_constraint "s"
>  "Matches a symbolic integer constant."
>  (and (match_test "CONSTANT_P (op)")
>   (match_test "!CONST_SCALAR_INT_P (op)")
>   (match_test "!flag_pic || LEGITIMATE_PIC_OPERAND_P (op)")))
>
> Where const, symbol_ref and label_ref is match CONSTANT_P &&
> !CONST_SCALAR_INT_P,
> and LEGITIMATE_PIC_OPERAND_P is always 1 for RISC-V

Thanks for catching this! I read "symbolic integer constant" and
skipped, but did not realized that
this actually means a symbol or label reference with a constant offset.
I agree that "s" should be preferred.
I have jotted down some notes at
https://maskray.me/blog/2024-01-30-raw-symbol-names-in-inline-assembly

The condition !flag_pic || LEGITIMATE_PIC_OPERAND_P (op) highlights a
key distinction in GCC's handling of symbol references:

Non-PIC code (-fno-pic): The "i" and "s" constraints are freely permitted.
PIC code (-fpie and -fpic): The architecture-specific
LEGITIMATE_PIC_OPERAND_P(X) macro dictates whether these constraints
are allowed.

While the default implementation (gcc/defaults.h) is permissive (used
by MIPS, PowerPC, and RISC-V), many ports impose stricter
restrictions, often disallowing preemptible symbols under PIC.

This differentiation probably stems from historical and architectural
considerations:

Non-PIC code: Absolute addresses could be directly embedded in
instructions like an immediate integer operand.
PIC code with dynamic linking: The need for GOT indirection often
requires an addressing mode different from absolute addressing and
more than one instructions.

Nevertheless, I think this symbol preemptibility limitation for "s" is
unfortunate. Ideally, we could retain the current "i" for immediate
integer operand (after linking), and design "s" for a raw symbol name
with a constant offset, ignoring symbol preemptibility. This
architecture-agnostic "s" would simplify metadata section utilization
and boost code portability.

> The only difference is it also allows high, which is something like
> %hi(sym), but I think it's harmless in the use case.

I do not follow this. Do you have an example?

> However I found LLVM also not work on " asm(".reloc ., BFD_RELOC_NONE,
> %0" :: "S"(&ns::a[3]));",
> so maybe we could consider implement 's' in LLVM? and also add some
> document in riscv-c-api.md

Clang does not implement the offset yet.
I created https://github.com/llvm/llvm-project/pull/80201 to support "s"

> And just clarify, I don't have strong prefer on using 's', I am ok
> with relaxing 'S' too,
> propose using 's' is because that is work fine on RISC-V gcc for long
> time and no backward compatible issue,
> But I guess you have this proposal may came from ClangBuiltLinux, so
> 's' may not work for clang well due to backward compatible.

It seems that ClangBuiltLinux can live with "i" for now:)

I raised the topic due to a micro-optimization opportunity in
https://github.com/protocolbuffers/protobuf/blob/1fe463ce71b6acc60b3aef65d51185e3704cac8b/src/google/protobuf/stubs/common.h#L86
and I believe metadata sections will get more used and compilers
should be prepared for future uses.

I'll abandon this "S" change. I can create a test-only change if you
think the test coverage is useful, as we hardly have any non-rvv
inline asm tests at present...

> [1] 
> https://gcc.gnu.org/onlinedocs/gcc/Simple-Constraints.html#index-s-in-constraint
> [2] 
> https://github.com/riscv-non-isa/riscv-c-api-doc/blob/master/riscv-c-api.md#constraints-on-operands-of-inline-assembly-statements
>
> On Wed, Jan 31, 2024 at 1:02 PM Fangrui Song  wrote:
> >
> > The constraint "S" can only be used with a symbol that binds locally, so
> > the following does not work for -fpie/-fpic (GOT access is used).
> > ```
> > namespace ns { extern int var, a[4]; }
> > void foo() {
> >   asm(".pushsection .xxx,\"aw\"; .dc.a %0; .popsection" :: "S"(&ns::var));
> >   asm(".reloc ., BFD_RELOC_NONE, %0" :: "S"(&ns::a[3]));
> > }
> > ```
> >
> > This is overly restrictive, as many references like an absolute
> > relocation in a writable section or a non-SHF_ALLOC section should be
> > totally fine.  Allow symbols that do not bind locally, similar to
> > aarch64 "S" and x86-64 "Ws" (commit 
> > d7250100381b817114447d91fff4748526d4fb21).
> >
> > gcc/ChangeLog:
> >
> > * config/riscv/constraints.md: Relax the condition for "S".
> > * doc/md.texi: Update.
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.target/riscv/asm-raw-symbol.c: New test.
> > ---
> >  gcc/config/riscv/constraints.md |  4 ++--
> >  gcc/doc/md.texi |  2 +-
> >  gcc/testsuite/gcc.target/riscv/asm-raw-symbol.c | 17 +
> >  3 files changed, 20 insertions(+), 3 deletions(-)
> >  create mode 100644 gcc/testsuite/gcc.target/riscv/as

Re: [COMMITTED V3 1/4] RISC-V: Add non-vector types to dfa pipelines

2024-01-31 Thread Edwin Lu


On 1/25/2024 9:06 AM, Robin Dapp wrote:

LGTM, thanks.

Regards
  Robin



Committed!

Edwin

Re: [COMMITTED V3 3/4] RISC-V: Use default cost model for insn scheduling

2024-01-31 Thread Edwin Lu


On 1/25/2024 9:06 AM, Robin Dapp wrote:

Use default cost model scheduling on these test cases. All these tests
introduce scan dump failures with -mtune generic-ooo. Since the vector
cost models are the same across all three tunes, some of the tests
in PR113249 will be fixed with this patch series.


This is OK, thanks.


39 additional unique testsuite failures (scan dumps) will still be present.
I don't know how optimal the new output is compared to the old. Should I update
the testcase expected output to match the new scan dumps?


Currently, without vector op latency, the output should come close
to what's normally considered "good" (i.e. minimal number of vsetvls
and so on).  Therefore I'd suggest not to change the scan dumps to
much except when there is a real problem.  If you have a specific
example that you're unsure about we can discuss this on or off list.

Regards
  Robin




Committed!

Edwin

Re: [COMMITTED V3 4/4] RISC-V: Enable assert for insn_has_dfa_reservation

2024-01-31 Thread Edwin Lu


On 1/25/2024 9:06 AM, Robin Dapp wrote:

/* If we ever encounter an insn without an insn reservation, trip
   an assert so we can find and fix this problem.  */
-#if 0
+  if (! insn_has_dfa_reservation_p (insn)) {
+print_rtl(stderr, insn);
+fprintf(stderr, "%d", get_attr_type (insn));
+  }
gcc_assert (insn_has_dfa_reservation_p (insn));
-#endif
  
return more - 1;

  }


I was thinking about make the gcc_assert a gcc_checking_assert so,
in case we accidentally forget something at any point, it would
only gracefully degrade in a release build.  As we already have
a hard assert for the type the patch (and not many test with
enable checking anyway) this is OK IMHO.

I suppose you tested with all available -mtune options?

Regards
  Robin




Committed without the debugging stuff!

Edwin

Re: [COMMITTED V4 2/4] RISC-V: Add vector related pipelines

2024-01-31 Thread Edwin Lu


On 1/31/2024 12:28 PM, Robin Dapp wrote:

LGTM, thanks.

Regards
  Robin



Committed!

Edwin

Re: [PATCH] RISC-V: Support scheduling for sifive p600 series

2024-01-31 Thread Edwin Lu

I recently committed changes modifying the scheduling reservations. Some 
things may need to be retested with the newly enabled asserts.


Edwin

On 1/31/2024 1:40 AM, Monk Chiang wrote:

Add sifive p600 series scheduler module. For more information
see https://www.sifive.com/cores/performance-p650-670.
Add sifive-p650, sifive-p670 for mcpu option will come in separate patches.

gcc/ChangeLog:
* config/riscv/riscv.md: Add "fcvt_i2f", "fcvt_f2i" type
attribute, and include sifive-p600.md.
* config/riscv/generic-ooo.md: Update type attribute.
* config/riscv/sifive-7.md: Update type attribute.
* config/riscv/sifive-p600.md: New file.
* config/riscv/riscv-cores.def (RISCV_TUNE): Add parameter.
* config/riscv/riscv-opts.h (enum riscv_microarchitecture_type):
Add sifive_p600.
* config/riscv/riscv.c (sifive_p600_tune_info): New.
* config/riscv/riscv.h (TARGET_SFB_ALU): Update.
* doc/invoke.texi (RISC-V Options): Add sifive-p600-series
---
  gcc/config/riscv/generic-ooo.md  |   2 +-
  gcc/config/riscv/generic.md  |   2 +-
  gcc/config/riscv/riscv-cores.def |   1 +
  gcc/config/riscv/riscv-opts.h|   1 +
  gcc/config/riscv/riscv.cc|  17 +++
  gcc/config/riscv/riscv.h |   4 +-
  gcc/config/riscv/riscv.md|  19 ++--
  gcc/config/riscv/sifive-7.md |   2 +-
  gcc/config/riscv/sifive-p600.md  | 174 +++
  gcc/doc/invoke.texi  |   3 +-
  10 files changed, 212 insertions(+), 13 deletions(-)
  create mode 100644 gcc/config/riscv/sifive-p600.md

diff --git a/gcc/config/riscv/generic-ooo.md b/gcc/config/riscv/generic-ooo.md
index 421a7bb929d..a22f8a3e079 100644
--- a/gcc/config/riscv/generic-ooo.md
+++ b/gcc/config/riscv/generic-ooo.md
@@ -127,7 +127,7 @@
  
  (define_insn_reservation "generic_ooo_fcvt" 3

(and (eq_attr "tune" "generic_ooo")
-   (eq_attr "type" "fcvt"))
+   (eq_attr "type" "fcvt,fcvt_i2f,fcvt_f2i"))
"generic_ooo_issue,generic_ooo_fxu")
  
  (define_insn_reservation "generic_ooo_fcmp" 2

diff --git a/gcc/config/riscv/generic.md b/gcc/config/riscv/generic.md
index b99ae345bb3..3f0eaa2ea08 100644
--- a/gcc/config/riscv/generic.md
+++ b/gcc/config/riscv/generic.md
@@ -42,7 +42,7 @@
  
  (define_insn_reservation "generic_xfer" 3

(and (eq_attr "tune" "generic")
-   (eq_attr "type" "mfc,mtc,fcvt,fmove,fcmp"))
+   (eq_attr "type" "mfc,mtc,fcvt,fcvt_i2f,fcvt_f2i,fmove,fcmp"))
"alu")
  
  (define_insn_reservation "generic_branch" 1

diff --git a/gcc/config/riscv/riscv-cores.def b/gcc/config/riscv/riscv-cores.def
index b30f4dfb08e..a07a79e2cb7 100644
--- a/gcc/config/riscv/riscv-cores.def
+++ b/gcc/config/riscv/riscv-cores.def
@@ -37,6 +37,7 @@ RISCV_TUNE("rocket", generic, rocket_tune_info)
  RISCV_TUNE("sifive-3-series", generic, rocket_tune_info)
  RISCV_TUNE("sifive-5-series", generic, rocket_tune_info)
  RISCV_TUNE("sifive-7-series", sifive_7, sifive_7_tune_info)
+RISCV_TUNE("sifive-p600-series", sifive_p600, sifive_p600_tune_info)
  RISCV_TUNE("thead-c906", generic, thead_c906_tune_info)
  RISCV_TUNE("generic-ooo", generic_ooo, generic_ooo_tune_info)
  RISCV_TUNE("size", generic, optimize_size_tune_info)
diff --git a/gcc/config/riscv/riscv-opts.h b/gcc/config/riscv/riscv-opts.h
index 1500f8811ef..25951665b13 100644
--- a/gcc/config/riscv/riscv-opts.h
+++ b/gcc/config/riscv/riscv-opts.h
@@ -55,6 +55,7 @@ extern enum riscv_isa_spec_class riscv_isa_spec;
  enum riscv_microarchitecture_type {
generic,
sifive_7,
+  sifive_p600,
generic_ooo
  };
  extern enum riscv_microarchitecture_type riscv_microarchitecture;
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 7b6111aa545..92d6fd5cf47 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -447,6 +447,23 @@ static const struct riscv_tune_param sifive_7_tune_info = {
NULL,   /* vector cost */
  };
  
+/* Costs to use when optimizing for Sifive p600 Series.  */

+static const struct riscv_tune_param sifive_p600_tune_info = {
+  {COSTS_N_INSNS (4), COSTS_N_INSNS (4)},  /* fp_add */
+  {COSTS_N_INSNS (4), COSTS_N_INSNS (4)},  /* fp_mul */
+  {COSTS_N_INSNS (20), COSTS_N_INSNS (20)},/* fp_div */
+  {COSTS_N_INSNS (4), COSTS_N_INSNS (4)},  /* int_mul */
+  {COSTS_N_INSNS (6), COSTS_N_INSNS (6)},  /* int_div */
+  4,   /* issue_rate */
+  4,   /* branch_cost */
+  3,   /* memory_cost */
+  4,   /* fmv_cost */
+  true,/* 
slow_unaligned_access */
+  false,   /* use_divmod_expansion */
+  RISCV_FUSE_LUI_ADDI | RISCV_FUSE_AUIPC_ADDI,  /* fusible_ops */
+  NULL,/* vector cost */
+};

Re: [PATCH] c-family: Fix ICE with large column number after restoring a PCH [PR105608]

2024-01-31 Thread Lewis Hyatt

On Wed, Jan 31, 2024 at 03:18:01PM -0500, Jason Merrill wrote:
> On 1/30/24 21:49, Lewis Hyatt wrote:
> > On Fri, Jan 26, 2024 at 04:16:54PM -0500, Jason Merrill wrote:
> > > On 12/5/23 20:52, Lewis Hyatt wrote:
> > > > Hello-
> > > > 
> > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105608
> > > > 
> > > > There are two related issues here really, a regression since GCC 11 
> > > > where we
> > > > can ICE after restoring a PCH, and a deeper issue with bogus locations
> > > > assigned to macros that were defined prior to restoring a PCH.  This 
> > > > patch
> > > > fixes the ICE regression with a simple change, and I think it's 
> > > > appropriate
> > > > for GCC 14 as well as backport to 11, 12, 13. The bad locations (wrong, 
> > > > but
> > > > not generally causing an ICE, and mostly affecting only the output of
> > > > -Wunused-macros) are not as problematic, and will be harder to fix. I 
> > > > could
> > > > take a stab at that for GCC 15. In the meantime the patch adds XFAILed
> > > > tests for the wrong locations (as well as passing tests for the 
> > > > regression
> > > > fix). Does it look OK please? Bootstrap + regtest all languages on 
> > > > x86-64
> > > > Linux. Thanks!
> > > 
> > > OK for trunk and branches, thanks!
> > > 
> > 
> > Thanks for the review! That is all taken care of. I have one more request if
> > you don't mind please... There have been some further comments on the PR
> > indicating that the new xfailed testcase I added is failing in an unexpected
> > way on at least one architecture. To recap, the idea here was that
> > 
> > 1) libcpp needs new logic to be able to output correct locations for this
> > case. That will be some new code that is suitable for stage 1, not now.
> > 
> > 2) In the meantime, we fixed things up enough to avoid an ICE that showed up
> > in GCC 11, and added an xfailed testcase to remind about #1.
> > 
> > The problem is that, the reason that libcpp outputs the wrong locations, is
> > that it has always used a location from the old line_map instance to index
> > into the new line_map instance, and so the exact details of the wrong
> > locations it outputs depend on the state of those two line maps, which may
> > differ depending on system includes and things like that. So I was hoping to
> > make one further one-line change to libcpp, not yet to output correct
> > locations, but at least to output one which is the same always and doesn't
> > depend on random things. This would assign all restored macros to a
> > consistent location, one line following the #include that triggered the PCH
> > process. I think this probably shouldn't be backported but it would be nice
> > to get into GCC 14, while nothing critical, at least it would avoid the new
> > test failure that's being reported. But more generally, I think using a
> > location from a totally different line map is dangerous and could have worse
> > consequences that haven't been seen yet. Does it look OK please? Thanks!
> 
> Can we use the line of the #include, as the test expects, rather than the
> following line?

Thanks, yes, that will work too, it just needs a few changes to
c-family/c-pch.cc to set the location there and then increment it
after. Patch which does that is attached. (This is a new one based on
master, not incremental to the prior patch.) The testcase does not require
any changes this way, and bootstrap + regtest looks good.

-Lewis
[PATCH] c-family: Stabilize the location for macros restored after PCH load 
[PR105608]

libcpp currently lacks the infrastructure to assign correct locations to
macros that were defined prior to loading a PCH and then restored
afterwards. While I plan to address that fully for GCC 15, this patch
improves things by using at least a valid location, even if it's not the
best one. Without this change, libcpp uses pfile->directive_line as the
location for the restored macros, but this location_t applies to the old
line map, not the one that was just restored from the PCH, so the resulting
location is unpredictable and depends on what was stored in the line maps
before. With this change, all restored macros get assigned locations at the
line of the #include that triggered the PCH restore. A future patch will
store the actual file name and line number of each definition and then
synthesize locations in the new line map pointing to the right place.

gcc/c-family/ChangeLog:

PR preprocessor/105608
* c-pch.cc (c_common_read_pch): Adjust line map so that libcpp
assigns a location to restored macros which is the same location
that triggered the PCH include.

libcpp/ChangeLog:

PR preprocessor/105608
* pch.cc (cpp_read_state): Set a valid location for restored
macros.

diff --git a/gcc/c-family/c-pch.cc b/gcc/c-family/c-pch.cc
index 79b4f88fe4d..971af90be0d 100644
--- a/gcc/c-family/c-pch.cc
+++ b/gcc/c-family/c-pch.cc
@@ -318,6 +318,7 @@ c_common_read_pch (cpp_reader *pfile, const char *name,
   struc

Re: [PATCH] c-family: Fix ICE with large column number after restoring a PCH [PR105608]

2024-01-31 Thread Jason Merrill


On 1/31/24 21:09, Lewis Hyatt wrote:

On Wed, Jan 31, 2024 at 03:18:01PM -0500, Jason Merrill wrote:

On 1/30/24 21:49, Lewis Hyatt wrote:

On Fri, Jan 26, 2024 at 04:16:54PM -0500, Jason Merrill wrote:

On 12/5/23 20:52, Lewis Hyatt wrote:

Hello-

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105608

There are two related issues here really, a regression since GCC 11 where we
can ICE after restoring a PCH, and a deeper issue with bogus locations
assigned to macros that were defined prior to restoring a PCH.  This patch
fixes the ICE regression with a simple change, and I think it's appropriate
for GCC 14 as well as backport to 11, 12, 13. The bad locations (wrong, but
not generally causing an ICE, and mostly affecting only the output of
-Wunused-macros) are not as problematic, and will be harder to fix. I could
take a stab at that for GCC 15. In the meantime the patch adds XFAILed
tests for the wrong locations (as well as passing tests for the regression
fix). Does it look OK please? Bootstrap + regtest all languages on x86-64
Linux. Thanks!


OK for trunk and branches, thanks!



Thanks for the review! That is all taken care of. I have one more request if
you don't mind please... There have been some further comments on the PR
indicating that the new xfailed testcase I added is failing in an unexpected
way on at least one architecture. To recap, the idea here was that

1) libcpp needs new logic to be able to output correct locations for this
case. That will be some new code that is suitable for stage 1, not now.

2) In the meantime, we fixed things up enough to avoid an ICE that showed up
in GCC 11, and added an xfailed testcase to remind about #1.

The problem is that, the reason that libcpp outputs the wrong locations, is
that it has always used a location from the old line_map instance to index
into the new line_map instance, and so the exact details of the wrong
locations it outputs depend on the state of those two line maps, which may
differ depending on system includes and things like that. So I was hoping to
make one further one-line change to libcpp, not yet to output correct
locations, but at least to output one which is the same always and doesn't
depend on random things. This would assign all restored macros to a
consistent location, one line following the #include that triggered the PCH
process. I think this probably shouldn't be backported but it would be nice
to get into GCC 14, while nothing critical, at least it would avoid the new
test failure that's being reported. But more generally, I think using a
location from a totally different line map is dangerous and could have worse
consequences that haven't been seen yet. Does it look OK please? Thanks!


Can we use the line of the #include, as the test expects, rather than the
following line?


Thanks, yes, that will work too, it just needs a few changes to
c-family/c-pch.cc to set the location there and then increment it
after. Patch which does that is attached. (This is a new one based on
master, not incremental to the prior patch.) The testcase does not require
any changes this way, and bootstrap + regtest looks good.


OK, thanks.

Jason

Re: [PATCH] c++: ICE with throw inside concept [PR112437]

2024-01-31 Thread Jason Merrill


On 1/31/24 18:41, Marek Polacek wrote:

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk/13/12?


OK.


-- >8 --
We crash in the loop at the end of treat_lvalue_as_rvalue_p for code
like

   template 
   concept Throwable = requires(T x) { throw x; };

because the code assumes that we eventually reach sk_function_parms or
sk_try and bail, but in a concept we're in a sk_namespace.

We're already checking sk_try so we don't crash in a function-try-block,
but I've added a test anyway.

PR c++/112437

gcc/cp/ChangeLog:

* typeck.cc (treat_lvalue_as_rvalue_p): Bail out on sk_namespace in
the move on throw of parms loop.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-throw1.C: New test.
* g++.dg/eh/throw4.C: New test.
---
  gcc/cp/typeck.cc |  4 +++-
  gcc/testsuite/g++.dg/cpp2a/concepts-throw1.C |  8 
  gcc/testsuite/g++.dg/eh/throw4.C | 13 +
  3 files changed, 24 insertions(+), 1 deletion(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-throw1.C
  create mode 100644 gcc/testsuite/g++.dg/eh/throw4.C

diff --git a/gcc/cp/typeck.cc b/gcc/cp/typeck.cc
index a15eda3f5f8..4937022ff20 100644
--- a/gcc/cp/typeck.cc
+++ b/gcc/cp/typeck.cc
@@ -10863,7 +10863,9 @@ treat_lvalue_as_rvalue_p (tree expr, bool return_p)
for (tree decl = b->names; decl; decl = TREE_CHAIN (decl))
if (decl == retval)
  return set_implicit_rvalue_p (move (expr));
-  if (b->kind == sk_function_parms || b->kind == sk_try)
+  if (b->kind == sk_function_parms
+ || b->kind == sk_try
+ || b->kind == sk_namespace)
return NULL_TREE;
  }
  }
diff --git a/gcc/testsuite/g++.dg/cpp2a/concepts-throw1.C 
b/gcc/testsuite/g++.dg/cpp2a/concepts-throw1.C
new file mode 100644
index 000..bc3e3b6891a
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/concepts-throw1.C
@@ -0,0 +1,8 @@
+// PR c++/112437
+// { dg-do compile { target c++20 } }
+
+struct S {};
+template 
+concept Throwable = requires(T x) { throw x; };
+
+bool a = Throwable;
diff --git a/gcc/testsuite/g++.dg/eh/throw4.C b/gcc/testsuite/g++.dg/eh/throw4.C
new file mode 100644
index 000..b474472de48
--- /dev/null
+++ b/gcc/testsuite/g++.dg/eh/throw4.C
@@ -0,0 +1,13 @@
+// PR c++/112437
+// { dg-do compile }
+
+struct S {};
+
+S
+foo (S s)
+try {
+  throw s;
+}
+catch (...) {
+  throw s;
+}

base-commit: d22d1a9346f27db41459738c6eb404f8f0956e6f

Re: [PATCH] c++: ttp CTAD equivalence [PR112737]

2024-01-31 Thread Jason Merrill


On 1/31/24 16:03, Patrick Palka wrote:

On Wed, 31 Jan 2024, Jason Merrill wrote:


On 1/31/24 12:12, Patrick Palka wrote:

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?

-- >8 --

Here during declaration matching we undesirably consider the two TT{42}
CTAD expressions to be non-equivalent ultimately because for CTAD
placeholder equivalence we compare the TEMPLATE_DECLs (which uses
pointer identity) and here the corresponding TEMPLATE_DECLs for TT are
different since they're from different scopes.  On the other hand, the
corresponding TEMPLATE_TEMPLATE_PARMs are deemed equivalent (since they
have the same position and template parameters).  This turns out to be
the root cause of some of the xtreme-header modules regressions.

We don't have this ttp equivalence issue in other contexts because either
the TEMPLATE_TEMPLATE_PARM is used instead of the TEMPLATE_DECL already
(e.g. when a ttp is used as a targ), or the equivalence logic is relaxed
(e.g. for bound ttps), it seems.

So this patch relaxes ttp CTAD placeholder equivalence accordingly, by
comparing the TEMPLATE_TEMPLATE_PARM instead of the TEMPLATE_DECL.  The
ctp_hasher doesn't need to be adjusted since it currently doesn't include
CLASS_PLACEHOLDER_TEMPLATE in the hash anyway.


Maybe put this handling in cp_tree_equal and call it from here?  Does
iterative_hash_template_arg need something similar?


I was hoping cp_tree_equal would never be called for a ttp TEMPLATE_DECL
after this patch, and so it wouldn't matter either way, but it turns out
to matter for a function template-id:

   template class, class T>
   void g(T);

   template class TT, class T>
   decltype(g(T{})) f(T); // #1

   template class TT, class T>
   decltype(g(T{})) f(T); // redeclaration of #1

   template struct A { A(T); };

   int main() {
 f(0);
   }

Here we represent TT in g as a TEMPLATE_DECL because it's not until
coercion that convert_template_argument turns it into a
TEMPLATE_TEMPLATE_PARM, but of course we can't coerce until the call is
non-dependent and we know which function we're calling.  (So TT within
a class, variable or alias template-id would be represented as a
TEMPLATE_TEMPLATE_PARM since we can do coercion ahead of time in that
case.)

So indeed it seems desirable to handle this in cp_tree_equal... like so?
Bootstrap and regtest nearly finished.


OK.


-- >8 --

PR c++/112737

gcc/cp/ChangeLog:

* pt.cc (iterative_hash_template_arg) :
Adjust hashing to match cp_tree_equal.
(ctp_hasher::hash): Also hash CLASS_PLACEHOLDER_TEMPLATE.
* tree.cc (cp_tree_equal) : Return true
for ttp TEMPLATE_DECLs if their TEMPLATE_TEMPLATE_PARMs are
equivalent.
* typeck.cc (structural_comptypes) :
Use cp_tree_equal to compare CLASS_PLACEHOLDER_TEMPLATE.

gcc/testsuite/ChangeLog:

* g++.dg/template/ttp42.C: New test.
* g++.dg/template/ttp42a.C: New test.
---
  gcc/cp/pt.cc   |  9 +
  gcc/cp/tree.cc |  6 +-
  gcc/cp/typeck.cc   |  4 ++--
  gcc/testsuite/g++.dg/template/ttp42.C  | 14 ++
  gcc/testsuite/g++.dg/template/ttp42a.C | 18 ++
  5 files changed, 48 insertions(+), 3 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/template/ttp42.C
  create mode 100644 gcc/testsuite/g++.dg/template/ttp42a.C

diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 5871cb668d0..ca454758ca7 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -1816,6 +1816,13 @@ iterative_hash_template_arg (tree arg, hashval_t val)
}
return iterative_hash_template_arg (TREE_TYPE (arg), val);
  
+case TEMPLATE_DECL:

+  if (DECL_TEMPLATE_TEMPLATE_PARM_P (arg))
+   return iterative_hash_template_arg (TREE_TYPE (arg), val);
+  else
+   /* Hash it like any other declaration.  */
+   break;
+
  case TARGET_EXPR:
return iterative_hash_template_arg (TARGET_EXPR_INITIAL (arg), val);
  
@@ -4499,6 +4506,8 @@ struct ctp_hasher : ggc_ptr_hash

  hashval_t val = iterative_hash_object (code, 0);
  val = iterative_hash_object (TEMPLATE_TYPE_LEVEL (t), val);
  val = iterative_hash_object (TEMPLATE_TYPE_IDX (t), val);
+if (TREE_CODE (t) == TEMPLATE_TYPE_PARM)
+  val = iterative_hash_template_arg (CLASS_PLACEHOLDER_TEMPLATE (t), val);
  if (TREE_CODE (t) == BOUND_TEMPLATE_TEMPLATE_PARM)
val = iterative_hash_template_arg (TYPE_TI_ARGS (t), val);
  --comparing_specializations;
diff --git a/gcc/cp/tree.cc b/gcc/cp/tree.cc
index 77f57e0f9ac..5c8c05dc168 100644
--- a/gcc/cp/tree.cc
+++ b/gcc/cp/tree.cc
@@ -4084,11 +4084,15 @@ cp_tree_equal (tree t1, tree t2)
}
return false;
  
+case TEMPLATE_DECL:

+  if (DECL_TEMPLATE_TEMPLATE_PARM_P (t1)
+ && DECL_TEMPLATE_TEMPLATE_PARM_P (t2))
+   return cp_tree_equal (TREE_TYPE (t1), TREE_TYPE (t2));
+  /* Fall through.  */
  case VAR_DECL:

Re: [PATCH v2] c++: avoid -Wdangling-reference for std::span-like classes [PR110358]

2024-01-31 Thread Jason Merrill


On 1/31/24 15:56, Marek Polacek wrote:

On Wed, Jan 31, 2024 at 02:57:09PM -0500, Jason Merrill wrote:

On 1/31/24 14:44, Alex Coplan wrote:

Hi Marek,

On 30/01/2024 13:15, Marek Polacek wrote:

On Thu, Jan 25, 2024 at 10:13:10PM -0500, Jason Merrill wrote:

On 1/25/24 20:36, Marek Polacek wrote:

Better version:

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

-- >8 --
Real-world experience shows that -Wdangling-reference triggers for
user-defined std::span-like classes a lot.  We can easily avoid that
by considering classes like

   template
   struct Span {
 T* data_;
 std::size len_;
   };

to be std::span-like, and not warning for them.  Unlike the previous
patch, this one considers a non-union class template that has a pointer
data member and a trivial destructor as std::span-like.

PR c++/110358
PR c++/109640

gcc/cp/ChangeLog:

* call.cc (reference_like_class_p): Don't warn for std::span-like
classes.

gcc/ChangeLog:

* doc/invoke.texi: Update -Wdangling-reference description.

gcc/testsuite/ChangeLog:

* g++.dg/warn/Wdangling-reference18.C: New test.
* g++.dg/warn/Wdangling-reference19.C: New test.
* g++.dg/warn/Wdangling-reference20.C: New test.
---
gcc/cp/call.cc| 18 
gcc/doc/invoke.texi   | 14 +++
.../g++.dg/warn/Wdangling-reference18.C   | 24 +++
.../g++.dg/warn/Wdangling-reference19.C   | 25 +++
.../g++.dg/warn/Wdangling-reference20.C   | 42 +++
5 files changed, 123 insertions(+)
create mode 100644 gcc/testsuite/g++.dg/warn/Wdangling-reference18.C
create mode 100644 gcc/testsuite/g++.dg/warn/Wdangling-reference19.C
create mode 100644 gcc/testsuite/g++.dg/warn/Wdangling-reference20.C

diff --git a/gcc/cp/call.cc b/gcc/cp/call.cc
index 9de0d77c423..afd3e1ff024 100644
--- a/gcc/cp/call.cc
+++ b/gcc/cp/call.cc
@@ -14082,6 +14082,24 @@ reference_like_class_p (tree ctype)
return true;
}
+  /* Avoid warning if CTYPE looks like std::span: it's a class template,
+ has a T* member, and a trivial destructor.  For example,
+
+  template
+  struct Span {
+   T* data_;
+   std::size len_;
+  };
+
+ is considered std::span-like.  */
+  if (NON_UNION_CLASS_TYPE_P (ctype)
+  && CLASSTYPE_TEMPLATE_INSTANTIATION (ctype)
+  && TYPE_HAS_TRIVIAL_DESTRUCTOR (ctype))
+for (tree field = next_aggregate_field (TYPE_FIELDS (ctype));
+field; field = next_aggregate_field (DECL_CHAIN (field)))
+  if (TYPE_PTR_P (TREE_TYPE (field)))
+   return true;
+
  /* Some classes, such as std::tuple, have the reference member in its
 (non-direct) base class.  */
  if (dfs_walk_once (TYPE_BINFO (ctype), class_has_reference_member_p_r,
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 6ec56493e59..e0ff18a86f5 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -3916,6 +3916,20 @@ where @code{std::minmax} returns @code{std::pair}, and
both references dangle after the end of the full expression that contains
the call to @code{std::minmax}.
+The warning does not warn for @code{std::span}-like classes.  We consider
+classes of the form:
+
+@smallexample
+template
+struct Span @{
+  T* data_;
+  std::size len_;
+@};
+@end smallexample
+
+as @code{std::span}-like; that is, the class is a non-union class template
+that has a pointer data member and a trivial destructor.
+
This warning is enabled by @option{-Wall}.
@opindex Wdelete-non-virtual-dtor
diff --git a/gcc/testsuite/g++.dg/warn/Wdangling-reference18.C 
b/gcc/testsuite/g++.dg/warn/Wdangling-reference18.C
new file mode 100644
index 000..e088c177769
--- /dev/null
+++ b/gcc/testsuite/g++.dg/warn/Wdangling-reference18.C
@@ -0,0 +1,24 @@
+// PR c++/110358
+// { dg-do compile { target c++11 } }
+// { dg-options "-Wdangling-reference" }
+// Don't warn for std::span-like classes.
+
+template 
+struct Span {
+T* data_;
+int len_;
+
+[[nodiscard]] constexpr auto operator[](int n) const noexcept -> T& { 
return data_[n]; }
+[[nodiscard]] constexpr auto front() const noexcept -> T& { return 
data_[0]; }
+[[nodiscard]] constexpr auto back() const noexcept -> T& { return 
data_[len_ - 1]; }
+};
+
+auto get() -> Span;
+
+auto f() -> int {
+int const& a = get().front(); // { dg-bogus "dangling reference" }
+int const& b = get().back();  // { dg-bogus "dangling reference" }
+int const& c = get()[0];  // { dg-bogus "dangling reference" }
+
+return a + b + c;
+}
diff --git a/gcc/testsuite/g++.dg/warn/Wdangling-reference19.C 
b/gcc/testsuite/g++.dg/warn/Wdangling-reference19.C
new file mode 100644
index 000..053467d822f
--- /dev/null
+++ b/gcc/testsuite/g++.dg/warn/Wdangling-reference19.C
@@ -0,0 +1,25 @@
+// PR c++/110358
+// { dg-do compile { target c++11 } }
+// { dg-options

[COMMITTED V3 1/4] RISC-V: Add non-vector types to dfa pipelines

2024-01-31 Thread juzhe.zh...@rivai.ai

Hi, all.

https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=26c34b809cd1a6249027730a8b52bbf6a1c0f4a8
 
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=e56fb037d9d265682f5e7217d8a4c12a8d3fddf8
 
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=4b799a16ae59fc0f508c5931ebf1851a3446b707
 
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=23cd2961bd2ff63583f46e3499a07bd54491d45c
 

These 4 commits cause all testcases failed (ICE and dump FAILs).

FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_32-4.c 
scan-tree-dump-times vect "vectorized 1 loops in function" 11
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_32-1.c 
(internal compiler error: in validate_change_or_fail, at 
config/riscv/riscv-v.cc:4972)
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_32-1.c 
(test for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_32-1.c 
scan-tree-dump-times vect "vectorized 1 loops in function" 11
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-11.c 
(internal compiler error: in validate_change_or_fail, at 
config/riscv/riscv-v.cc:4972)
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-11.c (test 
for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_64-1.c 
(internal compiler error: in validate_change_or_fail, at 
config/riscv/riscv-v.cc:4972)
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_64-1.c (test 
for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_64-1.c 
scan-tree-dump-times vect "vectorized 1 loops in function" 11
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_64-4.c 
(internal compiler error: in validate_change_or_fail, at 
config/riscv/riscv-v.cc:4972)
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_64-4.c (test 
for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_64-4.c 
scan-tree-dump-times vect "vectorized 1 loops in function" 11
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_64-2.c (internal 
compiler error: in validate_change_or_fail, at config/riscv/riscv-v.cc:4972)
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_64-2.c (test for 
excess errors)
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_64-2.c 
scan-tree-dump-times vect "vectorized 1 loops in function" 11
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_64-8.c (internal 
compiler error: in validate_change_or_fail, at config/riscv/riscv-v.cc:4972)
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_64-8.c (test for 
excess errors)
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_64-8.c 
scan-tree-dump-times vect "vectorized 1 loops in function" 11
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_32-5.c 
(internal compiler error: in validate_change_or_fail, at 
config/riscv/riscv-v.cc:4972)
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_32-5.c (test 
for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_32-5.c 
scan-tree-dump-times vect "vectorized 1 loops in function" 11
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_64-3.c 
(internal compiler error: in validate_change_or_fail, at 
config/riscv/riscv-v.cc:4972)
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_64-3.c 
(test for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_64-3.c 
scan-tree-dump-times vect "vectorized 1 loops in function" 11
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-9.c 
(internal compiler error: in validate_change_or_fail, at 
config/riscv/riscv-v.cc:4972)
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-9.c 
(test for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_32-3.c 
(internal compiler error: in validate_change_or_fail, at 
config/riscv/riscv-v.cc:4972)
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_32-3.c 
(test for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_32-3.c 
scan-tree-dump-times vect "vectorized 1 loops in function" 11
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_32-4.c (internal 
compiler error: in validate_change_or_fail, at config/riscv/riscv-v.cc:4972)
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_32-4.c (test for 
excess errors)
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_32-4.c 
scan-tree-dump-times vect "vectorized 1 loops in function" 11
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/offset_extend-1.c (internal 
compiler error: in validate_change_or_fail, at config/riscv/riscv-v.cc:4972)
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/offset_extend-1.c (test for 
excess errors)
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_64-2.c 
(internal compiler err

Re: [PATCH] RISC-V: Support scheduling for sifive p600 series

2024-01-31 Thread Monk Chiang

Thanks, I will push a V2 patch, to fix the typo and add a vector cost model
for p600 series.
 About block the div units, I decided to use your suggestion. The P600
series
divider is  4 bits per cycle. So blocking 3-5 cycles is enough.

On Thu, Feb 1, 2024 at 9:50 AM Edwin Lu  wrote:

> I recently committed changes modifying the scheduling reservations. Some
> things may need to be retested with the newly enabled asserts.
>
> Edwin
>
> On 1/31/2024 1:40 AM, Monk Chiang wrote:
> > Add sifive p600 series scheduler module. For more information
> > see https://www.sifive.com/cores/performance-p650-670.
> > Add sifive-p650, sifive-p670 for mcpu option will come in separate
> patches.
> >
> > gcc/ChangeLog:
> >   * config/riscv/riscv.md: Add "fcvt_i2f", "fcvt_f2i" type
> >   attribute, and include sifive-p600.md.
> >   * config/riscv/generic-ooo.md: Update type attribute.
> >   * config/riscv/sifive-7.md: Update type attribute.
> >   * config/riscv/sifive-p600.md: New file.
> >   * config/riscv/riscv-cores.def (RISCV_TUNE): Add parameter.
> >   * config/riscv/riscv-opts.h (enum riscv_microarchitecture_type):
> >   Add sifive_p600.
> >   * config/riscv/riscv.c (sifive_p600_tune_info): New.
> >   * config/riscv/riscv.h (TARGET_SFB_ALU): Update.
> >   * doc/invoke.texi (RISC-V Options): Add sifive-p600-series
> > ---
> >   gcc/config/riscv/generic-ooo.md  |   2 +-
> >   gcc/config/riscv/generic.md  |   2 +-
> >   gcc/config/riscv/riscv-cores.def |   1 +
> >   gcc/config/riscv/riscv-opts.h|   1 +
> >   gcc/config/riscv/riscv.cc|  17 +++
> >   gcc/config/riscv/riscv.h |   4 +-
> >   gcc/config/riscv/riscv.md|  19 ++--
> >   gcc/config/riscv/sifive-7.md |   2 +-
> >   gcc/config/riscv/sifive-p600.md  | 174 +++
> >   gcc/doc/invoke.texi  |   3 +-
> >   10 files changed, 212 insertions(+), 13 deletions(-)
> >   create mode 100644 gcc/config/riscv/sifive-p600.md
> >
> > diff --git a/gcc/config/riscv/generic-ooo.md
> b/gcc/config/riscv/generic-ooo.md
> > index 421a7bb929d..a22f8a3e079 100644
> > --- a/gcc/config/riscv/generic-ooo.md
> > +++ b/gcc/config/riscv/generic-ooo.md
> > @@ -127,7 +127,7 @@
> >
> >   (define_insn_reservation "generic_ooo_fcvt" 3
> > (and (eq_attr "tune" "generic_ooo")
> > -   (eq_attr "type" "fcvt"))
> > +   (eq_attr "type" "fcvt,fcvt_i2f,fcvt_f2i"))
> > "generic_ooo_issue,generic_ooo_fxu")
> >
> >   (define_insn_reservation "generic_ooo_fcmp" 2
> > diff --git a/gcc/config/riscv/generic.md b/gcc/config/riscv/generic.md
> > index b99ae345bb3..3f0eaa2ea08 100644
> > --- a/gcc/config/riscv/generic.md
> > +++ b/gcc/config/riscv/generic.md
> > @@ -42,7 +42,7 @@
> >
> >   (define_insn_reservation "generic_xfer" 3
> > (and (eq_attr "tune" "generic")
> > -   (eq_attr "type" "mfc,mtc,fcvt,fmove,fcmp"))
> > +   (eq_attr "type" "mfc,mtc,fcvt,fcvt_i2f,fcvt_f2i,fmove,fcmp"))
> > "alu")
> >
> >   (define_insn_reservation "generic_branch" 1
> > diff --git a/gcc/config/riscv/riscv-cores.def
> b/gcc/config/riscv/riscv-cores.def
> > index b30f4dfb08e..a07a79e2cb7 100644
> > --- a/gcc/config/riscv/riscv-cores.def
> > +++ b/gcc/config/riscv/riscv-cores.def
> > @@ -37,6 +37,7 @@ RISCV_TUNE("rocket", generic, rocket_tune_info)
> >   RISCV_TUNE("sifive-3-series", generic, rocket_tune_info)
> >   RISCV_TUNE("sifive-5-series", generic, rocket_tune_info)
> >   RISCV_TUNE("sifive-7-series", sifive_7, sifive_7_tune_info)
> > +RISCV_TUNE("sifive-p600-series", sifive_p600, sifive_p600_tune_info)
> >   RISCV_TUNE("thead-c906", generic, thead_c906_tune_info)
> >   RISCV_TUNE("generic-ooo", generic_ooo, generic_ooo_tune_info)
> >   RISCV_TUNE("size", generic, optimize_size_tune_info)
> > diff --git a/gcc/config/riscv/riscv-opts.h
> b/gcc/config/riscv/riscv-opts.h
> > index 1500f8811ef..25951665b13 100644
> > --- a/gcc/config/riscv/riscv-opts.h
> > +++ b/gcc/config/riscv/riscv-opts.h
> > @@ -55,6 +55,7 @@ extern enum riscv_isa_spec_class riscv_isa_spec;
> >   enum riscv_microarchitecture_type {
> > generic,
> > sifive_7,
> > +  sifive_p600,
> > generic_ooo
> >   };
> >   extern enum riscv_microarchitecture_type riscv_microarchitecture;
> > diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> > index 7b6111aa545..92d6fd5cf47 100644
> > --- a/gcc/config/riscv/riscv.cc
> > +++ b/gcc/config/riscv/riscv.cc
> > @@ -447,6 +447,23 @@ static const struct riscv_tune_param
> sifive_7_tune_info = {
> > NULL, /* vector cost */
> >   };
> >
> > +/* Costs to use when optimizing for Sifive p600 Series.  */
> > +static const struct riscv_tune_param sifive_p600_tune_info = {
> > +  {COSTS_N_INSNS (4), COSTS_N_INSNS (4)},/* fp_add */
> > +  {COSTS_N_INSNS (4), COSTS_N_INSNS (4)},/* fp_mul */
> > +  {COSTS_N_INSNS (20), COSTS_N_INSNS (20)},  /* fp_div */
> > +  {COSTS_N_INSNS (4), COSTS_N_INSNS (4)},/* int_mul */
> >

[PATCH v2] RISC-V: Support scheduling for sifive p600 series

2024-01-31 Thread Monk Chiang

Add sifive p600 series scheduler module. For more information
see https://www.sifive.com/cores/performance-p650-670.
Add sifive-p650, sifive-p670 for mcpu option will come in separate patches.

gcc/ChangeLog:
* config/riscv/riscv.md: Add "fcvt_i2f", "fcvt_f2i" type
attribute, and include sifive-p600.md.
* config/riscv/generic-ooo.md: Update type attribute.
* config/riscv/sifive-7.md: Update type attribute.
* config/riscv/sifive-p600.md: New file.
* config/riscv/riscv-cores.def (RISCV_TUNE): Add parameter.
* config/riscv/riscv-opts.h (enum riscv_microarchitecture_type):
Add sifive_p600.
* config/riscv/riscv.c (sifive_p600_tune_info): New.
* config/riscv/riscv.h (TARGET_SFB_ALU): Update.
* doc/invoke.texi (RISC-V Options): Add sifive-p600-series
---
 gcc/config/riscv/generic-ooo.md  |   2 +-
 gcc/config/riscv/generic.md  |   2 +-
 gcc/config/riscv/riscv-cores.def |   1 +
 gcc/config/riscv/riscv-opts.h|   1 +
 gcc/config/riscv/riscv.cc|  17 +++
 gcc/config/riscv/riscv.h |   4 +-
 gcc/config/riscv/riscv.md|  19 ++--
 gcc/config/riscv/sifive-7.md |   2 +-
 gcc/config/riscv/sifive-p600.md  | 178 +++
 gcc/doc/invoke.texi  |   3 +-
 10 files changed, 216 insertions(+), 13 deletions(-)
 create mode 100644 gcc/config/riscv/sifive-p600.md

diff --git a/gcc/config/riscv/generic-ooo.md b/gcc/config/riscv/generic-ooo.md
index 421a7bb929d..a22f8a3e079 100644
--- a/gcc/config/riscv/generic-ooo.md
+++ b/gcc/config/riscv/generic-ooo.md
@@ -127,7 +127,7 @@
 
 (define_insn_reservation "generic_ooo_fcvt" 3
   (and (eq_attr "tune" "generic_ooo")
-   (eq_attr "type" "fcvt"))
+   (eq_attr "type" "fcvt,fcvt_i2f,fcvt_f2i"))
   "generic_ooo_issue,generic_ooo_fxu")
 
 (define_insn_reservation "generic_ooo_fcmp" 2
diff --git a/gcc/config/riscv/generic.md b/gcc/config/riscv/generic.md
index b99ae345bb3..3f0eaa2ea08 100644
--- a/gcc/config/riscv/generic.md
+++ b/gcc/config/riscv/generic.md
@@ -42,7 +42,7 @@
 
 (define_insn_reservation "generic_xfer" 3
   (and (eq_attr "tune" "generic")
-   (eq_attr "type" "mfc,mtc,fcvt,fmove,fcmp"))
+   (eq_attr "type" "mfc,mtc,fcvt,fcvt_i2f,fcvt_f2i,fmove,fcmp"))
   "alu")
 
 (define_insn_reservation "generic_branch" 1
diff --git a/gcc/config/riscv/riscv-cores.def b/gcc/config/riscv/riscv-cores.def
index b30f4dfb08e..a07a79e2cb7 100644
--- a/gcc/config/riscv/riscv-cores.def
+++ b/gcc/config/riscv/riscv-cores.def
@@ -37,6 +37,7 @@ RISCV_TUNE("rocket", generic, rocket_tune_info)
 RISCV_TUNE("sifive-3-series", generic, rocket_tune_info)
 RISCV_TUNE("sifive-5-series", generic, rocket_tune_info)
 RISCV_TUNE("sifive-7-series", sifive_7, sifive_7_tune_info)
+RISCV_TUNE("sifive-p600-series", sifive_p600, sifive_p600_tune_info)
 RISCV_TUNE("thead-c906", generic, thead_c906_tune_info)
 RISCV_TUNE("generic-ooo", generic_ooo, generic_ooo_tune_info)
 RISCV_TUNE("size", generic, optimize_size_tune_info)
diff --git a/gcc/config/riscv/riscv-opts.h b/gcc/config/riscv/riscv-opts.h
index 1500f8811ef..25951665b13 100644
--- a/gcc/config/riscv/riscv-opts.h
+++ b/gcc/config/riscv/riscv-opts.h
@@ -55,6 +55,7 @@ extern enum riscv_isa_spec_class riscv_isa_spec;
 enum riscv_microarchitecture_type {
   generic,
   sifive_7,
+  sifive_p600,
   generic_ooo
 };
 extern enum riscv_microarchitecture_type riscv_microarchitecture;
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 7b6111aa545..476533395b5 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -447,6 +447,23 @@ static const struct riscv_tune_param sifive_7_tune_info = {
   NULL,/* vector cost */
 };
 
+/* Costs to use when optimizing for Sifive p600 Series.  */
+static const struct riscv_tune_param sifive_p600_tune_info = {
+  {COSTS_N_INSNS (4), COSTS_N_INSNS (4)},  /* fp_add */
+  {COSTS_N_INSNS (4), COSTS_N_INSNS (4)},  /* fp_mul */
+  {COSTS_N_INSNS (20), COSTS_N_INSNS (20)},/* fp_div */
+  {COSTS_N_INSNS (4), COSTS_N_INSNS (4)},  /* int_mul */
+  {COSTS_N_INSNS (6), COSTS_N_INSNS (6)},  /* int_div */
+  4,   /* issue_rate */
+  4,   /* branch_cost */
+  3,   /* memory_cost */
+  4,   /* fmv_cost */
+  true,/* 
slow_unaligned_access */
+  false,   /* use_divmod_expansion */
+  RISCV_FUSE_LUI_ADDI | RISCV_FUSE_AUIPC_ADDI,  /* fusible_ops */
+  &generic_vector_cost,/* vector cost */
+};
+
 /* Costs to use when optimizing for T-HEAD c906.  */
 static const struct riscv_tune_param thead_c906_tune_info = {
   {COSTS_N_INSNS (4), COSTS_N_INSNS (5)}, /* fp_add */
diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv

[PATCH v2] RISC-V: Support scheduling for sifive p600 series

2024-01-31 Thread juzhe.zh...@rivai.ai

Hi, Monk.

This model doesn't include vector.  Will you add vector pipeline in the 
followup patches ?



juzhe.zh...@rivai.ai

Re: [COMMITTED V3 1/4] RISC-V: Add non-vector types to dfa pipelines

2024-01-31 Thread Edwin Lu


Hi Juzhe,

I didn't see any ICEs when I tested locally (tested on 
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=8123f3ca3fd891034a8366518e756f161c4ff40d). 
Can you tell me what config you're using?


Edwin

On 1/31/2024 6:57 PM, juzhe.zh...@rivai.ai wrote:

Hi, all.

https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=26c34b809cd1a6249027730a8b52bbf6a1c0f4a8
 

https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=e56fb037d9d265682f5e7217d8a4c12a8d3fddf8
 

https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=4b799a16ae59fc0f508c5931ebf1851a3446b707
 

https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=23cd2961bd2ff63583f46e3499a07bd54491d45c
 


These 4 commits cause all testcases failed (ICE and dump FAILs).

FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_32-4.c 
scan-tree-dump-times vect "vectorized 1 loops in function" 11
FAIL: 
gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_32-1.c 
(internal compiler error: in validate_change_or_fail, at 
config/riscv/riscv-v.cc:4972)
FAIL: 
gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_32-1.c 
(test for excess errors)
FAIL: 
gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_32-1.c 
scan-tree-dump-times vect "vectorized 1 loops in function" 11
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-11.c 
(internal compiler error: in validate_change_or_fail, at 
config/riscv/riscv-v.cc:4972)
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-11.c 
(test for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_64-1.c 
(internal compiler error: in validate_change_or_fail, at 
config/riscv/riscv-v.cc:4972)
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_64-1.c 
(test for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_64-1.c 
scan-tree-dump-times vect "vectorized 1 loops in function" 11
FAIL: 
gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_64-4.c 
(internal compiler error: in validate_change_or_fail, at 
config/riscv/riscv-v.cc:4972)
FAIL: 
gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_64-4.c 
(test for excess errors)
FAIL: 
gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_64-4.c 
scan-tree-dump-times vect "vectorized 1 loops in function" 11
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_64-2.c 
(internal compiler error: in validate_change_or_fail, at 
config/riscv/riscv-v.cc:4972)
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_64-2.c 
(test for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_64-2.c 
scan-tree-dump-times vect "vectorized 1 loops in function" 11
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_64-8.c 
(internal compiler error: in validate_change_or_fail, at 
config/riscv/riscv-v.cc:4972)
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_64-8.c 
(test for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_64-8.c 
scan-tree-dump-times vect "vectorized 1 loops in function" 11
FAIL: 
gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_32-5.c 
(internal compiler error: in validate_change_or_fail, at 
config/riscv/riscv-v.cc:4972)
FAIL: 
gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_32-5.c 
(test for excess errors)
FAIL: 
gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_32-5.c 
scan-tree-dump-times vect "vectorized 1 loops in function" 11
FAIL: 
gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_64-3.c 
(internal compiler error: in validate_change_or_fail, at 
config/riscv/riscv-v.cc:4972)
FAIL: 
gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_64-3.c 
(test for excess errors)
FAIL: 
gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_64-3.c 
scan-tree-dump-times vect "vectorized 1 loops in function" 11
FAIL: 
gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-9.c 
(internal compiler error: in validate_change_or_fail, at 
config/riscv/riscv-v.cc:4972)
FAIL: 
gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-9.c 
(test for excess errors)
FAIL: 
gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_32-3.c 
(internal compiler error: in validate_change_or_fail, at 
config/riscv/riscv-v.cc:4972)
FAIL: 
gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_32-3.c 
(test for excess errors)
FAIL: 
gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_32-3.c 
scan-tree-dump-times vect "vectorized 1 loops in function" 11
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_32-4.c 
(internal compiler error: in

Re: [COMMITTED V3 1/4] RISC-V: Add non-vector types to dfa pipelines

2024-01-31 Thread Edwin Lu

From what I know, if it was a problem with my dfa reservation assert, 
it would have ICEd in riscv.cc and not riscv-v.cc. For now I reverted 
the changes since I don't want to leave things possibly broken overnight 
and not knowing which patch is the root cause. I kicked off another set 
of test runs using our full gcc postcommit testing configurations and 
should have those results in tomorrow. Hopefully it was just a missed 
config target I didn't test and wasn't tested on the precommit ci.


Edwin

On 1/31/2024 9:42 PM, Edwin Lu wrote:

Hi Juzhe,

I didn't see any ICEs when I tested locally (tested on 
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=8123f3ca3fd891034a8366518e756f161c4ff40d). Can you tell me what config you're using?


Edwin

On 1/31/2024 6:57 PM, juzhe.zh...@rivai.ai wrote:

Hi, all.

https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=26c34b809cd1a6249027730a8b52bbf6a1c0f4a8
 

https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=e56fb037d9d265682f5e7217d8a4c12a8d3fddf8
 

https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=4b799a16ae59fc0f508c5931ebf1851a3446b707
 

https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=23cd2961bd2ff63583f46e3499a07bd54491d45c
 


These 4 commits cause all testcases failed (ICE and dump FAILs).

FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_32-4.c 
scan-tree-dump-times vect "vectorized 1 loops in function" 11
FAIL: 
gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_32-1.c 
(internal compiler error: in validate_change_or_fail, at 
config/riscv/riscv-v.cc:4972)
FAIL: 
gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_32-1.c 
(test for excess errors)
FAIL: 
gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_32-1.c 
scan-tree-dump-times vect "vectorized 1 loops in function" 11
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-11.c 
(internal compiler error: in validate_change_or_fail, at 
config/riscv/riscv-v.cc:4972)
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-11.c 
(test for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_64-1.c 
(internal compiler error: in validate_change_or_fail, at 
config/riscv/riscv-v.cc:4972)
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_64-1.c 
(test for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_64-1.c 
scan-tree-dump-times vect "vectorized 1 loops in function" 11
FAIL: 
gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_64-4.c 
(internal compiler error: in validate_change_or_fail, at 
config/riscv/riscv-v.cc:4972)
FAIL: 
gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_64-4.c 
(test for excess errors)
FAIL: 
gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_64-4.c 
scan-tree-dump-times vect "vectorized 1 loops in function" 11
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_64-2.c 
(internal compiler error: in validate_change_or_fail, at 
config/riscv/riscv-v.cc:4972)
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_64-2.c 
(test for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_64-2.c 
scan-tree-dump-times vect "vectorized 1 loops in function" 11
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_64-8.c 
(internal compiler error: in validate_change_or_fail, at 
config/riscv/riscv-v.cc:4972)
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_64-8.c 
(test for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_64-8.c 
scan-tree-dump-times vect "vectorized 1 loops in function" 11
FAIL: 
gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_32-5.c 
(internal compiler error: in validate_change_or_fail, at 
config/riscv/riscv-v.cc:4972)
FAIL: 
gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_32-5.c 
(test for excess errors)
FAIL: 
gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_32-5.c 
scan-tree-dump-times vect "vectorized 1 loops in function" 11
FAIL: 
gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_64-3.c 
(internal compiler error: in validate_change_or_fail, at 
config/riscv/riscv-v.cc:4972)
FAIL: 
gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_64-3.c 
(test for excess errors)
FAIL: 
gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_64-3.c 
scan-tree-dump-times vect "vectorized 1 loops in function" 11
FAIL: 
gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-9.c 
(internal compiler error: in validate_change_or_fail, at 
config/riscv/riscv-v.cc:4972)
FAIL: 
gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_lo

Re: Re: [COMMITTED V3 1/4] RISC-V: Add non-vector types to dfa pipelines

2024-01-31 Thread juzhe.zh...@rivai.ai

Maybe I do the wrong testing. Let me use a clean linux environment and try 
again.



juzhe.zh...@rivai.ai
 
From: Edwin Lu
Date: 2024-02-01 14:13
To: juzhe.zh...@rivai.ai; gcc-patches
CC: Robin Dapp; kito.cheng; jeffreyalaw; palmer; vineetg; Patrick O'Neill
Subject: Re: [COMMITTED V3 1/4] RISC-V: Add non-vector types to dfa pipelines
From what I know, if it was a problem with my dfa reservation assert, 
it would have ICEd in riscv.cc and not riscv-v.cc. For now I reverted 
the changes since I don't want to leave things possibly broken overnight 
and not knowing which patch is the root cause. I kicked off another set 
of test runs using our full gcc postcommit testing configurations and 
should have those results in tomorrow. Hopefully it was just a missed 
config target I didn't test and wasn't tested on the precommit ci.
 
Edwin
 
On 1/31/2024 9:42 PM, Edwin Lu wrote:
> Hi Juzhe,
> 
> I didn't see any ICEs when I tested locally (tested on 
> https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=8123f3ca3fd891034a8366518e756f161c4ff40d).
>  Can you tell me what config you're using?
> 
> Edwin
> 
> On 1/31/2024 6:57 PM, juzhe.zh...@rivai.ai wrote:
>> Hi, all.
>>
>> https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=26c34b809cd1a6249027730a8b52bbf6a1c0f4a8
>>  
>> 
>> https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=e56fb037d9d265682f5e7217d8a4c12a8d3fddf8
>>  
>> 
>> https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=4b799a16ae59fc0f508c5931ebf1851a3446b707
>>  
>> 
>> https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=23cd2961bd2ff63583f46e3499a07bd54491d45c
>>  
>> 
>>
>> These 4 commits cause all testcases failed (ICE and dump FAILs).
>>
>> FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_32-4.c 
>> scan-tree-dump-times vect "vectorized 1 loops in function" 11
>> FAIL: 
>> gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_32-1.c 
>> (internal compiler error: in validate_change_or_fail, at 
>> config/riscv/riscv-v.cc:4972)
>> FAIL: 
>> gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_32-1.c 
>> (test for excess errors)
>> FAIL: 
>> gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_32-1.c 
>> scan-tree-dump-times vect "vectorized 1 loops in function" 11
>> FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-11.c 
>> (internal compiler error: in validate_change_or_fail, at 
>> config/riscv/riscv-v.cc:4972)
>> FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-11.c 
>> (test for excess errors)
>> FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_64-1.c 
>> (internal compiler error: in validate_change_or_fail, at 
>> config/riscv/riscv-v.cc:4972)
>> FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_64-1.c 
>> (test for excess errors)
>> FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_64-1.c 
>> scan-tree-dump-times vect "vectorized 1 loops in function" 11
>> FAIL: 
>> gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_64-4.c 
>> (internal compiler error: in validate_change_or_fail, at 
>> config/riscv/riscv-v.cc:4972)
>> FAIL: 
>> gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_64-4.c 
>> (test for excess errors)
>> FAIL: 
>> gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_64-4.c 
>> scan-tree-dump-times vect "vectorized 1 loops in function" 11
>> FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_64-2.c 
>> (internal compiler error: in validate_change_or_fail, at 
>> config/riscv/riscv-v.cc:4972)
>> FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_64-2.c 
>> (test for excess errors)
>> FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_64-2.c 
>> scan-tree-dump-times vect "vectorized 1 loops in function" 11
>> FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_64-8.c 
>> (internal compiler error: in validate_change_or_fail, at 
>> config/riscv/riscv-v.cc:4972)
>> FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_64-8.c 
>> (test for excess errors)
>> FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_64-8.c 
>> scan-tree-dump-times vect "vectorized 1 loops in function" 11
>> FAIL: 
>> gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_32-5.c 
>> (internal compiler error: in validate_change_or_fail, at 
>> config/riscv/riscv-v.cc:4972)
>> FAIL: 
>> gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_32-5.c 
>> (test for excess errors)
>> FAIL: 
>> gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_32-5.c 
>> scan-tree-dump-times vect "vectorized 1 loops in function" 11
>> FAIL: 
>> gcc.target/riscv/rvv/autovec/gather-scatter/mask_s

Re: Re: [COMMITTED V3 1/4] RISC-V: Add non-vector types to dfa pipelines

2024-01-31 Thread juzhe.zh...@rivai.ai

Oh. Sorry. I think I have done the wrong testing with incremental compilation.

With clean trunk, no ICEs now. Just this following FAILs:
FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-11.c (test 
for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/unop/vfsqrt-run.c (test for excess errors)
FAIL: gcc.target/riscv/rvv/autovec/unop/vfsqrt-run.c (test for excess errors)

Your patch is good.

Thanks for the help.


juzhe.zh...@rivai.ai
 
From: Edwin Lu
Date: 2024-02-01 14:13
To: juzhe.zh...@rivai.ai; gcc-patches
CC: Robin Dapp; kito.cheng; jeffreyalaw; palmer; vineetg; Patrick O'Neill
Subject: Re: [COMMITTED V3 1/4] RISC-V: Add non-vector types to dfa pipelines
From what I know, if it was a problem with my dfa reservation assert, 
it would have ICEd in riscv.cc and not riscv-v.cc. For now I reverted 
the changes since I don't want to leave things possibly broken overnight 
and not knowing which patch is the root cause. I kicked off another set 
of test runs using our full gcc postcommit testing configurations and 
should have those results in tomorrow. Hopefully it was just a missed 
config target I didn't test and wasn't tested on the precommit ci.
 
Edwin
 
On 1/31/2024 9:42 PM, Edwin Lu wrote:
> Hi Juzhe,
> 
> I didn't see any ICEs when I tested locally (tested on 
> https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=8123f3ca3fd891034a8366518e756f161c4ff40d).
>  Can you tell me what config you're using?
> 
> Edwin
> 
> On 1/31/2024 6:57 PM, juzhe.zh...@rivai.ai wrote:
>> Hi, all.
>>
>> https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=26c34b809cd1a6249027730a8b52bbf6a1c0f4a8
>>  
>> 
>> https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=e56fb037d9d265682f5e7217d8a4c12a8d3fddf8
>>  
>> 
>> https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=4b799a16ae59fc0f508c5931ebf1851a3446b707
>>  
>> 
>> https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=23cd2961bd2ff63583f46e3499a07bd54491d45c
>>  
>> 
>>
>> These 4 commits cause all testcases failed (ICE and dump FAILs).
>>
>> FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_32-4.c 
>> scan-tree-dump-times vect "vectorized 1 loops in function" 11
>> FAIL: 
>> gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_32-1.c 
>> (internal compiler error: in validate_change_or_fail, at 
>> config/riscv/riscv-v.cc:4972)
>> FAIL: 
>> gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_32-1.c 
>> (test for excess errors)
>> FAIL: 
>> gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_32-1.c 
>> scan-tree-dump-times vect "vectorized 1 loops in function" 11
>> FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-11.c 
>> (internal compiler error: in validate_change_or_fail, at 
>> config/riscv/riscv-v.cc:4972)
>> FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-11.c 
>> (test for excess errors)
>> FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_64-1.c 
>> (internal compiler error: in validate_change_or_fail, at 
>> config/riscv/riscv-v.cc:4972)
>> FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_64-1.c 
>> (test for excess errors)
>> FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_64-1.c 
>> scan-tree-dump-times vect "vectorized 1 loops in function" 11
>> FAIL: 
>> gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_64-4.c 
>> (internal compiler error: in validate_change_or_fail, at 
>> config/riscv/riscv-v.cc:4972)
>> FAIL: 
>> gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_64-4.c 
>> (test for excess errors)
>> FAIL: 
>> gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_64-4.c 
>> scan-tree-dump-times vect "vectorized 1 loops in function" 11
>> FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_64-2.c 
>> (internal compiler error: in validate_change_or_fail, at 
>> config/riscv/riscv-v.cc:4972)
>> FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_64-2.c 
>> (test for excess errors)
>> FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_64-2.c 
>> scan-tree-dump-times vect "vectorized 1 loops in function" 11
>> FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_64-8.c 
>> (internal compiler error: in validate_change_or_fail, at 
>> config/riscv/riscv-v.cc:4972)
>> FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_64-8.c 
>> (test for excess errors)
>> FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_64-8.c 
>> scan-tree-dump-times vect "vectorized 1 loops in function" 11
>> FAIL: 
>> gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_32-5.c 
>> (internal compiler error: in validate_change_or_fail, at 
>>

Re: Re: [COMMITTED V3 1/4] RISC-V: Add non-vector types to dfa pipelines

2024-01-31 Thread juzhe.zh...@rivai.ai

Sorry again. I just realized you have reverted your patches that's why I can 
pass the testing now.

I checkout your latest patch commit:
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=23cd2961bd2ff63583f46e3499a07bd54491d45c
 

Then I can reproduce the ICE now:

bug.c: In function 'popcount32_uint64_tuint64_t':
bug.c:20:3: internal compiler error: in validate_change_or_fail, at 
config/riscv/riscv-v.cc:4972
   20 |   }
  |   ^
bug.c:123:3: note: in expansion of macro 'DEF32'
  123 |   DEF32 (uint64_t, uint64_t)
   \
  |   ^
bug.c:444:1: note: in expansion of macro 'DEF_ALL'
  444 | DEF_ALL ()
  | ^~~
0x1fbf06f riscv_vector::validate_change_or_fail(rtx_def*, rtx_def**, rtx_def*, 
bool)
../../../../gcc/gcc/config/riscv/riscv-v.cc:4972
0x1fe2c60 simplify_replace_vlmax_avl
../../../../gcc/gcc/config/riscv/riscv-avlprop.cc:200
0x1fe3b05 pass_avlprop::execute(function*)
../../../../gcc/gcc/config/riscv/riscv-avlprop.cc:506

Would you mind taking a look at it ?



juzhe.zh...@rivai.ai
 
From: Edwin Lu
Date: 2024-02-01 14:13
To: juzhe.zh...@rivai.ai; gcc-patches
CC: Robin Dapp; kito.cheng; jeffreyalaw; palmer; vineetg; Patrick O'Neill
Subject: Re: [COMMITTED V3 1/4] RISC-V: Add non-vector types to dfa pipelines
From what I know, if it was a problem with my dfa reservation assert, 
it would have ICEd in riscv.cc and not riscv-v.cc. For now I reverted 
the changes since I don't want to leave things possibly broken overnight 
and not knowing which patch is the root cause. I kicked off another set 
of test runs using our full gcc postcommit testing configurations and 
should have those results in tomorrow. Hopefully it was just a missed 
config target I didn't test and wasn't tested on the precommit ci.
 
Edwin
 
On 1/31/2024 9:42 PM, Edwin Lu wrote:
> Hi Juzhe,
> 
> I didn't see any ICEs when I tested locally (tested on 
> https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=8123f3ca3fd891034a8366518e756f161c4ff40d).
>  Can you tell me what config you're using?
> 
> Edwin
> 
> On 1/31/2024 6:57 PM, juzhe.zh...@rivai.ai wrote:
>> Hi, all.
>>
>> https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=26c34b809cd1a6249027730a8b52bbf6a1c0f4a8
>>  
>> 
>> https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=e56fb037d9d265682f5e7217d8a4c12a8d3fddf8
>>  
>> 
>> https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=4b799a16ae59fc0f508c5931ebf1851a3446b707
>>  
>> 
>> https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=23cd2961bd2ff63583f46e3499a07bd54491d45c
>>  
>> 
>>
>> These 4 commits cause all testcases failed (ICE and dump FAILs).
>>
>> FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_32-4.c 
>> scan-tree-dump-times vect "vectorized 1 loops in function" 11
>> FAIL: 
>> gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_32-1.c 
>> (internal compiler error: in validate_change_or_fail, at 
>> config/riscv/riscv-v.cc:4972)
>> FAIL: 
>> gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_32-1.c 
>> (test for excess errors)
>> FAIL: 
>> gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_32-1.c 
>> scan-tree-dump-times vect "vectorized 1 loops in function" 11
>> FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-11.c 
>> (internal compiler error: in validate_change_or_fail, at 
>> config/riscv/riscv-v.cc:4972)
>> FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-11.c 
>> (test for excess errors)
>> FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_64-1.c 
>> (internal compiler error: in validate_change_or_fail, at 
>> config/riscv/riscv-v.cc:4972)
>> FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_64-1.c 
>> (test for excess errors)
>> FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_64-1.c 
>> scan-tree-dump-times vect "vectorized 1 loops in function" 11
>> FAIL: 
>> gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_64-4.c 
>> (internal compiler error: in validate_change_or_fail, at 
>> config/riscv/riscv-v.cc:4972)
>> FAIL: 
>> gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_64-4.c 
>> (test for excess errors)
>> FAIL: 
>> gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_64-4.c 
>> scan-tree-dump-times vect "vectorized 1 loops in function" 11
>> FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_64-2.c 
>> (internal compiler error: in validate_change_or_fail, at 
>> config/riscv/riscv-v.cc:4972)
>> FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_64-2.c 
>> (test for excess errors)
>> FAIL: gcc.target/riscv/rvv/autovec/gather-scatter/gather_lo

Re: [PATCH 1/3] vect: Pass stmt_vec_info to TARGET_SIMD_CLONE_USABLE

2024-01-31 Thread Richard Biener

On Wed, 31 Jan 2024, Andre Vieira (lists) wrote:

> 
> 
> On 31/01/2024 14:35, Richard Biener wrote:
> > On Wed, 31 Jan 2024, Andre Vieira (lists) wrote:
> > 
> >>
> >>
> >> On 31/01/2024 13:58, Richard Biener wrote:
> >>> On Wed, 31 Jan 2024, Andre Vieira (lists) wrote:
> >>>
> 
> 
>  On 31/01/2024 12:13, Richard Biener wrote:
> > On Wed, 31 Jan 2024, Richard Biener wrote:
> >
> >> On Tue, 30 Jan 2024, Andre Vieira wrote:
> >>
> >>>
> >>> This patch adds stmt_vec_info to TARGET_SIMD_CLONE_USABLE to make sure
> >>> the
> >>> target can reject a simd_clone based on the vector mode it is using.
> >>> This is needed because for VLS SVE vectorization the vectorizer
> >>> accepts
> >>> Advanced SIMD simd clones when vectorizing using SVE types because the
> >>> simdlens
> >>> might match.  This will cause type errors later on.
> >>>
> >>> Other targets do not currently need to use this argument.
> >>
> >> Can you instead pass down the mode?
> >
> > Thinking about that again the cgraph_simd_clone info in the clone
> > should have sufficient information to disambiguate.  If it doesn't
> > then we should amend it.
> >
> > Richard.
> 
>  Hi Richard,
> 
>  Thanks for the review, I don't think cgraph_simd_clone_info is the right
>  place
>  to pass down this information, since this is information about the caller
>  rather than the simdclone itself. What we are trying to achieve here is
>  making
>  the vectorizer being able to accept or reject simdclones based on the ISA
>  we
>  are vectorizing for. To distinguish between SVE and Advanced SIMD ISAs we
>  use
>  modes, I am also not sure that's ideal but it is what we currently use.
>  So
>  to
>  answer your earlier question, yes I can also pass down mode if that's
>  preferable.
> >>>
> >>> Note cgraph_simd_clone_info has simdlen and we seem to check elsewhere
> >>> whether that's POLY or constant.  I wonder how aarch64_sve_mode_p
> >>> comes into play here which in the end classifies VLS SVE modes as
> >>> non-SVE?
> >>>
> >>
> >> Using -msve-vector-bits=128
> >> (gdb) p TYPE_MODE (STMT_VINFO_VECTYPE (stmt_vinfo))
> >> $4 = E_VNx4SImode
> >> (gdb) p  TYPE_SIZE (STMT_VINFO_VECTYPE (stmt_vinfo))
> >> $5 = (tree) 0xf741c1b0
> >> (gdb) p debug (TYPE_SIZE (STMT_VINFO_VECTYPE (stmt_vinfo)))
> >> 128
> >> (gdb) p aarch64_sve_mode_p (TYPE_MODE (STMT_VINFO_VECTYPE (stmt_vinfo)))
> >> $5 = true
> >>
> >> and for reference without vls codegen:
> >> (gdb) p TYPE_MODE (STMT_VINFO_VECTYPE (stmt_vinfo))
> >> $1 = E_VNx4SImode
> >> (gdb) p  debug (TYPE_SIZE (STMT_VINFO_VECTYPE (stmt_vinfo)))
> >> POLY_INT_CST [128, 128]
> >>
> >> Having said that I believe that the USABLE targethook implementation for
> >> aarch64 should also block other uses, like an Advanced SIMD mode being used
> >> as
> >> input for a SVE VLS SIMDCLONE. The reason being that for instance 'half'
> >> registers like VNx2SI are packed differently from V2SI.
> >>
> >> We could teach the vectorizer to support these of course, but that requires
> >> more work and is not extremely useful just yet. I'll add the extra check
> >> that
> >> to the patch once we agree on how to pass down the information we need.
> >> Happy
> >> to use either mode, or stmt_vec_info and extract the mode from it like it
> >> does
> >> now.
> > 
> > As said, please pass down 'mode'.  But I wonder how to document it,
> > which mode is that supposed to be?  Any of result or any argument
> > mode that happens to be a vector?  I think that we might be able
> > to mix Advanced SIMD modes and SVE modes with -msve-vector-bits=128
> > in the same loop?
> > 
> > Are the simd clones you don't want to use with -msve-vector-bits=128
> > having constant simdlen?  If so why do you generate them in the first
> > place?
> 
> So this is where things get a bit confusing and I will write up some text for
> these cases to put in our ABI document (currently in Beta and in need of some
> tlc).
> 
> Our intended behaviour is for a 'declare simd' without a simdlen to generate
> simdclones for:
> * Advanced SIMD 128 and 64-bit vectors, where possible (we don't allow for
> simdlen 1, Tamar fixed that in gcc recently),
> * SVE VLA vectors.
> 
> Let me illustrate this with an example:
> 
> __attribute__ ((simd (notinbranch), const)) float cosf(float);
> 
> Should tell the compiler the following simd clones are available:
> __ZGVnN4v_cosf 128-bit 4x4 float Advanced SIMD clone
> __ZGVnN2v_cosf 64-bit  4x2 float Advanced SIMD clone
> __ZGVsMxv_cosf [128, 128]-bit 4x4xN SVE SIMD clone
> 
> [To save you looking into the abi let me break this down, _ZGV is prefix, then
> 'n' or 's' picks between Advanced SIMD and SVE, 'N' or 'M' picks between Not
> Masked and Masked (SVE is always masked even if we ask for notinbranch), then
> a digit or 'x' picks between Vector Length or VLA, and after that you get a

Re: [PATCH v3 4/5] Add tests for C/C++ musttail attributes

2024-01-31 Thread Andi Kleen

> This will run the test only once with -std=c++11.  We'll get better coverage
> with dropping the line above and using
> 
> /* { dg-do compile { target { tail_call && { c || c++11 } } } } */
> 
> but here it may not matter.

The problem is that older C/C++ standards don't support [[]] attributes.
It would make sense to say >= gnu++11 || >= c23 but I don't know how to
express that.

Re: [PATCH 1/3] vect: Pass stmt_vec_info to TARGET_SIMD_CLONE_USABLE

2024-01-31 Thread Richard Sandiford

"Andre Vieira (lists)"  writes:
> [...] The question at hand 
> here is, what can the vectorizer use for a specific loop. If we are 
> using Advanced SIMD modes then it needs to call an Advanced SIMD clone, 
> and if we are using SVE modes then it needs to call an SVE clone. At 
> least until we support the ABI conversion, because like I said for an 
> unpacked argument they behave differently.

Probably also worth noting that multi-byte elements are laid out
differently for big-endian.  E.g. V4SI is loaded as a 128-bit integer
whereas VNx4SI is loaded as an array of 4 32-bit integers, with the
first 32-bit integer going in the least significant bits of the register.

So it would only be possible to use Advanced SIMD clones for SVE modes
and vice versa for little-endian, or if the elements are all bytes,
or if we add some reverses to the inputs and outputs.

Richard

[PATCH] i386: Clear REG_UNUSED and REG_DEAD notes from the IL at the end of vzeroupper pass [PR113059]

2024-01-31 Thread Jakub Jelinek

Hi!

The move of the vzeroupper pass from after reload pass to after
postreload_cse helped only partially, CSE-like passes can still invalidate
those notes (especially REG_UNUSED) if they use some earlier register
holding some value later on in the IL.

So, either we could try to move it one pass further after gcse2 and hope
no later pass invalidates the notes, or the following patch attempts to
restore the REG_DEAD/REG_UNUSED state from GCC 13 and earlier, where
the LRA or reload passes remove all REG_DEAD/REG_UNUSED notes and the notes
reappear only at the start of dse2 pass when it calls
  df_note_add_problem ();
  df_analyze ();
So, effectively
  NEXT_PASS (pass_postreload_cse);
  NEXT_PASS (pass_gcse2);
  NEXT_PASS (pass_split_after_reload);
  NEXT_PASS (pass_ree);
  NEXT_PASS (pass_compare_elim_after_reload);
  NEXT_PASS (pass_thread_prologue_and_epilogue);
passes operate without those notes in the IL.
While in GCC 14 mode switching computes the notes problem at the start of
vzeroupper, the patch below removes them at the end of the pass again, so
that the above passes continue to operate without them.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2024-01-31  Jakub Jelinek  

PR target/113059
* config/i386/i386-features.cc (rest_of_handle_insert_vzeroupper):
Remove REG_DEAD/REG_UNUSED notes at the end of the pass before
df_analyze call.

--- gcc/config/i386/i386-features.cc.jj 2024-01-08 12:15:13.611477047 +0100
+++ gcc/config/i386/i386-features.cc2024-01-30 12:36:27.834515803 +0100
@@ -2664,6 +2664,32 @@ rest_of_handle_insert_vzeroupper (void)
   /* Call optimize_mode_switching.  */
   g->get_passes ()->execute_pass_mode_switching ();
 
+  /* LRA removes all REG_DEAD/REG_UNUSED notes and normally they
+ reappear in the IL only at the start of pass_rtl_dse2, which does
+ df_note_add_problem (); df_analyze ();
+ The vzeroupper is scheduled after postreload_cse pass and mode
+ switching computes the notes as well, the problem is that e.g.
+ pass_gcse2 doesn't maintain the notes, see PR113059 and
+ PR112760.  Remove the notes now to restore status quo ante
+ until we figure out how to maintain the notes or what else
+ to do.  */
+  basic_block bb;
+  rtx_insn *insn;
+  FOR_EACH_BB_FN (bb, cfun)
+FOR_BB_INSNS (bb, insn)
+  if (NONDEBUG_INSN_P (insn))
+   {
+ rtx *pnote = ®_NOTES (insn);
+ while (*pnote != 0)
+   {
+ if (REG_NOTE_KIND (*pnote) == REG_DEAD
+ || REG_NOTE_KIND (*pnote) == REG_UNUSED)
+   *pnote = XEXP (*pnote, 1);
+ else
+   pnote = &XEXP (*pnote, 1);
+   }
+   }
+
   df_analyze ();
   return 0;
 }

Jakub

[PATCH] lower-bitint: Fix up VIEW_CONVERT_EXPR handling in handle_operand_addr [PR113639]

2024-01-31 Thread Jakub Jelinek

Hi!

Yet another spot where we need to treat VIEW_CONVERT_EXPR differently
from NOP_EXPR/CONVERT_EXPR.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2024-01-31  Jakub Jelinek  

PR tree-optimization/113639
* gimple-lower-bitint.cc (bitint_large_huge::handle_operand_addr):
For VIEW_CONVERT_EXPR set rhs1 to its operand.

* gcc.dg/bitint-79.c: New test.

--- gcc/gimple-lower-bitint.cc.jj   2024-01-27 13:06:49.183671155 +0100
+++ gcc/gimple-lower-bitint.cc  2024-01-30 17:06:56.829144801 +0100
@@ -2159,6 +2159,8 @@ bitint_large_huge::handle_operand_addr (
  gcc_assert (gimple_assign_cast_p (g));
  tree rhs1 = gimple_assign_rhs1 (g);
  bitint_prec_kind kind = bitint_prec_small;
+ if (TREE_CODE (rhs1) == VIEW_CONVERT_EXPR)
+   rhs1 = TREE_OPERAND (rhs1, 0);
  gcc_assert (INTEGRAL_TYPE_P (TREE_TYPE (rhs1)));
  if (TREE_CODE (TREE_TYPE (rhs1)) == BITINT_TYPE)
kind = bitint_precision_kind (TREE_TYPE (rhs1));
--- gcc/testsuite/gcc.dg/bitint-79.c.jj 2024-01-30 17:18:50.711135054 +0100
+++ gcc/testsuite/gcc.dg/bitint-79.c2024-01-30 17:18:22.986524397 +0100
@@ -0,0 +1,16 @@
+/* PR tree-optimization/113639 */
+/* { dg-do compile { target bitint } } */
+/* { dg-options "-O2 -std=c23" } */
+
+int j, k;
+#if __BITINT_MAXWIDTH__ >= 162
+struct S { _BitInt(162) n; };
+void bar (_BitInt(162) x);
+
+void
+foo (struct S s)
+{
+  bar (s.n * j);
+  (void) (s.n * k);
+}
+#endif

Jakub

[PATCH] dwarf2out: Fix ICE on large _BitInt in loc_list_from_tree_1 [PR113637]

2024-01-31 Thread Jakub Jelinek

Hi!

This spot uses SCALAR_INT_TYPE_MODE which obviously ICEs for large/huge
BITINT_TYPE types which have BLKmode.  But such large BITINT_TYPEs certainly
don't fit into DWARF2_ADDR_SIZE either, so we can just assume it would be
false if type has BLKmode.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2024-01-31  Jakub Jelinek  

PR debug/113637
* dwarf2out.cc (loc_list_from_tree_1): Assume integral types
with BLKmode are larger than DWARF2_ADDR_SIZE.

* gcc.dg/bitint-80.c: New test.

--- gcc/dwarf2out.cc.jj 2024-01-24 13:11:21.132468150 +0100
+++ gcc/dwarf2out.cc2024-01-30 17:23:41.249054946 +0100
@@ -19027,6 +19027,7 @@ loc_list_from_tree_1 (tree loc, int want
&& ! DECL_IGNORED_P (loc)
&& (INTEGRAL_TYPE_P (TREE_TYPE (loc))
|| POINTER_TYPE_P (TREE_TYPE (loc)))
+   && TYPE_MODE (TREE_TYPE (loc)) != BLKmode
&& (GET_MODE_SIZE (SCALAR_INT_TYPE_MODE (TREE_TYPE (loc)))
<= DWARF2_ADDR_SIZE))
  {
--- gcc/testsuite/gcc.dg/bitint-80.c.jj 2024-01-30 17:30:02.843696120 +0100
+++ gcc/testsuite/gcc.dg/bitint-80.c2024-01-30 17:32:33.301583203 +0100
@@ -0,0 +1,15 @@
+/* PR debug/113637 */
+/* { dg-do compile { target bitint } } */
+/* { dg-options "-g -std=c23" } */
+
+#if __BITINT_MAXWIDTH__ >= 639
+typedef _BitInt(639) B;
+#else
+typedef _BitInt(63) B;
+#endif
+
+void
+foo (B n)
+{
+  extern void bar (int [][n]);
+}

Jakub

[PATCH] simplify-rtx: Fix up last argument to simplify_gen_unary [PR113656]

2024-01-31 Thread Jakub Jelinek

Hi!

When simplifying e.g. (float_truncate:SF (float_truncate:DF (reg:XF))
or (float_truncate:SF (float_extend:XF (reg:DF)) etc. into
(float_truncate:SF (reg:XF)) or (float_truncate:SF (reg:DF)) we call
simplify_gen_unary with incorrect op_mode argument, it should be
the argument's mode, but we call it with the outer mode instead.
As these are all floating point operations, the argument always
has non-VOIDmode and so we can just use that mode (as done in similar
simplifications a few lines later), but neither FLOAT_TRUNCATE nor
FLOAT_EXTEND are operations that should have the same modes of operand
and result.  This bug hasn't been a problem for years because normally
op_mode is used only if the mode of op is VOIDmode, otherwise it is
redundant, but r10-2139 added an assertion in some spots that op_mode
is right even in such cases.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2024-01-31  Jakub Jelinek  

PR rtl-optimization/113656
* simplify-rtx.cc (simplify_context::simplify_unary_operation_1)
: Fix up last argument to simplify_gen_unary.

* gcc.target/i386/pr113656.c: New test.

--- gcc/simplify-rtx.cc.jj  2024-01-03 11:51:32.828713189 +0100
+++ gcc/simplify-rtx.cc 2024-01-30 19:34:30.516934480 +0100
@@ -1305,7 +1305,7 @@ simplify_context::simplify_unary_operati
   > GET_MODE_UNIT_SIZE (mode)
   ? FLOAT_TRUNCATE : FLOAT_EXTEND,
   mode,
-  XEXP (op, 0), mode);
+  XEXP (op, 0), GET_MODE (XEXP (op, 0)));
 
   /*  (float_truncate (float x)) is (float x)  */
   if ((GET_CODE (op) == FLOAT || GET_CODE (op) == UNSIGNED_FLOAT)
--- gcc/testsuite/gcc.target/i386/pr113656.c.jj 2024-01-30 19:38:29.029608721 
+0100
+++ gcc/testsuite/gcc.target/i386/pr113656.c2024-01-30 19:37:10.519703443 
+0100
@@ -0,0 +1,12 @@
+/* PR rtl-optimization/113656 */
+/* { dg-do compile } */
+/* { dg-options "-O3 -frounding-math -funsafe-math-optimizations -mavx512fp16 
-mavx512vl" } */
+
+_Float16 a[8];
+
+void
+foo ()
+{
+  for (int i = 0; i < 8; i++)
+a[i] = i - 8.4;
+}

Jakub

Re: [PATCH] c++: add deprecation notice for -fconcepts-ts

2024-01-31 Thread Richard Biener

On Wed, Jan 31, 2024 at 12:19 AM Marek Polacek  wrote:
>
> Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?
>
> -- >8 --
> We plan to deprecate -fconcepts-ts in GCC 15 and remove the flag_concepts_ts
> code.  This note is an admonishing reminder to convert the Concepts TS
> code to C++20 Concepts.

What does "deprecated in GCC 15" mean?  Given you output the notice with
GCC 14 it would be better to state when it's going to be removed -
it's effectively
"deprecated" right now then?  Or will it continue to "work" forever
until it bitrots?

> gcc/c-family/ChangeLog:
>
> * c-opts.cc (c_common_post_options): Add an inform saying that
> -fconcepts-ts will be deprecated in GCC 15.
> ---
>  gcc/c-family/c-opts.cc | 5 +
>  1 file changed, 5 insertions(+)
>
> diff --git a/gcc/c-family/c-opts.cc b/gcc/c-family/c-opts.cc
> index b38a1225ac4..4cb69c6aefc 100644
> --- a/gcc/c-family/c-opts.cc
> +++ b/gcc/c-family/c-opts.cc
> @@ -1139,6 +1139,11 @@ c_common_post_options (const char **pfilename)
>if (cxx_dialect >= cxx20 || flag_concepts_ts)
>  flag_concepts = 1;
>
> +  /* -fconcepts-ts will be deprecated in GCC 15.  */
> +  if (flag_concepts_ts)
> +inform (input_location, "%<-fconcepts-ts%> will be deprecated in GCC 15; 
> "
> +   "please convert your code to C++20 concepts");
> +
>/* -fimmediate-escalation has no effect when immediate functions are not
>   supported.  */
>if (flag_immediate_escalation && cxx_dialect < cxx20)
>
> base-commit: f2061b2a9641c2228d4e2d86f19532ad7e93d627
> --
> 2.43.0
>

Re: [PATCH RFA] asan: poisoning promoted statics [PR113531]

2024-01-31 Thread Richard Biener

On Wed, Jan 31, 2024 at 4:38 AM Jason Merrill  wrote:
>
> Tested x86_64-pc-linux-gnu, OK for trunk?

It's a quite "late" fixup, I suppose you have tried to avoid marking it
during gimplification?  I see we do parts of this during BIND_EXPR
processing which is indeed a bit early but possibly difficult to rectify.

So, OK if you think fixing during gimplification is overly messy.

Richard.

> -- 8< --
>
> Since my r14-1500-g4d935f52b0d5c0 we promote an initializer_list backing
> array to static storage where appropriate, but this happens after we decided
> to add it to asan_poisoned_variables.  As a result we add unpoison/poison
> for it to the gimple.  But then sanopt removes the unpoison.  So the second
> time we call the function and want to load from the array asan still
> considers it poisoned.
>
> A simple fix seems to be to not expand unpoison/poison for such a variable,
> since by that time we know it's static.
>
> PR c++/113531
>
> gcc/ChangeLog:
>
> * asan.cc (asan_expand_mark_ifn): Check TREE_STATIC.
>
> gcc/testsuite/ChangeLog:
>
> * g++.dg/asan/initlist1.C: New test.
> ---
>  gcc/asan.cc   |  8 
>  gcc/testsuite/g++.dg/asan/initlist1.C | 20 
>  2 files changed, 28 insertions(+)
>  create mode 100644 gcc/testsuite/g++.dg/asan/initlist1.C
>
> diff --git a/gcc/asan.cc b/gcc/asan.cc
> index 0fd7dd1f3ed..efecac2ea2b 100644
> --- a/gcc/asan.cc
> +++ b/gcc/asan.cc
> @@ -3762,6 +3762,14 @@ asan_expand_mark_ifn (gimple_stmt_iterator *iter)
>
>gcc_checking_assert (TREE_CODE (decl) == VAR_DECL);
>
> +  if (TREE_STATIC (decl))
> +{
> +  /* Don't poison a variable with static storage; it might have gotten
> +marked before gimplify_init_constructor promoted it to static.  */
> +  gsi_remove (iter, true);
> +  return false;
> +}
> +
>if (hwasan_sanitize_p ())
>  {
>gcc_assert (param_hwasan_instrument_stack);
> diff --git a/gcc/testsuite/g++.dg/asan/initlist1.C 
> b/gcc/testsuite/g++.dg/asan/initlist1.C
> new file mode 100644
> index 000..6cd5b7d3aba
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/asan/initlist1.C
> @@ -0,0 +1,20 @@
> +// PR c++/113531
> +// { dg-do run { target c++11 } }
> +// { dg-additional-options "-fsanitize=address" }
> +
> +#include 
> +
> +void f(int) { }
> +
> +void g()
> +{
> +  for (auto i : { 1, 2, 3 })
> +f (i);
> +  f(42);
> +}
> +
> +int main()
> +{
> +  g();
> +  g();
> +}
>
> base-commit: 209fc1e5f6c67e55e579b69f617b0b678b1bfdf0
> --
> 2.39.3
>

Re: About 31109 - gprofng not built and installed in a combined binutils+gcc build

2024-01-31 Thread Richard Biener

On Wed, Jan 31, 2024 at 4:46 AM Vladimir Mezentsev
 wrote:
>
> Hi,
>
> I asked in https://sourceware.org/bugzilla/show_bug.cgi?id=31109
>  > I prepared a patch for the releases/gcc-13 branch.
>  > Richard Biener  rejected my patch for
> this branch.
>  > Which branch should I use? master, trunk or something else?

toplevel changes are synced between binutils/gcc master branches only

> Do you really need gprofng in the gcc repo ?
> if yes:
>the fix is trivial.
>I did for the releases/gcc-13 branch:
>   git cherry-pick 24552056fd5fc677c0d032f54a5cad1c4303d312
>Can anyone do the same for the correct branch.
>I have no write permissions for gcc.gnu.org/git/gcc.git
>
>I maintain binutils-gdb/gprofng. Who will maintain gcc/gprofng ?

It's maintained in the binutils-gdb repository.  Shared files are synced
as said above.

I've never seen us care for release branches in the GCC repository,
combined builds are not really "supported" (or even tested regularly).

> If no:
>   may I close 31109 ?

So yes, I'd say that's an INVALID bug since it doesn't use master
branches on both sides.

Richard.

> Thank you,
> -Vladimir
>
> .
>

Re: [PATCH RFA] asan: poisoning promoted statics [PR113531]

2024-01-31 Thread Jakub Jelinek

On Wed, Jan 31, 2024 at 09:51:05AM +0100, Richard Biener wrote:
> On Wed, Jan 31, 2024 at 4:38 AM Jason Merrill  wrote:
> >
> > Tested x86_64-pc-linux-gnu, OK for trunk?
> 
> It's a quite "late" fixup, I suppose you have tried to avoid marking it
> during gimplification?  I see we do parts of this during BIND_EXPR
> processing which is indeed a bit early but possibly difficult to rectify.

Indeed.  But what we could do is try to fold_stmt those .ASAN_MARK calls
away earlier (but sure, the asan.cc change would be still required because
that would be just an optimization).  But that can be handled incrementally,
so I think the patch is ok as is (and I can handle the incremental part
myself).

Note, the handling of global vars in asan is done only at the end
(asan_finish_file), so I think such late TREE_STATIC marked vars will still
be correctly treated as global vars if varpool knows about them (and if
varpool doesn't, then lots of other things would break).

Jakub

Re: [PATCH] lower-bitint: Fix up VIEW_CONVERT_EXPR handling in handle_operand_addr [PR113639]

2024-01-31 Thread Richard Biener

On Wed, 31 Jan 2024, Jakub Jelinek wrote:

> Hi!
> 
> Yet another spot where we need to treat VIEW_CONVERT_EXPR differently
> from NOP_EXPR/CONVERT_EXPR.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK

> 2024-01-31  Jakub Jelinek  
> 
>   PR tree-optimization/113639
>   * gimple-lower-bitint.cc (bitint_large_huge::handle_operand_addr):
>   For VIEW_CONVERT_EXPR set rhs1 to its operand.
> 
>   * gcc.dg/bitint-79.c: New test.
> 
> --- gcc/gimple-lower-bitint.cc.jj 2024-01-27 13:06:49.183671155 +0100
> +++ gcc/gimple-lower-bitint.cc2024-01-30 17:06:56.829144801 +0100
> @@ -2159,6 +2159,8 @@ bitint_large_huge::handle_operand_addr (
> gcc_assert (gimple_assign_cast_p (g));
> tree rhs1 = gimple_assign_rhs1 (g);
> bitint_prec_kind kind = bitint_prec_small;
> +   if (TREE_CODE (rhs1) == VIEW_CONVERT_EXPR)
> + rhs1 = TREE_OPERAND (rhs1, 0);
> gcc_assert (INTEGRAL_TYPE_P (TREE_TYPE (rhs1)));
> if (TREE_CODE (TREE_TYPE (rhs1)) == BITINT_TYPE)
>   kind = bitint_precision_kind (TREE_TYPE (rhs1));
> --- gcc/testsuite/gcc.dg/bitint-79.c.jj   2024-01-30 17:18:50.711135054 
> +0100
> +++ gcc/testsuite/gcc.dg/bitint-79.c  2024-01-30 17:18:22.986524397 +0100
> @@ -0,0 +1,16 @@
> +/* PR tree-optimization/113639 */
> +/* { dg-do compile { target bitint } } */
> +/* { dg-options "-O2 -std=c23" } */
> +
> +int j, k;
> +#if __BITINT_MAXWIDTH__ >= 162
> +struct S { _BitInt(162) n; };
> +void bar (_BitInt(162) x);
> +
> +void
> +foo (struct S s)
> +{
> +  bar (s.n * j);
> +  (void) (s.n * k);
> +}
> +#endif
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Re: [PATCH] dwarf2out: Fix ICE on large _BitInt in loc_list_from_tree_1 [PR113637]

2024-01-31 Thread Richard Biener

On Wed, 31 Jan 2024, Jakub Jelinek wrote:

> Hi!
> 
> This spot uses SCALAR_INT_TYPE_MODE which obviously ICEs for large/huge
> BITINT_TYPE types which have BLKmode.  But such large BITINT_TYPEs certainly
> don't fit into DWARF2_ADDR_SIZE either, so we can just assume it would be
> false if type has BLKmode.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK

> 2024-01-31  Jakub Jelinek  
> 
>   PR debug/113637
>   * dwarf2out.cc (loc_list_from_tree_1): Assume integral types
>   with BLKmode are larger than DWARF2_ADDR_SIZE.
> 
>   * gcc.dg/bitint-80.c: New test.
> 
> --- gcc/dwarf2out.cc.jj   2024-01-24 13:11:21.132468150 +0100
> +++ gcc/dwarf2out.cc  2024-01-30 17:23:41.249054946 +0100
> @@ -19027,6 +19027,7 @@ loc_list_from_tree_1 (tree loc, int want
>   && ! DECL_IGNORED_P (loc)
>   && (INTEGRAL_TYPE_P (TREE_TYPE (loc))
>   || POINTER_TYPE_P (TREE_TYPE (loc)))
> + && TYPE_MODE (TREE_TYPE (loc)) != BLKmode
>   && (GET_MODE_SIZE (SCALAR_INT_TYPE_MODE (TREE_TYPE (loc)))
>   <= DWARF2_ADDR_SIZE))
> {
> --- gcc/testsuite/gcc.dg/bitint-80.c.jj   2024-01-30 17:30:02.843696120 
> +0100
> +++ gcc/testsuite/gcc.dg/bitint-80.c  2024-01-30 17:32:33.301583203 +0100
> @@ -0,0 +1,15 @@
> +/* PR debug/113637 */
> +/* { dg-do compile { target bitint } } */
> +/* { dg-options "-g -std=c23" } */
> +
> +#if __BITINT_MAXWIDTH__ >= 639
> +typedef _BitInt(639) B;
> +#else
> +typedef _BitInt(63) B;
> +#endif
> +
> +void
> +foo (B n)
> +{
> +  extern void bar (int [][n]);
> +}
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Re: [PATCH] simplify-rtx: Fix up last argument to simplify_gen_unary [PR113656]

2024-01-31 Thread Richard Biener

On Wed, 31 Jan 2024, Jakub Jelinek wrote:

> Hi!
> 
> When simplifying e.g. (float_truncate:SF (float_truncate:DF (reg:XF))
> or (float_truncate:SF (float_extend:XF (reg:DF)) etc. into
> (float_truncate:SF (reg:XF)) or (float_truncate:SF (reg:DF)) we call
> simplify_gen_unary with incorrect op_mode argument, it should be
> the argument's mode, but we call it with the outer mode instead.
> As these are all floating point operations, the argument always
> has non-VOIDmode and so we can just use that mode (as done in similar
> simplifications a few lines later), but neither FLOAT_TRUNCATE nor
> FLOAT_EXTEND are operations that should have the same modes of operand
> and result.  This bug hasn't been a problem for years because normally
> op_mode is used only if the mode of op is VOIDmode, otherwise it is
> redundant, but r10-2139 added an assertion in some spots that op_mode
> is right even in such cases.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

> 2024-01-31  Jakub Jelinek  
> 
>   PR rtl-optimization/113656
>   * simplify-rtx.cc (simplify_context::simplify_unary_operation_1)
>   : Fix up last argument to simplify_gen_unary.
> 
>   * gcc.target/i386/pr113656.c: New test.
> 
> --- gcc/simplify-rtx.cc.jj2024-01-03 11:51:32.828713189 +0100
> +++ gcc/simplify-rtx.cc   2024-01-30 19:34:30.516934480 +0100
> @@ -1305,7 +1305,7 @@ simplify_context::simplify_unary_operati
>  > GET_MODE_UNIT_SIZE (mode)
>  ? FLOAT_TRUNCATE : FLOAT_EXTEND,
>  mode,
> -XEXP (op, 0), mode);
> +XEXP (op, 0), GET_MODE (XEXP (op, 0)));
>  
>/*  (float_truncate (float x)) is (float x)  */
>if ((GET_CODE (op) == FLOAT || GET_CODE (op) == UNSIGNED_FLOAT)
> --- gcc/testsuite/gcc.target/i386/pr113656.c.jj   2024-01-30 
> 19:38:29.029608721 +0100
> +++ gcc/testsuite/gcc.target/i386/pr113656.c  2024-01-30 19:37:10.519703443 
> +0100
> @@ -0,0 +1,12 @@
> +/* PR rtl-optimization/113656 */
> +/* { dg-do compile } */
> +/* { dg-options "-O3 -frounding-math -funsafe-math-optimizations 
> -mavx512fp16 -mavx512vl" } */
> +
> +_Float16 a[8];
> +
> +void
> +foo ()
> +{
> +  for (int i = 0; i < 8; i++)
> +a[i] = i - 8.4;
> +}
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

[PATCH] tree-optimization/113670 - gather/scatter to/from hard registers

2024-01-31 Thread Richard Biener

The following makes sure we're not taking the address of hard
registers when vectorizing appearant gathers or scatters to/from
them.

Bootstrapped and tested on x86_64-unkown-linux-gnu, pushed.

PR tree-optimization/113670
* tree-vect-data-refs.cc (vect_check_gather_scatter):
Make sure we can take the address of the reference base.

* gcc.target/i386/pr113670.c: New testcase.
---
 gcc/testsuite/gcc.target/i386/pr113670.c | 16 
 gcc/tree-vect-data-refs.cc   |  5 +
 2 files changed, 21 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr113670.c

diff --git a/gcc/testsuite/gcc.target/i386/pr113670.c 
b/gcc/testsuite/gcc.target/i386/pr113670.c
new file mode 100644
index 000..8b9d3744fe2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr113670.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-msse2 -O2 -fno-vect-cost-model" } */
+
+typedef float __attribute__ ((vector_size (16))) vec;
+typedef int __attribute__ ((vector_size (16))) ivec;
+ivec x;
+
+void
+test (void)
+{
+  register vec a asm("xmm3"), b asm("xmm4");
+  register ivec c asm("xmm5");
+  for (int i = 0; i < 4; i++)
+c[i] = a[i] < b[i] ? -1 : 1;
+  x = c;
+}
diff --git a/gcc/tree-vect-data-refs.cc b/gcc/tree-vect-data-refs.cc
index f592aeb8028..e6a3035064b 100644
--- a/gcc/tree-vect-data-refs.cc
+++ b/gcc/tree-vect-data-refs.cc
@@ -4325,6 +4325,11 @@ vect_check_gather_scatter (stmt_vec_info stmt_info, 
loop_vec_info loop_vinfo,
   if (!multiple_p (pbitpos, BITS_PER_UNIT))
 return false;
 
+  /* We need to be able to form an address to the base which for example
+ isn't possible for hard registers.  */
+  if (may_be_nonaddressable_p (base))
+return false;
+
   poly_int64 pbytepos = exact_div (pbitpos, BITS_PER_UNIT);
 
   if (TREE_CODE (base) == MEM_REF)
-- 
2.35.3

Re: [PATCH] aarch64: Avoid out-of-range shrink-wrapped saves [PR111677]

2024-01-31 Thread Richard Sandiford

Alex Coplan  writes:
> Hi,
>
> The PR shows us ICEing due to an unrecognizable TFmode save emitted by
> aarch64_process_components.  The problem is that for T{I,F,D}mode we
> conservatively require mems to be in range for x-register ldp/stp.  That
> is because (at least for TImode) it can be allocated to both GPRs and
> FPRs, and in the GPR case that is an x-reg ldp/stp, and the FPR case is
> a q-register load/store.
>
> As Richard pointed out in the PR, aarch64_get_separate_components
> already checks that the offsets are suitable for a single load, so we
> just need to choose a mode in aarch64_reg_save_mode that gives the full
> q-register range.  In this patch, we choose V16QImode as an alternative
> 16-byte "bag-of-bits" mode that doesn't have the artificial range
> restrictions imposed on T{I,F,D}mode.
>
> For T{F,D}mode in GCC 15 I think we could consider relaxing the
> restriction imposed in aarch64_classify_address, as AFAIK T{F,D}mode can
> only be allocated to FPRs (unlike TImode).  But such a change seems too
> invasive to consider for GCC 14 at this stage (let alone backports).

GPRs can hold all three, due to the way aarch64_hard_regno_mode_ok
is defined.  (They can also hold individual Advanced SIMD vectors.)

But the ABI says that TFmode is passed in FPRs, so I agree that it
seems better to optimise for the FPR range.  Same for TDmode.

> Fortunately the new flexible load/store pair patterns in GCC 14 allow
> this mode change to work without further changes.  The backports are
> more involved as we need to adjust the load/store pair handling to cater
> for V16QImode in a few places.
>
> Note that for the testcase we are relying on the torture options to add
> -funroll-loops at -O3 which is necessary to trigger the ICE on trunk
> (but not on the 13 branch).
>
> Bootstrapped/regtested on aarch64-linux-gnu, OK for trunk?
>
> Thanks,
> Alex
>
> gcc/ChangeLog:
>
>   PR target/111677
>   * config/aarch64/aarch64.cc (aarch64_reg_save_mode): Use
>   V16QImode for the full 16-byte FPR saves in the vector PCS case.
>
> gcc/testsuite/ChangeLog:
>
>   PR target/111677
>   * gcc.target/aarch64/torture/pr111677.c: New test.

OK, thanks.

Richard

Re: [PATCH][GCC 13] aarch64: Avoid out-of-range shrink-wrapped saves [PR111677]

2024-01-31 Thread Richard Sandiford

Alex Coplan  writes:
> Bootstrapped/regtested on aarch64-linux-gnu, OK for the 13 branch after
> a week of the trunk fix being in?  OK for the other active branches if
> the same changes test cleanly there?
>
> GCC 14 patch for reference:
> https://gcc.gnu.org/pipermail/gcc-patches/2024-January/61.html
>
> Thanks,
> Alex

OK, thanks.  It might be worth committing sooner than that, since the patch
includes changes that won't be tested on trunk.  It's your call though.

Richard

>
> -- >8 --
>
> The PR shows us ICEing due to an unrecognizable TFmode save emitted by
> aarch64_process_components.  The problem is that for T{I,F,D}mode we
> conservatively require mems to be in range for x-register ldp/stp.  That
> is because (at least for TImode) it can be allocated to both GPRs and
> FPRs, and in the GPR case that is an x-reg ldp/stp, and the FPR case is
> a q-register load/store.
>
> As Richard pointed out in the PR, aarch64_get_separate_components
> already checks that the offsets are suitable for a single load, so we
> just need to choose a mode in aarch64_reg_save_mode that gives the full
> q-register range.  In this patch, we choose V16QImode as an alternative
> 16-byte "bag-of-bits" mode that doesn't have the artificial range
> restrictions imposed on T{I,F,D}mode.
>
> For T{F,D}mode in GCC 15 I think we could consider relaxing the
> restriction imposed in aarch64_classify_address, as AFAIK T{F,D}mode can
> only be allocated to FPRs (unlike TImode).  But such a change seems too
> invasive to consider for GCC 14 at this stage (let alone backports).
>
> Unlike for GCC 14 we need additional handling in the load/store pair
> code as various cases are not expecting to see V16QImode (particularly
> the writeback patterns, but also aarch64_gen_load_pair).
>
> gcc/ChangeLog:
>
>   PR target/111677
>   * config/aarch64/aarch64.cc (aarch64_reg_save_mode): Use
>   V16QImode for the full 16-byte FPR saves in the vector PCS case.
>   (aarch64_gen_storewb_pair): Handle V16QImode.
>   (aarch64_gen_loadwb_pair): Likewise.
>   (aarch64_gen_load_pair): Likewise.
>   * config/aarch64/aarch64.md (loadwb_pair_):
>   Rename to ...
>   (loadwb_pair_): ... this, extending to
>   V16QImode.
>   (storewb_pair_): Rename to ...
>   (storewb_pair_): ... this, extending to
>   V16QImode.
>   * config/aarch64/iterators.md (TX_V16QI): New.
>
> gcc/testsuite/ChangeLog:
>
>   PR target/111677
>   * gcc.target/aarch64/torture/pr111677.c: New test.
>
> diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
> index 02515d4683a..f546c48ae2d 100644
> --- a/gcc/config/aarch64/aarch64.cc
> +++ b/gcc/config/aarch64/aarch64.cc
> @@ -4074,7 +4074,7 @@ aarch64_reg_save_mode (unsigned int regno)
>case ARM_PCS_SIMD:
>   /* The vector PCS saves the low 128 bits (which is the full
>  register on non-SVE targets).  */
> - return TFmode;
> + return V16QImode;
>  
>case ARM_PCS_SVE:
>   /* Use vectors of DImode for registers that need frame
> @@ -8863,6 +8863,10 @@ aarch64_gen_storewb_pair (machine_mode mode, rtx base, 
> rtx reg, rtx reg2,
>return gen_storewb_pairtf_di (base, base, reg, reg2,
>   GEN_INT (-adjustment),
>   GEN_INT (UNITS_PER_VREG - adjustment));
> +case E_V16QImode:
> +  return gen_storewb_pairv16qi_di (base, base, reg, reg2,
> +GEN_INT (-adjustment),
> +GEN_INT (UNITS_PER_VREG - adjustment));
>  default:
>gcc_unreachable ();
>  }
> @@ -8908,6 +8912,10 @@ aarch64_gen_loadwb_pair (machine_mode mode, rtx base, 
> rtx reg, rtx reg2,
>  case E_TFmode:
>return gen_loadwb_pairtf_di (base, base, reg, reg2, GEN_INT 
> (adjustment),
>  GEN_INT (UNITS_PER_VREG));
> +case E_V16QImode:
> +  return gen_loadwb_pairv16qi_di (base, base, reg, reg2,
> +   GEN_INT (adjustment),
> +   GEN_INT (UNITS_PER_VREG));
>  default:
>gcc_unreachable ();
>  }
> @@ -8991,6 +8999,9 @@ aarch64_gen_load_pair (machine_mode mode, rtx reg1, rtx 
> mem1, rtx reg2,
>  case E_V4SImode:
>return gen_load_pairv4siv4si (reg1, mem1, reg2, mem2);
>  
> +case E_V16QImode:
> +  return gen_load_pairv16qiv16qi (reg1, mem1, reg2, mem2);
> +
>  default:
>gcc_unreachable ();
>  }
> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
> index 50239d72fc0..922cc987595 100644
> --- a/gcc/config/aarch64/aarch64.md
> +++ b/gcc/config/aarch64/aarch64.md
> @@ -1896,17 +1896,18 @@ (define_insn "loadwb_pair_"
>[(set_attr "type" "neon_load1_2reg")]
>  )
>  
> -(define_insn "loadwb_pair_"
> +(define_insn "loadwb_pair_"
>[(parallel
>  [(set (match_operand:P 0 "register_operand" "=k")
> -  (plus:P (match_op

Re: [PATCH] libstdc++: Enable std::text_encoding for darwin and FreeBSD

2024-01-31 Thread Iain Sandoe

Hi Jonathan,

> On 30 Jan 2024, at 15:02, Jonathan Wakely  wrote:
> 
> This should fix the std/text_encoding/* FAILs that Iain sees on darwin.
> I assume it will make it work for FreeBSD too.
> 
> I won't push this until I hear it works for at least one of those.

It works on x86_64-darwin{19,21,23} and on a cross to powerpc-darwin9.
The header is present on all versions we currently support (although I did not 
yet
test more widely than noted above).

As discussed on IRC, Darwin is qualified to Posix 2003 (and does not, generally,
have all the optional parts), so we do not expect the symbols to be present in
locale.h (even with some _XOPEN_SOURCE= value).

thanks for taking care of this,
Iain
> 
> Tested x86_64-linux.
> 
> -- >8 --
> 
> The  header is needed for newlocale and locale_t on these
> targets.
> 
> libstdc++-v3/ChangeLog:
> 
>   * acinclude.m4 (GLIBCXX_CHECK_TEXT_ENCODING): Use  if
>   needed for newlocale.
>   * configure: Regenerate.
>   * src/c++26/text_encoding.cc: Use .
> ---
> libstdc++-v3/acinclude.m4   | 3 +++
> libstdc++-v3/configure  | 3 +++
> libstdc++-v3/src/c++26/text_encoding.cc | 3 +++
> 3 files changed, 9 insertions(+)
> 
> diff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4
> index f9ba7ef744b..f72bd0f45b8 100644
> --- a/libstdc++-v3/acinclude.m4
> +++ b/libstdc++-v3/acinclude.m4
> @@ -5834,6 +5834,9 @@ AC_LANG_SAVE
>   AC_MSG_CHECKING([whether nl_langinfo_l is defined in ])
>   AC_TRY_COMPILE([
>   #include 
> +  #if __has_include()
> +  # include 
> +  #endif
>   #include 
>   ],[
> locale_t loc = newlocale(LC_ALL_MASK, "", (locale_t)0);
> diff --git a/libstdc++-v3/configure b/libstdc++-v3/configure
> index 65ce679f1bd..f4bc0486768 100755
> --- a/libstdc++-v3/configure
> +++ b/libstdc++-v3/configure
> @@ -54533,6 +54533,9 @@ $as_echo_n "checking whether nl_langinfo_l is defined 
> in ... " >&6;
> /* end confdefs.h.  */
> 
>   #include 
> +  #if __has_include()
> +  # include 
> +  #endif
>   #include 
> 
> int
> diff --git a/libstdc++-v3/src/c++26/text_encoding.cc 
> b/libstdc++-v3/src/c++26/text_encoding.cc
> index 33c6c07820c..b9a50ef1a00 100644
> --- a/libstdc++-v3/src/c++26/text_encoding.cc
> +++ b/libstdc++-v3/src/c++26/text_encoding.cc
> @@ -27,6 +27,9 @@
> 
> #ifdef _GLIBCXX_USE_NL_LANGINFO_L
> #include 
> +#if __has_include()
> +# include 
> +#endif
> #include 
> 
> #if __CHAR_BIT__ == 8
> -- 
> 2.43.0
>

[PATCH] RISC-V: Support scheduling for sifive p600 series

2024-01-31 Thread Monk Chiang

Add sifive p600 series scheduler module. For more information
see https://www.sifive.com/cores/performance-p650-670.
Add sifive-p650, sifive-p670 for mcpu option will come in separate patches.

gcc/ChangeLog:
* config/riscv/riscv.md: Add "fcvt_i2f", "fcvt_f2i" type
attribute, and include sifive-p600.md.
* config/riscv/generic-ooo.md: Update type attribute.
* config/riscv/sifive-7.md: Update type attribute.
* config/riscv/sifive-p600.md: New file.
* config/riscv/riscv-cores.def (RISCV_TUNE): Add parameter.
* config/riscv/riscv-opts.h (enum riscv_microarchitecture_type):
Add sifive_p600.
* config/riscv/riscv.c (sifive_p600_tune_info): New.
* config/riscv/riscv.h (TARGET_SFB_ALU): Update.
* doc/invoke.texi (RISC-V Options): Add sifive-p600-series
---
 gcc/config/riscv/generic-ooo.md  |   2 +-
 gcc/config/riscv/generic.md  |   2 +-
 gcc/config/riscv/riscv-cores.def |   1 +
 gcc/config/riscv/riscv-opts.h|   1 +
 gcc/config/riscv/riscv.cc|  17 +++
 gcc/config/riscv/riscv.h |   4 +-
 gcc/config/riscv/riscv.md|  19 ++--
 gcc/config/riscv/sifive-7.md |   2 +-
 gcc/config/riscv/sifive-p600.md  | 174 +++
 gcc/doc/invoke.texi  |   3 +-
 10 files changed, 212 insertions(+), 13 deletions(-)
 create mode 100644 gcc/config/riscv/sifive-p600.md

diff --git a/gcc/config/riscv/generic-ooo.md b/gcc/config/riscv/generic-ooo.md
index 421a7bb929d..a22f8a3e079 100644
--- a/gcc/config/riscv/generic-ooo.md
+++ b/gcc/config/riscv/generic-ooo.md
@@ -127,7 +127,7 @@
 
 (define_insn_reservation "generic_ooo_fcvt" 3
   (and (eq_attr "tune" "generic_ooo")
-   (eq_attr "type" "fcvt"))
+   (eq_attr "type" "fcvt,fcvt_i2f,fcvt_f2i"))
   "generic_ooo_issue,generic_ooo_fxu")
 
 (define_insn_reservation "generic_ooo_fcmp" 2
diff --git a/gcc/config/riscv/generic.md b/gcc/config/riscv/generic.md
index b99ae345bb3..3f0eaa2ea08 100644
--- a/gcc/config/riscv/generic.md
+++ b/gcc/config/riscv/generic.md
@@ -42,7 +42,7 @@
 
 (define_insn_reservation "generic_xfer" 3
   (and (eq_attr "tune" "generic")
-   (eq_attr "type" "mfc,mtc,fcvt,fmove,fcmp"))
+   (eq_attr "type" "mfc,mtc,fcvt,fcvt_i2f,fcvt_f2i,fmove,fcmp"))
   "alu")
 
 (define_insn_reservation "generic_branch" 1
diff --git a/gcc/config/riscv/riscv-cores.def b/gcc/config/riscv/riscv-cores.def
index b30f4dfb08e..a07a79e2cb7 100644
--- a/gcc/config/riscv/riscv-cores.def
+++ b/gcc/config/riscv/riscv-cores.def
@@ -37,6 +37,7 @@ RISCV_TUNE("rocket", generic, rocket_tune_info)
 RISCV_TUNE("sifive-3-series", generic, rocket_tune_info)
 RISCV_TUNE("sifive-5-series", generic, rocket_tune_info)
 RISCV_TUNE("sifive-7-series", sifive_7, sifive_7_tune_info)
+RISCV_TUNE("sifive-p600-series", sifive_p600, sifive_p600_tune_info)
 RISCV_TUNE("thead-c906", generic, thead_c906_tune_info)
 RISCV_TUNE("generic-ooo", generic_ooo, generic_ooo_tune_info)
 RISCV_TUNE("size", generic, optimize_size_tune_info)
diff --git a/gcc/config/riscv/riscv-opts.h b/gcc/config/riscv/riscv-opts.h
index 1500f8811ef..25951665b13 100644
--- a/gcc/config/riscv/riscv-opts.h
+++ b/gcc/config/riscv/riscv-opts.h
@@ -55,6 +55,7 @@ extern enum riscv_isa_spec_class riscv_isa_spec;
 enum riscv_microarchitecture_type {
   generic,
   sifive_7,
+  sifive_p600,
   generic_ooo
 };
 extern enum riscv_microarchitecture_type riscv_microarchitecture;
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 7b6111aa545..92d6fd5cf47 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -447,6 +447,23 @@ static const struct riscv_tune_param sifive_7_tune_info = {
   NULL,/* vector cost */
 };
 
+/* Costs to use when optimizing for Sifive p600 Series.  */
+static const struct riscv_tune_param sifive_p600_tune_info = {
+  {COSTS_N_INSNS (4), COSTS_N_INSNS (4)},  /* fp_add */
+  {COSTS_N_INSNS (4), COSTS_N_INSNS (4)},  /* fp_mul */
+  {COSTS_N_INSNS (20), COSTS_N_INSNS (20)},/* fp_div */
+  {COSTS_N_INSNS (4), COSTS_N_INSNS (4)},  /* int_mul */
+  {COSTS_N_INSNS (6), COSTS_N_INSNS (6)},  /* int_div */
+  4,   /* issue_rate */
+  4,   /* branch_cost */
+  3,   /* memory_cost */
+  4,   /* fmv_cost */
+  true,/* 
slow_unaligned_access */
+  false,   /* use_divmod_expansion */
+  RISCV_FUSE_LUI_ADDI | RISCV_FUSE_AUIPC_ADDI,  /* fusible_ops */
+  NULL,/* vector cost */
+};
+
 /* Costs to use when optimizing for T-HEAD c906.  */
 static const struct riscv_tune_param thead_c906_tune_info = {
   {COSTS_N_INSNS (4), COSTS_N_INSNS (5)}, /* fp_add */
diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv

[committed] libstdc++: Fix -Wshift-count-overflow warning in std::bitset

2024-01-31 Thread Jonathan Wakely

Tested x86_64-linux and aarch64-linux. Pushed to trunk.

-- >8 --

This shift only happens if the unsigned long long type is wider than
unsigned long but the compiler warns when it sees the shift, without
caring if it's reachable.

Use the preprocessor to compare the sizes and just reuse _M_to_ulong()
if sizeof(long) == sizeof(long long).

libstdc++-v3/ChangeLog:

* include/std/bitset (_Base_bitset::_M_do_to_ullong): Avoid
-Wshift-count-overflow warning.
---
 libstdc++-v3/include/std/bitset | 13 +++--
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/libstdc++-v3/include/std/bitset b/libstdc++-v3/include/std/bitset
index c169269698a..3243c649731 100644
--- a/libstdc++-v3/include/std/bitset
+++ b/libstdc++-v3/include/std/bitset
@@ -320,17 +320,18 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
 _GLIBCXX14_CONSTEXPR unsigned long long
 _Base_bitset<_Nw>::_M_do_to_ullong() const
 {
-  const bool __dw = sizeof(unsigned long long) > sizeof(unsigned long);
-  for (size_t __i = 1 + __dw; __i < _Nw; ++__i)
+#if __SIZEOF_LONG_LONG__ == __SIZEOF_LONG__
+  return _M_do_to_ulong();
+#else
+  for (size_t __i = 2; __i < _Nw; ++__i)
if (_M_w[__i])
  __throw_overflow_error(__N("_Base_bitset::_M_do_to_ullong"));
 
-  if (__dw)
-   return _M_w[0] + (static_cast(_M_w[1])
+  return _M_w[0] + (static_cast(_M_w[1])
  << _GLIBCXX_BITSET_BITS_PER_WORD);
-  return _M_w[0];
-}
 #endif
+}
+#endif // C++11
 
   template
 _GLIBCXX14_CONSTEXPR size_t
-- 
2.43.0

[committed] libstdc++: Add all supported headers to lists in the manual

2024-01-31 Thread Jonathan Wakely

Another piece of the manual that needs to be kept up to date as we add
features.

Pushed to trunk.

-- >8 --

libstdc++-v3/ChangeLog:

* doc/xml/manual/using.xml: Update tables of supported headers.
* doc/html/*: Regenerate.
---
 libstdc++-v3/doc/html/manual/index.html   |   2 +-
 .../doc/html/manual/using_headers.html|  58 --
 libstdc++-v3/doc/xml/manual/using.xml | 165 --
 3 files changed, 192 insertions(+), 33 deletions(-)

diff --git a/libstdc++-v3/doc/xml/manual/using.xml 
b/libstdc++-v3/doc/xml/manual/using.xml
index 7276cad0feb..b3b0c368e44 100644
--- a/libstdc++-v3/doc/xml/manual/using.xml
+++ b/libstdc++-v3/doc/xml/manual/using.xml
@@ -468,9 +468,9 @@ Unless specified otherwise below, they are also available 
in later modes
 
 
 
-shows the C++2a include files.
-These are available in C++2a compilation
-mode, i.e. -std=c++2a or -std=gnu++2a.
+shows the C++20 include files.
+These are available in C++20 compilation
+mode, i.e. -std=c++20 or -std=gnu++20.
 Including these headers in earlier modes will not result in
 compilation errors, but will not define anything.
 
 
 
+barrier
 bit
-version
+charconv
+compare
+concepts
+
+
+coroutine
+format
+latch
+numbers
+ranges
+
+
+semaphore
+source_location
+span
+stop_token
+syncstream
+
+
+version
+
 
-
 
 
 
 
 
-  The following headers have been removed in the C++2a working draft.
+  The following headers have been removed in the C++20 standard.
   They are still available when using this implementation, but in future
-  they might start to produce warnings or errors when included in C++2a mode.
+  they might start to produce warnings or errors when included in C++20 mode.
   Programs that intend to be portable should not include them.
 
 
@@ -529,10 +547,86 @@ Unless specified otherwise below, they are also available 
in later modes
 
 
 
+
+
+shows the C++23 include files.
+These are available in C++23 compilation
+mode, i.e. -std=c++23 or -std=gnu++23.
+Including these headers in earlier modes will not result in
+compilation errors, but will not define anything.
+
+
+
+
+
+C++ 2023 Library Headers
+
+
+
+
+
+
+
+
+
+expected
+generator
+print
+spanstream
+stacktrace
+
+
+stdatomic.h
+stdfloat
+
+
+
+
+
+
+
+
+
+shows the C++26 include files.
+These are available in C++26 compilation
+mode, i.e. -std=c++26 or -std=gnu++26.
+Including these headers in earlier modes will not result in
+compilation errors, but will not define anything.
+
+
+
+
+
+C++ 2026 Library Headers
+
+
+
+
+
+
+text_encoding
+
+
+
+
+
+
+
 
 ,
 shows the additional include file define by the
-File System Technical Specification, ISO/IEC TS 18822.
+File System Technical Specification, ISO/IEC TS 18822:2015.
 This is available in C++11 and later compilation modes.
 Including this header in earlier modes will not result in
 compilation errors, but will not define anything.
@@ -556,8 +650,11 @@ compilation errors, but will not define anything.
 
 ,
 shows the additional include files define by the C++ Extensions for
-Library Fundamentals Technical Specification, ISO/IEC TS 19568.
-These are available in C++14 and later compilation modes.
+Library Fundamentals Technical Specification, ISO/IEC TS 19568:2015,
+ISO/IEC TS 19568:2017, and ISO/IEC TS 19568:2024.
+These are available in C++14 and later compilation modes, except for
+
+which is available in C++20 and later compilation modes.
 Including these headers in earlier modes will not result in
 compilation errors, but will not define anything.
 
@@ -598,22 +695,58 @@ compilation errors, but will not define anything.
 experimental/random
 experimental/ratio
 experimental/regex
+experimental/scope
 experimental/set
-experimental/source_location
 
 
+experimental/source_location
 experimental/string
 experimental/string_view
 experimental/system_error
 experimental/tuple
-experimental/type_traits
 
 
+experimental/type_traits
 experimental/unordered_map
 experimental/unordered_set
 experimental/utility
 experimental/vector
-
+
+
+
+
+
+
+
+,
+shows the additional include files define by the
+Networking Technical Specification, ISO/IEC TS 19216:2018.
+These are available in C++14 and later compilation modes.
+Including these headers in earlier modes will not result in
+compilation errors, but will not define anything.
+
+
+
+
+Networking TS Headers
+
+
+
+
+
+
+
+
+experimental/buffer
+experimental/executor
+experimental/internet
+experimental/io_context
+
+
+experimental/net
+experimental/netfwd
+experimental/socket
+experimental/timer
 
 
 
-- 
2.43.0

[committed] libstdc++: Add "ASCII" as an alias for std::text_encoding::id::ASCII

2024-01-31 Thread Jonathan Wakely

SG16 (Unicode and Text Study Group) and LWG are overwhelmingly in favour
of adding this alias, so let's not wait for the issue to get voted into
the working draft.

Tested aarch64-linux. Pushed to trunk.

-- >8 --

As noted in LWG 4043, "ASCII" is not an alias for any known registered
character encoding, so std::text_encoding("ASCII").mib() == id::other.
Add the alias "ASCII" to the implementation-defined superset of aliases
for that encoding.

libstdc++-v3/ChangeLog:

* include/bits/text_encoding-data.h: Regenerate.
* scripts/gen_text_encoding_data.py: Add extra_aliases dict
containing "ASCII".
* testsuite/std/text_encoding/cons.cc: Check "ascii" is known.

Co-authored-by: Ewan Higgs 
Signed-off-by: Ewan Higgs 
---
 .../include/bits/text_encoding-data.h |  3 ++-
 .../scripts/gen_text_encoding_data.py | 24 ++-
 .../testsuite/std/text_encoding/cons.cc   |  5 
 3 files changed, 30 insertions(+), 2 deletions(-)

diff --git a/libstdc++-v3/include/bits/text_encoding-data.h 
b/libstdc++-v3/include/bits/text_encoding-data.h
index 7ac2e9dc3d9..5041e738d21 100644
--- a/libstdc++-v3/include/bits/text_encoding-data.h
+++ b/libstdc++-v3/include/bits/text_encoding-data.h
@@ -14,6 +14,7 @@
   {3, "IBM367" },
   {3, "cp367" },
   {3, "csASCII" },
+  {3, "ASCII" }, // libstdc++ extension
   {4, "ISO_8859-1:1987" },
   {4, "iso-ir-100" },
   {4, "ISO_8859-1" },
@@ -417,7 +418,7 @@
   {  104, "csISO2022CN" },
   {  105, "ISO-2022-CN-EXT" },
   {  105, "csISO2022CNEXT" },
-#define _GLIBCXX_TEXT_ENCODING_UTF8_OFFSET 413
+#define _GLIBCXX_TEXT_ENCODING_UTF8_OFFSET 414
   {  106, "UTF-8" },
   {  106, "csUTF8" },
   {  109, "ISO-8859-13" },
diff --git a/libstdc++-v3/scripts/gen_text_encoding_data.py 
b/libstdc++-v3/scripts/gen_text_encoding_data.py
index 2d6f3e4077a..f0ebb42d8c2 100755
--- a/libstdc++-v3/scripts/gen_text_encoding_data.py
+++ b/libstdc++-v3/scripts/gen_text_encoding_data.py
@@ -36,6 +36,18 @@ print("#ifndef _GLIBCXX_GET_ENCODING_DATA")
 print('# error "This is not a public header, do not include it directly"')
 print("#endif\n")
 
+# We need to generate a list of initializers of the form { mib, alias }, e.g.,
+# { 3, "US-ASCII" },
+# { 3, "ISO646-US" },
+# { 3, "csASCII" },
+# { 4, "ISO_8859-1:1987" },
+# { 4, "latin1" },
+# The initializers must be sorted by the mib value. The first entry for
+# a given mib must be the primary name for the encoding. Any aliases for
+# the encoding come after the primary name.
+# We also define a macro _GLIBCXX_TEXT_ENCODING_UTF8_OFFSET which is the
+# offset into the list of the mib=106, alias="UTF-8" entry. This is used
+# to optimize the common case, so we don't need to search for "UTF-8".
 
 charsets = {}
 with open(sys.argv[1], newline='') as f:
@@ -52,10 +64,15 @@ with open(sys.argv[1], newline='') as f:
 aliases.remove(name)
 charsets[mib] = [name] + aliases
 
-# Remove "NATS-DANO" and "NATS-DANO-ADD"
+# Remove "NATS-DANO" and "NATS-DANO-ADD" as specified by the C++ standard.
 charsets.pop(33, None)
 charsets.pop(34, None)
 
+# This is not an official IANA alias, but we include it in the
+# implementation-defined superset of aliases for US-ASCII.
+# See also LWG 4043.
+extra_aliases = {3: ["ASCII"]}
+
 count = 0
 for mib in sorted(charsets.keys()):
 names = charsets[mib]
@@ -64,6 +81,11 @@ for mib in sorted(charsets.keys()):
 for name in names:
 print('  {{ {:4}, "{}" }},'.format(mib, name))
 count += len(names)
+if mib in extra_aliases:
+names = extra_aliases[mib]
+for name in names:
+print('  {{ {:4}, "{}" }}, // libstdc++ extension'.format(mib, 
name))
+count += len(names)
 
 #  gives an error if this macro is left defined.
 # Do this last, so that the generated output is not usable unless we reach 
here.
diff --git a/libstdc++-v3/testsuite/std/text_encoding/cons.cc 
b/libstdc++-v3/testsuite/std/text_encoding/cons.cc
index b9d93641de4..8fcc2ec8c3b 100644
--- a/libstdc++-v3/testsuite/std/text_encoding/cons.cc
+++ b/libstdc++-v3/testsuite/std/text_encoding/cons.cc
@@ -53,6 +53,11 @@ test_construct_by_name()
   VERIFY( e4.name() == s );
   VERIFY( ! e4.aliases().empty() );
   VERIFY( e4.aliases().front() == "US-ASCII"sv ); // primary name
+
+  s = "ascii";
+  std::text_encoding e5(s);
+  VERIFY( e5.mib() == std::text_encoding::ASCII );
+  VERIFY( e5.name() == s );
 }
 
 constexpr void
-- 
2.43.0

Re: [PATCH] libstdc++: Enable std::text_encoding for darwin and FreeBSD

2024-01-31 Thread Jonathan Wakely

On Wed, 31 Jan 2024 at 09:38, Iain Sandoe wrote:
>
> Hi Jonathan,
>
> > On 30 Jan 2024, at 15:02, Jonathan Wakely  wrote:
> >
> > This should fix the std/text_encoding/* FAILs that Iain sees on darwin.
> > I assume it will make it work for FreeBSD too.
> >
> > I won't push this until I hear it works for at least one of those.
>
> It works on x86_64-darwin{19,21,23} and on a cross to powerpc-darwin9.
> The header is present on all versions we currently support (although I did 
> not yet
> test more widely than noted above).
>
> As discussed on IRC, Darwin is qualified to Posix 2003 (and does not, 
> generally,
> have all the optional parts), so we do not expect the symbols to be present in
> locale.h (even with some _XOPEN_SOURCE= value).


OK, thanks for the checks. I've pushed it now.

Re: [PATCH V2] rs6000: New pass for replacement of adjacent loads fusion (lxv).

2024-01-31 Thread Ajit Agarwal

Hello Alex:

Thanks for your valuable review comments.

I am incorporating the comments and would send the patch with rs6000 and
AARCH64 changes.

Thanks & Regards
Ajit

On 24/01/24 10:13 pm, Alex Coplan wrote:
> Hi Ajit,
> 
> On 21/01/2024 19:57, Ajit Agarwal wrote:
>>
>> Hello All:
>>
>> New pass to replace adjacent memory addresses lxv with lxvp.
>> Added common infrastructure for load store fusion for
>> different targets.
> 
> Thanks for this, it would be nice to see the load/store pair pass
> generalized to multiple targets.
> 
> I assume you are targeting GCC 15 for this, as we are in stage 4 at
> the moment?
> 
>>
>> Common routines are refactored in fusion-common.h.
>>
>> AARCH64 load/store fusion pass is not changed with the 
>> common infrastructure.
> 
> I think any patch to generalize the load/store pair fusion pass should
> update the aarch64 code at the same time to use the generic
> infrastructure, instead of duplicating the code.
> 
> As a general comment, I think we should move as much of the code as
> possible to target-independent code, with only the bits that are truly
> target-specific (e.g. deciding which modes to allow for a load/store
> pair operand) in target code.
> 
> In terms of structuring the interface between generic code and target
> code, I think it would be pragmatic to use a class with (in some cases,
> pure) virtual functions that can be overriden by targets to implement
> any target-specific behaviour.
> 
> IMO the generic class should be implemented in its own .cc instead of
> using a header-only approach.  The target code would then define a
> derived class which overrides the virtual functions (where necessary)
> declared in the generic class, and then instantiate the derived class to
> create a target-customized instance of the pass.
> 
> A more traditional GCC approach would be to use optabs and target hooks
> to customize the behaviour of the pass to handle target-specific
> aspects, but:
>  - Target hooks are quite heavyweight, and we'd potentially have to add
>quite a few hooks just for one pass that (at least initially) will
>only be used by a couple of targets.
>  - Using classes allows both sides to easily maintain their own state
>and share that state where appropriate.
> 
> Nit on naming: I understand you want to move away from ldp_fusion, but
> how about pair_fusion or mem_pair_fusion instead of just "fusion" as a
> base name?  IMO just "fusion" isn't very clear as to what the pass is
> trying to achieve.
> 
> In general the code could do with a lot more commentary to explain the
> rationale for various things / explain the high-level intent of the
> code.
> 
> Unfortunately I'm not familiar with the DF framework (I've only really
> worked with RTL-SSA for the aarch64 pass), so I haven't commented on the
> use of that framework, but it would be nice if what you're trying to do
> could be done using RTL-SSA instead of using DF directly.
> 
> Hopefully Richard S can chime in on those aspects.
> 
> My main concerns with the patch at the moment (apart from the code
> duplication) is that it looks like:
> 
>  - The patch removes alias analysis from try_fuse_pair, which is unsafe.
>  - The patch tries to make its own RTL changes inside
>rs6000_gen_load_pair, but it should let fuse_pair make those changes
>using RTL-SSA instead.
> 
> I've left some more specific (but still mostly high-level) comments below.
> 
>>
>> For AARCH64 architectures just include "fusion-common.h"
>> and target dependent code can be added to that.
>>
>>
>> Alex/Richard:
>>
>> If you would like me to add for AARCH64 I can do that for AARCH64.
>>
>> If you would like to do that is fine with me.
>>
>> Bootstrapped and regtested with powerpc64-linux-gnu.
>>
>> Improvement in performance is seen with Spec 2017 spec FP benchmarks.
>>
>> Thanks & Regards
>> Ajit
>>
>> rs6000: New  pass for replacement of adjacent lxv with lxvp.
> 
> Are you looking to handle stores eventually, out of interest?  Looking
> at rs6000-vecload-opt.cc:fusion_bb it looks like you're just handling
> loads at the moment.
> 
>>
>> New pass to replace adjacent memory addresses lxv with lxvp.
>> Added common infrastructure for load store fusion for
>> different targets.
>>
>> Common routines are refactored in fusion-common.h.
> 
> I've just done a very quick scan through this file as it mostly just
> looks to be idential to existing code in aarch64-ldp-fusion.cc.
> 
>>
>> 2024-01-21  Ajit Kumar Agarwal  
>>
>> gcc/ChangeLog:
>>
>>  * config/rs6000/rs6000-passes.def: New vecload pass
>>  before pass_early_remat.
>>  * config/rs6000/rs6000-vecload-opt.cc: Add new pass.
>>  * config.gcc: Add new executable.
>>  * config/rs6000/rs6000-protos.h: Add new prototype for vecload
>>  pass.
>>  * config/rs6000/rs6000.cc: Add new prototype for vecload pass.
>>  * config/rs6000/t-rs6000: Add new rule.
>>  * fusion-common.h: Add common infrastructure for load store
>>  fusion t

GCN, RDNA 3: Adjust 'sync_compare_and_swap_lds_insn'

2024-01-31 Thread Thomas Schwinge

Hi!

OK to push "GCN, RDNA 3: Adjust 'sync_compare_and_swap_lds_insn'",
see attached?

In pre-RDNA 3 ISA manuals, there are notes for 'DS_CMPST_[...]', like:

Caution, the order of src and cmp are the *opposite* of the 
BUFFER_ATOMIC_CMPSWAP opcode.

..., and conversely in the RDNA 3 ISA manual, for 'DS_CMPSTORE_[...]':

In this architecture the order of src and cmp agree with the 
BUFFER_ATOMIC_CMPSWAP opcode.

Is my understanding correct, that this isn't something we have to worry
about at the GCC machine description level; that's resolved at the
assembler level?


Grüße
 Thomas


>From df6e031bf4b46d9e5b2de117fecd66b8b9b6dd20 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Wed, 31 Jan 2024 10:19:00 +0100
Subject: [PATCH] GCN, RDNA 3: Adjust 'sync_compare_and_swap_lds_insn'

For OpenACC/GCN '-march=gfx1100', a lot of test cases FAIL:

/tmp/ccGfLJ8a.mkoffload.2.s:406:2: error: instruction not supported on this GPU
ds_cmpst_rtn_b32 v0, v0, v4, v3
^

Apparently, in RDNA 3, 'ds_cmpst_[...]' has been replaced by
'ds_cmpstore_[...]'.

	gcc/
	* config/gcn/gcn.md (sync_compare_and_swap_lds_insn)
	[TARGET_RDNA3]: Adjust.
---
 gcc/config/gcn/gcn.md | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/gcc/config/gcn/gcn.md b/gcc/config/gcn/gcn.md
index 8abaef3bbdec..bbb75704140b 100644
--- a/gcc/config/gcn/gcn.md
+++ b/gcc/config/gcn/gcn.md
@@ -2095,7 +2095,12 @@
 	   (match_operand:SIDI 3 "register_operand" "  v")]
 	  UNSPECV_ATOMIC))]
   ""
-  "ds_cmpst_rtn_b %0, %1, %2, %3\;s_waitcnt\tlgkmcnt(0)"
+  {
+if (TARGET_RDNA3)
+  return "ds_cmpstore_rtn_b %0, %1, %2, %3\;s_waitcnt\tlgkmcnt(0)";
+else
+  return "ds_cmpst_rtn_b %0, %1, %2, %3\;s_waitcnt\tlgkmcnt(0)";
+  }
   [(set_attr "type" "ds")
(set_attr "length" "12")])
 
-- 
2.43.0

[PATCH] gimple-fold: Remove .ASAN_MARK calls on TREE_STATIC variables [PR113531]

2024-01-31 Thread Jakub Jelinek

On Wed, Jan 31, 2024 at 10:07:28AM +0100, Jakub Jelinek wrote:
> Indeed.  But what we could do is try to fold_stmt those .ASAN_MARK calls
> away earlier (but sure, the asan.cc change would be still required because
> that would be just an optimization).  But that can be handled incrementally,
> so I think the patch is ok as is (and I can handle the incremental part
> myself).

Like this, so far just tested on the testcase.  Ok for trunk if it passes
bootstrap/regtest on top of Jason's patch?

2024-01-31  Jakub Jelinek  

PR c++/113531
* gimple-fold.cc (gimple_fold_call): Remove .ASAN_MARK calls
on variables which were promoted to TREE_STATIC.

--- gcc/gimple-fold.cc.jj   2024-01-03 11:51:27.236790799 +0100
+++ gcc/gimple-fold.cc  2024-01-31 12:09:14.853348505 +0100
@@ -5722,6 +5722,21 @@ gimple_fold_call (gimple_stmt_iterator *
  }
  }
  break;
+case IFN_ASAN_MARK:
+  {
+tree base = gimple_call_arg (stmt, 1);
+gcc_checking_assert (TREE_CODE (base) == ADDR_EXPR);
+tree decl = TREE_OPERAND (base, 0);
+if (VAR_P (decl) && TREE_STATIC (decl))
+ {
+   /* Don't poison a variable with static storage; it might have
+  gotten marked before gimplify_init_constructor promoted it
+  to static.  */
+   replace_call_with_value (gsi, NULL_TREE);
+   return true;
+ }
+  }
+ break;
case IFN_GOACC_DIM_SIZE:
case IFN_GOACC_DIM_POS:
  result = fold_internal_goacc_dim (stmt);


Jakub

Re: GCN, RDNA 3: Adjust 'sync_compare_and_swap_lds_insn'

2024-01-31 Thread Andrew Stubbs


On 31/01/2024 10:36, Thomas Schwinge wrote:

Hi!

OK to push "GCN, RDNA 3: Adjust 'sync_compare_and_swap_lds_insn'",
see attached?

In pre-RDNA 3 ISA manuals, there are notes for 'DS_CMPST_[...]', like:

 Caution, the order of src and cmp are the *opposite* of the 
BUFFER_ATOMIC_CMPSWAP opcode.

..., and conversely in the RDNA 3 ISA manual, for 'DS_CMPSTORE_[...]':

 In this architecture the order of src and cmp agree with the 
BUFFER_ATOMIC_CMPSWAP opcode.

Is my understanding correct, that this isn't something we have to worry
about at the GCC machine description level; that's resolved at the
assembler level?


Right, the IR uses GCC's operand order and has nothing to do with the 
assembler syntax; the output template does the mapping.



--- a/gcc/config/gcn/gcn.md
+++ b/gcc/config/gcn/gcn.md
@@ -2095,7 +2095,12 @@
   (match_operand:SIDI 3 "register_operand" "  v")]
  UNSPECV_ATOMIC))]
   ""
-  "ds_cmpst_rtn_b %0, %1, %2, %3\;s_waitcnt\tlgkmcnt(0)"
+  {
+if (TARGET_RDNA3)
+  return "ds_cmpstore_rtn_b %0, %1, %2, 
%3\;s_waitcnt\tlgkmcnt(0)";
+else
+  return "ds_cmpst_rtn_b %0, %1, %2, %3\;s_waitcnt\tlgkmcnt(0)";
+  }
   [(set_attr "type" "ds")
(set_attr "length" "12")])


I think you need to swap %2 and %3 in the new format. ds_cmpst matches 
GCC operand order, but ds_cmpstore has "cmp" and "src" reversed.


Andrew

[PATCH] tree-optimization/113630 - invalid code hoisting

2024-01-31 Thread Richard Biener

The following avoids code hoisting (but also PRE insertion) of
expressions that got value-numbered to another one that are not
a valid replacement (but still compute the same value).  This time
because the access path ends in a structure with different size,
meaning we consider a related access as not trapping because of the
size of the base of the access.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

PR tree-optimization/113630
* tree-ssa-pre.cc (compute_avail): Avoid registering a
reference with a representation with not matching base
access size.

* gcc.dg/torture/pr113630.c: New testcase.
---
 gcc/testsuite/gcc.dg/torture/pr113630.c |  4 
 gcc/tree-ssa-pre.cc | 14 ++
 2 files changed, 18 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr113630.c

diff --git a/gcc/testsuite/gcc.dg/torture/pr113630.c 
b/gcc/testsuite/gcc.dg/torture/pr113630.c
new file mode 100644
index 000..72ebdefae27
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr113630.c
@@ -0,0 +1,4 @@
+/* { dg-do run { target { { *-*-linux* *-*-gnu* *-*-uclinux* } && mmap } } } */
+/* { dg-additional-options "-fno-strict-aliasing" } */
+
+#include "pr110799.c"
diff --git a/gcc/tree-ssa-pre.cc b/gcc/tree-ssa-pre.cc
index e72592de5e5..d29214d04f8 100644
--- a/gcc/tree-ssa-pre.cc
+++ b/gcc/tree-ssa-pre.cc
@@ -4281,6 +4281,20 @@ compute_avail (function *fun)
  = wide_int_to_tree (ptr_type_node,
  wi::to_wide (ref1->op2));
}
+ /* We also need to make sure that the access path
+ends in an access of the same size as otherwise
+we might assume an access may not trap while in
+fact it might.  That's independent of whether
+TBAA is in effect.  */
+ if (TYPE_SIZE (ref1->type) != TYPE_SIZE (ref2->type)
+ && (! TYPE_SIZE (ref1->type)
+ || ! TYPE_SIZE (ref2->type)
+ || ! operand_equal_p (TYPE_SIZE (ref1->type),
+   TYPE_SIZE (ref2->type
+   {
+ operands.release ();
+ continue;
+   }
  operands.release ();
 
  result = get_or_alloc_expr_for_reference
-- 
2.35.3

Re: [PATCH] LoongArch: Fix soft-float builds of libffi

2024-01-31 Thread Xi Ruoyao

On Sat, 2024-01-27 at 15:09 +0800, Yang Yujie wrote:
> This patch correspond to the upstream PR:
> https://github.com/libffi/libffi/pull/817
> 
> libffi/ChangeLog:
> 
>   * src/loongarch64/ffi.c: Avoid defining floats
>   in struct call_context if the ABI is soft-float.

You need to wait until the PR is accepted by the libffi maintainers. 
Frankly I don't know what libffi maintainers are busy on and I'm
frustrated as well (having a MIPS patch unreviewed there for a month)
but this is the procedure :(.

-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University

[PATCH] match: Fix vcond into conditional op folding [PR113607].

2024-01-31 Thread Robin Dapp

Hi,

in PR113607 we see an invalid fold of

  _429 = .COND_SHL (mask_patt_205.47_276, vect_cst__262, vect_cst__262, { 0, 
... });
  vect_prephitmp_129.51_282 = _429;
  vect_iftmp.55_287 = VEC_COND_EXPR ;

to

  Applying pattern match.pd:9607, gimple-match-10.cc:3817
  gimple_simplified to vect_iftmp.55_287 = .COND_SHL (mask_patt_205.47_276, 
vect_cst__262, vect_cst__262, { 0, ... });

where we essentially use COND_SHL's else instead of VEC_COND_EXPR's.

This patch adjusts the corresponding match.pd pattern and makes it only
match when the else values are the same.

That, however, causes the exact test case to fail which this pattern
was introduced for.  XFAIL it for now.

Bootstrapped and regtested on x86. Regtested on riscv.  aarch64
is still running.

Regards
 Robin


gcc/ChangeLog:

PR middle-end/113607

* match.pd: Make sure else values match when folding a
vec_cond into a conditional operation.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/sve/pre_cond_share_1.c: XFAIL.
* gcc.target/riscv/rvv/autovec/pr113607-run.c: New test.
* gcc.target/riscv/rvv/autovec/pr113607.c: New test.
---
 gcc/match.pd  |  8 +--
 .../gcc.target/aarch64/sve/pre_cond_share_1.c |  2 +-
 .../riscv/rvv/autovec/pr113607-run.c  |  4 ++
 .../gcc.target/riscv/rvv/autovec/pr113607.c   | 49 +++
 4 files changed, 58 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/pr113607-run.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/pr113607.c

diff --git a/gcc/match.pd b/gcc/match.pd
index e42ecaf9ec7..7c391a8fe20 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -9592,18 +9592,18 @@ and,
 
 /* Detect simplification for vector condition folding where
 
-  c = mask1 ? (masked_op mask2 a b) : b
+  c = mask1 ? (masked_op mask2 a b els) : els
 
   into
 
-  c = masked_op (mask1 & mask2) a b
+  c = masked_op (mask1 & mask2) a b els
 
   where the operation can be partially applied to one operand. */
 
 (for cond_op (COND_BINARY)
  (simplify
   (vec_cond @0
-   (cond_op:s @1 @2 @3 @4) @3)
+   (cond_op:s @1 @2 @3 @4) @4)
   (cond_op (bit_and @1 @0) @2 @3 @4)))
 
 /* And same for ternary expressions.  */
@@ -9611,7 +9611,7 @@ and,
 (for cond_op (COND_TERNARY)
  (simplify
   (vec_cond @0
-   (cond_op:s @1 @2 @3 @4 @5) @4)
+   (cond_op:s @1 @2 @3 @4 @5) @5)
   (cond_op (bit_and @1 @0) @2 @3 @4 @5)))
 
 /* For pointers @0 and @2 and nonnegative constant offset @1, look for
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pre_cond_share_1.c 
b/gcc/testsuite/gcc.target/aarch64/sve/pre_cond_share_1.c
index b51d0f298ea..e4f754d739c 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve/pre_cond_share_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve/pre_cond_share_1.c
@@ -129,4 +129,4 @@ fasten_main(size_t group, size_t ntypes, size_t nposes, 
size_t natlig, size_t na
 }
 
 /* { dg-final { scan-tree-dump-times {\.COND_MUL} 1 "optimized" } } */
-/* { dg-final { scan-tree-dump-times {\.VCOND} 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times {\.VCOND} 1 "optimized" { xfail *-*-* } } 
} */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr113607-run.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr113607-run.c
new file mode 100644
index 000..06074767ce5
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr113607-run.c
@@ -0,0 +1,4 @@
+/* { dg-do run { target { riscv_v && rv64 } } } */
+/* { dg-options "-O3 -march=rv64gcv -mabi=lp64d -fdump-tree-optimized" } */
+
+#include "pr113607.c"
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr113607.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr113607.c
new file mode 100644
index 000..70a93665497
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr113607.c
@@ -0,0 +1,49 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -march=rv64gcv -mabi=lp64d -fdump-tree-optimized" } */
+
+struct {
+  signed b;
+} c, d = {6};
+
+short e, f;
+int g[1000];
+signed char h;
+int i, j;
+long k, l;
+
+long m(long n, long o) {
+  if (n < 1 && o == 0)
+return 0;
+  return n;
+}
+
+static int p() {
+  long q = 0;
+  int a = 0;
+  for (; e < 2; e += 1)
+g[e * 7 + 1] = -1;
+  for (; h < 1; h += 1) {
+k = g[8] || f;
+l = m(g[f * 7 + 1], k);
+a = l;
+j = a < 0 || g[f * 7 + 1] < 0 || g[f * 7 + 1] >= 32 ? a : a << g[f * 7 + 
1];
+if (j)
+  ++q;
+  }
+  if (q)
+c = d;
+  return i;
+}
+
+int main() {
+  p();
+  if (c.b != 6)
+__builtin_abort ();
+}
+
+/* We must not fold VEC_COND_EXPR into COND_SHL.
+   Therefore, make sure that we still have 2/4 VCOND_MASKs with real else
+   value.  */
+
+/* { dg-final { scan-tree-dump-times { = \.VCOND_MASK.\([a-z0-9\._]+, 
[a-z0-9\._\{\}, ]+, [0-9\.\{\},]+\);} 0 "optimized" } } */
+/* { dg-final { scan-tree-dump-times { = \.VCOND_MASK.\([a-z0-9\._]+, 
[a-z0-9\._\{\}, ]+, [a-z0-9\._]+\);} 4 "optimized" } } */
-- 
2.43.0

[PATCH] libgcc: Fix up i386/t-heap-trampoline [PR113403]

2024-01-31 Thread Jakub Jelinek

On Sun, Jan 28, 2024 at 02:07:32PM +, Iain Sandoe wrote:
> --- a/libgcc/config/aarch64/t-heap-trampoline
> +++ b/libgcc/config/aarch64/t-heap-trampoline
> @@ -16,4 +16,5 @@
>  # along with GCC; see the file COPYING3.  If not see
>  # .
>  
> -LIB2ADD += $(srcdir)/config/aarch64/heap-trampoline.c
> +LIB2ADDEH += $(srcdir)/config/aarch64/heap-trampoline.c
> +LIB2ADDEHSHARED += $(srcdir)/config/aarch64/heap-trampoline.c
> --- a/libgcc/config/i386/t-heap-trampoline
> +++ b/libgcc/config/i386/t-heap-trampoline
> @@ -16,4 +16,5 @@
>  # along with GCC; see the file COPYING3.  If not see
>  # .
>  
> -LIB2ADD += $(srcdir)/config/i386/heap-trampoline.c
> +LIB2ADDEH += $(srcdir)/config/i386/heap-trampoline.c
> +LIB2ADDEHSHARED += $(srcdir)/config/aarch64/heap-trampoline.c

I'm seeing
../../../libgcc/shared-object.mk:14: warning: overriding recipe for target 
'heap-trampoline.o'
../../../libgcc/shared-object.mk:14: warning: ignoring old recipe for target 
'heap-trampoline.o'
../../../libgcc/shared-object.mk:17: warning: overriding recipe for target 
'heap-trampoline_s.o'
../../../libgcc/shared-object.mk:17: warning: ignoring old recipe for target 
'heap-trampoline_s.o'

Shouldn't we go with following patch?
I can test it on x86_64-linux and i686-linux, but can't test it e.g. on
Darwin easily.

2024-01-31  Jakub Jelinek  

* config/i386/t-heap-trampoline: Add to LIB2ADDEHSHARED
i386/heap-trampoline.c rather than aarch64/heap-trampoline.c.

--- libgcc/config/i386/t-heap-trampoline.jj 2024-01-31 10:46:36.491743132 
+0100
+++ libgcc/config/i386/t-heap-trampoline2024-01-31 12:55:59.779101625 
+0100
@@ -17,4 +17,4 @@
 # .
 
 LIB2ADDEH += $(srcdir)/config/i386/heap-trampoline.c
-LIB2ADDEHSHARED += $(srcdir)/config/aarch64/heap-trampoline.c
+LIB2ADDEHSHARED += $(srcdir)/config/i386/heap-trampoline.c

Jakub

[PATCH] libgcc: Avoid warnings on __gcc_nested_func_ptr_created [PR113402]

2024-01-31 Thread Jakub Jelinek

On Sun, Jan 28, 2024 at 11:02:33AM +, Iain Sandoe wrote:
>   * config/aarch64/heap-trampoline.c: Rename
>   __builtin_nested_func_ptr_created to __gcc_nested_func_ptr_created and
>   __builtin_nested_func_ptr_deleted to __gcc_nested_func_ptr_deleted.
>   * config/i386/heap-trampoline.c: Likewise.
>   * libgcc2.h: Likewise.

I'm seeing hundreds of
In file included from ../../../libgcc/libgcc2.c:56:
../../../libgcc/libgcc2.h:32:13: warning: conflicting types for built-in 
function ‘__gcc_nested_func_ptr_created’; expected ‘void(void *, void *, void 
*)’ [-Wbuiltin-declaration-mismatch]
   32 | extern void __gcc_nested_func_ptr_created (void *, void *, void **);
  | ^
warnings.

Either we need to add like in r14-6218
#pragma GCC diagnostic ignored "-Wbuiltin-declaration-mismatch"
(but in that case because of the libgcc2.h prototype (why is it there?)
it would need to be also with #pragma GCC diagnostic push/pop around),
or we could go with just following how the builtins are prototyped on the
compiler side and only cast to void ** when dereferencing (which is in
a single spot in each TU).

2024-01-31  Jakub Jelinek  

* libgcc2.h (__gcc_nested_func_ptr_created): Change type of last
argument from void ** to void *.
* config/i386/heap-trampoline.c (__gcc_nested_func_ptr_created):
Change type of dst from void ** to void * and cast dst to void **
before dereferencing it.
* config/aarch64/heap-trampoline.c (__gcc_nested_func_ptr_created):
Likewise.

--- libgcc/libgcc2.h.jj 2024-01-29 09:41:20.096387494 +0100
+++ libgcc/libgcc2.h2024-01-31 12:43:22.702694509 +0100
@@ -29,7 +29,7 @@ see the files COPYING3 and COPYING.RUNTI
 #pragma GCC visibility push(default)
 #endif
 
-extern void __gcc_nested_func_ptr_created (void *, void *, void **);
+extern void __gcc_nested_func_ptr_created (void *, void *, void *);
 extern void __gcc_nested_func_ptr_deleted (void);
 
 extern int __gcc_bcmp (const unsigned char *, const unsigned char *, size_t);
--- libgcc/config/i386/heap-trampoline.c.jj 2024-01-31 10:46:36.491743132 
+0100
+++ libgcc/config/i386/heap-trampoline.c2024-01-31 12:44:44.449550698 
+0100
@@ -26,7 +26,7 @@ int get_trampolines_per_page (void);
 struct tramp_ctrl_data *allocate_tramp_ctrl (struct tramp_ctrl_data *parent);
 void *allocate_trampoline_page (void);
 
-void __gcc_nested_func_ptr_created (void *chain, void *func, void **dst);
+void __gcc_nested_func_ptr_created (void *chain, void *func, void *dst);
 void __gcc_nested_func_ptr_deleted (void);
 
 static const uint8_t trampoline_insns[] = {
@@ -115,7 +115,7 @@ allocate_tramp_ctrl (struct tramp_ctrl_d
 
 HEAP_T_ATTR
 void
-__gcc_nested_func_ptr_created (void *chain, void *func, void **dst)
+__gcc_nested_func_ptr_created (void *chain, void *func, void *dst)
 {
   if (tramp_ctrl_curr == NULL)
 {
@@ -158,7 +158,7 @@ __gcc_nested_func_ptr_created (void *cha
   __builtin___clear_cache ((void *)trampoline->insns,
   ((void *)trampoline->insns + 
sizeof(trampoline->insns)));
 
-  *dst = &trampoline->insns;
+  *(void **) dst = &trampoline->insns;
 }
 
 HEAP_T_ATTR
--- libgcc/config/aarch64/heap-trampoline.c.jj  2024-01-31 10:46:36.491743132 
+0100
+++ libgcc/config/aarch64/heap-trampoline.c 2024-01-31 12:45:11.282175257 
+0100
@@ -26,7 +26,7 @@ int get_trampolines_per_page (void);
 struct tramp_ctrl_data *allocate_tramp_ctrl (struct tramp_ctrl_data *parent);
 void *allocate_trampoline_page (void);
 
-void __gcc_nested_func_ptr_created (void *chain, void *func, void **dst);
+void __gcc_nested_func_ptr_created (void *chain, void *func, void *dst);
 void __gcc_nested_func_ptr_deleted (void);
 
 #if defined(__gnu_linux__)
@@ -115,7 +115,7 @@ allocate_tramp_ctrl (struct tramp_ctrl_d
 
 HEAP_T_ATTR
 void
-__gcc_nested_func_ptr_created (void *chain, void *func, void **dst)
+__gcc_nested_func_ptr_created (void *chain, void *func, void *dst)
 {
   if (tramp_ctrl_curr == NULL)
 {
@@ -158,7 +158,7 @@ __gcc_nested_func_ptr_created (void *cha
   __builtin___clear_cache ((void *)trampoline->insns,
   ((void *)trampoline->insns + 
sizeof(trampoline->insns)));
 
-  *dst = &trampoline->insns;
+  *(void **) dst = &trampoline->insns;
 }
 
 HEAP_T_ATTR


Jakub

Re: [PATCH] gimple-fold: Remove .ASAN_MARK calls on TREE_STATIC variables [PR113531]

2024-01-31 Thread Richard Biener

On Wed, Jan 31, 2024 at 12:18 PM Jakub Jelinek  wrote:
>
> On Wed, Jan 31, 2024 at 10:07:28AM +0100, Jakub Jelinek wrote:
> > Indeed.  But what we could do is try to fold_stmt those .ASAN_MARK calls
> > away earlier (but sure, the asan.cc change would be still required because
> > that would be just an optimization).  But that can be handled incrementally,
> > so I think the patch is ok as is (and I can handle the incremental part
> > myself).
>
> Like this, so far just tested on the testcase.  Ok for trunk if it passes
> bootstrap/regtest on top of Jason's patch?

Note we fold all - well, all builtin - calls during gimple lowering.
Maybe we can put this special-casing there instead? (gimple-low.cc:797,
you possibly have to replace with a GIMPLE_NOP)

> 2024-01-31  Jakub Jelinek  
>
> PR c++/113531
> * gimple-fold.cc (gimple_fold_call): Remove .ASAN_MARK calls
> on variables which were promoted to TREE_STATIC.
>
> --- gcc/gimple-fold.cc.jj   2024-01-03 11:51:27.236790799 +0100
> +++ gcc/gimple-fold.cc  2024-01-31 12:09:14.853348505 +0100
> @@ -5722,6 +5722,21 @@ gimple_fold_call (gimple_stmt_iterator *
>   }
>   }
>   break;
> +case IFN_ASAN_MARK:
> +  {
> +tree base = gimple_call_arg (stmt, 1);
> +gcc_checking_assert (TREE_CODE (base) == ADDR_EXPR);
> +tree decl = TREE_OPERAND (base, 0);
> +if (VAR_P (decl) && TREE_STATIC (decl))
> + {
> +   /* Don't poison a variable with static storage; it might have
> +  gotten marked before gimplify_init_constructor promoted it
> +  to static.  */
> +   replace_call_with_value (gsi, NULL_TREE);
> +   return true;
> + }
> +  }
> + break;
> case IFN_GOACC_DIM_SIZE:
> case IFN_GOACC_DIM_POS:
>   result = fold_internal_goacc_dim (stmt);
>
>
> Jakub
>

Re: [PATCH] match: Fix vcond into conditional op folding [PR113607].

2024-01-31 Thread Richard Biener

On Wed, Jan 31, 2024 at 12:50 PM Robin Dapp  wrote:
>
> Hi,
>
> in PR113607 we see an invalid fold of
>
>   _429 = .COND_SHL (mask_patt_205.47_276, vect_cst__262, vect_cst__262, { 0, 
> ... });
>   vect_prephitmp_129.51_282 = _429;
>   vect_iftmp.55_287 = VEC_COND_EXPR  vect_prephitmp_129.51_282, vect_cst__262>;
>
> to
>
>   Applying pattern match.pd:9607, gimple-match-10.cc:3817
>   gimple_simplified to vect_iftmp.55_287 = .COND_SHL (mask_patt_205.47_276, 
> vect_cst__262, vect_cst__262, { 0, ... });
>
> where we essentially use COND_SHL's else instead of VEC_COND_EXPR's.
>
> This patch adjusts the corresponding match.pd pattern and makes it only
> match when the else values are the same.
>
> That, however, causes the exact test case to fail which this pattern
> was introduced for.  XFAIL it for now.
>
> Bootstrapped and regtested on x86. Regtested on riscv.  aarch64
> is still running.

OK.

Thanks,
Richard.

> Regards
>  Robin
>
>
> gcc/ChangeLog:
>
> PR middle-end/113607
>
> * match.pd: Make sure else values match when folding a
> vec_cond into a conditional operation.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/aarch64/sve/pre_cond_share_1.c: XFAIL.
> * gcc.target/riscv/rvv/autovec/pr113607-run.c: New test.
> * gcc.target/riscv/rvv/autovec/pr113607.c: New test.
> ---
>  gcc/match.pd  |  8 +--
>  .../gcc.target/aarch64/sve/pre_cond_share_1.c |  2 +-
>  .../riscv/rvv/autovec/pr113607-run.c  |  4 ++
>  .../gcc.target/riscv/rvv/autovec/pr113607.c   | 49 +++
>  4 files changed, 58 insertions(+), 5 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/pr113607-run.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/pr113607.c
>
> diff --git a/gcc/match.pd b/gcc/match.pd
> index e42ecaf9ec7..7c391a8fe20 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -9592,18 +9592,18 @@ and,
>
>  /* Detect simplification for vector condition folding where
>
> -  c = mask1 ? (masked_op mask2 a b) : b
> +  c = mask1 ? (masked_op mask2 a b els) : els
>
>into
>
> -  c = masked_op (mask1 & mask2) a b
> +  c = masked_op (mask1 & mask2) a b els
>
>where the operation can be partially applied to one operand. */
>
>  (for cond_op (COND_BINARY)
>   (simplify
>(vec_cond @0
> -   (cond_op:s @1 @2 @3 @4) @3)
> +   (cond_op:s @1 @2 @3 @4) @4)
>(cond_op (bit_and @1 @0) @2 @3 @4)))
>
>  /* And same for ternary expressions.  */
> @@ -9611,7 +9611,7 @@ and,
>  (for cond_op (COND_TERNARY)
>   (simplify
>(vec_cond @0
> -   (cond_op:s @1 @2 @3 @4 @5) @4)
> +   (cond_op:s @1 @2 @3 @4 @5) @5)
>(cond_op (bit_and @1 @0) @2 @3 @4 @5)))
>
>  /* For pointers @0 and @2 and nonnegative constant offset @1, look for
> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pre_cond_share_1.c 
> b/gcc/testsuite/gcc.target/aarch64/sve/pre_cond_share_1.c
> index b51d0f298ea..e4f754d739c 100644
> --- a/gcc/testsuite/gcc.target/aarch64/sve/pre_cond_share_1.c
> +++ b/gcc/testsuite/gcc.target/aarch64/sve/pre_cond_share_1.c
> @@ -129,4 +129,4 @@ fasten_main(size_t group, size_t ntypes, size_t nposes, 
> size_t natlig, size_t na
>  }
>
>  /* { dg-final { scan-tree-dump-times {\.COND_MUL} 1 "optimized" } } */
> -/* { dg-final { scan-tree-dump-times {\.VCOND} 1 "optimized" } } */
> +/* { dg-final { scan-tree-dump-times {\.VCOND} 1 "optimized" { xfail *-*-* } 
> } } */
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr113607-run.c 
> b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr113607-run.c
> new file mode 100644
> index 000..06074767ce5
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr113607-run.c
> @@ -0,0 +1,4 @@
> +/* { dg-do run { target { riscv_v && rv64 } } } */
> +/* { dg-options "-O3 -march=rv64gcv -mabi=lp64d -fdump-tree-optimized" } */
> +
> +#include "pr113607.c"
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr113607.c 
> b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr113607.c
> new file mode 100644
> index 000..70a93665497
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr113607.c
> @@ -0,0 +1,49 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O3 -march=rv64gcv -mabi=lp64d -fdump-tree-optimized" } */
> +
> +struct {
> +  signed b;
> +} c, d = {6};
> +
> +short e, f;
> +int g[1000];
> +signed char h;
> +int i, j;
> +long k, l;
> +
> +long m(long n, long o) {
> +  if (n < 1 && o == 0)
> +return 0;
> +  return n;
> +}
> +
> +static int p() {
> +  long q = 0;
> +  int a = 0;
> +  for (; e < 2; e += 1)
> +g[e * 7 + 1] = -1;
> +  for (; h < 1; h += 1) {
> +k = g[8] || f;
> +l = m(g[f * 7 + 1], k);
> +a = l;
> +j = a < 0 || g[f * 7 + 1] < 0 || g[f * 7 + 1] >= 32 ? a : a << g[f * 7 + 
> 1];
> +if (j)
> +  ++q;
> +  }
> +  if (q)
> +c = d;
> +  return i;
> +}
> +
> +int main() {
> +  p();
> +  if (c.b != 6)
> +__builtin_abort ();
> +}
> +
> +/* We must not fold VEC_COND_EXPR in

Re: [PATCH 1/3] vect: Pass stmt_vec_info to TARGET_SIMD_CLONE_USABLE

2024-01-31 Thread Richard Biener

On Tue, 30 Jan 2024, Andre Vieira wrote:

> 
> This patch adds stmt_vec_info to TARGET_SIMD_CLONE_USABLE to make sure the
> target can reject a simd_clone based on the vector mode it is using.
> This is needed because for VLS SVE vectorization the vectorizer accepts
> Advanced SIMD simd clones when vectorizing using SVE types because the 
> simdlens
> might match.  This will cause type errors later on.
> 
> Other targets do not currently need to use this argument.

Can you instead pass down the mode?

> gcc/ChangeLog:
> 
>   * target.def (TARGET_SIMD_CLONE_USABLE): Add argument.
>   * tree-vect-stmts.cc (vectorizable_simd_clone_call): Pass stmt_info to
>   call TARGET_SIMD_CLONE_USABLE.
>   * config/aarch64/aarch64.cc (aarch64_simd_clone_usable): Add argument
>   and use it to reject the use of SVE simd clones with Advanced SIMD
>   modes.
>   * config/gcn/gcn.cc (gcn_simd_clone_usable): Add unused argument.
>   * config/i386/i386.cc (ix86_simd_clone_usable): Likewise.
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Re: [PATCH 1/3] vect: Pass stmt_vec_info to TARGET_SIMD_CLONE_USABLE

2024-01-31 Thread Richard Biener

On Wed, 31 Jan 2024, Richard Biener wrote:

> On Tue, 30 Jan 2024, Andre Vieira wrote:
> 
> > 
> > This patch adds stmt_vec_info to TARGET_SIMD_CLONE_USABLE to make sure the
> > target can reject a simd_clone based on the vector mode it is using.
> > This is needed because for VLS SVE vectorization the vectorizer accepts
> > Advanced SIMD simd clones when vectorizing using SVE types because the 
> > simdlens
> > might match.  This will cause type errors later on.
> > 
> > Other targets do not currently need to use this argument.
> 
> Can you instead pass down the mode?

Thinking about that again the cgraph_simd_clone info in the clone
should have sufficient information to disambiguate.  If it doesn't
then we should amend it.

Richard.

Re: [PATCH 2/3] vect: disable multiple calls of poly simdclones

2024-01-31 Thread Richard Biener

On Tue, 30 Jan 2024, Andre Vieira wrote:

> 
> The current codegen code to support VF's that are multiples of a simdclone
> simdlen rely on BIT_FIELD_REF to create multiple input vectors.  This does not
> work for non-constant simdclones, so we should disable using such clones when
> the VF is a multiple of the non-constant simdlen until we change the codegen 
> to
> support those.

OK.

Thanks,
Richard.

> gcc/ChangeLog:
> 
>   * tree-vect-stmts.cc (vectorizable_simd_clone_call): Reject simdclones
>   with non-constant simdlen when VF is not exactly the same.

[COMMITTED] testsuite: Require ucn in g++.dg/cpp0x/udlit-extended-id-1.C

2024-01-31 Thread Rainer Orth

g++.dg/cpp0x/udlit-extended-id-1.C FAILs on Solaris/SPARC and x86 with
the native assembler:

UNRESOLVED: g++.dg/cpp0x/udlit-extended-id-1.C  -std=c++14 compilation failed 
to produce executable
FAIL: g++.dg/cpp0x/udlit-extended-id-1.C  -std=c++17 (test for excess errors)
UNRESOLVED: g++.dg/cpp0x/udlit-extended-id-1.C  -std=c++17 compilation failed 
to produce executable
FAIL: g++.dg/cpp0x/udlit-extended-id-1.C  -std=c++20 (test for excess errors)
UNRESOLVED: g++.dg/cpp0x/udlit-extended-id-1.C  -std=c++20 compilation failed 
to produce executable

/bin/as doesn't support UCN identifiers:

/usr/ccs/bin/as: "/var/tmp//ccCl_9fa.s", line 4: error: invalid character (0xcf)
/usr/ccs/bin/as: "/var/tmp//ccCl_9fa.s", line 4: error: invalid character (0x80)
/usr/ccs/bin/as: "/var/tmp//ccCl_9fa.s", line 4: error: statement syntax
/usr/ccs/bin/as: "/var/tmp//ccCl_9fa.s", line 4: error: statement syntax
[...]

To avoid this, this patch requires ucn support.

Tested on i386-pc-solaris2.11 (as and gas), sparc-sun-solaris2.11 (as
and gas), and i686-pc-linux-gnu.

Committed to trunk.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


2024-01-30  Rainer Orth  

gcc/testsuite:
* g++.dg/cpp0x/udlit-extended-id-1.C: Require ucn support.

# HG changeset patch
# Parent  b19044614bf18d15d2ccd8b2b26450678f93acaf
testsuite: Require ucn in g++.dg/cpp0x/udlit-extended-id-1.C

diff --git a/gcc/testsuite/g++.dg/cpp0x/udlit-extended-id-1.C b/gcc/testsuite/g++.dg/cpp0x/udlit-extended-id-1.C
--- a/gcc/testsuite/g++.dg/cpp0x/udlit-extended-id-1.C
+++ b/gcc/testsuite/g++.dg/cpp0x/udlit-extended-id-1.C
@@ -1,5 +1,6 @@
 // { dg-do run { target c++11 } }
 // { dg-additional-options "-Wno-error=normalized" }
+// { dg-require-effective-target ucn }
 #include 
 #include 
 using namespace std;

Re: [PATCH v3 4/5] Add tests for C/C++ musttail attributes

2024-01-31 Thread Prathamesh Kulkarni

On Wed, 31 Jan 2024 at 07:49, Andi Kleen  wrote:
>
> Mostly adopted from the existing C musttail plugin tests.
> ---
>  gcc/testsuite/c-c++-common/musttail1.c  | 17 
>  gcc/testsuite/c-c++-common/musttail2.c  | 36 +
>  gcc/testsuite/c-c++-common/musttail3.c  | 31 +
>  gcc/testsuite/c-c++-common/musttail4.c  | 19 +
>  gcc/testsuite/gcc.dg/musttail-invalid.c | 17 
>  5 files changed, 120 insertions(+)
>  create mode 100644 gcc/testsuite/c-c++-common/musttail1.c
>  create mode 100644 gcc/testsuite/c-c++-common/musttail2.c
>  create mode 100644 gcc/testsuite/c-c++-common/musttail3.c
>  create mode 100644 gcc/testsuite/c-c++-common/musttail4.c
>  create mode 100644 gcc/testsuite/gcc.dg/musttail-invalid.c
>
> diff --git a/gcc/testsuite/c-c++-common/musttail1.c 
> b/gcc/testsuite/c-c++-common/musttail1.c
> new file mode 100644
> index ..476185e3ed4b
> --- /dev/null
> +++ b/gcc/testsuite/c-c++-common/musttail1.c
> @@ -0,0 +1,17 @@
> +/* { dg-do compile { target tail_call } } */
> +/* { dg-options "-O2" } */
> +/* { dg-additional-options "-std=c++11" { target c++ } } */
> +/* { dg-additional-options "-std=c23" { target c } } */
> +/* { dg-additional-options "-fdelayed-branch" { target sparc*-*-* } } */
> +
> +int __attribute__((noinline,noclone))
Hi,
Sorry to nitpick -- Just wondering if it'd be slightly better to use
noipa attribute instead, assuming the intent is to disable IPA opts ?

Thanks,
Prathamesh


> +callee (int i)
> +{
> +  return i * i;
> +}
> +
> +int __attribute__((noinline,noclone))
> +caller (int i)
> +{
> +  [[gnu::musttail]] return callee (i + 1);
> +}
> diff --git a/gcc/testsuite/c-c++-common/musttail2.c 
> b/gcc/testsuite/c-c++-common/musttail2.c
> new file mode 100644
> index ..28f2f68ef13d
> --- /dev/null
> +++ b/gcc/testsuite/c-c++-common/musttail2.c
> @@ -0,0 +1,36 @@
> +/* { dg-do compile { target tail_call } } */
> +/* { dg-additional-options "-std=c++11" { target c++ } } */
> +/* { dg-additional-options "-std=c23" { target c } } */
> +
> +struct box { char field[256]; int i; };
> +
> +int __attribute__((noinline,noclone))
> +test_2_callee (int i, struct box b)
> +{
> +  if (b.field[0])
> +return 5;
> +  return i * i;
> +}
> +
> +int __attribute__((noinline,noclone))
> +test_2_caller (int i)
> +{
> +  struct box b;
> +  [[gnu::musttail]] return test_2_callee (i + 1, b); /* { dg-error "cannot 
> tail-call: " } */
> +}
> +
> +extern void setjmp (void);
> +void
> +test_3 (void)
> +{
> +  [[gnu::musttail]] return setjmp (); /* { dg-error "cannot tail-call: " } */
> +}
> +
> +typedef void (fn_ptr_t) (void);
> +volatile fn_ptr_t fn_ptr;
> +
> +void
> +test_5 (void)
> +{
> +  [[gnu::musttail]] return fn_ptr (); /* { dg-error "cannot tail-call: " } */
> +}
> diff --git a/gcc/testsuite/c-c++-common/musttail3.c 
> b/gcc/testsuite/c-c++-common/musttail3.c
> new file mode 100644
> index ..fdbb292944ad
> --- /dev/null
> +++ b/gcc/testsuite/c-c++-common/musttail3.c
> @@ -0,0 +1,31 @@
> +/* { dg-do compile { target tail_call } } */
> +/* { dg-additional-options "-std=c++11" { target c++ } } */
> +/* { dg-additional-options "-std=c23" { target c } } */
> +
> +extern int foo2 (int x, ...);
> +
> +struct str
> +{
> +  int a, b;
> +};
> +
> +struct str
> +cstruct (int x)
> +{
> +  if (x < 10)
> +[[clang::musttail]] return cstruct (x + 1);
> +  return ((struct str){ x, 0 });
> +}
> +
> +int
> +foo (int x)
> +{
> +  if (x < 10)
> +[[clang::musttail]] return foo2 (x, 29);
> +  if (x < 100)
> +{
> +  int k = foo (x + 1);
> +  [[clang::musttail]] return k;/* { dg-error "cannot tail-call: " } 
> */
> +}
> +  return x;
> +}
> diff --git a/gcc/testsuite/c-c++-common/musttail4.c 
> b/gcc/testsuite/c-c++-common/musttail4.c
> new file mode 100644
> index ..7bf44816f14a
> --- /dev/null
> +++ b/gcc/testsuite/c-c++-common/musttail4.c
> @@ -0,0 +1,19 @@
> +/* { dg-do compile { target tail_call } } */
> +/* { dg-additional-options "-std=c++11" { target c++ } } */
> +/* { dg-additional-options "-std=c23" { target c } } */
> +
> +struct box { char field[64]; int i; };
> +
> +struct box __attribute__((noinline,noclone))
> +returns_struct (int i)
> +{
> +  struct box b;
> +  b.i = i * i;
> +  return b;
> +}
> +
> +int __attribute__((noinline,noclone))
> +test_1 (int i)
> +{
> +  [[gnu::musttail]] return returns_struct (i * 5).i; /* { dg-error "cannot 
> tail-call: " } */
> +}
> diff --git a/gcc/testsuite/gcc.dg/musttail-invalid.c 
> b/gcc/testsuite/gcc.dg/musttail-invalid.c
> new file mode 100644
> index ..c4725b4b8226
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/musttail-invalid.c
> @@ -0,0 +1,17 @@
> +/* { dg-do compile } */
> +/* { dg-options "-std=c23" } */
> +
> +[[musttail]] int j; /* { dg-warning "attribute ignored" } */
> +__attribute__((musttail)) int k; /* { dg-warning "attribute directive 
> ignored" } */
> +
> +void foo(void)
> +{
> +   [[mus

[PATCH] tree-optimization/111444 - avoid insertions when skipping defs

2024-01-31 Thread Richard Biener

The following avoids inserting expressions for IPA CP discovered
equivalences into the VN hashtables when we are optimistically
skipping may-defs in the attempt to prove it's redundant.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

PR tree-optimization/111444
* tree-ssa-sccvn.cc (vn_reference_lookup_3): Do not use
vn_reference_lookup_2 when optimistically skipping may-defs.

* gcc.dg/torture/pr111444.c: New testcase.
---
 gcc/testsuite/gcc.dg/torture/pr111444.c | 26 +
 gcc/tree-ssa-sccvn.cc   | 22 -
 2 files changed, 39 insertions(+), 9 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr111444.c

diff --git a/gcc/testsuite/gcc.dg/torture/pr111444.c 
b/gcc/testsuite/gcc.dg/torture/pr111444.c
new file mode 100644
index 000..e613f255803
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr111444.c
@@ -0,0 +1,26 @@
+/* { dg-do run } */
+
+int a = 3, d, e;
+int *b = &a;
+char c;
+short f;
+const int **g;
+static long h(int **i, int **j)
+{
+  const int *k[46];
+  const int **l = &k[5];
+  *j = &e;
+  g = l;
+  for (; d; d = d + 1)
+;
+  **i = 0;
+  return f;
+}
+int main()
+{
+  int *m = &a;
+  h(&m, &m);
+  c = *b;
+  if (c != 3)
+__builtin_abort ();
+}
diff --git a/gcc/tree-ssa-sccvn.cc b/gcc/tree-ssa-sccvn.cc
index f0fa718a723..9bed9b3cc69 100644
--- a/gcc/tree-ssa-sccvn.cc
+++ b/gcc/tree-ssa-sccvn.cc
@@ -2790,25 +2790,29 @@ vn_reference_lookup_3 (ao_ref *ref, tree vuse, void 
*data_,
}
  else
{
- tree *saved_last_vuse_ptr = data->last_vuse_ptr;
- /* Do not update last_vuse_ptr in vn_reference_lookup_2.  */
- data->last_vuse_ptr = NULL;
  tree saved_vuse = vr->vuse;
  hashval_t saved_hashcode = vr->hashcode;
- void *res = vn_reference_lookup_2 (ref, gimple_vuse (def_stmt),
-data);
+ if (vr->vuse)
+   vr->hashcode = vr->hashcode - SSA_NAME_VERSION (vr->vuse);
+ vr->vuse = vuse_ssa_val (gimple_vuse (def_stmt));
+ if (vr->vuse)
+   vr->hashcode = vr->hashcode + SSA_NAME_VERSION (vr->vuse);
+ vn_reference_t vnresult = NULL;
+ /* Do not use vn_reference_lookup_2 since that might perform
+expression hashtable insertion but this lookup crosses
+a possible may-alias making such insertion conditionally
+invalid.  */
+ vn_reference_lookup_1 (vr, &vnresult);
  /* Need to restore vr->vuse and vr->hashcode.  */
  vr->vuse = saved_vuse;
  vr->hashcode = saved_hashcode;
- data->last_vuse_ptr = saved_last_vuse_ptr;
- if (res && res != (void *)-1)
+ if (vnresult)
{
- vn_reference_t vnresult = (vn_reference_t) res;
  if (TREE_CODE (rhs) == SSA_NAME)
rhs = SSA_VAL (rhs);
  if (vnresult->result
  && operand_equal_p (vnresult->result, rhs, 0))
-   return res;
+   return vnresult;
}
}
}
-- 
2.35.3

[PATCH] testsuite: i386: Disable .eh_frame in gcc.target/i386/auto-init-5.c etc.

2024-01-31 Thread Rainer Orth

The gcc.target/i386/auto-init-5.c and gcc.target/i386/auto-init-6.c
tests FAIL on 64-bit Solaris/x86 with the native assembler:

FAIL: gcc.target/i386/auto-init-5.c scan-assembler-times .long\\t0 14
FAIL: gcc.target/i386/auto-init-6.c scan-assembler-times long\\t0 8

/bin/as doesn't fully support the CFI directives, so the .eh_frame
sections are emitted directly and contain .long.  Since .eh_frame
doesn't matter for those tests, this patch disables its generation in
the first place.

Tested on i386-pc-solaris2.11 (as and gas) and i686-pc-linux-gnu.

Ok for trunk?

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


2024-01-30  Rainer Orth  

gcc/testsuite:
* gcc.target/i386/auto-init-5.c: Add
-fno-asynchronous-unwind-tables to dg-options.
* gcc.target/i386/auto-init-6.c: Likewise.

# HG changeset patch
# Parent  5486f016175658d0c7c2de48c893e9fbd68d3493
testsuite: i386: Disable .eh_frame in gcc.target/i386/auto-init-5.c etc.

diff --git a/gcc/testsuite/gcc.target/i386/auto-init-5.c b/gcc/testsuite/gcc.target/i386/auto-init-5.c
--- a/gcc/testsuite/gcc.target/i386/auto-init-5.c
+++ b/gcc/testsuite/gcc.target/i386/auto-init-5.c
@@ -1,6 +1,6 @@
 /* Verify zero initialization for complex type automatic variables.  */
 /* { dg-do compile } */
-/* { dg-options "-ftrivial-auto-var-init=zero" } */
+/* { dg-options "-ftrivial-auto-var-init=zero -fno-asynchronous-unwind-tables" } */
 
 
 _Complex long double result;
diff --git a/gcc/testsuite/gcc.target/i386/auto-init-6.c b/gcc/testsuite/gcc.target/i386/auto-init-6.c
--- a/gcc/testsuite/gcc.target/i386/auto-init-6.c
+++ b/gcc/testsuite/gcc.target/i386/auto-init-6.c
@@ -2,7 +2,7 @@
 /* Note, _Complex long double is initialized to zeroes due to the current
implemenation limitation.  */
 /* { dg-do compile } */
-/* { dg-options "-ftrivial-auto-var-init=pattern -march=x86-64 -mtune=generic -msse" } */
+/* { dg-options "-ftrivial-auto-var-init=pattern -march=x86-64 -mtune=generic -msse -fno-asynchronous-unwind-tables" } */
 
 
 _Complex long double result;

Re: [PATCH] RISC-V: Support scheduling for sifive p600 series

2024-01-31 Thread Robin Dapp

> +  NULL,  /* vector cost */
> +};

Does the P600 series include a vector unit?  From what I found on
the web it looks like it.  If so I would suggest specifying at least
the default (generic) vector cost model here.  We fall back to the
default one for NULL but I find it more explicit to specify one. 

> +;; The Sifive 8 has six pipelines:

P600?  Is 8 the generation and P600 the official name?

> +(define_insn_reservation "sifive_p600_div" 33
> +  (and (eq_attr "tune" "sifive_p600")
> +   (eq_attr "type" "idiv"))
> +  "sifive_p600_M, sifive_p600_idiv*32")
> +

> +(define_insn_reservation "sifive_p600_fdiv_s" 18
> +  (and (eq_attr "tune" "sifive_p600")
> +   (eq_attr "type" "fdiv,fsqrt")
> +   (eq_attr "mode" "SF"))
> +  "sifive_p600_FM, sifive_p600_fdiv*17")
> +
> +(define_insn_reservation "sifive_p600_fdiv_d" 31
> +  (and (eq_attr "tune" "sifive_p600")
> +   (eq_attr "type" "fdiv,fsqrt")
> +   (eq_attr "mode" "DF"))
> +  "sifive_p600_FM, sifive_p600_fdiv*30")

I would suggest not to block the units for that long.  It will
needlessly increase the automata's complexity causing longer build
times.  Even if you want to keep the latency high (doubtful if
that's beneficial in terms of spilling) you could just block the
unit for maybe 3-5 cycles.  Up to you in the end, though and not
a blocker.

Regards
 Robin

Re: [PATCH] testsuite: i386: Disable .eh_frame in gcc.target/i386/auto-init-5.c etc.

2024-01-31 Thread Jakub Jelinek

On Wed, Jan 31, 2024 at 01:50:33PM +0100, Rainer Orth wrote:
> The gcc.target/i386/auto-init-5.c and gcc.target/i386/auto-init-6.c
> tests FAIL on 64-bit Solaris/x86 with the native assembler:
> 
> FAIL: gcc.target/i386/auto-init-5.c scan-assembler-times .long\\t0 14
> FAIL: gcc.target/i386/auto-init-6.c scan-assembler-times long\\t0 8
> 
> /bin/as doesn't fully support the CFI directives, so the .eh_frame
> sections are emitted directly and contain .long.  Since .eh_frame
> doesn't matter for those tests, this patch disables its generation in
> the first place.
> 
> Tested on i386-pc-solaris2.11 (as and gas) and i686-pc-linux-gnu.
> 
> Ok for trunk?
> 
>   Rainer
> 
> -- 
> -
> Rainer Orth, Center for Biotechnology, Bielefeld University
> 
> 
> 2024-01-30  Rainer Orth  
> 
>   gcc/testsuite:
>   * gcc.target/i386/auto-init-5.c: Add
>   -fno-asynchronous-unwind-tables to dg-options.
>   * gcc.target/i386/auto-init-6.c: Likewise.

LGTM.

Jakub

[PATCH] testsuite: i386: Fix gcc.target/i386/no-callee-saved-1.c etc. on Solaris/x86

2024-01-31 Thread Rainer Orth

The gcc.target/i386/no-callee-saved-[12].c tests FAIL on Solaris/x86:

FAIL: gcc.target/i386/no-callee-saved-1.c scan-assembler-not push
FAIL: gcc.target/i386/no-callee-saved-2.c scan-assembler-not push

In both cases, the test expect the Linux/x86 default of
-fomit-frame-pointer, while Solaris/x86 defaults to
-fno-omit-frame-pointer.

So this patch explicitly specifies -fomit-frame-pointer.

Tested on i386-pc-solaris2.11 (as and gas) and i686-pc-linux-gnu.

Ok for trunk?

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


2024-01-30  Rainer Orth  

gcc/testsuite:
* gcc.target/i386/no-callee-saved-1.c: Add -fomit-frame-pointer to
dg-options.
* gcc.target/i386/no-callee-saved-2.c: Likewise.

# HG changeset patch
# Parent  6cd45b5f542c222744120356f69afdaebd618627
testsuite: i386: Fix gcc.target/i386/no-callee-saved-1.c etc. on Solaris/x86

diff --git a/gcc/testsuite/gcc.target/i386/no-callee-saved-1.c b/gcc/testsuite/gcc.target/i386/no-callee-saved-1.c
--- a/gcc/testsuite/gcc.target/i386/no-callee-saved-1.c
+++ b/gcc/testsuite/gcc.target/i386/no-callee-saved-1.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -mtune-ctrl=^prologue_using_move,^epilogue_using_move" } */
+/* { dg-options "-O2 -mtune-ctrl=^prologue_using_move,^epilogue_using_move -fomit-frame-pointer" } */
 
 extern int bar (int)
 #ifndef __x86_64__
diff --git a/gcc/testsuite/gcc.target/i386/no-callee-saved-2.c b/gcc/testsuite/gcc.target/i386/no-callee-saved-2.c
--- a/gcc/testsuite/gcc.target/i386/no-callee-saved-2.c
+++ b/gcc/testsuite/gcc.target/i386/no-callee-saved-2.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -mtune-ctrl=^prologue_using_move,^epilogue_using_move" } */
+/* { dg-options "-O2 -mtune-ctrl=^prologue_using_move,^epilogue_using_move -fomit-frame-pointer" } */
 
 extern int bar (int) __attribute__ ((no_caller_saved_registers))
 #ifndef __x86_64__

[PATCH] testsuite: i386: Fix gcc.target/i386/pr38534-1.c etc. on Solaris/x86

2024-01-31 Thread Rainer Orth

The gcc.target/i386/pr38534-1.c etc. tests FAIL on 32 and 64-bit
Solaris/x86:

FAIL: gcc.target/i386/pr38534-1.c scan-assembler-not push
FAIL: gcc.target/i386/pr38534-2.c scan-assembler-not push
FAIL: gcc.target/i386/pr38534-3.c scan-assembler-not push
FAIL: gcc.target/i386/pr38534-4.c scan-assembler-not push

The tests assume the Linux/x86 default of -fomit-frame-pointer, while
Solaris/x86 defaults to -fno-omit-frame-pointer.

Fixed by specifying -fomit-frame-pointer explicitly.

Tested on i386-pc-solaris2.11 and i686-pc-linux-gnu.

Ok for trunk?

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


2024-01-30  Rainer Orth  

gcc/testsuite:
* gcc.target/i386/pr38534-1.c: Add -fomit-frame-pointer to
dg-options.
* gcc.target/i386/pr38534-2.c: Likewise.
* gcc.target/i386/pr38534-3.c: Likewise.
* gcc.target/i386/pr38534-4.c: Likewise.

# HG changeset patch
# Parent  002cd7277f8ae2677784c606659300d27e7342a4
testsuite: i386: Fix gcc.target/i386/pr38534-1.c etc. on Solaris/x86

diff --git a/gcc/testsuite/gcc.target/i386/pr38534-1.c b/gcc/testsuite/gcc.target/i386/pr38534-1.c
--- a/gcc/testsuite/gcc.target/i386/pr38534-1.c
+++ b/gcc/testsuite/gcc.target/i386/pr38534-1.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -mtune-ctrl=^prologue_using_move,^epilogue_using_move" } */
+/* { dg-options "-O2 -mtune-ctrl=^prologue_using_move,^epilogue_using_move -fomit-frame-pointer" } */
 
 #define ARRAY_SIZE 256
 
diff --git a/gcc/testsuite/gcc.target/i386/pr38534-2.c b/gcc/testsuite/gcc.target/i386/pr38534-2.c
--- a/gcc/testsuite/gcc.target/i386/pr38534-2.c
+++ b/gcc/testsuite/gcc.target/i386/pr38534-2.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -mtune-ctrl=^prologue_using_move,^epilogue_using_move" } */
+/* { dg-options "-O2 -mtune-ctrl=^prologue_using_move,^epilogue_using_move -fomit-frame-pointer" } */
 
 extern void bar (void) __attribute__ ((no_callee_saved_registers));
 extern void fn (void) __attribute__ ((noreturn));
diff --git a/gcc/testsuite/gcc.target/i386/pr38534-3.c b/gcc/testsuite/gcc.target/i386/pr38534-3.c
--- a/gcc/testsuite/gcc.target/i386/pr38534-3.c
+++ b/gcc/testsuite/gcc.target/i386/pr38534-3.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -mtune-ctrl=^prologue_using_move,^epilogue_using_move" } */
+/* { dg-options "-O2 -mtune-ctrl=^prologue_using_move,^epilogue_using_move -fomit-frame-pointer" } */
 
 typedef void (*fn_t) (void) __attribute__ ((no_callee_saved_registers));
 extern fn_t bar;
diff --git a/gcc/testsuite/gcc.target/i386/pr38534-4.c b/gcc/testsuite/gcc.target/i386/pr38534-4.c
--- a/gcc/testsuite/gcc.target/i386/pr38534-4.c
+++ b/gcc/testsuite/gcc.target/i386/pr38534-4.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -mtune-ctrl=^prologue_using_move,^epilogue_using_move" } */
+/* { dg-options "-O2 -mtune-ctrl=^prologue_using_move,^epilogue_using_move -fomit-frame-pointer" } */
 
 typedef void (*fn_t) (void) __attribute__ ((no_callee_saved_registers));
 extern void fn (void) __attribute__ ((noreturn));

[PATCH] uninit-pr108968-register.c: use __UINTPTR_TYPE__ for LLP64

2024-01-31 Thread Jonathan Yong

Ensure sp variable is long enough by using __UINTPTR_TYPE__ for
rsp.

Attached patch okay? Changes unsigned long to __UINTPTR_TYPE__.From 8b5e79e1345d99ec6d3595013a20a9c672edb403 Mon Sep 17 00:00:00 2001
From: Jonathan Yong <10wa...@gmail.com>
Date: Wed, 31 Jan 2024 13:31:30 +
Subject: [PATCH] uninit-pr108968-register.c: use __UINTPTR_TYPE__ for LLP64

Ensure sp variable is long enough by using __UINTPTR_TYPE__ for
rsp.

gcc/testsuite/ChangeLog:

	* c-c++-common/analyzer/uninit-pr108968-register.c:
	Use __UINTPTR_TYPE__ instead of unsigned long for LLP64.
---
 gcc/testsuite/c-c++-common/analyzer/uninit-pr108968-register.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/c-c++-common/analyzer/uninit-pr108968-register.c b/gcc/testsuite/c-c++-common/analyzer/uninit-pr108968-register.c
index a76c09e7b14..e9a1c21990b 100644
--- a/gcc/testsuite/c-c++-common/analyzer/uninit-pr108968-register.c
+++ b/gcc/testsuite/c-c++-common/analyzer/uninit-pr108968-register.c
@@ -4,6 +4,6 @@
 struct cpu_info {};
 struct cpu_info *get_cpu_info(void)
 {
-  register unsigned long sp asm("rsp");
+  register __UINTPTR_TYPE__ sp asm("rsp");
   return (struct cpu_info *)((sp | (STACK_SIZE - 1)) + 1) - 1; /* { dg-bogus "use of uninitialized value 'sp'" } */
 }
-- 
2.43.0

Re: [PATCH v3 2/2] arm: Add support for MVE Tail-Predicated Low Overhead Loops

2024-01-31 Thread Richard Earnshaw (lists)

On 30/01/2024 14:09, Andre Simoes Dias Vieira wrote:
> Hi Richard,
> 
> Thanks for the reviews, I'm making these changes but just a heads up.
> 
> When hardcoding LR_REGNUM like this we need to change the way we compare the 
> register in doloop_condition_get. This function currently compares the rtx 
> nodes by address, which I think happens to be fine before we assign hard 
> registers, as I suspect we always share the rtx node for the same pseudo, but 
> when assigning registers it seems like we create copies, so things like:
> `XEXP (inc_src, 0) == reg` will fail for
> inc_src: (plus (reg LR) (const_int -n)'
> reg: (reg LR)
> 
> Instead I will substitute the operand '==' with calls to 'rtx_equal_p (op1, 
> op2, NULL)'.

Yes, that's fine.

R.

> 
> Sound good?
> 
> Kind regards,
> Andre
> 
> 
> From: Richard Earnshaw (lists) 
> Sent: Tuesday, January 30, 2024 11:36 AM
> To: Andre Simoes Dias Vieira; gcc-patches@gcc.gnu.org
> Cc: Kyrylo Tkachov; Stam Markianos-Wright
> Subject: Re: [PATCH v3 2/2] arm: Add support for MVE Tail-Predicated Low 
> Overhead Loops
> 
> On 19/01/2024 14:40, Andre Vieira wrote:
>>
>> Respin after comments from Kyrill and rebase. I also removed an if-then-else
>> construct in arm_mve_check_reg_origin_is_num_elems similar to the other 
>> functions
>> Kyrill pointed out.
>>
>> After an earlier comment from Richard Sandiford I also added comments to the
>> two tail predication patterns added to explain the need for the unspecs.
> 
> [missing ChangeLog]
> 
> I'm just going to focus on loop-doloop.c in this reply, I'll respond to the 
> other bits in a follow-up.
> 
>   2)  (set (reg) (plus (reg) (const_int -1))
> - (set (pc) (if_then_else (reg != 0)
> -(label_ref (label))
> -(pc))).
> +(set (pc) (if_then_else (reg != 0)
> +(label_ref (label))
> +(pc))).
> 
>   Some targets (ARM) do the comparison before the branch, as in the
>   following form:
> 
> - 3) (parallel [(set (cc) (compare ((plus (reg) (const_int -1), 0)))
> -   (set (reg) (plus (reg) (const_int -1)))])
> -(set (pc) (if_then_else (cc == NE)
> ...
> 
> 
> This comment is becoming confusing.  Really the text leading up to 3)... 
> should be inside 3.  Something like:
> 
>   3) Some targets (ARM) do the comparison before the branch, as in the
>   following form:
> 
>   (parallel [(set (cc) (compare (plus (reg) (const_int -1)) 0))
>  (set (reg) (plus (reg) (const_int -1)))])
>   (set (pc) (if_then_else (cc == NE)
>   (label_ref (label))
>   (pc)))])
> 
> 
> The same issue on the comment structure also applies to the new point 4...
> 
> +  The ARM target also supports a special case of a counter that 
> decrements
> +  by `n` and terminating in a GTU condition.  In that case, the compare 
> and
> +  branch are all part of one insn, containing an UNSPEC:
> +
> +  4) (parallel [
> +   (set (pc)
> +   (if_then_else (gtu (unspec:SI [(plus:SI (reg:SI 14 lr)
> +   (const_int -n))])
> +  (const_int n-1]))
> +   (label_ref)
> +   (pc)))
> +   (set (reg:SI 14 lr)
> +(plus:SI (reg:SI 14 lr)
> + (const_int -n)))
> + */
> 
> I think this needs a bit more clarification.  Specifically that this 
> construct supports a predicated vectorized do loop.  Also, the placement of 
> the unspec inside the comparison is ugnly and unnecessary.  It should be 
> sufficient to have the unspec inside a USE expression, which the mid-end can 
> then ignore entirely.  So
> 
> (parallel
>  [(set (pc) (if_then_else (gtu (plus (reg) (const_int -n))
>(const_int n-1))
>   (label_ref) (pc)))
>   (set (reg) (plus (reg) (const_int -n)))
>   (additional clobbers and uses)])
> 
> For Arm, we then add a (use (unspec [(const_int 0)] N)) that is specific to 
> this pattern to stop anything else from matching it.
> 
> Note that we don't need to mention that the register is 'LR' or the modes, 
> those are specific to a particular backend, not the generic pattern we want 
> to match.
> 
> +  || !CONST_INT_P (XEXP (inc_src, 1))
> +  || INTVAL (XEXP (inc_src, 1)) >= 0)
>  return 0;
> +  int dec_num = abs (INTVAL (XEXP (inc_src, 1)));
> 
> We can just use '-INTVAL(...)' here, we've verified just above that the 
> constant is negative.
> 
> -  if ((XEXP (condition, 0) == reg)
> +  /* For the ARM special case of having a GTU: re-form the condition without
> + the unspec for the benefit of the middle-end.  */
> +  if (GET_CODE (condition) == GTU)
> +{

Re: [PATCH RFA] asan: poisoning promoted statics [PR113531]

2024-01-31 Thread Jason Merrill


On 1/31/24 03:51, Richard Biener wrote:

On Wed, Jan 31, 2024 at 4:38 AM Jason Merrill  wrote:


Tested x86_64-pc-linux-gnu, OK for trunk?


It's a quite "late" fixup, I suppose you have tried to avoid marking it
during gimplification?  I see we do parts of this during BIND_EXPR
processing which is indeed a bit early but possibly difficult to rectify.


I also considered


diff --git a/gcc/gimplify.cc b/gcc/gimplify.cc
index 7f79b3cc7e6..c906d927a09 100644
--- a/gcc/gimplify.cc
+++ b/gcc/gimplify.cc
@@ -1249,6 +1249,10 @@ asan_poison_variable (tree decl, bool poison, 
gimple_stmt_iterator *it,
   if (zerop (unit_size))
 return;

+  /* Or variables in static storage.  */
+  if (TREE_STATIC (decl))
+return;
+
   /* It's necessary to have all stack variables aligned to ASAN granularity
  bytes.  */
   gcc_assert (!hwasan_sanitize_p () || hwasan_sanitize_stack_p ());


which fixes the bug by avoiding the poison mark, but it's too late to 
avoid the unpoison mark--though the unpoison is still removed by sanopt, 
so the end result is the same.  I decided to send the other patch 
because it applies to both, but I'm happy with either approach.


Jason

Re: [PATCH 1/3] vect: Pass stmt_vec_info to TARGET_SIMD_CLONE_USABLE

2024-01-31 Thread Andre Vieira (lists)





On 31/01/2024 12:13, Richard Biener wrote:

On Wed, 31 Jan 2024, Richard Biener wrote:


On Tue, 30 Jan 2024, Andre Vieira wrote:



This patch adds stmt_vec_info to TARGET_SIMD_CLONE_USABLE to make sure the
target can reject a simd_clone based on the vector mode it is using.
This is needed because for VLS SVE vectorization the vectorizer accepts
Advanced SIMD simd clones when vectorizing using SVE types because the simdlens
might match.  This will cause type errors later on.

Other targets do not currently need to use this argument.


Can you instead pass down the mode?


Thinking about that again the cgraph_simd_clone info in the clone
should have sufficient information to disambiguate.  If it doesn't
then we should amend it.

Richard.


Hi Richard,

Thanks for the review, I don't think cgraph_simd_clone_info is the right 
place to pass down this information, since this is information about the 
caller rather than the simdclone itself. What we are trying to achieve 
here is making the vectorizer being able to accept or reject simdclones 
based on the ISA we are vectorizing for. To distinguish between SVE and 
Advanced SIMD ISAs we use modes, I am also not sure that's ideal but it 
is what we currently use. So to answer your earlier question, yes I can 
also pass down mode if that's preferable.


Regards,
Andre

Re: [PATCH] c++: add deprecation notice for -fconcepts-ts

2024-01-31 Thread Jason Merrill


On 1/31/24 03:40, Richard Biener wrote:

On Wed, Jan 31, 2024 at 12:19 AM Marek Polacek  wrote:


Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

-- >8 --
We plan to deprecate -fconcepts-ts in GCC 15 and remove the flag_concepts_ts
code.  This note is an admonishing reminder to convert the Concepts TS
code to C++20 Concepts.


What does "deprecated in GCC 15" mean?  Given you output the notice with
GCC 14 it would be better to state when it's going to be removed -
it's effectively
"deprecated" right now then?  Or will it continue to "work" forever
until it bitrots?


Agreed, it's deprecated now.  We talked about it having no effect in GCC 
15; the message could say that.  Or we could leave it vague and just say 
it's deprecated.


Please also update invoke.texi.

Jason

Re: [PATCH] uninit-pr108968-register.c: use __UINTPTR_TYPE__ for LLP64

2024-01-31 Thread Richard Biener

On Wed, Jan 31, 2024 at 2:39 PM Jonathan Yong <10wa...@gmail.com> wrote:
>
> Ensure sp variable is long enough by using __UINTPTR_TYPE__ for
> rsp.
>
> Attached patch okay? Changes unsigned long to __UINTPTR_TYPE__.

OK.

Re: [PATCH] c++: add deprecation notice for -fconcepts-ts

2024-01-31 Thread Richard Biener

On Wed, Jan 31, 2024 at 2:53 PM Jason Merrill  wrote:
>
> On 1/31/24 03:40, Richard Biener wrote:
> > On Wed, Jan 31, 2024 at 12:19 AM Marek Polacek  wrote:
> >>
> >> Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?
> >>
> >> -- >8 --
> >> We plan to deprecate -fconcepts-ts in GCC 15 and remove the 
> >> flag_concepts_ts
> >> code.  This note is an admonishing reminder to convert the Concepts TS
> >> code to C++20 Concepts.
> >
> > What does "deprecated in GCC 15" mean?  Given you output the notice with
> > GCC 14 it would be better to state when it's going to be removed -
> > it's effectively
> > "deprecated" right now then?  Or will it continue to "work" forever
> > until it bitrots?
>
> Agreed, it's deprecated now.  We talked about it having no effect in GCC
> 15; the message could say that.  Or we could leave it vague and just say
> it's deprecated.
>
> Please also update invoke.texi.

Btw, should -std=c++20 -fconcepts-ts be rejected?  I suppose -std=c++20
enables -fconcepts by default, it also seems to accept -std=c++20
-fno-concepts ...

Richard.

>
> Jason
>

Re: [PATCH 1/3] vect: Pass stmt_vec_info to TARGET_SIMD_CLONE_USABLE

2024-01-31 Thread Richard Biener

On Wed, 31 Jan 2024, Andre Vieira (lists) wrote:

> 
> 
> On 31/01/2024 12:13, Richard Biener wrote:
> > On Wed, 31 Jan 2024, Richard Biener wrote:
> > 
> >> On Tue, 30 Jan 2024, Andre Vieira wrote:
> >>
> >>>
> >>> This patch adds stmt_vec_info to TARGET_SIMD_CLONE_USABLE to make sure the
> >>> target can reject a simd_clone based on the vector mode it is using.
> >>> This is needed because for VLS SVE vectorization the vectorizer accepts
> >>> Advanced SIMD simd clones when vectorizing using SVE types because the
> >>> simdlens
> >>> might match.  This will cause type errors later on.
> >>>
> >>> Other targets do not currently need to use this argument.
> >>
> >> Can you instead pass down the mode?
> > 
> > Thinking about that again the cgraph_simd_clone info in the clone
> > should have sufficient information to disambiguate.  If it doesn't
> > then we should amend it.
> > 
> > Richard.
> 
> Hi Richard,
> 
> Thanks for the review, I don't think cgraph_simd_clone_info is the right place
> to pass down this information, since this is information about the caller
> rather than the simdclone itself. What we are trying to achieve here is making
> the vectorizer being able to accept or reject simdclones based on the ISA we
> are vectorizing for. To distinguish between SVE and Advanced SIMD ISAs we use
> modes, I am also not sure that's ideal but it is what we currently use. So to
> answer your earlier question, yes I can also pass down mode if that's
> preferable.

Note cgraph_simd_clone_info has simdlen and we seem to check elsewhere
whether that's POLY or constant.  I wonder how aarch64_sve_mode_p
comes into play here which in the end classifies VLS SVE modes as
non-SVE?

> Regards,
> Andre
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Unreviewed patches

2024-01-31 Thread Rainer Orth

Three patches have remained unreviewed for a week or more:

c++: Fix g++.dg/ext/attr-section2.C etc. with Solaris/SPARC as
https://gcc.gnu.org/pipermail/gcc-patches/2024-January/643434.html

This one may even be obvious.

testsuite: i386: Fix gcc.target/i386/pr70321.c on 32-bit Solaris/x86
https://gcc.gnu.org/pipermail/gcc-patches/2024-January/643771.html

testsuite: i386: Fix gcc.target/i386/avx512vl-stv-rotatedi-1.c on 
32-bit Solaris/x86
https://gcc.gnu.org/pipermail/gcc-patches/2024-January/643774.html

Those two require an x86 maintainer.

Thanks.
Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University

Re: [PATCH 1/3] vect: Pass stmt_vec_info to TARGET_SIMD_CLONE_USABLE

2024-01-31 Thread Richard Biener

On Wed, 31 Jan 2024, Richard Biener wrote:

> On Wed, 31 Jan 2024, Andre Vieira (lists) wrote:
> 
> > 
> > 
> > On 31/01/2024 12:13, Richard Biener wrote:
> > > On Wed, 31 Jan 2024, Richard Biener wrote:
> > > 
> > >> On Tue, 30 Jan 2024, Andre Vieira wrote:
> > >>
> > >>>
> > >>> This patch adds stmt_vec_info to TARGET_SIMD_CLONE_USABLE to make sure 
> > >>> the
> > >>> target can reject a simd_clone based on the vector mode it is using.
> > >>> This is needed because for VLS SVE vectorization the vectorizer accepts
> > >>> Advanced SIMD simd clones when vectorizing using SVE types because the
> > >>> simdlens
> > >>> might match.  This will cause type errors later on.
> > >>>
> > >>> Other targets do not currently need to use this argument.
> > >>
> > >> Can you instead pass down the mode?
> > > 
> > > Thinking about that again the cgraph_simd_clone info in the clone
> > > should have sufficient information to disambiguate.  If it doesn't
> > > then we should amend it.
> > > 
> > > Richard.
> > 
> > Hi Richard,
> > 
> > Thanks for the review, I don't think cgraph_simd_clone_info is the right 
> > place
> > to pass down this information, since this is information about the caller
> > rather than the simdclone itself. What we are trying to achieve here is 
> > making
> > the vectorizer being able to accept or reject simdclones based on the ISA we
> > are vectorizing for. To distinguish between SVE and Advanced SIMD ISAs we 
> > use
> > modes, I am also not sure that's ideal but it is what we currently use. So 
> > to
> > answer your earlier question, yes I can also pass down mode if that's
> > preferable.
> 
> Note cgraph_simd_clone_info has simdlen and we seem to check elsewhere
> whether that's POLY or constant.  I wonder how aarch64_sve_mode_p
> comes into play here which in the end classifies VLS SVE modes as
> non-SVE?

Maybe it's just a bit non-obvious as you key on mangling:

 static int
-aarch64_simd_clone_usable (struct cgraph_node *node)
+aarch64_simd_clone_usable (struct cgraph_node *node, stmt_vec_info 
stmt_vinfo)
 {
   switch (node->simdclone->vecsize_mangle)
 {
 case 'n':
   if (!TARGET_SIMD)
return -1;
+  if (STMT_VINFO_VECTYPE (stmt_vinfo)
+ && aarch64_sve_mode_p (TYPE_MODE (STMT_VINFO_VECTYPE 
(stmt_vinfo
+   return -1;

?  What does 'n' mean?  It's documented as

  /* The mangling character for a given vector size.  This is used
 to determine the ISA mangling bit as specified in the Intel
 Vector ABI.  */
  unsigned char vecsize_mangle;

which is slightly misleading.

1 2 >

1 - 100 of 147 matches

Mail list logo