[PATCH v1] Match: Support form 1 for scalar signed integer .SAT_ADD

2024-08-05 Thread pan2 . li
From: Pan Li 

This patch would like to support the form 1 of the scalar signed
integer .SAT_ADD.  Aka below example:

Form 1:
  #define DEF_SAT_S_ADD_FMT_1(T) \
  T __attribute__((noinline))\
  sat_s_add_##T##_fmt_1 (T x, T y)   \
  {  \
T min = (T)1u << (sizeof (T) * 8 - 1);   \
T max = min - 1; \
return (x ^ y) < 0   \
  ? (T)(x + y)   \
  : ((T)(x + y) ^ x) >= 0\
? (T)(x + y) \
: x < 0 ? min : max; \
  }

DEF_SAT_S_ADD_FMT_1 (int64_t)

We can tell the difference before and after this patch if backend
implemented the ssadd3 pattern similar as below.

Before this patch:
   4   │ __attribute__((noinline))
   5   │ int64_t sat_s_add_int64_t_fmt_1 (int64_t x, int64_t y)
   6   │ {
   7   │   long int _1;
   8   │   long int _2;
   9   │   long int _3;
  10   │   int64_t _4;
  11   │   long int _7;
  12   │   _Bool _9;
  13   │   long int _10;
  14   │   long int _11;
  15   │   long int _12;
  16   │   long int _13;
  17   │
  18   │ ;;   basic block 2, loop depth 0
  19   │ ;;pred:   ENTRY
  20   │   _1 = x_5(D) ^ y_6(D);
  21   │   _13 = x_5(D) + y_6(D);
  22   │   _3 = x_5(D) ^ _13;
  23   │   _2 = ~_1;
  24   │   _7 = _2 & _3;
  25   │   if (_7 >= 0)
  26   │ goto ; [59.00%]
  27   │   else
  28   │ goto ; [41.00%]
  29   │ ;;succ:   4
  30   │ ;;3
  31   │
  32   │ ;;   basic block 3, loop depth 0
  33   │ ;;pred:   2
  34   │   _9 = x_5(D) < 0;
  35   │   _10 = (long int) _9;
  36   │   _11 = -_10;
  37   │   _12 = _11 ^ 9223372036854775807;
  38   │ ;;succ:   4
  39   │
  40   │ ;;   basic block 4, loop depth 0
  41   │ ;;pred:   2
  42   │ ;;3
  43   │   # _4 = PHI <_13(2), _12(3)>
  44   │   return _4;
  45   │ ;;succ:   EXIT
  46   │
  47   │ }

After this patch:
   4   │ __attribute__((noinline))
   5   │ int64_t sat_s_add_int64_t_fmt_1 (int64_t x, int64_t y)
   6   │ {
   7   │   int64_t _4;
   8   │
   9   │ ;;   basic block 2, loop depth 0
  10   │ ;;pred:   ENTRY
  11   │   _4 = .SAT_ADD (x_5(D), y_6(D)); [tail call]
  12   │   return _4;
  13   │ ;;succ:   EXIT
  14   │
  15   │ }

The below test suites are passed for this patch.
* The rv64gcv fully regression test.
* The x86 bootstrap test.
* The x86 fully regression test.

gcc/ChangeLog:

* match.pd: Add the matching for signed .SAT_ADD.
* tree-ssa-math-opts.cc (gimple_signed_integer_sat_add): Add new
matching func decl.
(match_unsigned_saturation_add): Try signed .SAT_ADD and rename
to ...
(match_saturation_add): ... here.
(math_opts_dom_walker::after_dom_children): Update the above renamed
func from caller.

Signed-off-by: Pan Li 
---
 gcc/match.pd  | 14 +
 gcc/tree-ssa-math-opts.cc | 42 ++-
 2 files changed, 51 insertions(+), 5 deletions(-)

diff --git a/gcc/match.pd b/gcc/match.pd
index c9c8478d286..0a2ffc733d3 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -3311,6 +3311,20 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
   }
   (if (otype_precision < itype_precision && wi::eq_p (trunc_max, int_cst))
 
+/* Signed saturation add, case 1:
+   T min = (T)1u << (sizeof (T) * 8 - 1);
+   T max = min - 1;
+   SAT_S_ADD = (X ^ Y) < 0
+ ? (X + Y)
+ : ((T)(X + Y) ^ X) >= 0 ? (X + Y) : X < 0 ? min : max.  */
+(match (signed_integer_sat_add @0 @1)
+  (cond^ (ge (bit_and:c (bit_xor @0 (convert? @2)) (bit_not (bit_xor @0 @1)))
+   integer_zerop)
+   (convert? (plus@2 (convert1? @0) (convert1? @1)))
+   (bit_xor (negate (convert (lt @0 integer_zerop))) max_value))
+ (if (INTEGRAL_TYPE_P (type) && !TYPE_UNSIGNED (type)
+  && types_match (type, @0, @1
+
 /* x >  y  &&  x != XXX_MIN  -->  x > y
x >  y  &&  x == XXX_MIN  -->  false . */
 (for eqne (eq ne)
diff --git a/gcc/tree-ssa-math-opts.cc b/gcc/tree-ssa-math-opts.cc
index 8d96a4c964b..d5c9b475f72 100644
--- a/gcc/tree-ssa-math-opts.cc
+++ b/gcc/tree-ssa-math-opts.cc
@@ -4023,6 +4023,8 @@ extern bool gimple_unsigned_integer_sat_add (tree, tree*, 
tree (*)(tree));
 extern bool gimple_unsigned_integer_sat_sub (tree, tree*, tree (*)(tree));
 extern bool gimple_unsigned_integer_sat_trunc (tree, tree*, tree (*)(tree));
 
+extern bool gimple_signed_integer_sat_add (tree, tree*, tree (*)(tree));
+
 static void
 build_saturation_binary_arith_call (gimple_stmt_iterator *gsi, internal_fn fn,
tree lhs, tree op_0, tree op_1)
@@ -4072,7 +4074,8 @@ match_unsigned_saturation_add (gimple_stmt_iterator *gsi, 
gassign *stmt)
 }
 
 /*
- * Try to match saturation unsigned add with PHI.
+ * Try to match saturation add with PHI.
+ * For unsigned integer:
  *:
  *   _1 = x_3(D) + y_4(D);
  *   if (_1 >= x_3(D))

[committed] libgomp.texi: Add OpenMP TR13 routines to @menu (commented out)

2024-08-05 Thread Tobias Burnus
Not user visible but I use this to keep track of both implemented OpenMP 
runtime routines that still have to be documented and of still to be 
implemented (and then documented) routines.


This commit (r15-2713-g1a5734135d265a) adds those routines added in 
OpenMP's third 6.0 preview (Technical Report 13).


Tobias

PS: The routines are again reordered in OpenMP; the question is whether 
we want to follow suit or keep the current ordering. I only reordered 
the undocumented ones inside @menu and only those @menu that I modified.
commit 1a5734135d265a7b363ead9f821676a2a358969b
Author: Tobias Burnus 
Date:   Mon Aug 5 09:18:29 2024 +0200

libgomp.texi: Add OpenMP TR13 routines to @menu (commented out)

To keep track of missing routine documentation (both implemented and not),
the libgomp.texi file contains all non-OMPT routines as commented items
in @menu. This commit adds the routines added in TR13 as commented fixme
items.

libgomp/ChangeLog:

* libgomp.texi (OpenMP Runtime Library Routines): Add TR13 routines
to @menu (commented out).
---
 libgomp/libgomp.texi | 27 +--
 1 file changed, 25 insertions(+), 2 deletions(-)

diff --git a/libgomp/libgomp.texi b/libgomp/libgomp.texi
index 07cd75124b0..c6759dd03bc 100644
--- a/libgomp/libgomp.texi
+++ b/libgomp/libgomp.texi
@@ -1591,12 +1591,18 @@ They have C linkage and do not throw exceptions.
 @menu
 * omp_get_num_procs::   Number of processors online
 @c * omp_get_max_progress_width:: /TR11
+@c * omp_get_device_from_uid::  /TR13
+@c * omp_get_uid_from_device::  /TR13
 * omp_set_default_device::  Set the default device for target regions
 * omp_get_default_device::  Get the default device for target regions
 * omp_get_num_devices:: Number of target devices
 * omp_get_device_num::  Get device that current thread is running on
 * omp_is_initial_device::   Whether executing on the host device
 * omp_get_initial_device::  Device number of host device
+@c * omp_get_device_num_teams::  /TR13
+@c * omp_set_device_num_teams::  /TR13
+@c * omp_get_device_teams_thread_limit::  /TR13
+@c * omp_set_device_teams_thread_limit::  /TR13
 @end menu
 
 
@@ -2813,8 +2819,27 @@ Routines to manage and allocate memory on the current device.
 They have C linkage and do not throw exceptions.
 
 @menu
+@c * omp_get_devices_memspace:: /TR13
+@c * omp_get_device_memspace:: /TR13
+@c * omp_get_devices_and_host_memspace:: /TR13
+@c * omp_get_device_and_host_memspace:: /TR13
+@c * omp_get_devices_all_memspace:: /TR13
+@c * omp_get_memspace_num_resources:: /TR11
+@c * omp_get_memspace_pagesize:: /TR13
+@c * omp_get_submemspace:: /TR11
+@c * omp_init_mempartitioner:: /TR13
+@c * omp_destroy_mempartitioner:: /TR13
+@c * omp_init_mempartition:: /TR13
+@c * omp_destroy_mempartition:: /TR13
+@c * omp_mempartition_set_part:: /TR13
+@c * omp_mempartition_get_user_data:: /TR13
 * omp_init_allocator:: Create an allocator
 * omp_destroy_allocator:: Destroy an allocator
+@c * omp_get_devices_allocator:: /TR13
+@c * omp_get_device_allocator:: /TR13
+@c * omp_get_devices_and_host_allocator:: /TR13
+@c * omp_get_device_and_host_allocator:: /TR13
+@c * omp_get_devices_all_allocator:: /TR13
 * omp_set_default_allocator:: Set the default allocator
 * omp_get_default_allocator:: Get the default allocator
 * omp_alloc:: Memory allocation with an allocator
@@ -2823,8 +2848,6 @@ They have C linkage and do not throw exceptions.
 * omp_calloc:: Allocate nullified memory with an allocator
 * omp_aligned_calloc:: Allocate nullified aligned memory with an allocator
 * omp_realloc:: Reallocate memory allocated with OpenMP routines
-@c * omp_get_memspace_num_resources:: /TR11
-@c * omp_get_submemspace:: /TR11
 @end menu
 
 


[PATCH v3] Improve bad error message with stray semicolon in initializer (and related) [PR101232]

2024-08-05 Thread Franciszek Witt
Hi, could someone review this patch?

This is built on top of the v2 from
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101232 the only
difference is fix for error59.C

I have tested it on x86_64 Ubuntu 22 machine.


Regards
Franciszek

---

Author: Franciszek Witt 
Date:   Mon Aug 5 09:00:35 2024 +0200

c++: [PR101232]

PR c++/101232

gcc/cp/ChangeLog:

* parser.cc (cp_parser_postfix_expression): Commit to the
parse in case we know its either a cast or invalid syntax.
(cp_parser_braced_list): Add a heuristic to inform about
missing comma or operator.

gcc/testsuite/ChangeLog:

* g++.dg/parse/error59.C: Change the test so the new error
message is accepted.
* g++.dg/cpp0x/initlist-err1.C: New test.
* g++.dg/cpp0x/initlist-err2.C: New test.
* g++.dg/cpp0x/initlist-err3.C: New test.

---

diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index 1dd0efaf963..2e0ce1c6ddb 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -7881,8 +7881,13 @@ cp_parser_postfix_expression (cp_parser
*parser, bool address_p, bool cast_p,
 --parser->prevent_constrained_type_specifiers;
/* Parse the cast itself.  */
if (!cp_parser_error_occurred (parser))
- postfix_expression
-   = cp_parser_functional_cast (parser, type);
+ {
+   if (cp_lexer_next_token_is (parser->lexer, CPP_OPEN_BRACE))
+ /* This can only be a cast.  */
+ cp_parser_commit_to_topmost_tentative_parse (parser);
+   postfix_expression
+ = cp_parser_functional_cast (parser, type);
+ }
/* If that worked, we're done.  */
if (cp_parser_parse_definitely (parser))
  break;
@@ -26350,8 +26355,19 @@ cp_parser_braced_list (cp_parser *parser,
bool *non_constant_p /*=nullptr*/)
   else if (non_constant_p)
 *non_constant_p = false;
   /* Now, there should be a trailing `}'.  */
-  location_t finish_loc = cp_lexer_peek_token (parser->lexer)->location;
-  braces.require_close (parser);
+  cp_token * token = cp_lexer_peek_token (parser->lexer);
+  location_t finish_loc = token->location;
+  if (!braces.require_close (parser))
+{
+  /* This is just a heuristic. */
+  if (token->type != CPP_SEMICOLON)
+{
+  inform (finish_loc,
+"probably missing a comma or an operator before");
+  if (cp_parser_skip_to_closing_brace (parser))
+cp_lexer_consume_token (parser->lexer);
+}
+}
   TREE_TYPE (initializer) = init_list_type_node;
   recompute_constructor_flags (initializer);

diff --git a/gcc/testsuite/g++.dg/cpp0x/initlist-err1.C
b/gcc/testsuite/g++.dg/cpp0x/initlist-err1.C
new file mode 100644
index 000..6ea8afb3273
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/initlist-err1.C
@@ -0,0 +1,11 @@
+// PR c++/101232
+// { dg-do compile { target c++11 } }
+
+struct X {
+int a;
+int b;
+};
+
+void f() {
+auto x = X{ 1, 2; };   // { dg-error "21:" }
+}  // { dg-prune-output "expected declaration" }
diff --git a/gcc/testsuite/g++.dg/cpp0x/initlist-err2.C
b/gcc/testsuite/g++.dg/cpp0x/initlist-err2.C
new file mode 100644
index 000..227f519dc19
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/initlist-err2.C
@@ -0,0 +1,11 @@
+// PR c++/101232
+// { dg-do compile { target c++11 } }
+
+struct X {
+int a;
+int b;
+};
+
+void f() {
+auto x = X{ 1 2 }; // { dg-error "19:.*probably" }
+}
diff --git a/gcc/testsuite/g++.dg/cpp0x/initlist-err3.C
b/gcc/testsuite/g++.dg/cpp0x/initlist-err3.C
new file mode 100644
index 000..b77ec9bf4e9
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/initlist-err3.C
@@ -0,0 +1,11 @@
+// PR c++/101232
+// { dg-do compile { target c++11 } }
+
+struct X {
+int a;
+int b;
+};
+
+void f() {
+auto x = X{ 1, {2 };   // { dg-error "expected.*before" }
+}
diff --git a/gcc/testsuite/g++.dg/parse/error59.C
b/gcc/testsuite/g++.dg/parse/error59.C
index 2c44e210366..d782c9b1616 100644
--- a/gcc/testsuite/g++.dg/parse/error59.C
+++ b/gcc/testsuite/g++.dg/parse/error59.C
@@ -3,4 +3,4 @@
 void foo()
 {
   (struct {}x){}; // { dg-error "" }
-}
+} // { dg-excess-errors "" }


[PATCH v1] RISC-V: Update .SAT_TRUNC dump check due to middle-end change

2024-08-05 Thread pan2 . li
From: Pan Li 

Due to recent middle-end change, update the .SAT_TRUNC expand dump
check from 2 to 4.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-1.c: Adjust
asm check times from 2 to 4.

Signed-off-by: Pan Li 
---
 .../gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-1.c   | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git 
a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-1.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-1.c
index 7f047f3f6a2..ae3e44cd57e 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-1.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-1.c
@@ -16,4 +16,4 @@
 */
 DEF_VEC_SAT_U_TRUNC_FMT_1 (uint8_t, uint16_t)
 
-/* { dg-final { scan-rtl-dump-times ".SAT_TRUNC " 2 "expand" } } */
+/* { dg-final { scan-rtl-dump-times ".SAT_TRUNC " 4 "expand" } } */
-- 
2.43.0



Don't override 'LIBS' if '--enable-languages=rust'; use 'CRAB1_LIBS' (was: [PATCH 005/125] gccrs: libgrust: Add format_parser library)

2024-08-05 Thread Thomas Schwinge
Hi!

On 2024-08-01T16:56:01+0200, Arthur Cohen  wrote:
> Compile libformat_parser and link to it.

> --- a/gcc/rust/Make-lang.in
> +++ b/gcc/rust/Make-lang.in

> +LIBS += -ldl -lpthread

That's still not correct.  I've pushed to trunk branch
commit 816c4de4d062c89f5b7a68f68f29b2b033f5b136
"Don't override 'LIBS' if '--enable-languages=rust'; use 'CRAB1_LIBS'",
see attached.


Grüße
 Thomas


>From 816c4de4d062c89f5b7a68f68f29b2b033f5b136 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Mon, 5 Aug 2024 10:06:05 +0200
Subject: [PATCH] Don't override 'LIBS' if '--enable-languages=rust'; use
 'CRAB1_LIBS'

Recent commit 6fef4d6ffcab0fec8518adcb05458cba5dbeac25
"gccrs: libgrust: Add format_parser library", added a general override of
'LIBS += -ldl -lpthread' if '--enable-languages=rust'.  This is wrong
conceptually, and will make the build fail on systems not providing such
libraries.  Instead, 'CRAB1_LIBS', added a while ago in
commit 75299e4fe50aa8d9b3ff529e48db4ed246083e64
"rust: Do not link with libdl and libpthread unconditionally", should be used,
and not generally, but for 'crab1' only.

	gcc/rust/
	* Make-lang.in (LIBS): Don't override.
	(crab1$(exeext):): Use 'CRAB1_LIBS'.
---
 gcc/rust/Make-lang.in | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/gcc/rust/Make-lang.in b/gcc/rust/Make-lang.in
index 24229c02770..c3be5f9d81b 100644
--- a/gcc/rust/Make-lang.in
+++ b/gcc/rust/Make-lang.in
@@ -54,8 +54,6 @@ GCCRS_D_OBJS = \
rust/rustspec.o \
$(END)
 
-LIBS += -ldl -lpthread
-
 gccrs$(exeext): $(GCCRS_D_OBJS) $(EXTRA_GCC_OBJS) libcommon-target.a $(LIBDEPS)
 	+$(LINKER) $(ALL_LINKERFLAGS) $(LDFLAGS) -o $@ \
 	  $(GCCRS_D_OBJS) $(EXTRA_GCC_OBJS) libcommon-target.a \
@@ -237,7 +235,7 @@ RUST_LIBDEPS = $(LIBDEPS) $(LIBPROC_MACRO_INTERNAL)
 crab1$(exeext): $(RUST_ALL_OBJS) attribs.o $(BACKEND) $(RUST_LIBDEPS) $(rust.prev)
 	@$(call LINK_PROGRESS,$(INDEX.rust),start)
 	+$(LLINKER) $(ALL_LINKERFLAGS) $(LDFLAGS) -o $@ \
-	  $(RUST_ALL_OBJS) attribs.o $(BACKEND) $(LIBS) $(LIBPROC_MACRO_INTERNAL) $(LIBFORMAT_PARSER) $(BACKENDLIBS)
+	  $(RUST_ALL_OBJS) attribs.o $(BACKEND) $(LIBS) $(CRAB1_LIBS) $(LIBPROC_MACRO_INTERNAL) $(LIBFORMAT_PARSER) $(BACKENDLIBS)
 	@$(call LINK_PROGRESS,$(INDEX.rust),end)
 
 # Build hooks.
-- 
2.34.1



[PATCH] RISC-V: Clarify that Vector Crypto Extensions require Vector Extensions[PR116150]

2024-08-05 Thread Liao Shihua


PR 116150: Zvk* and Zvb* extensions requires v or zve* extension, but 
on gcc v is implied.

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc: Removed the zvk extension's 
implicit expansion of v extension.
* config/riscv/arch-canonicalize: Ditto.
* config/riscv/riscv.cc (riscv_override_options_internal): Throw error 
when zvb or zvk extension without v extension.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-47.c: add 
v or zve* to -march.
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-48.c: 
Ditto.
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-49.c: 
Ditto.
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-50.c: 
Ditto.
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-51.c: 
Ditto.
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-52.c: 
Ditto.
* gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-53.c: 
Ditto.
* gcc.target/riscv/rvv/base/zvbc-intrinsic.c: Ditto.
* gcc.target/riscv/rvv/base/zvbc_vx_constraint-1.c: Ditto.
* gcc.target/riscv/rvv/base/zvbc_vx_constraint-2.c: Ditto.
* gcc.target/riscv/rvv/base/zvknhb-intrinsic.c: Ditto.
* gcc.target/riscv/zvbb.c: Ditto.
* gcc.target/riscv/zvbc.c: Ditto.
* gcc.target/riscv/zvkb.c: Ditto.
* gcc.target/riscv/zvkg.c: Ditto.
* gcc.target/riscv/zvkn-1.c: Ditto.
* gcc.target/riscv/zvkn.c: Ditto.
* gcc.target/riscv/zvknc-1.c: Ditto.
* gcc.target/riscv/zvknc-2.c: Ditto.
* gcc.target/riscv/zvknc.c: Ditto.
* gcc.target/riscv/zvkned.c: Ditto.
* gcc.target/riscv/zvkng-1.c: Ditto.
* gcc.target/riscv/zvkng-2.c: Ditto.
* gcc.target/riscv/zvkng.c: Ditto.
* gcc.target/riscv/zvknha.c: Ditto.
* gcc.target/riscv/zvknhb.c: Ditto.
* gcc.target/riscv/zvks-1.c: Ditto.
* gcc.target/riscv/zvks.c: Ditto.
* gcc.target/riscv/zvksc-1.c: Ditto.
* gcc.target/riscv/zvksc-2.c: Ditto.
* gcc.target/riscv/zvksc.c: Ditto.
* gcc.target/riscv/zvksed.c: Ditto.
* gcc.target/riscv/zvksg-1.c: Ditto.
* gcc.target/riscv/zvksg-2.c: Ditto.
* gcc.target/riscv/zvksg.c: Ditto.
* gcc.target/riscv/zvksh.c: Ditto.
* gcc.target/riscv/pr116150-1.c: New test.
* gcc.target/riscv/pr116150-2.c: New test.
* gcc.target/riscv/pr116150-3.c: New test.
* gcc.target/riscv/pr116150-4.c: New test.

---
 gcc/common/config/riscv/riscv-common.cc|  8 
 gcc/config/riscv/arch-canonicalize |  8 
 gcc/config/riscv/riscv.cc  | 14 ++
 gcc/testsuite/gcc.target/riscv/pr116150-1.c|  9 +
 gcc/testsuite/gcc.target/riscv/pr116150-2.c|  8 
 gcc/testsuite/gcc.target/riscv/pr116150-3.c|  8 
 gcc/testsuite/gcc.target/riscv/pr116150-4.c|  8 
 .../base/target_attribute_v_with_intrinsic-47.c|  2 +-
 .../base/target_attribute_v_with_intrinsic-48.c|  2 +-
 .../base/target_attribute_v_with_intrinsic-49.c|  2 +-
 .../base/target_attribute_v_with_intrinsic-50.c|  2 +-
 .../base/target_attribute_v_with_intrinsic-51.c|  2 +-
 .../base/target_attribute_v_with_intrinsic-52.c|  2 +-
 .../base/target_attribute_v_with_intrinsic-53.c|  2 +-
 .../gcc.target/riscv/rvv/base/zvbc-intrinsic.c |  2 +-
 .../riscv/rvv/base/zvbc_vx_constraint-1.c  |  2 +-
 .../riscv/rvv/base/zvbc_vx_constraint-2.c  |  2 +-
 .../gcc.target/riscv/rvv/base/zvknhb-intrinsic.c   |  2 +-
 gcc/testsuite/gcc.target/riscv/zvbb.c  |  4 ++--
 gcc/testsuite/gcc.target/riscv/zvbc.c  |  4 ++--
 gcc/testsuite/gcc.target/riscv/zvkb.c  |  4 ++--
 gcc/testsuite/gcc.target/riscv/zvkg.c  |  4 ++--
 gcc/testsuite/gcc.target/riscv/zvkn-1.c|  4 ++--
 gcc/testsuite/gcc.target/riscv/zvkn.c  |  4 ++--
 gcc/testsuite/gcc.target/riscv/zvknc-1.c   |  4 ++--
 gcc/testsuite/gcc.target/riscv/zvknc-2.c   |  4 ++--
 gcc/testsuite/gcc.target/riscv/zvknc.c |  4 ++--
 gcc/testsuite/gcc.target/riscv/zvkned.c|  4 ++--
 gcc/testsuite/gcc.target/riscv/zvkng-1.c   |  4 ++--
 gcc/testsuite/gcc.target/riscv/zvkng-2.c   |  4 ++--
 gcc/testsuite/gcc.target/riscv/zvkng.c |  4 ++--
 gcc/testsuite/gcc.target/riscv/zvknha.c|  4 ++--
 gcc/testsuite/gcc.target/riscv/zvknhb.c|  4 ++--
 gcc/testsuite/gcc.target/riscv/zvks-1.c|  4 ++--
 gcc/testsuite/gcc.target/riscv/zvks.c  |  4 ++--
 gcc/testsuite/gcc.target/riscv/zvksc-1.c   |  4 ++--
 gcc/testsuite/gcc.target/riscv/zvksc-2.c   |  4 ++--
 gcc/testsuite/gcc.target/riscv/zvksc.c |  4 ++--
 gcc/testsuite/gcc.target/riscv/z

Inline 'gcc/rust/Make-lang.in:RUST_LIBDEPS' (was: [PATCH 006/125] gccrs: Add 'gcc/rust/Make-lang.in:LIBFORMAT_PARSER')

2024-08-05 Thread Thomas Schwinge
Hi!

On 2024-08-01T16:56:02+0200, Arthur Cohen  wrote:
> --- a/gcc/rust/Make-lang.in
> +++ b/gcc/rust/Make-lang.in
> @@ -212,6 +212,9 @@ RUST_ALL_OBJS = $(GRS_OBJS) $(RUST_TARGET_OBJS)
>  rust_OBJS = $(RUST_ALL_OBJS) rust/rustspec.o
>  
>  LIBPROC_MACRO_INTERNAL = 
> ../libgrust/libproc_macro_internal/libproc_macro_internal.a
> +LIBFORMAT_PARSER = rust/libformat_parser.a
> +
> +RUST_LIBDEPS = $(LIBDEPS) $(LIBPROC_MACRO_INTERNAL) $(LIBFORMAT_PARSER)
>  
>  
>  RUST_LIBDEPS = $(LIBDEPS) $(LIBPROC_MACRO_INTERNAL)

That must've been a mis-merge; my GCC/Rust master branch original of this
commit (as part of 
"Move 'libformat_parser' build into the GCC build directory, and into libgrust")
didn't include a bogus second definition of 'RUST_LIBDEPS'.  I've pushed
to trunk branch commit aab9f33ed1f1b92444a82eb3ea5cab1048593791
"Inline 'gcc/rust/Make-lang.in:RUST_LIBDEPS'", see attached -- this
commit apparently had been omitted from the 2024-08-01 upstream
submission.


Grüße
 Thomas


>From aab9f33ed1f1b92444a82eb3ea5cab1048593791 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Wed, 28 Feb 2024 23:06:25 +0100
Subject: [PATCH] Inline 'gcc/rust/Make-lang.in:RUST_LIBDEPS'

..., also fixing up an apparently mis-merged
commit 2340894554334a310b891a1d9e9d5e3f502357ac
"gccrs: Add 'gcc/rust/Make-lang.in:LIBFORMAT_PARSER'", which was adding a bogus
second definition of 'RUST_LIBDEPS'.

	gcc/rust/
	* Make-lang.in (RUST_LIBDEPS): Inline into all users.
---
 gcc/rust/Make-lang.in | 7 +--
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/gcc/rust/Make-lang.in b/gcc/rust/Make-lang.in
index c3be5f9d81b..aed9a998c80 100644
--- a/gcc/rust/Make-lang.in
+++ b/gcc/rust/Make-lang.in
@@ -226,13 +226,8 @@ rust_OBJS = $(RUST_ALL_OBJS) rust/rustspec.o
 LIBPROC_MACRO_INTERNAL = ../libgrust/libproc_macro_internal/libproc_macro_internal.a
 LIBFORMAT_PARSER = ../libgrust/libformat_parser/debug/liblibformat_parser.a
 
-RUST_LIBDEPS = $(LIBDEPS) $(LIBPROC_MACRO_INTERNAL) $(LIBFORMAT_PARSER)
-
-
-RUST_LIBDEPS = $(LIBDEPS) $(LIBPROC_MACRO_INTERNAL)
-
 # The compiler itself is called crab1
-crab1$(exeext): $(RUST_ALL_OBJS) attribs.o $(BACKEND) $(RUST_LIBDEPS) $(rust.prev)
+crab1$(exeext): $(RUST_ALL_OBJS) attribs.o $(BACKEND) $(LIBDEPS) $(LIBPROC_MACRO_INTERNAL) $(LIBFORMAT_PARSER) $(rust.prev)
 	@$(call LINK_PROGRESS,$(INDEX.rust),start)
 	+$(LLINKER) $(ALL_LINKERFLAGS) $(LDFLAGS) -o $@ \
 	  $(RUST_ALL_OBJS) attribs.o $(BACKEND) $(LIBS) $(CRAB1_LIBS) $(LIBPROC_MACRO_INTERNAL) $(LIBFORMAT_PARSER) $(BACKENDLIBS)
-- 
2.34.1



Re: [PATCH] fortran: Fix a pasto in gfc_check_dependency

2024-08-05 Thread Jakub Jelinek
On Fri, Aug 02, 2024 at 06:29:27PM +0200, Mikael Morin wrote:
> I agree with all of that.  Sure keeping the condition around would be the
> safest.  I'm just afraid of keeping code that would remain dead.
> 
> > > > And the pasto fix would guess fix
> > > > aliasing_dummy_5.f90 with
> > > >   arg(2:3) = arr(1:2)
> > > > instead of
> > > >   arr(2:3) = arg(1:2)
> > > > if the original testcase would actually fail.
> > > > 
> > > Mmh, aren't they both actually the same?

> They can alias, and they do alias.  So in the end, writing either line is
> equivalent, what do I miss?

So, I had another look.  Seems the reason why the testcase passes is that
gfc_could_be_alias (called from gfc_conv_resolve_dependencies) returns true,
so the assignment goes through a temporary array.
gfc_check_dependency is then only called for
  if (lhs->rank > 0 && gfc_check_dependency (lhs, rhs, true) == 0)
optimize_binop_array_assignment (c, &rhs, false);

Guess the question is if one can construct a testcase where it would make a
difference.

Jakub



Re: [PATCH] Make may_trap_p_1 return false for constant pool references [PR116145]

2024-08-05 Thread Richard Sandiford
Jeff Law  writes:
> On 8/2/24 2:23 PM, Andrew Pinski wrote:
>> On Wed, Jul 31, 2024 at 9:41 AM Richard Sandiford
>>  wrote:
>>>
>>> The testcase contains the constant:
>>>
>>>arr2 = svreinterpret_u8(svdup_u32(0x0a0d5c3f));
>>>
>>> which was initially hoisted by hand, but which gimple optimisers later
>>> propagated to each use (as expected).  The constant was then expanded
>>> as a load-and-duplicate from the constant pool.  Normally that load
>>> should then be hoisted back out of the loop, but may_trap_or_fault_p
>>> stopped that from happening in this case.
>>>
>>> The code responsible was:
>>>
>>>if (/* MEM_NOTRAP_P only relates to the actual position of the memory
>>>   reference; moving it out of context such as when moving code
>>>   when optimizing, might cause its address to become invalid.  
>>> */
>>>code_changed
>>>|| !MEM_NOTRAP_P (x))
>>>  {
>>>poly_int64 size = MEM_SIZE_KNOWN_P (x) ? MEM_SIZE (x) : -1;
>>>return rtx_addr_can_trap_p_1 (XEXP (x, 0), 0, size,
>>>  GET_MODE (x), code_changed);
>>>  }
>>>
>>> where code_changed is true.  (Arguably it doesn't need to be true in
>>> this case, if we inserted invariants on the preheader edge, but it
>>> would still need to be true for conditionally executed loads.)
>>>
>>> Normally this wouldn't be a problem, since rtx_addr_can_trap_p_1
>>> would recognise that the address refers to the constant pool.
>>> However, the SVE load-and-replicate instructions have a limited
>>> offset range, so it isn't possible for them to have a LO_SUM address.
>>> All we have is a plain pseudo base register.
>>>
>>> MEM_READONLY_P is defined as:
>>>
>>> /* 1 if RTX is a mem that is statically allocated in read-only memory.  */
>>>(RTL_FLAG_CHECK1 ("MEM_READONLY_P", (RTX), MEM)->unchanging)
>>>
>>> and so I think it should be safe to move memory references if both
>>> MEM_READONLY_P and MEM_NOTRAP_P are true.
>>>
>>> The testcase isn't a minimal reproducer, but I think it's good
>>> to have a realistic full routine in the testsuite.
>>>
>>> Bootstrapped & regression-tested on aarch64-linux-gnu.  OK to install?
>> 
>> 
>> This is breaking the build on a few targets (x86_64 and powerpc64le so
>> far reported, see PR 116200).
>> 
>>  From what I can tell is that it is treating `(plus:DI (ashift:DI
>> (reg:DI 0 ax [690]) (const_int 3 [0x3]))  (label_ref:DI 1620))` as not
>> trapping and allowing it to be moved before the check of ax being in
>> the range [0..2] and we have eax being (unsigned long)(unsigned int)-9
>> in value. So we get a bogus address which will trap. I put my findings
>> in PR 116200 too.
> I think it's the root cause of the x86_64 libgomp failures on the trunk 
> as well.  I haven't done any debugging beyond that.

Sorry for the breakage.  I've reverted the patch.

Richard


Re: [Patch, Fortran] PR104626 ICE in gfc_format_decoder, at fortran/error.cc:1071

2024-08-05 Thread Andre Vehreschild
Hi Jerry,

looks ok to me. Thanks for taking care.

- Andre

On Fri, 2 Aug 2024 10:44:58 -0700
Jerry D  wrote:

> Hi all,
>
> Doing some catchup here. I plan to commit the following shortly. This is
> one of Steve's patches posted on bugzilla.
>
> I have created a new test case.
>
> Regression tested on linux x86-64.
>
> git show:
>
> commit 4d4549937b789afe4037c2f8f80dfc2285504a1e (HEAD -> master)
> Author: Steve Kargl 
> Date:   Thu Aug 1 21:50:49 2024 -0700
>
>  Fortran: Fix ICE on invalid in gfc_format_decoder.
>
>  PR fortran/104626
>
>  gcc/fortran/ChangeLog:
>
>  * symbol.cc (gfc_add_save): Add checks for SAVE attribute
>  conflicts and duplicate SAVE attribute.
>
>  gcc/testsuite/ChangeLog:
>
>  * gfortran.dg/pr104626.f90: New test.
>
> diff --git a/gcc/fortran/symbol.cc b/gcc/fortran/symbol.cc
> index a8479b862e3..b5143d9f790 100644
> --- a/gcc/fortran/symbol.cc
> +++ b/gcc/fortran/symbol.cc
> @@ -1307,9 +1307,8 @@ gfc_add_save (symbol_attribute *attr, save_state
> s, const char *name,
>
> if (s == SAVE_EXPLICIT && gfc_pure (NULL))
>   {
> -  gfc_error
> -   ("SAVE attribute at %L cannot be specified in a PURE procedure",
> -where);
> +  gfc_error ("SAVE attribute at %L cannot be specified in a PURE "
> +"procedure", where);
> return false;
>   }
>
> @@ -1319,10 +1318,15 @@ gfc_add_save (symbol_attribute *attr, save_state
> s, const char *name,
> if (s == SAVE_EXPLICIT && attr->save == SAVE_EXPLICIT
> && (flag_automatic || pedantic))
>   {
> -   if (!gfc_notify_std (GFC_STD_LEGACY,
> -"Duplicate SAVE attribute specified at %L",
> -where))
> +  if (!where)
> +   {
> + gfc_error ("Duplicate SAVE attribute specified near %C");
>return false;
> +   }
> +
> +  if (!gfc_notify_std (GFC_STD_LEGACY, "Duplicate SAVE attribute "
> +  "specified at %L", where))
> +   return false;
>   }
>
> attr->save = s;
> diff --git a/gcc/testsuite/gfortran.dg/pr104626.f90
> b/gcc/testsuite/gfortran.dg/pr104626.f90
> new file mode 100644
> index 000..faff65a8c92
> --- /dev/null
> +++ b/gcc/testsuite/gfortran.dg/pr104626.f90
> @@ -0,0 +1,8 @@
> +! { dg-do compile }
> +program p
> +   procedure(g), save :: f ! { dg-error "PROCEDURE attribute conflicts" }
> +   procedure(g), save :: f ! { dg-error "Duplicate SAVE attribute" }
> +contains
> +   subroutine g
> +   end
> +end


--
Andre Vehreschild * Email: vehre ad gmx dot de


Re: [PATCH 05/15] arm: [MVE intrinsics] add vcvt shape

2024-08-05 Thread Andre Vieira (lists)




On 11/07/2024 22:42, Christophe Lyon wrote:

+  bool
+  check (function_checker &c) const override
+  {
+if (c.mode_suffix_id == MODE_none)
+  return true;
+
+unsigned int bits = c.type_suffix (0).element_bits;
+return c.require_immediate_range (1, 1, bits);
+  }


When trying to understand how this worked I bumped into the following:

If you pass a value that's not in the appropriate range like 33, you'll see:

vcvt.c:5:10: error: passing 33 to argument 2 of 'vcvtq_n', which expects 
a value in the range [1, 32]


If you however pick a negative number like -3? You get:
vcvt.c:5:10: error: argument 2 of 'vcvtq_n' must be an integer constant 
expression


This is somewhat confusing and yes we could change the message to say 
'must be a positive integer constant expression', which might be clear 
enough in this case, but less so if you do something like 1<<7 for this 
intrinsic, because the signature looks like:


> + build_all (b, "v0,v1,ss8", group, MODE_n, preserve_user_namespace);

which converts the immediate to a signed 8-bit value, making 1<<7 a 
negative number, and the intrinsic specs defines the parameter as a 
signed 32-bit integer. So if a user accidentally passes 1<<7 here, they 
will get:
vcvt.c:5:10: error: argument 2 of 'vcvtq_n' must be an integer constant 
expression


Potentially making users think GCC does not support inline evaluation 
for these parameters, which would be sad.


So we should at least change the signature to be "v0,v1,ss32", but I 
suggest we deviate and make the last parameter in the signature either a 
su32 or a su64. As the framework code expects this value to fit into a 
uhwi, which I suspect is because SVE always defines the parameters as 
uint64_t constants, i.e. svxar_n_s64, disclaimer I didn't check all SVE 
intrinsics.


Initially I was going to suggest fold_converting the arg in 
'function_checker::require_immediate' into a 
long_long_unsigned_type_node, which also does what we want in these 
examples, but perhaps going the signature way is cleaner and more inline 
with the framework as is.  We probably want to keep it as much as SVE as 
possible, so we can 'borrow' code more easily if the need arises.


@Richard S: I added you to this review as you know the SVE intrinsic 
code better than most (if not all).  So hopefully you can let us know if 
you have an opinion on this.


Kind Regards,
Andre




Re: [RFC v3 3/3] c: Add __lengthof__() operator

2024-08-05 Thread Alejandro Colomar
[CC += Kees, Qing]

Hi Joseph,

On Sun, Aug 04, 2024 at 08:34:24PM GMT, Alejandro Colomar wrote:
> On Sun, Aug 04, 2024 at 08:02:25PM GMT, Martin Uecker wrote:
> D'oh!  I screwed it.  I wanted to have written this:
> 
>   $ cat star.c 
>   void foo(char (*a)[3][*], int (*x)[__lengthof__(*a)]);

I think this answers your question of if we want __lengthof__ to
evaluate its operand if the top-level array is non-VLA but an inner
array is VLA.

We clearly want it to not evaluate, because we want this __lengthof__
to be a constant expression, ...

>   void bar(char (*a)[*][3], int (*x)[__lengthof__(*a)]);
>   void foo2(char (*a)[3][*], int (*x)[sizeof(**a)]);
>   void bar2(char (*a)[*][3], int (*x)[sizeof(**a)]);
> 
>   int
>   main(void)
>   {
>   int  i3[3];
>   int  i5[5];
>   char c35[3][5];
>   char c53[5][3];
> 
>   foo(&c35, &i3);
>   foo(&c35, &i5);  // I'd expect this to err

... and thus cause a compile-time error here
(-Wincompatible-pointer-types).

I suspect we need to modify array_type_nelts_minus_one() for that; I'm
going to investigate.

Have a lovely day!
Alex

>   bar(&c53, &i3);  // I'd expect this to warn
>   bar(&c53, &i5);
> 
>   foo2(&c35, &i3);  // I'd expect this to warn
>   foo2(&c35, &i5);
>   bar2(&c53, &i3);
>   //bar2(&c53, &i5);  // error: -Wincompatible-pointer-types
>   }
>   $ /opt/local/gnu/gcc/lengthof/bin/gcc -Wall -Wextra star.c -S
>   $ 

-- 



signature.asc
Description: PGP signature


Re: [RFC v3 3/3] c: Add __lengthof__() operator

2024-08-05 Thread Jakub Jelinek
On Mon, Aug 05, 2024 at 11:45:56AM +0200, Alejandro Colomar wrote:
> [CC += Kees, Qing]
> 
> Hi Joseph,
> 
> On Sun, Aug 04, 2024 at 08:34:24PM GMT, Alejandro Colomar wrote:
> > On Sun, Aug 04, 2024 at 08:02:25PM GMT, Martin Uecker wrote:
> > D'oh!  I screwed it.  I wanted to have written this:
> > 
> > $ cat star.c 
> > void foo(char (*a)[3][*], int (*x)[__lengthof__(*a)]);
> 
> I think this answers your question of if we want __lengthof__ to
> evaluate its operand if the top-level array is non-VLA but an inner
> array is VLA.
> 
> We clearly want it to not evaluate, because we want this __lengthof__
> to be a constant expression, ...

But if you don't evaluate the argument, you can't handle counted_by.
Because for counted_by you need the expression (the object on which it is
used).

Jakub



Re: [RFC][PATCH] SVE intrinsics: Fold svdiv (svptrue, x, x) to ones

2024-08-05 Thread Richard Sandiford
Jennifer Schmitz  writes:
> This patch folds the SVE intrinsic svdiv into a vector of 1's in case
> 1) the predicate is svptrue and
> 2) dividend and divisor are equal.
> This is implemented in the gimple_folder for signed and unsigned
> integers. Corresponding test cases were added to the existing test
> suites.
>
> The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression.
> OK for mainline?
>
> Please also advise whether it makes sense to implement the same optimization
> for float types and if so, under which conditions?

I think we should instead use const_binop to try to fold the division
whenever the predicate is all-true, or if the function uses _x predication.
(As a follow-on, we could handle _z and _m too, using VEC_COND_EXPR.)

We shouldn't need to vet the arguments, since const_binop does that itself.
Using const_binop should also get the conditions right for floating-point
divisions.

Thanks,
Richard


>
> Signed-off-by: Jennifer Schmitz 
>
> gcc/
>
>   * config/aarch64/aarch64-sve-builtins-base.cc (svdiv_impl::fold):
>   Add optimization.
>
> gcc/testsuite/
>
>   * gcc.target/aarch64/sve/acle/asm/div_s32.c: New test.
>   * gcc.target/aarch64/sve/acle/asm/div_s64.c: Likewise.
>   * gcc.target/aarch64/sve/acle/asm/div_u32.c: Likewise.
>   * gcc.target/aarch64/sve/acle/asm/div_u64.c: Likewise.
>
> From 43913cfa47b31d055a0456c863a30e3e44acc2f0 Mon Sep 17 00:00:00 2001
> From: Jennifer Schmitz 
> Date: Fri, 2 Aug 2024 06:41:09 -0700
> Subject: [PATCH] SVE intrinsics: Fold svdiv (svptrue, x, x) to ones
>
> This patch folds the SVE intrinsic svdiv into a vector of 1's in case
> 1) the predicate is svptrue and
> 2) dividend and divisor are equal.
> This is implemented in the gimple_folder for signed and unsigned
> integers. Corresponding test cases were added to the existing test
> suites.
>
> The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression.
> OK for mainline?
>
> Signed-off-by: Jennifer Schmitz 
>
> gcc/
>
>   * config/aarch64/aarch64-sve-builtins-base.cc (svdiv_impl::fold):
>   Add optimization.
>
> gcc/testsuite/
>
>   * gcc.target/aarch64/sve/acle/asm/div_s32.c: New test.
>   * gcc.target/aarch64/sve/acle/asm/div_s64.c: Likewise.
>   * gcc.target/aarch64/sve/acle/asm/div_u32.c: Likewise.
>   * gcc.target/aarch64/sve/acle/asm/div_u64.c: Likewise.
> ---
>  .../aarch64/aarch64-sve-builtins-base.cc  | 19 ++---
>  .../gcc.target/aarch64/sve/acle/asm/div_s32.c | 27 +++
>  .../gcc.target/aarch64/sve/acle/asm/div_s64.c | 27 +++
>  .../gcc.target/aarch64/sve/acle/asm/div_u32.c | 27 +++
>  .../gcc.target/aarch64/sve/acle/asm/div_u64.c | 27 +++
>  5 files changed, 124 insertions(+), 3 deletions(-)
>
> diff --git a/gcc/config/aarch64/aarch64-sve-builtins-base.cc 
> b/gcc/config/aarch64/aarch64-sve-builtins-base.cc
> index d55bee0b72f..e347d29c725 100644
> --- a/gcc/config/aarch64/aarch64-sve-builtins-base.cc
> +++ b/gcc/config/aarch64/aarch64-sve-builtins-base.cc
> @@ -755,8 +755,21 @@ public:
>gimple *
>fold (gimple_folder &f) const override
>{
> -tree divisor = gimple_call_arg (f.call, 2);
> -tree divisor_cst = uniform_integer_cst_p (divisor);
> +tree pg = gimple_call_arg (f.call, 0);
> +tree op1 = gimple_call_arg (f.call, 1);
> +tree op2 = gimple_call_arg (f.call, 2);
> +
> +if (f.type_suffix (0).integer_p
> + && is_ptrue (pg, f.type_suffix (0).element_bytes)
> + && operand_equal_p (op1, op2, 0))
> +  {
> + tree lhs_type = TREE_TYPE (f.lhs);
> + tree_vector_builder builder (lhs_type, 1, 1);
> + builder.quick_push (build_each_one_cst (TREE_TYPE (lhs_type)));
> + return gimple_build_assign (f.lhs, builder.build ());
> +  }
> +
> +tree divisor_cst = uniform_integer_cst_p (op2);
>  
>  if (!divisor_cst || !integer_pow2p (divisor_cst))
>return NULL;
> @@ -770,7 +783,7 @@ public:
>   shapes::binary_uint_opt_n, MODE_n,
>   f.type_suffix_ids, GROUP_none, f.pred);
>   call = f.redirect_call (instance);
> - tree d = INTEGRAL_TYPE_P (TREE_TYPE (divisor)) ? divisor : divisor_cst;
> + tree d = INTEGRAL_TYPE_P (TREE_TYPE (op2)) ? op2 : divisor_cst;
>   new_divisor = wide_int_to_tree (TREE_TYPE (d), tree_log2 (d));
>}
>  else
> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/div_s32.c 
> b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/div_s32.c
> index d5a23bf0726..09d0419f3ef 100644
> --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/div_s32.c
> +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/div_s32.c
> @@ -55,6 +55,15 @@ TEST_UNIFORM_ZX (div_w0_s32_m_untied, svint32_t, int32_t,
>z0 = svdiv_n_s32_m (p0, z1, x0),
>z0 = svdiv_m (p0, z1, x0))
>  
> +/*
> +** div_same_s32_m_tied1:
> +**   mov z0\.s, #1
> +**   ret
> +*/
> +TEST_

Re: [PATCH v1] RISC-V: Update .SAT_TRUNC dump check due to middle-end change

2024-08-05 Thread 钟居哲
lgtm



juzhe.zh...@rivai.ai
 
From: pan2.li
Date: 2024-08-05 16:01
To: gcc-patches
CC: juzhe.zhong; kito.cheng; jeffreyalaw; rdapp.gcc; Pan Li
Subject: [PATCH v1] RISC-V: Update .SAT_TRUNC dump check due to middle-end 
change
From: Pan Li 
 
Due to recent middle-end change, update the .SAT_TRUNC expand dump
check from 2 to 4.
 
gcc/testsuite/ChangeLog:
 
* gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-1.c: Adjust
asm check times from 2 to 4.
 
Signed-off-by: Pan Li 
---
.../gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-1.c   | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
 
diff --git 
a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-1.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-1.c
index 7f047f3f6a2..ae3e44cd57e 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-1.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-1.c
@@ -16,4 +16,4 @@
*/
DEF_VEC_SAT_U_TRUNC_FMT_1 (uint8_t, uint16_t)
-/* { dg-final { scan-rtl-dump-times ".SAT_TRUNC " 2 "expand" } } */
+/* { dg-final { scan-rtl-dump-times ".SAT_TRUNC " 4 "expand" } } */
-- 
2.43.0
 
 


[x86_64 PATCH] Refactor V2DI arithmetic right shift expansion for STV.

2024-08-05 Thread Roger Sayle

This patch refactors ashrv2di RTL expansion into a function so that it may
be reused by a pre-reload splitter, such that DImode right shifts may be
considered candidates during the Scalar-To-Vector (STV) pass.  Currently
DImode arithmetic right shifts are not considered potential candidates
during STV, so for the following testcase:

long long m;
typedef long long v2di __attribute__((vector_size (16)));
void foo(v2di x) { m = x[0]>>63; }

We currently see the following warning/error during STV2
>  r101 use in insn 7 isn't convertible

And end up generating scalar code with an interunit move:

foo:movq%xmm0, %rax
sarq$63, %rax
movq%rax, m(%rip)
ret

With this patch, we can reuse the RTL expansion logic and produce:

foo:psrad   $31, %xmm0
pshufd  $245, %xmm0, %xmm0
movq%xmm0, m(%rip)
ret

Or with the addition of -mavx2, the equivalent:

foo:vpxor   %xmm1, %xmm1, %xmm1
vpcmpgtq%xmm0, %xmm1, %xmm0
vmovq   %xmm0, m(%rip)
ret


The only design decision of note is the choice to continue lowering V2DI
into vector sequences during RTL expansion, to enable combine to optimize
things if possible.  Using just define_insn_and_split potentially misses
optimizations, such as reusing the zero vector produced by vpxor above.
It may be necessary to tweak STV's compute gain at some point, but this
patch controls what's possible (rather than what's beneficial).

This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
and make -k check, both with and without --target_board=unix{-m32}
with no new failures.  Ok for mainline?

2024-08-05  Roger Sayle  

gcc/ChangeLog
* config/i386/i386-expand.cc (ix86_expand_v2di_ashiftrt): New
function refactored from define_expand ashrv2di3.
* config/i386/i386-features.cc
(general_scalar_to_vector_candidate_p)
: Handle like other shifts and rotates.
* config/i386/i386-protos.h (ix86_expand_v2di_ashiftrt): Prototype.
* config/i386/sse.md (ashrv2di3): Call ix86_expand_v2di_ashiftrt.
(*ashrv2di3): New define_insn_and_split to enable creation by stv2
pass, and splitting during split1 reusing ix86_expand_v2di_ashiftrt.

gcc/testsuite/ChangeLog
* gcc.target/i386/sse2-stv-2.c: New test case.


Thanks in advance,
Roger
--

diff --git a/gcc/config/i386/i386-expand.cc b/gcc/config/i386/i386-expand.cc
index d9ad062..bdbc142 100644
--- a/gcc/config/i386/i386-expand.cc
+++ b/gcc/config/i386/i386-expand.cc
@@ -7471,6 +7471,162 @@ ix86_expand_v1ti_ashiftrt (rtx operands[])
 }
 }
 
+/* Expand V2DI mode ashiftrt.  */
+void
+ix86_expand_v2di_ashiftrt (rtx operands[])
+{
+  if (operands[2] == const0_rtx)
+{
+  emit_move_insn (operands[0], operands[1]);
+  return;
+}
+
+  if (TARGET_SSE4_2
+  && CONST_INT_P (operands[2])
+  && UINTVAL (operands[2]) >= 63
+  && !optimize_insn_for_size_p ())
+{
+  rtx zero = force_reg (V2DImode, CONST0_RTX (V2DImode));
+  emit_insn (gen_sse4_2_gtv2di3 (operands[0], zero, operands[1]));
+  return;
+}
+
+  if (CONST_INT_P (operands[2])
+  && (!TARGET_XOP || UINTVAL (operands[2]) >= 63))
+{
+  vec_perm_builder sel (4, 4, 1);
+  sel.quick_grow (4);
+  rtx arg0, arg1;
+  rtx op1 = lowpart_subreg (V4SImode,
+   force_reg (V2DImode, operands[1]),
+   V2DImode);
+  rtx target = gen_reg_rtx (V4SImode);
+  if (UINTVAL (operands[2]) >= 63)
+   {
+ arg0 = arg1 = gen_reg_rtx (V4SImode);
+ emit_insn (gen_ashrv4si3 (arg0, op1, GEN_INT (31)));
+ sel[0] = 1;
+ sel[1] = 1;
+ sel[2] = 3;
+ sel[3] = 3;
+   }
+  else if (INTVAL (operands[2]) > 32)
+   {
+ arg0 = gen_reg_rtx (V4SImode);
+ arg1 = gen_reg_rtx (V4SImode);
+ emit_insn (gen_ashrv4si3 (arg1, op1, GEN_INT (31)));
+ emit_insn (gen_ashrv4si3 (arg0, op1,
+   GEN_INT (INTVAL (operands[2]) - 32)));
+ sel[0] = 1;
+ sel[1] = 5;
+ sel[2] = 3;
+ sel[3] = 7;
+   }
+  else if (INTVAL (operands[2]) == 32)
+   {
+ arg0 = op1;
+ arg1 = gen_reg_rtx (V4SImode);
+ emit_insn (gen_ashrv4si3 (arg1, op1, GEN_INT (31)));
+ sel[0] = 1;
+ sel[1] = 5;
+ sel[2] = 3;
+ sel[3] = 7;
+   }
+  else
+   {
+ arg0 = gen_reg_rtx (V2DImode);
+ arg1 = gen_reg_rtx (V4SImode);
+ emit_insn (gen_lshrv2di3 (arg0, operands[1], operands[2]));
+ emit_insn (gen_ashrv4si3 (arg1, op1, operands[2]));
+ arg0 = lowpart_subreg (V4SImode, arg0, V2DImode);
+ sel[0] = 0;
+ sel[1] = 5;
+ sel[2] = 2;
+ sel[3] = 7;
+   }
+  vec_perm_indices indices (sel, arg0 != arg1 ? 2 : 1, 4);
+  rtx op0 = operands[0];
+  bool ok = targetm.vectorize.vec_perm_const

Re: [PATCH, gfortran] libgfortran: implement fpu-macppc for Darwin, support IEEE arithmetic

2024-08-05 Thread Sergey Fedorov
On Thu, Jul 25, 2024 at 4:47 PM FX Coudert  wrote:

> Can you post an updated version of the patch, following the first round of
> review?
>
> FX


Sorry for a delay, done.

I dropped a change to the test file, since you have fixed it appropriately,
and switched to Apple libm convention for flags, as you have suggested.

Please let me know if I should do anything further to improve it and make
it acceptable for a merge.

Serge


0001-libgfortran-implement-fpu-macppc-for-Darwin-support-.patch
Description: Binary data


[PATCH v2] Hard register constraints

2024-08-05 Thread Stefan Schulze Frielinghaus
This is a follow-up of
https://gcc.gnu.org/pipermail/gcc-patches/2024-June/654013.html

What has changed?

- Rebased and fixed an issue in constrain_operands which manifested
after late-combine.

- Introduced new test cases for Arm, Intel, POWER, RISCV, S/390 for 32-
and 64-bit where appropriate (including register pairs etc.).  Test
gcc.dg/asm-hard-reg-7.c is a bit controversial since I'm testing for an
anti feature here, i.e., I'm testing for register asm in conjunction
with calls.  I'm fine with removing it in the end but I wanted to keep
it in for demonstration purposes at least during discussion of this
patch.

- Split test pr87600-2.c into pr87600-2.c and pr87600-3.c since test0
errors out early, now.  Otherwise, the remaining errors would not be
reported.  Beside that the error message has slightly changed.

- Modified genoutput.cc in order to allow hard register constraints in
machine descriptions.  For example, on s390 the instruction mvcrl makes
use of the implicit register r0 which we currently deal with as follows:

(define_insn "*mvcrl"
  [(set (match_operand:BLK 0 "memory_operand" "=Q")
   (unspec:BLK [(match_operand:BLK 1 "memory_operand" "Q")
(reg:SI GPR0_REGNUM)]
   UNSPEC_MVCRL))]
  "TARGET_Z15"
  "mvcrl\t%0,%1"
  [(set_attr "op_type" "SSE")])

(define_expand "mvcrl"
  [(set (reg:SI GPR0_REGNUM) (match_operand:SI 2 "general_operand"))
   (set (match_operand:BLK 0 "memory_operand" "=Q")
   (unspec:BLK [(match_operand:BLK 1 "memory_operand" "Q")
(reg:SI GPR0_REGNUM)]
   UNSPEC_MVCRL))]
  "TARGET_Z15"
  "")

In the expander we ensure that GPR0 is setup correctly.  With this patch
we could simply write

(define_insn "mvcrl"
  [(set (match_operand:BLK 0 "memory_operand" "=Q")
(unspec:BLK [(match_operand:BLK 1 "memory_operand" "Q")
 (match_operand:SI 2 "general_operand" "{r0}")]
UNSPEC_MVCRL))]
  "TARGET_Z15"
  "mvcrl\t%0,%1"
  [(set_attr "op_type" "SSE")])

What I dislike is that I didn't find a way to verify hard register names
during genoutput, i.e., ensuring that the name is valid after all.  This
is due to the fact how reg_names is defined which cannot be accessed by
genoutput.  The same holds true for REGISTER_NAMES et al. which may
reference some target specific variable (see e.g. POWER).  Thus, in case
of an invalid register name in a machine description file we do not
end-up with a genoutput-time error but instead fail at run-time in
process_alt_operands():

   case '{':
   {
 int regno = parse_constraint_regname (p);
 gcc_assert (regno >= 0);
 cl = REGNO_REG_CLASS (regno);
 CLEAR_HARD_REG_SET (hregset);
 SET_HARD_REG_BIT (hregset, regno);
 cl_filter = &hregset;
 goto reg;
   }

This is rather unfortunate but I couldn't find a way how to validate
register names during genoutput.  If no one else has an idea I will
replace gcc_assert with a more expressive error message.

What's next?

I was thinking about replacing register asm with the new hard register
constraint.  This would solve problems like demonstrated by
gcc.dg/asm-hard-reg-7.c.  For example, we could replace the constraint

   register int x asm ("r5") = 42;
   asm ("foo   %0" :: "r" (x));

with

   register int x asm ("r5") = 42;
   asm ("foo   %0" :: "{r5}" (x));

and ignore any further effect of the register asm.  However, I haven't
really thought this through and there are certainly cases which are
currently allowed which cannot trivially be converted as e.g. here:

   register int x asm ("r5") = 42;
   asm ("foo   %0" :: "rd" (x));

Multiple alternatives are kind of strange in combination with register
asm.  For example, on s390 the two constraints "r" and "d" restrict both
to GPRs.  That is not a show stopper but certainly something which needs
some consideration.  If you can think of some wild combinations/edge
cases I would be happy to here about.  Anyhow, this is something for a
further patch.

Last but not least, if there is enough consent to accept this feature, I
will start writing up some documentation.

Bootstrapped and regtested on Arm, Intel, POWER, RISCV, S/390.  I have
only verified the 32-bit tests via cross compilers and didn't execute
them in contrast to 64-bit targets.
---
 gcc/cfgexpand.cc  |  42 -
 gcc/genoutput.cc  |  12 ++
 gcc/genpreds.cc   |   4 +-
 gcc/gimplify.cc   | 134 ++-
 gcc/lra-constraints.cc|  13 ++
 gcc/recog.cc  |  11 +-
 gcc/stmt.cc   | 155 +-
 gcc/stmt.h|  12 +-
 gcc/testsuite/gcc.dg/asm-hard-reg-1.c |  85 ++
 gcc/testsuite/gcc.dg/asm-hard-reg-2.c |  33 
 gcc/testsuite/gcc.dg/asm-hard-reg-3.c |  25 +

Re: [RFC v3 3/3] c: Add __lengthof__() operator

2024-08-05 Thread Martin Uecker
Am Montag, dem 05.08.2024 um 11:50 +0200 schrieb Jakub Jelinek:
> On Mon, Aug 05, 2024 at 11:45:56AM +0200, Alejandro Colomar wrote:
> > [CC += Kees, Qing]
> > 
> > Hi Joseph,
> > 
> > On Sun, Aug 04, 2024 at 08:34:24PM GMT, Alejandro Colomar wrote:
> > > On Sun, Aug 04, 2024 at 08:02:25PM GMT, Martin Uecker wrote:
> > > D'oh!  I screwed it.  I wanted to have written this:
> > > 
> > >   $ cat star.c 
> > >   void foo(char (*a)[3][*], int (*x)[__lengthof__(*a)]);
> > 
> > I think this answers your question of if we want __lengthof__ to
> > evaluate its operand if the top-level array is non-VLA but an inner
> > array is VLA.
> > 
> > We clearly want it to not evaluate, because we want this __lengthof__
> > to be a constant expression, ...
> 
> But if you don't evaluate the argument, you can't handle counted_by.
> Because for counted_by you need the expression (the object on which it is
> used).

You would not evaluate only when the size is an integer constant
expression, which would not apply to counted_by.

Martin




[PATCH] vect: Allow unsigned-to-signed promotion in vect_look_through_possible_promotion [PR115707]

2024-08-05 Thread Feng Xue OS
The function vect_look_through_possible_promotion() fails to figure out root
definition if casts involves more than two promotions with sign change as:

long a = (long)b;   // promotion cast
 -> int b = (int)c; // promotion cast, sign change
   -> unsigned short c = ...;

For this case, the function thinks the 2nd cast has different sign as the 1st,
so stop looking through, while "unsigned short -> integer" is a nature sign
extension. This patch allows this unsigned-to-signed promotion in the function.

Thanks,
Feng

---
gcc/
* tree-vect-patterns.cc (vect_look_through_possible_promotion): Allow
unsigned-to-signed promotion.
---
 gcc/tree-vect-patterns.cc | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc
index 4674a16d15f..b2c83cfd219 100644
--- a/gcc/tree-vect-patterns.cc
+++ b/gcc/tree-vect-patterns.cc
@@ -434,7 +434,9 @@ vect_look_through_possible_promotion (vec_info *vinfo, tree 
op,
 sign of the previous promotion.  */
  if (!res
  || TYPE_PRECISION (unprom->type) == orig_precision
- || TYPE_SIGN (unprom->type) == TYPE_SIGN (op_type))
+ || TYPE_SIGN (unprom->type) == TYPE_SIGN (op_type)
+ || (TYPE_UNSIGNED (op_type)
+ && TYPE_PRECISION (op_type) < TYPE_PRECISION (unprom->type)))
{
  unprom->set_op (op, dt, caster);
  min_precision = TYPE_PRECISION (op_type);
-- 
2.17.1From 334998e1d991e1d2c8e4c2234663b4d829e88e5c Mon Sep 17 00:00:00 2001
From: Feng Xue 
Date: Mon, 5 Aug 2024 15:23:56 +0800
Subject: [PATCH] vect: Allow unsigned-to-signed promotion in
 vect_look_through_possible_promotion [PR115707]

The function fails to figure out root definition if casts involves more than
two promotions with sign change as:

long a = (long)b;   // promotion cast
 -> int b = (int)c; // promotion cast, sign change
   -> unsigned short c = ...;

For this case, the function thinks the 2nd cast has different sign as the 1st,
so stop looking through, while "unsigned short -> integer" is a nature sign
extension.

2024-08-05 Feng Xue 

gcc/
	* tree-vect-patterns.cc (vect_look_through_possible_promotion): Allow
	unsigned-to-signed promotion.
---
 gcc/tree-vect-patterns.cc | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc
index 4674a16d15f..b2c83cfd219 100644
--- a/gcc/tree-vect-patterns.cc
+++ b/gcc/tree-vect-patterns.cc
@@ -434,7 +434,9 @@ vect_look_through_possible_promotion (vec_info *vinfo, tree op,
 	 sign of the previous promotion.  */
 	  if (!res
 	  || TYPE_PRECISION (unprom->type) == orig_precision
-	  || TYPE_SIGN (unprom->type) == TYPE_SIGN (op_type))
+	  || TYPE_SIGN (unprom->type) == TYPE_SIGN (op_type)
+	  || (TYPE_UNSIGNED (op_type)
+		  && TYPE_PRECISION (op_type) < TYPE_PRECISION (unprom->type)))
 	{
 	  unprom->set_op (op, dt, caster);
 	  min_precision = TYPE_PRECISION (op_type);
-- 
2.17.1



[PATCH] vect: Add missed opcodes in vect_get_smallest_scalar_type [PR115228]

2024-08-05 Thread Feng Xue OS
Some opcodes are missed when determining the smallest scalar type for a
vectorizable statement. Currently, this bug does not cause any problem,
because vect_get_smallest_scalar_type is only used to compute max nunits
vectype, and even statement with missed opcode is incorrectly bypassed,
the max nunits vectype could also be rightly deduced from def statements
for operands of the statement.

In the future, if this function will be called to do other thing, we may
get something wrong. So fix it in this patch.

Thanks,
Feng

---
gcc/
* tree-vect-data-refs.cc (vect_get_smallest_scalar_type): Add
missed opcodes that involve widening operation.
---
 gcc/tree-vect-data-refs.cc | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/gcc/tree-vect-data-refs.cc b/gcc/tree-vect-data-refs.cc
index 39fd887a96b..5b0d548f847 100644
--- a/gcc/tree-vect-data-refs.cc
+++ b/gcc/tree-vect-data-refs.cc
@@ -162,7 +162,10 @@ vect_get_smallest_scalar_type (stmt_vec_info stmt_info, 
tree scalar_type)
   if (gimple_assign_cast_p (assign)
  || gimple_assign_rhs_code (assign) == DOT_PROD_EXPR
  || gimple_assign_rhs_code (assign) == WIDEN_SUM_EXPR
+ || gimple_assign_rhs_code (assign) == SAD_EXPR
  || gimple_assign_rhs_code (assign) == WIDEN_MULT_EXPR
+ || gimple_assign_rhs_code (assign) == WIDEN_MULT_PLUS_EXPR
+ || gimple_assign_rhs_code (assign) == WIDEN_MULT_MINUS_EXPR
  || gimple_assign_rhs_code (assign) == WIDEN_LSHIFT_EXPR
  || gimple_assign_rhs_code (assign) == FLOAT_EXPR)
{
-- 
2.17.1

Re: [PATCH] vect: Allow unsigned-to-signed promotion in vect_look_through_possible_promotion [PR115707]

2024-08-05 Thread Richard Biener
On Mon, Aug 5, 2024 at 12:34 PM Feng Xue OS  wrote:
>
> The function vect_look_through_possible_promotion() fails to figure out root
> definition if casts involves more than two promotions with sign change as:
>
> long a = (long)b;   // promotion cast
>  -> int b = (int)c; // promotion cast, sign change
>-> unsigned short c = ...;
>
> For this case, the function thinks the 2nd cast has different sign as the 1st,
> so stop looking through, while "unsigned short -> integer" is a nature sign
> extension. This patch allows this unsigned-to-signed promotion in the 
> function.

OK.

Thanks,
Richard.

> Thanks,
> Feng
>
> ---
> gcc/
> * tree-vect-patterns.cc (vect_look_through_possible_promotion): Allow
> unsigned-to-signed promotion.
> ---
>  gcc/tree-vect-patterns.cc | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc
> index 4674a16d15f..b2c83cfd219 100644
> --- a/gcc/tree-vect-patterns.cc
> +++ b/gcc/tree-vect-patterns.cc
> @@ -434,7 +434,9 @@ vect_look_through_possible_promotion (vec_info *vinfo, 
> tree op,
>  sign of the previous promotion.  */
>   if (!res
>   || TYPE_PRECISION (unprom->type) == orig_precision
> - || TYPE_SIGN (unprom->type) == TYPE_SIGN (op_type))
> + || TYPE_SIGN (unprom->type) == TYPE_SIGN (op_type)
> + || (TYPE_UNSIGNED (op_type)
> + && TYPE_PRECISION (op_type) < TYPE_PRECISION 
> (unprom->type)))
> {
>   unprom->set_op (op, dt, caster);
>   min_precision = TYPE_PRECISION (op_type);
> --
> 2.17.1


Re: [PATCH] vect: Add missed opcodes in vect_get_smallest_scalar_type [PR115228]

2024-08-05 Thread Richard Biener
On Mon, Aug 5, 2024 at 12:36 PM Feng Xue OS  wrote:
>
> Some opcodes are missed when determining the smallest scalar type for a
> vectorizable statement. Currently, this bug does not cause any problem,
> because vect_get_smallest_scalar_type is only used to compute max nunits
> vectype, and even statement with missed opcode is incorrectly bypassed,
> the max nunits vectype could also be rightly deduced from def statements
> for operands of the statement.
>
> In the future, if this function will be called to do other thing, we may
> get something wrong. So fix it in this patch.

OK.

Thanks,
Richard.

> Thanks,
> Feng
>
> ---
> gcc/
> * tree-vect-data-refs.cc (vect_get_smallest_scalar_type): Add
> missed opcodes that involve widening operation.
> ---
>  gcc/tree-vect-data-refs.cc | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/gcc/tree-vect-data-refs.cc b/gcc/tree-vect-data-refs.cc
> index 39fd887a96b..5b0d548f847 100644
> --- a/gcc/tree-vect-data-refs.cc
> +++ b/gcc/tree-vect-data-refs.cc
> @@ -162,7 +162,10 @@ vect_get_smallest_scalar_type (stmt_vec_info stmt_info, 
> tree scalar_type)
>if (gimple_assign_cast_p (assign)
>   || gimple_assign_rhs_code (assign) == DOT_PROD_EXPR
>   || gimple_assign_rhs_code (assign) == WIDEN_SUM_EXPR
> + || gimple_assign_rhs_code (assign) == SAD_EXPR
>   || gimple_assign_rhs_code (assign) == WIDEN_MULT_EXPR
> + || gimple_assign_rhs_code (assign) == WIDEN_MULT_PLUS_EXPR
> + || gimple_assign_rhs_code (assign) == WIDEN_MULT_MINUS_EXPR
>   || gimple_assign_rhs_code (assign) == WIDEN_LSHIFT_EXPR
>   || gimple_assign_rhs_code (assign) == FLOAT_EXPR)
> {
> --
> 2.17.1


Re: [PATCH] vect: Multistep float->int conversion only with no trapping math

2024-08-05 Thread Richard Biener
On Fri, Aug 2, 2024 at 2:43 PM Juergen Christ  wrote:
>
> Do not convert floats to ints in multiple step if trapping math is
> enabled.  This might hide some inexact signals.
>
> Also use correct sign (the sign of the target integer type) for the
> intermediate steps.  This only affects undefined behaviour (casting
> floats to unsigned datatype where the float is negative).
>
> gcc/ChangeLog:
>
> * tree-vect-stmts.cc (vectorizable_conversion): multi-step
>   float to int conversion only with trapping math and correct
>   sign.
>
> Signed-off-by: Juergen Christ 
>
> Bootstrapped and tested on x84 and s390.  Ok for trunk?
>
> ---
>  gcc/tree-vect-stmts.cc | 8 +---
>  1 file changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
> index fdcda0d2abae..2ddd13383193 100644
> --- a/gcc/tree-vect-stmts.cc
> +++ b/gcc/tree-vect-stmts.cc
> @@ -5448,7 +5448,8 @@ vectorizable_conversion (vec_info *vinfo,
> break;
>
>   cvt_type
> -   = build_nonstandard_integer_type (GET_MODE_BITSIZE (rhs_mode), 0);
> +   = build_nonstandard_integer_type (GET_MODE_BITSIZE (rhs_mode),
> + TYPE_UNSIGNED (lhs_type));

But lhs_type should be a float type here, the idea that for a
FLOAT_EXPR (int -> float)
a signed integer type is the natural one to use - as it's 2x wider
than the original
RHS type it's signedness doesn't matter.  Note all float types should be
!TYPE_UNSIGNED so this hunk is a no-op but still less clear on the intent IMO.

Please drop it.

>   cvt_type = get_same_sized_vectype (cvt_type, vectype_in);
>   if (cvt_type == NULL_TREE)
> goto unsupported;
> @@ -5505,10 +5506,11 @@ vectorizable_conversion (vec_info *vinfo,
>if (GET_MODE_SIZE (lhs_mode) >= GET_MODE_SIZE (rhs_mode))
> goto unsupported;
>
> -  if (code == FIX_TRUNC_EXPR)
> +  if (code == FIX_TRUNC_EXPR && !flag_trapping_math)
> {
>   cvt_type
> -   = build_nonstandard_integer_type (GET_MODE_BITSIZE (rhs_mode), 0);
> +   = build_nonstandard_integer_type (GET_MODE_BITSIZE (rhs_mode),
> + TYPE_UNSIGNED (lhs_type));

Here it might be relevant for correctness - we have to choose between
sfix and ufix for the float -> [u]int conversion.

Do  you have a testcase?  Shouldn't the exactness be independent of the integer
type we convert to?

>   cvt_type = get_same_sized_vectype (cvt_type, vectype_in);
>   if (cvt_type == NULL_TREE)
> goto unsupported;
> --
> 2.43.5
>


Re: [PATCH v1] Match: Add type_has_mode_precision_p check for SAT_TRUNC [PR116202]

2024-08-05 Thread Richard Biener
On Sun, Aug 4, 2024 at 1:47 PM  wrote:
>
> From: Pan Li 
>
> The .SAT_TRUNC matching can only perform the type has its mode
> precision.
>
> g_12 = (long unsigned int) _2;
> _13 = MIN_EXPR ;
> _3 = (_Bool) _13;
>
> The above pattern cannot be recog as .SAT_TRUNC (g_12) because the dest
> only has 1 bit precision but QImode.  Aka the type doesn't have the mode
> precision.  Thus,  add the type_has_mode_precision_p for the dest to
> avoid such case.
>
> The below tests are passed for this patch.
> 1. The rv64gcv fully regression tests.
> 2. The x86 bootstrap tests.
> 3. The x86 fully regression tests.

Isn't that now handled by the direct_internal_fn_supported_p check?  That is,
by the caller which needs to verify the matched operation is supported by
the target?

> PR target/116202
>
> gcc/ChangeLog:
>
> * match.pd: Add type_has_mode_precision_p for the dest type
> of the .SAT_TRUNC matching.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/base/pr116202-run-1.c: New test.
>
> Signed-off-by: Pan Li 
> ---
>  gcc/match.pd  |  6 +++--
>  .../riscv/rvv/base/pr116202-run-1.c   | 24 +++
>  2 files changed, 28 insertions(+), 2 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr116202-run-1.c
>
> diff --git a/gcc/match.pd b/gcc/match.pd
> index c9c8478d286..dfa0bba3908 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -3283,7 +3283,8 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> wide_int trunc_max = wi::mask (otype_precision, false, itype_precision);
> wide_int int_cst = wi::to_wide (@1, itype_precision);
>}
> -  (if (otype_precision < itype_precision && wi::eq_p (trunc_max, 
> int_cst))
> +  (if (type_has_mode_precision_p (type) && otype_precision < itype_precision
> +   && wi::eq_p (trunc_max, int_cst))
>
>  /* Unsigned saturation truncate, case 2, sizeof (WT) > sizeof (NT).
> SAT_U_TRUNC = (NT)(MIN_EXPR (X, 255)).  */
> @@ -3309,7 +3310,8 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> wide_int trunc_max = wi::mask (otype_precision, false, itype_precision);
> wide_int int_cst = wi::to_wide (@1, itype_precision);
>}
> -  (if (otype_precision < itype_precision && wi::eq_p (trunc_max, 
> int_cst))
> +  (if (type_has_mode_precision_p (type) && otype_precision < itype_precision
> +   && wi::eq_p (trunc_max, int_cst))
>
>  /* x >  y  &&  x != XXX_MIN  -->  x > y
> x >  y  &&  x == XXX_MIN  -->  false . */
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr116202-run-1.c 
> b/gcc/testsuite/gcc.target/riscv/rvv/base/pr116202-run-1.c
> new file mode 100644
> index 000..d150f20b5d9
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr116202-run-1.c
> @@ -0,0 +1,24 @@
> +/* { dg-do run } */
> +/* { dg-options "-O3 -march=rv64gcv_zvl256b -fdump-rtl-expand-details" } */
> +
> +int b[24];
> +_Bool c[24];
> +
> +int main() {
> +  for (int f = 0; f < 4; ++f)
> +b[f] = 6;
> +
> +  for (int f = 0; f < 24; f += 4)
> +c[f] = ({
> +  int g = ({
> +unsigned long g = -b[f];
> +1 < g ? 1 : g;
> +  });
> +  g;
> +});
> +
> +  if (c[0] != 1)
> +__builtin_abort ();
> +}
> +
> +/* { dg-final { scan-rtl-dump-not ".SAT_TRUNC " "expand" } } */
> --
> 2.43.0
>


Re: [PATCH] tree-reassoc.cc: PR tree-optimization/116139 Don't assert when forming fully-pipelined FMAs on wide MULT targets

2024-08-05 Thread Richard Biener
On Mon, Aug 5, 2024 at 8:49 AM Kyrylo Tkachov  wrote:
>
> Hi all,
>
> The code in get_reassociation_width that forms FMAs aggressively when
> they are fully pipelined expects the FMUL reassociation width in the
> target to be less than for FMAs. This doesn't hold for all target
> tunings.
>
> This code shouldn't ICE, just avoid forming these FMAs here.
> This patch does that.
>
> The test case uses -mcpu=neoverse-n3 tuning because it uses the width of 6 
> for the fp reassociation cost. The test case in the PR uses neoverse-v2 but I 
> intend to change the reassociation cost for neoverse-v2 in a future patch 
> that would not trigger this ICE.
>
> Bootstrapped and tested on aarch64-none-linux-gnu.
> Ok for trunk?

OK for trunk and affected branches.

Richard.

> Thanks,
> Kyrill
>
> Signed-off-by: Kyrylo Tkachov 
>
> PR tree-optimization/116139
>
> gcc/ChangeLog:
>
> * tree-ssa-reassoc.cc (get_reassociation_width): Move width_mult
> <= width comparison to if condition rather than assert.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/aarch64/pr116139.c: New test.
> ---
>


Re: [PATCH v1] Match: Support form 1 for scalar signed integer .SAT_ADD

2024-08-05 Thread Richard Biener
On Mon, Aug 5, 2024 at 9:14 AM  wrote:
>
> From: Pan Li 
>
> This patch would like to support the form 1 of the scalar signed
> integer .SAT_ADD.  Aka below example:
>
> Form 1:
>   #define DEF_SAT_S_ADD_FMT_1(T) \
>   T __attribute__((noinline))\
>   sat_s_add_##T##_fmt_1 (T x, T y)   \
>   {  \
> T min = (T)1u << (sizeof (T) * 8 - 1);   \
> T max = min - 1; \
> return (x ^ y) < 0   \
>   ? (T)(x + y)   \
>   : ((T)(x + y) ^ x) >= 0\
> ? (T)(x + y) \
> : x < 0 ? min : max; \
>   }
>
> DEF_SAT_S_ADD_FMT_1 (int64_t)
>
> We can tell the difference before and after this patch if backend
> implemented the ssadd3 pattern similar as below.
>
> Before this patch:
>4   │ __attribute__((noinline))
>5   │ int64_t sat_s_add_int64_t_fmt_1 (int64_t x, int64_t y)
>6   │ {
>7   │   long int _1;
>8   │   long int _2;
>9   │   long int _3;
>   10   │   int64_t _4;
>   11   │   long int _7;
>   12   │   _Bool _9;
>   13   │   long int _10;
>   14   │   long int _11;
>   15   │   long int _12;
>   16   │   long int _13;
>   17   │
>   18   │ ;;   basic block 2, loop depth 0
>   19   │ ;;pred:   ENTRY
>   20   │   _1 = x_5(D) ^ y_6(D);
>   21   │   _13 = x_5(D) + y_6(D);
>   22   │   _3 = x_5(D) ^ _13;
>   23   │   _2 = ~_1;
>   24   │   _7 = _2 & _3;
>   25   │   if (_7 >= 0)
>   26   │ goto ; [59.00%]
>   27   │   else
>   28   │ goto ; [41.00%]
>   29   │ ;;succ:   4
>   30   │ ;;3
>   31   │
>   32   │ ;;   basic block 3, loop depth 0
>   33   │ ;;pred:   2
>   34   │   _9 = x_5(D) < 0;
>   35   │   _10 = (long int) _9;
>   36   │   _11 = -_10;
>   37   │   _12 = _11 ^ 9223372036854775807;
>   38   │ ;;succ:   4
>   39   │
>   40   │ ;;   basic block 4, loop depth 0
>   41   │ ;;pred:   2
>   42   │ ;;3
>   43   │   # _4 = PHI <_13(2), _12(3)>
>   44   │   return _4;
>   45   │ ;;succ:   EXIT
>   46   │
>   47   │ }
>
> After this patch:
>4   │ __attribute__((noinline))
>5   │ int64_t sat_s_add_int64_t_fmt_1 (int64_t x, int64_t y)
>6   │ {
>7   │   int64_t _4;
>8   │
>9   │ ;;   basic block 2, loop depth 0
>   10   │ ;;pred:   ENTRY
>   11   │   _4 = .SAT_ADD (x_5(D), y_6(D)); [tail call]
>   12   │   return _4;
>   13   │ ;;succ:   EXIT
>   14   │
>   15   │ }
>
> The below test suites are passed for this patch.
> * The rv64gcv fully regression test.
> * The x86 bootstrap test.
> * The x86 fully regression test.
>
> gcc/ChangeLog:
>
> * match.pd: Add the matching for signed .SAT_ADD.
> * tree-ssa-math-opts.cc (gimple_signed_integer_sat_add): Add new
> matching func decl.
> (match_unsigned_saturation_add): Try signed .SAT_ADD and rename
> to ...
> (match_saturation_add): ... here.
> (math_opts_dom_walker::after_dom_children): Update the above renamed
> func from caller.
>
> Signed-off-by: Pan Li 
> ---
>  gcc/match.pd  | 14 +
>  gcc/tree-ssa-math-opts.cc | 42 ++-
>  2 files changed, 51 insertions(+), 5 deletions(-)
>
> diff --git a/gcc/match.pd b/gcc/match.pd
> index c9c8478d286..0a2ffc733d3 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -3311,6 +3311,20 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>}
>(if (otype_precision < itype_precision && wi::eq_p (trunc_max, 
> int_cst))
>
> +/* Signed saturation add, case 1:
> +   T min = (T)1u << (sizeof (T) * 8 - 1);
> +   T max = min - 1;
> +   SAT_S_ADD = (X ^ Y) < 0
> + ? (X + Y)
> + : ((T)(X + Y) ^ X) >= 0 ? (X + Y) : X < 0 ? min : max.  */
> +(match (signed_integer_sat_add @0 @1)
> +  (cond^ (ge (bit_and:c (bit_xor @0 (convert? @2)) (bit_not (bit_xor @0 @1)))

This matches arbitrary Z in (X ^ (T)Z) & ~(X ^ Y) which cannot be intended.
The GIMPLE IL in the comment below suggests Z == X + Y?

The convert looks odd to me given @0 is involved in both & operands.
The comment above has the same logic error.

I believe the bit_xor lack :c

> +   integer_zerop)

Please indent this to line up with the first operand of the 'ge' to make it
better readable.

> +   (convert? (plus@2 (convert1? @0) (convert1? @1)))

Same with the converts.  The plus needs :c I think.  Is this about
common sign-conversions being hoisted from (int)x + (int)y -> (int)(x+y)?

Note all the :c and conditional converts makes this a quite heavy pattern
(all combinations of swaps and converts gets code).

> +   (bit_xor (negate (convert (lt @0 integer_zerop))) max_value))
> + (if (INTEGRAL_TYPE_P (type) && !TYPE_UNSIGNED (type)
> +  && types_match (type, @0, @1
> +
>  /* x >  y  &&  x != XXX_MIN  -->  x > y
> x >  y  &&  x == XXX_MIN  -->  false . */
>  (for eqne (eq ne)
> diff 

Re: [PATCH v2] RISC-V: xtheadmemidx: Fix mode test for pre/post-modify addressing

2024-08-05 Thread Christoph Müllner
On Thu, Jul 25, 2024 at 5:40 PM Palmer Dabbelt  wrote:
>
> On Thu, 25 Jul 2024 08:37:05 PDT (-0700), christoph.muell...@vrull.eu wrote:
> > On Thu, Jul 25, 2024 at 5:19 PM Palmer Dabbelt  wrote:
> >>
> >> On Thu, 25 Jul 2024 08:10:25 PDT (-0700), jeffreya...@gmail.com wrote:
> >> >
> >> >
> >> > On 7/25/24 9:06 AM, Christoph Müllner wrote:
> >> >> Ok, also to backport to GCC 14?
> >> > Yes, of course.
> >>
> >> I'm OK with that, but according to the latest status report
> >> , we're
> >> between the RC and the release for 14.2 and the homepage is saying
> >> "frozen for release" (thanks to Andrew Pinski for pointing that out).
> >
> > This popped up when I was about to push, so I did not push yet.
> > The last thing I want to do is interfere with the release process,
> > so apologies for pushing the backport for PR116035 yesterday.
> >
> > I will wait with this patch for the GCC 14.2 release, rebase/retest
> > and push then,
> > unless the release manager or maintainers propose another procedure.
>
> That works for me, thanks!

Committed to releases/gcc-14:
  
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=eccf707e5ceb7e405ffe4edfbcae2f769b8386cf


Re: [PATCHv2 2/2] libiberty/buildargv: handle input consisting of only white space

2024-08-05 Thread Andrew Burgess
Jeff Law  writes:

> On 7/29/24 6:51 AM, Andrew Burgess wrote:
>> Thomas Schwinge  writes:
>> 
>>> Hi!
>>>
>>> On 2024-02-10T17:26:01+, Andrew Burgess  wrote:
 --- a/libiberty/argv.c
 +++ b/libiberty/argv.c
>>>
 @@ -439,17 +442,8 @@ expandargv (int *argcp, char ***argvp)
}
 /* Add a NUL terminator.  */
 buffer[len] = '\0';
 -  /* If the file is empty or contains only whitespace, buildargv would
 -   return a single empty argument.  In this context we want no arguments,
 -   instead.  */
 -  if (only_whitespace (buffer))
 -  {
 -file_argv = (char **) xmalloc (sizeof (char *));
 -file_argv[0] = NULL;
 -  }
 -  else
 -  /* Parse the string.  */
 -  file_argv = buildargv (buffer);
 +  /* Parse the string.  */
 +  file_argv = buildargv (buffer);
 /* If *ARGVP is not already dynamically allocated, copy it.  */
 if (*argvp == original_argv)
*argvp = dupargv (*argvp);
>>>
>>> With that (single) use of 'only_whitespace' now gone:
>>>
>>>  [...]/source-gcc/libiberty/argv.c:128:1: warning: ‘only_whitespace’ 
>>> defined but not used [-Wunused-function]
>>>128 | only_whitespace (const char* input)
>>>| ^~~
>>>
>> 
>> Sorry about that.
>> 
>> The patch below is the obvious fix.  OK to apply?
> Of course.
> jeff

Pushed.

Thanks,
Andrew



Re: [RFC v3 3/3] c: Add __lengthof__() operator

2024-08-05 Thread Alejandro Colomar
Hi Martin,

On Sun, Aug 04, 2024 at 11:39:26AM GMT, Martin Uecker wrote:
> > BTW, I still don't understand what `if (! TYPE_DOMAIN (type))` means,
> > within array_type_nelts_minus_one().  What code triggers that condition?
> > Am I missing error handling for that?  Thanks!
> 
> For incomplete arrays, basically we have the following different
> variants for arrays:
> 
> T[ ] incomplete: !TYPE_DOMAIN 
> T[1] constant size: TYPE_MAX_VALUE == INTEGER_CST
> T[n] variable size: TYPE_MAX_VALUE != INTEGER_CST
> T[0] flexible array member: !TYPE_MAX_VALUE && !C_TYPE_VARIABLE_SIZE
>   (ISO version T[0] has TYPE_SIZE == NULL_TREE)
> T[*] unspecified variable size: !TYPE_MAX_VALUE && C_TYPE_VARIABLE_SIZE

Could you describe the following types?  I've repeated the ones you
already described, deduplicated some that have a different meaning in
different contexts, and added some multi-dimensional arrays.

T[ ] (incomplete type; function parameter)
T[ ] (flexible array member)
T[0] (zero-size array)
T[0] (GNU flexible array member)
T[1] (old flexible array member)
T[7] (constant size)
T[7][n]  (constant size with inner variable size)
T[7][*]  (constant size with inner unspecified size)
T[n] (variable size)
T[*] (unspecified size)

That would help with the [*] issues I'm investigating.  I think
array_type_nelts_minus_one(T[7][*]) is not giving a constant expression,
and I'd like to fix that.

Have a lovely day!
Alex

-- 



signature.asc
Description: PGP signature


Re: [RFC v3 3/3] c: Add __lengthof__() operator

2024-08-05 Thread Alejandro Colomar
On Mon, Aug 05, 2024 at 01:55:50PM GMT, Alejandro Colomar wrote:
> Hi Martin,
> 
> On Sun, Aug 04, 2024 at 11:39:26AM GMT, Martin Uecker wrote:
> > > BTW, I still don't understand what `if (! TYPE_DOMAIN (type))` means,
> > > within array_type_nelts_minus_one().  What code triggers that condition?
> > > Am I missing error handling for that?  Thanks!
> > 
> > For incomplete arrays, basically we have the following different
> > variants for arrays:
> > 
> > T[ ] incomplete: !TYPE_DOMAIN 
> > T[1] constant size: TYPE_MAX_VALUE == INTEGER_CST
> > T[n] variable size: TYPE_MAX_VALUE != INTEGER_CST
> > T[0] flexible array member: !TYPE_MAX_VALUE && !C_TYPE_VARIABLE_SIZE
> >   (ISO version T[0] has TYPE_SIZE == NULL_TREE)
> > T[*] unspecified variable size: !TYPE_MAX_VALUE && C_TYPE_VARIABLE_SIZE
> 
> Could you describe the following types?  I've repeated the ones you
> already described, deduplicated some that have a different meaning in
> different contexts, and added some multi-dimensional arrays.
> 
> T[ ] (incomplete type; function parameter)
> T[ ] (flexible array member)
> T[0] (zero-size array)
> T[0] (GNU flexible array member)
> T[1] (old flexible array member)
> T[7] (constant size)
> T[7][n]  (constant size with inner variable size)
> T[7][*]  (constant size with inner unspecified size)

And please also describe T[7][4], although I expect that to be just the
same as T[7].

> T[n] (variable size)
> T[*] (unspecified size)
> 
> That would help with the [*] issues I'm investigating.  I think
> array_type_nelts_minus_one(T[7][*]) is not giving a constant expression,
> and I'd like to fix that.
> 
> Have a lovely day!
> Alex
> 
> -- 
> 



-- 



signature.asc
Description: PGP signature


Re: [RFC v3 3/3] c: Add __lengthof__() operator

2024-08-05 Thread Alejandro Colomar
On Mon, Aug 05, 2024 at 01:57:35PM GMT, Alejandro Colomar wrote:
> On Mon, Aug 05, 2024 at 01:55:50PM GMT, Alejandro Colomar wrote:
> > Hi Martin,
> > 
> > On Sun, Aug 04, 2024 at 11:39:26AM GMT, Martin Uecker wrote:
> > > > BTW, I still don't understand what `if (! TYPE_DOMAIN (type))` means,
> > > > within array_type_nelts_minus_one().  What code triggers that condition?
> > > > Am I missing error handling for that?  Thanks!
> > > 
> > > For incomplete arrays, basically we have the following different
> > > variants for arrays:
> > > 
> > > T[ ] incomplete: !TYPE_DOMAIN 
> > > T[1] constant size: TYPE_MAX_VALUE == INTEGER_CST
> > > T[n] variable size: TYPE_MAX_VALUE != INTEGER_CST
> > > T[0] flexible array member: !TYPE_MAX_VALUE && !C_TYPE_VARIABLE_SIZE
> > >   (ISO version T[0] has TYPE_SIZE == NULL_TREE)
> > > T[*] unspecified variable size: !TYPE_MAX_VALUE && C_TYPE_VARIABLE_SIZE
> > 
> > Could you describe the following types?  I've repeated the ones you
> > already described, deduplicated some that have a different meaning in
> > different contexts, and added some multi-dimensional arrays.
> > 
> > T[ ] (incomplete type; function parameter)
> > T[ ] (flexible array member)
> > T[0] (zero-size array)
> > T[0] (GNU flexible array member)
> > T[1] (old flexible array member)
> > T[7] (constant size)
> > T[7][n]  (constant size with inner variable size)
> > T[7][*]  (constant size with inner unspecified size)
> 
> And please also describe T[7][4], although I expect that to be just the
> same as T[7].

And it would also be interesting to describe T[7][ ].

> 
> > T[n] (variable size)
> > T[*] (unspecified size)
> > 
> > That would help with the [*] issues I'm investigating.  I think
> > array_type_nelts_minus_one(T[7][*]) is not giving a constant expression,
> > and I'd like to fix that.
> > 
> > Have a lovely day!
> > Alex
> > 
> > -- 
> > 
> 
> 
> 
> -- 
> 



-- 



signature.asc
Description: PGP signature


Re: [RFC v3 3/3] c: Add __lengthof__() operator

2024-08-05 Thread Alejandro Colomar
On Mon, Aug 05, 2024 at 01:58:18PM GMT, Alejandro Colomar wrote:
> On Mon, Aug 05, 2024 at 01:57:35PM GMT, Alejandro Colomar wrote:
> > On Mon, Aug 05, 2024 at 01:55:50PM GMT, Alejandro Colomar wrote:
> > > Hi Martin,
> > > 
> > > On Sun, Aug 04, 2024 at 11:39:26AM GMT, Martin Uecker wrote:
> > > > > BTW, I still don't understand what `if (! TYPE_DOMAIN (type))` means,
> > > > > within array_type_nelts_minus_one().  What code triggers that 
> > > > > condition?
> > > > > Am I missing error handling for that?  Thanks!
> > > > 
> > > > For incomplete arrays, basically we have the following different
> > > > variants for arrays:
> > > > 
> > > > T[ ] incomplete: !TYPE_DOMAIN 
> > > > T[1] constant size: TYPE_MAX_VALUE == INTEGER_CST
> > > > T[n] variable size: TYPE_MAX_VALUE != INTEGER_CST
> > > > T[0] flexible array member: !TYPE_MAX_VALUE && !C_TYPE_VARIABLE_SIZE
> > > >   (ISO version T[0] has TYPE_SIZE == NULL_TREE)
> > > > T[*] unspecified variable size: !TYPE_MAX_VALUE && C_TYPE_VARIABLE_SIZE
> > > 
> > > Could you describe the following types?  I've repeated the ones you
> > > already described, deduplicated some that have a different meaning in
> > > different contexts, and added some multi-dimensional arrays.
> > > 
> > > T[ ] (incomplete type; function parameter)
> > > T[ ] (flexible array member)
> > > T[0] (zero-size array)
> > > T[0] (GNU flexible array member)
> > > T[1] (old flexible array member)
> > > T[7] (constant size)
> > > T[7][n]  (constant size with inner variable size)
> > > T[7][*]  (constant size with inner unspecified size)
> > 
> > And please also describe T[7][4], although I expect that to be just the
> > same as T[7].
> 
> And it would also be interesting to describe T[7][ ].

And maybe also:

T[n][m]
T[n][*]
T[n][ ]
T[n][7]

> 
> > 
> > > T[n] (variable size)
> > > T[*] (unspecified size)
> > > 
> > > That would help with the [*] issues I'm investigating.  I think
> > > array_type_nelts_minus_one(T[7][*]) is not giving a constant expression,
> > > and I'd like to fix that.
> > > 
> > > Have a lovely day!
> > > Alex
> > > 
> > > -- 
> > > 
> > 
> > 
> > 
> > -- 
> > 
> 
> 
> 
> -- 
> 



-- 



signature.asc
Description: PGP signature


Re: [PATCH] fortran: Fix a pasto in gfc_check_dependency

2024-08-05 Thread Mikael Morin

Le 05/08/2024 à 10:59, Jakub Jelinek a écrit :

On Fri, Aug 02, 2024 at 06:29:27PM +0200, Mikael Morin wrote:

I agree with all of that.  Sure keeping the condition around would be the
safest.  I'm just afraid of keeping code that would remain dead.


And the pasto fix would guess fix
aliasing_dummy_5.f90 with
   arg(2:3) = arr(1:2)
instead of
   arr(2:3) = arg(1:2)
if the original testcase would actually fail.


Mmh, aren't they both actually the same?



They can alias, and they do alias.  So in the end, writing either line is
equivalent, what do I miss?


So, I had another look.  Seems the reason why the testcase passes is that
gfc_could_be_alias (called from gfc_conv_resolve_dependencies) returns true,
so the assignment goes through a temporary array.
gfc_check_dependency is then only called for
   if (lhs->rank > 0 && gfc_check_dependency (lhs, rhs, true) == 0)
 optimize_binop_array_assignment (c, &rhs, false);


gfc_check_dependency is also used for WHERE and FORALL.


Guess the question is if one can construct a testcase where it would make a
difference.

The one I made up for PR116196 is a candidate, or can serve as base; I 
think it's valid, but a bit twisted.




Re: [PATCH v2] Hard register constraints

2024-08-05 Thread Georg-Johann Lay

Am 05.08.24 um 12:28 schrieb Stefan Schulze Frielinghaus:

This is a follow-up of
https://gcc.gnu.org/pipermail/gcc-patches/2024-June/654013.html

What has changed?

- Rebased and fixed an issue in constrain_operands which manifested
after late-combine.

- Introduced new test cases for Arm, Intel, POWER, RISCV, S/390 for 32-
and 64-bit where appropriate (including register pairs etc.).  Test
gcc.dg/asm-hard-reg-7.c is a bit controversial since I'm testing for an
anti feature here, i.e., I'm testing for register asm in conjunction
with calls.  I'm fine with removing it in the end but I wanted to keep
it in for demonstration purposes at least during discussion of this
patch.

- Split test pr87600-2.c into pr87600-2.c and pr87600-3.c since test0
errors out early, now.  Otherwise, the remaining errors would not be
reported.  Beside that the error message has slightly changed.

- Modified genoutput.cc in order to allow hard register constraints in
machine descriptions.  For example, on s390 the instruction mvcrl makes


As I already said, such a feature would be great.  Some questions:

Which pass is satisfying that constraint? AFAIK for local reg vars,
it is asmcons, but for register constraints in md it it the register
allocator.

The avr backend has many insns that use explicit hard regs in order to
model some libcalls (ones with footprints smaller than ABI, or that
deviate from the ABI).  A proper way would be to add a register
constraint for each possible hard reg, e.g. R20_1 for QImode in R20,
R20_2 for HImode in R20, etc.  This would require a dozen or more
new register classes, and the problem with that is that register
allocation produces less efficient code even for cases that do
not use these new constraints.  So I gave up that approach.

How does your feature work? Does it imply that for each hreg
constraint there must be an according register class?

Obviously local reg vars don't require respective reg classes,
so I thought about representing such insns as asm_input or
whatever, but that's pure hack and would never pass a review...


use of the implicit register r0 which we currently deal with as follows:

(define_insn "*mvcrl"
   [(set (match_operand:BLK 0 "memory_operand" "=Q")
(unspec:BLK [(match_operand:BLK 1 "memory_operand" "Q")
 (reg:SI GPR0_REGNUM)]
UNSPEC_MVCRL))]
   "TARGET_Z15"
   "mvcrl\t%0,%1"
   [(set_attr "op_type" "SSE")])

(define_expand "mvcrl"
   [(set (reg:SI GPR0_REGNUM) (match_operand:SI 2 "general_operand"))
(set (match_operand:BLK 0 "memory_operand" "=Q")
(unspec:BLK [(match_operand:BLK 1 "memory_operand" "Q")
 (reg:SI GPR0_REGNUM)]
UNSPEC_MVCRL))]
   "TARGET_Z15"
   "")

In the expander we ensure that GPR0 is setup correctly.  With this patch
we could simply write

(define_insn "mvcrl"
   [(set (match_operand:BLK 0 "memory_operand" "=Q")
 (unspec:BLK [(match_operand:BLK 1 "memory_operand" "Q")
  (match_operand:SI 2 "general_operand" "{r0}")]
 UNSPEC_MVCRL))]
   "TARGET_Z15"
   "mvcrl\t%0,%1"
   [(set_attr "op_type" "SSE")])

What I dislike is that I didn't find a way to verify hard register names


Are plain register numbers also supported? Like "{0}" ?
(Provided regno(r0) == 0).


during genoutput, i.e., ensuring that the name is valid after all.  This
is due to the fact how reg_names is defined which cannot be accessed by
genoutput.  The same holds true for REGISTER_NAMES et al. which may
reference some target specific variable (see e.g. POWER).  Thus, in case
of an invalid register name in a machine description file we do not
end-up with a genoutput-time error but instead fail at run-time in
process_alt_operands():

case '{':
{
  int regno = parse_constraint_regname (p);
  gcc_assert (regno >= 0);
  cl = REGNO_REG_CLASS (regno);
  CLEAR_HARD_REG_SET (hregset);
  SET_HARD_REG_BIT (hregset, regno);


Is this correct when hard_regno_nregs(regno) > 1,
i.e. when the register occupies more than one hard register?


  cl_filter = &hregset;
  goto reg;
}

This is rather unfortunate but I couldn't find a way how to validate
register names during genoutput.  If no one else has an idea I will
replace gcc_assert with a more expressive error message.


[ADDITIONAL_]REGISTER_NAMES isn't available?  Though using that might
bypass the effect of target hooks like TARGET_CONDITIONAL_REGISTER_USAGE.

But there are also cases with an asm operand print modifier; you cannot
check that, it's checked by TARGET_PRINT_OPERAND etc. which get a
hard register and not a string for a register name.

Maybe genoutput could add additional information to insn-output.cc or
whatever, and the compiler proper checks that and emits diagnostics
as needed?


What's next?

I was thinking about replacing register asm with the new hard register
constraint.  This would solve problems like demon

[MAINTAINERS] Add my email address to write after approval and DCO.

2024-08-05 Thread Jennifer Schmitz
Add my email address to write after approval and DCO.
Pushed to trunk: 219b09215f530e4a4a3763746986b7068e00f000

Signed-off-by: Jennifer Schmitz mailto:jschm...@nvidia.com>>

ChangeLog:
* MAINTAINERS: Add myself.


0001-MAINTAINERS-Add-my-email-address-to-write-after-appr.patch
Description: Binary data


smime.p7s
Description: S/MIME cryptographic signature


[PATCH] testsuite: Add RISC-V to targets not xfailing gcc.dg/attr-alloc_size-11.c:50, 51.

2024-08-05 Thread Jiawei
The test has been observed to pass on most architectures including RISC-V:
https://godbolt.org/z/8nYEvW6n1

Origin issue see:
https://gcc.gnu.org/PR79356#c11

Update RISC-V target to to pass list.

gcc/testsuite/ChangeLog:

* gcc.dg/attr-alloc_size-11.c: Add RISC-V to the list
of targets excluding xfail on lines 50 and 51.

---
 gcc/testsuite/gcc.dg/attr-alloc_size-11.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/attr-alloc_size-11.c 
b/gcc/testsuite/gcc.dg/attr-alloc_size-11.c
index a2efe128915..6346d5e084b 100644
--- a/gcc/testsuite/gcc.dg/attr-alloc_size-11.c
+++ b/gcc/testsuite/gcc.dg/attr-alloc_size-11.c
@@ -47,8 +47,8 @@ typedef __SIZE_TYPE__size_t;
 
 /* The following tests fail because of missing range information.  The xfail
exclusions are PR79356.  */
-TEST (signed char, SCHAR_MIN + 2, ALLOC_MAX);   /* { dg-warning "argument 1 
range \\\[13, \[0-9\]+\\\] exceeds maximum object size 12" "missing range info 
for signed char" { xfail { ! { aarch64*-*-* arm*-*-* avr-*-* alpha*-*-* 
cris-*-* ia64-*-* mips*-*-* or1k*-*-* pdp11*-*-* powerpc*-*-* sparc*-*-* 
s390*-*-* visium-*-* msp430-*-* nvptx*-*-*} } } } */
-TEST (short, SHRT_MIN + 2, ALLOC_MAX); /* { dg-warning "argument 1 range 
\\\[13, \[0-9\]+\\\] exceeds maximum object size 12" "missing range info for 
short" { xfail { ! { aarch64*-*-* arm*-*-* alpha*-*-* avr-*-* cris-*-* ia64-*-* 
mips*-*-* or1k*-*-* pdp11*-*-* powerpc*-*-* sparc*-*-* s390x-*-* visium-*-* 
msp430-*-* nvptx*-*-* } } } } */
+TEST (signed char, SCHAR_MIN + 2, ALLOC_MAX);   /* { dg-warning "argument 1 
range \\\[13, \[0-9\]+\\\] exceeds maximum object size 12" "missing range info 
for signed char" { xfail { ! { aarch64*-*-* arm*-*-* avr-*-* alpha*-*-* 
cris-*-* ia64-*-* mips*-*-* or1k*-*-* pdp11*-*-* powerpc*-*-* riscv*-*-* 
sparc*-*-* s390*-*-* visium-*-* msp430-*-* nvptx*-*-*} } } } */
+TEST (short, SHRT_MIN + 2, ALLOC_MAX); /* { dg-warning "argument 1 range 
\\\[13, \[0-9\]+\\\] exceeds maximum object size 12" "missing range info for 
short" { xfail { ! { aarch64*-*-* arm*-*-* alpha*-*-* avr-*-* cris-*-* ia64-*-* 
mips*-*-* or1k*-*-* pdp11*-*-* powerpc*-*-* riscv*-*-* sparc*-*-* s390x-*-* 
visium-*-* msp430-*-* nvptx*-*-* } } } } */
 TEST (int, INT_MIN + 2, ALLOC_MAX);/* { dg-warning "argument 1 range 
\\\[13, \[0-9\]+\\\] exceeds maximum object size 12" } */
 TEST (int, -3, ALLOC_MAX); /* { dg-warning "argument 1 range 
\\\[13, \[0-9\]+\\\] exceeds maximum object size 12" } */
 TEST (int, -2, ALLOC_MAX); /* { dg-warning "argument 1 range 
\\\[13, \[0-9\]+\\\] exceeds maximum object size 12" } */
-- 
2.25.1



[COMMITED] gimple ssa: Fix a typo in gimple-ssa-sccopy.cc

2024-08-05 Thread Filip Kastl
Hello,

just commited this as obvious.

Filip Kastl

-- 8< --

Fixes a misplaced comment in gimple-ssa-sccopy.cc.  The comment belongs
to a bitmap definition but was instead placed before the beginning of a
namespace block.

gcc/ChangeLog:

* gimple-ssa-sccopy.cc: Move a misplaced comment.

Signed-off-by: Filip Kastl 
---
 gcc/gimple-ssa-sccopy.cc | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/gimple-ssa-sccopy.cc b/gcc/gimple-ssa-sccopy.cc
index 138ee9a0ac4..191a4c0b451 100644
--- a/gcc/gimple-ssa-sccopy.cc
+++ b/gcc/gimple-ssa-sccopy.cc
@@ -92,10 +92,11 @@ along with GCC; see the file COPYING3.  If not see
  Braun, Buchwald, Hack, Leissa, Mallon, Zwinkau, 2013, LNCS vol. 7791,
  Section 3.2.  */
 
+namespace {
+
 /* Bitmap tracking statements which were propagated to be removed at the end of
the pass.  */
 
-namespace {
 static bitmap dead_stmts;
 
 /* State of vertex during SCC discovery.
-- 
2.45.2



Re: [PATCH] AArch64: Set instruction attribute of TST to logics_imm

2024-08-05 Thread Jennifer Schmitz
Pushed to trunk: 7268d7249b3ca31bf322de99b1d59baf06f83eb3

> On 30 Jul 2024, at 13:39, Richard Sandiford  wrote:
> 
> External email: Use caution opening links or attachments
> 
> 
> Jennifer Schmitz mailto:jschm...@nvidia.com>> writes:
>> As suggested in
>> https://gcc.gnu.org/pipermail/gcc-patches/2024-July/658249.html,
>> this patch changes the instruction attribute of "*and_compare0" (TST) 
>> from
>> alus_imm to logics_imm.
>> 
>> The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression.
>> OK for mainline?
>> 
>> Signed-off-by: Jennifer Schmitz > >
>> 
>> gcc/
>> 
>>  * config/aarch64/aarch64.md (*and_compare0): Change attribute.
> 
> OK, thanks.
> 
> Richard
> 
>> 
>> From e643211edd212276ddeef87136932da4aa14837c Mon Sep 17 00:00:00 2001
>> From: Jennifer Schmitz mailto:jschm...@nvidia.com>>
>> Date: Mon, 29 Jul 2024 07:59:33 -0700
>> Subject: [PATCH] AArch64: Set instruction attribute of TST to logics_imm
>> 
>> As suggested in
>> https://gcc.gnu.org/pipermail/gcc-patches/2024-July/658249.html,
>> this patch changes the instruction attribute of "*and_compare0" (TST) 
>> from
>> alus_imm to logics_imm.
>> 
>> The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression.
>> OK for mainline?
>> 
>> Signed-off-by: Jennifer Schmitz 
>> 
>> gcc/
>> 
>>  * config/aarch64/aarch64.md (*and_compare0): Change attribute.
>> ---
>> gcc/config/aarch64/aarch64.md | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>> 
>> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
>> index ed29127dafb..734a21268dc 100644
>> --- a/gcc/config/aarch64/aarch64.md
>> +++ b/gcc/config/aarch64/aarch64.md
>> @@ -5398,7 +5398,7 @@
>>   (const_int 0)))]
>>   ""
>>   "tst\\t%0, "
>> -  [(set_attr "type" "alus_imm")]
>> +  [(set_attr "type" "logics_imm")]
>> )
>> 
>> (define_insn "*ands_compare0"



smime.p7s
Description: S/MIME cryptographic signature


RE: [PATCH v1] Match: Add type_has_mode_precision_p check for SAT_TRUNC [PR116202]

2024-08-05 Thread Li, Pan2
> Isn't that now handled by the direct_internal_fn_supported_p check?  That is,
> by the caller which needs to verify the matched operation is supported by
> the target?

type_strictly_matches_mode_p doesn't help here (include the un-committed one).
It will hit below case and return true directly as TYPE_MODE (type) is 
E_RVVM1QImode.

   if (VECTOR_TYPE_P (type))
return VECTOR_MODE_P (TYPE_MODE (type));

And looks we cannot TREE_PRECISION on vector type here similar as 
type_has_mode_precision_p
do for scalar types.  Thus, add the check to the matching.

Looks like we need to take care of vector in type_strictly_matches_mode_p, 
right ?

Pan

-Original Message-
From: Richard Biener  
Sent: Monday, August 5, 2024 7:02 PM
To: Li, Pan2 
Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; kito.ch...@gmail.com; 
jeffreya...@gmail.com; rdapp@gmail.com
Subject: Re: [PATCH v1] Match: Add type_has_mode_precision_p check for 
SAT_TRUNC [PR116202]

On Sun, Aug 4, 2024 at 1:47 PM  wrote:
>
> From: Pan Li 
>
> The .SAT_TRUNC matching can only perform the type has its mode
> precision.
>
> g_12 = (long unsigned int) _2;
> _13 = MIN_EXPR ;
> _3 = (_Bool) _13;
>
> The above pattern cannot be recog as .SAT_TRUNC (g_12) because the dest
> only has 1 bit precision but QImode.  Aka the type doesn't have the mode
> precision.  Thus,  add the type_has_mode_precision_p for the dest to
> avoid such case.
>
> The below tests are passed for this patch.
> 1. The rv64gcv fully regression tests.
> 2. The x86 bootstrap tests.
> 3. The x86 fully regression tests.

Isn't that now handled by the direct_internal_fn_supported_p check?  That is,
by the caller which needs to verify the matched operation is supported by
the target?

> PR target/116202
>
> gcc/ChangeLog:
>
> * match.pd: Add type_has_mode_precision_p for the dest type
> of the .SAT_TRUNC matching.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/base/pr116202-run-1.c: New test.
>
> Signed-off-by: Pan Li 
> ---
>  gcc/match.pd  |  6 +++--
>  .../riscv/rvv/base/pr116202-run-1.c   | 24 +++
>  2 files changed, 28 insertions(+), 2 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr116202-run-1.c
>
> diff --git a/gcc/match.pd b/gcc/match.pd
> index c9c8478d286..dfa0bba3908 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -3283,7 +3283,8 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> wide_int trunc_max = wi::mask (otype_precision, false, itype_precision);
> wide_int int_cst = wi::to_wide (@1, itype_precision);
>}
> -  (if (otype_precision < itype_precision && wi::eq_p (trunc_max, 
> int_cst))
> +  (if (type_has_mode_precision_p (type) && otype_precision < itype_precision
> +   && wi::eq_p (trunc_max, int_cst))
>
>  /* Unsigned saturation truncate, case 2, sizeof (WT) > sizeof (NT).
> SAT_U_TRUNC = (NT)(MIN_EXPR (X, 255)).  */
> @@ -3309,7 +3310,8 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> wide_int trunc_max = wi::mask (otype_precision, false, itype_precision);
> wide_int int_cst = wi::to_wide (@1, itype_precision);
>}
> -  (if (otype_precision < itype_precision && wi::eq_p (trunc_max, 
> int_cst))
> +  (if (type_has_mode_precision_p (type) && otype_precision < itype_precision
> +   && wi::eq_p (trunc_max, int_cst))
>
>  /* x >  y  &&  x != XXX_MIN  -->  x > y
> x >  y  &&  x == XXX_MIN  -->  false . */
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr116202-run-1.c 
> b/gcc/testsuite/gcc.target/riscv/rvv/base/pr116202-run-1.c
> new file mode 100644
> index 000..d150f20b5d9
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr116202-run-1.c
> @@ -0,0 +1,24 @@
> +/* { dg-do run } */
> +/* { dg-options "-O3 -march=rv64gcv_zvl256b -fdump-rtl-expand-details" } */
> +
> +int b[24];
> +_Bool c[24];
> +
> +int main() {
> +  for (int f = 0; f < 4; ++f)
> +b[f] = 6;
> +
> +  for (int f = 0; f < 24; f += 4)
> +c[f] = ({
> +  int g = ({
> +unsigned long g = -b[f];
> +1 < g ? 1 : g;
> +  });
> +  g;
> +});
> +
> +  if (c[0] != 1)
> +__builtin_abort ();
> +}
> +
> +/* { dg-final { scan-rtl-dump-not ".SAT_TRUNC " "expand" } } */
> --
> 2.43.0
>


[PATCH] c++/modules: Fix merging of GM entities in partitions [PR114950]

2024-08-05 Thread Nathaniel Shead
Bootstrapped and regtested (so far just modules.exp) on
x86_64-pc-linux-gnu, OK for trunk if full regtest passes?

-- >8 --

Currently name lookup generally seems to assume that all entities
declared within a named module (partition) are attached to said module,
which is not true for GM entities (e.g. via extern "C++"), and causes
issues with deduplication.

This patch fixes the issue by ensuring that module attachment of a
declaration is consistently used to handling merging.  Handling this
exposes some issues with deduplicating temploid friends; to resolve this
we always create the BINDING_SLOT_PARTITION slot so that we have
somewhere to place attached names (from any module).

PR c++/114950

gcc/cp/ChangeLog:

* module.cc (trees_out::decl_value): Stream bit indicating
imported temploid friends early.
(trees_in::decl_value): Use this bit with key_mergeable.
(trees_in::key_mergeable): Allow merging attached declarations
if they're imported temploid friends.
(module_state::read_cluster): Check for GM entities that may
require merging even when importing from partitions.
* name-lookup.cc (enum binding_slots): Adjust comment.
(get_fixed_binding_slot): Always create partition slot.
(name_lookup::search_namespace_only): Support binding vectors
with both partition and GM entities to dedup.
(walk_module_binding): Likewise.
(name_lookup::adl_namespace_fns): Likewise.
(set_module_binding): Likewise.
(check_module_override): Use attachment of the decl when
checking overrides rather than named_module_p.
(lookup_imported_hidden_friend): Use partition slot for finding
mergeable template bindings.
* name-lookup.h (set_module_binding): Split mod_glob_flag
parameter into separate global_p and partition_p params.

gcc/testsuite/ChangeLog:

* g++.dg/modules/tpl-friend-13_e.C: Adjust error message.
* g++.dg/modules/ambig-2_a.C: New test.
* g++.dg/modules/ambig-2_b.C: New test.
* g++.dg/modules/part-9_a.C: New test.
* g++.dg/modules/part-9_b.C: New test.
* g++.dg/modules/part-9_c.C: New test.
* g++.dg/modules/tpl-friend-15.h: New test.
* g++.dg/modules/tpl-friend-15_a.C: New test.
* g++.dg/modules/tpl-friend-15_b.C: New test.
* g++.dg/modules/tpl-friend-15_c.C: New test.

Signed-off-by: Nathaniel Shead 
---
 gcc/cp/module.cc  | 55 ++--
 gcc/cp/name-lookup.cc | 65 ++-
 gcc/cp/name-lookup.h  |  2 +-
 gcc/testsuite/g++.dg/modules/ambig-2_a.C  |  7 ++
 gcc/testsuite/g++.dg/modules/ambig-2_b.C  | 10 +++
 gcc/testsuite/g++.dg/modules/part-9_a.C   | 10 +++
 gcc/testsuite/g++.dg/modules/part-9_b.C   | 10 +++
 gcc/testsuite/g++.dg/modules/part-9_c.C   |  8 +++
 .../g++.dg/modules/tpl-friend-13_e.C  |  4 +-
 gcc/testsuite/g++.dg/modules/tpl-friend-15.h  | 11 
 .../g++.dg/modules/tpl-friend-15_a.C  |  8 +++
 .../g++.dg/modules/tpl-friend-15_b.C  |  8 +++
 .../g++.dg/modules/tpl-friend-15_c.C  |  7 ++
 13 files changed, 150 insertions(+), 55 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/modules/ambig-2_a.C
 create mode 100644 gcc/testsuite/g++.dg/modules/ambig-2_b.C
 create mode 100644 gcc/testsuite/g++.dg/modules/part-9_a.C
 create mode 100644 gcc/testsuite/g++.dg/modules/part-9_b.C
 create mode 100644 gcc/testsuite/g++.dg/modules/part-9_c.C
 create mode 100644 gcc/testsuite/g++.dg/modules/tpl-friend-15.h
 create mode 100644 gcc/testsuite/g++.dg/modules/tpl-friend-15_a.C
 create mode 100644 gcc/testsuite/g++.dg/modules/tpl-friend-15_b.C
 create mode 100644 gcc/testsuite/g++.dg/modules/tpl-friend-15_c.C

diff --git a/gcc/cp/module.cc b/gcc/cp/module.cc
index d1607a06757..e6b569ebca5 100644
--- a/gcc/cp/module.cc
+++ b/gcc/cp/module.cc
@@ -2955,7 +2955,8 @@ private:
 public:
   tree decl_container ();
   tree key_mergeable (int tag, merge_kind, tree decl, tree inner, tree type,
- tree container, bool is_attached);
+ tree container, bool is_attached,
+ bool is_imported_temploid_friend);
   unsigned binfo_mergeable (tree *);
 
 private:
@@ -7803,6 +7804,7 @@ trees_out::decl_value (tree decl, depset *dep)
   || !TYPE_PTRMEMFUNC_P (TREE_TYPE (decl)));
 
   merge_kind mk = get_merge_kind (decl, dep);
+  bool is_imported_temploid_friend = imported_temploid_friends->get (decl);
 
   if (CHECKING_P)
 {
@@ -7838,13 +7840,11 @@ trees_out::decl_value (tree decl, depset *dep)
  && DECL_MODULE_ATTACH_P (not_tmpl))
is_attached = true;
 
- /* But don't consider imported temploid friends as attached,
-since importers will need to merge this decl even if it was
-attached to a different mod

Re: [x86_64 PATCH] Refactor V2DI arithmetic right shift expansion for STV.

2024-08-05 Thread Uros Bizjak
On Mon, Aug 5, 2024 at 12:22 PM Roger Sayle  wrote:
>
>
> This patch refactors ashrv2di RTL expansion into a function so that it may
> be reused by a pre-reload splitter, such that DImode right shifts may be
> considered candidates during the Scalar-To-Vector (STV) pass.  Currently
> DImode arithmetic right shifts are not considered potential candidates
> during STV, so for the following testcase:
>
> long long m;
> typedef long long v2di __attribute__((vector_size (16)));
> void foo(v2di x) { m = x[0]>>63; }
>
> We currently see the following warning/error during STV2
> >  r101 use in insn 7 isn't convertible
>
> And end up generating scalar code with an interunit move:
>
> foo:movq%xmm0, %rax
> sarq$63, %rax
> movq%rax, m(%rip)
> ret
>
> With this patch, we can reuse the RTL expansion logic and produce:
>
> foo:psrad   $31, %xmm0
> pshufd  $245, %xmm0, %xmm0
> movq%xmm0, m(%rip)
> ret
>
> Or with the addition of -mavx2, the equivalent:
>
> foo:vpxor   %xmm1, %xmm1, %xmm1
> vpcmpgtq%xmm0, %xmm1, %xmm0
> vmovq   %xmm0, m(%rip)
> ret
>
>
> The only design decision of note is the choice to continue lowering V2DI
> into vector sequences during RTL expansion, to enable combine to optimize
> things if possible.  Using just define_insn_and_split potentially misses
> optimizations, such as reusing the zero vector produced by vpxor above.
> It may be necessary to tweak STV's compute gain at some point, but this
> patch controls what's possible (rather than what's beneficial).
>
> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> and make -k check, both with and without --target_board=unix{-m32}
> with no new failures.  Ok for mainline?
>
> 2024-08-05  Roger Sayle  
>
> gcc/ChangeLog
> * config/i386/i386-expand.cc (ix86_expand_v2di_ashiftrt): New
> function refactored from define_expand ashrv2di3.
> * config/i386/i386-features.cc
> (general_scalar_to_vector_candidate_p)
> : Handle like other shifts and rotates.
> * config/i386/i386-protos.h (ix86_expand_v2di_ashiftrt): Prototype.
> * config/i386/sse.md (ashrv2di3): Call ix86_expand_v2di_ashiftrt.
> (*ashrv2di3): New define_insn_and_split to enable creation by stv2
> pass, and splitting during split1 reusing ix86_expand_v2di_ashiftrt.
>
> gcc/testsuite/ChangeLog
> * gcc.target/i386/sse2-stv-2.c: New test case.

LGTM.

Thanks,
Uros.

>
>
> Thanks in advance,
> Roger
> --
>


[PATCH] Explicitly document that the "counted_by" attribute is only supported in C

2024-08-05 Thread Qing Zhao
As discussed in PR116016:https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116016#c48

We should explicitly document this limitation and issue error messages for C++.

The "counted_by" attribute currently is only supported in C, mention this
explicitly in documentation and also issue error when see "counted_by"
attribute in C++.

The patch has been bootstrappped and regression tested on both aarch64 and X86,
no issue.

Okay for committing?

thanks.

Qing

gcc/c-family/ChangeLog:

* c-attribs.cc (handle_counted_by_attribute): Issue error for C++.

gcc/ChangeLog:

* doc/extend.texi: Explicitly mentions counted_by is available
only for C.

gcc/testsuite/ChangeLog:

* g++.dg/flex-array-counted-by.C: New test.
---
 gcc/c-family/c-attribs.cc|  9 -
 gcc/doc/extend.texi  |  1 +
 gcc/testsuite/g++.dg/flex-array-counted-by.C | 11 +++
 3 files changed, 20 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/flex-array-counted-by.C

diff --git a/gcc/c-family/c-attribs.cc b/gcc/c-family/c-attribs.cc
index 685f212683f..f936058800b 100644
--- a/gcc/c-family/c-attribs.cc
+++ b/gcc/c-family/c-attribs.cc
@@ -2859,8 +2859,15 @@ handle_counted_by_attribute (tree *node, tree name,
   tree argval = TREE_VALUE (args);
   tree old_counted_by = lookup_attribute ("counted_by", DECL_ATTRIBUTES 
(decl));
 
+  /* This attribute is not supported in C++.  */
+  if (c_dialect_cxx ())
+{
+  error_at (DECL_SOURCE_LOCATION (decl),
+   "%qE attribute is not supported for C++", name);
+  *no_add_attrs = true;
+}
   /* This attribute only applies to field decls of a structure.  */
-  if (TREE_CODE (decl) != FIELD_DECL)
+  else if (TREE_CODE (decl) != FIELD_DECL)
 {
   error_at (DECL_SOURCE_LOCATION (decl),
"%qE attribute is not allowed for a non-field"
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 48b27ff9f39..f31f3bdb53d 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -7848,6 +7848,7 @@ The @code{counted_by} attribute may be attached to the 
C99 flexible array
 member of a structure.  It indicates that the number of the elements of the
 array is given by the field "@var{count}" in the same structure as the
 flexible array member.
+This attribute is available only for C.
 GCC may use this information to improve detection of object size information
 for such structures and provide better results in compile-time diagnostics
 and runtime features like the array bound sanitizer and
diff --git a/gcc/testsuite/g++.dg/flex-array-counted-by.C 
b/gcc/testsuite/g++.dg/flex-array-counted-by.C
new file mode 100644
index 000..7f1a345615e
--- /dev/null
+++ b/gcc/testsuite/g++.dg/flex-array-counted-by.C
@@ -0,0 +1,11 @@
+/* Testing the fact that the attribute counted_by is not supported in C++.  */
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+int size;
+int x __attribute ((counted_by (size))); /* { dg-error "attribute is not 
supported for C\\+\\+" } */
+
+struct trailing {
+  int count;
+  int field[] __attribute ((counted_by (count))); /* { dg-error "attribute is 
not supported for C\\+\\+" } */
+};
-- 
2.31.1



Re: [RFC v3 3/3] c: Add __lengthof__() operator

2024-08-05 Thread Martin Uecker
Am Montag, dem 05.08.2024 um 13:59 +0200 schrieb Alejandro Colomar:
> On Mon, Aug 05, 2024 at 01:58:18PM GMT, Alejandro Colomar wrote:
> > On Mon, Aug 05, 2024 at 01:57:35PM GMT, Alejandro Colomar wrote:
> > > On Mon, Aug 05, 2024 at 01:55:50PM GMT, Alejandro Colomar wrote:
> > > > Hi Martin,
> > > > 
> > > > On Sun, Aug 04, 2024 at 11:39:26AM GMT, Martin Uecker wrote:
> > > > > > BTW, I still don't understand what `if (! TYPE_DOMAIN (type))` 
> > > > > > means,
> > > > > > within array_type_nelts_minus_one().  What code triggers that 
> > > > > > condition?
> > > > > > Am I missing error handling for that?  Thanks!
> > > > > 
> > > > > For incomplete arrays, basically we have the following different
> > > > > variants for arrays:
> > > > > 
> > > > > T[ ] incomplete: !TYPE_DOMAIN 
> > > > > T[1] constant size: TYPE_MAX_VALUE == INTEGER_CST
> > > > > T[n] variable size: TYPE_MAX_VALUE != INTEGER_CST
> > > > > T[0] flexible array member: !TYPE_MAX_VALUE && !C_TYPE_VARIABLE_SIZE
> > > > >   (ISO version T[0] has TYPE_SIZE == NULL_TREE)
> > > > > T[*] unspecified variable size: !TYPE_MAX_VALUE && 
> > > > > C_TYPE_VARIABLE_SIZE
> > > > 
> > > > Could you describe the following types?  I've repeated the ones you
> > > > already described, deduplicated some that have a different meaning in
> > > > different contexts, and added some multi-dimensional arrays.
> > > > 
> > > > T[ ] (incomplete type; function parameter)
> > > > T[ ] (flexible array member)
> > > > T[0] (zero-size array)
> > > > T[0] (GNU flexible array member)
> > > > T[1] (old flexible array member)
> > > > T[7] (constant size)
> > > > T[7][n]  (constant size with inner variable size)
> > > > T[7][*]  (constant size with inner unspecified size)
> > > 
> > > And please also describe T[7][4], although I expect that to be just the
> > > same as T[7].
> > 
> > And it would also be interesting to describe T[7][ ].
> 
> And maybe also:
> 
> T[n][m]
> T[n][*]
> T[n][ ]
> T[n][7]

I do not understand your question. What do you mean by
"describe the type"?

But I think you might make it unnecessarily complicated.  It
should be sufficient to look at the outermost size.  You
can completely ignore thatever happens There
should be three cases if I am not mistaken:

- incomplete (includes ISO FAM) -> error
- constant (includes GNU FAM) -> return fixed size
- variable (includes unspecified) -> evaluate the
argument and return the size, while making sure it is 
visibly non-constant.

To check that the array has a variable length, you can use
the same logic as in comptypes_internal (cf. d1_variable).

It is possible that you can not properly distinguish between

int a[0][n];
int a[*][n];

those two cases. The logic will treat the first as the second.
I think this is ok for now.  All this array stuff should be 
implified and refactored anyway, but this is for another time.


I am also not sure you even need to use array_type_nelts in C
because there is never a non-zero minimum size.


Martin

> 
> > 
> > > 
> > > > T[n] (variable size)
> > > > T[*] (unspecified size)
> > > > 
> > > > That would help with the [*] issues I'm investigating.  I think
> > > > array_type_nelts_minus_one(T[7][*]) is not giving a constant expression,
> > > > and I'd like to fix that.
> > > > 
> > > > Have a lovely day!
> > > > Alex
> > > > 
> > > > -- 
> > > > 
> > > 
> > > 
> > > 
> > > -- 
> > > 
> > 
> > 
> > 
> > -- 
> > 
> 
> 
> 



Re: [PATCH v1] Match: Add type_has_mode_precision_p check for SAT_TRUNC [PR116202]

2024-08-05 Thread Richard Biener
On Mon, Aug 5, 2024 at 3:04 PM Li, Pan2  wrote:
>
> > Isn't that now handled by the direct_internal_fn_supported_p check?  That 
> > is,
> > by the caller which needs to verify the matched operation is supported by
> > the target?
>
> type_strictly_matches_mode_p doesn't help here (include the un-committed one).
> It will hit below case and return true directly as TYPE_MODE (type) is 
> E_RVVM1QImode.
>
>if (VECTOR_TYPE_P (type))
> return VECTOR_MODE_P (TYPE_MODE (type));
>
> And looks we cannot TREE_PRECISION on vector type here similar as 
> type_has_mode_precision_p
> do for scalar types.  Thus, add the check to the matching.
>
> Looks like we need to take care of vector in type_strictly_matches_mode_p, 
> right ?

Well that means the caller (vectorizer pattern recog?) wrongly used a
vector of QImode in
the first place, so it needs to check the scalar mode as well?  Vector
type assignment does

  /* For vector types of elements whose mode precision doesn't
 match their types precision we use a element type of mode
 precision.  The vectorization routines will have to make sure
 they support the proper result truncation/extension.
 We also make sure to build vector types with INTEGER_TYPE
 component type only.  */
  if (INTEGRAL_TYPE_P (scalar_type)
  && (GET_MODE_BITSIZE (inner_mode) != TYPE_PRECISION (scalar_type)
  || TREE_CODE (scalar_type) != INTEGER_TYPE))
scalar_type = build_nonstandard_integer_type (GET_MODE_BITSIZE (inner_mode),
  TYPE_UNSIGNED (scalar_type));

So possibly vectorizable_internal_function would need to be amended or better,
vector pattern matching be constrainted.

Richard.

> Pan
>
> -Original Message-
> From: Richard Biener 
> Sent: Monday, August 5, 2024 7:02 PM
> To: Li, Pan2 
> Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; kito.ch...@gmail.com; 
> jeffreya...@gmail.com; rdapp@gmail.com
> Subject: Re: [PATCH v1] Match: Add type_has_mode_precision_p check for 
> SAT_TRUNC [PR116202]
>
> On Sun, Aug 4, 2024 at 1:47 PM  wrote:
> >
> > From: Pan Li 
> >
> > The .SAT_TRUNC matching can only perform the type has its mode
> > precision.
> >
> > g_12 = (long unsigned int) _2;
> > _13 = MIN_EXPR ;
> > _3 = (_Bool) _13;
> >
> > The above pattern cannot be recog as .SAT_TRUNC (g_12) because the dest
> > only has 1 bit precision but QImode.  Aka the type doesn't have the mode
> > precision.  Thus,  add the type_has_mode_precision_p for the dest to
> > avoid such case.
> >
> > The below tests are passed for this patch.
> > 1. The rv64gcv fully regression tests.
> > 2. The x86 bootstrap tests.
> > 3. The x86 fully regression tests.
>
> Isn't that now handled by the direct_internal_fn_supported_p check?  That is,
> by the caller which needs to verify the matched operation is supported by
> the target?
>
> > PR target/116202
> >
> > gcc/ChangeLog:
> >
> > * match.pd: Add type_has_mode_precision_p for the dest type
> > of the .SAT_TRUNC matching.
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.target/riscv/rvv/base/pr116202-run-1.c: New test.
> >
> > Signed-off-by: Pan Li 
> > ---
> >  gcc/match.pd  |  6 +++--
> >  .../riscv/rvv/base/pr116202-run-1.c   | 24 +++
> >  2 files changed, 28 insertions(+), 2 deletions(-)
> >  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr116202-run-1.c
> >
> > diff --git a/gcc/match.pd b/gcc/match.pd
> > index c9c8478d286..dfa0bba3908 100644
> > --- a/gcc/match.pd
> > +++ b/gcc/match.pd
> > @@ -3283,7 +3283,8 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> > wide_int trunc_max = wi::mask (otype_precision, false, itype_precision);
> > wide_int int_cst = wi::to_wide (@1, itype_precision);
> >}
> > -  (if (otype_precision < itype_precision && wi::eq_p (trunc_max, 
> > int_cst))
> > +  (if (type_has_mode_precision_p (type) && otype_precision < 
> > itype_precision
> > +   && wi::eq_p (trunc_max, int_cst))
> >
> >  /* Unsigned saturation truncate, case 2, sizeof (WT) > sizeof (NT).
> > SAT_U_TRUNC = (NT)(MIN_EXPR (X, 255)).  */
> > @@ -3309,7 +3310,8 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> > wide_int trunc_max = wi::mask (otype_precision, false, itype_precision);
> > wide_int int_cst = wi::to_wide (@1, itype_precision);
> >}
> > -  (if (otype_precision < itype_precision && wi::eq_p (trunc_max, 
> > int_cst))
> > +  (if (type_has_mode_precision_p (type) && otype_precision < 
> > itype_precision
> > +   && wi::eq_p (trunc_max, int_cst))
> >
> >  /* x >  y  &&  x != XXX_MIN  -->  x > y
> > x >  y  &&  x == XXX_MIN  -->  false . */
> > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/pr116202-run-1.c 
> > b/gcc/testsuite/gcc.target/riscv/rvv/base/pr116202-run-1.c
> > new file mode 100644
> > index 000..d150f20b5d9
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/pr116202-run-1.c
> > @@ -

[patch,avr,v2] PR115830: Make better use of SREG

2024-08-05 Thread Georg-Johann Lay

This is a second take on improving SREG (condition code) usage
for avr.

The difference to the 1st patch is that I added a paragraph to
avr.md that explains why we don't use cmpelim:

- The achievable compare mode may depend on the availability of
  a scratch register.  SELECT_CC_MODE doesn't provide that info.

- Operation+setcc may have more demanding scratch reg requirements
  that the pure operation that just clobbers cc.  cmpelim cannot
  provide a scratch reg while peep2 can (provided a scratch is
  available).

- cmpelim does not support operations like (minus reg any_extend)
  while peep2 doesn't have restrictions.

- Pre-reload passes have #ifdef SELECT_CC_MODE, but CCmode only
  comes into live after reload.

Then there are many open question that are not addressed by the
documentation.  For example what to do when the operand combination
doesn't set cc in a usable way.  The example in SELECT_CC_MODE
just makes no sense for avr; the example is form sparc and looks
rather like a carry set operation, nothing like that makes sense
for avr.  The documentation has dangling references (to the
deceased cc0 framework), and the SELECT_CC_MODE docs does not
really fit into what compare-elim.cc lists as requirements.

So for the time being, here is a patch that uses peephole2:

The existing patterns for add have been simplified: The QImode
case is now handled as part of then new QImode pattern.

Apart from that, there are 3 new patterns for subtractions
and one for ashift, all for multi-byte modes.

With that patch we have 8 patterns for compare elimination
and 6 insn (2 for add, 2 for sub, 1 for ashift, 1 for 8-bit).

cmpelim would require at least 5 insns for add, 3 for ashift,
4 for sub (all multi-byte), and 8 insns for 8-bit (ashift,
lshiftrt ashiftrt, xor, and, ior, plus, minus).  Some of them
would be required twice: one for CCNmode, one for CCZNmode.

That would be ~30 new insns for cmpelim, so that the verbosity
of the peep2 approach is not that bad.

The patch also has some cleanup / simplification of existing code.

What the patch doesn't do is multi-byte patterns for bit-ops
(estimated 3 patterns) and patterns for fixed-point (could
perhaps be handled by using broader mode iterators for the
add and sub cases, and a new pattern for (minus reg const)).

Ok for trunk?

Johann

--

AVR: target/115830 - Make better use of SREG.N and SREG.Z.

This patch adds new CC modes CCN and CCZN for operations that
set SREG.N, resp. SREG.Z and SREG.N.  Add peephole2 patterns
to generate new compute + branch insns that make use
of the Z and N flags.  Most of these patterns need their own
asm output routines that don't do all the micro-optimizations
that the ordinary outputs may perform, as the latter have no
requirement to set CC in a usable way.

We don't use cmpelim because it cannot provide scratch regs
(which peephole2 can), and some of the patterns require a
scratch reg, whereas the same operations that don't set REG_CC
don't require a scratch.  See the comments in avr.md for details.

The existing add.for.cc* patterns are simplified as they no
more cover QImode, which is handled in a separate QImode case.
Apart from that, it adds 3 patterns for subtractions and one
pattern for shift left, all for multi-byte cases (HI, PSI, SI).

The add.for.cc* patterns now use CC[Z]Nmode, instead of the
formerly abuse of CCmode.

PR target/115830
gcc/
* config/avr/avr-modes.def (CCN, CCZN): New CC_MODEs.
* config/avr/avr-protos.h (avr_cond_branch): New from
ret_cond_branch.
(avr_out_plus_set_N, avr_op8_ZN_operator)
(avr_out_op8_set_ZN, avr_len_op8_set_ZN): New protos.
(ccn_reg_rtx, cczn_reg_rtx): New declarations.
* config/avr/avr.cc (avr_cond_branch): New from ret_cond_branch.
(avr_cond_string): Add bool cc_overflow_unusable argument.
(avr_print_operand) ['L']: Like 'j' but overflow unusable.
['K']: Like 'k' but overflow unusable.
(avr_out_plus_set_ZN): Remove handling of QImode.
(avr_out_plus_set_N, avr_op8_ZN_operator)
(avr_out_op8_set_ZN, avr_len_op8_set_ZN): New functions.
(avr_adjust_insn_length) [ADJUST_LEN_ADD_SET_N]: Hande case.
(avr_class_max_nregs): All MODE_CCs occupy one hard reg.
(avr_hard_regno_nregs): Same.
(avr_hard_regno_mode_ok) [REG_CC]: Allow all MODE_CC.
(pass_manager.h): Include it.
(ccn_reg_rtx, cczn_reg_rtx): New GTY variables.
(avr_init_expanders): Initialize them.
(avr_option_override): Run peephole2 a second time.
* config/avr/avr.md (adjust_len) [add_set_N]: New attr value.
(ALLCC, HI_SI): New mode iterators.
(CCname): New mode attribute.
(eqnegtle, cmp_signed, op8_ZN): New code iterators.
(branch): Handle CCNmode and CCZNmode.  Assimilate...
(difficult_branch): ...this insn.
(p1m1): Remove.
(gen_add_for__): Adjust to CCNmode and CCZNmode. Use
HISI as 

RE: [PATCH v1] Match: Support form 1 for scalar signed integer .SAT_ADD

2024-08-05 Thread Li, Pan2
Thanks Richard for comments.

> The convert looks odd to me given @0 is involved in both & operands.

The convert is introduced as the GIMPLE IL is somehow different for int8_t when 
compares to int32_t or int64_t.
There are some additional ops convert to unsigned for plus, see below line 8-9 
and line 22-23.
But we cannot see similar GIMPLE IL for int32_t and int64_t. To reconcile the 
types from int8_t to int64_t, add the
convert here.

Or may be I have some mistake in the example, let me revisit it and send v2 if 
no surprise.

   4   │ __attribute__((noinline))
   5   │ int8_t sat_s_add_int8_t_fmt_1 (int8_t x, int8_t y)
   6   │ {
   7   │   int8_t sum;
   8   │   unsigned char x.1_1;
   9   │   unsigned char y.2_2;
  10   │   unsigned char _3;
  11   │   signed char _4;
  12   │   signed char _5;
  13   │   int8_t _6;
  14   │   _Bool _11;
  15   │   signed char _12;
  16   │   signed char _13;
  17   │   signed char _14;
  18   │   signed char _22;
  19   │   signed char _23;
  20   │
  21   │[local count: 1073741822]:
  22   │   x.1_1 = (unsigned char) x_7(D);
  23   │   y.2_2 = (unsigned char) y_8(D);
  24   │   _3 = x.1_1 + y.2_2;
  25   │   sum_9 = (int8_t) _3;
  26   │   _4 = x_7(D) ^ y_8(D);
  27   │   _5 = x_7(D) ^ sum_9;
  28   │   _23 = ~_4;
  29   │   _22 = _5 & _23;
  30   │   if (_22 < 0)
  31   │ goto ; [41.00%]
  32   │   else
  33   │ goto ; [59.00%]
  34   │
  35   │[local count: 259738146]:
  36   │   _11 = x_7(D) < 0;
  37   │   _12 = (signed char) _11;
  38   │   _13 = -_12;
  39   │   _14 = _13 ^ 127;
  40   │
  41   │[local count: 1073741824]:
  42   │   # _6 = PHI <_14(3), sum_9(2)>
  43   │   return _6;
  44   │
  45   │ }

Pan

-Original Message-
From: Richard Biener  
Sent: Monday, August 5, 2024 7:16 PM
To: Li, Pan2 
Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; kito.ch...@gmail.com; 
jeffreya...@gmail.com; rdapp@gmail.com
Subject: Re: [PATCH v1] Match: Support form 1 for scalar signed integer .SAT_ADD

On Mon, Aug 5, 2024 at 9:14 AM  wrote:
>
> From: Pan Li 
>
> This patch would like to support the form 1 of the scalar signed
> integer .SAT_ADD.  Aka below example:
>
> Form 1:
>   #define DEF_SAT_S_ADD_FMT_1(T) \
>   T __attribute__((noinline))\
>   sat_s_add_##T##_fmt_1 (T x, T y)   \
>   {  \
> T min = (T)1u << (sizeof (T) * 8 - 1);   \
> T max = min - 1; \
> return (x ^ y) < 0   \
>   ? (T)(x + y)   \
>   : ((T)(x + y) ^ x) >= 0\
> ? (T)(x + y) \
> : x < 0 ? min : max; \
>   }
>
> DEF_SAT_S_ADD_FMT_1 (int64_t)
>
> We can tell the difference before and after this patch if backend
> implemented the ssadd3 pattern similar as below.
>
> Before this patch:
>4   │ __attribute__((noinline))
>5   │ int64_t sat_s_add_int64_t_fmt_1 (int64_t x, int64_t y)
>6   │ {
>7   │   long int _1;
>8   │   long int _2;
>9   │   long int _3;
>   10   │   int64_t _4;
>   11   │   long int _7;
>   12   │   _Bool _9;
>   13   │   long int _10;
>   14   │   long int _11;
>   15   │   long int _12;
>   16   │   long int _13;
>   17   │
>   18   │ ;;   basic block 2, loop depth 0
>   19   │ ;;pred:   ENTRY
>   20   │   _1 = x_5(D) ^ y_6(D);
>   21   │   _13 = x_5(D) + y_6(D);
>   22   │   _3 = x_5(D) ^ _13;
>   23   │   _2 = ~_1;
>   24   │   _7 = _2 & _3;
>   25   │   if (_7 >= 0)
>   26   │ goto ; [59.00%]
>   27   │   else
>   28   │ goto ; [41.00%]
>   29   │ ;;succ:   4
>   30   │ ;;3
>   31   │
>   32   │ ;;   basic block 3, loop depth 0
>   33   │ ;;pred:   2
>   34   │   _9 = x_5(D) < 0;
>   35   │   _10 = (long int) _9;
>   36   │   _11 = -_10;
>   37   │   _12 = _11 ^ 9223372036854775807;
>   38   │ ;;succ:   4
>   39   │
>   40   │ ;;   basic block 4, loop depth 0
>   41   │ ;;pred:   2
>   42   │ ;;3
>   43   │   # _4 = PHI <_13(2), _12(3)>
>   44   │   return _4;
>   45   │ ;;succ:   EXIT
>   46   │
>   47   │ }
>
> After this patch:
>4   │ __attribute__((noinline))
>5   │ int64_t sat_s_add_int64_t_fmt_1 (int64_t x, int64_t y)
>6   │ {
>7   │   int64_t _4;
>8   │
>9   │ ;;   basic block 2, loop depth 0
>   10   │ ;;pred:   ENTRY
>   11   │   _4 = .SAT_ADD (x_5(D), y_6(D)); [tail call]
>   12   │   return _4;
>   13   │ ;;succ:   EXIT
>   14   │
>   15   │ }
>
> The below test suites are passed for this patch.
> * The rv64gcv fully regression test.
> * The x86 bootstrap test.
> * The x86 fully regression test.
>
> gcc/ChangeLog:
>
> * match.pd: Add the matching for signed .SAT_ADD.
> * tree-ssa-math-opts.cc (gimple_signed_integer_sat_add): Add new
> matching func decl.
> (match_unsigned_saturation_add): Try signed .SAT_ADD and rename
> 

Re: [PATCH] Explicitly document that the "counted_by" attribute is only supported in C

2024-08-05 Thread Jakub Jelinek
On Mon, Aug 05, 2024 at 01:33:01PM +, Qing Zhao wrote:
> As discussed in 
> PR116016:https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116016#c48
> 
> We should explicitly document this limitation and issue error messages for 
> C++.
> 
> The "counted_by" attribute currently is only supported in C, mention this
> explicitly in documentation and also issue error when see "counted_by"
> attribute in C++.
> 
> The patch has been bootstrappped and regression tested on both aarch64 and 
> X86,
> no issue.
> 
> Okay for committing?
> 
> thanks.
> 
> Qing
> 
> gcc/c-family/ChangeLog:
> 
>   * c-attribs.cc (handle_counted_by_attribute): Issue error for C++.
> 
> gcc/ChangeLog:
> 
>   * doc/extend.texi: Explicitly mentions counted_by is available
>   only for C.
> 
> gcc/testsuite/ChangeLog:
> 
>   * g++.dg/flex-array-counted-by.C: New test.
> ---
>  gcc/c-family/c-attribs.cc|  9 -
>  gcc/doc/extend.texi  |  1 +
>  gcc/testsuite/g++.dg/flex-array-counted-by.C | 11 +++
>  3 files changed, 20 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/g++.dg/flex-array-counted-by.C
> 
> diff --git a/gcc/c-family/c-attribs.cc b/gcc/c-family/c-attribs.cc
> index 685f212683f..f936058800b 100644
> --- a/gcc/c-family/c-attribs.cc
> +++ b/gcc/c-family/c-attribs.cc
> @@ -2859,8 +2859,15 @@ handle_counted_by_attribute (tree *node, tree name,
>tree argval = TREE_VALUE (args);
>tree old_counted_by = lookup_attribute ("counted_by", DECL_ATTRIBUTES 
> (decl));
>  
> +  /* This attribute is not supported in C++.  */
> +  if (c_dialect_cxx ())
> +{
> +  error_at (DECL_SOURCE_LOCATION (decl),
> + "%qE attribute is not supported for C++", name);

This should be sorry_at instead IMHO (at least if there is a plan to support
it later, hopefully in the 15 timeframe).

> +  *no_add_attrs = true;
> +}
>/* This attribute only applies to field decls of a structure.  */
> -  if (TREE_CODE (decl) != FIELD_DECL)
> +  else if (TREE_CODE (decl) != FIELD_DECL)
>  {
>error_at (DECL_SOURCE_LOCATION (decl),
>   "%qE attribute is not allowed for a non-field"
> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> index 48b27ff9f39..f31f3bdb53d 100644
> --- a/gcc/doc/extend.texi
> +++ b/gcc/doc/extend.texi
> @@ -7848,6 +7848,7 @@ The @code{counted_by} attribute may be attached to the 
> C99 flexible array
>  member of a structure.  It indicates that the number of the elements of the
>  array is given by the field "@var{count}" in the same structure as the
>  flexible array member.
> +This attribute is available only for C.

And this should say for now or something similar.

>  GCC may use this information to improve detection of object size information
>  for such structures and provide better results in compile-time diagnostics
>  and runtime features like the array bound sanitizer and
> diff --git a/gcc/testsuite/g++.dg/flex-array-counted-by.C 
> b/gcc/testsuite/g++.dg/flex-array-counted-by.C
> new file mode 100644
> index 000..7f1a345615e
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/flex-array-counted-by.C

Tests shouldn't be added directly to g++.dg/ directory, I think this should
go into g++.dg/ext/ as it is an (unsupported) extension.

> @@ -0,0 +1,11 @@
> +/* Testing the fact that the attribute counted_by is not supported in C++.  
> */
> +/* { dg-do compile } */
> +/* { dg-options "-O2" } */
> +
> +int size;
> +int x __attribute ((counted_by (size))); /* { dg-error "attribute is not 
> supported for C\\+\\+" } */
> +
> +struct trailing {
> +  int count;
> +  int field[] __attribute ((counted_by (count))); /* { dg-error "attribute 
> is not supported for C\\+\\+" } */
> +};

Maybe it should also test in another { dg-do compile { target c++11 } } test
that the same happens even for [[gnu::counted_by (size)]].
Seems even for C23 there are no tests with [[gnu::counted_by (size)]].
The C++11/C23 standard attributes are more strict on where they can appear
depending on what it appertains to, as it applies to declarations, I think
it needs to go before the [] or at the start of the declaration, so
  [[gnu::counted_by (count)]] int field[];
or
  int field [[gnu::counted_by (count)]] [];
but I could be wrong, better test it...

Jakub



Re: [PATCH v1] RISC-V: Update .SAT_TRUNC dump check due to middle-end change

2024-08-05 Thread Jeff Law




On 8/5/24 2:01 AM, pan2...@intel.com wrote:

From: Pan Li 

Due to recent middle-end change, update the .SAT_TRUNC expand dump
check from 2 to 4.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-1.c: Adjust
asm check times from 2 to 4.

OK
jeff



Re: [PATCH v2] Hard register constraints

2024-08-05 Thread Stefan Schulze Frielinghaus
On Mon, Aug 05, 2024 at 02:19:50PM +0200, Georg-Johann Lay wrote:
> Am 05.08.24 um 12:28 schrieb Stefan Schulze Frielinghaus:
> > This is a follow-up of
> > https://gcc.gnu.org/pipermail/gcc-patches/2024-June/654013.html
> > 
> > What has changed?
> > 
> > - Rebased and fixed an issue in constrain_operands which manifested
> > after late-combine.
> > 
> > - Introduced new test cases for Arm, Intel, POWER, RISCV, S/390 for 32-
> > and 64-bit where appropriate (including register pairs etc.).  Test
> > gcc.dg/asm-hard-reg-7.c is a bit controversial since I'm testing for an
> > anti feature here, i.e., I'm testing for register asm in conjunction
> > with calls.  I'm fine with removing it in the end but I wanted to keep
> > it in for demonstration purposes at least during discussion of this
> > patch.
> > 
> > - Split test pr87600-2.c into pr87600-2.c and pr87600-3.c since test0
> > errors out early, now.  Otherwise, the remaining errors would not be
> > reported.  Beside that the error message has slightly changed.
> > 
> > - Modified genoutput.cc in order to allow hard register constraints in
> > machine descriptions.  For example, on s390 the instruction mvcrl makes
> 
> As I already said, such a feature would be great.  Some questions:
> 
> Which pass is satisfying that constraint? AFAIK for local reg vars,
> it is asmcons, but for register constraints in md it it the register
> allocator.

This is done by reload during process_alt_operands().  Basically
every other change in gimplify.cc, stmt.cc etc. is only there in order
to do some error checking and have some proper diagnostics.

> The avr backend has many insns that use explicit hard regs in order to
> model some libcalls (ones with footprints smaller than ABI, or that
> deviate from the ABI).  A proper way would be to add a register
> constraint for each possible hard reg, e.g. R20_1 for QImode in R20,
> R20_2 for HImode in R20, etc.  This would require a dozen or more
> new register classes, and the problem with that is that register
> allocation produces less efficient code even for cases that do
> not use these new constraints.  So I gave up that approach.
> 
> How does your feature work? Does it imply that for each hreg
> constraint there must be an according register class?

No.  During reload I limit the set of registers by installing a filter
and let RA solve it.

> 
> Obviously local reg vars don't require respective reg classes,
> so I thought about representing such insns as asm_input or
> whatever, but that's pure hack and would never pass a review...
> 
> > use of the implicit register r0 which we currently deal with as follows:
> > 
> > (define_insn "*mvcrl"
> >[(set (match_operand:BLK 0 "memory_operand" "=Q")
> > (unspec:BLK [(match_operand:BLK 1 "memory_operand" "Q")
> >  (reg:SI GPR0_REGNUM)]
> > UNSPEC_MVCRL))]
> >"TARGET_Z15"
> >"mvcrl\t%0,%1"
> >[(set_attr "op_type" "SSE")])
> > 
> > (define_expand "mvcrl"
> >[(set (reg:SI GPR0_REGNUM) (match_operand:SI 2 "general_operand"))
> > (set (match_operand:BLK 0 "memory_operand" "=Q")
> > (unspec:BLK [(match_operand:BLK 1 "memory_operand" "Q")
> >  (reg:SI GPR0_REGNUM)]
> > UNSPEC_MVCRL))]
> >"TARGET_Z15"
> >"")
> > 
> > In the expander we ensure that GPR0 is setup correctly.  With this patch
> > we could simply write
> > 
> > (define_insn "mvcrl"
> >[(set (match_operand:BLK 0 "memory_operand" "=Q")
> >  (unspec:BLK [(match_operand:BLK 1 "memory_operand" "Q")
> >   (match_operand:SI 2 "general_operand" "{r0}")]
> >  UNSPEC_MVCRL))]
> >"TARGET_Z15"
> >"mvcrl\t%0,%1"
> >[(set_attr "op_type" "SSE")])
> > 
> > What I dislike is that I didn't find a way to verify hard register names
> 
> Are plain register numbers also supported? Like "{0}" ?
> (Provided regno(r0) == 0).

Basically whatever passes decode_reg_name() is allowed.

> 
> > during genoutput, i.e., ensuring that the name is valid after all.  This
> > is due to the fact how reg_names is defined which cannot be accessed by
> > genoutput.  The same holds true for REGISTER_NAMES et al. which may
> > reference some target specific variable (see e.g. POWER).  Thus, in case
> > of an invalid register name in a machine description file we do not
> > end-up with a genoutput-time error but instead fail at run-time in
> > process_alt_operands():
> > 
> > case '{':
> > {
> >   int regno = parse_constraint_regname (p);
> >   gcc_assert (regno >= 0);
> >   cl = REGNO_REG_CLASS (regno);
> >   CLEAR_HARD_REG_SET (hregset);
> >   SET_HARD_REG_BIT (hregset, regno);
> 
> Is this correct when hard_regno_nregs(regno) > 1,
> i.e. when the register occupies more than one hard register?

This is the actual place where the hard register constraint manifests
(beside all the error handling).  By restricting the possible s

Re: [PATCH] vect: Multistep float->int conversion only with no trapping math

2024-08-05 Thread Juergen Christ
Am Mon, Aug 05, 2024 at 01:00:31PM +0200 schrieb Richard Biener:
> On Fri, Aug 2, 2024 at 2:43 PM Juergen Christ  wrote:
> >
> > Do not convert floats to ints in multiple step if trapping math is
> > enabled.  This might hide some inexact signals.
> >
> > Also use correct sign (the sign of the target integer type) for the
> > intermediate steps.  This only affects undefined behaviour (casting
> > floats to unsigned datatype where the float is negative).
> >
> > gcc/ChangeLog:
> >
> > * tree-vect-stmts.cc (vectorizable_conversion): multi-step
> >   float to int conversion only with trapping math and correct
> >   sign.
> >
> > Signed-off-by: Juergen Christ 
> >
> > Bootstrapped and tested on x84 and s390.  Ok for trunk?
> >
> > ---
> >  gcc/tree-vect-stmts.cc | 8 +---
> >  1 file changed, 5 insertions(+), 3 deletions(-)
> >
> > diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
> > index fdcda0d2abae..2ddd13383193 100644
> > --- a/gcc/tree-vect-stmts.cc
> > +++ b/gcc/tree-vect-stmts.cc
> > @@ -5448,7 +5448,8 @@ vectorizable_conversion (vec_info *vinfo,
> > break;
> >
> >   cvt_type
> > -   = build_nonstandard_integer_type (GET_MODE_BITSIZE (rhs_mode), 
> > 0);
> > +   = build_nonstandard_integer_type (GET_MODE_BITSIZE (rhs_mode),
> > + TYPE_UNSIGNED (lhs_type));
> 
> But lhs_type should be a float type here, the idea that for a
> FLOAT_EXPR (int -> float)
> a signed integer type is the natural one to use - as it's 2x wider
> than the original
> RHS type it's signedness doesn't matter.  Note all float types should be
> !TYPE_UNSIGNED so this hunk is a no-op but still less clear on the intent IMO.
> 
> Please drop it.

Will do.  Sorry about that.

> >   cvt_type = get_same_sized_vectype (cvt_type, vectype_in);
> >   if (cvt_type == NULL_TREE)
> > goto unsupported;
> > @@ -5505,10 +5506,11 @@ vectorizable_conversion (vec_info *vinfo,
> >if (GET_MODE_SIZE (lhs_mode) >= GET_MODE_SIZE (rhs_mode))
> > goto unsupported;
> >
> > -  if (code == FIX_TRUNC_EXPR)
> > +  if (code == FIX_TRUNC_EXPR && !flag_trapping_math)
> > {
> >   cvt_type
> > -   = build_nonstandard_integer_type (GET_MODE_BITSIZE (rhs_mode), 
> > 0);
> > +   = build_nonstandard_integer_type (GET_MODE_BITSIZE (rhs_mode),
> > + TYPE_UNSIGNED (lhs_type));
> 
> Here it might be relevant for correctness - we have to choose between
> sfix and ufix for the float -> [u]int conversion.
> 
> Do  you have a testcase?  Shouldn't the exactness be independent of the 
> integer
> type we convert to?

I was looking at this little program which contains undefined behaviour:

#include 

__attribute__((noinline,noclone,noipa))
void
vec_pack_ufix_trunc_v2df (double *in, unsigned int *out);

void
vec_pack_ufix_trunc_v2df (double *in, unsigned int *out)
{
out[0] = in[0];
out[1] = in[1];
out[2] = in[2];
out[3] = in[3];
}

int main()
{
double in[] = {-1,-2,-3,-4};
unsigned int out[4];

vec_pack_ufix_trunc_v2df (in, out);
for (int i = 0; i < 4; ++i)
printf("out[%d] = %u\n", i, out[i]);
return 0;
}

On s390x, I get different results after vectorization:

out[0] = 4294967295
out[1] = 4294967294
out[2] = 4294967293
out[3] = 4294967292

than without vectorization:

out[0] = 0
out[1] = 0
out[2] = 0
out[3] = 0

Even if this is undefined behaviour, I think it would be nice to have
consistent results here.

Also, while I added an expander to circumvent this problem in a
previous patch, reviewers requested to hide this behind trapping math.
Thus, I looked into this.

Seeing the result from the CI for aarch64, I guess there are some
tests that actually expect this vectorization to always happen even
though it might not be save w.r.t. trapping math.

> 
> >   cvt_type = get_same_sized_vectype (cvt_type, vectype_in);
> >   if (cvt_type == NULL_TREE)
> > goto unsupported;
> > --
> > 2.43.5
> >


Re: [PATCH] RISC-V: Minimal support for Zimop extension.

2024-08-05 Thread Jeff Law




On 8/4/24 8:20 PM, Jiawei wrote:


在 2024/8/5 8:45, Jeff Law 写道:



On 8/2/24 9:32 AM, Jiawei wrote:

https://github.com/riscv/riscv-isa-manual/blob/main/src/zimop.adoc

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc: New extension.
* config/riscv/riscv.opt: New mask.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/arch-42.c: New test.
* gcc.target/riscv/arch-43.c: New test.
Shouldn't the binutils bits go in first?  There's basic support for 
Zimop/Zcmop from Lyut on the binutils list in late 2023 or early 2024. 
I'm pretty sure it marked as DO NOT MERGE because we were waiting for 
the extension to get ratified.


Christoph informed me that Zimop has been ratified, so we may not need 
to worry about the spec lifecycle status:


https://jira.riscv.org/browse/RVS-1603?src=confmacro

Agreed.  No concerns about spec lifecycle at this point.





I don't know if Lyut is doing any RISC-V work right now, so if you 
wanted to ping the patch on his behalf, it'd be appreciated and I can 
handle the review on the binutils side too.


I found that ESWIN's patch to support Zimop on the binutils mailing list 
last month:


https://sourceware.org/pipermail/binutils/2024-June/134592.html

I don't watch binutils as closely as perhaps I should.

That patch looks marginally better than Lyut's version.  It has the 
updated version #s for the spec and handles the implied extensions. 
Let's go with Xiao's version.


Xiao, the Zimop/Zcmop patches are OK for binutils.

Jiawei, the GCC patches are OK once Xiao pushes his changes to the 
binutils repo.  Alternately if you have permissions in the binutils 
repo, you can push them for Xiao.


Jeff




Re: [PATCH] middle-end/111821 - compile-time/memory-hog with large copy

2024-08-05 Thread Jeff Law




On 8/2/24 6:50 AM, Richard Biener wrote:

The following fixes a compile-time/memory-hog when performing a
large aggregate copy to a small object allocated to a register.
While store_bit_field_1 called by store_integral_bit_field will
do nothign for accesses outside of the target the loop over the
source in store_integral_bit_field will still code-generate
the read parts for all words in the source.  The following copies
the noop condition from store_bit_field_1 and terminates the
loop when it runs forward or avoid code-generating the read parts
when not.

Bootstrapped and tested on x86_64-unknown-linux-gnu, OK?

Thanks,
Richard.

PR middle-end/111821
* expmed.cc (store_integral_bit_field): Terminate the
word-wise copy loop when we get out of the destination
and do a forward copy.  Skip the word if it would be
outside of the destination in case of a backward copy.

* gcc.dg/torture/pr111821.c: New testcase.

OK
jeff



Re: [PATCH v2] RISC-V: Add support to vector stack-clash protection

2024-08-05 Thread Jeff Law




On 8/1/24 2:16 PM, Raphael Zinsly wrote:

On Thu, Aug 1, 2024 at 3:40 PM Jeff Law  wrote:

On 8/1/24 6:01 AM, Raphael Moreira Zinsly wrote:

+/* Both prologue temp registers are used in the vector probe loop for when
+   stack-clash protection is enabled, so we need to copy SP to a new register
+   and set it as CFA during the loop, we are using T3 for that.  */
+#define RISCV_STACK_CLASH_VECTOR_CFA_REGNUM (GP_TEMP_FIRST + 23)

"23" looks like a typo.  Shouldn't it be "3"?


GP_TEMP_FIRST + 3 = 8, which is s0/fp.
t3 is register 28.

I'd forgotten the temps are a disjoint set, sorry about goofing that up.

The series is OK for the trunk.  IT's been a long road


jeff


Re: [PATCH v2] RISC-V: Add deprecation warning to LP64E abi

2024-08-05 Thread Jeff Law




On 7/30/24 6:32 PM, Patrick O'Neill wrote:

gcc/ChangeLog:

PR 116152
* config/riscv/riscv.cc (riscv_option_override): Add deprecation
warning.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/predef-9.c: Add check for warning.

OK
jeff



RE: Support streaming of poly_int for offloading when it's degree <= accel's NUM_POLY_INT_COEFFS

2024-08-05 Thread Prathamesh Kulkarni


> -Original Message-
> From: Jakub Jelinek 
> Sent: Friday, August 2, 2024 5:43 PM
> To: Prathamesh Kulkarni 
> Cc: Richard Biener ; Richard Sandiford
> ; gcc-patches@gcc.gnu.org
> Subject: Re: Support streaming of poly_int for offloading when it's
> degree <= accel's NUM_POLY_INT_COEFFS
> 
> External email: Use caution opening links or attachments
> 
> 
> On Fri, Aug 02, 2024 at 11:58:19AM +, Prathamesh Kulkarni wrote:
> > diff --git a/gcc/data-streamer-in.cc b/gcc/data-streamer-in.cc index
> > 7dce2928ef0..7b9d8cc0129 100644
> > --- a/gcc/data-streamer-in.cc
> > +++ b/gcc/data-streamer-in.cc
> > @@ -182,10 +182,8 @@ streamer_read_hwi (class lto_input_block *ib)
> >  poly_uint64
> >  streamer_read_poly_uint64 (class lto_input_block *ib)  {
> > -  poly_uint64 res;
> > -  for (unsigned int i = 0; i < NUM_POLY_INT_COEFFS; ++i)
> > -res.coeffs[i] = streamer_read_uhwi (ib);
> > -  return res;
> > +  return
> poly_int_read_common::coeff_type>
> > +  (streamer_read_uhwi, ib);
> 
> Can't you use
>   using coeff_type = poly_int_traits ::coeff_type;
>   return poly_int_read_common  (streamer_read_uhwi, ib); ?
> The call arguments on different line from the actual function name are
> ugly.
> 
> > --- a/gcc/data-streamer.cc
> > +++ b/gcc/data-streamer.cc
> > @@ -28,6 +28,12 @@ along with GCC; see the file COPYING3.  If not see
> > #include "cgraph.h"
> >  #include "data-streamer.h"
> >
> > +/* For offloading -- While streaming-out, host NUM_POLY_INT_COEFFS is
> > +   stored at beginning of mode_table. While streaming-in, the value
> is read in
> > +   host_num_poly_int_coeffs.  */
> > +
> > +unsigned host_num_poly_int_coeffs = 0;
> 
> I think it would be better to guard this with #ifdef ACCEL_COMPILER.
> 
> > +template
> > +poly_int poly_int_read_common (F read_coeff,
> > +Args ...args) {
> > +  poly_int x;
> > +  unsigned i;
> > +
> > +#ifndef ACCEL_COMPILER
> > +  host_num_poly_int_coeffs = NUM_POLY_INT_COEFFS; #endif
> 
> And instead of modifying a global var again and again do #ifdef
> ACCEL_COMPILER
>   const unsigned num_poly_int_coeffs = host_num_poly_int_coeffs;
>   gcc_assert (num_poly_int_coeffs > 0);
> #else
>   const unsigned num_poly_int_coeffs = NUM_POLY_INT_COEFFS; #endif and
> use num_poly_int_coeffs in the functions.
Hi Jakub,
I have done the suggested changes in the attached patch.
Does it look OK ?

Thanks,
Prathamesh
> 
> Jakub

Partially support streaming of poly_int for offloading.

The patch streams out host NUM_POLY_INT_COEFFS, and changes
streaming in as follows:

if (host_num_poly_int_coeffs <= NUM_POLY_INT_COEFFS)
{
  for (i = 0; i < host_num_poly_int_coeffs; i++)
poly_int.coeffs[i] = stream_in coeff;
  for (; i < NUM_POLY_INT_COEFFS; i++)
poly_int.coeffs[i] = 0;
}
else
{
  for (i = 0; i < NUM_POLY_INT_COEFFS; i++)
poly_int.coeffs[i] = stream_in coeff;

  /* Ensure that degree of poly_int <= accel NUM_POLY_INT_COEFFS.  */ 
  for (; i < host_num_poly_int_coeffs; i++)
{
  val = stream_in coeff;
  if (val != 0)
error ();
}
}

gcc/ChangeLog:
PR ipa/96265
PR ipa/111937
* data-streamer-in.cc (streamer_read_poly_uint64): Remove code for
streaming, and call poly_int_read_common instead. 
(streamer_read_poly_int64): Likewise.
* data-streamer.cc (host_num_poly_int_coeffs): Conditionally define
new variable if ACCEL_COMPILER is defined.
* data-streamer.h (host_num_poly_int_coeffs): Declare.
(poly_int_read_common): New function template.
(bp_unpack_poly_value): Remove code for streaming and call
poly_int_read_common instead.
* lto-streamer-in.cc (lto_input_mode_table): Stream-in host
NUM_POLY_INT_COEFFS into host_num_poly_int_coeffs if ACCEL_COMPILER
is defined.
* lto-streamer-out.cc (lto_write_mode_table): Stream out
NUM_POLY_INT_COEFFS if offloading is enabled.
* poly-int.h (MAX_NUM_POLY_INT_COEFFS_BITS): New macro.
* tree-streamer-in.cc (lto_input_ts_poly_tree_pointers): Adjust
streaming-in of poly_int.

Signed-off-by: Prathamesh Kulkarni 

diff --git a/gcc/data-streamer-in.cc b/gcc/data-streamer-in.cc
index 7dce2928ef0..57955a20091 100644
--- a/gcc/data-streamer-in.cc
+++ b/gcc/data-streamer-in.cc
@@ -182,10 +182,8 @@ streamer_read_hwi (class lto_input_block *ib)
 poly_uint64
 streamer_read_poly_uint64 (class lto_input_block *ib)
 {
-  poly_uint64 res;
-  for (unsigned int i = 0; i < NUM_POLY_INT_COEFFS; ++i)
-res.coeffs[i] = streamer_read_uhwi (ib);
-  return res;
+  using coeff_type = poly_int_traits::coeff_type;
+  return poly_int_read_common (streamer_read_hwi, ib);
 }
 
 /* Read a poly_int64 from IB.  */
@@ -193,10 +191,8 @@ streamer_read_poly_uint64 (class lto_input_block *ib)
 poly_int64
 streamer_read_poly_int64 (class lto_input_block *ib)
 {
-  poly_int64 res;
-  for (unsigned int i = 0; i < NUM_POLY_INT_COEFFS; ++i)
-res.coeffs[i] = streamer_read_hwi (ib);
-  return 

Re: [PATCH] PR tree-optimization/57371: Optimize (float)i == 16777222.0f sometimes.

2024-08-05 Thread Jeff Law




On 7/28/24 6:01 AM, Roger Sayle wrote:


This patch improves the tree-level optimization of (fptype)ivar != CST
in match.pd (historically tracked under PR 57371).  Joseph Myers'
description in comment #1 provides an excellent overview of the issues,
that historically it's the trapping behaviour of (fptype)ivar conversion
that is the primary concern, which is why the current code in match.pd
checks fmt.can_represent_integral_type_p (itype).  The first of the
improvements with this patch is to check flag_trapping_math to control
whether FP_OVERFLOW/FP_INEXACT needs to be preserved, and to use
ranger to determine whether the bounds on ivar confirm that these
traps aren't possible.  For example, the expression (int)var & 15
can't overflow conversion to IEEE float, even though the type of a
32-bit int could potentially overflow the significant of a 32-bit
float.

The next of the optimizations concern checking whether the comparison
against CST is unambiguous allowing it to be replaced with a integer
comparison.  For reference, consider the table below which shows the
default conversion of integers to IEEE 32-bit float.

(float)16777211 => 16777211.0f;
(float)16777212 => 16777212.0f;
(float)16777213 => 16777213.0f;
(float)16777214 => 16777214.0f;
(float)16777215 => 16777215.0f;
(float)16777216 => 16777216.0f;
(float)16777217 => 16777216.0f;  // rounded
(float)16777218 => 16777218.0f;
(float)16777219 => 16777220.0f;  // rounded
(float)16777220 => 16777220.0f;
(float)16777221 => 16777220.0f;  // rounded
(float)16777222 => 16777222.0f;
(float)16777223 => 16777224.0f;  // rounded
(float)16777224 => 16777224.0f;
(float)16777225 => 16777224.0f;  // rounded
(float)16777226 => 16777226.0f;

Observe that it's safe to optimize (float)i == 16777212.0f to the
equivalent i == 16777212 (as this is the only integer that can
convert to that floating point constant), but that it's unsafe to
optimize (float)i == 16777220.0f, as with default rounding there
are three possible integer values that FLOAT_EXPR to 16777220.0f.
The pragmatic check used in this patch is to confirm that (float)(i-1)
and (float)(i+1) are both distinct from (float)i before simplifying
the comparison to an integer-typed comparison.

Finally, this patch also handles non-default rounding modes.
In the table above, it's safe to optimize (float)i == 16777222.0f
in IEEE's default rounding mode, but not in all FP rounding modes.
This eventuality is handled by testing whether the (float)i, the
(float)(i-1) and the (float)(i+1) are all exactly rounded when
-frounding-math is specified.

This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
and make -k check, both with and without --target_board=unix{-m32}
with no new failures.  Ok for mainline?  If the testcases need to be
tweaked for non-IEEE targets (the transformations themselves should
be portable to VAX and IBM floating point formats) hopefully that
can be done as follow-up patches by folks with the appropriate
effective-targets?


2024-07-28  Roger Sayle  

gcc/ChangeLog
 PR tree-optimization/57371
 * fold-const.cc (fold_cmp_float_cst_p): New helper function.
 * fold-const.h (fold_cmp_float_cst_p): Prototype here.
 * match.pd ((FTYPE) N CMP CST): Use ranger to determine
 whether value is exactly representable by floating point type,
 and check flag_trapping_math if not.  Use the new helper
 fold_cmp_float_cst_p to check that transformation to an integer
 comparison is safe.

gcc/testsuite/ChangeLog
 PR tree-optimization/57371
 * c-c++-common/pr57371-6.c: New test case.
 * c-c++-common/pr57371-7.c: Likewise.
 * c-c++-common/pr57371-8.c: Likewise.
 * c-c++-common/pr57371-9.c: Likewise.
 * c-c++-common/pr57371-10.c: Likewise.
Nice.  I was a bit concerned about using Ranger in match.pd as match.pd 
can be used for GENERIC as well as GIMPLE.  But it looks like you 
handled that reasonably.  Similarly for the other FP formats.


As you note, I wouldn't be terribly surprised if the other FP formats 
need testsuite adjustments.  I'd hoped we had a target selector, but we 
don't.


Does it make sense to use "add_options_for_ieee" to conditionally add 
the necessary target options in the tests?   It only affects alpha, sh & 
rx, so it's unlikely to have been needed in your testing.


Jeff


Re: Support streaming of poly_int for offloading when it's degree <= accel's NUM_POLY_INT_COEFFS

2024-08-05 Thread Jakub Jelinek
On Mon, Aug 05, 2024 at 02:24:00PM +, Prathamesh Kulkarni wrote:
> gcc/ChangeLog:
>   PR ipa/96265
>   PR ipa/111937
>   * data-streamer-in.cc (streamer_read_poly_uint64): Remove code for
>   streaming, and call poly_int_read_common instead. 
>   (streamer_read_poly_int64): Likewise.
>   * data-streamer.cc (host_num_poly_int_coeffs): Conditionally define
>   new variable if ACCEL_COMPILER is defined.
>   * data-streamer.h (host_num_poly_int_coeffs): Declare.
>   (poly_int_read_common): New function template.
>   (bp_unpack_poly_value): Remove code for streaming and call
>   poly_int_read_common instead.
>   * lto-streamer-in.cc (lto_input_mode_table): Stream-in host
>   NUM_POLY_INT_COEFFS into host_num_poly_int_coeffs if ACCEL_COMPILER
>   is defined.
>   * lto-streamer-out.cc (lto_write_mode_table): Stream out
>   NUM_POLY_INT_COEFFS if offloading is enabled.
>   * poly-int.h (MAX_NUM_POLY_INT_COEFFS_BITS): New macro.
>   * tree-streamer-in.cc (lto_input_ts_poly_tree_pointers): Adjust
>   streaming-in of poly_int.
> 
> Signed-off-by: Prathamesh Kulkarni 

> --- a/gcc/data-streamer.cc
> +++ b/gcc/data-streamer.cc
> @@ -28,6 +28,14 @@ along with GCC; see the file COPYING3.  If not see
>  #include "cgraph.h"
>  #include "data-streamer.h"
>  
> +/* For offloading -- While streaming-out, host NUM_POLY_INT_COEFFS is
> +   stored at beginning of mode_table. While streaming-in, the value is read 
> in

Two spaces after . rather than just one, and because of that move in on the
next line.

> +   host_num_poly_int_coeffs.  */

Otherwise LGTM.

Jakub



Re: [RFC][PATCH] SVE intrinsics: Fold svdiv (svptrue, x, x) to ones

2024-08-05 Thread Kyrylo Tkachov


> On 5 Aug 2024, at 12:01, Richard Sandiford  wrote:
> 
> External email: Use caution opening links or attachments
> 
> 
> Jennifer Schmitz  writes:
>> This patch folds the SVE intrinsic svdiv into a vector of 1's in case
>> 1) the predicate is svptrue and
>> 2) dividend and divisor are equal.
>> This is implemented in the gimple_folder for signed and unsigned
>> integers. Corresponding test cases were added to the existing test
>> suites.
>> 
>> The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression.
>> OK for mainline?
>> 
>> Please also advise whether it makes sense to implement the same optimization
>> for float types and if so, under which conditions?
> 
> I think we should instead use const_binop to try to fold the division
> whenever the predicate is all-true, or if the function uses _x predication.
> (As a follow-on, we could handle _z and _m too, using VEC_COND_EXPR.)
> 

From what I can see const_binop only works on constant arguments.
Is fold_binary a better interface to use ? I think it’d hook into the match.pd 
machinery for divisions at some point.
Thanks,
Kyrill

> We shouldn't need to vet the arguments, since const_binop does that itself.
> Using const_binop should also get the conditions right for floating-point
> divisions.
> 
> Thanks,
> Richard
> 
> 
>> 
>> Signed-off-by: Jennifer Schmitz 
>> 
>> gcc/
>> 
>>  * config/aarch64/aarch64-sve-builtins-base.cc (svdiv_impl::fold):
>>  Add optimization.
>> 
>> gcc/testsuite/
>> 
>>  * gcc.target/aarch64/sve/acle/asm/div_s32.c: New test.
>>  * gcc.target/aarch64/sve/acle/asm/div_s64.c: Likewise.
>>  * gcc.target/aarch64/sve/acle/asm/div_u32.c: Likewise.
>>  * gcc.target/aarch64/sve/acle/asm/div_u64.c: Likewise.
>> 
>> From 43913cfa47b31d055a0456c863a30e3e44acc2f0 Mon Sep 17 00:00:00 2001
>> From: Jennifer Schmitz 
>> Date: Fri, 2 Aug 2024 06:41:09 -0700
>> Subject: [PATCH] SVE intrinsics: Fold svdiv (svptrue, x, x) to ones
>> 
>> This patch folds the SVE intrinsic svdiv into a vector of 1's in case
>> 1) the predicate is svptrue and
>> 2) dividend and divisor are equal.
>> This is implemented in the gimple_folder for signed and unsigned
>> integers. Corresponding test cases were added to the existing test
>> suites.
>> 
>> The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression.
>> OK for mainline?
>> 
>> Signed-off-by: Jennifer Schmitz 
>> 
>> gcc/
>> 
>>  * config/aarch64/aarch64-sve-builtins-base.cc (svdiv_impl::fold):
>>  Add optimization.
>> 
>> gcc/testsuite/
>> 
>>  * gcc.target/aarch64/sve/acle/asm/div_s32.c: New test.
>>  * gcc.target/aarch64/sve/acle/asm/div_s64.c: Likewise.
>>  * gcc.target/aarch64/sve/acle/asm/div_u32.c: Likewise.
>>  * gcc.target/aarch64/sve/acle/asm/div_u64.c: Likewise.
>> ---
>> .../aarch64/aarch64-sve-builtins-base.cc  | 19 ++---
>> .../gcc.target/aarch64/sve/acle/asm/div_s32.c | 27 +++
>> .../gcc.target/aarch64/sve/acle/asm/div_s64.c | 27 +++
>> .../gcc.target/aarch64/sve/acle/asm/div_u32.c | 27 +++
>> .../gcc.target/aarch64/sve/acle/asm/div_u64.c | 27 +++
>> 5 files changed, 124 insertions(+), 3 deletions(-)
>> 
>> diff --git a/gcc/config/aarch64/aarch64-sve-builtins-base.cc 
>> b/gcc/config/aarch64/aarch64-sve-builtins-base.cc
>> index d55bee0b72f..e347d29c725 100644
>> --- a/gcc/config/aarch64/aarch64-sve-builtins-base.cc
>> +++ b/gcc/config/aarch64/aarch64-sve-builtins-base.cc
>> @@ -755,8 +755,21 @@ public:
>>   gimple *
>>   fold (gimple_folder &f) const override
>>   {
>> -tree divisor = gimple_call_arg (f.call, 2);
>> -tree divisor_cst = uniform_integer_cst_p (divisor);
>> +tree pg = gimple_call_arg (f.call, 0);
>> +tree op1 = gimple_call_arg (f.call, 1);
>> +tree op2 = gimple_call_arg (f.call, 2);
>> +
>> +if (f.type_suffix (0).integer_p
>> + && is_ptrue (pg, f.type_suffix (0).element_bytes)
>> + && operand_equal_p (op1, op2, 0))
>> +  {
>> + tree lhs_type = TREE_TYPE (f.lhs);
>> + tree_vector_builder builder (lhs_type, 1, 1);
>> + builder.quick_push (build_each_one_cst (TREE_TYPE (lhs_type)));
>> + return gimple_build_assign (f.lhs, builder.build ());
>> +  }
>> +
>> +tree divisor_cst = uniform_integer_cst_p (op2);
>> 
>> if (!divisor_cst || !integer_pow2p (divisor_cst))
>>   return NULL;
>> @@ -770,7 +783,7 @@ public:
>>  shapes::binary_uint_opt_n, MODE_n,
>>  f.type_suffix_ids, GROUP_none, f.pred);
>>  call = f.redirect_call (instance);
>> - tree d = INTEGRAL_TYPE_P (TREE_TYPE (divisor)) ? divisor : divisor_cst;
>> + tree d = INTEGRAL_TYPE_P (TREE_TYPE (op2)) ? op2 : divisor_cst;
>>  new_divisor = wide_int_to_tree (TREE_TYPE (d), tree_log2 (d));
>>   }
>> else
>> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/div_s32.c 
>> b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm

Re: [PATCH-1v4] Value Range: Add range op for builtin isinf

2024-08-05 Thread Jeff Law




On 7/23/24 4:39 PM, Andrew MacLeod wrote:
the range is in r, and is set to [0,0].  this is the false part of what 
is being returned for the range.


the "return true" indicates we determined a range, so use what is in R.

returning false means we did not find a range to return, so r is garbage.
Duh.  I guess I should have realized that.  I'll have to take another 
look at Hao's patch.  It's likely OK, but let me take another looksie.


jeff



Re: Ping^4 [PATCH-2v4] Value Range: Add range op for builtin isfinite

2024-08-05 Thread Jeff Law




On 7/21/24 8:10 PM, HAO CHEN GUI wrote:

Hi,
   Gently ping it.
https://gcc.gnu.org/pipermail/gcc-patches/2024-May/653094.html

OK.  Sorry for the delays.

jeff



Re: Ping^4 [PATCH-3v2] Value Range: Add range op for builtin isnormal

2024-08-05 Thread Jeff Law




On 7/21/24 8:09 PM, HAO CHEN GUI wrote:

Hi,
   Gently ping it.
https://gcc.gnu.org/pipermail/gcc-patches/2024-May/653095.html

Also OK.

jeff



Re: [PATCHv2, expand] Add const0 move checking for CLEAR_BY_PIECES optabs

2024-08-05 Thread Jeff Law




On 7/26/24 2:55 AM, HAO CHEN GUI wrote:

Hi Jeff,

在 2024/7/24 5:57, Jeff Law 写道:



On 7/21/24 7:58 PM, HAO CHEN GUI wrote:

Hi,
    This patch adds const0 move checking for CLEAR_BY_PIECES. The original
vec_duplicate handles duplicates of non-constant inputs. But 0 is a
constant. So even a platform doesn't support vec_duplicate, it could
still do clear by pieces if it supports const0 move by that mode.

    Compared to the previous version, the main change is to do const0
direct move for by-piece clear if the target supports const0 move by
that mode.
https://gcc.gnu.org/pipermail/gcc-patches/2024-January/643063.html

    Bootstrapped and tested on x86 and powerpc64-linux BE and LE with no
regressions. There are several regressions on aarch64. They could be
fixed by enhancing const0 move on V2x8QImode. Is it OK for trunk?

Can you be more specific about the aarch64 regressions?  Execution? Scan-asm?  
ICE?

Ideally we'd include a patch to fix those regressions as well.


For aarch64, it supports const0 move on V2x8QImode (the first vector mode it
could find for 16-byte length). But the generated instructions are not
preferable. aarch64 has a better "stp" instruction which leverages its zero
register to store 16-byte zeros into memory.

I tested following experiment patch on aarch64. It converts a const0 move on
V2x8QImode to a const0 move on TImode which generates "stp" instruction. It
fixed all regressions.
We'll probably need Richard S. or someone else to chime in on the actual 
patch, but yea, if they can leverage stp, it's likely going to be better 
than actual vectors.


Do we have a testcase for this issue or was it something you just 
happened to notice?




diff --git a/gcc/config/aarch64/aarch64-simd.md 
b/gcc/config/aarch64/aarch64-simd.md
index 01b084d8ccb..8aa72940b12 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -7766,7 +7766,14 @@ (define_expand "mov"
 (match_operand:VSTRUCT_QD 1 "general_operand"))]
"TARGET_FLOAT"
  {
-  if (can_create_pseudo_p ())
+  if (mode == V2x8QImode
+  && operands[1] == CONST0_RTX (V2x8QImode)
+  && MEM_P (operands[0]))
+
+  operands[0] = adjust_address (operands[0], TImode, 0);
+  operands[1] = CONST0_RTX (TImode);
+}
There's may be other modes where they'll want to do something similar. 
Hence my note above that we'll likely need to get Richard S. or someone 
else with more knowledge of the port involved.







1. Originally it tests vec_duplicate_optab for clear by pieces. According to
the words in md.texi, this pattern should only handle non-constant inputs.
Constant vector should go through the move pattern instead. So shall we remove
vec_duplicate_optab checking thoroughly?
I think so.  The docs are pretty clear that the vec_duplicate pattern 
should only handle non-constant values.





2. The clear by pieces final calls emit_move_insn to do the memory set. I
noticed that emit_move_insn doesn't check the predicate of move expand. So any
operand (mem/reg/constant) should be fine when the certain mode of mov_optab
exists. So shall we need following predicate checking for const0 move?
insn_operand_matches (icode, 1, CONST0_RTX (mode))
I'm pretty confident that if we can't move (const_int 0) into any of the 
standard integer modes (up to and including word_mode) that all kinds of 
things would break.


Perhaps just make it an assert?






3. For clear by pieces, currently the zero is not directly passed to generate
function. It uses help fucntion - builtin_memset_read_str to store the zero
in a pseudo then do the move with the pseudo. I saw i386 can benefit with this
process as it calls gen_memset_value_from_prev to generate the value from
previous one. The i386 can utilize its variable length register for storing
different modes of constant zero. If we pass zero directly to generate function,
this optimization will lost. Shall we set a target hook for it?
I probably wouldn't do a target hook for this.  I'd let the x86 backend 
deal with it in their expanders.


jeff



Re: [PATCH] Fix Wstringop-overflow-47.c warning in RISC-V target.

2024-08-05 Thread Jeff Law




On 7/15/24 10:08 PM, Jiawei wrote:


在 2024/07/16 8:28, Jeff Law 写道:
IIRC these fails are dependent upon whether or not the statements turn 
into vector stores or not.


So to remove the xfail don't you have to know if vector is enabled/ 
disabled? 


I am not sure, I tried to enable with RVV, but it still pass the test:

https://godbolt.org/z/bvWfffTe5
Probably because it didn't vectorize ;-)  I don't remember all these 
tests, but I do remember some of them are highly sensitive to the 
changes in code generation from vectorization.


OK for the trunk.  Though I wouldn't be surprised if we have to come 
back to this at some point and adjust again.


jeff



Re: [PATCH] testsuite: Add RISC-V to targets not xfailing gcc.dg/attr-alloc_size-11.c:50, 51.

2024-08-05 Thread Jeff Law




On 8/5/24 6:26 AM, Jiawei wrote:

The test has been observed to pass on most architectures including RISC-V:
https://godbolt.org/z/8nYEvW6n1

Origin issue see:
https://gcc.gnu.org/PR79356#c11

Update RISC-V target to to pass list.

gcc/testsuite/ChangeLog:

* gcc.dg/attr-alloc_size-11.c: Add RISC-V to the list
of targets excluding xfail on lines 50 and 51.
Almost certainly behaving like the other targets in the list due to how 
promotions work.


OK for the trunk.  Thanks!

jeff



Re: [PATCH] RISC-V: Minimal support for Zimop extension.

2024-08-05 Thread Jiawei



在 2024/8/5 22:15, Jeff Law 写道:



On 8/4/24 8:20 PM, Jiawei wrote:


在 2024/8/5 8:45, Jeff Law 写道:



On 8/2/24 9:32 AM, Jiawei wrote:

https://github.com/riscv/riscv-isa-manual/blob/main/src/zimop.adoc

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc: New extension.
* config/riscv/riscv.opt: New mask.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/arch-42.c: New test.
* gcc.target/riscv/arch-43.c: New test.
Shouldn't the binutils bits go in first?  There's basic support for 
Zimop/Zcmop from Lyut on the binutils list in late 2023 or early 
2024. I'm pretty sure it marked as DO NOT MERGE because we were 
waiting for the extension to get ratified.


Christoph informed me that Zimop has been ratified, so we may not 
need to worry about the spec lifecycle status:


https://jira.riscv.org/browse/RVS-1603?src=confmacro

Agreed.  No concerns about spec lifecycle at this point.





I don't know if Lyut is doing any RISC-V work right now, so if you 
wanted to ping the patch on his behalf, it'd be appreciated and I 
can handle the review on the binutils side too.


I found that ESWIN's patch to support Zimop on the binutils mailing 
list last month:


https://sourceware.org/pipermail/binutils/2024-June/134592.html

I don't watch binutils as closely as perhaps I should.

That patch looks marginally better than Lyut's version.  It has the 
updated version #s for the spec and handles the implied extensions. 
Let's go with Xiao's version.


Xiao, the Zimop/Zcmop patches are OK for binutils.

Jiawei, the GCC patches are OK once Xiao pushes his changes to the 
binutils repo.  Alternately if you have permissions in the binutils 
repo, you can push them for Xiao.


Jeff


Thanks Jeff! I think I do not have the permissions in the binutils repo, 
let me contact Nelson to ask him give  some help.



BR,

Jiawei



Re: [PATCH] RISC-V: Minimal support for Zimop extension.

2024-08-05 Thread Jeff Law




On 8/5/24 9:21 AM, Jiawei wrote:



Thanks Jeff! I think I do not have the permissions in the binutils repo, 
let me contact Nelson to ask him give  some help.
Sounds good.  Thanks for taking care of this.   I just wish I'd noticed 
the patch a month ago so that we could have included it in the 2.43 
release that was just made.


jeff



Re: [RFC v3 3/3] c: Add __lengthof__() operator

2024-08-05 Thread Alejandro Colomar
Hi Martin,

On Mon, Aug 05, 2024 at 03:35:06PM GMT, Martin Uecker wrote:
> > > > > > For incomplete arrays, basically we have the following different
> > > > > > variants for arrays:
> > > > > > 
> > > > > > T[ ] incomplete: !TYPE_DOMAIN 
> > > > > > T[1] constant size: TYPE_MAX_VALUE == INTEGER_CST
> > > > > > T[n] variable size: TYPE_MAX_VALUE != INTEGER_CST
> > > > > > T[0] flexible array member: !TYPE_MAX_VALUE && !C_TYPE_VARIABLE_SIZE
> > > > > >   (ISO version T[0] has TYPE_SIZE == NULL_TREE)
> > > > > > T[*] unspecified variable size: !TYPE_MAX_VALUE && 
> > > > > > C_TYPE_VARIABLE_SIZE
> > > > > 
> > > > > Could you describe the following types?  I've repeated the ones you
> > > > > already described, deduplicated some that have a different meaning in
> > > > > different contexts, and added some multi-dimensional arrays.
> > > > > 
> > > > > T[ ] (incomplete type; function parameter)
> > > > > T[ ] (flexible array member)
> > > > > T[0] (zero-size array)
> > > > > T[0] (GNU flexible array member)
> > > > > T[1] (old flexible array member)
> > > > > T[7] (constant size)
> > > > > T[7][n]  (constant size with inner variable size)
> > > > > T[7][*]  (constant size with inner unspecified size)
> > > > 
> > > > And please also describe T[7][4], although I expect that to be just the
> > > > same as T[7].
> > > 
> > > And it would also be interesting to describe T[7][ ].
> > 
> > And maybe also:
> > 
> > T[n][m]
> > T[n][*]
> > T[n][ ]
> > T[n][7]
> 
> I do not understand your question. What do you mean by
> "describe the type"?

I had in mind what you already did above, (e.g.,
T[1] constant size: TYPE_MAX_VALUE == INTEGER_CST), but with a more
comprehensive list.  comptypes_internal() seems what I wanted.

> But I think you might make it unnecessarily complicated.  It
> should be sufficient to look at the outermost size.  You
> can completely ignore thatever happens There
> should be three cases if I am not mistaken:
> 
> - incomplete (includes ISO FAM) -> error
> - constant (includes GNU FAM) -> return fixed size
> - variable (includes unspecified) -> evaluate the
> argument and return the size, while making sure it is 
> visibly non-constant.
> 
> To check that the array has a variable length, you can use
> the same logic as in comptypes_internal (cf. d1_variable).

Hmmm, comptypes_internal() has taught me what I was asking here.
However, it seems to not be enough for what I actually need.

Here's my problem:

The array is correctly considered a fixed-length array.  I know it
because the following debugging code:

+fprintf(stderr, "ALX: %s() %d\n", __func__, __LINE__);
+tree dom = TYPE_DOMAIN (type);
+int zero = !TYPE_MAX_VALUE (dom);
+fprintf(stderr, "ALX: zero: %d\n", zero);
+int var0 = !zero
+&& (TREE_CODE (TYPE_MIN_VALUE (dom)) != INTEGER_CST
+   || TREE_CODE (TYPE_MAX_VALUE (dom)) != INTEGER_CST);
+fprintf(stderr, "ALX: var: %d\n", var0);
+int var = var0 || (zero && TYPE_LANG_FLAG_1(type));
+fprintf(stderr, "ALX: var: %d\n", var);
+  ret = array_type_nelts_top (type);
+fprintf(stderr, "ALX: %s() %d\n", __func__, __LINE__);

prints:

ALX: c_lengthof_type() 4098
ALX: zero: 0
ALX: var: 0
ALX: var: 0
ALX: c_lengthof_type() 4109

for
void foo(char (*a)[3][*], int (*x)[__lengthof__(*a)]);

That differs from

ALX: c_lengthof_type() 4098
ALX: zero: 1
ALX: var: 0
ALX: var: 1
ALX: c_lengthof_type() 4109

for
void bar(char (*a)[*][3], int (*x)[__lengthof__(*a)]);

However, if I turn on -Wvla, both get a warning:

len.c: At top level:
len.c:288:1: warning: ISO C90 forbids variable length array ‘x’ [-Wvla]
  288 | void foo(char (*a)[3][*], int (*x)[__lengthof__(*a)]);
  | ^~~~
len.c:289:1: warning: ISO C90 forbids variable length array ‘x’ [-Wvla]
  289 | void bar(char (*a)[*][3], int (*x)[__lengthof__(*a)]);
  | ^~~~

I suspect that the problem is in:

$ grepc -tfd array_type_nelts_minus_one gcc
gcc/tree.cc:tree
array_type_nelts_minus_one (const_tree type)
{
  tree index_type, min, max;

  /* If they did it with unspecified bounds, then we should have already
 given an error about it before we got here.  */
  if (! TYPE_DOMAIN (type))
return error_mark_node;

  index_type = TYPE_DOMAIN (type);
  min = TYPE_MIN_VALUE (index_type);
  max = TYPE_MAX_VALUE (index_type);

  /* TYPE_MAX_VALUE may not be set if the array has unknown length.  */
  if (!max)
{
  /* zero sized arrays are represented from C FE as complete types 
with
 NULL TYPE_MAX_VALUE and zero TYPE_SIZE, while C++ FE represents
 them as min 0, max -1.  */
  if (COMPLETE_T

Re: [PATCH v2] Hard register constraints

2024-08-05 Thread Georg-Johann Lay

Am 05.08.24 um 15:59 schrieb Stefan Schulze Frielinghaus:

On Mon, Aug 05, 2024 at 02:19:50PM +0200, Georg-Johann Lay wrote:

Am 05.08.24 um 12:28 schrieb Stefan Schulze Frielinghaus:

This is rather unfortunate but I couldn't find a way how to validate
register names during genoutput.  If no one else has an idea I will
replace gcc_assert with a more expressive error message.


[ADDITIONAL_]REGISTER_NAMES isn't available?  Though using that might
bypass the effect of target hooks like TARGET_CONDITIONAL_REGISTER_USAGE.


REGISTER_NAMES references sometimes target variables (see rs6000 e.g.)
which aren't linked into genoutput and are therefore unavailable.


But there are also cases with an asm operand print modifier; you cannot
check that, it's checked by TARGET_PRINT_OPERAND etc. which get a
hard register and not a string for a register name.

Maybe genoutput could add additional information to insn-output.cc or
whatever, and the compiler proper checks that and emits diagnostics
as needed?


Though, this would be a run-time check, right?  I was actually hoping
for a "compile-time" check, i.e., something which errors while compiling
GCC and not when GCC is executed.  The latter is already implemented.


Yes, it would be a run-time check.  As compiler options may be involved,
that's perhaps the only way.

Though such a test would always run, independent of the code being
compiled, so any problem would pop up immediately, e.g.self-test.
Hence not some nasty ICE that only triggers with specific code in
user land.  The runtime overhead would be negligible.

Johann


Re: [PATCH v2] c++, coroutines: Simplify separation of the user function body and ramp.

2024-08-05 Thread Jason Merrill

On 8/3/24 11:40 AM, Iain Sandoe wrote:

On 2 Aug 2024, at 15:19, Jason Merrill  wrote:


On 8/2/24 6:50 AM, Iain Sandoe wrote:



This version simplifies the process by extrating the second case directly



typo

thanks, fixed.


+static bool
+use_eh_spec_block (tree fn)
+{
+  return (flag_exceptions && flag_enforce_eh_specs
+ && !type_throw_all_p (TREE_TYPE (fn)));
+}



Rather than (partially) duplicate this function, let's make the one in decl.cc 
non-static.

done.


+static tree
+split_coroutine_body_from_ramp (tree fndecl, tree eh_spec_block)



Rather than pass current_eh_spec_block into this function through a parameter, 
why not refer to it directly?

done.

retested on x86_64, darwin, OK for trunk?


OK.


--- 8< ---

We need to separate the original user-authored function body from the
definition of the ramp function (which is what is called instead).
The function body tree is either in DECL_SAVED_TREE or the first operand
of current_eh_spec_block (for functions with an EH spec).
This version simplifies the process by extracting the second case directly
instead of inspecting the DECL_SAVED_TREE trees to discover it.

gcc/cp/ChangeLog:

* coroutines.cc (split_coroutine_body_from_ramp): New.
(morph_fn_to_coro): Use split_coroutine_body_from_ramp().
* cp-tree.h (use_eh_spec_block): New.
* decl.cc (use_eh_spec_block): Make non-static.

Signed-off-by: Iain Sandoe 
---
  gcc/cp/coroutines.cc | 90 ++--
  gcc/cp/cp-tree.h |  1 +
  gcc/cp/decl.cc   |  2 +-
  3 files changed, 47 insertions(+), 46 deletions(-)

diff --git a/gcc/cp/coroutines.cc b/gcc/cp/coroutines.cc
index af03f5e0f74..95f989fc035 100644
--- a/gcc/cp/coroutines.cc
+++ b/gcc/cp/coroutines.cc
@@ -4437,6 +4437,43 @@ coro_rewrite_function_body (location_t fn_start, tree 
fnbody, tree orig,
return update_body;
  }
  
+/* Extract the body of the function we are going to outline, leaving

+   to original function decl ready to build the ramp.  */
+
+static tree
+split_coroutine_body_from_ramp (tree fndecl)
+{
+  tree body;
+  /* Once we've tied off the original user-authored body in fn_body.
+ Start the replacement synthesized ramp body.  */
+
+  if (use_eh_spec_block (fndecl))
+{
+  body = pop_stmt_list (TREE_OPERAND (current_eh_spec_block, 0));
+  TREE_OPERAND (current_eh_spec_block, 0) = push_stmt_list ();
+}
+  else
+{
+  body = pop_stmt_list (DECL_SAVED_TREE (fndecl));
+  DECL_SAVED_TREE (fndecl) = push_stmt_list ();
+}
+
+  /* We can't validly get here with an empty statement list, since there's no
+ way for the FE to decide it's a coroutine in the absence of any code.  */
+  gcc_checking_assert (body != NULL_TREE);
+
+  /* If we have an empty or erroneous function body, do not try to transform it
+ since that would potentially wrap errors.  */
+  tree body_start = expr_first (body);
+  if (body_start == NULL_TREE || body_start == error_mark_node)
+{
+  /* Restore the original state.  */
+  add_stmt (body);
+  return NULL_TREE;
+}
+  return body;
+}
+
  /* Here we:
 a) Check that the function and promise type are valid for a
coroutine.
@@ -4483,57 +4520,22 @@ morph_fn_to_coro (tree orig, tree *resumer, tree 
*destroyer)
/* Discard the body, we can't process it further.  */
pop_stmt_list (DECL_SAVED_TREE (orig));
DECL_SAVED_TREE (orig) = push_stmt_list ();
+  /* Match the expected nesting when an eh block is in use.  */
+  if (use_eh_spec_block (orig))
+   current_eh_spec_block = begin_eh_spec_block ();
return false;
  }
  
-  /* We can't validly get here with an empty statement list, since there's no

- way for the FE to decide it's a coroutine in the absence of any code.  */
-  tree fnbody = pop_stmt_list (DECL_SAVED_TREE (orig));
-  gcc_checking_assert (fnbody != NULL_TREE);
-
/* We don't have the locus of the opening brace - it's filled in later (and
   there doesn't really seem to be any easy way to get at it).
   The closing brace is assumed to be input_location.  */
location_t fn_start = DECL_SOURCE_LOCATION (orig);
-  gcc_rich_location fn_start_loc (fn_start);
-
-  /* Initial processing of the function-body.
- If we have no expressions or just an error then punt.  */
-  tree body_start = expr_first (fnbody);
-  if (body_start == NULL_TREE || body_start == error_mark_node)
-{
-  DECL_SAVED_TREE (orig) = push_stmt_list ();
-  append_to_statement_list (fnbody, &DECL_SAVED_TREE (orig));
-  /* Suppress warnings about the missing return value.  */
-  suppress_warning (orig, OPT_Wreturn_type);
-  return false;
-}
-
-  /* So, we've tied off the original user-authored body in fn_body.
-
- Start the replacement synthesized ramp body as newbody.
- If we encounter a fatal error we might return a now-empty body.
-
- Note, the returned ramp body is not 'popped', to be com

Re: [PATCH] c++: permit errors inside uninstantiated templates [PR116064]

2024-08-05 Thread Jason Merrill

On 8/2/24 4:18 PM, Patrick Palka wrote:

On Fri, 2 Aug 2024, Patrick Palka wrote:


On Fri, 2 Aug 2024, Jason Merrill wrote:


On 8/1/24 2:52 PM, Patrick Palka wrote:

In recent versions of GCC we've been diagnosing more and more kinds of
errors inside a template ahead of time.  This is a largely good thing
because it catches bugs, typos, dead code etc sooner.

But if the template never gets instantiated then such errors are
harmless, and can be inconvenient to work around if say the code in
question is third party and in maintenence mode.  So it'd be useful to


"maintenance"


Fixed




diff --git a/gcc/cp/error.cc b/gcc/cp/error.cc
index d80bac822ba..0bb0a482e28 100644
--- a/gcc/cp/error.cc
+++ b/gcc/cp/error.cc
@@ -165,6 +165,58 @@ class cxx_format_postprocessor : public
format_postprocessor
 deferred_printed_type m_type_b;
   };
   +/* A map from TEMPLATE_DECL to the location of the first error (if any)
+   within the template that we permissivly downgraded to a warning.  */


"permissively"


Fixed




+relaxed_template_errors_t *relaxed_template_errors;
+
+/* Callback function diagnostic_context::m_adjust_diagnostic_info.
+
+   In -fpermissive mode we downgrade errors within a template to
+   warnings, and only issue an error if we later need to instantiate
+   the template.  */
+
+static void
+cp_adjust_diagnostic_info (diagnostic_context *context,
+  diagnostic_info *diagnostic)
+{
+  tree ti;
+  if (diagnostic->kind == DK_ERROR
+  && context->m_permissive
+  && !current_instantiation ()
+  && in_template_context
+  && (ti = get_template_info (current_scope (
+{
+  if (!relaxed_template_errors)
+   relaxed_template_errors = new relaxed_template_errors_t;
+
+  tree tmpl = TI_TEMPLATE (ti);
+  if (!relaxed_template_errors->get (tmpl))
+   relaxed_template_errors->put (tmpl, diagnostic->richloc->get_loc ());
+  diagnostic->kind = DK_WARNING;


Rather than check m_permissive directly and downgrade to DK_WARNING, how about
downgrading to DK_PERMERROR?  That way people will get the [-fpermissive]
clue.

...though I suppose DK_PERMERROR doesn't work where you call this hook in
report_diagnostic, at which point we've already reassigned it into DK_WARNING
or DK_ERROR in diagnostic_impl.

But we could still set diagnostic->option_index even for DK_ERROR, whether to
context->m_opt_permissive or to its own warning flag, perhaps
-Wno-template-body?


Fixed by adding an enabled-by-default -Wtemplate-body flag and setting
option_index to it for each downgraded error.  Thus -permissive
-Wno-template-body would suppress the downgraded warnings entirely, and
only issue a generic error upon instantiation of the erroneous template.


... or did you have in mind to set option_index even when not using
-fpermissive so that eligible non-downgraded errors get the
[-fpermissive] or [-Wtemplate-body] hint as well?


Yes.


IMHO I'm not sure that'd be worth the extra noise since the vast
majority of users appreciate and expect errors to get diagnosed inside
templates.


But people trying to build legacy code should appreciate the pointer for 
how to make it compile, as with other permerrors.



And on second thought I'm not sure what extra value a new warning flag
adds either.  I can't think of a good reason why one would use
-fpermissive -Wno-template-body?


One would use -Wno-template-body (or -Wno-error=template-body) without 
-fpermissive, like with the various permerror_opt cases.


Jason



Re: [PATCH v2] c++: fix -Wdangling-reference false positive [PR115987]

2024-08-05 Thread Jason Merrill

On 8/2/24 3:22 PM, Marek Polacek wrote:

On Thu, Aug 01, 2024 at 05:20:43PM -0400, Jason Merrill wrote:

On 8/1/24 4:19 PM, Marek Polacek wrote:

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

-- >8 --
This fixes another false positive.  When a function is taking a
temporary of scalar type that couldn't be bound to the return type
of the function, don't warn, such a program would be ill-formed.

Thanks to Jonathan for reporting the problem.

PR c++/115987

gcc/cp/ChangeLog:

* call.cc (do_warn_dangling_reference): Don't consider a
temporary with a scalar type that cannot bind to the return type.

gcc/testsuite/ChangeLog:

* g++.dg/warn/Wdangling-reference22.C: New test.
---
   gcc/cp/call.cc| 15 +--
   .../g++.dg/warn/Wdangling-reference22.C   | 19 +++
   2 files changed, 32 insertions(+), 2 deletions(-)
   create mode 100644 gcc/testsuite/g++.dg/warn/Wdangling-reference22.C

diff --git a/gcc/cp/call.cc b/gcc/cp/call.cc
index 40cb582acc7..375256ebcc4 100644
--- a/gcc/cp/call.cc
+++ b/gcc/cp/call.cc
@@ -14290,8 +14290,19 @@ do_warn_dangling_reference (tree expr, bool arg_p)
/* Recurse to see if the argument is a temporary.  It could also
   be another call taking a temporary and returning it and
   initializing this reference parameter.  */
-   if (do_warn_dangling_reference (arg, /*arg_p=*/true))
- return expr;
+   if ((arg = do_warn_dangling_reference (arg, /*arg_p=*/true)))
+ {
+   /* If we know the temporary could not bind to the return type,
+  don't warn.  This is for scalars only because for classes
+  we can't be sure we are not returning its sub-object.  */
+   if (SCALAR_TYPE_P (TREE_TYPE (arg))
+   && TYPE_REF_P (rettype)
+   && SCALAR_TYPE_P (TREE_TYPE (rettype))


I don't think we need to check for scalar return type, only argument type.


Oh that was a late change to keep attr-no-dangling6.C working, i.e., to
keep warning for something like

   struct X { X(int); };
   const X& get (const int& i)
   {
  return i;
   }

   void test ()
   {
 [[maybe_unused]] const X& x2 = get (10);
   }

But we already emit a -Wreturn-local-addr warning there.  So, I've dropped
the SCALAR_TYPE_P check and adjusted attr-no-dangling6.C instead:

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?


OK.


-- >8 --
This fixes another false positive.  When a function is taking a
temporary of scalar type that couldn't be bound to the return type
of the function, don't warn, such a program would be ill-formed.

Thanks to Jonathan for reporting the problem.

PR c++/115987

gcc/cp/ChangeLog:

* call.cc (do_warn_dangling_reference): Don't consider a
temporary with a scalar type that cannot bind to the return type.

gcc/testsuite/ChangeLog:

* g++.dg/ext/attr-no-dangling6.C: Adjust.
* g++.dg/ext/attr-no-dangling7.C: Likewise.
* g++.dg/warn/Wdangling-reference22.C: New test.
---
  gcc/cp/call.cc| 14 ++--
  gcc/testsuite/g++.dg/ext/attr-no-dangling6.C  | 22 +--
  gcc/testsuite/g++.dg/ext/attr-no-dangling7.C  |  8 +++
  .../g++.dg/warn/Wdangling-reference22.C   | 19 
  4 files changed, 46 insertions(+), 17 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/warn/Wdangling-reference22.C

diff --git a/gcc/cp/call.cc b/gcc/cp/call.cc
index 40cb582acc7..a75e2e5e3af 100644
--- a/gcc/cp/call.cc
+++ b/gcc/cp/call.cc
@@ -14290,8 +14290,18 @@ do_warn_dangling_reference (tree expr, bool arg_p)
/* Recurse to see if the argument is a temporary.  It could also
   be another call taking a temporary and returning it and
   initializing this reference parameter.  */
-   if (do_warn_dangling_reference (arg, /*arg_p=*/true))
- return expr;
+   if ((arg = do_warn_dangling_reference (arg, /*arg_p=*/true)))
+ {
+   /* If we know the temporary could not bind to the return type,
+  don't warn.  This is for scalars only because for classes
+  we can't be sure we are not returning its sub-object.  */
+   if (SCALAR_TYPE_P (TREE_TYPE (arg))
+   && TYPE_REF_P (rettype)
+   && !reference_related_p (TREE_TYPE (arg),
+TREE_TYPE (rettype)))
+ continue;
+   return expr;
+ }
  /* Don't warn about member functions like:
  std::any a(...);
  S& s = a.emplace({0}, 0);
diff --git a/gcc/testsuite/g++.dg/ext/attr-no-dangling6.C 
b/gcc/testsuite/g++.dg/ext/attr-no-dangling6.C
index 235a5fd86c5..5b349e8e682 100644
--- a/gcc/testsuite/g++.dg/ext/attr-no-dangling6.C
+++ b/gcc/

Re: [PATCH] testsuite: Add RISC-V to targets not xfailing gcc.dg/attr-alloc_size-11.c:50, 51.

2024-08-05 Thread Jiawei



在 2024/8/5 23:21, Jeff Law 写道:



On 8/5/24 6:26 AM, Jiawei wrote:
The test has been observed to pass on most architectures including 
RISC-V:

https://godbolt.org/z/8nYEvW6n1

Origin issue see:
https://gcc.gnu.org/PR79356#c11

Update RISC-V target to to pass list.

gcc/testsuite/ChangeLog:

* gcc.dg/attr-alloc_size-11.c: Add RISC-V to the list
of targets excluding xfail on lines 50 and 51.
Almost certainly behaving like the other targets in the list due to 
how promotions work.


OK for the trunk.  Thanks!

jeff


Okay, thanks for your review, committed.

BR,

jiawei



[x86_64 PATCH] Support memory destinations and wide immediate constants in STV.

2024-08-05 Thread Roger Sayle

Hi Uros,
Very many thanks for the quick review and approval.  Here's another.

This patch implements two improvements/refinements to the i386 backend's
Scalar-To-Vector (STV) pass.  The first is to support memory destinations
in binary logic operations, and the second is to provide more accurate
costs/gains for (wide) immediate constants in binary logic operations.

A motivating example is gcc.target/i386/movti-2.c:

__int128 m;
void foo()
{
m &= ((__int128)0x0123456789abcdefULL<<64) | 0x0123456789abcdefULL;
}

for which STV1 currently generates a warning/error:
> r100 has non convertible use in insn 6

(insn 5 2 6 2 (set (reg:TI 100)
(const_wide_int 0x123456789abcdef0123456789abcdef)) "movti-2.c":7:7
87 {
*movti_internal}
 (nil))
(insn 6 5 0 2 (parallel [
(set (mem/c:TI (symbol_ref:DI ("m") [flags 0x2]  ) [1 m+0 S16 A128])
(and:TI (mem/c:TI (symbol_ref:DI ("m") [flags 0x2]
) [1 m+0 S16 A128])
(reg:TI 100)))
(clobber (reg:CC 17 flags))
]) "movti-2.c":7:7 645 {*andti3_doubleword}
 (expr_list:REG_DEAD (reg:TI 100)
(expr_list:REG_UNUSED (reg:CC 17 flags)
(nil

and therefore generates the following scalar code with -O2 -mavx

foo:movabsq $81985529216486895, %rax
andq%rax, m(%rip)
andq%rax, m+8(%rip)
ret

with this patch we now support read-modify-write instructions (as STV
candidates), splitting them into explicit read-modify instructions
followed by an explicit write instruction.  Hence, we now produce
(when not optimizing for size):

foo:movabsq $81985529216486895, %rax
vmovq   %rax, %xmm0
vpunpcklqdq %xmm0, %xmm0, %xmm0
vpand   m(%rip), %xmm0, %xmm0
vmovdqa %xmm0, m(%rip)
ret

This code also handles the const_wide_int in example above, correcting
the costs/gains when the hi/lo words are the same.  One minor complication
is that the middle-end assumes (when generating memset) that SSE constants
will be shared/amortized across multiple consecutive writes.  Hence to
avoid testsuite regressions, we add a heuristic that considers an immediate
constant to be very cheap, if that same immediate value occurs in the
previous instruction or in the instruction after.

This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
and make -k check, both with and without --target_board=unix{-m32}
with no new failures.  Ok for mainline?


2024-08-05  Roger Sayle  

gcc/ChangeLog
* config/i386/i386-features.cc (timode_immed_const_gain): New
function to determine the gain/cost on a CONST_WIDE_INT.
(local_duplicate_constant_p): Helper function to see if the
same immediate constant appears in the previous or next insn.
(timode_scalar_chain::compute_convert_gain): Fix whitespace.
: Provide more accurate estimates using
timode_immed_const_gain and local_duplicate_constant_p.
: Handle MEM_P (dst) and CONSTANT_SCALAR_INT_P (src).
(timode_scalar_to_vector_candidate_p): Support the first operand
of AND, IOR and XOR being MEM_P (i.e. a read-modify-write insn).

gcc/testsuite/ChangeLog
* gcc.target/i386/movti-2.c: Change dg-options to -Os.
* gcc.target/i386/movti-4.c: Expected output of original movti-2.c.


Thanks again,
Roger
--

diff --git a/gcc/config/i386/i386-features.cc b/gcc/config/i386/i386-features.cc
index 3da56dd..8ad0ae7 100644
--- a/gcc/config/i386/i386-features.cc
+++ b/gcc/config/i386/i386-features.cc
@@ -1503,6 +1503,54 @@ general_scalar_chain::convert_insn (rtx_insn *insn)
   df_insn_rescan (insn);
 }
 
+/* Helper function to compute gain for loading an immediate constant.
+   Typically, two movabsq for TImode vs. vmovdqa for V1TImode, but
+   with numerous special cases.  */
+
+static int
+timode_immed_const_gain (rtx cst)
+{
+  /* movabsq vs. movabsq+vmovq+vunpacklqdq.  */
+  if (CONST_WIDE_INT_P (cst)
+  && CONST_WIDE_INT_NUNITS (cst) == 2
+  && CONST_WIDE_INT_ELT (cst, 0) == CONST_WIDE_INT_ELT (cst, 1))
+return optimize_insn_for_size_p () ? -COSTS_N_BYTES (9)
+  : -COSTS_N_INSNS (2);
+  /* 2x movabsq ~ vmovdqa.  */
+  return 0;
+}
+
+/* Return true if the constant CST in mode MODE is found as an
+   immediate operand in the insn after INSN, or the insn before it.  */
+
+static bool
+local_duplicate_constant_p (rtx_insn *insn, machine_mode mode, rtx cst)
+{
+  rtx set;
+
+  rtx_insn *next = NEXT_INSN (insn);
+  if (next)
+{
+  set = single_set (insn);
+  if (set
+ && GET_MODE (SET_DEST (set)) == mode
+ && rtx_equal_p (SET_SRC (set), cst))
+   return true;
+}
+
+  rtx_insn *prev = PREV_INSN (insn);
+  if (prev)
+{
+  set = single_set (insn);
+  if (set
+ && GET_MODE (SET_DEST (set)) == mode
+ && rtx_equal_p (SET_SRC (set), cst))
+   return true;
+}
+  return false;
+}
+
+
 /* Compute a gain for chain 

Re: [PATCH] c++: remove function/var concepts code

2024-08-05 Thread Jason Merrill

On 8/2/24 2:12 PM, Marek Polacek wrote:

Bootstrapped/regtested on x86_64-pc-linux-gnu.  Comments?

-- >8 --
This patch removes vestigial Concepts TS code as discussed in
.

In particular, it removes code related to function/variable concepts.
That includes variable_concept_p and function_concept_p, which then
cascades into removing DECL_DECLARED_CONCEPT_P etc.  So I think we
no longer need to say "standard concept" since there are no non-standard
ones anymore.

I've added two new errors saying that "variable/function concepts are
no longer supported".

gcc/cp/ChangeLog:

* constexpr.cc (cxx_eval_constant_expression): Don't call
unpack_concept_check.  Add a concept_check_p assert.  Remove
function_concept_p code.
* constraint.cc (check_constraint_atom): Remove function concepts code.
(unpack_concept_check): Remove.
(get_concept_check_template): Remove Concepts TS code.
(resolve_function_concept_overload): Remove.
(resolve_function_concept_check): Remove.
(resolve_concept_check): Remove Concepts TS code.
(get_returned_expression): Remove.
(get_variable_initializer): Remove.
(get_concept_definition): Remove Concepts TS code.
(normalize_concept_check): Likewise.
(build_function_check): Remove.
(build_variable_check): Remove.
(build_standard_check): Use concept_definition_p instead of
standard_concept_p.
(build_concept_check): Remove variable_concept_p/function_concept_p
code.
(build_concept_id): Simplify.
(build_type_constraint): Likewise.
(placeholder_extract_concept_and_args): Likewise.
(satisfy_nondeclaration_constraints): Likewise.
(check_function_concept): Remove.
(get_constraint_error_location): Remove Concepts TS code.
* cp-tree.h (DECL_DECLARED_CONCEPT_P): Remove.
(check_function_concept): Remove.
(unpack_concept_check): Remove.
(standard_concept_p): Remove.
(variable_concept_p): Remove.
(function_concept_p): Remove.
(concept_definition_p): Simplify.
(concept_check_p): Don't check for CALL_EXPR.
* decl.cc (check_concept_refinement): Remove.
(duplicate_decls): Remove check_concept_refinement code.
(is_concept_var): Remove.
(cp_finish_decl): Remove is_concept_var.
(check_concept_fn): Remove.
(grokfndecl): Give an error about function concepts not being supported
anymore.  Remove unused code.
(grokvardecl): Give an error about variable concepts not being
supported anymore.
(finish_function): Remove DECL_DECLARED_CONCEPT_P code.
* decl2.cc (min_vis_expr_r): Use concept_definition_p instead of
standard_concept_p.
(maybe_instantiate_decl): Remove DECL_DECLARED_CONCEPT_P check.
(mark_used): Likewise.
* error.cc (dump_simple_decl): Use concept_definition_p instead of
standard_concept_p.
(dump_function_decl): Remove DECL_DECLARED_CONCEPT_P code.
(print_concept_check_info): Don't call unpack_concept_check.
* mangle.cc (write_type_constraint): Likewise.
* parser.cc (cp_parser_nested_name_specifier_opt): Remove
function_concept_p code.  Only check concept_definition_p, not
variable_concept_p/standard_concept_p.
(add_debug_begin_stmt): Remove DECL_DECLARED_CONCEPT_P code.
(cp_parser_template_declaration_after_parameters): Remove a stale
comment.
* pt.cc (check_explicit_specialization): Remove
DECL_DECLARED_CONCEPT_P code.
(process_partial_specialization): Remove variable_concept_p code.
(lookup_template_variable): Likewise.
(tsubst_expr) : Remove Concepts TS code and simplify.
(do_decl_instantiation): Remove DECL_DECLARED_CONCEPT_P code.
(instantiate_decl): Likewise.
(placeholder_type_constraint_dependent_p): Don't call
unpack_concept_check.  Add a concept_check_p assert.
(convert_generic_types_to_packs): Likewise.
* semantics.cc (finish_call_expr): Remove Concepts TS code and simplify.

gcc/testsuite/ChangeLog:

* g++.dg/concepts/decl-diagnose.C: Adjust dg-error.
* g++.dg/concepts/fn-concept2.C: Likewise.
* g++.dg/concepts/pr71128.C: Likewise.
* g++.dg/concepts/var-concept6.C: Likewise.
* g++.dg/cpp2a/concepts.C: Likewise.
---
  gcc/cp/constexpr.cc   |  13 +-
  gcc/cp/constraint.cc  | 346 +-
  gcc/cp/cp-tree.h  |  71 +---
  gcc/cp/decl.cc| 118 +-
  gcc/cp/decl2.cc   |   4 +-
  gcc/cp/error.cc   |  10 +-
  gcc/cp/mangle.cc  |   4 +-
  gcc/cp/parser.cc  

Re: [RFC][PATCH] SVE intrinsics: Fold svdiv (svptrue, x, x) to ones

2024-08-05 Thread Richard Sandiford
Kyrylo Tkachov  writes:
>> On 5 Aug 2024, at 12:01, Richard Sandiford  wrote:
>> 
>> External email: Use caution opening links or attachments
>> 
>> 
>> Jennifer Schmitz  writes:
>>> This patch folds the SVE intrinsic svdiv into a vector of 1's in case
>>> 1) the predicate is svptrue and
>>> 2) dividend and divisor are equal.
>>> This is implemented in the gimple_folder for signed and unsigned
>>> integers. Corresponding test cases were added to the existing test
>>> suites.
>>> 
>>> The patch was bootstrapped and regtested on aarch64-linux-gnu, no 
>>> regression.
>>> OK for mainline?
>>> 
>>> Please also advise whether it makes sense to implement the same optimization
>>> for float types and if so, under which conditions?
>> 
>> I think we should instead use const_binop to try to fold the division
>> whenever the predicate is all-true, or if the function uses _x predication.
>> (As a follow-on, we could handle _z and _m too, using VEC_COND_EXPR.)
>> 
>
> From what I can see const_binop only works on constant arguments.

Yeah, it only produces a result for constant arguments.  I see now
that that isn't the case that the patch is interested in, sorry.

> Is fold_binary a better interface to use ? I think it’d hook into the 
> match.pd machinery for divisions at some point.

We shouldn't use that from gimple folders AIUI, but perhaps I misremember.
(I realise we'd be using it only to test whether the result is constant,
but even so.)

Have you (plural) come across a case where svdiv is used with equal
non-constant arguments?  If it's just being done on first principles
then how about starting with const_binop instead?  If possible, it'd be
good to structure it so that we can reuse the code for svadd, svmul,
svsub, etc.

Thanks,
Richard


> Thanks,
> Kyrill
>
>> We shouldn't need to vet the arguments, since const_binop does that itself.
>> Using const_binop should also get the conditions right for floating-point
>> divisions.
>> 
>> Thanks,
>> Richard
>> 
>> 
>>> 
>>> Signed-off-by: Jennifer Schmitz 
>>> 
>>> gcc/
>>> 
>>>  * config/aarch64/aarch64-sve-builtins-base.cc (svdiv_impl::fold):
>>>  Add optimization.
>>> 
>>> gcc/testsuite/
>>> 
>>>  * gcc.target/aarch64/sve/acle/asm/div_s32.c: New test.
>>>  * gcc.target/aarch64/sve/acle/asm/div_s64.c: Likewise.
>>>  * gcc.target/aarch64/sve/acle/asm/div_u32.c: Likewise.
>>>  * gcc.target/aarch64/sve/acle/asm/div_u64.c: Likewise.
>>> 
>>> From 43913cfa47b31d055a0456c863a30e3e44acc2f0 Mon Sep 17 00:00:00 2001
>>> From: Jennifer Schmitz 
>>> Date: Fri, 2 Aug 2024 06:41:09 -0700
>>> Subject: [PATCH] SVE intrinsics: Fold svdiv (svptrue, x, x) to ones
>>> 
>>> This patch folds the SVE intrinsic svdiv into a vector of 1's in case
>>> 1) the predicate is svptrue and
>>> 2) dividend and divisor are equal.
>>> This is implemented in the gimple_folder for signed and unsigned
>>> integers. Corresponding test cases were added to the existing test
>>> suites.
>>> 
>>> The patch was bootstrapped and regtested on aarch64-linux-gnu, no 
>>> regression.
>>> OK for mainline?
>>> 
>>> Signed-off-by: Jennifer Schmitz 
>>> 
>>> gcc/
>>> 
>>>  * config/aarch64/aarch64-sve-builtins-base.cc (svdiv_impl::fold):
>>>  Add optimization.
>>> 
>>> gcc/testsuite/
>>> 
>>>  * gcc.target/aarch64/sve/acle/asm/div_s32.c: New test.
>>>  * gcc.target/aarch64/sve/acle/asm/div_s64.c: Likewise.
>>>  * gcc.target/aarch64/sve/acle/asm/div_u32.c: Likewise.
>>>  * gcc.target/aarch64/sve/acle/asm/div_u64.c: Likewise.
>>> ---
>>> .../aarch64/aarch64-sve-builtins-base.cc  | 19 ++---
>>> .../gcc.target/aarch64/sve/acle/asm/div_s32.c | 27 +++
>>> .../gcc.target/aarch64/sve/acle/asm/div_s64.c | 27 +++
>>> .../gcc.target/aarch64/sve/acle/asm/div_u32.c | 27 +++
>>> .../gcc.target/aarch64/sve/acle/asm/div_u64.c | 27 +++
>>> 5 files changed, 124 insertions(+), 3 deletions(-)
>>> 
>>> diff --git a/gcc/config/aarch64/aarch64-sve-builtins-base.cc 
>>> b/gcc/config/aarch64/aarch64-sve-builtins-base.cc
>>> index d55bee0b72f..e347d29c725 100644
>>> --- a/gcc/config/aarch64/aarch64-sve-builtins-base.cc
>>> +++ b/gcc/config/aarch64/aarch64-sve-builtins-base.cc
>>> @@ -755,8 +755,21 @@ public:
>>>   gimple *
>>>   fold (gimple_folder &f) const override
>>>   {
>>> -tree divisor = gimple_call_arg (f.call, 2);
>>> -tree divisor_cst = uniform_integer_cst_p (divisor);
>>> +tree pg = gimple_call_arg (f.call, 0);
>>> +tree op1 = gimple_call_arg (f.call, 1);
>>> +tree op2 = gimple_call_arg (f.call, 2);
>>> +
>>> +if (f.type_suffix (0).integer_p
>>> + && is_ptrue (pg, f.type_suffix (0).element_bytes)
>>> + && operand_equal_p (op1, op2, 0))
>>> +  {
>>> + tree lhs_type = TREE_TYPE (f.lhs);
>>> + tree_vector_builder builder (lhs_type, 1, 1);
>>> + builder.quick_push (build_each_one_cst (TREE_TYPE (lhs_type)));
>>> + return gimple_build_assign (f.lhs, builder.bui

Re: [RFC v3 3/3] c: Add __lengthof__() operator

2024-08-05 Thread Martin Uecker
Am Montag, dem 05.08.2024 um 17:27 +0200 schrieb Alejandro Colomar:
> Hi Martin,
> 
...

> > But I think you might make it unnecessarily complicated.  It
> > should be sufficient to look at the outermost size.  You
> > can completely ignore thatever happens There
> > should be three cases if I am not mistaken:
> > 
> > - incomplete (includes ISO FAM) -> error
> > - constant (includes GNU FAM) -> return fixed size
> > - variable (includes unspecified) -> evaluate the
> > argument and return the size, while making sure it is 
> > visibly non-constant.
> > 
> > To check that the array has a variable length, you can use
> > the same logic as in comptypes_internal (cf. d1_variable).
> 
> Hmmm, comptypes_internal() has taught me what I was asking here.
> However, it seems to not be enough for what I actually need.
> 
> Here's my problem:
> 
> The array is correctly considered a fixed-length array.  I know it
> because the following debugging code:
> 
>   +fprintf(stderr, "ALX: %s() %d\n", __func__, __LINE__);
>   +tree dom = TYPE_DOMAIN (type);
>   +int zero = !TYPE_MAX_VALUE (dom);
>   +fprintf(stderr, "ALX: zero: %d\n", zero);
>   +int var0 = !zero
>   +&& (TREE_CODE (TYPE_MIN_VALUE (dom)) != INTEGER_CST
>   +   || TREE_CODE (TYPE_MAX_VALUE (dom)) != INTEGER_CST);
>   +fprintf(stderr, "ALX: var: %d\n", var0);
>   +int var = var0 || (zero && TYPE_LANG_FLAG_1(type));
>   +fprintf(stderr, "ALX: var: %d\n", var);
>   +  ret = array_type_nelts_top (type);
>   +fprintf(stderr, "ALX: %s() %d\n", __func__, __LINE__);
> 
> prints:
> 
>   ALX: c_lengthof_type() 4098
>   ALX: zero: 0
>   ALX: var: 0
>   ALX: var: 0
>   ALX: c_lengthof_type() 4109
> 
> for
>   void foo(char (*a)[3][*], int (*x)[__lengthof__(*a)]);
> 
> That differs from
> 
>   ALX: c_lengthof_type() 4098
>   ALX: zero: 1
>   ALX: var: 0
>   ALX: var: 1
>   ALX: c_lengthof_type() 4109
> 
> for
>   void bar(char (*a)[*][3], int (*x)[__lengthof__(*a)]);


That looks good.

> 
> However, if I turn on -Wvla, both get a warning:
> 
>   len.c: At top level:
>   len.c:288:1: warning: ISO C90 forbids variable length array ‘x’ [-Wvla]
> 288 | void foo(char (*a)[3][*], int (*x)[__lengthof__(*a)]);
> | ^~~~
>   len.c:289:1: warning: ISO C90 forbids variable length array ‘x’ [-Wvla]
> 289 | void bar(char (*a)[*][3], int (*x)[__lengthof__(*a)]);
> | ^~~~
> 

You should check the the result you get from __lengthof__
is an integer constant expression in the first case.

> I suspect that the problem is in:
> 
>   $ grepc -tfd array_type_nelts_minus_one gcc
>   gcc/tree.cc:tree
>   array_type_nelts_minus_one (const_tree type)
>   {
> tree index_type, min, max;
> 
> /* If they did it with unspecified bounds, then we should have already
>given an error about it before we got here.  */
> if (! TYPE_DOMAIN (type))
>   return error_mark_node;
> 
> index_type = TYPE_DOMAIN (type);
> min = TYPE_MIN_VALUE (index_type);
> max = TYPE_MAX_VALUE (index_type);
> 
> /* TYPE_MAX_VALUE may not be set if the array has unknown length.  */
> if (!max)
>   {
> /* zero sized arrays are represented from C FE as complete types 
> with
>NULL TYPE_MAX_VALUE and zero TYPE_SIZE, while C++ FE represents
>them as min 0, max -1.  */
> if (COMPLETE_TYPE_P (type)
> && integer_zerop (TYPE_SIZE (type))
> && integer_zerop (min))
>   return build_int_cst (TREE_TYPE (min), -1);
> 
> return error_mark_node;
>   }
> 
> return (integer_zerop (min)
> ? max
> : fold_build2 (MINUS_EXPR, TREE_TYPE (max), max, min));
>   }
> 
> With some debugging code, I've seen that in the fixed-length case, this
> reaches the last return (integer_zerop() is true, so it returns max),
> which is exactly the same as with any normal fixed-length array.
> 
> In the variable-length case (i.e., [*][3]), it returns build_int_cst().
> 
> So, it seems my problem is that 'max' does not represent an integer
> constant, even though we know it is.  Can we coerce it to an integer
> constant somehow?  Or maybe it's some of the in_lengthof that's messing
> with me?
> 

I would suspect the logic related to the C_MAYBE_CONST_EXPR.
In your original patch this still used C_TYPE_VARIABLE_SIZE,
which is not what we want for lengthof.

> > 
> > It is possible that you can not properly distinguish between
> > 
> > int a[0][n];
> > int a[*][n];
> > 
> > those two cases. The logic will treat the first as the second.
> 
> Those can be distinguished.  [0] triggers the zero test, while [*]
> triggers the second var test.

Are you sure? Both types should have C_TYPE_VARIABLE_SIZE set to 1.

> 
> > I think this is ok for now.  All

[COMMITTED, BPF] bpf: do not emit BPF non-fetching atomic instructions

2024-08-05 Thread Jose E. Marchesi
When GCC finds a call to one of the __atomic_OP_fetch built-ins in
which the return value is not used it optimizes it into the
corresponding non-fetching atomic operation.  Up to now we had
definitions in gcc/config/bpf/atomic.md to implement both atomic_OP
and atomic_fetch_OP sets of insns:

  atomic_add -> aadd (aka xadd)
  atomic_and -> aand
  atomic_or  -> aor
  atomic_xor -> axor

  atomic_fetch_add -> afadd
  atomic_fetch_and -> afand
  atomic_fetch_or  -> afor
  atomic_fetch_xor -> afxor

This was not correct, because as it happens the non-fetching BPF
atomic instructions imply different memory ordering semantics than the
fetching BPF atomic instructions, and they cannot be used
interchangeably, as it would be expected.

This patch modifies config/bpf/atomic.md in order to not define the
atomic_{add,and,or,xor} insns.  This makes GCC to implement them in
terms of the corresponding fetching operations; this is less
efficient, but correct.  It also updates the expected results in the
corresponding tests, which are also updated to cover cases where the
value resulting from the __atomic_fetch_* operations is actually used.

Tested in bpf-unknown-none target in x86_64-linux-gnu host.

gcc/ChangeLog

* config/bpf/atomic.md ("atomic_add"): Remove insn.
("atomic_and"): Likewise
("atomic_or"): Likewise.
("atomic_xor"): Likewise.

gcc/testsuite/ChangeLog

* gcc.target/bpf/atomic-op-1.c (test_used_atomic_add): New
function.
(test_used_atomic_sub): Likewise.
(test_used_atomic_and): Likewise.
(test_used_atomic_nand): Likewise.
(test_used_atomic_or): Likewise.
(test_used_atomic_xor): Likewise.
* gcc.target/bpf/atomic-op-2.c (test_used_atomic_add): Likewise.
(test_used_atomic_sub): Likewise.
(test_used_atomic_and): Likewise.
(test_used_atomic_nand): Likewise.
(test_used_atomic_or): Likewise.
(test_used_atomic_xor): Likewise.
* gcc.target/bpf/sync-fetch-and-add.c: Expected results updated.
---
 gcc/config/bpf/atomic.md  | 59 +--
 gcc/testsuite/gcc.target/bpf/atomic-op-1.c| 53 +++--
 gcc/testsuite/gcc.target/bpf/atomic-op-2.c| 53 +++--
 .../gcc.target/bpf/sync-fetch-and-add.c   |  4 +-
 4 files changed, 111 insertions(+), 58 deletions(-)

diff --git a/gcc/config/bpf/atomic.md b/gcc/config/bpf/atomic.md
index be4511bb51b..4e94c0352fe 100644
--- a/gcc/config/bpf/atomic.md
+++ b/gcc/config/bpf/atomic.md
@@ -22,50 +22,21 @@ (define_mode_iterator AMO [SI DI])
 
 ;;; Plain atomic modify operations.
 
-;; Non-fetching atomic add predates all other BPF atomic insns.
-;; Use xadd{w,dw} for compatibility with older GAS without support
-;; for v3 atomics.  Newer GAS supports "aadd[32]" in line with the
-;; other atomic operations.
-(define_insn "atomic_add"
-  [(set (match_operand:AMO 0 "memory_operand" "+m")
-(unspec_volatile:AMO
- [(plus:AMO (match_dup 0)
-(match_operand:AMO 1 "register_operand" "r"))
-  (match_operand:SI 2 "const_int_operand")] ;; Memory model.
- UNSPEC_AADD))]
-  ""
-  "{xadd\t%0,%1|lock *( *)%w0 += %w1}"
-  [(set_attr "type" "atomic")])
-
-(define_insn "atomic_and"
-  [(set (match_operand:AMO 0 "memory_operand" "+m")
-(unspec_volatile:AMO
- [(and:AMO (match_dup 0)
-   (match_operand:AMO 1 "register_operand" "r"))
-  (match_operand:SI 2 "const_int_operand")] ;; Memory model.
- UNSPEC_AAND))]
-  "bpf_has_v3_atomics"
-  "{aand\t%0,%1|lock *( *)%w0 &= %w1}")
-
-(define_insn "atomic_or"
-  [(set (match_operand:AMO 0 "memory_operand" "+m")
-(unspec_volatile:AMO
- [(ior:AMO (match_dup 0)
-   (match_operand:AMO 1 "register_operand" "r"))
-  (match_operand:SI 2 "const_int_operand")] ;; Memory model.
- UNSPEC_AOR))]
-  "bpf_has_v3_atomics"
-  "{aor\t%0,%1|lock *( *)%w0 %|= %w1}")
-
-(define_insn "atomic_xor"
-  [(set (match_operand:AMO 0 "memory_operand" "+m")
-(unspec_volatile:AMO
- [(xor:AMO (match_dup 0)
-   (match_operand:AMO 1 "register_operand" "r"))
-  (match_operand:SI 2 "const_int_operand")] ;; Memory model.
- UNSPEC_AXOR))]
-  "bpf_has_v3_atomics"
-  "{axor\t%0,%1|lock *( *)%w0 ^= %w1}")
+;; The BPF instruction set provides non-fetching atomic instructions
+;; that could be used to implement the corresponding named insns:
+;;
+;;  atomic_add -> aadd (aka xadd)
+;;  atomic_and -> aand 
+;;  atomic_or  -> aor
+;;  atomic_xor -> axor
+;;
+;; However, we are not including insns for these here because the
+;; non-fetching BPF atomic instruction imply different memory ordering
+;; semantics than the fetching BPF atomic instruction used to
+;; implement the atomic_fetch_* insns below (afadd, afand, afor,
+;; afxor) and they cannot be used interchangeably, as it is expected
+;; by GCC when it uses a non-fetch

Re: [RFC][PATCH] SVE intrinsics: Fold svdiv (svptrue, x, x) to ones

2024-08-05 Thread Kyrylo Tkachov


> On 5 Aug 2024, at 18:00, Richard Sandiford  wrote:
> 
> External email: Use caution opening links or attachments
> 
> 
> Kyrylo Tkachov  writes:
>>> On 5 Aug 2024, at 12:01, Richard Sandiford  
>>> wrote:
>>> 
>>> External email: Use caution opening links or attachments
>>> 
>>> 
>>> Jennifer Schmitz  writes:
 This patch folds the SVE intrinsic svdiv into a vector of 1's in case
 1) the predicate is svptrue and
 2) dividend and divisor are equal.
 This is implemented in the gimple_folder for signed and unsigned
 integers. Corresponding test cases were added to the existing test
 suites.
 
 The patch was bootstrapped and regtested on aarch64-linux-gnu, no 
 regression.
 OK for mainline?
 
 Please also advise whether it makes sense to implement the same 
 optimization
 for float types and if so, under which conditions?
>>> 
>>> I think we should instead use const_binop to try to fold the division
>>> whenever the predicate is all-true, or if the function uses _x predication.
>>> (As a follow-on, we could handle _z and _m too, using VEC_COND_EXPR.)
>>> 
>> 
>> From what I can see const_binop only works on constant arguments.
> 
> Yeah, it only produces a result for constant arguments.  I see now
> that that isn't the case that the patch is interested in, sorry.
> 
>> Is fold_binary a better interface to use ? I think it’d hook into the 
>> match.pd machinery for divisions at some point.
> 
> We shouldn't use that from gimple folders AIUI, but perhaps I misremember.
> (I realise we'd be using it only to test whether the result is constant,
> but even so.)

I haven’t looked more deeply, is there some more specific helper for folding 
trees we can use?

> 
> Have you (plural) come across a case where svdiv is used with equal
> non-constant arguments?  If it's just being done on first principles
> then how about starting with const_binop instead?  If possible, it'd be
> good to structure it so that we can reuse the code for svadd, svmul,
> svsub, etc.

Currently it’s done from first principles. We had a concrete user who wanted 
optimization of svdivs by constant, and we saw the folding of svdiv (x, x) as a 
natural extension. I tend to agree that we could make a common framework for 
binary SVE intrinsics that map to binary GIMPLE codes and call (and for unary 
codes too).
Do you think it’s worth handling the svdiv (x, x) case explicitly here and 
generalizing the folding of SVE intrinsics in a separate piece of work?

Thanks,
Kyrill


> 
> Thanks,
> Richard
> 
> 
>> Thanks,
>> Kyrill
>> 
>>> We shouldn't need to vet the arguments, since const_binop does that itself.
>>> Using const_binop should also get the conditions right for floating-point
>>> divisions.
>>> 
>>> Thanks,
>>> Richard
>>> 
>>> 
 
 Signed-off-by: Jennifer Schmitz 
 
 gcc/
 
 * config/aarch64/aarch64-sve-builtins-base.cc (svdiv_impl::fold):
 Add optimization.
 
 gcc/testsuite/
 
 * gcc.target/aarch64/sve/acle/asm/div_s32.c: New test.
 * gcc.target/aarch64/sve/acle/asm/div_s64.c: Likewise.
 * gcc.target/aarch64/sve/acle/asm/div_u32.c: Likewise.
 * gcc.target/aarch64/sve/acle/asm/div_u64.c: Likewise.
 
 From 43913cfa47b31d055a0456c863a30e3e44acc2f0 Mon Sep 17 00:00:00 2001
 From: Jennifer Schmitz 
 Date: Fri, 2 Aug 2024 06:41:09 -0700
 Subject: [PATCH] SVE intrinsics: Fold svdiv (svptrue, x, x) to ones
 
 This patch folds the SVE intrinsic svdiv into a vector of 1's in case
 1) the predicate is svptrue and
 2) dividend and divisor are equal.
 This is implemented in the gimple_folder for signed and unsigned
 integers. Corresponding test cases were added to the existing test
 suites.
 
 The patch was bootstrapped and regtested on aarch64-linux-gnu, no 
 regression.
 OK for mainline?
 
 Signed-off-by: Jennifer Schmitz 
 
 gcc/
 
 * config/aarch64/aarch64-sve-builtins-base.cc (svdiv_impl::fold):
 Add optimization.
 
 gcc/testsuite/
 
 * gcc.target/aarch64/sve/acle/asm/div_s32.c: New test.
 * gcc.target/aarch64/sve/acle/asm/div_s64.c: Likewise.
 * gcc.target/aarch64/sve/acle/asm/div_u32.c: Likewise.
 * gcc.target/aarch64/sve/acle/asm/div_u64.c: Likewise.
 ---
 .../aarch64/aarch64-sve-builtins-base.cc  | 19 ++---
 .../gcc.target/aarch64/sve/acle/asm/div_s32.c | 27 +++
 .../gcc.target/aarch64/sve/acle/asm/div_s64.c | 27 +++
 .../gcc.target/aarch64/sve/acle/asm/div_u32.c | 27 +++
 .../gcc.target/aarch64/sve/acle/asm/div_u64.c | 27 +++
 5 files changed, 124 insertions(+), 3 deletions(-)
 
 diff --git a/gcc/config/aarch64/aarch64-sve-builtins-base.cc 
 b/gcc/config/aarch64/aarch64-sve-builtins-base.cc
 index d55bee0b72f..e347d29c725 100644
 --- a/gcc/config

Re: [PATCH] RISC-V: Clarify that Vector Crypto Extensions require Vector Extensions[PR116150]

2024-08-05 Thread Patrick O'Neill



On 8/5/24 01:23, Liao Shihua wrote:

 PR 116150: Zvk* and Zvb* extensions requires v or zve* extension, but 
on gcc v is implied.

gcc/ChangeLog:

 * common/config/riscv/riscv-common.cc: Removed the zvk extension's 
implicit expansion of v extension.
 * config/riscv/arch-canonicalize: Ditto.
 * config/riscv/riscv.cc (riscv_override_options_internal): Throw error 
when zvb or zvk extension without v extension.

gcc/testsuite/ChangeLog:

 * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-47.c: 
add v or zve* to -march.
 * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-48.c: 
Ditto.
 * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-49.c: 
Ditto.
 * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-50.c: 
Ditto.
 * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-51.c: 
Ditto.
 * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-52.c: 
Ditto.
 * gcc.target/riscv/rvv/base/target_attribute_v_with_intrinsic-53.c: 
Ditto.
 * gcc.target/riscv/rvv/base/zvbc-intrinsic.c: Ditto.
 * gcc.target/riscv/rvv/base/zvbc_vx_constraint-1.c: Ditto.
 * gcc.target/riscv/rvv/base/zvbc_vx_constraint-2.c: Ditto.
 * gcc.target/riscv/rvv/base/zvknhb-intrinsic.c: Ditto.
 * gcc.target/riscv/zvbb.c: Ditto.
 * gcc.target/riscv/zvbc.c: Ditto.
 * gcc.target/riscv/zvkb.c: Ditto.
 * gcc.target/riscv/zvkg.c: Ditto.
 * gcc.target/riscv/zvkn-1.c: Ditto.
 * gcc.target/riscv/zvkn.c: Ditto.
 * gcc.target/riscv/zvknc-1.c: Ditto.
 * gcc.target/riscv/zvknc-2.c: Ditto.
 * gcc.target/riscv/zvknc.c: Ditto.
 * gcc.target/riscv/zvkned.c: Ditto.
 * gcc.target/riscv/zvkng-1.c: Ditto.
 * gcc.target/riscv/zvkng-2.c: Ditto.
 * gcc.target/riscv/zvkng.c: Ditto.
 * gcc.target/riscv/zvknha.c: Ditto.
 * gcc.target/riscv/zvknhb.c: Ditto.
 * gcc.target/riscv/zvks-1.c: Ditto.
 * gcc.target/riscv/zvks.c: Ditto.
 * gcc.target/riscv/zvksc-1.c: Ditto.
 * gcc.target/riscv/zvksc-2.c: Ditto.
 * gcc.target/riscv/zvksc.c: Ditto.
 * gcc.target/riscv/zvksed.c: Ditto.
 * gcc.target/riscv/zvksg-1.c: Ditto.
 * gcc.target/riscv/zvksg-2.c: Ditto.
 * gcc.target/riscv/zvksg.c: Ditto.
 * gcc.target/riscv/zvksh.c: Ditto.
 * gcc.target/riscv/pr116150-1.c: New test.
 * gcc.target/riscv/pr116150-2.c: New test.
 * gcc.target/riscv/pr116150-3.c: New test.
 * gcc.target/riscv/pr116150-4.c: New test.

---


Thanks for the patch! It's not clear to me if we want to match LLVM's 
behavior here.


Here's where GCC's current behavior is documented: 
https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/doc/invoke.texi;h=0fe99ca8ef6e8868f60369f6329fe29599d89159;hb=HEAD#l31150


Maybe Jeff or Kito can provide some guidance for what we want to do here.

Patrick



Re: [PATCH] Explicitly document that the "counted_by" attribute is only supported in C

2024-08-05 Thread Qing Zhao


> On Aug 5, 2024, at 09:53, Jakub Jelinek  wrote:
> 
> On Mon, Aug 05, 2024 at 01:33:01PM +, Qing Zhao wrote:
>> As discussed in 
>> PR116016:https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116016#c48
>> 
>> We should explicitly document this limitation and issue error messages for 
>> C++.
>> 
>> The "counted_by" attribute currently is only supported in C, mention this
>> explicitly in documentation and also issue error when see "counted_by"
>> attribute in C++.
>> 
>> The patch has been bootstrappped and regression tested on both aarch64 and 
>> X86,
>> no issue.
>> 
>> Okay for committing?
>> 
>> thanks.
>> 
>> Qing
>> 
>> gcc/c-family/ChangeLog:
>> 
>> * c-attribs.cc (handle_counted_by_attribute): Issue error for C++.
>> 
>> gcc/ChangeLog:
>> 
>> * doc/extend.texi: Explicitly mentions counted_by is available
>> only for C.
>> 
>> gcc/testsuite/ChangeLog:
>> 
>> * g++.dg/flex-array-counted-by.C: New test.
>> ---
>> gcc/c-family/c-attribs.cc|  9 -
>> gcc/doc/extend.texi  |  1 +
>> gcc/testsuite/g++.dg/flex-array-counted-by.C | 11 +++
>> 3 files changed, 20 insertions(+), 1 deletion(-)
>> create mode 100644 gcc/testsuite/g++.dg/flex-array-counted-by.C
>> 
>> diff --git a/gcc/c-family/c-attribs.cc b/gcc/c-family/c-attribs.cc
>> index 685f212683f..f936058800b 100644
>> --- a/gcc/c-family/c-attribs.cc
>> +++ b/gcc/c-family/c-attribs.cc
>> @@ -2859,8 +2859,15 @@ handle_counted_by_attribute (tree *node, tree name,
>>   tree argval = TREE_VALUE (args);
>>   tree old_counted_by = lookup_attribute ("counted_by", DECL_ATTRIBUTES 
>> (decl));
>> 
>> +  /* This attribute is not supported in C++.  */
>> +  if (c_dialect_cxx ())
>> +{
>> +  error_at (DECL_SOURCE_LOCATION (decl),
>> + "%qE attribute is not supported for C++", name);
> 
> This should be sorry_at instead IMHO (at least if there is a plan to support
> it later, hopefully in the 15 timeframe).
Okay. 
> 
>> +  *no_add_attrs = true;
>> +}
>>   /* This attribute only applies to field decls of a structure.  */
>> -  if (TREE_CODE (decl) != FIELD_DECL)
>> +  else if (TREE_CODE (decl) != FIELD_DECL)
>> {
>>   error_at (DECL_SOURCE_LOCATION (decl),
>> "%qE attribute is not allowed for a non-field"
>> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
>> index 48b27ff9f39..f31f3bdb53d 100644
>> --- a/gcc/doc/extend.texi
>> +++ b/gcc/doc/extend.texi
>> @@ -7848,6 +7848,7 @@ The @code{counted_by} attribute may be attached to the 
>> C99 flexible array
>> member of a structure.  It indicates that the number of the elements of the
>> array is given by the field "@var{count}" in the same structure as the
>> flexible array member.
>> +This attribute is available only for C.
> 
> And this should say for now or something similar.
Okay. 
> 
> 
> 
>> GCC may use this information to improve detection of object size information
>> for such structures and provide better results in compile-time diagnostics
>> and runtime features like the array bound sanitizer and
>> diff --git a/gcc/testsuite/g++.dg/flex-array-counted-by.C 
>> b/gcc/testsuite/g++.dg/flex-array-counted-by.C
>> new file mode 100644
>> index 000..7f1a345615e
>> --- /dev/null
>> +++ b/gcc/testsuite/g++.dg/flex-array-counted-by.C
> 
> Tests shouldn't be added directly to g++.dg/ directory, I think this should
> go into g++.dg/ext/ as it is an (unsupported) extension.
Makes sense.
> 
>> @@ -0,0 +1,11 @@
>> +/* Testing the fact that the attribute counted_by is not supported in C++.  
>> */
>> +/* { dg-do compile } */
>> +/* { dg-options "-O2" } */
>> +
>> +int size;
>> +int x __attribute ((counted_by (size))); /* { dg-error "attribute is not 
>> supported for C\\+\\+" } */
>> +
>> +struct trailing {
>> +  int count;
>> +  int field[] __attribute ((counted_by (count))); /* { dg-error "attribute 
>> is not supported for C\\+\\+" } */
>> +};
> 
> Maybe it should also test in another { dg-do compile { target c++11 } } test
> that the same happens even for [[gnu::counted_by (size)]].

Okay, will add one more test for c++11. 

> Seems even for C23 there are no tests with [[gnu::counted_by (size)]].
So, you want me to add counted_by test-suite for C23? (Which should be 
supported)
Okay, but I will do it in another separate patch since this patch is for C++. 
> The C++11/C23 standard attributes are more strict on where they can appear
> depending on what it appertains to, as it applies to declarations, I think
> it needs to go before the [] or at the start of the declaration, so
>  [[gnu::counted_by (count)]] int field[];
> or
>  int field [[gnu::counted_by (count)]] [];
> but I could be wrong, better test it…
For C++11, as I just checked:

 int field[] [[gnu::counted_by (count)]];

Is fine. 

thanks.

Qing
> 
> 
> Jakub




Re: [PATCH] Explicitly document that the "counted_by" attribute is only supported in C

2024-08-05 Thread Jakub Jelinek
On Mon, Aug 05, 2024 at 04:46:09PM +, Qing Zhao wrote:
> So, you want me to add counted_by test-suite for C23? (Which should be 
> supported)
> Okay, but I will do it in another separate patch since this patch is for C++. 
> > The C++11/C23 standard attributes are more strict on where they can appear
> > depending on what it appertains to, as it applies to declarations, I think
> > it needs to go before the [] or at the start of the declaration, so
> >  [[gnu::counted_by (count)]] int field[];
> > or
> >  int field [[gnu::counted_by (count)]] [];
> > but I could be wrong, better test it…
> For C++11, as I just checked:
> 
>  int field[] [[gnu::counted_by (count)]];
> 
> Is fine. 

What do you mean by fine, that it emits the sorry?  Yes, but the question
is if it will be ok when the support is added.
struct S {
  int s;
  int f[] [[gnu::counted_by (s)]];
};
with -std=c23 certainly emits
test.c:3:3: warning: ‘counted_by’ attribute does not apply to types 
[-Wattributes]
3 |   int f[] [[gnu::counted_by (s)]];
  |   ^~~
while it is fine for
  int f [[gnu::counted_by (s)]] [];
and
  [[gnu::counted_by (s)]] int f[];
So, I'd use that in the C++ testcase too...

Jakub



Re: [PATCH] c++: permit errors inside uninstantiated templates [PR116064]

2024-08-05 Thread Patrick Palka
On Mon, 5 Aug 2024, Jason Merrill wrote:

> On 8/2/24 4:18 PM, Patrick Palka wrote:
> > On Fri, 2 Aug 2024, Patrick Palka wrote:
> > 
> > > On Fri, 2 Aug 2024, Jason Merrill wrote:
> > > 
> > > > On 8/1/24 2:52 PM, Patrick Palka wrote:
> > > > > In recent versions of GCC we've been diagnosing more and more kinds of
> > > > > errors inside a template ahead of time.  This is a largely good thing
> > > > > because it catches bugs, typos, dead code etc sooner.
> > > > > 
> > > > > But if the template never gets instantiated then such errors are
> > > > > harmless, and can be inconvenient to work around if say the code in
> > > > > question is third party and in maintenence mode.  So it'd be useful to
> > > > 
> > > > "maintenance"
> > > 
> > > Fixed
> > > 
> > > > 
> > > > > diff --git a/gcc/cp/error.cc b/gcc/cp/error.cc
> > > > > index d80bac822ba..0bb0a482e28 100644
> > > > > --- a/gcc/cp/error.cc
> > > > > +++ b/gcc/cp/error.cc
> > > > > @@ -165,6 +165,58 @@ class cxx_format_postprocessor : public
> > > > > format_postprocessor
> > > > >  deferred_printed_type m_type_b;
> > > > >};
> > > > >+/* A map from TEMPLATE_DECL to the location of the first error (if
> > > > > any)
> > > > > +   within the template that we permissivly downgraded to a warning.
> > > > > */
> > > > 
> > > > "permissively"
> > > 
> > > Fixed
> > > 
> > > > 
> > > > > +relaxed_template_errors_t *relaxed_template_errors;
> > > > > +
> > > > > +/* Callback function diagnostic_context::m_adjust_diagnostic_info.
> > > > > +
> > > > > +   In -fpermissive mode we downgrade errors within a template to
> > > > > +   warnings, and only issue an error if we later need to instantiate
> > > > > +   the template.  */
> > > > > +
> > > > > +static void
> > > > > +cp_adjust_diagnostic_info (diagnostic_context *context,
> > > > > +diagnostic_info *diagnostic)
> > > > > +{
> > > > > +  tree ti;
> > > > > +  if (diagnostic->kind == DK_ERROR
> > > > > +  && context->m_permissive
> > > > > +  && !current_instantiation ()
> > > > > +  && in_template_context
> > > > > +  && (ti = get_template_info (current_scope (
> > > > > +{
> > > > > +  if (!relaxed_template_errors)
> > > > > + relaxed_template_errors = new relaxed_template_errors_t;
> > > > > +
> > > > > +  tree tmpl = TI_TEMPLATE (ti);
> > > > > +  if (!relaxed_template_errors->get (tmpl))
> > > > > + relaxed_template_errors->put (tmpl,
> > > > > diagnostic->richloc->get_loc ());
> > > > > +  diagnostic->kind = DK_WARNING;
> > > > 
> > > > Rather than check m_permissive directly and downgrade to DK_WARNING, how
> > > > about
> > > > downgrading to DK_PERMERROR?  That way people will get the
> > > > [-fpermissive]
> > > > clue.
> > > > 
> > > > ...though I suppose DK_PERMERROR doesn't work where you call this hook
> > > > in
> > > > report_diagnostic, at which point we've already reassigned it into
> > > > DK_WARNING
> > > > or DK_ERROR in diagnostic_impl.
> > > > 
> > > > But we could still set diagnostic->option_index even for DK_ERROR,
> > > > whether to
> > > > context->m_opt_permissive or to its own warning flag, perhaps
> > > > -Wno-template-body?
> > > 
> > > Fixed by adding an enabled-by-default -Wtemplate-body flag and setting
> > > option_index to it for each downgraded error.  Thus -permissive
> > > -Wno-template-body would suppress the downgraded warnings entirely, and
> > > only issue a generic error upon instantiation of the erroneous template.
> > 
> > ... or did you have in mind to set option_index even when not using
> > -fpermissive so that eligible non-downgraded errors get the
> > [-fpermissive] or [-Wtemplate-body] hint as well?
> 
> Yes.
> 
> > IMHO I'm not sure that'd be worth the extra noise since the vast
> > majority of users appreciate and expect errors to get diagnosed inside
> > templates.
> 
> But people trying to build legacy code should appreciate the pointer for how
> to make it compile, as with other permerrors.
> 
> > And on second thought I'm not sure what extra value a new warning flag
> > adds either.  I can't think of a good reason why one would use
> > -fpermissive -Wno-template-body?
> 
> One would use -Wno-template-body (or -Wno-error=template-body) without
> -fpermissive, like with the various permerror_opt cases.

Since compiling legacy/unmaintained code is the only plausible use case,
why have a dedicated warning flag instead of just recommending -fpermissive
when compiling legacy code?  I don't quite understand the motivation for
adding a new permerror_opt flag for this class of errors.

-Wnarrowing is an existing permerror_opt flag, but I can imagine it's
useful to pass -Wno-error=narrowing etc when incrementally migrating
C / C++98 code to modern C++ where you don't want any conformance errors
allowed by -fpermissive to sneak in.  So being able to narrowly control
this class of errors seems useful, so a dedicated flag makes sense.

But there's no parallel for -Wtemplate-body 

Re: [PATCH] Explicitly document that the "counted_by" attribute is only supported in C

2024-08-05 Thread Qing Zhao


> On Aug 5, 2024, at 12:51, Jakub Jelinek  wrote:
> 
> On Mon, Aug 05, 2024 at 04:46:09PM +, Qing Zhao wrote:
>> So, you want me to add counted_by test-suite for C23? (Which should be 
>> supported)
>> Okay, but I will do it in another separate patch since this patch is for C++.
>>> The C++11/C23 standard attributes are more strict on where they can appear
>>> depending on what it appertains to, as it applies to declarations, I think
>>> it needs to go before the [] or at the start of the declaration, so
>>> [[gnu::counted_by (count)]] int field[];
>>> or
>>> int field [[gnu::counted_by (count)]] [];
>>> but I could be wrong, better test it…
>> For C++11, as I just checked:
>> 
>> int field[] [[gnu::counted_by (count)]];
>> 
>> Is fine.
> 
> What do you mean by fine, that it emits the sorry?  Yes, but the question
> is if it will be ok when the support is added.
> struct S {
>  int s;
>  int f[] [[gnu::counted_by (s)]];
> };
> with -std=c23 certainly emits
> test.c:3:3: warning: ‘counted_by’ attribute does not apply to types 
> [-Wattributes]
>3 |   int f[] [[gnu::counted_by (s)]];
>  |   ^~~
> while it is fine for
>  int f [[gnu::counted_by (s)]] [];
> and
>  [[gnu::counted_by (s)]] int f[];
> So, I'd use that in the C++ testcase too...

Okay.

thanks.

Qing
> 
>   Jakub
> 



Re: [PATCH] c++: permit errors inside uninstantiated templates [PR116064]

2024-08-05 Thread Jason Merrill

On 8/5/24 1:14 PM, Patrick Palka wrote:

On Mon, 5 Aug 2024, Jason Merrill wrote:


On 8/2/24 4:18 PM, Patrick Palka wrote:

On Fri, 2 Aug 2024, Patrick Palka wrote:


On Fri, 2 Aug 2024, Jason Merrill wrote:


On 8/1/24 2:52 PM, Patrick Palka wrote:

In recent versions of GCC we've been diagnosing more and more kinds of
errors inside a template ahead of time.  This is a largely good thing
because it catches bugs, typos, dead code etc sooner.

But if the template never gets instantiated then such errors are
harmless, and can be inconvenient to work around if say the code in
question is third party and in maintenence mode.  So it'd be useful to


"maintenance"


Fixed




diff --git a/gcc/cp/error.cc b/gcc/cp/error.cc
index d80bac822ba..0bb0a482e28 100644
--- a/gcc/cp/error.cc
+++ b/gcc/cp/error.cc
@@ -165,6 +165,58 @@ class cxx_format_postprocessor : public
format_postprocessor
  deferred_printed_type m_type_b;
};
+/* A map from TEMPLATE_DECL to the location of the first error (if
any)
+   within the template that we permissivly downgraded to a warning.
*/


"permissively"


Fixed




+relaxed_template_errors_t *relaxed_template_errors;
+
+/* Callback function diagnostic_context::m_adjust_diagnostic_info.
+
+   In -fpermissive mode we downgrade errors within a template to
+   warnings, and only issue an error if we later need to instantiate
+   the template.  */
+
+static void
+cp_adjust_diagnostic_info (diagnostic_context *context,
+  diagnostic_info *diagnostic)
+{
+  tree ti;
+  if (diagnostic->kind == DK_ERROR
+  && context->m_permissive
+  && !current_instantiation ()
+  && in_template_context
+  && (ti = get_template_info (current_scope (
+{
+  if (!relaxed_template_errors)
+   relaxed_template_errors = new relaxed_template_errors_t;
+
+  tree tmpl = TI_TEMPLATE (ti);
+  if (!relaxed_template_errors->get (tmpl))
+   relaxed_template_errors->put (tmpl,
diagnostic->richloc->get_loc ());
+  diagnostic->kind = DK_WARNING;


Rather than check m_permissive directly and downgrade to DK_WARNING, how
about
downgrading to DK_PERMERROR?  That way people will get the
[-fpermissive]
clue.

...though I suppose DK_PERMERROR doesn't work where you call this hook
in
report_diagnostic, at which point we've already reassigned it into
DK_WARNING
or DK_ERROR in diagnostic_impl.

But we could still set diagnostic->option_index even for DK_ERROR,
whether to
context->m_opt_permissive or to its own warning flag, perhaps
-Wno-template-body?


Fixed by adding an enabled-by-default -Wtemplate-body flag and setting
option_index to it for each downgraded error.  Thus -permissive
-Wno-template-body would suppress the downgraded warnings entirely, and
only issue a generic error upon instantiation of the erroneous template.


... or did you have in mind to set option_index even when not using
-fpermissive so that eligible non-downgraded errors get the
[-fpermissive] or [-Wtemplate-body] hint as well?


Yes.


IMHO I'm not sure that'd be worth the extra noise since the vast
majority of users appreciate and expect errors to get diagnosed inside
templates.


But people trying to build legacy code should appreciate the pointer for how
to make it compile, as with other permerrors.


And on second thought I'm not sure what extra value a new warning flag
adds either.  I can't think of a good reason why one would use
-fpermissive -Wno-template-body?


One would use -Wno-template-body (or -Wno-error=template-body) without
-fpermissive, like with the various permerror_opt cases.


Since compiling legacy/unmaintained code is the only plausible use case,
why have a dedicated warning flag instead of just recommending -fpermissive
when compiling legacy code?  I don't quite understand the motivation for
adding a new permerror_opt flag for this class of errors.


It seems to me an interesting class of errors, but I don't mind leaving 
it under just -fpermissive if you prefer.



-Wnarrowing is an existing permerror_opt flag, but I can imagine it's
useful to pass -Wno-error=narrowing etc when incrementally migrating
C / C++98 code to modern C++ where you don't want any conformance errors
allowed by -fpermissive to sneak in.  So being able to narrowly control
this class of errors seems useful, so a dedicated flag makes sense.

But there's no parallel for -Wtemplate-body here, since by assumption
the code base is unmaintained / immutable.  Otherwise the more proper
fix would be to just fix and/or delete the uninstantiated erroneous
template.  If say you're #including a legacy header that has such
errors, then doing #pragma GCC diagnostic "-fpermissive -w" around
the #include should be totally fine too.

I just don't see the use case for being able to narrowly control this
class of errors that justifies the extra implementation complexity
(specifically for properly detecting -Wno-error=template-body in the
callback hook)?


The hook shouldn't need to do anything specia

Re: [RFC v3 3/3] c: Add __lengthof__() operator

2024-08-05 Thread Alejandro Colomar
Hi Martin,

On Mon, Aug 05, 2024 at 06:05:15PM GMT, Martin Uecker wrote:
> > 
> > However, if I turn on -Wvla, both get a warning:
> > 
> > len.c: At top level:
> > len.c:288:1: warning: ISO C90 forbids variable length array ‘x’ [-Wvla]
> >   288 | void foo(char (*a)[3][*], int (*x)[__lengthof__(*a)]);
> >   | ^~~~
> > len.c:289:1: warning: ISO C90 forbids variable length array ‘x’ [-Wvla]
> >   289 | void bar(char (*a)[*][3], int (*x)[__lengthof__(*a)]);
> >   | ^~~~
> > 
> 
> You should check the the result you get from __lengthof__
> is an integer constant expression in the first case.
> 
> > I suspect that the problem is in:
> > 
> > $ grepc -tfd array_type_nelts_minus_one gcc
> > gcc/tree.cc:tree
> > array_type_nelts_minus_one (const_tree type)
> > {
> >   tree index_type, min, max;
> > 
> >   /* If they did it with unspecified bounds, then we should have already
> >  given an error about it before we got here.  */
> >   if (! TYPE_DOMAIN (type))
> > return error_mark_node;
> > 
> >   index_type = TYPE_DOMAIN (type);
> >   min = TYPE_MIN_VALUE (index_type);
> >   max = TYPE_MAX_VALUE (index_type);
> > 
> >   /* TYPE_MAX_VALUE may not be set if the array has unknown length.  */
> >   if (!max)
> > {
> >   /* zero sized arrays are represented from C FE as complete types 
> > with
> >  NULL TYPE_MAX_VALUE and zero TYPE_SIZE, while C++ FE represents
> >  them as min 0, max -1.  */
> >   if (COMPLETE_TYPE_P (type)
> >   && integer_zerop (TYPE_SIZE (type))
> >   && integer_zerop (min))
> > return build_int_cst (TREE_TYPE (min), -1);
> > 
> >   return error_mark_node;
> > }
> > 
> >   return (integer_zerop (min)
> >   ? max
> >   : fold_build2 (MINUS_EXPR, TREE_TYPE (max), max, min));
> > }
> > 
> > With some debugging code, I've seen that in the fixed-length case, this
> > reaches the last return (integer_zerop() is true, so it returns max),
> > which is exactly the same as with any normal fixed-length array.
> > 
> > In the variable-length case (i.e., [*][3]), it returns build_int_cst().
> > 
> > So, it seems my problem is that 'max' does not represent an integer
> > constant, even though we know it is.  Can we coerce it to an integer
> > constant somehow?  Or maybe it's some of the in_lengthof that's messing
> > with me?
> > 
> 
> I would suspect the logic related to the C_MAYBE_CONST_EXPR.
> In your original patch this still used C_TYPE_VARIABLE_SIZE,
> which is not what we want for lengthof.

Ahhh, I blindly pasted that from sizeof, IIRC.  I'll check.
Thanks a lot!

> > > It is possible that you can not properly distinguish between
> > > 
> > > int a[0][n];
> > > int a[*][n];
> > > 
> > > those two cases. The logic will treat the first as the second.
> > 
> > Those can be distinguished.  [0] triggers the zero test, while [*]
> > triggers the second var test.
> 
> Are you sure? Both types should have C_TYPE_VARIABLE_SIZE set to 1.

You were right.  They're the same.  I was thinking of [0] vs [*], but
[0][n] is bad.  It gets treated as a VLA.

I won't worry too much about it, since GCC doesn't properly support
0-length arrays.  We'll have to worry about it if we start discussing
full support for 0-length arrays.

> > > I think this is ok for now.  All this array stuff should be 
> > > implified and refactored anyway, but this is for another time.
> > > 
> > > 
> > > I am also not sure you even need to use array_type_nelts in C
> > > because there is never a non-zero minimum size.
> > 
> > How should I get the number of elements without array_type_nelts()?  Is
> > there any other existing way to get it?  It just had a good name that
> > matched my grep, but maybe I'm missing something easier.
> 
> Maybe it is ok, but there is also code which just adds one
> to TYPE_MAX_VALUE.

Hmmm.  I'll check.

> 
> Martin

Cheers,
Alex

-- 



signature.asc
Description: PGP signature


Re: [PATCH] Explicitly document that the "counted_by" attribute is only supported in C

2024-08-05 Thread Jason Merrill

On 8/5/24 9:53 AM, Jakub Jelinek wrote:

On Mon, Aug 05, 2024 at 01:33:01PM +, Qing Zhao wrote:

As discussed in PR116016:https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116016#c48

We should explicitly document this limitation and issue error messages for C++.

The "counted_by" attribute currently is only supported in C, mention this
explicitly in documentation and also issue error when see "counted_by"
attribute in C++.

The patch has been bootstrappped and regression tested on both aarch64 and X86,
no issue.

+  /* This attribute is not supported in C++.  */
+  if (c_dialect_cxx ())
+{
+  error_at (DECL_SOURCE_LOCATION (decl),
+   "%qE attribute is not supported for C++", name);


This should be sorry_at instead IMHO (at least if there is a plan to support
it later, hopefully in the 15 timeframe).


Why should it be an error at all?  A warning seems sufficient since 
there's no semantic effect.


Jason



Re: [PATCH] Explicitly document that the "counted_by" attribute is only supported in C

2024-08-05 Thread Jakub Jelinek
On Mon, Aug 05, 2024 at 01:48:25PM -0400, Jason Merrill wrote:
> On 8/5/24 9:53 AM, Jakub Jelinek wrote:
> > On Mon, Aug 05, 2024 at 01:33:01PM +, Qing Zhao wrote:
> > > As discussed in 
> > > PR116016:https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116016#c48
> > > 
> > > We should explicitly document this limitation and issue error messages 
> > > for C++.
> > > 
> > > The "counted_by" attribute currently is only supported in C, mention this
> > > explicitly in documentation and also issue error when see "counted_by"
> > > attribute in C++.
> > > 
> > > The patch has been bootstrappped and regression tested on both aarch64 
> > > and X86,
> > > no issue.
> > > 
> > > +  /* This attribute is not supported in C++.  */
> > > +  if (c_dialect_cxx ())
> > > +{
> > > +  error_at (DECL_SOURCE_LOCATION (decl),
> > > + "%qE attribute is not supported for C++", name);
> > 
> > This should be sorry_at instead IMHO (at least if there is a plan to support
> > it later, hopefully in the 15 timeframe).
> 
> Why should it be an error at all?  A warning seems sufficient since there's
> no semantic effect.

Ok.  Guess OPT_Wattributes then.

Jakub



[pushed] wwwdocs: news: Adjust link to 2015 ACM Software System Award

2024-08-05 Thread Gerald Pfeifer
Pushed.

Gerald
---
 htdocs/news.html | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/htdocs/news.html b/htdocs/news.html
index 6fac4ea5..bb04a682 100644
--- a/htdocs/news.html
+++ b/htdocs/news.html
@@ -196,7 +196,7 @@
 [2016-06-03] wwwdocs:
 
 
-http://awards.acm.org/about/2015-technical-awards";>2015 ACM 
Software System Award
+https://awards.acm.org/about/2015-technical-awards";>2015 
ACM Software System Award
 [2016-04-29] wwwdocs:
 
 
-- 
2.45.2


[PATCH] testsuite: Fix struct size check [PR116155]

2024-08-05 Thread Dimitar Dimitrov
The size of "struct only_fam_2" is dependent on the alignment of the
flexible array member "b", and not on the type of the preceding
bit-fields.  For most targets the two are equal.  But on default_packed
targets like pru-unknown-elf, the alignment of int is not equal to the
size of int, so the test failed.

Patch was suggested by Qing Zhao.  Tested on pru-unknown-elf and
x86_64-pc-linux-gnu.

Ok for master?

PR testsuite/116155

gcc/testsuite/ChangeLog:

* c-c++-common/fam-in-union-alone-in-struct-1.c: Adjust
check to account for default_packed targets.

Signed-off-by: Dimitar Dimitrov 
---
 gcc/testsuite/c-c++-common/fam-in-union-alone-in-struct-1.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/c-c++-common/fam-in-union-alone-in-struct-1.c 
b/gcc/testsuite/c-c++-common/fam-in-union-alone-in-struct-1.c
index 39ebf17850b..9979e96fe70 100644
--- a/gcc/testsuite/c-c++-common/fam-in-union-alone-in-struct-1.c
+++ b/gcc/testsuite/c-c++-common/fam-in-union-alone-in-struct-1.c
@@ -45,7 +45,7 @@ int main ()
 __builtin_abort ();
   if (sizeof (struct only_fam) != 0)
 __builtin_abort ();
-  if (sizeof (struct only_fam_2) != sizeof (int))
+  if (sizeof (struct only_fam_2) != __alignof__ (int))
 __builtin_abort ();
   return 0;
 }
-- 
2.45.2



Re: [PATCH,c++,wwwdocs] bugs: Remove old "export" non-bug

2024-08-05 Thread Gerald Pfeifer
On Mon, 22 Jul 2024, Jonathan Wakely wrote:
>> We have been carrying this note on the "original" export feature for ages,
>> and I believe it's not actually a FAQ, if it ever was.
>>
>> Jonathan moved this down when adding a note on ADL last fall.
>>
>> I now propose to drop it.
> Sounds good to me.

Fine; there was no further response for two weeks, so I went ahead and 
pushed my patch.

> I keep meaning to make a similar change to
> https://isocpp.org/wiki/faq/templates#separate-template-fn-defn-from-decl-export-keyword
> which is also outdated and irrelevant in 2024.

Maybe a good trigger to yank the above now? :-)

Gerald


Re: [PATCH] Explicitly document that the "counted_by" attribute is only supported in C

2024-08-05 Thread Qing Zhao


On Aug 5, 2024, at 13:54, Jakub Jelinek  wrote:

On Mon, Aug 05, 2024 at 01:48:25PM -0400, Jason Merrill wrote:
On 8/5/24 9:53 AM, Jakub Jelinek wrote:
On Mon, Aug 05, 2024 at 01:33:01PM +, Qing Zhao wrote:
As discussed in PR116016:https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116016#c48

We should explicitly document this limitation and issue error messages for C++.

The "counted_by" attribute currently is only supported in C, mention this
explicitly in documentation and also issue error when see "counted_by"
attribute in C++.

The patch has been bootstrappped and regression tested on both aarch64 and X86,
no issue.

+  /* This attribute is not supported in C++.  */
+  if (c_dialect_cxx ())
+{
+  error_at (DECL_SOURCE_LOCATION (decl),
+ "%qE attribute is not supported for C++", name);

This should be sorry_at instead IMHO (at least if there is a plan to support
it later, hopefully in the 15 timeframe).

Why should it be an error at all?  A warning seems sufficient since there's
no semantic effect.

Ok.  Guess OPT_Wattributes then.

Okay.

Qing

Jakub



Re: [PATCH] c++: remove function/var concepts code

2024-08-05 Thread Marek Polacek
On Mon, Aug 05, 2024 at 12:00:04PM -0400, Jason Merrill wrote:
> On 8/2/24 2:12 PM, Marek Polacek wrote:
> > Bootstrapped/regtested on x86_64-pc-linux-gnu.  Comments?
> > 
> > -- >8 --
> > This patch removes vestigial Concepts TS code as discussed in
> > .
> > 
> > In particular, it removes code related to function/variable concepts.
> > That includes variable_concept_p and function_concept_p, which then
> > cascades into removing DECL_DECLARED_CONCEPT_P etc.  So I think we
> > no longer need to say "standard concept" since there are no non-standard
> > ones anymore.
> > 
> > I've added two new errors saying that "variable/function concepts are
> > no longer supported".
> > 
> > gcc/cp/ChangeLog:
> > 
> > * constexpr.cc (cxx_eval_constant_expression): Don't call
> > unpack_concept_check.  Add a concept_check_p assert.  Remove
> > function_concept_p code.
> > * constraint.cc (check_constraint_atom): Remove function concepts code.
> > (unpack_concept_check): Remove.
> > (get_concept_check_template): Remove Concepts TS code.
> > (resolve_function_concept_overload): Remove.
> > (resolve_function_concept_check): Remove.
> > (resolve_concept_check): Remove Concepts TS code.
> > (get_returned_expression): Remove.
> > (get_variable_initializer): Remove.
> > (get_concept_definition): Remove Concepts TS code.
> > (normalize_concept_check): Likewise.
> > (build_function_check): Remove.
> > (build_variable_check): Remove.
> > (build_standard_check): Use concept_definition_p instead of
> > standard_concept_p.
> > (build_concept_check): Remove variable_concept_p/function_concept_p
> > code.
> > (build_concept_id): Simplify.
> > (build_type_constraint): Likewise.
> > (placeholder_extract_concept_and_args): Likewise.
> > (satisfy_nondeclaration_constraints): Likewise.
> > (check_function_concept): Remove.
> > (get_constraint_error_location): Remove Concepts TS code.
> > * cp-tree.h (DECL_DECLARED_CONCEPT_P): Remove.
> > (check_function_concept): Remove.
> > (unpack_concept_check): Remove.
> > (standard_concept_p): Remove.
> > (variable_concept_p): Remove.
> > (function_concept_p): Remove.
> > (concept_definition_p): Simplify.
> > (concept_check_p): Don't check for CALL_EXPR.
> > * decl.cc (check_concept_refinement): Remove.
> > (duplicate_decls): Remove check_concept_refinement code.
> > (is_concept_var): Remove.
> > (cp_finish_decl): Remove is_concept_var.
> > (check_concept_fn): Remove.
> > (grokfndecl): Give an error about function concepts not being supported
> > anymore.  Remove unused code.
> > (grokvardecl): Give an error about variable concepts not being
> > supported anymore.
> > (finish_function): Remove DECL_DECLARED_CONCEPT_P code.
> > * decl2.cc (min_vis_expr_r): Use concept_definition_p instead of
> > standard_concept_p.
> > (maybe_instantiate_decl): Remove DECL_DECLARED_CONCEPT_P check.
> > (mark_used): Likewise.
> > * error.cc (dump_simple_decl): Use concept_definition_p instead of
> > standard_concept_p.
> > (dump_function_decl): Remove DECL_DECLARED_CONCEPT_P code.
> > (print_concept_check_info): Don't call unpack_concept_check.
> > * mangle.cc (write_type_constraint): Likewise.
> > * parser.cc (cp_parser_nested_name_specifier_opt): Remove
> > function_concept_p code.  Only check concept_definition_p, not
> > variable_concept_p/standard_concept_p.
> > (add_debug_begin_stmt): Remove DECL_DECLARED_CONCEPT_P code.
> > (cp_parser_template_declaration_after_parameters): Remove a stale
> > comment.
> > * pt.cc (check_explicit_specialization): Remove
> > DECL_DECLARED_CONCEPT_P code.
> > (process_partial_specialization): Remove variable_concept_p code.
> > (lookup_template_variable): Likewise.
> > (tsubst_expr) : Remove Concepts TS code and simplify.
> > (do_decl_instantiation): Remove DECL_DECLARED_CONCEPT_P code.
> > (instantiate_decl): Likewise.
> > (placeholder_type_constraint_dependent_p): Don't call
> > unpack_concept_check.  Add a concept_check_p assert.
> > (convert_generic_types_to_packs): Likewise.
> > * semantics.cc (finish_call_expr): Remove Concepts TS code and simplify.
> > 
> > gcc/testsuite/ChangeLog:
> > 
> > * g++.dg/concepts/decl-diagnose.C: Adjust dg-error.
> > * g++.dg/concepts/fn-concept2.C: Likewise.
> > * g++.dg/concepts/pr71128.C: Likewise.
> > * g++.dg/concepts/var-concept6.C: Likewise.
> > * g++.dg/cpp2a/concepts.C: Likewise.
> > ---
> >   gcc/cp/constexpr.cc   |  13 +-
> >   gcc/cp/constraint.cc  | 346 +-
> >   gcc/cp/cp-tree.h  |  71 +---
> >   gcc/cp/decl.cc| 118 +-
> >   gcc/cp/decl2.cc   

Re: [PATCH] c++: remove function/var concepts code

2024-08-05 Thread Jason Merrill

On 8/5/24 2:44 PM, Marek Polacek wrote:

On Mon, Aug 05, 2024 at 12:00:04PM -0400, Jason Merrill wrote:



I think we also want to adjust the 'concept bool' handling in
cp_parser_decl_specifier_seq:


   /* Warn for concept as a decl-specifier. We'll rewrite these
as
concept declarations later.  */
   {
 cp_token *next = cp_lexer_peek_token (parser->lexer);
 if (next->keyword == RID_BOOL)
=>permerror (next->location, "the % keyword is not "
  "allowed in a C++20 concept definition");
 else
   error_at (token->location, "C++20 concept definition syntax "
 "is % = %>");
   }


After the permerror let's skip the 'bool' token and continue trying to parse
a concept declaration.  I think that should allow us to remove more of the
code in grokfndecl/grokvardecl?


If by skip you mean cp_lexer_consume_token, then that results in worse
diagnostics for e.g.

   concept bool f3();

where it adds the extra "with no type" error:


Ah, yeah, cp_parser_decl_specifier_seq is too late for what I was 
thinking.  How about in cp_parser_template_declaration_after_parameters:



  else if (flag_concepts
   && cp_lexer_next_token_is_keyword (parser->lexer, RID_CONCEPT)
   && cp_lexer_nth_token_is (parser->lexer, 2, CPP_NAME))
/* -fconcept-ts 'concept bool' syntax is handled below, in  
cp_parser_single_declaration.  */

decl = cp_parser_concept_definition (parser);


What happens if we remove the CPP_NAME check, so we commit to concept 
parsing as soon as we see the keyword?


Jason



[PATCH] doc: Rephrase GM2 Limitations section

2024-08-05 Thread Gerald Pfeifer
I noticed a non-working link, then some other details in that section.

Here is a suggestion to rework it a bit.

Okay?

Gerald



>From 83e856355a94bd78afbf19eed32ca1726658f581 Mon Sep 17 00:00:00 2001
From: Gerald Pfeifer 
Date: Mon, 5 Aug 2024 21:06:20 +0200
Subject: [PATCH] doc: Rephrase GM2 Limitations section

gcc:
* doc/gm2.texi (Limitations): Rephrase. Remove invalid link.
---
 gcc/doc/gm2.texi | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/gcc/doc/gm2.texi b/gcc/doc/gm2.texi
index 5bff9eb3829..bfc8dc71f23 100644
--- a/gcc/doc/gm2.texi
+++ b/gcc/doc/gm2.texi
@@ -2970,10 +2970,9 @@ $ @samp{directory to the sources}/contrib/test_summary
 @node Limitations, Objectives, Regression tests, Using
 @section Limitations
 
-Logitech compatibility library is incomplete.  The principle modules
-for this platform exist however for a comprehensive list of completed
-modules please check the documentation
-@url{gm2.html}.
+The Logitech compatibility library is incomplete.  The primary
+modules for this platform exist, though for a comprehensive list
+of completed modules please check the documentation.
 
 @node Objectives, FAQ, Limitations, Using
 @section Objectives
-- 
2.45.2



[pushed] wwwdocs: news: Switch MMIX links to https

2024-08-05 Thread Gerald Pfeifer
Pushed.

Gerald
---
 htdocs/news.html | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/htdocs/news.html b/htdocs/news.html
index bb04a682..a8850dd4 100644
--- a/htdocs/news.html
+++ b/htdocs/news.html
@@ -1228,7 +1228,7 @@ GCC 3.0.3 has been released.
 November 3, 2001
 
 Hans-Peter Nilsson has contributed a port to http://www-cs-faculty.stanford.edu/~knuth/mmix.html";>MMIX, the
+href="https://www-cs-faculty.stanford.edu/~knuth/mmix.html";>MMIX, the
 CPU architecture used in new editions of Donald E. Knuth's The Art of
 Computer Programming.
 
-- 
2.45.2


[pushed] wwwdocs: gcc-3.4: Switch GNU Classpath link to https

2024-08-05 Thread Gerald Pfeifer
Pushed.

Gerald
---
 htdocs/gcc-3.4/changes.html | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/htdocs/gcc-3.4/changes.html b/htdocs/gcc-3.4/changes.html
index 76b4ce87..64042080 100644
--- a/htdocs/gcc-3.4/changes.html
+++ b/htdocs/gcc-3.4/changes.html
@@ -735,7 +735,7 @@ and not your code, that is broken.
   lets URLClassLoader load code from shared
   libraries.
 libgcj has been much more completely merged with http://www.gnu.org/software/classpath/";>GNU Classpath.
+  href="https://www.gnu.org/software/classpath/";>GNU Classpath.
 Class loading is now much more correct; in particular the
   caller's class loader is now used when that is required.
 https://www.eclipse.org";>Eclipse 2.x will run
-- 
2.45.2


Re: [PATCH 1/8] fortran: Add tests covering inline MINLOC/MAXLOC without DIM [PR90608]

2024-08-05 Thread Harald Anlauf

Hi Mikael,

I had only a quick glance at this patch, but am a little concerned
about the tests involving nans.

E.g.:


+  subroutine check_all_nans()
+real, allocatable :: a(:,:,:)
+real :: nan
+integer, allocatable :: m(:)
+nan = 0
+nan = nan / nan
+allocate(a(3,3,3), source = nan)
+m = maxloc(a)
+if (size(m, dim=1) /= 3) stop 161
+if (any(m /= (/ 1, 1, 1 /))) stop 162
+  end subroutineT


Is there a reason you do not use the ieee intrinsic module way
to get a quiet nan?  Otherwise, how do you prevent exceptions
to happen, possibly leading to a failing test?
(The test cases need a workaround to run with NAG).

Thanks,
Harald





Re: [PATCH] c++: permit errors inside uninstantiated templates [PR116064]

2024-08-05 Thread Patrick Palka
On Mon, 5 Aug 2024, Jason Merrill wrote:

> On 8/5/24 1:14 PM, Patrick Palka wrote:
> > On Mon, 5 Aug 2024, Jason Merrill wrote:
> > 
> > > On 8/2/24 4:18 PM, Patrick Palka wrote:
> > > > On Fri, 2 Aug 2024, Patrick Palka wrote:
> > > > 
> > > > > On Fri, 2 Aug 2024, Jason Merrill wrote:
> > > > > 
> > > > > > On 8/1/24 2:52 PM, Patrick Palka wrote:
> > > > > > > In recent versions of GCC we've been diagnosing more and more
> > > > > > > kinds of
> > > > > > > errors inside a template ahead of time.  This is a largely good
> > > > > > > thing
> > > > > > > because it catches bugs, typos, dead code etc sooner.
> > > > > > > 
> > > > > > > But if the template never gets instantiated then such errors are
> > > > > > > harmless, and can be inconvenient to work around if say the code
> > > > > > > in
> > > > > > > question is third party and in maintenence mode.  So it'd be
> > > > > > > useful to
> > > > > > 
> > > > > > "maintenance"
> > > > > 
> > > > > Fixed
> > > > > 
> > > > > > 
> > > > > > > diff --git a/gcc/cp/error.cc b/gcc/cp/error.cc
> > > > > > > index d80bac822ba..0bb0a482e28 100644
> > > > > > > --- a/gcc/cp/error.cc
> > > > > > > +++ b/gcc/cp/error.cc
> > > > > > > @@ -165,6 +165,58 @@ class cxx_format_postprocessor : public
> > > > > > > format_postprocessor
> > > > > > >   deferred_printed_type m_type_b;
> > > > > > > };
> > > > > > > +/* A map from TEMPLATE_DECL to the location of the first
> > > > > > > error (if
> > > > > > > any)
> > > > > > > +   within the template that we permissivly downgraded to a
> > > > > > > warning.
> > > > > > > */
> > > > > > 
> > > > > > "permissively"
> > > > > 
> > > > > Fixed
> > > > > 
> > > > > > 
> > > > > > > +relaxed_template_errors_t *relaxed_template_errors;
> > > > > > > +
> > > > > > > +/* Callback function
> > > > > > > diagnostic_context::m_adjust_diagnostic_info.
> > > > > > > +
> > > > > > > +   In -fpermissive mode we downgrade errors within a template to
> > > > > > > +   warnings, and only issue an error if we later need to
> > > > > > > instantiate
> > > > > > > +   the template.  */
> > > > > > > +
> > > > > > > +static void
> > > > > > > +cp_adjust_diagnostic_info (diagnostic_context *context,
> > > > > > > +diagnostic_info *diagnostic)
> > > > > > > +{
> > > > > > > +  tree ti;
> > > > > > > +  if (diagnostic->kind == DK_ERROR
> > > > > > > +  && context->m_permissive
> > > > > > > +  && !current_instantiation ()
> > > > > > > +  && in_template_context
> > > > > > > +  && (ti = get_template_info (current_scope (
> > > > > > > +{
> > > > > > > +  if (!relaxed_template_errors)
> > > > > > > + relaxed_template_errors = new relaxed_template_errors_t;
> > > > > > > +
> > > > > > > +  tree tmpl = TI_TEMPLATE (ti);
> > > > > > > +  if (!relaxed_template_errors->get (tmpl))
> > > > > > > + relaxed_template_errors->put (tmpl,
> > > > > > > diagnostic->richloc->get_loc ());
> > > > > > > +  diagnostic->kind = DK_WARNING;
> > > > > > 
> > > > > > Rather than check m_permissive directly and downgrade to DK_WARNING,
> > > > > > how
> > > > > > about
> > > > > > downgrading to DK_PERMERROR?  That way people will get the
> > > > > > [-fpermissive]
> > > > > > clue.
> > > > > > 
> > > > > > ...though I suppose DK_PERMERROR doesn't work where you call this
> > > > > > hook
> > > > > > in
> > > > > > report_diagnostic, at which point we've already reassigned it into
> > > > > > DK_WARNING
> > > > > > or DK_ERROR in diagnostic_impl.
> > > > > > 
> > > > > > But we could still set diagnostic->option_index even for DK_ERROR,
> > > > > > whether to
> > > > > > context->m_opt_permissive or to its own warning flag, perhaps
> > > > > > -Wno-template-body?
> > > > > 
> > > > > Fixed by adding an enabled-by-default -Wtemplate-body flag and setting
> > > > > option_index to it for each downgraded error.  Thus -permissive
> > > > > -Wno-template-body would suppress the downgraded warnings entirely,
> > > > > and
> > > > > only issue a generic error upon instantiation of the erroneous
> > > > > template.
> > > > 
> > > > ... or did you have in mind to set option_index even when not using
> > > > -fpermissive so that eligible non-downgraded errors get the
> > > > [-fpermissive] or [-Wtemplate-body] hint as well?
> > > 
> > > Yes.
> > > 
> > > > IMHO I'm not sure that'd be worth the extra noise since the vast
> > > > majority of users appreciate and expect errors to get diagnosed inside
> > > > templates.
> > > 
> > > But people trying to build legacy code should appreciate the pointer for
> > > how
> > > to make it compile, as with other permerrors.
> > > 
> > > > And on second thought I'm not sure what extra value a new warning flag
> > > > adds either.  I can't think of a good reason why one would use
> > > > -fpermissive -Wno-template-body?
> > > 
> > > One would use -Wno-template-body (or -Wno-error=template-body) without
> > > -fpermissive, like with the various permerror_opt cases.
> > 
> > S

Re: [PATCH] c++: remove function/var concepts code

2024-08-05 Thread Marek Polacek
On Mon, Aug 05, 2024 at 02:52:32PM -0400, Jason Merrill wrote:
> On 8/5/24 2:44 PM, Marek Polacek wrote:
> > On Mon, Aug 05, 2024 at 12:00:04PM -0400, Jason Merrill wrote:
> 
> > > I think we also want to adjust the 'concept bool' handling in
> > > cp_parser_decl_specifier_seq:
> > > 
> > > >/* Warn for concept as a decl-specifier. We'll rewrite these
> > > > as
> > > > concept declarations later.  */
> > > >{
> > > >  cp_token *next = cp_lexer_peek_token (parser->lexer);
> > > >  if (next->keyword == RID_BOOL)
> > > > =>permerror (next->location, "the % keyword is not "
> > > >   "allowed in a C++20 concept definition");
> > > >  else
> > > >error_at (token->location, "C++20 concept definition 
> > > > syntax "
> > > >  "is % = %>");
> > > >}
> > > 
> > > After the permerror let's skip the 'bool' token and continue trying to 
> > > parse
> > > a concept declaration.  I think that should allow us to remove more of the
> > > code in grokfndecl/grokvardecl?
> > 
> > If by skip you mean cp_lexer_consume_token, then that results in worse
> > diagnostics for e.g.
> > 
> >concept bool f3();
> > 
> > where it adds the extra "with no type" error:
> 
> Ah, yeah, cp_parser_decl_specifier_seq is too late for what I was thinking.
> How about in cp_parser_template_declaration_after_parameters:
> 
> >   else if (flag_concepts
> >&& cp_lexer_next_token_is_keyword (parser->lexer, RID_CONCEPT)
> >&& cp_lexer_nth_token_is (parser->lexer, 2, CPP_NAME))
> > /* -fconcept-ts 'concept bool' syntax is handled below, in
> > cp_parser_single_declaration.  */
> > decl = cp_parser_concept_definition (parser);
> 
> What happens if we remove the CPP_NAME check, so we commit to concept
> parsing as soon as we see the keyword?

Hmm, for 

  template
  concept int f2() { return 0; }
  concept bool f3();

it produces this output:

t.C:2:9: error: expected identifier before 'int'
2 | concept int f2() { return 0; }
  | ^~~
t.C:2:31: error: expected ';' before 'concept'
2 | concept int f2() { return 0; }
  |   ^
  |   ;
3 | concept bool f3();
  | ~~~

In cp_parser_concept_definition we have

  cp_expr id = cp_parser_identifier (parser);
  if (id == error_mark_node)
{
  cp_parser_skip_to_end_of_statement (parser);
  cp_parser_consume_semicolon_at_end_of_statement (parser);
  return NULL_TREE;
}

cp_parser_identifier emits an error on the "int",
cp_parser_skip_to_end_of_statement consumes all tokens up to the '}'
(including) and then the next token is "concept", not a ';'.  After
cp_parser_consume_semicolon_at_end_of_statement we end up at EOF.  So
the whole f3 decl is skipped.

But the same thing will happen with a valid concept if you forget the ';':

  template
  concept C = true
  concept bool f3();

so I can "fix" it by adding a "stray" ';' in the test.  That sound good?

Marek



  1   2   >