[PATCH] [i386] Optimize a ^ ((a ^ b) & mask) to (~mask & a) | (b & mask).

2022-01-12 Thread Haochen Jiang via Gcc-patches
Hi all,

This patch targets PR94790, which change the instruction selection under the 
following circumstance.

Regtested on x86_64-pc-linux-gnu. Ok for trunk?

BRs,
Haochen

>From the perspective of the pipeline, `andn + and + ior` version take
2 cycles(AND and ANDN doesn't have dependence), but xor + and + xor
will take 3 cycles.

-   xorl%edi, %esi
andl%edx, %esi
-   movl%esi, %eax
-   xorl%edi, %eax
+   andn%edi, %edx, %eax
+   orl %esi, %eax

gcc/ChangeLog:

PR taeget/94790
* config/i386/i386.md (*xor2andn): New define_insn_and_split.

gcc/testsuite/ChangeLog:

PR taeget/94790
* gcc.target/i386/pr94790-1.c: New test.
* gcc.target/i386/pr94790-2.c: Ditto.
---
 gcc/config/i386/i386.md   | 39 +++
 gcc/testsuite/gcc.target/i386/pr94790-1.c | 14 
 gcc/testsuite/gcc.target/i386/pr94790-2.c |  9 ++
 3 files changed, 62 insertions(+)
 create mode 100755 gcc/testsuite/gcc.target/i386/pr94790-1.c
 create mode 100755 gcc/testsuite/gcc.target/i386/pr94790-2.c

diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 9b424a3935b..38efc6d5837 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -10452,6 +10452,45 @@
(set_attr "znver1_decode" "double")
(set_attr "mode" "DI")])
 
+;; PR target/94790: Optimize a ^ ((a ^ b) & mask) to (~mask & a) | (b & mask)
+(define_insn_and_split "*xor2andn"
+  [(set (match_operand:SWI248 0 "nonimmediate_operand")
+   (xor:SWI248
+ (and:SWI248
+   (xor:SWI248
+ (match_operand:SWI248 1 "nonimmediate_operand")
+ (match_operand:SWI248 2 "nonimmediate_operand"))
+   (match_operand:SWI248 3 "nonimmediate_operand"))
+ (match_dup 1)))
+(clobber (reg:CC FLAGS_REG))]
+  "(TARGET_BMI || TARGET_AVX512BW)
+   && ix86_pre_reload_split ()"
+  "#"
+  "&& 1"
+  [(parallel [(set (match_dup 4)
+   (and:SWI248
+ (not:SWI248
+   (match_dup 3))
+ (match_dup 1)))
+ (clobber (reg:CC FLAGS_REG))])
+   (parallel [(set (match_dup 5)
+   (and:SWI248
+ (match_dup 2)
+ (match_dup 3)))
+ (clobber (reg:CC FLAGS_REG))])
+   (parallel [(set (match_dup 0)
+   (ior:SWI248
+ (match_dup 4)
+ (match_dup 5)))
+ (clobber (reg:CC FLAGS_REG))])]
+  {
+operands[1] = force_reg (mode, operands[1]);
+operands[3] = force_reg (mode, operands[3]);
+operands[4] = gen_reg_rtx (mode);
+operands[5] = gen_reg_rtx (mode);
+  }
+)
+
 ;; See comment for addsi_1_zext why we do use nonimmediate_operand
 (define_insn "*si_1_zext"
   [(set (match_operand:DI 0 "register_operand" "=r")
diff --git a/gcc/testsuite/gcc.target/i386/pr94790-1.c 
b/gcc/testsuite/gcc.target/i386/pr94790-1.c
new file mode 100755
index 000..6ebbec15cfd
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr94790-1.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mbmi" } */
+/* { dg-final { scan-assembler-times "andn\[ \\t\]" 2 } } */
+/* { dg-final { scan-assembler-not "xorl\[ \\t\]" } } */
+
+unsigned r1(unsigned a, unsigned b, unsigned mask)
+{
+  return a ^ ((a ^ b) & mask);
+}
+
+unsigned r2(unsigned a, unsigned b, unsigned mask)
+{
+  return (~mask & a) | (b & mask);
+}
diff --git a/gcc/testsuite/gcc.target/i386/pr94790-2.c 
b/gcc/testsuite/gcc.target/i386/pr94790-2.c
new file mode 100755
index 000..d7b0eec5bef
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr94790-2.c
@@ -0,0 +1,9 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mbmi" } */
+/* { dg-final { scan-assembler-not "andn\[ \\t\]" } } */
+/* { dg-final { scan-assembler-times "xorl\[ \\t\]" 2 } } */
+
+unsigned r1(unsigned a, unsigned b, unsigned mask)
+{
+  return a ^ ((a ^ b) & mask) + (a ^ b);
+}
-- 
2.18.1



Re: [PATCH] ira: Fix old-reload targets [PR103974]

2022-01-12 Thread Richard Biener via Gcc-patches
On Tue, Jan 11, 2022 at 7:41 PM Vladimir Makarov via Gcc-patches
 wrote:
>
>
> On 2022-01-11 12:42, Richard Sandiford wrote:
> > The new IRA heuristics would need more work on old-reload targets,
> > since flattening needs to be able to undo the cost propagation.
> > It's doable, but hardly seems worth it.
> Agree. It is not worth to spend your time for work for reload.
> > This patch therefore makes all the new calls to
> > ira_subloop_allocnos_can_differ_p return false if !ira_use_lra_p.
> > The color_pass code that predated the new function (and that was
> > the source of ira_subloop_allocnos_can_differ_p) continues to
> > behave as before.
> >
> > It's a hack, but at least it has the advantage that the new parameter
> > would become obviously unused if reload and (!)ira_use_lra_p were
> > removed.  The hack should therefore disappear alongside reload.
> I have a feeling that it will stay for a long time if not forever.

We indeed seem to have 34 targets w/o LRA by default and only 15 with :/

At some point Eric wrote a nice summary for how to transition targets
away from CC0, I wonder if there's something similar for transitioning
a port away from reload to LRA?  In those 34 targets there must be some
for which that's a relatively easy task?  I suppose it depends on how
much of the reload target hooks are put to use and how those translate
to LRA.

Richard.


Re: [PATCH] c++: Silence -Wuseless-cast warnings during move [PR103480]

2022-01-12 Thread Jakub Jelinek via Gcc-patches
On Tue, Jan 11, 2022 at 11:50:42AM -0500, Jason Merrill wrote:
> On 12/31/21 04:30, Jakub Jelinek wrote:
> > Hi!
> > 
> > This is maybe just a shot in the dark, but IMHO we shouldn't be diagnosing
> > -Wuseless-cast on casts the compiler adds on its own when calling its move
> > function.  We don't seem to warn when user calls std::move either.
> > We call move on elinit (*NON_LVALUE_EXPR <(struct C[2] &&) &D.2497->b>)[0]
> > so it is already an xvalue_p and try to static_cast it to struct C &&.
> > But we don't warn e.g. on std::move (std::move (whatever)).
> > 
> > Bootstrapped/regtested on x86_64-linux and i686-linux.
> 
> Maybe we should
> 
> if (xvalue_p (expr))
>   return expr;
> 
> instead?  OK either way.

That worked well, so I've committed following instead:

2022-01-11  Jakub Jelinek  
Jason Merrill  

PR c++/103480
* tree.c (move): If expr is xvalue_p, just return expr without
build_static_cast.

* g++.dg/warn/Wuseless-cast2.C: New test.

--- gcc/cp/tree.c.jj2022-01-10 10:53:48.983925610 +0100
+++ gcc/cp/tree.c   2022-01-11 19:41:12.787433061 +0100
@@ -1304,6 +1304,8 @@ move (tree expr)
 {
   tree type = TREE_TYPE (expr);
   gcc_assert (!TYPE_REF_P (type));
+  if (xvalue_p (expr))
+return expr;
   type = cp_build_reference_type (type, /*rval*/true);
   return build_static_cast (input_location, type, expr,
tf_warning_or_error);
--- gcc/testsuite/g++.dg/warn/Wuseless-cast2.C.jj   2022-01-11 
19:38:08.552039028 +0100
+++ gcc/testsuite/g++.dg/warn/Wuseless-cast2.C  2022-01-11 19:38:08.552039028 
+0100
@@ -0,0 +1,24 @@
+// PR c++/103480
+// { dg-do compile { target c++14 } }
+// { dg-options "-Wuseless-cast" }
+
+template 
+struct A { typedef T t[N]; };
+template 
+struct B { typename A::t b; };
+struct C {
+  constexpr C (C &&) {}
+  template 
+  static auto bar ()
+  {
+B r;
+return r;  // { dg-bogus "useless cast to type" }
+  }
+  C () = default;
+};
+
+void
+foo ()
+{
+  C::bar<2> ();
+}


Jakub



[committed] testsuite: Fix up c-c++-common/builtin-shufflevector-3.c testcase [PR101530]

2022-01-12 Thread Jakub Jelinek via Gcc-patches
Hi!

This fixes on i686-linux:
FAIL: c-c++-common/builtin-shufflevector-3.c  -Wc++-compat  (test for excess 
errors)
Excess errors:
.../gcc/testsuite/c-c++-common/builtin-shufflevector-3.c:6:1: warning: SSE 
vector argument without SSE enabled changes the ABI [-Wpsabi]

Committed as obvious.

2022-01-12  Jakub Jelinek  

PR middle-end/101530
* c-c++-common/builtin-shufflevector-3.c: Add -Wno-psabi to
dg-options.

--- gcc/testsuite/c-c++-common/builtin-shufflevector-3.c.jj 2022-01-11 
23:11:22.830283971 +0100
+++ gcc/testsuite/c-c++-common/builtin-shufflevector-3.c2022-01-12 
09:37:43.013565187 +0100
@@ -1,4 +1,5 @@
 /* { dg-do compile } */
+/* { dg-options "-Wno-psabi" } */
 
 typedef int __attribute__((__vector_size__ (sizeof(int)*4))) V;
 


Jakub



Re: libgfortran bootstrap failure

2022-01-12 Thread Jakub Jelinek via Gcc-patches
On Tue, Jan 11, 2022 at 10:30:26PM -0500, David Edelsohn wrote:
> The recent patch to support Power IEEE128 causes a bootstrap failure
> on AIX and possibly all non-GLIBC systems.
> 
> +#if defined(__powerpc64__) && __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__ \
> +&& defined __GLIBC_PREREQ && __GLIBC_PREREQ (2, 32)
> +#define POWER_IEEE128 1
> +#endif
> 
> __GLIBC_PREREQ is tested on all systems.
> 
> /nasfarm/edelsohn/src/src/libgfortran/libgfortran.h:107:49: error:
> missing binary operator before token "("
>   107 | && defined __GLIBC_PREREQ && __GLIBC_PREREQ (2, 32)
>   | ^

This is what I've committed as obvious:

2022-01-12  Jakub Jelinek  

* libgfortran.h (POWER_IEEE128): Use __GLIBC_PREREQ in a separate
#if directive inside of #if ... && defined __GLIBC_PREREQ.

--- libgfortran/libgfortran.h.jj2022-01-11 23:49:51.759830402 +0100
+++ libgfortran/libgfortran.h   2022-01-12 09:41:48.779107854 +0100
@@ -104,9 +104,11 @@ typedef off_t gfc_offset;
 #endif
 
 #if defined(__powerpc64__) && __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__ \
-&& defined __GLIBC_PREREQ && __GLIBC_PREREQ (2, 32)
+&& defined __GLIBC_PREREQ
+#if __GLIBC_PREREQ (2, 32)
 #define POWER_IEEE128 1
 #endif
+#endif
 
 /* These functions from  should only be used on values that can be
represented as unsigned char, otherwise the behavior is undefined.


Jakub



Re: [PATCH] Mass rename of C++ .c files to .cc suffix

2022-01-12 Thread Martin Liška

On 1/11/22 17:16, Jakub Jelinek wrote:

On Tue, Jan 11, 2022 at 05:03:34PM +0100, Martin Liška wrote:

On 1/11/22 16:56, Jakub Jelinek wrote:

While e.g. libcpp or gcc are in C++.


Which means I should rename .c files under libcpp, right?
Is there any other folder from gcc/ and libcpp/ that would need that as well?


 From the directories that use .c files for C++, I think that is all.
c++tools/, libcody/, libitm/, libsanitizer/, libstdc++-v3/ already
use .cc or .cpp.


All right, I've included libcpp folder in the renaming effort.

Please take a look at V2 of the series here:
https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git;a=log;h=refs/users/marxin/heads/cc-renaming

Now I filled up complete ChangeLog entries and I'm asking for a patchset 
approval.

Cheers,
Martin



Jakub





[PATCH] Fix -Wformat-diag for aarch64 target.

2022-01-12 Thread Martin Liška

Hello.

We've got -Wformat-diag for some time and I think we should start using it
in -Werror for GCC bootstrap. The following patch removes last pieces of the 
warning
for aarch64 target.

Ready to be installed?
Thanks,
Martin

gcc/ChangeLog:

* config/aarch64/aarch64.c (aarch64_parse_boolean_options): Use
%qs where possible.
(aarch64_parse_sve_width_string): Likewise.
(aarch64_override_options_internal): Likewise.
(aarch64_print_hint_for_extensions): Likewise.
(aarch64_validate_sls_mitigation): Likewise.
(aarch64_handle_attr_arch): Likewise.
(aarch64_handle_attr_cpu): Likewise.
(aarch64_handle_attr_tune): Likewise.
(aarch64_handle_attr_isa_flags): Likewise.
---
 gcc/config/aarch64/aarch64.c | 26 +-
 1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 5920b29880a..1bca2a31a1b 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -16337,7 +16337,7 @@ aarch64_parse_boolean_options (const char *option,
   /* We ended with a comma, print something.  */
   if (!(*specs))
 {
-  error ("%s string ill-formed\n", option_name);
+  error ("%qs string ill-formed", option_name);
   return 0;
 }
 
@@ -16393,7 +16393,7 @@ aarch64_parse_sve_width_string (const char *tune_string,

   int n = sscanf (tune_string, "%d", &width);
   if (n == EOF)
 {
-  error ("invalid format for sve_width");
+  error ("invalid format for %");
   return;
 }
   switch (width)
@@ -16405,7 +16405,7 @@ aarch64_parse_sve_width_string (const char *tune_string,
 case SVE_2048:
   break;
 default:
-  error ("invalid sve_width value: %d", width);
+  error ("invalid % value: %d", width);
 }
   tune->sve_width = (enum aarch64_sve_vector_bits_enum) width;
 }
@@ -16628,7 +16628,7 @@ aarch64_override_options_internal (struct gcc_options 
*opts)
   if (opts->x_aarch64_stack_protector_guard_reg_str)
 {
   if (strlen (opts->x_aarch64_stack_protector_guard_reg_str) > 100)
- error ("specify a system register with a small string length.");
+ error ("specify a system register with a small string length");
 }
 
   if (opts->x_aarch64_stack_protector_guard_offset_str)

@@ -16832,7 +16832,7 @@ aarch64_print_hint_for_extensions (const std::string 
&str)
 inform (input_location, "valid arguments are: %s;"
 " did you mean %qs?", s, hint);
   else
-inform (input_location, "valid arguments are: %s;", s);
+inform (input_location, "valid arguments are: %s", s);
 
   XDELETEVEC (s);

 }
@@ -16933,7 +16933,7 @@ aarch64_validate_sls_mitigation (const char *const_str)
temp |= SLS_RETBR;
   else if (strcmp (str, "none") == 0 || strcmp (str, "all") == 0)
{
- error ("%<%s%> must be by itself for %<-mharden-sls=%>", str);
+ error ("%qs must be by itself for %<-mharden-sls=%>", str);
  break;
}
   else
@@ -17572,11 +17572,11 @@ aarch64_handle_attr_arch (const char *str)
error ("missing name in % pragma or attribute");
break;
   case AARCH64_PARSE_INVALID_ARG:
-   error ("invalid name (\"%s\") in % pragma or 
attribute", str);
+   error ("invalid name (%qs) in % pragma or 
attribute", str);
aarch64_print_hint_for_arch (str);
break;
   case AARCH64_PARSE_INVALID_FEATURE:
-   error ("invalid feature modifier %s of value (\"%s\") in "
+   error ("invalid feature modifier %s of value (%qs) in "
   "% pragma or attribute", invalid_extension.c_str (), 
str);
aarch64_print_hint_for_extensions (invalid_extension);
break;
@@ -17614,11 +17614,11 @@ aarch64_handle_attr_cpu (const char *str)
error ("missing name in % pragma or attribute");
break;
   case AARCH64_PARSE_INVALID_ARG:
-   error ("invalid name (\"%s\") in % pragma or 
attribute", str);
+   error ("invalid name (%qs) in % pragma or 
attribute", str);
aarch64_print_hint_for_core (str);
break;
   case AARCH64_PARSE_INVALID_FEATURE:
-   error ("invalid feature modifier %s of value (\"%s\") in "
+   error ("invalid feature modifier %s of value (%qs) in "
   "% pragma or attribute", invalid_extension.c_str (), 
str);
aarch64_print_hint_for_extensions (invalid_extension);
break;
@@ -17645,7 +17645,7 @@ aarch64_handle_attr_cpu (const char *str)
  " attribute");
break;
  case AARCH64_PARSE_INVALID_ARG:
-   error ("invalid protection type (\"%s\") in 
% pragma or attribute", err_str);
break;
  case AARCH64_PARSE_OK:
@@ -17680,7 +17680,7 @@ aarch64_handle_attr_tune (const char *str)
   switch (parse_res)
 {
   case AARCH64_PARSE_INVALID_ARG:
-   error ("invalid name (\"%s\") in % pragma or 
attribute", str);
+   error ("invalid name (

Re: [PATCH] tree-optimization/103551 - Always set EDGE_EXECUTABLE in ranger vrp.

2022-01-12 Thread Richard Biener via Gcc-patches
On Tue, Jan 11, 2022 at 8:49 PM Andrew MacLeod via Gcc-patches
 wrote:
>
> The simplify_and_fold engine uses the EDGE_EXECUTABLE flag to eliminate
> branches:
>
> substitute_and_fold_dom_walker::before_dom_children:
>
>   if (gimple_code (stmt) == GIMPLE_COND)
>  {
>if ((EDGE_SUCC (bb, 0)->flags & EDGE_EXECUTABLE)
>^ (EDGE_SUCC (bb, 1)->flags & EDGE_EXECUTABLE))
>
> ... but if certain passes before VRP2 are turned off, the flag can be in
> an uninitialized state upon entry to VRP2.  This patch simply always
> sets EDGE_EXECUTABLE on all edges before invoking ranger and the S&F engine.
>
> Bootstraps on  x86_64-pc-linux-gnu with no regressions.
>
> OK for trunk?

OK.

> Andrew
>
>
>
>
>


[PATCH] sh-linux fix target cpu

2022-01-12 Thread Yoshinori Sato
sh-linux not supported any SH1 and SH2a little-endian.
Add exceptios it.

gcc/ChangeLog:

* config/sh/t-linux (MULTILIB_EXCEPTIONS): Add m1, mb/m1 and m2a.
---
 gcc/config/sh/t-linux | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/config/sh/t-linux b/gcc/config/sh/t-linux
index d33c6383915..ca323ca25a9 100644
--- a/gcc/config/sh/t-linux
+++ b/gcc/config/sh/t-linux
@@ -1,2 +1,3 @@
 MULTILIB_DIRNAMES= 
-MULTILIB_MATCHES = 
+MULTILIB_MATCHES= 
+MULTILIB_EXCEPTIONS=m1 mb/m1 m2a
-- 
2.30.2



[PATCH] Fix -Wformat-diag for rs6000 target.

2022-01-12 Thread Martin Liška

Hello.

We've got -Wformat-diag for some time and I think we should start using it
in -Werror for GCC bootstrap. The following patch removes last pieces of the 
warning
for rs6000 target.

Ready to be installed?
Thanks,
Martin


gcc/ChangeLog:

* config/rs6000/rs6000-call.c (rs6000_invalid_builtin): Wrap
keywords and use %qs instead of %<%s%>.
(rs6000_expand_builtin): Likewise.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/bfp/scalar-extract-exp-5.c: Adjust scans in
testcases.
* gcc.target/powerpc/bfp/scalar-extract-sig-5.c: Likewise.
* gcc.target/powerpc/bfp/scalar-insert-exp-11.c: Likewise.
---
 gcc/config/rs6000/rs6000-call.c   | 8 
 .../gcc.target/powerpc/bfp/scalar-extract-exp-5.c | 2 +-
 .../gcc.target/powerpc/bfp/scalar-extract-sig-5.c | 2 +-
 .../gcc.target/powerpc/bfp/scalar-insert-exp-11.c | 2 +-
 4 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
index c78b8b08c40..becdad73812 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -3307,7 +3307,7 @@ rs6000_invalid_builtin (enum rs6000_gen_builtins fncode)
 "-mvsx");
   break;
 case ENB_IEEE128_HW:
-  error ("%qs requires ISA 3.0 IEEE 128-bit floating point", name);
+  error ("%qs requires ISA 3.0 IEEE 128-bit floating-point", name);
   break;
 case ENB_DFP:
   error ("%qs requires the %qs option", name, "-mhard-dfp");
@@ -5589,20 +5589,20 @@ rs6000_expand_builtin (tree exp, rtx target, rtx /* 
subtarget */,
   if (bif_is_nosoft (*bifaddr)
   && rs6000_isa_flags & OPTION_MASK_SOFT_FLOAT)
 {
-  error ("%<%s%> not supported with %<-msoft-float%>",
+  error ("%qs not supported with %<-msoft-float%>",
 bifaddr->bifname);
   return const0_rtx;
 }
 
   if (bif_is_no32bit (*bifaddr) && TARGET_32BIT)

 {
-  error ("%<%s%> is not supported in 32-bit mode", bifaddr->bifname);
+  error ("%qs is not supported in 32-bit mode", bifaddr->bifname);
   return const0_rtx;
 }
 
   if (bif_is_ibmld (*bifaddr) && !FLOAT128_2REG_P (TFmode))

 {
-  error ("%<%s%> requires % to be IBM 128-bit format",
+  error ("%qs requires % to be IBM 128-bit format",
 bifaddr->bifname);
   return const0_rtx;
 }
diff --git a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-5.c 
b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-5.c
index 34184812dc5..1225613b275 100644
--- a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-5.c
+++ b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-5.c
@@ -14,7 +14,7 @@ get_exponent (__ieee128 *p)
 {
   __ieee128 source = *p;
 
-  return scalar_extract_exp (source);	/* { dg-error "requires ISA 3.0 IEEE 128-bit floating point" } */

+  return scalar_extract_exp (source);  /* { dg-error "requires ISA 3.0 IEEE 128-bit 
floating-point" } */
 }
 
 
diff --git a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-sig-5.c b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-sig-5.c

index 13c64fc3acf..adf0e4f99df 100644
--- a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-sig-5.c
+++ b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-sig-5.c
@@ -12,5 +12,5 @@ get_significand (__ieee128 *p)
 {
   __ieee128 source = *p;
 
-  return (long long int) __builtin_vec_scalar_extract_sig (source); /* { dg-error "requires ISA 3.0 IEEE 128-bit floating point" } */

+  return (long long int) __builtin_vec_scalar_extract_sig (source); /* { dg-error 
"requires ISA 3.0 IEEE 128-bit floating-point" } */
 }
diff --git a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-insert-exp-11.c 
b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-insert-exp-11.c
index a5dd852e60f..6787a67950b 100644
--- a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-insert-exp-11.c
+++ b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-insert-exp-11.c
@@ -16,5 +16,5 @@ insert_exponent (__ieee128 *significand_p,
   __ieee128 significand = *significand_p;
   unsigned long long int exponent = *exponent_p;
 
-  return scalar_insert_exp (significand, exponent); /* { dg-error "requires ISA 3.0 IEEE 128-bit floating point" } */

+  return scalar_insert_exp (significand, exponent); /* { dg-error "requires ISA 3.0 
IEEE 128-bit floating-point" } */
 }
--
2.34.1



[PATCH] Fix -Wformat-diag for s390x target.

2022-01-12 Thread Martin Liška

Hello.

We've got -Wformat-diag for some time and I think we should start using it
in -Werror for GCC bootstrap. The following patch removes last pieces of the 
warning
for s390x target.

Ready to be installed?
Thanks,
Martin


gcc/ChangeLog:

* config/s390/s390-c.c (s390_expand_overloaded_builtin): Wrap
keyword in quotes.
(s390_resolve_overloaded_builtin): Remove trailing dot.
* config/s390/s390.c (s390_const_operand_ok): Use - for range.
(s390_expand_builtin): Remove trailing dot.
(s390_emit_prologue): Likewise, use semicolon.
(s390_option_override_internal): Update keyword.
* varasm.c (do_assemble_alias): Wrap keyword in quotes.
---
 gcc/config/s390/s390-c.c |  9 +
 gcc/config/s390/s390.c   | 28 ++--
 gcc/varasm.c |  2 +-
 3 files changed, 20 insertions(+), 19 deletions(-)

diff --git a/gcc/config/s390/s390-c.c b/gcc/config/s390/s390-c.c
index 600018421df..10bc6ac8900 100644
--- a/gcc/config/s390/s390-c.c
+++ b/gcc/config/s390/s390-c.c
@@ -484,7 +484,8 @@ s390_expand_overloaded_builtin (location_t loc,
 case S390_OVERLOADED_BUILTIN_s390_vec_step:
   if (TREE_CODE (TREE_TYPE ((*arglist)[0])) != VECTOR_TYPE)
{
- error_at (loc, "builtin vec_step can only be used on vector types.");
+ error_at (loc, "builtin %qs can only be used on vector types",
+   "vec_step ");
  return error_mark_node;
}
   return build_int_cst (NULL_TREE,
@@ -905,7 +906,7 @@ s390_resolve_overloaded_builtin (location_t loc,
   if (ob_flags & B_INT)
{
  error_at (loc,
-   "builtin %qF is for GCC internal use only.",
+   "builtin %qF is for GCC internal use only",
ob_fndecl);
  return error_mark_node;
}
@@ -913,7 +914,7 @@ s390_resolve_overloaded_builtin (location_t loc,
 }
 
   if (ob_flags & B_DEP)

-warning_at (loc, 0, "builtin %qF is deprecated.", ob_fndecl);
+warning_at (loc, 0, "builtin %qF is deprecated", ob_fndecl);
 
   if (!TARGET_VX && (ob_flags & B_VX))

 {
@@ -1021,7 +1022,7 @@ s390_resolve_overloaded_builtin (location_t loc,
 }
 
   if (bflags_overloaded_builtin_var[last_match_index] & B_DEP)

-warning_at (loc, 0, "%qs matching variant is deprecated.",
+warning_at (loc, 0, "%qs matching variant is deprecated",
IDENTIFIER_POINTER (DECL_NAME (ob_fndecl)));
 
   /* Overloaded variants which have MAX set as low level builtin are

diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index 056002e4a4a..bf96cbf7588 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -766,7 +766,7 @@ s390_const_operand_ok (tree arg, int argnum, int op_flags, 
tree decl)
 argnum, decl, values);
}
  else
-   error ("constant argument %d for builtin %qF is out of range 
(0..%wu)",
+   error ("constant argument %d for builtin %qF is out of range 
(0-%wu)",
   argnum, decl, (HOST_WIDE_INT_1U << bitwidth) - 1);
 
 	  return false;

@@ -783,7 +783,7 @@ s390_const_operand_ok (tree arg, int argnum, int op_flags, 
tree decl)
  || tree_to_shwi (arg) > ((HOST_WIDE_INT_1 << (bitwidth - 1)) - 1))
{
  error ("constant argument %d for builtin %qF is out of range "
-"(%wd..%wd)", argnum, decl,
+"(%wd-%wd)", argnum, decl,
 -(HOST_WIDE_INT_1 << (bitwidth - 1)),
 (HOST_WIDE_INT_1 << (bitwidth - 1)) - 1);
  return false;
@@ -832,25 +832,25 @@ s390_expand_builtin (tree exp, rtx target, rtx subtarget 
ATTRIBUTE_UNUSED,
   if ((bflags & B_HTM) && !TARGET_HTM)
{
  error ("builtin %qF is not supported without %<-mhtm%> "
-"(default with %<-march=zEC12%> and higher).", fndecl);
+"(default with %<-march=zEC12%> and higher)", fndecl);
  return const0_rtx;
}
   if (((bflags & B_VX) || (bflags & B_VXE)) && !TARGET_VX)
{
  error ("builtin %qF requires %<-mvx%> "
-"(default with %<-march=z13%> and higher).", fndecl);
+"(default with %<-march=z13%> and higher)", fndecl);
  return const0_rtx;
}
 
   if ((bflags & B_VXE) && !TARGET_VXE)

{
- error ("Builtin %qF requires z14 or higher.", fndecl);
+ error ("Builtin %qF requires z14 or higher", fndecl);
  return const0_rtx;
}
 
   if ((bflags & B_VXE2) && !TARGET_VXE2)

{
- error ("Builtin %qF requires z15 or higher.", fndecl);
+ error ("Builtin %qF requires z15 or higher", fndecl);
  return const0_rtx;
}
 
@@ -11464,8 +11464,8 @@ s390_emit_prologue (void)

{
  warning (0, "frame size of function %qs is %wd"
   " bytes exceeding user provided stack limit of "
-  "

[PATCH] libstdc++: Add attribute to features deprecated in C++17 [PR91260]

2022-01-12 Thread Jonathan Wakely via Gcc-patches
This passes testing (with -std=gnu++98/11/17/20) but is quite a large
patch for this late in stage 3. Does anybody object to doing this now?
The bugs it fixes were closed as INVALID because we're not actually
*required* to remove these or deprecate them. But users are right to
complain about us silently accepting use of things like std::bind1st in
C++20 mode. Probably WONTFIX would have been better, and in fact we can
fix them (as this patch does). If we don't do this, users will keep
reporting bugs about it, but it could wait for stage 1 if needed.

=== >8 === >8 === >8 ===

There are a lot of things in the C++ standard library which were
deprecated in C++11, and more in C++17.  Some of them were removed after
deprecation and are no longer present in the standard at all. We have
not removed these from libstdc++ because keeping them as non-standard
extensions is conforming, and avoids gratuitously breaking user code,
and in some cases we need to keep using them to avoid ABI changes. But
we should at least give a warning for using them. That has not been done
previously because of the library's own uses of them (e.g. the
std::iterator class template used as a base class).

This adds deprecated attributes to the relevant components, and then
goes through the whole library to add diagnostic pragmas where needed to
suppress warnings about our internal uses of them. The tests are updated
to either expect the additional warnings, or to suppress them where we
aren't interested in them.

libstdc++-v3/ChangeLog:

PR libstdc++/91260
PR libstdc++/91383
PR libstdc++/95065
* include/backward/binders.h (bind1st, bind2nd): Add deprecated
attribute.
* include/bits/refwrap.h (_Maybe_unary_or_binary_function):
Disable deprecated warnings for base classes.
(_Reference_wrapper_base): Likewise.
* include/bits/shared_ptr_base.h (_Sp_owner_less): Likewise.
* include/bits/stl_bvector.h (_Bit_iterator_base): Likewise.
* include/bits/stl_function.h (unary_function, binary_function):
Add deprecated attribute.
(unary_negate, not1, binary_negate, not2, ptr_fun)
(pointer_to_unary_function, pointer_to_binary_function)
(mem_fun_t, const_mem_fun_t, mem_fun_ref_t, const_mem_fun_ref_t)
(mem_fun1_t, const_mem_fun1_t, mem_fun_ref1_t)
(const_mem_fun1_ref_t, mem_fun, mem_fun_ref): Add deprecated
attributes.
* include/bits/stl_iterator.h: Disable deprecated warnings for
std::iterator base classes.
* include/bits/stl_iterator_base_types.h (iterator): Add
deprecated attribute.
* include/bits/stl_map.h (map::value_compare): Disable
deprecated warnings for base class.
* include/bits/stl_multimap.h (multimap::value_compare):
Likewise.
* include/bits/stl_raw_storage_iter.h (raw_storage_iterator):
Add deprecated attribute.
* include/bits/stl_tempbuf.h (get_temporary_buffer): Likewise.
* include/bits/stream_iterator.h: Disable deprecated warnings.
* include/bits/streambuf_iterator.h: Likewise.
* include/ext/bitmap_allocator.h: Remove unary_function base
classes.
* include/ext/functional: Disable deprecated warnings.
* include/ext/rope: Likewise.
* include/ext/throw_allocator.h: Likewise.
* include/std/type_traits (result_of): Add deprecated attribute.
* include/tr1/functional: Disable deprecated warnings.
* include/tr1/functional_hash.h: Likewise.
* testsuite/20_util/function_objects/binders/1.cc: Add
-Wno-disable-deprecations.
* testsuite/20_util/function_objects/binders/3113.cc: Likewise.
* testsuite/20_util/function_objects/constexpr.cc: Add
dg-warning.
* testsuite/20_util/raw_storage_iterator/base.cc: Likewise.
* testsuite/20_util/raw_storage_iterator/dr2127.cc: Likewise.
* testsuite/20_util/raw_storage_iterator/requirements/base_classes.cc:
Likewise.
* 
testsuite/20_util/raw_storage_iterator/requirements/explicit_instantiation/1.cc:
Likewise.
* testsuite/20_util/raw_storage_iterator/requirements/typedefs.cc:
Likewise.
* testsuite/20_util/reference_wrapper/24803.cc:
Likewise.
* testsuite/20_util/reference_wrapper/typedefs.cc: Enable for
C++20 and check for absence of nested types.
* testsuite/20_util/shared_ptr/comparison/less.cc: Remove
std::binary_function base class.
* testsuite/20_util/temporary_buffer.cc: Add dg-warning.
* testsuite/21_strings/basic_string/cons/char/69092.cc: Remove
std::iterator base class.
* 
testsuite/24_iterators/back_insert_iterator/requirements/base_classes.cc:
Likewise.
* 
testsuite/24_iterators/front_insert_iterator/requirements/base_classes.cc:
Likewise.
* testsuite/24_iterators/insert_iterator/requir

[PATCH] libgomp, OpenMP: Fix issue for omp_get_device_num on gfx targets.

2022-01-12 Thread Marcel Vollweiler

Hi,

Currently omp_get_device_num does not work on gcn targets with more than
one offload device. The reason is that GOMP_DEVICE_NUM_VAR is static in
icv-device.c and thus "__gomp_device_num" is not visible in the offload
image.

This patch removes "static" such that "__gomp_device_num" is now part of
the offload image and can now be found in GOMP_OFFLOAD_load_image in the
plugin.

This is not an issue for nvptx. There, "__gomp_device_num" is in the
offload image even with "static".

The patch was tested on x86_64-linux with amdgcn offloading with no
regressions.

Marcel
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
libgomp, OpenMP: Fix issue for omp_get_device_num on gcn targets.

Currently omp_get_device_num does not work on gcn targets with more than one
offload device. The reason is that GOMP_DEVICE_NUM_VAR is static in
icv-device.c and thus "__gomp_device_num" is not visible in the offload image.

This patch removes "static" such that "__gomp_device_num" is now part of the
offload image and can now be found in GOMP_OFFLOAD_load_image in the plugin.

This is not an issue for nvptx. There, "__gomp_device_num" is in the offload
image even with "static".

libgomp/ChangeLog:

* config/gcn/icv-device.c: Make GOMP_DEVICE_NUM_VAR public (remove
"static") to make the device num available in the offload image.

diff --git a/libgomp/config/gcn/icv-device.c b/libgomp/config/gcn/icv-device.c
index fcfa0f3..f70b7e6 100644
--- a/libgomp/config/gcn/icv-device.c
+++ b/libgomp/config/gcn/icv-device.c
@@ -60,7 +60,7 @@ omp_is_initial_device (void)
 
 /* This is set to the device number of current GPU during device 
initialization,
when the offload image containing this libgomp portion is loaded.  */
-static volatile int GOMP_DEVICE_NUM_VAR;
+volatile int GOMP_DEVICE_NUM_VAR;
 
 int
 omp_get_device_num (void)


Re: [patch] Fix reverse SSO issues in IPA-SRA

2022-01-12 Thread Eric Botcazou via Gcc-patches
> Thanks for the fixes, the forgotten duplication and streaming were quite
> embarrassing, but one reason is that there are few reverse SSO testcases
> in the suite.  Would it be difficult to add some covering the issues you
> fixed?

No, I think I should be able to cover 3 out of the 4 changes, see below.

> ...is this strictly necessary?  I know it is inconsistent but the
> certain flag is fairly special.  More importantly...

No strong opinion, but that's really inconsistent...

> > @@ -3349,6 +3346,7 @@ pull_accesses_from_callee (cgraph_node *caller,
> > isra_param_desc *param_desc,> 
> >   copy->type = argacc->type;
> >   copy->alias_ptr_type = argacc->alias_ptr_type;
> >   copy->certain = true;
> > 
> > + copy->reverse = argacc->reverse;
> > 
> >   vec_safe_push (param_desc->accesses, copy);
> > 
> > }
> > 
> >else if (prop_kinds[j] == ACC_PROP_CERTAIN)
> 
> ...earlier in the function, there is a check doing:
> 
> if (argacc->alias_ptr_type != pacc->alias_ptr_type
> 
> || !types_compatible_p (argacc->type, pacc->type))
> 
>   return "propagated access types would not match existing 
ones";
> 
> Will the types_compatible_p catch the case when one type is a
> reverse-SSO and the other is not?  If not, we probably need to check
> that the flags are the same too.

That's the change I don't know how to cover and I agree that the check looks 
in order, but I presume that the flag still needs to be copied onto "copy"?

-- 
Eric Botcazou




[PATCH] Fix redefined macro for epiphany.

2022-01-12 Thread Martin Liška

The following warning is emitted gazillion times.

Fixes:

In file included from ./tm.h:23,
 from gcc/genconfig.c:25:
gcc/config/elfos.h:209: warning: "READONLY_DATA_SECTION_ASM_OP" redefined
  209 | #define READONLY_DATA_SECTION_ASM_OP"\t.section\t.rodata"
  |
In file included from ./tm.h:21,
 from gcc/genconfig.c:25:
gcc/config/epiphany/epiphany.h:671: note: this is the location of the previous 
definition
  671 | #define READONLY_DATA_SECTION_ASM_OP"\t.section .rodata"

Ready to be installed?
Thanks,
Martin

gcc/ChangeLog:

* config/elfos.h (READONLY_DATA_SECTION_ASM_OP): Define only if
not defined.
---
 gcc/config/elfos.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/gcc/config/elfos.h b/gcc/config/elfos.h
index 2e0c709e585..add949ec940 100644
--- a/gcc/config/elfos.h
+++ b/gcc/config/elfos.h
@@ -206,7 +206,9 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  
If not, see
 #define ASCII_DATA_ASM_OP  "\t.ascii\t"
 
 /* Support a read-only data section.  */

+#ifndef READONLY_DATA_SECTION_ASM_OP
 #define READONLY_DATA_SECTION_ASM_OP   "\t.section\t.rodata"
+#endif
 
 /* On svr4, we *do* have support for the .init and .fini sections, and we

can put stuff in there to be executed before and after `main'.  We let
--
2.34.1



Re: [PATCH] Fix -Wformat-diag for s390x target.

2022-01-12 Thread Richard Biener via Gcc-patches
On Wed, Jan 12, 2022 at 10:05 AM Martin Liška  wrote:
>
> Hello.
>
> We've got -Wformat-diag for some time and I think we should start using it
> in -Werror for GCC bootstrap. The following patch removes last pieces of the 
> warning
> for s390x target.
>
> Ready to be installed?
> Thanks,
> Martin
>
>
> gcc/ChangeLog:
>
> * config/s390/s390-c.c (s390_expand_overloaded_builtin): Wrap
> keyword in quotes.
> (s390_resolve_overloaded_builtin): Remove trailing dot.
> * config/s390/s390.c (s390_const_operand_ok): Use - for range.
> (s390_expand_builtin): Remove trailing dot.
> (s390_emit_prologue): Likewise, use semicolon.
> (s390_option_override_internal): Update keyword.
> * varasm.c (do_assemble_alias): Wrap keyword in quotes.
> ---
>   gcc/config/s390/s390-c.c |  9 +
>   gcc/config/s390/s390.c   | 28 ++--
>   gcc/varasm.c |  2 +-
>   3 files changed, 20 insertions(+), 19 deletions(-)
>
> diff --git a/gcc/config/s390/s390-c.c b/gcc/config/s390/s390-c.c
> index 600018421df..10bc6ac8900 100644
> --- a/gcc/config/s390/s390-c.c
> +++ b/gcc/config/s390/s390-c.c
> @@ -484,7 +484,8 @@ s390_expand_overloaded_builtin (location_t loc,
>   case S390_OVERLOADED_BUILTIN_s390_vec_step:
> if (TREE_CODE (TREE_TYPE ((*arglist)[0])) != VECTOR_TYPE)
> {
> - error_at (loc, "builtin vec_step can only be used on vector 
> types.");
> + error_at (loc, "builtin %qs can only be used on vector types",
> +   "vec_step ");

extra space in "vec_step "?

>   return error_mark_node;
> }
> return build_int_cst (NULL_TREE,
> @@ -905,7 +906,7 @@ s390_resolve_overloaded_builtin (location_t loc,
> if (ob_flags & B_INT)
> {
>   error_at (loc,
> -   "builtin %qF is for GCC internal use only.",
> +   "builtin %qF is for GCC internal use only",
> ob_fndecl);
>   return error_mark_node;
> }
> @@ -913,7 +914,7 @@ s390_resolve_overloaded_builtin (location_t loc,
>   }
>
> if (ob_flags & B_DEP)
> -warning_at (loc, 0, "builtin %qF is deprecated.", ob_fndecl);
> +warning_at (loc, 0, "builtin %qF is deprecated", ob_fndecl);
>
> if (!TARGET_VX && (ob_flags & B_VX))
>   {
> @@ -1021,7 +1022,7 @@ s390_resolve_overloaded_builtin (location_t loc,
>   }
>
> if (bflags_overloaded_builtin_var[last_match_index] & B_DEP)
> -warning_at (loc, 0, "%qs matching variant is deprecated.",
> +warning_at (loc, 0, "%qs matching variant is deprecated",
> IDENTIFIER_POINTER (DECL_NAME (ob_fndecl)));
>
> /* Overloaded variants which have MAX set as low level builtin are
> diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
> index 056002e4a4a..bf96cbf7588 100644
> --- a/gcc/config/s390/s390.c
> +++ b/gcc/config/s390/s390.c
> @@ -766,7 +766,7 @@ s390_const_operand_ok (tree arg, int argnum, int 
> op_flags, tree decl)
>  argnum, decl, values);
> }
>   else
> -   error ("constant argument %d for builtin %qF is out of range 
> (0..%wu)",
> +   error ("constant argument %d for builtin %qF is out of range 
> (0-%wu)",
>argnum, decl, (HOST_WIDE_INT_1U << bitwidth) - 1);
>
>   return false;
> @@ -783,7 +783,7 @@ s390_const_operand_ok (tree arg, int argnum, int 
> op_flags, tree decl)
>   || tree_to_shwi (arg) > ((HOST_WIDE_INT_1 << (bitwidth - 1)) - 1))
> {
>   error ("constant argument %d for builtin %qF is out of range "
> -"(%wd..%wd)", argnum, decl,
> +"(%wd-%wd)", argnum, decl,
>  -(HOST_WIDE_INT_1 << (bitwidth - 1)),
>  (HOST_WIDE_INT_1 << (bitwidth - 1)) - 1);
>   return false;
> @@ -832,25 +832,25 @@ s390_expand_builtin (tree exp, rtx target, rtx 
> subtarget ATTRIBUTE_UNUSED,
> if ((bflags & B_HTM) && !TARGET_HTM)
> {
>   error ("builtin %qF is not supported without %<-mhtm%> "
> -"(default with %<-march=zEC12%> and higher).", fndecl);
> +"(default with %<-march=zEC12%> and higher)", fndecl);
>   return const0_rtx;
> }
> if (((bflags & B_VX) || (bflags & B_VXE)) && !TARGET_VX)
> {
>   error ("builtin %qF requires %<-mvx%> "
> -"(default with %<-march=z13%> and higher).", fndecl);
> +"(default with %<-march=z13%> and higher)", fndecl);
>   return const0_rtx;
> }
>
> if ((bflags & B_VXE) && !TARGET_VXE)
> {
> - error ("Builtin %qF requires z14 or higher.", fndecl);
> + error ("Builtin %qF requires z14 or higher", fndecl);
>   return const0_rtx;
> }
>
> if ((bflags & B_VXE2) && !TARGET_VXE2)
> {
> - error ("Builtin %qF requires z15 or higher.", fndecl)

[PATCH] epiphany: fix -Wimplicit-fallthrough warnings in epiphany.c.

2022-01-12 Thread Martin Liška

This is RFC.

Cheers,
Martin

gcc/ChangeLog:

* config/epiphany/epiphany.c (epiphany_mode_priority):
Use gcc_unreachable for not handled cases.
---
 gcc/config/epiphany/epiphany.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/gcc/config/epiphany/epiphany.c b/gcc/config/epiphany/epiphany.c
index a2743436f38..88283419dc1 100644
--- a/gcc/config/epiphany/epiphany.c
+++ b/gcc/config/epiphany/epiphany.c
@@ -2369,6 +2369,7 @@ epiphany_mode_priority (int entity, int priority)
  case 2: return (epiphany_normal_fp_rounding == FP_MODE_ROUND_NEAREST
  ? FP_MODE_ROUND_TRUNC : FP_MODE_ROUND_NEAREST);
  case 3: return FP_MODE_CALLER;
+ default: gcc_unreachable ();
  }
   case FP_MODE_ROUND_NEAREST:
   case FP_MODE_CALLER:
@@ -2378,6 +2379,7 @@ epiphany_mode_priority (int entity, int priority)
  case 1: return FP_MODE_ROUND_TRUNC;
  case 2: return FP_MODE_INT;
  case 3: return FP_MODE_CALLER;
+ default: gcc_unreachable ();
  }
   case FP_MODE_ROUND_TRUNC:
switch (priority)
@@ -2386,6 +2388,7 @@ epiphany_mode_priority (int entity, int priority)
  case 1: return FP_MODE_ROUND_NEAREST;
  case 2: return FP_MODE_INT;
  case 3: return FP_MODE_CALLER;
+ default: gcc_unreachable ();
  }
   case FP_MODE_ROUND_UNKNOWN:
   case FP_MODE_NONE:
--
2.34.1



[PATCH] epiphany: fix -Wimplicit-fallthrough warnings in epiphany.c.

2022-01-12 Thread Martin Liška

This fixes -Wformat-diag warnings.

Ready to be installed?
Thanks,
Martin

gcc/ChangeLog:

* config/epiphany/epiphany.c (epiphany_mode_priority):
Use gcc_unreachable for not handled cases.
---
 gcc/config/epiphany/epiphany.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/gcc/config/epiphany/epiphany.c b/gcc/config/epiphany/epiphany.c
index a2743436f38..88283419dc1 100644
--- a/gcc/config/epiphany/epiphany.c
+++ b/gcc/config/epiphany/epiphany.c
@@ -2369,6 +2369,7 @@ epiphany_mode_priority (int entity, int priority)
  case 2: return (epiphany_normal_fp_rounding == FP_MODE_ROUND_NEAREST
  ? FP_MODE_ROUND_TRUNC : FP_MODE_ROUND_NEAREST);
  case 3: return FP_MODE_CALLER;
+ default: gcc_unreachable ();
  }
   case FP_MODE_ROUND_NEAREST:
   case FP_MODE_CALLER:
@@ -2378,6 +2379,7 @@ epiphany_mode_priority (int entity, int priority)
  case 1: return FP_MODE_ROUND_TRUNC;
  case 2: return FP_MODE_INT;
  case 3: return FP_MODE_CALLER;
+ default: gcc_unreachable ();
  }
   case FP_MODE_ROUND_TRUNC:
switch (priority)
@@ -2386,6 +2388,7 @@ epiphany_mode_priority (int entity, int priority)
  case 1: return FP_MODE_ROUND_NEAREST;
  case 2: return FP_MODE_INT;
  case 3: return FP_MODE_CALLER;
+ default: gcc_unreachable ();
  }
   case FP_MODE_ROUND_UNKNOWN:
   case FP_MODE_NONE:
--
2.34.1



[PATCH] epiphany: fir -Wformat-diag.

2022-01-12 Thread Martin Liška

This fixes -Wformat-diag warnings.

Ready to be installed?
Thanks,
Martin

gcc/ChangeLog:

* config/epiphany/epiphany.c (epiphany_handle_interrupt_attribute):
Use %qs format specifier.
(epiphany_override_options): Wrap keyword in %<, %>.
---
 gcc/config/epiphany/epiphany.c | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/gcc/config/epiphany/epiphany.c b/gcc/config/epiphany/epiphany.c
index 88283419dc1..c9c321560de 100644
--- a/gcc/config/epiphany/epiphany.c
+++ b/gcc/config/epiphany/epiphany.c
@@ -520,8 +520,10 @@ epiphany_handle_interrupt_attribute (tree *node, tree 
name, tree args,
   && strcmp (TREE_STRING_POINTER (value), "swi"))
 {
   warning (OPT_Wattributes,
-  "argument of %qE attribute is not \"reset\", \"software_exception\", \"page_miss\", \"timer0\", \"timer1\", 
\"message\", \"dma0\", \"dma1\", \"wand\" or \"swi\"",
-  name);
+  "argument of %qE attribute is not %qs, %qs %qs, %qs, %qs, "
+  "%qs, %qs, %qs, %qs or %qs", name,
+  "reset", "software_exception", "page_miss", "timer0", "timer1",
+  "message", "dma0", "dma1", "wand", "swi");
   *no_add_attrs = true;
   return NULL_TREE;
 }
@@ -1538,9 +1540,9 @@ static void
 epiphany_override_options (void)
 {
   if (epiphany_stack_offset < 4)
-error ("stack_offset must be at least 4");
+error ("% must be at least 4");
   if (epiphany_stack_offset & 3)
-error ("stack_offset must be a multiple of 4");
+error ("% must be a multiple of 4");
   epiphany_stack_offset = (epiphany_stack_offset + 3) & -4;
  if (!TARGET_SOFT_CMPSF)
flag_finite_math_only = 1;
--
2.34.1



Re: [PATCH] epiphany: fix -Wimplicit-fallthrough warnings in epiphany.c.

2022-01-12 Thread Martin Liška

On 1/12/22 10:56, Martin Liška wrote:

This fixes -Wformat-diag warnings.

Ready to be installed?
Thanks,
Martin


Please forget about this email, it's a dup.

Martin


Re: [PATCH] Fix redefined macro for epiphany.

2022-01-12 Thread Richard Biener via Gcc-patches
On Wed, Jan 12, 2022 at 10:47 AM Martin Liška  wrote:
>
> The following warning is emitted gazillion times.
>
> Fixes:
>
> In file included from ./tm.h:23,
>   from gcc/genconfig.c:25:
> gcc/config/elfos.h:209: warning: "READONLY_DATA_SECTION_ASM_OP" redefined
>209 | #define READONLY_DATA_SECTION_ASM_OP"\t.section\t.rodata"
>|
> In file included from ./tm.h:21,
>   from gcc/genconfig.c:25:
> gcc/config/epiphany/epiphany.h:671: note: this is the location of the 
> previous definition
>671 | #define READONLY_DATA_SECTION_ASM_OP"\t.section .rodata"
>
> Ready to be installed?

Most other targets #undef READONLY_DATA_SECION_ASM_OP before defining,
isn't the issue that the target doesn't do that and that it has
uncommon order of
tm_file in config.gcc?  (${tm_file} first, before elfos.h)

Richard.

> Thanks,
> Martin
>
> gcc/ChangeLog:
>
> * config/elfos.h (READONLY_DATA_SECTION_ASM_OP): Define only if
> not defined.
> ---
>   gcc/config/elfos.h | 2 ++
>   1 file changed, 2 insertions(+)
>
> diff --git a/gcc/config/elfos.h b/gcc/config/elfos.h
> index 2e0c709e585..add949ec940 100644
> --- a/gcc/config/elfos.h
> +++ b/gcc/config/elfos.h
> @@ -206,7 +206,9 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  
> If not, see
>   #define ASCII_DATA_ASM_OP "\t.ascii\t"
>
>   /* Support a read-only data section.  */
> +#ifndef READONLY_DATA_SECTION_ASM_OP
>   #define READONLY_DATA_SECTION_ASM_OP  "\t.section\t.rodata"
> +#endif
>
>   /* On svr4, we *do* have support for the .init and .fini sections, and we
>  can put stuff in there to be executed before and after `main'.  We let
> --
> 2.34.1
>


Re: [PATCH] Fix redefined macro for epiphany.

2022-01-12 Thread Martin Liška

On 1/12/22 11:03, Richard Biener wrote:

tm_file in config.gcc?  (${tm_file} first, before elfos.h)


Heh, yes, that's the case. The following patch fixes that.

May I install it?
Cheers,
MartinFrom 14fa27ba17cf9aa1a971ae93f8a5ef6aa857c67e Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Wed, 12 Jan 2022 11:09:20 +0100
Subject: [PATCH] Include elfos.h before ${tm_file}.

Fixes:

In file included from ./tm.h:23,
  from gcc/genconfig.c:25:
gcc/config/elfos.h:209: warning: "READONLY_DATA_SECTION_ASM_OP" redefined
   209 | #define READONLY_DATA_SECTION_ASM_OP"\t.section\t.rodata"
   |
In file included from ./tm.h:21,
  from gcc/genconfig.c:25:
gcc/config/epiphany/epiphany.h:671: note: this is the location of the previous definition
   671 | #define READONLY_DATA_SECTION_ASM_OP"\t.section .rodata"

gcc/ChangeLog:

	* config.gcc: Include elfos.h before ${tm_file}.
---
 gcc/config.gcc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config.gcc b/gcc/config.gcc
index 89e0992fda5..f1b8d832c6f 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -1620,7 +1620,7 @@ csky-*-*)
 	esac
 	;;
 epiphany-*-elf | epiphany-*-rtems*)
-	tm_file="${tm_file} dbxelf.h elfos.h"
+	tm_file="dbxelf.h elfos.h ${tm_file}"
 	tmake_file="${tmake_file} epiphany/t-epiphany"
 	case ${target} in
 	epiphany-*-rtems*)
-- 
2.34.1



[vect] PR103971, PR103977: Fix epilogue mode selection for autodetect only

2022-01-12 Thread Andre Vieira (lists) via Gcc-patches

Hi,

This a fix for the regression caused by '[vect] Re-analyze all modes for 
epilogues'. The earlier patch assumed there was always at least one 
other mode than VOIDmode, but that does not need to be the case.
If we are dealing with a target that does not define more modes for 
'autovectorize_vector_modes', the behaviour before the patch would be to 
try to create an epilogue for the same autodetected_vector_mode, which 
unless the target supported partial vectors would always fail. So as a 
fix I suggest trying to vectorize the epilogue with the 
preferred_simd_mode for QI, mimicking autovectorize_vector_mode, which 
will be skipped if it is not a vector_mode (since that already should 
indicate partial vectors aren't possible) or if no partial vectors are 
supported and its pessimistic NUNITS is larger than the main loop's VF.


Currently bootstrapping and regression testing, otherwise OK for trunk? 
Can someone verify this fixes the issue for PR103971 on powerpc?


gcc/ChangeLog:

    * tree-vect-loop.c (vect-analyze-loop): Handle scenario where 
target

    does add autovectorize_vector_modes.
diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
index 
6ed2b5f8724e5ebf27592f67d7f6bdfe1ebcf512..c81ebc411312e649f9cd954895244c60c928fee1
 100644
--- a/gcc/tree-vect-loop.c
+++ b/gcc/tree-vect-loop.c
@@ -3024,6 +3024,18 @@ vect_analyze_loop (class loop *loop, vec_info_shared 
*shared)
  ordering is not guaranteed, so we could end up picking a mode for the main
  loop that is after the epilogue's optimal mode.  */
   mode_i = 1;
+  /* If we only had VOIDmode then push the AUTODETECTED_VECTOR_MODE to see if
+ an epilogue can be created with that mode.  */
+  if (vector_modes.length () == 1)
+{
+  machine_mode preferred_mode
+   = targetm.vectorize.preferred_simd_mode (QImode);
+  /* If the preferred mode isn't a vector mode we will not be needing an
+ epilogue.  */
+  if (!VECTOR_MODE_P (preferred_mode))
+   return first_loop_vinfo;
+  vector_modes.safe_push (preferred_mode);
+}
   bool supports_partial_vectors = partial_vectors_supported_p ();
   poly_uint64 first_vinfo_vf = LOOP_VINFO_VECT_FACTOR (first_loop_vinfo);
 


Re: [patch] Fix reverse SSO issues in IPA-SRA

2022-01-12 Thread Martin Jambor
Hi,

On Wed, Jan 12 2022, Eric Botcazou wrote:
>> Thanks for the fixes, the forgotten duplication and streaming were quite
>> embarrassing, but one reason is that there are few reverse SSO testcases
>> in the suite.  Would it be difficult to add some covering the issues you
>> fixed?
>
> No, I think I should be able to cover 3 out of the 4 changes, see below.
>
>> ...is this strictly necessary?  I know it is inconsistent but the
>> certain flag is fairly special.  More importantly...
>
> No strong opinion, but that's really inconsistent...
>
>> > @@ -3349,6 +3346,7 @@ pull_accesses_from_callee (cgraph_node *caller,
>> > isra_param_desc *param_desc,> 
>> >  copy->type = argacc->type;
>> >  copy->alias_ptr_type = argacc->alias_ptr_type;
>> >  copy->certain = true;
>> > 
>> > +copy->reverse = argacc->reverse;
>> > 
>> >  vec_safe_push (param_desc->accesses, copy);
>> >
>> >}
>> >
>> >else if (prop_kinds[j] == ACC_PROP_CERTAIN)
>> 
>> ...earlier in the function, there is a check doing:
>> 
>>if (argacc->alias_ptr_type != pacc->alias_ptr_type
>> 
>>|| !types_compatible_p (argacc->type, pacc->type))
>> 
>>  return "propagated access types would not match existing 
> ones";
>> 
>> Will the types_compatible_p catch the case when one type is a
>> reverse-SSO and the other is not?  If not, we probably need to check
>> that the flags are the same too.
>
> That's the change I don't know how to cover and I agree that the check looks 
> in order,

It might be indeed difficult to trigger the inconsistency without some
rather unsavory type casting.  Basically it would mean we would have
something like...

f (struct with_reverse s)
{
  use (s.field);
}

g (struct without_reverse_buth_otherwise_identical s)
{
  f ((struct with_reverse) s);
}

However with LTO, in incorrect programs, this might happen even without
the typecast and so IPA might encounter the situation even without an
explicit typecast.


>  but I presume that the flag still needs to be copied onto "copy"?
>

Yes, it must still be copied.  

Thanks,

Martin


Re: [PATCH] Fortran: make IEEE_CLASS recognize signaling NaNs

2022-01-12 Thread FX via Gcc-patches
Hi,

I can confirm that I don’t see this failure on a Debian bullseye/sid (Linux 
5.11.0-46, glibc 2.31-0ubuntu9.2) with a fresh bootstrap of master:

$ grep signaling testsuite/gfortran/gfortran.sum
PASS: gfortran.dg/ieee/signaling_1.f90   -O0  (test for excess errors)
PASS: gfortran.dg/ieee/signaling_1.f90   -O0  execution test
PASS: gfortran.dg/ieee/signaling_1.f90   -O1  (test for excess errors)
PASS: gfortran.dg/ieee/signaling_1.f90   -O1  execution test
PASS: gfortran.dg/ieee/signaling_1.f90   -O2  (test for excess errors)
PASS: gfortran.dg/ieee/signaling_1.f90   -O2  execution test
PASS: gfortran.dg/ieee/signaling_1.f90   -O3 -fomit-frame-pointer 
-funroll-loops -fpeel-loops -ftracer -finline-functions  (test for excess 
errors)
PASS: gfortran.dg/ieee/signaling_1.f90   -O3 -fomit-frame-pointer 
-funroll-loops -fpeel-loops -ftracer -finline-functions  execution test
PASS: gfortran.dg/ieee/signaling_1.f90   -O3 -g  (test for excess errors)
PASS: gfortran.dg/ieee/signaling_1.f90   -O3 -g  execution test
PASS: gfortran.dg/ieee/signaling_1.f90   -Os  (test for excess errors)
PASS: gfortran.dg/ieee/signaling_1.f90   -Os  execution test

It’s showing on some gcc-testresults for x86_64 and aarch64 linux, so I’ll need 
someone (HJ, Andreas are in CC) to show me what the error is.

It may be related to the use of dg-options instead of dg-additional-options in 
that testcase, so I pushed the attached fix. I’ll check the test results 
mailing-list to see if it improved things (or confirm when the error message is 
posted).

FX




test.patch
Description: Binary data


Re: [vect] PR103971, PR103977: Fix epilogue mode selection for autodetect only

2022-01-12 Thread Richard Biener via Gcc-patches
On Wed, 12 Jan 2022, Andre Vieira (lists) wrote:

> Hi,
> 
> This a fix for the regression caused by '[vect] Re-analyze all modes for
> epilogues'. The earlier patch assumed there was always at least one other mode
> than VOIDmode, but that does not need to be the case.
> If we are dealing with a target that does not define more modes for
> 'autovectorize_vector_modes', the behaviour before the patch would be to try
> to create an epilogue for the same autodetected_vector_mode, which unless the
> target supported partial vectors would always fail. So as a fix I suggest
> trying to vectorize the epilogue with the preferred_simd_mode for QI,
> mimicking autovectorize_vector_mode, which will be skipped if it is not a
> vector_mode (since that already should indicate partial vectors aren't
> possible) or if no partial vectors are supported and its pessimistic NUNITS is
> larger than the main loop's VF.
> 
> Currently bootstrapping and regression testing, otherwise OK for trunk? Can
> someone verify this fixes the issue for PR103971 on powerpc?

Why not simply start at mode_i = 0 which means autodetecting the mode
to use for the epilogue?  That appears to be a much simpler solution to
me, including for targets where there are more than one element in the
vector.

Richard.


Re: [vect] PR103971, PR103977: Fix epilogue mode selection for autodetect only

2022-01-12 Thread Richard Sandiford via Gcc-patches
Richard Biener  writes:
> On Wed, 12 Jan 2022, Andre Vieira (lists) wrote:
>
>> Hi,
>> 
>> This a fix for the regression caused by '[vect] Re-analyze all modes for
>> epilogues'. The earlier patch assumed there was always at least one other 
>> mode
>> than VOIDmode, but that does not need to be the case.
>> If we are dealing with a target that does not define more modes for
>> 'autovectorize_vector_modes', the behaviour before the patch would be to try
>> to create an epilogue for the same autodetected_vector_mode, which unless the
>> target supported partial vectors would always fail. So as a fix I suggest
>> trying to vectorize the epilogue with the preferred_simd_mode for QI,
>> mimicking autovectorize_vector_mode, which will be skipped if it is not a
>> vector_mode (since that already should indicate partial vectors aren't
>> possible) or if no partial vectors are supported and its pessimistic NUNITS 
>> is
>> larger than the main loop's VF.
>> 
>> Currently bootstrapping and regression testing, otherwise OK for trunk? Can
>> someone verify this fixes the issue for PR103971 on powerpc?
>
> Why not simply start at mode_i = 0 which means autodetecting the mode
> to use for the epilogue?  That appears to be a much simpler solution to
> me, including for targets where there are more than one element in the
> vector.

VOIDmode doesn't tell us anything about what the autodetected mode
will be, so current short-circuit:

  /* If the target does not support partial vectors we can shorten the
 number of modes to analyze for the epilogue as we know we can't pick a
 mode that has at least as many NUNITS as the main loop's vectorization
 factor, since that would imply the epilogue's vectorization factor
 would be at least as high as the main loop's and we would be
 vectorizing for more scalar iterations than there would be left.  */
  if (!supports_partial_vectors
  && maybe_ge (GET_MODE_NUNITS (vector_modes[mode_i]), first_vinfo_vf))
{
  mode_i++;
  if (mode_i == vector_modes.length ())
break;
  continue;
}

wouldn't be effective.

Thanks,
Richard



Re: [PATCH] Fortran: make IEEE_CLASS recognize signaling NaNs

2022-01-12 Thread Jakub Jelinek via Gcc-patches
On Wed, Jan 12, 2022 at 11:23:43AM +0100, FX via Gcc-patches wrote:
> I can confirm that I don’t see this failure on a Debian bullseye/sid (Linux 
> 5.11.0-46, glibc 2.31-0ubuntu9.2) with a fresh bootstrap of master:
> 
> $ grep signaling testsuite/gfortran/gfortran.sum
> PASS: gfortran.dg/ieee/signaling_1.f90   -O0  (test for excess errors)
> PASS: gfortran.dg/ieee/signaling_1.f90   -O0  execution test
> PASS: gfortran.dg/ieee/signaling_1.f90   -O1  (test for excess errors)
> PASS: gfortran.dg/ieee/signaling_1.f90   -O1  execution test
> PASS: gfortran.dg/ieee/signaling_1.f90   -O2  (test for excess errors)
> PASS: gfortran.dg/ieee/signaling_1.f90   -O2  execution test
> PASS: gfortran.dg/ieee/signaling_1.f90   -O3 -fomit-frame-pointer 
> -funroll-loops -fpeel-loops -ftracer -finline-functions  (test for excess 
> errors)
> PASS: gfortran.dg/ieee/signaling_1.f90   -O3 -fomit-frame-pointer 
> -funroll-loops -fpeel-loops -ftracer -finline-functions  execution test
> PASS: gfortran.dg/ieee/signaling_1.f90   -O3 -g  (test for excess errors)
> PASS: gfortran.dg/ieee/signaling_1.f90   -O3 -g  execution test
> PASS: gfortran.dg/ieee/signaling_1.f90   -Os  (test for excess errors)
> PASS: gfortran.dg/ieee/signaling_1.f90   -Os  execution test

The error I was getting was:
/home/jakub/src/gcc/obj46/gcc/testsuite/gfortran2/../../gfortran 
-B/home/jakub/src/gcc/obj46/gcc/testsuite/gfortran2/../../ 
-B/home/jakub/src/gcc/obj46/x86_64-pc
-linux-gnu/./libgfortran/ 
/home/jakub/src/gcc/gcc/testsuite/gfortran.dg/ieee/signaling_1.f90 
-fdiagnostics-plain-output -fdiagnostics-plain-output -O1 -fsignaling-nans 
/home/jakub/sr
c/gcc/gcc/testsuite/gfortran.dg/ieee/signaling_1_c.c -dumpbase  
-B/home/jakub/src/gcc/obj46/x86_64-pc-linux-gnu/./libgfortran/.libs 
-L/home/jakub/src/gcc/obj46/x86_64-pc-linux-gnu/./
libgfortran/.libs 
-L/home/jakub/src/gcc/obj46/x86_64-pc-linux-gnu/./libgfortran/.libs 
-L/home/jakub/src/gcc/obj46/x86_64-pc-linux-gnu/./libatomic/.libs 
-B/home/jakub/src/gcc/obj46/x8
6_64-pc-linux-gnu/./libquadmath/.libs 
-L/home/jakub/src/gcc/obj46/x86_64-pc-linux-gnu/./libquadmath/.libs 
-L/home/jakub/src/gcc/obj46/x86_64-pc-linux-gnu/./libquadmath/.libs -lm -o .
/signaling_1.exe
/home/jakub/src/gcc/gcc/testsuite/gfortran.dg/ieee/signaling_1.f90:8:20: Fatal 
Error: Cannot find an intrinsic module named 'ieee_arithmetic' at (1)
compilation terminated.
compiler exited with status 1
FAIL: gfortran.dg/ieee/signaling_1.f90   -O1  (test for excess errors)
Excess errors:
/home/jakub/src/gcc/gcc/testsuite/gfortran.dg/ieee/signaling_1.f90:8:20: Fatal 
Error: Cannot find an intrinsic module named 'ieee_arithmetic' at (1)
compilation terminated.

UNRESOLVED: gfortran.dg/ieee/signaling_1.f90   -O1  compilation failed to 
produce executable
And
-! { dg-options "-fsignaling-nans" }
+! { dg-additional-options "-fsignaling-nans" }
doesn't fix it, it changes the FAIL into:
cc1: warning: command-line option '-fintrinsic-modules-path 
/home/jakub/src/gcc/obj46/x86_64-pc-linux-gnu/./libgfortran/' is valid for 
Fortran but not for C
FAIL: gfortran.dg/ieee/signaling_1.f90   -O1  (test for excess errors)
Excess errors:
cc1: warning: command-line option '-fintrinsic-modules-path 
/home/jakub/src/gcc/obj46/x86_64-pc-linux-gnu/./libgfortran/' is valid for 
Fortran but not for C

We need -fintrinsic-modules-path option for the signalling_1.f90 compilation
but need to make sure it isn't used when the *.c file is compiled, so they
need to be compiled by separate compiler invocations probably.

Jakub



Re: [PATCH] Fix redefined macro for epiphany.

2022-01-12 Thread Richard Biener via Gcc-patches
On Wed, Jan 12, 2022 at 11:10 AM Martin Liška  wrote:
>
> On 1/12/22 11:03, Richard Biener wrote:
> > tm_file in config.gcc?  (${tm_file} first, before elfos.h)
>
> Heh, yes, that's the case. The following patch fixes that.

OK if the target builds with that.

> May I install it?
> Cheers,
> Martin


Re: [vect] PR103971, PR103977: Fix epilogue mode selection for autodetect only

2022-01-12 Thread Richard Biener via Gcc-patches
On Wed, 12 Jan 2022, Richard Sandiford wrote:

> Richard Biener  writes:
> > On Wed, 12 Jan 2022, Andre Vieira (lists) wrote:
> >
> >> Hi,
> >> 
> >> This a fix for the regression caused by '[vect] Re-analyze all modes for
> >> epilogues'. The earlier patch assumed there was always at least one other 
> >> mode
> >> than VOIDmode, but that does not need to be the case.
> >> If we are dealing with a target that does not define more modes for
> >> 'autovectorize_vector_modes', the behaviour before the patch would be to 
> >> try
> >> to create an epilogue for the same autodetected_vector_mode, which unless 
> >> the
> >> target supported partial vectors would always fail. So as a fix I suggest
> >> trying to vectorize the epilogue with the preferred_simd_mode for QI,
> >> mimicking autovectorize_vector_mode, which will be skipped if it is not a
> >> vector_mode (since that already should indicate partial vectors aren't
> >> possible) or if no partial vectors are supported and its pessimistic 
> >> NUNITS is
> >> larger than the main loop's VF.
> >> 
> >> Currently bootstrapping and regression testing, otherwise OK for trunk? Can
> >> someone verify this fixes the issue for PR103971 on powerpc?
> >
> > Why not simply start at mode_i = 0 which means autodetecting the mode
> > to use for the epilogue?  That appears to be a much simpler solution to
> > me, including for targets where there are more than one element in the
> > vector.
> 
> VOIDmode doesn't tell us anything about what the autodetected mode
> will be, so current short-circuit:
> 
>   /* If the target does not support partial vectors we can shorten the
>number of modes to analyze for the epilogue as we know we can't pick a
>mode that has at least as many NUNITS as the main loop's vectorization
>factor, since that would imply the epilogue's vectorization factor
>would be at least as high as the main loop's and we would be
>vectorizing for more scalar iterations than there would be left.  */
>   if (!supports_partial_vectors
> && maybe_ge (GET_MODE_NUNITS (vector_modes[mode_i]), first_vinfo_vf))
>   {
> mode_i++;
> if (mode_i == vector_modes.length ())
>   break;
> continue;
>   }
> 
> wouldn't be effective.

Well, before this change we simply did

-  /* Handle the case that the original loop can use partial
- vectorization, but want to only adopt it for the epilogue.
- The retry should be in the same mode as original.  */
-  if (LOOP_VINFO_EPIL_USING_PARTIAL_VECTORS_P (first_loop_vinfo))
...
-  else
-{
-  mode_i = first_loop_next_i;
-  if (mode_i == vector_modes.length ())
-   return first_loop_vinfo;
-}

and thus didn't bother with epilogue vectorization.  I think we should
then just restore this behavior, not doing epilogue vectorization
if vector_modes.length () == 1?

Richard.


Re: [vect] PR103971, PR103977: Fix epilogue mode selection for autodetect only

2022-01-12 Thread Richard Sandiford via Gcc-patches
Richard Biener  writes:
> On Wed, 12 Jan 2022, Richard Sandiford wrote:
>
>> Richard Biener  writes:
>> > On Wed, 12 Jan 2022, Andre Vieira (lists) wrote:
>> >
>> >> Hi,
>> >> 
>> >> This a fix for the regression caused by '[vect] Re-analyze all modes for
>> >> epilogues'. The earlier patch assumed there was always at least one other 
>> >> mode
>> >> than VOIDmode, but that does not need to be the case.
>> >> If we are dealing with a target that does not define more modes for
>> >> 'autovectorize_vector_modes', the behaviour before the patch would be to 
>> >> try
>> >> to create an epilogue for the same autodetected_vector_mode, which unless 
>> >> the
>> >> target supported partial vectors would always fail. So as a fix I suggest
>> >> trying to vectorize the epilogue with the preferred_simd_mode for QI,
>> >> mimicking autovectorize_vector_mode, which will be skipped if it is not a
>> >> vector_mode (since that already should indicate partial vectors aren't
>> >> possible) or if no partial vectors are supported and its pessimistic 
>> >> NUNITS is
>> >> larger than the main loop's VF.
>> >> 
>> >> Currently bootstrapping and regression testing, otherwise OK for trunk? 
>> >> Can
>> >> someone verify this fixes the issue for PR103971 on powerpc?
>> >
>> > Why not simply start at mode_i = 0 which means autodetecting the mode
>> > to use for the epilogue?  That appears to be a much simpler solution to
>> > me, including for targets where there are more than one element in the
>> > vector.
>> 
>> VOIDmode doesn't tell us anything about what the autodetected mode
>> will be, so current short-circuit:
>> 
>>   /* If the target does not support partial vectors we can shorten the
>>   number of modes to analyze for the epilogue as we know we can't pick a
>>   mode that has at least as many NUNITS as the main loop's vectorization
>>   factor, since that would imply the epilogue's vectorization factor
>>   would be at least as high as the main loop's and we would be
>>   vectorizing for more scalar iterations than there would be left.  */
>>   if (!supports_partial_vectors
>>&& maybe_ge (GET_MODE_NUNITS (vector_modes[mode_i]), first_vinfo_vf))
>>  {
>>mode_i++;
>>if (mode_i == vector_modes.length ())
>>  break;
>>continue;
>>  }
>> 
>> wouldn't be effective.
>
> Well, before this change we simply did
>
> -  /* Handle the case that the original loop can use partial
> - vectorization, but want to only adopt it for the epilogue.
> - The retry should be in the same mode as original.  */
> -  if (LOOP_VINFO_EPIL_USING_PARTIAL_VECTORS_P (first_loop_vinfo))
> ...
> -  else
> -{
> -  mode_i = first_loop_next_i;
> -  if (mode_i == vector_modes.length ())
> -   return first_loop_vinfo;
> -}
>
> and thus didn't bother with epilogue vectorization.  I think we should
> then just restore this behavior, not doing epilogue vectorization
> if vector_modes.length () == 1?

Yeah, but that case didn't need epilogue vectorisation before.  This
series is adding support for unrolling, and targets with a single vector
size will benefit from epilogues in that case.

Thanks,
Richard


Re: [Patch][V2]Enable -Wuninitialized + -ftrivial-auto-var-init for address taken variables

2022-01-12 Thread Richard Biener via Gcc-patches
On Tue, Jan 11, 2022 at 5:32 PM Qing Zhao  wrote:
>
> Hi, Richard,
>
> > On Jan 11, 2022, at 7:43 AM, Richard Biener  
> > wrote:
> >
> 
> 
>  1.  Add some meaningful temporaries to break the long expression to make 
>  it
> Readable. And also add comments to explain the purpose of the 
>  statement;
> 
>  2.  Resolve the memory leakage of the dynamically created string.
> 
>  The patch has been bootstrapped and regressing tested on both x86 and 
>  aarch64, no issues.
>  Okay for commit?
> >>>
> >>> tree decl_name
> >>> += build_string_literal (IDENTIFIER_LENGTH (DECL_NAME (decl)) + 1,
> >>> +   IDENTIFIER_POINTER (DECL_NAME (decl)));
> >>>
> >>> you need to deal with DECL_NAME being NULL.
> >>
> >> Okay.
> >> Usually under what situation, the decl_name will be NULL?
> >
> > I don't know but it definitely happens.
> >
> >>> It's also a bit awkward
> >>> to build another
> >>> copy of all decl names :/
> >>
> >> Yes, this is awkward. But it might be unavoidable for address taken 
> >> variables since the original variable might be completely deleted by 
> >> optimizations.
> >> See the details at:
> >> https://gcc.gnu.org/pipermail/gcc-patches/2021-August/577431.html
> >>
> >> We had a previous discussion on this issue, and the idea of adding this 
> >> 3rd argument with the name of the variable was proposed by you at that 
> >> time. -:)
> >
> > I know ... I didn't have a better idea.
>
> I think that adding the name string of the auto variable as one parameter to 
> the function .DEFERRED_INIT might be the only solution to this issue? (In the 
> very beginning of the implementation, we added the VAR itself as one 
> parameter to the function .DEFERRED_INIT, but that design didn’t work out)
> >
> >>
> >>
> >>>
> >>> + /* The LHS of the call is a temporary variable, we use it as a
> >>> +placeholder to record the information on whether the warning
> >>> +has been issued or not.  */
> >>> + repl_var = gimple_call_lhs (def_stmt);
> >>>
> >>> this seems to be a change that can be done independently?
> >>
> >> The major purpose of this “repl_var” is used to record the info whether 
> >> the warning has been issued for the variable or not, then avoid emitting 
> >> it again later.
> >> Since the original variable has been completely deleted by optimization, 
> >> we have to use this “repl_var” for a placeholder to record such info.
> >
> > But the ... = .DEFERRED_INIT stmt itself could be used to record this
> > since its location is
> > already used to indicate the original location, like with
> > suppress_warning/warning_suppressed_p?
>
> Ah, I will check on this. Thanks a lot.
> >
> >>>
> >>> + /* Ignore the call to .DEFERRED_INIT that define the original
> >>> +var itself.  */
> >>> + if (is_gimple_assign (context))
> >>> +   {
> >>> + if (TREE_CODE (gimple_assign_lhs (context)) == VAR_DECL)
> >>> +   lhs_var = gimple_assign_lhs (context);
> >>> + else if (TREE_CODE (gimple_assign_lhs (context)) == 
> >>> SSA_NAME)
> >>> +   lhs_var = SSA_NAME_VAR (gimple_assign_lhs (context));
> >>> +   }
> >>> + if (lhs_var
> >>> + && (lhs_var_name = DECL_NAME (lhs_var))
> >>> + && (lhs_var_name_str = IDENTIFIER_POINTER (lhs_var_name))
> >>> + && (strcmp (lhs_var_name_str, var_name_str) == 0))
> >>> +   return;
> >>>
> >>> likewise but I don't really understand what you are doing here.
> >>
> >> The above is to exclude the following case:
> >>
> >>   temp = .DEFERRED_INIT (4, 2, “alt_reloc");
> >>   alt_reloc = temp;
> >>
> >> i.e, a call to .DEFERRED_INIT that define the original variable itself.
> >
> > How can this happen?  It looks like a bug to me.  Do you have a testcase?
> With -ftrivial-auto-var-init, During gimplification phase, almost all address 
> taken variables that do not have an explicit initializer will have the above 
> IR pattern.
>
> For example, gcc.dg/auto-init-uninit-16.c:
>
> [opc@qinzhao-ol8u3-x86 gcc]$ cat 
> /home/opc/Work/GCC/latest-gcc/gcc/testsuite/gcc.dg/auto-init-uninit-16.c
> /* { dg-do compile } */
> /* { dg-options "-O2 -Wuninitialized -ftrivial-auto-var-init=zero" } */
>
> int foo, bar;
>
> static
> void decode_reloc(int reloc, int *is_alt)
> {
>   if (reloc >= 20)
>   *is_alt = 1;
>   else if (reloc >= 10)
>   *is_alt = 0;
> }
>
> void testfunc()
> {
>   int alt_reloc;
>
>   decode_reloc(foo, &alt_reloc);
>
>   if (alt_reloc) /* { dg-warning "may be used uninitialized" "" }  */
> bar = 42;
> }
>
> With  -ftrivial-auto-var-init=zero, the IR after gimplification is:
>
>   _1 = .DEFERRED_INIT (4, 2, &"alt_reloc"[0]);
>   alt_reloc = _1;
>
> And the IR after SSA is similar as the above:
>
>   _1 = .DEFERRED_INIT (4, 2, &"alt_reloc"[0]);
>   alt_reloc = _1;
>
> During the early uninitialized analysis

Re: [vect] PR103971, PR103977: Fix epilogue mode selection for autodetect only

2022-01-12 Thread Richard Biener via Gcc-patches
On Wed, 12 Jan 2022, Richard Sandiford wrote:

> Richard Biener  writes:
> > On Wed, 12 Jan 2022, Richard Sandiford wrote:
> >
> >> Richard Biener  writes:
> >> > On Wed, 12 Jan 2022, Andre Vieira (lists) wrote:
> >> >
> >> >> Hi,
> >> >> 
> >> >> This a fix for the regression caused by '[vect] Re-analyze all modes for
> >> >> epilogues'. The earlier patch assumed there was always at least one 
> >> >> other mode
> >> >> than VOIDmode, but that does not need to be the case.
> >> >> If we are dealing with a target that does not define more modes for
> >> >> 'autovectorize_vector_modes', the behaviour before the patch would be 
> >> >> to try
> >> >> to create an epilogue for the same autodetected_vector_mode, which 
> >> >> unless the
> >> >> target supported partial vectors would always fail. So as a fix I 
> >> >> suggest
> >> >> trying to vectorize the epilogue with the preferred_simd_mode for QI,
> >> >> mimicking autovectorize_vector_mode, which will be skipped if it is not 
> >> >> a
> >> >> vector_mode (since that already should indicate partial vectors aren't
> >> >> possible) or if no partial vectors are supported and its pessimistic 
> >> >> NUNITS is
> >> >> larger than the main loop's VF.
> >> >> 
> >> >> Currently bootstrapping and regression testing, otherwise OK for trunk? 
> >> >> Can
> >> >> someone verify this fixes the issue for PR103971 on powerpc?
> >> >
> >> > Why not simply start at mode_i = 0 which means autodetecting the mode
> >> > to use for the epilogue?  That appears to be a much simpler solution to
> >> > me, including for targets where there are more than one element in the
> >> > vector.
> >> 
> >> VOIDmode doesn't tell us anything about what the autodetected mode
> >> will be, so current short-circuit:
> >> 
> >>   /* If the target does not support partial vectors we can shorten the
> >> number of modes to analyze for the epilogue as we know we can't pick a
> >> mode that has at least as many NUNITS as the main loop's vectorization
> >> factor, since that would imply the epilogue's vectorization factor
> >> would be at least as high as the main loop's and we would be
> >> vectorizing for more scalar iterations than there would be left.  */
> >>   if (!supports_partial_vectors
> >>  && maybe_ge (GET_MODE_NUNITS (vector_modes[mode_i]), first_vinfo_vf))
> >>{
> >>  mode_i++;
> >>  if (mode_i == vector_modes.length ())
> >>break;
> >>  continue;
> >>}
> >> 
> >> wouldn't be effective.
> >
> > Well, before this change we simply did
> >
> > -  /* Handle the case that the original loop can use partial
> > - vectorization, but want to only adopt it for the epilogue.
> > - The retry should be in the same mode as original.  */
> > -  if (LOOP_VINFO_EPIL_USING_PARTIAL_VECTORS_P (first_loop_vinfo))
> > ...
> > -  else
> > -{
> > -  mode_i = first_loop_next_i;
> > -  if (mode_i == vector_modes.length ())
> > -   return first_loop_vinfo;
> > -}
> >
> > and thus didn't bother with epilogue vectorization.  I think we should
> > then just restore this behavior, not doing epilogue vectorization
> > if vector_modes.length () == 1?
> 
> Yeah, but that case didn't need epilogue vectorisation before.  This
> series is adding support for unrolling, and targets with a single vector
> size will benefit from epilogues in that case.

But in that case (which we could detect), we could then just use
autodetected_vector_mode?  Like if we do before epilogue vect

 vector_modes[0] = autodetected_vector_mode;
 mode_i = 0;

thus replace VOIDmode with what we detected and then start at 0?
That is, the proposed patch looks very much like a hack to me.

I suppose the VECTOR_MODE_P check should be added to

  if (!supports_partial_vectors
  && maybe_ge (GET_MODE_NUNITS (vector_modes[mode_i]), 
first_vinfo_vf))
{
  mode_i++;

instead.

> Thanks,
> Richard
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Ivo Totev; HRB 36809 (AG Nuernberg)


Re: [vect] PR103971, PR103977: Fix epilogue mode selection for autodetect only

2022-01-12 Thread Richard Sandiford via Gcc-patches
Richard Biener  writes:
> On Wed, 12 Jan 2022, Richard Sandiford wrote:
>
>> Richard Biener  writes:
>> > On Wed, 12 Jan 2022, Richard Sandiford wrote:
>> >
>> >> Richard Biener  writes:
>> >> > On Wed, 12 Jan 2022, Andre Vieira (lists) wrote:
>> >> >
>> >> >> Hi,
>> >> >> 
>> >> >> This a fix for the regression caused by '[vect] Re-analyze all modes 
>> >> >> for
>> >> >> epilogues'. The earlier patch assumed there was always at least one 
>> >> >> other mode
>> >> >> than VOIDmode, but that does not need to be the case.
>> >> >> If we are dealing with a target that does not define more modes for
>> >> >> 'autovectorize_vector_modes', the behaviour before the patch would be 
>> >> >> to try
>> >> >> to create an epilogue for the same autodetected_vector_mode, which 
>> >> >> unless the
>> >> >> target supported partial vectors would always fail. So as a fix I 
>> >> >> suggest
>> >> >> trying to vectorize the epilogue with the preferred_simd_mode for QI,
>> >> >> mimicking autovectorize_vector_mode, which will be skipped if it is 
>> >> >> not a
>> >> >> vector_mode (since that already should indicate partial vectors aren't
>> >> >> possible) or if no partial vectors are supported and its pessimistic 
>> >> >> NUNITS is
>> >> >> larger than the main loop's VF.
>> >> >> 
>> >> >> Currently bootstrapping and regression testing, otherwise OK for 
>> >> >> trunk? Can
>> >> >> someone verify this fixes the issue for PR103971 on powerpc?
>> >> >
>> >> > Why not simply start at mode_i = 0 which means autodetecting the mode
>> >> > to use for the epilogue?  That appears to be a much simpler solution to
>> >> > me, including for targets where there are more than one element in the
>> >> > vector.
>> >> 
>> >> VOIDmode doesn't tell us anything about what the autodetected mode
>> >> will be, so current short-circuit:
>> >> 
>> >>   /* If the target does not support partial vectors we can shorten the
>> >>number of modes to analyze for the epilogue as we know we can't pick a
>> >>mode that has at least as many NUNITS as the main loop's vectorization
>> >>factor, since that would imply the epilogue's vectorization factor
>> >>would be at least as high as the main loop's and we would be
>> >>vectorizing for more scalar iterations than there would be left.  */
>> >>   if (!supports_partial_vectors
>> >> && maybe_ge (GET_MODE_NUNITS (vector_modes[mode_i]), first_vinfo_vf))
>> >>   {
>> >> mode_i++;
>> >> if (mode_i == vector_modes.length ())
>> >>   break;
>> >> continue;
>> >>   }
>> >> 
>> >> wouldn't be effective.
>> >
>> > Well, before this change we simply did
>> >
>> > -  /* Handle the case that the original loop can use partial
>> > - vectorization, but want to only adopt it for the epilogue.
>> > - The retry should be in the same mode as original.  */
>> > -  if (LOOP_VINFO_EPIL_USING_PARTIAL_VECTORS_P (first_loop_vinfo))
>> > ...
>> > -  else
>> > -{
>> > -  mode_i = first_loop_next_i;
>> > -  if (mode_i == vector_modes.length ())
>> > -   return first_loop_vinfo;
>> > -}
>> >
>> > and thus didn't bother with epilogue vectorization.  I think we should
>> > then just restore this behavior, not doing epilogue vectorization
>> > if vector_modes.length () == 1?
>> 
>> Yeah, but that case didn't need epilogue vectorisation before.  This
>> series is adding support for unrolling, and targets with a single vector
>> size will benefit from epilogues in that case.
>
> But in that case (which we could detect), we could then just use
> autodetected_vector_mode?  Like if we do before epilogue vect
>
>  vector_modes[0] = autodetected_vector_mode;
>  mode_i = 0;
>
> thus replace VOIDmode with what we detected and then start at 0?
> That is, the proposed patch looks very much like a hack to me.

You mean check whether the loop is unrolled?  If so, that's what feels
like a hack to me :-)  The question is whether there are enough elements
for epilogue vectorisation to make sense.  The VF is what tells us that.
Unrolling is just one of the things that influences that VF and I don't
think we should check for the individual influences.  It's just the end
result that matters.

> I suppose the VECTOR_MODE_P check should be added to
>
>   if (!supports_partial_vectors
>   && maybe_ge (GET_MODE_NUNITS (vector_modes[mode_i]), 
> first_vinfo_vf))
> {
>   mode_i++;
>
> instead.

You mean:

  if (!supports_partial_vectors
  && VECTOR_MODE_P (vector_modes[mode_i])
  && maybe_ge (GET_MODE_NUNITS (vector_modes[mode_i]), first_vinfo_vf))
{
  mode_i++;

?  If so, the skip won't be effective the first time round.

Thanks,
Richard


Re: [PATCH] Fortran: make IEEE_CLASS recognize signaling NaNs

2022-01-12 Thread FX via Gcc-patches
Hi Jakub,

> We need -fintrinsic-modules-path option for the signalling_1.f90 compilation
> but need to make sure it isn't used when the *.c file is compiled, so they
> need to be compiled by separate compiler invocations probably.

Thanks for posting the errors! So I wasn’t seeing it because I had run “make 
install” before the testsuite. The patch I pushed at 
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=6b14100b9504800768da726dcb81f1857db3b493
 should fix the failure, then.

FX

Re: [PATCH] Fortran: make IEEE_CLASS recognize signaling NaNs

2022-01-12 Thread Jakub Jelinek via Gcc-patches
On Wed, Jan 12, 2022 at 12:03:40PM +0100, FX wrote:
> > We need -fintrinsic-modules-path option for the signalling_1.f90 compilation
> > but need to make sure it isn't used when the *.c file is compiled, so they
> > need to be compiled by separate compiler invocations probably.
> 
> Thanks for posting the errors!  So I wasn’t seeing it because I had run
> “make install” before the testsuite.  The patch I pushed at
> https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=6b14100b9504800768da726dcb81f1857db3b493
> should fix the failure, then.

Thanks.  Just a nit, it is cc1 that reports the warning, not f951.

Jakub



Re: [PATCH] Fix -Wformat-diag for aarch64 target.

2022-01-12 Thread Richard Earnshaw via Gcc-patches




On 12/01/2022 09:02, Martin Liška wrote:

Hello.

We've got -Wformat-diag for some time and I think we should start using it
in -Werror for GCC bootstrap. The following patch removes last pieces of 
the warning

for aarch64 target.

Ready to be installed?
Thanks,
Martin


OK.

R.



gcc/ChangeLog:

 * config/aarch64/aarch64.c (aarch64_parse_boolean_options): Use
 %qs where possible.
 (aarch64_parse_sve_width_string): Likewise.
 (aarch64_override_options_internal): Likewise.
 (aarch64_print_hint_for_extensions): Likewise.
 (aarch64_validate_sls_mitigation): Likewise.
 (aarch64_handle_attr_arch): Likewise.
 (aarch64_handle_attr_cpu): Likewise.
 (aarch64_handle_attr_tune): Likewise.
 (aarch64_handle_attr_isa_flags): Likewise.
---
  gcc/config/aarch64/aarch64.c | 26 +-
  1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 5920b29880a..1bca2a31a1b 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -16337,7 +16337,7 @@ aarch64_parse_boolean_options (const char *option,
    /* We ended with a comma, print something.  */
    if (!(*specs))
  {
-  error ("%s string ill-formed\n", option_name);
+  error ("%qs string ill-formed", option_name);
    return 0;
  }

@@ -16393,7 +16393,7 @@ aarch64_parse_sve_width_string (const char 
*tune_string,

    int n = sscanf (tune_string, "%d", &width);
    if (n == EOF)
  {
-  error ("invalid format for sve_width");
+  error ("invalid format for %");
    return;
  }
    switch (width)
@@ -16405,7 +16405,7 @@ aarch64_parse_sve_width_string (const char 
*tune_string,

  case SVE_2048:
    break;
  default:
-  error ("invalid sve_width value: %d", width);
+  error ("invalid % value: %d", width);
  }
    tune->sve_width = (enum aarch64_sve_vector_bits_enum) width;
  }
@@ -16628,7 +16628,7 @@ aarch64_override_options_internal (struct 
gcc_options *opts)

    if (opts->x_aarch64_stack_protector_guard_reg_str)
  {
    if (strlen (opts->x_aarch64_stack_protector_guard_reg_str) > 100)
-  error ("specify a system register with a small string length.");
+  error ("specify a system register with a small string length");
  }

    if (opts->x_aarch64_stack_protector_guard_offset_str)
@@ -16832,7 +16832,7 @@ aarch64_print_hint_for_extensions (const 
std::string &str)

  inform (input_location, "valid arguments are: %s;"
   " did you mean %qs?", s, hint);
    else
-    inform (input_location, "valid arguments are: %s;", s);
+    inform (input_location, "valid arguments are: %s", s);

    XDELETEVEC (s);
  }
@@ -16933,7 +16933,7 @@ aarch64_validate_sls_mitigation (const char 
*const_str)

  temp |= SLS_RETBR;
    else if (strcmp (str, "none") == 0 || strcmp (str, "all") == 0)
  {
-  error ("%<%s%> must be by itself for %<-mharden-sls=%>", str);
+  error ("%qs must be by itself for %<-mharden-sls=%>", str);
    break;
  }
    else
@@ -17572,11 +17572,11 @@ aarch64_handle_attr_arch (const char *str)
  error ("missing name in % pragma or attribute");
  break;
    case AARCH64_PARSE_INVALID_ARG:
-    error ("invalid name (\"%s\") in % pragma or 
attribute", str);
+    error ("invalid name (%qs) in % pragma or 
attribute", str);

  aarch64_print_hint_for_arch (str);
  break;
    case AARCH64_PARSE_INVALID_FEATURE:
-    error ("invalid feature modifier %s of value (\"%s\") in "
+    error ("invalid feature modifier %s of value (%qs) in "
     "% pragma or attribute", invalid_extension.c_str 
(), str);

  aarch64_print_hint_for_extensions (invalid_extension);
  break;
@@ -17614,11 +17614,11 @@ aarch64_handle_attr_cpu (const char *str)
  error ("missing name in % pragma or attribute");
  break;
    case AARCH64_PARSE_INVALID_ARG:
-    error ("invalid name (\"%s\") in % pragma or 
attribute", str);
+    error ("invalid name (%qs) in % pragma or 
attribute", str);

  aarch64_print_hint_for_core (str);
  break;
    case AARCH64_PARSE_INVALID_FEATURE:
-    error ("invalid feature modifier %s of value (\"%s\") in "
+    error ("invalid feature modifier %s of value (%qs) in "
     "% pragma or attribute", invalid_extension.c_str 
(), str);

  aarch64_print_hint_for_extensions (invalid_extension);
  break;
@@ -17645,7 +17645,7 @@ aarch64_handle_attr_cpu (const char *str)
    " attribute");
     break;
   case AARCH64_PARSE_INVALID_ARG:
-   error ("invalid protection type (\"%s\") in 
%+   error ("invalid protection type (%qs) in 
%
    "=\")%> pragma or attribute", err_str);
     break;
   case AARCH64_PARSE_OK:
@@ -17680,7 +17680,7 @@ aarch64_handle_attr_tune (const char *str)
    switch (parse_res)
  {
    case AARCH64_PARSE_INVALID_ARG:
-    error ("invalid name (\"%s\") in % 

Re: [vect] PR103971, PR103977: Fix epilogue mode selection for autodetect only

2022-01-12 Thread Richard Biener via Gcc-patches
On Wed, 12 Jan 2022, Richard Sandiford wrote:

> Richard Biener  writes:
> > On Wed, 12 Jan 2022, Richard Sandiford wrote:
> >
> >> Richard Biener  writes:
> >> > On Wed, 12 Jan 2022, Richard Sandiford wrote:
> >> >
> >> >> Richard Biener  writes:
> >> >> > On Wed, 12 Jan 2022, Andre Vieira (lists) wrote:
> >> >> >
> >> >> >> Hi,
> >> >> >> 
> >> >> >> This a fix for the regression caused by '[vect] Re-analyze all modes 
> >> >> >> for
> >> >> >> epilogues'. The earlier patch assumed there was always at least one 
> >> >> >> other mode
> >> >> >> than VOIDmode, but that does not need to be the case.
> >> >> >> If we are dealing with a target that does not define more modes for
> >> >> >> 'autovectorize_vector_modes', the behaviour before the patch would 
> >> >> >> be to try
> >> >> >> to create an epilogue for the same autodetected_vector_mode, which 
> >> >> >> unless the
> >> >> >> target supported partial vectors would always fail. So as a fix I 
> >> >> >> suggest
> >> >> >> trying to vectorize the epilogue with the preferred_simd_mode for QI,
> >> >> >> mimicking autovectorize_vector_mode, which will be skipped if it is 
> >> >> >> not a
> >> >> >> vector_mode (since that already should indicate partial vectors 
> >> >> >> aren't
> >> >> >> possible) or if no partial vectors are supported and its pessimistic 
> >> >> >> NUNITS is
> >> >> >> larger than the main loop's VF.
> >> >> >> 
> >> >> >> Currently bootstrapping and regression testing, otherwise OK for 
> >> >> >> trunk? Can
> >> >> >> someone verify this fixes the issue for PR103971 on powerpc?
> >> >> >
> >> >> > Why not simply start at mode_i = 0 which means autodetecting the mode
> >> >> > to use for the epilogue?  That appears to be a much simpler solution 
> >> >> > to
> >> >> > me, including for targets where there are more than one element in the
> >> >> > vector.
> >> >> 
> >> >> VOIDmode doesn't tell us anything about what the autodetected mode
> >> >> will be, so current short-circuit:
> >> >> 
> >> >>   /* If the target does not support partial vectors we can shorten 
> >> >> the
> >> >>  number of modes to analyze for the epilogue as we know we 
> >> >> can't pick a
> >> >>  mode that has at least as many NUNITS as the main loop's 
> >> >> vectorization
> >> >>  factor, since that would imply the epilogue's vectorization 
> >> >> factor
> >> >>  would be at least as high as the main loop's and we would be
> >> >>  vectorizing for more scalar iterations than there would be 
> >> >> left.  */
> >> >>   if (!supports_partial_vectors
> >> >>   && maybe_ge (GET_MODE_NUNITS (vector_modes[mode_i]), 
> >> >> first_vinfo_vf))
> >> >> {
> >> >>   mode_i++;
> >> >>   if (mode_i == vector_modes.length ())
> >> >> break;
> >> >>   continue;
> >> >> }
> >> >> 
> >> >> wouldn't be effective.
> >> >
> >> > Well, before this change we simply did
> >> >
> >> > -  /* Handle the case that the original loop can use partial
> >> > - vectorization, but want to only adopt it for the epilogue.
> >> > - The retry should be in the same mode as original.  */
> >> > -  if (LOOP_VINFO_EPIL_USING_PARTIAL_VECTORS_P (first_loop_vinfo))
> >> > ...
> >> > -  else
> >> > -{
> >> > -  mode_i = first_loop_next_i;
> >> > -  if (mode_i == vector_modes.length ())
> >> > -   return first_loop_vinfo;
> >> > -}
> >> >
> >> > and thus didn't bother with epilogue vectorization.  I think we should
> >> > then just restore this behavior, not doing epilogue vectorization
> >> > if vector_modes.length () == 1?
> >> 
> >> Yeah, but that case didn't need epilogue vectorisation before.  This
> >> series is adding support for unrolling, and targets with a single vector
> >> size will benefit from epilogues in that case.
> >
> > But in that case (which we could detect), we could then just use
> > autodetected_vector_mode?  Like if we do before epilogue vect
> >
> >  vector_modes[0] = autodetected_vector_mode;
> >  mode_i = 0;
> >
> > thus replace VOIDmode with what we detected and then start at 0?
> > That is, the proposed patch looks very much like a hack to me.
> 
> You mean check whether the loop is unrolled?  If so, that's what feels
> like a hack to me :-)  The question is whether there are enough elements
> for epilogue vectorisation to make sense.  The VF is what tells us that.
> Unrolling is just one of the things that influences that VF and I don't
> think we should check for the individual influences.  It's just the end
> result that matters.
> 
> > I suppose the VECTOR_MODE_P check should be added to
> >
> >   if (!supports_partial_vectors
> >   && maybe_ge (GET_MODE_NUNITS (vector_modes[mode_i]), 
> > first_vinfo_vf))
> > {
> >   mode_i++;
> >
> > instead.
> 
> You mean:
> 
>   if (!supports_partial_vectors
> && VECTOR_MODE_P (vector_modes[mode_i])
> && maybe_ge (GET_MODE_NUNITS (vector_modes[

Re: [vect] PR103971, PR103977: Fix epilogue mode selection for autodetect only

2022-01-12 Thread Richard Sandiford via Gcc-patches
Richard Biener  writes:
> On Wed, 12 Jan 2022, Richard Sandiford wrote:
>
>> Richard Biener  writes:
>> > On Wed, 12 Jan 2022, Richard Sandiford wrote:
>> >
>> >> Richard Biener  writes:
>> >> > On Wed, 12 Jan 2022, Richard Sandiford wrote:
>> >> >
>> >> >> Richard Biener  writes:
>> >> >> > On Wed, 12 Jan 2022, Andre Vieira (lists) wrote:
>> >> >> >
>> >> >> >> Hi,
>> >> >> >> 
>> >> >> >> This a fix for the regression caused by '[vect] Re-analyze all 
>> >> >> >> modes for
>> >> >> >> epilogues'. The earlier patch assumed there was always at least one 
>> >> >> >> other mode
>> >> >> >> than VOIDmode, but that does not need to be the case.
>> >> >> >> If we are dealing with a target that does not define more modes for
>> >> >> >> 'autovectorize_vector_modes', the behaviour before the patch would 
>> >> >> >> be to try
>> >> >> >> to create an epilogue for the same autodetected_vector_mode, which 
>> >> >> >> unless the
>> >> >> >> target supported partial vectors would always fail. So as a fix I 
>> >> >> >> suggest
>> >> >> >> trying to vectorize the epilogue with the preferred_simd_mode for 
>> >> >> >> QI,
>> >> >> >> mimicking autovectorize_vector_mode, which will be skipped if it is 
>> >> >> >> not a
>> >> >> >> vector_mode (since that already should indicate partial vectors 
>> >> >> >> aren't
>> >> >> >> possible) or if no partial vectors are supported and its 
>> >> >> >> pessimistic NUNITS is
>> >> >> >> larger than the main loop's VF.
>> >> >> >> 
>> >> >> >> Currently bootstrapping and regression testing, otherwise OK for 
>> >> >> >> trunk? Can
>> >> >> >> someone verify this fixes the issue for PR103971 on powerpc?
>> >> >> >
>> >> >> > Why not simply start at mode_i = 0 which means autodetecting the mode
>> >> >> > to use for the epilogue?  That appears to be a much simpler solution 
>> >> >> > to
>> >> >> > me, including for targets where there are more than one element in 
>> >> >> > the
>> >> >> > vector.
>> >> >> 
>> >> >> VOIDmode doesn't tell us anything about what the autodetected mode
>> >> >> will be, so current short-circuit:
>> >> >> 
>> >> >>   /* If the target does not support partial vectors we can shorten 
>> >> >> the
>> >> >> number of modes to analyze for the epilogue as we know we 
>> >> >> can't pick a
>> >> >> mode that has at least as many NUNITS as the main loop's 
>> >> >> vectorization
>> >> >> factor, since that would imply the epilogue's vectorization 
>> >> >> factor
>> >> >> would be at least as high as the main loop's and we would be
>> >> >> vectorizing for more scalar iterations than there would be 
>> >> >> left.  */
>> >> >>   if (!supports_partial_vectors
>> >> >>  && maybe_ge (GET_MODE_NUNITS (vector_modes[mode_i]), 
>> >> >> first_vinfo_vf))
>> >> >>{
>> >> >>  mode_i++;
>> >> >>  if (mode_i == vector_modes.length ())
>> >> >>break;
>> >> >>  continue;
>> >> >>}
>> >> >> 
>> >> >> wouldn't be effective.
>> >> >
>> >> > Well, before this change we simply did
>> >> >
>> >> > -  /* Handle the case that the original loop can use partial
>> >> > - vectorization, but want to only adopt it for the epilogue.
>> >> > - The retry should be in the same mode as original.  */
>> >> > -  if (LOOP_VINFO_EPIL_USING_PARTIAL_VECTORS_P (first_loop_vinfo))
>> >> > ...
>> >> > -  else
>> >> > -{
>> >> > -  mode_i = first_loop_next_i;
>> >> > -  if (mode_i == vector_modes.length ())
>> >> > -   return first_loop_vinfo;
>> >> > -}
>> >> >
>> >> > and thus didn't bother with epilogue vectorization.  I think we should
>> >> > then just restore this behavior, not doing epilogue vectorization
>> >> > if vector_modes.length () == 1?
>> >> 
>> >> Yeah, but that case didn't need epilogue vectorisation before.  This
>> >> series is adding support for unrolling, and targets with a single vector
>> >> size will benefit from epilogues in that case.
>> >
>> > But in that case (which we could detect), we could then just use
>> > autodetected_vector_mode?  Like if we do before epilogue vect
>> >
>> >  vector_modes[0] = autodetected_vector_mode;
>> >  mode_i = 0;
>> >
>> > thus replace VOIDmode with what we detected and then start at 0?
>> > That is, the proposed patch looks very much like a hack to me.
>> 
>> You mean check whether the loop is unrolled?  If so, that's what feels
>> like a hack to me :-)  The question is whether there are enough elements
>> for epilogue vectorisation to make sense.  The VF is what tells us that.
>> Unrolling is just one of the things that influences that VF and I don't
>> think we should check for the individual influences.  It's just the end
>> result that matters.
>> 
>> > I suppose the VECTOR_MODE_P check should be added to
>> >
>> >   if (!supports_partial_vectors
>> >   && maybe_ge (GET_MODE_NUNITS (vector_modes[mode_i]), 
>> > first_vinfo_vf))
>> > {
>> >   mode_i++;
>> >
>> > instead.
>> 

Re: [PATCH] opts: do not do sanity check when an error is seen

2022-01-12 Thread Martin Liška

On 12/16/21 17:44, Jeff Law wrote:



On 12/16/2021 7:37 AM, Martin Liška wrote:

Do not check global options modification when an error is seen in parsing
of options (pragmas or attributes).

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin

PR target/103709

gcc/c-family/ChangeLog:

* c-pragma.c (handle_pragma_pop_options): Do not check
global options modification when an error is seen in parsing
of options (pragmas or attributes).

OK
jeff



The following patch handled the same for attributes.

Pushed as obvious.
MartinFrom 98b5359b474e4de89ebc1ea5203ca907738f7d7f Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Wed, 12 Jan 2022 12:48:33 +0100
Subject: [PATCH] opts: do not do sanity check when an error is seen

	PR target/103804

gcc/c-family/ChangeLog:

	* c-attribs.c (handle_optimize_attribute): Do not call
	cl_optimization_compare if we seen an error.
---
 gcc/c-family/c-attribs.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/c-family/c-attribs.c b/gcc/c-family/c-attribs.c
index dbb892e0ec6..bdf72ce385c 100644
--- a/gcc/c-family/c-attribs.c
+++ b/gcc/c-family/c-attribs.c
@@ -5516,7 +5516,8 @@ handle_optimize_attribute (tree *node, tree name, tree args,
 
   if (saved_global_options != NULL)
 	{
-	  cl_optimization_compare (saved_global_options, &global_options);
+	  if (!seen_error ())
+	cl_optimization_compare (saved_global_options, &global_options);
 	  free (saved_global_options);
 	}
 }
-- 
2.34.1



Re: [PATCH] Fortran: make IEEE_CLASS recognize signaling NaNs

2022-01-12 Thread FX via Gcc-patches
> Thanks.  Just a nit, it is cc1 that reports the warning, not f951.

I confirm the patch fixes the testcase failure. And I fixed the comment in a 
follow-up commit.

Thanks!

FX

Re: [vect] PR103971, PR103977: Fix epilogue mode selection for autodetect only

2022-01-12 Thread Andre Vieira (lists) via Gcc-patches



On 12/01/2022 11:44, Richard Sandiford wrote:

Another alternative would be to push autodetected_vector_mode when the
length is 1 and keep 1 as the starting point.

Richard


I'm guessing we would still want to skip epilogue vectorization if 
!VECTOR_MODE_P (autodetected_vector_mode) in that case?




Re: [vect] PR103971, PR103977: Fix epilogue mode selection for autodetect only

2022-01-12 Thread Richard Biener via Gcc-patches
On Wed, 12 Jan 2022, Richard Sandiford wrote:

> Richard Biener  writes:
> > On Wed, 12 Jan 2022, Richard Sandiford wrote:
> >
> >> Richard Biener  writes:
> >> > On Wed, 12 Jan 2022, Richard Sandiford wrote:
> >> >
> >> >> Richard Biener  writes:
> >> >> > On Wed, 12 Jan 2022, Richard Sandiford wrote:
> >> >> >
> >> >> >> Richard Biener  writes:
> >> >> >> > On Wed, 12 Jan 2022, Andre Vieira (lists) wrote:
> >> >> >> >
> >> >> >> >> Hi,
> >> >> >> >> 
> >> >> >> >> This a fix for the regression caused by '[vect] Re-analyze all 
> >> >> >> >> modes for
> >> >> >> >> epilogues'. The earlier patch assumed there was always at least 
> >> >> >> >> one other mode
> >> >> >> >> than VOIDmode, but that does not need to be the case.
> >> >> >> >> If we are dealing with a target that does not define more modes 
> >> >> >> >> for
> >> >> >> >> 'autovectorize_vector_modes', the behaviour before the patch 
> >> >> >> >> would be to try
> >> >> >> >> to create an epilogue for the same autodetected_vector_mode, 
> >> >> >> >> which unless the
> >> >> >> >> target supported partial vectors would always fail. So as a fix I 
> >> >> >> >> suggest
> >> >> >> >> trying to vectorize the epilogue with the preferred_simd_mode for 
> >> >> >> >> QI,
> >> >> >> >> mimicking autovectorize_vector_mode, which will be skipped if it 
> >> >> >> >> is not a
> >> >> >> >> vector_mode (since that already should indicate partial vectors 
> >> >> >> >> aren't
> >> >> >> >> possible) or if no partial vectors are supported and its 
> >> >> >> >> pessimistic NUNITS is
> >> >> >> >> larger than the main loop's VF.
> >> >> >> >> 
> >> >> >> >> Currently bootstrapping and regression testing, otherwise OK for 
> >> >> >> >> trunk? Can
> >> >> >> >> someone verify this fixes the issue for PR103971 on powerpc?
> >> >> >> >
> >> >> >> > Why not simply start at mode_i = 0 which means autodetecting the 
> >> >> >> > mode
> >> >> >> > to use for the epilogue?  That appears to be a much simpler 
> >> >> >> > solution to
> >> >> >> > me, including for targets where there are more than one element in 
> >> >> >> > the
> >> >> >> > vector.
> >> >> >> 
> >> >> >> VOIDmode doesn't tell us anything about what the autodetected mode
> >> >> >> will be, so current short-circuit:
> >> >> >> 
> >> >> >>   /* If the target does not support partial vectors we can 
> >> >> >> shorten the
> >> >> >>   number of modes to analyze for the epilogue as we know we 
> >> >> >> can't pick a
> >> >> >>   mode that has at least as many NUNITS as the main loop's 
> >> >> >> vectorization
> >> >> >>   factor, since that would imply the epilogue's vectorization 
> >> >> >> factor
> >> >> >>   would be at least as high as the main loop's and we would be
> >> >> >>   vectorizing for more scalar iterations than there would be 
> >> >> >> left.  */
> >> >> >>   if (!supports_partial_vectors
> >> >> >>&& maybe_ge (GET_MODE_NUNITS (vector_modes[mode_i]), 
> >> >> >> first_vinfo_vf))
> >> >> >>  {
> >> >> >>mode_i++;
> >> >> >>if (mode_i == vector_modes.length ())
> >> >> >>  break;
> >> >> >>continue;
> >> >> >>  }
> >> >> >> 
> >> >> >> wouldn't be effective.
> >> >> >
> >> >> > Well, before this change we simply did
> >> >> >
> >> >> > -  /* Handle the case that the original loop can use partial
> >> >> > - vectorization, but want to only adopt it for the epilogue.
> >> >> > - The retry should be in the same mode as original.  */
> >> >> > -  if (LOOP_VINFO_EPIL_USING_PARTIAL_VECTORS_P (first_loop_vinfo))
> >> >> > ...
> >> >> > -  else
> >> >> > -{
> >> >> > -  mode_i = first_loop_next_i;
> >> >> > -  if (mode_i == vector_modes.length ())
> >> >> > -   return first_loop_vinfo;
> >> >> > -}
> >> >> >
> >> >> > and thus didn't bother with epilogue vectorization.  I think we should
> >> >> > then just restore this behavior, not doing epilogue vectorization
> >> >> > if vector_modes.length () == 1?
> >> >> 
> >> >> Yeah, but that case didn't need epilogue vectorisation before.  This
> >> >> series is adding support for unrolling, and targets with a single vector
> >> >> size will benefit from epilogues in that case.
> >> >
> >> > But in that case (which we could detect), we could then just use
> >> > autodetected_vector_mode?  Like if we do before epilogue vect
> >> >
> >> >  vector_modes[0] = autodetected_vector_mode;
> >> >  mode_i = 0;
> >> >
> >> > thus replace VOIDmode with what we detected and then start at 0?
> >> > That is, the proposed patch looks very much like a hack to me.
> >> 
> >> You mean check whether the loop is unrolled?  If so, that's what feels
> >> like a hack to me :-)  The question is whether there are enough elements
> >> for epilogue vectorisation to make sense.  The VF is what tells us that.
> >> Unrolling is just one of the things that influences that VF and I don't
> >> think we should check for the individual influences.  It's just the end
> >> result that

Re: [vect] PR103971, PR103977: Fix epilogue mode selection for autodetect only

2022-01-12 Thread Richard Biener via Gcc-patches
On Wed, 12 Jan 2022, Andre Vieira (lists) wrote:

> 
> On 12/01/2022 11:44, Richard Sandiford wrote:
> > Another alternative would be to push autodetected_vector_mode when the
> > length is 1 and keep 1 as the starting point.
> >
> > Richard
> 
> I'm guessing we would still want to skip epilogue vectorization if
> !VECTOR_MODE_P (autodetected_vector_mode) in that case?

Practically we currently only support fixed width word_mode there,
but eventually one could end up with 64bit DImode for the main loop
and 32bit V4QImode in the epilogue ... so not sure if it's worth
special-casing.  But I don't mind adding that skip.

Richard.


[PATCH, rs6000] Enable absolute jump table by default

2022-01-12 Thread HAO CHEN GUI via Gcc-patches
Hi,
   This patch enables absolute jump table by default on rs6000. The relative 
jump tables are used when
   it's explicit set by "rs6000_relative_jumptables",
   or jump tables are placed in text section but global relocation is required.

   Bootstrapped and tested on powerpc64-linux BE and LE with no regressions. Is 
this okay for trunk?
Any recommendations? Thanks a lot.

ChangeLog
2022-01-12 Haochen Gui 

gcc/
* config/rs6000/linux64.h (JUMP_TABLES_IN_TEXT_SECTION): Define.
* config/rs6000/rs6000.c (rs6000_gen_pic_addr_diff_vec): Return
true when relative jump table is explicit required or jump tables have
to be put in text section but global relocation is also required.
* config/rs6000/rs6000.opt (rs6000_relative_jumptables): Disable.

patch.diff
diff --git a/gcc/config/rs6000/linux64.h b/gcc/config/rs6000/linux64.h
index d617f346f81..2e257c60f8c 100644
--- a/gcc/config/rs6000/linux64.h
+++ b/gcc/config/rs6000/linux64.h
@@ -239,7 +239,7 @@ extern int dot_symbols;

 /* Indicate that jump tables go in the text section.  */
 #undef  JUMP_TABLES_IN_TEXT_SECTION
-#define JUMP_TABLES_IN_TEXT_SECTION TARGET_64BIT
+#define JUMP_TABLES_IN_TEXT_SECTION 0

 /* The linux ppc64 ABI isn't explicit on whether aggregates smaller
than a doubleword should be padded upward or downward.  You could
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 319182e94d9..9fba893a27a 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -28465,7 +28465,9 @@ rs6000_emit_xxspltidp_v2df (rtx dst, long value)
 static bool
 rs6000_gen_pic_addr_diff_vec (void)
 {
-  return rs6000_relative_jumptables;
+  return rs6000_relative_jumptables
+|| (JUMP_TABLES_IN_TEXT_SECTION
+&& targetm.asm_out.reloc_rw_mask () == 3);
 }

 void
diff --git a/gcc/config/rs6000/rs6000.opt b/gcc/config/rs6000/rs6000.opt
index c2a77182a9e..75e3fa86829 100644
--- a/gcc/config/rs6000/rs6000.opt
+++ b/gcc/config/rs6000/rs6000.opt
@@ -630,7 +630,7 @@ Target Mask(MMA) Var(rs6000_isa_flags)
 Generate (do not generate) MMA instructions.

 mrelative-jumptables
-Target Undocumented Var(rs6000_relative_jumptables) Init(1) Save
+Target Undocumented Var(rs6000_relative_jumptables) Init(0) Save

 mrop-protect
 Target Var(rs6000_rop_protect) Init(0)


Re: [vect] PR103971, PR103977: Fix epilogue mode selection for autodetect only

2022-01-12 Thread Andre Vieira (lists) via Gcc-patches


On 12/01/2022 11:59, Richard Biener wrote:

On Wed, 12 Jan 2022, Andre Vieira (lists) wrote:


On 12/01/2022 11:44, Richard Sandiford wrote:

Another alternative would be to push autodetected_vector_mode when the
length is 1 and keep 1 as the starting point.

Richard

I'm guessing we would still want to skip epilogue vectorization if
!VECTOR_MODE_P (autodetected_vector_mode) in that case?

Practically we currently only support fixed width word_mode there,
but eventually one could end up with 64bit DImode for the main loop
and 32bit V4QImode in the epilogue ... so not sure if it's worth
special-casing.  But I don't mind adding that skip.

Richard.


I left out the skip, it shouldn't break anything as it would try that 
same mode before anyway.
Just to clarify what I meant though was to skip if 
autodetected_vector_mode wasn't a vector AND the target didn't define 
autovectorize_vector_modes, so in that scenario it wouldn't ever try  
V4QImode for the epilogue if the mainloop was autodetected DImode, I 
think...
Either way, this is less code, less complicated and doesn't analyze more 
than it did before the original patch, so I'm happy with that too.


Is this what you had in mind?

Andre
diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
index 
6ed2b5f8724e5ebf27592f67d7f6bdfe1ebcf512..03459363afa48f0e2753bc7aa18cbf2771d2a4e5
 100644
--- a/gcc/tree-vect-loop.c
+++ b/gcc/tree-vect-loop.c
@@ -3023,7 +3023,16 @@ vect_analyze_loop (class loop *loop, vec_info_shared 
*shared)
  array may contain length-agnostic and length-specific modes.  Their
  ordering is not guaranteed, so we could end up picking a mode for the main
  loop that is after the epilogue's optimal mode.  */
-  mode_i = 1;
+  if (vector_modes.length () == 1)
+{
+  /* If we only had VOIDmode then use AUTODETECTED_VECTOR_MODE to see if
+an epilogue can be created with that mode.  */
+  vector_modes[0] = autodetected_vector_mode;
+  mode_i = 0;
+}
+  else
+mode_i = 1;
+
   bool supports_partial_vectors = partial_vectors_supported_p ();
   poly_uint64 first_vinfo_vf = LOOP_VINFO_VECT_FACTOR (first_loop_vinfo);
 


Re: [vect] PR103971, PR103977: Fix epilogue mode selection for autodetect only

2022-01-12 Thread Richard Biener via Gcc-patches
On Wed, 12 Jan 2022, Andre Vieira (lists) wrote:

> 
> On 12/01/2022 11:59, Richard Biener wrote:
> > On Wed, 12 Jan 2022, Andre Vieira (lists) wrote:
> >
> >> On 12/01/2022 11:44, Richard Sandiford wrote:
> >>> Another alternative would be to push autodetected_vector_mode when the
> >>> length is 1 and keep 1 as the starting point.
> >>>
> >>> Richard
> >> I'm guessing we would still want to skip epilogue vectorization if
> >> !VECTOR_MODE_P (autodetected_vector_mode) in that case?
> > Practically we currently only support fixed width word_mode there,
> > but eventually one could end up with 64bit DImode for the main loop
> > and 32bit V4QImode in the epilogue ... so not sure if it's worth
> > special-casing.  But I don't mind adding that skip.
> >
> > Richard.
> 
> I left out the skip, it shouldn't break anything as it would try that same
> mode before anyway.
> Just to clarify what I meant though was to skip if autodetected_vector_mode
> wasn't a vector AND the target didn't define autovectorize_vector_modes, so in
> that scenario it wouldn't ever try  V4QImode for the epilogue if the mainloop
> was autodetected DImode, I think...
> Either way, this is less code, less complicated and doesn't analyze more than
> it did before the original patch, so I'm happy with that too.
> 
> Is this what you had in mind?

-  mode_i = 1;
+  if (vector_modes.length () == 1)
+{
+  /* If we only had VOIDmode then use AUTODETECTED_VECTOR_MODE to see
if
+an epilogue can be created with that mode.  */
+  vector_modes[0] = autodetected_vector_mode;
+  mode_i = 0;
+}
+  else
+mode_i = 1;
+

I would have left out the condition and unconditionally do

  vector_modes[0] = autodetected_vector_mode;
  mode_i = 0;

but OK if you think it makes sense to special case length == 1.

Richard.


[PATCH] Fix -Wformat-diag for ARM target.

2022-01-12 Thread Martin Liška

Hello.

We've got -Wformat-diag for some time and I think we should start using it
in -Werror for GCC bootstrap. The following patch removes last pieces of the 
warning
for ARM target.

Ready to be installed?
Thanks,
Martin

gcc/ChangeLog:

* common/config/arm/arm-common.c (arm_target_mode): Wrap
keywords with %<, %> and remove trailing punctuation char.
(arm_canon_arch_option_1): Likewise.
(arm_asm_auto_mfpu): Likewise.
* config/arm/arm-builtins.c (arm_expand_builtin): Likewise.
* config/arm/arm.c (arm_options_perform_arch_sanity_checks): Likewise.
(use_vfp_abi): Likewise.
(aapcs_vfp_is_call_or_return_candidate): Likewise.
(arm_handle_cmse_nonsecure_entry): Likewise.
(arm_handle_cmse_nonsecure_call): Likewise.
(thumb1_md_asm_adjust): Likewise.
---
 gcc/common/config/arm/arm-common.c | 12 +++
 gcc/config/arm/arm-builtins.c  | 50 +++---
 gcc/config/arm/arm.c   | 12 +++
 3 files changed, 37 insertions(+), 37 deletions(-)

diff --git a/gcc/common/config/arm/arm-common.c 
b/gcc/common/config/arm/arm-common.c
index e7e19400263..6a898d8554b 100644
--- a/gcc/common/config/arm/arm-common.c
+++ b/gcc/common/config/arm/arm-common.c
@@ -286,7 +286,7 @@ arm_target_mode (int argc, const char **argv)
 
   if (argc % 2 != 0)

 fatal_error (input_location,
-"%%:target_mode_check takes an even number of parameters");
+"%%:% takes an even number of parameters");
 
   while (argc)

 {
@@ -295,8 +295,8 @@ arm_target_mode (int argc, const char **argv)
   else if (strcmp (argv[0], "cpu") == 0)
cpu = argv[1];
   else
-   fatal_error (input_location,
-"unrecognized option passed to %%:target_mode_check");
+   fatal_error (input_location, "unrecognized option passed to %%:"
+"%>");
   argc -= 2;
   argv += 2;
 }
@@ -662,7 +662,7 @@ arm_canon_arch_option_1 (int argc, const char **argv, bool 
arch_for_multilib)
 
   if (argc & 1)

 fatal_error (input_location,
-"%%:canon_for_mlib takes 1 or more pairs of parameters");
+"%%:% takes 1 or more pairs of parameters");
 
   while (argc)

 {
@@ -676,7 +676,7 @@ arm_canon_arch_option_1 (int argc, const char **argv, bool 
arch_for_multilib)
abi = argv[1];
   else
fatal_error (input_location,
-"unrecognized operand to %%:canon_for_mlib");
+"unrecognized operand to %%:%");
 
   argc -= 2;

   argv += 2;
@@ -1032,7 +1032,7 @@ arm_asm_auto_mfpu (int argc, const char **argv)
arch = argv[1];
   else
fatal_error (input_location,
-"unrecognized operand to %%:asm_auto_mfpu");
+"unrecognized operand to %%:%");
   argc -= 2;
   argv += 2;
 }
diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c
index 9c645722230..ab5c469b1ba 100644
--- a/gcc/config/arm/arm-builtins.c
+++ b/gcc/config/arm/arm-builtins.c
@@ -3013,7 +3013,7 @@ constant_arg:
  else
error_at (EXPR_LOCATION (exp),
  "coproc must be a constant immediate in "
- "range [0-%d] enabled with +cdecp",
+ "range [0-%d] enabled with +cdecp%",
  ARM_CDE_CONST_COPROC);
}
  else
@@ -3860,60 +3860,60 @@ arm_expand_builtin (tree exp,
  && (imm < 0 || imm > 32))
{
  if (fcode == ARM_BUILTIN_WRORHI)
-   error ("the range of count should be in 0 to 32.  please check the 
intrinsic _mm_rori_pi16 in code.");
+   error ("the range of count should be in 0 to 32; please check the 
intrinsic %<_mm_rori_pi16%> in code");
  else if (fcode == ARM_BUILTIN_WRORWI)
-   error ("the range of count should be in 0 to 32.  please check the 
intrinsic _mm_rori_pi32 in code.");
+   error ("the range of count should be in 0 to 32; please check the 
intrinsic %<_mm_rori_pi32%> in code");
  else if (fcode == ARM_BUILTIN_WRORH)
-   error ("the range of count should be in 0 to 32.  please check the 
intrinsic _mm_ror_pi16 in code.");
+   error ("the range of count should be in 0 to 32; please check the 
intrinsic %<_mm_ror_pi16%> in code");
  else
-   error ("the range of count should be in 0 to 32.  please check the 
intrinsic _mm_ror_pi32 in code.");
+   error ("the range of count should be in 0 to 32; please check the 
intrinsic %<_mm_ror_pi32%> in code");
}
  else if ((fcode == ARM_BUILTIN_WRORDI || fcode == ARM_BUILTIN_WRORD)
   && (imm < 0 || imm > 64))
{
  if (fcode == AR

[PATCH] Fix -Wformat-diag for s390x-ibm-tpf.

2022-01-12 Thread Martin Liška

This one is related to s390x-ibm-tpf -Wformat-diag warnings.

Ready to be installed?
Thanks,
Martin

gcc/ChangeLog:

* config/s390/s390.c: Fix -Wformat-diag warnings.
---
 gcc/config/s390/s390.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index bf96cbf7588..968672bb42b 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -15606,16 +15606,16 @@ s390_option_override_internal (struct gcc_options 
*opts,
 
 #if TARGET_TPF != 0

   if (!CONST_OK_FOR_J (opts->x_s390_tpf_trace_hook_prologue_check))
-error ("-mtpf-trace-hook-prologue-check requires integer in range 
0..4095");
+error ("%<-mtpf-trace-hook-prologue-check%> requires integer in range 
0-4095");
 
   if (!CONST_OK_FOR_J (opts->x_s390_tpf_trace_hook_prologue_target))

-error ("-mtpf-trace-hook-prologue-target requires integer in range 
0..4095");
+error ("%<-mtpf-trace-hook-prologue-target%> requires integer in range 
0-4095");
 
   if (!CONST_OK_FOR_J (opts->x_s390_tpf_trace_hook_epilogue_check))

-error ("-mtpf-trace-hook-epilogue-check requires integer in range 
0..4095");
+error ("%<-mtpf-trace-hook-epilogue-check%> requires integer in range 
0-4095");
 
   if (!CONST_OK_FOR_J (opts->x_s390_tpf_trace_hook_epilogue_target))

-error ("-mtpf-trace-hook-epilogue-target requires integer in range 
0..4095");
+error ("%<-mtpf-trace-hook-epilogue-target%> requires integer in range 
0-4095");
 
   if (s390_tpf_trace_skip)

 {
--
2.34.1



Re: [PATCH] [12/11/10] Fix invalid format warnings on Windows

2022-01-12 Thread Tomas Kalibera via Gcc-patches


On 1/11/22 2:37 PM, Martin Liška wrote:

Hello.

I do support the patch, but I would ...


Thanks, Martin,  that makes the patch simpler and easier to maintain. 
Would the attached version do?


Thanks
Tomas



On 1/7/22 19:33, Tomas Kalibera wrote:

+  if (is_attribute_p ("format", get_attribute_name (aa)) &&
+  fndecl && fndecl_built_in_p (fndecl, BUILT_IN_NORMAL))
+{
+  switch (DECL_FUNCTION_CODE (fndecl))
+{
+case BUILT_IN_FSCANF:
+case BUILT_IN_PRINTF:
+case BUILT_IN_SCANF:
+case BUILT_IN_SNPRINTF:
+case BUILT_IN_SSCANF:
+case BUILT_IN_VFSCANF:
+case BUILT_IN_VPRINTF:
+case BUILT_IN_VSCANF:
+case BUILT_IN_VSNPRINTF:
+case BUILT_IN_VSSCANF:
+case BUILT_IN_DCGETTEXT:
+case BUILT_IN_DGETTEXT:
+case BUILT_IN_GETTEXT:
+case BUILT_IN_STRFMON:
+case BUILT_IN_STRFTIME:
+case BUILT_IN_SNPRINTF_CHK:
+case BUILT_IN_VSNPRINTF_CHK:
+case BUILT_IN_PRINTF_CHK:
+case BUILT_IN_VPRINTF_CHK:
+  skipped_default_format = 1;
+  break;
+default:
+  break;
+}
+}


... skip this as the listed functions are only these that have defined 
ATTR_FORMAT_*:


$ grep ATTR_FORMAT gcc/builtins.def
DEF_LIB_BUILTIN    (BUILT_IN_FSCANF, "fscanf", 
BT_FN_INT_FILEPTR_CONST_STRING_VAR, ATTR_FORMAT_SCANF_2_3)
DEF_LIB_BUILTIN    (BUILT_IN_PRINTF, "printf", 
BT_FN_INT_CONST_STRING_VAR, ATTR_FORMAT_PRINTF_1_2)
DEF_LIB_BUILTIN    (BUILT_IN_SCANF, "scanf", 
BT_FN_INT_CONST_STRING_VAR, ATTR_FORMAT_SCANF_1_2)
DEF_C99_BUILTIN    (BUILT_IN_SNPRINTF, "snprintf", 
BT_FN_INT_STRING_SIZE_CONST_STRING_VAR, ATTR_FORMAT_PRINTF_NOTHROW_3_4)
DEF_LIB_BUILTIN    (BUILT_IN_SSCANF, "sscanf", 
BT_FN_INT_CONST_STRING_CONST_STRING_VAR, ATTR_FORMAT_SCANF_NOTHROW_2_3)
DEF_C99_BUILTIN    (BUILT_IN_VFSCANF, "vfscanf", 
BT_FN_INT_FILEPTR_CONST_STRING_VALIST_ARG, ATTR_FORMAT_SCANF_2_0)
DEF_LIB_BUILTIN    (BUILT_IN_VPRINTF, "vprintf", 
BT_FN_INT_CONST_STRING_VALIST_ARG, ATTR_FORMAT_PRINTF_1_0)
DEF_C99_BUILTIN    (BUILT_IN_VSCANF, "vscanf", 
BT_FN_INT_CONST_STRING_VALIST_ARG, ATTR_FORMAT_SCANF_1_0)
DEF_C99_BUILTIN    (BUILT_IN_VSNPRINTF, "vsnprintf", 
BT_FN_INT_STRING_SIZE_CONST_STRING_VALIST_ARG, 
ATTR_FORMAT_PRINTF_NOTHROW_3_0)
DEF_C99_BUILTIN    (BUILT_IN_VSSCANF, "vsscanf", 
BT_FN_INT_CONST_STRING_CONST_STRING_VALIST_ARG, 
ATTR_FORMAT_SCANF_NOTHROW_2_0)
DEF_EXT_LIB_BUILTIN    (BUILT_IN_DCGETTEXT, "dcgettext", 
BT_FN_STRING_CONST_STRING_CONST_STRING_INT, ATTR_FORMAT_ARG_2)
DEF_EXT_LIB_BUILTIN    (BUILT_IN_DGETTEXT, "dgettext", 
BT_FN_STRING_CONST_STRING_CONST_STRING, ATTR_FORMAT_ARG_2)
DEF_EXT_LIB_BUILTIN    (BUILT_IN_GETTEXT, "gettext", 
BT_FN_STRING_CONST_STRING, ATTR_FORMAT_ARG_1)
DEF_EXT_LIB_BUILTIN    (BUILT_IN_STRFMON, "strfmon", 
BT_FN_SSIZE_STRING_SIZE_CONST_STRING_VAR, 
ATTR_FORMAT_STRFMON_NOTHROW_3_4)
DEF_LIB_BUILTIN    (BUILT_IN_STRFTIME, "strftime", 
BT_FN_SIZE_STRING_SIZE_CONST_STRING_CONST_TM_PTR, 
ATTR_FORMAT_STRFTIME_NOTHROW_3_0)
DEF_EXT_LIB_BUILTIN    (BUILT_IN_SNPRINTF_CHK, "__snprintf_chk", 
BT_FN_INT_STRING_SIZE_INT_SIZE_CONST_STRING_VAR, 
ATTR_FORMAT_PRINTF_NOTHROW_5_6)
DEF_EXT_LIB_BUILTIN    (BUILT_IN_VSNPRINTF_CHK, "__vsnprintf_chk", 
BT_FN_INT_STRING_SIZE_INT_SIZE_CONST_STRING_VALIST_ARG, 
ATTR_FORMAT_PRINTF_NOTHROW_5_0)
DEF_EXT_LIB_BUILTIN    (BUILT_IN_PRINTF_CHK, "__printf_chk", 
BT_FN_INT_INT_CONST_STRING_VAR, ATTR_FORMAT_PRINTF_2_3)
DEF_EXT_LIB_BUILTIN    (BUILT_IN_VPRINTF_CHK, "__vprintf_chk", 
BT_FN_INT_INT_CONST_STRING_VALIST_ARG, ATTR_FORMAT_PRINTF_2_0)


Martin
>From 82a659c7e5b24bbd39ac567dff3f79cc4c1e083f Mon Sep 17 00:00:00 2001
From: Tomas Kalibera 
Date: Wed, 12 Jan 2022 08:17:21 -0500
Subject: [PATCH] Mingw32 targets use ms_printf format for printf, but
 mingw-w64 when configured for UCRT uses gnu_format (via stdio.h). GCC then
 checks both formats, which means that one cannot print a 64-bit integer
 without a warning. All these lines issue a warning:

  printf("Hello %"PRIu64"\n", x);
  printf("Hello %I64u\n", x);
  printf("Hello %llu\n", x);

because each of them violates one of the formats.  Also, one gets a warning
twice if the format string violates both formats.

Fixed by disabling the built in format in case there are additional ones.

gcc/c-family/ChangeLog:

	PR c/95130
	PR c/92292

	* c-common.c (check_function_arguments): Pass also function
	  declaration to check_function_format.

	* c-common.h (check_function_format): Extra argument - function
	  declaration.

	* c-format.c (check_function_format): For builtin functions with a
	  built in format and at least one more, do not check the first one.
---
 gcc/c-family/c-common.c |  2 +-
 gcc/c-family/c-common.h |  2 +-
 gcc/c-family/c-format.c | 31 +--
 

Re: [PATCH v2][GCC] arm: Add support for dwarf debug directives and pseudo hard-register for PAC feature.

2022-01-12 Thread Srinath Parvathaneni via Gcc-patches
Ping!!

From: Srinath Parvathaneni 
Sent: 13 December 2021 10:44
To: gcc-patches@gcc.gnu.org 
Cc: Kyrylo Tkachov ; Richard Earnshaw 
; Tejas Belagod 
Subject: Re: [PATCH v2][GCC] arm: Add support for dwarf debug directives and 
pseudo hard-register for PAC feature.

Ping!!

From: Srinath Parvathaneni
Sent: 12 November 2021 18:03
To: gcc-patches@gcc.gnu.org 
Cc: Kyrylo Tkachov ; Richard Earnshaw 
; Tejas Belagod 
Subject: [PATCH v2][GCC] arm: Add support for dwarf debug directives and pseudo 
hard-register for PAC feature.

Hello,

This patch teaches the DWARF support in gcc about RA_AUTH_CODE pseudo 
hard-register and also
.save {ra_auth_code} and .cfi_offset ra_auth_code  dwarf directives for 
the PAC feature
in Armv8.1-M architecture.

RA_AUTH_CODE register number is 107 and it's dwarf register number is 143.

When compiled with "arm-none-eabi-gcc -O2  -mthumb -march=armv8.1-m.main+pacbti 
-S -fasynchronous-unwind-tables -g"
command line options, the directives supported in this patch looks like below:

...
push{ip}
.save {ra_auth_code}
.cfi_def_cfa_offset 8
.cfi_offset 143, -8
...

This patch can be committed after the patch at 
https://gcc.gnu.org/pipermail/gcc-patches/2021-November/583407.html
is committed.

Regression tested on arm-none-eabi target and found no regressions.

Ok for master?

Regards,
Srinath.

gcc/ChangeLog:

2021-11-12  Srinath Parvathaneni  

* config/arm/aout.h (ra_auth_code): Add to enum.
* config/arm/arm.c (emit_multi_reg_push): Add RA_AUTH_CODE register to
dwarf frame expression instead of IP_REGNUM.
(arm_expand_prologue): Mark as frame related insn.
(arm_regno_class): Check for pac pseudo reigster.
(arm_dbx_register_number): Assign ra_auth_code register number in dwarf.
(arm_unwind_emit_sequence): Print .save directive with ra_auth_code
register.
(arm_conditional_register_usage): Mark ra_auth_code in fixed reigsters.
* config/arm/arm.h (FIRST_PSEUDO_REGISTER): Modify.
(IS_PAC_Pseudo_REGNUM): Define.
(enum reg_class): Add PAC_REG entry.
* config/arm/arm.md (RA_AUTH_CODE): Define.

gcc/testsuite/ChangeLog:

2021-11-12  Srinath Parvathaneni  

* g++.target/arm/pac-1.C: New test.


### Attachment also inlined for ease of reply###


diff --git a/gcc/config/arm/aout.h b/gcc/config/arm/aout.h
index 
25a2812a663742893b928398b0d3948e97f1905b..c69e299e012f46c8d0711830125dbf2f6b2e93d7
 100644
--- a/gcc/config/arm/aout.h
+++ b/gcc/config/arm/aout.h
@@ -74,7 +74,8 @@
   "wr8",   "wr9",   "wr10",  "wr11",   \
   "wr12",  "wr13",  "wr14",  "wr15",   \
   "wcgr0", "wcgr1", "wcgr2", "wcgr3",  \
-  "cc", "vfpcc", "sfp", "afp", "apsrq", "apsrge", "p0" \
+  "cc", "vfpcc", "sfp", "afp", "apsrq", "apsrge", "p0",\
+  "ra_auth_code"   \
 }
 #endif

diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
index 
8e6ef41f6b065217d1af3f4f1cb85b2d8fbd0dc0..f31944e85c9ab83501f156d138e2aea1bcb5b79d
 100644
--- a/gcc/config/arm/arm.h
+++ b/gcc/config/arm/arm.h
@@ -815,7 +815,8 @@ extern const int arm_arch_cde_coproc_bits[];
 s16-s31   S  VFP variable (aka d8-d15).
 vfpcc   Not a real register.  Represents the VFP condition
 code flags.
-   vpr Used to represent MVE VPR predication.  */
+   vpr Used to represent MVE VPR predication.
+   ra_auth_codePseudo register to save PAC.  */

 /* The stack backtrace structure is as follows:
   fp points to here:  |  save code pointer  |  [fp]
@@ -856,7 +857,7 @@ extern const int arm_arch_cde_coproc_bits[];
   1,1,1,1,1,1,1,1, \
   1,1,1,1, \
   /* Specials.  */ \
-  1,1,1,1,1,1,1\
+  1,1,1,1,1,1,1,1  \
 }

 /* 1 for registers not available across function calls.
@@ -886,7 +887,7 @@ extern const int arm_arch_cde_coproc_bits[];
   1,1,1,1,1,1,1,1, \
   1,1,1,1, \
   /* Specials.  */ \
-  1,1,1,1,1,1,1\
+  1,1,1,1,1,1,1,1  \
 }

 #ifndef SUBTARGET_CONDITIONAL_REGISTER_USAGE
@@ -1062,10 +1063,10 @@ extern const int arm_arch_cde_coproc_bits[];
&& (LAST_VFP_REGNUM - (REGNUM) >= 2 * (N) - 1))

 /* The number of hard registers is 16 ARM + 1 CC + 1 SFP + 1 AFP
-   + 1 APSRQ + 1 APSRGE + 1 VPR.  */
+   + 1 APSRQ + 1 APSRGE + 1 VPR + 1 Pseudo register to save PAC.  */
 /* Intel Wireless MMX Technology registers add 16 + 4 more.  */
 /* VFP (VFP3) adds 32 (64) + 1 VFPCC.  */
-#define FIRST_PSEUDO_REGISTER   107
+#define FIRST_PSEUDO_REGISTER   108

 #define DBX_REGISTER_NUMBER(REGNO) arm_dbx_register_number (REGNO)

@@ -1248,12 +1249,

Re: [vect] PR103971, PR103977: Fix epilogue mode selection for autodetect only

2022-01-12 Thread Bill Schmidt via Gcc-patches
I think we need a fix or a revert for this today, please.  Bootstrap has been 
broken
for a couple of days during the last week of stage 3, which is really 
problematic.

Thanks,
Bill

On 1/12/22 6:57 AM, Richard Biener via Gcc-patches wrote:
> On Wed, 12 Jan 2022, Andre Vieira (lists) wrote:
>
>> On 12/01/2022 11:59, Richard Biener wrote:
>>> On Wed, 12 Jan 2022, Andre Vieira (lists) wrote:
>>>
 On 12/01/2022 11:44, Richard Sandiford wrote:
> Another alternative would be to push autodetected_vector_mode when the
> length is 1 and keep 1 as the starting point.
>
> Richard
 I'm guessing we would still want to skip epilogue vectorization if
 !VECTOR_MODE_P (autodetected_vector_mode) in that case?
>>> Practically we currently only support fixed width word_mode there,
>>> but eventually one could end up with 64bit DImode for the main loop
>>> and 32bit V4QImode in the epilogue ... so not sure if it's worth
>>> special-casing.  But I don't mind adding that skip.
>>>
>>> Richard.
>> I left out the skip, it shouldn't break anything as it would try that same
>> mode before anyway.
>> Just to clarify what I meant though was to skip if autodetected_vector_mode
>> wasn't a vector AND the target didn't define autovectorize_vector_modes, so 
>> in
>> that scenario it wouldn't ever try  V4QImode for the epilogue if the mainloop
>> was autodetected DImode, I think...
>> Either way, this is less code, less complicated and doesn't analyze more than
>> it did before the original patch, so I'm happy with that too.
>>
>> Is this what you had in mind?
> -  mode_i = 1;
> +  if (vector_modes.length () == 1)
> +{
> +  /* If we only had VOIDmode then use AUTODETECTED_VECTOR_MODE to see
> if
> +an epilogue can be created with that mode.  */
> +  vector_modes[0] = autodetected_vector_mode;
> +  mode_i = 0;
> +}
> +  else
> +mode_i = 1;
> +
>
> I would have left out the condition and unconditionally do
>
>   vector_modes[0] = autodetected_vector_mode;
>   mode_i = 0;
>
> but OK if you think it makes sense to special case length == 1.
>
> Richard.


Re: [PATCH] Fix redefined macro for epiphany.

2022-01-12 Thread Jeff Law via Gcc-patches




On 1/12/2022 2:47 AM, Martin Liška wrote:

The following warning is emitted gazillion times.

Fixes:

In file included from ./tm.h:23,
 from gcc/genconfig.c:25:
gcc/config/elfos.h:209: warning: "READONLY_DATA_SECTION_ASM_OP" redefined
  209 | #define READONLY_DATA_SECTION_ASM_OP "\t.section\t.rodata"
  |
In file included from ./tm.h:21,
 from gcc/genconfig.c:25:
gcc/config/epiphany/epiphany.h:671: note: this is the location of the 
previous definition

  671 | #define READONLY_DATA_SECTION_ASM_OP    "\t.section .rodata"

Ready to be installed?
Thanks,
Martin

gcc/ChangeLog:

* config/elfos.h (READONLY_DATA_SECTION_ASM_OP): Define only if
not defined.

Please do.

However, I think we should deprecate the epiphany.  It's been broken for 
~2 years randomly failing in reload.  I tried to fix it once and quickly 
gave up.  It currently ignores all testing failures in my tester as it's 
been horribly unstable.


jeff


Re: [PATCH] ira: Fix old-reload targets [PR103974]

2022-01-12 Thread Vladimir Makarov via Gcc-patches



On 2022-01-12 03:47, Richard Biener wrote:

On Tue, Jan 11, 2022 at 7:41 PM Vladimir Makarov via Gcc-patches
 wrote:


On 2022-01-11 12:42, Richard Sandiford wrote:

The new IRA heuristics would need more work on old-reload targets,
since flattening needs to be able to undo the cost propagation.
It's doable, but hardly seems worth it.

Agree. It is not worth to spend your time for work for reload.

This patch therefore makes all the new calls to
ira_subloop_allocnos_can_differ_p return false if !ira_use_lra_p.
The color_pass code that predated the new function (and that was
the source of ira_subloop_allocnos_can_differ_p) continues to
behave as before.

It's a hack, but at least it has the advantage that the new parameter
would become obviously unused if reload and (!)ira_use_lra_p were
removed.  The hack should therefore disappear alongside reload.

I have a feeling that it will stay for a long time if not forever.

We indeed seem to have 34 targets w/o LRA by default and only 15 with :/

At some point Eric wrote a nice summary for how to transition targets
away from CC0, I wonder if there's something similar for transitioning
a port away from reload to LRA?  In those 34 targets there must be some
for which that's a relatively easy task?  I suppose it depends on how
much of the reload target hooks are put to use and how those translate
to LRA.


First of all the target should be rid of using CC0.  Then theoretically 
:) the target should work with LRA as LRA uses existing reload hooks 
(more accurately a subset of them).


In practice some work should be done for switching to LRA.  I did first 
4 major targets to work with LRA and unfortunately did not find some 
repeating patterns in this work.  The problems for the first targets 
were mostly unique and required a lot of LRA code modifications.  After 
that people did other target switching pretty easily and spent few time 
for this as I remember.


So based on my experience of porting targets to LRA I can not formalize 
this work.  But probably it can be done by examining all LRA targets 
code (mostly looking at machine dependent code related to use 
lra_in_progress_p) or by collecting information what was done from other 
people who did porting to LRA.




Re: [vect] PR103971, PR103977: Fix epilogue mode selection for autodetect only

2022-01-12 Thread Andre Vieira (lists) via Gcc-patches



On 12/01/2022 12:57, Richard Biener wrote:

On Wed, 12 Jan 2022, Andre Vieira (lists) wrote:


On 12/01/2022 11:59, Richard Biener wrote:

On Wed, 12 Jan 2022, Andre Vieira (lists) wrote:


On 12/01/2022 11:44, Richard Sandiford wrote:

Another alternative would be to push autodetected_vector_mode when the
length is 1 and keep 1 as the starting point.

Richard

I'm guessing we would still want to skip epilogue vectorization if
!VECTOR_MODE_P (autodetected_vector_mode) in that case?

Practically we currently only support fixed width word_mode there,
but eventually one could end up with 64bit DImode for the main loop
and 32bit V4QImode in the epilogue ... so not sure if it's worth
special-casing.  But I don't mind adding that skip.

Richard.

I left out the skip, it shouldn't break anything as it would try that same
mode before anyway.
Just to clarify what I meant though was to skip if autodetected_vector_mode
wasn't a vector AND the target didn't define autovectorize_vector_modes, so in
that scenario it wouldn't ever try  V4QImode for the epilogue if the mainloop
was autodetected DImode, I think...
Either way, this is less code, less complicated and doesn't analyze more than
it did before the original patch, so I'm happy with that too.

Is this what you had in mind?

-  mode_i = 1;
+  if (vector_modes.length () == 1)
+{
+  /* If we only had VOIDmode then use AUTODETECTED_VECTOR_MODE to see
if
+an epilogue can be created with that mode.  */
+  vector_modes[0] = autodetected_vector_mode;
+  mode_i = 0;
+}
+  else
+mode_i = 1;
+

I would have left out the condition and unconditionally do

   vector_modes[0] = autodetected_vector_mode;
   mode_i = 0;

but OK if you think it makes sense to special case length == 1.

Richard.


Tested without the special casing, all good, only have performance 
regressions left (which I'm working on), will commit this to fix the ICEs.


Thanks,
Andre



Re: [PATCH, rs6000] Enable absolute jump table by default

2022-01-12 Thread David Edelsohn via Gcc-patches
On Wed, Jan 12, 2022 at 7:22 AM HAO CHEN GUI  wrote:
>
> Hi,
>This patch enables absolute jump table by default on rs6000. The relative 
> jump tables are used when
>it's explicit set by "rs6000_relative_jumptables",
>or jump tables are placed in text section but global relocation is 
> required.
>
>Bootstrapped and tested on powerpc64-linux BE and LE with no regressions. 
> Is this okay for trunk?
> Any recommendations? Thanks a lot.
>
> ChangeLog
> 2022-01-12 Haochen Gui 
>
> gcc/
> * config/rs6000/linux64.h (JUMP_TABLES_IN_TEXT_SECTION): Define.
> * config/rs6000/rs6000.c (rs6000_gen_pic_addr_diff_vec): Return
> true when relative jump table is explicit required or jump tables have
> to be put in text section but global relocation is also required.
> * config/rs6000/rs6000.opt (rs6000_relative_jumptables): Disable.
>
> patch.diff
> diff --git a/gcc/config/rs6000/linux64.h b/gcc/config/rs6000/linux64.h
> index d617f346f81..2e257c60f8c 100644
> --- a/gcc/config/rs6000/linux64.h
> +++ b/gcc/config/rs6000/linux64.h
> @@ -239,7 +239,7 @@ extern int dot_symbols;
>
>  /* Indicate that jump tables go in the text section.  */
>  #undef  JUMP_TABLES_IN_TEXT_SECTION
> -#define JUMP_TABLES_IN_TEXT_SECTION TARGET_64BIT
> +#define JUMP_TABLES_IN_TEXT_SECTION 0
>
>  /* The linux ppc64 ABI isn't explicit on whether aggregates smaller
> than a doubleword should be padded upward or downward.  You could
> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
> index 319182e94d9..9fba893a27a 100644
> --- a/gcc/config/rs6000/rs6000.c
> +++ b/gcc/config/rs6000/rs6000.c
> @@ -28465,7 +28465,9 @@ rs6000_emit_xxspltidp_v2df (rtx dst, long value)
>  static bool
>  rs6000_gen_pic_addr_diff_vec (void)
>  {
> -  return rs6000_relative_jumptables;
> +  return rs6000_relative_jumptables
> +|| (JUMP_TABLES_IN_TEXT_SECTION
> +&& targetm.asm_out.reloc_rw_mask () == 3);
>  }

This seems like contorted logic and overriding the
rs6000_relative_jumptables option change.  The later part of the patch
overrides rs6000_relative_jumptables for all rs6000 configurations,
and then changes this one use of rs6000_relative_jumptables to add
more logic to revert to the old meaning for some targets.

What about all of the other uses of rs6000_relative_jumptables in the
target?  What about rs6000.md?

I highly doubt that this patch is correct.

Why not override rs6000_relative_jumptables for PPC64 Linux instead of
changing its value globally and then trying to fix it up?

Thanks, David

>
>  void
> diff --git a/gcc/config/rs6000/rs6000.opt b/gcc/config/rs6000/rs6000.opt
> index c2a77182a9e..75e3fa86829 100644
> --- a/gcc/config/rs6000/rs6000.opt
> +++ b/gcc/config/rs6000/rs6000.opt
> @@ -630,7 +630,7 @@ Target Mask(MMA) Var(rs6000_isa_flags)
>  Generate (do not generate) MMA instructions.
>
>  mrelative-jumptables
> -Target Undocumented Var(rs6000_relative_jumptables) Init(1) Save
> +Target Undocumented Var(rs6000_relative_jumptables) Init(0) Save
>
>  mrop-protect
>  Target Var(rs6000_rop_protect) Init(0)


[PATCH] tree-optimization/103990 - fix CFG cleanup regression from PRE change

2022-01-12 Thread Richard Biener via Gcc-patches
This adjusts the CFG cleanup flow back to what it was before the
last change which fixes the observed regression of 541.leela_r with
LTO and FDO.

Boostrapped on x86_64-unknown-linux-gnu, testing in progress.

2022-01-12  Richard Biener  

PR tree-optimization/103990
* tree-pass.h (tail_merge_optimize): Drop unused argument.
* tree-ssa-tail-merge.c (tail_merge_optimize): Likewise.
* tree-ssa-pre.c (pass_pre::execute): Retain TODO_cleanup_cfg
and adjust call to tail_merge_optimize.
---
 gcc/tree-pass.h   | 2 +-
 gcc/tree-ssa-pre.c| 6 ++
 gcc/tree-ssa-tail-merge.c | 4 ++--
 3 files changed, 5 insertions(+), 7 deletions(-)

diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
index 36097cf2736..606d1d60b85 100644
--- a/gcc/tree-pass.h
+++ b/gcc/tree-pass.h
@@ -412,7 +412,7 @@ extern gimple_opt_pass *make_pass_early_thread_jumps 
(gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_split_crit_edges (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_laddress (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_pre (gcc::context *ctxt);
-extern unsigned int tail_merge_optimize (unsigned int, bool);
+extern unsigned int tail_merge_optimize (bool);
 extern gimple_opt_pass *make_pass_profile (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_strip_predict_hints (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_lower_complex_O0 (gcc::context *ctxt);
diff --git a/gcc/tree-ssa-pre.c b/gcc/tree-ssa-pre.c
index ab24fa98a1f..5113256802e 100644
--- a/gcc/tree-ssa-pre.c
+++ b/gcc/tree-ssa-pre.c
@@ -4442,7 +4442,6 @@ pass_pre::execute (function *fun)
   if (todo & TODO_cleanup_cfg)
 {
   cleanup_tree_cfg ();
-  todo &= ~TODO_cleanup_cfg;
   need_crit_edge_split = true;
 }
 
@@ -4458,9 +4457,8 @@ pass_pre::execute (function *fun)
  It should either:
  - call merge_blocks after each tail merge iteration
  - call merge_blocks after all tail merge iterations
- - mark TODO_cleanup_cfg when necessary
- - share the cfg cleanup with fini_pre.  */
-  todo |= tail_merge_optimize (todo, need_crit_edge_split);
+ - mark TODO_cleanup_cfg when necessary.  */
+  todo |= tail_merge_optimize (need_crit_edge_split);
 
   free_rpo_vn ();
 
diff --git a/gcc/tree-ssa-tail-merge.c b/gcc/tree-ssa-tail-merge.c
index fd333800f0f..8e1ea1a982e 100644
--- a/gcc/tree-ssa-tail-merge.c
+++ b/gcc/tree-ssa-tail-merge.c
@@ -1724,7 +1724,7 @@ update_debug_stmts (void)
 /* Runs tail merge optimization.  */
 
 unsigned int
-tail_merge_optimize (unsigned int todo, bool need_crit_edge_split)
+tail_merge_optimize (bool need_crit_edge_split)
 {
   int nr_bbs_removed_total = 0;
   int nr_bbs_removed;
@@ -1814,5 +1814,5 @@ tail_merge_optimize (unsigned int todo, bool 
need_crit_edge_split)
 
   timevar_pop (TV_TREE_TAIL_MERGE);
 
-  return todo;
+  return 0;
 }
-- 
2.31.1


Re: [PATCH] Fix redefined macro for epiphany.

2022-01-12 Thread Richard Biener via Gcc-patches
On Wed, Jan 12, 2022 at 3:23 PM Jeff Law via Gcc-patches
 wrote:
>
>
>
> On 1/12/2022 2:47 AM, Martin Liška wrote:
> > The following warning is emitted gazillion times.
> >
> > Fixes:
> >
> > In file included from ./tm.h:23,
> >  from gcc/genconfig.c:25:
> > gcc/config/elfos.h:209: warning: "READONLY_DATA_SECTION_ASM_OP" redefined
> >   209 | #define READONLY_DATA_SECTION_ASM_OP "\t.section\t.rodata"
> >   |
> > In file included from ./tm.h:21,
> >  from gcc/genconfig.c:25:
> > gcc/config/epiphany/epiphany.h:671: note: this is the location of the
> > previous definition
> >   671 | #define READONLY_DATA_SECTION_ASM_OP"\t.section .rodata"
> >
> > Ready to be installed?
> > Thanks,
> > Martin
> >
> > gcc/ChangeLog:
> >
> > * config/elfos.h (READONLY_DATA_SECTION_ASM_OP): Define only if
> > not defined.
> Please do.
>
> However, I think we should deprecate the epiphany.  It's been broken for
> ~2 years randomly failing in reload.  I tried to fix it once and quickly
> gave up.  It currently ignores all testing failures in my tester as it's
> been horribly unstable.

Joern is still listed as maintainer so let's CC him before doing
anything.  That said,
if it builds it's good enough for GCC 12 - we can discuss any target
deprecations
during stage1 (or deprecate those that don't even build right now,
though I'm not
aware of any).

Richard.

>
> jeff


[committed][nvptx] Improve gcc.target/nvptx/atomic_fetch-*.c test-cases

2022-01-12 Thread Tom de Vries via Gcc-patches
Hi,

Fix a few issues in test-cases gcc.target/nvptx/atomic_fetch-*.c:
- atomic_fetch-1.c uses scan-assembler instead of scan-assembler-times,
  which is less accurate
- atomic_fetch-2.c only contains negative testing using
  scan-assembler-not
- the test-cases use stack variables to generate generic addresses,
  while stack atomics are not natively supported
- the test-cases only test (64-bit) x (generic), instead of
  (32-bit, 64-bit) x (generic, global, shared)
- the test-cases use a hardcoded '0' instead of the clearer
  MEMMODEL_RELAXED

Tested on nvptx.

Committed to trunk.

Thanks,
- Tom

[nvptx] Improve gcc.target/nvptx/atomic_fetch-*.c test-cases

gcc/testsuite/ChangeLog:

2022-01-12  Tom de Vries  

* gcc.target/nvptx/atomic_fetch-1.c: Rewrite.
* gcc.target/nvptx/atomic_fetch-2.c: Rewrite.

---
 gcc/testsuite/gcc.target/nvptx/atomic_fetch-1.c | 98 ++---
 gcc/testsuite/gcc.target/nvptx/atomic_fetch-2.c | 92 ---
 2 files changed, 168 insertions(+), 22 deletions(-)

diff --git a/gcc/testsuite/gcc.target/nvptx/atomic_fetch-1.c 
b/gcc/testsuite/gcc.target/nvptx/atomic_fetch-1.c
index c637caa79a0..941cf3a2ab4 100644
--- a/gcc/testsuite/gcc.target/nvptx/atomic_fetch-1.c
+++ b/gcc/testsuite/gcc.target/nvptx/atomic_fetch-1.c
@@ -4,21 +4,97 @@
 /* { dg-do compile } */
 /* { dg-options "-O2 -misa=sm_35" } */
 
+enum memmodel
+{
+  MEMMODEL_RELAXED = 0
+};
+
+unsigned long long int *p64;
+unsigned int *p32;
+
+unsigned long long int g64;
+unsigned int g32;
+
+unsigned int s32 __attribute__((shared));
+unsigned long long int s64 __attribute__((shared));
+
+unsigned long long int v64;
+unsigned int v32;
+
 int
 main()
 {
-  unsigned long long a = ~0;
-  unsigned b = 0xa;
+  /* Generic.  */
 
-  __atomic_fetch_add (&a, b, 0);
-  __atomic_fetch_and (&a, b, 0);
-  __atomic_fetch_or (&a, b, 0);
-  __atomic_fetch_xor (&a, b, 0);
+  __atomic_fetch_add (p64, v64, MEMMODEL_RELAXED);
+  __atomic_fetch_and (p64, v64, MEMMODEL_RELAXED);
+  __atomic_fetch_or (p64, v64, MEMMODEL_RELAXED);
+  __atomic_fetch_xor (p64, v64, MEMMODEL_RELAXED);
   
-  return a;
+  __atomic_fetch_add (p32, v32, MEMMODEL_RELAXED);
+  __atomic_fetch_and (p32, v32, MEMMODEL_RELAXED);
+  __atomic_fetch_or (p32, v32, MEMMODEL_RELAXED);
+  __atomic_fetch_xor (p32, v32, MEMMODEL_RELAXED);
+
+  /* Global.  */
+
+  __atomic_fetch_add (&g64, v64, MEMMODEL_RELAXED);
+  __atomic_fetch_and (&g64, v64, MEMMODEL_RELAXED);
+  __atomic_fetch_or (&g64, v64, MEMMODEL_RELAXED);
+  __atomic_fetch_xor (&g64, v64, MEMMODEL_RELAXED);
+  
+  __atomic_fetch_add (&g32, v32, MEMMODEL_RELAXED);
+  __atomic_fetch_and (&g32, v32, MEMMODEL_RELAXED);
+  __atomic_fetch_or (&g32, v32, MEMMODEL_RELAXED);
+  __atomic_fetch_xor (&g32, v32, MEMMODEL_RELAXED);
+
+  /* Shared.  */
+
+  __atomic_fetch_add (&s64, v64, MEMMODEL_RELAXED);
+  __atomic_fetch_and (&s64, v64, MEMMODEL_RELAXED);
+  __atomic_fetch_or (&s64, v64, MEMMODEL_RELAXED);
+  __atomic_fetch_xor (&s64, v64, MEMMODEL_RELAXED);
+  
+  __atomic_fetch_add (&s32, v32, MEMMODEL_RELAXED);
+  __atomic_fetch_and (&s32, v32, MEMMODEL_RELAXED);
+  __atomic_fetch_or (&s32, v32, MEMMODEL_RELAXED);
+  __atomic_fetch_xor (&s32, v32, MEMMODEL_RELAXED);
+
+  return 0;
 }
 
-/* { dg-final { scan-assembler "atom.add.u64" } } */
-/* { dg-final { scan-assembler "atom.b64.and" } } */
-/* { dg-final { scan-assembler "atom.b64.or" } } */
-/* { dg-final { scan-assembler "atom.b64.xor" } } */
+/* Generic.  */
+
+/* { dg-final { scan-assembler-times "atom.add.u64" 1 } } */
+/* { dg-final { scan-assembler-times "atom.b64.and" 1 } } */
+/* { dg-final { scan-assembler-times "atom.b64.or" 1 } } */
+/* { dg-final { scan-assembler-times "atom.b64.xor" 1 } } */
+
+/* { dg-final { scan-assembler-times "atom.add.u32" 1 } } */
+/* { dg-final { scan-assembler-times "atom.b32.and" 1 } } */
+/* { dg-final { scan-assembler-times "atom.b32.or" 1 } } */
+/* { dg-final { scan-assembler-times "atom.b32.xor" 1 } } */
+
+/* Global.  */
+
+/* { dg-final { scan-assembler-times "atom.global.add.u64" 1 } } */
+/* { dg-final { scan-assembler-times "atom.global.b64.and" 1 } } */
+/* { dg-final { scan-assembler-times "atom.global.b64.or" 1 } } */
+/* { dg-final { scan-assembler-times "atom.global.b64.xor" 1 } } */
+
+/* { dg-final { scan-assembler-times "atom.global.add.u32" 1 } } */
+/* { dg-final { scan-assembler-times "atom.global.b32.and" 1 } } */
+/* { dg-final { scan-assembler-times "atom.global.b32.or" 1 } } */
+/* { dg-final { scan-assembler-times "atom.global.b32.xor" 1 } } */
+
+/* Shared.  */
+
+/* { dg-final { scan-assembler-times "atom.shared.add.u64" 1 } } */
+/* { dg-final { scan-assembler-times "atom.shared.b64.and" 1 } } */
+/* { dg-final { scan-assembler-times "atom.shared.b64.or" 1 } } */
+/* { dg-final { scan-assembler-times "atom.shared.b64.xor" 1 } } */
+
+/* { dg-final { scan-assembler-times "atom.shared.add.u32" 1 } } */
+/* { dg-final { scan-assembler-times "atom.shared.b32.and" 1 } } */
+/* { d

[committed][nvptx] Add gcc.target/nvptx/atomic-exchange-*.c test-cases

2022-01-12 Thread Tom de Vries via Gcc-patches
Hi,

Add a few test-cases that test expansion of __atomic_exchange.

Tested on nvptx.

Committed to trunk.

Thanks,
- Tom

[nvptx] Add gcc.target/nvptx/atomic-exchange-*.c test-cases

gcc/testsuite/ChangeLog:

2022-01-12  Tom de Vries  

* gcc.target/nvptx/atomic-exchange-1.c: New test.
* gcc.target/nvptx/atomic-exchange-2.c: New test.
* gcc.target/nvptx/atomic-exchange-3.c: New test.
* gcc.target/nvptx/atomic-exchange-4.c: New test.

---
 gcc/testsuite/gcc.target/nvptx/atomic-exchange-1.c | 39 
 gcc/testsuite/gcc.target/nvptx/atomic-exchange-2.c | 33 ++
 gcc/testsuite/gcc.target/nvptx/atomic-exchange-3.c | 33 ++
 gcc/testsuite/gcc.target/nvptx/atomic-exchange-4.c | 74 ++
 4 files changed, 179 insertions(+)

diff --git a/gcc/testsuite/gcc.target/nvptx/atomic-exchange-1.c 
b/gcc/testsuite/gcc.target/nvptx/atomic-exchange-1.c
new file mode 100644
index 000..c63f52b168c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/nvptx/atomic-exchange-1.c
@@ -0,0 +1,39 @@
+/* Test the atomic exchange expansion, shared state space.  */
+
+/* { dg-do compile } */
+/* { dg-options "-Wno-long-long" } */
+
+enum memmodel
+{
+  MEMMODEL_SEQ_CST = 5
+};
+
+unsigned char u8 __attribute__((shared));
+unsigned short u16 __attribute__((shared));
+unsigned int u32 __attribute__((shared));
+unsigned long long int u64 __attribute__((shared));
+
+int
+main()
+{
+  __atomic_exchange_n (&u8, 0, MEMMODEL_SEQ_CST);
+  __atomic_exchange_n (&u16, 0, MEMMODEL_SEQ_CST);
+  __atomic_exchange_n (&u32, 0, MEMMODEL_SEQ_CST);
+  __atomic_exchange_n (&u64, 0, MEMMODEL_SEQ_CST);
+
+  return 0;
+}
+
+
+/* Not ptx-native, fallback to libatomic.
+   Libatomic uses generic addressing with a global lock and membar.sys 
barriers.
+   We could implement these more efficiently by cloning libatomic for .shared,
+   using a per-CTA lock and membar.cta barrier.  But we'd expect
+   performance-critical code to use the ptx-native atomic sizes 32 and 64 bit,
+   so that doesn't seem to be worth the trouble.  */
+/* { dg-final { scan-assembler-times "(?n)call .* __atomic_exchange_1" 1 } } */
+/* { dg-final { scan-assembler-times "(?n)call .* __atomic_exchange_2" 1 } } */
+
+/* { dg-final { scan-assembler-times "atom.shared.exch.b32" 1 } } */
+/* { dg-final { scan-assembler-times "atom.shared.exch.b64" 1 } } */
+/* { dg-final { scan-assembler-times "membar.cta" 4 } } */
diff --git a/gcc/testsuite/gcc.target/nvptx/atomic-exchange-2.c 
b/gcc/testsuite/gcc.target/nvptx/atomic-exchange-2.c
new file mode 100644
index 000..4301e74e94e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/nvptx/atomic-exchange-2.c
@@ -0,0 +1,33 @@
+/* Test the atomic exchange expansion, global state space.  */
+
+/* { dg-do compile } */
+/* { dg-options "-Wno-long-long" } */
+
+enum memmodel
+{
+  MEMMODEL_SEQ_CST = 5
+};
+
+unsigned char u8;
+unsigned short u16;
+unsigned int u32;
+unsigned long long int u64;
+
+int
+main()
+{
+  __atomic_exchange_n (&u8, 0, MEMMODEL_SEQ_CST);
+  __atomic_exchange_n (&u16, 0, MEMMODEL_SEQ_CST);
+  __atomic_exchange_n (&u32, 0, MEMMODEL_SEQ_CST);
+  __atomic_exchange_n (&u64, 0, MEMMODEL_SEQ_CST);
+
+  return 0;
+}
+
+/* Not ptx-native, fallback to libatomic.  */
+/* { dg-final { scan-assembler-times "(?n)call .* __atomic_exchange_1" 1 } } */
+/* { dg-final { scan-assembler-times "(?n)call .* __atomic_exchange_2" 1 } } */
+
+/* { dg-final { scan-assembler-times "atom.global.exch.b32" 1 } } */
+/* { dg-final { scan-assembler-times "atom.global.exch.b64" 1 } } */
+/* { dg-final { scan-assembler-times "membar.sys" 4 } } */
diff --git a/gcc/testsuite/gcc.target/nvptx/atomic-exchange-3.c 
b/gcc/testsuite/gcc.target/nvptx/atomic-exchange-3.c
new file mode 100644
index 000..2f8232f25eb
--- /dev/null
+++ b/gcc/testsuite/gcc.target/nvptx/atomic-exchange-3.c
@@ -0,0 +1,33 @@
+/* Test the atomic exchange expansion, generic addressing.  */
+
+/* { dg-do compile } */
+/* { dg-options "-Wno-long-long" } */
+
+enum memmodel
+{
+  MEMMODEL_SEQ_CST = 5
+};
+
+unsigned char *u8;
+unsigned short *u16;
+unsigned int *u32;
+unsigned long long int *u64;
+
+int
+main()
+{
+  __atomic_exchange_n (u8, 0, MEMMODEL_SEQ_CST);
+  __atomic_exchange_n (u16, 0, MEMMODEL_SEQ_CST);
+  __atomic_exchange_n (u32, 0, MEMMODEL_SEQ_CST);
+  __atomic_exchange_n (u64, 0, MEMMODEL_SEQ_CST);
+
+  return 0;
+}
+
+/* Not ptx-native, fallback to libatomic.  */
+/* { dg-final { scan-assembler-times "(?n)call .* __atomic_exchange_1" 1 } } */
+/* { dg-final { scan-assembler-times "(?n)call .* __atomic_exchange_2" 1 } } */
+
+/* { dg-final { scan-assembler-times "atom.exch.b32" 1 } } */
+/* { dg-final { scan-assembler-times "atom.exch.b64" 1 } } */
+/* { dg-final { scan-assembler-times "membar.sys" 4 } } */
diff --git a/gcc/testsuite/gcc.target/nvptx/atomic-exchange-4.c 
b/gcc/testsuite/gcc.target/nvptx/atomic-exchange-4.c
new file mode 100644
index 000..de1d395cccf
--- /dev/null
+++ b/gcc/testsuite/gcc.targ

Re: [PATCH] ira: Fix old-reload targets [PR103974]

2022-01-12 Thread Richard Biener via Gcc-patches
On Wed, Jan 12, 2022 at 3:26 PM Vladimir Makarov  wrote:
>
>
> On 2022-01-12 03:47, Richard Biener wrote:
> > On Tue, Jan 11, 2022 at 7:41 PM Vladimir Makarov via Gcc-patches
> >  wrote:
> >>
> >> On 2022-01-11 12:42, Richard Sandiford wrote:
> >>> The new IRA heuristics would need more work on old-reload targets,
> >>> since flattening needs to be able to undo the cost propagation.
> >>> It's doable, but hardly seems worth it.
> >> Agree. It is not worth to spend your time for work for reload.
> >>> This patch therefore makes all the new calls to
> >>> ira_subloop_allocnos_can_differ_p return false if !ira_use_lra_p.
> >>> The color_pass code that predated the new function (and that was
> >>> the source of ira_subloop_allocnos_can_differ_p) continues to
> >>> behave as before.
> >>>
> >>> It's a hack, but at least it has the advantage that the new parameter
> >>> would become obviously unused if reload and (!)ira_use_lra_p were
> >>> removed.  The hack should therefore disappear alongside reload.
> >> I have a feeling that it will stay for a long time if not forever.
> > We indeed seem to have 34 targets w/o LRA by default and only 15 with :/
> >
> > At some point Eric wrote a nice summary for how to transition targets
> > away from CC0, I wonder if there's something similar for transitioning
> > a port away from reload to LRA?  In those 34 targets there must be some
> > for which that's a relatively easy task?  I suppose it depends on how
> > much of the reload target hooks are put to use and how those translate
> > to LRA.
>
> First of all the target should be rid of using CC0.  Then theoretically
> :) the target should work with LRA as LRA uses existing reload hooks
> (more accurately a subset of them).

CC0 is no more, we've accomplished that feat for GCC 12!

> In practice some work should be done for switching to LRA.  I did first
> 4 major targets to work with LRA and unfortunately did not find some
> repeating patterns in this work.  The problems for the first targets
> were mostly unique and required a lot of LRA code modifications.  After
> that people did other target switching pretty easily and spent few time
> for this as I remember.
>
> So based on my experience of porting targets to LRA I can not formalize
> this work.  But probably it can be done by examining all LRA targets
> code (mostly looking at machine dependent code related to use
> lra_in_progress_p) or by collecting information what was done from other
> people who did porting to LRA.

So in theory it might be just pulling the switch for some?  That is,
removing their definition of TARGET_LRA_P which then defaults
to true?

Jeff might be able to test this for (all) targets on his harness.

Richard.


[committed] analyzer: complain about tainted sizes with "access" attribute [PR103940]

2022-01-12 Thread David Malcolm via Gcc-patches
GCC 10 gained the "access" function and type attribute, which
optionally can take a size-index param:
  https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html

-fanalyzer in trunk (for GCC 12) has gained a -Wanalyzer-tainted-size to
complain about attacker-controlled size values, but this was only being
used deep inside the region-model code when handling the hardcoded known
behavior of certain functions (memset, IIRC).

This patch extends -Wanalyzer-tainted-size to also complain about
unsanitized attacker-controlled values being passed to function
parameters marked as a size via the "access" attribute.

Note that -fanalyzer-checker=taint is currently required in
addition to -fanalyzer to use this warning, due to scaling issues
(see bug 103533).

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r12-6526-g2c16dfe6268eeeb4b7924ff423e274fa00894a4d.

gcc/analyzer/ChangeLog:
PR analyzer/103940
* engine.cc (impl_sm_context::impl_sm_context): Add
"unknown_side_effects" param and use it to initialize
new m_unknown_side_effects field.
(impl_sm_context::unknown_side_effects_p): New.
(impl_sm_context::m_unknown_side_effects): New.
(exploded_node::on_stmt): Pass unknown_side_effects to sm_ctxt
ctor.
* sm-taint.cc: Include "stringpool.h" and "attribs.h".
(tainted_size::tainted_size): Drop "dir" param.
(tainted_size::get_kind): Drop "FINAL".
(tainted_size::emit): Likewise.
(tainted_size::m_dir): Drop unused field.
(class tainted_access_attrib_size): New subclass.
(taint_state_machine::on_stmt): Call check_for_tainted_size_arg on
external functions with unknown side effects.
(taint_state_machine::check_for_tainted_size_arg): New.
(region_model::check_region_for_taint): Drop "dir" param from
tainted_size ctor.
* sm.h (sm_context::unknown_side_effects_p): New.

gcc/testsuite/ChangeLog:
PR analyzer/103940
* gcc.dg/analyzer/taint-size-access-attr-1.c: New test.

Signed-off-by: David Malcolm 
---
 gcc/analyzer/engine.cc|  17 ++-
 gcc/analyzer/sm-taint.cc  | 116 --
 gcc/analyzer/sm.h |   3 +
 .../analyzer/taint-size-access-attr-1.c   |  63 ++
 4 files changed, 187 insertions(+), 12 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/taint-size-access-attr-1.c

diff --git a/gcc/analyzer/engine.cc b/gcc/analyzer/engine.cc
index 18fc00d3857..f2a14f52caa 100644
--- a/gcc/analyzer/engine.cc
+++ b/gcc/analyzer/engine.cc
@@ -293,14 +293,16 @@ public:
   const sm_state_map *old_smap,
   sm_state_map *new_smap,
   path_context *path_ctxt,
-  stmt_finder *stmt_finder = NULL)
+  stmt_finder *stmt_finder = NULL,
+  bool unknown_side_effects = false)
   : sm_context (sm_idx, sm),
 m_logger (eg.get_logger ()),
 m_eg (eg), m_enode_for_diag (enode_for_diag),
 m_old_state (old_state), m_new_state (new_state),
 m_old_smap (old_smap), m_new_smap (new_smap),
 m_path_ctxt (path_ctxt),
-m_stmt_finder (stmt_finder)
+m_stmt_finder (stmt_finder),
+m_unknown_side_effects (unknown_side_effects)
   {
   }
 
@@ -490,6 +492,11 @@ public:
 return m_path_ctxt;
   }
 
+  bool unknown_side_effects_p () const FINAL OVERRIDE
+  {
+return m_unknown_side_effects;
+  }
+
   log_user m_logger;
   exploded_graph &m_eg;
   exploded_node *m_enode_for_diag;
@@ -499,6 +506,9 @@ public:
   sm_state_map *m_new_smap;
   path_context *m_path_ctxt;
   stmt_finder *m_stmt_finder;
+
+  /* Are we handling an external function with unknown side effects?  */
+  bool m_unknown_side_effects;
 };
 
 /* Subclass of stmt_finder for finding the best stmt to report the leak at,
@@ -1325,7 +1335,8 @@ exploded_node::on_stmt (exploded_graph &eg,
= old_state.m_checker_states[sm_idx];
   sm_state_map *new_smap = state->m_checker_states[sm_idx];
   impl_sm_context sm_ctxt (eg, sm_idx, sm, this, &old_state, state,
-  old_smap, new_smap, path_ctxt);
+  old_smap, new_smap, path_ctxt, NULL,
+  unknown_side_effects);
 
   /* Allow the state_machine to handle the stmt.  */
   if (sm.on_stmt (&sm_ctxt, snode, stmt))
diff --git a/gcc/analyzer/sm-taint.cc b/gcc/analyzer/sm-taint.cc
index 8f274df997a..54c7e6015ab 100644
--- a/gcc/analyzer/sm-taint.cc
+++ b/gcc/analyzer/sm-taint.cc
@@ -41,6 +41,8 @@ along with GCC; see the file COPYING3.  If not see
 #include "cgraph.h"
 #include "cfg.h"
 #include "digraph.h"
+#include "stringpool.h"
+#include "attribs.h"
 #include "analyzer/supergraph.h"
 #include "analyzer/call-string.h"
 #include "analyzer/program-point.h"
@@ -102,6 +104,13 @@ public:
 
   state_t combine_states (state_t s0, state_t s1)

Re: [PATCH] ira: Fix old-reload targets [PR103974]

2022-01-12 Thread Richard Biener via Gcc-patches
On Wed, Jan 12, 2022 at 4:00 PM Richard Biener
 wrote:
>
> On Wed, Jan 12, 2022 at 3:26 PM Vladimir Makarov  wrote:
> >
> >
> > On 2022-01-12 03:47, Richard Biener wrote:
> > > On Tue, Jan 11, 2022 at 7:41 PM Vladimir Makarov via Gcc-patches
> > >  wrote:
> > >>
> > >> On 2022-01-11 12:42, Richard Sandiford wrote:
> > >>> The new IRA heuristics would need more work on old-reload targets,
> > >>> since flattening needs to be able to undo the cost propagation.
> > >>> It's doable, but hardly seems worth it.
> > >> Agree. It is not worth to spend your time for work for reload.
> > >>> This patch therefore makes all the new calls to
> > >>> ira_subloop_allocnos_can_differ_p return false if !ira_use_lra_p.
> > >>> The color_pass code that predated the new function (and that was
> > >>> the source of ira_subloop_allocnos_can_differ_p) continues to
> > >>> behave as before.
> > >>>
> > >>> It's a hack, but at least it has the advantage that the new parameter
> > >>> would become obviously unused if reload and (!)ira_use_lra_p were
> > >>> removed.  The hack should therefore disappear alongside reload.
> > >> I have a feeling that it will stay for a long time if not forever.
> > > We indeed seem to have 34 targets w/o LRA by default and only 15 with :/
> > >
> > > At some point Eric wrote a nice summary for how to transition targets
> > > away from CC0, I wonder if there's something similar for transitioning
> > > a port away from reload to LRA?  In those 34 targets there must be some
> > > for which that's a relatively easy task?  I suppose it depends on how
> > > much of the reload target hooks are put to use and how those translate
> > > to LRA.
> >
> > First of all the target should be rid of using CC0.  Then theoretically
> > :) the target should work with LRA as LRA uses existing reload hooks
> > (more accurately a subset of them).
>
> CC0 is no more, we've accomplished that feat for GCC 12!
>
> > In practice some work should be done for switching to LRA.  I did first
> > 4 major targets to work with LRA and unfortunately did not find some
> > repeating patterns in this work.  The problems for the first targets
> > were mostly unique and required a lot of LRA code modifications.  After
> > that people did other target switching pretty easily and spent few time
> > for this as I remember.
> >
> > So based on my experience of porting targets to LRA I can not formalize
> > this work.  But probably it can be done by examining all LRA targets
> > code (mostly looking at machine dependent code related to use
> > lra_in_progress_p) or by collecting information what was done from other
> > people who did porting to LRA.
>
> So in theory it might be just pulling the switch for some?  That is,
> removing their definition of TARGET_LRA_P which then defaults
> to true?

I can confirm it works for visium which builds after such change and
can compile a hello world without crashing.  So it might be a viable
strathegy for the less maintained odd architectures we still have in
our tree.

> Jeff might be able to test this for (all) targets on his harness.
>
> Richard.


Re: PING^2 (C/C++): Re: [PATCH 6/6] Add __attribute__ ((tainted))

2022-01-12 Thread David Malcolm via Gcc-patches
On Tue, 2022-01-11 at 23:36 -0500, Jason Merrill wrote:
> On 1/10/22 16:36, David Malcolm via Gcc-patches wrote:
> > On Thu, 2022-01-06 at 09:08 -0500, David Malcolm wrote:
> > > On Sat, 2021-11-13 at 15:37 -0500, David Malcolm wrote:
> > > > This patch adds a new __attribute__ ((tainted)) to the C/C++
> > > > frontends.
> > > 
> > > Ping for GCC C/C++ mantainers for review of the C/C++ FE parts of
> > > this
> > > patch (attribute registration, documentation, the name of the
> > > attribute, etc).
> > > 
> > > (I believe it's independent of the rest of the patch kit, in that
> > > it
> > > could go into trunk without needing the prior patches)
> > > 
> > > Thanks
> > > Dave
> > 
> > Getting close to end of stage 3 for GCC 12, so pinging this patch
> > again...
> > 
> > https://gcc.gnu.org/pipermail/gcc-patches/2021-November/584376.html
> 
> The c-family change is OK.

Thanks.

I'm retesting the patch now, but it now seems to me that
  __attribute__((tainted_args))
would lead to more readable code than:
  __attribute__((tainted))

in that the name "tainted_args" better conveys the idea that all
arguments are under attacker-control (as opposed to the body of the
function or the function pointer being under attacker-control).

Looking at
  https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html
we already have some attributes with underscores in their names.

Does this sound good?
Dave

> 
> > Thanks
> > Dave
> > 
> > > 
> > > 
> > > > 
> > > > It can be used on function decls: the analyzer will treat as
> > > > tainted
> > > > all parameters to the function and all buffers pointed to by
> > > > parameters
> > > > to the function.  Adding this in one place to the Linux kernel's
> > > > __SYSCALL_DEFINEx macro allows the analyzer to treat all syscalls
> > > > as
> > > > having tainted inputs.  This gives additional testing beyond e.g.
> > > > __user
> > > > pointers added by earlier patches - an example of the use of this
> > > > can
> > > > be
> > > > seen in CVE-2011-2210, where given:
> > > > 
> > > >   SYSCALL_DEFINE5(osf_getsysinfo, unsigned long, op, void __user
> > > > *,
> > > > buffer,
> > > >   unsigned long, nbytes, int __user *, start,
> > > > void
> > > > __user *, arg)
> > > > 
> > > > the analyzer will treat the nbytes param as under attacker
> > > > control,
> > > > and
> > > > can complain accordingly:
> > > > 
> > > > taint-CVE-2011-2210-1.c: In function ‘sys_osf_getsysinfo’:
> > > > taint-CVE-2011-2210-1.c:69:21: warning: use of attacker-
> > > > controlled
> > > > value
> > > >    ‘nbytes’ as size without upper-bounds checking [CWE-129] [-
> > > > Wanalyzer-tainted-size]
> > > >     69 | if (copy_to_user(buffer, hwrpb, nbytes)
> > > > != 0)
> > > >    | ^~~
> > > > 
> > > > Additionally, the patch allows the attribute to be used on field
> > > > decls:
> > > > specifically function pointers.  Any function used as an
> > > > initializer
> > > > for such a field gets treated as tainted.  An example can be seen
> > > > in
> > > > CVE-2020-13143, where adding __attribute__((tainted)) to the
> > > > "store"
> > > > callback of configfs_attribute:
> > > > 
> > > >    struct configfs_attribute {
> > > >   /* [...snip...] */
> > > >   ssize_t (*store)(struct config_item *, const char *,
> > > > size_t)
> > > >     __attribute__((tainted));
> > > >   /* [...snip...] */
> > > >    };
> > > > 
> > > > allows the analyzer to see:
> > > > 
> > > >   CONFIGFS_ATTR(gadget_dev_desc_, UDC);
> > > > 
> > > > and treat gadget_dev_desc_UDC_store as tainted, so that it
> > > > complains:
> > > > 
> > > > taint-CVE-2020-13143-1.c: In function
> > > > ‘gadget_dev_desc_UDC_store’:
> > > > taint-CVE-2020-13143-1.c:33:17: warning: use of attacker-
> > > > controlled
> > > > value
> > > >    ‘len + 18446744073709551615’ as offset without upper-bounds
> > > > checking [CWE-823] [-Wanalyzer-tainted-offset]
> > > >     33 | if (name[len - 1] == '\n')
> > > >    | ^
> > > > 
> > > > Similarly, the attribute could be used on the ioctl callback
> > > > field,
> > > > USB device callbacks, network-handling callbacks etc.  This
> > > > potentially
> > > > gives a lot of test coverage with relatively little code
> > > > annotation,
> > > > and
> > > > without necessarily needing link-time analysis (which -fanalyzer
> > > > can
> > > > only do at present on trivial examples).
> > > > 
> > > > I believe this is the first time we've had an attribute on a
> > > > field.
> > > > If that's an issue, I could prepare a version of the patch that
> > > > merely allowed it on functions themselves.
> > > > 
> > > > As before this currently still needs -fanalyzer-checker=taint (in
> > > > addition to -fanalyzer).
> > > > 
> > > > gcc/analyzer/ChangeLog:
> > > >  * engine.cc: Include "stringpool.h", "attribs.h", and
> > > >  "tree-dfa.h".
> > > >  (mark_params_as_tai

Re: [PATCH] Fix -Wformat-diag for rs6000 target.

2022-01-12 Thread Martin Sebor via Gcc-patches

On 1/12/22 02:02, Martin Liška wrote:

Hello.

We've got -Wformat-diag for some time and I think we should start using it
in -Werror for GCC bootstrap. The following patch removes last pieces of 
the warning

for rs6000 target.

Ready to be installed?
Thanks,
Martin


gcc/ChangeLog:

 * config/rs6000/rs6000-call.c (rs6000_invalid_builtin): Wrap
 keywords and use %qs instead of %<%s%>.
 (rs6000_expand_builtin): Likewise.

gcc/testsuite/ChangeLog:

 * gcc.target/powerpc/bfp/scalar-extract-exp-5.c: Adjust scans in
 testcases.
 * gcc.target/powerpc/bfp/scalar-extract-sig-5.c: Likewise.
 * gcc.target/powerpc/bfp/scalar-insert-exp-11.c: Likewise.
---
  gcc/config/rs6000/rs6000-call.c   | 8 
  .../gcc.target/powerpc/bfp/scalar-extract-exp-5.c | 2 +-
  .../gcc.target/powerpc/bfp/scalar-extract-sig-5.c | 2 +-
  .../gcc.target/powerpc/bfp/scalar-insert-exp-11.c | 2 +-
  4 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-call.c 
b/gcc/config/rs6000/rs6000-call.c

index c78b8b08c40..becdad73812 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -3307,7 +3307,7 @@ rs6000_invalid_builtin (enum rs6000_gen_builtins 
fncode)

   "-mvsx");
    break;
  case ENB_IEEE128_HW:
-  error ("%qs requires ISA 3.0 IEEE 128-bit floating point", name);
+  error ("%qs requires ISA 3.0 IEEE 128-bit floating-point", name);


The instances of the warning where floating point is at the end
of a message aren't correct.  The warning should be relaxed to
allow unhyphenated floating point as a noun (as discussed briefly
last March:
https://gcc.gnu.org/pipermail/gcc-patches/2021-March/566881.html)

Martin



    break;
  case ENB_DFP:
    error ("%qs requires the %qs option", name, "-mhard-dfp");
@@ -5589,20 +5589,20 @@ rs6000_expand_builtin (tree exp, rtx target, rtx 
/* subtarget */,

    if (bif_is_nosoft (*bifaddr)
    && rs6000_isa_flags & OPTION_MASK_SOFT_FLOAT)
  {
-  error ("%<%s%> not supported with %<-msoft-float%>",
+  error ("%qs not supported with %<-msoft-float%>",
   bifaddr->bifname);
    return const0_rtx;
  }

    if (bif_is_no32bit (*bifaddr) && TARGET_32BIT)
  {
-  error ("%<%s%> is not supported in 32-bit mode", bifaddr->bifname);
+  error ("%qs is not supported in 32-bit mode", bifaddr->bifname);
    return const0_rtx;
  }

    if (bif_is_ibmld (*bifaddr) && !FLOAT128_2REG_P (TFmode))
  {
-  error ("%<%s%> requires % to be IBM 128-bit format",
+  error ("%qs requires % to be IBM 128-bit format",
   bifaddr->bifname);
    return const0_rtx;
  }
diff --git a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-5.c 
b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-5.c

index 34184812dc5..1225613b275 100644
--- a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-5.c
+++ b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-5.c
@@ -14,7 +14,7 @@ get_exponent (__ieee128 *p)
  {
    __ieee128 source = *p;

-  return scalar_extract_exp (source);    /* { dg-error "requires ISA 
3.0 IEEE 128-bit floating point" } */
+  return scalar_extract_exp (source);    /* { dg-error "requires ISA 
3.0 IEEE 128-bit floating-point" } */

  }


diff --git a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-sig-5.c 
b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-sig-5.c

index 13c64fc3acf..adf0e4f99df 100644
--- a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-sig-5.c
+++ b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-sig-5.c
@@ -12,5 +12,5 @@ get_significand (__ieee128 *p)
  {
    __ieee128 source = *p;

-  return (long long int) __builtin_vec_scalar_extract_sig (source); /* 
{ dg-error "requires ISA 3.0 IEEE 128-bit floating point" } */
+  return (long long int) __builtin_vec_scalar_extract_sig (source); /* 
{ dg-error "requires ISA 3.0 IEEE 128-bit floating-point" } */

  }
diff --git a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-insert-exp-11.c 
b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-insert-exp-11.c

index a5dd852e60f..6787a67950b 100644
--- a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-insert-exp-11.c
+++ b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-insert-exp-11.c
@@ -16,5 +16,5 @@ insert_exponent (__ieee128 *significand_p,
    __ieee128 significand = *significand_p;
    unsigned long long int exponent = *exponent_p;

-  return scalar_insert_exp (significand, exponent); /* { dg-error 
"requires ISA 3.0 IEEE 128-bit floating point" } */
+  return scalar_insert_exp (significand, exponent); /* { dg-error 
"requires ISA 3.0 IEEE 128-bit floating-point" } */

  }




[PATCH] git-backport: support renamed .cc files in commit message.

2022-01-12 Thread Martin Liška

Hi.

There's a patch that enhances git-backport so that it updates commit
messages for files which name ends now with .cc and is still .c on a branch.

Example usage:

$ git show test
commit 8ed4b2cb9aa158c0ef418fd1ac66271664904604 (test)
Author: Martin Liska 
Date:   Wed Jan 12 16:08:13 2022 +0100

Fix file.

gcc/ChangeLog:

* ipa-icf.cc: Test me.

gcc/ada/ChangeLog:

* cal.c: aaa.

libcpp/ChangeLog:

* expr.cc: aaa.


$ git gcc-backport test
Auto-merging gcc/ada/cal.c
Auto-merging gcc/ipa-icf.c
Auto-merging libcpp/expr.c
[test2 ee40feb077e] Fix file.
 Date: Wed Jan 12 16:08:13 2022 +0100
 3 files changed, 3 insertions(+)
Commit message updated: 2 .cc file(s) changed.

$ git show test2
commit f59a1e736c1b68f07d83388a994df8d043e8aa6e (test2)
Author: Martin Liska 
Date:   Wed Jan 12 16:08:13 2022 +0100

Fix file.

gcc/ChangeLog:

* ipa-icf.c: Test me.

gcc/ada/ChangeLog:

* cal.c: aaa.

libcpp/ChangeLog:

* expr.c: aaa.

(cherry picked from commit 8ed4b2cb9aa158c0ef418fd1ac66271664904604)


MartinFrom 647a6dbaf8cde4ee07b95c4530a03f7774500914 Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Wed, 12 Jan 2022 16:35:41 +0100
Subject: [PATCH] git-backport: support renamed .cc files in commit message.

The change can automatically update names for *.cc files that
are part of a backport.

contrib/ChangeLog:

	* git-backport.py: Support renaming of .cc files.
---
 contrib/git-backport.py | 50 +
 1 file changed, 50 insertions(+)

diff --git a/contrib/git-backport.py b/contrib/git-backport.py
index 2b8e4686719..627bbd9ee66 100755
--- a/contrib/git-backport.py
+++ b/contrib/git-backport.py
@@ -20,7 +20,32 @@
 # Boston, MA 02110-1301, USA.
 
 import argparse
+import os
 import subprocess
+import sys
+import tempfile
+
+script_folder = os.path.dirname(os.path.abspath(__file__))
+verify_script = os.path.join(script_folder,
+ 'gcc-changelog/git_check_commit.py')
+
+
+def replace_file_in_changelog(lines, filename):
+if not filename.endswith('.cc'):
+return
+
+# consider all componenets of a path: gcc/ipa-icf.cc
+while filename:
+for i, line in enumerate(lines):
+if filename in line:
+line = line.replace(filename, filename[:-1])
+lines[i] = line
+return
+parts = filename.split('/')
+if len(parts) == 1:
+return
+filename = '/'.join(parts[1:])
+
 
 if __name__ == '__main__':
 parser = argparse.ArgumentParser(description='Backport a git revision and '
@@ -63,3 +88,28 @@ if __name__ == '__main__':
 subprocess.check_output(cmd, shell=True)
 else:
 print('Please resolve all remaining file conflicts.')
+sys.exit(1)
+
+# Update commit message if change for a .cc file was taken
+r = subprocess.run(f'{verify_script} HEAD', shell=True, encoding='utf8',
+   stdout=subprocess.PIPE, stderr=subprocess.PIPE)
+if r.returncode != 0:
+lines = r.stdout.splitlines()
+cmd = 'git show -s --format=%B'
+commit_message = subprocess.check_output(cmd, shell=True,
+ encoding='utf8').strip()
+commit_message = commit_message.splitlines()
+
+todo = [line for line in lines if 'unchanged file mentioned' in line]
+for item in todo:
+filename = item.split()[-1].strip('"')
+replace_file_in_changelog(commit_message, filename)
+
+with tempfile.NamedTemporaryFile('w', encoding='utf8',
+ delete=False) as w:
+w.write('\n'.join(commit_message))
+w.close()
+subprocess.check_output(f'git commit --amend -F {w.name}',
+shell=True, encoding='utf8')
+os.unlink(w.name)
+print(f'Commit message updated: {len(todo)} .cc file(s) changed.')
-- 
2.34.1



Re: [PATCH] Fix redefined macro for epiphany.

2022-01-12 Thread Jeff Law via Gcc-patches




On 1/12/2022 7:56 AM, Richard Biener wrote:

On Wed, Jan 12, 2022 at 3:23 PM Jeff Law via Gcc-patches
 wrote:



On 1/12/2022 2:47 AM, Martin Liška wrote:

The following warning is emitted gazillion times.

Fixes:

In file included from ./tm.h:23,
  from gcc/genconfig.c:25:
gcc/config/elfos.h:209: warning: "READONLY_DATA_SECTION_ASM_OP" redefined
   209 | #define READONLY_DATA_SECTION_ASM_OP "\t.section\t.rodata"
   |
In file included from ./tm.h:21,
  from gcc/genconfig.c:25:
gcc/config/epiphany/epiphany.h:671: note: this is the location of the
previous definition
   671 | #define READONLY_DATA_SECTION_ASM_OP"\t.section .rodata"

Ready to be installed?
Thanks,
Martin

gcc/ChangeLog:

 * config/elfos.h (READONLY_DATA_SECTION_ASM_OP): Define only if
 not defined.

Please do.

However, I think we should deprecate the epiphany.  It's been broken for
~2 years randomly failing in reload.  I tried to fix it once and quickly
gave up.  It currently ignores all testing failures in my tester as it's
been horribly unstable.

Joern is still listed as maintainer so let's CC him before doing
anything.  That said,
if it builds it's good enough for GCC 12 - we can discuss any target
deprecations
during stage1 (or deprecate those that don't even build right now,
though I'm not
aware of any).
cr16 still uses cc0.  It's marked as deprecated in gcc-12 and assuming 
it doesn't get fixed, removal in gcc-13.


m32c should be deprecated.  While it'll build libgcc, it'll fault (due 
to register allocation/reloading issues) building newlib or anything 
relatively complex.  This has been the case for at least 2 years.


epiphany will build newlib consistently, so I guess keeping it another 
release is OK to see if Joern wants to step up and fix its reload issues.







Jeff



Re: [PATCH] ira: Fix old-reload targets [PR103974]

2022-01-12 Thread Jeff Law via Gcc-patches




On 1/12/2022 8:00 AM, Richard Biener wrote:

On Wed, Jan 12, 2022 at 3:26 PM Vladimir Makarov  wrote:


On 2022-01-12 03:47, Richard Biener wrote:

On Tue, Jan 11, 2022 at 7:41 PM Vladimir Makarov via Gcc-patches
 wrote:

On 2022-01-11 12:42, Richard Sandiford wrote:

The new IRA heuristics would need more work on old-reload targets,
since flattening needs to be able to undo the cost propagation.
It's doable, but hardly seems worth it.

Agree. It is not worth to spend your time for work for reload.

This patch therefore makes all the new calls to
ira_subloop_allocnos_can_differ_p return false if !ira_use_lra_p.
The color_pass code that predated the new function (and that was
the source of ira_subloop_allocnos_can_differ_p) continues to
behave as before.

It's a hack, but at least it has the advantage that the new parameter
would become obviously unused if reload and (!)ira_use_lra_p were
removed.  The hack should therefore disappear alongside reload.

I have a feeling that it will stay for a long time if not forever.

We indeed seem to have 34 targets w/o LRA by default and only 15 with :/

At some point Eric wrote a nice summary for how to transition targets
away from CC0, I wonder if there's something similar for transitioning
a port away from reload to LRA?  In those 34 targets there must be some
for which that's a relatively easy task?  I suppose it depends on how
much of the reload target hooks are put to use and how those translate
to LRA.

First of all the target should be rid of using CC0.  Then theoretically
:) the target should work with LRA as LRA uses existing reload hooks
(more accurately a subset of them).

CC0 is no more, we've accomplished that feat for GCC 12!


In practice some work should be done for switching to LRA.  I did first
4 major targets to work with LRA and unfortunately did not find some
repeating patterns in this work.  The problems for the first targets
were mostly unique and required a lot of LRA code modifications.  After
that people did other target switching pretty easily and spent few time
for this as I remember.

So based on my experience of porting targets to LRA I can not formalize
this work.  But probably it can be done by examining all LRA targets
code (mostly looking at machine dependent code related to use
lra_in_progress_p) or by collecting information what was done from other
people who did porting to LRA.

So in theory it might be just pulling the switch for some?  That is,
removing their definition of TARGET_LRA_P which then defaults
to true?

Jeff might be able to test this for (all) targets on his harness.

Given a patch, it's trivial do throw it in and see what the fallout is.

jeff


Re: [PATCH] epiphany: fir -Wformat-diag.

2022-01-12 Thread Jeff Law via Gcc-patches




On 1/12/2022 2:57 AM, Martin Liška wrote:

This fixes -Wformat-diag warnings.

Ready to be installed?
Thanks,
Martin

gcc/ChangeLog:

* config/epiphany/epiphany.c (epiphany_handle_interrupt_attribute):
Use %qs format specifier.
(epiphany_override_options): Wrap keyword in %<, %>.

OK.  And similar changes for other ports are pre-approved.

jeff



Re: [PATCH] epiphany: fix -Wimplicit-fallthrough warnings in epiphany.c.

2022-01-12 Thread Jeff Law via Gcc-patches




On 1/12/2022 2:56 AM, Martin Liška wrote:

This is RFC.

Cheers,
Martin

gcc/ChangeLog:

* config/epiphany/epiphany.c (epiphany_mode_priority):
Use gcc_unreachable for not handled cases.

OK.  As are similar changes in other ports.

jeff



Re: [Patch][V2]Enable -Wuninitialized + -ftrivial-auto-var-init for address taken variables

2022-01-12 Thread Qing Zhao via Gcc-patches


> On Jan 12, 2022, at 4:46 AM, Richard Biener  
> wrote:
> 
> On Tue, Jan 11, 2022 at 5:32 PM Qing Zhao  wrote:
>> 
> + /* Ignore the call to .DEFERRED_INIT that define the original
> +var itself.  */
> + if (is_gimple_assign (context))
> +   {
> + if (TREE_CODE (gimple_assign_lhs (context)) == VAR_DECL)
> +   lhs_var = gimple_assign_lhs (context);
> + else if (TREE_CODE (gimple_assign_lhs (context)) == 
> SSA_NAME)
> +   lhs_var = SSA_NAME_VAR (gimple_assign_lhs (context));
> +   }
> + if (lhs_var
> + && (lhs_var_name = DECL_NAME (lhs_var))
> + && (lhs_var_name_str = IDENTIFIER_POINTER (lhs_var_name))
> + && (strcmp (lhs_var_name_str, var_name_str) == 0))
> +   return;
> 
> likewise but I don't really understand what you are doing here.
 
 The above is to exclude the following case:
 
  temp = .DEFERRED_INIT (4, 2, “alt_reloc");
  alt_reloc = temp;
 
 i.e, a call to .DEFERRED_INIT that define the original variable itself.
>>> 
>>> How can this happen?  It looks like a bug to me.  Do you have a testcase?
>> With -ftrivial-auto-var-init, During gimplification phase, almost all 
>> address taken variables that do not have an explicit initializer will have 
>> the above IR pattern.
>> 
>> For example, gcc.dg/auto-init-uninit-16.c:
>> 
>> [opc@qinzhao-ol8u3-x86 gcc]$ cat 
>> /home/opc/Work/GCC/latest-gcc/gcc/testsuite/gcc.dg/auto-init-uninit-16.c
>> 
>> With  -ftrivial-auto-var-init=zero, the IR after gimplification is:
>> 
>>  _1 = .DEFERRED_INIT (4, 2, &"alt_reloc"[0]);
>>  alt_reloc = _1;
>> 
>> And the IR after SSA is similar as the above:
>> 
>>  _1 = .DEFERRED_INIT (4, 2, &"alt_reloc"[0]);
>>  alt_reloc = _1;
>> 
>> During the early uninitialized analysis phase, the above IR will feed to the 
>> analyzer, we should exclude such
>> IR from issuing fake warnings.
> 
> Yes, but how do we get to a fake warning here?  We should eventually
> run into the _1 def being used
> by alt_reloc = ...; and then by means of using the string "alt_reloc"
> warn about an uninitialized use of
> alt_reloc?  Is it the point of the use that is "misdiagnosed"?  

Yes, that’s the issue, the use point would be misdiagnosed without excluding 
this self definition chain. 

For the following: cat 
/home/opc/Work/GCC/latest-gcc/gcc/testsuite/gcc.dg/auto-init-uninit-16.c
  1 /* { dg-do compile } */
  2 /* { dg-options "-O2 -Wuninitialized -ftrivial-auto-var-init=zero" } */
  3 
  4 int foo, bar;
  5 
  6 static
  7 void decode_reloc(int reloc, int *is_alt)
  8 {
  9   if (reloc >= 20)
 10   *is_alt = 1;
 11   else if (reloc >= 10)
 12   *is_alt = 0;
 13 }
 14 
 15 void testfunc()
 16 {
 17   int alt_reloc;
 18 
 19   decode_reloc(foo, &alt_reloc);
 20 
 21   if (alt_reloc) /* { dg-warning "may be used uninitialized" "" }  */
 22 bar = 42;
 23 }

If I deleted the following from tree-ssa-uninit.c:

+ /* Ignore the call to .DEFERRED_INIT that define the original
+var itself.  */
+ if (is_gimple_assign (context))
+   {
+ if (TREE_CODE (gimple_assign_lhs (context)) == VAR_DECL)
+   lhs_var = gimple_assign_lhs (context);
+ else if (TREE_CODE (gimple_assign_lhs (context)) == SSA_NAME)
+   lhs_var = SSA_NAME_VAR (gimple_assign_lhs (context));
+   }
+ if (lhs_var
+ && (lhs_var_name = DECL_NAME (lhs_var))
+ && (lhs_var_name_str = IDENTIFIER_POINTER (lhs_var_name))
+ && (strcmp (lhs_var_name_str, var_name_str) == 0))
+   return;

We will get the following warning:

/home/opc/Work/GCC/latest_gcc/gcc/testsuite/gcc.dg/auto-init-uninit-16.c: In 
function ‘testfunc’:
/home/opc/Work/GCC/latest_gcc/gcc/testsuite/gcc.dg/auto-init-uninit-16.c:17:7: 
warning: ‘alt_reloc’ is used uninitialized [-Wuninitialized]

In which the line number of the usage point “17:7” is wrong, which is the 
declaration point of the variable. 
This warning is issued during “pass_early_warn_uninitialized” for the following 
IR:

_1 = .DEFERRED_INIT (4, 2, &"alt_reloc"[0]);
alt_reloc = _1;

The above usage point “alt_reloc = _1” actually is a fake usage, we should 
exclude this usage from issuing warning.

The correct warning should be issued during “pass_late_warn_uninitialized” and 
for line 21 “if alt_reloc” for the following IR (the use point is “if (_1 != 
0), which is for line 21:

_1 = .DEFERRED_INIT (4, 2, &"alt_reloc"[0]);
  if (_1 != 0)

Hope this is clear now.

> I was
> expecting the first patch to fix this.
> 
> I'll wait for an update to the second patch.
I will update it today.

thanks.

Qing
> 
> Richard.
> 
>> 
>>> 
 
> I'm
> also not sure
> I understand the case we try to fix with passing the name - is that
> for VLA decls
> that

Re: [PATCH] Fix -Wformat-diag for s390x target.

2022-01-12 Thread Martin Sebor via Gcc-patches

On 1/12/22 02:03, Martin Liška wrote:

Hello.

We've got -Wformat-diag for some time and I think we should start using it
in -Werror for GCC bootstrap. The following patch removes last pieces of 
the warning

for s390x target.

Ready to be installed?
Thanks,
Martin


gcc/ChangeLog:

 * config/s390/s390-c.c (s390_expand_overloaded_builtin): Wrap
 keyword in quotes.
 (s390_resolve_overloaded_builtin): Remove trailing dot.
 * config/s390/s390.c (s390_const_operand_ok): Use - for range.
 (s390_expand_builtin): Remove trailing dot.
 (s390_emit_prologue): Likewise, use semicolon.
 (s390_option_override_internal): Update keyword.
 * varasm.c (do_assemble_alias): Wrap keyword in quotes.
---
  gcc/config/s390/s390-c.c |  9 +
  gcc/config/s390/s390.c   | 28 ++--
  gcc/varasm.c |  2 +-
  3 files changed, 20 insertions(+), 19 deletions(-)

diff --git a/gcc/config/s390/s390-c.c b/gcc/config/s390/s390-c.c
index 600018421df..10bc6ac8900 100644
--- a/gcc/config/s390/s390-c.c
+++ b/gcc/config/s390/s390-c.c
@@ -484,7 +484,8 @@ s390_expand_overloaded_builtin (location_t loc,
  case S390_OVERLOADED_BUILTIN_s390_vec_step:
    if (TREE_CODE (TREE_TYPE ((*arglist)[0])) != VECTOR_TYPE)
  {
-  error_at (loc, "builtin vec_step can only be used on vector 
types.");

+  error_at (loc, "builtin %qs can only be used on vector types",
+    "vec_step ");


I'd have expected the warning to also trigger for the missing
hyphen in "builtin" (as per the GCC coding conventions) but it
looks like the code only looks for "builtin function".  I must
have done that because of the large number of misspellings.

Regardless, if the name of the function is __builtin_vec_step
(or __builtin_s390_vec_step?) then quoting the entire name of
the function as is conventionally done in the rest of GCC would
be one solution.  Rather than hardcoding the name as a string
there should be a way to do that by passing the right tree to
%qD (or %qF like below).  Based on my reading of the rest of
the file I wonder if this might be the decl:

s390_builtin_decls[bt_for_overloaded_builtin_var[S390_OVERLOADED_BUILTIN_s390_vec_step]]

Alternatively, if the preferred name to call the function with
is vec_step then simply vec_step without the "builtin" would
suffice.

(As an aside, there's a spurious space at the end of "vec_step "
above.)


    return error_mark_node;
  }
    return build_int_cst (NULL_TREE,
@@ -905,7 +906,7 @@ s390_resolve_overloaded_builtin (location_t loc,
    if (ob_flags & B_INT)
  {
    error_at (loc,
-    "builtin %qF is for GCC internal use only.",
+    "builtin %qF is for GCC internal use only",
  ob_fndecl);
    return error_mark_node;
  }
@@ -913,7 +914,7 @@ s390_resolve_overloaded_builtin (location_t loc,
  }

    if (ob_flags & B_DEP)
-    warning_at (loc, 0, "builtin %qF is deprecated.", ob_fndecl);
+    warning_at (loc, 0, "builtin %qF is deprecated", ob_fndecl);

    if (!TARGET_VX && (ob_flags & B_VX))
  {
@@ -1021,7 +1022,7 @@ s390_resolve_overloaded_builtin (location_t loc,
  }

    if (bflags_overloaded_builtin_var[last_match_index] & B_DEP)
-    warning_at (loc, 0, "%qs matching variant is deprecated.",
+    warning_at (loc, 0, "%qs matching variant is deprecated",
  IDENTIFIER_POINTER (DECL_NAME (ob_fndecl)));

    /* Overloaded variants which have MAX set as low level builtin are
diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index 056002e4a4a..bf96cbf7588 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -766,7 +766,7 @@ s390_const_operand_ok (tree arg, int argnum, int 
op_flags, tree decl)

   argnum, decl, values);
  }
    else
-    error ("constant argument %d for builtin %qF is out of range 
(0..%wu)",
+    error ("constant argument %d for builtin %qF is out of range 
(0-%wu)",

     argnum, decl, (HOST_WIDE_INT_1U << bitwidth) - 1);

    return false;
@@ -783,7 +783,7 @@ s390_const_operand_ok (tree arg, int argnum, int 
op_flags, tree decl)

    || tree_to_shwi (arg) > ((HOST_WIDE_INT_1 << (bitwidth - 1)) - 1))
  {
    error ("constant argument %d for builtin %qF is out of range "
- "(%wd..%wd)", argnum, decl,
+ "(%wd-%wd)", argnum, decl,
   -(HOST_WIDE_INT_1 << (bitwidth - 1)),
   (HOST_WIDE_INT_1 << (bitwidth - 1)) - 1);
    return false;
@@ -832,25 +832,25 @@ s390_expand_builtin (tree exp, rtx target, rtx 
subtarget ATTRIBUTE_UNUSED,

    if ((bflags & B_HTM) && !TARGET_HTM)
  {
    error ("builtin %qF is not supported without %<-mhtm%> "
- "(default with %<-march=zEC12%> and higher).", fndecl);
+ "(default with %<-march=zEC12%> and higher)", fndecl);
    return const0_rtx;
  }
    if (((bflags & B_VX) || (bflags & B_VXE)) && !TARGET_VX)
  {
    error ("builtin %qF requires %

[committed] libstdc++: Add explicit dg-do directive to .../103955.cc

2022-01-12 Thread Patrick Palka via Gcc-patches
libstdc++-v3/ChangeLog:

* testsuite/20_util/to_chars/103955.cc: Add explicit dg-do
directive.
---
 libstdc++-v3/testsuite/20_util/to_chars/103955.cc | 1 +
 1 file changed, 1 insertion(+)

diff --git a/libstdc++-v3/testsuite/20_util/to_chars/103955.cc 
b/libstdc++-v3/testsuite/20_util/to_chars/103955.cc
index 3753c87ad8b..e7f17f7a9e8 100644
--- a/libstdc++-v3/testsuite/20_util/to_chars/103955.cc
+++ b/libstdc++-v3/testsuite/20_util/to_chars/103955.cc
@@ -1,6 +1,7 @@
 // PR libstdc++/103955
 // Verify we don't crash when the floating-point to_chars overloads are passed
 // a large precision argument.
+// { dg-do run { target c++17 } }
 
 #include 
 
-- 
2.35.0.rc0



Re: [PATCH] Fix -Wformat-diag for rs6000 target.

2022-01-12 Thread Martin Sebor via Gcc-patches

On 1/12/22 02:02, Martin Liška wrote:

Hello.

We've got -Wformat-diag for some time and I think we should start using it
in -Werror for GCC bootstrap. The following patch removes last pieces of 
the warning

for rs6000 target.

Ready to be installed?
Thanks,
Martin


gcc/ChangeLog:

 * config/rs6000/rs6000-call.c (rs6000_invalid_builtin): Wrap
 keywords and use %qs instead of %<%s%>.
 (rs6000_expand_builtin): Likewise.

gcc/testsuite/ChangeLog:

 * gcc.target/powerpc/bfp/scalar-extract-exp-5.c: Adjust scans in
 testcases.
 * gcc.target/powerpc/bfp/scalar-extract-sig-5.c: Likewise.
 * gcc.target/powerpc/bfp/scalar-insert-exp-11.c: Likewise.
---
  gcc/config/rs6000/rs6000-call.c   | 8 
  .../gcc.target/powerpc/bfp/scalar-extract-exp-5.c | 2 +-
  .../gcc.target/powerpc/bfp/scalar-extract-sig-5.c | 2 +-
  .../gcc.target/powerpc/bfp/scalar-insert-exp-11.c | 2 +-
  4 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/gcc/config/rs6000/rs6000-call.c 
b/gcc/config/rs6000/rs6000-call.c

index c78b8b08c40..becdad73812 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -3307,7 +3307,7 @@ rs6000_invalid_builtin (enum rs6000_gen_builtins 
fncode)

   "-mvsx");
    break;
  case ENB_IEEE128_HW:
-  error ("%qs requires ISA 3.0 IEEE 128-bit floating point", name);
+  error ("%qs requires ISA 3.0 IEEE 128-bit floating-point", name);
    break;
  case ENB_DFP:
    error ("%qs requires the %qs option", name, "-mhard-dfp");
@@ -5589,20 +5589,20 @@ rs6000_expand_builtin (tree exp, rtx target, rtx 
/* subtarget */,

    if (bif_is_nosoft (*bifaddr)
    && rs6000_isa_flags & OPTION_MASK_SOFT_FLOAT)
  {
-  error ("%<%s%> not supported with %<-msoft-float%>",
+  error ("%qs not supported with %<-msoft-float%>",
   bifaddr->bifname);
    return const0_rtx;
  }

    if (bif_is_no32bit (*bifaddr) && TARGET_32BIT)
  {
-  error ("%<%s%> is not supported in 32-bit mode", bifaddr->bifname);
+  error ("%qs is not supported in 32-bit mode", bifaddr->bifname);
    return const0_rtx;
  }

    if (bif_is_ibmld (*bifaddr) && !FLOAT128_2REG_P (TFmode))
  {
-  error ("%<%s%> requires % to be IBM 128-bit format",
+  error ("%qs requires % to be IBM 128-bit format",
   bifaddr->bifname);
    return const0_rtx;
  }
diff --git a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-5.c 
b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-5.c

index 34184812dc5..1225613b275 100644
--- a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-5.c
+++ b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-exp-5.c
@@ -14,7 +14,7 @@ get_exponent (__ieee128 *p)
  {
    __ieee128 source = *p;

-  return scalar_extract_exp (source);    /* { dg-error "requires ISA 
3.0 IEEE 128-bit floating point" } */
+  return scalar_extract_exp (source);    /* { dg-error "requires ISA 
3.0 IEEE 128-bit floating-point" } */


This instance of the warning isn't correct.  It should be relaxed
to allow unhyphenated floating point as a noun at the end of
a message, as discussed briefly last March:

https://gcc.gnu.org/pipermail/gcc-patches/2021-March/566881.html

Martin


  }


diff --git a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-sig-5.c 
b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-sig-5.c

index 13c64fc3acf..adf0e4f99df 100644
--- a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-sig-5.c
+++ b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-extract-sig-5.c
@@ -12,5 +12,5 @@ get_significand (__ieee128 *p)
  {
    __ieee128 source = *p;

-  return (long long int) __builtin_vec_scalar_extract_sig (source); /* 
{ dg-error "requires ISA 3.0 IEEE 128-bit floating point" } */
+  return (long long int) __builtin_vec_scalar_extract_sig (source); /* 
{ dg-error "requires ISA 3.0 IEEE 128-bit floating-point" } */

  }
diff --git a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-insert-exp-11.c 
b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-insert-exp-11.c

index a5dd852e60f..6787a67950b 100644
--- a/gcc/testsuite/gcc.target/powerpc/bfp/scalar-insert-exp-11.c
+++ b/gcc/testsuite/gcc.target/powerpc/bfp/scalar-insert-exp-11.c
@@ -16,5 +16,5 @@ insert_exponent (__ieee128 *significand_p,
    __ieee128 significand = *significand_p;
    unsigned long long int exponent = *exponent_p;

-  return scalar_insert_exp (significand, exponent); /* { dg-error 
"requires ISA 3.0 IEEE 128-bit floating point" } */
+  return scalar_insert_exp (significand, exponent); /* { dg-error 
"requires ISA 3.0 IEEE 128-bit floating-point" } */

  }




[PATCH] Allow other languages to change long double format on PowerPC

2022-01-12 Thread Michael Meissner via Gcc-patches
Allow other languages to change long double format on PowerPC.

With Fortran adding support for changing the long double format, this
patch removes the code that only allowed C/C++ to change the long double
format for GLIBC 2.32 and later without a warning.

I have bootstraped the compiler with this change and there where no
regressesion.  In addition, I have applied it to the power-ieee128 branch where
the work is being done to add support for both types of 128-bit floating point
on PowerPC.  If I build a compiler from that branch, I can change the 128-bit
floating point format at compile time.

Can I check this patch into the trunk GCC compiler?

gcc/
2022-01-12  Michael Meissner  

* config/rs6000/rs6000.c (rs6000_option_override_internal): Remove
checks for only C/C++ front ends before allowing the long double
format to change without a warning.
---
 gcc/config/rs6000/rs6000.c | 8 +++-
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 319182e94d9..0e3481c8327 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -4221,13 +4221,11 @@ rs6000_option_override_internal (bool global_init_p)
   if (rs6000_ieeequad != TARGET_IEEEQUAD_DEFAULT && TARGET_LONG_DOUBLE_128)
{
  /* Determine if the user can change the default long double type at
-compilation time.  Only C and C++ support this, and you need GLIBC
-2.32 or newer.  Only issue one warning.  */
+compilation time.  You need GLIBC 2.32 or newer to be able to
+change the long double type.  Only issue one warning.  */
  static bool warned_change_long_double;
 
- if (!warned_change_long_double
- && (!glibc_supports_ieee_128bit ()
- || (!lang_GNU_C () && !lang_GNU_CXX (
+ if (!warned_change_long_double && !glibc_supports_ieee_128bit ())
{
  warned_change_long_double = true;
  if (TARGET_IEEEQUAD)
-- 
2.33.1


-- 
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: meiss...@linux.ibm.com


Re: [PATCH] Allow other languages to change long double format on PowerPC

2022-01-12 Thread Jakub Jelinek via Gcc-patches
On Wed, Jan 12, 2022 at 12:46:16PM -0500, Michael Meissner via Gcc-patches 
wrote:
> Allow other languages to change long double format on PowerPC.
> 
> With Fortran adding support for changing the long double format, this
> patch removes the code that only allowed C/C++ to change the long double
> format for GLIBC 2.32 and later without a warning.
> 
> I have bootstraped the compiler with this change and there where no
> regressesion.  In addition, I have applied it to the power-ieee128 branch 
> where
> the work is being done to add support for both types of 128-bit floating point
> on PowerPC.  If I build a compiler from that branch, I can change the 128-bit
> floating point format at compile time.
> 
> Can I check this patch into the trunk GCC compiler?
> 
> gcc/
> 2022-01-12  Michael Meissner  
> 
>   * config/rs6000/rs6000.c (rs6000_option_override_internal): Remove
>   checks for only C/C++ front ends before allowing the long double
>   format to change without a warning.

This is already on the trunk, I've merged the power-ieee128 branch changes
into trunk yesterday (r12-6491 through r12-6508 commits inclusive).
This patch in particular as r12-6503-g7d8011fa00fca283003c84e23a8ca66286f83dfa.

Jakub



Re: [PATCH] Allow other languages to change long double format on PowerPC

2022-01-12 Thread Michael Meissner via Gcc-patches
On Wed, Jan 12, 2022 at 06:50:04PM +0100, Jakub Jelinek wrote:
> On Wed, Jan 12, 2022 at 12:46:16PM -0500, Michael Meissner via Gcc-patches 
> wrote:
> > Allow other languages to change long double format on PowerPC.
> > 
> > With Fortran adding support for changing the long double format, this
> > patch removes the code that only allowed C/C++ to change the long double
> > format for GLIBC 2.32 and later without a warning.
> > 
> > I have bootstraped the compiler with this change and there where no
> > regressesion.  In addition, I have applied it to the power-ieee128 branch 
> > where
> > the work is being done to add support for both types of 128-bit floating 
> > point
> > on PowerPC.  If I build a compiler from that branch, I can change the 
> > 128-bit
> > floating point format at compile time.
> > 
> > Can I check this patch into the trunk GCC compiler?
> > 
> > gcc/
> > 2022-01-12  Michael Meissner  
> > 
> > * config/rs6000/rs6000.c (rs6000_option_override_internal): Remove
> > checks for only C/C++ front ends before allowing the long double
> > format to change without a warning.
> 
> This is already on the trunk, I've merged the power-ieee128 branch changes
> into trunk yesterday (r12-6491 through r12-6508 commits inclusive).
> This patch in particular as 
> r12-6503-g7d8011fa00fca283003c84e23a8ca66286f83dfa.

Thanks.  I didn't notice that the patches had been merged into the trunk.

-- 
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: meiss...@linux.ibm.com


[PATCH] Use system default for long double if not specified on PowerPC.

2022-01-12 Thread Michael Meissner via Gcc-patches
Use system default for long double if not specified on PowerPC.

If the user did not specify a default long double format, use the long double
default for the build compiler for the long double default.  This patch will
allow compilers built on a distribution that has changed the 128-bit floating
point format to use the default used on the system.

I did a normal normal bootstrap and make check regression on a little
endian power9 system and there were no regressions.

In addition, I built a compiler where I configured the default to use IEEE
128-bit floating point for long double.  I then used that compiler to
build a bootstrap with this patch applied and I did not set the floating
point format.  I verified that the compiler built with this patch defaults
long double to be IEEE 128-bit.

Can I apply this patch to the trunk for GCC 12?

gcc/
2022-01-12  Michael Meissner  

* config/rs6000/rs6000.c (TARGET_IEEEQUAD_DEFAULT): If the
compiler used to build this compiler defaults long double to IEEE
128-bit, then make it the default for the target.
---
 gcc/config/rs6000/rs6000.c | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 0e3481c8327..e59f7aa49c8 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -91,14 +91,22 @@
  explicitly redefine TARGET_IEEEQUAD and TARGET_IEEEQUAD_DEFAULT to 0, so
  those systems will not pick up this default.  This needs to be after all
  of the include files, so that POWERPC_LINUX and POWERPC_FREEBSD are
- properly defined.  */
+ properly defined.
+
+ If we are being built by a compiler that uses IEEE 128-bit as the default
+ long double and no explicit long double format was selected, then also
+ default long double to IEEE 128-bit.  */
 #ifndef TARGET_IEEEQUAD_DEFAULT
 #if !defined (POWERPC_LINUX) && !defined (POWERPC_FREEBSD)
 #define TARGET_IEEEQUAD_DEFAULT 1
 #else
+#ifdef __LONG_DOUBLE_IEEE128__
+#define TARGET_IEEEQUAD_DEFAULT 1
+#else
 #define TARGET_IEEEQUAD_DEFAULT 0
 #endif
 #endif
+#endif
 
 /* Don't enable PC-relative addressing if the target does not support it.  */
 #ifndef PCREL_SUPPORTED_BY_OS
-- 
2.34.1


-- 
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: meiss...@linux.ibm.com


Re: [PATCH] ira: Fix old-reload targets [PR103974]

2022-01-12 Thread Hans-Peter Nilsson via Gcc-patches
> From: Jeff Law via Gcc-patches 
> Date: Wed, 12 Jan 2022 16:58:50 +0100

> On 1/12/2022 8:00 AM, Richard Biener wrote:
> > On Wed, Jan 12, 2022 at 3:26 PM Vladimir Makarov  
> > wrote:
> >>
> >> On 2022-01-12 03:47, Richard Biener wrote:
> >>> On Tue, Jan 11, 2022 at 7:41 PM Vladimir Makarov via Gcc-patches
> >>>  wrote:
>  On 2022-01-11 12:42, Richard Sandiford wrote:
> > The new IRA heuristics would need more work on old-reload targets,
> > since flattening needs to be able to undo the cost propagation.
> > It's doable, but hardly seems worth it.
>  Agree. It is not worth to spend your time for work for reload.
> > This patch therefore makes all the new calls to
> > ira_subloop_allocnos_can_differ_p return false if !ira_use_lra_p.
> > The color_pass code that predated the new function (and that was
> > the source of ira_subloop_allocnos_can_differ_p) continues to
> > behave as before.
> >
> > It's a hack, but at least it has the advantage that the new parameter
> > would become obviously unused if reload and (!)ira_use_lra_p were
> > removed.  The hack should therefore disappear alongside reload.
>  I have a feeling that it will stay for a long time if not forever.
> >>> We indeed seem to have 34 targets w/o LRA by default and only 15 with :/
> >>>
> >>> At some point Eric wrote a nice summary for how to transition targets
> >>> away from CC0, I wonder if there's something similar for transitioning
> >>> a port away from reload to LRA?  In those 34 targets there must be some
> >>> for which that's a relatively easy task?  I suppose it depends on how
> >>> much of the reload target hooks are put to use and how those translate
> >>> to LRA.
> >> First of all the target should be rid of using CC0.  Then theoretically
> >> :) the target should work with LRA as LRA uses existing reload hooks
> >> (more accurately a subset of them).
> > CC0 is no more, we've accomplished that feat for GCC 12!
> >
> >> In practice some work should be done for switching to LRA.  I did first
> >> 4 major targets to work with LRA and unfortunately did not find some
> >> repeating patterns in this work.  The problems for the first targets
> >> were mostly unique and required a lot of LRA code modifications.  After
> >> that people did other target switching pretty easily and spent few time
> >> for this as I remember.
> >>
> >> So based on my experience of porting targets to LRA I can not formalize
> >> this work.  But probably it can be done by examining all LRA targets
> >> code (mostly looking at machine dependent code related to use
> >> lra_in_progress_p) or by collecting information what was done from other

(About lra_in_progress_p, I see mixes of reload_in_progress
and lra_in_progress (no "_p") in supposed-switched-targets.
And shouldn't reload_completed be renamed (actually: some
kind or alias for) lra_completed to avoid confusion?)

I recall comments about code quality regressions.  Are there
actual numbers?  (Preferably from around the transition
time, because I bet targets still supporting "-mlra" have
regressed on the reload side since then.)

> >> people who did porting to LRA.
> > So in theory it might be just pulling the switch for some?  That is,
> > removing their definition of TARGET_LRA_P which then defaults
> > to true?
> >
> > Jeff might be able to test this for (all) targets on his harness.
> Given a patch, it's trivial do throw it in and see what the fallout is.

Again there's talk about LRA and comparing it to CC0, so
again I'll remind of the lack of documentation for LRA (in
contrast to CC0).  I'm not just referring to guides to use
for switching over a target to LRA, but sure, that'll help
too.

For starters, for each constraint and register-class macro
and hook, what's the difference between reload and LRA;
which ones are unused and which ones are new?

brgds, H-P


Re: [PATCH] ira: Fix old-reload targets [PR103974]

2022-01-12 Thread Hans-Peter Nilsson via Gcc-patches
> From: Jeff Law via Gcc-patches 
> Date: Wed, 12 Jan 2022 16:58:50 +0100

> On 1/12/2022 8:00 AM, Richard Biener wrote:
> > On Wed, Jan 12, 2022 at 3:26 PM Vladimir Makarov  
> > wrote:
> >>
> >> On 2022-01-12 03:47, Richard Biener wrote:
> >>> On Tue, Jan 11, 2022 at 7:41 PM Vladimir Makarov via Gcc-patches
> >>>  wrote:
>  On 2022-01-11 12:42, Richard Sandiford wrote:
> > The new IRA heuristics would need more work on old-reload targets,
> > since flattening needs to be able to undo the cost propagation.
> > It's doable, but hardly seems worth it.
>  Agree. It is not worth to spend your time for work for reload.
> > This patch therefore makes all the new calls to
> > ira_subloop_allocnos_can_differ_p return false if !ira_use_lra_p.
> > The color_pass code that predated the new function (and that was
> > the source of ira_subloop_allocnos_can_differ_p) continues to
> > behave as before.
> >
> > It's a hack, but at least it has the advantage that the new parameter
> > would become obviously unused if reload and (!)ira_use_lra_p were
> > removed.  The hack should therefore disappear alongside reload.
>  I have a feeling that it will stay for a long time if not forever.
> >>> We indeed seem to have 34 targets w/o LRA by default and only 15 with :/
> >>>
> >>> At some point Eric wrote a nice summary for how to transition targets
> >>> away from CC0, I wonder if there's something similar for transitioning
> >>> a port away from reload to LRA?  In those 34 targets there must be some
> >>> for which that's a relatively easy task?  I suppose it depends on how
> >>> much of the reload target hooks are put to use and how those translate
> >>> to LRA.
> >> First of all the target should be rid of using CC0.  Then theoretically
> >> :) the target should work with LRA as LRA uses existing reload hooks
> >> (more accurately a subset of them).
> > CC0 is no more, we've accomplished that feat for GCC 12!
> >
> >> In practice some work should be done for switching to LRA.  I did first
> >> 4 major targets to work with LRA and unfortunately did not find some
> >> repeating patterns in this work.  The problems for the first targets
> >> were mostly unique and required a lot of LRA code modifications.  After
> >> that people did other target switching pretty easily and spent few time
> >> for this as I remember.
> >>
> >> So based on my experience of porting targets to LRA I can not formalize
> >> this work.  But probably it can be done by examining all LRA targets
> >> code (mostly looking at machine dependent code related to use
> >> lra_in_progress_p) or by collecting information what was done from other

(About lra_in_progress_p, I see mixes of reload_in_progress
and lra_in_progress (no "_p") in supposed-switched-targets.
And shouldn't reload_completed be renamed (actually: some
kind or alias for) lra_completed to avoid confusion?)

I recall comments about code quality regressions.  Are there
actual numbers?  (Preferably from around the transition
time, because I bet targets still supporting "-mlra" have
regressed on the reload side since then.)

> >> people who did porting to LRA.
> > So in theory it might be just pulling the switch for some?  That is,
> > removing their definition of TARGET_LRA_P which then defaults
> > to true?
> >
> > Jeff might be able to test this for (all) targets on his harness.
> Given a patch, it's trivial do throw it in and see what the fallout is.

Again there's talk about LRA and comparing it to CC0, so
again I'll remind of the lack of documentation for LRA (in
contrast to CC0).  I'm not just referring to guides to use
for switching over a target to LRA, but sure, that'll help
too.

For starters, for each constraint and register-class macro
and hook, what's the difference between reload and LRA;
which ones are unused and which ones are new?

brgds, H-P


Re: [PATCH v5 1/1] [ARM] Add support for TLS register based stack protector canary access

2022-01-12 Thread Ard Biesheuvel via Gcc-patches
On Wed, 17 Nov 2021 at 18:12, Ard Biesheuvel  wrote:
>
> (+ Ramana)
>

Ping?

> On Mon, 15 Nov 2021 at 19:04, Ard Biesheuvel  wrote:
> >
> > Add support for accessing the stack canary value via the TLS register,
> > so that multiple threads running in the same address space can use
> > distinct canary values. This is intended for the Linux kernel running in
> > SMP mode, where processes entering the kernel are essentially threads
> > running the same program concurrently: using a global variable for the
> > canary in that context is problematic because it can never be rotated,
> > and so the OS is forced to use the same value as long as it remains up.
> >
> > Using the TLS register to index the stack canary helps with this, as it
> > allows each CPU to context switch the TLS register along with the rest
> > of the process, permitting each process to use its own value for the
> > stack canary.
> >
> > 2021-11-15 Ard Biesheuvel 
> >
> > * config/arm/arm-opts.h (enum stack_protector_guard): New
> > * config/arm/arm-protos.h (arm_stack_protect_tls_canary_mem):
> > New
> > * config/arm/arm.c (TARGET_STACK_PROTECT_GUARD): Define
> > (arm_option_override_internal): Handle and put in error checks
> > for stack protector guard options.
> > (arm_option_reconfigure_globals): Likewise
> > (arm_stack_protect_tls_canary_mem): New
> > (arm_stack_protect_guard): New
> > * config/arm/arm.md (stack_protect_set): New
> > (stack_protect_set_tls): Likewise
> > (stack_protect_test): Likewise
> > (stack_protect_test_tls): Likewise
> > (reload_tp_hard): Likewise
> > * config/arm/arm.opt (-mstack-protector-guard): New
> > (-mstack-protector-guard-offset): New.
> > * doc/invoke.texi: Document new options
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.target/arm/stack-protector-7.c: New test.
> > * gcc.target/arm/stack-protector-8.c: New test.
> >
> > Signed-off-by: Ard Biesheuvel 
> > ---
> >  gcc/config/arm/arm-opts.h|  6 ++
> >  gcc/config/arm/arm-protos.h  |  2 +
> >  gcc/config/arm/arm.c | 55 +++
> >  gcc/config/arm/arm.md| 71 +++-
> >  gcc/config/arm/arm.opt   | 22 ++
> >  gcc/doc/invoke.texi  | 11 +++
> >  gcc/testsuite/gcc.target/arm/stack-protector-7.c | 10 +++
> >  gcc/testsuite/gcc.target/arm/stack-protector-8.c |  5 ++
> >  8 files changed, 180 insertions(+), 2 deletions(-)
> >
> > diff --git a/gcc/config/arm/arm-opts.h b/gcc/config/arm/arm-opts.h
> > index 5c4b62f404f7..581ba3c4fbbb 100644
> > --- a/gcc/config/arm/arm-opts.h
> > +++ b/gcc/config/arm/arm-opts.h
> > @@ -69,4 +69,10 @@ enum arm_tls_type {
> >TLS_GNU,
> >TLS_GNU2
> >  };
> > +
> > +/* Where to get the canary for the stack protector.  */
> > +enum stack_protector_guard {
> > +  SSP_TLSREG,  /* per-thread canary in TLS register */
> > +  SSP_GLOBAL   /* global canary */
> > +};
> >  #endif
> > diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
> > index 9b1f61394ad7..d8d605920c97 100644
> > --- a/gcc/config/arm/arm-protos.h
> > +++ b/gcc/config/arm/arm-protos.h
> > @@ -195,6 +195,8 @@ extern void arm_split_atomic_op (enum rtx_code, rtx, 
> > rtx, rtx, rtx, rtx, rtx);
> >  extern rtx arm_load_tp (rtx);
> >  extern bool arm_coproc_builtin_available (enum unspecv);
> >  extern bool arm_coproc_ldc_stc_legitimate_address (rtx);
> > +extern rtx arm_stack_protect_tls_canary_mem (bool);
> > +
> >
> >  #if defined TREE_CODE
> >  extern void arm_init_cumulative_args (CUMULATIVE_ARGS *, tree, rtx, tree);
> > diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> > index a5b403eb3e49..e5077348ce07 100644
> > --- a/gcc/config/arm/arm.c
> > +++ b/gcc/config/arm/arm.c
> > @@ -829,6 +829,9 @@ static const struct attribute_spec 
> > arm_attribute_table[] =
> >
> >  #undef TARGET_MD_ASM_ADJUST
> >  #define TARGET_MD_ASM_ADJUST arm_md_asm_adjust
> > +
> > +#undef TARGET_STACK_PROTECT_GUARD
> > +#define TARGET_STACK_PROTECT_GUARD arm_stack_protect_guard
> >
> >  /* Obstack for minipool constant handling.  */
> >  static struct obstack minipool_obstack;
> > @@ -3176,6 +3179,26 @@ arm_option_override_internal (struct gcc_options 
> > *opts,
> >if (TARGET_THUMB2_P (opts->x_target_flags))
> >  opts->x_inline_asm_unified = true;
> >
> > +  if (arm_stack_protector_guard == SSP_GLOBAL
> > +  && opts->x_arm_stack_protector_guard_offset_str)
> > +{
> > +  error ("incompatible options %'-mstack-protector-guard=global%' and"
> > +"%'-mstack-protector-guard-offset=%qs%'",
> > +arm_stack_protector_guard_offset_str);
> > +}
> > +
> > +  if (opts->x_arm_stack_protector_guard_offset_str)
> > +{
> > +  char *end;
> > +  const char *str = arm_st

Re: [PATCH] ira: Fix old-reload targets [PR103974]

2022-01-12 Thread Maciej W. Rozycki
On Wed, 12 Jan 2022, Richard Biener via Gcc-patches wrote:

> > So in theory it might be just pulling the switch for some?  That is,
> > removing their definition of TARGET_LRA_P which then defaults
> > to true?
> 
> I can confirm it works for visium which builds after such change and
> can compile a hello world without crashing.  So it might be a viable
> strathegy for the less maintained odd architectures we still have in
> our tree.

 FWIW I have massaged the VAX backend sufficiently for the `vax-netbsdelf' 
target to successfully build with `-mlra' as the default.  There are still 
several ICE regressions throughout the testsuite however (none for old 
reload), to say nothing about severe code quality regression where it does 
build.  So there seems to be a lot of work to do there yet even if we let 
code quality regress with LRA.  I plan to do it, but it won't be quick.

  Maciej


Re: [PATCH] ira: Fix old-reload targets [PR103974]

2022-01-12 Thread Koning, Paul via Gcc-patches



> On Jan 12, 2022, at 1:13 PM, Hans-Peter Nilsson via Gcc-patches 
>  wrote:
> 
>> ...
> I recall comments about code quality regressions.  Are there
> actual numbers?  (Preferably from around the transition
> time, because I bet targets still supporting "-mlra" have
> regressed on the reload side since then.)

I haven't looked in a while, but it is certainly the case that the -mlra code 
out of pdp11 is not as good as that coming out of the old reload.  My 
understanding is that LRA isn't as friendly to memory-centric targets like 
pdp11 (and vax?).  In particular, from what I understood there is no support, 
or at least no significant support, for the pre-decrement and post-increment 
register indirect references that those targets like so much.

There was some suggestion along the lines of "please feel free to add it to 
LRA" but that's a seriously hairy undertaking for a programmer with no current 
knowledge of LRA at all.

 people who did porting to LRA.
>>> So in theory it might be just pulling the switch for some?  That is,
>>> removing their definition of TARGET_LRA_P which then defaults
>>> to true?
>>> 
>>> Jeff might be able to test this for (all) targets on his harness.
>> Given a patch, it's trivial do throw it in and see what the fallout is.
> 
> Again there's talk about LRA and comparing it to CC0, so
> again I'll remind of the lack of documentation for LRA (in
> contrast to CC0).  I'm not just referring to guides to use
> for switching over a target to LRA, but sure, that'll help
> too.
> 
> For starters, for each constraint and register-class macro
> and hook, what's the difference between reload and LRA;
> which ones are unused and which ones are new?


What I found interesting is that apparently, to first approximation, supporting 
LRA amounted to "just turn it on".  I think there were some issues to fix in 
register classes or constraints, more along the lines of "these are things you 
should not do for either system but the old reload usually lets you get away 
with it".  So those were handled by cleaning up the issue in question 
generally, not as an LRA-specific change.

Compared to CC0 work the effort was vastly smaller, it's a major rewrite of the 
back end vs. a few small changes in a few spots.  But it's quite possible that 
the true picture is different and that there should be more changes.  And yes, 
there really should be documentation saying so.  GCCint has traditionally been 
quite excellent; it would be distressing if the creation of new technology like 
LRA causes it to regress.

paul



[PATCH] rs6000: Convert built-in constraints to form

2022-01-12 Thread Bill Schmidt via Gcc-patches
Hi!

When introducing the new built-in support, I tried to match as many
existing error messages as possible.  One common form was "argument X must
be a Y-bit unsigned literal".  Another was "argument X must be a literal
between X' and  Y', inclusive".  During reviews, Segher requested that I
eventually convert all messages of the first form into the second form for
consistency.  That's what this patch does, replacing all -form
constraints (first form) with -form constraints (second form).

For the moment, the parser will still accept  arguments, but I've added
a note in rs6000-builtins.def that this form is deprecated in favor of
.  I think it's harmless to leave it in, in case a desire for the
distinction comes up in the future.

Bootstrapped and tested on powerpc64le-linux-gnu with no regressions.
Is this okay for trunk?

Thanks!
Bill

2022-01-12  Bill Schmidt  

gcc/
* config/rs6000/rs6000-builtins.def (MTFSB0): Replace -form
constraints with -form constraints.
(MTFSB1): Likewise.
(MTFSF): Likewise.
(UNPACK_IF): Likewise.
(UNPACK_TF): Likewise.
(DSS): Likewise.
(DST): Likewise.
(DSTST): Likewise.
(DSTSTT): Likewise.
(DSTT): Likewise.
(VCFSX): Likewise.
(VCFUX): Likewise.
(VCTSXS): Likewise.
(VCTUXS): Likewise.
(VSLDOI_16QI): Likewise.
(VSLDOI_4SF): Likewise.
(VSLDOI_4SI): Likewise.
(VSLDOI_8HI): Likewise.
(VSPLTB): Likewise.
(VSPLTH): Likewise.
(VSPLTW): Likewise.
(VEC_SET_V16QI): Likewise.
(VEC_SET_V4SF): Likewise.
(VEC_SET_V4SI): Likewise.
(VEC_SET_V8HI): Likewise.
(VSLDOI_2DF): Likewise.
(VSLDOI_2DI): Likewise.
(VEC_SET_V2DF): Likewise.
(VEC_SET_V2DI): Likewise.
(XVCVSXDDP_SCALE): Likewise.
(XVCVUXDDP_SCALE): Likewise.
(XXPERMDI_16QI): Likewise.
(XXPERMDI_1TI): Likewise.
(XXPERMDI_2DF): Likewise.
(XXPERMDI_2DI): Likewise.
(XXPERMDI_4SF): Likewise.
(XXPERMDI_4SI): Likewise.
(XXPERMDI_8HI): Likewise.
(XXSLDWI_16QI): Likewise.
(XXSLDWI_2DF): Likewise.
(XXSLDWI_2DI): Likewise.
(XXSLDWI_4SF): Likewise.
(XXSLDWI_4SI): Likewise.
(XXSLDWI_8HI): Likewise.
(XXSPLTD_V2DF): Likewise.
(XXSPLTD_V2DI): Likewise.
(UNPACK_V1TI): Likewise.
(BCDADD_V1TI): Likewise.
(BCDADD_V16QI): Likewise.
(BCDADD_EQ_V1TI): Likewise.
(BCDADD_EQ_V16QI): Likewise.
(BCDADD_GT_V1TI): Likewise.
(BCDADD_GT_V16QI): Likewise.
(BCDADD_LT_V1TI): Likewise.
(BCDADD_LT_V16QI): Likewise.
(BCDADD_OV_V1TI): Likewise.
(BCDADD_OV_V16QI): Likewise.
(BCDSUB_V1TI): Likewise.
(BCDSUB_V16QI): Likewise.
(BCDSUB_EQ_V1TI): Likewise.
(BCDSUB_EQ_V16QI): Likewise.
(BCDSUB_GT_V1TI): Likewise.
(BCDSUB_GT_V16QI): Likewise.
(BCDSUB_LT_V1TI): Likewise.
(BCDSUB_LT_V16QI): Likewise.
(BCDSUB_OV_V1TI): Likewise.
(BCDSUB_OV_V16QI): Likewise.
(VSTDCDP): Likewise.
(VSTDCSP): Likewise.
(VTDCDP): Likewise.
(VTDCSP): Likewise.
(TSTSFI_EQ_DD): Likewise.
(TSTSFI_EQ_TD): Likewise.
(TSTSFI_GT_DD): Likewise.
(TSTSFI_GT_TD): Likewise.
(TSTSFI_LT_DD): Likewise.
(TSTSFI_LT_TD): Likewise.
(TSTSFI_OV_DD): Likewise.
(TSTSFI_OV_TD): Likewise.
(VSTDCQP): Likewise.
(DDEDPD): Likewise.
(DDEDPDQ): Likewise.
(DENBCD): Likewise.
(DENBCDQ): Likewise.
(DSCLI): Likewise.
(DSCLIQ): Likewise.
(DSCRI): Likewise.
(DSCRIQ): Likewise.
(UNPACK_TD): Likewise.
(VSHASIGMAD): Likewise.
(VSHASIGMAW): Likewise.
(VCNTMBB): Likewise.
(VCNTMBD): Likewise.
(VCNTMBH): Likewise.
(VCNTMBW): Likewise.
(VREPLACE_UN_UV2DI): Likewise.
(VREPLACE_UN_UV4SI): Likewise.
(VREPLACE_UN_V2DF): Likewise.
(VREPLACE_UN_V2DI): Likewise.
(VREPLACE_UN_V4SF): Likewise.
(VREPLACE_UN_V4SI): Likewise.
(VREPLACE_ELT_UV2DI): Likewise.
(VREPLACE_ELT_UV4SI): Likewise.
(VREPLACE_ELT_V2DF): Likewise.
(VREPLACE_ELT_V2DI): Likewise.
(VREPLACE_ELT_V4SF): Likewise.
(VREPLACE_ELT_V4SI): Likewise.
(VSLDB_V16QI): Likewise.
(VSLDB_V2DI): Likewise.
(VSLDB_V4SI): Likewise.
(VSLDB_V8HI): Likewise.
(VSRDB_V16QI): Likewise.
(VSRDB_V2DI): Likewise.
(VSRDB_V4SI): Likewise.
(VSRDB_V8HI): Likewise.
(VXXSPLTI32DX_V4SF): Likewise.
(VXXSPLTI32DX_V4SI): Likewise.
(XXEVAL): Likewise.
(XXGENPCVM_V16QI): Likewise.
(XXGENPCVM_V2DI): Likewise.
(XXGENPCVM_V4SI): Likewise.
(XXGENPCVM_V8HI): Likewise.
   

[PATCH] i386: Add CC clobber and splits for 32-bit vector mode logic insns [PR100673, PR103861]

2022-01-12 Thread Uros Bizjak via Gcc-patches
Add CC clobber to 32-bit vector mode logic insns to allow variants with
general-purpose registers.  Also improve ix86_sse_movcc to emit insn with
CC clobber for narrow vector modes in order to re-enable conditional moves
for 16-bit and 32-bit narrow vector modes with -msse2.

2022-01-12  Uroš Bizjak  

gcc/ChangeLog:

PR target/100637
PR target/103861
* config/i386/i386-expand.c (ix86_emit_vec_binop): New static function.
(ix86_expand_sse_movcc): Use ix86_emit_vec_binop instead of gen_rtx_X
when constructing vector logic RTXes.
(expand_vec_perm_pshufb2): Ditto.
* config/i386/mmx.md (negv2qi): Disparage GPR alternative a bit.
(v2qi3): Ditto.
(vcond): Re-enable for TARGET_SSE2.
(vcondu): Ditto.
(vcond_mask_): Ditto.
(one_cmpl2): Remove expander.
(one_cmpl2): Rename from one_cmplv2qi.
Use VI_16_32 mode iterator.
(one_cmpl2 splitters): Use VI_16_32 mode iterator.
Use lowpart_subreg instead of gen_lowpart to create subreg.
(*andnot3): Merge from "*andnot" and
"*andnotv2qi3" insn patterns using VI_16_32 mode iterator.
Disparage GPR alternative a bit.  Add CC clobber.
(*andnot3 splitters): Use VI_16_32 mode iterator.
Use lowpart_subreg instead of gen_lowpart to create subreg.
(*3): Merge from
"*" and "*v2qi3 insn patterns"
using VI_16_32 mode iterator.  Disparage GPR alternative a bit.
Add CC clobber.
(*3 splitters):Use VI_16_32 mode
iterator.  Use lowpart_subreg instead of gen_lowpart to create subreg.

gcc/testsuite/ChangeLog:

PR target/100637
PR target/103861
* g++.target/i386/pr100637-1b.C (dg-options):
Use -msse2 instead of -msse4.1.
* g++.target/i386/pr100637-1w.C (dg-options): Ditto.
* g++.target/i386/pr103861-1.C (dg-options): Ditto.
* gcc.target/i386/pr100637-4b.c (dg-options): Ditto.
* gcc.target/i386/pr103861-4.c (dg-options): Ditto.
* gcc.target/i386/pr100637-1b.c: Remove scan-assembler
directives for logic instructions.
* gcc.target/i386/pr100637-1w.c: Ditto.
* gcc.target/i386/warn-vect-op-2.c:
Update dg-warning for vector logic operation.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Pushed to master.

Uros.
diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c
index 8b1266fb9f1..0318f126785 100644
--- a/gcc/config/i386/i386-expand.c
+++ b/gcc/config/i386/i386-expand.c
@@ -3752,6 +3752,27 @@ ix86_expand_sse_cmp (rtx dest, enum rtx_code code, rtx 
cmp_op0, rtx cmp_op1,
   return dest;
 }
 
+/* Emit x86 binary operand CODE in mode MODE for SSE vector
+   instructions that can be performed using GP registers.  */
+
+static void
+ix86_emit_vec_binop (enum rtx_code code, machine_mode mode,
+rtx dst, rtx src1, rtx src2)
+{
+  rtx tmp;
+
+  tmp = gen_rtx_SET (dst, gen_rtx_fmt_ee (code, mode, src1, src2));
+
+  if (GET_MODE_SIZE (mode) <= GET_MODE_SIZE (SImode)
+  && GET_MODE_CLASS (mode) == MODE_VECTOR_INT)
+{
+  rtx clob = gen_rtx_CLOBBER (VOIDmode, gen_rtx_REG (CCmode, FLAGS_REG));
+  tmp = gen_rtx_PARALLEL (VOIDmode, gen_rtvec (2, tmp, clob));
+}
+
+  emit_insn (tmp);
+}
+
 /* Expand DEST = CMP ? OP_TRUE : OP_FALSE into a sequence of logical
operations.  This is used for both scalar and vector conditional moves.  */
 
@@ -3820,23 +3841,20 @@ ix86_expand_sse_movcc (rtx dest, rtx cmp, rtx op_true, 
rtx op_false)
   else if (op_false == CONST0_RTX (mode))
 {
   op_true = force_reg (mode, op_true);
-  x = gen_rtx_AND (mode, cmp, op_true);
-  emit_insn (gen_rtx_SET (dest, x));
+  ix86_emit_vec_binop (AND, mode, dest, cmp, op_true);
   return;
 }
   else if (op_true == CONST0_RTX (mode))
 {
   op_false = force_reg (mode, op_false);
   x = gen_rtx_NOT (mode, cmp);
-  x = gen_rtx_AND (mode, x, op_false);
-  emit_insn (gen_rtx_SET (dest, x));
+  ix86_emit_vec_binop (AND, mode, dest, x, op_false);
   return;
 }
   else if (INTEGRAL_MODE_P (mode) && op_true == CONSTM1_RTX (mode))
 {
   op_false = force_reg (mode, op_false);
-  x = gen_rtx_IOR (mode, cmp, op_false);
-  emit_insn (gen_rtx_SET (dest, x));
+  ix86_emit_vec_binop (IOR, mode, dest, cmp, op_false);
   return;
 }
   else if (TARGET_XOP)
@@ -4010,15 +4028,12 @@ ix86_expand_sse_movcc (rtx dest, rtx cmp, rtx op_true, 
rtx op_false)
   else
t3 = dest;
 
-  x = gen_rtx_AND (mode, op_true, cmp);
-  emit_insn (gen_rtx_SET (t2, x));
+  ix86_emit_vec_binop (AND, mode, t2, op_true, cmp);
 
   x = gen_rtx_NOT (mode, cmp);
-  x = gen_rtx_AND (mode, x, op_false);
-  emit_insn (gen_rtx_SET (t3, x));
+  ix86_emit_vec_binop (AND, mode, t3, x, op_false);
 
-  x = gen_rtx_IOR (mode, t3, t2);
-  emit_insn (gen_rtx_SET (dest, x));
+  ix86_emit_vec_binop (IOR, mode, dest, t3, t2);
 }
 }
 
@@ -20733,7 +20748,7 @@ expand_vec_perm_pshufb2 (struct expand_vec_perm_d *d)
   op = d->target;
   if (d-

Re: [PATCH] x86_64: Improvements to arithmetic right shifts of V1TImode values.

2022-01-12 Thread Uros Bizjak via Gcc-patches
On Tue, Jan 11, 2022 at 2:26 PM Roger Sayle  wrote:
>
>
> This patch to the i386 backend's ix86_expand_v1ti_ashiftrt provides
> improved (shorter) implementations of V1TI mode arithmetic right shifts
> for constant amounts between 111 and 126 bits.  The significance of
> this range is that this functionality is useful for (eventually)
> providing sign extension from HImode and QImode to V1TImode.
>
> For example, x>>112 (to sign extend a 16-bit value), was previously
> generated as a four operation sequence:
>
> movdqa  %xmm0, %xmm1// word7 6 5 4 3 2 1 0
> psrad   $31, %xmm0  // V8HI = [S,S,?,?,?,?,?,?]
> psrad   $16, %xmm1  // V8HI = [S,X,?,?,?,?,?,?]
> punpckhqdq  %xmm0, %xmm1// V8HI = [S,S,?,?,S,X,?,?]
> pshufd  $253, %xmm1, %xmm0  // V8HI = [S,S,S,S,S,S,S,X]
>
> with this patch, we now generates a three operation sequence:
>
> psrad   $16, %xmm0  // V8HI = [S,X,?,?,?,?,?,?]
> pshufhw $254, %xmm0, %xmm0  // V8HI = [S,S,S,X,?,?,?,?]
> pshufd  $254, %xmm0, %xmm0  // V8HI = [S,S,S,S,S,S,S,X]
>
> The correctness of generated code is confirmed by the existing
> run-time test gcc.target/i386/sse2-v1ti-ashiftrt-1.c in the testsuite.
> This idiom is safe to use for shifts by 127, but that case gets handled
> by a two operation sequence earlier in this function.
>
>
> This patch has been tested on x86_64-pc-linux-gnu with a make bootstrap
> and make -k check with no new failures.  OK for mainline?
>
>
> 2022-01-11  Roger Sayle  
>
> gcc/ChangeLog
> * config/i386/i386-expand.c (ix86_expand_v1ti_ashiftrt): Provide
> new three operation implementations for shifts by 111..126 bits.

+  if (bits >= 111)
+{
+  /* Three operations.  */
+  rtx tmp1 = gen_reg_rtx (V4SImode);
+  rtx tmp2 = gen_reg_rtx (V4SImode);
+  emit_move_insn (tmp1, gen_lowpart (V4SImode, op1));
+  emit_insn (gen_ashrv4si3 (tmp2, tmp1, GEN_INT (bits - 96)));

This can be written as:

rtx tmp1 = force_reg (V4SImode, gen_lowpart (V4SImode, op1));
emit_insn (gen_ashrv4i3 (tmp2, tmp1, GEN_INT ...));

+  rtx tmp3 = gen_reg_rtx (V8HImode);
+  rtx tmp4 = gen_reg_rtx (V8HImode);
+  emit_move_insn (tmp3, gen_lowpart (V8HImode, tmp2));
+  emit_insn (gen_sse2_pshufhw (tmp4, tmp3, GEN_INT (0xfe)));

Here in a similar way...

+  rtx tmp5 = gen_reg_rtx (V4SImode);
+  rtx tmp6 = gen_reg_rtx (V4SImode);
+  emit_move_insn (tmp5, gen_lowpart (V4SImode, tmp4));
+  emit_insn (gen_sse2_pshufd (tmp6, tmp5, GEN_INT (0xfe)));

... also here.

+  rtx tmp7 = gen_reg_rtx (V1TImode);
+  emit_move_insn (tmp7, gen_lowpart (V1TImode, tmp6));
+  emit_move_insn (operands[0], tmp7);

And here a simple:

emit_move_insn (operands[0], gen_lowpart (V1TImode, tmp6);

+  return;
+}
+

Uros.


Re: [PATCH] [i386] Optimize a ^ ((a ^ b) & mask) to (~mask & a) | (b & mask).

2022-01-12 Thread Uros Bizjak via Gcc-patches
On Wed, Jan 12, 2022 at 9:11 AM Haochen Jiang  wrote:
>
> Hi all,
>
> This patch targets PR94790, which change the instruction selection under the 
> following circumstance.
>
> Regtested on x86_64-pc-linux-gnu. Ok for trunk?

Please also test with -m32, e.g.:

make -j 12 -k check RUNTESTFLAGS="--target_board=unix\{,-m32\}"

OK (with an it below), if new testcases do not FAIL with -m32.

Thanks,
Uros.

>
> BRs,
> Haochen
>
> From the perspective of the pipeline, `andn + and + ior` version take
> 2 cycles(AND and ANDN doesn't have dependence), but xor + and + xor
> will take 3 cycles.
>
> -   xorl%edi, %esi
> andl%edx, %esi
> -   movl%esi, %eax
> -   xorl%edi, %eax
> +   andn%edi, %edx, %eax
> +   orl %esi, %eax
>
> gcc/ChangeLog:
>
> PR taeget/94790
> * config/i386/i386.md (*xor2andn): New define_insn_and_split.
>
> gcc/testsuite/ChangeLog:
>
> PR taeget/94790
> * gcc.target/i386/pr94790-1.c: New test.
> * gcc.target/i386/pr94790-2.c: Ditto.
> ---
>  gcc/config/i386/i386.md   | 39 +++
>  gcc/testsuite/gcc.target/i386/pr94790-1.c | 14 
>  gcc/testsuite/gcc.target/i386/pr94790-2.c |  9 ++
>  3 files changed, 62 insertions(+)
>  create mode 100755 gcc/testsuite/gcc.target/i386/pr94790-1.c
>  create mode 100755 gcc/testsuite/gcc.target/i386/pr94790-2.c
>
> diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
> index 9b424a3935b..38efc6d5837 100644
> --- a/gcc/config/i386/i386.md
> +++ b/gcc/config/i386/i386.md
> @@ -10452,6 +10452,45 @@
> (set_attr "znver1_decode" "double")
> (set_attr "mode" "DI")])
>
> +;; PR target/94790: Optimize a ^ ((a ^ b) & mask) to (~mask & a) | (b & mask)
> +(define_insn_and_split "*xor2andn"
> +  [(set (match_operand:SWI248 0 "nonimmediate_operand")
> +   (xor:SWI248
> + (and:SWI248
> +   (xor:SWI248
> + (match_operand:SWI248 1 "nonimmediate_operand")
> + (match_operand:SWI248 2 "nonimmediate_operand"))
> +   (match_operand:SWI248 3 "nonimmediate_operand"))
> + (match_dup 1)))
> +(clobber (reg:CC FLAGS_REG))]
> +  "(TARGET_BMI || TARGET_AVX512BW)
> +   && ix86_pre_reload_split ()"
> +  "#"
> +  "&& 1"
> +  [(parallel [(set (match_dup 4)
> +   (and:SWI248
> + (not:SWI248
> +   (match_dup 3))
> + (match_dup 1)))
> + (clobber (reg:CC FLAGS_REG))])
> +   (parallel [(set (match_dup 5)
> +   (and:SWI248
> + (match_dup 2)
> + (match_dup 3)))
> + (clobber (reg:CC FLAGS_REG))])
> +   (parallel [(set (match_dup 0)
> +   (ior:SWI248
> + (match_dup 4)
> + (match_dup 5)))
> + (clobber (reg:CC FLAGS_REG))])]
> +  {
> +operands[1] = force_reg (mode, operands[1]);
> +operands[3] = force_reg (mode, operands[3]);
> +operands[4] = gen_reg_rtx (mode);
> +operands[5] = gen_reg_rtx (mode);
> +  }
> +)

Please put brace just after the curved brace, see numerous examples in
.md files.

> +
>  ;; See comment for addsi_1_zext why we do use nonimmediate_operand
>  (define_insn "*si_1_zext"
>[(set (match_operand:DI 0 "register_operand" "=r")
> diff --git a/gcc/testsuite/gcc.target/i386/pr94790-1.c 
> b/gcc/testsuite/gcc.target/i386/pr94790-1.c
> new file mode 100755
> index 000..6ebbec15cfd
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr94790-1.c
> @@ -0,0 +1,14 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -mbmi" } */
> +/* { dg-final { scan-assembler-times "andn\[ \\t\]" 2 } } */
> +/* { dg-final { scan-assembler-not "xorl\[ \\t\]" } } */
> +
> +unsigned r1(unsigned a, unsigned b, unsigned mask)
> +{
> +  return a ^ ((a ^ b) & mask);
> +}
> +
> +unsigned r2(unsigned a, unsigned b, unsigned mask)
> +{
> +  return (~mask & a) | (b & mask);
> +}
> diff --git a/gcc/testsuite/gcc.target/i386/pr94790-2.c 
> b/gcc/testsuite/gcc.target/i386/pr94790-2.c
> new file mode 100755
> index 000..d7b0eec5bef
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr94790-2.c
> @@ -0,0 +1,9 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -mbmi" } */
> +/* { dg-final { scan-assembler-not "andn\[ \\t\]" } } */
> +/* { dg-final { scan-assembler-times "xorl\[ \\t\]" 2 } } */
> +
> +unsigned r1(unsigned a, unsigned b, unsigned mask)
> +{
> +  return a ^ ((a ^ b) & mask) + (a ^ b);
> +}
> --
> 2.18.1
>


[PATCH] testsuite: Compile g++.dg/vect/slp-pr98855.cc only for x86 targets [PR103935]

2022-01-12 Thread Uros Bizjak via Gcc-patches
The testcase is x86 specific, other targets have different costs defined.

2022-01-12  Uroš Bizjak  

gcc/testsuite/ChangeLog:

PR target/103935
* g++.dg/vect/slp-pr98855.cc: Compile only for x86 targets.

Tested on x86_64-linux-gnu {,-m32}.

Pushed to master.

Uros.
diff --git a/gcc/testsuite/g++.dg/vect/slp-pr98855.cc 
b/gcc/testsuite/g++.dg/vect/slp-pr98855.cc
index ff59eb95aca..e0527c492d5 100644
--- a/gcc/testsuite/g++.dg/vect/slp-pr98855.cc
+++ b/gcc/testsuite/g++.dg/vect/slp-pr98855.cc
@@ -1,6 +1,5 @@
-// { dg-do compile }
-// { dg-additional-options "-fvect-cost-model=cheap" }
-// { dg-additional-options "-mavx2" { target x86_64-*-* i?86-*-* } }
+// { dg-do compile { target i?86-*-* x86_64-*-* } }
+// { dg-additional-options "-fvect-cost-model=cheap -mavx2" }
 
 #include 
 #include 


[PATCH] PR fortran/67804 - ICE on data initialization of type(character) with wrong data

2022-01-12 Thread Harald Anlauf via Gcc-patches
Dear Fortranners,

the attached patch improves error recovery after an invalid
structure constructor has been detected in a DATA statement.

Testcase by Gerhard.

Regtested on x86_64-pc-linux-gnu.  OK for mainline?

This should be a rather safe patch which I would like to
backport to 11-branch after a suitable waiting period.

Thanks,
Harald

From 31436189cb2859040703ec6baff816cd63ef Mon Sep 17 00:00:00 2001
From: Harald Anlauf 
Date: Wed, 12 Jan 2022 21:24:49 +0100
Subject: [PATCH] Fortran: fix error recovery on bad structure constructor in
 DATA statement

gcc/fortran/ChangeLog:

	PR fortran/67804
	* primary.c (gfc_match_structure_constructor): Recover from errors
	that occurred while checking for a valid structure constructor in
	a DATA statement.

gcc/testsuite/ChangeLog:

	PR fortran/67804
	* gfortran.dg/pr93604.f90: Adjust to changed diagnostics.
	* gfortran.dg/pr67804.f90: New test.
---
 gcc/fortran/primary.c | 15 ---
 gcc/testsuite/gfortran.dg/pr67804.f90 | 25 +
 gcc/testsuite/gfortran.dg/pr93604.f90 |  2 +-
 3 files changed, 38 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/pr67804.f90

diff --git a/gcc/fortran/primary.c b/gcc/fortran/primary.c
index fd4d6af50c0..3f01f67cd49 100644
--- a/gcc/fortran/primary.c
+++ b/gcc/fortran/primary.c
@@ -3364,6 +3364,7 @@ gfc_match_structure_constructor (gfc_symbol *sym, gfc_expr **result)
   match m;
   gfc_expr *e;
   gfc_symtree *symtree;
+  bool t = true;

   gfc_get_ha_sym_tree (sym->name, &symtree);

@@ -3394,10 +3395,18 @@ gfc_match_structure_constructor (gfc_symbol *sym, gfc_expr **result)
  in the structure constructor must be a constant.  Try to reduce the
  expression here.  */
   if (gfc_in_match_data ())
-gfc_reduce_init_expr (e);
+t = gfc_reduce_init_expr (e);

-  *result = e;
-  return MATCH_YES;
+  if (t)
+{
+  *result = e;
+  return MATCH_YES;
+}
+  else
+{
+  gfc_free_expr (e);
+  return MATCH_ERROR;
+}
 }


diff --git a/gcc/testsuite/gfortran.dg/pr67804.f90 b/gcc/testsuite/gfortran.dg/pr67804.f90
new file mode 100644
index 000..e2009a5bfdb
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr67804.f90
@@ -0,0 +1,25 @@
+! { dg-do compile }
+! PR fortran/67804 - ICE on bad type in structure constructor in DATA statement
+! Contributed by G.Steinmetz
+
+program p
+  type t
+ character :: c
+  end type
+  type u
+ character, pointer :: c
+  end type
+  type(t) :: x0, x1, x2, x3, x4, x5, x6, x7, x8, x9
+  type(u) :: y6
+  data x0 /t('a')/ ! OK
+  data x1 /t(1)/   ! { dg-error "Cannot convert" }
+  data x2 /t(1.)/  ! { dg-error "Cannot convert" }
+  data x3 /t(1d1)/ ! { dg-error "Cannot convert" }
+  data x4 /t((0.,1.))/ ! { dg-error "Cannot convert" }
+  data x5 /t(.true.)/  ! { dg-error "Cannot convert" }
+  data x6 /t(null())/  ! { dg-error "neither a POINTER nor ALLOCATABLE" }
+  data x7 /t(['1'])/   ! { dg-error "The rank of the element" }
+  data x8 /t([1])/ ! { dg-error "Cannot convert" }
+  data x9 /t(z'0')/! { dg-error "Cannot convert" }
+  data y6 /u(null())/  ! OK
+end
diff --git a/gcc/testsuite/gfortran.dg/pr93604.f90 b/gcc/testsuite/gfortran.dg/pr93604.f90
index 2c695d37829..4040155120c 100644
--- a/gcc/testsuite/gfortran.dg/pr93604.f90
+++ b/gcc/testsuite/gfortran.dg/pr93604.f90
@@ -5,6 +5,6 @@ program p
   integer :: a
end type
type(t) :: x
-   data x /t(z'1')/ ! { dg-error "cannot appear in a structure constructor" }
+   data x /t(z'1')/ ! { dg-error "BOZ" }
 end

--
2.31.1



RE: [PATCH] [i386] Optimize a ^ ((a ^ b) & mask) to (~mask & a) | (b & mask).

2022-01-12 Thread Jiang, Haochen via Gcc-patches
Hi Uros,

I have also tested on -m32. They do not fail.

Thx,
Haochen

-Original Message-
From: Uros Bizjak  
Sent: Thursday, January 13, 2022 3:22 AM
To: Jiang, Haochen 
Cc: gcc-patches@gcc.gnu.org; Liu, Hongtao 
Subject: Re: [PATCH] [i386] Optimize a ^ ((a ^ b) & mask) to (~mask & a) | (b & 
mask).

On Wed, Jan 12, 2022 at 9:11 AM Haochen Jiang  wrote:
>
> Hi all,
>
> This patch targets PR94790, which change the instruction selection under the 
> following circumstance.
>
> Regtested on x86_64-pc-linux-gnu. Ok for trunk?

Please also test with -m32, e.g.:

make -j 12 -k check RUNTESTFLAGS="--target_board=unix\{,-m32\}"

OK (with an it below), if new testcases do not FAIL with -m32.

Thanks,
Uros.

>
> BRs,
> Haochen
>
> From the perspective of the pipeline, `andn + and + ior` version take
> 2 cycles(AND and ANDN doesn't have dependence), but xor + and + xor 
> will take 3 cycles.
>
> -   xorl%edi, %esi
> andl%edx, %esi
> -   movl%esi, %eax
> -   xorl%edi, %eax
> +   andn%edi, %edx, %eax
> +   orl %esi, %eax
>
> gcc/ChangeLog:
>
> PR taeget/94790
> * config/i386/i386.md (*xor2andn): New define_insn_and_split.
>
> gcc/testsuite/ChangeLog:
>
> PR taeget/94790
> * gcc.target/i386/pr94790-1.c: New test.
> * gcc.target/i386/pr94790-2.c: Ditto.
> ---
>  gcc/config/i386/i386.md   | 39 +++
>  gcc/testsuite/gcc.target/i386/pr94790-1.c | 14   
> gcc/testsuite/gcc.target/i386/pr94790-2.c |  9 ++
>  3 files changed, 62 insertions(+)
>  create mode 100755 gcc/testsuite/gcc.target/i386/pr94790-1.c
>  create mode 100755 gcc/testsuite/gcc.target/i386/pr94790-2.c
>
> diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 
> 9b424a3935b..38efc6d5837 100644
> --- a/gcc/config/i386/i386.md
> +++ b/gcc/config/i386/i386.md
> @@ -10452,6 +10452,45 @@
> (set_attr "znver1_decode" "double")
> (set_attr "mode" "DI")])
>
> +;; PR target/94790: Optimize a ^ ((a ^ b) & mask) to (~mask & a) | (b 
> +& mask) (define_insn_and_split "*xor2andn"
> +  [(set (match_operand:SWI248 0 "nonimmediate_operand")
> +   (xor:SWI248
> + (and:SWI248
> +   (xor:SWI248
> + (match_operand:SWI248 1 "nonimmediate_operand")
> + (match_operand:SWI248 2 "nonimmediate_operand"))
> +   (match_operand:SWI248 3 "nonimmediate_operand"))
> + (match_dup 1)))
> +(clobber (reg:CC FLAGS_REG))]
> +  "(TARGET_BMI || TARGET_AVX512BW)
> +   && ix86_pre_reload_split ()"
> +  "#"
> +  "&& 1"
> +  [(parallel [(set (match_dup 4)
> +   (and:SWI248
> + (not:SWI248
> +   (match_dup 3))
> + (match_dup 1)))
> + (clobber (reg:CC FLAGS_REG))])
> +   (parallel [(set (match_dup 5)
> +   (and:SWI248
> + (match_dup 2)
> + (match_dup 3)))
> + (clobber (reg:CC FLAGS_REG))])
> +   (parallel [(set (match_dup 0)
> +   (ior:SWI248
> + (match_dup 4)
> + (match_dup 5)))
> + (clobber (reg:CC FLAGS_REG))])]
> +  {
> +operands[1] = force_reg (mode, operands[1]);
> +operands[3] = force_reg (mode, operands[3]);
> +operands[4] = gen_reg_rtx (mode);
> +operands[5] = gen_reg_rtx (mode);
> +  }
> +)

Please put brace just after the curved brace, see numerous examples in .md 
files.

> +
>  ;; See comment for addsi_1_zext why we do use nonimmediate_operand  
> (define_insn "*si_1_zext"
>[(set (match_operand:DI 0 "register_operand" "=r") diff --git 
> a/gcc/testsuite/gcc.target/i386/pr94790-1.c 
> b/gcc/testsuite/gcc.target/i386/pr94790-1.c
> new file mode 100755
> index 000..6ebbec15cfd
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr94790-1.c
> @@ -0,0 +1,14 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -mbmi" } */
> +/* { dg-final { scan-assembler-times "andn\[ \\t\]" 2 } } */
> +/* { dg-final { scan-assembler-not "xorl\[ \\t\]" } } */
> +
> +unsigned r1(unsigned a, unsigned b, unsigned mask) {
> +  return a ^ ((a ^ b) & mask);
> +}
> +
> +unsigned r2(unsigned a, unsigned b, unsigned mask) {
> +  return (~mask & a) | (b & mask);
> +}
> diff --git a/gcc/testsuite/gcc.target/i386/pr94790-2.c 
> b/gcc/testsuite/gcc.target/i386/pr94790-2.c
> new file mode 100755
> index 000..d7b0eec5bef
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr94790-2.c
> @@ -0,0 +1,9 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -mbmi" } */
> +/* { dg-final { scan-assembler-not "andn\[ \\t\]" } } */
> +/* { dg-final { scan-assembler-times "xorl\[ \\t\]" 2 } } */
> +
> +unsigned r1(unsigned a, unsigned b, unsigned mask) {
> +  return a ^ ((a ^ b) & mask) + (a ^ b); }
> --
> 2.18.1
>


Re: [PATCH, rs6000] Enable absolute jump table by default

2022-01-12 Thread HAO CHEN GUI via Gcc-patches
Hi David,

On 12/1/2022 下午 10:44, David Edelsohn wrote:
> On Wed, Jan 12, 2022 at 7:22 AM HAO CHEN GUI  wrote:
>>
>> Hi,
>>This patch enables absolute jump table by default on rs6000. The relative 
>> jump tables are used when
>>it's explicit set by "rs6000_relative_jumptables",
>>or jump tables are placed in text section but global relocation is 
>> required.
>>
>>Bootstrapped and tested on powerpc64-linux BE and LE with no regressions. 
>> Is this okay for trunk?
>> Any recommendations? Thanks a lot.
>>
>> ChangeLog
>> 2022-01-12 Haochen Gui 
>>
>> gcc/
>> * config/rs6000/linux64.h (JUMP_TABLES_IN_TEXT_SECTION): Define.
>> * config/rs6000/rs6000.c (rs6000_gen_pic_addr_diff_vec): Return
>> true when relative jump table is explicit required or jump tables 
>> have
>> to be put in text section but global relocation is also required.
>> * config/rs6000/rs6000.opt (rs6000_relative_jumptables): Disable.
>>
>> patch.diff
>> diff --git a/gcc/config/rs6000/linux64.h b/gcc/config/rs6000/linux64.h
>> index d617f346f81..2e257c60f8c 100644
>> --- a/gcc/config/rs6000/linux64.h
>> +++ b/gcc/config/rs6000/linux64.h
>> @@ -239,7 +239,7 @@ extern int dot_symbols;
>>
>>  /* Indicate that jump tables go in the text section.  */
>>  #undef  JUMP_TABLES_IN_TEXT_SECTION
>> -#define JUMP_TABLES_IN_TEXT_SECTION TARGET_64BIT
>> +#define JUMP_TABLES_IN_TEXT_SECTION 0
>>
>>  /* The linux ppc64 ABI isn't explicit on whether aggregates smaller
>> than a doubleword should be padded upward or downward.  You could
>> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
>> index 319182e94d9..9fba893a27a 100644
>> --- a/gcc/config/rs6000/rs6000.c
>> +++ b/gcc/config/rs6000/rs6000.c
>> @@ -28465,7 +28465,9 @@ rs6000_emit_xxspltidp_v2df (rtx dst, long value)
>>  static bool
>>  rs6000_gen_pic_addr_diff_vec (void)
>>  {
>> -  return rs6000_relative_jumptables;
>> +  return rs6000_relative_jumptables
>> +|| (JUMP_TABLES_IN_TEXT_SECTION
>> +&& targetm.asm_out.reloc_rw_mask () == 3);
>>  }
> 
> This seems like contorted logic and overriding the
> rs6000_relative_jumptables option change.  The later part of the patch
> overrides rs6000_relative_jumptables for all rs6000 configurations,
> and then changes this one use of rs6000_relative_jumptables to add
> more logic to revert to the old meaning for some targets.
> 
> What about all of the other uses of rs6000_relative_jumptables in the
> target?  What about rs6000.md?
> 
> I highly doubt that this patch is correct.
> 
> Why not override rs6000_relative_jumptables for PPC64 Linux instead of
> changing its value globally and then trying to fix it up?
> 
> Thanks, David
  Thanks for your comments.

  In this patch, I tried to enable absolute jump table on all rs6000 targets.
For PPC64 Linux, it supports RELRO section (e.g. .data.rel.ro.local) as
"JUMP_TABLES_IN_TEXT_SECTION" is set to 0. So, absolute jump tables could be 
placed
in RELRO section whatever global relocation is required or not. The absolute 
jump
table can't be placed in text section when global relocation is required as text
section is not writable. So for other rs6000 targets, absolute jump table can't 
be
used if the target doesn't support RELRO and global relocation is required also.

  Looking forward to your advice. Thanks again.
> 
>>
>>  void
>> diff --git a/gcc/config/rs6000/rs6000.opt b/gcc/config/rs6000/rs6000.opt
>> index c2a77182a9e..75e3fa86829 100644
>> --- a/gcc/config/rs6000/rs6000.opt
>> +++ b/gcc/config/rs6000/rs6000.opt
>> @@ -630,7 +630,7 @@ Target Mask(MMA) Var(rs6000_isa_flags)
>>  Generate (do not generate) MMA instructions.
>>
>>  mrelative-jumptables
>> -Target Undocumented Var(rs6000_relative_jumptables) Init(1) Save
>> +Target Undocumented Var(rs6000_relative_jumptables) Init(0) Save
>>
>>  mrop-protect
>>  Target Var(rs6000_rop_protect) Init(0)


[Patch][V4][Patch 2/2]Enable -Wuninitialized + -ftrivial-auto-var-init for address taken variables.

2022-01-12 Thread Qing Zhao via Gcc-patches
Hi, Richard,

This is the updated version for the second patch, which is mainly the change 
for "Enable -Wuninitialized + -ftrivial-auto-var-init for  address taken 
variables”.

In this update, I mainly made the following change:

1.  Delete “repl_var”, use the var_def_stmt, i.e, the call to .DEFERRED_INIT to 
record the warning suppressed info.
2. Add and change the comments in multiple places to make the change more 
readable. 

Now, for the deleted variable, we will get the necessary info to report warning 
from the call to .DEFERRED_INIT.

A. the name string of DECL from the 3rd parameter of the call;
B. the location of the DECL from the location of the call;
C. the call can also be used to hold the information on whether the warning
   has been issued or not to suppress warning messages when needed;

Please see the detailed description below for the problem and solution of this 
patch.

This patch has been bootstrapped and regressing tested on both X86 and aarch64.

Okay for GCC12?

thanks.

Qing.


Enable -Wuninitialized + -ftrivial-auto-var-init for address
 taken variables.

With -ftrivial-auto-var-init, the address taken auto variable is replaced with
a temporary variable during gimplification, and the original auto variable might
be eliminated by compiler optimization completely. As a result, the current
uninitialized warning analysis cannot get enough information from the IR,
therefore the uninitialized warnings for address taken variable cannot be
issued based on the current implemenation of -ftrival-auto-var-init.

For more info please refer to:
https://gcc.gnu.org/pipermail/gcc-patches/2021-August/577431.html

In order to improve this situation, we can improve uninitialized analysis
for address taken auto variables with -ftrivial-auto-var-init as following:

for the following stmt:

_1 = .DEFERRED_INIT (4, 2, &"alt_reloc"[0]);
if (_1 != 0)

The original variable DECL has been eliminated from the IR, all the necessary
information that is needed for reporting the warnings for DECL can be acquired
from the call to .DEFERRED_INIT.

A. the name string of DECL from the 3rd parameter of the call;
B. the location of the DECL from the location of the call;
C. the call can also be used to hold the information on whether the warning
   has been issued or not to suppress warning messages when needed;

The current testing cases for uninitialized warnings + -ftrivial-auto-var-init
are adjusted to reflect the fact that we can issue warnings for address taken
variables.

gcc/ChangeLog:

2022-01-12  qing zhao  

* tree-ssa-uninit.c (warn_uninit): Handle .DEFERRED_INIT call with an
anonymous SSA_NAME specially.
(check_defs): Likewise.
(warn_uninit_phi_uses): Adjust the message format for warn_uninit.
(warn_uninitialized_vars): Likewise.
(warn_uninitialized_phi): Likewise

gcc/testsuite/ChangeLog:

2022-01-12  qing zhao  

* gcc.dg/auto-init-uninit-16.c (testfunc): Delete xfail to reflect
the fact that address taken variable can be warned.
* gcc.dg/auto-init-uninit-34.c (warn_scalar_1): Likewise.
(warn_scalar_2): Likewise.
* gcc.dg/auto-init-uninit-37.c (T1): Likewise.
(T2): Likewise.
* gcc.dg/auto-init-uninit-B.c (baz): Likewise.

The complete patch is attached:



0001-Enable-Wuninitialized-ftrivial-auto-var-init-for-add.patch
Description: 0001-Enable-Wuninitialized-ftrivial-auto-var-init-for-add.patch


[PATCH 1/2] Check negative combined step

2022-01-12 Thread Jiufu Guo via Gcc-patches
Hi,

Previously, there is discussion in:
https://gcc.gnu.org/pipermail/gcc-patches/2021-December/586460.html
I seperate it as two patches.

This first patch is to avoid negative step when combining two ivs.
The second patch is adding more accurate assumptions.

This patch pass bootstrap and regtest on ppc64, ppc64le and x86_64.
Is this ok for trunk?

BR,
Jiufu

PR tree-optimization/100740

gcc/ChangeLog:

* tree-ssa-loop-niter.c (number_of_iterations_cond): Check
sign of combined step.

gcc/testsuite/ChangeLog:

* gcc.c-torture/execute/pr100740.c: New test.



---
 gcc/tree-ssa-loop-niter.c  |  6 --
 gcc/testsuite/gcc.c-torture/execute/pr100740.c | 13 +
 2 files changed, 17 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.c-torture/execute/pr100740.c

diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c
index b767056aeb0..439d595a79f 100644
--- a/gcc/tree-ssa-loop-niter.c
+++ b/gcc/tree-ssa-loop-niter.c
@@ -1890,8 +1890,10 @@ number_of_iterations_cond (class loop *loop,
   tree step = fold_binary_to_constant (MINUS_EXPR, step_type,
   iv0->step, iv1->step);
 
-  /* No need to check sign of the new step since below code takes care
-of this well.  */
+  /* Like cases shown in PR100740/102131, negtive step is not safe.  */
+  if (tree_int_cst_sign_bit (step))
+   return false;
+
   if (code != NE_EXPR
  && (TREE_CODE (step) != INTEGER_CST
  || !iv0->no_overflow || !iv1->no_overflow))
diff --git a/gcc/testsuite/gcc.c-torture/execute/pr100740.c 
b/gcc/testsuite/gcc.c-torture/execute/pr100740.c
new file mode 100644
index 000..381cdeb947a
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/execute/pr100740.c
@@ -0,0 +1,13 @@
+/* PR tree-optimization/100740 */
+
+unsigned a, b;
+int
+main ()
+{
+  unsigned c = 0;
+  for (a = 0; a < 2; a++)
+for (b = 0; b < 2; b++)
+  if (++c < a)
+   __builtin_abort ();
+  return 0;
+}
-- 
2.17.1



[PATCH 2/2] Add assumption combining iv

2022-01-12 Thread Jiufu Guo via Gcc-patches
This is the second patch for two IVs combining.
When one IV is chasing another one, to make it safe, we should check if
there is wrap/overflow for either IV.

With the assumption, which computed as this patch, the number of iterations
can be caculated, even the no_overflow flag is not updated for unsigned IVs,
like the test case of this patch.

This patch pass bootstrap and regtest on ppc64le and x86_64.
Is this ok for trunk, or it may more suitable for stage1.

BR,
Jiufu

PR tree-optimization/102131

gcc/ChangeLog:

* tree-ssa-loop-niter.c (get_step_count): New function.
(iv_chase_assumption): New function.
(number_of_iterations_cond): Call iv_chase_assumption.

gcc/testsuite/ChangeLog:

* gcc.dg/vect/pr102131.c: New test.


---
 gcc/tree-ssa-loop-niter.c| 73 
 gcc/testsuite/gcc.dg/vect/pr102131.c | 47 ++
 2 files changed, 110 insertions(+), 10 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/vect/pr102131.c

diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c
index 439d595a79f..56261164f28 100644
--- a/gcc/tree-ssa-loop-niter.c
+++ b/gcc/tree-ssa-loop-niter.c
@@ -1788,6 +1788,63 @@ dump_affine_iv (FILE *file, affine_iv *iv)
 }
 }
 
+/* Generate expr: (HIGH - LOW) / STEP, under UTYPE. */
+
+static tree
+get_step_count (tree high, tree low, tree step, tree utype,
+   bool end_inclusive = false)
+{
+  tree delta = fold_build2 (MINUS_EXPR, TREE_TYPE (low), high, low);
+  delta = fold_convert (utype, delta);
+  if (end_inclusive)
+delta = fold_build2 (PLUS_EXPR, utype, delta, build_one_cst (utype));
+
+  if (tree_int_cst_sign_bit (step))
+step = fold_build1 (NEGATE_EXPR, TREE_TYPE (step), step);
+  step = fold_convert (utype, step);
+
+  return fold_build2 (FLOOR_DIV_EXPR, utype, delta, step);
+}
+
+/*  Get the additional assumption if both two steps are not zero.
+Assumptions satisfy that there is no overflow or wrap during
+v0 and v1 chasing.  */
+
+static tree
+iv_chase_assumption (affine_iv *iv0, affine_iv *iv1, tree step,
+enum tree_code code)
+{
+  /* Here, no need additional assumptions for NE.  */
+  if (code == NE_EXPR)
+return boolean_true_node;
+
+  /* No need addition assumption for pointer.  */
+  tree type = TREE_TYPE (iv0->base);
+  if (POINTER_TYPE_P (type))
+return boolean_true_node;
+
+  bool positive0 = !tree_int_cst_sign_bit (iv0->step);
+  bool positive1 = !tree_int_cst_sign_bit (iv1->step);
+  tree utype = unsigned_type_for (type);
+  bool add1 = code == LE_EXPR;
+  tree niter = get_step_count (iv1->base, iv0->base, step, utype, add1);
+
+  int prec = TYPE_PRECISION (type);
+  signop sgn = TYPE_SIGN (type);
+  tree max = wide_int_to_tree (type, wi::max_value (prec, sgn));
+  tree min = wide_int_to_tree (type, wi::min_value (prec, sgn));
+  tree valid_niter0, valid_niter1;
+
+  valid_niter0 = positive0 ? get_step_count (max, iv0->base, iv0->step, utype)
+  : get_step_count (iv0->base, min, iv0->step, utype);
+  valid_niter1 = positive1 ? get_step_count (max, iv1->base, iv1->step, utype)
+  : get_step_count (iv1->base, min, iv1->step, utype);
+
+  tree e0 = fold_build2 (LT_EXPR, boolean_type_node, niter, valid_niter0);
+  tree e1 = fold_build2 (LT_EXPR, boolean_type_node, niter, valid_niter1);
+  return fold_build2 (TRUTH_AND_EXPR, boolean_type_node, e0, e1);
+}
+
 /* Determine the number of iterations according to condition (for staying
inside loop) which compares two induction variables using comparison
operator CODE.  The induction variable on left side of the comparison
@@ -1879,11 +1936,10 @@ number_of_iterations_cond (class loop *loop,
{iv0.base, iv0.step - iv1.step} cmp_code {iv1.base, 0}
 
  provided that either below condition is satisfied:
+ a. iv0.step and iv1.step are integer.
+ b. Additional condition: before iv0 chase up v1, iv0 and iv1 should not
+ step over min or max of the type.  */
 
-   a) the test is NE_EXPR;
-   b) iv0.step - iv1.step is integer and iv0/iv1 don't overflow.
-
- This rarely occurs in practice, but it is simple enough to manage.  */
   if (!integer_zerop (iv0->step) && !integer_zerop (iv1->step))
 {
   tree step_type = POINTER_TYPE_P (type) ? sizetype : type;
@@ -1894,15 +1950,12 @@ number_of_iterations_cond (class loop *loop,
   if (tree_int_cst_sign_bit (step))
return false;
 
-  if (code != NE_EXPR
- && (TREE_CODE (step) != INTEGER_CST
- || !iv0->no_overflow || !iv1->no_overflow))
+  niter->assumptions = iv_chase_assumption (iv0, iv1, step, code);
+  if (integer_zerop (niter->assumptions))
return false;
 
   iv0->step = step;
-  if (!POINTER_TYPE_P (type))
-   iv0->no_overflow = false;
-
+  iv0->no_overflow = true;
   iv1->step = build_int_cst (step_type, 0);
   iv1->no_overflow = true;
 }
diff --git a

RE: [PATCH] [i386] Optimize a ^ ((a ^ b) & mask) to (~mask & a) | (b & mask).

2022-01-12 Thread Jiang, Haochen via Gcc-patches
Hi Uros,

Has fixed that format issue with this new patch. Ok for trunk?

Thx,
Haochen

-Original Message-
From: Uros Bizjak  
Sent: Thursday, January 13, 2022 3:22 AM
To: Jiang, Haochen 
Cc: gcc-patches@gcc.gnu.org; Liu, Hongtao 
Subject: Re: [PATCH] [i386] Optimize a ^ ((a ^ b) & mask) to (~mask & a) | (b & 
mask).

On Wed, Jan 12, 2022 at 9:11 AM Haochen Jiang  wrote:
>
> Hi all,
>
> This patch targets PR94790, which change the instruction selection under the 
> following circumstance.
>
> Regtested on x86_64-pc-linux-gnu. Ok for trunk?

Please also test with -m32, e.g.:

make -j 12 -k check RUNTESTFLAGS="--target_board=unix\{,-m32\}"

OK (with an it below), if new testcases do not FAIL with -m32.

Thanks,
Uros.

>
> BRs,
> Haochen
>
> From the perspective of the pipeline, `andn + and + ior` version take
> 2 cycles(AND and ANDN doesn't have dependence), but xor + and + xor 
> will take 3 cycles.
>
> -   xorl%edi, %esi
> andl%edx, %esi
> -   movl%esi, %eax
> -   xorl%edi, %eax
> +   andn%edi, %edx, %eax
> +   orl %esi, %eax
>
> gcc/ChangeLog:
>
> PR taeget/94790
> * config/i386/i386.md (*xor2andn): New define_insn_and_split.
>
> gcc/testsuite/ChangeLog:
>
> PR taeget/94790
> * gcc.target/i386/pr94790-1.c: New test.
> * gcc.target/i386/pr94790-2.c: Ditto.
> ---
>  gcc/config/i386/i386.md   | 39 +++
>  gcc/testsuite/gcc.target/i386/pr94790-1.c | 14   
> gcc/testsuite/gcc.target/i386/pr94790-2.c |  9 ++
>  3 files changed, 62 insertions(+)
>  create mode 100755 gcc/testsuite/gcc.target/i386/pr94790-1.c
>  create mode 100755 gcc/testsuite/gcc.target/i386/pr94790-2.c
>
> diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 
> 9b424a3935b..38efc6d5837 100644
> --- a/gcc/config/i386/i386.md
> +++ b/gcc/config/i386/i386.md
> @@ -10452,6 +10452,45 @@
> (set_attr "znver1_decode" "double")
> (set_attr "mode" "DI")])
>
> +;; PR target/94790: Optimize a ^ ((a ^ b) & mask) to (~mask & a) | (b 
> +& mask) (define_insn_and_split "*xor2andn"
> +  [(set (match_operand:SWI248 0 "nonimmediate_operand")
> +   (xor:SWI248
> + (and:SWI248
> +   (xor:SWI248
> + (match_operand:SWI248 1 "nonimmediate_operand")
> + (match_operand:SWI248 2 "nonimmediate_operand"))
> +   (match_operand:SWI248 3 "nonimmediate_operand"))
> + (match_dup 1)))
> +(clobber (reg:CC FLAGS_REG))]
> +  "(TARGET_BMI || TARGET_AVX512BW)
> +   && ix86_pre_reload_split ()"
> +  "#"
> +  "&& 1"
> +  [(parallel [(set (match_dup 4)
> +   (and:SWI248
> + (not:SWI248
> +   (match_dup 3))
> + (match_dup 1)))
> + (clobber (reg:CC FLAGS_REG))])
> +   (parallel [(set (match_dup 5)
> +   (and:SWI248
> + (match_dup 2)
> + (match_dup 3)))
> + (clobber (reg:CC FLAGS_REG))])
> +   (parallel [(set (match_dup 0)
> +   (ior:SWI248
> + (match_dup 4)
> + (match_dup 5)))
> + (clobber (reg:CC FLAGS_REG))])]
> +  {
> +operands[1] = force_reg (mode, operands[1]);
> +operands[3] = force_reg (mode, operands[3]);
> +operands[4] = gen_reg_rtx (mode);
> +operands[5] = gen_reg_rtx (mode);
> +  }
> +)

Please put brace just after the curved brace, see numerous examples in .md 
files.

> +
>  ;; See comment for addsi_1_zext why we do use nonimmediate_operand  
> (define_insn "*si_1_zext"
>[(set (match_operand:DI 0 "register_operand" "=r") diff --git 
> a/gcc/testsuite/gcc.target/i386/pr94790-1.c 
> b/gcc/testsuite/gcc.target/i386/pr94790-1.c
> new file mode 100755
> index 000..6ebbec15cfd
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr94790-1.c
> @@ -0,0 +1,14 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -mbmi" } */
> +/* { dg-final { scan-assembler-times "andn\[ \\t\]" 2 } } */
> +/* { dg-final { scan-assembler-not "xorl\[ \\t\]" } } */
> +
> +unsigned r1(unsigned a, unsigned b, unsigned mask) {
> +  return a ^ ((a ^ b) & mask);
> +}
> +
> +unsigned r2(unsigned a, unsigned b, unsigned mask) {
> +  return (~mask & a) | (b & mask);
> +}
> diff --git a/gcc/testsuite/gcc.target/i386/pr94790-2.c 
> b/gcc/testsuite/gcc.target/i386/pr94790-2.c
> new file mode 100755
> index 000..d7b0eec5bef
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr94790-2.c
> @@ -0,0 +1,9 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -mbmi" } */
> +/* { dg-final { scan-assembler-not "andn\[ \\t\]" } } */
> +/* { dg-final { scan-assembler-times "xorl\[ \\t\]" 2 } } */
> +
> +unsigned r1(unsigned a, unsigned b, unsigned mask) {
> +  return a ^ ((a ^ b) & mask) + (a ^ b); }
> --
> 2.18.1
>


0001-i386-Optimize-a-a-b-mask-to-mask-a-b-mask.patch
Description: 0001-i386-Optimize-a-a-b-mask-to-mask-a-b-mask.patch


Re: [PATCH, rs6000] Enable absolute jump table by default

2022-01-12 Thread David Edelsohn via Gcc-patches
On Wed, Jan 12, 2022 at 8:40 PM HAO CHEN GUI  wrote:
>
> Hi David,
>
> On 12/1/2022 下午 10:44, David Edelsohn wrote:
> > On Wed, Jan 12, 2022 at 7:22 AM HAO CHEN GUI  wrote:
> >>
> >> Hi,
> >>This patch enables absolute jump table by default on rs6000. The 
> >> relative jump tables are used when
> >>it's explicit set by "rs6000_relative_jumptables",
> >>or jump tables are placed in text section but global relocation is 
> >> required.
> >>
> >>Bootstrapped and tested on powerpc64-linux BE and LE with no 
> >> regressions. Is this okay for trunk?
> >> Any recommendations? Thanks a lot.
> >>
> >> ChangeLog
> >> 2022-01-12 Haochen Gui 
> >>
> >> gcc/
> >> * config/rs6000/linux64.h (JUMP_TABLES_IN_TEXT_SECTION): Define.
> >> * config/rs6000/rs6000.c (rs6000_gen_pic_addr_diff_vec): Return
> >> true when relative jump table is explicit required or jump tables 
> >> have
> >> to be put in text section but global relocation is also required.
> >> * config/rs6000/rs6000.opt (rs6000_relative_jumptables): Disable.
> >>
> >> patch.diff
> >> diff --git a/gcc/config/rs6000/linux64.h b/gcc/config/rs6000/linux64.h
> >> index d617f346f81..2e257c60f8c 100644
> >> --- a/gcc/config/rs6000/linux64.h
> >> +++ b/gcc/config/rs6000/linux64.h
> >> @@ -239,7 +239,7 @@ extern int dot_symbols;
> >>
> >>  /* Indicate that jump tables go in the text section.  */
> >>  #undef  JUMP_TABLES_IN_TEXT_SECTION
> >> -#define JUMP_TABLES_IN_TEXT_SECTION TARGET_64BIT
> >> +#define JUMP_TABLES_IN_TEXT_SECTION 0
> >>
> >>  /* The linux ppc64 ABI isn't explicit on whether aggregates smaller
> >> than a doubleword should be padded upward or downward.  You could
> >> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
> >> index 319182e94d9..9fba893a27a 100644
> >> --- a/gcc/config/rs6000/rs6000.c
> >> +++ b/gcc/config/rs6000/rs6000.c
> >> @@ -28465,7 +28465,9 @@ rs6000_emit_xxspltidp_v2df (rtx dst, long value)
> >>  static bool
> >>  rs6000_gen_pic_addr_diff_vec (void)
> >>  {
> >> -  return rs6000_relative_jumptables;
> >> +  return rs6000_relative_jumptables
> >> +|| (JUMP_TABLES_IN_TEXT_SECTION
> >> +&& targetm.asm_out.reloc_rw_mask () == 3);
> >>  }
> >
> > This seems like contorted logic and overriding the
> > rs6000_relative_jumptables option change.  The later part of the patch
> > overrides rs6000_relative_jumptables for all rs6000 configurations,
> > and then changes this one use of rs6000_relative_jumptables to add
> > more logic to revert to the old meaning for some targets.
> >
> > What about all of the other uses of rs6000_relative_jumptables in the
> > target?  What about rs6000.md?
> >
> > I highly doubt that this patch is correct.
> >
> > Why not override rs6000_relative_jumptables for PPC64 Linux instead of
> > changing its value globally and then trying to fix it up?
> >
> > Thanks, David
>   Thanks for your comments.
>
>   In this patch, I tried to enable absolute jump table on all rs6000 targets.
> For PPC64 Linux, it supports RELRO section (e.g. .data.rel.ro.local) as
> "JUMP_TABLES_IN_TEXT_SECTION" is set to 0. So, absolute jump tables could be 
> placed
> in RELRO section whatever global relocation is required or not. The absolute 
> jump
> table can't be placed in text section when global relocation is required as 
> text
> section is not writable. So for other rs6000 targets, absolute jump table 
> can't be
> used if the target doesn't support RELRO and global relocation is required 
> also.
>
>   Looking forward to your advice. Thanks again.

Why not override rs6000_relative_jumptables in
rs6000_option_override_internal() for PPC64 Linux subtarget instead of
changing the default value and then trying to fix the behavior for
other configurations in rs6000_gen_pic_addr_diff_vec()?  Or override
it in SUBSUBTARGET_OVERRIDE_OPTIONS in linux64.h, or whichever header
file(s) is appropriate for the subtarget variants.

Your initial patch also changed rs6000_gen_pic_addr_diff_vec but
didn't address the use of rs6000_relative_jumptables in the definition
of CASE_VECTOR_MODE in rs6000.h.  That cannot have been correct.  At
least without the change to the default value of
rs6000_relative_jumptables you don't need to add kludges to all of the
places where that variable is used for other subtarget variants of the
rs6000 target.

Thanks, David


[PATCH] rs6000: Fix constraint v with rs6000_constraints[RS6000_CONSTRAINT_v]

2022-01-12 Thread Kewen.Lin via Gcc-patches
Hi,

This patch is to fix register constraint v with
rs6000_constraints[RS6000_CONSTRAINT_v] instead of ALTIVEC_REGS,
just like some other existing register constraints with
RS6000_CONSTRAINT_*.

I happened to see this and hope it's not intentional and just
got neglected. 

Bootstrapped and regtested on powerpc64le-linux-gnu P9 and
powerpc64-linux-gnu P8.

Is it ok for trunk?

BR,
Kewen
-
gcc/ChangeLog:

* config/rs6000/constraints.md (register constraint v): Use
rs6000_constraints[RS6000_CONSTRAINT_v] instead of ALTIVEC_REGS.
---
 gcc/config/rs6000/constraints.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/rs6000/constraints.md b/gcc/config/rs6000/constraints.md
index a4b05837fa6..c01dcbbc3a3 100644
--- a/gcc/config/rs6000/constraints.md
+++ b/gcc/config/rs6000/constraints.md
@@ -37,7 +37,7 @@ (define_register_constraint "d" 
"rs6000_constraints[RS6000_CONSTRAINT_d]"
historically @code{f} was for single-precision and @code{d} was for
double-precision floating point.")

-(define_register_constraint "v" "ALTIVEC_REGS"
+(define_register_constraint "v" "rs6000_constraints[RS6000_CONSTRAINT_v]"
   "An Altivec vector register (VR), @code{v0}@dots{}@code{v31}.")

 (define_register_constraint "wa" "rs6000_constraints[RS6000_CONSTRAINT_wa]"
--
2.27.0


[PATCH] rs6000: Use known constant for GET_MODE_NUNITS and similar

2022-01-12 Thread Kewen.Lin via Gcc-patches
Hi,

This patch is to clean up some codes with GET_MODE_UNIT_SIZE or
GET_MODE_NUNITS, which can use known constant instead.

Bootstrapped and regtested on powerpc64le-linux-gnu P9 and
powerpc64-linux-gnu P8.

Is it ok for trunk?

BR,
Kewen
-
gcc/ChangeLog:

* config/rs6000/altivec.md (altivec_vreveti2): Use known constant
values to simplify code.
* config/rs6000/vsx.md (*vsx_extract_si, *vsx_extract_si_float_df,
*vsx_extract_si_float_, *vsx_insert_extract_v4sf_p9):
Likewise.
---
 gcc/config/rs6000/altivec.md | 11 +++
 gcc/config/rs6000/vsx.md |  8 
 2 files changed, 7 insertions(+), 12 deletions(-)

diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
index c2312cc1e0f..d5c4ecfa9b7 100644
--- a/gcc/config/rs6000/altivec.md
+++ b/gcc/config/rs6000/altivec.md
@@ -3957,17 +3957,12 @@ (define_expand "altivec_vreveti2"
  UNSPEC_VREVEV))]
   "TARGET_ALTIVEC"
 {
-  int i, j, size, num_elements;
+  int i;
   rtvec v = rtvec_alloc (16);
   rtx mask = gen_reg_rtx (V16QImode);

-  size = GET_MODE_UNIT_SIZE (TImode);
-  num_elements = GET_MODE_NUNITS (TImode);
-
-  for (j = 0; j < num_elements; j++)
-for (i = 0; i < size; i++)
-  RTVEC_ELT (v, i + j * size)
-   = GEN_INT (i + (num_elements - 1 - j) * size);
+  for (i = 0; i < 16; i++)
+RTVEC_ELT (v, i) = GEN_INT (i);

   emit_insn (gen_vec_initv16qiqi (mask, gen_rtx_PARALLEL (V16QImode, v)));
   emit_insn (gen_altivec_vperm_ti (operands[0], operands[1],
diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index 802db0d112b..892b99c6d6b 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -3855,7 +3855,7 @@ (define_insn_and_split  "*vsx_extract_si"
   int value;

   if (!BYTES_BIG_ENDIAN)
-element = GEN_INT (GET_MODE_NUNITS (V4SImode) - 1 - INTVAL (element));
+element = GEN_INT (3 - INTVAL (element));

   /* If the value is in the correct position, we can avoid doing the VSPLT
  instruction.  */
@@ -4231,7 +4231,7 @@ (define_insn_and_split "*vsx_extract_si_float_df"
   int value;

   if (!BYTES_BIG_ENDIAN)
-element = GEN_INT (GET_MODE_NUNITS (V4SImode) - 1 - INTVAL (element));
+element = GEN_INT (3 - INTVAL (element));

   /* If the value is in the correct position, we can avoid doing the VSPLT
  instruction.  */
@@ -4274,7 +4274,7 @@ (define_insn_and_split "*vsx_extract_si_float_"
   int value;

   if (!BYTES_BIG_ENDIAN)
-element = GEN_INT (GET_MODE_NUNITS (V4SImode) - 1 - INTVAL (element));
+element = GEN_INT (3 - INTVAL (element));

   /* If the value is in the correct position, we can avoid doing the VSPLT
  instruction.  */
@@ -4467,7 +4467,7 @@ (define_insn "*vsx_insert_extract_v4sf_p9"
   int ele = INTVAL (operands[4]);

   if (!BYTES_BIG_ENDIAN)
-ele = GET_MODE_NUNITS (V4SFmode) - 1 - ele;
+ele = 3 - ele;

   operands[4] = GEN_INT (GET_MODE_SIZE (SFmode) * ele);
   return "xxinsertw %x0,%x2,%4";
--
2.27.0



PING^1 [PATCH] rs6000: Fix an assertion in update_target_cost_per_stmt [PR103702]

2022-01-12 Thread Kewen.Lin via Gcc-patches
Gentle ping:

https://gcc.gnu.org/pipermail/gcc-patches/2021-December/587309.html

on 2021/12/23 上午10:06, Kewen.Lin via Gcc-patches wrote:
> Hi, 
> 
> This patch is to fix one wrong assertion which is too aggressive.
> Vectorizer can do vec_construct costing for the vector type which
> only has one unit.  For the failed case, the passed-in vector type
> is "vector(1) int", though it doesn't end up with any construction
> eventually.  We have to handle this kind of input in function
> rs6000_cost_data::update_target_cost_per_stmt.
> 
> Bootstrapped and regtested on powerpc64le-linux-gnu P9 and
> powerpc64-linux-gnu P8.
> 
> Is it ok for trunk?
> 
> BR,
> Kewen
> -
> gcc/ChangeLog:
> 
>   PR target/103702
>   * config/rs6000/rs6000.c
>   (rs6000_cost_data::update_target_cost_per_stmt): Fix one wrong
>   assertion with early return.
> 
> gcc/testsuite/ChangeLog:
> 
>   PR target/103702
>   * gcc.target/powerpc/pr103702.c: New test.
> ---
>  gcc/config/rs6000/rs6000.c  |  7 --
>  gcc/testsuite/gcc.target/powerpc/pr103702.c | 24 +
>  2 files changed, 29 insertions(+), 2 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr103702.c
> 
> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
> index 0b09713b2f5..37f07fe5358 100644
> --- a/gcc/config/rs6000/rs6000.c
> +++ b/gcc/config/rs6000/rs6000.c
> @@ -5461,8 +5461,11 @@ rs6000_cost_data::update_target_cost_per_stmt 
> (vect_cost_for_stmt kind,
>   {
> tree vectype = STMT_VINFO_VECTYPE (stmt_info);
> unsigned int nunits = vect_nunits_for_cost (vectype);
> -   /* We don't expect strided/elementwise loads for just 1 nunit.  */
> -   gcc_assert (nunits > 1);
> +   /* As PR103702 shows, it's possible that vectorizer wants to do
> +  costings for only one unit here, it's no need to do any
> +  penalization for it, so simply early return here.  */
> +   if (nunits == 1)
> + return;
> /* i386 port adopts nunits * stmt_cost as the penalized cost
>for this kind of penalization, we used to follow it but
>found it could result in an unreliable body cost especially
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr103702.c 
> b/gcc/testsuite/gcc.target/powerpc/pr103702.c
> new file mode 100644
> index 000..585946fd64b
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr103702.c
> @@ -0,0 +1,24 @@
> +/* We don't have one powerpc.*_ok for Power6, use altivec_ok conservatively. 
>  */
> +/* { dg-require-effective-target powerpc_altivec_ok } */
> +/* { dg-options "-mdejagnu-cpu=power6 -O2 -ftree-loop-vectorize 
> -fno-tree-scev-cprop" } */
> +
> +/* Verify there is no ICE.  */
> +
> +unsigned short a, e;
> +int *b, *d;
> +int c;
> +extern int fn2 ();
> +void
> +fn1 ()
> +{
> +  void *f;
> +  for (;;)
> +{
> +  fn2 ();
> +  b = f;
> +  e = 0;
> +  for (; e < a; ++e)
> + b[e] = d[e * c];
> +}
> +}
> +
> --
> 2.27.0
> 


PING^1 [PATCH] rs6000: Disable MMA if no P9 VECTOR support [PR103627]

2022-01-12 Thread Kewen.Lin via Gcc-patches
Gentle ping:

https://gcc.gnu.org/pipermail/gcc-patches/2021-December/587310.html

on 2021/12/23 上午10:09, Kewen.Lin via Gcc-patches wrote:
> Hi,
> 
> As PR103627 shows, there is an unexpected case where !TARGET_VSX
> and TARGET_MMA co-exist.  As ISA3.1 claims, SIMD is a requirement
> for MMA.  By looking into the ICE, I noticed that the current
> MMA implementation depends on vector pairs load/store, but since
> we don't have a separated option to control Power10 vector, this
> patch is to check for Power9 vector instead.
> 
> Bootstrapped and regtested on powerpc64le-linux-gnu P9 and
> powerpc64-linux-gnu P8.
> 
> Is it ok for trunk?
> 
> BR,
> Kewen
> -
> gcc/ChangeLog:
> 
>   PR target/103627
>   * config/rs6000/rs6000.c (rs6000_option_override_internal): Disable
>   MMA if !TARGET_P9_VECTOR.
> 
> gcc/testsuite/ChangeLog:
> 
>   PR target/103627
>   * gcc.target/powerpc/pr103627-1.c: New test.
>   * gcc.target/powerpc/pr103627-2.c: New test.
> ---
>  gcc/config/rs6000/rs6000.c| 11 +++
>  gcc/testsuite/gcc.target/powerpc/pr103627-1.c | 16 
>  gcc/testsuite/gcc.target/powerpc/pr103627-2.c | 16 
>  3 files changed, 43 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr103627-1.c
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr103627-2.c
> 
> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
> index c020947abc8..ec3b46682a7 100644
> --- a/gcc/config/rs6000/rs6000.c
> +++ b/gcc/config/rs6000/rs6000.c
> @@ -4505,6 +4505,17 @@ rs6000_option_override_internal (bool global_init_p)
>rs6000_isa_flags &= ~OPTION_MASK_MMA;
>  }
> 
> +  /* MMA requires SIMD support as ISA 3.1 claims and our implementation
> + such as "*movoo" uses vector pair access which are only supported
> + from ISA 3.1.  But since we don't have one separated option to
> + control Power10 vector, check for Power9 vector instead.  */
> +  if (TARGET_MMA && !TARGET_P9_VECTOR)
> +{
> +  if ((rs6000_isa_flags_explicit & OPTION_MASK_MMA) != 0)
> + error ("%qs requires %qs", "-mmma", "-mpower9-vector");
> +  rs6000_isa_flags &= ~OPTION_MASK_MMA;
> +}
> +
>if (!TARGET_PCREL && TARGET_PCREL_OPT)
>  rs6000_isa_flags &= ~OPTION_MASK_PCREL_OPT;
> 
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr103627-1.c 
> b/gcc/testsuite/gcc.target/powerpc/pr103627-1.c
> new file mode 100644
> index 000..6c6c16188fb
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr103627-1.c
> @@ -0,0 +1,16 @@
> +/* { dg-require-effective-target power10_ok } */
> +/* { dg-options "-mdejagnu-cpu=power10 -mno-power9-vector" } */
> +
> +/* Verify compiler emits error message instead of ICE.  */
> +
> +extern float *dest;
> +extern __vector_quad src;
> +
> +int
> +foo ()
> +{
> +  __builtin_mma_disassemble_acc (dest, &src);
> +  /* { dg-error "'__builtin_mma_disassemble_acc' requires the '-mmma' 
> option" "" { target *-*-* } .-1 } */
> +  return 0;
> +}
> +
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr103627-2.c 
> b/gcc/testsuite/gcc.target/powerpc/pr103627-2.c
> new file mode 100644
> index 000..6604872c0e8
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr103627-2.c
> @@ -0,0 +1,16 @@
> +/* { dg-require-effective-target power10_ok } */
> +/* { dg-options "-mdejagnu-cpu=power10 -mmma -mno-power9-vector" } */
> +
> +/* Verify the emitted error message.  */
> +
> +extern float *dest;
> +extern __vector_quad src;
> +
> +int
> +foo ()
> +{
> +  __builtin_mma_disassemble_acc (dest, &src);
> +  /* { dg-error "'-mmma' requires '-mpower9-vector'" "mma" { target *-*-* } 
> 0 } */
> +  return 0;
> +}
> +
> --
> 2.27.0
> 


  1   2   >