[PATCH PR94708] [8/9/10 Regression] pr94780.c fails with ICE on aarch64

2020-04-29 Thread Zhanghaijian (A)
Hi

This is a simple fix for pr94820.
The PR was only fixed on i386, the same error was also reported on aarch64.
This function, because it is sometimes called even outside of function bodies, 
uses create_tmp_var_raw rather than create_tmp_var. 
But in order for that to work, when first referenced, the VAR_DECLs need to 
appear in a TARGET_EXPR so that during gimplification
the var gets the right DECL_CONTEXT and is added to local decls. Without that, 
e.g. tree-nested.c ICEs on those.
Attached please find the v1 patch which is suitable for git am. Bootstrap and 
tested on aarch64 Linux platform. No new regression witnessed.

>From 538bafed0d4704b588769a97756e4b5a6312b7ad Mon Sep 17 00:00:00 2001
From: z00219097 
Date: Wed, 29 Apr 2020 14:30:23 +0800
Subject: [PATCH] aarch64: Fix up aarch64_atomic_assign_expand_fenv
 [PR94820]

This function, because it is sometimes called even outside of function
bodies, uses create_tmp_var_raw rather than create_tmp_var.  But in order
for that to work, when first referenced, the VAR_DECLs need to appear in a
TARGET_EXPR so that during gimplification the var gets the right
DECL_CONTEXT and is added to local decls.  Without that, e.g. tree-nested.c
ICEs on those.

2020-04-29  Haijian Zhang 

PR target/94820
* config/aarch64/aarch64-builtins.c
(aarch64_atomic_assign_expand_fenv): Use TARGET_EXPR instead of
MODIFY_EXPR for first assignment to fenv_cr, fenv_sr and
new_fenv_var.
---
 gcc/config/aarch64/aarch64-builtins.c | 15 +--
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/gcc/config/aarch64/aarch64-builtins.c 
b/gcc/config/aarch64/aarch64-builtins.c
index 5744e68ea08..95213cd70c8 100644
--- a/gcc/config/aarch64/aarch64-builtins.c
+++ b/gcc/config/aarch64/aarch64-builtins.c
@@ -2313,10 +2313,12 @@ aarch64_atomic_assign_expand_fenv (tree *hold, tree 
*clear, tree *update)
   mask_sr = build_int_cst (unsigned_type_node,
   ~(AARCH64_FE_ALL_EXCEPT));
 
-  ld_fenv_cr = build2 (MODIFY_EXPR, unsigned_type_node,
-  fenv_cr, build_call_expr (get_fpcr, 0));
-  ld_fenv_sr = build2 (MODIFY_EXPR, unsigned_type_node,
-  fenv_sr, build_call_expr (get_fpsr, 0));
+  ld_fenv_cr = build4 (TARGET_EXPR, unsigned_type_node,
+  fenv_cr, build_call_expr (get_fpcr, 0),
+  NULL_TREE, NULL_TREE);
+  ld_fenv_sr = build4 (TARGET_EXPR, unsigned_type_node,
+  fenv_sr, build_call_expr (get_fpsr, 0),
+  NULL_TREE, NULL_TREE);
 
   masked_fenv_cr = build2 (BIT_AND_EXPR, unsigned_type_node, fenv_cr, mask_cr);
   masked_fenv_sr = build2 (BIT_AND_EXPR, unsigned_type_node, fenv_sr, mask_sr);
@@ -2348,8 +2350,9 @@ aarch64_atomic_assign_expand_fenv (tree *hold, tree 
*clear, tree *update)
__atomic_feraiseexcept (new_fenv_var);  */
 
   new_fenv_var = create_tmp_var_raw (unsigned_type_node);
-  reload_fenv = build2 (MODIFY_EXPR, unsigned_type_node,
-   new_fenv_var, build_call_expr (get_fpsr, 0));
+  reload_fenv = build4 (TARGET_EXPR, unsigned_type_node,
+   new_fenv_var, build_call_expr (get_fpsr, 0),
+   NULL_TREE, NULL_TREE);
   restore_fnenv = build_call_expr (set_fpsr, 1, fenv_sr);
   atomic_feraiseexcept = builtin_decl_implicit (BUILT_IN_ATOMIC_FERAISEEXCEPT);
   update_call = build_call_expr (atomic_feraiseexcept, 1,
-- 
2.19.1


Any suggestion?  

Thanks,
Haijian Zhang


pr94820-v1.patch
Description: pr94820-v1.patch


Re: [PATCH PR94708] [8/9/10 Regression] pr94780.c fails with ICE on aarch64

2020-04-29 Thread Jakub Jelinek via Gcc-patches
On Wed, Apr 29, 2020 at 07:16:53AM +, Zhanghaijian (A) wrote:
> 2020-04-29  Haijian Zhang 

Two spaces before < rather than just one.

> PR target/94820
> * config/aarch64/aarch64-builtins.c
> (aarch64_atomic_assign_expand_fenv): Use TARGET_EXPR instead of
> MODIFY_EXPR for first assignment to fenv_cr, fenv_sr and
> new_fenv_var.

Ok for trunk, thanks.

> diff --git a/gcc/config/aarch64/aarch64-builtins.c 
> b/gcc/config/aarch64/aarch64-builtins.c
> index 5744e68ea08..95213cd70c8 100644
> --- a/gcc/config/aarch64/aarch64-builtins.c
> +++ b/gcc/config/aarch64/aarch64-builtins.c
> @@ -2313,10 +2313,12 @@ aarch64_atomic_assign_expand_fenv (tree *hold, tree 
> *clear, tree *update)
>mask_sr = build_int_cst (unsigned_type_node,
>  ~(AARCH64_FE_ALL_EXCEPT));
>  
> -  ld_fenv_cr = build2 (MODIFY_EXPR, unsigned_type_node,
> -fenv_cr, build_call_expr (get_fpcr, 0));
> -  ld_fenv_sr = build2 (MODIFY_EXPR, unsigned_type_node,
> -fenv_sr, build_call_expr (get_fpsr, 0));
> +  ld_fenv_cr = build4 (TARGET_EXPR, unsigned_type_node,
> +fenv_cr, build_call_expr (get_fpcr, 0),
> +NULL_TREE, NULL_TREE);
> +  ld_fenv_sr = build4 (TARGET_EXPR, unsigned_type_node,
> +fenv_sr, build_call_expr (get_fpsr, 0),
> +NULL_TREE, NULL_TREE);
>  
>masked_fenv_cr = build2 (BIT_AND_EXPR, unsigned_type_node, fenv_cr, 
> mask_cr);
>masked_fenv_sr = build2 (BIT_AND_EXPR, unsigned_type_node, fenv_sr, 
> mask_sr);
> @@ -2348,8 +2350,9 @@ aarch64_atomic_assign_expand_fenv (tree *hold, tree 
> *clear, tree *update)
> __atomic_feraiseexcept (new_fenv_var);  */
>  
>new_fenv_var = create_tmp_var_raw (unsigned_type_node);
> -  reload_fenv = build2 (MODIFY_EXPR, unsigned_type_node,
> - new_fenv_var, build_call_expr (get_fpsr, 0));
> +  reload_fenv = build4 (TARGET_EXPR, unsigned_type_node,
> + new_fenv_var, build_call_expr (get_fpsr, 0),
> + NULL_TREE, NULL_TREE);
>restore_fnenv = build_call_expr (set_fpsr, 1, fenv_sr);
>atomic_feraiseexcept = builtin_decl_implicit 
> (BUILT_IN_ATOMIC_FERAISEEXCEPT);
>update_call = build_call_expr (atomic_feraiseexcept, 1,
> -- 
> 2.19.1

Jakub



[PATCH] aarch64: prefer using csinv, csneg in zero extend contexts

2020-04-29 Thread Alex Coplan
Hello,

The attached patch adds an optimization to the AArch64 backend to catch
additional cases where we can use csinv and csneg.

Given the C code:

unsigned long long inv(unsigned a, unsigned b, unsigned c)
{
  return a ? b : ~c;
}

Prior to this patch, AArch64 GCC at -O2 generates:

inv:
cmp w0, 0
mvn w2, w2
cselw0, w1, w2, ne
ret

and after applying the patch, we get:

inv:
cmp w0, 0
csinv   w0, w1, w2, ne
ret

The new pattern also catches the optimization for the symmetric case where the
body of foo reads a ? ~b : c.

Similarly, with the following code:

unsigned long long neg(unsigned a, unsigned b, unsigned c)
{
  return a ? b : -c;
}

GCC at -O2 previously gave:

neg:
cmp w0, 0
neg w2, w2
cselw0, w1, w2, ne

but now gives:

neg:
cmp w0, 0
csneg   w0, w1, w2, ne
ret

with the corresponding code for the symmetric case as above.

Testing:
 - New regression test which checks all four of these cases.
 - Full bootstrap and regression on aarch64-linux.

Thanks,
Alex

---

gcc/ChangeLog:

2020-04-24  Alex Coplan  

* config/aarch64/aarch64.md (*csinv3_utxw_insn): New.
* config/aarch64/iterators.md (neg_not_cs): New.

gcc/testsuite/ChangeLog:

2020-04-22  Alex Coplan  

* gcc.target/aarch64/csinv-neg.c: New test.
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index c7c4d1dd519..2f7367c0b1a 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -4390,6 +4390,19 @@
   [(set_attr "type" "csel")]
 )
 
+(define_insn "*csinv3_uxtw_insn"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+(if_then_else:DI
+  (match_operand 1 "aarch64_comparison_operation" "")
+  (zero_extend:DI
+(match_operand:SI 2 "aarch64_reg_or_zero" "rZ"))
+  (zero_extend:DI
+(NEG_NOT:SI (match_operand:SI 3 "register_operand" "r")]
+  ""
+  "cs\\t%w0, %w2, %w3, %m1"
+  [(set_attr "type" "csel")]
+)
+
 ;; If X can be loaded by a single CNT[BHWD] instruction,
 ;;
 ;;A = UMAX (B, X)
diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md
index 8e434389e59..a568cf21b99 100644
--- a/gcc/config/aarch64/iterators.md
+++ b/gcc/config/aarch64/iterators.md
@@ -1932,6 +1932,9 @@
 ;; Operation names for negate and bitwise complement.
 (define_code_attr neg_not_op [(neg "neg") (not "not")])
 
+;; csinv, csneg insn suffixes.
+(define_code_attr neg_not_cs [(neg "neg") (not "inv")])
+
 ;; Similar, but when the second operand is inverted.
 (define_code_attr nlogical [(and "bic") (ior "orn") (xor "eon")])
 
diff --git a/gcc/testsuite/gcc.target/aarch64/csinv-neg.c 
b/gcc/testsuite/gcc.target/aarch64/csinv-neg.c
new file mode 100644
index 000..4636f3e0906
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/csinv-neg.c
@@ -0,0 +1,54 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+/*
+** inv1:
+** cmp w0, 0
+** csinv   w0, w1, w2, ne
+** ret
+*/
+unsigned long long
+inv1(unsigned a, unsigned b, unsigned c)
+{
+  return a ? b : ~c;
+}
+
+/*
+** inv2:
+** cmp w0, 0
+** csinv   w0, w2, w1, eq
+** ret
+*/
+unsigned long long
+inv2(unsigned a, unsigned b, unsigned c)
+{
+  return a ? ~b : c;
+}
+
+/*
+** neg1:
+** cmp w0, 0
+** csneg   w0, w1, w2, ne
+** ret
+*/
+unsigned long long
+neg1(unsigned a, unsigned b, unsigned c)
+{
+  return a ? b : -c;
+}
+
+
+/*
+** neg2:
+** cmp w0, 0
+** csneg   w0, w2, w1, eq
+** ret
+*/
+unsigned long long
+neg2(unsigned a, unsigned b, unsigned c)
+{
+  return a ? -b : c;
+}
+
+
+/* { dg-final { check-function-bodies "**" "" "" } } */


RE: [PATCH PR94708] [8/9/10 Regression] pr94780.c fails with ICE on aarch64

2020-04-29 Thread Zhanghaijian (A)
Thanks for your suggestions. I have modified accordingly.
Is the v2 patch ok?
Please help install it this patch is ok, as I am not a committer.

>From 15d8675676b3fcced3fa5695b3d9c37f3ec3be18 Mon Sep 17 00:00:00 2001
From: Haijian Zhang  
Date: Wed, 29 Apr 2020 15:00:00 +0800
Subject: [PATCH] aarch64: Fix up aarch64_atomic_assign_expand_fenv [PR94820]

This function, because it is sometimes called even outside of function
bodies, uses create_tmp_var_raw rather than create_tmp_var. But in order
for that to work, when first referenced, the VAR_DECLs need to appear in a
TARGET_EXPR so that during gimplification the var gets the right
DECL_CONTEXT and is added to local decls. Without that, e.g. tree-nested.c
ICEs on those.

2020-04-29  Haijian Zhang  

PR target/94820
* config/aarch64/aarch64-builtins.c
(aarch64_atomic_assign_expand_fenv): Use TARGET_EXPR instead of
MODIFY_EXPR for first assignment to fenv_cr, fenv_sr and
new_fenv_var.
---
 gcc/ChangeLog |  8 
 gcc/config/aarch64/aarch64-builtins.c | 15 +--
 2 files changed, 17 insertions(+), 6 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index c4afdb053f7..f87866622c6 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,11 @@
+2020-04-29  Haijian Zhang  
+
+   PR target/94820
+   * config/aarch64/aarch64-builtins.c
+   (aarch64_atomic_assign_expand_fenv): Use TARGET_EXPR instead of
+   MODIFY_EXPR for first assignment to fenv_cr, fenv_sr and
+   new_fenv_var.
+
 2020-04-29  Richard Biener  
 
* tree-ssa-loop-im.c (ref_always_accessed::operator ()):
diff --git a/gcc/config/aarch64/aarch64-builtins.c 
b/gcc/config/aarch64/aarch64-builtins.c
index 5744e68ea08..95213cd70c8 100644
--- a/gcc/config/aarch64/aarch64-builtins.c
+++ b/gcc/config/aarch64/aarch64-builtins.c
@@ -2313,10 +2313,12 @@ aarch64_atomic_assign_expand_fenv (tree *hold, tree 
*clear, tree *update)
   mask_sr = build_int_cst (unsigned_type_node,
   ~(AARCH64_FE_ALL_EXCEPT));
 
-  ld_fenv_cr = build2 (MODIFY_EXPR, unsigned_type_node,
-  fenv_cr, build_call_expr (get_fpcr, 0));
-  ld_fenv_sr = build2 (MODIFY_EXPR, unsigned_type_node,
-  fenv_sr, build_call_expr (get_fpsr, 0));
+  ld_fenv_cr = build4 (TARGET_EXPR, unsigned_type_node,
+  fenv_cr, build_call_expr (get_fpcr, 0),
+  NULL_TREE, NULL_TREE);
+  ld_fenv_sr = build4 (TARGET_EXPR, unsigned_type_node,
+  fenv_sr, build_call_expr (get_fpsr, 0),
+  NULL_TREE, NULL_TREE);
 
   masked_fenv_cr = build2 (BIT_AND_EXPR, unsigned_type_node, fenv_cr, mask_cr);
   masked_fenv_sr = build2 (BIT_AND_EXPR, unsigned_type_node, fenv_sr, mask_sr);
@@ -2348,8 +2350,9 @@ aarch64_atomic_assign_expand_fenv (tree *hold, tree 
*clear, tree *update)
__atomic_feraiseexcept (new_fenv_var);  */
 
   new_fenv_var = create_tmp_var_raw (unsigned_type_node);
-  reload_fenv = build2 (MODIFY_EXPR, unsigned_type_node,
-   new_fenv_var, build_call_expr (get_fpsr, 0));
+  reload_fenv = build4 (TARGET_EXPR, unsigned_type_node,
+   new_fenv_var, build_call_expr (get_fpsr, 0),
+   NULL_TREE, NULL_TREE);
   restore_fnenv = build_call_expr (set_fpsr, 1, fenv_sr);
   atomic_feraiseexcept = builtin_decl_implicit (BUILT_IN_ATOMIC_FERAISEEXCEPT);
   update_call = build_call_expr (atomic_feraiseexcept, 1,
-- 
2.19.1

Thanks,
Haijian Zhang


> -Original Message-
> From: Jakub Jelinek [mailto:ja...@redhat.com]
> Sent: Wednesday, April 29, 2020 3:21 PM
> To: Zhanghaijian (A) 
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH PR94708] [8/9/10 Regression] pr94780.c fails with ICE on
> aarch64
> 
> On Wed, Apr 29, 2020 at 07:16:53AM +, Zhanghaijian (A) wrote:
> > 2020-04-29  Haijian Zhang 
> 
> Two spaces before < rather than just one.
> 
> > PR target/94820
> > * config/aarch64/aarch64-builtins.c
> > (aarch64_atomic_assign_expand_fenv): Use TARGET_EXPR
> instead of
> > MODIFY_EXPR for first assignment to fenv_cr, fenv_sr and
> > new_fenv_var.
> 
> Ok for trunk, thanks.
> 
> > diff --git a/gcc/config/aarch64/aarch64-builtins.c
> > b/gcc/config/aarch64/aarch64-builtins.c
> > index 5744e68ea08..95213cd70c8 100644
> > --- a/gcc/config/aarch64/aarch64-builtins.c
> > +++ b/gcc/config/aarch64/aarch64-builtins.c
> > @@ -2313,10 +2313,12 @@ aarch64_atomic_assign_expand_fenv (tree
> *hold, tree *clear, tree *update)
> >mask_sr = build_int_cst (unsigned_type_node,
> >~(AARCH64_FE_ALL_EXCEPT));
> >
> > -  ld_fenv_cr = build2 (MODIFY_EXPR, unsigned_type_node,
> > -  fenv_cr, build_call_expr (get_fpcr, 0));
> > -  ld_fenv_sr = build2 (MODIFY_EXPR, unsigned_type_node,
> > -  fenv_sr, build_call

Re: [PATCH] Add value range info for affine combination to improve store motion (PR83403)

2020-04-29 Thread luoxhu via Gcc-patches



On 2020/4/28 18:30, Richard Biener wrote:

> 
> OK, I guess instead of get_range_info expr_to_aff_combination could
> simply use determine_value_range (op0, &minv, &maxv) == VR_RANGE
> (the && TREE_CODE (op0) == SSA_NAME check can then be removed)?
> 

Tried with determine_value_range, it works and is much more straight forward,
Thanks for your suggestion, update the patch as below with some tests, 
regression testing.


[PATCH] Add handling of MULT_EXPR/PLUS_EXPR for wrapping overflow in affine 
combination(PR83403)

Use determine_value_range to get value range info for fold convert expressions
with internal operation PLUS_EXPR/MINUS_EXPR/MULT_EXPR when not overflow on
wrapping overflow inner type.  i.e.:

(long unsigned int)((unsigned  int)n * 10 + 1)
=>
(long unsigned  int)n * (long unsigned int)10 + (long unsigned int)1

With this patch for affine combination, load/store motion could detect
more address refs independency and promote some memory expressions to
registers within loop.

PS: Replace the previous "(T1)(X + CST) as (T1)X - (T1)(-CST))"
to "(T1)(X + CST) as (T1)X + (T1)(CST))" for wrapping overflow.

gcc/ChangeLog

2020-04-29  Xiong Hu Luo  

PR tree-optimization/83403
* tree-affine.c (expr_to_aff_combination): Replace SSA_NAME with
determine_value_range, Add fold conversion of MULT_EXPR, fix the
previous PLUS_EXPR.

gcc/testsuite/ChangeLog

2020-04-29  Xiong Hu Luo  

PR tree-optimization/83403
* gcc.dg/tree-ssa/pr83403-1.c: New test.
* gcc.dg/tree-ssa/pr83403-2.c: New test.
* gcc.dg/tree-ssa/pr83403.h: New header.
---
 gcc/testsuite/gcc.dg/tree-ssa/pr83403-1.c |  8 ++
 gcc/testsuite/gcc.dg/tree-ssa/pr83403-2.c |  8 ++
 gcc/testsuite/gcc.dg/tree-ssa/pr83403.h   | 30 +++
 gcc/tree-affine.c | 22 +
 4 files changed, 58 insertions(+), 10 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr83403-1.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr83403-2.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr83403.h

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr83403-1.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr83403-1.c
new file mode 100644
index 000..748375b03af
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr83403-1.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -funroll-loops -fdump-tree-lim2-details" } */
+
+#define TYPE unsigned int
+
+#include "pr83403.h"
+
+/* { dg-final { scan-tree-dump-times "Executing store motion of" 10 "lim2" } } 
*/
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr83403-2.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr83403-2.c
new file mode 100644
index 000..ca2e6bbd61c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr83403-2.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -funroll-loops -fdump-tree-lim2-details" } */
+
+#define TYPE int
+
+#include "pr83403.h"
+
+/* { dg-final { scan-tree-dump-times "Executing store motion of" 10 "lim2" } } 
*/
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr83403.h 
b/gcc/testsuite/gcc.dg/tree-ssa/pr83403.h
new file mode 100644
index 000..0da8a835b5f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr83403.h
@@ -0,0 +1,30 @@
+__attribute__ ((noinline)) void
+calculate (const double *__restrict__ A, const double *__restrict__ B,
+  double *__restrict__ C)
+{
+  TYPE m = 0;
+  TYPE n = 0;
+  TYPE k = 0;
+
+  A = (const double *) __builtin_assume_aligned (A, 16);
+  B = (const double *) __builtin_assume_aligned (B, 16);
+  C = (double *) __builtin_assume_aligned (C, 16);
+
+  for (n = 0; n < 9; n++)
+{
+  for (m = 0; m < 10; m++)
+   {
+ C[(n * 10) + m] = 0.0;
+   }
+
+  for (k = 0; k < 17; k++)
+   {
+#pragma simd
+ for (m = 0; m < 10; m++)
+   {
+ C[(n * 10) + m] += A[(k * 20) + m] * B[(n * 20) + k];
+   }
+   }
+}
+}
+
diff --git a/gcc/tree-affine.c b/gcc/tree-affine.c
index 0eb8db1b086..86bb64fe245 100644
--- a/gcc/tree-affine.c
+++ b/gcc/tree-affine.c
@@ -343,23 +343,25 @@ expr_to_aff_combination (aff_tree *comb, tree_code code, 
tree type,
wide_int minv, maxv;
/* If inner type has wrapping overflow behavior, fold conversion
   for below case:
-(T1)(X - CST) -> (T1)X - (T1)CST
-  if X - CST doesn't overflow by range information.  Also handle
-  (T1)(X + CST) as (T1)(X - (-CST)).  */
+(T1)(X *+- CST) -> (T1)X *+- (T1)CST
+  if X *+- CST doesn't overflow by range information.  */
if (TYPE_UNSIGNED (itype)
&& TYPE_OVERFLOW_WRAPS (itype)
-   && TREE_CODE (op0) == SSA_NAME
&& TREE_CODE (op1) == INTEGER_CST
-   && icode != MULT_EXPR
-   && get_range_info (op0, &minv, &maxv) == VR_RANGE)
+   && determine_value_range (op0, &minv, &maxv) == VR_RANGE)
 

Re: [Patch][OpenMP][Fortran] Support absent optional args with use_device_{ptr, addr} (+ OpenACC's use_device clause)

2020-04-29 Thread Thomas Schwinge
Hi!

On 2019-11-11T13:14:43+0100, I wrote:
> On 2019-11-08T16:41:23+0100, Tobias Burnus  wrote:
>> --- /dev/null
>> +++ b/libgomp/testsuite/libgomp.fortran/use_device_ptr-optional-2.f90
>
> When adding '{ dg-do run }' for torture testing (please remember to), I
> see the '-O0' and '-O1' execution testing FAIL, complaining that
> "libgomp: use_device_ptr pointer wasn't mapped".

That'd gotten resolved in the mean time, so I've now myself pushed to
master branch in commit b9dc11b6730a8030cfc85f0222cef523c9c5d27c "Torture
testing: 'libgomp.fortran/use_device_ptr-optional-2.f90'", see attached.


Grüße
 Thomas


-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter
>From b9dc11b6730a8030cfc85f0222cef523c9c5d27c Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Mon, 11 Nov 2019 11:30:33 +0100
Subject: [PATCH] Torture testing:
 'libgomp.fortran/use_device_ptr-optional-2.f90'

Fix-up for commit a2c26c50310a336361d8129ecdd43d3001d6cb3a (r278046) "Fortran]
Support absent optional args with use_device_{ptr,addr}".

	libgomp/
	* testsuite/libgomp.fortran/use_device_ptr-optional-2.f90: Add
	'dg-do run'.
---
 libgomp/ChangeLog| 5 +
 .../testsuite/libgomp.fortran/use_device_ptr-optional-2.f90  | 1 +
 2 files changed, 6 insertions(+)

diff --git a/libgomp/ChangeLog b/libgomp/ChangeLog
index ee1764d4ae31..53bb8d23fa15 100644
--- a/libgomp/ChangeLog
+++ b/libgomp/ChangeLog
@@ -1,3 +1,8 @@
+2020-04-29  Thomas Schwinge  
+
+	* testsuite/libgomp.fortran/use_device_ptr-optional-2.f90: Add
+	'dg-do run'.
+
 2020-04-23  Andrew Stubbs  
 
 	PR other/94629
diff --git a/libgomp/testsuite/libgomp.fortran/use_device_ptr-optional-2.f90 b/libgomp/testsuite/libgomp.fortran/use_device_ptr-optional-2.f90
index 641ebd98962a..7a4aaae52cbd 100644
--- a/libgomp/testsuite/libgomp.fortran/use_device_ptr-optional-2.f90
+++ b/libgomp/testsuite/libgomp.fortran/use_device_ptr-optional-2.f90
@@ -1,3 +1,4 @@
+! { dg-do run }
 ! Check whether absent optional arguments are properly
 ! handled with use_device_{addr,ptr}.
 program main
-- 
2.26.2



Re: [AMD GCN] Use 'radeon' for the environment variable 'ACC_DEVICE_TYPE'

2020-04-29 Thread Thomas Schwinge
Hi!

On 2020-04-23T17:27:45+0100, Andrew Stubbs  wrote:
> On 21/04/2020 13:24, Thomas Schwinge wrote:
>> [...], what about the user-visible 'gcn' in the
>> 'ACC_DEVICE_TYPE' environment variable?  As I'd quoted in
>> :
>>
>> | Per OpenACC 3.0, A.1.2. "AMD
>> | GPU Targets", for example, there is 'acc_device_radeon'
>>
>> ... which you've addressed...
>>
>> | (and "the
>> | case-insensitive name 'radeon' for the environment variable
>> | 'ACC_DEVICE_TYPE'").
>>
>> ..., but this not yet?
>
> Oh, right, the user-visible case ought to accept "radeon", at least. :-(
>
>> Please see the attached "[AMD GCN] Use 'radeon' for the environment
>> variable 'ACC_DEVICE_TYPE'", [...]
>
> This is OK, assuming it tests OK.

Thanks, tested, and now pushed to master branch in commit
4912a04f8b35fadf65973bffc7037432ff7b7980 "[gcn] Use 'radeon' for the
environment variable 'ACC_DEVICE_TYPE'", see attached.


Grüße
 Thomas


-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter
>From 4912a04f8b35fadf65973bffc7037432ff7b7980 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Tue, 21 Apr 2020 14:16:24 +0200
Subject: [PATCH] [gcn] Use 'radeon' for the environment variable
 'ACC_DEVICE_TYPE'

..., per OpenACC 3.0, A.1.2. "AMD GPU Targets".

This complements commit 6687d13a87c42dddc7d1c7adade38d31ba0d1401 "Rename
acc_device_gcn to acc_device_radeon".

	libgomp/
	* oacc-init.c (get_openacc_name): Handle 'gcn'.
	* testsuite/lib/libgomp.exp
	(offload_target_to_openacc_device_type) [amdgcn*]: Return
	'radeon'.  Adjust all users.
	(check_effective_target_openacc_amdgcn_accel_present): Rename
	to...
	(check_effective_target_openacc_radeon_accel_present): ... this.
	Adjust all users.
	(check_effective_target_openacc_amdgcn_accel_selected): Rename to...
	(check_effective_target_openacc_radeon_accel_selected): ... this.
	Adjust all users.
---
 libgomp/ChangeLog  | 12 
 libgomp/oacc-init.c|  4 +++-
 libgomp/testsuite/lib/libgomp.exp  | 16 
 libgomp/testsuite/libgomp.oacc-c++/c++.exp | 18 +-
 .../libgomp.oacc-c++/firstprivate-mappings-1.C |  2 +-
 .../acc_get_property-gcn.c |  2 +-
 .../asyncwait-nop-1.c  |  2 +-
 .../firstprivate-mappings-1.c  |  2 +-
 .../function-not-offloaded.c   |  4 ++--
 .../libgomp.oacc-c-c++-common/loop-auto-1.c|  2 +-
 .../loop-dim-default.c |  2 +-
 .../libgomp.oacc-c-c++-common/routine-wv-2.c   |  2 +-
 .../libgomp.oacc-c-c++-common/tile-1.c |  2 +-
 libgomp/testsuite/libgomp.oacc-c/c.exp | 18 +-
 .../libgomp.oacc-fortran/error_stop-1.f|  2 +-
 .../libgomp.oacc-fortran/error_stop-2.f|  2 +-
 .../libgomp.oacc-fortran/error_stop-3.f|  2 +-
 .../testsuite/libgomp.oacc-fortran/fortran.exp | 14 +++---
 18 files changed, 61 insertions(+), 47 deletions(-)

diff --git a/libgomp/ChangeLog b/libgomp/ChangeLog
index 53bb8d23fa15..cfe6e0653c92 100644
--- a/libgomp/ChangeLog
+++ b/libgomp/ChangeLog
@@ -1,5 +1,17 @@
 2020-04-29  Thomas Schwinge  
 
+	* oacc-init.c (get_openacc_name): Handle 'gcn'.
+	* testsuite/lib/libgomp.exp
+	(offload_target_to_openacc_device_type) [amdgcn*]: Return
+	'radeon'.  Adjust all users.
+	(check_effective_target_openacc_amdgcn_accel_present): Rename
+	to...
+	(check_effective_target_openacc_radeon_accel_present): ... this.
+	Adjust all users.
+	(check_effective_target_openacc_amdgcn_accel_selected): Rename to...
+	(check_effective_target_openacc_radeon_accel_selected): ... this.
+	Adjust all users.
+
 	* testsuite/libgomp.fortran/use_device_ptr-optional-2.f90: Add
 	'dg-do run'.
 
diff --git a/libgomp/oacc-init.c b/libgomp/oacc-init.c
index ef12b4c16d01..5d786a5a2e7c 100644
--- a/libgomp/oacc-init.c
+++ b/libgomp/oacc-init.c
@@ -99,7 +99,9 @@ unknown_device_type_error (acc_device_t invalid_type)
 static const char *
 get_openacc_name (const char *name)
 {
-  if (strcmp (name, "nvptx") == 0)
+  if (strcmp (name, "gcn") == 0)
+return "radeon";
+  else if (strcmp (name, "nvptx") == 0)
 return "nvidia";
   else
 return name;
diff --git a/libgomp/testsuite/lib/libgomp.exp b/libgomp/testsuite/lib/libgomp.exp
index e7ce784314d9..ee5f0e57b190 100644
--- a/libgomp/testsuite/lib/libgomp.exp
+++ b/libgomp/testsuite/lib/libgomp.exp
@@ -319,7 +319,7 @@ proc libgomp_option_proc { option } {
 proc offload_target_to_openacc_device_type { offload_target } {
 switch -glob $offload_target {
 	amdgcn* {
-	return "gcn"
+	return "radeon"
 	}
 	disable {
 	return "host"
@@ -483,10 +483,10 @@ proc check_effective_target_hsa_offloading_selected {} {
 }]
 }
 
-# Return 1 if at least one AMD GCN board is present.
+# 

[gcn] Don't default to building target-libstdc++-v3 [PR92713] (was: [PATCH v3 06/10] GCN back-end config)

2020-04-29 Thread Thomas Schwinge
Hi!

On 2018-12-12T11:52:53+, Andrew Stubbs  wrote:
> This patch contains the configuration adjustments needed to enable the GCN
> back-end.

In addition to default-enabling libgomp, which Tobias already pushed to
master branch in commit 29b1533acd5c57e3e327261b0054d0ee2858a166
"configure - build libgomp by default for amdgcn",
,
I've now, as obvious, pushed to master branch in commit
afa3d80e86fb538ce7f5f1485fa774c11fdaf0f6 "[gcn] Don't default to building
target-libstdc++-v3 [PR92713]", see attached.  That way, a (standard) GCC
'--enable-languages=[...]' configuration including 'c++' now "works"
(... to the extent expected without libstdc++).


Grüße
 Thomas


-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter
>From afa3d80e86fb538ce7f5f1485fa774c11fdaf0f6 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Sat, 21 Mar 2020 11:46:02 +0100
Subject: [PATCH] [gcn] Don't default to building target-libstdc++-v3 [PR92713]

... which hasn't been ported/fails to build when using newlib (with GCC commit
b73f69020f08208d2d969fcf8879bd294a6e3596 sources, and newlib commit
6d79e0a58866548f435527798fbd4a6849d05bc7, tag: newlib-3.3.0 sources):

In file included from [...]/build-gcc-offload-amdgcn-amdhsa/amdgcn-amdhsa/libstdc++-v3/include/csetjmp:42,
 from [...]/source-gcc/libstdc++-v3/include/precompiled/stdc++.h:42:
[...]/source-gcc/newlib/libc/include/setjmp.h:15:6: error: variable or field 'longjmp' declared void
   15 | void longjmp (jmp_buf __jmpb, int __retval)
  |  ^~~
[...]
Makefile:1824: recipe for target 'amdgcn-amdhsa/bits/stdc++.h.gch/O2ggnu++0x.gch' failed
make[3]: *** [amdgcn-amdhsa/bits/stdc++.h.gch/O2ggnu++0x.gch] Error 1

	PR target/92713
	* configure.ac ["${ENABLE_LIBSTDCXX}" = "default" && amdgcn*-*-*]
	(noconfigdirs): Add 'target-libstdc++-v3'.
	* configure: Regenerate.
---
 ChangeLog| 7 +++
 configure| 4 
 configure.ac | 4 
 3 files changed, 15 insertions(+)

diff --git a/ChangeLog b/ChangeLog
index 3b667b18f9f1..a7fcf77b9b26 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,10 @@
+2020-04-29  Thomas Schwinge  
+
+	PR target/92713
+	* configure.ac ["${ENABLE_LIBSTDCXX}" = "default" && amdgcn*-*-*]
+	(noconfigdirs): Add 'target-libstdc++-v3'.
+	* configure: Regenerate.
+
 2020-04-21  Stephen Casner  
 
 	PR 25830
diff --git a/configure b/configure
index 815069c9856d..4cc938ebb7d8 100755
--- a/configure
+++ b/configure
@@ -3387,6 +3387,10 @@ if test "${ENABLE_LIBSTDCXX}" = "default" ; then
   # VxWorks uses the Dinkumware C++ library.
   noconfigdirs="$noconfigdirs target-libstdc++-v3"
   ;;
+amdgcn*-*-*)
+  # Not ported/fails to build when using newlib.
+  noconfigdirs="$noconfigdirs target-libstdc++-v3"
+  ;;
 arm*-wince-pe*)
   # the C++ libraries don't build on top of CE's C libraries
   noconfigdirs="$noconfigdirs target-libstdc++-v3"
diff --git a/configure.ac b/configure.ac
index b95532cae90f..c78d9cbea62d 100644
--- a/configure.ac
+++ b/configure.ac
@@ -668,6 +668,10 @@ if test "${ENABLE_LIBSTDCXX}" = "default" ; then
   # VxWorks uses the Dinkumware C++ library.
   noconfigdirs="$noconfigdirs target-libstdc++-v3"
   ;;
+amdgcn*-*-*)
+  # Not ported/fails to build when using newlib.
+  noconfigdirs="$noconfigdirs target-libstdc++-v3"
+  ;;
 arm*-wince-pe*)
   # the C++ libraries don't build on top of CE's C libraries
   noconfigdirs="$noconfigdirs target-libstdc++-v3"
-- 
2.26.2



Re: [PATCH v3 04/10] GCN machine description

2020-04-29 Thread Thomas Schwinge
Hi!

On 2018-12-12T11:52:23+, Andrew Stubbs  wrote:
> This patch contains the machine description portion of the GCN
> back-end.

> --- /dev/null
> +++ b/gcc/config/gcn/gcn.md

> +(define_insn_and_split "*mov_insn"
> +  [(set (match_operand:DIDF 0 "nonimmediate_operand"
> +   "=SD,SD,SD,RS,Sm,v, v,Sg, v, v,RF,RLRG,   v, v,RM")
> + (match_operand:DIDF 1 "general_operand"
> +   "SSA, C,DB,Sm,RS,v,DB, v,Sv,RF, v,   v,RLRG,RM, v"))]
> +  "GET_CODE(operands[1]) != SYMBOL_REF"
> +  "@
> +  s_mov_b64\t%0, %1
> +  s_mov_b64\t%0, %1
> +  #
> +  s_store_dwordx2\t%1, %A0\;s_waitcnt\tlgkmcnt(0)
> +  s_load_dwordx2\t%0, %A1\;s_waitcnt\tlgkmcnt(0)
> +  #
> +  #
> +  #
> +  #
> +  flat_load_dwordx2\t%0, %A1%O1%g1\;s_waitcnt\t0
> +  flat_store_dwordx2\t%A0, %1%O0%g0\;s_waitcnt\t0
> +  ds_write_b64\t%A0, %1%O0\;s_waitcnt\tlgkmcnt(0)
> +  ds_read_b64\t%0, %A1%O1\;s_waitcnt\tlgkmcnt(0)
> +  global_load_dwordx2\t%0, %A1%O1%g1\;s_waitcnt\tvmcnt(0)
> +  global_store_dwordx2\t%A0, %1%O0%g0\;s_waitcnt\tvmcnt(0)"
> +  "(reload_completed && !MEM_P (operands[0]) && !MEM_P (operands[1])
> +&& !gcn_sgpr_move_p (operands[0], operands[1]))
> +   || (GET_CODE (operands[1]) == CONST_INT && !gcn_constant64_p 
> (operands[1]))"
> +  [(set (match_dup 0) (match_dup 1))
> +   (set (match_dup 2) (match_dup 3))]
> +  {
> +rtx inlo = gen_lowpart (SImode, operands[1]);
> +rtx inhi = gen_highpart_mode (SImode, mode, operands[1]);
> +rtx outlo = gen_lowpart (SImode, operands[0]);
> +rtx outhi = gen_highpart_mode (SImode, mode, operands[0]);
> +
> +/* Ensure that overlapping registers aren't corrupted.  */
> +if (REGNO (outlo) == REGNO (inhi))

As discussed in PR94248 "[amdgcn] Doesn't build with RTL checking", I've
tested Jakub's patch, and now pushed to master branch in commit
ccf93cd0b21e9c0ff0a1d4ace59899fca25ac157 "[gcn] Fix build with RTL
checking [PR94248]".

> +  {
> + operands[0] = outhi;
> + operands[1] = inhi;
> + operands[2] = outlo;
> + operands[3] = inlo;
> +  }
> +else
> +  {
> + operands[0] = outlo;
> + operands[1] = inlo;
> + operands[2] = outhi;
> + operands[3] = inhi;
> +  }
> +  }
> +  [(set_attr "type" "sop1,sop1,mult,smem,smem,vmult,vmult,vmult,vmult,flat,
> +  flat,ds,ds,flat,flat")
> +   (set_attr "length" "4,8,*,12,12,*,*,*,*,12,12,12,12,12,12")])


Grüße
 Thomas


-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter
>From ccf93cd0b21e9c0ff0a1d4ace59899fca25ac157 Mon Sep 17 00:00:00 2001
From: Jakub Jelinek 
Date: Sat, 21 Mar 2020 14:39:56 +
Subject: [PATCH] [gcn] Fix build with RTL checking [PR94248]

Building (for offloading) a '--target=amdgcn-amdhsa' GCC with
'--enable-checking=yes,extra,rtl' fails:

during RTL pass: split2
[...]/source-gcc/libgcc/libgcc2.c: In function '__absvdi2':
[...]/source-gcc/libgcc/libgcc2.c:271:1: internal compiler error: RTL check: expected code 'reg', have 'const_int' in rhs_regno, at rtl.h:1923
  271 | }
	  | ^
0x565847 ???
	[...]/source-gcc/gcc/rtl.c:881
0x59a8a4 ???
	[...]/source-gcc/gcc/rtl.h:1923
0x12e3a5c ???
	[...]/source-gcc/gcc/config/gcn/gcn.md:631
[...]
Makefile:501: recipe for target '_absvdi2.o' failed
make[4]: *** [_absvdi2.o] Error 1
make[4]: Leaving directory '[...]/build-gcc-offload-amdgcn-amdhsa/amdgcn-amdhsa/gfx900/libgcc'

	gcc/
	PR target/94248
	* config/gcn/gcn.md (*mov_insn): Use
	'reg_overlap_mentioned_p' to check for overlap.

Tested-by: Thomas Schwinge 
---
 gcc/ChangeLog | 4 
 gcc/config/gcn/gcn.md | 2 +-
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 5541d9c62d9f..3321e8839707 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,5 +1,9 @@
 2020-04-29  Jakub Jelinek  
 
+	PR target/94248
+	* config/gcn/gcn.md (*mov_insn): Use
+	'reg_overlap_mentioned_p' to check for overlap.
+
 	PR target/94706
 	* config/ia64/ia64.c (hfa_element_mode): Use DECL_FIELD_ABI_IGNORED
 	instead of cxx17_empty_base_field_p.
diff --git a/gcc/config/gcn/gcn.md b/gcc/config/gcn/gcn.md
index 8f5937781b2b..8cfb3a85d256 100644
--- a/gcc/config/gcn/gcn.md
+++ b/gcc/config/gcn/gcn.md
@@ -629,7 +629,7 @@
 rtx outhi = gen_highpart_mode (SImode, mode, operands[0]);
 
 /* Ensure that overlapping registers aren't corrupted.  */
-if (REGNO (outlo) == REGNO (inhi))
+if (reg_overlap_mentioned_p (outlo, inhi))
   {
 	operands[0] = outhi;
 	operands[1] = inhi;
-- 
2.26.2



[gcn] Set 'UI_NONE' for 'TARGET_EXCEPT_UNWIND_INFO' [PR94282] (was: [PATCH] amdgcn: Add stub personality function)

2020-04-29 Thread Thomas Schwinge
Hi!

On 2020-04-23T13:05:33+0200, I wrote:
> On 2020-04-23T09:15:29+0100, Andrew Stubbs  wrote:
>> On 22/04/2020 22:10, Kwok Cheung Yeung wrote:
>>> [...] allows the following tests in the libgomp
>>> testsuite that were previously failing with a linker error to compile
>>> and run, provided that they do not throw any exceptions:
>>>
>>> libgomp.c-c++-common/function-not-offloaded.c
>>> libgomp.c++/for-15.C
>>> libgomp.c++/for-24.C
>>> libgomp.oacc-c-c++-common/routine-1.c
>>> libgomp.oacc-c++/pr71959.C
>>> libgomp.oacc-c++/routine-1-auto.C
>>> libgomp.oacc-c++/routine-1-template-auto.C
>>> libgomp.oacc-c++/routine-1-template-trailing-return-type.C
>>> libgomp.oacc-c++/routine-1-template.C
>>> libgomp.oacc-c++/routine-1-trailing-return-type.C
>
> That's  "[amdgcn] ld: error: undefined
> symbol: __gxx_personality_v0"

> I suggest we [...] apply what I'd proposed a month ago in
>  "[amdgcn] ld: error: undefined symbol:
> __gxx_personality_v0", and now yesterday (coincidentally) posted.

Now pushed to master branch in commit
7f1989249e25af6fc0f124452efa24b3796b767a "[gcn] Set 'UI_NONE' for
'TARGET_EXCEPT_UNWIND_INFO' [PR94282]", see attached.


Grüße
 Thomas


-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter
>From 7f1989249e25af6fc0f124452efa24b3796b767a Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Tue, 21 Apr 2020 22:39:33 +0200
Subject: [PATCH] [gcn] Set 'UI_NONE' for 'TARGET_EXCEPT_UNWIND_INFO' [PR94282]

In libgomp offloading testing, this resolves all the 'ld: error: undefined
symbol: __gxx_personality_v0' FAILs.

	gcc/
	PR target/94282
	* common/config/gcn/gcn-common.c (gcn_except_unwind_info): New
	function.
	(TARGET_EXCEPT_UNWIND_INFO): Define.
	libgomp/
	PR target/94282
	* testsuite/libgomp.c-c++-common/function-not-offloaded.c: Remove
	'dg-allow-blank-lines-in-output'.
---
 gcc/ChangeLog| 7 +++
 gcc/common/config/gcn/gcn-common.c   | 9 +
 libgomp/ChangeLog| 4 
 .../libgomp.c-c++-common/function-not-offloaded.c| 1 -
 4 files changed, 20 insertions(+), 1 deletion(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 3321e8839707..2ba39f67200f 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,10 @@
+2020-04-29  Thomas Schwinge  
+
+	PR target/94282
+	* common/config/gcn/gcn-common.c (gcn_except_unwind_info): New
+	function.
+	(TARGET_EXCEPT_UNWIND_INFO): Define.
+
 2020-04-29  Jakub Jelinek  
 
 	PR target/94248
diff --git a/gcc/common/config/gcn/gcn-common.c b/gcc/common/config/gcn/gcn-common.c
index 84a567b2..9642f9cc5a6c 100644
--- a/gcc/common/config/gcn/gcn-common.c
+++ b/gcc/common/config/gcn/gcn-common.c
@@ -34,4 +34,13 @@ static const struct default_options gcn_option_optimization_table[] =
 #undef  TARGET_OPTION_OPTIMIZATION_TABLE
 #define TARGET_OPTION_OPTIMIZATION_TABLE gcn_option_optimization_table
 
+static enum unwind_info_type
+gcn_except_unwind_info (struct gcc_options *opts ATTRIBUTE_UNUSED)
+{
+  return UI_NONE;
+}
+
+#undef  TARGET_EXCEPT_UNWIND_INFO
+#define TARGET_EXCEPT_UNWIND_INFO gcn_except_unwind_info
+
 struct gcc_targetm_common targetm_common = TARGETM_COMMON_INITIALIZER;
diff --git a/libgomp/ChangeLog b/libgomp/ChangeLog
index cfe6e0653c92..1a7046f2fc64 100644
--- a/libgomp/ChangeLog
+++ b/libgomp/ChangeLog
@@ -1,5 +1,9 @@
 2020-04-29  Thomas Schwinge  
 
+	PR target/94282
+	* testsuite/libgomp.c-c++-common/function-not-offloaded.c: Remove
+	'dg-allow-blank-lines-in-output'.
+
 	* oacc-init.c (get_openacc_name): Handle 'gcn'.
 	* testsuite/lib/libgomp.exp
 	(offload_target_to_openacc_device_type) [amdgcn*]: Return
diff --git a/libgomp/testsuite/libgomp.c-c++-common/function-not-offloaded.c b/libgomp/testsuite/libgomp.c-c++-common/function-not-offloaded.c
index f01a64e72c07..9e59ef8864e7 100644
--- a/libgomp/testsuite/libgomp.c-c++-common/function-not-offloaded.c
+++ b/libgomp/testsuite/libgomp.c-c++-common/function-not-offloaded.c
@@ -1,6 +1,5 @@
 /* { dg-do link } */
 /* { dg-excess-errors "unresolved symbol foo, lto1, mkoffload and lto-wrapper fatal errors" { target offload_device_nonshared_as } } */
-/* { dg-allow-blank-lines-in-output 1 } */
 /* { dg-additional-sources "function-not-offloaded-aux.c" } */
 
 #pragma omp declare target
-- 
2.26.2



Re: [committed] libphobos: Add power*-*-linux* as a supported target

2020-04-29 Thread Iain Buclaw via Gcc-patches
On 28/04/2020 19:12, Segher Boessenkool wrote:
> Hi!
> 
> On Sun, Apr 26, 2020 at 11:35:21AM +0200, Iain Buclaw wrote:
>> This patch adds power*-*-linux* as a supported target for libphobos.
> 
> Many thanks for doing this!
> 
> A problem though: libphobos/libdruntime is built for -m32 as well, but
> that builds libphobos/libdruntime/config/powerpc64/callwithstack.S,
> which cannot work (that file has all kinds of 64-bit only constructs and
> instructions in it).
> 

Yes, sadly I could not get --enable-multilib/multiarch to work properly
locally, so this one unfornately slipped though.

>> diff --git a/libphobos/configure b/libphobos/configure
>> index c2b49132fda..c923417532f 100755
>> --- a/libphobos/configure
>> +++ b/libphobos/configure
>> @@ -13991,9 +13991,10 @@ fi
>> ;;
>>mips*)   druntime_target_cpu_parsed="mips"
>> ;;
>> -  powerpc) druntime_target_cpu_parsed="powerpc"
>> +  powerpc|powerpcle)
>> +   druntime_target_cpu_parsed="powerpc"
>> ;;
>> -  powerpc64)
>> +  powerpc64|powerpc64le)
>> druntime_target_cpu_parsed="powerpc64"
>> ;;
>>i[34567]86|x86_64)
> 
> We are a biarch target, so both powerpc-* and powerpc64-* configurations
> can do both those configs (potentially, and the default for many configs).
> 

Fixed on master 2110-g8b53086ab6a

Iain.


Re: [rtl] Harden 'set_noop_p' for non-constant selectors [PR94279]

2020-04-29 Thread Thomas Schwinge
Hi!

On 2020-04-22T18:23:24+0100, Richard Sandiford  
wrote:
> Andrew Stubbs  writes:
>> On 22/04/2020 17:43, Thomas Schwinge wrote:
>>> In  "[amdgcn] internal compiler error: RTL
>>> check: expected code 'const_int', have 'reg' in rtx_to_poly_int64, at
>>> rtl.h:2379", we recently found that that it's wrong to expect constant
>>> selectors, at least in the current code and its usage context.  (Thanks,
>>> Richard Biener for the guidance!)  Not too many actually, but of course,
>>> this code has seen some changes since 2013-12-04 (for example, r261530
>>> "Use poly_int rtx accessors instead of hwi accessors"), and also the
>>> context may have changed that it's being used in -- so, I'm not sure
>>> whether the original code (as quoted above) is actually buggy already,
>>> but it already does contain the pattern that 'INTVAL' is used on
>>> something without making sure that we're actually dealing with a constant
>>> selector.  (Has that maybe have been an impossible scenario back then?)
>>
>> I think it was impossible. See
>> https://gcc.gnu.org/legacy-ml/gcc-patches/2018-09/msg00273.html
>
> Ah!  Thanks for the link.

Many thanks indeed! That gives confidence why we're running into this
problem just now, for GCN target only -- Tejas' original patch thus is
not to blame at all, good.

(..., and hopefully we won't find much more fall-out due to the GCN
target doing away with the constant 'vec_select' restriction...)

>>> Then, should this also be backported to release branches?  GCC 9: same
>>> patch as for master branch.  GCC 8: pre poly_int, so only need to guard
>>> 'INTVAL' (by 'CONST_INT_P', right?).  Or, is that not worth it, given
>>> that nobody found this to be a problem until now (as far as I know),
>>> and/or it's maybe really specific to (or, exposed by) AMD GCN's vector
>>> instructions?  (For AMD GCN offloading, we only care about master
>>> branch.)
>>
>> I don't think it's needed prior to GCC 9, and then only for amdgcn which
>> was probably not widely used.
>
> Based on that, OK for master and GCC 9.

Thanks for the quick review.  I'll later see about backporting for GCC 9.
For now pushed to master branch in commit
f2c2eaaf8fb5c66ae372bb526b2b2fe67a9c5c39 "[rtl] Harden 'set_noop_p' for
non-constant selectors [PR94279]", see attached.


Grüße
 Thomas


-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter
>From f2c2eaaf8fb5c66ae372bb526b2b2fe67a9c5c39 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Wed, 22 Apr 2020 16:58:44 +0200
Subject: [PATCH] [rtl] Harden 'set_noop_p' for non-constant selectors
 [PR94279]

... given that the GCN target did away with the constant 'vec_select'
restriction.

	gcc/
	PR target/94279
	* rtlanal.c (set_noop_p): Handle non-constant selectors.
---
 gcc/ChangeLog |  3 +++
 gcc/rtlanal.c | 12 +---
 2 files changed, 12 insertions(+), 3 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 2ba39f67200f..ef851ef84626 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,5 +1,8 @@
 2020-04-29  Thomas Schwinge  
 
+	PR target/94279
+	* rtlanal.c (set_noop_p): Handle non-constant selectors.
+
 	PR target/94282
 	* common/config/gcn/gcn-common.c (gcn_except_unwind_info): New
 	function.
diff --git a/gcc/rtlanal.c b/gcc/rtlanal.c
index c7ab86e228b1..0ebde7622db6 100644
--- a/gcc/rtlanal.c
+++ b/gcc/rtlanal.c
@@ -1631,12 +1631,18 @@ set_noop_p (const_rtx set)
   int i;
   rtx par = XEXP (src, 1);
   rtx src0 = XEXP (src, 0);
-  poly_int64 c0 = rtx_to_poly_int64 (XVECEXP (par, 0, 0));
+  poly_int64 c0;
+  if (!poly_int_rtx_p (XVECEXP (par, 0, 0), &c0))
+	return 0;
   poly_int64 offset = GET_MODE_UNIT_SIZE (GET_MODE (src0)) * c0;
 
   for (i = 1; i < XVECLEN (par, 0); i++)
-	if (maybe_ne (rtx_to_poly_int64 (XVECEXP (par, 0, i)), c0 + i))
-	  return 0;
+	{
+	  poly_int64 c0i;
+	  if (!poly_int_rtx_p (XVECEXP (par, 0, i), &c0i)
+	  || maybe_ne (c0i, c0 + i))
+	return 0;
+	}
   return
 	REG_CAN_CHANGE_MODE_P (REGNO (dst), GET_MODE (src0), GET_MODE (dst))
 	&& simplify_subreg_regno (REGNO (src0), GET_MODE (src0),
-- 
2.26.2



[committed] d: Merge bug fix from upstream dmd 06160ccae

2020-04-29 Thread Iain Buclaw via Gcc-patches
Hi,

This patch adds classKind information to the front-end AST, which in
turn allows us to fix code generation of type names for extern(C) and
extern(C++) structs and classes.  Inspecting such types inside a
debugger now just works without the need to cast(module_name.cxx).

Bootstrapped on x86_64-linux-gnu, committed to mainline.

Regards
Iain.

---
gcc/d/ChangeLog:

* d-codegen.cc (d_decl_context): Don't include module in the name of
class and struct types that aren't extern(D).
---
 gcc/d/d-codegen.cc|  4 ++-
 gcc/d/dmd/MERGE   |  2 +-
 gcc/d/dmd/aggregate.h | 17 --
 gcc/d/dmd/dclass.c| 32 +--
 gcc/d/dmd/declaration.c   |  2 +-
 gcc/d/dmd/dstruct.c   |  4 +++
 gcc/d/dmd/func.c  |  4 +--
 gcc/d/dmd/opover.c|  2 +-
 gcc/d/dmd/traits.c| 24 --
 gcc/testsuite/gdc.test/compilable/test17419.d | 18 +++
 10 files changed, 82 insertions(+), 27 deletions(-)

diff --git a/gcc/d/d-codegen.cc b/gcc/d/d-codegen.cc
index 12c6f138362..b4927a2de10 100644
--- a/gcc/d/d-codegen.cc
+++ b/gcc/d/d-codegen.cc
@@ -66,6 +66,7 @@ d_decl_context (Dsymbol *dsym)
 {
   Dsymbol *parent = dsym;
   Declaration *decl = dsym->isDeclaration ();
+  AggregateDeclaration *ad = dsym->isAggregateDeclaration ();
 
   while ((parent = parent->toParent2 ()))
 {
@@ -74,7 +75,8 @@ d_decl_context (Dsymbol *dsym)
 but only for extern(D) symbols.  */
   if (parent->isModule ())
{
- if (decl != NULL && decl->linkage != LINKd)
+ if ((decl != NULL && decl->linkage != LINKd)
+ || (ad != NULL && ad->classKind != ClassKind::d))
return NULL_TREE;
 
  return build_import_decl (parent);
diff --git a/gcc/d/dmd/MERGE b/gcc/d/dmd/MERGE
index f933cf1c992..a2699d39842 100644
--- a/gcc/d/dmd/MERGE
+++ b/gcc/d/dmd/MERGE
@@ -1,4 +1,4 @@
-f8a1a515346b16ebbd9da56a908540cbef1ee582
+06160ccaed7af7955d169024f417c43beb7a8f9f
 
 The first line of this file holds the git revision number of the last
 merge done from the dlang/dmd repository.
diff --git a/gcc/d/dmd/aggregate.h b/gcc/d/dmd/aggregate.h
index 881be58f2b8..da4a0398522 100644
--- a/gcc/d/dmd/aggregate.h
+++ b/gcc/d/dmd/aggregate.h
@@ -71,6 +71,19 @@ FuncDeclaration *buildDtor(AggregateDeclaration *ad, Scope 
*sc);
 FuncDeclaration *buildInv(AggregateDeclaration *ad, Scope *sc);
 FuncDeclaration *search_toString(StructDeclaration *sd);
 
+struct ClassKind
+{
+enum Type
+{
+/// the class is a d(efault) class
+d,
+/// the class is a C++ interface
+cpp,
+/// the class is an Objective-C class/interface
+objc,
+};
+};
+
 class AggregateDeclaration : public ScopeDsymbol
 {
 public:
@@ -84,6 +97,8 @@ public:
 Dsymbol *deferred;  // any deferred semantic2() or semantic3() 
symbol
 bool isdeprecated;  // true if deprecated
 
+ClassKind::Type classKind;  // specifies the linkage type
+
 /* !=NULL if is nested
  * pointing to the dsymbol that directly enclosing it.
  * 1. The function that enclosing it (nested struct and class)
@@ -274,8 +289,6 @@ public:
 
 TypeInfoClassDeclaration *vclassinfo;   // the ClassInfo object for 
this ClassDeclaration
 bool com;   // true if this is a COM class 
(meaning it derives from IUnknown)
-bool cpp;   // true if this is a C++ interface
-bool isobjc;// true if this is an Objective-C 
class/interface
 bool isscope;   // true if this is a scope class
 Abstract isabstract;// 0: fwdref, 1: is abstract class, 2: 
not abstract
 int inuse;  // to prevent recursive attempts
diff --git a/gcc/d/dmd/dclass.c b/gcc/d/dmd/dclass.c
index 4609d6a9f54..a2009a604a5 100644
--- a/gcc/d/dmd/dclass.c
+++ b/gcc/d/dmd/dclass.c
@@ -240,12 +240,10 @@ ClassDeclaration::ClassDeclaration(Loc loc, Identifier 
*id, BaseClasses *basecla
 }
 
 com = false;
-cpp = false;
 isscope = false;
 isabstract = ABSfwdref;
 inuse = 0;
 baseok = BASEOKnone;
-isobjc = false;
 cpp_type_info_ptr_sym = NULL;
 }
 
@@ -389,7 +387,7 @@ void ClassDeclaration::semantic(Scope *sc)
 userAttribDecl = sc->userAttribDecl;
 
 if (sc->linkage == LINKcpp)
-cpp = true;
+classKind = ClassKind::cpp;
 if (sc->linkage == LINKobjc)
 objc()->setObjc(this);
 }
@@ -555,7 +553,7 @@ void ClassDeclaration::semantic(Scope *sc)
 baseok = BASEOKdone;
 
 // If no base class, and this is not an Object, use Object as base 
class
-if (!baseClass && ident != Id::Object && !cpp)
+if (!baseClass && ident != Id::Object && !

Re: [PATCH]use vnode->get_constructor () to get intial in lto[PR94822]

2020-04-29 Thread Richard Biener via Gcc-patches
On Wed, Apr 29, 2020 at 8:44 AM Richard Biener
 wrote:
>
> On Wed, Apr 29, 2020 at 6:04 AM lizekun (A)  wrote:
> >
> > Hi,
> >
> > This ICE appears because gcc will stream it to the function_body section 
> > when processing the
> > variable with the initial value of the constructor type, and the 
> > error_mark_node to the
> > decls section. When recompiling, the value obtained with DECL_INITIAL will 
> > be error_mark.
> >
> > This patch use vnode->get_constructor () to get intial in lto.
> >
> > Bootstrap and tested on aarch64 Linux platform.
>
> You should use ctor_for_folding to get all the magic correct.  Note that this
> function returns error_mark_node if there is no usable initializer, NULL if
> there is zero-init and sth else for the actual initializer.  You can use
>
>   if (tree init = ctor_for_folding (base))
> if (init && init != error_mark_node)
>   {
> ...
>   }
>
> Not to mention this whole function is a red herring ...:/

So I tested this and ctor_for_folding is too strict but we also do not
stream ctors not usable for folding but to the LTRANS unit that
is responsible for the variable output.

So instead I am simply testing an additional != error_mark_node check.
The size, if "important", should be resolved before LTO.

Richard.

> Thanks,
> Richard.
>
> >
> > Best regulars,
> > Zekun
> >
> >
> > gcc:
> > +2020-04-29  Li Zekun  
> > +
> > +   PR  lto/94822
> > +   * tree.c: use vnode->get_constructor () to get intial in lto
> >
> > gcc/testsuite:
> > +2020-04-29  Li Zekun  
> > +
> > +   PR  lto/94822
> > +   * gcc.dg/lto/pr94822.h: New test.
> > +   * gcc.dg/lto/pr94822_0.c: New test.
> > +   * gcc.dg/lto/pr94822_1.c: New test.


[committed] libphobos: Fix KERNEL_VERSION condition in libphobos testsuite

2020-04-29 Thread Iain Buclaw via Gcc-patches
Hi,

This patch fixes an effective target check in the libphobos testsuite.
A typo in the macro call meant that the #error always triggered.

Regression tested on x86_64-linux-gnu with multilib configurations
{-m64,-m32,-mx32}, committed to mainline.

Regards
Iain.

---
libphobos/ChangeLog:

* testsuite/lib/libphobos.exp (check_effective_target_linux_pre_2639):
Fix KERNEL_VERSION condition.
---
 libphobos/testsuite/lib/libphobos.exp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libphobos/testsuite/lib/libphobos.exp 
b/libphobos/testsuite/lib/libphobos.exp
index 7e6e53e9d5c..2e9da95ac1c 100644
--- a/libphobos/testsuite/lib/libphobos.exp
+++ b/libphobos/testsuite/lib/libphobos.exp
@@ -285,7 +285,7 @@ proc check_effective_target_linux_pre_2639 { } {
 
 if { [check_no_compiler_messages linux_pre_2639 assembly {
#include 
-   #if !defined LINUX_VERSION_CODE || LINUX_VERSION_CODE < 
KERNEL_VERSION(2.6.39)
+   #if !defined LINUX_VERSION_CODE || LINUX_VERSION_CODE < 
KERNEL_VERSION(2,6,39)
#error Yes, it is.
#endif
 }] } {
-- 
2.20.1



Re: [Patch][AMD GCN][OpenMP] Add gcc/config/gcn/t-omp-device for OpenMP declare variant kind/arch/isa

2020-04-29 Thread Thomas Schwinge
Hi!

On 2019-11-04T17:35:32+0100, Jakub Jelinek  wrote:
> On Mon, Nov 04, 2019 at 04:33:27PM +, Andrew Stubbs wrote:
>> On 04/11/2019 15:37, Jakub Jelinek wrote:
>> > My preference would be that arch on amdgcn is something like amdgcn or gcn.
>> > I hope the general distinction between arch and isa will be something that
>> > will be discussed next Tuesday on the language committee, so hopefully 
>> > we'll
>> > know more afterwards and can tweak afterwards depending on the outcome.
>> >
>> > Ok with that change.
>>
>> I'm fine with this, too. The OpenMP support should be posted Real Soon Now.
>>
>> We've not been terribly consistent with "gcn" vs. "amdgcn", which is
>> unfortunate, but we are where we are. Most of the API enums have "GCN", so
>> let's use that, for now.
>
> With the way it is defined in the OpenMP spec, there can be actually aliases, 
> so
> having both gcn and amdgcn as supported arch is fine too.

Has this been resolved yet?


On 2019-11-04T16:31:10+0100, Tobias Burnus  wrote:
> This adds gcc/config/gcn/t-omp-device to augment the nvptx, hsa and x86
> host implementation.

> --- a/gcc/configure.ac
> +++ b/gcc/configure.ac
> @@ -1037,6 +1037,10 @@ for tgt in `echo $enable_offload_targets | sed 's/,/ 
> /g'`; do

> +  gcn*-*)
> + omp_device_property=omp-device-properties-gcn
> + omp_device_property_tmake_file="${omp_device_property_tmake_file} 
> \$(srcdir)/config/gcn/t-omp-device"
> + ;;

After some messy debugging (see the commit log), I've pushed as obvious
to master branch in commit b6a0ae1d22c9675f4374c2cb2b5c0833bb1461f1
"[gcn] Fix 'omp-device-properties-gcn' handling", see attached.

Quoting '"$${props}"' as a safety measure didn't solve the problem, as
we're then just being told 'sed: can't read : No such file or directory'
thrice, without any error condition arising -- sloppy shell programming
without proper error checking/handling, as seen so often...  :-| (Not
Tobias' fault here, of course.)  I'm not going to fix up this one case,
as there are so many other pre-existing ones already...

What I don't understand is why this only did hang occasionally but not
all the time...  (What "junk" could there be on 'stdin' to read, or why
would 'stdin' just sometimes be closed, EOF?)

With that fixed, we now get the expected
'build-gcc/gcc/omp-device-properties-gcn', and:

$ diff -U1 [...] build-gcc/gcc/omp-device-properties.h
--- [...] 2020-04-23 10:03:38.278178193 +0200
+++ build-gcc/gcc/omp-device-properties.h   2020-04-23 
22:16:52.472436595 +0200
@@ -2,2 +2,3 @@
 "amdgcn-amdhsa\0"
+"gpu\0\0"
 "nvptx-none\0"
@@ -9,2 +10,3 @@
 "amdgcn-amdhsa\0"
+"gcn\0\0"
 "nvptx-none\0"
@@ -16,2 +18,3 @@
 "amdgcn-amdhsa\0"
+"fiji\0gfx900\0gfx906\0\0"
 "nvptx-none\0"


Doesn't this thing need some testsuite coverage, like
'c-c++-common/gomp/declare-variant-9.c',
'c-c++-common/gomp/declare-variant-10.c'?  (... which would've caught the
problem I've run into here?)


Also, nvptx testsuite coverage could be better, too...  :-|


Grüße
 Thomas


-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter
>From b6a0ae1d22c9675f4374c2cb2b5c0833bb1461f1 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Thu, 23 Apr 2020 21:45:34 +0200
Subject: [PATCH] [gcn] Fix 'omp-device-properties-gcn' handling

Fix-up for commit 955cd057454b323419e06affa7df7d59dc3cd1fb "Add
gcc/config/gcn/t-omp-device for OpenMP declare variant kind/arch/isa".

With AMD GCN offloading configured, I'm seeing occasional GCC build hangs.
I've now captured and analyzed one of them:

$ ps -f
UIDPID  PPID  C STIME TTY  TIME CMD
[...]
tschwing  5113  4508  0 20:24 pts/500:00:00 /bin/sh -c rm -f tmp-omp-device-properties.h; \ for kind in kind arch isa; do \   echo 'const char omp_offload_device_'${kind}'[] = ' \ >> tmp-omp-device-properties.h; \   for prop in no
tschwing  5126  5113  0 20:24 pts/500:00:00 sed -n s/^kind: //p
tschwing  5127  5113  0 20:24 pts/500:00:00 sed s/[[:blank:]]/ /g;s/  */ /g;s/^ //;s/ $//;s/ /\\0/g;s/^/"/;s/$/\\0\\0"/
[...]
$ pstree -p $$
[...]---sh(5113)-+-sed(5126)
 `-sed(5127)
$ ls -lrt build-gcc/gcc/*omp-device*
-rw-r--r-- 1 tschwing eeg  39 Apr 23 20:24 build-gcc/gcc/omp-device-properties-nvptx
-rw-r--r-- 1 tschwing eeg 634 Apr 23 20:24 build-gcc/gcc/omp-device-properties-i386
-rw-r--r-- 1 tschwing eeg  58 Apr 23 20:24 build-gcc/gcc/tmp-omp-device-properties.h

Notably missing is the 'omp-device-properties-gcn' file...

$ grep ^ build-gcc/gcc/*omp-device*
build-gcc/gcc/omp-device-properties-i386:kind: cpu
build-gcc/gcc/omp-device-properties-i386:arch: x86 x86_64 i386 i486 i586 i686 ia32
build-gcc/gcc/omp-device-properties-i386:isa: sse4 cx16 [...]
build-gcc/gcc/omp-device-properties-nvptx:k

Harden and adjust 'gcc/configure' parsing of '--enable-offload-targets' (was: [committed] Simplify omp-device-properties.h creation (PR bootstrap/92314))

2020-04-29 Thread Thomas Schwinge
Hi Jakub!

On 2019-11-02T00:38:02+0100, Jakub Jelinek  wrote:
> Apparently my recent change broke quite a lot of setups where people were
> configuring --enable-offload-targets= without having the corresponding
> offloading compiler already installed.
> The following patch simplifies it by removing the need to have it installed
> again for compiler building (it is still needed for testing as before),
> by adding new tmake_files with the needed rules and building everything
> while building the host compiler.

ACK, thanks for that re-work.


> --- gcc/configure.ac.jj   2019-10-31 11:05:50.461137028 +0100
> +++ gcc/configure.ac  2019-11-01 21:22:58.417920544 +0100
> @@ -1026,18 +1026,22 @@ AC_SUBST(real_target_noncanonical)
>  AC_SUBST(accel_dir_suffix)
>
>  for tgt in `echo $enable_offload_targets | sed 's/,/ /g'`; do
> -  tgt_dir=`echo $tgt | sed -n 's/.*=//p'`
>tgt=`echo $tgt | sed 's/=.*//'`
>
>if echo "$tgt" | grep "^hsa" > /dev/null ; then
>  enable_hsa=1
>else
>  enable_offloading=1
> -if test -n "$tgt_dir"; then
> -  
> omp_device_property="${tgt_dir}/lib/gcc/\$(real_target_noncanonical)/\$(version)/accel/${tgt}/omp-device-properties"
> -else
> -  omp_device_property="\$(libsubdir)/accel/${tgt}/omp-device-properties"
> -fi
> +case "$tgt" in
> +  *-intelmicemul-*)
> + omp_device_property=omp-device-properties-i386
> + omp_device_property_tmake_file="${omp_device_property_tmake_file} 
> \$(srcdir)/config/i386/t-omp-device"
> + ;;
> +  nvptx*-*)
> + omp_device_property=omp-device-properties-nvptx
> + omp_device_property_tmake_file="${omp_device_property_tmake_file} 
> \$(srcdir)/config/nvptx/t-omp-device"
> + ;;
> +esac
>  omp_device_properties="${omp_device_properties} 
> ${tgt}=${omp_device_property}"
>  omp_device_property_deps="${omp_device_property_deps} 
> ${omp_device_property}"
>fi

Discovered/developed in context of my commit
b6a0ae1d22c9675f4374c2cb2b5c0833bb1461f1 "[gcn] Fix
'omp-device-properties-gcn' handling",
<87ees6brhe.fsf@euler.schwinge.homeip.net">http://mid.mail-archive.com/87ees6brhe.fsf@euler.schwinge.homeip.net>,
I've now as obvious pushed to master branch commit
d20219b5ab26cd003ae78ac9f82be337fe4df261 "Harden and adjust
'gcc/configure' parsing of '--enable-offload-targets'".

Not sure why you'd diverged here, and in particular omitted the error
case, which right away would've caught this 'amdgcn' vs. 'gcn' problem
introduced by Tobias' commit 955cd057454b323419e06affa7df7d59dc3cd1fb
(r277797) "Add gcc/config/gcn/t-omp-device for OpenMP declare variant
kind/arch/isa".


Grüße
 Thomas


-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter
>From d20219b5ab26cd003ae78ac9f82be337fe4df261 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Thu, 23 Apr 2020 21:59:07 +0200
Subject: [PATCH] Harden and adjust 'gcc/configure' parsing of
 '--enable-offload-targets'

Fix-up for commit d228ee80f8578be474595a517a228111fac26c5e "re PR
bootstrap/92314 (missing omp-device-properties', needed by
's-omp-device-properties-h')".

	gcc/
	* configure.ac <$enable_offload_targets>: Do parsing as done
	elsewhere.
	* configure: Regenerate.
---
 gcc/ChangeLog|  4 
 gcc/configure| 13 -
 gcc/configure.ac |  9 ++---
 3 files changed, 18 insertions(+), 8 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 85d1c2b6f758..3df46e498f57 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,5 +1,9 @@
 2020-04-29  Thomas Schwinge  
 
+	* configure.ac <$enable_offload_targets>: Do parsing as done
+	elsewhere.
+	* configure: Regenerate.
+
 	* configure.ac <$enable_offload_targets>: 'amdgcn' is 'gcn'.
 	* configure: Regenerate.
 
diff --git a/gcc/configure b/gcc/configure
index 83101072aea0..c7bf5d1fdc6b 100755
--- a/gcc/configure
+++ b/gcc/configure
@@ -7920,18 +7920,21 @@ for tgt in `echo $enable_offload_targets | sed 's/,/ /g'`; do
   else
 enable_offloading=1
 case "$tgt" in
-  *-intelmicemul-*)
+  *-intelmic-* | *-intelmicemul-*)
 	omp_device_property=omp-device-properties-i386
 	omp_device_property_tmake_file="${omp_device_property_tmake_file} \$(srcdir)/config/i386/t-omp-device"
 	;;
-  amdgcn*-*)
+  amdgcn*)
 	omp_device_property=omp-device-properties-gcn
 	omp_device_property_tmake_file="${omp_device_property_tmake_file} \$(srcdir)/config/gcn/t-omp-device"
 	;;
-  nvptx*-*)
+  nvptx*)
 	omp_device_property=omp-device-properties-nvptx
 	omp_device_property_tmake_file="${omp_device_property_tmake_file} \$(srcdir)/config/nvptx/t-omp-device"
 	;;
+  *)
+	as_fn_error $? "unknown offload target specified" "$LINENO" 5
+	;;
 esac
 omp_device_properties="${omp_device_properties} ${tgt}=${omp_device_property}"
 omp_device_property_deps="${omp_device_property_deps} ${omp_device_property}"
@@ -18985,7 +18988,7 @@ el

Re: [PATCH] Add OpenACC 2.6 `acc_get_property' support

2020-04-29 Thread Thomas Schwinge
Hi!

On 2019-12-17T00:00:04+0100, I wrote:
> On 2019-11-14T16:35:31+0100, Frederik Harwath  
> wrote:
>> this patch implements OpenACC 2.6 "acc_get_property" and related functions.

> As I mentioned before ("thinking aloud"):
>
> | [...] 'acc_device_current' [is] relevant only for
> | 'acc_get_property', to return "the value of the property for the current
> | device".  This [now has a] special (negative?) value
> | [...], so that when additional real device types are added
> | later on, we can just add them with increasing numbers, and keep the
> | scanning code simple.

> Should this actually get value '-1' instead of '-3'?  Or, is the OpenACC
> 'acc_device_t' code already paying special attention to negative values
> '-1', '-2'?  (I don't think so.)

Now pushed this change to master branch in commit
a5d0bc12e1bfa956941cd9c49d5b978256bd11ec "[OpenACC] Set
'acc_device_current = -1'", see attached.


Grüße
 Thomas


-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter
>From a5d0bc12e1bfa956941cd9c49d5b978256bd11ec Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Wed, 29 Apr 2020 08:12:36 +0200
Subject: [PATCH] [OpenACC] Set 'acc_device_current = -1'

There's no point in using value '-3', and even though not directly related,
value '-1' does match 'GOMP_DEVICE_ICV'.

	libgomp/
	* config/accel/openacc.f90 (acc_device_current): Set to '-1'.
	* openacc.f90 (acc_device_current): Likewise.
	* openacc.h (acc_device_current): Likewise.
	* openacc_lib.h (acc_device_current): Likewise.
---
 libgomp/ChangeLog| 5 +
 libgomp/config/accel/openacc.f90 | 2 +-
 libgomp/openacc.f90  | 2 +-
 libgomp/openacc.h| 2 +-
 libgomp/openacc_lib.h| 2 +-
 5 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/libgomp/ChangeLog b/libgomp/ChangeLog
index 1a7046f2fc64..b6828adcbe3d 100644
--- a/libgomp/ChangeLog
+++ b/libgomp/ChangeLog
@@ -1,5 +1,10 @@
 2020-04-29  Thomas Schwinge  
 
+	* config/accel/openacc.f90 (acc_device_current): Set to '-1'.
+	* openacc.f90 (acc_device_current): Likewise.
+	* openacc.h (acc_device_current): Likewise.
+	* openacc_lib.h (acc_device_current): Likewise.
+
 	PR target/94282
 	* testsuite/libgomp.c-c++-common/function-not-offloaded.c: Remove
 	'dg-allow-blank-lines-in-output'.
diff --git a/libgomp/config/accel/openacc.f90 b/libgomp/config/accel/openacc.f90
index 275afe43475c..99330733d8f8 100644
--- a/libgomp/config/accel/openacc.f90
+++ b/libgomp/config/accel/openacc.f90
@@ -44,7 +44,7 @@ module openacc_kinds
   integer, parameter :: acc_device_kind = int32
 
   ! Keep in sync with include/gomp-constants.h.
-  integer (acc_device_kind), parameter :: acc_device_current = -3
+  integer (acc_device_kind), parameter :: acc_device_current = -1
   integer (acc_device_kind), parameter :: acc_device_none = 0
   integer (acc_device_kind), parameter :: acc_device_default = 1
   integer (acc_device_kind), parameter :: acc_device_host = 2
diff --git a/libgomp/openacc.f90 b/libgomp/openacc.f90
index 467fb612c548..111705d0fb60 100644
--- a/libgomp/openacc.f90
+++ b/libgomp/openacc.f90
@@ -41,7 +41,7 @@ module openacc_kinds
   integer, parameter :: acc_device_kind = int32
 
   ! Keep in sync with include/gomp-constants.h.
-  integer (acc_device_kind), parameter :: acc_device_current = -3
+  integer (acc_device_kind), parameter :: acc_device_current = -1
   integer (acc_device_kind), parameter :: acc_device_none = 0
   integer (acc_device_kind), parameter :: acc_device_default = 1
   integer (acc_device_kind), parameter :: acc_device_host = 2
diff --git a/libgomp/openacc.h b/libgomp/openacc.h
index 617364634748..1dc471f62bc7 100644
--- a/libgomp/openacc.h
+++ b/libgomp/openacc.h
@@ -49,7 +49,7 @@ extern "C" {
 /* Types */
 typedef enum acc_device_t {
   /* Keep in sync with include/gomp-constants.h.  */
-  acc_device_current = -3,
+  acc_device_current = -1,
   acc_device_none = 0,
   acc_device_default = 1,
   acc_device_host = 2,
diff --git a/libgomp/openacc_lib.h b/libgomp/openacc_lib.h
index ee08e9787cc9..82a3735b1063 100644
--- a/libgomp/openacc_lib.h
+++ b/libgomp/openacc_lib.h
@@ -37,7 +37,7 @@
   integer, parameter :: acc_device_kind = 4
 
 ! Keep in sync with include/gomp-constants.h.
-  integer (acc_device_kind), parameter :: acc_device_current = -3
+  integer (acc_device_kind), parameter :: acc_device_current = -1
   integer (acc_device_kind), parameter :: acc_device_none = 0
   integer (acc_device_kind), parameter :: acc_device_default = 1
   integer (acc_device_kind), parameter :: acc_device_host = 2
-- 
2.26.2



[committed] testsuite: Save dg-do-what-default in mve.exp

2020-04-29 Thread Richard Sandiford
mve.exp changed the default dg-do action to "assemble", but then
left it like that for later exp files.  This meant that in a
two-multilib test run, the first arm.exp run would have a default
of "dg-do compile" and the second would have a default of
"dg-do assemble".

Tested on arm-linux-gnueabihf and armeb-eabi.  Pushed as obvious.

Richard


2020-04-29  Richard Sandiford  

gcc/testsuite/
* g++.target/arm/mve.exp: Restore the original dg-do-what-default
before finishing.
---
 gcc/testsuite/g++.target/arm/mve.exp | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/gcc/testsuite/g++.target/arm/mve.exp 
b/gcc/testsuite/g++.target/arm/mve.exp
index 08f8d4d87f6..e5b4b65eb57 100644
--- a/gcc/testsuite/g++.target/arm/mve.exp
+++ b/gcc/testsuite/g++.target/arm/mve.exp
@@ -35,6 +35,7 @@ global dg_runtest_extra_prunes
 set dg_runtest_extra_prunes ""
 lappend dg_runtest_extra_prunes "warning: switch -m(cpu|arch)=.* conflicts 
with -m(cpu|arch)=.* switch"
 
+set save-dg-do-what-default ${dg-do-what-default}
 set dg-do-what-default "assemble"
 
 # Initialize `dg'.
@@ -45,5 +46,6 @@ dg-runtest [lsort [glob -nocomplain 
$srcdir/$subdir/../../gcc.target/arm/mve/int
"" $DEFAULT_CXXFLAGS
 
 # All done.
+set dg-do-what-default ${save-dg-do-what-default}
 set dg_runtest_extra_prunes ""
 dg-finish


[wwwdocs] Correct status and macro for P1907R1

2020-04-29 Thread Jonathan Wakely via Gcc-patches
As noted in PR c++/94765 floating point types are not supported as
non-type template args.

Marek added the P1907R1 paper in commit d59a823fb but didn't give it a
distinct entries for the "Supported in GCC?" and "SD-6 Feature Test"
columns. This keeps it grouped with P0732R2 but in its own row.

The table uses the nth-child CSS pseudo-class for formatting, but
rowspan attributes mess that up by altering the number of children in
the row. This patches uses (non-displayed)  elements to
preserve the ordering for nth-child, so that the table cells are aligned
consistently. Those elements have no other effect, and no effect at all
for non-graphical browsers that ignore the CSS formatting.

OK for wwwdocs?


commit 51e56b08d72034ec904ee640919fc0906e78076f
Author: Jonathan Wakely 
Date:   Wed Apr 29 10:44:55 2020 +0100

Correct status and macro for P1907R1

As noted in PR c++/94765 floating point types are not supported as
non-type template args.

Marek added the P1907R1 paper in commit d59a823fb but didn't give it a
distinct entries for the "Supported in GCC?" and "SD-6 Feature Test"
columns. This keeps it grouped with P0732R2 but in its own row.

The table uses the nth-child CSS pseudo-class for formatting, but
rowspan attributes mess that up by altering the number of children in
the row. This patches uses (non-displayed)  elements to
preserve the ordering for nth-child, so that the table cells are aligned
consistently. Those elements have no other effect, and no effect at all
for non-graphical browsers that ignore the CSS formatting.

diff --git a/htdocs/projects/cxx-status.html b/htdocs/projects/cxx-status.html
index 1a2633ef..3eca8049 100644
--- a/htdocs/projects/cxx-status.html
+++ b/htdocs/projects/cxx-status.html
@@ -293,13 +293,21 @@

 
 
-   Class Types in Non-Type Template Parameters 
-  https://wg21.link/p0732r2";>P0732R2
-   
-   https://wg21.link/p1907r1";>P1907R1
+   Class Types in Non-Type Template Parameters 
+  https://wg21.link/p0732r2";>P0732R2
9 
__cpp_nontype_template_parameter_class >= 201806 
 
+
+  
+  
+  https://wg21.link/p1907r1";>P1907R1
+  
+9
+(partial, no floating point template args)
+  
+   __cpp_nontype_template_args >= 201911 
+
 
Atomic Compare-and-Exchange with Padding Bits 
   https://wg21.link/p0528r3";>P0528R3
@@ -383,6 +391,7 @@

 
 
+  
   https://wg21.link/p1331r2";>P1331R2
   10
__cpp_constexpr >= 201907 


[committed] aarch64: Fix parameter passing for [[no_unique_address]]

2020-04-29 Thread Richard Sandiford
This patch makes the ABI code ignore zero-sized [[no_unique_address]]
fields when deciding whether something is a HFA or HVA.

As things stood, we'd get two sets of -Wpsabi warnings, one when
trying to decide whether something was an SVE function, and another
when actually processing the function definition or function call.
The patch therefore makes aapcs_vfp_sub_candidate honour the
CUMULATIVE_ARGS "silent_p" flag where applicable.

This doesn't stop all duplicate warnings for parameters, and I suspect
we'll get duplicate warnings for return values too, but it should be
better than nothing.

Clang produces equivalent code for all tests except unions Q and R,
which clang passes in s0 rather than w0.  I think this is a clang bug,
since the "x" field contains a single-byte FDT and so shouldn't be
ignored.

Tested on aarch64-linux-gnu and aarch64_be-elf.  Pushed.

Richard


2020-04-29  Richard Sandiford  

gcc/
* config/aarch64/aarch64.c (aarch64_function_arg_alignment): Add a
comment explaining why we consider even zero-sized fields.
(WARN_PSABI_EMPTY_CXX17_BASE): New constant.
(WARN_PSABI_NO_UNIQUE_ADDRESS): Likewise.
(aapcs_vfp_sub_candidate): Replace the boolean pointer parameter
avoid_cxx17_empty_base with a pointer to a bitmask.  Ignore fields
whose DECL_FIELD_ABI_IGNORED bit is set when determining whether
something actually is a HFA or HVA.  Record whether we see a
[[no_unique_address]] field that previous GCCs would not have
ignored in this way.
(aarch64_vfp_is_call_or_return_candidate): Add a parameter to say
whether diagnostics should be suppressed.  Update the calls to
aapcs_vfp_sub_candidate and report a -Wpsabi warning for the
[[no_unique_address]] case.
(aarch64_return_in_msb): Update call accordingly, never silencing
diagnostics.
(aarch64_function_value): Likewise.
(aarch64_return_in_memory_1): Likewise.
(aarch64_init_cumulative_args): Likewise.
(aarch64_gimplify_va_arg_expr): Likewise.
(aarch64_pass_by_reference_1): Take a CUMULATIVE_ARGS pointer and
use it to decide whether arch64_vfp_is_call_or_return_candidate
should be silent.
(aarch64_pass_by_reference): Update calls accordingly.
(aarch64_vfp_is_call_candidate): Use the CUMULATIVE_ARGS argument
to decide whether arch64_vfp_is_call_or_return_candidate should be
silent.

gcc/testsuite/
* g++.target/aarch64/no_unique_address_1.C: New test.
* g++.target/aarch64/no_unique_address_2.C: Likewise.
---
 gcc/config/aarch64/aarch64.c  | 159 +-
 .../g++.target/aarch64/no_unique_address_1.C  | 206 ++
 .../g++.target/aarch64/no_unique_address_2.C  | 206 ++
 3 files changed, 518 insertions(+), 53 deletions(-)
 create mode 100644 gcc/testsuite/g++.target/aarch64/no_unique_address_1.C
 create mode 100644 gcc/testsuite/g++.target/aarch64/no_unique_address_2.C

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 5316350a9da..e996fd12042 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -286,7 +286,7 @@ static bool aarch64_return_in_memory_1 (const_tree);
 static bool aarch64_vfp_is_call_or_return_candidate (machine_mode,
 const_tree,
 machine_mode *, int *,
-bool *);
+bool *, bool);
 static void aarch64_elf_asm_constructor (rtx, int) ATTRIBUTE_UNUSED;
 static void aarch64_elf_asm_destructor (rtx, int) ATTRIBUTE_UNUSED;
 static void aarch64_override_options_after_change (void);
@@ -5369,7 +5369,8 @@ aarch64_function_ok_for_sibcall (tree, tree exp)
passed in SVE registers.  */
 
 static bool
-aarch64_pass_by_reference_1 (const function_arg_info &arg)
+aarch64_pass_by_reference_1 (CUMULATIVE_ARGS *pcum,
+const function_arg_info &arg)
 {
   HOST_WIDE_INT size;
   machine_mode dummymode;
@@ -5393,8 +5394,8 @@ aarch64_pass_by_reference_1 (const function_arg_info &arg)
 
   /* Can this be a candidate to be passed in fp/simd register(s)?  */
   if (aarch64_vfp_is_call_or_return_candidate (arg.mode, arg.type,
-  &dummymode, &nregs,
-  NULL))
+  &dummymode, &nregs, NULL,
+  !pcum || pcum->silent_p))
 return false;
 
   /* Arguments which are variable sized or larger than 2 registers are
@@ -5412,7 +5413,7 @@ aarch64_pass_by_reference (cumulative_args_t pcum_v,
   CUMULATIVE_ARGS *pcum = get_cumulative_args (pcum_v);
 
   if (!arg.type)
-return aarch64_pass_by_reference_1 (arg);
+return aarch64_pass_by_r

Re: [Patch, Fortran] OpenMP/OpenACC – fix more issues with OPTIONAL

2020-04-29 Thread Thomas Schwinge
Hi Tobias!

Do you happen to have any update regarding this one:

On 2020-01-08T09:55:06+0100, Tobias Burnus  wrote:
> On 1/8/20 9:33 AM, Thomas Schwinge wrote:
>> With 'dg-do run' added, on [...] x86_64-pc-linux-gnu with offloading I'm 
>> seeing:

| PASS: libgomp.fortran/use_device_ptr-optional-3.f90   -O0  (test for 
excess errors)
| PASS: libgomp.fortran/use_device_ptr-optional-3.f90   -O0  execution test
| PASS: libgomp.fortran/use_device_ptr-optional-3.f90   -O1  (test for 
excess errors)
| PASS: libgomp.fortran/use_device_ptr-optional-3.f90   -O1  execution test
| FAIL: libgomp.fortran/use_device_ptr-optional-3.f90   -O2  (test for 
excess errors)
| UNRESOLVED: libgomp.fortran/use_device_ptr-optional-3.f90   -O2  
compilation failed to produce executable
| FAIL: libgomp.fortran/use_device_ptr-optional-3.f90   -O3 
-fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  
(test for excess errors)
| UNRESOLVED: libgomp.fortran/use_device_ptr-optional-3.f90   -O3 
-fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  
compilation failed to produce executable
| FAIL: libgomp.fortran/use_device_ptr-optional-3.f90   -O3 -g  (test for 
excess errors)
| UNRESOLVED: libgomp.fortran/use_device_ptr-optional-3.f90   -O3 -g  
compilation failed to produce executable
| FAIL: libgomp.fortran/use_device_ptr-optional-3.f90   -Os  (test for 
excess errors)
| UNRESOLVED: libgomp.fortran/use_device_ptr-optional-3.f90   -Os  
compilation failed to produce executable

>> ... due to:
>>  /tmp/cciVc43I.o:(.gnu.offload_vars+0x10): undefined reference to 
>> `A.12.4064'
>> which may be something like PR90779, PR85063, PR84592, PR90779,
>> PR80411, PR71536 -- or something else.
>
> Hmm. It is surely among the listed items, if all fails in the last item.
> Note that PR85063 is fixed and PR84592 a duplicate of PR90779 (which is
> listed twice). To through in another number it could also be a variant
> of PR 92029 to though in yet another number …


Grüße
 Thomas
-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter


[PATCH] arm: Fix parameter passing for [[no_unique_address]]

2020-04-29 Thread Richard Sandiford
This patch makes the ABI code ignore zero-sized [[no_unique_address]]
fields when deciding whether something is a HFA or HVA.

For the tests, I wanted an -march setting that was stable enough
to use check-function-bodies and also wanted to force -mfloat-abi=hard.
I couldn't see any existing way of doing both together, since most
arm-related effective-target keywords are agnostic about the choice
between -mfloat-abi=softfp and -mfloat-abi=hard.  I therefore added
a new effective-target keyword for this combination.

I used the arm_arch_* framework for the effective-target rather than
writing a new set of custom Tcl routines.  This has the nice property
of separating the "compile and assemble" cases from the "link and run"
cases.  I only need compilation to work for the new tests, so requiring
linking to work would be an unnecessary restriction.

However, including an ABI requirement is arguably stretching what the
list was originally intended to handle.  The name arm_arch_v8a_hard
doesn't fit very naturally with some of the NEON-based tests.
On the other hand, the naming convention isn't entirely consistent,
so any choice would be inconsistent with something.

Like with the aarch64 patch, clang produces equivalent code for all
tests except unions Q and R, which clang passes in s0 rather than r0.
I think this is a clang bug, since the "x" field contains a single-byte
FDT and so shouldn't be ignored.

Tested on arm-linux-gnueabihf and armeb-eabi.  OK to install?

Richard


2020-04-29  Richard Sandiford  

gcc/
* doc/sourcebuild.texi (arm_arch_v8a_hard_ok): Document new
effective-target keyword.
(arm_arch_v8a_hard_multilib): Likewise.
(arm_arch_v8a_hard): Document new dg-add-options keyword.
* config/arm/arm.c (arm_return_in_memory): Note that the APCS
code is deprecated and has not been updated to handle
DECL_FIELD_ABI_IGNORED.
(WARN_PSABI_EMPTY_CXX17_BASE): New constant.
(WARN_PSABI_NO_UNIQUE_ADDRESS): Likewise.
(aapcs_vfp_sub_candidate): Replace the boolean pointer parameter
avoid_cxx17_empty_base with a pointer to a bitmask.  Ignore fields
whose DECL_FIELD_ABI_IGNORED bit is set when determining whether
something actually is a HFA or HVA.  Record whether we see a
[[no_unique_address]] field that previous GCCs would not have
ignored in this way.
(aapcs_vfp_is_call_or_return_candidate): Update the calls to
aapcs_vfp_sub_candidate and report a -Wpsabi warning for the
[[no_unique_address]] case.  Use TYPE_MAIN_VARIANT in the
diagnostic messages.
(arm_needs_doubleword_align): Add a comment explaining why we
consider even zero-sized fields.

gcc/testsuite/
* lib/target-supports.exp: Add v8a_hard to the list of arm_arch_*
targets.
* g++.target/arm/no_unique_address_1.C: New test.
* g++.target/arm/no_unique_address_2.C: Likewise.
---
 gcc/config/arm/arm.c  | 108 +++---
 gcc/doc/sourcebuild.texi  |  15 ++
 .../g++.target/arm/no_unique_address_1.C  | 201 ++
 .../g++.target/arm/no_unique_address_2.C  | 201 ++
 gcc/testsuite/lib/target-supports.exp |   1 +
 5 files changed, 498 insertions(+), 28 deletions(-)
 create mode 100644 gcc/testsuite/g++.target/arm/no_unique_address_1.C
 create mode 100644 gcc/testsuite/g++.target/arm/no_unique_address_2.C

diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi
index b69612024be..d8da77d5ba4 100644
--- a/gcc/doc/sourcebuild.texi
+++ b/gcc/doc/sourcebuild.texi
@@ -1829,6 +1829,16 @@ Some multilibs may be incompatible with these options.
 ARM target supports @code{-mfpu=vfp3 -mfloat-abi=softfp}.
 Some multilibs may be incompatible with these options.
 
+@item arm_arch_v8a_hard_ok
+The compiler is targeting @code{arm*-*-*} and can compile and assemble code
+using the options @code{-march=armv8-a -mfpu=neon-fp-armv8 -mfloat-abi=hard}.
+This is not enough to guarantee that linking works.
+
+@item arm_arch_v8a_hard_multilib
+The compiler is targeting @code{arm*-*-*} and can build programs using
+the options @code{-march=armv8-a -mfpu=neon-fp-armv8 -mfloat-abi=hard}.
+The target can also run the resulting binaries.
+
 @item arm_v8_vfp_ok
 ARM target supports @code{-mfpu=fp-armv8 -mfloat-abi=softfp}.
 Some multilibs may be incompatible with these options.
@@ -2586,6 +2596,11 @@ the @ref{arm_neon_fp16_ok,,arm_neon_fp16_ok effective 
target keyword}.
 arm vfp3 floating point support; see
 the @ref{arm_vfp3_ok,,arm_vfp3_ok effective target keyword}.
 
+@item arm_arch_v8a_hard
+Add options for ARMv8-A and the hard-float variant of the AAPCS,
+if this is supported by the compiler; see the
+@ref{arm_arch_v8a_hard_ok,,arm_arch_v8a_hard_ok} effective target keyword.
+
 @item arm_v8_1a_neon
 Add options for ARMv8.1-A with Adv.SIMD support, if this is supported
 by the target; see the @ref{arm_

Re: [committed] aarch64: Fix parameter passing for [[no_unique_address]]

2020-04-29 Thread Jakub Jelinek via Gcc-patches
On Wed, Apr 29, 2020 at 10:57:24AM +0100, Richard Sandiford wrote:
> This patch makes the ABI code ignore zero-sized [[no_unique_address]]
> fields when deciding whether something is a HFA or HVA.

As you use cxx17_empty_base_field_p, can you please remove the FIXME
in calls.h because it then can't go away, and perhaps update it to
test DECL_ARTIFICIAL (field) instead (or in addition to no attribute,
with with DECL_ARTIFICIAL first)?

Thanks.

Jakub



[PATCH] lto/94822 - fix ICE in component_ref_size

2020-04-29 Thread Richard Biener


Bootstrapped/tested on x86_64-unknown-linux-gnu, pushed.

This ICE appears because gcc will stream it to the function_body section
when processing the variable with the initial value of the constructor
type, and the error_mark_node to the decls section.
When recompiling, the value obtained with DECL_INITIAL will be error_mark.

2020-04-29  Richard Biener  
Li Zekun  

PR lto/94822
* tree.c (component_ref_size): Guard against error_mark_node
DECL_INITIAL as it happens with LTO.

* gcc.dg/lto/pr94822_0.c: New testcase.
* gcc.dg/lto/pr94822_1.c: Alternate file.
* gcc.dg/lto/pr94822.h: Likewise.
---
 gcc/testsuite/gcc.dg/lto/pr94822.h   |  4 
 gcc/testsuite/gcc.dg/lto/pr94822_0.c | 10 
 gcc/testsuite/gcc.dg/lto/pr94822_1.c |  6 +
 gcc/tree.c   | 35 ++--
 4 files changed, 38 insertions(+), 17 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/lto/pr94822.h
 create mode 100644 gcc/testsuite/gcc.dg/lto/pr94822_0.c
 create mode 100644 gcc/testsuite/gcc.dg/lto/pr94822_1.c

diff --git a/gcc/testsuite/gcc.dg/lto/pr94822.h 
b/gcc/testsuite/gcc.dg/lto/pr94822.h
new file mode 100644
index 000..d9e6c3da645
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/lto/pr94822.h
@@ -0,0 +1,4 @@
+typedef struct {
+  int i;
+  int ints[];
+} struct_t;
diff --git a/gcc/testsuite/gcc.dg/lto/pr94822_0.c 
b/gcc/testsuite/gcc.dg/lto/pr94822_0.c
new file mode 100644
index 000..698c0928a81
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/lto/pr94822_0.c
@@ -0,0 +1,10 @@
+/* { dg-lto-do link } */
+
+#include "pr94822.h"
+
+extern struct_t my_struct;
+
+int main() {
+ return my_struct.ints[1];
+}
+
diff --git a/gcc/testsuite/gcc.dg/lto/pr94822_1.c 
b/gcc/testsuite/gcc.dg/lto/pr94822_1.c
new file mode 100644
index 000..a7ace71680f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/lto/pr94822_1.c
@@ -0,0 +1,6 @@
+#include "pr94822.h"
+
+struct_t my_struct = {
+ 20,
+ { 1, 2 }
+};
diff --git a/gcc/tree.c b/gcc/tree.c
index e28b29580ca..e451401822c 100644
--- a/gcc/tree.c
+++ b/gcc/tree.c
@@ -13723,24 +13723,25 @@ component_ref_size (tree ref, bool 
*interior_zero_length /* = NULL */)
 /* MEMBER is a true flexible array member.  Compute its size from
the initializer of the BASE object if it has one.  */
 if (tree init = DECL_P (base) ? DECL_INITIAL (base) : NULL_TREE)
-  {
-   init = get_initializer_for (init, member);
-   if (init)
- {
-   memsize = TYPE_SIZE_UNIT (TREE_TYPE (init));
-   if (tree refsize = TYPE_SIZE_UNIT (reftype))
- {
-   /* Use the larger of the initializer size and the tail
-  padding in the enclosing struct.  */
-   poly_int64 rsz = tree_to_poly_int64 (refsize);
-   rsz -= baseoff;
-   if (known_lt (tree_to_poly_int64 (memsize), rsz))
- memsize = wide_int_to_tree (TREE_TYPE (memsize), rsz);
- }
+  if (init != error_mark_node)
+   {
+ init = get_initializer_for (init, member);
+ if (init)
+   {
+ memsize = TYPE_SIZE_UNIT (TREE_TYPE (init));
+ if (tree refsize = TYPE_SIZE_UNIT (reftype))
+   {
+ /* Use the larger of the initializer size and the tail
+padding in the enclosing struct.  */
+ poly_int64 rsz = tree_to_poly_int64 (refsize);
+ rsz -= baseoff;
+ if (known_lt (tree_to_poly_int64 (memsize), rsz))
+   memsize = wide_int_to_tree (TREE_TYPE (memsize), rsz);
+   }
 
-   baseoff = 0;
- }
-  }
+ baseoff = 0;
+   }
+   }
 
   if (!memsize)
 {
-- 
2.25.1


[PATCH] arm: Extend the PR94780 fix to arm

2020-04-29 Thread Richard Sandiford
Essentially the same fix as for x86.

Tested on arm-linux-gnueabihf and armeb-eabi.  Bordering on the obvious
I guess, but OK to install?

Richard


2020-04-29  Richard Sandiford  

gcc/
* config/arm/arm-builtins.c (arm_atomic_assign_expand_fenv): Use
TARGET_EXPR instead of MODIFY_EXPR for the first assignments to
fenv_var and new_fenv_var.
---
 gcc/config/arm/arm-builtins.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c
index aee3fd6e2ff..f64742e6447 100644
--- a/gcc/config/arm/arm-builtins.c
+++ b/gcc/config/arm/arm-builtins.c
@@ -4167,8 +4167,9 @@ arm_atomic_assign_expand_fenv (tree *hold, tree *clear, 
tree *update)
   mask = build_int_cst (unsigned_type_node,
~((ARM_FE_ALL_EXCEPT << ARM_FE_EXCEPT_SHIFT)
  | ARM_FE_ALL_EXCEPT));
-  ld_fenv = build2 (MODIFY_EXPR, unsigned_type_node,
-   fenv_var, build_call_expr (get_fpscr, 0));
+  ld_fenv = build4 (TARGET_EXPR, unsigned_type_node,
+   fenv_var, build_call_expr (get_fpscr, 0),
+   NULL_TREE, NULL_TREE);
   masked_fenv = build2 (BIT_AND_EXPR, unsigned_type_node, fenv_var, mask);
   hold_fnclex = build_call_expr (set_fpscr, 1, masked_fenv);
   *hold = build2 (COMPOUND_EXPR, void_type_node,
@@ -4189,8 +4190,8 @@ arm_atomic_assign_expand_fenv (tree *hold, tree *clear, 
tree *update)
__atomic_feraiseexcept (new_fenv_var);  */
 
   new_fenv_var = create_tmp_var_raw (unsigned_type_node);
-  reload_fenv = build2 (MODIFY_EXPR, unsigned_type_node, new_fenv_var,
-   build_call_expr (get_fpscr, 0));
+  reload_fenv = build4 (TARGET_EXPR, unsigned_type_node, new_fenv_var,
+   build_call_expr (get_fpscr, 0), NULL_TREE, NULL_TREE);
   restore_fnenv = build_call_expr (set_fpscr, 1, fenv_var);
   atomic_feraiseexcept = builtin_decl_implicit (BUILT_IN_ATOMIC_FERAISEEXCEPT);
   update_call = build_call_expr (atomic_feraiseexcept, 1,


Re: [PATCH] arm: Extend the PR94780 fix to arm

2020-04-29 Thread Ramana Radhakrishnan via Gcc-patches
On Wed, Apr 29, 2020 at 11:30 AM Richard Sandiford
 wrote:
>
> Essentially the same fix as for x86.
>
> Tested on arm-linux-gnueabihf and armeb-eabi.  Bordering on the obvious
> I guess, but OK to install?
>
> Richard
>

Ok.

Ramana

>
> 2020-04-29  Richard Sandiford  
>
> gcc/
> * config/arm/arm-builtins.c (arm_atomic_assign_expand_fenv): Use
> TARGET_EXPR instead of MODIFY_EXPR for the first assignments to
> fenv_var and new_fenv_var.
> ---
>  gcc/config/arm/arm-builtins.c | 9 +
>  1 file changed, 5 insertions(+), 4 deletions(-)
>
> diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c
> index aee3fd6e2ff..f64742e6447 100644
> --- a/gcc/config/arm/arm-builtins.c
> +++ b/gcc/config/arm/arm-builtins.c
> @@ -4167,8 +4167,9 @@ arm_atomic_assign_expand_fenv (tree *hold, tree *clear, 
> tree *update)
>mask = build_int_cst (unsigned_type_node,
> ~((ARM_FE_ALL_EXCEPT << ARM_FE_EXCEPT_SHIFT)
>   | ARM_FE_ALL_EXCEPT));
> -  ld_fenv = build2 (MODIFY_EXPR, unsigned_type_node,
> -   fenv_var, build_call_expr (get_fpscr, 0));
> +  ld_fenv = build4 (TARGET_EXPR, unsigned_type_node,
> +   fenv_var, build_call_expr (get_fpscr, 0),
> +   NULL_TREE, NULL_TREE);
>masked_fenv = build2 (BIT_AND_EXPR, unsigned_type_node, fenv_var, mask);
>hold_fnclex = build_call_expr (set_fpscr, 1, masked_fenv);
>*hold = build2 (COMPOUND_EXPR, void_type_node,
> @@ -4189,8 +4190,8 @@ arm_atomic_assign_expand_fenv (tree *hold, tree *clear, 
> tree *update)
> __atomic_feraiseexcept (new_fenv_var);  */
>
>new_fenv_var = create_tmp_var_raw (unsigned_type_node);
> -  reload_fenv = build2 (MODIFY_EXPR, unsigned_type_node, new_fenv_var,
> -   build_call_expr (get_fpscr, 0));
> +  reload_fenv = build4 (TARGET_EXPR, unsigned_type_node, new_fenv_var,
> +   build_call_expr (get_fpscr, 0), NULL_TREE, NULL_TREE);
>restore_fnenv = build_call_expr (set_fpscr, 1, fenv_var);
>atomic_feraiseexcept = builtin_decl_implicit 
> (BUILT_IN_ATOMIC_FERAISEEXCEPT);
>update_call = build_call_expr (atomic_feraiseexcept, 1,


RE: [PATCH] arm: Fix parameter passing for [[no_unique_address]]

2020-04-29 Thread Kyrylo Tkachov
Hi Richard,

> -Original Message-
> From: Richard Sandiford 
> Sent: 29 April 2020 11:04
> To: gcc-patches@gcc.gnu.org
> Cc: ni...@redhat.com; Richard Earnshaw ;
> Ramana Radhakrishnan ; Kyrylo
> Tkachov 
> Subject: [PATCH] arm: Fix parameter passing for [[no_unique_address]]
> 
> This patch makes the ABI code ignore zero-sized [[no_unique_address]]
> fields when deciding whether something is a HFA or HVA.
> 
> For the tests, I wanted an -march setting that was stable enough
> to use check-function-bodies and also wanted to force -mfloat-abi=hard.
> I couldn't see any existing way of doing both together, since most
> arm-related effective-target keywords are agnostic about the choice
> between -mfloat-abi=softfp and -mfloat-abi=hard.  I therefore added
> a new effective-target keyword for this combination.
> 
> I used the arm_arch_* framework for the effective-target rather than
> writing a new set of custom Tcl routines.  This has the nice property
> of separating the "compile and assemble" cases from the "link and run"
> cases.  I only need compilation to work for the new tests, so requiring
> linking to work would be an unnecessary restriction.
> 
> However, including an ABI requirement is arguably stretching what the
> list was originally intended to handle.  The name arm_arch_v8a_hard
> doesn't fit very naturally with some of the NEON-based tests.
> On the other hand, the naming convention isn't entirely consistent,
> so any choice would be inconsistent with something.
> 
> Like with the aarch64 patch, clang produces equivalent code for all
> tests except unions Q and R, which clang passes in s0 rather than r0.
> I think this is a clang bug, since the "x" field contains a single-byte
> FDT and so shouldn't be ignored.

Worth reporting it?

> 
> Tested on arm-linux-gnueabihf and armeb-eabi.  OK to install?

Ok.
Thanks,
Kyrill

> 
> Richard
> 
> 
> 2020-04-29  Richard Sandiford  
> 
> gcc/
>   * doc/sourcebuild.texi (arm_arch_v8a_hard_ok): Document new
>   effective-target keyword.
>   (arm_arch_v8a_hard_multilib): Likewise.
>   (arm_arch_v8a_hard): Document new dg-add-options keyword.
>   * config/arm/arm.c (arm_return_in_memory): Note that the APCS
>   code is deprecated and has not been updated to handle
>   DECL_FIELD_ABI_IGNORED.
>   (WARN_PSABI_EMPTY_CXX17_BASE): New constant.
>   (WARN_PSABI_NO_UNIQUE_ADDRESS): Likewise.
>   (aapcs_vfp_sub_candidate): Replace the boolean pointer parameter
>   avoid_cxx17_empty_base with a pointer to a bitmask.  Ignore fields
>   whose DECL_FIELD_ABI_IGNORED bit is set when determining
> whether
>   something actually is a HFA or HVA.  Record whether we see a
>   [[no_unique_address]] field that previous GCCs would not have
>   ignored in this way.
>   (aapcs_vfp_is_call_or_return_candidate): Update the calls to
>   aapcs_vfp_sub_candidate and report a -Wpsabi warning for the
>   [[no_unique_address]] case.  Use TYPE_MAIN_VARIANT in the
>   diagnostic messages.
>   (arm_needs_doubleword_align): Add a comment explaining why we
>   consider even zero-sized fields.
> 
> gcc/testsuite/
>   * lib/target-supports.exp: Add v8a_hard to the list of arm_arch_*
>   targets.
>   * g++.target/arm/no_unique_address_1.C: New test.
>   * g++.target/arm/no_unique_address_2.C: Likewise.
> ---
>  gcc/config/arm/arm.c  | 108 +++---
>  gcc/doc/sourcebuild.texi  |  15 ++
>  .../g++.target/arm/no_unique_address_1.C  | 201 ++
>  .../g++.target/arm/no_unique_address_2.C  | 201 ++
>  gcc/testsuite/lib/target-supports.exp |   1 +
>  5 files changed, 498 insertions(+), 28 deletions(-)
>  create mode 100644 gcc/testsuite/g++.target/arm/no_unique_address_1.C
>  create mode 100644 gcc/testsuite/g++.target/arm/no_unique_address_2.C
> 
> diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi
> index b69612024be..d8da77d5ba4 100644
> --- a/gcc/doc/sourcebuild.texi
> +++ b/gcc/doc/sourcebuild.texi
> @@ -1829,6 +1829,16 @@ Some multilibs may be incompatible with these
> options.
>  ARM target supports @code{-mfpu=vfp3 -mfloat-abi=softfp}.
>  Some multilibs may be incompatible with these options.
> 
> +@item arm_arch_v8a_hard_ok
> +The compiler is targeting @code{arm*-*-*} and can compile and assemble
> code
> +using the options @code{-march=armv8-a -mfpu=neon-fp-armv8 -mfloat-
> abi=hard}.
> +This is not enough to guarantee that linking works.
> +
> +@item arm_arch_v8a_hard_multilib
> +The compiler is targeting @code{arm*-*-*} and can build programs using
> +the options @code{-march=armv8-a -mfpu=neon-fp-armv8 -mfloat-
> abi=hard}.
> +The target can also run the resulting binaries.
> +
>  @item arm_v8_vfp_ok
>  ARM target supports @code{-mfpu=fp-armv8 -mfloat-abi=softfp}.
>  Some multilibs may be incompatible with these options.
> @@ -2586,6 +2596,11 @@ the
> @ref{arm_neon_fp16_ok,,arm_neo

Re: [committed] aarch64: Fix parameter passing for [[no_unique_address]]

2020-04-29 Thread Richard Sandiford
Jakub Jelinek  writes:
> On Wed, Apr 29, 2020 at 10:57:24AM +0100, Richard Sandiford wrote:
>> This patch makes the ABI code ignore zero-sized [[no_unique_address]]
>> fields when deciding whether something is a HFA or HVA.
>
> As you use cxx17_empty_base_field_p, can you please remove the FIXME
> in calls.h because it then can't go away, and perhaps update it to
> test DECL_ARTIFICIAL (field) instead (or in addition to no attribute,
> with with DECL_ARTIFICIAL first)?

Sure, how's this?  I added back the RECORD_OR_UNION_TYPE_P test too
for good measure :-)

Only lightly tested on aarch64-linux-gnu and aarch64_be-elf so far.

Richard


2020-04-29  Richard Sandiford  

gcc/
* calls.h (cxx17_empty_base_field_p): Turn into a function declaration.
* calls.c (cxx17_empty_base_field_p): New function.  Check
DECL_ARTIFICIAL and RECORD_OR_UNION_TYPE_P in addition to the
previous checks.
---
 gcc/calls.c | 15 +++
 gcc/calls.h |  5 +
 2 files changed, 16 insertions(+), 4 deletions(-)

diff --git a/gcc/calls.h b/gcc/calls.h
index e1c944efbb6..4ee49360777 100644
--- a/gcc/calls.h
+++ b/gcc/calls.h
@@ -135,9 +135,6 @@ extern tree get_attr_nonstring_decl (tree, tree * = NULL);
 extern void maybe_warn_nonstring_arg (tree, tree);
 extern bool get_size_range (tree, tree[2], bool = false);
 extern rtx rtx_for_static_chain (const_tree, bool);
-/* FIXME: Remove after all backends are converted.  */
-#define cxx17_empty_base_field_p(t) \
-  (DECL_FIELD_ABI_IGNORED (t)  \
-   && !lookup_attribute ("no_unique_address", DECL_ATTRIBUTES (t)))
+extern bool cxx17_empty_base_field_p (const_tree);
 
 #endif // GCC_CALLS_H
diff --git a/gcc/calls.c b/gcc/calls.c
index 5bd922779af..8041388c1d2 100644
--- a/gcc/calls.c
+++ b/gcc/calls.c
@@ -6261,5 +6261,20 @@ must_pass_va_arg_in_stack (tree type)
   return targetm.calls.must_pass_in_stack (arg);
 }
 
+/* Return true if FIELD is the C++17 empty base field that should
+   be ignored for ABI calling convention decisions in order to
+   maintain ABI compatibility between C++14 and earlier, which doesn't
+   add this FIELD to classes with empty bases, and C++17 and later
+   which does.  */
+
+bool
+cxx17_empty_base_field_p (const_tree field)
+{
+  return (DECL_FIELD_ABI_IGNORED (field)
+ && DECL_ARTIFICIAL (field)
+ && RECORD_OR_UNION_TYPE_P (TREE_TYPE (field))
+ && !lookup_attribute ("no_unique_address", DECL_ATTRIBUTES (field)));
+}
+
 /* Tell the garbage collector about GTY markers in this source file.  */
 #include "gt-calls.h"


[PATCH] coroutines: Improve error recovery [PR94817, PR94829].

2020-04-29 Thread Iain Sandoe
Hi,

When we have completely missing key information (e.g. the
coroutine_traits) or a partially transformed function body, we
need to try and balance returning useful information about
failures with the possibility that some part of the diagnostics
machinery or following code will not be able to handle the
state.

The PRs (and revised testcase) point to cases where that processing
has failed.

This revises the process to avoid special handling for the
ramp, and falls back on the same code used for regular function
fails.

There are test-cases (in addition to the ones for the PRs) that now
cover all early exit points [where the transforms are considered
to have failed in a manner that does not allow compilation to
continue].

tested on x86_64-darwin
OK for master after testing on x86-64-linux?
thanks
Iain

gcc/cp/ChangeLog:

2020-04-29  Iain Sandoe  

PR c++/94817
PR c++/94829
* coroutines.cc (morph_fn_to_coro): Set unformed outline
functions to error_mark_node.  For early error returns suppress
warnings about missing ramp return values.  Fix reinstatement
of the function body on pre-existing initial error.
* decl.c (finish_function): Use the normal error path for fails
in the ramp function, do not try to compile the helpers if the
transform fails.

gcc/testsuite/ChangeLog:

2020-04-29  Iain Sandoe  

PR c++/94817
PR c++/94829
* g++.dg/coroutines/coro-missing-final-suspend.C: New test.
* g++.dg/coroutines/coro-missing-initial-suspend.C: New test.
* g++.dg/coroutines/coro-missing-promise-yield.C: Check for
continuation of compilation.
* g++.dg/coroutines/coro-missing-promise.C: Likewise.
* g++.dg/coroutines/coro-missing-ret-value.C: Likewise
* g++.dg/coroutines/coro-missing-ret-void.C: Likewise
* g++.dg/coroutines/coro-missing-ueh-3.C: Likewise
* g++.dg/coroutines/pr94817.C: New test.
* g++.dg/coroutines/pr94829.C: New test.
---
 gcc/cp/coroutines.cc  | 31 +---
 gcc/cp/decl.c | 20 
 .../coroutines/coro-missing-final-suspend.C   | 19 +++
 .../coroutines/coro-missing-initial-suspend.C | 19 +++
 .../coroutines/coro-missing-promise-yield.C   |  6 +++
 .../g++.dg/coroutines/coro-missing-promise.C  |  6 +++
 .../coroutines/coro-missing-ret-value.C   |  6 +++
 .../g++.dg/coroutines/coro-missing-ret-void.C |  6 +++
 .../g++.dg/coroutines/coro-missing-ueh-3.C|  6 +++
 .../coroutines/coro1-ret-int-yield-int.h  |  6 +++
 gcc/testsuite/g++.dg/coroutines/pr94817.C | 10 
 gcc/testsuite/g++.dg/coroutines/pr94829.C | 49 +++
 12 files changed, 168 insertions(+), 16 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/coroutines/coro-missing-final-suspend.C
 create mode 100644 
gcc/testsuite/g++.dg/coroutines/coro-missing-initial-suspend.C
 create mode 100644 gcc/testsuite/g++.dg/coroutines/pr94817.C
 create mode 100644 gcc/testsuite/g++.dg/coroutines/pr94829.C

diff --git a/gcc/cp/coroutines.cc b/gcc/cp/coroutines.cc
index 7bb3e98fe6c..15aa57cd65c 100644
--- a/gcc/cp/coroutines.cc
+++ b/gcc/cp/coroutines.cc
@@ -3516,14 +3516,21 @@ morph_fn_to_coro (tree orig, tree *resumer, tree 
*destroyer)
 {
   gcc_checking_assert (orig && TREE_CODE (orig) == FUNCTION_DECL);
 
+  *resumer = error_mark_node;
+  *destroyer = error_mark_node;
   if (!coro_function_valid_p (orig))
-return false;
+{
+  /* For early errors, we do not want a diagnostic about the missing
+ramp return value, since the user cannot fix this - a 'return' is
+not allowed in a coroutine.  */
+  TREE_NO_WARNING (orig) = true;
+  return false;
+}
 
   /* We can't validly get here with an empty statement list, since there's no
  way for the FE to decide it's a coroutine in the absence of any code.  */
   tree fnbody = pop_stmt_list (DECL_SAVED_TREE (orig));
-  if (fnbody == NULL_TREE)
-return false;
+  gcc_checking_assert (fnbody != NULL_TREE);
 
   /* We don't have the locus of the opening brace - it's filled in later (and
  there doesn't really seem to be any easy way to get at it).
@@ -3537,7 +3544,9 @@ morph_fn_to_coro (tree orig, tree *resumer, tree 
*destroyer)
   if (body_start == NULL_TREE || body_start == error_mark_node)
 {
   DECL_SAVED_TREE (orig) = push_stmt_list ();
-  append_to_statement_list (DECL_SAVED_TREE (orig), &fnbody);
+  append_to_statement_list (fnbody, &DECL_SAVED_TREE (orig));
+  /* Suppress warnings about the missing return value.  */
+  TREE_NO_WARNING (orig) = true;
   return false;
 }
 
@@ -3606,13 +3615,21 @@ morph_fn_to_coro (tree orig, tree *resumer, tree 
*destroyer)
  the coroutine promise.  */
   tree initial_await = build_init_or_final_await (fn_start, false);
   if (initial_await == error_mark_node)
-return false;
+{
+  /* Suppress warnings about the m

Re: [committed] aarch64: Fix parameter passing for [[no_unique_address]]

2020-04-29 Thread Jakub Jelinek via Gcc-patches
On Wed, Apr 29, 2020 at 11:48:51AM +0100, Richard Sandiford wrote:
> Jakub Jelinek  writes:
> > On Wed, Apr 29, 2020 at 10:57:24AM +0100, Richard Sandiford wrote:
> >> This patch makes the ABI code ignore zero-sized [[no_unique_address]]
> >> fields when deciding whether something is a HFA or HVA.
> >
> > As you use cxx17_empty_base_field_p, can you please remove the FIXME
> > in calls.h because it then can't go away, and perhaps update it to
> > test DECL_ARTIFICIAL (field) instead (or in addition to no attribute,
> > with with DECL_ARTIFICIAL first)?
> 
> Sure, how's this?  I added back the RECORD_OR_UNION_TYPE_P test too
> for good measure :-)
> 
> Only lightly tested on aarch64-linux-gnu and aarch64_be-elf so far.
> 
> Richard
> 
> 
> 2020-04-29  Richard Sandiford  
> 
> gcc/
>   * calls.h (cxx17_empty_base_field_p): Turn into a function declaration.
>   * calls.c (cxx17_empty_base_field_p): New function.  Check
>   DECL_ARTIFICIAL and RECORD_OR_UNION_TYPE_P in addition to the
>   previous checks.

Ok, thanks.

Jakub



[PATCH] x86: Fix -O0 intrinsic *gather*/*scatter* macros [PR94832]

2020-04-29 Thread Jakub Jelinek via Gcc-patches
Hi!

As reported in the PR, while most intrinsic -O0 macro argument uses
are properly wrapped in ()s or used in context where having a complex
expression passed as the argument doesn't pose a problem (e.g. when
macro argument use is in between commas, or between ( and comma, or
between comma and ) etc.), especially the gather/scatter macros don't do
this and if one passes to some macro e.g. x + y as argument, the
corresponding inline function would do cast on the argument, but
the macro does (int) ARG, then it is (int) x + y rather than (int) (x + y).

The following patch fixes those issues in *gather/*scatter*; additionally,
the AVX2 macros were passing incorrect mask of e.g.
(__v2df)_mm_set1_pd((double)(long long int) -1)
which is IMHO equivalent to
(__v2df){-1.0, -1.0}
when it really wants to pass __v2df vector with all bits set.
I've used what the inline functions use for those cases.

Tested on x86_64-linux, ok for trunk if it passes full bootstrap/regtest
on x86_64-linux and i686-linux?

2020-04-29  Jakub Jelinek  

PR target/94832
* config/i386/avx2intrin.h (_mm_mask_i32gather_pd,
_mm256_mask_i32gather_pd, _mm_mask_i64gather_pd,
_mm256_mask_i64gather_pd, _mm_mask_i32gather_ps,
_mm256_mask_i32gather_ps, _mm_mask_i64gather_ps,
_mm256_mask_i64gather_ps, _mm_i32gather_epi64,
_mm_mask_i32gather_epi64, _mm256_i32gather_epi64,
_mm256_mask_i32gather_epi64, _mm_i64gather_epi64,
_mm_mask_i64gather_epi64, _mm256_i64gather_epi64,
_mm256_mask_i64gather_epi64, _mm_i32gather_epi32,
_mm_mask_i32gather_epi32, _mm256_i32gather_epi32,
_mm256_mask_i32gather_epi32, _mm_i64gather_epi32,
_mm_mask_i64gather_epi32, _mm256_i64gather_epi32,
_mm256_mask_i64gather_epi32): Surround macro parameter uses with
parens.
(_mm_i32gather_pd, _mm256_i32gather_pd, _mm_i64gather_pd,
_mm256_i64gather_pd, _mm_i32gather_ps, _mm256_i32gather_ps,
_mm_i64gather_ps, _mm256_i64gather_ps): Likewise.  Don't use
as mask vector containing -1.0 or -1.0f elts, but instead vector
with all bits set using _mm*_cmpeq_p? with zero operands.
* config/i386/avx512fintrin.h (_mm512_i32gather_ps,
_mm512_mask_i32gather_ps, _mm512_i32gather_pd,
_mm512_mask_i32gather_pd, _mm512_i64gather_ps,
_mm512_mask_i64gather_ps, _mm512_i64gather_pd,
_mm512_mask_i64gather_pd, _mm512_i32gather_epi32,
_mm512_mask_i32gather_epi32, _mm512_i32gather_epi64,
_mm512_mask_i32gather_epi64, _mm512_i64gather_epi32,
_mm512_mask_i64gather_epi32, _mm512_i64gather_epi64,
_mm512_mask_i64gather_epi64, _mm512_i32scatter_ps,
_mm512_mask_i32scatter_ps, _mm512_i32scatter_pd,
_mm512_mask_i32scatter_pd, _mm512_i64scatter_ps,
_mm512_mask_i64scatter_ps, _mm512_i64scatter_pd,
_mm512_mask_i64scatter_pd, _mm512_i32scatter_epi32,
_mm512_mask_i32scatter_epi32, _mm512_i32scatter_epi64,
_mm512_mask_i32scatter_epi64, _mm512_i64scatter_epi32,
_mm512_mask_i64scatter_epi32, _mm512_i64scatter_epi64,
_mm512_mask_i64scatter_epi64): Surround macro parameter uses with
parens.
* config/i386/avx512pfintrin.h (_mm512_prefetch_i32gather_pd,
_mm512_prefetch_i32gather_ps, _mm512_mask_prefetch_i32gather_pd,
_mm512_mask_prefetch_i32gather_ps, _mm512_prefetch_i64gather_pd,
_mm512_prefetch_i64gather_ps, _mm512_mask_prefetch_i64gather_pd,
_mm512_mask_prefetch_i64gather_ps, _mm512_prefetch_i32scatter_pd,
_mm512_prefetch_i32scatter_ps, _mm512_mask_prefetch_i32scatter_pd,
_mm512_mask_prefetch_i32scatter_ps, _mm512_prefetch_i64scatter_pd,
_mm512_prefetch_i64scatter_ps, _mm512_mask_prefetch_i64scatter_pd,
_mm512_mask_prefetch_i64scatter_ps): Likewise.
* config/i386/avx512vlintrin.h (_mm256_mmask_i32gather_ps,
_mm_mmask_i32gather_ps, _mm256_mmask_i32gather_pd,
_mm_mmask_i32gather_pd, _mm256_mmask_i64gather_ps,
_mm_mmask_i64gather_ps, _mm256_mmask_i64gather_pd,
_mm_mmask_i64gather_pd, _mm256_mmask_i32gather_epi32,
_mm_mmask_i32gather_epi32, _mm256_mmask_i32gather_epi64,
_mm_mmask_i32gather_epi64, _mm256_mmask_i64gather_epi32,
_mm_mmask_i64gather_epi32, _mm256_mmask_i64gather_epi64,
_mm_mmask_i64gather_epi64, _mm256_i32scatter_ps,
_mm256_mask_i32scatter_ps, _mm_i32scatter_ps, _mm_mask_i32scatter_ps,
_mm256_i32scatter_pd, _mm256_mask_i32scatter_pd, _mm_i32scatter_pd,
_mm_mask_i32scatter_pd, _mm256_i64scatter_ps,
_mm256_mask_i64scatter_ps, _mm_i64scatter_ps, _mm_mask_i64scatter_ps,
_mm256_i64scatter_pd, _mm256_mask_i64scatter_pd, _mm_i64scatter_pd,
_mm_mask_i64scatter_pd, _mm256_i32scatter_epi32,
_mm256_mask_i32scatter_epi32, _mm_i32scatter_epi32,
_mm_mask_i32scatter_epi32, _mm256_i32scatter_epi64,
_mm256_mask_i32scatter_epi64, _m

[committed] doc: Add missing arm_arch_v8a_hard_ok anchor

2020-04-29 Thread Richard Sandiford
Fix a missing anchor in my earlier arm patch.

Sorry for the breakage.  Clearly I didn't test the doc part properly
in the rush to get this out.  (The code itself was fully tested as
committed, honest.)

Richard


2020-04-29  Richard Sandiford  

gcc/
* doc/sourcebuild.texi: Add missing arm_arch_v8a_hard_ok anchor.
---
 gcc/doc/sourcebuild.texi | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi
index d8da77d5ba4..66f3576cb11 100644
--- a/gcc/doc/sourcebuild.texi
+++ b/gcc/doc/sourcebuild.texi
@@ -1830,6 +1830,7 @@ ARM target supports @code{-mfpu=vfp3 -mfloat-abi=softfp}.
 Some multilibs may be incompatible with these options.
 
 @item arm_arch_v8a_hard_ok
+@anchor{arm_arch_v8a_hard_ok}
 The compiler is targeting @code{arm*-*-*} and can compile and assemble code
 using the options @code{-march=armv8-a -mfpu=neon-fp-armv8 -mfloat-abi=hard}.
 This is not enough to guarantee that linking works.


[PATCH] x86: Fix -O0 remaining intrinsic macros [PR94832]

2020-04-29 Thread Jakub Jelinek via Gcc-patches
Hi!

A few other macros seem to suffer from the same issue.  What I've done was:
cat gcc/config/i386/*intrin.h | sed -e ':x /\\$/ { N; s/\\\n//g ; bx }' \
| grep '^[[:blank:]]*#[[:blank:]]*define[[:blank:]].*(' | sed 's/[  ]\+/ 
/g' \
> /tmp/macros
and then looking for regexps:
)[a-zA-Z]
) [a-zA-Z]
[a-zA-Z][-+*/%]
[a-zA-Z] [-+*/%]
[-+*/%][a-zA-Z]
[-+*/%] [a-zA-Z]
in the resulting file.

Tested on x86_64-linux, ok for trunk if it passes full bootstrap/regtest
on x86_64-linux and i686-linux?

2020-04-29  Jakub Jelinek  

PR target/94832
* config/i386/avx512bwintrin.h (_mm512_alignr_epi8,
_mm512_mask_alignr_epi8, _mm512_maskz_alignr_epi8): Wrap macro operands
used in casts into parens.
* config/i386/avx512fintrin.h (_mm512_cvt_roundps_ph, _mm512_cvtps_ph,
_mm512_mask_cvt_roundps_ph, _mm512_mask_cvtps_ph,
_mm512_maskz_cvt_roundps_ph, _mm512_maskz_cvtps_ph,
_mm512_mask_cmp_epi64_mask, _mm512_mask_cmp_epi32_mask,
_mm512_mask_cmp_epu64_mask, _mm512_mask_cmp_epu32_mask,
_mm512_mask_cmp_round_pd_mask, _mm512_mask_cmp_round_ps_mask,
_mm512_mask_cmp_pd_mask, _mm512_mask_cmp_ps_mask): Likewise.
* config/i386/avx512vlbwintrin.h (_mm256_mask_alignr_epi8,
_mm256_maskz_alignr_epi8, _mm_mask_alignr_epi8, _mm_maskz_alignr_epi8,
_mm256_mask_cmp_epu8_mask): Likewise.
* config/i386/avx512vlintrin.h (_mm_mask_cvtps_ph, _mm_maskz_cvtps_ph,
_mm256_mask_cvtps_ph, _mm256_maskz_cvtps_ph): Likewise.
* config/i386/f16cintrin.h (_mm_cvtps_ph, _mm256_cvtps_ph): Likewise.
* config/i386/shaintrin.h (_mm_sha1rnds4_epu32): Likewise.

--- gcc/config/i386/avx512vlintrin.h.jj 2020-04-29 11:16:27.671094124 +0200
+++ gcc/config/i386/avx512vlintrin.h2020-04-29 11:52:30.746028151 +0200
@@ -13466,19 +13466,19 @@ _mm256_permutex_pd (__m256d __X, const i
 (__mmask8)(U)))
 
 #define _mm_mask_cvtps_ph(W, U, A, I)  
\
-  ((__m128i) __builtin_ia32_vcvtps2ph_mask ((__v4sf)(__m128) A, (int) (I), 
 \
+  ((__m128i) __builtin_ia32_vcvtps2ph_mask ((__v4sf)(__m128) (A), (int) (I),   
\
   (__v8hi)(__m128i) (W), (__mmask8) (U)))
 
 #define _mm_maskz_cvtps_ph(U, A, I)
\
-  ((__m128i) __builtin_ia32_vcvtps2ph_mask ((__v4sf)(__m128) A, (int) (I), 
 \
+  ((__m128i) __builtin_ia32_vcvtps2ph_mask ((__v4sf)(__m128) (A), (int) (I),   
\
   (__v8hi)(__m128i) _mm_setzero_si128 (), (__mmask8) (U)))
 
 #define _mm256_mask_cvtps_ph(W, U, A, I)   
\
-  ((__m128i) __builtin_ia32_vcvtps2ph256_mask ((__v8sf)(__m256) A, (int) (I),  
\
+  ((__m128i) __builtin_ia32_vcvtps2ph256_mask ((__v8sf)(__m256) (A), (int) 
(I),\
   (__v8hi)(__m128i) (W), (__mmask8) (U)))
 
 #define _mm256_maskz_cvtps_ph(U, A, I) 
\
-  ((__m128i) __builtin_ia32_vcvtps2ph256_mask ((__v8sf)(__m256) A, (int) (I),  
 \
+  ((__m128i) __builtin_ia32_vcvtps2ph256_mask ((__v8sf)(__m256) (A), (int) 
(I),\
   (__v8hi)(__m128i) _mm_setzero_si128 (), (__mmask8) (U)))
 
 #define _mm256_mask_srai_epi32(W, U, A, B) \
--- gcc/config/i386/avx512bwintrin.h.jj 2020-01-12 11:54:36.313414917 +0100
+++ gcc/config/i386/avx512bwintrin.h2020-04-29 11:55:52.703026442 +0200
@@ -3128,16 +3128,16 @@ _mm512_bsrli_epi128 (__m512i __A, const
 #define _mm512_alignr_epi8(X, Y, N)
\
   ((__m512i) __builtin_ia32_palignr512 ((__v8di)(__m512i)(X),  
\
(__v8di)(__m512i)(Y),   
\
-   (int)(N * 8)))
+   (int)((N) * 8)))
 
 #define _mm512_mask_alignr_epi8(W, U, X, Y, N) 
\
   ((__m512i) __builtin_ia32_palignr512_mask ((__v8di)(__m512i)(X), 
\
-   (__v8di)(__m512i)(Y), (int)(N * 8), 
\
+   (__v8di)(__m512i)(Y), (int)((N) * 
8),   \
(__v8di)(__m512i)(W), 
(__mmask64)(U)))
 
 #define _mm512_maskz_alignr_epi8(U, X, Y, N)   
\
   ((__m512i) __builtin_ia32_palignr512_mask ((__v8di)(__m512i)(X), 
\
-(__v8di)(__m512i)(Y), (int)(N * 
8),\
+(__v8di)(__m512i)(Y), (int)((N) * 
8),  \
 (__v8di)(__m512i)  
\
 _mm512_setzero_si512 (),   
\
 (__mmask64)(U)))
--- gcc/config/i386/avx512vlbwintrin.h.jj   2020-01-12 11:54:36.315414887 
+0100
+++ gcc/config/i386/avx512vlbwintrin.h  2020-04-29 11:56:45.7662399

Re: [PATCH] x86: Fix -O0 intrinsic *gather*/*scatter* macros [PR94832]

2020-04-29 Thread Uros Bizjak via Gcc-patches
On Wed, Apr 29, 2020 at 1:04 PM Jakub Jelinek  wrote:
>
> Hi!
>
> As reported in the PR, while most intrinsic -O0 macro argument uses
> are properly wrapped in ()s or used in context where having a complex
> expression passed as the argument doesn't pose a problem (e.g. when
> macro argument use is in between commas, or between ( and comma, or
> between comma and ) etc.), especially the gather/scatter macros don't do
> this and if one passes to some macro e.g. x + y as argument, the
> corresponding inline function would do cast on the argument, but
> the macro does (int) ARG, then it is (int) x + y rather than (int) (x + y).
>
> The following patch fixes those issues in *gather/*scatter*; additionally,
> the AVX2 macros were passing incorrect mask of e.g.
> (__v2df)_mm_set1_pd((double)(long long int) -1)
> which is IMHO equivalent to
> (__v2df){-1.0, -1.0}
> when it really wants to pass __v2df vector with all bits set.
> I've used what the inline functions use for those cases.
>
> Tested on x86_64-linux, ok for trunk if it passes full bootstrap/regtest
> on x86_64-linux and i686-linux?
>
> 2020-04-29  Jakub Jelinek  
>
> PR target/94832
> * config/i386/avx2intrin.h (_mm_mask_i32gather_pd,
> _mm256_mask_i32gather_pd, _mm_mask_i64gather_pd,
> _mm256_mask_i64gather_pd, _mm_mask_i32gather_ps,
> _mm256_mask_i32gather_ps, _mm_mask_i64gather_ps,
> _mm256_mask_i64gather_ps, _mm_i32gather_epi64,
> _mm_mask_i32gather_epi64, _mm256_i32gather_epi64,
> _mm256_mask_i32gather_epi64, _mm_i64gather_epi64,
> _mm_mask_i64gather_epi64, _mm256_i64gather_epi64,
> _mm256_mask_i64gather_epi64, _mm_i32gather_epi32,
> _mm_mask_i32gather_epi32, _mm256_i32gather_epi32,
> _mm256_mask_i32gather_epi32, _mm_i64gather_epi32,
> _mm_mask_i64gather_epi32, _mm256_i64gather_epi32,
> _mm256_mask_i64gather_epi32): Surround macro parameter uses with
> parens.
> (_mm_i32gather_pd, _mm256_i32gather_pd, _mm_i64gather_pd,
> _mm256_i64gather_pd, _mm_i32gather_ps, _mm256_i32gather_ps,
> _mm_i64gather_ps, _mm256_i64gather_ps): Likewise.  Don't use
> as mask vector containing -1.0 or -1.0f elts, but instead vector
> with all bits set using _mm*_cmpeq_p? with zero operands.
> * config/i386/avx512fintrin.h (_mm512_i32gather_ps,
> _mm512_mask_i32gather_ps, _mm512_i32gather_pd,
> _mm512_mask_i32gather_pd, _mm512_i64gather_ps,
> _mm512_mask_i64gather_ps, _mm512_i64gather_pd,
> _mm512_mask_i64gather_pd, _mm512_i32gather_epi32,
> _mm512_mask_i32gather_epi32, _mm512_i32gather_epi64,
> _mm512_mask_i32gather_epi64, _mm512_i64gather_epi32,
> _mm512_mask_i64gather_epi32, _mm512_i64gather_epi64,
> _mm512_mask_i64gather_epi64, _mm512_i32scatter_ps,
> _mm512_mask_i32scatter_ps, _mm512_i32scatter_pd,
> _mm512_mask_i32scatter_pd, _mm512_i64scatter_ps,
> _mm512_mask_i64scatter_ps, _mm512_i64scatter_pd,
> _mm512_mask_i64scatter_pd, _mm512_i32scatter_epi32,
> _mm512_mask_i32scatter_epi32, _mm512_i32scatter_epi64,
> _mm512_mask_i32scatter_epi64, _mm512_i64scatter_epi32,
> _mm512_mask_i64scatter_epi32, _mm512_i64scatter_epi64,
> _mm512_mask_i64scatter_epi64): Surround macro parameter uses with
> parens.
> * config/i386/avx512pfintrin.h (_mm512_prefetch_i32gather_pd,
> _mm512_prefetch_i32gather_ps, _mm512_mask_prefetch_i32gather_pd,
> _mm512_mask_prefetch_i32gather_ps, _mm512_prefetch_i64gather_pd,
> _mm512_prefetch_i64gather_ps, _mm512_mask_prefetch_i64gather_pd,
> _mm512_mask_prefetch_i64gather_ps, _mm512_prefetch_i32scatter_pd,
> _mm512_prefetch_i32scatter_ps, _mm512_mask_prefetch_i32scatter_pd,
> _mm512_mask_prefetch_i32scatter_ps, _mm512_prefetch_i64scatter_pd,
> _mm512_prefetch_i64scatter_ps, _mm512_mask_prefetch_i64scatter_pd,
> _mm512_mask_prefetch_i64scatter_ps): Likewise.
> * config/i386/avx512vlintrin.h (_mm256_mmask_i32gather_ps,
> _mm_mmask_i32gather_ps, _mm256_mmask_i32gather_pd,
> _mm_mmask_i32gather_pd, _mm256_mmask_i64gather_ps,
> _mm_mmask_i64gather_ps, _mm256_mmask_i64gather_pd,
> _mm_mmask_i64gather_pd, _mm256_mmask_i32gather_epi32,
> _mm_mmask_i32gather_epi32, _mm256_mmask_i32gather_epi64,
> _mm_mmask_i32gather_epi64, _mm256_mmask_i64gather_epi32,
> _mm_mmask_i64gather_epi32, _mm256_mmask_i64gather_epi64,
> _mm_mmask_i64gather_epi64, _mm256_i32scatter_ps,
> _mm256_mask_i32scatter_ps, _mm_i32scatter_ps, _mm_mask_i32scatter_ps,
> _mm256_i32scatter_pd, _mm256_mask_i32scatter_pd, _mm_i32scatter_pd,
> _mm_mask_i32scatter_pd, _mm256_i64scatter_ps,
> _mm256_mask_i64scatter_ps, _mm_i64scatter_ps, _mm_mask_i64scatter_ps,
> _mm256_i64scatter_pd, _mm256_mask_i64scatter_pd, _mm_i64scatter_

Re: [PATCH] x86: Fix -O0 remaining intrinsic macros [PR94832]

2020-04-29 Thread Uros Bizjak via Gcc-patches
On Wed, Apr 29, 2020 at 1:11 PM Jakub Jelinek  wrote:
>
> Hi!
>
> A few other macros seem to suffer from the same issue.  What I've done was:
> cat gcc/config/i386/*intrin.h | sed -e ':x /\\$/ { N; s/\\\n//g ; bx }' \
> | grep '^[[:blank:]]*#[[:blank:]]*define[[:blank:]].*(' | sed 's/[  ]\+/ 
> /g' \
> > /tmp/macros
> and then looking for regexps:
> )[a-zA-Z]
> ) [a-zA-Z]
> [a-zA-Z][-+*/%]
> [a-zA-Z] [-+*/%]
> [-+*/%][a-zA-Z]
> [-+*/%] [a-zA-Z]
> in the resulting file.
>
> Tested on x86_64-linux, ok for trunk if it passes full bootstrap/regtest
> on x86_64-linux and i686-linux?
>
> 2020-04-29  Jakub Jelinek  
>
> PR target/94832
> * config/i386/avx512bwintrin.h (_mm512_alignr_epi8,
> _mm512_mask_alignr_epi8, _mm512_maskz_alignr_epi8): Wrap macro 
> operands
> used in casts into parens.
> * config/i386/avx512fintrin.h (_mm512_cvt_roundps_ph, _mm512_cvtps_ph,
> _mm512_mask_cvt_roundps_ph, _mm512_mask_cvtps_ph,
> _mm512_maskz_cvt_roundps_ph, _mm512_maskz_cvtps_ph,
> _mm512_mask_cmp_epi64_mask, _mm512_mask_cmp_epi32_mask,
> _mm512_mask_cmp_epu64_mask, _mm512_mask_cmp_epu32_mask,
> _mm512_mask_cmp_round_pd_mask, _mm512_mask_cmp_round_ps_mask,
> _mm512_mask_cmp_pd_mask, _mm512_mask_cmp_ps_mask): Likewise.
> * config/i386/avx512vlbwintrin.h (_mm256_mask_alignr_epi8,
> _mm256_maskz_alignr_epi8, _mm_mask_alignr_epi8, _mm_maskz_alignr_epi8,
> _mm256_mask_cmp_epu8_mask): Likewise.
> * config/i386/avx512vlintrin.h (_mm_mask_cvtps_ph, _mm_maskz_cvtps_ph,
> _mm256_mask_cvtps_ph, _mm256_maskz_cvtps_ph): Likewise.
> * config/i386/f16cintrin.h (_mm_cvtps_ph, _mm256_cvtps_ph): Likewise.
> * config/i386/shaintrin.h (_mm_sha1rnds4_epu32): Likewise.

LGTM.

Thanks,
Uros.

> --- gcc/config/i386/avx512vlintrin.h.jj 2020-04-29 11:16:27.671094124 +0200
> +++ gcc/config/i386/avx512vlintrin.h2020-04-29 11:52:30.746028151 +0200
> @@ -13466,19 +13466,19 @@ _mm256_permutex_pd (__m256d __X, const i
>  (__mmask8)(U)))
>
>  #define _mm_mask_cvtps_ph(W, U, A, I)
>   \
> -  ((__m128i) __builtin_ia32_vcvtps2ph_mask ((__v4sf)(__m128) A, (int) (I),   
>\
> +  ((__m128i) __builtin_ia32_vcvtps2ph_mask ((__v4sf)(__m128) (A), (int) (I), 
>   \
>(__v8hi)(__m128i) (W), (__mmask8) (U)))
>
>  #define _mm_maskz_cvtps_ph(U, A, I)  
>   \
> -  ((__m128i) __builtin_ia32_vcvtps2ph_mask ((__v4sf)(__m128) A, (int) (I),   
>\
> +  ((__m128i) __builtin_ia32_vcvtps2ph_mask ((__v4sf)(__m128) (A), (int) (I), 
>   \
>(__v8hi)(__m128i) _mm_setzero_si128 (), (__mmask8) (U)))
>
>  #define _mm256_mask_cvtps_ph(W, U, A, I) 
>   \
> -  ((__m128i) __builtin_ia32_vcvtps2ph256_mask ((__v8sf)(__m256) A, (int) 
> (I),  \
> +  ((__m128i) __builtin_ia32_vcvtps2ph256_mask ((__v8sf)(__m256) (A), (int) 
> (I),\
>(__v8hi)(__m128i) (W), (__mmask8) (U)))
>
>  #define _mm256_maskz_cvtps_ph(U, A, I)   
>   \
> -  ((__m128i) __builtin_ia32_vcvtps2ph256_mask ((__v8sf)(__m256) A, (int) 
> (I),   \
> +  ((__m128i) __builtin_ia32_vcvtps2ph256_mask ((__v8sf)(__m256) (A), (int) 
> (I),\
>(__v8hi)(__m128i) _mm_setzero_si128 (), (__mmask8) (U)))
>
>  #define _mm256_mask_srai_epi32(W, U, A, B) \
> --- gcc/config/i386/avx512bwintrin.h.jj 2020-01-12 11:54:36.313414917 +0100
> +++ gcc/config/i386/avx512bwintrin.h2020-04-29 11:55:52.703026442 +0200
> @@ -3128,16 +3128,16 @@ _mm512_bsrli_epi128 (__m512i __A, const
>  #define _mm512_alignr_epi8(X, Y, N)  
>   \
>((__m512i) __builtin_ia32_palignr512 ((__v8di)(__m512i)(X),
>   \
> (__v8di)(__m512i)(Y), 
>   \
> -   (int)(N * 8)))
> +   (int)((N) * 8)))
>
>  #define _mm512_mask_alignr_epi8(W, U, X, Y, N)   
>   \
>((__m512i) __builtin_ia32_palignr512_mask ((__v8di)(__m512i)(X),   
>   \
> -   (__v8di)(__m512i)(Y), (int)(N * 
> 8), \
> +   (__v8di)(__m512i)(Y), (int)((N) * 
> 8),   \
> (__v8di)(__m512i)(W), 
> (__mmask64)(U)))
>
>  #define _mm512_maskz_alignr_epi8(U, X, Y, N) 
>   \
>((__m512i) __builtin_ia32_palignr512_mask ((__v8di)(__m512i)(X),   
>   \
> -(__v8di)(__m512i)(Y), (int)(N * 
> 8),\
> +(__v8di)(__m512i)(Y), (int)((N) 
> * 8),  \
>  (__v8di)(__m512i)
>   \
> 

[PATCH] rs6000: Fix rs6000_atomic_assign_expand_fenv [PR94826]

2020-04-29 Thread Jakub Jelinek via Gcc-patches
Hi!

This is the rs6000 version of the earlier committed x86, aarch64 and arm
fixes, as create_tmp_var_raw is used because the C FE can call this outside
of function context, we need to make sure the first references to those
VAR_DECLs are through a TARGET_EXPR, so that it gets gimple_add_tmp_var
marked in whatever function it gets expanded in.  Without that DECL_CONTEXT
is NULL and the vars aren't added as local decls of the containing function.

Bootstrapped/regtested on powerpc64{,le}-linux, ok for trunk?

2020-04-29  Jakub Jelinek  

PR target/94826
* config/rs6000/rs6000.c (rs6000_atomic_assign_expand_fenv): Use
TARGET_EXPR instead of MODIFY_EXPR for first assignment to
fenv_var, fenv_clear and old_fenv variables.  For fenv_addr
take address of TARGET_EXPR of fenv_var with void_node initializer.
Formatting fixes.

--- gcc/config/rs6000/rs6000.c.jj   2020-04-29 09:00:46.0 +0200
+++ gcc/config/rs6000/rs6000.c  2020-04-29 09:33:03.990008223 +0200
@@ -26012,7 +26012,9 @@ rs6000_atomic_assign_expand_fenv (tree *
 
   tree fenv_var = create_tmp_var_raw (double_type_node);
   TREE_ADDRESSABLE (fenv_var) = 1;
-  tree fenv_addr = build1 (ADDR_EXPR, double_ptr_type_node, fenv_var);
+  tree fenv_addr = build1 (ADDR_EXPR, double_ptr_type_node,
+  build4 (TARGET_EXPR, double_type_node, fenv_var,
+  void_node, NULL_TREE, NULL_TREE));
 
   *hold = build_call_expr (atomic_hold_decl, 1, fenv_addr);
   *clear = build_call_expr (atomic_clear_decl, 0);
@@ -26035,12 +26037,13 @@ rs6000_atomic_assign_expand_fenv (tree *
 
   /* Mask to clear everything except for the rounding modes and non-IEEE
  arithmetic flag.  */
-  const unsigned HOST_WIDE_INT hold_exception_mask =
-HOST_WIDE_INT_C (0x0007);
+  const unsigned HOST_WIDE_INT hold_exception_mask
+= HOST_WIDE_INT_C (0x0007);
 
   tree fenv_var = create_tmp_var_raw (double_type_node);
 
-  tree hold_mffs = build2 (MODIFY_EXPR, void_type_node, fenv_var, call_mffs);
+  tree hold_mffs = build4 (TARGET_EXPR, double_type_node, fenv_var, call_mffs,
+  NULL_TREE, NULL_TREE);
 
   tree fenv_llu = build1 (VIEW_CONVERT_EXPR, uint64_type_node, fenv_var);
   tree fenv_llu_and = build2 (BIT_AND_EXPR, uint64_type_node, fenv_llu,
@@ -26064,12 +26067,13 @@ rs6000_atomic_assign_expand_fenv (tree *
 
   /* Mask to clear everything except for the rounding modes and non-IEEE
  arithmetic flag.  */
-  const unsigned HOST_WIDE_INT clear_exception_mask =
-HOST_WIDE_INT_C (0x);
+  const unsigned HOST_WIDE_INT clear_exception_mask
+= HOST_WIDE_INT_C (0x);
 
   tree fenv_clear = create_tmp_var_raw (double_type_node);
 
-  tree clear_mffs = build2 (MODIFY_EXPR, void_type_node, fenv_clear, 
call_mffs);
+  tree clear_mffs = build4 (TARGET_EXPR, double_type_node, fenv_clear,
+   call_mffs, NULL_TREE, NULL_TREE);
 
   tree fenv_clean_llu = build1 (VIEW_CONVERT_EXPR, uint64_type_node, 
fenv_clear);
   tree fenv_clear_llu_and = build2 (BIT_AND_EXPR, uint64_type_node,
@@ -26094,13 +26098,14 @@ rs6000_atomic_assign_expand_fenv (tree *
 (*(uint64_t*)fenv_var 0x1ff80fff);
  __builtin_mtfsf (0xff, fenv_update);  */
 
-  const unsigned HOST_WIDE_INT update_exception_mask =
-HOST_WIDE_INT_C (0x1f00);
-  const unsigned HOST_WIDE_INT new_exception_mask =
-HOST_WIDE_INT_C (0x1ff80fff);
+  const unsigned HOST_WIDE_INT update_exception_mask
+= HOST_WIDE_INT_C (0x1f00);
+  const unsigned HOST_WIDE_INT new_exception_mask
+= HOST_WIDE_INT_C (0x1ff80fff);
 
   tree old_fenv = create_tmp_var_raw (double_type_node);
-  tree update_mffs = build2 (MODIFY_EXPR, void_type_node, old_fenv, call_mffs);
+  tree update_mffs = build4 (TARGET_EXPR, double_type_node, old_fenv,
+call_mffs, NULL_TREE, NULL_TREE);
 
   tree old_llu = build1 (VIEW_CONVERT_EXPR, uint64_type_node, old_fenv);
   tree old_llu_and = build2 (BIT_AND_EXPR, uint64_type_node, old_llu,

Jakub



Re: [PATCH, PR94769] fortran/io.c: Fix use of uninitialized variable num

2020-04-29 Thread Stefan Schulze Frielinghaus via Gcc-patches
Hi Tobias,

On Tue, Apr 28, 2020 at 05:06:15PM +0200, Tobias Burnus wrote:
> Hi Stefan,
> 
> I prefer Jakub's suggestion – his change LGTM.

Please find attached an updated patch.  Bootstrapped and regtested on
S/390.  Ok for master?

Cheers,
Stefan

> On 4/28/20 2:30 PM, Jakub Jelinek via Gcc-patches wrote:
> > On Tue, Apr 28, 2020 at 01:53:07PM +0200, Stefan Schulze Frielinghaus via 
> > Gcc-patches wrote:
> > > gcc/fortran/ChangeLog:
> > > 
> > > 2020-04-28  Stefan Schulze Frielinghaus  
> > > 
> > >  PR fortran/94769
> > >  * io.c (check_io_constraints): Initialize local variable num.
> > > ---
> > >   gcc/fortran/io.c | 2 +-
> > >   1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/gcc/fortran/io.c b/gcc/fortran/io.c
> > > index e06e01d..4526f729d1d 100644
> > > --- a/gcc/fortran/io.c
> > > +++ b/gcc/fortran/io.c
> > > @@ -3840,7 +3840,7 @@ if (condition) \
> > > 
> > > if (dt->asynchronous)
> > >   {
> > > -  int num;
> > > +  int num = 2;
> > > static const char * asynchronous[] = { "YES", "NO", NULL };
> > Just nitpicking, wouldn't -1 be more usual value?
> > And, I think there should be an assertion that it didn't remain -1 after the
> > call, above
> >/* For "YES", mark related symbols as asynchronous.  */
> > do
> >gcc_checking (num != -1);
> > or so.
> > 
> > Note, the reason why this triggers only on s390x is the vastly different
> > inlining parameters the target uses, that causes a lot of headaches
> > everywhere.
> > 
> >   Jakub
> > 
> -
> Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
> Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, 
> Alexander Walter
>From 6118c522e96b978a2a8ba5df5fd90f7b38176e16 Mon Sep 17 00:00:00 2001
From: Stefan Schulze Frielinghaus 
Date: Tue, 28 Apr 2020 13:14:28 +0200
Subject: [PATCH] fortran/io.c: Fix use of uninitialized variable num [PR94769]

While bootstrapping GCC on S/390 the following warning occurs:

gcc/fortran/io.c: In function 'bool gfc_resolve_dt(gfc_code*, gfc_dt*, locus*)':
gcc/fortran/io.c:3857:7: error: 'num' may be used uninitialized in this 
function [-Werror=maybe-uninitialized]
 3857 |   if (num == 0)
  |   ^~
gcc/fortran/io.c:3843:11: note: 'num' was declared here
 3843 |   int num;

Since gfc_resolve_dt is a non-static function we cannot assume anything about
argument DT.  Argument DT gets passed to function check_io_constraints which
passes values depending on DT, namely dt->asynchronous->value.character.string
to function compare_to_allowed_values as well as argument warn which is true as
soon as DT->dterr is true.  Thus both arguments depend on DT.

If function compare_to_allowed_values is called with
dt->asynchronous->value.character.string not being an allowed value, and
ALLOWED_F2003 as well as ALLOWED_GNU being NULL (which is the case at the
particular call side), and WARN equals true, then the function returns with a
non-zero value and leaves num uninitialized which renders the warning true.

Initialized num to -1 and added an assert statement.

gcc/fortran/ChangeLog:

2020-04-28  Stefan Schulze Frielinghaus  

PR fortran/94769
* io.c (check_io_constraints): Initialize local variable num to
-1 and assert that it receives a meaningful value by function
compare_to_allowed_values.
---
 gcc/fortran/io.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/gcc/fortran/io.c b/gcc/fortran/io.c
index e06e01d..981cf9e88dd 100644
--- a/gcc/fortran/io.c
+++ b/gcc/fortran/io.c
@@ -3840,7 +3840,7 @@ if (condition) \
 
   if (dt->asynchronous)
 {
-  int num;
+  int num = -1;
   static const char * asynchronous[] = { "YES", "NO", NULL };
 
   /* Note: gfc_reduce_init_expr reports an error if not init-expr.  */
@@ -3853,6 +3853,8 @@ if (condition) \
 io_kind_name (k), warn, &dt->asynchronous->where, &num))
return false;
 
+  gcc_checking_assert (num != -1);
+
   /* For "YES", mark related symbols as asynchronous.  */
   if (num == 0)
{
-- 
2.25.3



Re: [PATCH] x86: Allow -fcf-protection with external thunk

2020-04-29 Thread H.J. Lu via Gcc-patches
On Tue, Apr 28, 2020 at 11:38 PM Richard Biener
 wrote:
>
> On Wed, Apr 29, 2020 at 4:22 AM H.J. Lu via Gcc-patches
>  wrote:
> >
> > Allow -fcf-protection with external thunk since the external thunk can be
> > made compatible with -fcf-protection.
> >
> > OK for master?
>
> OK.  I guess also OK for backporting to branches where CET support is
> in a reasonable
> state (not sure to what extent you backported the various fixes in this area).
>
> Thanks,
> Richard.

Fixed on master.  I am backporting it to GCC 8 and 9 branches.

Thanks.

-- 
H.J.


Re: [PATCH, PR94769] fortran/io.c: Fix use of uninitialized variable num

2020-04-29 Thread Jakub Jelinek via Gcc-patches
On Wed, Apr 29, 2020 at 01:53:01PM +0200, Stefan Schulze Frielinghaus wrote:
> Hi Tobias,
> 
> On Tue, Apr 28, 2020 at 05:06:15PM +0200, Tobias Burnus wrote:
> > Hi Stefan,
> > 
> > I prefer Jakub's suggestion – his change LGTM.
> 
> Please find attached an updated patch.  Bootstrapped and regtested on
> S/390.  Ok for master?

Ok, thanks.

> gcc/fortran/ChangeLog:
> 
> 2020-04-28  Stefan Schulze Frielinghaus  
> 
> PR fortran/94769
> * io.c (check_io_constraints): Initialize local variable num to
> -1 and assert that it receives a meaningful value by function
> compare_to_allowed_values.
> ---
>  gcc/fortran/io.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/gcc/fortran/io.c b/gcc/fortran/io.c
> index e06e01d..981cf9e88dd 100644
> --- a/gcc/fortran/io.c
> +++ b/gcc/fortran/io.c
> @@ -3840,7 +3840,7 @@ if (condition) \
>  
>if (dt->asynchronous)
>  {
> -  int num;
> +  int num = -1;
>static const char * asynchronous[] = { "YES", "NO", NULL };
>  
>/* Note: gfc_reduce_init_expr reports an error if not init-expr.  */
> @@ -3853,6 +3853,8 @@ if (condition) \
>io_kind_name (k), warn, &dt->asynchronous->where, &num))
>   return false;
>  
> +  gcc_checking_assert (num != -1);
> +
>/* For "YES", mark related symbols as asynchronous.  */
>if (num == 0)
>   {
> -- 
> 2.25.3
> 


Jakub



Re: [PATCH] x86: Allow -fcf-protection with external thunk

2020-04-29 Thread H.J. Lu via Gcc-patches
On Wed, Apr 29, 2020 at 5:02 AM H.J. Lu  wrote:
>
> On Tue, Apr 28, 2020 at 11:38 PM Richard Biener
>  wrote:
> >
> > On Wed, Apr 29, 2020 at 4:22 AM H.J. Lu via Gcc-patches
> >  wrote:
> > >
> > > Allow -fcf-protection with external thunk since the external thunk can be
> > > made compatible with -fcf-protection.
> > >
> > > OK for master?
> >
> > OK.  I guess also OK for backporting to branches where CET support is
> > in a reasonable
> > state (not sure to what extent you backported the various fixes in this 
> > area).
> >
> > Thanks,
> > Richard.
>
> Fixed on master.  I am backporting it to GCC 8 and 9 branches.
>

No need for GCC 8.  It is a regression for GCC 9/10.

-- 
H.J.


[PATCH][GCC-8][Aarch64]: Backport Force TImode values into even registers

2020-04-29 Thread Andre Vieira (lists)

Hi,

This is a backport from trunk/gcc-9 that I think we need now that we 
have backported the casp LSE instructions.


Bootstrapped and regression tested on aarch64.

Is this OK for gcc-8?

Cheers,
Andre

The LSE CASP instruction requires values to be placed in even
register pairs.  A solution involving two additional register
classes was rejected in favor of the much simpler solution of
simply requiring all TImode values to be aligned.

gcc/ChangeLog:
2020-04-29  Andre Vieira  

    Backport from mainline.
    2018-10-31  Richard Henderson 

    * config/aarch64/aarch64.c (aarch64_hard_regno_mode_ok): Force
    16-byte modes held in GP registers to use an even regno.

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 
5eec1aae54abe04b8320deaf8202621c8e193c01..525deba56ea363a621cccec1a923da241908dd06
 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -1369,10 +1369,14 @@ aarch64_hard_regno_mode_ok (unsigned regno, 
machine_mode mode)
   if (regno == FRAME_POINTER_REGNUM || regno == ARG_POINTER_REGNUM)
 return mode == Pmode;
 
-  if (GP_REGNUM_P (regno) && known_le (GET_MODE_SIZE (mode), 16))
-return true;
-
-  if (FP_REGNUM_P (regno))
+  if (GP_REGNUM_P (regno))
+{
+  if (known_le (GET_MODE_SIZE (mode), 8))
+   return true;
+  else if (known_le (GET_MODE_SIZE (mode), 16))
+   return (regno & 1) == 0;
+}
+  else if (FP_REGNUM_P (regno))
 {
   if (vec_flags & VEC_STRUCT)
return end_hard_regno (mode, regno) - 1 <= V31_REGNUM;


RE: [PATCH][GCC-8][Aarch64]: Backport Force TImode values into even registers

2020-04-29 Thread Kyrylo Tkachov


> -Original Message-
> From: Andre Vieira (lists) 
> Sent: 29 April 2020 13:37
> To: gcc-patches@gcc.gnu.org
> Cc: Kyrylo Tkachov 
> Subject: [PATCH][GCC-8][Aarch64]: Backport Force TImode values into even
> registers
> 
> Hi,
> 
> This is a backport from trunk/gcc-9 that I think we need now that we
> have backported the casp LSE instructions.
> 
> Bootstrapped and regression tested on aarch64.
> 
> Is this OK for gcc-8?

Ok.
Thanks,
Kyrill

> 
> Cheers,
> Andre
> 
> The LSE CASP instruction requires values to be placed in even
> register pairs.  A solution involving two additional register
> classes was rejected in favor of the much simpler solution of
> simply requiring all TImode values to be aligned.
> 
> gcc/ChangeLog:
> 2020-04-29  Andre Vieira  
> 
>      Backport from mainline.
>      2018-10-31  Richard Henderson 
> 
>      * config/aarch64/aarch64.c (aarch64_hard_regno_mode_ok): Force
>      16-byte modes held in GP registers to use an even regno.



[PATCH] diagnostics: Add %{...%} pretty-format support for URLs and use it in -Wpsabi diagnostcs

2020-04-29 Thread Jakub Jelinek via Gcc-patches
Hi!

The following patch attempts to use the diagnostics URL support if available
to provide more information about the C++17 empty base and C++20
[[no_unique_address]] empty class ABI changes in -Wpsabi diagnostics.

GCC 10.1 at the end of the diagnostics is then in some terminals
underlined with a dotted line and points to a (to be written) anchor in
gcc-10/changes.html which we need to write anyway.

Ok for trunk if this passes bootstrap/regtest?

2020-04-29  Jakub Jelinek  

* configure.ac (-with-changes-root-url): New configure option,
defaulting to https://gcc.gnu.org/.
* Makefile.in (CFLAGS-opts.o): Define CHANGES_ROOT_URL for
opts.c.
* pretty-print.c (end_url_string): New function.
(pp_format): Handle %{ and %} for URLs.
(pp_begin_url): Use pp_string instead of pp_printf.
(pp_end_url): Use end_url_string.
* opts.h (get_changes_url): Declare.
* opts.c (get_changes_url): New function.
* config/rs6000/rs6000-call.c: Include opts.h.
(rs6000_discover_homogeneous_aggregate): Use %{GCC 10.1%} instead of
just GCC 10.1 in diagnostics and add URL.
* config/arm/arm.c (aapcs_vfp_is_call_or_return_candidate): Likewise.
* config/aarch64/aarch64.c (aarch64_vfp_is_call_or_return_candidate):
Likewise.
* configure: Regenerated.

* c-format.c (PP_FORMAT_CHAR_TABLE): Add %{ and %}.

--- gcc/configure.ac.jj 2020-04-29 10:21:25.061999873 +0200
+++ gcc/configure.ac2020-04-29 13:26:51.515516959 +0200
@@ -986,6 +986,20 @@ AC_ARG_WITH(documentation-root-url,
 )
 AC_SUBST(DOCUMENTATION_ROOT_URL)
 
+# Allow overriding the default URL for GCC changes
+AC_ARG_WITH(changes-root-url,
+AS_HELP_STRING([--with-changes-root-url=URL],
+   [Root for GCC changes URLs]),
+[case "$withval" in
+  yes) AC_MSG_ERROR([changes root URL not specified]) ;;
+  no)  AC_MSG_ERROR([changes root URL not specified]) ;;
+  *)   CHANGES_ROOT_URL="$withval"
+  ;;
+ esac],
+ CHANGES_ROOT_URL="https://gcc.gnu.org/";
+)
+AC_SUBST(CHANGES_ROOT_URL)
+
 # Sanity check enable_languages in case someone does not run the toplevel
 # configure # script.
 AC_ARG_ENABLE(languages,
--- gcc/Makefile.in.jj  2020-02-27 09:28:46.129960135 +0100
+++ gcc/Makefile.in 2020-04-29 13:38:42.455008718 +0200
@@ -2187,6 +2187,7 @@ lto-wrapper$(exeext): $(LTO_WRAPPER_OBJS
mv -f T$@ $@
 
 CFLAGS-opts.o += -DDOCUMENTATION_ROOT_URL=\"@DOCUMENTATION_ROOT_URL@\"
+CFLAGS-opts.o += -DCHANGES_ROOT_URL=\"@CHANGES_ROOT_URL@\"
 
 # Files used by all variants of C or by the stand-alone pre-processor.
 
--- gcc/pretty-print.c.jj   2020-04-25 00:08:33.359759328 +0200
+++ gcc/pretty-print.c  2020-04-29 14:30:46.791792554 +0200
@@ -1020,6 +1020,8 @@ pp_indent (pretty_printer *pp)
 pp_space (pp);
 }
 
+static const char *end_url_string (pretty_printer *);
+
 /* The following format specifiers are recognized as being client independent:
%d, %i: (signed) integer in base ten.
%u: unsigned integer in base ten.
@@ -1038,6 +1040,8 @@ pp_indent (pretty_printer *pp)
%%: '%'.
%<: opening quote.
%>: closing quote.
+   %{: URL start.
+   %}: URL end.
%': apostrophe (should only be used in untranslated messages;
translations should use appropriate punctuation directly).
%@: diagnostic_event_id_ptr, for which event_id->known_p () must be true.
@@ -1051,7 +1055,7 @@ pp_indent (pretty_printer *pp)
Arguments can be used sequentially, or through %N$ resp. *N$
notation Nth argument after the format string.  If %N$ / *N$
notation is used, it must be used for all arguments, except %m, %%,
-   %<, %> and %', which may not have a number, as they do not consume
+   %<, %>, %} and %', which may not have a number, as they do not consume
an argument.  When %M$.*N$s is used, M must be N + 1.  (This may
also be written %M$.*s, provided N is not otherwise used.)  The
format string must have conversion specifiers with argument numbers
@@ -1084,7 +1088,7 @@ pp_format (pretty_printer *pp, text_info
   /* Formatting phase 1: split up TEXT->format_spec into chunks in
  pp_buffer (PP)->args[].  Even-numbered chunks are to be output
  verbatim, odd-numbered chunks are format specifiers.
- %m, %%, %<, %>, and %' are replaced with the appropriate text at
+ %m, %%, %<, %>, %} and %' are replaced with the appropriate text at
  this point.  */
 
   memset (formatters, 0, sizeof formatters);
@@ -1133,6 +1137,15 @@ pp_format (pretty_printer *pp, text_info
  p++;
  continue;
 
+   case '}':
+ {
+   const char *endurlstr = end_url_string (pp);
+   obstack_grow (&buffer->chunk_obstack, endurlstr,
+ strlen (endurlstr));
+ }
+ p++;
+ continue;
+
case 'R':
  {
const char *colorstr = colorize_stop (pp_show_color (pp));
@@ -1445,6 +1458,

RE: [PATCH]use vnode->get_constructor () to get intial in lto[PR94822]

2020-04-29 Thread lizekun (A)
Hi Richard,

Thanks you for your reply.
And sorry for not responding in time.

The reason why the ctor_for_folding was not used here before is that the 
variable here is read-only, and  ctor_for_folding would return error_mark_node.
If I want to keep the size check for array, I had to skip the check of 
ctor_for_folding , and use get_constructor () directly.
I think you're right, the boundary warning here is indeed not critical.
Your fix works for me fine.

Thanks,
Zekun

> -Original Message-
> From: Richard Biener [mailto:richard.guent...@gmail.com]
> Sent: 2020年4月29日 16:48
> To: lizekun (A) 
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH]use vnode->get_constructor () to get intial in
> lto[PR94822]
> 
> On Wed, Apr 29, 2020 at 8:44 AM Richard Biener
>  wrote:
> >
> > On Wed, Apr 29, 2020 at 6:04 AM lizekun (A) 
> wrote:
> > >
> > > Hi,
> > >
> > > This ICE appears because gcc will stream it to the function_body
> > > section when processing the variable with the initial value of the
> > > constructor type, and the error_mark_node to the decls section. When
> recompiling, the value obtained with DECL_INITIAL will be error_mark.
> > >
> > > This patch use vnode->get_constructor () to get intial in lto.
> > >
> > > Bootstrap and tested on aarch64 Linux platform.
> >
> > You should use ctor_for_folding to get all the magic correct.  Note
> > that this function returns error_mark_node if there is no usable
> > initializer, NULL if there is zero-init and sth else for the actual
> > initializer.  You can use
> >
> >   if (tree init = ctor_for_folding (base))
> > if (init && init != error_mark_node)
> >   {
> > ...
> >   }
> >
> > Not to mention this whole function is a red herring ...:/
> 
> So I tested this and ctor_for_folding is too strict but we also do not stream
> ctors not usable for folding but to the LTRANS unit that is responsible for 
> the
> variable output.
> 
> So instead I am simply testing an additional != error_mark_node check.
> The size, if "important", should be resolved before LTO.
> 
> Richard.
> 
> > Thanks,
> > Richard.
> >
> > >
> > > Best regulars,
> > > Zekun
> > >
> > >
> > > gcc:
> > > +2020-04-29  Li Zekun  
> > > +
> > > +   PR  lto/94822
> > > +   * tree.c: use vnode->get_constructor () to get intial in lto
> > >
> > > gcc/testsuite:
> > > +2020-04-29  Li Zekun  
> > > +
> > > +   PR  lto/94822
> > > +   * gcc.dg/lto/pr94822.h: New test.
> > > +   * gcc.dg/lto/pr94822_0.c: New test.
> > > +   * gcc.dg/lto/pr94822_1.c: New test.


Re: [PATCH] diagnostics: Add %{...%} pretty-format support for URLs and use it in -Wpsabi diagnostcs

2020-04-29 Thread Richard Sandiford
Jakub Jelinek via Gcc-patches  writes:
> @@ -5747,16 +5748,18 @@ rs6000_discover_homogeneous_aggregate (m
> unsigned uid = TYPE_UID (TYPE_MAIN_VARIANT (type));
> if (uid != last_reported_type_uid)
>   {
> +   const char *url
> + = get_changes_url ("gcc-10/changes.html#empty_base");
> if (empty_base_seen & 1)
>   inform (input_location,
>   "parameter passing for argument of type %qT "
>   "when C++17 is enabled changed to match C++14 "
> - "in GCC 10.1", type);
> + "in %{GCC 10.1%}", type, url);
> else
>   inform (input_location,
>   "parameter passing for argument of type %qT "
>   "with %<[[no_unique_address]]%> members "
> - "changed in GCC 10.1", type);
> + "changed in %{GCC 10.1%}", type, url);
> last_reported_type_uid = uid;
>   }
>   }

Looks like there's a missing free() for this one.

Thanks,
Richard


Re: [wwwdocs] Correct status and macro for P1907R1

2020-04-29 Thread Marek Polacek via Gcc-patches
On Wed, Apr 29, 2020 at 10:51:21AM +0100, Jonathan Wakely wrote:
> As noted in PR c++/94765 floating point types are not supported as
> non-type template args.
> 
> Marek added the P1907R1 paper in commit d59a823fb but didn't give it a
> distinct entries for the "Supported in GCC?" and "SD-6 Feature Test"
> columns. This keeps it grouped with P0732R2 but in its own row.
> 
> The table uses the nth-child CSS pseudo-class for formatting, but
> rowspan attributes mess that up by altering the number of children in
> the row. This patches uses (non-displayed)  elements to
> preserve the ordering for nth-child, so that the table cells are aligned
> consistently. Those elements have no other effect, and no effect at all
> for non-graphical browsers that ignore the CSS formatting.
> 
> OK for wwwdocs?

Yes, thanks.

Marek



Re: [PATCH] diagnostics: Add %{...%} pretty-format support for URLs and use it in -Wpsabi diagnostcs

2020-04-29 Thread David Malcolm via Gcc-patches
On Wed, 2020-04-29 at 14:46 +0200, Jakub Jelinek wrote:
> Hi!
> 
> The following patch attempts to use the diagnostics URL support if
> available
> to provide more information about the C++17 empty base and C++20
> [[no_unique_address]] empty class ABI changes in -Wpsabi diagnostics.
> 
> GCC 10.1 at the end of the diagnostics is then in some terminals
> underlined with a dotted line and points to a (to be written) anchor
> in
> gcc-10/changes.html which we need to write anyway.
> 
> Ok for trunk if this passes bootstrap/regtest?

Thanks for implementing this.

Overall I like the idea; some nits inline below (and I think it won't
bootstrap due to a const-correctness issue).

Ideally we ought to have an integration test for this in DejaGnu
somewhere, somehow, but I don't think we have a way to write one, alas.

[...]

> --- gcc/pretty-print.c.jj 2020-04-25 00:08:33.359759328 +0200
> +++ gcc/pretty-print.c2020-04-29 14:30:46.791792554 +0200
> @@ -1020,6 +1020,8 @@ pp_indent (pretty_printer *pp)
>  pp_space (pp);
>  }
>  
> +static const char *end_url_string (pretty_printer *);
> +
>  /* The following format specifiers are recognized as being client
> independent:
> %d, %i: (signed) integer in base ten.
> %u: unsigned integer in base ten.
> @@ -1038,6 +1040,8 @@ pp_indent (pretty_printer *pp)
> %%: '%'.
> %<: opening quote.
> %>: closing quote.
> +   %{: URL start.
> +   %}: URL end.

I think this needs a little more explanation; perhaps something like:

  %{: URL start.  Consumes a const char * argument for the URL.
  %}: URL end.Does not consume any arguments.

? 

[...]

> 
> +/* Helper function for pp_end_url and pp_format, return the "close
> URL" escape
> +   sequence string.  */
> +
> +static const char *
> +end_url_string (pretty_printer *pp)

Given that this passes in a pp, please rename this to
"get_end_url_string to" avoid possible confusion that "end" might be a
verb here, and thus might emit the codes to the pp.

> +{
> +  switch (pp->url_format)
> +{
> +case URL_FORMAT_NONE:
> +  return "";
> +case URL_FORMAT_ST:
> +  return "\33]8;;\33\\";
> +case URL_FORMAT_BEL:
> +  return "\33]8;;\a";
> +default:
> +  gcc_unreachable ();
> +}
>  }
>  
> /* If URL-printing is enabled, write a "close URL" escape sequence to
> PP.  */
> @@ -2191,19 +2231,8 @@ pp_begin_url (pretty_printer *pp, const
>  void
>  pp_end_url (pretty_printer *pp)
>  {
> -  switch (pp->url_format)
> -  {
> -  case URL_FORMAT_NONE:
> -break;
> -  case URL_FORMAT_ST:
> -pp_string (pp, "\33]8;;\33\\");
> -break;
> -  case URL_FORMAT_BEL:
> -pp_string (pp, "\33]8;;\a");
> -break;
> -  default:
> -gcc_unreachable ();
> -  }
> +  if (pp->url_format != URL_FORMAT_NONE)
> +pp_string (pp, end_url_string (pp));
>  }
>  
>  #if CHECKING_P

[...]

> --- gcc/opts.c.jj 2020-04-27 23:28:18.865609469 +0200
> +++ gcc/opts.c2020-04-29 13:42:13.033891253 +0200
> @@ -3190,6 +3190,15 @@ get_option_url (diagnostic_context *, in
>  return NULL;
>  }
>  
> +/* Given "gcc-10/changes.html#foobar", return that URL under
> +   CHANGES_ROOT_URL (see --with-changes-root-url).  */

Although the return type is a "char *" rather than "const char *" I
think this comment could be improved by clarifying ownership.  How
about:

/* Given "gcc-10/changes.html#foobar", return that URL under
   CHANGES_ROOT_URL (see --with-changes-root-url).
   The caller is responsible for freeing the returned string.  */

> +
> +char *
> +get_changes_url (const char *str)
> +{
> +  return concat (CHANGES_ROOT_URL, str, NULL);
> +}
> +
>  #if CHECKING_P
>  
>  namespace selftest {
> --- gcc/config/arm/arm.c.jj   2020-04-29 13:12:46.736007298 +0200
> +++ gcc/config/arm/arm.c  2020-04-29 14:34:04.878864236 +0200

There's some pre-existing repetition between arm, aarch64, and rs6000,
in which this patch has to make the same changes in multiple places. 
Could these be consolidated e.g.

void maybe_emit_gcc10_param_passing_abi_warning (tree type)
{
  /* something here; not sure what */
}

That said it's a pre-existing thing.

> @@ -6416,6 +6416,8 @@ aapcs_vfp_is_call_or_return_candidate (e
> && ((alt = aapcs_vfp_sub_candidate (type, &new_mode,
> NULL))
> != ag_count))
>   {
> +   const char *url
> + = get_changes_url ("gcc-10/changes.html#empty_base");

"url" is declared here (and elsewhere) as a "const char *", but later
freed; that seems like a violation of const-correctness.  Hopefully we
emit a warning about that.

> gcc_assert (alt == -1);
> last_reported_type_uid = uid;
> /* Use TYPE_MAIN_VARIANT to strip any redundant const
> @@ -6423,11 +6425,14 @@ aapcs_vfp_is_call_or_return_candidate (e
> if (warn_psabi_flags & WARN_PSABI_NO_UNIQUE_ADDRESS)
>   inform (input_location, "parameter passing for argument
> of "
>   "type %qT with %

Re: [PATCH] rs6000: Fix rs6000_atomic_assign_expand_fenv [PR94826]

2020-04-29 Thread David Edelsohn via Gcc-patches
On Wed, Apr 29, 2020 at 7:48 AM Jakub Jelinek  wrote:
>
> Hi!
>
> This is the rs6000 version of the earlier committed x86, aarch64 and arm
> fixes, as create_tmp_var_raw is used because the C FE can call this outside
> of function context, we need to make sure the first references to those
> VAR_DECLs are through a TARGET_EXPR, so that it gets gimple_add_tmp_var
> marked in whatever function it gets expanded in.  Without that DECL_CONTEXT
> is NULL and the vars aren't added as local decls of the containing function.
>
> Bootstrapped/regtested on powerpc64{,le}-linux, ok for trunk?
>
> 2020-04-29  Jakub Jelinek  
>
> PR target/94826
> * config/rs6000/rs6000.c (rs6000_atomic_assign_expand_fenv): Use
> TARGET_EXPR instead of MODIFY_EXPR for first assignment to
> fenv_var, fenv_clear and old_fenv variables.  For fenv_addr
> take address of TARGET_EXPR of fenv_var with void_node initializer.
> Formatting fixes.

Okay.

Thanks, David


[PATCH v2] diagnostics: Add %{...%} pretty-format support for URLs and use it in -Wpsabi diagnostcs

2020-04-29 Thread Jakub Jelinek via Gcc-patches
On Wed, Apr 29, 2020 at 09:24:09AM -0400, David Malcolm wrote:
> Overall I like the idea; some nits inline below (and I think it won't
> bootstrap due to a const-correctness issue).

I'll start a bootstrap soon.

> Ideally we ought to have an integration test for this in DejaGnu
> somewhere, somehow, but I don't think we have a way to write one, alas.

All we have right now is g++.target/ tests that check the wording,
but those test have from the common code the URLs disabled.

> I think this needs a little more explanation; perhaps something like:

Ack, added.

> Given that this passes in a pp, please rename this to
> "get_end_url_string to" avoid possible confusion that "end" might be a
> verb here, and thus might emit the codes to the pp.

Ok.

> Although the return type is a "char *" rather than "const char *" I
> think this comment could be improved by clarifying ownership.  How
> about:

Ok.

> There's some pre-existing repetition between arm, aarch64, and rs6000,
> in which this patch has to make the same changes in multiple places. 
> Could these be consolidated e.g.
> 
> void maybe_emit_gcc10_param_passing_abi_warning (tree type)
> {
>   /* something here; not sure what */
> }
> 
> That said it's a pre-existing thing.

I'd prefer not to at this point, all that could be commonized are
the two inform calls perhaps with a bool or enum arg which one it is,
but usually the backends prefer to keep their -Wpsabi stuff visible there.
Yes, it affects 4 targets (s390 too; haven't touched that one, because
there is a pending patch with two alternatives).

> > + const char *url
> > +   = get_changes_url ("gcc-10/changes.html#empty_base");
> 
> "url" is declared here (and elsewhere) as a "const char *", but later
> freed; that seems like a violation of const-correctness.  Hopefully we
> emit a warning about that.

Changed to char *, one could use free (CONST_CAST (char *, url)) too though.

> > @@ -6423,11 +6425,14 @@ aapcs_vfp_is_call_or_return_candidate (e
> >   if (warn_psabi_flags & WARN_PSABI_NO_UNIQUE_ADDRESS)
> > inform (input_location, "parameter passing for argument
> > of "
> > "type %qT with %<[[no_unique_address]]%>
> > members "
> > -   "changed in GCC 10.1", TYPE_MAIN_VARIANT
> > (type));
> > +   "changed in %{GCC 10.1%}",
> > +   TYPE_MAIN_VARIANT (type), url);
> 
> A different bikeshed: it might be better style for the hyperlink to
> cover the text "%{changed in GCC 10.1%}" given that the link is
> describing a specific change in GCC 10.1 rather than GCC 10.1 itself. 

I think I can't do that, because there are two inform calls and while one
has the changed in GCC 10.1 in that order, the other doesn't, as it has
changed to match C++14 in GCC 10.1.  Additionally, for translations, I think
some translators will need to reorder the words in various ways, and am not
sure what they would do if e.g. the translated changed word is way appart
from in GCC 10.1.  One could use
... type %1$qT ... %2${changed%} ... %2${in GCC 10.1%} perhaps, which
would mean two separate underlined words, but not sure if they'd figure it
out.
Is just %{in GCC 10.1%} good enough (in the patch below)?

I've also included the missing free (url); in rs6000-call.c Richard
Sandiford reported.

2020-04-29  Jakub Jelinek  

* configure.ac (-with-changes-root-url): New configure option,
defaulting to https://gcc.gnu.org/.
* Makefile.in (CFLAGS-opts.o): Define CHANGES_ROOT_URL for
opts.c.
* pretty-print.c (get_end_url_string): New function.
(pp_format): Handle %{ and %} for URLs.
(pp_begin_url): Use pp_string instead of pp_printf.
(pp_end_url): Use get_end_url_string.
* opts.h (get_changes_url): Declare.
* opts.c (get_changes_url): New function.
* config/rs6000/rs6000-call.c: Include opts.h.
(rs6000_discover_homogeneous_aggregate): Use %{in GCC 10.1%} instead
of just in GCC 10.1 in diagnostics and add URL.
* config/arm/arm.c (aapcs_vfp_is_call_or_return_candidate): Likewise.
* config/aarch64/aarch64.c (aarch64_vfp_is_call_or_return_candidate):
Likewise.
* configure: Regenerated.

* c-format.c (PP_FORMAT_CHAR_TABLE): Add %{ and %}.

--- gcc/configure.ac.jj 2020-04-29 15:28:01.907010894 +0200
+++ gcc/configure.ac2020-04-29 15:28:27.748628849 +0200
@@ -986,6 +986,20 @@ AC_ARG_WITH(documentation-root-url,
 )
 AC_SUBST(DOCUMENTATION_ROOT_URL)
 
+# Allow overriding the default URL for GCC changes
+AC_ARG_WITH(changes-root-url,
+AS_HELP_STRING([--with-changes-root-url=URL],
+   [Root for GCC changes URLs]),
+[case "$withval" in
+  yes) AC_MSG_ERROR([changes root URL not specified]) ;;
+  no)  AC_MSG_ERROR([changes root URL not specified]) ;;
+  *)   CHANGES_ROOT_URL="$withval"
+  ;;
+ esac],
+ CHANGES_ROOT_URL="https://gcc.gnu.org/";
+)
+AC_

Re: [Patch, Fortran] OpenMP/OpenACC – fix more issues with OPTIONAL

2020-04-29 Thread Tobias Burnus

Hi Thomas,

was a bit on the backburner but I now digged again.
See: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94848

The problem is a generated static array variable in the
device function:
  static integer(kind=4) A.12[3] = {1, 2, 3};
used as
  _26 = A.12[S.13_67];

With -ftree-pre, the expressions using _26 now directly use
the value (1, 2 and 3). It seems as if the variable A.12 is
eliminated before writing to LTO but the usage (A.12[S.…])
is not – at least it still appears in the device dumps.

Maybe it is also related to something else, but crucial is
the -ftree-pre on the host side; the optimization level on
the device lto1 side is irrelevant.

Cheers,

Tobias

On 4/29/20 12:00 PM, Thomas Schwinge wrote:


Hi Tobias!

Do you happen to have any update regarding this one:

On 2020-01-08T09:55:06+0100, Tobias Burnus  wrote:

On 1/8/20 9:33 AM, Thomas Schwinge wrote:

With 'dg-do run' added, on [...] x86_64-pc-linux-gnu with offloading I'm seeing:

| PASS: libgomp.fortran/use_device_ptr-optional-3.f90   -O0  (test for 
excess errors)
| PASS: libgomp.fortran/use_device_ptr-optional-3.f90   -O0  execution test
| PASS: libgomp.fortran/use_device_ptr-optional-3.f90   -O1  (test for 
excess errors)
| PASS: libgomp.fortran/use_device_ptr-optional-3.f90   -O1  execution test
| FAIL: libgomp.fortran/use_device_ptr-optional-3.f90   -O2  (test for 
excess errors)
| UNRESOLVED: libgomp.fortran/use_device_ptr-optional-3.f90   -O2  
compilation failed to produce executable
| FAIL: libgomp.fortran/use_device_ptr-optional-3.f90   -O3 
-fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  
(test for excess errors)
| UNRESOLVED: libgomp.fortran/use_device_ptr-optional-3.f90   -O3 
-fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  
compilation failed to produce executable
| FAIL: libgomp.fortran/use_device_ptr-optional-3.f90   -O3 -g  (test for 
excess errors)
| UNRESOLVED: libgomp.fortran/use_device_ptr-optional-3.f90   -O3 -g  
compilation failed to produce executable
| FAIL: libgomp.fortran/use_device_ptr-optional-3.f90   -Os  (test for 
excess errors)
| UNRESOLVED: libgomp.fortran/use_device_ptr-optional-3.f90   -Os  
compilation failed to produce executable


... due to:
  /tmp/cciVc43I.o:(.gnu.offload_vars+0x10): undefined reference to 
`A.12.4064'
which may be something like PR90779, PR85063, PR84592, PR90779,
PR80411, PR71536 -- or something else.

Hmm. It is surely among the listed items, if all fails in the last item.
Note that PR85063 is fixed and PR84592 a duplicate of PR90779 (which is
listed twice). To through in another number it could also be a variant
of PR 92029 to though in yet another number …


Grüße
  Thomas
-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter

-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter


[committed] Fix a couple more instruction length issues on the H8 port

2020-04-29 Thread Jeff Law via Gcc-patches

And another case were the H8 port had the wrong lengths which resulted in using 
a
short branch instead of the necessary long branch and the short branch going off
into never-never land.   It usually "worked" anyway, but if addresses in the C
runtime line up just right we'd get a fault.

I know we're close to an RC, but this seems safe enough given it just hits the 
H8
port and only the H8SX variant and only corrects the length of two relatively
uncommon instructions.

Installing on the trunk momentarily.

Jeff
commit 392aa7d7adfbd84253121d2ef779bf3c627e8d0b
Author: Jeff Law 
Date:   Wed Apr 29 10:19:22 2020 -0400

Fix some testsuite failures for H8/SX multilibs where short branches 
where used when long branches were necessary.

* config/h8300/h8300.md (H8/SX div patterns): All H8/SX specific
division instructions are 4 bytes long.

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 80064da83ce..a2d4a1b82f4 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,8 @@
+2020-04-29  Jeff Law  
+
+   * config/h8300/h8300.md (H8/SX div patterns): All H8/SX specific
+   division instructions are 4 bytes long.
+
 2020-04-29  Jakub Jelinek  
 
PR target/94826
diff --git a/gcc/config/h8300/h8300.md b/gcc/config/h8300/h8300.md
index 3e5cdbeeebe..a86b8ea2074 100644
--- a/gcc/config/h8300/h8300.md
+++ b/gcc/config/h8300/h8300.md
@@ -1218,7 +1218,7 @@
  (match_operand:HSI 2 "reg_or_nibble_operand" "r IP4>X")))]
   "TARGET_H8300SX"
   { return mode == HImode ? "divu.w\\t%T2,%T0" : "divu.l\\t%S2,%S0"; }
-  [(set_attr "length" "2")])
+  [(set_attr "length" "4")])
 
 (define_insn "div3"
   [(set (match_operand:HSI 0 "register_operand" "=r")
@@ -1226,7 +1226,7 @@
 (match_operand:HSI 2 "reg_or_nibble_operand" "r IP4>X")))]
   "TARGET_H8300SX"
   { return mode == HImode ? "divs.w\\t%T2,%T0" : "divs.l\\t%S2,%S0"; }
-  [(set_attr "length" "2")])
+  [(set_attr "length" "4")])
 
 (define_insn "udivmodqi4"
   [(set (match_operand:QI 0 "register_operand" "=r")


Re: [PATCH v2] diagnostics: Add %{...%} pretty-format support for URLs and use it in -Wpsabi diagnostcs

2020-04-29 Thread David Malcolm via Gcc-patches
On Wed, 2020-04-29 at 15:52 +0200, Jakub Jelinek wrote:
> On Wed, Apr 29, 2020 at 09:24:09AM -0400, David Malcolm wrote:

[...]

> > There's some pre-existing repetition between arm, aarch64, and
> > rs6000,
> > in which this patch has to make the same changes in multiple
> > places. 
> > Could these be consolidated e.g.
> > 
> > void maybe_emit_gcc10_param_passing_abi_warning (tree type)
> > {
> >   /* something here; not sure what */
> > }
> > 
> > That said it's a pre-existing thing.
> 
> I'd prefer not to at this point, all that could be commonized are
> the two inform calls perhaps with a bool or enum arg which one it is,
> but usually the backends prefer to keep their -Wpsabi stuff visible
> there.
> Yes, it affects 4 targets (s390 too; haven't touched that one,
> because
> there is a pending patch with two alternatives).

One other benefit is to move the allocation/free of the URL string so
that it's guarded by the same condition as the "inform" call.

Is there a chance this patch could be doing a bunch of extra allocation
and freeing of URL strings that never get used?  How often does this
code get called?

Idea: introduce a static allocation for this

const char *
get_url_for_gcc_10_empty_base ()
{
  static const char *url;
  if (!url)
 url = get_changes_url ("gcc-10/changes.html#empty_base");
  return url;
}

or somesuch?

> > > +   const char *url
> > > + = get_changes_url ("gcc-10/changes.html#empty_base");
> > 
> > "url" is declared here (and elsewhere) as a "const char *", but
> > later
> > freed; that seems like a violation of const-
> > correctness.  Hopefully 
> > we
> > emit a warning about that.
> 
> Changed to char *, one could use free (CONST_CAST (char *, url)) too
> though.
> 
> > > @@ -6423,11 +6425,14 @@ aapcs_vfp_is_call_or_return_candidate (e
> > > if (warn_psabi_flags & WARN_PSABI_NO_UNIQUE_ADDRESS)
> > >   inform (input_location, "parameter passing for argument
> > > of "
> > >   "type %qT with %<[[no_unique_address]]%>
> > > members "
> > > - "changed in GCC 10.1", TYPE_MAIN_VARIANT
> > > (type));
> > > + "changed in %{GCC 10.1%}",
> > > + TYPE_MAIN_VARIANT (type), url);
> > 
> > A different bikeshed: it might be better style for the hyperlink to
> > cover the text "%{changed in GCC 10.1%}" given that the link is
> > describing a specific change in GCC 10.1 rather than GCC 10.1
> > itself. 
> 
> I think I can't do that, because there are two inform calls and while
> one
> has the changed in GCC 10.1 in that order, the other doesn't, as it
> has
> changed to match C++14 in GCC 10.1.  Additionally, for translations,
> I think
> some translators will need to reorder the words in various ways, and
> am not
> sure what they would do if e.g. the translated changed word is way
> appart
> from in GCC 10.1.  One could use
> ... type %1$qT ... %2${changed%} ... %2${in GCC 10.1%} perhaps, which
> would mean two separate underlined words, but not sure if they'd
> figure it
> out.
> Is just %{in GCC 10.1%} good enough (in the patch below)?

Yes.

> I've also included the missing free (url); in rs6000-call.c Richard
> Sandiford reported.

Thanks.

> 2020-04-29  Jakub Jelinek  
> 
>   * configure.ac (-with-changes-root-url): New configure option,
>   defaulting to https://gcc.gnu.org/.
>   * Makefile.in (CFLAGS-opts.o): Define CHANGES_ROOT_URL for
>   opts.c.
>   * pretty-print.c (get_end_url_string): New function.
>   (pp_format): Handle %{ and %} for URLs.
>   (pp_begin_url): Use pp_string instead of pp_printf.
>   (pp_end_url): Use get_end_url_string.
>   * opts.h (get_changes_url): Declare.
>   * opts.c (get_changes_url): New function.
>   * config/rs6000/rs6000-call.c: Include opts.h.
>   (rs6000_discover_homogeneous_aggregate): Use %{in GCC 10.1%}
> instead
>   of just in GCC 10.1 in diagnostics and add URL.
>   * config/arm/arm.c (aapcs_vfp_is_call_or_return_candidate):
> Likewise.
>   * config/aarch64/aarch64.c
> (aarch64_vfp_is_call_or_return_candidate):
>   Likewise.
>   * configure: Regenerated.
> 
>   * c-format.c (PP_FORMAT_CHAR_TABLE): Add %{ and %}.
> 

[...]

LGTM, with the caveat about allocation noted above.


Thanks
Dave



Re: [PATCH v2] diagnostics: Add %{...%} pretty-format support for URLs and use it in -Wpsabi diagnostcs

2020-04-29 Thread Jakub Jelinek via Gcc-patches
On Wed, Apr 29, 2020 at 10:25:38AM -0400, David Malcolm wrote:
> > I'd prefer not to at this point, all that could be commonized are
> > the two inform calls perhaps with a bool or enum arg which one it is,
> > but usually the backends prefer to keep their -Wpsabi stuff visible
> > there.
> > Yes, it affects 4 targets (s390 too; haven't touched that one,
> > because
> > there is a pending patch with two alternatives).
> 
> One other benefit is to move the allocation/free of the URL string so
> that it's guarded by the same condition as the "inform" call.
> 
> Is there a chance this patch could be doing a bunch of extra allocation
> and freeing of URL strings that never get used?  How often does this
> code get called?

I think it will be called infrequently, and emitting a diagnostic is already
slow.  E.g. the URLs for warnings are also allocated and freed during the
warning{,_at} call.  So, not sure it is worth wasting a static variable on
it, especially for all the other 52 or how many backends that don't need it.

But if you feel strongly about it, can do it that way, though not really
sure what is the best spot to place it, calls.[ch] ?

Jakub



Re: [PATCH] rtl cse: Fix PR94740, ICE on testsuite/gcc.dg/sso/t5.c with -mcpu=future -mpcrel -O1

2020-04-29 Thread Jeff Law via Gcc-patches
On Tue, 2020-04-28 at 20:03 -0500, Segher Boessenkool wrote:
> Hi!
> 
> On Tue, Apr 28, 2020 at 08:38:56AM +0100, Richard Sandiford wrote:
> > Also, the (const:P ...) ought to be there even outside of a MEM.  E.g. we
> > ought to have:
> > 
> >   (set (reg X) (const:P (plus:P (symbol_ref:P S) (const_int D
> > 
> > rather than:
> > 
> >   (set (reg X) (plus:P (symbol_ref:P S) (const_int D)))
> 
> What are the actual rules?  Where is this documented?
> 
> Of course it is highly desirable to have CONST around such constant
> addresses, but when is it *required*?  And what *is* a constant address
> (in this context)?
I don't know if it's documented, but it's been expected practice for decades.

It includes symbolic constants, and the results of simple arithmetic on symbolic
constants.  It (to the best of my knowledge) does not include logical operations
on symbolic constants.

Jeff



Re: [PATCH] diagnostics: Add %{...%} pretty-format support for URLs and use it in -Wpsabi diagnostcs

2020-04-29 Thread Jonathan Wakely via Gcc-patches

On 29/04/20 09:24 -0400, David Malcolm wrote:

On Wed, 2020-04-29 at 14:46 +0200, Jakub Jelinek wrote:



@@ -6416,6 +6416,8 @@ aapcs_vfp_is_call_or_return_candidate (e
  && ((alt = aapcs_vfp_sub_candidate (type, &new_mode,
NULL))
  != ag_count))
{
+ const char *url
+   = get_changes_url ("gcc-10/changes.html#empty_base");


"url" is declared here (and elsewhere) as a "const char *", but later
freed; that seems like a violation of const-correctness.  Hopefully we
emit a warning about that.


Since this is C++, it should be an error not a warning.





Re: [PATCH] rtl cse: Fix PR94740, ICE on testsuite/gcc.dg/sso/t5.c with -mcpu=future -mpcrel -O1

2020-04-29 Thread Richard Sandiford
Peter Bergner  writes:
> On 4/28/20 5:39 PM, Richard Sandiford wrote:
>> If we use simplify_gen_binary then I don't think we need to validate
>> each change individually.  It should be enough to do something like:
>> 
>>   x0 = cse_process_notes (XEXP (x, 0), object, changed);
>>   x1 = cse_process_notes (XEXP (x, 1), object, changed);
>>   if (x0 == XEXP (x, 0) && x1 == XEXP (x, 1))
>> return x;
>>   return simplify_gen_binary (code, GET_MODE (x), x0, x1);
>
> Ok, I wasn't sure whether simplify_gen_binary verified its changes or not.
>
>> Alternatively, in the hope of reducing redundant rtl, I guess we could
>> add the switch statement after the format walk and do:
>> 
>>   switch (GET_RTX_CLASS (code))
>> {
>> case RTX_BIN_ARITH:
>> case RTX_COMM_ARITH:
>>   if (rtx new_rtx = simplify_binary (code, GET_MODE (x),
>>   XEXP (x, 0), XEXP (x, 1)))
>>  return new_rtx;
>>   break;
>> default:
>>   break;
>>}

To answer your later question, I meant simplify_binary_operation rather
than simplify_binary here.  Sorry, I always forget that the simplify_gen_*
and non-gen versions use different naming schemes.

But then I guess we're effectively back to acting like simplify_rtx,
as in your original patch, but without checking whether we're in a MEM.

The traditional problem with doing recursive replacement followed by
simplification is that we can lose the original mode for the operands
of unary operators.  E.g. (zero_extend:DI (reg:M X)) can become
(zero_extend:DI (const_int -1)), which isn't well-formed.
So calling simplify_rtx for all rtxes (not just binary ones)
could lead to problems.  More below.

> Doesn't moving this after the format walk, create the situation of
> producing constant expressions without (const:P ...) and then fixing
> them up after the fact like my original patch did?  That's why I placed
> the previous version before the format walk, but if you're ok with that
> now, then that's fine with me.  That said, the PLUS will be wrapped in
> a (const:P ...) before it's substituted into the MEM, so maybe it was
> just the MEM you were worried about?

To be honest, when reviewing the original patch, I think I'd missed that
cse_process_notes and cse_process_notes_1 were co-recursive, and so
every level of the MEM would be simplified by your simplify_rtx call.
I originally thought it was building up a new MEM and only then
simplifying it.

But yeah, like you say, the problem is that if we just add the
simplification on top, we're still validating the original unsimplified
replacement beforehand, so there's still a potential for non-canonical
rtl to seep into the validation routine before it gets corrected by
a parent call.  That might cause problems in future, even if it
doesn't yet.

Ugh.

Unfortunately, there doesn't seem to be a perfect off-the-shelf
routine to use here.  validate_replace_rtx_part looks like it has
the right kind of idea, but can only handle a single "from" and "to"
value.  simplify_replace_fn_rtx can handle general replacements,
but generates more garbage rtl and doesn't verify the new MEMs.

If this is a bug we need to fix for GCC 10 then how about this:

(1) change the MEM case to something like:

  {
bool addr_changed = false;
rtx addr = cse_process_notes (XEXP (x, 0), x, &addr_changed);
if (addr_changed && validate_change (x, &XEXP (x, 0), addr, 0))
  *changed = true;
return x;
  }

which makes it safe to do the validation only at the top level
of the MEM.

(2) make the:

   for (i = 0; i < GET_RTX_LENGTH (code); i++)
 if (fmt[i] == 'e')

loop assign directly to XEXP (x, i) if the new rtx is different,
instead of using validate_change.  This ensures that temporarily
invalid rtl doesn't get verified.

(passing a null object to validate_change would have the same effect,
but that feels a bit hacky)

(3) use the simplify_binary_operator thing above, since that at least
should be safe, even if it potentially leaves similar bugs unfixed
for other operators.

Sorry for the runaround :-(

Hopefully this area is something we can clean up for GCC 11.

Thanks,
Richard


[PATCH] arm: Warn if IRQ handler is not compiled with -mgeneral-regs-only [PR target/94743]

2020-04-29 Thread Christophe Lyon via Gcc-patches
The interrupt attribute does not guarantee that the FP registers are
saved, which can result in problems difficult to debug.

Saving the FP registers and status registers can be a large penalty,
so it's probably not desirable to do that all the time.

If the handler calls other functions, we'd likely need to save all of
them, for lack of knowledge of which registers they actually use.

This is even more obscure for the end-user when the compiler inserts
calls to helper functions such as memcpy (some multilibs do use FP
registers to speed it up).

In the PR, we discussed adding routines in libgcc to save the FP
context and saving only locally-clobbered FP registers, but I think
this is too intrusive for stage 4.

In the mean time, emit a warning to suggest re-compiling with
-mgeneral-regs-only. Note that this can lead to errors if the code
uses floating-point and -mfloat-abi=hard, eg:
argument of type 'double' not permitted with -mgeneral-regs-only

This can be troublesome for the user, but at least this would make
them aware of the latent issue.

The patch adds two testcases:
- pr94734-1.c checks that a warning is emitted. One function can make
  implicit calls to runtime floating-point routines, the other one
  doesn't. We can improve the diagnostic later not to warn in the
  second case.

- pr94734-2.c checks that no warning is emitted when using
  -mgeneral-regs-only.

2020-04-29  Christophe Lyon  

PR target/94743
gcc/
* config/arm/arm.c (arm_handle_isr_attribute): Warn if
-mgeneral-regs-only is not used.

gcc/testsuite/
* gcc.target/arm/pr94743-1.c: New test.
* gcc.target/arm/pr94743-2.c: New test.
---
 gcc/config/arm/arm.c |  5 +
 gcc/testsuite/gcc.target/arm/pr94743-1.c | 20 
 gcc/testsuite/gcc.target/arm/pr94743-2.c | 17 +
 3 files changed, 42 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/arm/pr94743-1.c
 create mode 100644 gcc/testsuite/gcc.target/arm/pr94743-2.c

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 6a6e804..34aad1d 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -7176,6 +7176,11 @@ arm_handle_isr_attribute (tree *node, tree name, tree 
args, int flags,
   name);
  *no_add_attrs = true;
}
+  else if (TARGET_VFP_BASE)
+   {
+ warning (OPT_Wattributes, "FP registers might be clobbered despite 
%qE attribute: compile with -mgeneral-regs-only",
+  name);
+   }
   /* FIXME: the argument if any is checked for type attributes;
 should it be checked for decl ones?  */
 }
diff --git a/gcc/testsuite/gcc.target/arm/pr94743-1.c 
b/gcc/testsuite/gcc.target/arm/pr94743-1.c
new file mode 100644
index 000..67700c6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/pr94743-1.c
@@ -0,0 +1,20 @@
+/* PR target/94743 */
+/* { dg-do compile } */
+
+typedef struct {
+  double fpdata[32];
+} dummy_t;
+
+dummy_t global_d;
+dummy_t global_d1;
+
+__attribute__ ((interrupt("IRQ"))) void IRQ_HDLR_Test(void)
+{ /* { dg-warning { FP registers might be clobbered despite 'interrupt' 
attribute: compile with -mgeneral-regs-only} "" { target *-*-* } . } */
+  global_d.fpdata[3] += global_d.fpdata[3] * global_d1.fpdata[3];
+}
+
+
+__attribute__ ((interrupt("IRQ"))) void IRQ_HDLR_Test2(void)
+{ /* { dg-warning { FP registers might be clobbered despite 'interrupt' 
attribute: compile with -mgeneral-regs-only} "" { target *-*-* } . } */
+  global_d.fpdata[3] = 1.0;
+}
diff --git a/gcc/testsuite/gcc.target/arm/pr94743-2.c 
b/gcc/testsuite/gcc.target/arm/pr94743-2.c
new file mode 100644
index 000..745fd36
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/pr94743-2.c
@@ -0,0 +1,17 @@
+/* PR target/94743 */
+/* { dg-do compile } */
+/* { dg-options "-mgeneral-regs-only" } */
+
+typedef struct {
+  /* Do not use floating-point types, which are not be compatible with
+ -mgeneral-regs-only under -mfloat-abi=hard */
+  int fpdata[32];
+} dummy_t;
+
+dummy_t global_d;
+dummy_t global_d1;
+
+__attribute__ ((interrupt("IRQ"))) void IRQ_HDLR_Test(void)
+{
+  global_d.fpdata[3] += global_d.fpdata[3] * global_d1.fpdata[3];
+}
-- 
2.7.4



[PATCH] arm: Remove duplicate entries in isr_attribute_args [PR target/57002]

2020-04-29 Thread Christophe Lyon via Gcc-patches
Remove two duplicate entries in isr_attribute_args ("abort" and
"ABORT").

2020-04-29  Christophe Lyon  

PR target/57002
gcc/
* config/arm/arm.c (isr_attribute_args): Remove duplicate entries.
---
 gcc/config/arm/arm.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 30a2a3a..6a6e804 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -3925,8 +3925,6 @@ static const isr_attribute_arg isr_attribute_args [] =
   { "fiq",   ARM_FT_FIQ },
   { "ABORT", ARM_FT_ISR },
   { "abort", ARM_FT_ISR },
-  { "ABORT", ARM_FT_ISR },
-  { "abort", ARM_FT_ISR },
   { "UNDEF", ARM_FT_EXCEPTION },
   { "undef", ARM_FT_EXCEPTION },
   { "SWI",   ARM_FT_EXCEPTION },
-- 
2.7.4



Re: [PATCH] match.pd: Decrease number of nop conversions around bitwise ops [PR94718]

2020-04-29 Thread Jakub Jelinek via Gcc-patches
On Tue, Apr 28, 2020 at 11:16:13AM +0200, Richard Biener wrote:
> I think we generally want to use :s, but OTOH even for your new case,
> since you restrict to nop-conversions, :s isn't strictly needed since
> VN should know how to materialize nop-converted variants.  I think
> it's different for the 2nd case you add since there we deal with
> bitops in different sign.
> 
> So maybe just remove the single_use () calls?

I've tried doing that, but that regresses e.g.
void bar (unsigned int, unsigned int);

unsigned int
foo (int x, int y)
{
  unsigned int x2 = x;
  unsigned int y2 = y;
  bar (x2, y2);
  return x2 | y2;
}

@@ -6,12 +6,14 @@ foo (int x, int y)
   unsigned int y2;
   unsigned int x2;
   unsigned int _7;
+  int _8;
 
[local count: 1073741824]:
   x2_2 = (unsigned int) x_1(D);
   y2_4 = (unsigned int) y_3(D);
   bar (x2_2, y2_4);
-  _7 = x2_2 | y2_4;
+  _8 = x_1(D) | y_3(D);
+  _7 = (unsigned int) _8;
   return _7;
 
 }

As the patch was meant purely as an IL size optimization, that is not the
case here.

> > Or should I e.g. move this GIMPLE rule as a separate simplify (under the
> > same for) and then use :s in there?
> 
> That wouldn't work, if a structural pattern matched others are not
> considered so it wouldn't ever be used.  genmatch even warns for this
> I think.

Ah, ok.

Jakub



Re: [PATCH v2] diagnostics: Add %{...%} pretty-format support for URLs and use it in -Wpsabi diagnostcs

2020-04-29 Thread David Malcolm via Gcc-patches
On Wed, 2020-04-29 at 16:42 +0200, Jakub Jelinek wrote:
> On Wed, Apr 29, 2020 at 10:25:38AM -0400, David Malcolm wrote:
> > > I'd prefer not to at this point, all that could be commonized are
> > > the two inform calls perhaps with a bool or enum arg which one it
> > > is,
> > > but usually the backends prefer to keep their -Wpsabi stuff
> > > visible
> > > there.
> > > Yes, it affects 4 targets (s390 too; haven't touched that one,
> > > because
> > > there is a pending patch with two alternatives).
> > 
> > One other benefit is to move the allocation/free of the URL string
> > so
> > that it's guarded by the same condition as the "inform" call.
> > 
> > Is there a chance this patch could be doing a bunch of extra
> > allocation
> > and freeing of URL strings that never get used?  How often does
> > this
> > code get called?
> 
> I think it will be called infrequently, and emitting a diagnostic is
> already
> slow.  

I was more concerned about the case for which the url is built but then
the "inform" is not called - I was worried it might happen per field of
a struct - but if you think that code path is infrequent, then I trust
your judgment.

> E.g. the URLs for warnings are also allocated and freed during the
> warning{,_at} call.

I believe I fixed that in abb4852434b3dee26301cc406472ac9450c621aa so
that it only builds the URLs if they're going to be used.

>   So, not sure it is worth wasting a static variable on
> it, especially for all the other 52 or how many backends that don't
> need it.
> 
> But if you feel strongly about it, can do it that way, though not
> really
> sure what is the best spot to place it, calls.[ch] ?

I don't feel strongly about it.

Thanks
Dave



[PATCH 1/1] IBM Z: vec_store_len_r/vec_load_len_r fix

2020-04-29 Thread Andreas Krebbel via Gcc-patches
This fixes a problem with the vec_store_len_r intrinsic.  The macros
mapping the intrinsic to a GCC builtin had the wrong signature.

With the patch an immediate length operand of vlrl/vstrl is handled
the same way as if it was passed in a register to vlrlr/vstrlr.
Values bigger than 15 always load the full vector.  If it can be
recognized that it is in effect a full vector register load or store
it is now implemented with vl/vst instead.

I'll commit it to mainline after successful bootstrap and regtest.

gcc/ChangeLog:

2020-04-29  Andreas Krebbel  

* config/s390/constraints.md ("j>f", "jb4"): New constraints.
* config/s390/vecintrin.h (vec_load_len_r, vec_store_len_r): Fix
macro definitions.
* config/s390/vx-builtins.md ("vlrlrv16qi", "vstrlrv16qi"): Add a
separate expander.
("*vlrlrv16qi", "*vstrlrv16qi"): Add alternative for vl/vst.
Change constraint for vlrl/vstrl to jb4.

gcc/testsuite/ChangeLog:

2020-04-29  Andreas Krebbel  

* gcc.target/s390/zvector/vec_load_len_r.c: New test.
* gcc.target/s390/zvector/vec_store_len_r.c: New test.
---
 gcc/config/s390/constraints.md| 14 ++-
 gcc/config/s390/vecintrin.h   |  6 +-
 gcc/config/s390/vx-builtins.md| 58 +---
 .../gcc.target/s390/zvector/vec_load_len_r.c  | 94 +++
 .../gcc.target/s390/zvector/vec_store_len_r.c | 94 +++
 5 files changed, 251 insertions(+), 15 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/s390/zvector/vec_load_len_r.c
 create mode 100644 gcc/testsuite/gcc.target/s390/zvector/vec_store_len_r.c

diff --git a/gcc/config/s390/constraints.md b/gcc/config/s390/constraints.md
index 91e3db7a146..0b05c5ca729 100644
--- a/gcc/config/s390/constraints.md
+++ b/gcc/config/s390/constraints.md
@@ -38,6 +38,8 @@
 ;;  matching K constraint
 ;; jm6: An integer operand with the lowest order 6 bits all ones.
 ;; jdd: A constant operand that fits into the data section.
+;; j>f: An integer operand whose lower 32 bits are greater than or 
equal to 15
+;; jb4: An unsigned constant 4 bit operand.
 ;;t -- Access registers 36 and 37.
 ;;v -- Vector registers v0-v31.
 ;;C -- A signed 8-bit constant (-128..127)
@@ -425,7 +427,7 @@
 
 
 ;;
-;; Vector constraints follow.
+;; Vector and scalar constraints for constant values follow.
 ;;
 
 (define_constraint "j00"
@@ -462,6 +464,16 @@
   "@internal An integer operand with the lowest order 6 bits all ones."
   (match_operand 0 "const_int_6bitset_operand"))
 
+(define_constraint "j>f"
+  "@internal An integer operand whose lower 32 bits are greater than or equal 
to 15."
+  (and (match_code "const_int")
+   (match_test "(unsigned int)(ival & 0x) >= 15")))
+
+(define_constraint "jb4"
+  "@internal Constant unsigned integer 4 bit value"
+  (and (match_code "const_int")
+   (match_test "ival >= 0 && ival <= 15")))
+
 ;;
 ;; Memory constraints follow.
 ;;
diff --git a/gcc/config/s390/vecintrin.h b/gcc/config/s390/vecintrin.h
index bd9355ec9d0..8ef4f44bb34 100644
--- a/gcc/config/s390/vecintrin.h
+++ b/gcc/config/s390/vecintrin.h
@@ -111,8 +111,10 @@ __lcbb(const void *ptr, int bndry)
 #define vec_round(X)  __builtin_s390_vfi((X), 4, 4)
 #define vec_doublee(X) __builtin_s390_vfll((X))
 #define vec_floate(X) __builtin_s390_vflr((X), 0, 0)
-#define vec_load_len_r(X,Y) __builtin_s390_vlrl((Y),(X))
-#define vec_store_len_r(X,Y) __builtin_s390_vstrl((Y),(X))
+#define vec_load_len_r(X,L)\
+  (__vector unsigned char)__builtin_s390_vlrlr((L),(X))
+#define vec_store_len_r(X,Y,L) \
+  __builtin_s390_vstrlr((__vector signed char)(X),(L),(Y))
 
 #define vec_all_nan(a) \
   __extension__ ({ \
diff --git a/gcc/config/s390/vx-builtins.md b/gcc/config/s390/vx-builtins.md
index 0eed31923c5..6f1add02d0b 100644
--- a/gcc/config/s390/vx-builtins.md
+++ b/gcc/config/s390/vx-builtins.md
@@ -202,16 +202,34 @@
   "vlbb\t%v0,%1,%2"
   [(set_attr "op_type" "VRX")])
 
-(define_insn "vlrlrv16qi"
-  [(set (match_operand:V16QI  0 "register_operand"  "=v,v")
-   (unspec:V16QI [(match_operand:BLK 2 "memory_operand" "Q,Q")
-  (match_operand:SI  1 "nonmemory_operand"  "d,C")]
+; Vector load rightmost with length
+
+(define_expand "vlrlrv16qi"
+  [(set (match_operand:V16QI  0 "register_operand"  "")
+   (unspec:V16QI [(match_operand:BLK 2 "memory_operand""")
+  (match_operand:SI  1 "nonmemory_operand" "")]
+ UNSPEC_VEC_LOAD_LEN_R))]
+  "TARGET_VXE"
+{
+  /* vlrlr sets all length values beyond 15 to 15.  Emulate the same
+ behavior for immediate length operands.  vlrl would trigger a
+ SIGILL for too large immediate operands.  */
+  if (CONST_INT_P (operands[1])
+  && (UINTVAL (operands[1]) & 0xff

Re: [PATCH v2] diagnostics: Add %{...%} pretty-format support for URLs and use it in -Wpsabi diagnostcs

2020-04-29 Thread Jakub Jelinek via Gcc-patches
On Wed, Apr 29, 2020 at 11:34:06AM -0400, David Malcolm wrote:
> > I think it will be called infrequently, and emitting a diagnostic is
> > already
> > slow.  
> 
> I was more concerned about the case for which the url is built but then
> the "inform" is not called - I was worried it might happen per field of
> a struct - but if you think that code path is infrequent, then I trust
> your judgment.

When get_changes_url is called, then inform is also always called, and
I think inform even doesn't do any -Wsystem-headers disabling.
On rs6000 it should be clear from the patch itself, on aarch64/arm it is less 
so,
but it does:
if (... && warn_psabi && warn_psabi_flags && ...)
{
char *url = get_changes_url ("...");
if (warn_psabi_flags & WARN_PSABI_NO_UNIQUE_ADDRESS)
  inform (..., url);
else if (warn_psabi_flags & WARN_PSABI_EMPTY_CXX17_BASE)
  inform (..., url);
}
and the two bits are the only ones ever set in warn_psabi_flags.
So, this is not called e.g. with -Wno-psabi or not called again if it is the
same type as reported last time.

BTW, the patch successfully passed bootstrap/regtest on
{x86_64,i686,powerpc64le}-linux.

Jakub



arm: Fix vfp_operand_register for VFP HI regs

2020-04-29 Thread Christophe Lyon via Gcc-patches
Hi,

While looking at PR target/94743 I noticed an ICE when I tried to save
all the FP registers: this was because all HI registers wouldn't match
vfp_register_operand.

Regression-tested and bootstrapped OK.

2020-04-29  Christophe Lyon  

gcc/
* config/arm/predicates.md (vfp_register_operand): Use VFP_HI_REGS
instead of VFP_REGS.

OK?

Thanks,

Christophe
While looking at PR target/94743 I noticed an ICE when I tried to save
all the FP registers: this was because all HI registers wouldn't match
vfp_register_operand.

2020-04-29  Christophe Lyon  

gcc/
* config/arm/predicates.md (vfp_register_operand): Use VFP_HI_REGS
instead of VFP_REGS.
diff --git a/gcc/config/arm/predicates.md b/gcc/config/arm/predicates.md
index 009862e..dbd4fb4 100644
--- a/gcc/config/arm/predicates.md
+++ b/gcc/config/arm/predicates.md
@@ -161,7 +161,7 @@ (define_predicate "vfp_register_operand"
  || REGNO_REG_CLASS (REGNO (op)) == VFP_D0_D7_REGS
  || REGNO_REG_CLASS (REGNO (op)) == VFP_LO_REGS
  || (TARGET_VFPD32
- && REGNO_REG_CLASS (REGNO (op)) == VFP_REGS)));
+ && REGNO_REG_CLASS (REGNO (op)) == VFP_HI_REGS)));
 })
 
 (define_predicate "vfp_hard_register_operand"


Re: [PATCH v2] diagnostics: Add %{...%} pretty-format support for URLs and use it in -Wpsabi diagnostcs

2020-04-29 Thread David Malcolm via Gcc-patches
On Wed, 2020-04-29 at 17:49 +0200, Jakub Jelinek wrote:
> On Wed, Apr 29, 2020 at 11:34:06AM -0400, David Malcolm wrote:
> > > I think it will be called infrequently, and emitting a diagnostic
> > > is
> > > already
> > > slow.  
> > 
> > I was more concerned about the case for which the url is built but
> > then
> > the "inform" is not called - I was worried it might happen per
> > field of
> > a struct - but if you think that code path is infrequent, then I
> > trust
> > your judgment.
> 
> When get_changes_url is called, then inform is also always called,
> and
> I think inform even doesn't do any -Wsystem-headers disabling.
> On rs6000 it should be clear from the patch itself, on aarch64/arm it
> is less so,
> but it does:
> if (... && warn_psabi && warn_psabi_flags && ...)
> {
> char *url = get_changes_url ("...");
> if (warn_psabi_flags & WARN_PSABI_NO_UNIQUE_ADDRESS)
>   inform (..., url);
> else if (warn_psabi_flags & WARN_PSABI_EMPTY_CXX17_BASE)
>   inform (..., url);
> }
> and the two bits are the only ones ever set in warn_psabi_flags.
> So, this is not called e.g. with -Wno-psabi or not called again if it
> is the
> same type as reported last time.

Aha - thanks for clarifying.

> BTW, the patch successfully passed bootstrap/regtest on
> {x86_64,i686,powerpc64le}-linux.

The v2 patch looks good to me.

Thanks
Dave



Re: [PATCH] var-tracking.c: Fix possible use of uninitialized variable pre

2020-04-29 Thread Jeff Law via Gcc-patches
On Tue, 2020-04-28 at 11:44 +0200, Richard Biener via Gcc-patches wrote:
> 
> Btw, does s390 have different inlining parameters somehow?
I think so.  We saw a ton of backend warnings that were specific to the s390
builds in Fedora that appeared to be triggered by different inlining decisions.

It happened enough that one of the ideas we're going to explore is seeing if we
can tune the x86_64 port to use the same heuristics.  That way we can try to 
find
more of these issues in our tester without having to dramatically increase s390x
resourcing.  This hasn't started, but I expect we'll be looking at it in the 
fall
(it's unclear if we can just use flags or will need hackery to the x86 port, but
we're happy to share whatever we find with the wider community).

jeff





Re: [PATCH] s390: Fix up -Wpsabi diagnostcs + analysis [PR94704]

2020-04-29 Thread Andreas Krebbel via Gcc-patches
On 28.04.20 17:44, Jakub Jelinek wrote:
> On Tue, Apr 28, 2020 at 04:23:24PM +0200, Andreas Krebbel via Gcc-patches 
> wrote:
>> Given that this is something which hasn't been covered by the ABI so far I'm 
>> not sure we really need
>> a -Wpsabi warning for that.
> 
> Attached are two (updated) versions of the patch on top of the
> powerpc+middle-end patch just posted.
> 
> The first one emits two separate -Wpsabi warnings like powerpc, one for
> the -std=c++14 vs. -std=c++17 ABI difference and one for GCC 9 vs. 10
> [[no_unique_address]] passing changes, the other one is silent about the
> second case.  Will bootstrap/regtest both and you can choose.

Ok for the first one having both warnings. I don't see a reason to be different 
from Power in that
regard.

Thank you very much!

Andreas


Re: [PATCH] aarch64: prefer using csinv, csneg in zero extend contexts

2020-04-29 Thread Richard Sandiford
Thanks for doing this.

Alex Coplan  writes:
> Hello,
>
> The attached patch adds an optimization to the AArch64 backend to catch
> additional cases where we can use csinv and csneg.
>
> Given the C code:
>
> unsigned long long inv(unsigned a, unsigned b, unsigned c)
> {
>   return a ? b : ~c;
> }
>
> Prior to this patch, AArch64 GCC at -O2 generates:
>
> inv:
> cmp w0, 0
> mvn w2, w2
> cselw0, w1, w2, ne
> ret
>
> and after applying the patch, we get:
>
> inv:
> cmp w0, 0
> csinv   w0, w1, w2, ne
> ret
>
> The new pattern also catches the optimization for the symmetric case where the
> body of foo reads a ? ~b : c.

Yeah.  The thing that surprised me was that the non-extending form
has the operator in the "then" arm of the if_then_else:

(define_insn "*csinv3_insn"
  [(set (match_operand:GPI 0 "register_operand" "=r")
(if_then_else:GPI
  (match_operand 1 "aarch64_comparison_operation" "")
  (not:GPI (match_operand:GPI 2 "register_operand" "r"))
  (match_operand:GPI 3 "aarch64_reg_or_zero" "rZ")))]
  ""
  "csinv\\t%0, %3, %2, %M1"
  [(set_attr "type" "csel")]
)

whereas the new pattern has it in the "else" arm.  I think for the
non-extending form, having the operator in the "then" arm really is
canonical and close to guaranteed, since that's how ifcvt processes
half diamonds.

But when both values are zero-extended, ifcvt has to convert a full
diamond, and I'm not sure we can rely on the two sides coming out in
a particular order.  I think the two ?: orders above work with one pattern
thanks to simplifications that happen before entering gimple.  If instead
the operator is split out into a separate statement:

unsigned long long
inv1(int a, unsigned b, unsigned c)
{
  unsigned d = ~c;
  return a ? b : d;
}

then we go back to using the less optimal CSEL sequence.

So I think it would be worth having a second pattern for the opposite order.
Alternatively, we could add a new rtl canonicalisation rule to force the
if_then_else operands to end up in a particular order, but that's more
complex and is likely to affect other targets.

> diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
> index c7c4d1dd519..2f7367c0b1a 100644
> --- a/gcc/config/aarch64/aarch64.md
> +++ b/gcc/config/aarch64/aarch64.md
> @@ -4390,6 +4390,19 @@
>[(set_attr "type" "csel")]
>  )
>  
> +(define_insn "*csinv3_uxtw_insn"
> +  [(set (match_operand:DI 0 "register_operand" "=r")
> +(if_then_else:DI
> +  (match_operand 1 "aarch64_comparison_operation" "")
> +  (zero_extend:DI
> +(match_operand:SI 2 "aarch64_reg_or_zero" "rZ"))
> +  (zero_extend:DI
> +(NEG_NOT:SI (match_operand:SI 3 "register_operand" "r")]
> +  ""
> +  "cs\\t%w0, %w2, %w3, %m1"
> +  [(set_attr "type" "csel")]
> +)
> +

The operand to a zero_extend can never be a const_int, so operand 2
should just be a register_operand with an "r" constraint.

At the risk of feature creep :-) a useful third pattern could be
to combine a zero-extended operator result with an existing DImode value.
In that case, the existing DImode value really can be "rZ" and should
always be in the "else" arm of the if_then_else.  E.g.:

unsigned long long
f(int a, unsigned long b, unsigned c)
{
  return a ? b : ~c;
}

unsigned long long
g(int a, unsigned c)
{
  return a ? 0 : ~c;
}

But that's definitely something that can be left for later.

Thanks,
Richard


Re: [PATCH] arm: Remove duplicate entries in isr_attribute_args [PR target/57002]

2020-04-29 Thread Ramana Radhakrishnan via Gcc-patches
On Wed, Apr 29, 2020 at 4:19 PM Christophe Lyon via Gcc-patches
 wrote:
>
> Remove two duplicate entries in isr_attribute_args ("abort" and
> "ABORT").
>
> 2020-04-29  Christophe Lyon  
>
> PR target/57002
> gcc/
> * config/arm/arm.c (isr_attribute_args): Remove duplicate entries.


OK, this would count as obvious.

Ramana
> ---
>  gcc/config/arm/arm.c | 2 --
>  1 file changed, 2 deletions(-)
>
> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> index 30a2a3a..6a6e804 100644
> --- a/gcc/config/arm/arm.c
> +++ b/gcc/config/arm/arm.c
> @@ -3925,8 +3925,6 @@ static const isr_attribute_arg isr_attribute_args [] =
>{ "fiq",   ARM_FT_FIQ },
>{ "ABORT", ARM_FT_ISR },
>{ "abort", ARM_FT_ISR },
> -  { "ABORT", ARM_FT_ISR },
> -  { "abort", ARM_FT_ISR },
>{ "UNDEF", ARM_FT_EXCEPTION },
>{ "undef", ARM_FT_EXCEPTION },
>{ "SWI",   ARM_FT_EXCEPTION },
> --
> 2.7.4
>


RE: arm: Fix vfp_operand_register for VFP HI regs

2020-04-29 Thread Kyrylo Tkachov
Hi Christophe,

> -Original Message-
> From: Gcc-patches  On Behalf Of
> Christophe Lyon via Gcc-patches
> Sent: 29 April 2020 16:53
> To: gcc Patches 
> Subject: arm: Fix vfp_operand_register for VFP HI regs
> 
> Hi,
> 
> While looking at PR target/94743 I noticed an ICE when I tried to save
> all the FP registers: this was because all HI registers wouldn't match
> vfp_register_operand.

Hmm, I see that arm_regno_class indeed never returns VFP_REGS and would return 
VFP_HI_REGS here.
So the patch looks correct to me.
Do you have a testcase for the ICE to add to the testsuite?

Thanks,
Kyrill

> 
> Regression-tested and bootstrapped OK.
> 
> 2020-04-29  Christophe Lyon  
> 
> gcc/
> * config/arm/predicates.md (vfp_register_operand): Use VFP_HI_REGS
> instead of VFP_REGS.
> 
> OK?
> 
> Thanks,
> 
> Christophe


Re: [PATCH] match.pd: Decrease number of nop conversions around bitwise ops [PR94718]

2020-04-29 Thread Richard Biener
On Wed, 29 Apr 2020, Jakub Jelinek wrote:

> On Tue, Apr 28, 2020 at 11:16:13AM +0200, Richard Biener wrote:
> > I think we generally want to use :s, but OTOH even for your new case,
> > since you restrict to nop-conversions, :s isn't strictly needed since
> > VN should know how to materialize nop-converted variants.  I think
> > it's different for the 2nd case you add since there we deal with
> > bitops in different sign.
> > 
> > So maybe just remove the single_use () calls?
> 
> I've tried doing that, but that regresses e.g.
> void bar (unsigned int, unsigned int);
> 
> unsigned int
> foo (int x, int y)
> {
>   unsigned int x2 = x;
>   unsigned int y2 = y;
>   bar (x2, y2);
>   return x2 | y2;
> }
> 
> @@ -6,12 +6,14 @@ foo (int x, int y)
>unsigned int y2;
>unsigned int x2;
>unsigned int _7;
> +  int _8;
>  
> [local count: 1073741824]:
>x2_2 = (unsigned int) x_1(D);
>y2_4 = (unsigned int) y_3(D);
>bar (x2_2, y2_4);
> -  _7 = x2_2 | y2_4;
> +  _8 = x_1(D) | y_3(D);
> +  _7 = (unsigned int) _8;
>return _7;
>  
>  }
> 
> As the patch was meant purely as an IL size optimization, that is not the
> case here.

Hmm OK.  Let's go with your original patch then for GCC 11.

Richard.

> > > Or should I e.g. move this GIMPLE rule as a separate simplify (under the
> > > same for) and then use :s in there?
> > 
> > That wouldn't work, if a structural pattern matched others are not
> > considered so it wouldn't ever be used.  genmatch even warns for this
> > I think.
> 
> Ah, ok.
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)


Re: [PATCH] toplev.c: Check for null argument to fprintf

2020-04-29 Thread Jeff Law via Gcc-patches
On Tue, 2020-04-28 at 09:03 +0200, Stefan Schulze Frielinghaus via Gcc-patches
wrote:
> Ensure that CF does not equal NULL in function output_stack_usage_1
> before calling fprintf.  This fixes the following warning/error:
> 
> gcc/toplev.c:976:13: error: argument 1 null where non-null expected [-
> Werror=nonnull]
>   976 | fprintf (cf, "\\n" HOST_WIDE_INT_PRINT_DEC " bytes (%s)",
>   | ^
>   977 |   stack_usage,
>   |   
>   978 |   stack_usage_kind_str[stack_usage_kind]);
> 
> An example call side where CF is NULL is in function output_stack_usage.
> 
> Bootstrapped and regtested successfully on S/390. Ok for master?
> 
> gcc/ChangeLog:
> 
> 2020-04-28  Stefan Schulze Frielinghaus  
> 
>   * toplev.c (output_stack_usage_1): Ensure that first
>   argument to fprintf is not null.
OK
jeff



Re: [PATCH] contrib/vimrc: Reduce textwidth for commit messages

2020-04-29 Thread Jeff Law via Gcc-patches
On Tue, 2020-04-28 at 00:31 -0400, Patrick Palka via Gcc-patches wrote:
> The bundled vimrc script, when enabled, currently sets the text width to
> 80 characters for all files within the source directory.  Unfortunately
> this means the setting also applies when editing .git/COMMIT_EDITMSG,
> overriding the default and standard text width of 72 for Git commit
> messages.  (A text width of 80 is too much for commit messages because
> Git indents commit messages by four spaces when displaying them via e.g.
> git log, leading to a total displayed width of 84 characters.)
> 
> This patch explicitly sets the text width of Git commit messages to 72
> characters in accordance with standard practice.  (Alternatively we
> could avoid setting textwidth at all in this case and let the defaults
> kick in, but maybe it's better to be explicit?)
> 
> Tested by writing up this commit message :)  Is this OK to commit?
> 
> contrib/ChangeLog:
> 
>   * vimrc: Reduce textwidth to 72 for Git commit messages.
OK
jeff



Re: [PATCH] rs6000, Fix header comment for intrinsic function

2020-04-29 Thread Bill Schmidt via Gcc-patches

On 4/22/20 1:20 PM, Carl Love wrote:

GCC maintainers:

The following is a trivial patch to fix a comment describing the
intrinsic function _mm_movemask_epi8.  The comment was expanded to
clarify the layout of the returned result.

The patch does not make any functional changes.

Please let me know if the patch is OK for mainline and backporting as
appropriate.

Thanks.

  Carl Love
---
rs6000, Fix header comment for intrinsic function _mm_movemask_epi8

gcc/ChangeLog

2020-04-22  Carl Love  

* config/rs6000/emmintrin.h (_mm_movemask_epi8): Fix comment for the
function.


Drop "for the function" as Will suggested.



Signed-off-by: Carl Love 
---
  gcc/config/rs6000/emmintrin.h | 4 +++-
  1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/gcc/config/rs6000/emmintrin.h b/gcc/config/rs6000/emmintrin.h
index 2462cf5bdac..0872a75c0de 100644
--- a/gcc/config/rs6000/emmintrin.h
+++ b/gcc/config/rs6000/emmintrin.h
@@ -2033,7 +2033,9 @@ _mm_min_epu8 (__m128i __A, __m128i __B)
  #ifdef _ARCH_PWR8
  /* Intrinsic functions that require PowerISA 2.07 minimum.  */

-/* Creates a 4-bit mask from the most significant bits of the SPFP values.  */
+/* Creates a 16-bit mask from the most significant bits of the sixteen 8-bit
+   values.  The 16-bit result is placed in bits[48:63], bits [0:47] and
+   bits [64:127] are  set to zero.  */


The description of the emulated function is

"Create mask from the most significant bit of each 8-bit element in a, and store the 
result in dst." [*]

Therefore I suggest you change the comment to

"Return a mask created from the most significant bit of each 8-bit element in 
A."

OK to commit as obvious with these changes.

Thanks,
Bill

[*] 
https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=_mm_movemask_ep&expand=3864


  extern __inline int __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
  _mm_movemask_epi8 (__m128i __A)
  {


Re: [PATCH v5 GCC] libffi/test: Fix compilation for build sysroot

2020-04-29 Thread Jeff Law via Gcc-patches
On Sat, 2020-04-25 at 21:33 +0100, Maciej W. Rozycki wrote:
> On Wed, 22 Apr 2020, Jeff Law wrote:
> 
> > >   libffi/
> > >   * Makefile.am (DISTCLEANFILES): New variable.
> > >   * configure.ac: Produce `local.exp'.
> > >   * Makefile.in: Regenerate.
> > >   * configure: Regenerate.
> > >   * testsuite/Makefile.am (EXTRA_DEJAGNU_SITE_CONFIG): New 
> > >   variable.
> > >   * testsuite/Makefile.in: Regenerate.
> > OK
> > jeff
> 
>  Committed now then, according to schedule, thanks for your review.  We 
> may revisit the backport of improvements I previously proposed when/if the 
> upstream libffi maintainer decides to accept them.
ACK.

FWIW, I'm expecting Florian to be digging into some libffi stuff relatively soon
and once those issues are ironed out, there'll be a push for a libffi upstream
release -- which seems like a good place to re-sync GCC to the upstream libffi
master.

jeff



Re: [PATCH] contrib/vimrc: Reduce textwidth for commit messages

2020-04-29 Thread Jakub Jelinek via Gcc-patches
On Wed, Apr 29, 2020 at 11:06:27AM -0600, Jeff Law via Gcc-patches wrote:
> On Tue, 2020-04-28 at 00:31 -0400, Patrick Palka via Gcc-patches wrote:
> > The bundled vimrc script, when enabled, currently sets the text width to
> > 80 characters for all files within the source directory.  Unfortunately
> > this means the setting also applies when editing .git/COMMIT_EDITMSG,
> > overriding the default and standard text width of 72 for Git commit
> > messages.  (A text width of 80 is too much for commit messages because
> > Git indents commit messages by four spaces when displaying them via e.g.
> > git log, leading to a total displayed width of 84 characters.)
> > 
> > This patch explicitly sets the text width of Git commit messages to 72
> > characters in accordance with standard practice.  (Alternatively we
> > could avoid setting textwidth at all in this case and let the defaults
> > kick in, but maybe it's better to be explicit?)
> > 
> > Tested by writing up this commit message :)  Is this OK to commit?
> > 
> > contrib/ChangeLog:
> > 
> > * vimrc: Reduce textwidth to 72 for Git commit messages.
> OK

Though, if we go for the ChangeLog entries in commit messages as the source
for auto-appended ChangeLog files during DATESTAMP update, as Martin Liska
is working on - see https://github.com/marxin/gcc-changelog/ -
won't this cause people to wrap ChangeLog entries too early in there?
Because for the script what git log --format=%B is what really matters.

Jakub



Re: introduce target tmpnam and require it in tests relying on it

2020-04-29 Thread Jeff Law via Gcc-patches
On Tue, 2020-04-28 at 20:43 -0300, Alexandre Oliva wrote:
> On Apr 28, 2020, Bernhard Reutner-Fischer  wrote:
> 
> > ISTM the corresponding documentation hunk for sourcebuild.texi  is missing.
> 
> Thanks, I wasn't aware that testsuite effective targets were documented
> there.
> 
> Here's the missing documentation.  Tested with 'make info' on
> x86_64-linux-gnu.  Ok to install?
> 
> 
> document effective target fileio
> 
> From: Alexandre Oliva 
> 
> check_effective_target_fileio was added to
> gcc/testsuite/lib/target-supports.exp the other day, without
> documentation.
> 
> This patch adds the corresponding documentation.
> 
> 
> for  gcc/ChangeLog
> 
>   * doc/sourcebuild.texi (Effective-Target Keywords): Document
>   the newly-introduced fileio effective target.
OK
jeff
> 



[committed] libstdc++: Fix outdated comment about std::string instantiations (PR 94854)

2020-04-29 Thread Jonathan Wakely via Gcc-patches
PR libstdc++/94854
* include/bits/basic_string.tcc: Update comment about explicit
instantiations.

Committed to master as obvious. Backports to follow.

commit 8f1591763fd50b143af0dc1770741f326a97583a
Author: Jonathan Wakely 
Date:   Wed Apr 29 18:57:34 2020 +0100

libstdc++: Fix outdated comment about std::string instantiations (PR 94854)

PR libstdc++/94854
* include/bits/basic_string.tcc: Update comment about explicit
instantiations.

diff --git a/libstdc++-v3/include/bits/basic_string.tcc 
b/libstdc++-v3/include/bits/basic_string.tcc
index e78bb2c14b3..e370f439390 100644
--- a/libstdc++-v3/include/bits/basic_string.tcc
+++ b/libstdc++-v3/include/bits/basic_string.tcc
@@ -1609,11 +1609,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   // Inhibit implicit instantiations for required instantiations,
   // which are defined via explicit instantiations elsewhere.
 #if _GLIBCXX_EXTERN_TEMPLATE
-  // The explicit instantiations definitions in src/c++11/string-inst.cc
-  // are compiled as C++14, so the new C++17 members aren't instantiated.
-  // Until those definitions are compiled as C++17 suppress the declaration,
-  // so C++17 code will implicitly instantiate std::string and std::wstring
-  // as needed.
+  // The explicit instantiation definitions in src/c++11/string-inst.cc and
+  // src/c++17/string-inst.cc only instantiate the members required for C++17
+  // and earlier standards (so not C++20's starts_with and ends_with).
+  // Suppress the explicit instantiation declarations for C++20, so C++20
+  // code will implicitly instantiate std::string and std::wstring as needed.
 # if __cplusplus <= 201703L && _GLIBCXX_EXTERN_TEMPLATE > 0
   extern template class basic_string;
 # elif ! _GLIBCXX_USE_CXX11_ABI


Re: [PATCH] rtl cse: Fix PR94740, ICE on testsuite/gcc.dg/sso/t5.c with -mcpu=future -mpcrel -O1

2020-04-29 Thread Peter Bergner via Gcc-patches
On 4/29/20 10:13 AM, Richard Sandiford wrote:
> If this is a bug we need to fix for GCC 10 then how about this:

I do think this is important to fix.  Mike Meissner just switched our
-mcpu=future code to enable -mpcrel by default in GCC10, so we're going
to see a lot more of these types of addresses than before.



> (1) change the MEM case to something like:
> 
>   {
> bool addr_changed = false;
> rtx addr = cse_process_notes (XEXP (x, 0), x, &addr_changed);
> if (addr_changed && validate_change (x, &XEXP (x, 0), addr, 0))
> *changed = true;
> return x;
>   }
> 
> which makes it safe to do the validation only at the top level
> of the MEM.
> 
> (2) make the:
> 
>for (i = 0; i < GET_RTX_LENGTH (code); i++)
>  if (fmt[i] == 'e')
> 
> loop assign directly to XEXP (x, i) if the new rtx is different,
> instead of using validate_change.  This ensures that temporarily
> invalid rtl doesn't get verified.
> 
> (passing a null object to validate_change would have the same effect,
> but that feels a bit hacky)
> 
> (3) use the simplify_binary_operator thing above, since that at least
> should be safe, even if it potentially leaves similar bugs unfixed
> for other operators.
> 
> Sorry for the runaround :-(

No problem.  I appreciate the feedback and I agree we want to get this right.
I'll implement the above and report back.  Thanks!

Peter



[PR c+ 94827]: template parm with default requires

2020-04-29 Thread Nathan Sidwell

Jason,
this is the patch you suggested, as I understood it.  I kept 
finish_nested_require's saving of the (converted) 
current_template_parms, becase of the comment about use in diagnostics.


Is this what you meant?

boostrapped on x86_64-linux.

nathan
--
Nathan Sidwell
2020-04-27  Jason Merrill  
	Nathan Sidwell  

	PR c++/94827
	* constraint.cc (map_arguments): If ARGS is null, it's a
	self-mapping of parms.
	(finish_nested_requirement): Do not pass argified
	current_template_parms to normalization.

diff --git c/gcc/cp/constraint.cc w/gcc/cp/constraint.cc
index 06161b8c8c4..a6009dc0c47 100644
--- c/gcc/cp/constraint.cc
+++ w/gcc/cp/constraint.cc
@@ -546,12 +546,16 @@ static tree
 map_arguments (tree parms, tree args)
 {
   for (tree p = parms; p; p = TREE_CHAIN (p))
-{
-  int level;
-  int index;
-  template_parm_level_and_index (TREE_VALUE (p), &level, &index);
-  TREE_PURPOSE (p) = TMPL_ARG (args, level, index);
-}
+if (args)
+  {
+	int level;
+	int index;
+	template_parm_level_and_index (TREE_VALUE (p), &level, &index);
+	TREE_PURPOSE (p) = TMPL_ARG (args, level, index);
+  }
+else
+  TREE_PURPOSE (p) = TREE_VALUE (p);
+
   return parms;
 }
 
@@ -2953,12 +2957,15 @@ finish_compound_requirement (location_t loc, tree expr, tree type, bool noexcept
 tree
 finish_nested_requirement (location_t loc, tree expr)
 {
+  /* Currently open template headers have dummy arg vectors, so don't
+ pass into normalization.  */
+  tree norm = normalize_constraint_expression (expr, NULL_TREE, false);
+  tree args = current_template_parms
+? template_parms_to_args (current_template_parms) : NULL_TREE;
+
   /* Save the normalized constraint and complete set of normalization
  arguments with the requirement.  We keep the complete set of arguments
  around for re-normalization during diagnostics.  */
-  tree args = current_template_parms
-? template_parms_to_args (current_template_parms) : NULL_TREE;
-  tree norm = normalize_constraint_expression (expr, args, false);
   tree info = build_tree_list (args, norm);
 
   /* Build the constraint, saving its normalization as its type.  */
diff --git c/gcc/testsuite/g++.dg/concepts/pr94827.C w/gcc/testsuite/g++.dg/concepts/pr94827.C
new file mode 100644
index 000..a6c147bf303
--- /dev/null
+++ w/gcc/testsuite/g++.dg/concepts/pr94827.C
@@ -0,0 +1,7 @@
+// PR 94287 ICE looking inside open template-parm level
+// { dg-do compile { target c++17 } }
+// { dg-options -fconcepts }
+template 
+void foo(T) {}
+


Re: [PATCH] rtl cse: Fix PR94740, ICE on testsuite/gcc.dg/sso/t5.c with -mcpu=future -mpcrel -O1

2020-04-29 Thread Richard Sandiford
Peter Bergner  writes:
> On 4/29/20 10:13 AM, Richard Sandiford wrote:
>> If this is a bug we need to fix for GCC 10 then how about this:
>
> I do think this is important to fix.  Mike Meissner just switched our
> -mcpu=future code to enable -mpcrel by default in GCC10, so we're going
> to see a lot more of these types of addresses than before.

OK.

>> (1) change the MEM case to something like:
>> 
>>   {
>> bool addr_changed = false;
>> rtx addr = cse_process_notes (XEXP (x, 0), x, &addr_changed);
>> if (addr_changed && validate_change (x, &XEXP (x, 0), addr, 0))
>>*changed = true;
>> return x;
>>   }
>> 
>> which makes it safe to do the validation only at the top level
>> of the MEM.
>> 
>> (2) make the:
>> 
>>for (i = 0; i < GET_RTX_LENGTH (code); i++)
>>  if (fmt[i] == 'e')
>> 
>> loop assign directly to XEXP (x, i) if the new rtx is different,
>> instead of using validate_change.  This ensures that temporarily
>> invalid rtl doesn't get verified.
>> 
>> (passing a null object to validate_change would have the same effect,
>> but that feels a bit hacky)
>> 
>> (3) use the simplify_binary_operator thing above, since that at least
>> should be safe, even if it potentially leaves similar bugs unfixed
>> for other operators.
>> 
>> Sorry for the runaround :-(
>
> No problem.  I appreciate the feedback and I agree we want to get this right.
> I'll implement the above and report back.  Thanks!

Sigh.  And now of course I realise that that too is flawed.  If we make
a replacement on a subexpression and the containing expression isn't
simplified, the validate_change for the MEM will end up being a no-op,
even if it shouldn't be.  So in one case we really need the recursive
validate_changes, and in one case we really want to avoid them.

I think at this point it would be better to go for the
simplify_replace_fn_rtx thing I mentioned in the original review,
rather than try to hack at the current code even more.  What do you
think about the attached?  Testing in progress.

(Sorry for going ahead and writing an alternative patch, since if we do
go for this, I guess the earlier misdirections will have wasted two days
of your time.  But it seemed like I was just never going to think about
this PR properly unless I actually tried to write something. :-()

Richard

>From 39e20b51af8f977a1786ef5c15af80e47415d352 Mon Sep 17 00:00:00 2001
From: Richard Sandiford 
Date: Wed, 29 Apr 2020 20:38:13 +0100
Subject: [PATCH] cse: Use simplify_replace_fn_rtx to process notes [PR94740]

cse_process_notes did a very simple substitution, which in the wrong
circumstances could create non-canonical RTL and invalid MEMs.
Various sticking plasters have been applied to cse_process_notes_1
to handle cases like ZERO_EXTEND, SIGN_EXTEND and UNSIGNED_FLOAT,
but I think this PR is a plaster too far.

The code is trying hard to avoid creating unnecessary rtl, which of
course is a good thing.  If we continue to do that, then we can end
up changing subexpressions while keeping the containing rtx.
This in turn means that validate_change will be a no-op on the
containing rtx, even if its contents have changed.  So in these
cases we have to apply validate_change to the individual subexpressions.

On the other hand, if we always apply validate_change to the
individual subexpressions, we'll end up calling validate_change
on something before it has been simplified and canonicalised.
And that's one of the situations we're trying to avoid.

There might be a middle ground in which we queue the validate_changes
as part of a group, and so can cancel the pending validate_changes
for subexpressions if there's a change in the outer expression.
But that seems even more ad-hoc than the current code.
It would also be quite an invasive change.

I think the best thing is just to hook into the existing
simplify_replace_fn_rtx function, keeping the REG and MEM handling
from cse_process_notes_1 essentially unchanged.  It can generate
more redundant rtl when a simplification takes place, but it has
the advantage of being relative well-used code (both directly
and via simplify_replace_rtx).

2020-04-29  Richard Sandiford  

gcc/
	PR rtl-optimization/94740
	* cse.c (cse_process_notes_1): Replace with...
	(cse_process_note_1): ...this new function, acting as a
	simplify_replace_fn_rtx callback to process_note.  Handle only
	REGs and MEMs directly.  Validate the MEM if cse_process_note
	changes its address.
	(cse_process_notes): Replace with...
	(cse_process_note): ...this new function.
	(cse_extended_basic_block): Update accordingly, iterating over
	the register notes and passing individual notes to cse_process_note.
---
 gcc/cse.c | 118 --
 1 file changed, 34 insertions(+), 84 deletions(-)

diff --git a/gcc/cse.c b/gcc/cse.c
index 5aaba8d80e0..36bcfc354d8 100644
--- a/gcc/cse.c
+++ b/gcc/cse.c
@@ -585,7 +585,6 @@ static

Re: [PATCH v5 GCC] libffi/test: Fix compilation for build sysroot

2020-04-29 Thread Florian Weimer
* Jeff Law via Libffi-discuss:

> FWIW, I'm expecting Florian to be digging into some libffi stuff
> relatively soon and once those issues are ironed out, there'll be a
> push for a libffi upstream release -- which seems like a good place
> to re-sync GCC to the upstream libffi master.

I'm not going to work on the master branch for now, so this should not
impact the release.  The goal is specifically to support more hardware
for legacy binaries with the old libffi.so.6 soname.


[PATCH] tree: Don't reuse types if TYPE_USER_ALIGN differ [PR94775]

2020-04-29 Thread Marek Polacek via Gcc-patches
Here we trip on the TYPE_USER_ALIGN (t) assert in strip_typedefs: it
gets "const d[0]" with TYPE_USER_ALIGN=0 but the result built by
build_cplus_array_type is "const char[0]" with TYPE_USER_ALIGN=1.

When we strip_typedefs the element of the array "const d", we see it's
a typedef_variant_p, so we look at its DECL_ORIGINAL_TYPE, which is
char, but we need to add the const qualifier, so we call
cp_build_qualified_type -> build_qualified_type
where get_qualified_type checks to see if we already have such a type
by walking the variants list, which in this case is:

  char -> c -> const char -> const char -> d -> const d

Because check_base_type only checks TYPE_ALIGN and not TYPE_USER_ALIGN,
we choose the first const char, which has TYPE_USER_ALIGN set.  If the
element type of an array has TYPE_USER_ALIGN, the array type gets it too.

So we can make check_base_type stricter.  I was afraid that it might make
us reuse types less often, but measuring showed that we build the same
amount of types with and without the patch, while bootstrapping.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk/9/8?

PR c++/94775
* tree.c (check_base_type): Return true only if TYPE_USER_ALIGN match.
(check_aligned_type): Check if TYPE_USER_ALIGN match.

* g++.dg/warn/Warray-bounds-10.C: New test.
---
 gcc/testsuite/g++.dg/warn/Warray-bounds-10.C | 40 
 gcc/tree.c   |  4 +-
 2 files changed, 43 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/warn/Warray-bounds-10.C

diff --git a/gcc/testsuite/g++.dg/warn/Warray-bounds-10.C 
b/gcc/testsuite/g++.dg/warn/Warray-bounds-10.C
new file mode 100644
index 000..0a18f637e0e
--- /dev/null
+++ b/gcc/testsuite/g++.dg/warn/Warray-bounds-10.C
@@ -0,0 +1,40 @@
+// PR c++/94775
+// { dg-do compile { target c++14 } }
+// { dg-options "-O2 -Warray-bounds" }
+
+template  using a = int;
+template  using b = int;
+typedef char d;
+template  using e = int;
+template  struct h { using i = b>, e>; };
+template  using j = typename h::i;
+long ab, k, aj;
+const d l[]{};
+class m {
+public:
+  m(int);
+};
+class n {
+  void ad() const;
+  template  void o(long) const {
+using c __attribute__((aligned(1))) = const ae;
+  }
+  long p;
+  template 
+  auto s(unsigned long, unsigned long, unsigned long, unsigned long) const;
+  template  auto q(unsigned long, unsigned long) const;
+};
+template 
+auto n::s(unsigned long, unsigned long, unsigned long, unsigned long t) const {
+  o(p);
+  return t;
+}
+template  auto n::q(unsigned long p1, unsigned long p2) const {
+  using r = j<4, false>;
+  using ai = j<4, g>;
+  return s(ab, k, p1, p2);
+}
+void n::ad() const {
+  long f(l[aj]); // { dg-warning "outside array bounds" }
+  m(q(8, f));
+}
diff --git a/gcc/tree.c b/gcc/tree.c
index e451401822c..341766c51e5 100644
--- a/gcc/tree.c
+++ b/gcc/tree.c
@@ -6493,7 +6493,8 @@ check_base_type (const_tree cand, const_tree base)
TYPE_ATTRIBUTES (base)))
 return false;
   /* Check alignment.  */
-  if (TYPE_ALIGN (cand) == TYPE_ALIGN (base))
+  if (TYPE_ALIGN (cand) == TYPE_ALIGN (base)
+  && TYPE_USER_ALIGN (cand) == TYPE_USER_ALIGN (base))
 return true;
   /* Atomic types increase minimal alignment.  We must to do so as well
  or we get duplicated canonical types. See PR88686.  */
@@ -6528,6 +6529,7 @@ check_aligned_type (const_tree cand, const_tree base, 
unsigned int align)
  && TYPE_CONTEXT (cand) == TYPE_CONTEXT (base)
  /* Check alignment.  */
  && TYPE_ALIGN (cand) == align
+ && TYPE_USER_ALIGN (cand) == TYPE_USER_ALIGN (base)
  && attribute_list_equal (TYPE_ATTRIBUTES (cand),
   TYPE_ATTRIBUTES (base))
  && check_lang_type (cand, base));

base-commit: 8f1591763fd50b143af0dc1770741f326a97583a
-- 
Marek Polacek • Red Hat, Inc. • 300 A St, Boston, MA



Re: [PATCH] c++: Member template function lookup failure [PR94799]

2020-04-29 Thread Jason Merrill via Gcc-patches

On 4/28/20 11:55 PM, Marek Polacek wrote:

Whew, this took a while.  We fail to parse "p->template A::a()"
(where p is of type A *) because since r249752 we treat the RHS of the ->
as dependent and avoid a lookup in the enclosing context: since that rev
cp_parser_template_name checks parser->context->object_type too, which
here is unknown_type_node, signalling a type-dependent object:

  7756   if (dependent_p)
  7757 /* Tell cp_parser_lookup_name that there was an object, even though 
it's
  7758type-dependent.  */
  7759 parser->context->object_type = unknown_type_node;

with which cp_parser_template_name returns identifier 'A', cp_parser_class_name
then creates a TEMPLATE_ID_EXPR A, but then

23735   decl = make_typename_type (scope, decl, tag_type, tf_error);

in cp_parser_class_name fails because scope is NULL.  Then we return
error_mark_node and parse errors ensue.

I've tried various approaches, e.g. keeping TEMPLATE_ID_EXPR around
instead of calling make_typename_type, which didn't work, whereupon I
realized that since we don't want to perform name lookup if we've seen
the template keyword and the scope is dependent, we can adjust
parser->context->object_type and use the type of the object expression
as the scope, even if it's type-dependent.  This should be in line with
[basic.lookup.classref]p4.



The "&& scope != unknown_type_node" line in cp_parser_class_name is there
for diagnostic purposes only (to avoid issuing a confusing error).

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?
(Happy to defer to GCC 11 if this doesn't seem very safe.)

PR c++/94799
* parser.c (cp_parser_postfix_dot_deref_expression): If we have
a type-dependent object of class type, stash it to
parser->context->object_type.
(cp_parser_class_name): Consider object scope too.  Don't call
make_typename_type when the scope is unknown_type_node.

* g++.dg/lookup/this1.C: Adjust dg-error.
* g++.dg/template/lookup12.C: New test.
* g++.dg/template/lookup13.C: New test.
* g++.dg/template/lookup14.C: New test.
---
  gcc/cp/parser.c  | 28 ++--
  gcc/testsuite/g++.dg/lookup/this1.C  |  2 +-
  gcc/testsuite/g++.dg/template/lookup12.C | 26 ++
  gcc/testsuite/g++.dg/template/lookup13.C | 28 
  gcc/testsuite/g++.dg/template/lookup14.C | 11 ++
  5 files changed, 87 insertions(+), 8 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/template/lookup12.C
  create mode 100644 gcc/testsuite/g++.dg/template/lookup13.C
  create mode 100644 gcc/testsuite/g++.dg/template/lookup14.C

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index e1f9786893a..b344721fb60 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -7694,11 +7694,16 @@ cp_parser_postfix_dot_deref_expression (cp_parser 
*parser,
bool pseudo_destructor_p;
tree scope = NULL_TREE;
location_t start_loc = postfix_expression.get_start ();
+  tree type = TREE_TYPE (postfix_expression);
  
/* If this is a `->' operator, dereference the pointer.  */

if (token_type == CPP_DEREF)
-postfix_expression = build_x_arrow (location, postfix_expression,
-   tf_warning_or_error);
+{
+  if (type && POINTER_TYPE_P (type))
+   type = TREE_TYPE (type);
+  postfix_expression = build_x_arrow (location, postfix_expression,
+ tf_warning_or_error);
+}
/* Check to see whether or not the expression is type-dependent and
   not the current instantiation.  */
dependent_p = type_dependent_object_expression_p (postfix_expression);
@@ -7754,9 +7759,10 @@ cp_parser_postfix_dot_deref_expression (cp_parser 
*parser,
  }
  
if (dependent_p)

-/* Tell cp_parser_lookup_name that there was an object, even though it's
-   type-dependent.  */
-parser->context->object_type = unknown_type_node;
+/* If we don't have a (type-dependent) object of class type, use
+   unknown_type_node to signal that there was an object.  */
+parser->context->object_type = (type && CLASS_TYPE_P (type)
+   ? type : unknown_type_node);


Anything that depends on CLASS_TYPE_P won't work if 'p' isn't clearly a 
class, i.e. if it has type T*, T, or NULL_TREE.  Why not use any 
non-null type here?


For null type, I wonder if using decltype would make sense.


/* Assume this expression is not a pseudo-destructor access.  */
pseudo_destructor_p = false;
@@ -23625,8 +23631,15 @@ cp_parser_class_name (cp_parser *parser,
  }
  
/* PARSER->SCOPE can be cleared when parsing the template-arguments

- to a template-id, so we save it here.  */
-  scope = parser->scope;
+ to a template-id, so we save it here.  Consider object scope too,
+ so that make_typename_type below can use it (cp_parser_template_name
+ considers object scop

Re: [PATCH] rtl cse: Fix PR94740, ICE on testsuite/gcc.dg/sso/t5.c with -mcpu=future -mpcrel -O1

2020-04-29 Thread Peter Bergner via Gcc-patches
On 4/29/20 3:28 PM, Richard Sandiford wrote:
> (Sorry for going ahead and writing an alternative patch, since if we do
> go for this, I guess the earlier misdirections will have wasted two days
> of your time.  But it seemed like I was just never going to think about
> this PR properly unless I actually tried to write something. :-()

No worries from me!  I'm just glad to see this fixed before the release.
I'll kill off a bootstrap and regtest on powerpc64le-linux too, in addition
to your tests (arm & x86_64?).  Thanks for your help with this!

Peter





Re: [PATCH] rtl cse: Fix PR94740, ICE on testsuite/gcc.dg/sso/t5.c with -mcpu=future -mpcrel -O1

2020-04-29 Thread Peter Bergner via Gcc-patches
On 4/29/20 3:28 PM, Richard Sandiford wrote:
> I think at this point it would be better to go for the
> simplify_replace_fn_rtx thing I mentioned in the original review,
> rather than try to hack at the current code even more.  What do you
> think about the attached?  Testing in progress.

My testing is still running, but I like the patch a lot!
It's nice when a patch not only fixes a bug, but makes the code
*MUCH* simpler too.  Well done!

Peter



Re: [PATCH v2] diagnostics: Add %{...%} pretty-format support for URLs and use it in -Wpsabi diagnostcs

2020-04-29 Thread Joseph Myers
This is missing documentation for the new configure option in 
install.texi.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH v2] diagnostics: Add %{...%} pretty-format support for URLs and use it in -Wpsabi diagnostcs

2020-04-29 Thread Jakub Jelinek via Gcc-patches
On Wed, Apr 29, 2020 at 09:42:59PM +, Joseph Myers wrote:
> This is missing documentation for the new configure option in 
> install.texi.

I was considering it, but then didn't find any docs for the
--with-documentation-root= option either, so punted on that.

I'll try to write something for both.

Jakub



Re: [PATCH] rtl cse: Fix PR94740, ICE on testsuite/gcc.dg/sso/t5.c with -mcpu=future -mpcrel -O1

2020-04-29 Thread Segher Boessenkool
On Wed, Apr 29, 2020 at 08:56:11AM -0600, Jeff Law wrote:
> On Tue, 2020-04-28 at 20:03 -0500, Segher Boessenkool wrote:
> > What are the actual rules?  Where is this documented?
> > 
> > Of course it is highly desirable to have CONST around such constant
> > addresses, but when is it *required*?  And what *is* a constant address
> > (in this context)?
> I don't know if it's documented, but it's been expected practice for decades.
> 
> It includes symbolic constants, and the results of simple arithmetic on 
> symbolic
> constants.  It (to the best of my knowledge) does not include logical 
> operations
> on symbolic constants.

So should it be done for anything that is "i" as constraint?  Or only
for constant addresses?  In what sense constant, anyway?

None of this ever was clear to me...  I always did the "if it doesn't
ICE anymore, we're good" routine, but *everyone* seems to do that, and
that isn't so optimal ;-)


Segher


Re: [PATCH] rtl cse: Fix PR94740, ICE on testsuite/gcc.dg/sso/t5.c with -mcpu=future -mpcrel -O1

2020-04-29 Thread Peter Bergner via Gcc-patches
On 4/29/20 4:15 PM, Peter Bergner wrote:
> On 4/29/20 3:28 PM, Richard Sandiford wrote:
>> (Sorry for going ahead and writing an alternative patch, since if we do
>> go for this, I guess the earlier misdirections will have wasted two days
>> of your time.  But it seemed like I was just never going to think about
>> this PR properly unless I actually tried to write something. :-()
> 
> No worries from me!  I'm just glad to see this fixed before the release.
> I'll kill off a bootstrap and regtest on powerpc64le-linux too, in addition
> to your tests (arm & x86_64?).  Thanks for your help with this!

My bootstrap and regtesting of your patch on powerpc64le-linux was clean.
Ship it! :-)

Peter


Re: [PATCH] rtl cse: Fix PR94740, ICE on testsuite/gcc.dg/sso/t5.c with -mcpu=future -mpcrel -O1

2020-04-29 Thread Jeff Law via Gcc-patches
On Wed, 2020-04-29 at 17:54 -0500, Segher Boessenkool wrote:
> On Wed, Apr 29, 2020 at 08:56:11AM -0600, Jeff Law wrote:
> > On Tue, 2020-04-28 at 20:03 -0500, Segher Boessenkool wrote:
> > > What are the actual rules?  Where is this documented?
> > > 
> > > Of course it is highly desirable to have CONST around such constant
> > > addresses, but when is it *required*?  And what *is* a constant address
> > > (in this context)?
> > I don't know if it's documented, but it's been expected practice for 
> > decades.
> > 
> > It includes symbolic constants, and the results of simple arithmetic on
> > symbolic
> > constants.  It (to the best of my knowledge) does not include logical
> > operations
> > on symbolic constants.
> 
> So should it be done for anything that is "i" as constraint?  Or only
> for constant addresses?  In what sense constant, anyway?
Only symbolic constants.  ie, things involving SYMBOL_REF or LABEL_REF which are
constant when the program is run, but the exact value isn't known at compile
time.  Simple CONST_INT, CONST_DOUBLE and the like do not need the CONST
wrapping.
Jeff



Re: [Patch, Fortran] OpenMP/OpenACC – fix more issues with OPTIONAL

2020-04-29 Thread Richard Biener via Gcc-patches
On Wed, Apr 29, 2020 at 5:05 PM Tobias Burnus  wrote:
>
> Hi Thomas,
>
> was a bit on the backburner but I now digged again.
> See: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94848
>
> The problem is a generated static array variable in the
> device function:
>static integer(kind=4) A.12[3] = {1, 2, 3};
> used as
>_26 = A.12[S.13_67];
>
> With -ftree-pre, the expressions using _26 now directly use
> the value (1, 2 and 3). It seems as if the variable A.12 is
> eliminated before writing to LTO but the usage (A.12[S.…])
> is not – at least it still appears in the device dumps.
>
> Maybe it is also related to something else, but crucial is
> the -ftree-pre on the host side; the optimization level on
> the device lto1 side is irrelevant.

IIRC there was a bugreport similar to this where offload
variables were computed "early" instead of at the point
of streaming.

It's obvious that the offload "part" use is not reflected in
the non-offloat "part" IL (like via a function call parameter
or so), so I assume this is a local static in a function
we simply compile twice (but did not actually clone),
once for each offloat target and once for the host.  So it's
likely the above issue.

Richard.

> Cheers,
>
> Tobias
>
> On 4/29/20 12:00 PM, Thomas Schwinge wrote:
>
> > Hi Tobias!
> >
> > Do you happen to have any update regarding this one:
> >
> > On 2020-01-08T09:55:06+0100, Tobias Burnus  wrote:
> >> On 1/8/20 9:33 AM, Thomas Schwinge wrote:
> >>> With 'dg-do run' added, on [...] x86_64-pc-linux-gnu with offloading I'm 
> >>> seeing:
> > | PASS: libgomp.fortran/use_device_ptr-optional-3.f90   -O0  (test for 
> > excess errors)
> > | PASS: libgomp.fortran/use_device_ptr-optional-3.f90   -O0  execution 
> > test
> > | PASS: libgomp.fortran/use_device_ptr-optional-3.f90   -O1  (test for 
> > excess errors)
> > | PASS: libgomp.fortran/use_device_ptr-optional-3.f90   -O1  execution 
> > test
> > | FAIL: libgomp.fortran/use_device_ptr-optional-3.f90   -O2  (test for 
> > excess errors)
> > | UNRESOLVED: libgomp.fortran/use_device_ptr-optional-3.f90   -O2  
> > compilation failed to produce executable
> > | FAIL: libgomp.fortran/use_device_ptr-optional-3.f90   -O3 
> > -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer 
> > -finline-functions  (test for excess errors)
> > | UNRESOLVED: libgomp.fortran/use_device_ptr-optional-3.f90   -O3 
> > -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer 
> > -finline-functions  compilation failed to produce executable
> > | FAIL: libgomp.fortran/use_device_ptr-optional-3.f90   -O3 -g  (test 
> > for excess errors)
> > | UNRESOLVED: libgomp.fortran/use_device_ptr-optional-3.f90   -O3 -g  
> > compilation failed to produce executable
> > | FAIL: libgomp.fortran/use_device_ptr-optional-3.f90   -Os  (test for 
> > excess errors)
> > | UNRESOLVED: libgomp.fortran/use_device_ptr-optional-3.f90   -Os  
> > compilation failed to produce executable
> >
> >>> ... due to:
> >>>   /tmp/cciVc43I.o:(.gnu.offload_vars+0x10): undefined reference to 
> >>> `A.12.4064'
> >>> which may be something like PR90779, PR85063, PR84592, PR90779,
> >>> PR80411, PR71536 -- or something else.
> >> Hmm. It is surely among the listed items, if all fails in the last item.
> >> Note that PR85063 is fixed and PR84592 a duplicate of PR90779 (which is
> >> listed twice). To through in another number it could also be a variant
> >> of PR 92029 to though in yet another number …
> >
> > Grüße
> >   Thomas
> > -
> > Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / 
> > Germany
> > Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, 
> > Alexander Walter
> -
> Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
> Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, 
> Alexander Walter


Re: [PATCH] var-tracking.c: Fix possible use of uninitialized variable pre

2020-04-29 Thread Richard Biener via Gcc-patches
On Wed, Apr 29, 2020 at 5:56 PM Jeff Law  wrote:
>
> On Tue, 2020-04-28 at 11:44 +0200, Richard Biener via Gcc-patches wrote:
> >
> > Btw, does s390 have different inlining parameters somehow?
> I think so.  We saw a ton of backend warnings that were specific to the s390
> builds in Fedora that appeared to be triggered by different inlining 
> decisions.
>
> It happened enough that one of the ideas we're going to explore is seeing if 
> we
> can tune the x86_64 port to use the same heuristics.  That way we can try to 
> find
> more of these issues in our tester without having to dramatically increase 
> s390x
> resourcing.  This hasn't started, but I expect we'll be looking at it in the 
> fall
> (it's unclear if we can just use flags or will need hackery to the x86 port, 
> but
> we're happy to share whatever we find with the wider community).

Hmm, OK.  Guess if it becomes an issue for openSUSE I'll simply revert those
s390 backend changes.

As a general advice to backend developers I _strongly_ discourage to adjust
--param defaults that affect GIMPLE.

Richard.

> jeff
>
>
>


Re: [PATCH] tree: Don't reuse types if TYPE_USER_ALIGN differ [PR94775]

2020-04-29 Thread Richard Biener via Gcc-patches
On Wed, Apr 29, 2020 at 11:43 PM Marek Polacek via Gcc-patches
 wrote:
>
> Here we trip on the TYPE_USER_ALIGN (t) assert in strip_typedefs: it
> gets "const d[0]" with TYPE_USER_ALIGN=0 but the result built by
> build_cplus_array_type is "const char[0]" with TYPE_USER_ALIGN=1.
>
> When we strip_typedefs the element of the array "const d", we see it's
> a typedef_variant_p, so we look at its DECL_ORIGINAL_TYPE, which is
> char, but we need to add the const qualifier, so we call
> cp_build_qualified_type -> build_qualified_type
> where get_qualified_type checks to see if we already have such a type
> by walking the variants list, which in this case is:
>
>   char -> c -> const char -> const char -> d -> const d
>
> Because check_base_type only checks TYPE_ALIGN and not TYPE_USER_ALIGN,
> we choose the first const char, which has TYPE_USER_ALIGN set.  If the
> element type of an array has TYPE_USER_ALIGN, the array type gets it too.
>
> So we can make check_base_type stricter.  I was afraid that it might make
> us reuse types less often, but measuring showed that we build the same
> amount of types with and without the patch, while bootstrapping.
>
> Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk/9/8?

Hmm, so while I wonder whether a mismatch in TYPE_USER_ALIGN would
affect program semantics we should make sure to handle this flag consistently.

Thus the patch is OK for trunk and branches after a while.

Note I question that we build new types during diagnostic printing.

Thanks,
Richard.

> PR c++/94775
> * tree.c (check_base_type): Return true only if TYPE_USER_ALIGN match.
> (check_aligned_type): Check if TYPE_USER_ALIGN match.
>
> * g++.dg/warn/Warray-bounds-10.C: New test.
> ---
>  gcc/testsuite/g++.dg/warn/Warray-bounds-10.C | 40 
>  gcc/tree.c   |  4 +-
>  2 files changed, 43 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/g++.dg/warn/Warray-bounds-10.C
>
> diff --git a/gcc/testsuite/g++.dg/warn/Warray-bounds-10.C 
> b/gcc/testsuite/g++.dg/warn/Warray-bounds-10.C
> new file mode 100644
> index 000..0a18f637e0e
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/warn/Warray-bounds-10.C
> @@ -0,0 +1,40 @@
> +// PR c++/94775
> +// { dg-do compile { target c++14 } }
> +// { dg-options "-O2 -Warray-bounds" }
> +
> +template  using a = int;
> +template  using b = int;
> +typedef char d;
> +template  using e = int;
> +template  struct h { using i = b>, e>; };
> +template  using j = typename h::i;
> +long ab, k, aj;
> +const d l[]{};
> +class m {
> +public:
> +  m(int);
> +};
> +class n {
> +  void ad() const;
> +  template  void o(long) const {
> +using c __attribute__((aligned(1))) = const ae;
> +  }
> +  long p;
> +  template 
> +  auto s(unsigned long, unsigned long, unsigned long, unsigned long) const;
> +  template  auto q(unsigned long, unsigned long) const;
> +};
> +template 
> +auto n::s(unsigned long, unsigned long, unsigned long, unsigned long t) 
> const {
> +  o(p);
> +  return t;
> +}
> +template  auto n::q(unsigned long p1, unsigned long p2) const {
> +  using r = j<4, false>;
> +  using ai = j<4, g>;
> +  return s(ab, k, p1, p2);
> +}
> +void n::ad() const {
> +  long f(l[aj]); // { dg-warning "outside array bounds" }
> +  m(q(8, f));
> +}
> diff --git a/gcc/tree.c b/gcc/tree.c
> index e451401822c..341766c51e5 100644
> --- a/gcc/tree.c
> +++ b/gcc/tree.c
> @@ -6493,7 +6493,8 @@ check_base_type (const_tree cand, const_tree base)
> TYPE_ATTRIBUTES (base)))
>  return false;
>/* Check alignment.  */
> -  if (TYPE_ALIGN (cand) == TYPE_ALIGN (base))
> +  if (TYPE_ALIGN (cand) == TYPE_ALIGN (base)
> +  && TYPE_USER_ALIGN (cand) == TYPE_USER_ALIGN (base))
>  return true;
>/* Atomic types increase minimal alignment.  We must to do so as well
>   or we get duplicated canonical types. See PR88686.  */
> @@ -6528,6 +6529,7 @@ check_aligned_type (const_tree cand, const_tree base, 
> unsigned int align)
>   && TYPE_CONTEXT (cand) == TYPE_CONTEXT (base)
>   /* Check alignment.  */
>   && TYPE_ALIGN (cand) == align
> + && TYPE_USER_ALIGN (cand) == TYPE_USER_ALIGN (base)
>   && attribute_list_equal (TYPE_ATTRIBUTES (cand),
>TYPE_ATTRIBUTES (base))
>   && check_lang_type (cand, base));
>
> base-commit: 8f1591763fd50b143af0dc1770741f326a97583a
> --
> Marek Polacek • Red Hat, Inc. • 300 A St, Boston, MA
>


Re: [PATCH] diagnostics: Add %{...%} pretty-format support for URLs and use it in -Wpsabi diagnostcs

2020-04-29 Thread Richard Biener via Gcc-patches
On Wed, Apr 29, 2020 at 3:44 PM Jakub Jelinek via Gcc-patches
 wrote:
>
> Hi!
>
> The following patch attempts to use the diagnostics URL support if available
> to provide more information about the C++17 empty base and C++20
> [[no_unique_address]] empty class ABI changes in -Wpsabi diagnostics.
>
> GCC 10.1 at the end of the diagnostics is then in some terminals
> underlined with a dotted line and points to a (to be written) anchor in
> gcc-10/changes.html which we need to write anyway.
>
> Ok for trunk if this passes bootstrap/regtest?
>
> 2020-04-29  Jakub Jelinek  
>
> * configure.ac (-with-changes-root-url): New configure option,
> defaulting to https://gcc.gnu.org/.
> * Makefile.in (CFLAGS-opts.o): Define CHANGES_ROOT_URL for
> opts.c.
> * pretty-print.c (end_url_string): New function.
> (pp_format): Handle %{ and %} for URLs.
> (pp_begin_url): Use pp_string instead of pp_printf.
> (pp_end_url): Use end_url_string.
> * opts.h (get_changes_url): Declare.
> * opts.c (get_changes_url): New function.
> * config/rs6000/rs6000-call.c: Include opts.h.
> (rs6000_discover_homogeneous_aggregate): Use %{GCC 10.1%} instead of
> just GCC 10.1 in diagnostics and add URL.
> * config/arm/arm.c (aapcs_vfp_is_call_or_return_candidate): Likewise.
> * config/aarch64/aarch64.c (aarch64_vfp_is_call_or_return_candidate):
> Likewise.
> * configure: Regenerated.
>
> * c-format.c (PP_FORMAT_CHAR_TABLE): Add %{ and %}.
>
> --- gcc/configure.ac.jj 2020-04-29 10:21:25.061999873 +0200
> +++ gcc/configure.ac2020-04-29 13:26:51.515516959 +0200
> @@ -986,6 +986,20 @@ AC_ARG_WITH(documentation-root-url,
>  )
>  AC_SUBST(DOCUMENTATION_ROOT_URL)
>
> +# Allow overriding the default URL for GCC changes
> +AC_ARG_WITH(changes-root-url,
> +AS_HELP_STRING([--with-changes-root-url=URL],
> +   [Root for GCC changes URLs]),
> +[case "$withval" in
> +  yes) AC_MSG_ERROR([changes root URL not specified]) ;;
> +  no)  AC_MSG_ERROR([changes root URL not specified]) ;;
> +  *)   CHANGES_ROOT_URL="$withval"
> +  ;;
> + esac],
> + CHANGES_ROOT_URL="https://gcc.gnu.org/";
> +)
> +AC_SUBST(CHANGES_ROOT_URL)
> +
>  # Sanity check enable_languages in case someone does not run the toplevel
>  # configure # script.
>  AC_ARG_ENABLE(languages,
> --- gcc/Makefile.in.jj  2020-02-27 09:28:46.129960135 +0100
> +++ gcc/Makefile.in 2020-04-29 13:38:42.455008718 +0200
> @@ -2187,6 +2187,7 @@ lto-wrapper$(exeext): $(LTO_WRAPPER_OBJS
> mv -f T$@ $@
>
>  CFLAGS-opts.o += -DDOCUMENTATION_ROOT_URL=\"@DOCUMENTATION_ROOT_URL@\"
> +CFLAGS-opts.o += -DCHANGES_ROOT_URL=\"@CHANGES_ROOT_URL@\"
>
>  # Files used by all variants of C or by the stand-alone pre-processor.
>
> --- gcc/pretty-print.c.jj   2020-04-25 00:08:33.359759328 +0200
> +++ gcc/pretty-print.c  2020-04-29 14:30:46.791792554 +0200
> @@ -1020,6 +1020,8 @@ pp_indent (pretty_printer *pp)
>  pp_space (pp);
>  }
>
> +static const char *end_url_string (pretty_printer *);
> +
>  /* The following format specifiers are recognized as being client 
> independent:
> %d, %i: (signed) integer in base ten.
> %u: unsigned integer in base ten.
> @@ -1038,6 +1040,8 @@ pp_indent (pretty_printer *pp)
> %%: '%'.
> %<: opening quote.
> %>: closing quote.
> +   %{: URL start.
> +   %}: URL end.
> %': apostrophe (should only be used in untranslated messages;
> translations should use appropriate punctuation directly).
> %@: diagnostic_event_id_ptr, for which event_id->known_p () must be true.
> @@ -1051,7 +1055,7 @@ pp_indent (pretty_printer *pp)
> Arguments can be used sequentially, or through %N$ resp. *N$
> notation Nth argument after the format string.  If %N$ / *N$
> notation is used, it must be used for all arguments, except %m, %%,
> -   %<, %> and %', which may not have a number, as they do not consume
> +   %<, %>, %} and %', which may not have a number, as they do not consume
> an argument.  When %M$.*N$s is used, M must be N + 1.  (This may
> also be written %M$.*s, provided N is not otherwise used.)  The
> format string must have conversion specifiers with argument numbers
> @@ -1084,7 +1088,7 @@ pp_format (pretty_printer *pp, text_info
>/* Formatting phase 1: split up TEXT->format_spec into chunks in
>   pp_buffer (PP)->args[].  Even-numbered chunks are to be output
>   verbatim, odd-numbered chunks are format specifiers.
> - %m, %%, %<, %>, and %' are replaced with the appropriate text at
> + %m, %%, %<, %>, %} and %' are replaced with the appropriate text at
>   this point.  */
>
>memset (formatters, 0, sizeof formatters);
> @@ -1133,6 +1137,15 @@ pp_format (pretty_printer *pp, text_info
>   p++;
>   continue;
>
> +   case '}':
> + {
> +   const char *endurlstr = end_url_string (pp