[PATCH] Add bypass_p cost check in flag_sched_last_insn_heuristic

2020-11-05 Thread Jojo R
gcc/
* haifa-sched.c (rank_for_schedule): Add bypass_p
cost check in flag_sched_last_insn_heuristic.

---
 gcc/haifa-sched.c | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/gcc/haifa-sched.c b/gcc/haifa-sched.c
index 62d1816a55d..7d826483f55 100644
--- a/gcc/haifa-sched.c
+++ b/gcc/haifa-sched.c
@@ -2780,10 +2780,14 @@ rank_for_schedule (const void *x, const void *y)
  1) Data dependent on last schedule insn.
  2) Anti/Output dependent on last scheduled insn.
  3) Independent of last scheduled insn, or has latency of one.
+ 4) bypass of last scheduled insn, and has latency of zero.
  Choose the insn from the highest numbered class if different.  */
   dep1 = sd_find_dep_between (last, tmp, true);
 
-  if (dep1 == NULL || dep_cost (dep1) == 1)
+  if (dep1 == NULL || dep_cost (dep1) == 1
+ || (INSN_CODE (DEP_PRO (dep1)) >= 0 && bypass_p (DEP_PRO (dep1))
+ && recog_memoized (DEP_CON (dep1)) >= 0
+ && !insn_latency (DEP_PRO (dep1), DEP_CON (dep1
tmp_class = 3;
   else if (/* Data dependence.  */
   DEP_TYPE (dep1) == REG_DEP_TRUE)
@@ -2793,7 +2797,10 @@ rank_for_schedule (const void *x, const void *y)
 
   dep2 = sd_find_dep_between (last, tmp2, true);
 
-  if (dep2 == NULL || dep_cost (dep2)  == 1)
+  if (dep2 == NULL || dep_cost (dep2)  == 1
+ || (INSN_CODE (DEP_PRO (dep2)) >= 0 && bypass_p (DEP_PRO (dep2))
+ && recog_memoized (DEP_CON (dep2)) >= 0
+ && !insn_latency (DEP_PRO (dep2), DEP_CON (dep2
tmp2_class = 3;
   else if (/* Data dependence.  */
   DEP_TYPE (dep2) == REG_DEP_TRUE)
-- 
2.24.3 (Apple Git-128)



Re: [PATCH v3] rs6000: Use direct move for char/short vector CTOR [PR96933]

2020-11-05 Thread Kewen.Lin via Gcc-patches
Hi Segher,

Thanks for the review!

>>> Why does this test has_arch_pwr9 instead of adding -mdejagnu-cpu=power9?
>>
>> I thought using -mdejagnu-cpu=power9 would force the case run with
>> power9 cpu all the time, while using has_arch_pwr9 seems to be more
>> flexible, it can be compiled with power9 or later (like -mcpu=power10),
>> we can check whether we generate unexpected code on power10 or later.
>> Does it sound good?
> 
> It will not run at all if your compiler (or testsuite invocation) does
> not use at least power9.  Since the default for powerpc64-linux is
> power4, and that for powerpc64le-linux is power8, this will happen for
> many people (not to mention that it is extra important to test the
> default setup, of course).
> 

Good point!  has_arch_pwr9 can cause fewer test coverage if the default
arch is less than power9.

> It probably would be useful if there was some convenient way to say
> "use at least -mcpu=power9 for this, but some later cpu is fine too" --
> but there is no such thing yet.
> 
> Using something like that might cause more maintenance issues later, see
> "pstb" below for example, but that is not really an argument against
> fixing this.

Yeah, thanks for the good example!

>> +  if (TARGET_POWERPC64)
>> +{
>> +  op[i] = gen_reg_rtx (DImode);
>> +  emit_insn (gen_zero_extendqidi2 (op[i], tmp));
>> +}
>> +  else
>> +{
>> +  op[i] = gen_reg_rtx (SImode);
>> +  emit_insn (gen_zero_extendqisi2 (op[i], tmp));
>> +}
>> +}
> 
> TARGET_POWERPC64 should be TARGET_64BIT afaics?  (See below.)

Yes, fixed.

> 
> You can use Pmode then, too.  The zero_extend thing can be handled by
> changing
>   (define_insn "zero_extendqi2"
> to
>   (define_insn "@zero_extendqi2"
> (and no other changes needed), and then calling
>   emit_insn (gen_zero_extendqi2 (Pmode, op[i], tmp));
> (or so is the theory.  This might need some other changes, and also all
> other gen_zero_extendqi* callers need to change, so that is a separate
> patch if you want to try.  This isn't so bad right now.)

Will deal with this in a separate patch.

> 
>> +  for (i = 0; i < n_elts; i++)
>> +{
>> +  vr_qi[i] = gen_reg_rtx (V16QImode);
>> +  if (TARGET_POWERPC64)
>> +emit_insn (gen_p8_mtvsrd_v16qidi2 (vr_qi[i], op[i]));
>> +  else
>> +emit_insn (gen_p8_mtvsrwz_v16qisi2 (vr_qi[i], op[i]));
>> +}
> 
> TARGET_64BIT here as well.
> 
> TARGET_POWERPC64 means the current machine has the 64-bit insns.  It
> does not mean the code will run in 64-bit mode (e.g. -m32 -mpowerpc64 is
> just fine, and can be useful), but it also does not mean the OS (libc,
> kernel, etc.) will actually save the full 64-bit registers -- making it
> only useful on Darwin currently.
> 
> (You *can* run all of the testsuite flawlessly on Linux with those
> options, but that only works because those are small, short-running
> programs.  More "real", bigger and more complex programs fail in strange
> and exciting ways!)

Fixed as well.  Thanks for the detailed explanation!

>> +++ b/gcc/testsuite/gcc.target/powerpc/pr96933-1.c
>> @@ -0,0 +1,14 @@
>> +/* { dg-do compile { target { lp64 && has_arch_pwr9 } } } */
>> +/* { dg-require-effective-target powerpc_p9vector_ok } */
>> +/* { dg-options "-O2" } */
> 
> As David said:
> 
> /* { dg-do compile } */
> /* { dg-require-effective-target lp64 } */
> /* { dg-require-effective-target has_arch_pwr9 } */
> /* { dg-require-effective-target powerpc_p9vector_ok } */
> /* { dg-options "-O2 -mdejagnu-cpu=power9" } */
> 

Updated.  But I guessed the reason why we recommend to use single
line effective-target: more clear for people to read?  easier to
find the check result in test debugging verbose dump content?
Anything else I missed?

> But, you probably don't want that has_arch_pwr9 line at all, this is a
> compile test?

Yeah, removed.

> 
> So, okay for trunk with those TARGET_POWERPC64 fixed, and that one
> remaining testcase.  Thanks!

Thanks!  Bootstrapped/regress-tested on powerpc64le-linux-gnu P8/P9
and powerpc64-linux-gnu P8 again and committed in r11-4731.

BR,
Kewen


Re: [committed] patch to deal with insn scratches in global RA

2020-11-05 Thread Christophe Lyon via Gcc-patches
On Mon, 2 Nov 2020 at 23:01, Vladimir Makarov  wrote:
>
>
> On 2020-11-02 4:30 p.m., Vladimir Makarov via Gcc-patches wrote:
> >
> > On 2020-11-02 3:12 p.m., Christophe Lyon wrote:
> >>
> >> Hi,
> >>
> >> This patch causes ICEs on arm (eg arm-none-linux-gnueabi)
> >>  gcc.c-torture/compile/sync-3.c   -O1  (internal compiler error)
> >>  gcc.c-torture/compile/sync-3.c   -O2  (internal compiler error)
> >>  gcc.c-torture/compile/sync-3.c   -O2 -flto -fno-use-linker-plugin
> >> -flto-partition=none  (internal compiler error)
> >>  gcc.c-torture/compile/sync-3.c   -O2 -flto -fuse-linker-plugin
> >> -fno-fat-lto-objects  (internal compiler error)
> >>  gcc.c-torture/compile/sync-3.c   -O3 -g  (internal compiler error)
> >>  gcc.c-torture/compile/sync-3.c   -Os  (internal compiler error)
> >>
> >> gcc.log says:
> >> FAIL: gcc.c-torture/compile/sync-3.c   -O1  (internal compiler error)
> >> PASS: gcc.c-torture/compile/sync-3.c   -O1   (test for warnings, line )
> >> FAIL: gcc.c-torture/compile/sync-3.c   -O1  (test for excess errors)
> >> Excess errors:
> >> during RTL pass: ira
> >> /gcc/testsuite/gcc.c-torture/compile/sync-3.c:85:1: internal compiler
> >> error: Segmentation fault
> >> 0xcf8b1f crash_signal
> >>  /gcc/toplev.c:330
> >> 0xaeb0a0 fix_reg_equiv_init
> >>  /gcc/ira.c:2671
> >> 0xaf2113 find_moveable_pseudos
> >>  /gcc/ira.c:4874
> >> 0xaf48e8 ira
> >>  /gcc/ira.c:5533
> >> 0xaf48e8 execute
> >>  /gcc/ira.c:5861
> >
> >
> > Thank you for sending this info.  I reproduced the crash with
> > x86-64-arm cross-compiler although it is absent on native arm
> > environment.  I will have a fix tomorrow.
> >
> >
> I've fixed it.
>
Thanks, I confirm I no longer see this error.

Christophe


[PATCH] debug/97718 - fix abstract origin references after last change

2020-11-05 Thread Richard Biener
The change to clear the external_die_map slot after creating
the concrete instance DIE broke abstract origin processing which
tried to make sure to have those point to the early abstract instance
and not the concrete instance.  The following restores this by
eventually following the abstract origin link in the concrete instance.

Bootstrapped and tested on x86_64-unknown-linux-gnu.

2020-11-05  Richard Biener  

PR debug/97718
* dwarf2out.c (add_abstract_origin_attribute): Make sure to
point to the abstract instance.
---
 gcc/dwarf2out.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
index 64ac94a8cbd..81cb7341a7e 100644
--- a/gcc/dwarf2out.c
+++ b/gcc/dwarf2out.c
@@ -21293,7 +21293,16 @@ add_abstract_origin_attribute (dw_die_ref die, tree 
origin)
  here.  */
 
   if (origin_die)
-add_AT_die_ref (die, DW_AT_abstract_origin, origin_die);
+{
+  dw_attr_node *a;
+  /* Like above, if we already created a concrete instance DIE
+do not use that for the abstract origin but the early DIE
+if present.  */
+  if (in_lto_p
+ && (a = get_AT (origin_die, DW_AT_abstract_origin)))
+   origin_die = AT_ref (a);
+  add_AT_die_ref (die, DW_AT_abstract_origin, origin_die);
+}
 }
 
 /* We do not currently support the pure_virtual attribute.  */
-- 
2.26.2


[PATCH] Fix SLP vectorization of stores from boolean vectors

2020-11-05 Thread Richard Biener
The following fixes SLP vectorization of stores that were
pattern recognized.  Since in SLP vectorization pattern analysis
happens after dataref group analysis we have to adjust the groups
with the pattern stmts.  This has some effects down the pipeline
and exposes cases where we looked at the wrong pattern/non-pattern
stmts.

Bootstrap & regtest running on x86_64-unknown-linux-gnu.

2020-11-05  Richard Biener  

* tree-vect-data-refs.c (vect_slp_analyze_node_dependences):
Use the original stmts.
(vect_slp_analyze_node_alignment): Use the pattern stmt.
* tree-vect-slp.c (vect_fixup_store_groups_with_patterns):
New function.
(vect_slp_analyze_bb_1): Call it.

* gcc.dg/vect/bb-slp-69.c: New testcase.
---
 gcc/testsuite/gcc.dg/vect/bb-slp-69.c | 45 +++
 gcc/tree-vect-data-refs.c |  9 --
 gcc/tree-vect-slp.c   | 42 +
 3 files changed, 93 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/vect/bb-slp-69.c

diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-69.c 
b/gcc/testsuite/gcc.dg/vect/bb-slp-69.c
new file mode 100644
index 000..ca72a6804b7
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/bb-slp-69.c
@@ -0,0 +1,45 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target vect_int } */
+
+_Bool arr[16];
+
+void foo(char *q)
+{
+  char *p = __builtin_assume_aligned (q, 16);
+  _Bool b0, b1, b2, b3, b4, b5, b6, b7, b8, b9, b10, b11, b12, b13, b14, b15;
+  b0 = p[0] != 0;
+  b1 = p[1] != 0;
+  b2 = p[2] != 0;
+  b3 = p[3] != 0;
+  b4 = p[4] != 0;
+  b5 = p[5] != 0;
+  b6 = p[6] != 0;
+  b7 = p[7] != 0;
+  b8 = p[8] != 0;
+  b9 = p[9] != 0;
+  b10 = p[10] != 0;
+  b11 = p[11] != 0;
+  b12 = p[12] != 0;
+  b13 = p[13] != 0;
+  b14 = p[14] != 0;
+  b15 = p[15] != 0;
+  arr[0] = b0;
+  arr[1] = b1;
+  arr[2] = b2;
+  arr[3] = b3;
+  arr[4] = b4;
+  arr[5] = b5;
+  arr[6] = b6;
+  arr[7] = b7;
+  arr[8] = b8;
+  arr[9] = b9;
+  arr[10] = b10;
+  arr[11] = b11;
+  arr[12] = b12;
+  arr[13] = b13;
+  arr[14] = b14;
+  arr[15] = b15;
+}
+
+/* { dg-final { scan-tree-dump "transform load" "slp2" } } */
+/* { dg-final { scan-tree-dump "optimized: basic block" "slp2" } } */
diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
index fd14b480dbf..8afd3044461 100644
--- a/gcc/tree-vect-data-refs.c
+++ b/gcc/tree-vect-data-refs.c
@@ -688,7 +688,8 @@ vect_slp_analyze_node_dependences (vec_info *vinfo, 
slp_tree node,
   stmt_vec_info last_access_info = vect_find_last_scalar_stmt_in_slp 
(node);
   for (unsigned k = 0; k < SLP_TREE_SCALAR_STMTS (node).length (); ++k)
{
- stmt_vec_info access_info = SLP_TREE_SCALAR_STMTS (node)[k];
+ stmt_vec_info access_info
+   = vect_orig_stmt (SLP_TREE_SCALAR_STMTS (node)[k]);
  if (access_info == last_access_info)
continue;
  data_reference *dr_a = STMT_VINFO_DATA_REF (access_info);
@@ -759,7 +760,8 @@ vect_slp_analyze_node_dependences (vec_info *vinfo, 
slp_tree node,
= vect_find_first_scalar_stmt_in_slp (node);
   for (unsigned k = 0; k < SLP_TREE_SCALAR_STMTS (node).length (); ++k)
{
- stmt_vec_info access_info = SLP_TREE_SCALAR_STMTS (node)[k];
+ stmt_vec_info access_info
+   = vect_orig_stmt (SLP_TREE_SCALAR_STMTS (node)[k]);
  if (access_info == first_access_info)
continue;
  data_reference *dr_a = STMT_VINFO_DATA_REF (access_info);
@@ -2444,7 +2446,8 @@ vect_slp_analyze_node_alignment (vec_info *vinfo, 
slp_tree node)
 
   /* For creating the data-ref pointer we need alignment of the
  first element as well.  */
-  first_stmt_info = vect_find_first_scalar_stmt_in_slp (node);
+  first_stmt_info
+= vect_stmt_to_vectorize (vect_find_first_scalar_stmt_in_slp (node));
   if (first_stmt_info != SLP_TREE_SCALAR_STMTS (node)[0])
 {
   first_dr_info = STMT_VINFO_DR_INFO (first_stmt_info);
diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
index 420c3c93374..bb580089378 100644
--- a/gcc/tree-vect-slp.c
+++ b/gcc/tree-vect-slp.c
@@ -4023,6 +4023,45 @@ vect_slp_check_for_constructors (bb_vec_info bb_vinfo)
 }
 }
 
+/* Walk the grouped store chains and replace entries with their
+   pattern variant if any.  */
+
+static void
+vect_fixup_store_groups_with_patterns (vec_info *vinfo)
+{
+  stmt_vec_info first_element;
+  unsigned i;
+
+  FOR_EACH_VEC_ELT (vinfo->grouped_stores, i, first_element)
+{
+  if (STMT_VINFO_IN_PATTERN_P (first_element))
+   {
+ stmt_vec_info orig = first_element;
+ first_element = STMT_VINFO_RELATED_STMT (first_element);
+ DR_GROUP_FIRST_ELEMENT (first_element) = first_element;
+ DR_GROUP_SIZE (first_element) = DR_GROUP_SIZE (first_element);
+ DR_GROUP_GAP (first_element) = DR_GROUP_GAP (first_element);
+ DR_GROUP_NEXT_ELEMENT (first_element) = DR_GROUP_NEXT_ELEMENT (orig);
+ 

RE: [PATCH] arm: Implement vceqq_p64, vceqz_p64 and vceqzq_p64 intrinsics

2020-11-05 Thread Kyrylo Tkachov via Gcc-patches
H, Christophe,

> -Original Message-
> From: Gcc-patches  On Behalf Of
> Christophe Lyon via Gcc-patches
> Sent: 15 October 2020 18:23
> To: gcc-patches@gcc.gnu.org
> Subject: [PATCH] arm: Implement vceqq_p64, vceqz_p64 and vceqzq_p64
> intrinsics
> 
> This patch adds implementations for vceqq_p64, vceqz_p64 and
> vceqzq_p64 intrinsics.
> 
> vceqq_p64 uses the existing vceq_p64 after splitting the input vectors
> into their high and low halves.
> 
> vceqz[q] simply call the vceq and vceqq with a second argument equal
> to zero.
> 
> The added (executable) testcases make sure that the poly64x2_t
> variants have results with one element of all zeroes (false) and the
> other element with all bits set to one (true).
> 
> 2020-10-15  Christophe Lyon  
> 
>   gcc/
>   * config/arm/arm_neon.h (vceqz_p64, vceqq_p64, vceqzq_p64):
> New.
> 
>   gcc/testsuite/
>   * gcc.target/aarch64/advsimd-intrinsics/p64_p128.c: Add tests for
>   vceqz_p64, vceqq_p64 and vceqzq_p64.
> ---
>  gcc/config/arm/arm_neon.h  | 31 +++
>  .../aarch64/advsimd-intrinsics/p64_p128.c  | 46
> +-
>  2 files changed, 76 insertions(+), 1 deletion(-)
> 
> diff --git a/gcc/config/arm/arm_neon.h b/gcc/config/arm/arm_neon.h
> index aa21730..f7eff37 100644
> --- a/gcc/config/arm/arm_neon.h
> +++ b/gcc/config/arm/arm_neon.h
> @@ -16912,6 +16912,37 @@ vceq_p64 (poly64x1_t __a, poly64x1_t __b)
>return vreinterpret_u64_u32 (__m);
>  }
> 
> +__extension__ extern __inline uint64x1_t
> +__attribute__  ((__always_inline__, __gnu_inline__, __artificial__))
> +vceqz_p64 (poly64x1_t __a)
> +{
> +  poly64x1_t __b = vreinterpret_p64_u32 (vdup_n_u32 (0));
> +  return vceq_p64 (__a, __b);
> +}

This approach is okay, but can we have some kind of test to confirm it 
generates the VCEQ instruction with immediate zero rather than having a 
separate DUP...
Thanks,
Kyrill

> +
> +/* For vceqq_p64, we rely on vceq_p64 for each of the two elements.  */
> +__extension__ extern __inline uint64x2_t
> +__attribute__  ((__always_inline__, __gnu_inline__, __artificial__))
> +vceqq_p64 (poly64x2_t __a, poly64x2_t __b)
> +{
> +  poly64_t __high_a = vget_high_p64 (__a);
> +  poly64_t __high_b = vget_high_p64 (__b);
> +  uint64x1_t __high = vceq_p64(__high_a, __high_b);
> +
> +  poly64_t __low_a = vget_low_p64 (__a);
> +  poly64_t __low_b = vget_low_p64 (__b);
> +  uint64x1_t __low = vceq_p64(__low_a, __low_b);
> +  return vcombine_u64 (__low, __high);
> +}
> +
> +__extension__ extern __inline uint64x2_t
> +__attribute__  ((__always_inline__, __gnu_inline__, __artificial__))
> +vceqzq_p64 (poly64x2_t __a)
> +{
> +  poly64x2_t __b = vreinterpretq_p64_u32 (vdupq_n_u32 (0));
> +  return vceqq_p64 (__a, __b);
> +}
> +
>  /* The vtst_p64 intrinsic does not map to a single instruction.
> We emulate it in way similar to vceq_p64 above but here we do
> a reduction with max since if any two corresponding bits
> diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/p64_p128.c
> b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/p64_p128.c
> index a3210a9..6aed096 100644
> --- a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/p64_p128.c
> +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/p64_p128.c
> @@ -16,6 +16,11 @@ VECT_VAR_DECL(vbsl_expected,poly,64,2) [] =
> { 0xfff1,
> 
>  /* Expected results: vceq.  */
>  VECT_VAR_DECL(vceq_expected,uint,64,1) [] = { 0x0 };
> +VECT_VAR_DECL(vceq_expected,uint,64,2) [] = { 0x0, 0x };
> +
> +/* Expected results: vceqz.  */
> +VECT_VAR_DECL(vceqz_expected,uint,64,1) [] = { 0x0 };
> +VECT_VAR_DECL(vceqz_expected,uint,64,2) [] = { 0x0, 0x };
> 
>  /* Expected results: vcombine.  */
>  VECT_VAR_DECL(vcombine_expected,poly,64,2) [] = { 0xfff0,
> 0x88 };
> @@ -213,7 +218,7 @@ int main (void)
> 
>/* vceq_p64 tests. */
>  #undef TEST_MSG
> -#define TEST_MSG "VCEQ"
> +#define TEST_MSG "VCEQ/VCEQQ"
> 
>  #define TEST_VCOMP1(INSN, Q, T1, T2, T3, W, N)
>   \
>VECT_VAR(vceq_vector_res, T3, W, N) =
>   \
> @@ -227,16 +232,55 @@ int main (void)
>DECL_VARIABLE(vceq_vector, poly, 64, 1);
>DECL_VARIABLE(vceq_vector2, poly, 64, 1);
>DECL_VARIABLE(vceq_vector_res, uint, 64, 1);
> +  DECL_VARIABLE(vceq_vector, poly, 64, 2);
> +  DECL_VARIABLE(vceq_vector2, poly, 64, 2);
> +  DECL_VARIABLE(vceq_vector_res, uint, 64, 2);
> 
>CLEAN(result, uint, 64, 1);
> +  CLEAN(result, uint, 64, 2);
> 
>VLOAD(vceq_vector, buffer, , poly, p, 64, 1);
> +  VLOAD(vceq_vector, buffer, q, poly, p, 64, 2);
> 
>VDUP(vceq_vector2, , poly, p, 64, 1, 0x88);
> +  VSET_LANE(vceq_vector2, q, poly, p, 64, 2, 0, 0x88);
> +  VSET_LANE(vceq_vector2, q, poly, p, 64, 2, 1, 0xFFF1);
> 
>TEST_VCOMP(vceq, , poly, p, uint, 64, 1);
> +  TEST_VCOMP(vceq, q, poly, p, uint, 64, 2);
> 
>CHECK(TEST_MSG, uint, 64, 1, PRIx64, vceq_expected, "");
> +  CHECK(TEST_MSG, uint, 64, 2, PRIx64, vce

[committed][PATCH] testsuite: disable vect tests that was accidentally enabled on x86

2020-11-05 Thread Tamar Christina via Gcc-patches
Hi All,

My previous patch accidentally enabled some tests on x86 because my target
selector foo was weak..  This now properly only runs them on AArch64.

Bootstrapped Regtested on aarch64-none-linux-gnu and x86_64-pc-linux-gnu
and no issues.

Committed under the obvious rule.

Thanks,
Tamar

gcc/testsuite/ChangeLog:

* gcc.dg/vect/slp-11b.c: Update testcase.
* gcc.dg/vect/slp-perm-6.c: Update target selector.

-- 
diff --git a/gcc/testsuite/gcc.dg/vect/slp-11b.c b/gcc/testsuite/gcc.dg/vect/slp-11b.c
index 3f16c9cec3e0ab6672d321e1bb265e7853920341..0cc23770badf0e00ef98769a2dd14a92dca32cca 100644
--- a/gcc/testsuite/gcc.dg/vect/slp-11b.c
+++ b/gcc/testsuite/gcc.dg/vect/slp-11b.c
@@ -45,4 +45,4 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target { vect_strided4 && vect_int_mult } } } } */
 /* { dg-final { scan-tree-dump-times "vectorized 0 loops" 1 "vect" { target { ! { vect_strided4 && vect_int_mult } } } } } */
-/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" } } */
+/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 0 "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/slp-perm-6.c b/gcc/testsuite/gcc.dg/vect/slp-perm-6.c
index 816486a050d4fbf7d3365c829d48175ff7ba7f60..cc863de76bf9c9ececfcc821c3384959ec49aa6d 100644
--- a/gcc/testsuite/gcc.dg/vect/slp-perm-6.c
+++ b/gcc/testsuite/gcc.dg/vect/slp-perm-6.c
@@ -107,6 +107,6 @@ int main (int argc, const char* argv[])
 /* The epilogues are vectorized using partial vectors.  */
 /* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 4 "vect" { target { vect_perm3_int && { {! vect_load_lanes } && vect_partial_vectors_usage_1 } } } } } */
 /* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 "vect" { target vect_load_lanes } } } */
-/* { dg-final { scan-tree-dump "Built SLP cancelled: can use load/store-lanes" "vect" { xfail { vect_perm3_int && vect_load_lanes } } } } */
-/* { dg-final { scan-tree-dump "LOAD_LANES" "vect" { xfail vect_load_lanes } } } */
-/* { dg-final { scan-tree-dump "STORE_LANES" "vect" { xfail vect_load_lanes } } } */
+/* { dg-final { scan-tree-dump "Built SLP cancelled: can use load/store-lanes" "vect" { target { vect_perm3_int && vect_load_lanes } xfail { vect_perm3_int && vect_load_lanes } } } } */
+/* { dg-final { scan-tree-dump "LOAD_LANES" "vect" { target { vect_load_lanes } xfail { vect_load_lanes } } } } */
+/* { dg-final { scan-tree-dump "STORE_LANES" "vect" { target { vect_load_lanes } xfail { vect_load_lanes } } } } */



RE: [PATCH] SLP: Move load/store-lanes check till late

2020-11-05 Thread Richard Biener
On Wed, 4 Nov 2020, Tamar Christina wrote:

> Hi Richi,
> 
> > -Original Message-
> > From: rguent...@c653.arch.suse.de  On
> > Behalf Of Richard Biener
> > Sent: Wednesday, November 4, 2020 8:07 AM
> > To: Tamar Christina 
> > Cc: gcc-patches@gcc.gnu.org; nd ; o...@ucw.cz
> > Subject: RE: [PATCH] SLP: Move load/store-lanes check till late
> > 
> > On Tue, 3 Nov 2020, Tamar Christina wrote:
> > 
> > > Hi Richi,
> > >
> > > We decided to take the regression in any code-gen this could give and
> > > fix it properly next stage-1.  As such here's a new patch based on
> > > your previous feedback.
> > >
> > > Ok for master?
> > 
> > Looks good sofar but be aware that you elide the
> > 
> > - && vect_store_lanes_supported
> > -  (STMT_VINFO_VECTYPE (scalar_stmts[0]), group_size,
> > false))
> > 
> > part of the check - that is, you don't verify the store part of the 
> > instance can
> > use store-lanes.  Btw, this means the original code cancelled an instance 
> > only
> > when the SLP graph entry is a store-lane capable store but your variant
> > would also cancel in case there's a load-lane capable reduction.
> > 
> 
> I do still have it,
> 
> if (loads_permuted
> && vect_store_lanes_supported (vectype, group_size, false))
> 
> I just grab the type from the SLP_TREE_VECTYPE (slp_root); which should be 
> the store if
> one exists. 
> 
> > I think that you eventually want to re-instantiate the store-lane check but
> > treat it the same as any of the load checks (thus not require all instances 
> > to
> > be stores for the cancellation).
> > But at least when a store cannot use store-lanes we probably shouldn't
> > cancel the SLP.
> 
> I did however elide the kind check, that was added as part of the rebase, it 
> looked like kind wasn't
> Being stored inside the SLP instance and I'd have to redo the analysis to 
> find it.
> 
> Does it does reasonable to include kind as a field in the SLP instance?
> 
> > 
> > Anyway, the patch is OK for master.  The store-lane check part can be re-
> > added as followup.
> > 
> 
> Thanks! Will do.

Btw, the patch regressed

FAIL: gcc.dg/vect/slp-11b.c -flto -ffat-lto-objects  scan-tree-dump-times 
vect "vectorizing stmts using SLP" 1
FAIL: gcc.dg/vect/slp-11b.c scan-tree-dump-times vect "vectorizing stmts 
using SLP" 1
FAIL: gcc.dg/vect/slp-perm-6.c -flto -ffat-lto-objects  scan-tree-dump 
vect "Built SLP cancelled: can use load/store-lanes"
FAIL: gcc.dg/vect/slp-perm-6.c -flto -ffat-lto-objects  scan-tree-dump 
vect "LOAD_LANES"
FAIL: gcc.dg/vect/slp-perm-6.c -flto -ffat-lto-objects  scan-tree-dump 
vect "STORE_LANES"
FAIL: gcc.dg/vect/slp-perm-6.c scan-tree-dump vect "Built SLP cancelled: 
can use load/store-lanes"
FAIL: gcc.dg/vect/slp-perm-6.c scan-tree-dump vect "LOAD_LANES"
FAIL: gcc.dg/vect/slp-perm-6.c scan-tree-dump vect "STORE_LANES"

on x86_64.  The slp-11b.c testcase is interesting since there
extract_muldiv folding makes the group of four stores
not matching so we split into a size of 3 and one remaining store.
This causes us to arrive at

note:   node 0x441a940 (max_nunits=4, refcnt=2)
note:   stmt 0 _2 = in[_1];
note:   stmt 1 _6 = in[_5];
note:   stmt 2 _10 = in[_8];
note:   load permutation { 0 2 1 }

which on x86_64 we in the end cannot handle (without SSE4 I think)
so it fails to SLP there.  Guess arm can do the permute but not
the load-lane here.

For gcc.dg/vect/slp-perm-6.c the XFAILs shouldn't be done for
!vect_load_lanes targets.  Not sure if that's possible easily,
like with a { target vect_load_lanes } { xfail vect_load_lanes }
combo ...?  I suggest to make it xfail everywhere instead and
add a comment as to we're expecting those only for vect_load_lanes
targets.

Richard.

> > Thanks,
> > Richard.
> > 
> > > Thanks,
> > > Tamar
> > >
> > > gcc/ChangeLog:
> > >
> > >   * tree-vect-slp.c (vect_analyze_slp_instance): Moved load/store
> > lanes
> > >   check to ...
> > >   * tree-vect-loop.c (vect_analyze_loop_2): ..Here
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > >   * gcc.dg/vect/slp-11b.c: Update output scan.
> > >   * gcc.dg/vect/slp-perm-6.c: Likewise.
> > >
> > > > -Original Message-
> > > > From: rguent...@c653.arch.suse.de  On
> > > > Behalf Of Richard Biener
> > > > Sent: Thursday, October 22, 2020 9:44 AM
> > > > To: Tamar Christina 
> > > > Cc: gcc-patches@gcc.gnu.org; nd ; o...@ucw.cz
> > > > Subject: Re: [PATCH] SLP: Move load/store-lanes check till late
> > > >
> > > > On Wed, 21 Oct 2020, Tamar Christina wrote:
> > > >
> > > > > Hi All,
> > > > >
> > > > > This moves the code that checks for load/store lanes further in
> > > > > the pipeline and places it after slp_optimize.  This would allow
> > > > > us to perform optimizations on the SLP tree and only bail out if
> > > > > we really have a
> > > > permute.
> > > > >
> > > > > With this change it allows us to handle permutes such as {1,1,1,1}
> > > > > which should be handled by a load and replicate.
> > 

RE: [PATCH] SLP: Move load/store-lanes check till late

2020-11-05 Thread Tamar Christina via Gcc-patches
> -Original Message-
> From: rguent...@c653.arch.suse.de  On
> Behalf Of Richard Biener
> Sent: Thursday, November 5, 2020 10:17 AM
> To: Tamar Christina 
> Cc: gcc-patches@gcc.gnu.org; nd ; o...@ucw.cz
> Subject: RE: [PATCH] SLP: Move load/store-lanes check till late
> 
> On Wed, 4 Nov 2020, Tamar Christina wrote:
> 
> > Hi Richi,
> >
> > > -Original Message-
> > > From: rguent...@c653.arch.suse.de  On
> > > Behalf Of Richard Biener
> > > Sent: Wednesday, November 4, 2020 8:07 AM
> > > To: Tamar Christina 
> > > Cc: gcc-patches@gcc.gnu.org; nd ; o...@ucw.cz
> > > Subject: RE: [PATCH] SLP: Move load/store-lanes check till late
> > >
> > > On Tue, 3 Nov 2020, Tamar Christina wrote:
> > >
> > > > Hi Richi,
> > > >
> > > > We decided to take the regression in any code-gen this could give
> > > > and fix it properly next stage-1.  As such here's a new patch
> > > > based on your previous feedback.
> > > >
> > > > Ok for master?
> > >
> > > Looks good sofar but be aware that you elide the
> > >
> > > - && vect_store_lanes_supported
> > > -  (STMT_VINFO_VECTYPE (scalar_stmts[0]), group_size,
> > > false))
> > >
> > > part of the check - that is, you don't verify the store part of the
> > > instance can use store-lanes.  Btw, this means the original code
> > > cancelled an instance only when the SLP graph entry is a store-lane
> > > capable store but your variant would also cancel in case there's a load-
> lane capable reduction.
> > >
> >
> > I do still have it,
> >
> >   if (loads_permuted
> >   && vect_store_lanes_supported (vectype, group_size, false))
> >
> > I just grab the type from the SLP_TREE_VECTYPE (slp_root); which
> > should be the store if one exists.
> >
> > > I think that you eventually want to re-instantiate the store-lane
> > > check but treat it the same as any of the load checks (thus not
> > > require all instances to be stores for the cancellation).
> > > But at least when a store cannot use store-lanes we probably
> > > shouldn't cancel the SLP.
> >
> > I did however elide the kind check, that was added as part of the
> > rebase, it looked like kind wasn't Being stored inside the SLP instance and
> I'd have to redo the analysis to find it.
> >
> > Does it does reasonable to include kind as a field in the SLP instance?
> >
> > >
> > > Anyway, the patch is OK for master.  The store-lane check part can
> > > be re- added as followup.
> > >
> >
> > Thanks! Will do.
> 
> Btw, the patch regressed
> 
> FAIL: gcc.dg/vect/slp-11b.c -flto -ffat-lto-objects  scan-tree-dump-times vect
> "vectorizing stmts using SLP" 1
> FAIL: gcc.dg/vect/slp-11b.c scan-tree-dump-times vect "vectorizing stmts
> using SLP" 1
> FAIL: gcc.dg/vect/slp-perm-6.c -flto -ffat-lto-objects  scan-tree-dump vect
> "Built SLP cancelled: can use load/store-lanes"
> FAIL: gcc.dg/vect/slp-perm-6.c -flto -ffat-lto-objects  scan-tree-dump vect
> "LOAD_LANES"
> FAIL: gcc.dg/vect/slp-perm-6.c -flto -ffat-lto-objects  scan-tree-dump vect
> "STORE_LANES"
> FAIL: gcc.dg/vect/slp-perm-6.c scan-tree-dump vect "Built SLP cancelled:
> can use load/store-lanes"
> FAIL: gcc.dg/vect/slp-perm-6.c scan-tree-dump vect "LOAD_LANES"
> FAIL: gcc.dg/vect/slp-perm-6.c scan-tree-dump vect "STORE_LANES"
> 
> on x86_64.  The slp-11b.c testcase is interesting since there extract_muldiv
> folding makes the group of four stores not matching so we split into a size of
> 3 and one remaining store.
> This causes us to arrive at
> 
> note:   node 0x441a940 (max_nunits=4, refcnt=2)
> note:   stmt 0 _2 = in[_1];
> note:   stmt 1 _6 = in[_5];
> note:   stmt 2 _10 = in[_8];
> note:   load permutation { 0 2 1 }
> 
> which on x86_64 we in the end cannot handle (without SSE4 I think) so it fails
> to SLP there.  Guess arm can do the permute but not the load-lane here.
> 
> For gcc.dg/vect/slp-perm-6.c the XFAILs shouldn't be done
> for !vect_load_lanes targets.  Not sure if that's possible easily, like with a
> { target vect_load_lanes } { xfail vect_load_lanes } combo ...?  I suggest to
> make it xfail everywhere instead and add a comment as to we're expecting
> those only for vect_load_lanes targets.

Yes just fixed these, the change in gcc.dg/vect/slp-11b.c shouldn't be there
and I updated the target selector properly :) 

Cheers,
Tamar

> 
> Richard.
> 
> > > Thanks,
> > > Richard.
> > >
> > > > Thanks,
> > > > Tamar
> > > >
> > > > gcc/ChangeLog:
> > > >
> > > > * tree-vect-slp.c (vect_analyze_slp_instance): Moved load/store
> > > lanes
> > > > check to ...
> > > > * tree-vect-loop.c (vect_analyze_loop_2): ..Here
> > > >
> > > > gcc/testsuite/ChangeLog:
> > > >
> > > > * gcc.dg/vect/slp-11b.c: Update output scan.
> > > > * gcc.dg/vect/slp-perm-6.c: Likewise.
> > > >
> > > > > -Original Message-
> > > > > From: rguent...@c653.arch.suse.de 
> > > > > On Behalf Of Richard Biener
> > > > > Sent: Thursday, October 22, 2020 9:44 AM
> > > > > To: Tamar Chri

Re: [PATCH] ASAN: disable -Wno-stringop-overflow for 2 tests

2020-11-05 Thread Martin Liška

On 10/31/20 4:59 PM, H.J. Lu wrote:

On Tue, Oct 13, 2020 at 1:17 AM Jakub Jelinek via Gcc-patches
 wrote:


On Tue, Oct 13, 2020 at 10:11:26AM +0200, Martin Liška wrote:

--- a/gcc/testsuite/g++.dg/asan/asan_test.C
+++ b/gcc/testsuite/g++.dg/asan/asan_test.C
@@ -9,6 +9,7 @@
  // { dg-additional-options "-DASAN_AVOID_EXPENSIVE_TESTS=1" { target { ! 
run_expensive_tests } } }
  // { dg-additional-options "-msse2" { target { i?86-*-linux* x86_64-*-linux* 
i?86-*-freebsd* x86_64-*-freebsd*} } }
  // { dg-additional-options "-D__NO_INLINE__" { target { *-*-linux-gnu } } }
+/* { dg-additional-options "-Wno-stringop-overflow" } */


I'd put this one on the dg-options line next to other -Wno-* options.
Otherwise LGTM.


  // { dg-set-target-env-var ASAN_OPTIONS "handle_segv=2" }
  // { dg-final { asan-gtest } }
diff --git a/gcc/testsuite/gcc.dg/asan/pr80166.c 
b/gcc/testsuite/gcc.dg/asan/pr80166.c
index 629dd23a31c..5e153b274fa 100644
--- a/gcc/testsuite/gcc.dg/asan/pr80166.c
+++ b/gcc/testsuite/gcc.dg/asan/pr80166.c
@@ -1,5 +1,6 @@
  /* PR sanitizer/80166 */
  /* { dg-do run } */
+/* { dg-additional-options "-Wno-stringop-overflow" } */
  #include 
  #include 
--
2.28.0


 Jakub



Can you backport this to release branches?


Sure, I've just done that for GCC 9 and GCC 10 branch.
For GCC 8.x there's also missing -Wno-alloc-size-larger-than,
so I didn't do the backport.

Thanks,
Martin



Thanks.





Re: [PATCH] "used" attribute saves decl from linker garbage collection

2020-11-05 Thread Jozef Lawrynowicz
On Wed, Nov 04, 2020 at 03:58:56PM -0800, H.J. Lu wrote:
> On Wed, Nov 4, 2020 at 3:00 PM Hans-Peter Nilsson  wrote:
> >
> > On Wed, 4 Nov 2020, H.J. Lu wrote:
> > > On Wed, Nov 4, 2020 at 1:56 PM Hans-Peter Nilsson  
> > > wrote:
> > > > On Wed, 4 Nov 2020, H.J. Lu wrote:
> > > >
> > > > > On Wed, Nov 4, 2020 at 1:03 PM Hans-Peter Nilsson  
> > > > > wrote:
> > > > > >
> > > > > > On Wed, 4 Nov 2020, H.J. Lu wrote:
> > > > > > > On Wed, Nov 4, 2020 at 10:09 AM Hans-Peter Nilsson 
> > > > > > >  wrote:
> > > > > > > > I'm not much more than a random voice, but an assembly directive
> > > > > > > > that specifies the symbol (IIUC your .retain directive) to
> > > > > > >
> > > > > > > But .retain directive DOES NOT adjust symbol attribute.
> > > >
> > > > I see I missed to point out that I was speaking about the *gcc
> > > > symbol* attribute "used".
> > >
> > > There is no such corresponding symbol attribute in ELF.
> >
> > I have not missed that, nor that SHF_GNU_RETAIN is so new that
> > it's not in binutils master.  I have also not missed that gcc
> > caters to other object formats too.  A common symbol-specific
> > directive such as .retain, would be better than messing with
> > section attributes, for gcc.
> 
> This is totally irrelevant to SHF_GNU_RETAIN.
> 
> > > > It's cleaner to the compiler if it can pass on to the assembler
> > > > the specific symbol that needs to be kept.
> > > >
> > >
> > > SHF_GNU_RETAIN is for section and GCC should place the symbol,
> > > which should be kept, in the SHF_GNU_RETAIN section directly, not
> > > through .retain directive.
> >
> > This is where opinions differ.  Anyway, this is now repetition;
> > I'm done.
> 
> .retain is ill-defined.   For example,
> 
> [hjl@gnu-cfl-2 gcc]$ cat /tmp/x.c
> static int xyzzy __attribute__((__used__));
> [hjl@gnu-cfl-2 gcc]$ ./xgcc -B./ -S /tmp/x.c -fcommon
> [hjl@gnu-cfl-2 gcc]$ cat x.s
> .file "x.c"
> .text
> .retain xyzzy  < What does it do?
> .local xyzzy
> .comm xyzzy,4,4
> .ident "GCC: (GNU) 11.0.0 20201103 (experimental)"
> .section .note.GNU-stack,"",@progbits
> [hjl@gnu-cfl-2 gcc]$
> 
> A symbol directive should operate on the symbol table.
> With 'R' flag, we got
> 
> .file "x.c"
> .text
> .section .bss.xyzzy,"awR",@nobits
> .align 4
> .type xyzzy, @object
> .size xyzzy, 4
> xyzzy:
> .zero 4
> .ident "GCC: (GNU) 11.0.0 20201104 (experimental)"
> .section .note.GNU-stack,"",@progbits

I still think it is very wrong for the "used" attribute to place the
symbol in a unique section. The structure of the sections in the object
file should be no different whether the "used" attribute was applied to
a symbol or not.

I will therefore have to make changes to GCC so that we can get the name
of "unnamed" sections, and emit a .section directive with the "R" flag
set on that section name, in order to avoid using a .retain directive.

"used" applied to a function
---
  Before:
TEXT_SECTION_ASM_OP 
func:

  After:
.section TEXT_SECTION_NAME,"axR",%progbits
func:

Where TEXT_SECTION_NAME is a new macro which defines the section name
corresponding to TEXT_SECTION_ASM_OP.
Similar new macros are required for all *SECTION_ASM_OP.

Since we can't use the .retain directive, this is the cludge that will
be required to robustly support all targets.

The alternative is to just infer that the mapping of unnamed sections to
section names is always the following:
  text_section-> .text,"ax",%progbits
  data_section-> .data,"aw"
  bss_section -> .bss,"aw",%nobits
  rodata_section  -> .rodata,"a",
  etc.

This section name assumption does not hold for a couple of ELF targets.

Also, many targets omit the specification of the flags, leaving that
choice to the assembler, instead the compiler will now have to infer
what the assembler will do, all because we can't have the .retain
directive.

.retain  makes life very easy for GCC, but I understand your
objection from a theoretical point of view.

You previously objected to .retain , to apply
SHF_GNU_RETAIN to . This does not violate your rule about
a directive applying flags to a different type of structure to what is
named in the directive.

If we can have .retain , then we don't have to make
assumptions about section flags in GCC, we can just name the section use
in the ASM_OP.

Do you still oppose .retain ?

Another alternative is to disallow "used" from applying SHF_GNU_RETAIN,
unless the symbol is in a named section. Obviously this is pretty gross,
but would mean we don't need to handle *SECTION_ASM_OP sections.

Thanks,
Jozef
> 
> -- 
> H.J.


[PATCH] arm: [testcase] Better narrow some bfloat16 testcase

2020-11-05 Thread Andrea Corallo via Gcc-patches
Christophe Lyon  writes:

[...]

>> I think you need to add -mfloat-abi=hard to the dg-additional-options
>> otherwise vld1_lane_bf16_1.c
>> fails on targets with a soft float-abi default (eg arm-linux-gnueabi).
>>
>> See bf16_vldn_1.c.
>
> Actually that's not sufficient because in turn we get:
> /sysroot-arm-none-linux-gnueabi/usr/include/gnu/stubs.h:10:11: fatal
> error: gnu/stubs-hard.h: No such file or directory
>
> So you should check that -mfloat-abi=hard is supported.
>
> Ditto for the vst tests.

Hi Christophe,

this patch should implement your suggestions.

On my arm-none-linux-gnueabi setup the tests were already skipped
as unsupported so if you could test and confirm this fixes the 
issue you see would be great.

Thanks!

  Andrea

>From d27e3f39fa2f348a4b8aa929bbb65808a09f1211 Mon Sep 17 00:00:00 2001
From: Andrea Corallo 
Date: Thu, 5 Nov 2020 08:57:03 +
Subject: [PATCH] arm: [testcase] Better narrow some bfloat16 testcase

2020-11-05  Andrea Corallo  

* gcc.target/arm/simd/vld1_lane_bf16_1.c: Add -mfloat-abi=hard
flag.
* gcc.target/arm/simd/vld1_lane_bf16_indices_1.c: Likewise.
* gcc.target/arm/simd/vld1q_lane_bf16_indices_1.c: Likewise.
* gcc.target/arm/simd/vst1_lane_bf16_1.c: Likewise.
* gcc.target/arm/simd/vst1_lane_bf16_indices_1.c: Likewise.
* gcc.target/arm/simd/vstq1_lane_bf16_indices_1.c: Likewise.
* lib/target-supports.exp
(check_effective_target_arm_v8_2a_bf16_neon_ok_nocache): Require
target to support -mfloat-abi=hard.
---
 gcc/testsuite/gcc.target/arm/simd/vld1_lane_bf16_1.c  | 2 +-
 gcc/testsuite/gcc.target/arm/simd/vld1_lane_bf16_indices_1.c  | 1 +
 gcc/testsuite/gcc.target/arm/simd/vld1q_lane_bf16_indices_1.c | 1 +
 gcc/testsuite/gcc.target/arm/simd/vst1_lane_bf16_1.c  | 2 +-
 gcc/testsuite/gcc.target/arm/simd/vst1_lane_bf16_indices_1.c  | 1 +
 gcc/testsuite/gcc.target/arm/simd/vstq1_lane_bf16_indices_1.c | 1 +
 gcc/testsuite/lib/target-supports.exp | 4 
 7 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/simd/vld1_lane_bf16_1.c 
b/gcc/testsuite/gcc.target/arm/simd/vld1_lane_bf16_1.c
index fa4e45b7217..64e1f394676 100644
--- a/gcc/testsuite/gcc.target/arm/simd/vld1_lane_bf16_1.c
+++ b/gcc/testsuite/gcc.target/arm/simd/vld1_lane_bf16_1.c
@@ -1,7 +1,7 @@
 /* { dg-do assemble } */
 /* { dg-require-effective-target arm_v8_2a_bf16_neon_ok } */
 /* { dg-add-options arm_v8_2a_bf16_neon } */
-/* { dg-additional-options "-O3 --save-temps" } */
+/* { dg-additional-options "-O3 --save-temps -mfloat-abi=hard" } */
 
 #include "arm_neon.h"
 
diff --git a/gcc/testsuite/gcc.target/arm/simd/vld1_lane_bf16_indices_1.c 
b/gcc/testsuite/gcc.target/arm/simd/vld1_lane_bf16_indices_1.c
index c83eb53234d..9e56c25974e 100644
--- a/gcc/testsuite/gcc.target/arm/simd/vld1_lane_bf16_indices_1.c
+++ b/gcc/testsuite/gcc.target/arm/simd/vld1_lane_bf16_indices_1.c
@@ -1,6 +1,7 @@
 /* { dg-do assemble } */
 /* { dg-require-effective-target arm_v8_2a_bf16_neon_ok } */
 /* { dg-add-options arm_v8_2a_bf16_neon } */
+/* { dg-additional-options "-mfloat-abi=hard" } */
 
 #include "arm_neon.h"
 
diff --git a/gcc/testsuite/gcc.target/arm/simd/vld1q_lane_bf16_indices_1.c 
b/gcc/testsuite/gcc.target/arm/simd/vld1q_lane_bf16_indices_1.c
index 8e21e61c9c0..c75d24db11b 100644
--- a/gcc/testsuite/gcc.target/arm/simd/vld1q_lane_bf16_indices_1.c
+++ b/gcc/testsuite/gcc.target/arm/simd/vld1q_lane_bf16_indices_1.c
@@ -1,6 +1,7 @@
 /* { dg-do assemble } */
 /* { dg-require-effective-target arm_v8_2a_bf16_neon_ok } */
 /* { dg-add-options arm_v8_2a_bf16_neon } */
+/* { dg-additional-options "-mfloat-abi=hard" } */
 
 #include "arm_neon.h"
 
diff --git a/gcc/testsuite/gcc.target/arm/simd/vst1_lane_bf16_1.c 
b/gcc/testsuite/gcc.target/arm/simd/vst1_lane_bf16_1.c
index e018ec6592f..77e8a3bd5eb 100644
--- a/gcc/testsuite/gcc.target/arm/simd/vst1_lane_bf16_1.c
+++ b/gcc/testsuite/gcc.target/arm/simd/vst1_lane_bf16_1.c
@@ -1,7 +1,7 @@
 /* { dg-do assemble } */
 /* { dg-require-effective-target arm_v8_2a_bf16_neon_ok } */
 /* { dg-add-options arm_v8_2a_bf16_neon } */
-/* { dg-additional-options "-O3 --save-temps" } */
+/* { dg-additional-options "-O3 --save-temps -mfloat-abi=hard" } */
 
 #include "arm_neon.h"
 
diff --git a/gcc/testsuite/gcc.target/arm/simd/vst1_lane_bf16_indices_1.c 
b/gcc/testsuite/gcc.target/arm/simd/vst1_lane_bf16_indices_1.c
index 39870dc054c..ba4017afd0c 100644
--- a/gcc/testsuite/gcc.target/arm/simd/vst1_lane_bf16_indices_1.c
+++ b/gcc/testsuite/gcc.target/arm/simd/vst1_lane_bf16_indices_1.c
@@ -1,6 +1,7 @@
 /* { dg-do assemble } */
 /* { dg-require-effective-target arm_v8_2a_bf16_neon_ok } */
 /* { dg-add-options arm_v8_2a_bf16_neon } */
+/* { dg-additional-options "-mfloat-abi=hard" } */
 
 #include "arm_neon.h"
 
diff --git a/gcc/testsuite/gcc.target/arm/simd/vstq1_lane_bf16_indices_1.c 
b/gcc/testsuite/gcc.target/arm/simd/vstq1_lane_bf16

Re: [PATCH] "used" attribute saves decl from linker garbage collection

2020-11-05 Thread Hans-Peter Nilsson
On Wed, 4 Nov 2020, H.J. Lu wrote:
> .retain is ill-defined.   For example,
>
> [hjl@gnu-cfl-2 gcc]$ cat /tmp/x.c
> static int xyzzy __attribute__((__used__));
> [hjl@gnu-cfl-2 gcc]$ ./xgcc -B./ -S /tmp/x.c -fcommon
> [hjl@gnu-cfl-2 gcc]$ cat x.s
> .file "x.c"
> .text
> .retain xyzzy  < What does it do?
> .local xyzzy
> .comm xyzzy,4,4
> .ident "GCC: (GNU) 11.0.0 20201103 (experimental)"
> .section .note.GNU-stack,"",@progbits
> [hjl@gnu-cfl-2 gcc]$

To answer that question: it's up to the assembler, but for ELF
and SHF_GNU_RETAIN, it seems obvious it'd tell the assembler to
set SHF_GNU_RETAIN for the section where the symbol ends up.
We both know this isn't rocket science with binutils.

brgds, H-P


Re: [PATCH] "used" attribute saves decl from linker garbage collection

2020-11-05 Thread Jozef Lawrynowicz
On Thu, Nov 05, 2020 at 06:21:21AM -0500, Hans-Peter Nilsson wrote:
> On Wed, 4 Nov 2020, H.J. Lu wrote:
> > .retain is ill-defined.   For example,
> >
> > [hjl@gnu-cfl-2 gcc]$ cat /tmp/x.c
> > static int xyzzy __attribute__((__used__));
> > [hjl@gnu-cfl-2 gcc]$ ./xgcc -B./ -S /tmp/x.c -fcommon
> > [hjl@gnu-cfl-2 gcc]$ cat x.s
> > .file "x.c"
> > .text
> > .retain xyzzy  < What does it do?
> > .local xyzzy
> > .comm xyzzy,4,4
> > .ident "GCC: (GNU) 11.0.0 20201103 (experimental)"
> > .section .note.GNU-stack,"",@progbits
> > [hjl@gnu-cfl-2 gcc]$
> 
> To answer that question: it's up to the assembler, but for ELF
> and SHF_GNU_RETAIN, it seems obvious it'd tell the assembler to
> set SHF_GNU_RETAIN for the section where the symbol ends up.
> We both know this isn't rocket science with binutils.

Indeed, and my patch handles it trivially:
https://sourceware.org/pipermail/binutils/2020-November/113993.html

  +void
  +obj_elf_retain (int arg ATTRIBUTE_UNUSED)
   snip 
  +  sym = get_sym_from_input_line_and_check ();
  +  symbol_get_obj (sym)->retain = 1;

  @@ -2624,6 +2704,9 @@ elf_frob_symbol (symbolS *symp, int *puntp)
}
   }
   
  +  if (symbol_get_obj (symp)->retain)
  +elf_section_flags (S_GET_SEGMENT (symp)) |= SHF_GNU_RETAIN;
  +
 /* Double check weak symbols.  */
 if (S_IS_WEAK (symp))
   {

We could check that the symbol named in the .retain directive has
already been defined, however this isn't compatible with GCC
mark_decl_preserved handling, since mark_decl_preserved is called
emitted before the local symbols are defined in the assembly output
file.

GAS should at least validate that the symbol named in the .retain
directive does end up as a symbol though.

Thanks,
Jozef


> 
> brgds, H-P


Re: [PATCH] arm: Implement vceqq_p64, vceqz_p64 and vceqzq_p64 intrinsics

2020-11-05 Thread Christophe Lyon via Gcc-patches
On Thu, 5 Nov 2020 at 10:36, Kyrylo Tkachov  wrote:
>
> H, Christophe,
>
> > -Original Message-
> > From: Gcc-patches  On Behalf Of
> > Christophe Lyon via Gcc-patches
> > Sent: 15 October 2020 18:23
> > To: gcc-patches@gcc.gnu.org
> > Subject: [PATCH] arm: Implement vceqq_p64, vceqz_p64 and vceqzq_p64
> > intrinsics
> >
> > This patch adds implementations for vceqq_p64, vceqz_p64 and
> > vceqzq_p64 intrinsics.
> >
> > vceqq_p64 uses the existing vceq_p64 after splitting the input vectors
> > into their high and low halves.
> >
> > vceqz[q] simply call the vceq and vceqq with a second argument equal
> > to zero.
> >
> > The added (executable) testcases make sure that the poly64x2_t
> > variants have results with one element of all zeroes (false) and the
> > other element with all bits set to one (true).
> >
> > 2020-10-15  Christophe Lyon  
> >
> >   gcc/
> >   * config/arm/arm_neon.h (vceqz_p64, vceqq_p64, vceqzq_p64):
> > New.
> >
> >   gcc/testsuite/
> >   * gcc.target/aarch64/advsimd-intrinsics/p64_p128.c: Add tests for
> >   vceqz_p64, vceqq_p64 and vceqzq_p64.
> > ---
> >  gcc/config/arm/arm_neon.h  | 31 +++
> >  .../aarch64/advsimd-intrinsics/p64_p128.c  | 46
> > +-
> >  2 files changed, 76 insertions(+), 1 deletion(-)
> >
> > diff --git a/gcc/config/arm/arm_neon.h b/gcc/config/arm/arm_neon.h
> > index aa21730..f7eff37 100644
> > --- a/gcc/config/arm/arm_neon.h
> > +++ b/gcc/config/arm/arm_neon.h
> > @@ -16912,6 +16912,37 @@ vceq_p64 (poly64x1_t __a, poly64x1_t __b)
> >return vreinterpret_u64_u32 (__m);
> >  }
> >
> > +__extension__ extern __inline uint64x1_t
> > +__attribute__  ((__always_inline__, __gnu_inline__, __artificial__))
> > +vceqz_p64 (poly64x1_t __a)
> > +{
> > +  poly64x1_t __b = vreinterpret_p64_u32 (vdup_n_u32 (0));
> > +  return vceq_p64 (__a, __b);
> > +}
>
> This approach is okay, but can we have some kind of test to confirm it 
> generates the VCEQ instruction with immediate zero rather than having a 
> separate DUP...

I had checked that manually, but I'll add a test.
However, I have noticed that although vceqz_p64 uses vceq.i32 dX, dY, #0,
the vceqzq_64 version below first sets
vmov dZ, #0
and then emits two
vmoz dX, dY, dZ

I'm looking at why this happens.

Thanks,

Christophe


> Thanks,
> Kyrill
>
> > +
> > +/* For vceqq_p64, we rely on vceq_p64 for each of the two elements.  */
> > +__extension__ extern __inline uint64x2_t
> > +__attribute__  ((__always_inline__, __gnu_inline__, __artificial__))
> > +vceqq_p64 (poly64x2_t __a, poly64x2_t __b)
> > +{
> > +  poly64_t __high_a = vget_high_p64 (__a);
> > +  poly64_t __high_b = vget_high_p64 (__b);
> > +  uint64x1_t __high = vceq_p64(__high_a, __high_b);
> > +
> > +  poly64_t __low_a = vget_low_p64 (__a);
> > +  poly64_t __low_b = vget_low_p64 (__b);
> > +  uint64x1_t __low = vceq_p64(__low_a, __low_b);
> > +  return vcombine_u64 (__low, __high);
> > +}
> > +
> > +__extension__ extern __inline uint64x2_t
> > +__attribute__  ((__always_inline__, __gnu_inline__, __artificial__))
> > +vceqzq_p64 (poly64x2_t __a)
> > +{
> > +  poly64x2_t __b = vreinterpretq_p64_u32 (vdupq_n_u32 (0));
> > +  return vceqq_p64 (__a, __b);
> > +}
> > +
> >  /* The vtst_p64 intrinsic does not map to a single instruction.
> > We emulate it in way similar to vceq_p64 above but here we do
> > a reduction with max since if any two corresponding bits
> > diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/p64_p128.c
> > b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/p64_p128.c
> > index a3210a9..6aed096 100644
> > --- a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/p64_p128.c
> > +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/p64_p128.c
> > @@ -16,6 +16,11 @@ VECT_VAR_DECL(vbsl_expected,poly,64,2) [] =
> > { 0xfff1,
> >
> >  /* Expected results: vceq.  */
> >  VECT_VAR_DECL(vceq_expected,uint,64,1) [] = { 0x0 };
> > +VECT_VAR_DECL(vceq_expected,uint,64,2) [] = { 0x0, 0x };
> > +
> > +/* Expected results: vceqz.  */
> > +VECT_VAR_DECL(vceqz_expected,uint,64,1) [] = { 0x0 };
> > +VECT_VAR_DECL(vceqz_expected,uint,64,2) [] = { 0x0, 0x };
> >
> >  /* Expected results: vcombine.  */
> >  VECT_VAR_DECL(vcombine_expected,poly,64,2) [] = { 0xfff0,
> > 0x88 };
> > @@ -213,7 +218,7 @@ int main (void)
> >
> >/* vceq_p64 tests. */
> >  #undef TEST_MSG
> > -#define TEST_MSG "VCEQ"
> > +#define TEST_MSG "VCEQ/VCEQQ"
> >
> >  #define TEST_VCOMP1(INSN, Q, T1, T2, T3, W, N)
> >   \
> >VECT_VAR(vceq_vector_res, T3, W, N) =
> >   \
> > @@ -227,16 +232,55 @@ int main (void)
> >DECL_VARIABLE(vceq_vector, poly, 64, 1);
> >DECL_VARIABLE(vceq_vector2, poly, 64, 1);
> >DECL_VARIABLE(vceq_vector_res, uint, 64, 1);
> > +  DECL_VARIABLE(vceq_vector, poly, 64, 2);
> > +  DECL_VARIABLE(vceq_vector2, poly, 64, 2);
> > +  DECL_VARIABLE(vceq_vector_res, uint, 64, 2);
> >
> >

Re: [PATCH] arm: [testcase] Better narrow some bfloat16 testcase

2020-11-05 Thread Christophe Lyon via Gcc-patches
On Thu, 5 Nov 2020 at 12:11, Andrea Corallo  wrote:
>
> Christophe Lyon  writes:
>
> [...]
>
> >> I think you need to add -mfloat-abi=hard to the dg-additional-options
> >> otherwise vld1_lane_bf16_1.c
> >> fails on targets with a soft float-abi default (eg arm-linux-gnueabi).
> >>
> >> See bf16_vldn_1.c.
> >
> > Actually that's not sufficient because in turn we get:
> > /sysroot-arm-none-linux-gnueabi/usr/include/gnu/stubs.h:10:11: fatal
> > error: gnu/stubs-hard.h: No such file or directory
> >
> > So you should check that -mfloat-abi=hard is supported.
> >
> > Ditto for the vst tests.
>
> Hi Christophe,
>
> this patch should implement your suggestions.
>
> On my arm-none-linux-gnueabi setup the tests were already skipped
> as unsupported so if you could test and confirm this fixes the
> issue you see would be great.

Do you know why they are unsupported in your setup?

> diff --git a/gcc/testsuite/lib/target-supports.exp 
> b/gcc/testsuite/lib/target-supports.exp
> index 15f0649f8ae..2ab7e39756d 100644
> --- a/gcc/testsuite/lib/target-supports.exp
> +++ b/gcc/testsuite/lib/target-supports.exp
> @@ -5213,6 +5213,10 @@ proc 
> check_effective_target_arm_v8_2a_bf16_neon_ok_nocache { } {
>  return 0;
>  }
>
> +if { ! [check_effective_target_arm_hard_ok] } {
> + return 0;
> +}
> +
> foreach flags {"" "-mfloat-abi=hard -mfpu=neon-fp-armv8" 
> "-mfloat-abi=softfp -mfpu=neon-fp-armv8" } {
> if { [check_no_compiler_messages_nocache arm_v8_2a_bf16_neon_ok 
> object {
> #include 

This seems strange since you would now exit early if
check_effective_target_arm_hard_ok is false, so you'll never need the
-mfloat-abi=softfp version of the flags.
BTW in general, I think softfp is tried before hard in the other
similar effective targets, any reason the order is different here?

Christophe

>
> Thanks!
>
>   Andrea
>


[Patch] OpenACC (C/C++): Fix 'acc atomic' parsing

2020-11-05 Thread Tobias Burnus

OpenACC piggybacks on OpenACC for the atomic parsing; however, there
are two issues:
* Newer OpenMP versions added additional clauses such as 'seq_cst',
  which do not exist in OpenACC.
* OpenACC 2.6 added 'acc atomic update capture' (besides 'acc atomic capture',
  which was not accepted.

Actually, while OpenACC 2.6/2.7/3.0 has 'acc atomic update capture' in the
syntax, it never explicitly states that this matches 'atomic capture'.

NOTE: I did not check whether the supported expressions by OpenMP 5.0/the
current GCC implementation is the same as in OpenACC 2.6/2.7/3.x.

Any comments?

Tobias

-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter
OpenACC (C/C++): Fix 'acc atomic' parsing

gcc/c/ChangeLog:

	* c-parser.c (c_parser_omp_atomic): Add openacc parameter and update
	OpenACC matching.
	(c_parser_omp_construct): Update call.

gcc/cp/ChangeLog:

	* parser.c (cp_parser_omp_atomic): Add openacc parameter and update
	OpenACC matching.
	(cp_parser_omp_construct): Update call.

gcc/testsuite/ChangeLog:

	* c-c++-common/goacc-gomp/atomic.c: New test.
	* c-c++-common/goacc/atomic.c: New test.

 gcc/c/c-parser.c   | 40 +++-
 gcc/cp/parser.c| 39 ++-
 gcc/testsuite/c-c++-common/goacc-gomp/atomic.c | 43 ++
 gcc/testsuite/c-c++-common/goacc/atomic.c  | 30 ++
 4 files changed, 124 insertions(+), 28 deletions(-)

diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
index fc97aa3f95f..79037d98f76 100644
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -17304,7 +17304,7 @@ c_parser_oacc_wait (location_t loc, c_parser *parser, char *p_name)
   LOC is the location of the #pragma token.  */
 
 static void
-c_parser_omp_atomic (location_t loc, c_parser *parser)
+c_parser_omp_atomic (location_t loc, c_parser *parser, bool openacc)
 {
   tree lhs = NULL_TREE, rhs = NULL_TREE, v = NULL_TREE;
   tree lhs1 = NULL_TREE, rhs1 = NULL_TREE;
@@ -17343,17 +17343,17 @@ c_parser_omp_atomic (location_t loc, c_parser *parser)
 	new_code = OMP_ATOMIC;
 	  else if (!strcmp (p, "capture"))
 	new_code = OMP_ATOMIC_CAPTURE_NEW;
-	  else if (!strcmp (p, "seq_cst"))
+	  else if (!openacc && !strcmp (p, "seq_cst"))
 	new_memory_order = OMP_MEMORY_ORDER_SEQ_CST;
-	  else if (!strcmp (p, "acq_rel"))
+	  else if (!openacc && !strcmp (p, "acq_rel"))
 	new_memory_order = OMP_MEMORY_ORDER_ACQ_REL;
-	  else if (!strcmp (p, "release"))
+	  else if (!openacc && !strcmp (p, "release"))
 	new_memory_order = OMP_MEMORY_ORDER_RELEASE;
-	  else if (!strcmp (p, "acquire"))
+	  else if (!openacc && !strcmp (p, "acquire"))
 	new_memory_order = OMP_MEMORY_ORDER_ACQUIRE;
-	  else if (!strcmp (p, "relaxed"))
+	  else if (!openacc && !strcmp (p, "relaxed"))
 	new_memory_order = OMP_MEMORY_ORDER_RELAXED;
-	  else if (!strcmp (p, "hint"))
+	  else if (!openacc && !strcmp (p, "hint"))
 	{
 	  c_parser_consume_token (parser);
 	  clauses = c_parser_omp_clause_hint (parser, clauses);
@@ -17362,15 +17362,24 @@ c_parser_omp_atomic (location_t loc, c_parser *parser)
 	  else
 	{
 	  p = NULL;
-	  error_at (cloc, "expected %, %, %, "
-			  "%, %, %, "
-			  "%, % or % clause");
+	  if (openacc)
+		error_at (cloc, "expected %, %, %, "
+"or % clause");
+	  else
+		error_at (cloc, "expected %, %, %, "
+"%, %, %, "
+"%, % or % clause");
 	}
 	  if (p)
 	{
 	  if (new_code != ERROR_MARK)
 		{
-		  if (code != ERROR_MARK)
+		  /* OpenACC permits 'update capture'.  */
+		  if (openacc
+		  && code == OMP_ATOMIC
+		  && new_code == OMP_ATOMIC_CAPTURE_NEW)
+		code = new_code;
+		  else if (code != ERROR_MARK)
 		error_at (cloc, "too many atomic clauses");
 		  else
 		code = new_code;
@@ -17392,7 +17401,9 @@ c_parser_omp_atomic (location_t loc, c_parser *parser)
 
   if (code == ERROR_MARK)
 code = OMP_ATOMIC;
-  if (memory_order == OMP_MEMORY_ORDER_UNSPECIFIED)
+  if (openacc)
+memory_order = OMP_MEMORY_ORDER_RELAXED;
+  else if (memory_order == OMP_MEMORY_ORDER_UNSPECIFIED)
 {
   omp_requires_mask
 	= (enum omp_requires) (omp_requires_mask
@@ -17448,6 +17459,7 @@ c_parser_omp_atomic (location_t loc, c_parser *parser)
 	  }
 	break;
   case OMP_ATOMIC:
+ /* case OMP_ATOMIC_CAPTURE_NEW: - or update to OpenMP 5.1 */
 	if (memory_order == OMP_MEMORY_ORDER_ACQ_REL
 	|| memory_order == OMP_MEMORY_ORDER_ACQUIRE)
 	  {
@@ -21489,7 +21501,7 @@ c_parser_omp_construct (c_parser *parser, bool *if_p)
   switch (p_kind)
 {
 case PRAGMA_OACC_ATOMIC:
-  c_parser_omp_atomic (loc, parser);
+  c_parser_omp_atomic (loc, parser, true);
   return;
 case PRAGMA_OACC_CACHE:
   strcpy (p_name, "#pragma acc");
@@ -21516,7 +21528,7 @@ c_parser_omp_construct (c_parser *pars

Re: [Patch] OpenACC (C/C++): Fix 'acc atomic' parsing

2020-11-05 Thread Jakub Jelinek via Gcc-patches
On Thu, Nov 05, 2020 at 01:03:38PM +0100, Tobias Burnus wrote:
> OpenACC piggybacks on OpenACC for the atomic parsing; however, there
> are two issues:
> * Newer OpenMP versions added additional clauses such as 'seq_cst',
>   which do not exist in OpenACC.
> * OpenACC 2.6 added 'acc atomic update capture' (besides 'acc atomic capture',
>   which was not accepted.
> 
> Actually, while OpenACC 2.6/2.7/3.0 has 'acc atomic update capture' in the
> syntax, it never explicitly states that this matches 'atomic capture'.
> 
> NOTE: I did not check whether the supported expressions by OpenMP 5.0/the
> current GCC implementation is the same as in OpenACC 2.6/2.7/3.x.
> 
> Any comments?

> OpenACC (C/C++): Fix 'acc atomic' parsing
> 
> gcc/c/ChangeLog:
> 
>   * c-parser.c (c_parser_omp_atomic): Add openacc parameter and update
>   OpenACC matching.
>   (c_parser_omp_construct): Update call.
> 
> gcc/cp/ChangeLog:
> 
>   * parser.c (cp_parser_omp_atomic): Add openacc parameter and update
>   OpenACC matching.
>   (cp_parser_omp_construct): Update call.
> 
> gcc/testsuite/ChangeLog:
> 
>   * c-c++-common/goacc-gomp/atomic.c: New test.
>   * c-c++-common/goacc/atomic.c: New test.
> 
>  gcc/c/c-parser.c   | 40 +++-
>  gcc/cp/parser.c| 39 ++-
>  gcc/testsuite/c-c++-common/goacc-gomp/atomic.c | 43 
> ++
>  gcc/testsuite/c-c++-common/goacc/atomic.c  | 30 ++
>  4 files changed, 124 insertions(+), 28 deletions(-)
> 
> diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
> index fc97aa3f95f..79037d98f76 100644
> --- a/gcc/c/c-parser.c
> +++ b/gcc/c/c-parser.c
> @@ -17304,7 +17304,7 @@ c_parser_oacc_wait (location_t loc, c_parser *parser, 
> char *p_name)
>LOC is the location of the #pragma token.  */
>  
>  static void
> -c_parser_omp_atomic (location_t loc, c_parser *parser)
> +c_parser_omp_atomic (location_t loc, c_parser *parser, bool openacc)
>  {
>tree lhs = NULL_TREE, rhs = NULL_TREE, v = NULL_TREE;
>tree lhs1 = NULL_TREE, rhs1 = NULL_TREE;
> @@ -17343,17 +17343,17 @@ c_parser_omp_atomic (location_t loc, c_parser 
> *parser)
>   new_code = OMP_ATOMIC;
> else if (!strcmp (p, "capture"))
>   new_code = OMP_ATOMIC_CAPTURE_NEW;
> -   else if (!strcmp (p, "seq_cst"))
> +   else if (!openacc && !strcmp (p, "seq_cst"))
>   new_memory_order = OMP_MEMORY_ORDER_SEQ_CST;
> -   else if (!strcmp (p, "acq_rel"))
> +   else if (!openacc && !strcmp (p, "acq_rel"))
>   new_memory_order = OMP_MEMORY_ORDER_ACQ_REL;
> -   else if (!strcmp (p, "release"))
> +   else if (!openacc && !strcmp (p, "release"))
>   new_memory_order = OMP_MEMORY_ORDER_RELEASE;
> -   else if (!strcmp (p, "acquire"))
> +   else if (!openacc && !strcmp (p, "acquire"))
>   new_memory_order = OMP_MEMORY_ORDER_ACQUIRE;
> -   else if (!strcmp (p, "relaxed"))
> +   else if (!openacc && !strcmp (p, "relaxed"))
>   new_memory_order = OMP_MEMORY_ORDER_RELAXED;
> -   else if (!strcmp (p, "hint"))
> +   else if (!openacc && !strcmp (p, "hint"))
>   {
> c_parser_consume_token (parser);
> clauses = c_parser_omp_clause_hint (parser, clauses);
> @@ -17362,15 +17362,24 @@ c_parser_omp_atomic (location_t loc, c_parser 
> *parser)
> else
>   {
> p = NULL;
> -   error_at (cloc, "expected %, %, %, "
> -   "%, %, %, "
> -   "%, % or % clause");
> +   if (openacc)
> + error_at (cloc, "expected %, %, %, "
> + "or % clause");
> +   else
> + error_at (cloc, "expected %, %, %, "
> + "%, %, %, "
> + "%, % or % clause");

Wouldn't it be much simpler and more readable to do:
  else if (!strcmp (p, "capture"))
new_code = OMP_ATOMIC_CAPTURE_NEW;
+ else if (openacc)
+   {
+ p = NULL;
+ error_at (cloc, "expected %, %, %, "
+ "or % clause");
+   }
  else if (!strcmp (p, "seq_cst"))
new_memory_order = OMP_MEMORY_ORDER_SEQ_CST;
... - handling of other openmp only clauses here
  else
{
  p = NULL;
  error_at (cloc, "expected %, %, %, "
  "%, %, %, "
  "%, % or % clause");
}
?
Ditto C++.

Otherwise LGTM, but I have no idea what OpenACC actually says...

Jakub



[PATCH] Clean up loop-closed PHIs at loopdone pass

2020-11-05 Thread guojiufu via Gcc-patches
In PR87473, there are discussions about loop-closed PHIs which
are generated for loop optimization passes.  It would be helpful
to clean them up after loop optimization is done, then this may
simplify some jobs of following passes.
This patch introduces a cheaper way to propagate them out in
pass_tree_loop_done.

This patch passes bootstrap and regtest on ppc64le.  Is this ok for trunk?

gcc/ChangeLog
2020-10-05  Jiufu Guo   

* tree-ssa-loop.h (clean_up_loop_closed_phi): New declaration.
* tree-ssa-loop.c (tree_ssa_loop_done): Call clean_up_loop_closed_phi.
* tree-ssa-propagate.c (propagate_rhs_into_lhs): New function.

gcc/testsuite/ChangeLog
2020-10-05  Jiufu Guo   

* gcc.dg/tree-ssa/loopclosedphi.c: New test.
---
 gcc/testsuite/gcc.dg/tree-ssa/loopclosedphi.c |  21 +++
 gcc/tree-ssa-loop.c   |   1 +
 gcc/tree-ssa-loop.h   |   1 +
 gcc/tree-ssa-propagate.c  | 120 ++
 4 files changed, 143 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/loopclosedphi.c

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/loopclosedphi.c 
b/gcc/testsuite/gcc.dg/tree-ssa/loopclosedphi.c
new file mode 100644
index 000..d71b757fbca
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/loopclosedphi.c
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -fno-tree-ch -w -fdump-tree-loopdone-details" } */
+
+void
+t6 (int qz, int wh)
+{
+  int jl = wh;
+
+  while (1.0 * qz / wh < 1)
+{
+  qz = wh * (wh + 2);
+
+  while (wh < 1)
+jl = 0;
+}
+
+  while (qz < 1)
+qz = jl * wh;
+}
+
+/* { dg-final { scan-tree-dump-times "Replacing" 2 "loopdone"} } */
diff --git a/gcc/tree-ssa-loop.c b/gcc/tree-ssa-loop.c
index 5e8365d4e83..7d680b2f5d2 100644
--- a/gcc/tree-ssa-loop.c
+++ b/gcc/tree-ssa-loop.c
@@ -530,6 +530,7 @@ tree_ssa_loop_done (void)
   free_numbers_of_iterations_estimates (cfun);
   scev_finalize ();
   loop_optimizer_finalize ();
+  clean_up_loop_closed_phi (cfun);
   return 0;
 }
 
diff --git a/gcc/tree-ssa-loop.h b/gcc/tree-ssa-loop.h
index 9e35125e6e8..baa940b9d1e 100644
--- a/gcc/tree-ssa-loop.h
+++ b/gcc/tree-ssa-loop.h
@@ -67,6 +67,7 @@ public:
 extern bool for_each_index (tree *, bool (*) (tree, tree *, void *), void *);
 extern char *get_lsm_tmp_name (tree ref, unsigned n, const char *suffix = 
NULL);
 extern unsigned tree_num_loop_insns (class loop *, struct eni_weights *);
+extern unsigned clean_up_loop_closed_phi (function *);
 
 /* Returns the loop of the statement STMT.  */
 
diff --git a/gcc/tree-ssa-propagate.c b/gcc/tree-ssa-propagate.c
index 87dbf55fab9..813143852b9 100644
--- a/gcc/tree-ssa-propagate.c
+++ b/gcc/tree-ssa-propagate.c
@@ -1549,4 +1549,123 @@ propagate_tree_value_into_stmt (gimple_stmt_iterator 
*gsi, tree val)
   else
 gcc_unreachable ();
 }
+
+/* Propagate RHS into all uses of LHS (when possible).
+
+   RHS and LHS are derived from STMT, which is passed in solely so
+   that we can remove it if propagation is successful.  */
+
+static bool
+propagate_rhs_into_lhs (gphi *stmt, tree lhs, tree rhs)
+{
+  use_operand_p use_p;
+  imm_use_iterator iter;
+  gimple_stmt_iterator gsi;
+  gimple *use_stmt;
+  bool changed = false;
+  bool all = true;
+
+  /* Dump details.  */
+  if (dump_file && (dump_flags & TDF_DETAILS))
+{
+  fprintf (dump_file, "  Replacing '");
+  print_generic_expr (dump_file, lhs, dump_flags);
+  fprintf (dump_file, "' with '");
+  print_generic_expr (dump_file, rhs, dump_flags);
+  fprintf (dump_file, "'\n");
+}
+
+  /* Walk over every use of LHS and try to replace the use with RHS. */
+  FOR_EACH_IMM_USE_STMT (use_stmt, iter, lhs)
+{
+  /* It is not safe to propagate into below stmts.  */
+  if (gimple_debug_bind_p (use_stmt)
+ || (gimple_code (use_stmt) == GIMPLE_ASM
+ && !may_propagate_copy_into_asm (lhs))
+ || (TREE_CODE (rhs) == SSA_NAME
+ && SSA_NAME_DEF_STMT (rhs) == use_stmt))
+   {
+ all = false;
+ continue;
+   }
+
+  /* Dump details.  */
+  if (dump_file && (dump_flags & TDF_DETAILS))
+   {
+ fprintf (dump_file, "Original statement:");
+ print_gimple_stmt (dump_file, use_stmt, 0, dump_flags);
+   }
+
+  /* Propagate the RHS into this use of the LHS.  */
+  FOR_EACH_IMM_USE_ON_STMT (use_p, iter)
+   propagate_value (use_p, rhs);
+
+  /* Propagation may expose new operands to the renamer.  */
+  update_stmt (use_stmt);
+
+  /* If variable index is replaced with a constant, then
+update the invariant flag for ADDR_EXPRs.  */
+  if (gimple_assign_single_p (use_stmt)
+ && TREE_CODE (gimple_assign_rhs1 (use_stmt)) == ADDR_EXPR)
+   recompute_tree_invariant_for_addr_expr (gimple_assign_rhs1 (use_stmt));
+
+  /* Dump details.  */
+  if (dump_file && (dump_flags & TDF_DETAILS))
+   {
+ fprintf (dump_file,

Re: [21/32] miscelaneous

2020-11-05 Thread Richard Biener via Gcc-patches
On Tue, Nov 3, 2020 at 10:16 PM Nathan Sidwell  wrote:
>
> These are changes to gcc/tree.h adding some raw accessors to nodes,
> which seemed preferable to direct field access.  I also needed access to
> the integral constant cache

can you please document the adjusted interface to cache_integer_cst in
its (non-existing) function level comment?  It looks like 'replace'== true
turns it into get_or_insert from now put with an assertion it wasn't in the
cache.

Otherwise OK.

Thanks,
Richard.

>
> --
> Nathan Sidwell
>


Re: [PATCH] libstdc++: Implement C++20 features for

2020-11-05 Thread Jonathan Wakely via Gcc-patches

On 04/11/20 23:41 +, Jonathan Wakely wrote:

On 04/11/20 21:45 +, Jonathan Wakely wrote:

On 04/11/20 12:43 -0800, Thomas Rodgers wrote:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97719


On Nov 4, 2020, at 11:54 AM, Stephan Bergmann  wrote:

On 07/10/2020 18:55, Thomas Rodgers wrote:

From: Thomas Rodgers 
New ctors and ::view() accessor for -
* basic_stingbuf
* basic_istringstream
* basic_ostringstream
* basic_stringstreamm
New ::get_allocator() accessor for basic_stringbuf.

I found that this 
 
"libstdc++: Implement C++20 features for " changed the behavior of


$ cat test.cc
#include 
#include 
#include 
int main() {
std::stringstream s("a");
std::istreambuf_iterator i(s);
if (i != std::istreambuf_iterator()) std::cout << *i << '\n';
}
$ g++ -std=c++20 test.cc
$ ./a.out


from printing "a" to printing nothing.  (The `i != ...` comparison appears to change i 
from pointing at "a" to pointing to null, and returns false.)

I ran into this when building LibreOffice, and I hope test.cc is a faithfully 
minimized reproducer.  However, I know little about std::istreambuf_iterator, 
so it may well be that the code isn't even valid.



I'm testing this patch.


Tested powerpc64le-linux. Pushed now.


And this fixes some other bugs in the new constructors.

Tested powerpc64le-linux, pushed to trunk.

commit 432258be4f2cf4f0970f106db319e3dbab4ab13d
Author: Jonathan Wakely 
Date:   Thu Nov 5 12:16:13 2020

libstdc++: Fix new  constructors

- Add a missing 'explicit' to a basic_stringbuf constructor.
- Set up the get/put area pointers in the constructor from strings using
  different allocator types.
- Remove public basic_stringbuf::__sv_type alias.
- Do not construct temporary basic_string objects with a
  default-constructed allocator.

Also, change which basic_string constructor is used, as a minor
compile-time optimization. Constructing from a basic_string_view
requires more work from the compiler, so just use a pointer and length.

libstdc++-v3/ChangeLog:

* include/std/sstream (basic_stringbuf(const allocator_type&):
Add explicit.
(basic_stringbuf(const basic_string&, openmode, const A&)):
Call _M_stringbuf_init. Construct _M_string from pointer and length
to avoid constraint checks for string view.
(basic_stringbuf::view()): Make __sv_type alias local to the
function.
(basic_istringstream(const basic_string&, openmode, const A&)):
Pass string to _M_streambuf instead of constructing a temporary
with the wrong allocator.
(basic_ostringstream(const basic_string&, openmode, const A&)):
Likewise.
(basic_stringstream(const basic_string&, openmode, const A&)):
Likewise.
* src/c++20/sstream-inst.cc: Use string_view and wstring_view
typedefs in explicit instantiations.
* testsuite/27_io/basic_istringstream/cons/char/1.cc: Add more
tests for constructors.
* testsuite/27_io/basic_ostringstream/cons/char/1.cc: Likewise.
* testsuite/27_io/basic_stringbuf/cons/char/1.cc: Likewise.
* testsuite/27_io/basic_stringbuf/cons/char/2.cc: Likewise.
* testsuite/27_io/basic_stringbuf/cons/wchar_t/1.cc: Likewise.
* testsuite/27_io/basic_stringbuf/cons/wchar_t/2.cc: Likewise.
* testsuite/27_io/basic_stringstream/cons/char/1.cc: Likewise.

diff --git a/libstdc++-v3/include/std/sstream b/libstdc++-v3/include/std/sstream
index 276badfd9657..437e2ba2a5f8 100644
--- a/libstdc++-v3/include/std/sstream
+++ b/libstdc++-v3/include/std/sstream
@@ -166,8 +166,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
 #endif
 
 #if __cplusplus > 201703L && _GLIBCXX_USE_CXX11_ABI
-  using __sv_type = basic_string_view;
-
+  explicit
   basic_stringbuf(const allocator_type& __a)
   : basic_stringbuf(ios_base::in | std::ios_base::out, __a)
   { }
@@ -185,18 +184,18 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11
   { _M_stringbuf_init(__mode); }
 
   template
-  basic_stringbuf(const basic_string<_CharT, _Traits, _SAlloc>& __s,
-		  const allocator_type& __a)
-  : basic_stringbuf(__s, ios_base::in | std::ios_base::out, __a)
-  { }
+	basic_stringbuf(const basic_string<_CharT, _Traits, _SAlloc>& __s,
+			const allocator_type& __a)
+	: basic_stringbuf(__s, ios_base::in | std::ios_base::out, __a)
+	{ }
 
   template
-  basic_stringbuf(const basic_string<_CharT, _Traits, _SAlloc>& __s,
-		  ios_base::openmode __mode,
-		  const allocator_type& __a)
-  : __streambuf_type(), _M_mode(__mode),
-  _M_string(static_cast<__sv_type>(__s), __a)
-  { }
+	basic_stringbuf(const basic_string<_CharT, _Traits, _SAlloc>& __s,
+			ios_base::openmode __mode,
+			const allocator_type& __a)
+	: __s

Re: [00/32] C++ 20 Modules

2020-11-05 Thread Richard Biener via Gcc-patches
On Tue, Nov 3, 2020 at 10:12 PM Nathan Sidwell  wrote:
>
> Here is the implementation of C++20 modules that I have been developing
> on the devel/c++-modules branch over the last few years.
>
> It is not a complete implementation.  The major missing pieces are:
>
> 1) Private Module Fragment
>The syntax is recognized and a sorry emitted
>
> 2) textually parsing a duplicate global module definition when a
> definition has already been read from a header-unit.  (the converse is
> supported)
>
> 3) Complete type (in)visibility when provided in implementation
> partitions that are imported into the primary interface.  Users will see
> the type as complete.
>
> 4) Internal linkage reachability rules from exported entities.  We're
> likely to accept ill-formed programs.  This will not cause us to reject
> well-formed programs.
>
> It is some 25K new lines of code (plus testsuite).  There are about 48
> FIXMEs, nearly all in module.cc and the remaining in name-lookup.c. Of
> these 12 are QOI comments.  The remaining 36 probably fall into the
> following categories:
> 1/3 are repeating a FIXME mentioned elsewhere
> 1/3 are already resolved, or have become irrelevant
> 1/3 are defects (an above missing feature, a QOI issue, or something else).
>
> I believe there is time in stage 1 to address the most significant ones.
>
> I have bootstrapped and tested on:
> x86_64-linux
> aarch64-linux
> powerpc8le-linux
> powerpc8-aix
>
> Iain Sandoe has been regularly bootstrapping on x86_64-darwin.  Joseph
> Myers graciously built for i686-mingw host.  We eventually ran into
> compilation errors in the analyzer, as it seemed unprepared for an
> IL32P64 host.
>
> I have attempted to break the patches apart into coherent pieces.  But
> they are somewhat interconnected.
>
> 01-langhooks.diff   New langhooks
> 02-cpp-line-maps.diff   line-map pieces
> 03-cpp-callbacks.diff   Preprocessor callbacks
> 04-cpp-lexer.diff   There are new lexing requirements
> 05-cpp-files.diff   ... and file reading functionality
> 06-cpp-macro.diff   ... and macro expansion rules
> 07-cpp-main.diffMain file reading
> 08-cpp-mkdeps.diff  Dependency generation
> 09-core-diag.diff   Core diagnostics
> 10-core-config.diff Autoconf
> 11-core-parmtime.diff   parameters and time instrumentation
> 12-core-doc.diffUser documentation
> 13-family-options.diff  New options
> 14-family-keywords.diff New keyword
> 15-c++-lexer.diff   New C++ lexing
> 16-c++-infra.diff   C++ infrastructure interfaces
> 17-c++-infra-constexpr.diff new constexpr interfacing
> 18-c++-infra-template.diff  new template interfacing
> 19-global-trees.diffGlobal tree ordering
> 20-c++-dynctor.diff Dynamic constructor generation
> 21-core-rawbits.diffSome core node bits
> 22-c++-otherbits.diff   Miscellaneous C++ changes
> 23-libcody.diff Libcody
> 24-c++-mapper.diff  Module Mapper
> 25-c++-modules.diff The Modules file
> 26-c++-name-lookup.diff Name lookup changes
> 27-c++-parser.diff  Parser changes
> 28-c++-langhooks.diff   Lang hooks implementation
> 29-c++-make.diffMakefile changes
> 30-test-harness.diffTestharness changes
> 31-testsuite.diff   The testsuite
> 32-aix-fixincl.diff AIX fixinclude
>
> Nearly all of this is within gcc/cp and libcpp/ directories.  There are
> a few changes to gcc/ and more changes in gcc/c-family/  It is likely
> that this patchset will cause breakages, for that I apologize (please
> try the modules branch and report early).
>
> My understanding is that a Global Maintainer's approval is needed for
> such a large patchset.  It's be good to get this in as early in stage 3
> as possible (if stage 1 expires).

>From a RM perspective this is OK if merging doesn't drag itself too
far along.  Expect build & install fallout from the more weird hosts & targets
we have though.

Moving the module mapper to a more easily (build-)testable location
and to a place where host dependences can be more easily fixed
& customized than in a bootstrapped directory would be nice.  Thus,
I think the module mapper should be in the toplevel somehow
and independently buildable.

Richard.

> Definitely the most important event of today :)  But don't forget to vote.
>
> nathan
>
> --
> Nathan Sidwell


Re: [PATCH] Clean up loop-closed PHIs at loopdone pass

2020-11-05 Thread Richard Biener via Gcc-patches
On Thu, Nov 5, 2020 at 2:19 PM guojiufu via Gcc-patches
 wrote:
>
> In PR87473, there are discussions about loop-closed PHIs which
> are generated for loop optimization passes.  It would be helpful
> to clean them up after loop optimization is done, then this may
> simplify some jobs of following passes.
> This patch introduces a cheaper way to propagate them out in
> pass_tree_loop_done.
>
> This patch passes bootstrap and regtest on ppc64le.  Is this ok for trunk?

Huh, I think this is somewhat useless work, the PHIs won't survive for long
and you certainly cannot expect degenerate PHIs to not occur anyway.
You probably can replace propagate_rhs_into_lhs by the
existing replace_uses_by function.  You're walking loop exits
after loop_optimizer_finalize () - that's wasting work.  If you want to
avoid inconsistent state and we really want to go with this I suggest
to instead add a flag to loop_optimizer_finalize () as to whether to
propagate out LC PHI nodes or not and do this from within there.

Thanks,
Richard.

> gcc/ChangeLog
> 2020-10-05  Jiufu Guo   
>
> * tree-ssa-loop.h (clean_up_loop_closed_phi): New declaration.
> * tree-ssa-loop.c (tree_ssa_loop_done): Call clean_up_loop_closed_phi.
> * tree-ssa-propagate.c (propagate_rhs_into_lhs): New function.
>
> gcc/testsuite/ChangeLog
> 2020-10-05  Jiufu Guo   
>
> * gcc.dg/tree-ssa/loopclosedphi.c: New test.
> ---
>  gcc/testsuite/gcc.dg/tree-ssa/loopclosedphi.c |  21 +++
>  gcc/tree-ssa-loop.c   |   1 +
>  gcc/tree-ssa-loop.h   |   1 +
>  gcc/tree-ssa-propagate.c  | 120 ++
>  4 files changed, 143 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/loopclosedphi.c
>
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/loopclosedphi.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/loopclosedphi.c
> new file mode 100644
> index 000..d71b757fbca
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/loopclosedphi.c
> @@ -0,0 +1,21 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O3 -fno-tree-ch -w -fdump-tree-loopdone-details" } */
> +
> +void
> +t6 (int qz, int wh)
> +{
> +  int jl = wh;
> +
> +  while (1.0 * qz / wh < 1)
> +{
> +  qz = wh * (wh + 2);
> +
> +  while (wh < 1)
> +jl = 0;
> +}
> +
> +  while (qz < 1)
> +qz = jl * wh;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "Replacing" 2 "loopdone"} } */
> diff --git a/gcc/tree-ssa-loop.c b/gcc/tree-ssa-loop.c
> index 5e8365d4e83..7d680b2f5d2 100644
> --- a/gcc/tree-ssa-loop.c
> +++ b/gcc/tree-ssa-loop.c
> @@ -530,6 +530,7 @@ tree_ssa_loop_done (void)
>free_numbers_of_iterations_estimates (cfun);
>scev_finalize ();
>loop_optimizer_finalize ();
> +  clean_up_loop_closed_phi (cfun);
>return 0;
>  }
>
> diff --git a/gcc/tree-ssa-loop.h b/gcc/tree-ssa-loop.h
> index 9e35125e6e8..baa940b9d1e 100644
> --- a/gcc/tree-ssa-loop.h
> +++ b/gcc/tree-ssa-loop.h
> @@ -67,6 +67,7 @@ public:
>  extern bool for_each_index (tree *, bool (*) (tree, tree *, void *), void *);
>  extern char *get_lsm_tmp_name (tree ref, unsigned n, const char *suffix = 
> NULL);
>  extern unsigned tree_num_loop_insns (class loop *, struct eni_weights *);
> +extern unsigned clean_up_loop_closed_phi (function *);
>
>  /* Returns the loop of the statement STMT.  */
>
> diff --git a/gcc/tree-ssa-propagate.c b/gcc/tree-ssa-propagate.c
> index 87dbf55fab9..813143852b9 100644
> --- a/gcc/tree-ssa-propagate.c
> +++ b/gcc/tree-ssa-propagate.c
> @@ -1549,4 +1549,123 @@ propagate_tree_value_into_stmt (gimple_stmt_iterator 
> *gsi, tree val)
>else
>  gcc_unreachable ();
>  }
> +
> +/* Propagate RHS into all uses of LHS (when possible).
> +
> +   RHS and LHS are derived from STMT, which is passed in solely so
> +   that we can remove it if propagation is successful.  */
> +
> +static bool
> +propagate_rhs_into_lhs (gphi *stmt, tree lhs, tree rhs)
> +{
> +  use_operand_p use_p;
> +  imm_use_iterator iter;
> +  gimple_stmt_iterator gsi;
> +  gimple *use_stmt;
> +  bool changed = false;
> +  bool all = true;
> +
> +  /* Dump details.  */
> +  if (dump_file && (dump_flags & TDF_DETAILS))
> +{
> +  fprintf (dump_file, "  Replacing '");
> +  print_generic_expr (dump_file, lhs, dump_flags);
> +  fprintf (dump_file, "' with '");
> +  print_generic_expr (dump_file, rhs, dump_flags);
> +  fprintf (dump_file, "'\n");
> +}
> +
> +  /* Walk over every use of LHS and try to replace the use with RHS. */
> +  FOR_EACH_IMM_USE_STMT (use_stmt, iter, lhs)
> +{
> +  /* It is not safe to propagate into below stmts.  */
> +  if (gimple_debug_bind_p (use_stmt)
> + || (gimple_code (use_stmt) == GIMPLE_ASM
> + && !may_propagate_copy_into_asm (lhs))
> + || (TREE_CODE (rhs) == SSA_NAME
> + && SSA_NAME_DEF_STMT (rhs) == use_stmt))
> +   {
> + all = false;
> + continue;
> +   }
> +
> +  /* Dump details.  */
>

[PATCH] gcc-changelog: prevent double cherry-pick line

2020-11-05 Thread Martin Liška

I'm going to prevent from creation of double 'cherry picked from'
lines. There are quite some revision that violate that. I'm going
to install it tomorrow in order to make DATESTAMP succeed
the upcoming night.

Then we can update server hook.

Martin


contrib/ChangeLog:

* gcc-changelog/git_commit.py: Add new check.
* gcc-changelog/test_email.py: Test it.
* gcc-changelog/test_patches.txt: Add new patch.
---
 contrib/gcc-changelog/git_commit.py|  6 +-
 contrib/gcc-changelog/test_email.py|  4 
 contrib/gcc-changelog/test_patches.txt | 29 ++
 3 files changed, 38 insertions(+), 1 deletion(-)

diff --git a/contrib/gcc-changelog/git_commit.py 
b/contrib/gcc-changelog/git_commit.py
index 0008865338b..80ae0b2a77d 100755
--- a/contrib/gcc-changelog/git_commit.py
+++ b/contrib/gcc-changelog/git_commit.py
@@ -423,7 +423,11 @@ class GitCommit:
 continue
 elif line.startswith(CHERRY_PICK_PREFIX):
 commit = line[len(CHERRY_PICK_PREFIX):].rstrip(')')
-self.cherry_pick_commit = commit
+if self.cherry_pick_commit:
+self.errors.append(Error('multiple cherry pick lines',
+ line))
+else:
+self.cherry_pick_commit = commit
 continue
 
 # ChangeLog name will be deduced later

diff --git a/contrib/gcc-changelog/test_email.py 
b/contrib/gcc-changelog/test_email.py
index df350a41228..e38c3e52158 100755
--- a/contrib/gcc-changelog/test_email.py
+++ b/contrib/gcc-changelog/test_email.py
@@ -362,6 +362,10 @@ class TestGccChangelog(unittest.TestCase):
 assert '\t2020-06-11  Martin Liska  ' in entry
 assert '\t\tJakub Jelinek  ' in entry
 
+def test_backport_double_cherry_pick(self):

+email = self.from_patch_glob('double-cherry-pick.patch')
+assert email.errors[0].message.startswith('multiple cherry pick lines')
+
 def test_square_and_lt_gt(self):
 email = self.from_patch_glob('0001-Check-for-more-missing')
 assert not email.errors
diff --git a/contrib/gcc-changelog/test_patches.txt 
b/contrib/gcc-changelog/test_patches.txt
index bc9cc2e078e..37f49c851ec 100644
--- a/contrib/gcc-changelog/test_patches.txt
+++ b/contrib/gcc-changelog/test_patches.txt
@@ -3160,6 +3160,35 @@ index 823eb539993..4ec22162c12 100644
 --
 2.27.0
 
+=== double-cherry-pick.patch ===

+From e1d68582022cfa2b1dc76646724b397ba2739439 Mon Sep 17 00:00:00 2001
+From: Martin Liska 
+Date: Thu, 11 Jun 2020 09:34:41 +0200
+Subject: [PATCH] asan: fix RTX emission for ilp32
+
+gcc/ChangeLog:
+
+   PR sanitizer/95634
+   * asan.c (asan_emit_stack_protection): Fix emission for ilp32
+   by using Pmode instead of ptr_mode.
+
+Co-Authored-By: Jakub Jelinek 
+(cherry picked from commit 8cff672cb9a132d3d3158c2edfc9a64b55292b80)
+(cherry picked from commit 8cff672cb9a132d3d3158c2edfc9a64b55292b80)
+---
+ gcc/asan.c | 1 +
+ 1 file changed, 1 insertion(+)
+
+diff --git a/gcc/asan.c b/gcc/asan.c
+index 823eb539993..4ec22162c12 100644
+--- a/gcc/asan.c
 b/gcc/asan.c
+@@ -1 +1,2 @@
+
++
+--
+2.27.0
+
 === 0001-Check-for-more-missing-math-decls-on-vxworks.patch ===
 From 0edfc1fd22405ee8e946101e44cd8edc0ee12047 Mon Sep 17 00:00:00 2001
 From: Douglas B Rupp 
--
2.29.1



Fix uninitialized memory use in ipa-modref

2020-11-05 Thread Jan Hubicka
Hi,
this patch fixes two uninitialized memory uses in ipa-modref.  First is
harmless because the values are never used, but they will make valgrind
unhapy.
Second is an actual bug: while breaking the patch in half I forgot to
initialize errno at stream in time.

Bootstrapped/regtested x86_64-linux, comitted.

* ipa-modref.c (parm_map_for_arg): Initialize parm_offset and
parm_offset_knonw.
(read_section): Set writes_errno to false.
diff --git a/gcc/ipa-modref.c b/gcc/ipa-modref.c
index b40f3da3ba2..9df3d2bcf2d 100644
--- a/gcc/ipa-modref.c
+++ b/gcc/ipa-modref.c
@@ -525,6 +525,9 @@ parm_map_for_arg (gimple *stmt, int i)
   poly_int64 offset;
   struct modref_parm_map parm_map;
 
+  parm_map.parm_offset_known = false;
+  parm_map.parm_offset = 0;
+
   offset_known = unadjusted_ptr_and_unit_offset (op, &op, &offset);
   if (TREE_CODE (op) == SSA_NAME
   && SSA_NAME_IS_DEFAULT_DEF (op)
@@ -1533,10 +1536,12 @@ read_section (struct lto_file_decl_data *file_data, 
const char *data,
   modref_summary_lto *modref_sum_lto = summaries_lto
   ? summaries_lto->get_create (node)
   : NULL;
-
   if (optimization_summaries)
modref_sum = optimization_summaries->get_create (node);
 
+  if (modref_sum)
+   modref_sum->writes_errno = false;
+
   gcc_assert (!modref_sum || (!modref_sum->loads
  && !modref_sum->stores));
   gcc_assert (!modref_sum_lto || (!modref_sum_lto->loads


Re: [00/32] C++ 20 Modules

2020-11-05 Thread David Malcolm via Gcc-patches
On Tue, 2020-11-03 at 16:12 -0500, Nathan Sidwell wrote:

[...]

[CCing Joseph]

> I have bootstrapped and tested on:
> x86_64-linux
> aarch64-linux
> powerpc8le-linux
> powerpc8-aix
> 
> Iain Sandoe has been regularly bootstrapping on x86_64-
> darwin.  Joseph 
> Myers graciously built for i686-mingw host.  We eventually ran into 
> compilation errors in the analyzer, as it seemed unprepared for an 
> IL32P64 host.

Sorry about the issues with the analyzer with IL32P64 hosts.  I pushed
Markus Böck's fix for PR 96608 to master on 2020-10-27 as
942086bf73ee2ba6cfd7fdacc552940048437a6e.

Is anyone still seeing build issues with the analyzer?  (and is there a
machine in the compile farm I can test them out on?).

Dave



Re: [PATCH] arm: [testcase] Better narrow some bfloat16 testcase

2020-11-05 Thread Andrea Corallo via Gcc-patches
Christophe Lyon  writes:

> On Thu, 5 Nov 2020 at 12:11, Andrea Corallo  wrote:
>>
>> Christophe Lyon  writes:
>>
>> [...]
>>
>> >> I think you need to add -mfloat-abi=hard to the dg-additional-options
>> >> otherwise vld1_lane_bf16_1.c
>> >> fails on targets with a soft float-abi default (eg arm-linux-gnueabi).
>> >>
>> >> See bf16_vldn_1.c.
>> >
>> > Actually that's not sufficient because in turn we get:
>> > /sysroot-arm-none-linux-gnueabi/usr/include/gnu/stubs.h:10:11: fatal
>> > error: gnu/stubs-hard.h: No such file or directory
>> >
>> > So you should check that -mfloat-abi=hard is supported.
>> >
>> > Ditto for the vst tests.
>>
>> Hi Christophe,
>>
>> this patch should implement your suggestions.
>>
>> On my arm-none-linux-gnueabi setup the tests were already skipped
>> as unsupported so if you could test and confirm this fixes the
>> issue you see would be great.
>
> Do you know why they are unsupported in your setup?

We probably have a different GCC configuration.  Could you share how
it's configured your?

>> diff --git a/gcc/testsuite/lib/target-supports.exp 
>> b/gcc/testsuite/lib/target-supports.exp
>> index 15f0649f8ae..2ab7e39756d 100644
>> --- a/gcc/testsuite/lib/target-supports.exp
>> +++ b/gcc/testsuite/lib/target-supports.exp
>> @@ -5213,6 +5213,10 @@ proc 
>> check_effective_target_arm_v8_2a_bf16_neon_ok_nocache { } {
>>  return 0;
>>  }
>>
>> +if { ! [check_effective_target_arm_hard_ok] } {
>> + return 0;
>> +}
>> +
>> foreach flags {"" "-mfloat-abi=hard -mfpu=neon-fp-armv8" 
>> "-mfloat-abi=softfp -mfpu=neon-fp-armv8" } {
>> if { [check_no_compiler_messages_nocache arm_v8_2a_bf16_neon_ok 
>> object {
>> #include 
>
> This seems strange since you would now exit early if
> check_effective_target_arm_hard_ok is false, so you'll never need the
> -mfloat-abi=softfp version of the flags.

So IIUC your suggestion would be to test with higher priority softfp and
in case we decide to go for hardfp make sure
check_effective_target_arm_hard_ok is satisfied.  Am I correct?

> BTW in general, I think softfp is tried before hard in the other
> similar effective targets, any reason the order is different here?

No idea.

Thanks

  Andrea


Re: [PATCH] "used" attribute saves decl from linker garbage collection

2020-11-05 Thread H.J. Lu via Gcc-patches
On Thu, Nov 5, 2020 at 3:37 AM Jozef Lawrynowicz
 wrote:
>
> On Thu, Nov 05, 2020 at 06:21:21AM -0500, Hans-Peter Nilsson wrote:
> > On Wed, 4 Nov 2020, H.J. Lu wrote:
> > > .retain is ill-defined.   For example,
> > >
> > > [hjl@gnu-cfl-2 gcc]$ cat /tmp/x.c
> > > static int xyzzy __attribute__((__used__));
> > > [hjl@gnu-cfl-2 gcc]$ ./xgcc -B./ -S /tmp/x.c -fcommon
> > > [hjl@gnu-cfl-2 gcc]$ cat x.s
> > > .file "x.c"
> > > .text
> > > .retain xyzzy  < What does it do?
> > > .local xyzzy
> > > .comm xyzzy,4,4
> > > .ident "GCC: (GNU) 11.0.0 20201103 (experimental)"
> > > .section .note.GNU-stack,"",@progbits
> > > [hjl@gnu-cfl-2 gcc]$
> >
> > To answer that question: it's up to the assembler, but for ELF
> > and SHF_GNU_RETAIN, it seems obvious it'd tell the assembler to
> > set SHF_GNU_RETAIN for the section where the symbol ends up.
> > We both know this isn't rocket science with binutils.
>
> Indeed, and my patch handles it trivially:
> https://sourceware.org/pipermail/binutils/2020-November/113993.html
>
>   +void
>   +obj_elf_retain (int arg ATTRIBUTE_UNUSED)
>    snip 
>   +  sym = get_sym_from_input_line_and_check ();
>   +  symbol_get_obj (sym)->retain = 1;
>
>   @@ -2624,6 +2704,9 @@ elf_frob_symbol (symbolS *symp, int *puntp)
> }
>}
>
>   +  if (symbol_get_obj (symp)->retain)
>   +elf_section_flags (S_GET_SEGMENT (symp)) |= SHF_GNU_RETAIN;
>   +
>  /* Double check weak symbols.  */
>  if (S_IS_WEAK (symp))
>{
>
> We could check that the symbol named in the .retain directive has
> already been defined, however this isn't compatible with GCC
> mark_decl_preserved handling, since mark_decl_preserved is called
> emitted before the local symbols are defined in the assembly output
> file.
>
> GAS should at least validate that the symbol named in the .retain
> directive does end up as a symbol though.
>

Don't add .retain.

-- 
H.J.


[committed] analyzer: fix ICE comparing COMPLEX_CSTs [PR97668]

2020-11-05 Thread David Malcolm via Gcc-patches
Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to master as r11-4740-g54cbdb528df16686290ad26e2130a1896915639d.

gcc/analyzer/ChangeLog:
PR analyzer/97668
* svalue.cc (cmp_cst): Handle COMPLEX_CST.

gcc/testsuite/ChangeLog:
PR analyzer/97668
* gcc.dg/analyzer/pr97668.c: New test.
* gfortran.dg/analyzer/pr97668.f: New test.
---
 gcc/analyzer/svalue.cc   |  4 +++
 gcc/testsuite/gcc.dg/analyzer/pr97668.c  | 27 
 gcc/testsuite/gfortran.dg/analyzer/pr97668.f | 26 +++
 3 files changed, 57 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/pr97668.c
 create mode 100644 gcc/testsuite/gfortran.dg/analyzer/pr97668.f

diff --git a/gcc/analyzer/svalue.cc b/gcc/analyzer/svalue.cc
index 18d9c376f5e..e9304522b8e 100644
--- a/gcc/analyzer/svalue.cc
+++ b/gcc/analyzer/svalue.cc
@@ -291,6 +291,10 @@ cmp_cst (const_tree cst1, const_tree cst2)
   return memcmp (TREE_REAL_CST_PTR (cst1),
 TREE_REAL_CST_PTR (cst2),
 sizeof (real_value));
+case COMPLEX_CST:
+  if (int cmp_real = cmp_cst (TREE_REALPART (cst1), TREE_REALPART (cst2)))
+   return cmp_real;
+  return cmp_cst (TREE_IMAGPART (cst1), TREE_IMAGPART (cst2));
 case VECTOR_CST:
   if (int cmp_log2_npatterns
= ((int)VECTOR_CST_LOG2_NPATTERNS (cst1)
diff --git a/gcc/testsuite/gcc.dg/analyzer/pr97668.c 
b/gcc/testsuite/gcc.dg/analyzer/pr97668.c
new file mode 100644
index 000..6ec8164e868
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/pr97668.c
@@ -0,0 +1,27 @@
+/* { dg-additional-options "-O1" } */
+
+void
+wb (_Complex double jh)
+{
+  _Complex double af = 0.0;
+
+  do
+{
+  af += jh;
+}
+  while (af != 0.0);
+}
+
+_Complex double
+o6 (void)
+{
+  _Complex double ba = 0.0;
+
+  for (;;)
+{
+  wb (ba);
+  ba = 1.0;
+}
+
+  return ba;
+}
diff --git a/gcc/testsuite/gfortran.dg/analyzer/pr97668.f 
b/gcc/testsuite/gfortran.dg/analyzer/pr97668.f
new file mode 100644
index 000..568c891cdc4
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/analyzer/pr97668.f
@@ -0,0 +1,26 @@
+c { dg-additional-options "-std=legacy" }
+
+  SUBROUTINE PPADD (A, C, BH)
+
+  COMPLEX DD, FP, FPP, R1, R2
+  DIMENSION A(*), C(*), BH(*)
+
+  DO 136 IG=IS,1
+ FP = (0.,0.)
+ FPP = (0.,0.)
+
+ DO 121 J=1,1
+DD = 1./2
+FP = DD
+FPP = DD+1
+ 121 CONTINUE
+
+ R2 = -FP
+ IF (ABS(R1)-ABS(R2)) 129,129,133
+ 129 R1 = R2/FPP
+ 133 IT = IT+1
+
+ 136  CONTINUE
+
+  RETURN
+  END
-- 
2.26.2



[committed] diagnostic paths: loosen coupling between path-printing and path_summary

2020-11-05 Thread David Malcolm via Gcc-patches
Doing this makes followup work to add HTML path-printing cleaner.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to master as f8cc59ef4941c19d068b9dfe4e13753c9fd402c6.

gcc/ChangeLog:
* tree-diagnostic-path.cc (struct path_summary::event_range): Move
out of path_summary to...
(struct event_range): ...here.
(class path_summary): Convert to...
(struct path_summary): ...this.
(path_summary::m_ranges): Drop "private".
(path_summary::print): Convert to...
(print_path_summary_as_text): ...this, passing in the path_summary
explicitly.
(default_tree_diagnostic_path_printer): Update for above change.
(selftest::test_empty_path): Likewise.
(selftest::test_intraprocedural_path): Likewise.
(selftest::test_interprocedural_path_1): Likewise.
(selftest::test_interprocedural_path_2): Likewise.
(selftest::test_recursion): Likewise.
---
 gcc/tree-diagnostic-path.cc | 204 ++--
 1 file changed, 100 insertions(+), 104 deletions(-)

diff --git a/gcc/tree-diagnostic-path.cc b/gcc/tree-diagnostic-path.cc
index 82b3c2d6b6a..164df86037e 100644
--- a/gcc/tree-diagnostic-path.cc
+++ b/gcc/tree-diagnostic-path.cc
@@ -110,112 +110,108 @@ can_consolidate_events (const diagnostic_event &e1,
   return true;
 }
 
-/* A class for grouing together the events in a diagnostic_path into
-   ranges of events, partitioned by stack frame (i.e. by fndecl and
-   stack depth).  */
-
-class path_summary
+/* A range of consecutive events within a diagnostic_path,
+   all with the same fndecl and stack_depth, and which are suitable
+   to print with a single call to diagnostic_show_locus.  */
+struct event_range
 {
-  /* A range of consecutive events within a diagnostic_path,
- all with the same fndecl and stack_depth, and which are suitable
- to print with a single call to diagnostic_show_locus.  */
-  struct event_range
+  event_range (const diagnostic_path *path, unsigned start_idx,
+  const diagnostic_event &initial_event)
+  : m_path (path),
+m_initial_event (initial_event),
+m_fndecl (initial_event.get_fndecl ()),
+m_stack_depth (initial_event.get_stack_depth ()),
+m_start_idx (start_idx), m_end_idx (start_idx),
+m_path_label (path, start_idx),
+m_richloc (initial_event.get_location (), &m_path_label)
+  {}
+
+  bool maybe_add_event (const diagnostic_event &new_ev, unsigned idx,
+   bool check_rich_locations)
   {
-event_range (const diagnostic_path *path, unsigned start_idx,
-const diagnostic_event &initial_event)
-: m_path (path),
-  m_initial_event (initial_event),
-  m_fndecl (initial_event.get_fndecl ()),
-  m_stack_depth (initial_event.get_stack_depth ()),
-  m_start_idx (start_idx), m_end_idx (start_idx),
-  m_path_label (path, start_idx),
-  m_richloc (initial_event.get_location (), &m_path_label)
-{}
-
-bool maybe_add_event (const diagnostic_event &new_ev, unsigned idx,
- bool check_rich_locations)
-{
-  if (!can_consolidate_events (m_initial_event, new_ev,
-  check_rich_locations))
+if (!can_consolidate_events (m_initial_event, new_ev,
+check_rich_locations))
+  return false;
+if (check_rich_locations)
+  if (!m_richloc.add_location_if_nearby (new_ev.get_location (),
+false, &m_path_label))
return false;
-  if (check_rich_locations)
-   if (!m_richloc.add_location_if_nearby (new_ev.get_location (),
-  false, &m_path_label))
- return false;
-  m_end_idx = idx;
-  return true;
-}
+m_end_idx = idx;
+return true;
+  }
 
-/* Print the events in this range to DC, typically as a single
-   call to the printer's diagnostic_show_locus.  */
+  /* Print the events in this range to DC, typically as a single
+ call to the printer's diagnostic_show_locus.  */
 
-void print (diagnostic_context *dc)
-{
-  location_t initial_loc = m_initial_event.get_location ();
+  void print (diagnostic_context *dc)
+  {
+location_t initial_loc = m_initial_event.get_location ();
 
-  /* Emit a span indicating the filename (and line/column) if the
-line has changed relative to the last call to
-diagnostic_show_locus.  */
-  if (dc->show_caret)
-   {
- expanded_location exploc
-   = linemap_client_expand_location_to_spelling_point
-   (initial_loc, LOCATION_ASPECT_CARET);
- if (exploc.file != LOCATION_FILE (dc->last_location))
-   dc->start_span (dc, exploc);
-   }
+/* Emit a span indicating the filename (and line/column) if the
+   line has changed relative to the last call to
+   diagnostic_show_locus.  */
+if (dc->show_care

Re: [PATCH] cache compute_objsize results in strlen/sprintf (PR 97373)

2020-11-05 Thread Martin Sebor via Gcc-patches

On 11/5/20 12:31 AM, Richard Biener wrote:

On Thu, Nov 5, 2020 at 1:59 AM Martin Sebor via Gcc-patches
 wrote:


To determine the target of a pointer expression and the offset into
it, the increasingly widely used compute_objsize function traverses
the IL following the DEF statements of pointer variables, aggregating
offsets from POINTER_PLUS assignments along the way.  It does that
for many statements that involve pointers, including calls to
built-in functions and (so far only) accesses to char arrays.  When
a function has many such statements with pointers to the same objects
but with different offsets, the traversal ends up visiting the same
pointer assignments repeatedly and unnecessarily.

To avoid this repeated traversal, the attached patch adds the ability
to cache results obtained in prior calls to the function.  The cache
is optional and only used when enabled.

To exercise the cache I have enabled it for the strlen pass (which
is probably the heaviest compute_objsize user).  That happens to
resolve PR 97373 which tracks the pass' failure to detect sprintf
overflowing allocated buffers at a non-constant offset.  I thought
about making this a separate patch but the sprintf/strlen changes
are completely mechanical so it didn't seem worth the effort.

In the benchmarking I've done the cache isn't a huge win there but
it does have a measurable difference in the project I'm wrapping up
where most pointer assignments need to be examined.  The space used
for the cache is negligible on average: fewer than 20 entries per
Glibc function and about 6 for GCC.  The worst case in Glibc is
6 thousand entries and 10k in GCC.  Since the entries are sizable
(216 bytes each) the worst case memory consumption can be reduced
by adding a level of indirection.  A further savings can be obtained
by replacing some of the offset_int members of the entries with
HOST_WIDE_INT.

The efficiency benefits of the cache should increase further as more
of the access checking code is integrated into the same pass.  This
should eventually include the checking currently done in the built-in
expanders.

Tested on x86_64-linux, along with Glibc and Binutils/GDB.


I'm quite sure the objsz pass already has a cache, why not
re-use it instead of piggy-backing another one onto its machinery?


compute_objsize() and the objsz pass are completely independent.
The pass is also quite limited in that it doesn't make use of
ranges.  That limitation was also the main reason for introducing
the compute_objsize() function.

I'd love to see the objsize pass and compute_objsize() integrated
and exposed under an interface similar to range_query, with
the information available anywhere, and on demand.  I might tackle
it some day, but I expect it will be a nontrivial project, much
bigger than letting compute_objsize() do this simple caching for
the time being.

Martin



Richard.


Martin

PS The patch add the new pointer_query class (loosely modeled on
range_query) to builtins.{h,c}.  This should be only temporary,
until the access checking code is moved into a file (and ultimately
a pass) of its own.




Re: [00/32] C++ 20 Modules

2020-11-05 Thread Nathan Sidwell

On 11/5/20 8:33 AM, Richard Biener wrote:


Moving the module mapper to a more easily (build-)testable location
and to a place where host dependences can be more easily fixed
& customized than in a bootstrapped directory would be nice.  Thus,
I think the module mapper should be in the toplevel somehow
and independently buildable.


Ok, that makes sense.  It is where it is, because originally it was much 
more tightly coupled with cc1plus.


The mapper-server and cc1plus do share some (maybe just one?) obj files. 
The in-process resolving and the server's default have the same 
functionality.


For bootstrap cc1plus needs them, so I guess they should remain in 
gcc/cp/?  The alternative would be to put them in new mapper-server dir 
and have it provide somekind of library that cc1plus could link with. 
However that'll probably mess up bootstrap.


Having a --with-module-mapper configure option seems sensible.

nathan

--
Nathan Sidwell


Re: [PATCH] cache compute_objsize results in strlen/sprintf (PR 97373)

2020-11-05 Thread Jakub Jelinek via Gcc-patches
On Thu, Nov 05, 2020 at 08:20:20AM -0700, Martin Sebor via Gcc-patches wrote:
> compute_objsize() and the objsz pass are completely independent.
> The pass is also quite limited in that it doesn't make use of
> ranges.  That limitation was also the main reason for introducing
> the compute_objsize() function.
> 
> I'd love to see the objsize pass and compute_objsize() integrated
> and exposed under an interface similar to range_query, with
> the information available anywhere, and on demand.  I might tackle

As I said multiple times, that would be a serious security hazard.
_FORTIFY_SOURCE protects against some UBs in the programs, and ranges
are computed on the assumption that UB doesn't happen in the program,
so relying on the ranges etc. in there is highly undesirable.

Jakub



Re: [PATCH v4] c++: Implement -Wvexing-parse [PR25814]

2020-11-05 Thread Marek Polacek via Gcc-patches
On Fri, Oct 30, 2020 at 04:33:48PM -0400, Jason Merrill wrote:
> On 10/29/20 11:00 PM, Marek Polacek wrote:
> > Gotcha.  Now we do most of the work in warn_about_ambiguous_parse.
> 
> Thanks, just a few tweaks left.
> > --- a/gcc/cp/decl.c
> > +++ b/gcc/cp/decl.c
> > @@ -4378,6 +4378,9 @@ cxx_init_decl_processing (void)
> > init_list_type_node = make_node (LANG_TYPE);
> > record_unknown_type (init_list_type_node, "init list");
> > +  /* Used when parsing to distinguish parameter-lists () and (void).  */
> > +  explicit_void_list_node = build_void_list_node ();
> > +
> > {
> >   /* Make sure we get a unique function type, so we can give
> >  its pointer type a name.  (This wins for gdb.) */
> > @@ -14033,7 +14036,7 @@ grokparms (tree parmlist, tree *parms)
> > tree init = TREE_PURPOSE (parm);
> > tree decl = TREE_VALUE (parm);
> > -  if (parm == void_list_node)
> > +  if (parm == void_list_node || parm == explicit_void_list_node)
> > break;
> 
> Is this hunk needed?  I thought explicit_void_type_node would be handled by
> the if (VOID_TYPE_P) block below.

Yeah, because explicit_/void_list_node don't have a type.

> > +static void
> > +warn_about_ambiguous_parse (tree type, const cp_declarator *declarator)
> > +{
> > +  if (declarator->kind != cdk_function
> > +  || !declarator->declarator
> > +  || declarator->declarator->kind != cdk_id
> > +  || !identifier_p (get_unqualified_id
> > +   (const_cast(declarator
> > +return;
> > +
> > +  /* Don't warn when the whole declarator (not just the declarator-id!)
> > + was parenthesized.  That is, don't warn for int(n()) but do warn
> > + for int(f)().  */
> > +  if (declarator->parenthesized != UNKNOWN_LOCATION)
> > +return;
> > +
> > +  location_t loc = declarator->u.function.parens_loc;
> > +  if (loc == UNKNOWN_LOCATION)
> > +return;
> 
> Is this still possible?

Looks like it isn't, removed.

> > +  if (TREE_CODE (type) == TYPE_DECL)
> > +   type = TREE_TYPE (type);
> > +
> > +  /* If the return type is void there is no ambiguity.  */
> > +  if (same_type_p (type, void_type_node))
> > +return;
> > +
> > +  auto_diagnostic_group d;
> > +  tree params = declarator->u.function.parameters;
> > +  const bool has_list_ctor_p = CLASS_TYPE_P (type) && TYPE_HAS_LIST_CTOR 
> > (type);
> > +
> > +  /* The T t() case.  */
> > +  if (params == void_list_node)
> > +{
> > +  if (warning_at (loc, OPT_Wvexing_parse,
> > + "empty parentheses were disambiguated as a function "
> > + "declaration"))
> > +   {
> > + /* () means value-initialization (C++03 and up); {} (C++11 and up)
> > +means value-initialization or aggregate--initialization, nothing
> > +means default-initialization.  We can only suggest removing the
> > +parentheses/adding {} if T has a default constructor.  */
> > + if (!CLASS_TYPE_P (type) || TYPE_HAS_DEFAULT_CONSTRUCTOR (type))
> > +   {
> > + gcc_rich_location iloc (loc);
> > + iloc.add_fixit_remove ();
> > + inform (&iloc, "remove parentheses to default-initialize "
> > + "a variable");
> > + if (cxx_dialect >= cxx11 && !has_list_ctor_p)
> > +   {
> > + if (CP_AGGREGATE_TYPE_P (type))
> > +   inform (loc, "or replace parentheses with braces to "
> > +   "aggregate-initialize a variable");
> > + else
> > +   inform (loc, "or replace parentheses with braces to "
> > +   "value-initialize a variable");
> > +   }
> > +   }
> > +   }
> > +  return;
> > +}
> > +
> > +  /* If we had (...) or the parameter-list wasn't parenthesized,
> > + we're done.  */
> > +  if (params == NULL_TREE || !PARENTHESIZED_LIST_P (params))
> > +return;
> 
> This needs to be a loop so we check all the elements of the list.

I think this can't be a loop, because we only set PARENTHESIZED_LIST_P
in the whole list.  But I realized that there still was an issue: I was
setting PARENTHESIZED_LIST_P only based on the last element in the list,
but we want to set it only if every parameter was parenthesized.  So I've
fixed setting of PARENTHESIZED_LIST_P instead.

> > +  /* The T t(X()) case.  */
> > +  if (list_length (params) == 2)
> > +{
> > +  if (warning_at (loc, OPT_Wvexing_parse,
> > + "parentheses were disambiguated as a function "
> > + "declaration"))
> > +   {
> > + gcc_rich_location iloc (loc);
> > + /* {}-initialization means that we can use an initializer-list
> > +constructor if no default constructor is available, so don't
> > +suggest using {} for classes that have an initializer_list
> > +constructor.  */
> > + if (cxx_dialect >= cxx11 && !has_list_ctor_p)
> > +   {
> > + iloc.add_fixit_replace (get_start (loc), "{");
> > + iloc.add_f

[PATCH] AArch64: Improve inline memcpy expansion

2020-11-05 Thread Wilco Dijkstra via Gcc-patches
Improve the inline memcpy expansion.  Use integer load/store for copies <= 24 
bytes
instead of SIMD.  Set the maximum copy to expand to 256 by default, except that 
-Os or
no Neon expands up to 128 bytes.  When using LDP/STP of Q-registers, also use 
Q-register
accesses for the unaligned tail, saving 2 instructions (eg. all sizes up to 48 
bytes emit
exactly 4 instructions).  Cleanup code and comments.

The codesize gain vs the GCC10 expansion is 0.05% on SPECINT2017.

Passes bootstrap and regress. OK for commit?

ChangeLog:
2020-11-03  Wilco Dijkstra  

* config/aarch64/aarch64.c (aarch64_expand_cpymem): Cleanup code and
comments, tweak expansion decisions and improve tail expansion.

---

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 
41e2a699108146e0fa7464743607bd34e91ea9eb..9487c1cb07b0d851c0f085262179470d0d596116
 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -21255,35 +21255,36 @@ aarch64_copy_one_block_and_progress_pointers (rtx 
*src, rtx *dst,
 bool
 aarch64_expand_cpymem (rtx *operands)
 {
-  /* These need to be signed as we need to perform arithmetic on n as
- signed operations.  */
-  int n, mode_bits;
+  int mode_bits;
   rtx dst = operands[0];
   rtx src = operands[1];
   rtx base;
-  machine_mode cur_mode = BLKmode, next_mode;
-  bool speed_p = !optimize_function_for_size_p (cfun);
+  machine_mode cur_mode = BLKmode;
 
-  /* When optimizing for size, give a better estimate of the length of a
- memcpy call, but use the default otherwise.  Moves larger than 8 bytes
- will always require an even number of instructions to do now.  And each
- operation requires both a load+store, so divide the max number by 2.  */
-  unsigned int max_num_moves = (speed_p ? 16 : AARCH64_CALL_RATIO) / 2;
-
-  /* We can't do anything smart if the amount to copy is not constant.  */
+  /* Only expand fixed-size copies.  */
   if (!CONST_INT_P (operands[2]))
 return false;
 
-  unsigned HOST_WIDE_INT tmp = INTVAL (operands[2]);
+  unsigned HOST_WIDE_INT size = INTVAL (operands[2]);
 
-  /* Try to keep the number of instructions low.  For all cases we will do at
- most two moves for the residual amount, since we'll always overlap the
- remainder.  */
-  if (((tmp / 16) + (tmp % 16 ? 2 : 0)) > max_num_moves)
+  /* Inline up to 256 bytes when optimizing for speed.  */
+  unsigned HOST_WIDE_INT max_copy_size = 256;
+
+  if (optimize_function_for_size_p (cfun) || !TARGET_SIMD)
+max_copy_size = 128;
+
+  if (size > max_copy_size)
 return false;
 
-  /* At this point tmp is known to have to fit inside an int.  */
-  n = tmp;
+  int copy_bits = 256;
+
+  /* Default to 256-bit LDP/STP on large copies, however small copies, no SIMD
+ support or slow 256-bit LDP/STP fall back to 128-bit chunks.  */
+  if (size <= 24 || !TARGET_SIMD
+  || (size <= (max_copy_size / 2)
+ && (aarch64_tune_params.extra_tuning_flags
+ & AARCH64_EXTRA_TUNE_NO_LDP_STP_QREGS)))
+copy_bits = GET_MODE_BITSIZE (TImode);
 
   base = copy_to_mode_reg (Pmode, XEXP (dst, 0));
   dst = adjust_automodify_address (dst, VOIDmode, base, 0);
@@ -21291,15 +21292,8 @@ aarch64_expand_cpymem (rtx *operands)
   base = copy_to_mode_reg (Pmode, XEXP (src, 0));
   src = adjust_automodify_address (src, VOIDmode, base, 0);
 
-  /* Convert n to bits to make the rest of the code simpler.  */
-  n = n * BITS_PER_UNIT;
-
-  /* Maximum amount to copy in one go.  We allow 256-bit chunks based on the
- AARCH64_EXTRA_TUNE_NO_LDP_STP_QREGS tuning parameter and TARGET_SIMD.  */
-  const int copy_limit = ((aarch64_tune_params.extra_tuning_flags
-  & AARCH64_EXTRA_TUNE_NO_LDP_STP_QREGS)
- || !TARGET_SIMD)
-? GET_MODE_BITSIZE (TImode) : 256;
+  /* Convert size to bits to make the rest of the code simpler.  */
+  int n = size * BITS_PER_UNIT;
 
   while (n > 0)
 {
@@ -21307,23 +21301,26 @@ aarch64_expand_cpymem (rtx *operands)
 or writing.  */
   opt_scalar_int_mode mode_iter;
   FOR_EACH_MODE_IN_CLASS (mode_iter, MODE_INT)
-   if (GET_MODE_BITSIZE (mode_iter.require ()) <= MIN (n, copy_limit))
+   if (GET_MODE_BITSIZE (mode_iter.require ()) <= MIN (n, copy_bits))
  cur_mode = mode_iter.require ();
 
   gcc_assert (cur_mode != BLKmode);
 
   mode_bits = GET_MODE_BITSIZE (cur_mode).to_constant ();
+
+  /* Prefer Q-register accesses for the last bytes.  */
+  if (mode_bits == 128 && copy_bits == 256)
+   cur_mode = V4SImode;
+
   aarch64_copy_one_block_and_progress_pointers (&src, &dst, cur_mode);
 
   n -= mode_bits;
 
-  /* Do certain trailing copies as overlapping if it's going to be
-cheaper.  i.e. less instructions to do so.  For instance doing a 15
-byte copy it's more efficient to do two overlapping 8 byte copies than
-8 + 6 + 1.  */
-  if (n > 0 && n <= 8 * BITS_PER_

[PATCH] c++: Add -Wexceptions warning option [PR97675]

2020-11-05 Thread Marek Polacek via Gcc-patches
This PR asks that we add a warning option for an existing (very old)
warning, so that it can be disabled selectively.  clang++ uses
-Wexceptions for this, so I added this new option rather than using
e.g. -Wnoexcept.

gcc/c-family/ChangeLog:

PR c++/97675
* c.opt (Wexceptions): New option.

gcc/cp/ChangeLog:

PR c++/97675
* except.c (check_handlers_1): Use OPT_Wexceptions for the
warning.  Use inform for the second part of the warning.

gcc/ChangeLog:

PR c++/97675
* doc/invoke.texi: Document -Wexceptions.

gcc/testsuite/ChangeLog:

PR c++/97675
* g++.old-deja/g++.eh/catch10.C: Adjust dg-warning.
* g++.dg/warn/Wexceptions1.C: New test.
* g++.dg/warn/Wexceptions2.C: New test.
---
 gcc/c-family/c.opt  |  4 
 gcc/cp/except.c |  9 -
 gcc/doc/invoke.texi |  8 +++-
 gcc/testsuite/g++.dg/warn/Wexceptions1.C|  9 +
 gcc/testsuite/g++.dg/warn/Wexceptions2.C| 10 ++
 gcc/testsuite/g++.old-deja/g++.eh/catch10.C |  4 ++--
 6 files changed, 36 insertions(+), 8 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/warn/Wexceptions1.C
 create mode 100644 gcc/testsuite/g++.dg/warn/Wexceptions2.C

diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index 426636be839..9493acb82ff 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -579,6 +579,10 @@ Werror-implicit-function-declaration
 C ObjC RejectNegative Warning Alias(Werror=, implicit-function-declaration)
 This switch is deprecated; use -Werror=implicit-function-declaration instead.
 
+Wexceptions
+C++ ObjC++ Var(warn_exceptions) Init(1)
+Warn when an exception handler is shadowed by another handler.
+
 Wextra
 C ObjC C++ ObjC++ Warning
 ; in common.opt
diff --git a/gcc/cp/except.c b/gcc/cp/except.c
index cb1a4105dae..985206f6a64 100644
--- a/gcc/cp/except.c
+++ b/gcc/cp/except.c
@@ -975,11 +975,10 @@ check_handlers_1 (tree master, tree_stmt_iterator i)
   tree handler = tsi_stmt (i);
   if (TREE_TYPE (handler) && can_convert_eh (type, TREE_TYPE (handler)))
{
- warning_at (EXPR_LOCATION (handler), 0,
- "exception of type %qT will be caught",
- TREE_TYPE (handler));
- warning_at (EXPR_LOCATION (master), 0,
- "   by earlier handler for %qT", type);
+ if (warning_at (EXPR_LOCATION (handler), OPT_Wexceptions,
+ "exception of type %qT will be caught by earlier "
+ "handler", TREE_TYPE (handler)))
+   inform (EXPR_LOCATION (master), "for type %qT", type);
  break;
}
 }
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 5320e6c1e1e..4c6435d5e14 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -240,7 +240,7 @@ in the following sections.
 -Wctor-dtor-privacy  -Wno-delete-incomplete @gol
 -Wdelete-non-virtual-dtor  -Wdeprecated-copy  -Wdeprecated-copy-dtor @gol
 -Wno-deprecated-enum-enum-conversion -Wno-deprecated-enum-float-conversion @gol
--Weffc++  -Wextra-semi  -Wno-inaccessible-base @gol
+-Weffc++  -Wno-exceptions -Wextra-semi  -Wno-inaccessible-base @gol
 -Wno-inherited-variadic-ctor  -Wno-init-list-lifetime @gol
 -Wno-invalid-offsetof  -Wno-literal-suffix  -Wmismatched-tags @gol
 -Wmultiple-inheritance  -Wnamespaces  -Wnarrowing @gol
@@ -3738,6 +3738,12 @@ When selecting this option, be aware that the standard 
library
 headers do not obey all of these guidelines; use @samp{grep -v}
 to filter out those warnings.
 
+@item -Wno-exceptions @r{(C++ and Objective-C++ only)}
+@opindex Wexceptions
+@opindex Wno-exceptions
+Disable the warning about the case when an exception handler is shadowed by
+another handler, which can point out a wrong ordering of exception handlers.
+
 @item -Wstrict-null-sentinel @r{(C++ and Objective-C++ only)}
 @opindex Wstrict-null-sentinel
 @opindex Wno-strict-null-sentinel
diff --git a/gcc/testsuite/g++.dg/warn/Wexceptions1.C 
b/gcc/testsuite/g++.dg/warn/Wexceptions1.C
new file mode 100644
index 000..af140fd0dc2
--- /dev/null
+++ b/gcc/testsuite/g++.dg/warn/Wexceptions1.C
@@ -0,0 +1,9 @@
+// PR c++/97675
+
+struct Base { };
+struct Child : Base { };
+int main() {
+try { throw Child(); }
+catch (Base const&) { }
+catch (Child const&) { } // { dg-warning "exception of type .Child. will 
be caught by earlier handler" }
+}
diff --git a/gcc/testsuite/g++.dg/warn/Wexceptions2.C 
b/gcc/testsuite/g++.dg/warn/Wexceptions2.C
new file mode 100644
index 000..07c5155ac06
--- /dev/null
+++ b/gcc/testsuite/g++.dg/warn/Wexceptions2.C
@@ -0,0 +1,10 @@
+// PR c++/97675
+// { dg-additional-options -Wno-exceptions }
+
+struct Base { };
+struct Child : Base { };
+int main() {
+try { throw Child(); }
+catch (Base const&) { }
+catch (Child const&) { } // { dg-bogus "exception of type .Child. will be 
caught by earlier handler" }

Re: [PATCH][AArch64] Use intrinsics for upper saturating shift right

2020-11-05 Thread David Candler via Gcc-patches
Hi Richard,

Thanks for the feedback.

Richard Sandiford  writes:
> > diff --git a/gcc/config/aarch64/aarch64-builtins.c 
> > b/gcc/config/aarch64/aarch64-builtins.c
> > index 4f33dd936c7..f93f4e29c89 100644
> > --- a/gcc/config/aarch64/aarch64-builtins.c
> > +++ b/gcc/config/aarch64/aarch64-builtins.c
> > @@ -254,6 +254,10 @@ 
> > aarch64_types_binop_imm_qualifiers[SIMD_MAX_BUILTIN_ARGS]
> >  #define TYPES_GETREG (aarch64_types_binop_imm_qualifiers)
> >  #define TYPES_SHIFTIMM (aarch64_types_binop_imm_qualifiers)
> >  static enum aarch64_type_qualifiers
> > +aarch64_types_ternop_s_imm_qualifiers[SIMD_MAX_BUILTIN_ARGS]
> > +  = { qualifier_none, qualifier_none, qualifier_none, qualifier_immediate};
> > +#define TYPES_SHIFT2IMM (aarch64_types_ternop_s_imm_qualifiers)
> > +static enum aarch64_type_qualifiers
> >  aarch64_types_shift_to_unsigned_qualifiers[SIMD_MAX_BUILTIN_ARGS]
> >= { qualifier_unsigned, qualifier_none, qualifier_immediate };
> >  #define TYPES_SHIFTIMM_USS (aarch64_types_shift_to_unsigned_qualifiers)
> > @@ -265,14 +269,16 @@ static enum aarch64_type_qualifiers
> >  aarch64_types_unsigned_shift_qualifiers[SIMD_MAX_BUILTIN_ARGS]
> >= { qualifier_unsigned, qualifier_unsigned, qualifier_immediate };
> >  #define TYPES_USHIFTIMM (aarch64_types_unsigned_shift_qualifiers)
> > +#define TYPES_USHIFT2IMM (aarch64_types_ternopu_imm_qualifiers)
> > +static enum aarch64_type_qualifiers
> > +aarch64_types_shift2_to_unsigned_qualifiers[SIMD_MAX_BUILTIN_ARGS]
> > +  = { qualifier_unsigned, qualifier_unsigned, qualifier_none, 
> > qualifier_immediate };
> > +#define TYPES_SHIFT2IMM_UUSS (aarch64_types_shift2_to_unsigned_qualifiers)
> >  
> >  static enum aarch64_type_qualifiers
> >  aarch64_types_ternop_s_imm_p_qualifiers[SIMD_MAX_BUILTIN_ARGS]
> >= { qualifier_none, qualifier_none, qualifier_poly, qualifier_immediate};
> >  #define TYPES_SETREGP (aarch64_types_ternop_s_imm_p_qualifiers)
> > -static enum aarch64_type_qualifiers
> > -aarch64_types_ternop_s_imm_qualifiers[SIMD_MAX_BUILTIN_ARGS]
> > -  = { qualifier_none, qualifier_none, qualifier_none, qualifier_immediate};
> >  #define TYPES_SETREG (aarch64_types_ternop_s_imm_qualifiers)
> >  #define TYPES_SHIFTINSERT (aarch64_types_ternop_s_imm_qualifiers)
> >  #define TYPES_SHIFTACC (aarch64_types_ternop_s_imm_qualifiers)
> 
> Very minor, but I think it would be better to keep
> aarch64_types_ternop_s_imm_qualifiers where it is and define
> TYPES_SHIFT2IMM here rather than above.  For better or worse,
> the current style seems to be to keep the defines next to the
> associated arrays, rather than group them based on the TYPES_* name.
> 
> > diff --git a/gcc/config/aarch64/aarch64-simd-builtins.def 
> > b/gcc/config/aarch64/aarch64-simd-builtins.def
> > index d1b21102b2f..0b82b9c072b 100644
> > --- a/gcc/config/aarch64/aarch64-simd-builtins.def
> > +++ b/gcc/config/aarch64/aarch64-simd-builtins.def
> > @@ -285,6 +285,13 @@
> >BUILTIN_VSQN_HSDI (USHIFTIMM, uqshrn_n, 0, ALL)
> >BUILTIN_VSQN_HSDI (SHIFTIMM, sqrshrn_n, 0, ALL)
> >BUILTIN_VSQN_HSDI (USHIFTIMM, uqrshrn_n, 0, ALL)
> > +  /* Implemented by aarch64_qshrn2_n.  */
> > +  BUILTIN_VQN (SHIFT2IMM_UUSS, sqshrun2_n, 0, ALL)
> > +  BUILTIN_VQN (SHIFT2IMM_UUSS, sqrshrun2_n, 0, ALL)
> > +  BUILTIN_VQN (SHIFT2IMM, sqshrn2_n, 0, ALL)
> > +  BUILTIN_VQN (USHIFT2IMM, uqshrn2_n, 0, ALL)
> > +  BUILTIN_VQN (SHIFT2IMM, sqrshrn2_n, 0, ALL)
> > +  BUILTIN_VQN (USHIFT2IMM, uqrshrn2_n, 0, ALL)
> 
> Using ALL is a holdover from the time (until a few weeks ago) when we
> didn't record function attributes.  New intrinsics should therefore
> have something more specific than ALL.
> 
> We discussed offline whether the Q flag side effect of the intrinsics
> should be observable or not, and the conclusion was that it shouldn't.
> I think we can therefore treat these functions as pure functions,
> meaning that they should have flags NONE rather than ALL.
> 
> For that reason, I think we should also remove the Set_Neon_Cumulative_Sat
> and CHECK_CUMULATIVE_SAT parts of the test (sorry).
> 
> Other than that, the patch looks good to go.
> 
> Thanks,
> Richard

I've updated the patch with TYPES_SHIFT2IMM moved, the builtins changed
to NONE, and the Q flag portion of the tests removed.

Thanks,
David


ChangeLog
Description: ChangeLog
diff --git a/gcc/config/aarch64/aarch64-builtins.c b/gcc/config/aarch64/aarch64-builtins.c
index 4f33dd936c7..a9fc0de9de9 100644
--- a/gcc/config/aarch64/aarch64-builtins.c
+++ b/gcc/config/aarch64/aarch64-builtins.c
@@ -265,6 +265,11 @@ static enum aarch64_type_qualifiers
 aarch64_types_unsigned_shift_qualifiers[SIMD_MAX_BUILTIN_ARGS]
   = { qualifier_unsigned, qualifier_unsigned, qualifier_immediate };
 #define TYPES_USHIFTIMM (aarch64_types_unsigned_shift_qualifiers)
+#define TYPES_USHIFT2IMM (aarch64_types_ternopu_imm_qualifiers)
+static enum aarch64_type_qualifiers
+aarch64_types_shift2_to_unsigned_qualifiers[SIMD_MAX_BUILTIN_ARGS]
+  = { qualifier_unsigned, qualifier_uns

[PATCH] c++: Consider only relevant template arguments in sat_hasher

2020-11-05 Thread Patrick Palka via Gcc-patches
[ This patch depends on

  c++: Use two levels of caching in satisfy_atom

  https://gcc.gnu.org/pipermail/gcc-patches/2020-November/558096.html  ]

A large source of cache misses in satisfy_atom is caused by the identity
of an (atom,args) pair within the satisfaction cache being determined by
the entire set of supplied template arguments rather than by the subset
of template arguments that the atom actually depends on.  For instance,
consider

  template 
  concept range = range_v;

  template  void foo () requires range;
  template  void bar () requires range;

The associated constraints of foo and bar are equivalent: they both
consist of the atom range_v (with mapping T -> U).  But the sat_cache
currently will never reuse a satisfaction value between the two atoms
because foo has one template parameter and bar has two, and the
satisfaction cache conservatively assumes that all template parameters
are relevant to a satisfaction value of an atom.

This patch eliminates this assumption and makes the sat_cache instead
care about just the subset of args of an (atom,args) pair that's used
in the targets of an atom's parameter mapping.

With this patch, compile time and memory usage for the cmcstl2 test
test/algorithm/set_symmetric_difference4.cpp drops from 8.5s/1.2GB to
3.5s/0.4GB.

Bootstrapped and regtested on x86_64-pc-linux-gnu.

gcc/cp/ChangeLog:

* constraint.cc (norm_info::norm_info): Initialize orig_decl.
(norm_info::orig_decl): New data member.
(normalize_atom): When caching an atom for the first time,
compute a list of template parameters used in the targets of the
parameter mapping and store it in the TREE_TYPE of the mapping.
(sat_hasher::hash): Use this list to hash only the template
arguments that are relevant to the atom.
(satisfy_atom): Use this list to compare only the template
arguments that are relevant to the atom.
---
 gcc/cp/constraint.cc | 66 ++--
 1 file changed, 63 insertions(+), 3 deletions(-)

diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
index c612bfba13b..221c5b21c7f 100644
--- a/gcc/cp/constraint.cc
+++ b/gcc/cp/constraint.cc
@@ -616,7 +616,8 @@ struct norm_info : subst_info
 
   norm_info (tree in_decl, tsubst_flags_t complain)
 : subst_info (tf_warning_or_error | complain, in_decl),
-  context (make_context (in_decl))
+  context (make_context (in_decl)),
+  orig_decl (in_decl)
   {}
 
   bool generate_diagnostics() const
@@ -647,6 +648,12 @@ struct norm_info : subst_info
  for that check.  */
 
   tree context;
+
+  /* The declaration whose constraints we're normalizing.  The targets
+ of the parameter mapping of each atom will be in terms of template
+ parameters of ORIG_DECL.  */
+
+  tree orig_decl = NULL_TREE;
 };
 
 static tree normalize_expression (tree, tree, norm_info);
@@ -758,6 +765,28 @@ normalize_atom (tree t, tree args, norm_info info)
   tree *slot = atom_cache->find_slot (atom, INSERT);
   if (*slot)
return *slot;
+
+  /* Find all template parameters used in the targets of the parameter
+mapping, and store a list of them in the TREE_TYPE of the mapping.
+This list will be used by sat_hasher to determine the subset of
+supplied template arguments that the satisfaction value of the atom
+depends on.  */
+  if (map)
+   {
+ tree targets = make_tree_vec (list_length (map));
+ int i = 0;
+ for (tree node = map; node; node = TREE_CHAIN (node))
+   {
+ tree target = TREE_PURPOSE (node);
+ TREE_VEC_ELT (targets, i++) = target;
+   }
+ tree ctx_parms = (info.orig_decl
+   ? DECL_TEMPLATE_PARMS (info.orig_decl)
+   : current_template_parms);
+ tree target_parms = find_template_parameters (targets, ctx_parms);
+ TREE_TYPE (map) = target_parms;
+   }
+
   *slot = atom;
 }
   return atom;
@@ -2322,7 +2351,20 @@ struct sat_hasher : ggc_ptr_hash
   }
 
 hashval_t value = htab_hash_pointer (e->constr);
-return iterative_hash_template_arg (e->args, value);
+
+tree map = ATOMIC_CONSTR_MAP (e->constr);
+if (map)
+  for (tree target_parms = TREE_TYPE (map);
+  target_parms;
+  target_parms = TREE_CHAIN (target_parms))
+   {
+ int level, index;
+ tree parm = TREE_VALUE (target_parms);
+ template_parm_level_and_index (parm, &level, &index);
+ tree arg = TMPL_ARG (e->args, level, index);
+ value = iterative_hash_template_arg (arg, value);
+   }
+return value;
   }
 
   static bool equal (sat_entry *e1, sat_entry *e2)
@@ -2343,7 +2385,25 @@ struct sat_hasher : ggc_ptr_hash
the caching of ATOMIC_CONSTRs performed therein.  */
 if (e1->constr != e2->constr)
   return false;
-return template_args_equal (e1->args, e2->args);
+

[PATCH] Drop overflow from constants while building ranges in ranger.

2020-11-05 Thread Aldy Hernandez via Gcc-patches
Sometimes the overflow flag will leak into the IL.  Drop it while
creating ranges.

There are various places we could plug this.  This patch just plugs things
at get_tree_range which is the entry point for ranges from tree expressions.
It fixes the PR, and probably fixes the ranger entirely, but we may need
to revisit this.

For example, I looked to see if there were other places that created
ranges with TREE_OVERFLOW set, and there are various.  For example,
the following code pattern appears multiple times in vr-values.c:

  else if (is_gimple_min_invariant (op0))
vr0.set (op0);

This can pick up TREE_OVERFLOW from the IL if present.  However, the
ranger won't see them so we're good.

At some point we should audit all this.  Or perhaps just nuke all
TREE_OVERFLOW's at irange::set.

For now, this will do.

Pushed.

gcc/ChangeLog:

PR tree-optimization/97721
* gimple-range.cc (get_tree_range): Drop overflow from constants.

gcc/testsuite/ChangeLog:

* gcc.dg/pr97721.c: New test.
---
 gcc/gimple-range.cc|  2 ++
 gcc/testsuite/gcc.dg/pr97721.c | 13 +
 2 files changed, 15 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/pr97721.c

diff --git a/gcc/gimple-range.cc b/gcc/gimple-range.cc
index ef65e00cc1d..0c8ec40448f 100644
--- a/gcc/gimple-range.cc
+++ b/gcc/gimple-range.cc
@@ -165,6 +165,8 @@ get_tree_range (irange &r, tree expr)
   switch (TREE_CODE (expr))
 {
   case INTEGER_CST:
+   if (TREE_OVERFLOW_P (expr))
+ expr = drop_tree_overflow (expr);
r.set (expr, expr);
return true;
 
diff --git a/gcc/testsuite/gcc.dg/pr97721.c b/gcc/testsuite/gcc.dg/pr97721.c
new file mode 100644
index 000..c2a2848ba13
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr97721.c
@@ -0,0 +1,13 @@
+// { dg-do compile }
+// { dg-options "-O -fno-tree-dominator-opts" }
+
+int ot;
+
+void
+z6 (char *tw)
+{ 
+  while (ot >= 0)
+--ot;
+
+  __builtin_strcpy (&tw[ot], tw);
+}
-- 
2.26.2



Re: [PATCH] cache compute_objsize results in strlen/sprintf (PR 97373)

2020-11-05 Thread Martin Sebor via Gcc-patches

On 11/5/20 8:29 AM, Jakub Jelinek wrote:

On Thu, Nov 05, 2020 at 08:20:20AM -0700, Martin Sebor via Gcc-patches wrote:

compute_objsize() and the objsz pass are completely independent.
The pass is also quite limited in that it doesn't make use of
ranges.  That limitation was also the main reason for introducing
the compute_objsize() function.

I'd love to see the objsize pass and compute_objsize() integrated
and exposed under an interface similar to range_query, with
the information available anywhere, and on demand.  I might tackle


As I said multiple times, that would be a serious security hazard.
_FORTIFY_SOURCE protects against some UBs in the programs, and ranges
are computed on the assumption that UB doesn't happen in the program,
so relying on the ranges etc. in there is highly undesirable.


As I think has been pointed out as many times in response, failing
to handle ranges in _FORTIFY_SOURCE, or any expressions that can't
be evaluated at compile time for that matter, is the security gap.
It means that the vast majority of allocated objects (by malloc,
alloca, or VLAs) are unprotected, as are accesses into fixed-size
objects involving nonconstant offsets.

It's only thanks to compute_objsize() GCC diagnoses (but doesn't
prevent) a small subset of buffer overflows involving such accesses
but only those where the lower bound of the access exceeds the upper
bound of the size.  It's powerless against those where the overflow
is due to the size of the access exceeding just the lower bound of
the object size but not the upper bound.

This serious _FORTIFY_SOURCE shortcoming was recognized by the Clang
developers and rectified by introducing __builtin_dynamic_object_size.

Martin


Re: Fix uninitialized memory use in ipa-modref

2020-11-05 Thread Martin Liška

On 11/5/20 3:27 PM, Jan Hubicka wrote:

poly_int64 offset;
struct modref_parm_map parm_map;
  
+  parm_map.parm_offset_known = false;

+  parm_map.parm_offset = 0;
+


I'm curious, can't we use a proper C++ class construction.
The IPA pass is new and so we can make it more C++-ish? Similarly
for all newly introduced structs in mod ref.

Thanks,
Martin


[committed][PATCH] middle-end: guard slp-11b.c testcase on vec_lanes

2020-11-05 Thread Tamar Christina via Gcc-patches
Hi All,

They say third time is the charm.. It looks like the testcase
disables the cost model and so AArch64 we end up being able to
do the permute but on x86 we can't.  However when analyzing the
testcase I didn't disable the cost model hence the difference.

So I now guard the testcase on vect_load_lanes as there's not a
"can do any permute" test directive and load lanes is what I will
be fixing up next year so this should catch it.

Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu (through the test 
harness) and no issues.

Committed under the obvious rule...

Thanks,
Tamar

gcc/testsuite/ChangeLog:

* gcc.dg/vect/slp-11b.c: Guard statements.

-- 
diff --git a/gcc/testsuite/gcc.dg/vect/slp-11b.c b/gcc/testsuite/gcc.dg/vect/slp-11b.c
index 0cc23770badf0e00ef98769a2dd14a92dca32cca..0aece8092a83ebd5fbdcd8257537a6bb3838a2c2 100644
--- a/gcc/testsuite/gcc.dg/vect/slp-11b.c
+++ b/gcc/testsuite/gcc.dg/vect/slp-11b.c
@@ -45,4 +45,5 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target { vect_strided4 && vect_int_mult } } } } */
 /* { dg-final { scan-tree-dump-times "vectorized 0 loops" 1 "vect" { target { ! { vect_strided4 && vect_int_mult } } } } } */
-/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 0 "vect" } } */
+/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 0 "vect" { target { ! vect_load_lanes } } } } */
+/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { target { vect_load_lanes } } } } */



Re: [00/32] C++ 20 Modules

2020-11-05 Thread Nathan Sidwell

On 11/5/20 2:08 AM, Boris Kolpackov wrote:



To give an example of such a likely change, currently the mapper
has a notion of the central module repository directory that is
used to resolve all the relative CMI (compiled module interface[1])
paths (even paths like ./foo.gcm). However, this model will not
apply to all build systems. For example, in build2 (the build
system I am involved with), there can be no such central place
since a project can pull dependencies that are built in other
places. Currently, the only way to disable this repository
semantics is to use absolute CMI paths throughout.


The repo is providing a mechanism by which two processes can synchronize 
on a fixed location in the file system that is not /.  You need such a 
capability as the file system is the bulk transfer mechanism.


The alternatives are to always use absolute paths, or require the two 
ends of the communication to have the same working directory, or have 
one end of the communication to map file system locations into the other 
end's view.  That'll probably require knowing some fixed point, which 
you have to figure out how synchronize, and we're back to defining more 
fixed points in the file system.


The location of the repo is entirely under the mapper-server's control. 
Set it to / if you want.


nathan
--
Nathan Sidwell


Re: Fix uninitialized memory use in ipa-modref

2020-11-05 Thread Jan Hubicka
> On 11/5/20 3:27 PM, Jan Hubicka wrote:
> > poly_int64 offset;
> > struct modref_parm_map parm_map;
> > +  parm_map.parm_offset_known = false;
> > +  parm_map.parm_offset = 0;
> > +
> 
> I'm curious, can't we use a proper C++ class construction.
> The IPA pass is new and so we can make it more C++-ish? Similarly
> for all newly introduced structs in mod ref.

We can't because our vec does not accept non-pods and this needs to be
GGC safe since it points to trees.

Honza
> 
> Thanks,
> Martin


Re: [ping] aarch64: move and adjust PROBE_STACK_*_REG

2020-11-05 Thread Olivier Hainque
Hi Richard,

> On 4 Nov 2020, at 20:04, Richard Sandiford  wrote:
> 
> It's a bit concerning that the second register now overlaps
> STACK_CLASH_SVE_CFA_REGNUM, but I agree that isn't a problem
> in practice, since the two uses are currently mutually-exclusive.

> I think it might be worth having a comment about that,  So maybe add:
> 
>;; Note that the use of these registers is mutually exclusive with the use
>;; of STACK_CLASH_SVE_CFA_REGNUM, which is for -fstack-clash-protection
>;; rather than -fstack-check.
> 
> to the new comment above.

Sure. Yes, the two stack checking modes are definitely
exclusive.

> OK with that change, thanks.  Sorry for the long delay in the review.

Great :) No pb. Thanks for your feedback!

Best Regards,

Olivier



Re: Fix uninitialized memory use in ipa-modref

2020-11-05 Thread Jan Hubicka
> > On 11/5/20 3:27 PM, Jan Hubicka wrote:
> > > poly_int64 offset;
> > > struct modref_parm_map parm_map;
> > > +  parm_map.parm_offset_known = false;
> > > +  parm_map.parm_offset = 0;
> > > +
> > 
> > I'm curious, can't we use a proper C++ class construction.
> > The IPA pass is new and so we can make it more C++-ish? Similarly
> > for all newly introduced structs in mod ref.
> 
> We can't because our vec does not accept non-pods and this needs to be
> GGC safe since it points to trees.

We could probably add construction of writes_errno even though in corret
run it should be never used (in analysis we need to be able to
reinitialize and during stream in we will always stream it in).
What else do you think can be more ++-ish? The pass even has two
templates :).

diff --git a/gcc/ipa-modref.c b/gcc/ipa-modref.c
index b40f3da3ba2..e80f6de09f2 100644
--- a/gcc/ipa-modref.c
+++ b/gcc/ipa-modref.c
@@ -124,7 +124,7 @@ static GTY(()) fast_function_summary 
 /* Summary for a single function which this pass produces.  */
 
 modref_summary::modref_summary ()
-  : loads (NULL), stores (NULL)
+  : loads (NULL), stores (NULL), writes_errno (NULL)
 {
 }
 


[committed 1/2] libstdc++: Export basic_stringbuf constructor [PR 97729]

2020-11-05 Thread Jonathan Wakely via Gcc-patches
libstdc++-v3/ChangeLog:

PR libstdc++/97729
* config/abi/pre/gnu.ver (GLIBCXX_3.4.29): Add exports.
* src/c++20/sstream-inst.cc (basic_stringbuf): Instantiate
private constructor taking __xfer_bufptrs.

Tested powerpc64le-linux. Committed to trunk.

commit 50b840ac5e1d6534e345c3fee9a97ae45ced6bc7
Author: Jonathan Wakely 
Date:   Thu Nov 5 13:41:40 2020

libstdc++: Export basic_stringbuf constructor [PR 97729]

libstdc++-v3/ChangeLog:

PR libstdc++/97729
* config/abi/pre/gnu.ver (GLIBCXX_3.4.29): Add exports.
* src/c++20/sstream-inst.cc (basic_stringbuf): Instantiate
private constructor taking __xfer_bufptrs.

diff --git a/libstdc++-v3/config/abi/pre/gnu.ver 
b/libstdc++-v3/config/abi/pre/gnu.ver
index 707539a17c3a..ed68ffa28723 100644
--- a/libstdc++-v3/config/abi/pre/gnu.ver
+++ b/libstdc++-v3/config/abi/pre/gnu.ver
@@ -2346,6 +2346,7 @@ GLIBCXX_3.4.29 {
 
 # basic_stringbuf::basic_stringbuf(basic_stringbuf&&, allocator const&)
 
_ZNSt7__cxx1115basic_stringbufI[cw]St11char_traitsI[cw]ESaI[cw]EEC[12]EOS4_RKS3_;
+
_ZNSt7__cxx1115basic_stringbufI[cw]St11char_traitsI[cw]ESaI[cw]EEC[12]EOS4_RKS3_ONS4_14__xfer_bufptrsE;
 
 # basic_stringbuf::get_allocator()
 
_ZNKSt7__cxx1115basic_stringbufI[cw]St11char_traitsI[cw]ESaI[cw]EE13get_allocatorEv;
diff --git a/libstdc++-v3/src/c++20/sstream-inst.cc 
b/libstdc++-v3/src/c++20/sstream-inst.cc
index ada3eabac1f5..7d275de5cc24 100644
--- a/libstdc++-v3/src/c++20/sstream-inst.cc
+++ b/libstdc++-v3/src/c++20/sstream-inst.cc
@@ -41,6 +41,9 @@ template 
basic_stringbuf::basic_stringbuf(__string_type&&,
ios_base::openmode);
 template basic_stringbuf::basic_stringbuf(basic_stringbuf&&,
const allocator_type&);
+template basic_stringbuf::basic_stringbuf(basic_stringbuf&&,
+   const allocator_type&,
+   __xfer_bufptrs&&);
 template basic_stringbuf::allocator_type
 basic_stringbuf::get_allocator() const noexcept;
 template string_view
@@ -75,6 +78,9 @@ template 
basic_stringbuf::basic_stringbuf(__string_type&&,
   ios_base::openmode);
 template basic_stringbuf::basic_stringbuf(basic_stringbuf&&,
   const allocator_type&);
+template basic_stringbuf::basic_stringbuf(basic_stringbuf&&,
+  const allocator_type&,
+  __xfer_bufptrs&&);
 template basic_stringbuf::allocator_type
 basic_stringbuf::get_allocator() const noexcept;
 


[committed] libstdc++: Use non-throwing increment in recursive_directory_iterator [PR 97731]

2020-11-05 Thread Jonathan Wakely via Gcc-patches
As described in the PR, the recursive_directory_iterator constructor
calls advance(ec), but ec is a pointer so it calls _Dir::advance(bool).
The intention was to either call advance() or advance(*ec) depending
whether the pointer is null or not.

This fixes the bug and renames the parameter to ecptr to make similar
mistakes less likely in future.

libstdc++-v3/ChangeLog:

PR libstdc++/97731
* src/filesystem/dir.cc (recursive_directory_iterator): Call the
right overload of _Dir::advance.
* testsuite/experimental/filesystem/iterators/97731.cc: New test.

Tested powerpc64le-linux. Committed to trunk.

I'll backport this to all branches too.


commit 2f93a2a03a343a29f614a530d7657f1ed6347ed5
Author: Jonathan Wakely 
Date:   Thu Nov 5 17:26:13 2020

libstdc++: Use non-throwing increment in recursive_directory_iterator [PR 
97731]

As described in the PR, the recursive_directory_iterator constructor
calls advance(ec), but ec is a pointer so it calls _Dir::advance(bool).
The intention was to either call advance() or advance(*ec) depending
whether the pointer is null or not.

This fixes the bug and renames the parameter to ecptr to make similar
mistakes less likely in future.

libstdc++-v3/ChangeLog:

PR libstdc++/97731
* src/filesystem/dir.cc (recursive_directory_iterator): Call the
right overload of _Dir::advance.
* testsuite/experimental/filesystem/iterators/97731.cc: New test.

diff --git a/libstdc++-v3/src/filesystem/dir.cc 
b/libstdc++-v3/src/filesystem/dir.cc
index 86aee2ded51a..5109897abbde 100644
--- a/libstdc++-v3/src/filesystem/dir.cc
+++ b/libstdc++-v3/src/filesystem/dir.cc
@@ -187,16 +187,16 @@ struct fs::recursive_directory_iterator::_Dir_stack : 
std::stack<_Dir>
 
 fs::recursive_directory_iterator::
 recursive_directory_iterator(const path& p, directory_options options,
- error_code* ec)
+ error_code* ecptr)
 : _M_options(options), _M_pending(true)
 {
-  if (ec)
-ec->clear();
   if (posix::DIR* dirp = posix::opendir(p.c_str()))
 {
+  if (ecptr)
+   ecptr->clear();
   auto sp = std::make_shared<_Dir_stack>();
   sp->push(_Dir{ dirp, p });
-  if (sp->top().advance(ec))
+  if (ecptr ? sp->top().advance(*ecptr) : sp->top().advance())
_M_dirs.swap(sp);
 }
   else
@@ -204,14 +204,18 @@ recursive_directory_iterator(const path& p, 
directory_options options,
   const int err = errno;
   if (err == EACCES
  && is_set(options, fs::directory_options::skip_permission_denied))
-   return;
+   {
+ if (ecptr)
+   ecptr->clear();
+ return;
+   }
 
-  if (!ec)
+  if (!ecptr)
_GLIBCXX_THROW_OR_ABORT(filesystem_error(
  "recursive directory iterator cannot open directory", p,
  std::error_code(err, std::generic_category(;
 
-  ec->assign(err, std::generic_category());
+  ecptr->assign(err, std::generic_category());
 }
 }
 
diff --git a/libstdc++-v3/testsuite/experimental/filesystem/iterators/97731.cc 
b/libstdc++-v3/testsuite/experimental/filesystem/iterators/97731.cc
new file mode 100644
index ..c6a9d5663fe2
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/filesystem/iterators/97731.cc
@@ -0,0 +1,49 @@
+// Copyright (C) 2020 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// .
+
+// { dg-options "-DUSE_FILESYSTEM_TS -lstdc++fs" }
+// { dg-do run { target c++11 } }
+// { dg-require-filesystem-ts "" }
+
+#include 
+#include 
+#include 
+
+bool used_custom_readdir = false;
+
+extern "C" void* readdir(void*)
+{
+  used_custom_readdir = true;
+  errno = EIO;
+  return nullptr;
+}
+
+void
+test01()
+{
+  using std::experimental::filesystem::recursive_directory_iterator;
+  std::error_code ec;
+  recursive_directory_iterator it(".", ec);
+  if (used_custom_readdir)
+VERIFY( ec.value() == EIO );
+}
+
+int
+main()
+{
+  test01();
+}


Re: [committed 1/2] libstdc++: Fix multiple definitions of std::exception_ptr functions [PR 97729]

2020-11-05 Thread Jonathan Wakely via Gcc-patches

This fixes some multiple definition errors caused by the changes for
PR libstdc++/90295. The previous solution for inlining the members of
std::exception_ptr but still exporting them from the library was to
suppress the 'inline' keyword on those functions when compiling
libsupc++/eh_ptr.cc, so they get defined in that file. That produces ODR
violations though, because there are now both inline and non-inline
definitions in the library, due to the use of std::exception_ptr in
other files sucg as src/c++11/future.cc.

The new solution is to define all the relevant members as 'inline'
unconditionally, but use __attribute__((used)) to cause definitions to
be emitted in libsupc++/eh_ptr.cc as before. This doesn't quite work
however, because PR c++/67453 means the attribute is ignored on
constructors and destructors. As a workaround, the old solution
(conditionally inline) is still used for those members, but they are
given the always_inline attribute so that they aren't emitted in
src/c++11/future.o as inline definitions.


Tested powerpc64le-linux. Committed to trunk.


commit 710508c7b1a2c8e1d75d4c4f1ac79473dbf2b2bb
Author: Jonathan Wakely 
Date:   Thu Nov 5 16:19:15 2020

libstdc++: Fix multiple definitions of std::exception_ptr functions [PR 97729]

This fixes some multiple definition errors caused by the changes for
PR libstdc++/90295. The previous solution for inlining the members of
std::exception_ptr but still exporting them from the library was to
suppress the 'inline' keyword on those functions when compiling
libsupc++/eh_ptr.cc, so they get defined in that file. That produces ODR
violations though, because there are now both inline and non-inline
definitions in the library, due to the use of std::exception_ptr in
other files sucg as src/c++11/future.cc.

The new solution is to define all the relevant members as 'inline'
unconditionally, but use __attribute__((used)) to cause definitions to
be emitted in libsupc++/eh_ptr.cc as before. This doesn't quite work
however, because PR c++/67453 means the attribute is ignored on
constructors and destructors. As a workaround, the old solution
(conditionally inline) is still used for those members, but they are
given the always_inline attribute so that they aren't emitted in
src/c++11/future.o as inline definitions.

libstdc++-v3/ChangeLog:

PR libstdc++/97729
* include/std/future (__basic_future::_M_get_result): Use
nullptr for null pointer constant.
* libsupc++/eh_ptr.cc (operator==, operator!=): Remove
definitions.
* libsupc++/exception_ptr.h (_GLIBCXX_EH_PTR_USED): Define
macro to conditionally add __attribute__((__used__)).
(operator==, operator!=, exception_ptr::exception_ptr())
(exception_ptr::exception_ptr(const exception_ptr&))
(exception_ptr::~exception_ptr())
(exception_ptr::operator=(const exception_ptr&))
(exception_ptr::swap(exception_ptr&)): Always define as
inline. Add macro to be conditionally "used".

diff --git a/libstdc++-v3/include/std/future b/libstdc++-v3/include/std/future
index 3c2aaa1fab19..5d948018c75c 100644
--- a/libstdc++-v3/include/std/future
+++ b/libstdc++-v3/include/std/future
@@ -709,7 +709,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   {
 _State_base::_S_check(_M_state);
 _Result_base& __res = _M_state->wait();
-if (!(__res._M_error == 0))
+if (!(__res._M_error == nullptr))
   rethrow_exception(__res._M_error);
 return static_cast<__result_type>(__res);
   }
diff --git a/libstdc++-v3/libsupc++/eh_ptr.cc b/libstdc++-v3/libsupc++/eh_ptr.cc
index c41bdca234c7..7e6863550ce4 100644
--- a/libstdc++-v3/libsupc++/eh_ptr.cc
+++ b/libstdc++-v3/libsupc++/eh_ptr.cc
@@ -25,7 +25,12 @@
 #include 
 #include "eh_atomics.h"
 
+#if ! _GLIBCXX_INLINE_VERSION
+// This macro causes exception_ptr to declare an older API (with corresponding
+// definitions in this file) and to mark some inline functions as "used" so
+// that definitions will be emitted in this translation unit.
 #define _GLIBCXX_EH_PTR_COMPAT
+#endif
 
 #include 
 #include 
@@ -61,6 +66,8 @@ static_assert( adjptr<__cxa_exception>()
 #endif
 }
 
+// Define non-inline functions.
+
 std::__exception_ptr::exception_ptr::exception_ptr(void* obj) noexcept
 : _M_exception_object(obj)  { _M_addref(); }
 
@@ -130,19 +137,6 @@ std::__exception_ptr::exception_ptr::__cxa_exception_type() const noexcept
   return eh->exceptionType;
 }
 
-// Retained for compatibility with CXXABI_1.3.12.
-bool
-std::__exception_ptr::operator==(const exception_ptr& lhs,
- const exception_ptr& rhs) noexcept
-{ return lhs._M_exception_object == rhs._M_exception_object; }
-
-// Retained for compatibility with CXXABI_1.3.12.
-bool
-std::__exception_ptr::operator!=(const exception_ptr& lhs,
- const exception_ptr& rhs) no

Re: deprecations in OpenMP 5.0

2020-11-05 Thread Kwok Cheung Yeung

On 04/11/2020 2:33 pm, Jakub Jelinek wrote:

LGTM, except:


+  omp_lock_hint_contended __GOMP_DEPRECATED_5_0 = omp_sync_hint_contended,
omp_sync_hint_nonspeculative = 4,
-  omp_lock_hint_nonspeculative = omp_sync_hint_nonspeculative,
+  omp_lock_hint_nonspeculative __GOMP_DEPRECATED_5_0 = 
omp_sync_hint_nonspeculative,


The above line is too long and needs wrapping.



Fixed.


But it would be nice to also add -Wno-deprecated to dg-additional-options of
tests that do use those.
Perhaps for testing replace the 201811 temporarily with 201511 and run make
check.



I have run the tests (with _OPENMP >= 201511) and added 
-Wno-deprecated-declarations option to the testcases that trigger the 
deprecation warning.


I also found a bug in the previous version of the patch - C++ doesn't like 
having an attribute come before the throw clause at the end of a function 
declaration. This is now fixed.


Bootstrapped on x86_64 with no offloading, and tested with nvptx offloading. Is 
this version okay for trunk?



--- a/libgomp/omp_lib.f90.in
+++ b/libgomp/omp_lib.f90.in
@@ -644,4 +644,8 @@
end function
  end interface
  
+#if _OPENMP >= 201811

+!GCC$ ATTRIBUTES DEPRECATED :: omp_get_nested, omp_set_nested
+#endif
+
end module omp_lib


Also, what about omp_lib.h?  Do you plan to change it only when we switch
_OPENMP macro?  I mean, we can't rely on preprocessing in that case...



Since we can't rely on having access to the preprocessor, I don't see what else 
we could do at the moment, except maybe extend the DEPRECATED attribute to take 
a condition (openmp_version >= 201811), and not print when false? Probably more 
trouble than it is worth, and it differs from the behaviour of the C attribute.


Kwok
commit a944f2ab445bb226f65239429d13efdf69a98e4b
Author: Kwok Cheung Yeung 
Date:   Thu Nov 5 10:11:23 2020 -0800

openmp: Mark deprecated symbols in OpenMP 5.0

2020-11-05  Ulrich Drepper  
Kwok Cheung Yeung  

libgomp/
* Makefile.am (%.mod): Add -cpp and -fopenmp to compile flags.
* Makefile.in: Regenerate.
* fortran.c: Wrap uses of omp_set_nested and omp_get_nested with
pragmas to ignore -Wdeprecated-declarations warnings.
* icv.c: Likewise.
* omp.h.in (__GOMP_DEPRECATED_5_0): Define.
Mark omp_lock_hint_* enum values, omp_lock_hint_t, omp_set_nested,
and omp_get_nested with __GOMP_DEPRECATED_5_0.
* omp_lib.f90.in: Mark omp_get_nested and omp_set_nested as
deprecated.
* testsuite/libgomp.c++/affinity-1.C: Add -Wno-deprecated-declarations
to test options.
* testsuite/libgomp.c/affinity-1.c: Likewise.
* testsuite/libgomp.c/affinity-2.c: Likewise.
* testsuite/libgomp.c/appendix-a/a.15.1.c: Likewise.
* testsuite/libgomp.c/lib-1.c: Likewise.
* testsuite/libgomp.c/nested-1.c: Likewise.
* testsuite/libgomp.c/nested-2.c: Likewise.
* testsuite/libgomp.c/nested-3.c: Likewise.
* testsuite/libgomp.c/pr32362-1.c: Likewise.
* testsuite/libgomp.c/pr32362-2.c: Likewise.
* testsuite/libgomp.c/pr32362-3.c: Likewise.
* testsuite/libgomp.c/pr35549.c: Likewise.
* testsuite/libgomp.c/pr42942.c: Likewise.
* testsuite/libgomp.c/pr61200.c: Likewise.
* testsuite/libgomp.c/sort-1.c: Likewise.
* testsuite/libgomp.c/target-5.c: Likewise.
* testsuite/libgomp.c/target-6.c: Likewise.
* testsuite/libgomp.c/teams-1.c: Likewise.
* testsuite/libgomp.c/thread-limit-1.c: Likewise.
* testsuite/libgomp.c/thread-limit-2.c: Likewise.
* testsuite/libgomp.c/thread-limit-4.c: Likewise.
* testsuite/libgomp.fortran/affinity1.f90: Likewise.
* testsuite/libgomp.fortran/lib1.f90: Likewise.
* testsuite/libgomp.fortran/lib2.f: Likewise.
* testsuite/libgomp.fortran/nested1.f90: Likewise.
* testsuite/libgomp.fortran/teams1.f90: Likewise.

diff --git a/libgomp/Makefile.am b/libgomp/Makefile.am
index 586c930..4cf1f58 100644
--- a/libgomp/Makefile.am
+++ b/libgomp/Makefile.am
@@ -92,7 +92,7 @@ openacc_kinds.mod: openacc.mod
 openacc.mod: openacc.lo
:
 %.mod: %.f90
-   $(FC) $(FCFLAGS) -fsyntax-only $<
+   $(FC) $(FCFLAGS) -cpp -fopenmp -fsyntax-only $<
 fortran.lo: libgomp_f.h
 fortran.o: libgomp_f.h
 env.lo: libgomp_f.h
diff --git a/libgomp/Makefile.in b/libgomp/Makefile.in
index 00d5e29..eb868b3 100644
--- a/libgomp/Makefile.in
+++ b/libgomp/Makefile.in
@@ -1382,7 +1382,7 @@ openacc_kinds.mod: openacc.mod
 openacc.mod: openacc.lo
:
 %.mod: %.f90
-   $(FC) $(FCFLAGS) -fsyntax-only $<
+   $(FC) $(FCFLAGS) -cpp -fopenmp -fsyntax-only $<
 fortran.lo: libgomp_f.h
 fortran.o: libgomp_f.h
 env.lo: libgomp_f.h
diff --git a/libgomp/fortran.c b/libgomp/fortran.c
index 029dec1..cd719f9 100644
--- a/libgomp/fortran.c
+++ b/libgomp/fortran.c
@@ -47,10 +47,13 @@ ialias_redirect (omp_test_loc

Re: deprecations in OpenMP 5.0

2020-11-05 Thread Jakub Jelinek via Gcc-patches
On Thu, Nov 05, 2020 at 06:18:11PM +, Kwok Cheung Yeung wrote:
> I have run the tests (with _OPENMP >= 201511) and added
> -Wno-deprecated-declarations option to the testcases that trigger the
> deprecation warning.
> 
> I also found a bug in the previous version of the patch - C++ doesn't like
> having an attribute come before the throw clause at the end of a function
> declaration. This is now fixed.
> 
> Bootstrapped on x86_64 with no offloading, and tested with nvptx offloading.
> Is this version okay for trunk?

Ok, thanks.

Jakub



[patch] vxworks, aarch64: Handle use of r18 as a TCB pointer

2020-11-05 Thread Olivier Hainque

Instead of #define TARGET_OS_USES_R18 which is not
handled, pick R9 as an alternate static chain regnum
and document that the port needs to be configured to
issue -ffixed-r18 by default.

r9 is now available after the approval at

  https://gcc.gnu.org/pipermail/gcc-patches/2020-November/558094.html


We have been using this variant with gcc-9 in-house
for a while and I verified that I could get a build
+ reasonable test results with gcc-10.

Olivier

2020-11-04  Olivier Hainque  

gcc/
* config/aarch64-vxworks.h (TARGET_OS_USES_R18): Remove
definition.
(STATIC_CHAIN_REGNUM): Redefine to 9.

--- a/gcc/config/aarch64/aarch64-vxworks.h
+++ b/gcc/config/aarch64/aarch64-vxworks.h
@@ -60,12 +60,14 @@ along with GCC; see the file COPYING3.  If not see
 #undef STACK_CHECK_PROTECT
 #define STACK_CHECK_PROTECT 16384
 
-/* The VxWorks environment on aarch64 is llvm-based only, uses R18 as
-   a TCB pointer.  */
-
+/* The VxWorks environment on aarch64 is llvm-based.  */
 #undef VXWORKS_PERSONALITY
 #define VXWORKS_PERSONALITY "llvm"
 
-#undef  TARGET_OS_USES_R18
-#define TARGET_OS_USES_R18 1
+/* VxWorks uses R18 as a TCB pointer.  We must pick something else as
+   the static chain and R18 needs to be claimed "fixed".  Until we
+   arrange to override the common parts of the port family to
+   acknowledge the latter, configure --with-specs="-ffixed-r18".  */
+#undef  STATIC_CHAIN_REGNUM
+#define STATIC_CHAIN_REGNUM 9
 
-- 
2.17.1



Re: [00/32] C++ 20 Modules

2020-11-05 Thread Richard Biener via Gcc-patches
On November 5, 2020 4:25:23 PM GMT+01:00, Nathan Sidwell  wrote:
>On 11/5/20 8:33 AM, Richard Biener wrote:
>
>> Moving the module mapper to a more easily (build-)testable location
>> and to a place where host dependences can be more easily fixed
>> & customized than in a bootstrapped directory would be nice.  Thus,
>> I think the module mapper should be in the toplevel somehow
>> and independently buildable.
>
>Ok, that makes sense.  It is where it is, because originally it was
>much 
>more tightly coupled with cc1plus.
>
>The mapper-server and cc1plus do share some (maybe just one?) obj
>files. 
>The in-process resolving and the server's default have the same 
>functionality.
>
>For bootstrap cc1plus needs them, so I guess they should remain in 
>gcc/cp/?  The alternative would be to put them in new mapper-server dir

Guess some file you can include from the mapper dir (and thus build it twice) 
would work? I'm not suggesting another static library, if the maybe libiberty 
if the thing is remotely generic. 

>and have it provide somekind of library that cc1plus could link with. 
>However that'll probably mess up bootstrap.
>
>Having a --with-module-mapper configure option seems sensible.
>
>nathan



[committed] libstdc++: Fix constraints on std::optional comparisons [PR 96269]

2020-11-05 Thread Jonathan Wakely via Gcc-patches
The relational operators for std::optional were using the wrong types
in the declval expressions used to constrain them. Instead of using
const lvalues they were using non-const rvalues, which meant that a type
might satisfy the constraints but then give an error when the function
body was instantiated.

libstdc++-v3/ChangeLog:

PR libstdc++/96269
* include/std/optional (operator==, operator!=, operator<)
(operator>, operator<=, operator>=): Fix types used in
SFINAE constraints.
* testsuite/20_util/optional/relops/96269.cc: New test.

Tested powerpc64le-linux. Committed to trunk.

I'll backport to gcc-10 too.


commit cdd2d448d8200ed5ebcb232163954367b553291e
Author: Jonathan Wakely 
Date:   Thu Nov 5 18:36:19 2020

libstdc++: Fix constraints on std::optional comparisons [PR 96269]

The relational operators for std::optional were using the wrong types
in the declval expressions used to constrain them. Instead of using
const lvalues they were using non-const rvalues, which meant that a type
might satisfy the constraints but then give an error when the function
body was instantiated.

libstdc++-v3/ChangeLog:

PR libstdc++/96269
* include/std/optional (operator==, operator!=, operator<)
(operator>, operator<=, operator>=): Fix types used in
SFINAE constraints.
* testsuite/20_util/optional/relops/96269.cc: New test.

diff --git a/libstdc++-v3/include/std/optional 
b/libstdc++-v3/include/std/optional
index f9f42efe09ce..5ea5b39d0e69 100644
--- a/libstdc++-v3/include/std/optional
+++ b/libstdc++-v3/include/std/optional
@@ -1002,11 +1002,41 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 using __optional_relop_t =
   enable_if_t::value, bool>;
 
+  template
+using __optional_eq_t = __optional_relop_t<
+  decltype(std::declval() == std::declval())
+  >;
+
+  template
+using __optional_ne_t = __optional_relop_t<
+  decltype(std::declval() != std::declval())
+  >;
+
+  template
+using __optional_lt_t = __optional_relop_t<
+  decltype(std::declval() < std::declval())
+  >;
+
+  template
+using __optional_gt_t = __optional_relop_t<
+  decltype(std::declval() > std::declval())
+  >;
+
+  template
+using __optional_le_t = __optional_relop_t<
+  decltype(std::declval() <= std::declval())
+  >;
+
+  template
+using __optional_ge_t = __optional_relop_t<
+  decltype(std::declval() >= std::declval())
+  >;
+
   // Comparisons between optional values.
   template
 constexpr auto
 operator==(const optional<_Tp>& __lhs, const optional<_Up>& __rhs)
--> __optional_relop_t() == declval<_Up>())>
+-> __optional_eq_t<_Tp, _Up>
 {
   return static_cast(__lhs) == static_cast(__rhs)
 && (!__lhs || *__lhs == *__rhs);
@@ -1015,7 +1045,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template
 constexpr auto
 operator!=(const optional<_Tp>& __lhs, const optional<_Up>& __rhs)
--> __optional_relop_t() != declval<_Up>())>
+-> __optional_ne_t<_Tp, _Up>
 {
   return static_cast(__lhs) != static_cast(__rhs)
|| (static_cast(__lhs) && *__lhs != *__rhs);
@@ -1024,7 +1054,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template
 constexpr auto
 operator<(const optional<_Tp>& __lhs, const optional<_Up>& __rhs)
--> __optional_relop_t() < declval<_Up>())>
+-> __optional_lt_t<_Tp, _Up>
 {
   return static_cast(__rhs) && (!__lhs || *__lhs < *__rhs);
 }
@@ -1032,7 +1062,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template
 constexpr auto
 operator>(const optional<_Tp>& __lhs, const optional<_Up>& __rhs)
--> __optional_relop_t() > declval<_Up>())>
+-> __optional_gt_t<_Tp, _Up>
 {
   return static_cast(__lhs) && (!__rhs || *__lhs > *__rhs);
 }
@@ -1040,7 +1070,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template
 constexpr auto
 operator<=(const optional<_Tp>& __lhs, const optional<_Up>& __rhs)
--> __optional_relop_t() <= declval<_Up>())>
+-> __optional_le_t<_Tp, _Up>
 {
   return !__lhs || (static_cast(__rhs) && *__lhs <= *__rhs);
 }
@@ -1048,7 +1078,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template
 constexpr auto
 operator>=(const optional<_Tp>& __lhs, const optional<_Up>& __rhs)
--> __optional_relop_t() >= declval<_Up>())>
+-> __optional_ge_t<_Tp, _Up>
 {
   return !__rhs || (static_cast(__lhs) && *__lhs >= *__rhs);
 }
@@ -1134,73 +1164,73 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template
 constexpr auto
 operator==(const optional<_Tp>& __lhs, const _Up& __rhs)
--> __optional_relop_t() == declval<_Up>())>
+-> __optional_eq_t<_Tp, _Up>
 { return __lhs && *__lhs == __rhs; }
 
   template
 constexpr auto
 operator==(const _Up& __lhs, const optional<_Tp>& __rhs)
--> __optional_relop_t() == declval<_Tp>())>
+-> __optional_eq_t<_Up, _Tp>
 { return __rhs && __lhs

[r11-4733 Regression] FAIL: gcc.dg/guality/pr54519-4.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects -DPREVENT_OPTIMIZATION line 17 y == 25 on Linux/x86_64

2020-11-05 Thread sunil.k.pandey via Gcc-patches
On Linux/x86_64,

1436ef2a57e79b6b8ce5b03e32a38dd64f46c97c is the first bad commit
commit 1436ef2a57e79b6b8ce5b03e32a38dd64f46c97c
Author: Richard Biener 
Date:   Thu Nov 5 09:27:28 2020 +0100

debug/97718 - fix abstract origin references after last change

caused

FAIL: gcc.dg/guality/pr54519-3.c   -O2 -flto -fno-use-linker-plugin 
-flto-partition=none  -DPREVENT_OPTIMIZATION line 20 y == 25
FAIL: gcc.dg/guality/pr54519-3.c   -O2 -flto -fno-use-linker-plugin 
-flto-partition=none  -DPREVENT_OPTIMIZATION line 20 z == 6
FAIL: gcc.dg/guality/pr54519-3.c   -O2 -flto -fno-use-linker-plugin 
-flto-partition=none  -DPREVENT_OPTIMIZATION line 23 y == 117
FAIL: gcc.dg/guality/pr54519-3.c   -O2 -flto -fno-use-linker-plugin 
-flto-partition=none  -DPREVENT_OPTIMIZATION line 23 z == 8
FAIL: gcc.dg/guality/pr54519-3.c   -O2 -flto -fuse-linker-plugin 
-fno-fat-lto-objects  -DPREVENT_OPTIMIZATION line 20 y == 25
FAIL: gcc.dg/guality/pr54519-3.c   -O2 -flto -fuse-linker-plugin 
-fno-fat-lto-objects  -DPREVENT_OPTIMIZATION line 20 z == 6
FAIL: gcc.dg/guality/pr54519-3.c   -O2 -flto -fuse-linker-plugin 
-fno-fat-lto-objects  -DPREVENT_OPTIMIZATION line 23 y == 117
FAIL: gcc.dg/guality/pr54519-3.c   -O2 -flto -fuse-linker-plugin 
-fno-fat-lto-objects  -DPREVENT_OPTIMIZATION line 23 z == 8
FAIL: gcc.dg/guality/pr54519-4.c   -O2 -flto -fno-use-linker-plugin 
-flto-partition=none  -DPREVENT_OPTIMIZATION line 17 y == 25
FAIL: gcc.dg/guality/pr54519-4.c   -O2 -flto -fuse-linker-plugin 
-fno-fat-lto-objects  -DPREVENT_OPTIMIZATION line 17 y == 25

with GCC configured with

../../gcc/configure 
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r11-4733/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="guality.exp=gcc.dg/guality/pr54519-3.c 
--target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="guality.exp=gcc.dg/guality/pr54519-3.c --target_board='unix{-m32\ 
-march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="guality.exp=gcc.dg/guality/pr54519-3.c 
--target_board='unix{-m64}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="guality.exp=gcc.dg/guality/pr54519-3.c --target_board='unix{-m64\ 
-march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="guality.exp=gcc.dg/guality/pr54519-4.c 
--target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="guality.exp=gcc.dg/guality/pr54519-4.c --target_board='unix{-m32\ 
-march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="guality.exp=gcc.dg/guality/pr54519-4.c 
--target_board='unix{-m64}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="guality.exp=gcc.dg/guality/pr54519-4.c --target_board='unix{-m64\ 
-march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at skpgkp2 at gmail dot com)


Re: [PATCH] generalized range_query class for multiple contexts

2020-11-05 Thread Martin Sebor via Gcc-patches

On 10/1/20 11:25 AM, Martin Sebor wrote:

On 10/1/20 9:34 AM, Aldy Hernandez wrote:



On 10/1/20 3:22 PM, Andrew MacLeod wrote:
 > On 10/1/20 5:05 AM, Aldy Hernandez via Gcc-patches wrote:
 >>> Thanks for doing all this!  There isn't anything I don't understand
 >>> in the sprintf changes so no questions from me (well, almost none).
 >>> Just some comments:
 >> Thanks for your comments on the sprintf/strlen API conversion.
 >>
 >>> The current call statement is available in all functions that take
 >>> a directive argument, as dir->info.callstmt.  There should be no 
need

 >>> to also add it as a new argument to the functions that now need it.
 >> Fixed.
 >>
 >>> The change adds code along these lines in a bunch of places:
 >>>
 >>> + value_range vr;
 >>> + if (!query->range_of_expr (vr, arg, stmt))
 >>> +   vr.set_varying (TREE_TYPE (arg));
 >>>
 >>> I thought under the new Ranger APIs when a range couldn't be
 >>> determined it would be automatically set to the maximum for
 >>> the type.  I like that and have been moving in that direction
 >>> with my code myself (rather than having an API fail, have it
 >>> set the max range and succeed).
 >> I went through all the above idioms and noticed all are being used on
 >> supported types (integers or pointers).  So range_of_expr will always
 >> return true.  I've removed the if() and the set_varying.
 >>
 >>> Since that isn't so in this case, I think it would still be nice
 >>> if the added code could be written as if the range were set to
 >>> varying in this case and (ideally) reduced to just initialization:
 >>>
 >>> value_range vr = some-function (query, stmt, arg);
 >>>
 >>> some-function could be an inline helper defined just for the sprintf
 >>> pass (and maybe also strlen which also seems to use the same 
pattern),
 >>> or it could be a value_range AKA irange ctor, or it could be a 
member

 >>> of range_query, whatever is the most appropriate.
 >>>
 >>> (If assigning/copying a value_range is thought to be too expensive,
 >>> declaring it first and then passing it to that helper to set it
 >>> would work too).
 >>>
 >>> In strlen, is the removed comment no longer relevant?  (I.e., does
 >>> the ranger solve the problem?)
 >>>
 >>> -  /* The range below may be "inaccurate" if a constant has been
 >>> -    substituted earlier for VAL by this pass that hasn't been
 >>> -    propagated through the CFG.  This shoud be fixed by the new
 >>> -    on-demand VRP if/when it becomes available (hopefully in
 >>> -    GCC 11).  */
 >> It should.
 >>
 >>> I'm wondering about the comment added to get_range_strlen_dynamic
 >>> and other places:
 >>>
 >>> + // FIXME: Use range_query instead of global ranges.
 >>>
 >>> Is that something you're planning to do in a followup or should
 >>> I remember to do it at some point?
 >> I'm not planning on doing it.  It's just a reminder that it would be
 >> beneficial to do so.
 >>
 >>> Otherwise I have no concern with the changes.
 >> It's not cleared whether Andrew approved all 3 parts of the patchset
 >> or just the valuation part.  I'll wait for his nod before committing
 >> this chunk.
 >>
 >> Aldy
 >>
 > I have no issue with it, so OK.

Pushed all 3 patches.

 >
 > Just an observation that should be pointed out, I believe Aldy has all
 > the code for converting to a ranger, but we have not pursued that any
 > further yet since there is a regression due to our lack of equivalence
 > processing I think?  That should be resolved in the coming month, 
but at

 > the moment is a holdback/concern for converting these passes...  iirc.

Yes.  Martin, the take away here is that the strlen/sprintf pass has 
been converted to the new API, but ranger is still not up and running 
on it (even on the branch).


With the new API, all you have to do is remove all instances of 
evrp_range_analyzer and replace them with a ranger.  That's it.
Below is an untested patch that would convert you to a ranger once 
it's contributed.


IIRC when I enabled the ranger for your pass a while back, there was 
one or two regressions due to missing equivalences, and the rest were 
because the tests were expecting an actual specific range, and the 
ranger returned a slightly different/better one.  You'll need to 
adjust your tests.


Ack.  I'll be on the lookout for the ranger commit (if you hppen
to remember and CC me on it just in case I might miss it that would
be great).


I have applied the patch and ran some tests.  There are quite
a few failures (see the list below).  I have only looked at
a couple.  The one in in gcc.dg/tree-ssa/builtin-sprintf-warn-3.c
boils down to the following test case.  There should be no warning
for either sprintf call.  The one in h() is a false positive and
the reason for at least some of the regressions.  Somehow,
the conversions between int and char are causing Ranger to lose
the range.

$ cat t.c && gcc -O2 -S -Wall t.c
char a[2];

extern int x;

signed char f (int

Re: [committed] libstdc++: Fix constraints on std::optional comparisons [PR 96269]

2020-11-05 Thread Jonathan Wakely via Gcc-patches

On 05/11/20 19:09 +, Jonathan Wakely wrote:

The relational operators for std::optional were using the wrong types
in the declval expressions used to constrain them. Instead of using
const lvalues they were using non-const rvalues, which meant that a type
might satisfy the constraints but then give an error when the function
body was instantiated.

libstdc++-v3/ChangeLog:

PR libstdc++/96269
* include/std/optional (operator==, operator!=, operator<)
(operator>, operator<=, operator>=): Fix types used in
SFINAE constraints.
* testsuite/20_util/optional/relops/96269.cc: New test.

Tested powerpc64le-linux. Committed to trunk.


When concepts are supported we can make the alias templates
__optional_eq_t et al use a requires-expression instead of SFINAE.
This is potentially faster to compile, given expected improvements
to C++20 compilers.

I'm testing this patch.

commit c5d8e2ba0ad20425cc7778152824d9e5267b0ec5
Author: Jonathan Wakely 
Date:   Thu Nov 5 19:45:52 2020

libstdc++: Use concepts to constrain std::optional relops

When concepts are supported we can make the alias templates
__optional_eq_t et al use a requires-expression instead of SFINAE.
This is potentially faster to compile, given expected improvements
to C++20 compilers.

libstdc++-v3/ChangeLog:

* include/std/optional [__cpp_concepts] (__optional_eq_t)
(__optional_ne_t, __optional_lt_t, __optional_gt_t)
(__optional_le_t, __optional_ge_t): Use requires-clause on
alias template.

diff --git a/libstdc++-v3/include/std/optional b/libstdc++-v3/include/std/optional
index 5ea5b39d0e69..4e9618648250 100644
--- a/libstdc++-v3/include/std/optional
+++ b/libstdc++-v3/include/std/optional
@@ -998,9 +998,48 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   void reset() noexcept { this->_M_reset(); }
 };
 
+#if __cpp_lib_concepts
+  template
+requires requires (const _Tp __t, const _Up __u) {
+	  { __t == __u } -> convertible_to;
+}
+using __optional_eq_t = bool;
+
+  template
+requires requires (const _Tp __t, const _Up __u) {
+	  { __t != __u } -> convertible_to;
+}
+using __optional_ne_t = bool;
+
+  template
+requires requires (const _Tp __t, const _Up __u) {
+	  { __t < __u } -> convertible_to;
+}
+using __optional_lt_t = bool;
+
+  template
+requires requires (const _Tp __t, const _Up __u) {
+	  { __t > __u } -> convertible_to;
+}
+using __optional_gt_t = bool;
+
+  template
+requires requires (const _Tp __t, const _Up __u) {
+	  { __t <= __u } -> convertible_to;
+}
+using __optional_le_t = bool;
+
+  template
+requires requires (const _Tp __t, const _Up __u) {
+	  { __t >= __u } -> convertible_to;
+}
+using __optional_ge_t = bool;
+
+#else // concepts
+
   template
 using __optional_relop_t =
-  enable_if_t::value, bool>;
+  enable_if_t, bool>;
 
   template
 using __optional_eq_t = __optional_relop_t<
@@ -1031,6 +1070,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 using __optional_ge_t = __optional_relop_t<
   decltype(std::declval() >= std::declval())
   >;
+#endif // concepts
 
   // Comparisons between optional values.
   template


Re: [PATCH][AArch64] ACLE intrinsics: get low/high half from BFloat16 vector

2020-11-05 Thread Christophe Lyon via Gcc-patches
On Tue, 3 Nov 2020 at 12:17, Dennis Zhang via Gcc-patches
 wrote:
>
> Hi Richard,
>
> On 10/30/20 2:07 PM, Richard Sandiford wrote:
> > Dennis Zhang  writes:
> >> diff --git a/gcc/config/aarch64/aarch64-simd-builtins.def 
> >> b/gcc/config/aarch64/aarch64-simd-builtins.def
> >> index 332a0b6b1ea..39ebb776d1d 100644
> >> --- a/gcc/config/aarch64/aarch64-simd-builtins.def
> >> +++ b/gcc/config/aarch64/aarch64-simd-builtins.def
> >> @@ -719,6 +719,9 @@
> >> VAR1 (QUADOP_LANE, bfmlalb_lane_q, 0, ALL, v4sf)
> >> VAR1 (QUADOP_LANE, bfmlalt_lane_q, 0, ALL, v4sf)
> >>
> >> +  /* Implemented by aarch64_vget_halfv8bf.  */
> >> +  VAR1 (GETREG, vget_half, 0, ALL, v8bf)
> >
> > This should be AUTO_FP, since it doesn't have any side-effects.
> > (As before, we should probably rename the flag, but that's separate work.)
> >
> >> +
> >> /* Implemented by aarch64_simd_mmlav16qi.  */
> >> VAR1 (TERNOP, simd_smmla, 0, NONE, v16qi)
> >> VAR1 (TERNOPU, simd_ummla, 0, NONE, v16qi)
> >> diff --git a/gcc/config/aarch64/aarch64-simd.md 
> >> b/gcc/config/aarch64/aarch64-simd.md
> >> index 9f0e2bd1e6f..f62c52ca327 100644
> >> --- a/gcc/config/aarch64/aarch64-simd.md
> >> +++ b/gcc/config/aarch64/aarch64-simd.md
> >> @@ -7159,6 +7159,19 @@
> >> [(set_attr "type" "neon_dot")]
> >>   )
> >>
> >> +;; vget_low/high_bf16
> >> +(define_expand "aarch64_vget_halfv8bf"
> >> +  [(match_operand:V4BF 0 "register_operand")
> >> +   (match_operand:V8BF 1 "register_operand")
> >> +   (match_operand:SI 2 "aarch64_zero_or_1")]
> >> +  "TARGET_BF16_SIMD"
> >> +{
> >> +  int hbase = INTVAL (operands[2]);
> >> +  rtx sel = aarch64_gen_stepped_int_parallel (4, hbase * 4, 1);
> >
> > I think this needs to be:
> >
> >aarch64_simd_vect_par_cnst_half
> >
> > instead.  The issue is that on big-endian targets, GCC assumes vector
> > lane 0 is in the high part of the register, whereas for AArch64 it's
> > always in the low part of the register.  So we convert from AArch64
> > numbering to GCC numbering when generating the rtx and then take
> > endianness into account when matching the rtx later.
> >
> > It would be good to have -mbig-endian tests that make sure we generate
> > the right instruction for each function (i.e. we get them the right way
> > round).  I guess it would be good to test that for little-endian too.
> >
>
> I've updated the expander using aarch64_simd_vect_par_cnst_half.
> And the expander is divided into two for getting low and high half
> seperately.
> It's tested for aarch64-none-linux-gnu and aarch64_be-none-linux-gnu
> targets with new tests including -mbig-endian option.
>

Hi,

When testing with a cross x86_64 -> aarch64-none-linux-gnu, the new
big-endian test fails:
FAIL: gcc.target/aarch64/advsimd-intrinsics/bf16_get-be.c   -O0  (test
for excess errors)
Excess errors:
/aci-gcc-fsf/builds/gcc-fsf-gccsrc/sysroot-aarch64-none-linux-gnu/usr/include/gnu/stubs.h:11:11:
fatal error: gnu/stubs-lp64_be.h: No such file or directory
compilation terminated.

What am I missing, since it works for you?

Thanks

Christophe

> >> +  emit_insn (gen_aarch64_get_halfv8bf (operands[0], operands[1], sel));
> >> +  DONE;
> >> +})
> >> +
> >>   ;; bfmmla
> >>   (define_insn "aarch64_bfmmlaqv4sf"
> >> [(set (match_operand:V4SF 0 "register_operand" "=w")
> >> diff --git a/gcc/config/aarch64/predicates.md 
> >> b/gcc/config/aarch64/predicates.md
> >> index 215fcec5955..0c8bc2b0c73 100644
> >> --- a/gcc/config/aarch64/predicates.md
> >> +++ b/gcc/config/aarch64/predicates.md
> >> @@ -84,6 +84,10 @@
> >>   (ior (match_test "op == constm1_rtx")
> >>(match_test "op == const1_rtx"))
> >>
> >> +(define_predicate "aarch64_zero_or_1"
> >> +  (and (match_code "const_int")
> >> +   (match_test "op == const0_rtx || op == const1_rtx")))
> >
> > zero_or_1 looked odd to me, feels like it should be 0_or_1 or zero_or_one.
> > But I see that it's for consistency with aarch64_reg_zero_or_m1_or_1,
> > so let's keep it as-is.
> >
>
> This predicate is removed since there is no need of the imm operand in
> the new expanders.
>
> Thanks for the reviews.
> Is it OK for trunk now?
>
> Cheers
> Dennis
>
>


Re: [committed] libstdc++: Fix constraints on std::optional comparisons [PR 96269]

2020-11-05 Thread Ville Voutilainen via Gcc-patches
On Thu, 5 Nov 2020 at 21:52, Jonathan Wakely via Libstdc++
 wrote:
>
> On 05/11/20 19:09 +, Jonathan Wakely wrote:
> >The relational operators for std::optional were using the wrong types
> >in the declval expressions used to constrain them. Instead of using
> >const lvalues they were using non-const rvalues, which meant that a type
> >might satisfy the constraints but then give an error when the function
> >body was instantiated.
> >
> >libstdc++-v3/ChangeLog:
> >
> >   PR libstdc++/96269
> >   * include/std/optional (operator==, operator!=, operator<)
> >   (operator>, operator<=, operator>=): Fix types used in
> >   SFINAE constraints.
> >   * testsuite/20_util/optional/relops/96269.cc: New test.
> >
> >Tested powerpc64le-linux. Committed to trunk.
>
> When concepts are supported we can make the alias templates
> __optional_eq_t et al use a requires-expression instead of SFINAE.
> This is potentially faster to compile, given expected improvements
> to C++20 compilers.
>
> I'm testing this patch.

It concerns me that we'd have such conditional conceptifying just
because it's possibly faster to compile.
There's more types where we'd want to conditionally use concepts, but
perhaps we want to think a bit
more how to do that in our source code, rather than just make them
preprocessor-conditionals in the same
header. We might entertain conceptifying tuple, when concepts are
available. That may end up being
fairly verbose if it's done with preprocessor in .

That's not to say that I'm objecting to this as such; I merely think
we want to be a bit careful with
conceptifying, and be rather instantly prepared to entertain doing it
with a slightly different source code
structure, which may involve splitting things across more files, which
would then involve adding more
headers that are installed.


Re: [PATCH] arm: [testcase] Better narrow some bfloat16 testcase

2020-11-05 Thread Christophe Lyon via Gcc-patches
On Thu, 5 Nov 2020 at 15:30, Andrea Corallo  wrote:
>
> Christophe Lyon  writes:
>
> > On Thu, 5 Nov 2020 at 12:11, Andrea Corallo  wrote:
> >>
> >> Christophe Lyon  writes:
> >>
> >> [...]
> >>
> >> >> I think you need to add -mfloat-abi=hard to the dg-additional-options
> >> >> otherwise vld1_lane_bf16_1.c
> >> >> fails on targets with a soft float-abi default (eg arm-linux-gnueabi).
> >> >>
> >> >> See bf16_vldn_1.c.
> >> >
> >> > Actually that's not sufficient because in turn we get:
> >> > /sysroot-arm-none-linux-gnueabi/usr/include/gnu/stubs.h:10:11: fatal
> >> > error: gnu/stubs-hard.h: No such file or directory
> >> >
> >> > So you should check that -mfloat-abi=hard is supported.
> >> >
> >> > Ditto for the vst tests.
> >>
> >> Hi Christophe,
> >>
> >> this patch should implement your suggestions.
> >>
> >> On my arm-none-linux-gnueabi setup the tests were already skipped
> >> as unsupported so if you could test and confirm this fixes the
> >> issue you see would be great.
> >
> > Do you know why they are unsupported in your setup?
>
> We probably have a different GCC configuration.  Could you share how
> it's configured your?
>
Sure, for instance:
--target=arm-none-linux-gnueabi --with-float=soft --with-mode=arm
--with-cpu=cortex-a9

> >> diff --git a/gcc/testsuite/lib/target-supports.exp 
> >> b/gcc/testsuite/lib/target-supports.exp
> >> index 15f0649f8ae..2ab7e39756d 100644
> >> --- a/gcc/testsuite/lib/target-supports.exp
> >> +++ b/gcc/testsuite/lib/target-supports.exp
> >> @@ -5213,6 +5213,10 @@ proc 
> >> check_effective_target_arm_v8_2a_bf16_neon_ok_nocache { } {
> >>  return 0;
> >>  }
> >>
> >> +if { ! [check_effective_target_arm_hard_ok] } {
> >> + return 0;
> >> +}
> >> +
> >> foreach flags {"" "-mfloat-abi=hard -mfpu=neon-fp-armv8" 
> >> "-mfloat-abi=softfp -mfpu=neon-fp-armv8" } {
> >> if { [check_no_compiler_messages_nocache arm_v8_2a_bf16_neon_ok 
> >> object {
> >> #include 
> >
> > This seems strange since you would now exit early if
> > check_effective_target_arm_hard_ok is false, so you'll never need the
> > -mfloat-abi=softfp version of the flags.
>
> So IIUC your suggestion would be to test with higher priority softfp and
> in case we decide to go for hardfp make sure
> check_effective_target_arm_hard_ok is satisfied.  Am I correct?
>
ISTM that other tests that need hardfp check if it's supported in the
test, not in other effective targets.

For instance mve/intrinsics/mve_fpu1.c

I can see that quite a few tests that use -mfloat-abi=hard do not
check whether it's supported. Those I checked do not include
arm_neon.h and thus do not end up with the gnu/stubs-hard.h error
above.

> > BTW in general, I think softfp is tried before hard in the other
> > similar effective targets, any reason the order is different here?
>
> No idea.
>
> Thanks
>
>   Andrea


Re: [PATCH v4] c++: Implement -Wvexing-parse [PR25814]

2020-11-05 Thread Jason Merrill via Gcc-patches

On 11/5/20 10:46 AM, Marek Polacek wrote:

On Fri, Oct 30, 2020 at 04:33:48PM -0400, Jason Merrill wrote:

On 10/29/20 11:00 PM, Marek Polacek wrote:

Gotcha.  Now we do most of the work in warn_about_ambiguous_parse.


Thanks, just a few tweaks left.

--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -4378,6 +4378,9 @@ cxx_init_decl_processing (void)
 init_list_type_node = make_node (LANG_TYPE);
 record_unknown_type (init_list_type_node, "init list");
+  /* Used when parsing to distinguish parameter-lists () and (void).  */
+  explicit_void_list_node = build_void_list_node ();
+
 {
   /* Make sure we get a unique function type, so we can give
  its pointer type a name.  (This wins for gdb.) */
@@ -14033,7 +14036,7 @@ grokparms (tree parmlist, tree *parms)
 tree init = TREE_PURPOSE (parm);
 tree decl = TREE_VALUE (parm);
-  if (parm == void_list_node)
+  if (parm == void_list_node || parm == explicit_void_list_node)
break;


Is this hunk needed?  I thought explicit_void_type_node would be handled by
the if (VOID_TYPE_P) block below.


Yeah, because explicit_/void_list_node don't have a type.


+static void
+warn_about_ambiguous_parse (tree type, const cp_declarator *declarator)
+{
+  if (declarator->kind != cdk_function
+  || !declarator->declarator
+  || declarator->declarator->kind != cdk_id
+  || !identifier_p (get_unqualified_id
+   (const_cast(declarator
+return;
+
+  /* Don't warn when the whole declarator (not just the declarator-id!)
+ was parenthesized.  That is, don't warn for int(n()) but do warn
+ for int(f)().  */
+  if (declarator->parenthesized != UNKNOWN_LOCATION)
+return;
+
+  location_t loc = declarator->u.function.parens_loc;
+  if (loc == UNKNOWN_LOCATION)
+return;


Is this still possible?


Looks like it isn't, removed.


+  if (TREE_CODE (type) == TYPE_DECL)
+   type = TREE_TYPE (type);
+
+  /* If the return type is void there is no ambiguity.  */
+  if (same_type_p (type, void_type_node))
+return;
+
+  auto_diagnostic_group d;
+  tree params = declarator->u.function.parameters;
+  const bool has_list_ctor_p = CLASS_TYPE_P (type) && TYPE_HAS_LIST_CTOR 
(type);
+
+  /* The T t() case.  */
+  if (params == void_list_node)
+{
+  if (warning_at (loc, OPT_Wvexing_parse,
+ "empty parentheses were disambiguated as a function "
+ "declaration"))
+   {
+ /* () means value-initialization (C++03 and up); {} (C++11 and up)
+means value-initialization or aggregate--initialization, nothing
+means default-initialization.  We can only suggest removing the
+parentheses/adding {} if T has a default constructor.  */
+ if (!CLASS_TYPE_P (type) || TYPE_HAS_DEFAULT_CONSTRUCTOR (type))
+   {
+ gcc_rich_location iloc (loc);
+ iloc.add_fixit_remove ();
+ inform (&iloc, "remove parentheses to default-initialize "
+ "a variable");
+ if (cxx_dialect >= cxx11 && !has_list_ctor_p)
+   {
+ if (CP_AGGREGATE_TYPE_P (type))
+   inform (loc, "or replace parentheses with braces to "
+   "aggregate-initialize a variable");
+ else
+   inform (loc, "or replace parentheses with braces to "
+   "value-initialize a variable");
+   }
+   }
+   }
+  return;
+}
+
+  /* If we had (...) or the parameter-list wasn't parenthesized,
+ we're done.  */
+  if (params == NULL_TREE || !PARENTHESIZED_LIST_P (params))
+return;


This needs to be a loop so we check all the elements of the list.


I think this can't be a loop, because we only set PARENTHESIZED_LIST_P
in the whole list.  But I realized that there still was an issue: I was
setting PARENTHESIZED_LIST_P only based on the last element in the list,
but we want to set it only if every parameter was parenthesized.  So I've
fixed setting of PARENTHESIZED_LIST_P instead.


+  /* The T t(X()) case.  */
+  if (list_length (params) == 2)
+{
+  if (warning_at (loc, OPT_Wvexing_parse,
+ "parentheses were disambiguated as a function "
+ "declaration"))
+   {
+ gcc_rich_location iloc (loc);
+ /* {}-initialization means that we can use an initializer-list
+constructor if no default constructor is available, so don't
+suggest using {} for classes that have an initializer_list
+constructor.  */
+ if (cxx_dialect >= cxx11 && !has_list_ctor_p)
+   {
+ iloc.add_fixit_replace (get_start (loc), "{");
+ iloc.add_fixit_replace (get_finish (loc), "}");
+ inform (&iloc, "replace parentheses with braces to declare a "
+ "variable");
+   }
+ else
+   {
+  

[PATCH] Pass multi-range from range_query::value_* routines

2020-11-05 Thread Andrew MacLeod via Gcc-patches

As detailed in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97725

This was a latent bug where we were passing a value_range into to 
range_ops for aq calculation. This was being used by the 
not_equal::fold() routine as an intermediary as one point, and it 
couldnt represent the full range, and info was lost.


fix si to pass an int_range_max instead of a value range, and to also 
adjust the equal/not_equal fold routines to use a normal intermediary 
range instead of counting on the return range which it knows nothing about.


Bootstrapped on x86_64-pc-linux-gnu, no regressions.  pushed.

Andrew

commit 22984f3f090921b5ac80ec0057f6754ec458e97e
Author: Andrew MacLeod 
Date:   Thu Nov 5 13:59:45 2020 -0500

Pass multi-range from range_query::value_*  routines

fix range-ops equal/not_equal to not reuse the result range as intermediary.
value_query::value routines should pasa multi-range in as some other rangeop
routines build into this result, so we may need better precision.

gcc/
PR tree-optimization/97725
* range-op.cc (operator_equal::fold_range): Use new tmp value.
(operator_not_equal::fold_range): Ditto.
* value-query.cc (range_query::value_of_expr): Use int_range_max
not a value_range.
(range_query::value_on_edge): Ditto.
(range_query::value_of_stmt): Ditto.
gcc/testsuite/
* gcc.dg/pr97725.c: New.

diff --git a/gcc/range-op.cc b/gcc/range-op.cc
index 74ab2e57fde..f38f02e8d27 100644
--- a/gcc/range-op.cc
+++ b/gcc/range-op.cc
@@ -428,9 +428,9 @@ operator_equal::fold_range (irange &r, tree type,
 {
   // If ranges do not intersect, we know the range is not equal,
   // otherwise we don't know anything for sure.
-  r = op1;
-  r.intersect (op2);
-  if (r.undefined_p ())
+  int_range_max tmp = op1;
+  tmp.intersect (op2);
+  if (tmp.undefined_p ())
 	r = range_false (type);
   else
 	r = range_true_and_false (type);
@@ -513,9 +513,9 @@ operator_not_equal::fold_range (irange &r, tree type,
 {
   // If ranges do not intersect, we know the range is not equal,
   // otherwise we don't know anything for sure.
-  r = op1;
-  r.intersect (op2);
-  if (r.undefined_p ())
+  int_range_max tmp = op1;
+  tmp.intersect (op2);
+  if (tmp.undefined_p ())
 	r = range_true (type);
   else
 	r = range_true_and_false (type);
diff --git a/gcc/testsuite/gcc.dg/pr97725.c b/gcc/testsuite/gcc.dg/pr97725.c
new file mode 100644
index 000..2fcb12cc301
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr97725.c
@@ -0,0 +1,28 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+int a;
+unsigned b;
+
+int main() {
+  if (a) {
+goto L1;
+while (1)
+  while (1) {
+long e = -1L, g;
+int f, h, i;
+  L1:
+a = f;
+  L2:
+g = e;
+f = h || g;
+e = ~(f & b);
+if (i || g < -1L) {
+  ~(g || 0);
+  break;
+}
+goto L2;
+  }
+  }
+  return 0;
+}
diff --git a/gcc/value-query.cc b/gcc/value-query.cc
index 23ba48d73a7..f9a948f3c6c 100644
--- a/gcc/value-query.cc
+++ b/gcc/value-query.cc
@@ -78,7 +78,7 @@ tree
 range_query::value_of_expr (tree name, gimple *stmt)
 {
   tree t;
-  value_range r;
+  int_range_max r;
 
   if (!irange::supports_type_p (TREE_TYPE (name)))
 return NULL_TREE;
@@ -99,7 +99,7 @@ tree
 range_query::value_on_edge (edge e, tree name)
 {
   tree t;
-  value_range r;
+  int_range_max r;
 
   if (!irange::supports_type_p (TREE_TYPE (name)))
 return NULL_TREE;
@@ -120,7 +120,7 @@ tree
 range_query::value_of_stmt (gimple *stmt, tree name)
 {
   tree t;
-  value_range r;
+  int_range_max r;
 
   if (!name)
 name = gimple_get_lhs (stmt);


Re: [PATCH] c++: Reuse identical ATOMIC_CONSTRs during normalization

2020-11-05 Thread Jason Merrill via Gcc-patches

On 11/3/20 3:43 PM, Patrick Palka wrote:

Profiling revealed that sat_hasher::equal accounts for nearly 40% of
compile time in some cmcstl2 tests.

This patch eliminates this bottleneck by caching the ATOMIC_CONSTRs
returned by normalize_atom.  This in turn allows us to replace the
expensive atomic_constraints_identical_p check in sat_hasher::equal
with cheap pointer equality, with no loss in cache hit rate.

With this patch, compile time for the cmcstl2 test
test/algorithm/set_symmetric_difference4.cpp drops from 19s to 11s with
an --enable-checking=release compiler.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?

gcc/cp/ChangeLog:

* constraint.cc (struct atom_hasher): New descriptor class for a
hash_table.  Use it to define ...
(atom_cache): ... this.
(normalize_atom): Use it to cache ATOMIC_CONSTRs when not
generating diagnostics.
(sat_hasher::hash): Use htab_hash_pointer instead of
hash_atomic_constraint.
(sat_hasher::equal): Test for pointer equality instead of
atomic_constraints_identical_p.
---
  gcc/cp/constraint.cc | 37 ++---
  1 file changed, 34 insertions(+), 3 deletions(-)

diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
index b6f6f0d02a5..ce720c641e8 100644
--- a/gcc/cp/constraint.cc
+++ b/gcc/cp/constraint.cc
@@ -710,6 +710,25 @@ normalize_concept_check (tree check, tree args, norm_info 
info)
return normalize_expression (def, subst, info);
  }
  
+/* Hash functions for ATOMIC_CONSTRs.  */

+
+struct atom_hasher : default_hash_traits
+{
+  static hashval_t hash (tree atom)
+  {
+return hash_atomic_constraint (atom);
+  }
+
+  static bool equal (tree atom1, tree atom2)
+  {
+return atomic_constraints_identical_p (atom1, atom2);
+  }
+};


This is the same as constraint_hash in logic.cc; either they should be 
combined, or (probably) the hash table in logic.cc should be changed to 
also take advantage of pointer equivalence.



+/* Used by normalize_atom to cache ATOMIC_CONSTRs.  */
+
+static GTY((deletable)) hash_table *atom_cache;


If we're relying on pointer identity, this can't be deletable; if GC 
discards it, later normalization will generate a new equivalent 
ATOMIC_CONSTR, breaking the uniqueness assumption.



  /* The normal form of an atom depends on the expression. The normal
 form of a function call to a function concept is a check constraint
 for that concept. The normal form of a reference to a variable
@@ -729,7 +748,19 @@ normalize_atom (tree t, tree args, norm_info info)
/* Build a new info object for the atom.  */
tree ci = build_tree_list (t, info.context);
  
-  return build1 (ATOMIC_CONSTR, ci, map);

+  tree atom = build1 (ATOMIC_CONSTR, ci, map);
+  if (!info.generate_diagnostics ())
+{
+  /* Cache the ATOMIC_CONSTRs that we return, so that sat_hasher::equal
+later can quickly compare two atoms using just pointer equality.  */
+  if (!atom_cache)
+   atom_cache = hash_table::create_ggc (31);
+  tree *slot = atom_cache->find_slot (atom, INSERT);
+  if (*slot)
+   return *slot;
+  *slot = atom;
+}
+  return atom;
  }
  
  /* Returns the normal form of an expression. */

@@ -2284,13 +2315,13 @@ struct sat_hasher : ggc_ptr_hash
  {
static hashval_t hash (sat_entry *e)
{


We could use a comment here about why we can just hash the pointer.


-hashval_t value = hash_atomic_constraint (e->constr);
+hashval_t value = htab_hash_pointer (e->constr);
  return iterative_hash_template_arg (e->args, value);
}
  
static bool equal (sat_entry *e1, sat_entry *e2)

{
-if (!atomic_constraints_identical_p (e1->constr, e2->constr))
+if (e1->constr != e2->constr)
return false;
  return template_args_equal (e1->args, e2->args);
}





Re: [PATCH] c++: Add -Wexceptions warning option [PR97675]

2020-11-05 Thread Jason Merrill via Gcc-patches

On 11/5/20 11:03 AM, Marek Polacek wrote:

This PR asks that we add a warning option for an existing (very old)
warning, so that it can be disabled selectively.  clang++ uses
-Wexceptions for this, so I added this new option rather than using
e.g. -Wnoexcept.


OK.


gcc/c-family/ChangeLog:

PR c++/97675
* c.opt (Wexceptions): New option.

gcc/cp/ChangeLog:

PR c++/97675
* except.c (check_handlers_1): Use OPT_Wexceptions for the
warning.  Use inform for the second part of the warning.

gcc/ChangeLog:

PR c++/97675
* doc/invoke.texi: Document -Wexceptions.

gcc/testsuite/ChangeLog:

PR c++/97675
* g++.old-deja/g++.eh/catch10.C: Adjust dg-warning.
* g++.dg/warn/Wexceptions1.C: New test.
* g++.dg/warn/Wexceptions2.C: New test.
---
  gcc/c-family/c.opt  |  4 
  gcc/cp/except.c |  9 -
  gcc/doc/invoke.texi |  8 +++-
  gcc/testsuite/g++.dg/warn/Wexceptions1.C|  9 +
  gcc/testsuite/g++.dg/warn/Wexceptions2.C| 10 ++
  gcc/testsuite/g++.old-deja/g++.eh/catch10.C |  4 ++--
  6 files changed, 36 insertions(+), 8 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/warn/Wexceptions1.C
  create mode 100644 gcc/testsuite/g++.dg/warn/Wexceptions2.C

diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index 426636be839..9493acb82ff 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -579,6 +579,10 @@ Werror-implicit-function-declaration
  C ObjC RejectNegative Warning Alias(Werror=, implicit-function-declaration)
  This switch is deprecated; use -Werror=implicit-function-declaration instead.
  
+Wexceptions

+C++ ObjC++ Var(warn_exceptions) Init(1)
+Warn when an exception handler is shadowed by another handler.
+
  Wextra
  C ObjC C++ ObjC++ Warning
  ; in common.opt
diff --git a/gcc/cp/except.c b/gcc/cp/except.c
index cb1a4105dae..985206f6a64 100644
--- a/gcc/cp/except.c
+++ b/gcc/cp/except.c
@@ -975,11 +975,10 @@ check_handlers_1 (tree master, tree_stmt_iterator i)
tree handler = tsi_stmt (i);
if (TREE_TYPE (handler) && can_convert_eh (type, TREE_TYPE (handler)))
{
- warning_at (EXPR_LOCATION (handler), 0,
- "exception of type %qT will be caught",
- TREE_TYPE (handler));
- warning_at (EXPR_LOCATION (master), 0,
- "   by earlier handler for %qT", type);
+ if (warning_at (EXPR_LOCATION (handler), OPT_Wexceptions,
+ "exception of type %qT will be caught by earlier "
+ "handler", TREE_TYPE (handler)))
+   inform (EXPR_LOCATION (master), "for type %qT", type);
  break;
}
  }
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 5320e6c1e1e..4c6435d5e14 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -240,7 +240,7 @@ in the following sections.
  -Wctor-dtor-privacy  -Wno-delete-incomplete @gol
  -Wdelete-non-virtual-dtor  -Wdeprecated-copy  -Wdeprecated-copy-dtor @gol
  -Wno-deprecated-enum-enum-conversion -Wno-deprecated-enum-float-conversion 
@gol
--Weffc++  -Wextra-semi  -Wno-inaccessible-base @gol
+-Weffc++  -Wno-exceptions -Wextra-semi  -Wno-inaccessible-base @gol
  -Wno-inherited-variadic-ctor  -Wno-init-list-lifetime @gol
  -Wno-invalid-offsetof  -Wno-literal-suffix  -Wmismatched-tags @gol
  -Wmultiple-inheritance  -Wnamespaces  -Wnarrowing @gol
@@ -3738,6 +3738,12 @@ When selecting this option, be aware that the standard 
library
  headers do not obey all of these guidelines; use @samp{grep -v}
  to filter out those warnings.
  
+@item -Wno-exceptions @r{(C++ and Objective-C++ only)}

+@opindex Wexceptions
+@opindex Wno-exceptions
+Disable the warning about the case when an exception handler is shadowed by
+another handler, which can point out a wrong ordering of exception handlers.
+
  @item -Wstrict-null-sentinel @r{(C++ and Objective-C++ only)}
  @opindex Wstrict-null-sentinel
  @opindex Wno-strict-null-sentinel
diff --git a/gcc/testsuite/g++.dg/warn/Wexceptions1.C 
b/gcc/testsuite/g++.dg/warn/Wexceptions1.C
new file mode 100644
index 000..af140fd0dc2
--- /dev/null
+++ b/gcc/testsuite/g++.dg/warn/Wexceptions1.C
@@ -0,0 +1,9 @@
+// PR c++/97675
+
+struct Base { };
+struct Child : Base { };
+int main() {
+try { throw Child(); }
+catch (Base const&) { }
+catch (Child const&) { } // { dg-warning "exception of type .Child. will be 
caught by earlier handler" }
+}
diff --git a/gcc/testsuite/g++.dg/warn/Wexceptions2.C 
b/gcc/testsuite/g++.dg/warn/Wexceptions2.C
new file mode 100644
index 000..07c5155ac06
--- /dev/null
+++ b/gcc/testsuite/g++.dg/warn/Wexceptions2.C
@@ -0,0 +1,10 @@
+// PR c++/97675
+// { dg-additional-options -Wno-exceptions }
+
+struct Base { };
+struct Child : Base { };
+int main() {
+try { throw Child(); }
+catch (Base const&) { }
+catch (Child const&

Re: Use EAF_RETURN_ARG in tree-ssa-ccp.c

2020-11-05 Thread Jeff Law via Gcc-patches


On 10/27/20 3:01 AM, Richard Biener wrote:
> On Tue, 27 Oct 2020, Jan Hubicka wrote:
>
>>> On Mon, 26 Oct 2020, Jan Hubicka wrote:
>>>
 Hi,
 while looking for special cases of buitins I noticed that tree-ssa-ccp
 can use EAF_RETURNS_ARG.  I wonder if same should be done by value
 numbering and other propagators
>>> The issue is that changing
>>>
>>>   q = memcpy (p, r);
>>>   .. use q ...
>>>
>>> to
>>>
>>>   memcpy (p, r);
>>>   .. use p ..
>>>
>>> is bad for RA so we generally do not want to copy-propagate
>>> EAF_RETURNS_ARG.  We eventually do want to optimize a following
>>>
>>>
>>>   if (q == p)
>>>
>>> of course.  And we eventually want to do the _reverse_ transform,
>>> replacing
>>>
>>>   memcpy (p, r)
>>>   .. use p ..
>>>
>>> with
>>>
>>>   tem = memcpy (p, r)
>>>   .. use tem ..
>>>
>>> ISTR playing with patches doing all of the above, would need to dig
>>> them out again.  There's also a PR about this I think.
>>>
>>> Bernd added some code to RTL call expansion, not sure exactly
>>> what it does...
>> It adds copy intstruction to call fusage, so RTL backend now about the
>> equivalence.
>> void *
>> test(void *a, void *b, int l)
>> {
>>   __builtin_memcpy (a,b,l);
>>   return a;
>> }
>> eliminates the extra copy. So I would say that we should not be affraid
>> to propagate in gimple world. It is a minor thing I guess though.
>> (my interest is mostly to get rid of unnecesary special casing of
>> builtins, as these special cases are clearly not well maintained
>> because almost no one knows about them:)
> The complication is when this appears in a loop like
>
>  for (; n; --n)
>{
>  p = memcpy (p, s, k);
>  p += j;
>}
>
> then I assume IVOPTs can do a better job knowing the equivalence
> (guess we'd still need to teach SCEV about this then ...) and
> when it's not present explicitely in the SSA chain any SSA based
> analysis has difficulties seeing it.
>
> ISTR I saw regressions when doing a patch propagating those
> equivalences.

SImilarly.  I don't remember the details, but definitely remember being
surprised that the propagation caused regressions and then chasing it
down to a bad interaction with the register allocator.


jeff




[PATCH] c++: Fix decltype(auto) deduction with rvalue ref [PR78209]

2020-11-05 Thread Marek Polacek via Gcc-patches
Here's a small deficiency in decltype(auto).  [dcl.type.auto.deduct]/5:
If the placeholder-type-specifier is of the form decltype(auto), [...]
the type deduced for T is determined [...] as though E had been the operand
of the decltype.  So:

  int &&i = 0;
  decltype(auto) j = i; // should behave like int &&j = i; error

We deduce j's type in do_auto_deduction via finish_decltype_type which
takes an 'id' argument.  Currently we compute 'id' as false, because
stripped_init is *i (a REFERENCE_REF_P).  But it seems to me we should
rather set 'id' to true here, by looking through the REFERENCE_REF_P,
so that finish_decltype_type DTRT.

gcc/cp/ChangeLog:

PR c++/78209
* pt.c (do_auto_deduction): If init is REFERENCE_REF_P, use its
first operand.

gcc/testsuite/ChangeLog:

PR c++/78209
* g++.dg/cpp1y/decltype-auto1.C: New test.
---
 gcc/cp/pt.c | 2 ++
 gcc/testsuite/g++.dg/cpp1y/decltype-auto1.C | 8 
 2 files changed, 10 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/cpp1y/decltype-auto1.C

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index f401c75b9e5..c033a286407 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -29278,6 +29278,8 @@ do_auto_deduction (tree type, tree init, tree auto_node,
   else if (AUTO_IS_DECLTYPE (auto_node))
 {
   tree stripped_init = tree_strip_any_location_wrapper (init);
+  if (REFERENCE_REF_P (stripped_init))
+   stripped_init = TREE_OPERAND (stripped_init, 0);
   bool id = (DECL_P (stripped_init)
 || ((TREE_CODE (init) == COMPONENT_REF
  || TREE_CODE (init) == SCOPE_REF)
diff --git a/gcc/testsuite/g++.dg/cpp1y/decltype-auto1.C 
b/gcc/testsuite/g++.dg/cpp1y/decltype-auto1.C
new file mode 100644
index 000..13baf8eba06
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1y/decltype-auto1.C
@@ -0,0 +1,8 @@
+// PR c++/78209
+// { dg-do compile { target c++14 } }
+
+int main()
+{
+  int &&i = 0;
+  decltype(auto) j = i; // { dg-error "cannot bind rvalue reference" }
+}

base-commit: d16d45655d77d58e3f8430b9cf386b04759e01c7
-- 
2.28.0



Re: [PATCH] Remove vr_values::extract_range_builtin.

2020-11-05 Thread Jeff Law via Gcc-patches


On 10/20/20 10:43 AM, Aldy Hernandez via Gcc-patches wrote:
> As promised.
>
> Now that we know the vr_values and ranger versions are in sync, it
> is safe to remove the vr_values version and just call the ranger one.
>
> I am holding off on pushing this for a week or two, or until Fedora gets
> rebuilt with the current compiler.
>
> gcc/ChangeLog:
>
>   * vr-values.h (class vr_values): Remove extract_range_builtin.
>   * vr-values.c (vr_values::extract_range_basic): Remove call to
>   extract_range_builtin.
>   (vr_values::extract_range_builtin): Remove.

The 10/25 snapshot build is done.  11/01 snapshot testing is in
progress.  Your call when you want to commit.


jeff




Re: Use EAF_RETURN_ARG in tree-ssa-ccp.c

2020-11-05 Thread Jan Hubicka
> 
> On 10/27/20 3:01 AM, Richard Biener wrote:
> > On Tue, 27 Oct 2020, Jan Hubicka wrote:
> >
> >>> On Mon, 26 Oct 2020, Jan Hubicka wrote:
> >>>
>  Hi,
>  while looking for special cases of buitins I noticed that tree-ssa-ccp
>  can use EAF_RETURNS_ARG.  I wonder if same should be done by value
>  numbering and other propagators
> >>> The issue is that changing
> >>>
> >>>   q = memcpy (p, r);
> >>>   .. use q ...
> >>>
> >>> to
> >>>
> >>>   memcpy (p, r);
> >>>   .. use p ..
> >>>
> >>> is bad for RA so we generally do not want to copy-propagate
> >>> EAF_RETURNS_ARG.  We eventually do want to optimize a following
> >>>
> >>>
> >>>   if (q == p)
> >>>
> >>> of course.  And we eventually want to do the _reverse_ transform,
> >>> replacing
> >>>
> >>>   memcpy (p, r)
> >>>   .. use p ..
> >>>
> >>> with
> >>>
> >>>   tem = memcpy (p, r)
> >>>   .. use tem ..
> >>>
> >>> ISTR playing with patches doing all of the above, would need to dig
> >>> them out again.  There's also a PR about this I think.
> >>>
> >>> Bernd added some code to RTL call expansion, not sure exactly
> >>> what it does...
> >> It adds copy intstruction to call fusage, so RTL backend now about the
> >> equivalence.
> >> void *
> >> test(void *a, void *b, int l)
> >> {
> >>   __builtin_memcpy (a,b,l);
> >>   return a;
> >> }
> >> eliminates the extra copy. So I would say that we should not be affraid
> >> to propagate in gimple world. It is a minor thing I guess though.
> >> (my interest is mostly to get rid of unnecesary special casing of
> >> builtins, as these special cases are clearly not well maintained
> >> because almost no one knows about them:)
> > The complication is when this appears in a loop like
> >
> >  for (; n; --n)
> >{
> >  p = memcpy (p, s, k);
> >  p += j;
> >}
> >
> > then I assume IVOPTs can do a better job knowing the equivalence
> > (guess we'd still need to teach SCEV about this then ...) and
> > when it's not present explicitely in the SSA chain any SSA based
> > analysis has difficulties seeing it.
> >
> > ISTR I saw regressions when doing a patch propagating those
> > equivalences.
> 
> SImilarly.  I don't remember the details, but definitely remember being
> surprised that the propagation caused regressions and then chasing it
> down to a bad interaction with the register allocator.

I wonder if it was before or after the code in calls.c adding
CALL_FUSAGE was added.  It is probably not that important, but given
that we have all infrastructure on place it seems pity to not use it.

Honza
> 
> 
> jeff
> 
> 


Re: [committed] libstdc++: Fix constraints on std::optional comparisons [PR 96269]

2020-11-05 Thread Jonathan Wakely via Gcc-patches

On 05/11/20 22:12 +0200, Ville Voutilainen via Libstdc++ wrote:

On Thu, 5 Nov 2020 at 21:52, Jonathan Wakely via Libstdc++
 wrote:


On 05/11/20 19:09 +, Jonathan Wakely wrote:
>The relational operators for std::optional were using the wrong types
>in the declval expressions used to constrain them. Instead of using
>const lvalues they were using non-const rvalues, which meant that a type
>might satisfy the constraints but then give an error when the function
>body was instantiated.
>
>libstdc++-v3/ChangeLog:
>
>   PR libstdc++/96269
>   * include/std/optional (operator==, operator!=, operator<)
>   (operator>, operator<=, operator>=): Fix types used in
>   SFINAE constraints.
>   * testsuite/20_util/optional/relops/96269.cc: New test.
>
>Tested powerpc64le-linux. Committed to trunk.

When concepts are supported we can make the alias templates
__optional_eq_t et al use a requires-expression instead of SFINAE.
This is potentially faster to compile, given expected improvements
to C++20 compilers.

I'm testing this patch.


It concerns me that we'd have such conditional conceptifying just
because it's possibly faster to compile.
There's more types where we'd want to conditionally use concepts, but
perhaps we want to think a bit
more how to do that in our source code, rather than just make them
preprocessor-conditionals in the same
header. We might entertain conceptifying tuple, when concepts are
available. That may end up being
fairly verbose if it's done with preprocessor in .

That's not to say that I'm objecting to this as such; I merely think
we want to be a bit careful with
conceptifying, and be rather instantly prepared to entertain doing it
with a slightly different source code
structure, which may involve splitting things across more files, which
would then involve adding more
headers that are installed.


I agree. I only considered doing it here (and am posting it for
comment rather than committing it right away) because we already have
the alias helpers which are used in multiple places in the file.
Without those, every relational operator would look like this if we
used concepts conditionally:

  template
constexpr auto
operator==(const optional<_Tp>& __lhs, const optional<_Up>& __rhs)
#if __cpp_lib_concepts
requires requires(const _Tp __t, const _Up __u) {
  { *__lhs == *__rhs } -> convertible_to;
}
#else
-> enable_if_t() == std::declval()), 
bool>,
bool>
#endif
{
  return static_cast(__lhs) == static_cast(__rhs)
 && (!__lhs || *__lhs == *__rhs);
}

Or:

  template
constexpr auto
operator==(const optional<_Tp>& __lhs, const optional<_Up>& __rhs)
#if __cpp_lib_concepts
requires requires { *__lhs == *__rhs } -> convertible_to; }
#else
-> enable_if_t, bool>
#endif
{
  return static_cast(__lhs) == static_cast(__rhs)
 && (!__lhs || *__lhs == *__rhs);
}

Yuck.

The second one is less verbose, but does overload resolution and type
deduction for optional<_Tp>::operator* and optional<_Up>::operator*.
That's unnecessary (and so compiles slower) because we know the result
types are just const _Tp& and const _Up&, so the first version uses
those types directly.

Either way, having that #if-#else-#endif on every relational operator
is NOT appealing. But since all the operators already use aliases like
__optional_eq_t any changes are localized to those helpers. The actual
rel ops themselves don't change.

We definitely want to think about the trade offs though. So far we
only use concepts in code that only has to compile as C++20, so we
don't need to provide a non-concepts fallback for C++17, or where it's
required for conformance (e.g. iterator_traits). That's definitely
more palatable than preprocessor conditions choosing between two
functionally equivalent ways to do the same thing.





Re: [PATCH] generalized range_query class for multiple contexts

2020-11-05 Thread Martin Sebor via Gcc-patches

On 11/5/20 12:29 PM, Martin Sebor wrote:

On 10/1/20 11:25 AM, Martin Sebor wrote:

On 10/1/20 9:34 AM, Aldy Hernandez wrote:



On 10/1/20 3:22 PM, Andrew MacLeod wrote:
 > On 10/1/20 5:05 AM, Aldy Hernandez via Gcc-patches wrote:
 >>> Thanks for doing all this!  There isn't anything I don't understand
 >>> in the sprintf changes so no questions from me (well, almost none).
 >>> Just some comments:
 >> Thanks for your comments on the sprintf/strlen API conversion.
 >>
 >>> The current call statement is available in all functions that take
 >>> a directive argument, as dir->info.callstmt.  There should be no 
need

 >>> to also add it as a new argument to the functions that now need it.
 >> Fixed.
 >>
 >>> The change adds code along these lines in a bunch of places:
 >>>
 >>> + value_range vr;
 >>> + if (!query->range_of_expr (vr, arg, stmt))
 >>> +   vr.set_varying (TREE_TYPE (arg));
 >>>
 >>> I thought under the new Ranger APIs when a range couldn't be
 >>> determined it would be automatically set to the maximum for
 >>> the type.  I like that and have been moving in that direction
 >>> with my code myself (rather than having an API fail, have it
 >>> set the max range and succeed).
 >> I went through all the above idioms and noticed all are being 
used on
 >> supported types (integers or pointers).  So range_of_expr will 
always

 >> return true.  I've removed the if() and the set_varying.
 >>
 >>> Since that isn't so in this case, I think it would still be nice
 >>> if the added code could be written as if the range were set to
 >>> varying in this case and (ideally) reduced to just initialization:
 >>>
 >>> value_range vr = some-function (query, stmt, arg);
 >>>
 >>> some-function could be an inline helper defined just for the 
sprintf
 >>> pass (and maybe also strlen which also seems to use the same 
pattern),
 >>> or it could be a value_range AKA irange ctor, or it could be a 
member

 >>> of range_query, whatever is the most appropriate.
 >>>
 >>> (If assigning/copying a value_range is thought to be too expensive,
 >>> declaring it first and then passing it to that helper to set it
 >>> would work too).
 >>>
 >>> In strlen, is the removed comment no longer relevant?  (I.e., does
 >>> the ranger solve the problem?)
 >>>
 >>> -  /* The range below may be "inaccurate" if a constant has 
been

 >>> -    substituted earlier for VAL by this pass that hasn't been
 >>> -    propagated through the CFG.  This shoud be fixed by the 
new

 >>> -    on-demand VRP if/when it becomes available (hopefully in
 >>> -    GCC 11).  */
 >> It should.
 >>
 >>> I'm wondering about the comment added to get_range_strlen_dynamic
 >>> and other places:
 >>>
 >>> + // FIXME: Use range_query instead of global ranges.
 >>>
 >>> Is that something you're planning to do in a followup or should
 >>> I remember to do it at some point?
 >> I'm not planning on doing it.  It's just a reminder that it would be
 >> beneficial to do so.
 >>
 >>> Otherwise I have no concern with the changes.
 >> It's not cleared whether Andrew approved all 3 parts of the patchset
 >> or just the valuation part.  I'll wait for his nod before committing
 >> this chunk.
 >>
 >> Aldy
 >>
 > I have no issue with it, so OK.

Pushed all 3 patches.

 >
 > Just an observation that should be pointed out, I believe Aldy has 
all

 > the code for converting to a ranger, but we have not pursued that any
 > further yet since there is a regression due to our lack of 
equivalence
 > processing I think?  That should be resolved in the coming month, 
but at
 > the moment is a holdback/concern for converting these passes...  
iirc.


Yes.  Martin, the take away here is that the strlen/sprintf pass has 
been converted to the new API, but ranger is still not up and running 
on it (even on the branch).


With the new API, all you have to do is remove all instances of 
evrp_range_analyzer and replace them with a ranger.  That's it.
Below is an untested patch that would convert you to a ranger once 
it's contributed.


IIRC when I enabled the ranger for your pass a while back, there was 
one or two regressions due to missing equivalences, and the rest were 
because the tests were expecting an actual specific range, and the 
ranger returned a slightly different/better one.  You'll need to 
adjust your tests.


Ack.  I'll be on the lookout for the ranger commit (if you hppen
to remember and CC me on it just in case I might miss it that would
be great).


I have applied the patch and ran some tests.  There are quite
a few failures (see the list below).  I have only looked at
a couple.  The one in in gcc.dg/tree-ssa/builtin-sprintf-warn-3.c
boils down to the following test case.  There should be no warning
for either sprintf call.  The one in h() is a false positive and
the reason for at least some of the regressions.  Somehow,
the conversions between int and char are causing Ranger to lose
the range.

$ cat t.c && gcc -O2 -S -

Re: [PATCH] c++: Fix decltype(auto) deduction with rvalue ref [PR78209]

2020-11-05 Thread Jason Merrill via Gcc-patches

On 11/5/20 3:52 PM, Marek Polacek wrote:

Here's a small deficiency in decltype(auto).  [dcl.type.auto.deduct]/5:
If the placeholder-type-specifier is of the form decltype(auto), [...]
the type deduced for T is determined [...] as though E had been the operand
of the decltype.  So:

   int &&i = 0;
   decltype(auto) j = i; // should behave like int &&j = i; error

We deduce j's type in do_auto_deduction via finish_decltype_type which
takes an 'id' argument.  Currently we compute 'id' as false, because
stripped_init is *i (a REFERENCE_REF_P).  But it seems to me we should
rather set 'id' to true here, by looking through the REFERENCE_REF_P,
so that finish_decltype_type DTRT.


OK.


gcc/cp/ChangeLog:

PR c++/78209
* pt.c (do_auto_deduction): If init is REFERENCE_REF_P, use its
first operand.

gcc/testsuite/ChangeLog:

PR c++/78209
* g++.dg/cpp1y/decltype-auto1.C: New test.
---
  gcc/cp/pt.c | 2 ++
  gcc/testsuite/g++.dg/cpp1y/decltype-auto1.C | 8 
  2 files changed, 10 insertions(+)
  create mode 100644 gcc/testsuite/g++.dg/cpp1y/decltype-auto1.C

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index f401c75b9e5..c033a286407 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -29278,6 +29278,8 @@ do_auto_deduction (tree type, tree init, tree auto_node,
else if (AUTO_IS_DECLTYPE (auto_node))
  {
tree stripped_init = tree_strip_any_location_wrapper (init);
+  if (REFERENCE_REF_P (stripped_init))
+   stripped_init = TREE_OPERAND (stripped_init, 0);
bool id = (DECL_P (stripped_init)
 || ((TREE_CODE (init) == COMPONENT_REF
  || TREE_CODE (init) == SCOPE_REF)
diff --git a/gcc/testsuite/g++.dg/cpp1y/decltype-auto1.C 
b/gcc/testsuite/g++.dg/cpp1y/decltype-auto1.C
new file mode 100644
index 000..13baf8eba06
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1y/decltype-auto1.C
@@ -0,0 +1,8 @@
+// PR c++/78209
+// { dg-do compile { target c++14 } }
+
+int main()
+{
+  int &&i = 0;
+  decltype(auto) j = i; // { dg-error "cannot bind rvalue reference" }
+}

base-commit: d16d45655d77d58e3f8430b9cf386b04759e01c7





Re: [PATCH] Remove vr_values::extract_range_builtin.

2020-11-05 Thread Aldy Hernandez via Gcc-patches
I'll wait for the 11/01 snapshot to finish then.

Thanks.
Aldy

On Thu, Nov 5, 2020, 21:53 Jeff Law  wrote:

>
> On 10/20/20 10:43 AM, Aldy Hernandez via Gcc-patches wrote:
> > As promised.
> >
> > Now that we know the vr_values and ranger versions are in sync, it
> > is safe to remove the vr_values version and just call the ranger one.
> >
> > I am holding off on pushing this for a week or two, or until Fedora gets
> > rebuilt with the current compiler.
> >
> > gcc/ChangeLog:
> >
> >   * vr-values.h (class vr_values): Remove extract_range_builtin.
> >   * vr-values.c (vr_values::extract_range_basic): Remove call to
> >   extract_range_builtin.
> >   (vr_values::extract_range_builtin): Remove.
>
> The 10/25 snapshot build is done.  11/01 snapshot testing is in
> progress.  Your call when you want to commit.
>
>
> jeff
>
>
>


Re: [PATCH] c++: Use two levels of caching in satisfy_atom

2020-11-05 Thread Jason Merrill via Gcc-patches

On 11/4/20 2:19 PM, Patrick Palka wrote:

[ This patch depends on

   c++: Reuse identical ATOMIC_CONSTRs during normalization

   https://gcc.gnu.org/pipermail/gcc-patches/2020-November/557929.html  ]

This improves the effectiveness of caching in satisfy_atom by querying
the cache again after we've instantiated the atom's parameter mapping.

Before instantiating its mapping, the identity of an (atom,args) pair
within the satisfaction cache is determined by idiosyncratic things such
as the level and index of each template parameter used in targets of the
parameter mapping.  For example, the associated constraints of foo in

   template  concept range = range_v;
   template  void foo () requires range && range;

are range_v (with mapping T -> U) /\ range_v (with mapping T -> V).
If during satisfaction the template arguments supplied for U and V are
the same, then the satisfaction value of these two atoms will be the
same (despite their uninstantiated parameter mappings being different).

But sat_cache doesn't see this because it compares the uninstantiated
parameter mapping and the supplied template arguments of sat_entry's
independently.  So satisy_atom currently will end up fully evaluating
the latter atom instead of reusing the satisfaction value of the former.

But there is a point when the two atoms do look the same to sat_cache,
and that's after instantiating their parameter mappings.  By querying
the cache again at this point, we're at least able to avoid substituting
the instantiated mapping into the second atom's expression.

With this patch, compile time and memory usage for the cmcstl2 test
test/algorithm/set_symmetric_diference4.cpp drops from 11s/1.4GB to
8.5s/1.2GB with an --enable-checking=release compiler.


How does the performance compare if we *only* cache after substituting 
into the parameter mapping?  I'd expect that substitution to be pretty 
cheap in general.



Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?

gcc/cp/ChangeLog:

* cp-tree.h (ATOMIC_CONSTR_MAP_INSTANTIATED_P): Define this flag
for ATOMIC_CONSTRs.
* constraint.cc (sat_hasher::hash): Use hash_atomic_constraint
if the flag is set, otherwise keep using a pointer hash.
(sat_hasher::equal): Return false if the flag's setting differs
on two atoms.  Call atomic_constraints_identical_p if the flag
is set, otherwise keep using a pointer equality test.
(satisfy_atom): After instantiating the parameter mapping, form
another ATOMIC_CONSTR using the instantiated mapping and query
the cache again.  Cache the satisfaction value of both atoms.
(diagnose_atomic_constraint): Simplify now that the supplied
atom has an instantiated mapping.
---
  gcc/cp/constraint.cc | 47 +++-
  gcc/cp/cp-tree.h |  6 ++
  2 files changed, 44 insertions(+), 9 deletions(-)

diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
index 55dba362ca5..c612bfba13b 100644
--- a/gcc/cp/constraint.cc
+++ b/gcc/cp/constraint.cc
@@ -2315,12 +2315,32 @@ struct sat_hasher : ggc_ptr_hash
  {
static hashval_t hash (sat_entry *e)
{
+if (ATOMIC_CONSTR_MAP_INSTANTIATED_P (e->constr))
+  {
+   gcc_assert (!e->args);
+   return hash_atomic_constraint (e->constr);
+  }
+
  hashval_t value = htab_hash_pointer (e->constr);
  return iterative_hash_template_arg (e->args, value);
}
  
static bool equal (sat_entry *e1, sat_entry *e2)

{
+if (ATOMIC_CONSTR_MAP_INSTANTIATED_P (e1->constr)
+   != ATOMIC_CONSTR_MAP_INSTANTIATED_P (e2->constr))
+  return false;
+
+if (ATOMIC_CONSTR_MAP_INSTANTIATED_P (e1->constr))
+  {
+   /* Atoms with instantiated mappings are built in satisfy_atom.  */
+   gcc_assert (!e1->args && !e2->args);
+   return atomic_constraints_identical_p (e1->constr, e2->constr);
+  }
+
+/* Atoms with uninstantiated mappings are built in normalize_atom.
+   Their identity is determined by their pointer value due to
+   the caching of ATOMIC_CONSTRs performed therein.  */
  if (e1->constr != e2->constr)
return false;
  return template_args_equal (e1->args, e2->args);
@@ -2614,6 +2634,18 @@ satisfy_atom (tree t, tree args, subst_info info)
return cache.save (boolean_false_node);
  }
  
+  /* Now build a new atom using the instantiated mapping.  We use

+ this atom as a second key to the satisfaction cache, and we
+ also pass it to diagnose_atomic_constraint so that diagnostics
+ which refer to the atom display the instantiated mapping.  */
+  t = copy_node (t);
+  ATOMIC_CONSTR_MAP (t) = map;
+  gcc_assert (!ATOMIC_CONSTR_MAP_INSTANTIATED_P (t));
+  ATOMIC_CONSTR_MAP_INSTANTIATED_P (t) = true;
+  satisfaction_cache inst_cache (t, /*args=*/NULL_TREE, info.complain);
+  if (tree r = inst_cache.get ())
+return cache.save (r);
+
/* Rebuild the argument vector f

Re: [PATCH] Remove vr_values::extract_range_builtin.

2020-11-05 Thread Jeff Law via Gcc-patches


On 11/5/20 2:40 PM, Aldy Hernandez wrote:
> I'll wait for the 11/01 snapshot to finish then.

I'm worried that the 11/01 snapshot is going to generate so many
failures that it may not be useful.  I'm not sure what's going on, but
I'm getting a ton of what appear to be codegen correctness issues.


jeff




Re: [PATCH] c++: Consider only relevant template arguments in sat_hasher

2020-11-05 Thread Jason Merrill via Gcc-patches

On 11/5/20 11:18 AM, Patrick Palka wrote:

[ This patch depends on

   c++: Use two levels of caching in satisfy_atom

   https://gcc.gnu.org/pipermail/gcc-patches/2020-November/558096.html  ]

A large source of cache misses in satisfy_atom is caused by the identity
of an (atom,args) pair within the satisfaction cache being determined by
the entire set of supplied template arguments rather than by the subset
of template arguments that the atom actually depends on.  For instance,
consider

   template 
   concept range = range_v;

   template  void foo () requires range;
   template  void bar () requires range;

The associated constraints of foo and bar are equivalent: they both
consist of the atom range_v (with mapping T -> U).  But the sat_cache
currently will never reuse a satisfaction value between the two atoms
because foo has one template parameter and bar has two, and the
satisfaction cache conservatively assumes that all template parameters
are relevant to a satisfaction value of an atom.

This patch eliminates this assumption and makes the sat_cache instead
care about just the subset of args of an (atom,args) pair that's used
in the targets of an atom's parameter mapping.

With this patch, compile time and memory usage for the cmcstl2 test
test/algorithm/set_symmetric_difference4.cpp drops from 8.5s/1.2GB to
3.5s/0.4GB.


This seems like another situation where caching only after substitution 
of the parameter mapping might make things simpler.



Bootstrapped and regtested on x86_64-pc-linux-gnu.

gcc/cp/ChangeLog:

* constraint.cc (norm_info::norm_info): Initialize orig_decl.
(norm_info::orig_decl): New data member.
(normalize_atom): When caching an atom for the first time,
compute a list of template parameters used in the targets of the
parameter mapping and store it in the TREE_TYPE of the mapping.
(sat_hasher::hash): Use this list to hash only the template
arguments that are relevant to the atom.
(satisfy_atom): Use this list to compare only the template
arguments that are relevant to the atom.
---
  gcc/cp/constraint.cc | 66 ++--
  1 file changed, 63 insertions(+), 3 deletions(-)

diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
index c612bfba13b..221c5b21c7f 100644
--- a/gcc/cp/constraint.cc
+++ b/gcc/cp/constraint.cc
@@ -616,7 +616,8 @@ struct norm_info : subst_info
  
norm_info (tree in_decl, tsubst_flags_t complain)

  : subst_info (tf_warning_or_error | complain, in_decl),
-  context (make_context (in_decl))
+  context (make_context (in_decl)),
+  orig_decl (in_decl)
{}
  
bool generate_diagnostics() const

@@ -647,6 +648,12 @@ struct norm_info : subst_info
   for that check.  */
  
tree context;

+
+  /* The declaration whose constraints we're normalizing.  The targets
+ of the parameter mapping of each atom will be in terms of template
+ parameters of ORIG_DECL.  */
+
+  tree orig_decl = NULL_TREE;
  };
  
  static tree normalize_expression (tree, tree, norm_info);

@@ -758,6 +765,28 @@ normalize_atom (tree t, tree args, norm_info info)
tree *slot = atom_cache->find_slot (atom, INSERT);
if (*slot)
return *slot;
+
+  /* Find all template parameters used in the targets of the parameter
+mapping, and store a list of them in the TREE_TYPE of the mapping.
+This list will be used by sat_hasher to determine the subset of
+supplied template arguments that the satisfaction value of the atom
+depends on.  */
+  if (map)
+   {
+ tree targets = make_tree_vec (list_length (map));
+ int i = 0;
+ for (tree node = map; node; node = TREE_CHAIN (node))
+   {
+ tree target = TREE_PURPOSE (node);
+ TREE_VEC_ELT (targets, i++) = target;
+   }
+ tree ctx_parms = (info.orig_decl
+   ? DECL_TEMPLATE_PARMS (info.orig_decl)
+   : current_template_parms);
+ tree target_parms = find_template_parameters (targets, ctx_parms);
+ TREE_TYPE (map) = target_parms;
+   }
+
*slot = atom;
  }
return atom;
@@ -2322,7 +2351,20 @@ struct sat_hasher : ggc_ptr_hash
}
  
  hashval_t value = htab_hash_pointer (e->constr);

-return iterative_hash_template_arg (e->args, value);
+
+tree map = ATOMIC_CONSTR_MAP (e->constr);
+if (map)
+  for (tree target_parms = TREE_TYPE (map);
+  target_parms;
+  target_parms = TREE_CHAIN (target_parms))
+   {
+ int level, index;
+ tree parm = TREE_VALUE (target_parms);
+ template_parm_level_and_index (parm, &level, &index);
+ tree arg = TMPL_ARG (e->args, level, index);
+ value = iterative_hash_template_arg (arg, value);
+   }
+return value;
}
  
static bool equal (sat_entry *e1, sat_entry *e2)

@@ -2343,7 +2385,25

Re: gcc-wwwdocs branch master updated. 88e29096c36837553fc841bd1fa5df6caa776b44

2020-11-05 Thread Gerald Pfeifer
On Thu, 29 Oct 2020, hongtao Liu via Gcc-cvs-wwwdocs wrote:
> The branch, master has been updated
>via  88e29096c36837553fc841bd1fa5df6caa776b44 (commit)
>   from  053c956f6e9c71efac5be01f8a8ba79f15d87f4b (commit)

>GCC now supports the Intel CPU named Alderlake through
>  -march=alderlake.
> -The switch enables the CLDEMOTE PTWRITE WAITPKG SERIALIZE ISA extensions.
> +The switch enables the CLDEMOTE PTWRITE WAITPKG SERIALIZE KEYLOCKER
> +ISA extensions.

I did not see this posted on gcc-patches.  Should this list of
extensions be separated by commas?

(I can make that change if you agree.)

Also, I did not see you in gcc/MAINTAINERS, or did miss it?
Since evidently you have write after approval access, please
add yourself there.

Gerald


[Patch] Fortran: Fix function decl's location [PR95847]

2020-11-05 Thread Tobias Burnus

In gfc_get_symbol_decl, if an external procedure is invoked
and sym->backend_decl is NULL_TREE,
  gfc_get_extern_function_decl
is called. This searches the translation unit (or gsym) to
find the declaration – and if found, it returns it.

Well, that worked and the module procedure's decl is returned
with DECL_SOURCE_LOCATION() matching the original declaration
and the associated cfun->function_end_location is also properly
set.

But before this patch, the location is reset to the sym->declared_at,
which is the location of the "use foo" line in the example from
the PR.

Result: The DECL_SOURCE_LOCATION(cfun->decl) is *after*
cfun->function_end_location, which runs with --coverage
into an assert.


The other changes to BIND_EXPR is unrelated. It was just that
I did run with '-fdump-tree-original-lineno' and got:

  foo_suite ()
  [coverage.f90:17:22] {
integer(kind=4) res;

[coverage.f90:16:17] res = bar ([coverage.f90:16:17] sbr);
  }

Where line 17 for '{' looked odd. With the patch, we now have:

  foo_suite ()
  [coverage.f90:12:0] {
integer(kind=4) res;

[coverage.f90:16:17] res = bar ([coverage.f90:16:17] sbr);
  }


OK for the trunk?

Tobias

-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter
Fortran: Fix function decl's location [PR95847]

gcc/fortran/ChangeLog:

	PR fortran/95847
	* trans-decl.c (gfc_get_symbol_decl): Do not (re)set the location
	of an external procedure.
	(build_entry_thunks, generate_coarray_init, create_main_function,
	gfc_generate_function_code): Use fndecl's location in BIND_EXPR.

gcc/testsuite/ChangeLog:

	PR fortran/95847
	* gfortran.dg/coverage.f90: New test.

 gcc/fortran/trans-decl.c   | 19 ++-
 gcc/testsuite/gfortran.dg/coverage.f90 | 17 +
 2 files changed, 27 insertions(+), 9 deletions(-)

diff --git a/gcc/fortran/trans-decl.c b/gcc/fortran/trans-decl.c
index cdef753ea8d..71d5c670e55 100644
--- a/gcc/fortran/trans-decl.c
+++ b/gcc/fortran/trans-decl.c
@@ -1749,7 +1749,6 @@ gfc_get_symbol_decl (gfc_symbol * sym)
 	  || sym->attr.if_source != IFSRC_DECL)
 	{
 	  decl = gfc_get_extern_function_decl (sym);
-	  gfc_set_decl_location (decl, &sym->declared_at);
 	}
   else
 	{
@@ -3021,8 +3020,9 @@ build_entry_thunks (gfc_namespace * ns, bool global)
   poplevel (1, 1);
   BLOCK_SUPERCONTEXT (DECL_INITIAL (thunk_fndecl)) = thunk_fndecl;
   DECL_SAVED_TREE (thunk_fndecl)
-	= build3_v (BIND_EXPR, tmp, DECL_SAVED_TREE (thunk_fndecl),
-		DECL_INITIAL (thunk_fndecl));
+	= fold_build3_loc (DECL_SOURCE_LOCATION (thunk_fndecl), BIND_EXPR,
+			   void_type_node, tmp, DECL_SAVED_TREE (thunk_fndecl),
+			   DECL_INITIAL (thunk_fndecl));
 
   /* Output the GENERIC tree.  */
   dump_function (TDI_original, thunk_fndecl);
@@ -5786,8 +5786,8 @@ generate_coarray_init (gfc_namespace * ns __attribute((unused)))
   BLOCK_SUPERCONTEXT (DECL_INITIAL (fndecl)) = fndecl;
 
   DECL_SAVED_TREE (fndecl)
-= build3_v (BIND_EXPR, decl, DECL_SAVED_TREE (fndecl),
-DECL_INITIAL (fndecl));
+= fold_build3_loc (DECL_SOURCE_LOCATION (fndecl), BIND_EXPR, void_type_node,
+		   decl, DECL_SAVED_TREE (fndecl), DECL_INITIAL (fndecl));
   dump_function (TDI_original, fndecl);
 
   cfun->function_end_locus = input_location;
@@ -6512,8 +6512,9 @@ create_main_function (tree fndecl)
   BLOCK_SUPERCONTEXT (DECL_INITIAL (ftn_main)) = ftn_main;
 
   DECL_SAVED_TREE (ftn_main)
-= build3_v (BIND_EXPR, decl, DECL_SAVED_TREE (ftn_main),
-		DECL_INITIAL (ftn_main));
+= fold_build3_loc (DECL_SOURCE_LOCATION (ftn_main), BIND_EXPR,
+		   void_type_node, decl, DECL_SAVED_TREE (ftn_main),
+		   DECL_INITIAL (ftn_main));
 
   /* Output the GENERIC tree.  */
   dump_function (TDI_original, ftn_main);
@@ -7004,8 +7005,8 @@ gfc_generate_function_code (gfc_namespace * ns)
   BLOCK_SUPERCONTEXT (DECL_INITIAL (fndecl)) = fndecl;
 
   DECL_SAVED_TREE (fndecl)
-= build3_v (BIND_EXPR, decl, DECL_SAVED_TREE (fndecl),
-		DECL_INITIAL (fndecl));
+= fold_build3_loc (DECL_SOURCE_LOCATION (fndecl), BIND_EXPR, void_type_node,
+		   decl, DECL_SAVED_TREE (fndecl), DECL_INITIAL (fndecl));
 
   /* Output the GENERIC tree.  */
   dump_function (TDI_original, fndecl);
diff --git a/gcc/testsuite/gfortran.dg/coverage.f90 b/gcc/testsuite/gfortran.dg/coverage.f90
new file mode 100644
index 000..e0800f869c1
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/coverage.f90
@@ -0,0 +1,17 @@
+! { dg-do compile }
+! { dg-additional-options "-fprofile-arcs -ftest-coverage" }
+!
+! PR fortran/95847
+!
+module foo
+contains
+subroutine sbr()
+end subroutine sbr
+end module foo
+
+function foo_suite() result(suite)
+   use foo
+   integer :: bar
+   integer :: res
+   res = bar(sbr)
+end function foo_suite


Re: [PATCH, rs6000] Optimize pcrel access of globals (updated, ping)

2020-11-05 Thread will schmidt via Gcc-patches
On Wed, 2020-11-04 at 12:10 -0600, acsawdey--- via Gcc-patches wrote:
> From: Aaron Sawdey 
> 
> Ping, as it has been a while.
> This also includes a slight fix to make sure that all references can get
> optimized.
> 


I've read over what I could.  a few nits below, nothing significant
jumped out at me, also not my area of expertise.  :-)

comments inline below.
thanks
-WIll


> This patch implements a RTL pass that looks for pc-relative loads of the
> address of an external variable using the PCREL_GOT relocation and a
> single load or store that uses that external address.
> 
> Produced by a cast of thousands:
>  * Michael Meissner
>  * Peter Bergner
>  * Bill Schmidt
>  * Alan Modra
>  * Segher Boessenkool
>  * Aaron Sawdey
> 
> Passes bootstrap/regtest on ppc64le power10. OK for trunk?

Any impact to non-power10 targets?  (power9,power8, or BE, ...)


> 
> gcc/ChangeLog:
> 
>   * config.gcc: Add pcrel-opt.o.

pcrel-opt.c and pcrel-opt.o entries.


>   * config/rs6000/pcrel-opt.c: New file.
>   * config/rs6000/pcrel-opt.md: New file.
>   * config/rs6000/predicates.md: Add d_form_memory predicate.
>   * config/rs6000/rs6000-cpus.def: Add OPTION_MASK_PCREL_OPT.
>   * config/rs6000/rs6000-passes.def: Add pass_pcrel_opt.
>   * config/rs6000/rs6000-protos.h: Add reg_to_non_prefixed(),
>   offsettable_non_prefixed_memory(), output_pcrel_opt_reloc(),
>   and make_pass_pcrel_opt().
>   * config/rs6000/rs6000.c (reg_to_non_prefixed): Make global.
>   (rs6000_option_override_internal): Add pcrel-opt.
>   (rs6000_delegitimize_address): Support pcrel-opt.
>   (rs6000_opt_masks): Add pcrel-opt.
>   (offsettable_non_prefixed_memory): New function.
>   (reg_to_non_prefixed): Make global.
>   (rs6000_asm_output_opcode): Reset next_insn_prefixed_p.
>   (output_pcrel_opt_reloc): New function.
>   * config/rs6000/rs6000.md (loads_extern_addr): New attr.
>   (pcrel_extern_addr): Set loads_extern_addr.
>   Add include for pcrel-opt.md.
>   * config/rs6000/rs6000.opt: Add -mpcrel-opt.
>   * config/rs6000/t-rs6000: Add rules for pcrel-opt.c and
> pcrel-opt.md.

indent.

> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/powerpc/pcrel-opt-inc-di.c: New test.
>   * gcc.target/powerpc/pcrel-opt-ld-df.c: New test.
>   * gcc.target/powerpc/pcrel-opt-ld-di.c: New test.
>   * gcc.target/powerpc/pcrel-opt-ld-hi.c: New test.
>   * gcc.target/powerpc/pcrel-opt-ld-qi.c: New test.
>   * gcc.target/powerpc/pcrel-opt-ld-sf.c: New test.
>   * gcc.target/powerpc/pcrel-opt-ld-si.c: New test.
>   * gcc.target/powerpc/pcrel-opt-ld-vector.c: New test.
>   * gcc.target/powerpc/pcrel-opt-st-df.c: New test.
>   * gcc.target/powerpc/pcrel-opt-st-di.c: New test.
>   * gcc.target/powerpc/pcrel-opt-st-hi.c: New test.
>   * gcc.target/powerpc/pcrel-opt-st-qi.c: New test.
>   * gcc.target/powerpc/pcrel-opt-st-sf.c: New test.
>   * gcc.target/powerpc/pcrel-opt-st-si.c: New test.
>   * gcc.target/powerpc/pcrel-opt-st-vector.c: New test.
> ---
>  gcc/config.gcc|   6 +-
>  gcc/config/rs6000/pcrel-opt.c | 888 ++
>  gcc/config/rs6000/pcrel-opt.md| 386 
>  gcc/config/rs6000/predicates.md   |  23 +
>  gcc/config/rs6000/rs6000-cpus.def |   2 +
>  gcc/config/rs6000/rs6000-passes.def   |   8 +
>  gcc/config/rs6000/rs6000-protos.h |   4 +
>  gcc/config/rs6000/rs6000.c| 116 ++-
>  gcc/config/rs6000/rs6000.md   |   8 +-
>  gcc/config/rs6000/rs6000.opt  |   4 +
>  gcc/config/rs6000/t-rs6000|   7 +-
>  .../gcc.target/powerpc/pcrel-opt-inc-di.c |  18 +
>  .../gcc.target/powerpc/pcrel-opt-ld-df.c  |  36 +
>  .../gcc.target/powerpc/pcrel-opt-ld-di.c  |  43 +
>  .../gcc.target/powerpc/pcrel-opt-ld-hi.c  |  42 +
>  .../gcc.target/powerpc/pcrel-opt-ld-qi.c  |  42 +
>  .../gcc.target/powerpc/pcrel-opt-ld-sf.c  |  42 +
>  .../gcc.target/powerpc/pcrel-opt-ld-si.c  |  41 +
>  .../gcc.target/powerpc/pcrel-opt-ld-vector.c  |  36 +
>  .../gcc.target/powerpc/pcrel-opt-st-df.c  |  36 +
>  .../gcc.target/powerpc/pcrel-opt-st-di.c  |  37 +
>  .../gcc.target/powerpc/pcrel-opt-st-hi.c  |  42 +
>  .../gcc.target/powerpc/pcrel-opt-st-qi.c  |  42 +
>  .../gcc.target/powerpc/pcrel-opt-st-sf.c  |  36 +
>  .../gcc.target/powerpc/pcrel-opt-st-si.c  |  41 +
>  .../gcc.target/powerpc/pcrel-opt-st-vector.c  |  36 +
>  26 files changed, 2013 insertions(+), 9 deletions(-)
>  create mode 100644 gcc/config/rs6000/pcrel-opt.c
>  create mode 100644 gcc/config/rs6000/pcrel-opt.md
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/pcrel-opt-inc-di.c
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/pcrel-opt-ld-df.c
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/pcrel-opt-ld-di.c
>  create mod

Re: [PATCH] configure: Suppress output from multi-do recipes

2020-11-05 Thread Jeff Law via Gcc-patches


On 10/14/20 11:55 AM, Jonathan Wakely via Gcc-patches wrote:
> On 14/10/20 17:29 +0100, Jonathan Wakely wrote:
>> The FIXME comment saying "Leave out until this is tested a bit more" is
>> from 1997. I think it's been sufficiently tested.
>>
>> ChangeLog:
>>
>>     * config-ml.in (multi-do): Add @ to silence recipe. Remove FIXME
>>     comment.
>>
>> OK for trunk?
>>
>> This removes 44 lines of irrelevant noise from various build targets,
>> such as the 'check' target that runs the libstdc++ testsuite.
>
> Actually there are two instances of this FIXME in that file. This
> revised patch deals with both.
>
> It looks like this file is shared with binutils-gdb and newlib-cygwin,
> I've only tested it for GCC.

OK

jeff




Re: [PATCH] PR target/96307: Fix KASAN option checking.

2020-11-05 Thread Jeff Law via Gcc-patches


On 10/16/20 3:01 AM, Martin Liška wrote:
> On 10/16/20 9:41 AM, Kito Cheng wrote:
>> I think it is still useful for other targets which are not supporting
>> libsanitizer yet, so in this patch I also moved related testcases
>> from gcc.target to gcc.dg.
>
> All right, I can't approve the patch, but I support it.

Well, that's good enough for me :-)  Approved.


jeff




Re: [PATCH] Remove vr_values::extract_range_builtin.

2020-11-05 Thread Aldy Hernandez via Gcc-patches
Ug. Well, we need to wait for something later than the 25th's snapshot
since I committed the asset patch later.

Aldy

On Thu, Nov 5, 2020, 22:43 Jeff Law  wrote:

>
> On 11/5/20 2:40 PM, Aldy Hernandez wrote:
> > I'll wait for the 11/01 snapshot to finish then.
>
> I'm worried that the 11/01 snapshot is going to generate so many
> failures that it may not be useful.  I'm not sure what's going on, but
> I'm getting a ton of what appear to be codegen correctness issues.
>
>
> jeff
>
>
>


Re: [PATCH, rs6000] Update instruction attributes for Power10

2020-11-05 Thread will schmidt via Gcc-patches
On Wed, 2020-11-04 at 14:42 -0600, Pat Haugen via Gcc-patches wrote:
> Update instruction attributes for Power10.
> 
> 
> This patch updates the type/prefixed/dot/size attributes for various new 
> instructions (and a couple existing that were incorrect) in preparation for 
> the Power10 scheduling patch that will be following.
> 
> Bootstrap/regtest on powerpc64le (Power8/Power10) with no new regressions. Ok 
> for trunk?
> 
> -Pat
> 
> 
> 2020-11-04  Pat Haugen  
> 
> gcc/
>   * config/rs6000/altivec.md (vsdb_, xxspltiw_v4si,
>   xxspltiw_v4sf_inst, xxspltidp_v2df_inst, xxsplti32dx_v4si_inst,
>   xxsplti32dx_v4sf_inst, xxblend_, xxpermx_inst,
>   vstrir_code_, vstrir_p_code_, vstril_code_,
>   vstril_p_code_, altivec_lvsl_reg, altivec_lvsl_direct,
>   altivec_lvsr_reg, altivec_lvsr_direct, xxeval, vcfuged, vclzdm,
>   vctzdm, vpdepd, vpextd, vgnb, vclrlb, vclrrb): Update instruction
>   attributes for Power10.
>   * config/rs6000/dfp.md (extendddtd2, trunctddd2, *cmp_internal1,
>   floatditd2, ftrunc2, fixdi2, dfp_ddedpd_,
>   dfp_denbcd_, dfp_dxex_, dfp_diex_,
>   *dfp_sgnfcnc_, dfp_dscli_, dfp_dscri_): Likewise.
>   * config/rs6000/mma.md (*movpoi, mma_, mma_,
>   mma_, mma_, mma_, mma_,
>   mma_, mma_, mma_, mma_):
>   Likewise.
>   * config/rs6000/rs6000.c (rs6000_final_prescan_insn): Only add 'p' for
>   PREFIXED_YES.

The code change reads as roughly 
- next_insns_prefixed_p != PREFIXED_NO

+ next_insn_prefixed_p == PREFIXED_YES"

So just an inversion of the logic? I don't obviously see the 'p' impact
there.


>   * config/rs6000/rs6000.md (define_attr "size"): Add 256.
>   (define_attr "prefixed"): Add 'always'.
>   (define_mode_attr bits): Add DD/TD modes.
>   (cfuged, cntlzdm, cnttzdm, pdepd, pextd, bswaphi2_reg, bswapsi2_reg,
>   bswapdi2_brd, setbc_signed_,
>   *setbcr_signed_, *setnbc_signed_,
>   *setnbcr_signed_): Update instruction attributes for
>   Power10.

ok.  (assuming the assorted 'integer' -> 'crypto' changes are correct,
of course).  

>   * config/rs6000/sync.md (load_quadpti, store_quadpti, load_lockedpti,
>   store_conditionalpti): Update instruction attributes for Power10.
>   * config/rs6000/vsx.md (*xvtlsbb_internal, xxgenpcvm__internal,
>   vextractl_internal, vextractr_internal,
>   vinsertvl_internal_, vinsertvr_internal_,
>   vinsertgl_internal_, vinsertgr_internal_,
>   vreplace_elt__inst): Likewise.


lgtm, 
thanks
-Will

> 



Re: [PATCH] c++: Use two levels of caching in satisfy_atom

2020-11-05 Thread Patrick Palka via Gcc-patches
On Thu, 5 Nov 2020, Jason Merrill wrote:

> On 11/4/20 2:19 PM, Patrick Palka wrote:
> > [ This patch depends on
> > 
> >c++: Reuse identical ATOMIC_CONSTRs during normalization
> > 
> >https://gcc.gnu.org/pipermail/gcc-patches/2020-November/557929.html  ]
> > 
> > This improves the effectiveness of caching in satisfy_atom by querying
> > the cache again after we've instantiated the atom's parameter mapping.
> > 
> > Before instantiating its mapping, the identity of an (atom,args) pair
> > within the satisfaction cache is determined by idiosyncratic things such
> > as the level and index of each template parameter used in targets of the
> > parameter mapping.  For example, the associated constraints of foo in
> > 
> >template  concept range = range_v;
> >template  void foo () requires range && range;
> > 
> > are range_v (with mapping T -> U) /\ range_v (with mapping T -> V).
> > If during satisfaction the template arguments supplied for U and V are
> > the same, then the satisfaction value of these two atoms will be the
> > same (despite their uninstantiated parameter mappings being different).
> > 
> > But sat_cache doesn't see this because it compares the uninstantiated
> > parameter mapping and the supplied template arguments of sat_entry's
> > independently.  So satisy_atom currently will end up fully evaluating
> > the latter atom instead of reusing the satisfaction value of the former.
> > 
> > But there is a point when the two atoms do look the same to sat_cache,
> > and that's after instantiating their parameter mappings.  By querying
> > the cache again at this point, we're at least able to avoid substituting
> > the instantiated mapping into the second atom's expression.
> > 
> > With this patch, compile time and memory usage for the cmcstl2 test
> > test/algorithm/set_symmetric_diference4.cpp drops from 11s/1.4GB to
> > 8.5s/1.2GB with an --enable-checking=release compiler.
> 
> How does the performance compare if we *only* cache after substituting into
> the parameter mapping?  I'd expect that substitution to be pretty cheap in
> general.

tsubst_parameter_mapping is surprisingly expensive.  If we only cache
after substituting into the mapping, then for e.g. the libstdc++ test
std/ranges/adaptor/join.cc the performance stats are 5s/600MB vs
2s/225MB (with the three patches).  Profiling shows that
tsubst_parameter_mapping accounts for ~50% of compile time (and
apparently a ton of garbage generation) and tsubst_expr only for ~10% of
compile time in this scheme.  Compiling the cmcstl2 test mentioned above
required 8+GB memory before I killed the procress.

Also, only caching after substituting into the mapping means there's no
way to cache negative satisfaction results that arise from failed
substitution into the parameter mapping, IIUC.

> 
> > Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
> > trunk?
> > 
> > gcc/cp/ChangeLog:
> > 
> > * cp-tree.h (ATOMIC_CONSTR_MAP_INSTANTIATED_P): Define this flag
> > for ATOMIC_CONSTRs.
> > * constraint.cc (sat_hasher::hash): Use hash_atomic_constraint
> > if the flag is set, otherwise keep using a pointer hash.
> > (sat_hasher::equal): Return false if the flag's setting differs
> > on two atoms.  Call atomic_constraints_identical_p if the flag
> > is set, otherwise keep using a pointer equality test.
> > (satisfy_atom): After instantiating the parameter mapping, form
> > another ATOMIC_CONSTR using the instantiated mapping and query
> > the cache again.  Cache the satisfaction value of both atoms.
> > (diagnose_atomic_constraint): Simplify now that the supplied
> > atom has an instantiated mapping.
> > ---
> >   gcc/cp/constraint.cc | 47 +++-
> >   gcc/cp/cp-tree.h |  6 ++
> >   2 files changed, 44 insertions(+), 9 deletions(-)
> > 
> > diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
> > index 55dba362ca5..c612bfba13b 100644
> > --- a/gcc/cp/constraint.cc
> > +++ b/gcc/cp/constraint.cc
> > @@ -2315,12 +2315,32 @@ struct sat_hasher : ggc_ptr_hash
> >   {
> > static hashval_t hash (sat_entry *e)
> > {
> > +if (ATOMIC_CONSTR_MAP_INSTANTIATED_P (e->constr))
> > +  {
> > +   gcc_assert (!e->args);
> > +   return hash_atomic_constraint (e->constr);
> > +  }
> > +
> >   hashval_t value = htab_hash_pointer (e->constr);
> >   return iterative_hash_template_arg (e->args, value);
> > }
> >   static bool equal (sat_entry *e1, sat_entry *e2)
> > {
> > +if (ATOMIC_CONSTR_MAP_INSTANTIATED_P (e1->constr)
> > +   != ATOMIC_CONSTR_MAP_INSTANTIATED_P (e2->constr))
> > +  return false;
> > +
> > +if (ATOMIC_CONSTR_MAP_INSTANTIATED_P (e1->constr))
> > +  {
> > +   /* Atoms with instantiated mappings are built in satisfy_atom.  */
> > +   gcc_assert (!e1->args && !e2->args);
> > +   return atomic_constraints_identical_p (e1->constr, e2->constr);
> > +  }
> > +
> > +/* Ato

Re: [PATCH,rs6000] Add patterns for combine to support p10 fusion

2020-11-05 Thread will schmidt via Gcc-patches
On Wed, 2020-11-04 at 12:12 -0600, Aaron Sawdey via Gcc-patches wrote:
> Ping.
> 
> Aaron Sawdey, Ph.D. saw...@linux.ibm.com
> IBM Linux on POWER Toolchain
>  
> 
> > On Oct 26, 2020, at 4:44 PM, acsaw...@linux.ibm.com wrote:
> > 
> > From: Aaron Sawdey 
> > 

Hi, 

> > This patch adds the first couple patterns to support p10 fusion. These
> > will allow combine to create a single insn for a pair of instructions
> > that that power10 can fuse and execute. These particular ones have the

that the power10

s/particular ones/particular insns/ 

> > requirement that only cr0 can be used when fusing a load with a compare
> > immediate of -1/0/1, so we want combine to put that requirement in, and
> > if it doesn't work out later the splitter can get used.
> > 
> > This also adds option -mpower10-fusion which defaults on for power10 and
> > will gate all these fusion patterns. In addition I have added an
> > undocumented option -mpower10-fusion-ld-cmpi (which may be removed later)
> > that just controls the load+compare-immediate patterns.

ok

> >  I have make

made

> > these default on for power10 but they are not disallowed for earlier

to on

> > processors because it is still valid code. This allows us to test the
> > correctness of fusion code generation by turning it on explicitly.
> > 
> > The intention is to work through more patterns of this style to support
> > the rest of the power10 fusion pairs.
> > 
> > Bootstrap and regtest looks good on ppc64le power9 with these patterns
> > enabled in stage2/stage3 and for regtest. Ok for trunk?
> > 
> > gcc/ChangeLog:
> > 
> > * config/rs6000/predicates.md: Add const_me_to_1_operand.
> > * config/rs6000/rs6000-cpus.def: Add OPTION_MASK_P10_FUSION and
> > OPTION_MASK_P10_FUSION_LD_CMPI to ISA_3_1_MASKS_SERVER.

to ... and OTHER_P9_VECTOR_MASKS

> > * config/rs6000/rs6000-protos.h (address_ok_for_form): Add
> > prototype.



> > * config/rs6000/rs6000.c (rs6000_option_override_internal):
> > automatically set -mpower10-fusion and -mpower10-fusion-ld-cmpi
> > if target is power10.  (rs600_opt_masks): Allow -mpower10-fusion
> > in function attributes.  (address_ok_for_form): New function.

ok


> > * config/rs6000/rs6000.h: Add MASK_P10_FUSION.

> > * config/rs6000/rs6000.md (*ld_cmpi_cr0): New
> > define_insn_and_split.
> > (*lwa_cmpdi_cr0): New define_insn_and_split.
> > (*lwa_cmpwi_cr0): New define_insn_and_split.


> > * config/rs6000/rs6000.opt: Add -mpower10-fusion
> > and -mpower10-fusion-ld-cmpi.
> > ---
> > gcc/config/rs6000/predicates.md   |  5 +++
> > gcc/config/rs6000/rs6000-cpus.def |  6 ++-
> > gcc/config/rs6000/rs6000-protos.h |  2 +
> > gcc/config/rs6000/rs6000.c| 34 
> > gcc/config/rs6000/rs6000.h|  1 +
> > gcc/config/rs6000/rs6000.md   | 68 +++
> > gcc/config/rs6000/rs6000.opt  |  8 
> > 7 files changed, 123 insertions(+), 1 deletion(-)
> > 
> > diff --git a/gcc/config/rs6000/predicates.md 
> > b/gcc/config/rs6000/predicates.md
> > index 4c2fe7fa312..b75c1ddfb69 100644
> > --- a/gcc/config/rs6000/predicates.md
> > +++ b/gcc/config/rs6000/predicates.md
> > @@ -297,6 +297,11 @@ (define_predicate "const_0_to_1_operand"
> >   (and (match_code "const_int")
> >(match_test "IN_RANGE (INTVAL (op), 0, 1)")))
> > 
> > +;; Match op = -1, op = 0, or op = 1.
> > +(define_predicate "const_m1_to_1_operand"
> > +  (and (match_code "const_int")
> > +   (match_test "IN_RANGE (INTVAL (op), -1, 1)")))
> > +
> > ;; Match op = 0..3.
> > (define_predicate "const_0_to_3_operand"
> >   (and (match_code "const_int")

ok

> > diff --git a/gcc/config/rs6000/rs6000-cpus.def 
> > b/gcc/config/rs6000/rs6000-cpus.def
> > index 8d2c1ffd6cf..3e65289d8df 100644
> > --- a/gcc/config/rs6000/rs6000-cpus.def
> > +++ b/gcc/config/rs6000/rs6000-cpus.def
> > @@ -82,7 +82,9 @@
> > 
> > #define ISA_3_1_MASKS_SERVER(ISA_3_0_MASKS_SERVER   
> > \
> >  | OPTION_MASK_POWER10  \
> > -| OTHER_POWER10_MASKS)
> > +| OTHER_POWER10_MASKS  \
> > +| OPTION_MASK_P10_FUSION   \
> > +| OPTION_MASK_P10_FUSION_LD_CMPI)
> > 
> > /* Flags that need to be turned off if -mno-power9-vector.  */
> > #define OTHER_P9_VECTOR_MASKS   (OPTION_MASK_FLOAT128_HW
> > \
> > @@ -129,6 +131,8 @@
> >  | OPTION_MASK_FLOAT128_KEYWORD \
> >  | OPTION_MASK_FPRND\
> >  | OPTION_MASK_POWER10  \
> > +| OPTION_MASK_P10_FUSION   \
> > +| OPTION_MASK_P10_FUSION_LD_CMPI   \
> >  | OP

Re: [PATCH] c++: Reuse identical ATOMIC_CONSTRs during normalization

2020-11-05 Thread Patrick Palka via Gcc-patches
On Thu, 5 Nov 2020, Jason Merrill wrote:

> On 11/3/20 3:43 PM, Patrick Palka wrote:
> > Profiling revealed that sat_hasher::equal accounts for nearly 40% of
> > compile time in some cmcstl2 tests.
> > 
> > This patch eliminates this bottleneck by caching the ATOMIC_CONSTRs
> > returned by normalize_atom.  This in turn allows us to replace the
> > expensive atomic_constraints_identical_p check in sat_hasher::equal
> > with cheap pointer equality, with no loss in cache hit rate.
> > 
> > With this patch, compile time for the cmcstl2 test
> > test/algorithm/set_symmetric_difference4.cpp drops from 19s to 11s with
> > an --enable-checking=release compiler.
> > 
> > Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
> > trunk?
> > 
> > gcc/cp/ChangeLog:
> > 
> > * constraint.cc (struct atom_hasher): New descriptor class for a
> > hash_table.  Use it to define ...
> > (atom_cache): ... this.
> > (normalize_atom): Use it to cache ATOMIC_CONSTRs when not
> > generating diagnostics.
> > (sat_hasher::hash): Use htab_hash_pointer instead of
> > hash_atomic_constraint.
> > (sat_hasher::equal): Test for pointer equality instead of
> > atomic_constraints_identical_p.
> > ---
> >   gcc/cp/constraint.cc | 37 ++---
> >   1 file changed, 34 insertions(+), 3 deletions(-)
> > 
> > diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
> > index b6f6f0d02a5..ce720c641e8 100644
> > --- a/gcc/cp/constraint.cc
> > +++ b/gcc/cp/constraint.cc
> > @@ -710,6 +710,25 @@ normalize_concept_check (tree check, tree args,
> > norm_info info)
> > return normalize_expression (def, subst, info);
> >   }
> >   +/* Hash functions for ATOMIC_CONSTRs.  */
> > +
> > +struct atom_hasher : default_hash_traits
> > +{
> > +  static hashval_t hash (tree atom)
> > +  {
> > +return hash_atomic_constraint (atom);
> > +  }
> > +
> > +  static bool equal (tree atom1, tree atom2)
> > +  {
> > +return atomic_constraints_identical_p (atom1, atom2);
> > +  }
> > +};
> 
> This is the same as constraint_hash in logic.cc; either they should be
> combined, or (probably) the hash table in logic.cc should be changed to also
> take advantage of pointer equivalence.

Ah, I forgot about the existence of this hasher.  Consider this hasher
changed to take advantage of the pointer equivalence (I'll post a
revised and tested patch later today).

> 
> > +/* Used by normalize_atom to cache ATOMIC_CONSTRs.  */
> > +
> > +static GTY((deletable)) hash_table *atom_cache;
> 
> If we're relying on pointer identity, this can't be deletable; if GC discards
> it, later normalization will generate a new equivalent ATOMIC_CONSTR, breaking
> the uniqueness assumption.

But because no ATOMIC_CONSTR is ever reachable from a GC root (since
they live only inside GC-deletable structures), there will never be two
equivalent ATOMIC_CONSTR trees live at once, which is a sufficient
enough notion of uniqueness for us, I think.

> 
> >   /* The normal form of an atom depends on the expression. The normal
> >  form of a function call to a function concept is a check constraint
> >  for that concept. The normal form of a reference to a variable
> > @@ -729,7 +748,19 @@ normalize_atom (tree t, tree args, norm_info info)
> > /* Build a new info object for the atom.  */
> > tree ci = build_tree_list (t, info.context);
> >   -  return build1 (ATOMIC_CONSTR, ci, map);
> > +  tree atom = build1 (ATOMIC_CONSTR, ci, map);
> > +  if (!info.generate_diagnostics ())
> > +{
> > +  /* Cache the ATOMIC_CONSTRs that we return, so that sat_hasher::equal
> > +later can quickly compare two atoms using just pointer equality.  */
> > +  if (!atom_cache)
> > +   atom_cache = hash_table::create_ggc (31);
> > +  tree *slot = atom_cache->find_slot (atom, INSERT);
> > +  if (*slot)
> > +   return *slot;
> > +  *slot = atom;
> > +}
> > +  return atom;
> >   }
> > /* Returns the normal form of an expression. */
> > @@ -2284,13 +2315,13 @@ struct sat_hasher : ggc_ptr_hash
> >   {
> > static hashval_t hash (sat_entry *e)
> > {
> 
> We could use a comment here about why we can just hash the pointer.

Will do.  The subsequent patch ("Use two levels of caching in
satisfy_atom") also adds

+/* Atoms with uninstantiated mappings are built in normalize_atom.
+   Their identity is determined by their pointer value due to
+   the caching of ATOMIC_CONSTRs performed therein.  */

to sat_hasher::equal, but it could use repeating in sat_hasher::hash.

> 
> > -hashval_t value = hash_atomic_constraint (e->constr);
> > +hashval_t value = htab_hash_pointer (e->constr);
> >   return iterative_hash_template_arg (e->args, value);
> > }
> >   static bool equal (sat_entry *e1, sat_entry *e2)
> > {
> > -if (!atomic_constraints_identical_p (e1->constr, e2->constr))
> > +if (e1->constr != e2->constr)
> > return false;
> >   return tem

Re: [PATCH] c++: Add -Wexceptions warning option [PR97675]

2020-11-05 Thread David Malcolm via Gcc-patches
On Thu, 2020-11-05 at 11:03 -0500, Marek Polacek via Gcc-patches wrote:
> This PR asks that we add a warning option for an existing (very old)
> warning, so that it can be disabled selectively.  clang++ uses
> -Wexceptions for this, so I added this new option rather than using
> e.g. -Wnoexcept.
> 
> gcc/c-family/ChangeLog:
> 
>   PR c++/97675
>   * c.opt (Wexceptions): New option.
> 
> gcc/cp/ChangeLog:
> 
>   PR c++/97675
>   * except.c (check_handlers_1): Use OPT_Wexceptions for the
>   warning.  Use inform for the second part of the warning.
> 
> gcc/ChangeLog:
> 
>   PR c++/97675
>   * doc/invoke.texi: Document -Wexceptions.
> 
> gcc/testsuite/ChangeLog:
> 
>   PR c++/97675
>   * g++.old-deja/g++.eh/catch10.C: Adjust dg-warning.
>   * g++.dg/warn/Wexceptions1.C: New test.
>   * g++.dg/warn/Wexceptions2.C: New test.
> ---
>  gcc/c-family/c.opt  |  4 
>  gcc/cp/except.c |  9 -
>  gcc/doc/invoke.texi |  8 +++-
>  gcc/testsuite/g++.dg/warn/Wexceptions1.C|  9 +
>  gcc/testsuite/g++.dg/warn/Wexceptions2.C| 10 ++
>  gcc/testsuite/g++.old-deja/g++.eh/catch10.C |  4 ++--
>  6 files changed, 36 insertions(+), 8 deletions(-)
>  create mode 100644 gcc/testsuite/g++.dg/warn/Wexceptions1.C
>  create mode 100644 gcc/testsuite/g++.dg/warn/Wexceptions2.C
> 
> diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
> index 426636be839..9493acb82ff 100644
> --- a/gcc/c-family/c.opt
> +++ b/gcc/c-family/c.opt
> @@ -579,6 +579,10 @@ Werror-implicit-function-declaration
>  C ObjC RejectNegative Warning Alias(Werror=, implicit-function-
> declaration)
>  This switch is deprecated; use -Werror=implicit-function-declaration 
> instead.
>  
> +Wexceptions
> +C++ ObjC++ Var(warn_exceptions) Init(1)
> +Warn when an exception handler is shadowed by another handler.
> +
>  Wextra
>  C ObjC C++ ObjC++ Warning
>  ; in common.opt
> diff --git a/gcc/cp/except.c b/gcc/cp/except.c
> index cb1a4105dae..985206f6a64 100644
> --- a/gcc/cp/except.c
> +++ b/gcc/cp/except.c
> @@ -975,11 +975,10 @@ check_handlers_1 (tree master,
> tree_stmt_iterator i)
>tree handler = tsi_stmt (i);
>if (TREE_TYPE (handler) && can_convert_eh (type, TREE_TYPE
> (handler)))
>   {

Can you add an auto_diagnostic_group here please.

> -   warning_at (EXPR_LOCATION (handler), 0,
> -   "exception of type %qT will be caught",
> -   TREE_TYPE (handler));
> -   warning_at (EXPR_LOCATION (master), 0,
> -   "   by earlier handler for %qT", type);
> +   if (warning_at (EXPR_LOCATION (handler), OPT_Wexceptions,
> +   "exception of type %qT will be caught by
> earlier "
> +   "handler", TREE_TYPE (handler)))
> + inform (EXPR_LOCATION (master), "for type %qT", type);
> break;
>   }
>  }

Thanks
Dave



float.h: C2x decimal signaling NaN macros

2020-11-05 Thread Joseph Myers
C2x adds macros for decimal floating-point signaling NaNs to
.  Add these macros to GCC's  implementation.

Note that the current C2x draft has these under incorrect names
D32_SNAN, D64_SNAN, D128_SNAN.  The intent was to change the naming
convention to be consistent with other  macros when they were
moved to , so DEC32_SNAN, DEC64_SNAN, DEC128_NAN, which this
patch uses (as does the current draft integration of TS 18661-3 as an
Annex to C2x, for its _Decimal* and _Decimal*x types).

This patch is relative to a tree with

and

(both pending review) applied.

Bootstrapped with no regressions for x86_64-pc-linux-gnu.  OK to commit?

gcc/
2020-11-05  Joseph Myers  

* ginclude/float.h (DEC32_SNAN, DEC64_SNAN, DEC128_SNAN): New C2x
macros.

gcc/testsuite/
2020-11-05  Joseph Myers  

* gcc.dg/dfp/c2x-float-dfp-7.c, gcc.dg/dfp/c2x-float-dfp-8.c: New
tests.
* gcc.dg/c2x-float-no-dfp-3.c: Also check that DEC32_SNAN,
DEC64_SNAN and DEC128_SNAN are not defined.

diff --git a/gcc/ginclude/float.h b/gcc/ginclude/float.h
index 77446995515..0fa00461230 100644
--- a/gcc/ginclude/float.h
+++ b/gcc/ginclude/float.h
@@ -601,6 +601,14 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  
If not, see
 #undef DEC_NAN
 #define DEC_NAN(__builtin_nand32 (""))
 
+/* Signaling NaN in each decimal floating-point type.  */
+#undef DEC32_SNAN
+#define DEC32_SNAN (__builtin_nansd32 (""))
+#undef DEC64_SNAN
+#define DEC64_SNAN (__builtin_nansd64 (""))
+#undef DEC128_SNAN
+#define DEC128_SNAN(__builtin_nansd128 (""))
+
 #endif /* C2X */
 
 #endif /* __DEC32_MANT_DIG__ */
diff --git a/gcc/testsuite/gcc.dg/c2x-float-no-dfp-3.c 
b/gcc/testsuite/gcc.dg/c2x-float-no-dfp-3.c
index d8a239c787e..aa790c8e21d 100644
--- a/gcc/testsuite/gcc.dg/c2x-float-no-dfp-3.c
+++ b/gcc/testsuite/gcc.dg/c2x-float-no-dfp-3.c
@@ -12,3 +12,15 @@
 #ifdef DEC_NAN
 # error "DEC_NAN defined"
 #endif
+
+#ifdef DEC32_SNAN
+# error "DEC32_SNAN defined"
+#endif
+
+#ifdef DEC64_SNAN
+# error "DEC64_SNAN defined"
+#endif
+
+#ifdef DEC128_SNAN
+# error "DEC128_SNAN defined"
+#endif
diff --git a/gcc/testsuite/gcc.dg/dfp/c2x-float-dfp-7.c 
b/gcc/testsuite/gcc.dg/dfp/c2x-float-dfp-7.c
new file mode 100644
index 000..dec6b500656
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/dfp/c2x-float-dfp-7.c
@@ -0,0 +1,45 @@
+/* Test DEC*_SNAN macros defined in  with DFP support.  */
+/* { dg-do run } */
+/* { dg-options "-std=c2x" } */
+
+#include 
+
+#ifndef DEC32_SNAN
+# error "DEC32_SNAN not defined"
+#endif
+
+#ifndef DEC64_SNAN
+# error "DEC64_SNAN not defined"
+#endif
+
+#ifndef DEC128_SNAN
+# error "DEC128_SNAN not defined"
+#endif
+
+volatile _Decimal32 d32 = DEC32_SNAN;
+volatile _Decimal64 d64 = DEC64_SNAN;
+volatile _Decimal128 d128 = DEC128_SNAN;
+
+extern void abort (void);
+extern void exit (int);
+
+int
+main (void)
+{
+  (void) _Generic (DEC32_SNAN, _Decimal32 : 0);
+  if (!__builtin_isnan (DEC32_SNAN))
+abort ();
+  if (!__builtin_isnan (d32))
+abort ();
+  (void) _Generic (DEC64_SNAN, _Decimal64 : 0);
+  if (!__builtin_isnan (DEC64_SNAN))
+abort ();
+  if (!__builtin_isnan (d64))
+abort ();
+  (void) _Generic (DEC128_SNAN, _Decimal128 : 0);
+  if (!__builtin_isnan (DEC128_SNAN))
+abort ();
+  if (!__builtin_isnan (d128))
+abort ();
+  exit (0);
+}
diff --git a/gcc/testsuite/gcc.dg/dfp/c2x-float-dfp-8.c 
b/gcc/testsuite/gcc.dg/dfp/c2x-float-dfp-8.c
new file mode 100644
index 000..4169602fd9c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/dfp/c2x-float-dfp-8.c
@@ -0,0 +1,45 @@
+/* Test DEC*_SNAN macros.  Test requiring runtime exceptions
+   support.  */
+/* { dg-do run } */
+/* { dg-require-effective-target fenv_exceptions_dfp } */
+/* { dg-options "-std=c2x" } */
+
+#include 
+#include 
+
+volatile _Decimal32 d32 = DEC32_SNAN;
+volatile _Decimal64 d64 = DEC64_SNAN;
+volatile _Decimal128 d128 = DEC128_SNAN;
+
+extern void abort (void);
+extern void exit (int);
+
+int
+main (void)
+{
+  feclearexcept (FE_ALL_EXCEPT);
+  d32 += d32;
+  if (!fetestexcept (FE_INVALID))
+abort ();
+  feclearexcept (FE_ALL_EXCEPT);
+  d32 += d32;
+  if (fetestexcept (FE_INVALID))
+abort ();
+  feclearexcept (FE_ALL_EXCEPT);
+  d64 += d64;
+  if (!fetestexcept (FE_INVALID))
+abort ();
+  feclearexcept (FE_ALL_EXCEPT);
+  d64 += d64;
+  if (fetestexcept (FE_INVALID))
+abort ();
+  feclearexcept (FE_ALL_EXCEPT);
+  d128 += d128;
+  if (!fetestexcept (FE_INVALID))
+abort ();
+  feclearexcept (FE_ALL_EXCEPT);
+  d128 += d128;
+  if (fetestexcept (FE_INVALID))
+abort ();
+  exit (0);
+}

-- 
Joseph S. Myers
jos...@codesourcery.com


Move size time tables from ggc to heap

2020-11-05 Thread Jan Hubicka
Hi,
this patch moves size time tables out of ggc allocated memory.  This makes
sources bit cleaner and saves about 60MB of GGC memory that turns to about 45MB
of heap memory for cc1plus LTO build.

Bootstrapped/regtested x86_64-linux, plan to commit it shortly.

Honza

2020-11-06  Jan Hubicka  

* ipa-fnsummary.h (class size_time_entry): Do not GTY annotate.
(class ipa_fnsummary): Turn size_time_table to auto_vec and
call_size_time_table to effecient vec; update constructors.
* ipa-fnsummary.c (ipa_fn_summary::account_size_time): Update.
(ipa_fn_summary::~ipa_fn_summary): Update.
(ipa_fn_summary_t::duplicate): Update.
(ipa_dump_fn_summary): Update.
(set_switch_stmt_execution_predicate): Update.
(analyze_function_body): Update.
(estimate_calls_size_and_time): Update.
(ipa_call_context::estimate_size_and_time): Update.
(ipa_merge_fn_summary_after_inlining): Update.
(ipa_update_overall_fn_summary): Update.
(inline_read_section): Update.
(ipa_fn_summary_write): Update.

diff --git a/gcc/ipa-fnsummary.c b/gcc/ipa-fnsummary.c
index 0393f2cad11..b8f4a0a9091 100644
--- a/gcc/ipa-fnsummary.c
+++ b/gcc/ipa-fnsummary.c
@@ -168,8 +168,7 @@ ipa_fn_summary::account_size_time (int size, sreal time,
   bool found = false;
   int i;
   predicate nonconst_pred;
-  vec *table = call
-  ? call_size_time_table : size_time_table;
+  vec *table = call ? &call_size_time_table : 
&size_time_table;
 
   if (exec_pred == false)
 return;
@@ -181,13 +180,13 @@ ipa_fn_summary::account_size_time (int size, sreal time,
 
   /* We need to create initial empty unconditional clause, but otherwise
  we don't need to account empty times and sizes.  */
-  if (!size && time == 0 && table)
+  if (!size && time == 0 && table->length ())
 return;
 
   /* Only for calls we are unaccounting what we previously recorded.  */
   gcc_checking_assert (time >= 0 || call);
 
-  for (i = 0; vec_safe_iterate (table, i, &e); i++)
+  for (i = 0; table->iterate (i, &e); i++)
 if (e->exec_predicate == exec_pred
&& e->nonconst_predicate == nonconst_pred)
   {
@@ -227,9 +226,9 @@ ipa_fn_summary::account_size_time (int size, sreal time,
   new_entry.exec_predicate = exec_pred;
   new_entry.nonconst_predicate = nonconst_pred;
   if (call)
-vec_safe_push (call_size_time_table, new_entry);
+   call_size_time_table.safe_push (new_entry);
   else
-vec_safe_push (size_time_table, new_entry);
+   size_time_table.safe_push (new_entry);
 }
   else
 {
@@ -753,8 +752,7 @@ ipa_fn_summary::~ipa_fn_summary ()
   for (unsigned i = 0; i < len; i++)
 edge_predicate_pool.remove ((*loop_strides)[i].predicate);
   vec_free (conds);
-  vec_free (size_time_table);
-  vec_free (call_size_time_table);
+  call_size_time_table.release ();
   vec_free (loop_iterations);
   vec_free (loop_strides);
   builtin_constant_p_parms.release ();
@@ -804,10 +802,10 @@ remap_freqcounting_preds_after_dup 
(vec *v,
 void
 ipa_fn_summary_t::duplicate (cgraph_node *src,
 cgraph_node *dst,
-ipa_fn_summary *,
+ipa_fn_summary *src_info,
 ipa_fn_summary *info)
 {
-  new (info) ipa_fn_summary (*ipa_fn_summaries->get (src));
+  new (info) ipa_fn_summary (*src_info);
   /* TODO: as an optimization, we may avoid copying conditions
  that are known to be false or true.  */
   info->conds = vec_safe_copy (info->conds);
@@ -817,7 +815,6 @@ ipa_fn_summary_t::duplicate (cgraph_node *src,
  out that something was optimized out.  */
   if (ipa_node_params_sum && cinfo && cinfo->tree_map)
 {
-  vec *entry = info->size_time_table;
   /* Use SRC parm info since it may not be copied yet.  */
   class ipa_node_params *parms_info = IPA_NODE_REF (src);
   ipa_auto_call_arg_values avals;
@@ -830,7 +827,7 @@ ipa_fn_summary_t::duplicate (cgraph_node *src,
   bool inlined_to_p = false;
   struct cgraph_edge *edge, *next;
 
-  info->size_time_table = 0;
+  info->size_time_table.release ();
   avals.m_known_vals.safe_grow_cleared (count, true);
   for (i = 0; i < count; i++)
{
@@ -859,7 +856,7 @@ ipa_fn_summary_t::duplicate (cgraph_node *src,
  to be false.
  TODO: as on optimization, we can also eliminate conditions known
  to be true.  */
-  for (i = 0; vec_safe_iterate (entry, i, &e); i++)
+  for (i = 0; src_info->size_time_table.iterate (i, &e); i++)
{
  predicate new_exec_pred;
  predicate new_nonconst_pred;
@@ -935,8 +932,8 @@ ipa_fn_summary_t::duplicate (cgraph_node *src,
 }
   else
 {
-  info->size_time_table = vec_safe_copy (info->size_time_table);
-  info->loop_iterations = vec_safe_copy (info->loop_iterations);
+  info->size_time_table = src_info->

Re: [PATCH] c++: Add -Wexceptions warning option [PR97675]

2020-11-05 Thread Marek Polacek via Gcc-patches
On Thu, Nov 05, 2020 at 06:13:41PM -0500, David Malcolm via Gcc-patches wrote:
> On Thu, 2020-11-05 at 11:03 -0500, Marek Polacek via Gcc-patches wrote:
> > This PR asks that we add a warning option for an existing (very old)
> > warning, so that it can be disabled selectively.  clang++ uses
> > -Wexceptions for this, so I added this new option rather than using
> > e.g. -Wnoexcept.
> > 
> > gcc/c-family/ChangeLog:
> > 
> > PR c++/97675
> > * c.opt (Wexceptions): New option.
> > 
> > gcc/cp/ChangeLog:
> > 
> > PR c++/97675
> > * except.c (check_handlers_1): Use OPT_Wexceptions for the
> > warning.  Use inform for the second part of the warning.
> > 
> > gcc/ChangeLog:
> > 
> > PR c++/97675
> > * doc/invoke.texi: Document -Wexceptions.
> > 
> > gcc/testsuite/ChangeLog:
> > 
> > PR c++/97675
> > * g++.old-deja/g++.eh/catch10.C: Adjust dg-warning.
> > * g++.dg/warn/Wexceptions1.C: New test.
> > * g++.dg/warn/Wexceptions2.C: New test.
> > ---
> >  gcc/c-family/c.opt  |  4 
> >  gcc/cp/except.c |  9 -
> >  gcc/doc/invoke.texi |  8 +++-
> >  gcc/testsuite/g++.dg/warn/Wexceptions1.C|  9 +
> >  gcc/testsuite/g++.dg/warn/Wexceptions2.C| 10 ++
> >  gcc/testsuite/g++.old-deja/g++.eh/catch10.C |  4 ++--
> >  6 files changed, 36 insertions(+), 8 deletions(-)
> >  create mode 100644 gcc/testsuite/g++.dg/warn/Wexceptions1.C
> >  create mode 100644 gcc/testsuite/g++.dg/warn/Wexceptions2.C
> > 
> > diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
> > index 426636be839..9493acb82ff 100644
> > --- a/gcc/c-family/c.opt
> > +++ b/gcc/c-family/c.opt
> > @@ -579,6 +579,10 @@ Werror-implicit-function-declaration
> >  C ObjC RejectNegative Warning Alias(Werror=, implicit-function-
> > declaration)
> >  This switch is deprecated; use -Werror=implicit-function-declaration 
> > instead.
> >  
> > +Wexceptions
> > +C++ ObjC++ Var(warn_exceptions) Init(1)
> > +Warn when an exception handler is shadowed by another handler.
> > +
> >  Wextra
> >  C ObjC C++ ObjC++ Warning
> >  ; in common.opt
> > diff --git a/gcc/cp/except.c b/gcc/cp/except.c
> > index cb1a4105dae..985206f6a64 100644
> > --- a/gcc/cp/except.c
> > +++ b/gcc/cp/except.c
> > @@ -975,11 +975,10 @@ check_handlers_1 (tree master,
> > tree_stmt_iterator i)
> >tree handler = tsi_stmt (i);
> >if (TREE_TYPE (handler) && can_convert_eh (type, TREE_TYPE
> > (handler)))
> > {
> 
> Can you add an auto_diagnostic_group here please.

Yup, I've pushed this:

>From 44e1f63e20fec07e3a10d8e75336cfda64c911bf Mon Sep 17 00:00:00 2001
From: Marek Polacek 
Date: Thu, 5 Nov 2020 18:23:56 -0500
Subject: [pushed] c++: Add auto_diagnostic_group to check_handlers_1.

This was missing.

gcc/cp/ChangeLog:

* except.c (check_handlers_1): Add auto_diagnostic_group.
---
 gcc/cp/except.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/cp/except.c b/gcc/cp/except.c
index 985206f6a64..b72a28c1aa9 100644
--- a/gcc/cp/except.c
+++ b/gcc/cp/except.c
@@ -975,6 +975,7 @@ check_handlers_1 (tree master, tree_stmt_iterator i)
   tree handler = tsi_stmt (i);
   if (TREE_TYPE (handler) && can_convert_eh (type, TREE_TYPE (handler)))
{
+ auto_diagnostic_group d;
  if (warning_at (EXPR_LOCATION (handler), OPT_Wexceptions,
  "exception of type %qT will be caught by earlier "
  "handler", TREE_TYPE (handler)))

base-commit: e6fd02cc6d874c523466250a1cb724e0c7af9d75
-- 
2.28.0



Handle fnspec in ipa-modref II

2020-11-05 Thread Jan Hubicka
Hi,
this patch adds the IPA propagation support for fnspecs in ipa-modref.
Fnspec string are collected to a new edge summary (since they do depend
on the call statement and not only the called function) and stored to
summaries.  Since ipa-modref is first pass that now cares about jump funitions
to non-definitions lto streaming needed to be updated, too.

Disambiguations on cc1plus:

Alias oracle query stats:   
  refs_may_alias_p: 65808750 disambiguations, 75664890 queries  
  ref_maybe_used_by_call_p: 153485 disambiguations, 66711204 queries
  call_may_clobber_ref_p: 22816 disambiguations, 28889 queries  
  nonoverlapping_component_refs_p: 0 disambiguations, 36846 queries 
  nonoverlapping_refs_since_match_p: 27271 disambiguations, 58917 must 
overlaps, 86958 queries
  aliasing_component_refs_p: 65808 disambiguations, 2067256 queries 
  TBAA oracle: 25929211 disambiguations 60395141 queries
   12391384 are in alias set 0  
   10783783 queries asked about the same object 
   126 queries asked about the same alias set   
   0 access volatile
   9598698 are dependent in the DAG 
   1691939 are aritificially in conflict with void *

Modref stats:   
  modref use: 14284 disambiguations, 53336 queries  
  modref clobber: 1660281 disambiguations, 2130440 queries  
  4311165 tbaa queries (2.023603 per modref query)  
  685304 base compares (0.321673 per modref query)  

PTA query stats:
  pt_solution_includes: 959190 disambiguations, 13169678 queries
  pt_solutions_intersect: 1050969 disambiguations, 13246686 queries 

This is about 10% up compared to last report, but it may be also caused
by C++ new/delete operator support that I commited today.

Bootstrapped/regtested x86_64-linux, plan to commit it tomorrow after crafting
few testcases.

gcc/ChangeLog:

2020-11-06  Jan Hubicka  

* attr-fnspec.h (attr_fnspec::get_str): New accessor
* ipa-fnsummary.c (read_ipa_call_summary): Store also parm info
for builtins.
* ipa-modref.c (class fnspec_summary): New type.
(class fnspec_summaries_t): New type.
(modref_summary::modref_summary): Initialize writes_errno.
(struct modref_summary_lto): Add writes_errno.
(modref_summary_lto::modref_summary_lto): Initialize writes_errno.
(modref_summary::dump): Check for NULL pointers.
(modref_summary_lto::dump): Dump writes_errno.
(collapse_loads): Move up in source file.
(collapse_stores): New function.
(process_fnspec): Handle also internal calls.
(analyze_call): Likewise.
(analyze_stmt): Store fnspec string if needed.
(analyze_function): Initialize fnspec_sumarries.
(modref_summaries_lto::duplicate): Copy writes_errno.
(modref_write): Store writes_errno and fnspec summaries.
(read_section): Read writes_errno and fnspec summaries.
(modref_read): Initialize fnspec summaries.
(update_signature): Fix formating.
(compute_parm_map): Return true if sucessful.
(get_parm_type): New function.
(get_access_for_fnspec): New function.
(propagate_unknown_call): New function.
(modref_propagate_in_scc): Use it.
(pass_ipa_modref::execute): Delete fnspec_summaries.
(ipa_modref_c_finalize): Delete fnspec_summaries.
* ipa-prop.c: Include attr-fnspec.h.
(ipa_compute_jump_functions_for_bb):  Also compute jump functions
for functions with fnspecs.
(ipa_read_edge_info): Read jump functions for builtins.

diff --git a/gcc/attr-fnspec.h b/gcc/attr-fnspec.h
index 78b1a5a2b1c..28135328437 100644
--- a/gcc/attr-fnspec.h
+++ b/gcc/attr-fnspec.h
@@ -246,6 +246,13 @@ public:
 
   /* Check validity of the string.  */
   void verify ();
+
+  /* Return the fnspec string.  */
+  const char *
+  get_str ()
+  {
+return str;
+  }
 };
 
 extern attr_fnspec gimple_call_fnspec (const gcall *stmt);
diff --git a/gcc/ipa-fnsummary.c b/gcc/ipa-fnsummary.c
index 0393f2cad11..b8f4a0a9091 100644
--- a/gcc/ipa-fnsummary.c
+++ b/gcc/ipa-fnsummary.c
@@ -4304,7 +4301,11 @@ read_ipa_call_summary (class lto_input_block *ib, struct 
cgraph_edge *e,
   if (es)
 edge_set_predicate (e, &

Re: [PATCH] generalized range_query class for multiple contexts

2020-11-05 Thread Andrew MacLeod via Gcc-patches

On 11/5/20 4:02 PM, Martin Sebor wrote:

On 11/5/20 12:29 PM, Martin Sebor wrote:

On 10/1/20 11:25 AM, Martin Sebor wrote:

On 10/1/20 9:34 AM, Aldy Hernandez wrote:



On 10/1/20 3:22 PM, Andrew MacLeod wrote:
 > On 10/1/20 5:05 AM, Aldy Hernandez via Gcc-patches wrote:
 >>> Thanks for doing all this!  There isn't anything I don't 
understand
 >>> in the sprintf changes so no questions from me (well, almost 
none).

 >>> Just some comments:
 >> Thanks for your comments on the sprintf/strlen API conversion.
 >>
 >>> The current call statement is available in all functions that 
take
 >>> a directive argument, as dir->info.callstmt.  There should be 
no need
 >>> to also add it as a new argument to the functions that now 
need it.

 >> Fixed.
 >>
 >>> The change adds code along these lines in a bunch of places:
 >>>
 >>> + value_range vr;
 >>> + if (!query->range_of_expr (vr, arg, stmt))
 >>> +   vr.set_varying (TREE_TYPE (arg));
 >>>
 >>> I thought under the new Ranger APIs when a range couldn't be
 >>> determined it would be automatically set to the maximum for
 >>> the type.  I like that and have been moving in that direction
 >>> with my code myself (rather than having an API fail, have it
 >>> set the max range and succeed).
 >> I went through all the above idioms and noticed all are being 
used on
 >> supported types (integers or pointers).  So range_of_expr will 
always

 >> return true.  I've removed the if() and the set_varying.
 >>
 >>> Since that isn't so in this case, I think it would still be nice
 >>> if the added code could be written as if the range were set to
 >>> varying in this case and (ideally) reduced to just 
initialization:

 >>>
 >>> value_range vr = some-function (query, stmt, arg);
 >>>
 >>> some-function could be an inline helper defined just for the 
sprintf
 >>> pass (and maybe also strlen which also seems to use the same 
pattern),
 >>> or it could be a value_range AKA irange ctor, or it could be a 
member

 >>> of range_query, whatever is the most appropriate.
 >>>
 >>> (If assigning/copying a value_range is thought to be too 
expensive,

 >>> declaring it first and then passing it to that helper to set it
 >>> would work too).
 >>>
 >>> In strlen, is the removed comment no longer relevant?  (I.e., 
does

 >>> the ranger solve the problem?)
 >>>
 >>> -  /* The range below may be "inaccurate" if a constant 
has been
 >>> -    substituted earlier for VAL by this pass that hasn't 
been
 >>> -    propagated through the CFG.  This shoud be fixed by 
the new

 >>> -    on-demand VRP if/when it becomes available (hopefully in
 >>> -    GCC 11).  */
 >> It should.
 >>
 >>> I'm wondering about the comment added to get_range_strlen_dynamic
 >>> and other places:
 >>>
 >>> + // FIXME: Use range_query instead of global ranges.
 >>>
 >>> Is that something you're planning to do in a followup or should
 >>> I remember to do it at some point?
 >> I'm not planning on doing it.  It's just a reminder that it 
would be

 >> beneficial to do so.
 >>
 >>> Otherwise I have no concern with the changes.
 >> It's not cleared whether Andrew approved all 3 parts of the 
patchset
 >> or just the valuation part.  I'll wait for his nod before 
committing

 >> this chunk.
 >>
 >> Aldy
 >>
 > I have no issue with it, so OK.

Pushed all 3 patches.

 >
 > Just an observation that should be pointed out, I believe Aldy 
has all
 > the code for converting to a ranger, but we have not pursued 
that any
 > further yet since there is a regression due to our lack of 
equivalence
 > processing I think?  That should be resolved in the coming 
month, but at
 > the moment is a holdback/concern for converting these passes...  
iirc.


Yes.  Martin, the take away here is that the strlen/sprintf pass 
has been converted to the new API, but ranger is still not up and 
running on it (even on the branch).


With the new API, all you have to do is remove all instances of 
evrp_range_analyzer and replace them with a ranger. That's it.
Below is an untested patch that would convert you to a ranger once 
it's contributed.


IIRC when I enabled the ranger for your pass a while back, there 
was one or two regressions due to missing equivalences, and the 
rest were because the tests were expecting an actual specific 
range, and the ranger returned a slightly different/better one.  
You'll need to adjust your tests.


Ack.  I'll be on the lookout for the ranger commit (if you hppen
to remember and CC me on it just in case I might miss it that would
be great).


I have applied the patch and ran some tests.  There are quite
a few failures (see the list below).  I have only looked at
a couple.  The one in in gcc.dg/tree-ssa/builtin-sprintf-warn-3.c
boils down to the following test case.  There should be no warning
for either sprintf call.  The one in h() is a false positive and
the reason for at least some of the regressions.  Somehow,
the conversions between int and char are caus

Re: [PATCH] generalized range_query class for multiple contexts

2020-11-05 Thread Andrew MacLeod via Gcc-patches

On 11/5/20 2:29 PM, Martin Sebor wrote:



signed char g (signed char min, signed char max)
{
  signed char i = x;
  return i < min || max < i ? min : i;
}

void gg (void)
{
  __builtin_sprintf (a, "%i", g (0, 9));   // bogus warning
}
Im looking at this. its actually completely different code thats 
generated for this signed char case.  And something is being missed that 
shouldnt be.     I'll get back to you.


Andrew



Re: [PATCH] generalized range_query class for multiple contexts

2020-11-05 Thread Martin Sebor via Gcc-patches

On 11/5/20 5:02 PM, Andrew MacLeod wrote:

On 11/5/20 4:02 PM, Martin Sebor wrote:

On 11/5/20 12:29 PM, Martin Sebor wrote:

On 10/1/20 11:25 AM, Martin Sebor wrote:

On 10/1/20 9:34 AM, Aldy Hernandez wrote:



On 10/1/20 3:22 PM, Andrew MacLeod wrote:
 > On 10/1/20 5:05 AM, Aldy Hernandez via Gcc-patches wrote:
 >>> Thanks for doing all this!  There isn't anything I don't 
understand
 >>> in the sprintf changes so no questions from me (well, almost 
none).

 >>> Just some comments:
 >> Thanks for your comments on the sprintf/strlen API conversion.
 >>
 >>> The current call statement is available in all functions that 
take
 >>> a directive argument, as dir->info.callstmt.  There should be 
no need
 >>> to also add it as a new argument to the functions that now 
need it.

 >> Fixed.
 >>
 >>> The change adds code along these lines in a bunch of places:
 >>>
 >>> + value_range vr;
 >>> + if (!query->range_of_expr (vr, arg, stmt))
 >>> +   vr.set_varying (TREE_TYPE (arg));
 >>>
 >>> I thought under the new Ranger APIs when a range couldn't be
 >>> determined it would be automatically set to the maximum for
 >>> the type.  I like that and have been moving in that direction
 >>> with my code myself (rather than having an API fail, have it
 >>> set the max range and succeed).
 >> I went through all the above idioms and noticed all are being 
used on
 >> supported types (integers or pointers).  So range_of_expr will 
always

 >> return true.  I've removed the if() and the set_varying.
 >>
 >>> Since that isn't so in this case, I think it would still be nice
 >>> if the added code could be written as if the range were set to
 >>> varying in this case and (ideally) reduced to just 
initialization:

 >>>
 >>> value_range vr = some-function (query, stmt, arg);
 >>>
 >>> some-function could be an inline helper defined just for the 
sprintf
 >>> pass (and maybe also strlen which also seems to use the same 
pattern),
 >>> or it could be a value_range AKA irange ctor, or it could be a 
member

 >>> of range_query, whatever is the most appropriate.
 >>>
 >>> (If assigning/copying a value_range is thought to be too 
expensive,

 >>> declaring it first and then passing it to that helper to set it
 >>> would work too).
 >>>
 >>> In strlen, is the removed comment no longer relevant?  (I.e., 
does

 >>> the ranger solve the problem?)
 >>>
 >>> -  /* The range below may be "inaccurate" if a constant 
has been
 >>> -    substituted earlier for VAL by this pass that hasn't 
been
 >>> -    propagated through the CFG.  This shoud be fixed by 
the new

 >>> -    on-demand VRP if/when it becomes available (hopefully in
 >>> -    GCC 11).  */
 >> It should.
 >>
 >>> I'm wondering about the comment added to get_range_strlen_dynamic
 >>> and other places:
 >>>
 >>> + // FIXME: Use range_query instead of global ranges.
 >>>
 >>> Is that something you're planning to do in a followup or should
 >>> I remember to do it at some point?
 >> I'm not planning on doing it.  It's just a reminder that it 
would be

 >> beneficial to do so.
 >>
 >>> Otherwise I have no concern with the changes.
 >> It's not cleared whether Andrew approved all 3 parts of the 
patchset
 >> or just the valuation part.  I'll wait for his nod before 
committing

 >> this chunk.
 >>
 >> Aldy
 >>
 > I have no issue with it, so OK.

Pushed all 3 patches.

 >
 > Just an observation that should be pointed out, I believe Aldy 
has all
 > the code for converting to a ranger, but we have not pursued 
that any
 > further yet since there is a regression due to our lack of 
equivalence
 > processing I think?  That should be resolved in the coming 
month, but at
 > the moment is a holdback/concern for converting these passes... 
iirc.


Yes.  Martin, the take away here is that the strlen/sprintf pass 
has been converted to the new API, but ranger is still not up and 
running on it (even on the branch).


With the new API, all you have to do is remove all instances of 
evrp_range_analyzer and replace them with a ranger. That's it.
Below is an untested patch that would convert you to a ranger once 
it's contributed.


IIRC when I enabled the ranger for your pass a while back, there 
was one or two regressions due to missing equivalences, and the 
rest were because the tests were expecting an actual specific 
range, and the ranger returned a slightly different/better one. 
You'll need to adjust your tests.


Ack.  I'll be on the lookout for the ranger commit (if you hppen
to remember and CC me on it just in case I might miss it that would
be great).


I have applied the patch and ran some tests.  There are quite
a few failures (see the list below).  I have only looked at
a couple.  The one in in gcc.dg/tree-ssa/builtin-sprintf-warn-3.c
boils down to the following test case.  There should be no warning
for either sprintf call.  The one in h() is a false positive and
the reason for at least some of the regressions.  Somehow,
the 

Re: [PATCH, rs6000] Add non-relative jump table support on Power Linux

2020-11-05 Thread HAO CHEN GUI via Gcc-patches

Hi,

Gentle ping this:

https://gcc.gnu.org/pipermail/gcc-patches/2020-October/556236.html

Thanks.

Gui Haochen


On 15/10/2020 下午 4:46, HAO CHEN GUI wrote:

Segher,

    I re-wrote the patch based on parameterized name.

    The attachments are the patch diff file and change log file.

    Bootstrapped and tested on powerpc64le-linux-gnu with no 
regressions.  Is this okay for trunk? Any recommendations? Thanks a lot.



On 29/9/2020 上午 6:46, Segher Boessenkool wrote:

Hi hao Chen,

On Wed, Sep 09, 2020 at 04:55:29PM +0800, HAO CHEN GUI wrote:

 Thanks for your advice. I removed macros defined in linux64.h and
linux.h. So they take relative jump tables by default. When
no-relative-jumptables is set, the absolute jump tables are taken. All
things relevant to section relocations are put in another patch. Thanks
again.

[ Please do not insert patches into discussions ]


+/* Implement TARGET_ASM_GENERATE_PIC_ADDR_DIFF_VEC.
+   Return true if rs6000_relative_jumptables is set.  */

Don't just say what the code does (we can see that ;-) ); say *why*.
Of course it is terribly simple in this case, so maybe just that first
line is plenty.


+/* Specify the machine mode that this machine uses
+   for the index in the tablejump instruction.  */
+#define CASE_VECTOR_MODE \
+  (TARGET_32BIT || rs6000_relative_jumptables ? SImode : Pmode)

If TARGET_32BIT is true, SImode and Pmode are the same, so this is
simpler said as

#define CASE_VECTOR_MODE (rs6000_relative_jumptables ? SImode : Pmode)


I'll have the tablejump* patterns converted to paremeterized names
hopefully tonight or tomorrow, which will make your patch much easier
to read.  It looks like it will be fine, thanks :-)


Segher


[Patch] Fortran: Fix type-decl for PDT / wrong-code pdt_14.f03 issue [PR97652]

2020-11-05 Thread Tobias Burnus

Recent IPA work exposed this issue by causing wrong-code for 
gfortran.dg/pdt_15.f03
with optimization turned on; this shows up rather prominently with an endless 
loop,
until after 300s (per optimization level) the timeout kicks in.

OK? It probably should be backported to GCC 9 + 10, I think, even though IPA
patches are mainline/GCC 11, only.

Tobias

-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter
Fortran: Fix type-decl for PDT / wrong-code pdt_14.f03 issue [PR97652]

Parameterized derived types are handled in a special way and start with 'Pdt'.
If the 'P' is not uppercase, gfc_get_derived_type (which calls
gfc_get_module_backend_decl) does not find the existing declaration and
builds a new type. The middle end then sees those types as being different
and nonalising, creating an endless loop for pdt_14.f03.

gcc/fortran/ChangeLog:

	PR fortran/97652
	* module.c (mio_symbol): Fix symbol name for pdt_type.

diff --git a/gcc/fortran/module.c b/gcc/fortran/module.c
index 33e7df7d6a4..4c6ff22d5c1 100644
--- a/gcc/fortran/module.c
+++ b/gcc/fortran/module.c
@@ -4549,6 +4549,9 @@ mio_symbol (gfc_symbol *sym)
 
   mio_symbol_attribute (&sym->attr);
 
+  if (sym->attr.pdt_type)
+sym->name = gfc_dt_upper_string (sym->name);
+
   /* Note that components are always saved, even if they are supposed
  to be private.  Component access is checked during searching.  */
   mio_component_list (&sym->components, sym->attr.vtype);


Re: [PATCH] Put absolute address jump table in data.rel.ro.local if targets support relocations

2020-11-05 Thread HAO CHEN GUI via Gcc-patches

Hi,

Gentle ping this:

https://gcc.gnu.org/pipermail/gcc-patches/2020-October/556744.html

Thanks

Gui Haochen


On 22/10/2020 上午 10:53, HAO CHEN GUI wrote:
I had a wrong email setting and got your reply later. I modified the 
patch according to your advice. Could you please review it again? Thanks.


On 2/10/2020 上午 1:47, Richard Sandiford wrote:

Sorry for the slow review.

HAO CHEN GUI via Gcc-patches  writes:

diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c
index 513fc5fe295..6f5bf8d7d73 100644
--- a/gcc/config/mips/mips.c
+++ b/gcc/config/mips/mips.c
@@ -9315,10 +9315,10 @@ mips_select_rtx_section (machine_mode mode, 
rtx x,

 default_function_rodata_section.  */
    static section *
-mips_function_rodata_section (tree decl)
+mips_function_rodata_section (tree decl, bool relocatable 
ATTRIBUTE_UNUSED)

Now that we're C++, it's more idiomatic to leave off the parameter name:

   mips_function_rodata_section (tree decl, bool)

Same for the rest of the patch.

@@ -2491,9 +2491,19 @@ final_scan_insn_1 (rtx_insn *insn, FILE 
*file, int optimize_p ATTRIBUTE_UNUSED,

    if (! JUMP_TABLES_IN_TEXT_SECTION)
  {
    int log_align;
+  bool relocatable;
+
+  relocatable = 0;

Very minor, but simpler as:

    bool relocatable = false;

Same for the later hunk.

@@ -549,16 +549,17 @@ Whatever the actual target object format, this 
is often good enough.",

   void, (tree decl, int reloc),
   default_unique_section)
  -/* Return the readonly data section associated with function 
DECL.  */

+/* Return the readonly or relocated readonly data section
+   associated with function DECL.  */
  DEFHOOK
  (function_rodata_section,
- "Return the readonly data section associated with\n\
+ "Return the readonly or reloc readonly data section associated 
with\n\

  @samp{DECL_SECTION_NAME (@var{decl})}.\n\

Maybe add “; @var{relocatable} selects the latter over the former.”

  The default version of this function selects 
@code{.gnu.linkonce.r.name} if\n\
  the function's section is @code{.gnu.linkonce.t.name}, 
@code{.rodata.name}\n\
-if function is in @code{.text.name}, and the normal readonly-data 
section\n\

-otherwise.",
- section *, (tree decl),
+or @code{.data.rel.ro.name} if function is in @code{.text.name}, 
and\n\

+the normal readonly-data or reloc readonly data section otherwise.",
+ section *, (tree decl, bool relocatable),
   default_function_rodata_section)
    /* Nonnull if the target wants to override the default ".rodata" 
prefix

diff --git a/gcc/varasm.c b/gcc/varasm.c
index 4070f9c17e8..91ab75aed06 100644
--- a/gcc/varasm.c
+++ b/gcc/varasm.c
@@ -726,12 +726,26 @@ switch_to_other_text_partition (void)
    switch_to_section (current_function_section ());
  }
  -/* Return the read-only data section associated with function 
DECL.  */

+/* Return the read-only or relocated read-only data section
+   associated with function DECL.  */
    section *
-default_function_rodata_section (tree decl)
+default_function_rodata_section (tree decl, bool relocatable)
  {
-  if (decl != NULL_TREE && DECL_SECTION_NAME (decl))
+  const char* sname;
+  unsigned int flags;
+
+  flags = 0;
+
+  if (relocatable)
+    {
+  sname = ".data.rel.ro.local";
+  flags = (SECTION_WRITE | SECTION_RELRO);
+    }
+  else
+    sname = ".rodata";
+
+  if (decl && DECL_SECTION_NAME (decl))
  {
    const char *name = DECL_SECTION_NAME (decl);
  @@ -744,12 +758,12 @@ default_function_rodata_section (tree decl)
    dot = strchr (name + 1, '.');
    if (!dot)
  dot = name;
-  len = strlen (dot) + 8;
+  len = strlen (dot) + strlen (sname) + 1;
    rname = (char *) alloca (len);
  -  strcpy (rname, ".rodata");
+  strcpy (rname, sname);
    strcat (rname, dot);
-  return get_section (rname, SECTION_LINKONCE, decl);
+  return get_section (rname, (SECTION_LINKONCE | flags), decl);
  }
    /* For .gnu.linkonce.t.foo we want to use 
.gnu.linkonce.r.foo.  */

    else if (DECL_COMDAT_GROUP (decl)
@@ -767,15 +781,18 @@ default_function_rodata_section (tree decl)
 && strncmp (name, ".text.", 6) == 0)
  {
    size_t len = strlen (name) + 1;
-  char *rname = (char *) alloca (len + 2);
+  char *rname = (char *) alloca (len + strlen (sname) - 5);
  -  memcpy (rname, ".rodata", 7);
-  memcpy (rname + 7, name + 5, len - 5);
-  return get_section (rname, 0, decl);
+  memcpy (rname, sname, strlen (sname));
+  memcpy (rname + strlen (sname), name + 5, len - 5);
+  return get_section (rname, flags, decl);
  }
  }

Don't we need to handle the .gnu.linkonce.t. case too?  I believe
the suffix there is “.d.rel.ro.local” (replacing “.t”)

My main concern is how this interacts with non-ELF targets.
It looks like AIX/XCOFF, Darwin and Cygwin already pick
default_no_function_rodata_section, so they should be fine.
But at the moment, all the fancy stuff in 
default_function_rodata_section


RE: gcc-wwwdocs branch master updated. 88e29096c36837553fc841bd1fa5df6caa776b44

2020-11-05 Thread Liu, Hongtao via Gcc-patches



>-Original Message-
>From: Gerald Pfeifer 
>Sent: Friday, November 6, 2020 5:57 AM
>To: Hongtao Liu ; hongtao Liu
>
>Cc: gcc-patches@gcc.gnu.org
>Subject: Re: gcc-wwwdocs branch master updated.
>88e29096c36837553fc841bd1fa5df6caa776b44
>
>On Thu, 29 Oct 2020, hongtao Liu via Gcc-cvs-wwwdocs wrote:
>> The branch, master has been updated
>>via  88e29096c36837553fc841bd1fa5df6caa776b44 (commit)
>>   from  053c956f6e9c71efac5be01f8a8ba79f15d87f4b (commit)
>
>>GCC now supports the Intel CPU named Alderlake through
>>  -march=alderlake.
>> -The switch enables the CLDEMOTE PTWRITE WAITPKG SERIALIZE ISA
>extensions.
>> +The switch enables the CLDEMOTE PTWRITE WAITPKG SERIALIZE
>KEYLOCKER
>> +ISA extensions.
>
>I did not see this posted on gcc-patches.  Should this list of extensions be
>separated by commas?
>
>(I can make that change if you agree.)
>

Yes, thanks for that.
Patch for adding -march=alderlake  
https://gcc.gnu.org/pipermail/gcc-patches/2020-July/549699.html
Patch for Keylocker  
https://gcc.gnu.org/pipermail/gcc-patches/2020-October/556026.html

>Also, I did not see you in gcc/MAINTAINERS, or did miss it?
>Since evidently you have write after approval access, please add yourself
>there.
>

Will do.

>Gerald


Re: [PATCH] c++: Reuse identical ATOMIC_CONSTRs during normalization

2020-11-05 Thread Patrick Palka via Gcc-patches
On Thu, 5 Nov 2020, Patrick Palka wrote:

> On Thu, 5 Nov 2020, Jason Merrill wrote:
> 
> > On 11/3/20 3:43 PM, Patrick Palka wrote:
> > > Profiling revealed that sat_hasher::equal accounts for nearly 40% of
> > > compile time in some cmcstl2 tests.
> > > 
> > > This patch eliminates this bottleneck by caching the ATOMIC_CONSTRs
> > > returned by normalize_atom.  This in turn allows us to replace the
> > > expensive atomic_constraints_identical_p check in sat_hasher::equal
> > > with cheap pointer equality, with no loss in cache hit rate.
> > > 
> > > With this patch, compile time for the cmcstl2 test
> > > test/algorithm/set_symmetric_difference4.cpp drops from 19s to 11s with
> > > an --enable-checking=release compiler.
> > > 
> > > Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
> > > trunk?
> > > 
> > > gcc/cp/ChangeLog:
> > > 
> > >   * constraint.cc (struct atom_hasher): New descriptor class for a
> > >   hash_table.  Use it to define ...
> > >   (atom_cache): ... this.
> > >   (normalize_atom): Use it to cache ATOMIC_CONSTRs when not
> > >   generating diagnostics.
> > >   (sat_hasher::hash): Use htab_hash_pointer instead of
> > >   hash_atomic_constraint.
> > >   (sat_hasher::equal): Test for pointer equality instead of
> > >   atomic_constraints_identical_p.
> > > ---
> > >   gcc/cp/constraint.cc | 37 ++---
> > >   1 file changed, 34 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
> > > index b6f6f0d02a5..ce720c641e8 100644
> > > --- a/gcc/cp/constraint.cc
> > > +++ b/gcc/cp/constraint.cc
> > > @@ -710,6 +710,25 @@ normalize_concept_check (tree check, tree args,
> > > norm_info info)
> > > return normalize_expression (def, subst, info);
> > >   }
> > >   +/* Hash functions for ATOMIC_CONSTRs.  */
> > > +
> > > +struct atom_hasher : default_hash_traits
> > > +{
> > > +  static hashval_t hash (tree atom)
> > > +  {
> > > +return hash_atomic_constraint (atom);
> > > +  }
> > > +
> > > +  static bool equal (tree atom1, tree atom2)
> > > +  {
> > > +return atomic_constraints_identical_p (atom1, atom2);
> > > +  }
> > > +};
> > 
> > This is the same as constraint_hash in logic.cc; either they should be
> > combined, or (probably) the hash table in logic.cc should be changed to also
> > take advantage of pointer equivalence.
> 
> Ah, I forgot about the existence of this hasher.  Consider this hasher
> changed to take advantage of the pointer equivalence (I'll post a
> revised and tested patch later today).

On second thought, if we make the hasher in logic.cc and other places
(e.g. add_constraint and constraints_equivalent_p) take advantage of
pointer-based identity for ATOMIC_CONSTR, then we'd be relying on the
uniqueness assumption in a crucial way rather than just relying on it in
the satisfaction cache in a benign way that doesn't affect the semantics
of the program if the assumption is somehow violated (we just get a
cache miss if it is violated).

So it seems to me it'd be better to not infect other parts of the code
with this assumption, and to keep it local to the satisfaction cache,
so as to minimize complexity.  So I suppose we should go with combining
these two "structural" hashers.

> 
> > 
> > > +/* Used by normalize_atom to cache ATOMIC_CONSTRs.  */
> > > +
> > > +static GTY((deletable)) hash_table *atom_cache;
> > 
> > If we're relying on pointer identity, this can't be deletable; if GC 
> > discards
> > it, later normalization will generate a new equivalent ATOMIC_CONSTR, 
> > breaking
> > the uniqueness assumption.
> 
> But because no ATOMIC_CONSTR is ever reachable from a GC root (since
> they live only inside GC-deletable structures), there will never be two
> equivalent ATOMIC_CONSTR trees live at once, which is a sufficient
> enough notion of uniqueness for us, I think.
> 
> > 
> > >   /* The normal form of an atom depends on the expression. The normal
> > >  form of a function call to a function concept is a check constraint
> > >  for that concept. The normal form of a reference to a variable
> > > @@ -729,7 +748,19 @@ normalize_atom (tree t, tree args, norm_info info)
> > > /* Build a new info object for the atom.  */
> > > tree ci = build_tree_list (t, info.context);
> > >   -  return build1 (ATOMIC_CONSTR, ci, map);
> > > +  tree atom = build1 (ATOMIC_CONSTR, ci, map);
> > > +  if (!info.generate_diagnostics ())
> > > +{
> > > +  /* Cache the ATOMIC_CONSTRs that we return, so that 
> > > sat_hasher::equal
> > > +  later can quickly compare two atoms using just pointer equality.  */
> > > +  if (!atom_cache)
> > > + atom_cache = hash_table::create_ggc (31);
> > > +  tree *slot = atom_cache->find_slot (atom, INSERT);
> > > +  if (*slot)
> > > + return *slot;
> > > +  *slot = atom;
> > > +}
> > > +  return atom;
> > >   }
> > > /* Returns the normal form of an expression. */
> > > @@ -2284,13 +2315,13 @@ struct sat_

[PATCH 1/4] c++: Fix ICE with variadic concepts and aliases [PR93907]

2020-11-05 Thread Patrick Palka via Gcc-patches
This patch (naively) extends the PR93907 fix to also apply to variadic
concepts invoked with a type argument pack.  Without this, we ICE on
the below testcase (a variadic version of concepts-using2.C) in the same
manner as we used to on concepts-using2.C before r10-7133.

Patch series bootstrapped and regtested on x86_64-pc-linux-gnu,
and also tested against cmcstl2 and range-v3.

gcc/cp/ChangeLog:

PR c++/93907
* constraint.cc (tsubst_parameter_mapping): Also canonicalize
the type arguments of a TYPE_ARGUMENT_PACk.

gcc/testsuite/ChangeLog:

PR c++/93907
* g++.dg/cpp2a/concepts-using3.C: New test, based off of
concepts-using2.C.
---
 gcc/cp/constraint.cc | 10 
 gcc/testsuite/g++.dg/cpp2a/concepts-using3.C | 52 
 2 files changed, 62 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-using3.C

diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
index b6f6f0d02a5..c871a8ab86a 100644
--- a/gcc/cp/constraint.cc
+++ b/gcc/cp/constraint.cc
@@ -2252,6 +2252,16 @@ tsubst_parameter_mapping (tree map, tree args, 
subst_info info)
  new_arg = tsubst_template_arg (arg, args, complain, in_decl);
  if (TYPE_P (new_arg))
new_arg = canonicalize_type_argument (new_arg, complain);
+ if (TREE_CODE (new_arg) == TYPE_ARGUMENT_PACK)
+   {
+ tree pack_args = ARGUMENT_PACK_ARGS (new_arg);
+ for (int i = 0; i < TREE_VEC_LENGTH (pack_args); i++)
+   {
+ tree& pack_arg = TREE_VEC_ELT (pack_args, i);
+ if (TYPE_P (pack_arg))
+   pack_arg = canonicalize_type_argument (pack_arg, complain);
+   }
+   }
}
   if (new_arg == error_mark_node)
return error_mark_node;
diff --git a/gcc/testsuite/g++.dg/cpp2a/concepts-using3.C 
b/gcc/testsuite/g++.dg/cpp2a/concepts-using3.C
new file mode 100644
index 000..2c8ad40d104
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/concepts-using3.C
@@ -0,0 +1,52 @@
+// PR c++/93907
+// { dg-options -std=gnu++20 }
+
+// This testcase is a variadic version of concepts-using2.C; the only
+// difference is that 'cd' and 'ce' are now variadic concepts.
+
+template  struct c {
+  static constexpr int d = a;
+  typedef c e;
+};
+template  struct f;
+template  using g = typename f::e;
+struct b;
+template  struct f { using e = b; };
+template  struct m { typedef g aj; };
+template  struct n { typedef typename m::aj e; };
+template  using an = typename n::e;
+template  constexpr bool ao = c::d;
+template  constexpr bool i = c<1>::d;
+template  concept bb = i;
+#ifdef __SIZEOF_INT128__
+using cc = __int128;
+#else
+using cc = long long;
+#endif
+template  concept cd = bb;
+template  concept ce = requires { requires cd; };
+template  concept h = ce;
+template  concept l = h;
+template  concept cl = ao;
+template  concept cp = requires(b j) {
+  requires h>;
+};
+struct o {
+  template  requires cp auto operator()(b) {}
+};
+template  using cm = decltype(o{}(b()));
+template  concept ct = l;
+template  concept dd = ct>;
+template  concept de = dd;
+struct {
+  template  void operator()(da, b);
+} di;
+struct p {
+  void begin();
+};
+template  using df = p;
+template  void q() {
+  df k;
+  int d;
+  di(k, d);
+}
-- 
2.29.2.154.g7f7ebe054a



[PATCH 2/4 v2] c++: Reuse identical ATOMIC_CONSTRs during normalization

2020-11-05 Thread Patrick Palka via Gcc-patches
Profiling revealed that sat_hasher::equal accounts for nearly 40% of
compile time in some cmcstl2 tests.

This patch eliminates this bottleneck by caching the ATOMIC_CONSTRs
returned by normalize_atom.  This in turn allows us to replace the
expensive atomic_constraints_identical_p check in sat_hasher::equal
with cheap pointer equality, with no loss in cache hit rate.

With this patch, compile time for the cmcstl2 test
test/algorithm/set_symmetric_difference4.cpp drops from 19s to 11s with
an --enable-checking=release compiler.

gcc/cp/ChangeLog:

* constraint.cc (atom_cache): Define this deletable hash_table.
(normalize_atom): Use it to cache ATOMIC_CONSTRs when not
generating diagnostics.
(sat_hasher::hash): Use htab_hash_pointer instead of
hash_atomic_constraint.
(sat_hasher::equal): Test for pointer equality instead of
atomic_constraints_identical_p.
* cp-tree.h (struct atom_hasher): Moved and renamed from ...
* logic.cc (struct constraint_hash): ... here.
(clause::m_set): Adjust accordingly.
---
 gcc/cp/constraint.cc | 27 ---
 gcc/cp/cp-tree.h | 15 +++
 gcc/cp/logic.cc  | 17 +
 3 files changed, 40 insertions(+), 19 deletions(-)

diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
index c871a8ab86a..613ced26e2b 100644
--- a/gcc/cp/constraint.cc
+++ b/gcc/cp/constraint.cc
@@ -710,6 +710,10 @@ normalize_concept_check (tree check, tree args, norm_info 
info)
   return normalize_expression (def, subst, info);
 }
 
+/* Used by normalize_atom to cache ATOMIC_CONSTRs.  */
+
+static GTY((deletable)) hash_table *atom_cache;
+
 /* The normal form of an atom depends on the expression. The normal
form of a function call to a function concept is a check constraint
for that concept. The normal form of a reference to a variable
@@ -729,7 +733,19 @@ normalize_atom (tree t, tree args, norm_info info)
   /* Build a new info object for the atom.  */
   tree ci = build_tree_list (t, info.context);
 
-  return build1 (ATOMIC_CONSTR, ci, map);
+  tree atom = build1 (ATOMIC_CONSTR, ci, map);
+  if (!info.generate_diagnostics ())
+{
+  /* Cache the ATOMIC_CONSTRs that we return, so that sat_hasher::equal
+later can cheaply compare two atoms using just pointer equality.  */
+  if (!atom_cache)
+   atom_cache = hash_table::create_ggc (31);
+  tree *slot = atom_cache->find_slot (atom, INSERT);
+  if (*slot)
+   return *slot;
+  *slot = atom;
+}
+  return atom;
 }
 
 /* Returns the normal form of an expression. */
@@ -2294,13 +2310,18 @@ struct sat_hasher : ggc_ptr_hash
 {
   static hashval_t hash (sat_entry *e)
   {
-hashval_t value = hash_atomic_constraint (e->constr);
+/* Since normalize_atom caches the ATOMIC_CONSTRs it returns,
+   we can assume pointer-based identity for fast hashing and
+   comparison.  Even if this assumption is violated, that's
+   okay, we'll just get a cache miss.  */
+hashval_t value = htab_hash_pointer (e->constr);
 return iterative_hash_template_arg (e->args, value);
   }
 
   static bool equal (sat_entry *e1, sat_entry *e2)
   {
-if (!atomic_constraints_identical_p (e1->constr, e2->constr))
+/* As in sat_hasher::hash.  */
+if (e1->constr != e2->constr)
   return false;
 return template_args_equal (e1->args, e2->args);
   }
diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 26852f6f2e3..eda4c56b406 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -7835,6 +7835,21 @@ extern hashval_t iterative_hash_constraint  (tree, 
hashval_t);
 extern hashval_t hash_atomic_constraint (tree);
 extern void diagnose_constraints(location_t, tree, tree);
 
+/* A structural hasher for ATOMIC_CONSTRs.  */
+
+struct atom_hasher : default_hash_traits
+{
+  static hashval_t hash (tree t)
+  {
+return hash_atomic_constraint (t);
+  }
+
+  static bool equal (tree t1, tree t2)
+  {
+return atomic_constraints_identical_p (t1, t2);
+  }
+};
+
 /* in logic.cc */
 extern bool subsumes(tree, tree);
 
diff --git a/gcc/cp/logic.cc b/gcc/cp/logic.cc
index 194b743192d..6701488bc1c 100644
--- a/gcc/cp/logic.cc
+++ b/gcc/cp/logic.cc
@@ -47,21 +47,6 @@ along with GCC; see the file COPYING3.  If not see
 #include "toplev.h"
 #include "type-utils.h"
 
-/* Hash functions for atomic constrains.  */
-
-struct constraint_hash : default_hash_traits
-{
-  static hashval_t hash (tree t)
-  {
-return hash_atomic_constraint (t);
-  }
-
-  static bool equal (tree t1, tree t2)
-  {
-return atomic_constraints_identical_p (t1, t2);
-  }
-};
-
 /* A conjunctive or disjunctive clause.
 
Each clause maintains an iterator that refers to the current
@@ -219,7 +204,7 @@ struct clause
   }
 
   std::list m_terms; /* The list of terms.  */
-  hash_set m_set; /* The set of atomic 
constraints.  */
+  hash_set m_set; /* The set of atomic 

  1   2   >