Re: GCC 10 backports

2021-03-03 Thread Martin Liška

One more backport I've just tested.

Martin
>From 1740f6453356fec7926e360b3a379ca0fa80a1da Mon Sep 17 00:00:00 2001
From: Tom de Vries 
Date: Fri, 5 Feb 2021 10:36:38 +0100
Subject: [PATCH] debug: fix switch lowering debug info

gcc/ChangeLog:

	PR debug/98656
	* tree-switch-conversion.c (jump_table_cluster::emit): Add loc
	argument.
	(bit_test_cluster::emit): Reuse location_t for newly created
	gswitch statement.
	(switch_decision_tree::try_switch_expansion): Preserve
	location_t.
	* tree-switch-conversion.h: Change function signatures.

(cherry picked from commit 4ede02a5f2af1205434f0e05aaaeff762b24e329)
---
 gcc/tree-switch-conversion.c | 11 +++
 gcc/tree-switch-conversion.h |  8 
 2 files changed, 11 insertions(+), 8 deletions(-)

diff --git a/gcc/tree-switch-conversion.c b/gcc/tree-switch-conversion.c
index 8da1be1cd99..2bc098d0172 100644
--- a/gcc/tree-switch-conversion.c
+++ b/gcc/tree-switch-conversion.c
@@ -1115,7 +1115,8 @@ group_cluster::dump (FILE *f, bool details)
 
 void
 jump_table_cluster::emit (tree index_expr, tree,
-			  tree default_label_expr, basic_block default_bb)
+			  tree default_label_expr, basic_block default_bb,
+			  location_t loc)
 {
   unsigned HOST_WIDE_INT range = get_range (get_low (), get_high ());
   unsigned HOST_WIDE_INT nondefault_range = 0;
@@ -1134,6 +1135,7 @@ jump_table_cluster::emit (tree index_expr, tree,
 
   gswitch *s = gimple_build_switch (index_expr,
 unshare_expr (default_label_expr), labels);
+  gimple_set_location (s, loc);
   gimple_stmt_iterator gsi = gsi_start_bb (m_case_bb);
   gsi_insert_after (&gsi, s, GSI_NEW_STMT);
 
@@ -1491,7 +1493,7 @@ case_bit_test::cmp (const void *p1, const void *p2)
 
 void
 bit_test_cluster::emit (tree index_expr, tree index_type,
-			tree, basic_block default_bb)
+			tree, basic_block default_bb, location_t)
 {
   case_bit_test test[m_max_case_bit_tests] = { {} };
   unsigned int i, j, k;
@@ -1858,7 +1860,8 @@ switch_decision_tree::try_switch_expansion (vec &clusters)
 {
   cluster *c = clusters[0];
   c->emit (index_expr, index_type,
-	   gimple_switch_default_label (m_switch), m_default_bb);
+	   gimple_switch_default_label (m_switch), m_default_bb,
+	   gimple_location (m_switch));
   redirect_edge_succ (single_succ_edge (bb), c->m_case_bb);
 }
   else
@@ -1870,7 +1873,7 @@ switch_decision_tree::try_switch_expansion (vec &clusters)
 	if (clusters[i]->get_type () != SIMPLE_CASE)
 	  clusters[i]->emit (index_expr, index_type,
 			 gimple_switch_default_label (m_switch),
-			 m_default_bb);
+			 m_default_bb, gimple_location (m_switch));
 }
 
   fix_phi_operands_for_edges ();
diff --git a/gcc/tree-switch-conversion.h b/gcc/tree-switch-conversion.h
index dbfd9eecba2..1ff50fafaa3 100644
--- a/gcc/tree-switch-conversion.h
+++ b/gcc/tree-switch-conversion.h
@@ -71,7 +71,7 @@ public:
   virtual void dump (FILE *f, bool details = false) = 0;
 
   /* Emit GIMPLE code to handle the cluster.  */
-  virtual void emit (tree, tree, tree, basic_block) = 0;
+  virtual void emit (tree, tree, tree, basic_block, location_t) = 0;
 
   /* Return true if a cluster handles only a single case value and the
  value is not a range.  */
@@ -164,7 +164,7 @@ public:
 fprintf (f, " ");
   }
 
-  void emit (tree, tree, tree, basic_block)
+  void emit (tree, tree, tree, basic_block, location_t)
   {
 gcc_unreachable ();
   }
@@ -250,7 +250,7 @@ public:
   }
 
   void emit (tree index_expr, tree index_type,
-	 tree default_label_expr, basic_block default_bb);
+	 tree default_label_expr, basic_block default_bb, location_t loc);
 
   /* Find jump tables of given CLUSTERS, where all members of the vector
  are of type simple_cluster.  New clusters are returned.  */
@@ -368,7 +368,7 @@ public:
 There *MUST* be max_case_bit_tests or less unique case
 node targets.  */
   void emit (tree index_expr, tree index_type,
-	 tree default_label_expr, basic_block default_bb);
+	 tree default_label_expr, basic_block default_bb, location_t loc);
 
   /* Find bit tests of given CLUSTERS, where all members of the vector
  are of type simple_cluster.  New clusters are returned.  */
-- 
2.30.1



Re: [PATCH] correct fix to avoid false positives for vectorized stores (PR 96963, 94655)

2021-03-03 Thread Richard Biener via Gcc-patches
On Tue, Mar 2, 2021 at 9:23 PM Martin Sebor  wrote:
>
> On 3/2/21 3:39 AM, Richard Biener wrote:
> > On Fri, Jan 22, 2021 at 12:39 AM Martin Sebor via Gcc-patches
> >  wrote:
> >>
> >> The hack I put in compute_objsize() last January for pr93200 isn't
> >> quite correct.  It happened to suppress the false positive there
> >> but, due to what looks like a thinko on my part, not in some other
> >> test cases involving vectorized stores.
> >>
> >> The attached change adjusts the hack to have compute_objsize() give
> >> up on all MEM_REFs with a vector type.  This effectively disables
> >> -Wstringop-{overflow,overread} for vector accesses (either written
> >> by the user or synthesized by GCC from ordinary accesses).  It
> >> doesn't affect -Warray-bounds because this warning doesn't use
> >> compute_objsize() yet.  When it does (it should considerably
> >> simplify the code) some additional changes will be needed to
> >> preserve -Warray-bounds for out of bounds vector accesses.
> >> The test this patch adds should serve as a reminder to make
> >> it if we forget.
> >>
> >> Tested on x86_64-linux.  Since PR 94655 was reported against GCC
> >> 10 I'd like to apply this fix to both the trunk and the 10 branch.
> >
> > The proposed code reads even more confusingly than before.
> > What is called 'ptr' is treated as a reference and what you
> > then name ptrtype is the type of the reference.
> >
> > That leads to code like
> >
> > +   if (code == MEM_REF && TREE_CODE (ptrtype) == VECTOR_TYPE)
> >
> > what?  the pointer type is a VECTOR_TYPE?!
> >
> > I think you are looking for whether the reference type is a VECTOR_TYPE.
> >
> > Part of the confusion is likely because the code commons
> >
> >if (code == ARRAY_REF || code == MEM_REF)
> >
> > but in one case operand zero is a pointer to the object (MEM_REF) and
> > in one case
> > it is the object (ARRAY_REF).
> >
> > Please refactor this code so one can actually read it literally
> > without second-guessing proper variable names.
>
> I agree that handling both REFs in the same block makes the code
> more difficult to follow than it needs to be.
>
> In the attached patch I've factored the code out into two helper
> functions, one for ARRAY_REF and another for MEM_REF, and chosen
> (hopefully) clearer names for the local variables.
>
> A similar refactoring could be done for COMPONENT_REF and also
> for SSA_NAME to reduce the size of the function and make it much
> easier to follow.  I stopped short of doing that but I can do it
> for GCC 12 when I move the whole compute_objsize() implementation
> into a more appropriate  place (e.g., its own file).

+  if (!addr && TREE_CODE (TREE_TYPE (reftype)) == POINTER_TYPE)

POINTER_TYPE_P ()

+/* Avoid arrays of pointers.  FIXME: Hande pointers to arrays
+   of known bound.  */
+return false;

+  if (TREE_CODE (TREE_TYPE (mref)) == VECTOR_TYPE)
+{
+  /* Hack: Give up for MEM_REFs of vector types; those may be
+synthesized from multiple assignments to consecutive data
+members (see PR 93200 and 96963).

VECTOR_TYPE_P (TREE_TYPE (mref))

+FIXME: Vectorized assignments should only be present after
+vectorization so this hack is only necessary after it has
+run and could be avoided in calls from prior passes (e.g.,
+tree-ssa-strlen.c).

You could gate this on cfun->curr_properties & PROP_gimple_lvec
which is only set after vectorization.  Users can write vector
accesses using intrinsics or GCCs vector extension and I guess
warning for those is useful (and they less likely will cover multiple
fields).

Note the vectorizer also uses integer types for vector accesses
either when vectorizing using 'generic' vectors (for loads, stores
and bit operations we don't need any vector ISA) or when
composing vectors.

Note store-merging has the same issue (but fortunately runs later?)

OK with the above changes.

Thanks,
Richard.

> Martin
>
> >
> > Thanks,
> > Richard.
> >
> >
> >
> >> Martin
>


Re: [PATCH] IBM Z: Run mul-signed-overflow-*.c only on z14+

2021-03-03 Thread Ilya Leoshkevich via Gcc-patches
On Wed, 2021-03-03 at 07:50 +0100, Andreas Krebbel wrote:
> On 3/2/21 11:59 PM, Ilya Leoshkevich wrote:
> > mul-signed-overflow-*.c execution tests fail on z13, because they
> > contain z14-specific instructions.  Fix by requiring s390_z14_hw
> > target.
> > 
> > gcc/testsuite/ChangeLog:
> > 
> > * gcc.target/s390/mul-signed-overflow-1.c: Run only on
> > z14+.
> > * gcc.target/s390/mul-signed-overflow-2.c: Likewise.
> 
> I did that change yesterday already.

Ah, I haven't noticed.  One difference between our patches is, though,
that I also have `dg-do compile` - this way, compile tests still run on
z13.

[...]



Re: [PATCH] gcov: use mmap pools for KVP.

2021-03-03 Thread Jan Hubicka
> Hello.
> 
> AS mentioned here, https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97461#c25, I 
> like
> what Richard suggested. So instead of usage of malloc, we should use mmap 
> memory
> chunks that serve as a memory pool for struct gcov_kvp.
> 
> Malloc is used as a fallback when mmap is not available. I also drop 
> statically
> pre-allocated static pool, mmap solves the root problem.
> 
> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
> 
> Ready to be installed?
> Thanks,
> Martin
> 
> gcc/ChangeLog:
> 
>   PR gcov-profile/97461
>   * gcov-io.h (GCOV_PREALLOCATED_KVP): Remove.
> 
> libgcc/ChangeLog:
> 
>   PR gcov-profile/97461
>   * config.in: Regenerate.
>   * configure: Likewise.
>   * configure.ac: Check sys/mman.h header file
>   * libgcov-driver.c (struct gcov_kvp): Remove static
>   pre-allocated pool and use a dynamic one.
>   * libgcov.h (MMAP_CHUNK_SIZE): New.
>   (gcov_counter_add): Use mmap to allocate pool for struct
>   gcov_kvp.
OK, thanks.
We should be ready to add support for non-mmap targets (especially
mingw).  

Honza
> ---
>  gcc/gcov-io.h   |  3 ---
>  libgcc/config.in|  3 +++
>  libgcc/configure|  4 ++--
>  libgcc/configure.ac |  2 +-
>  libgcc/libgcov-driver.c | 11 +++
>  libgcc/libgcov.h| 42 -
>  6 files changed, 46 insertions(+), 19 deletions(-)
> 
> diff --git a/gcc/gcov-io.h b/gcc/gcov-io.h
> index baed67609e2..75f16a274c7 100644
> --- a/gcc/gcov-io.h
> +++ b/gcc/gcov-io.h
> @@ -292,9 +292,6 @@ GCOV_COUNTERS
>  /* Maximum number of tracked TOP N value profiles.  */
>  #define GCOV_TOPN_MAXIMUM_TRACKED_VALUES 32
> -/* Number of pre-allocated gcov_kvp structures.  */
> -#define GCOV_PREALLOCATED_KVP 64
> -
>  /* Convert a counter index to a tag.  */
>  #define GCOV_TAG_FOR_COUNTER(COUNT)  \
>   (GCOV_TAG_COUNTER_BASE + ((gcov_unsigned_t)(COUNT) << 17))
> diff --git a/libgcc/config.in b/libgcc/config.in
> index 5be5321d258..f93c64a00c3 100644
> --- a/libgcc/config.in
> +++ b/libgcc/config.in
> @@ -49,6 +49,9 @@
>  /* Define to 1 if you have the  header file. */
>  #undef HAVE_SYS_AUXV_H
> +/* Define to 1 if you have the  header file. */
> +#undef HAVE_SYS_MMAN_H
> +
>  /* Define to 1 if you have the  header file. */
>  #undef HAVE_SYS_STAT_H
> diff --git a/libgcc/configure b/libgcc/configure
> index 78fc22a5784..dd3afb2c957 100755
> --- a/libgcc/configure
> +++ b/libgcc/configure
> @@ -4458,7 +4458,7 @@ as_fn_arith $ac_cv_sizeof_long_double \* 8 && 
> long_double_type_size=$as_val
>  for ac_header in inttypes.h stdint.h stdlib.h ftw.h \
>   unistd.h sys/stat.h sys/types.h \
> - string.h strings.h memory.h sys/auxv.h
> + string.h strings.h memory.h sys/auxv.h sys/mman.h
>  do :
>as_ac_Header=`$as_echo "ac_cv_header_$ac_header" | $as_tr_sh`
>  ac_fn_c_check_header_preproc "$LINENO" "$ac_header" "$as_ac_Header"
> @@ -4913,7 +4913,7 @@ case "$host" in
>  case "$enable_cet" in
>auto)
>   # Check if target supports multi-byte NOPs
> - # and if assembler supports CET insn.
> + # and if compiler and assembler support CET insn.
>   cet_save_CFLAGS="$CFLAGS"
>   CFLAGS="$CFLAGS -fcf-protection"
>   cat confdefs.h - <<_ACEOF >conftest.$ac_ext
> diff --git a/libgcc/configure.ac b/libgcc/configure.ac
> index ed50c0e9b49..10ffb046415 100644
> --- a/libgcc/configure.ac
> +++ b/libgcc/configure.ac
> @@ -224,7 +224,7 @@ AC_SUBST(long_double_type_size)
>  AC_CHECK_HEADERS(inttypes.h stdint.h stdlib.h ftw.h \
>   unistd.h sys/stat.h sys/types.h \
> - string.h strings.h memory.h sys/auxv.h)
> + string.h strings.h memory.h sys/auxv.h sys/mman.h)
>  AC_HEADER_STDC
>  # Check for decimal float support.
> diff --git a/libgcc/libgcov-driver.c b/libgcc/libgcov-driver.c
> index e474e032b54..91462350132 100644
> --- a/libgcc/libgcov-driver.c
> +++ b/libgcc/libgcov-driver.c
> @@ -588,11 +588,14 @@ struct gcov_root __gcov_root;
>  struct gcov_master __gcov_master =
>{GCOV_VERSION, 0};
> -/* Pool of pre-allocated gcov_kvp strutures.  */
> -struct gcov_kvp __gcov_kvp_pool[GCOV_PREALLOCATED_KVP];
> +/* Dynamic pool for gcov_kvp structures.  */
> +struct gcov_kvp *__gcov_kvp_dynamic_pool;
> -/* Index to first free gcov_kvp in the pool.  */
> -unsigned __gcov_kvp_pool_index;
> +/* Index into __gcov_kvp_dynamic_pool array.  */
> +unsigned __gcov_kvp_dynamic_pool_index;
> +
> +/* Size of _gcov_kvp_dynamic_pool array.  */
> +unsigned __gcov_kvp_dynamic_pool_size;
>  void
>  __gcov_exit (void)
> diff --git a/libgcc/libgcov.h b/libgcc/libgcov.h
> index b4a7e942a7e..e848811d89d 100644
> --- a/libgcc/libgcov.h
> +++ b/libgcc/libgcov.h
> @@ -45,6 +45,10 @@
>  #include "libgcc_tm.h"
>  #include "gcov.h"
> +#if HAVE_SYS_MMAN_H
> +#include 
> +#endif
> +
>  #if __CHAR_BIT__ == 8
>  typedef unsigned gcov_unsigned_t __attribute__ ((mode (SI)));
>  typedef unsigned gcov_position_t __

Re: [PATCH] profiling: fix streaming of TOPN counters

2021-03-03 Thread Jan Hubicka
> 
> libgcc/ChangeLog:
> 
>   PR gcov-profile/99105
>   * libgcov-driver.c (write_top_counters): Rename to ...
>   (write_topn_counters): ... this.
>   (write_one_data): Pre-allocate buffer for number of items
>   in the corresponding linked lists.
>   * libgcov-merge.c (__gcov_merge_topn): Use renamed function.
> 
> gcc/testsuite/ChangeLog:
> 
>   PR gcov-profile/99105
>   * gcc.dg/tree-prof/indir-call-prof-malloc.c: Use profile
>   correction as the wrapped malloc is called one more time
>   from libgcov.

>for (unsigned i = 0; i < counters; i++)
>  {
> -  gcov_type pair_count = ci_ptr->values[GCOV_TOPN_MEM_COUNTERS * i + 1];
>gcov_write_counter (ci_ptr->values[GCOV_TOPN_MEM_COUNTERS * i]);
> -  gcov_write_counter (pair_count);
> +  gcov_write_counter (list_sizes[i]);
>gcov_type start = ci_ptr->values[GCOV_TOPN_MEM_COUNTERS * i + 2];
> +
> +  unsigned j = 0;
>for (struct gcov_kvp *node = (struct gcov_kvp *)(intptr_t)start;
> -node != NULL; node = node->next)
> +node != NULL; node = node->next, j++)
>   {
> gcov_write_counter (node->value);
> gcov_write_counter (node->count);
> +
> +   /* Stop when we reach expected number of items.  */
> +   if (j + 1 == list_sizes[i])
> + break;

Since you counted number of entries earlier, I would expect loop to
always terminate here and thus the node != NULL condition in for loop
above to be unnecesary.
>   }
>  }
> +
> +  free (list_sizes);

We already have our own mmap allocator, so I wonder if we don't want to
avoid malloc/free pair here.  The counters are per-function, right?  I
wonder if they happen to be large on some bigger project, but it may
reduct risk of user messing this up with his own malloc/free
implementation if we used alloca for counts of reasonable size.
>  }
>  
>  /* Write counters in GI_PTR and the summary in PRG to a gcda file. In
> @@ -425,7 +446,7 @@ write_one_data (const struct gcov_info *gi_ptr,
> n_counts = ci_ptr->num;
>  
> if (t_ix == GCOV_COUNTER_V_TOPN || t_ix == GCOV_COUNTER_V_INDIR)
> - write_top_counters (ci_ptr, t_ix, n_counts);
> + write_topn_counters (ci_ptr, t_ix, n_counts);
> else
>   {
> /* Do not stream when all counters are zero.  */
> diff --git a/libgcc/libgcov-merge.c b/libgcc/libgcov-merge.c
> index 7db188a4f4c..3379b688128 100644
> --- a/libgcc/libgcov-merge.c
> +++ b/libgcc/libgcov-merge.c
> @@ -109,6 +109,7 @@ __gcov_merge_topn (gcov_type *counters, unsigned 
> n_counters)
>/* First value is number of total executions of the profiler.  */
>gcov_type all = gcov_get_counter_ignore_scaling (-1);
>gcov_type n = gcov_get_counter_ignore_scaling (-1);
> +  gcc_assert (n <= GCOV_TOPN_MAXIMUM_TRACKED_VALUES);

I think in the runtime we do not want to have asserts checking
implementation correctness since it bloats it up.  So I would leave it
out.

I wonder if we can have some testcase for parallel updating/streaming in
testsuite?

Otherwise the patch looks good to me.
Honza
>  
>unsigned full = all < 0;
>gcov_type *total = &counters[GCOV_TOPN_MEM_COUNTERS * i];
> -- 
> 2.30.0
> 



Re: [x86] Fix PR target/99264

2021-03-03 Thread Eric Botcazou
> LGTM.

Thanks.  However, I overlooked an issue with pathologically large frames 
(larger than SEH_MAX_FRAME_SIZE, i.e. 2GB and for which we cannot emit CFI 
directives) that is visible in the gnat.dg testsuite under the form of an ICE.

Tested on x86_64-w64-mingw32, applied on mainline/10/9 branches as obvious.


2021-03-03 Eric Botcazou  

PR target/99234
* config/i386/i386.c (ix86_compute_frame_layout): For a SEH target,
point back the hard frame pointer to its default location when the
frame is larger than SEH_MAX_FRAME_SIZE.

-- 
Eric Botcazoudiff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index e85b06b6824..6672da597ba 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -6660,7 +6660,8 @@ ix86_compute_frame_layout (void)
   frame->hard_frame_pointer_offset = frame->sse_reg_save_offset;
 
   /* If we can leave the frame pointer where it is, do so.  Also, return
-	 the establisher frame for __builtin_frame_address (0).  */
+	 the establisher frame for __builtin_frame_address (0) or else if the
+	 frame overflows the SEH maximum frame size.  */
   const HOST_WIDE_INT diff
 	= frame->stack_pointer_offset - frame->hard_frame_pointer_offset;
   if (diff <= 255)
@@ -6678,6 +6679,8 @@ ix86_compute_frame_layout (void)
 	 frame that is addressable with 8-bit offsets.  */
 	  frame->hard_frame_pointer_offset = frame->stack_pointer_offset - 128;
 	}
+  else
+	frame->hard_frame_pointer_offset = frame->hfp_save_offset;
 }
 }
 


[Patch] Fortran: Fix -freal-{4,8}-real* handling [PR99355]

2021-03-03 Thread Tobias Burnus

Rather obvious change – don't apply replacement multiple times.

OK for mainline?

Tobias

-
Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München 
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank 
Thürauf
Fortran: Fix -freal-{4,8}-real* handling [PR99355]

Avoid chain kind conversion for, e.g., -freal-4-real-8 -freal-8-real-10.
Note that gfc_default_double_kind/gfc_default_double_kind already
honors the -freal flags.

gcc/fortran/ChangeLog:

	PR fortran/99355
	* decl.c (gfc_match_old_kind_spec, gfc_match_kind_spec): Avoid
	redoing kind conversions.
	* primary.c (match_real_constant): Likewise.

gcc/testsuite/ChangeLog:

	PR fortran/99355
	* gfortran.dg/real4-10-real8-10.f90: New test.
	* gfortran.dg/real4-10-real8-16.f90: New test.
	* gfortran.dg/real4-10-real8-4.f90: New test.
	* gfortran.dg/real4-10.f90: New test.
	* gfortran.dg/real4-16-real8-10.f90: New test.
	* gfortran.dg/real4-16-real8-16.f90: New test.
	* gfortran.dg/real4-16-real8-4.f90: New test.
	* gfortran.dg/real4-16.f90: New test.
	* gfortran.dg/real4-8-real8-10.f90: New test.
	* gfortran.dg/real4-8-real8-16.f90: New test.
	* gfortran.dg/real4-8-real8-4.f90: New test.
	* gfortran.dg/real4-8.f90: New test.
	* gfortran.dg/real8-10.f90: New test.
	* gfortran.dg/real8-16.f90: New test.
	* gfortran.dg/real8-4.f90: New test.

 gcc/fortran/decl.c  |  6 ++--
 gcc/fortran/primary.c   | 40 -
 gcc/testsuite/gfortran.dg/real4-10-real8-10.f90 | 23 ++
 gcc/testsuite/gfortran.dg/real4-10-real8-16.f90 | 24 +++
 gcc/testsuite/gfortran.dg/real4-10-real8-4.f90  | 23 ++
 gcc/testsuite/gfortran.dg/real4-10.f90  | 23 ++
 gcc/testsuite/gfortran.dg/real4-16-real8-10.f90 | 24 +++
 gcc/testsuite/gfortran.dg/real4-16-real8-16.f90 | 24 +++
 gcc/testsuite/gfortran.dg/real4-16-real8-4.f90  | 24 +++
 gcc/testsuite/gfortran.dg/real4-16.f90  | 24 +++
 gcc/testsuite/gfortran.dg/real4-8-real8-10.f90  | 23 ++
 gcc/testsuite/gfortran.dg/real4-8-real8-16.f90  | 24 +++
 gcc/testsuite/gfortran.dg/real4-8-real8-4.f90   | 23 ++
 gcc/testsuite/gfortran.dg/real4-8.f90   | 23 ++
 gcc/testsuite/gfortran.dg/real8-10.f90  | 23 ++
 gcc/testsuite/gfortran.dg/real8-16.f90  | 24 +++
 gcc/testsuite/gfortran.dg/real8-4.f90   | 23 ++
 17 files changed, 354 insertions(+), 44 deletions(-)

diff --git a/gcc/fortran/decl.c b/gcc/fortran/decl.c
index 723915822f3..947e4f868a1 100644
--- a/gcc/fortran/decl.c
+++ b/gcc/fortran/decl.c
@@ -3066,8 +3066,7 @@ gfc_match_old_kind_spec (gfc_typespec *ts)
 	  if (flag_real4_kind == 16)
 	ts->kind = 16;
 	}
-
-  if (ts->kind == 8)
+  else if (ts->kind == 8)
 	{
 	  if (flag_real8_kind == 4)
 	ts->kind = 4;
@@ -3246,8 +3245,7 @@ close_brackets:
 	  if (flag_real4_kind == 16)
 	ts->kind = 16;
 	}
-
-  if (ts->kind == 8)
+  else if (ts->kind == 8)
 	{
 	  if (flag_real8_kind == 4)
 	ts->kind = 4;
diff --git a/gcc/fortran/primary.c b/gcc/fortran/primary.c
index 7633e77909f..9dd1a86f275 100644
--- a/gcc/fortran/primary.c
+++ b/gcc/fortran/primary.c
@@ -676,26 +676,6 @@ done:
 	  goto cleanup;
 	}
   kind = gfc_default_double_kind;
-
-  if (kind == 4)
-	{
-	  if (flag_real4_kind == 8)
-	kind = 8;
-	  if (flag_real4_kind == 10)
-	kind = 10;
-	  if (flag_real4_kind == 16)
-	kind = 16;
-	}
-
-  if (kind == 8)
-	{
-	  if (flag_real8_kind == 4)
-	kind = 4;
-	  if (flag_real8_kind == 10)
-	kind = 10;
-	  if (flag_real8_kind == 16)
-	kind = 16;
-	}
   break;
 
 case 'q':
@@ -726,26 +706,6 @@ done:
   if (kind == -2)
 	kind = gfc_default_real_kind;
 
-  if (kind == 4)
-	{
-	  if (flag_real4_kind == 8)
-	kind = 8;
-	  if (flag_real4_kind == 10)
-	kind = 10;
-	  if (flag_real4_kind == 16)
-	kind = 16;
-	}
-
-  if (kind == 8)
-	{
-	  if (flag_real8_kind == 4)
-	kind = 4;
-	  if (flag_real8_kind == 10)
-	kind = 10;
-	  if (flag_real8_kind == 16)
-	kind = 16;
-	}
-
   if (gfc_validate_kind (BT_REAL, kind, true) < 0)
 	{
 	  gfc_error ("Invalid real kind %d at %C", kind);
diff --git a/gcc/testsuite/gfortran.dg/real4-10-real8-10.f90 b/gcc/testsuite/gfortran.dg/real4-10-real8-10.f90
new file mode 100644
index 000..af4f1b2b4e6
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/real4-10-real8-10.f90
@@ -0,0 +1,23 @@
+! { dg-do run { target i?86-*-* x86_64-*-* } }
+! { dg-additional-options "-w -freal-4-real-10 -freal-8-real-10" }
+!
+! PR fortran/99355
+!
+
+program test
+  real :: r1
+  real*4:: r2
+  real(4) :: r3
+  real(selected_real_kind(p=6)) :: r4
+
+  double precision :: d1
+  real*8 :: d2
+  real(8) :: d3
+  real(kind(1.d0)) :: d4
+  real(selected_real_kind(p=15)) :: d5
+
+  !print '(tr3,a10,10(tr1,i2))',

*PING* — Re: Fortran: Create CLASS(*) early to avoid ICE [PR99254]

2021-03-03 Thread Tobias Burnus

*PING*

I think the patch is an improvement, even though there is a small
regression and many issues are not covered: PR fortran/99266 and as
explained below.

On 25.02.21 12:16, Tobias Burnus wrote:

The issue is that for CLASS – and in particular CLASS(*)
the ts.u.derived->components is not yet created when it is
accessed. – PR99138 shows a similar case (unfixed!
See comment 5 and initial report for two examples).

There is an issue for code like:
  class(*) var
  dimension :: var(..)
that the attribute is not properly encoded in the class.
That's independent of this patch, cf. PR fortran/99266.

NOTE: This patch causes a regression:
  subroutine foo()
class(*) :: x
allocatable :: x
is now rejected with:
  CLASS variable ‘x’ at (1) must be dummy, allocatable or pointer

HOWEVER: While that was accepted before,
  subroutine foo(dummy)
class(*) :: dummy
allocatable :: dummy
failed before and still fails. (→ class_ok handling in
gfc_build_class_symbol).

All those issues are tracked in that PR fortran/99266.

OK for the trunk? What about GCC 10??

I am unsure about GCC 10 due to the new regression;
the issue (PR99254) was reported as new ICE between
20190825 and 20190901.

Tobias

PS: I still do not like the way we represent CLASS internally,
but I think I had complained about this already years ago...
Can proponents of the current way please look at
PR fortran/99266 ?


-
Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München 
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank 
Thürauf


c++: namespace reachability [PR 99344]

2021-03-03 Thread Nathan Sidwell

This reworks namespace serializing to avoid some issues I ran into
when working on 99170.  In modules, (non-anonymous) namespaces are
strange beasts, that always have external linkage, but may have
module-specific visibility.  I still don't get the latter 100%
correct, but this is in the right direction.

PR c++/99344
gcc/cp/
* module.cc (trees_out::decl_node): Small refactor.
(depset::hash::add_binding_entity): Return true on meeting an
import.  Set namespace's import here.
(module_state:write_namespaces): Inform of purview too.
(module_state:read_namespaces): Adjust.
* name-lookup.c (implicitly_export_namespace): Delete.
(do_pushdecl): Don't call it.
(push_namespace): Likewise, set purview.
(add_imported_namespace): Reorder parms.
* name-lookup.h (add_imported_namespace): Alter param ordering.
gcc/testsuite/
* g++.dg/modules/namespace-2_a.C
* g++.dg/modules/pr99344_a.C
* g++.dg/modules/pr99344_b.C

--
Nathan Sidwell
diff --git c/gcc/cp/module.cc w/gcc/cp/module.cc
index 369a02604fe..31172824fdd 100644
--- c/gcc/cp/module.cc
+++ w/gcc/cp/module.cc
@@ -2377,8 +2377,10 @@ public:
   }
 
 public:
+  /* This class-member is defined here, but the class was imported.  */
   bool is_member () const
   {
+gcc_checking_assert (get_entity_kind () == EK_DECL);
 return get_flag_bit ();
   }
 public:
@@ -8613,12 +8615,13 @@ trees_out::decl_node (tree decl, walk_kind ref)
   else if (TREE_CODE (ctx) != FUNCTION_DECL
 	   || TREE_CODE (decl) == TEMPLATE_DECL
 	   || (dep_hash->sneakoscope && DECL_IMPLICIT_TYPEDEF_P (decl))
-	   || (DECL_LANG_SPECIFIC (decl)
-	   && DECL_MODULE_IMPORT_P (decl)))
-dep = dep_hash->add_dependency (decl,
-TREE_CODE (decl) == NAMESPACE_DECL
-&& !DECL_NAMESPACE_ALIAS (decl)
-? depset::EK_NAMESPACE : depset::EK_DECL);
+	   || (DECL_LANG_SPECIFIC (decl) && DECL_MODULE_IMPORT_P (decl)))
+{
+  auto kind = (TREE_CODE (decl) == NAMESPACE_DECL
+		   && !DECL_NAMESPACE_ALIAS (decl)
+		   ? depset::EK_NAMESPACE : depset::EK_DECL);
+  dep = dep_hash->add_dependency (decl, kind);
+}
 
   if (!dep)
 {
@@ -12751,12 +12754,14 @@ struct add_binding_data
   bool met_namespace;
 };
 
+/* Return true if we are, or contain something that is exported.  */
+
 bool
 depset::hash::add_binding_entity (tree decl, WMB_Flags flags, void *data_)
 {
   auto data = static_cast  (data_);
 
-  if (TREE_CODE (decl) != NAMESPACE_DECL || DECL_NAMESPACE_ALIAS (decl))
+  if (!(TREE_CODE (decl) == NAMESPACE_DECL && !DECL_NAMESPACE_ALIAS (decl)))
 {
   tree inner = decl;
 
@@ -12811,7 +12816,7 @@ depset::hash::add_binding_entity (tree decl, WMB_Flags flags, void *data_)
 		d->clear_hidden_binding ();
 		  if (flags & WMB_Export)
 		OVL_EXPORT_P (d->get_entity ()) = true;
-		  return false;
+		  return bool (flags & WMB_Export);
 		}
 	}
 	}
@@ -12857,30 +12862,37 @@ depset::hash::add_binding_entity (tree decl, WMB_Flags flags, void *data_)
   /* Binding and contents are mutually dependent.  */
   dep->deps.safe_push (data->binding);
 
-  return true;
+  return (flags & WMB_Using
+	  ? flags & WMB_Export : DECL_MODULE_EXPORT_P (decl));
 }
   else if (DECL_NAME (decl) && !data->met_namespace)
 {
   /* Namespace, walk exactly once.  */
   gcc_checking_assert (TREE_PUBLIC (decl));
   data->met_namespace = true;
-  if (data->hash->add_namespace_entities (decl, data->partitions)
-	  || DECL_MODULE_EXPORT_P (decl))
+  if (data->hash->add_namespace_entities (decl, data->partitions))
+	{
+	  /* It contains an exported thing, so it is exported.  */
+	  gcc_checking_assert (DECL_MODULE_PURVIEW_P (decl));
+	  DECL_MODULE_EXPORT_P (decl) = true;
+	}
+
+  if (DECL_MODULE_PURVIEW_P (decl))
 	{
 	  data->hash->make_dependency (decl, depset::EK_NAMESPACE);
-	  return true;
+
+	  return DECL_MODULE_EXPORT_P (decl);
 	}
 }
 
   return false;
 }
 
-/* Recursively find all the namespace bindings of NS.
-   Add a depset for every binding that contains an export or
-   module-linkage entity.  Add a defining depset for every such decl
-   that we need to write a definition.  Such defining depsets depend
-   on the binding depset.  Returns true if we contain something
-   explicitly exported.  */
+/* Recursively find all the namespace bindings of NS.  Add a depset
+   for every binding that contains an export or module-linkage entity.
+   Add a defining depset for every such decl that we need to write a
+   definition.  Such defining depsets depend on the binding depset.
+   Returns true if we contain something exported.  */
 
 bool
 depset::hash::add_namespace_entities (tree ns, bitmap partitions)
@@ -15088,36 +15100,29 @@ module_state::write_namespaces (elf_out *to, vec spaces,
   tree ns = b->get_entity ();
 
   gcc_checking_assert (TREE_CODE (ns) == NAMESPACE_DECL);
+  /* P1815 ma

[PATCH] testsuite: aarch64: Add tests for narrowing-arithmetic intrinsics

2021-03-03 Thread Jonathan Wright via Gcc-patches
Hi,

As subject, this patch adds tests for v[r]addhn_high and v[r]subhn_high Neon
intrinsics. Since these intrinsics are only supported for AArch64, these tests
are restricted to only run on AArch64 targets.

Ok for master?

Thanks,
Jonathan

---

gcc/testsuite/ChangeLog:

2021-03-02  Jonathan Wright  

* gcc.target/aarch64/advsimd-intrinsics/vXXXhn_high.inc:
New test template.
* gcc.target/aarch64/advsimd-intrinsics/vaddhn_high.c:
New test.
* gcc.target/aarch64/advsimd-intrinsics/vraddhn_high.c:
New test.
* gcc.target/aarch64/advsimd-intrinsics/vrsubhn_high.c:
New test.
* gcc.target/aarch64/advsimd-intrinsics/vsubhn_high.c:
New test.
diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vXXXhn_high.inc b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vXXXhn_high.inc
new file mode 100644
index ..e77e84520139069a90cb5d62046744eaf14ff195
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vXXXhn_high.inc
@@ -0,0 +1,65 @@
+#define FNNAME1(NAME) exec_ ## NAME
+#define FNNAME(NAME) FNNAME1(NAME)
+
+void FNNAME (INSN_NAME) (void)
+{
+  /* Basic test: v128_r=vXXXhn_high(v64_r, v128_a, v128_b), store result.  */
+#define TEST_VXXXHN_HIGH1(INSN, T1, T2, W1, W2, N1, N2)\
+  VECT_VAR(v128_r, T1, W2, N2) = INSN##_##T2##W1(VECT_VAR(v64_r, T1, W2, N1),	\
+		 VECT_VAR(v128_a, T1, W1, N1),	\
+		 VECT_VAR(v128_b, T1, W1, N1));	\
+  vst1q_##T2##W2(VECT_VAR(result, T1, W2, N2), VECT_VAR(v128_r, T1, W2, N2))
+
+#define TEST_VXXXHN_HIGH(INSN, T1, T2, W1, W2, N1, N2)\
+  TEST_VXXXHN_HIGH1(INSN, T1, T2, W1, W2, N1, N2)
+
+  DECL_VARIABLE_128BITS_VARIANTS(v128_r);
+  DECL_VARIABLE_64BITS_VARIANTS(v64_r);
+  DECL_VARIABLE_128BITS_VARIANTS(v128_a);
+  DECL_VARIABLE_128BITS_VARIANTS(v128_b);
+
+  clean_results ();
+
+  /* Fill v64_r with a value easy to recognise in the result vector. */
+  VDUP(v64_r, , int, s, 8, 8, 0x5);
+  VDUP(v64_r, , int, s, 16, 4, 0x5);
+  VDUP(v64_r, , int, s, 32, 2, 0x5);
+  VDUP(v64_r, , uint, u, 8, 8, 0x5);
+  VDUP(v64_r, , uint, u, 16, 4, 0x5);
+  VDUP(v64_r, , uint, u, 32, 2, 0x5);
+
+  /* Fill input v128_a and v128_b with arbitrary values. */
+  VDUP(v128_a, q, int, s, 16, 8, 50*(UINT8_MAX+1));
+  VDUP(v128_a, q, int, s, 32, 4, 50*(UINT16_MAX+1));
+  VDUP(v128_a, q, int, s, 64, 2, 24*((uint64_t)UINT32_MAX+1));
+  VDUP(v128_a, q, uint, u, 16, 8, 3*(UINT8_MAX+1));
+  VDUP(v128_a, q, uint, u, 32, 4, 55*(UINT16_MAX+1));
+  VDUP(v128_a, q, uint, u, 64, 2, 3*((uint64_t)UINT32_MAX+1));
+
+  VDUP(v128_b, q, int, s, 16, 8, (uint16_t)UINT8_MAX);
+  VDUP(v128_b, q, int, s, 32, 4, (uint32_t)UINT16_MAX);
+  VDUP(v128_b, q, int, s, 64, 2, (uint64_t)UINT32_MAX);
+  VDUP(v128_b, q, uint, u, 16, 8, (uint16_t)UINT8_MAX);
+  VDUP(v128_b, q, uint, u, 32, 4, (uint32_t)UINT16_MAX);
+  VDUP(v128_b, q, uint, u, 64, 2, (uint64_t)UINT32_MAX);
+
+  TEST_VXXXHN_HIGH(INSN_NAME, int, s, 16, 8, 8, 16);
+  TEST_VXXXHN_HIGH(INSN_NAME, int, s, 32, 16, 4, 8);
+  TEST_VXXXHN_HIGH(INSN_NAME, int, s, 64, 32, 2, 4);
+  TEST_VXXXHN_HIGH(INSN_NAME, uint, u, 16, 8, 8, 16);
+  TEST_VXXXHN_HIGH(INSN_NAME, uint, u, 32, 16, 4, 8);
+  TEST_VXXXHN_HIGH(INSN_NAME, uint, u, 64, 32, 2, 4);
+
+  CHECK(TEST_MSG, int, 8, 16, PRIx8, expected, "");
+  CHECK(TEST_MSG, int, 16, 8, PRIx16, expected, "");
+  CHECK(TEST_MSG, int, 32, 4, PRIx32, expected, "");
+  CHECK(TEST_MSG, uint, 8, 16, PRIx8, expected, "");
+  CHECK(TEST_MSG, uint, 16, 8, PRIx16, expected, "");
+  CHECK(TEST_MSG, uint, 32, 4, PRIx32, expected, "");
+}
+
+int main (void)
+{
+  FNNAME (INSN_NAME) ();
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vaddhn_high.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vaddhn_high.c
new file mode 100644
index ..329dd494f8b2cd3b9c64187278b55107651ea05a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vaddhn_high.c
@@ -0,0 +1,32 @@
+/* { dg-skip-if "" { arm*-*-* } } */
+
+#include 
+#include "arm-neon-ref.h"
+#include "compute-ref-data.h"
+
+#if defined(__cplusplus)
+#include 
+#else
+#include 
+#endif
+
+#define INSN_NAME vaddhn_high
+#define TEST_MSG "VADDHN_HIGH"
+
+/* Expected results.  */
+VECT_VAR_DECL(expected, int, 8, 16) [] = { 0x5, 0x5, 0x5, 0x5,
+   0x5, 0x5, 0x5, 0x5,
+	   0x32, 0x32, 0x32, 0x32,
+	   0x32, 0x32, 0x32, 0x32 };
+VECT_VAR_DECL(expected, int, 16, 8) [] = { 0x5, 0x5, 0x5, 0x5,
+	   0x32, 0x32, 0x32, 0x32 };
+VECT_VAR_DECL(expected, int, 32, 4) [] = { 0x5, 0x5, 0x18, 0x18 };
+VECT_VAR_DECL(expected, uint, 8, 16) [] = { 0x5, 0x5, 0x5, 0x5,
+	0x5, 0x5, 0x5, 0x5,
+	0x3, 0x3, 0x3, 0x3,
+	0x3, 0x3, 0x3, 0x3 };
+VECT_VAR_DECL(expected, uint, 16, 8) [] = { 0x5, 0x5, 0x5, 0x5,
+	0x37, 0x37, 0x37, 0x37 };
+VECT_VAR_DECL(expected, uint, 32, 4) [] = { 0x5, 0x5, 0x3, 0x3 };
+
+#include "vXXXhn_high.inc"
diff --git a/gc

[PATCH] testsuite: aarch64: Add tests for v[r]shrn_high intrinsics

2021-03-03 Thread Jonathan Wright via Gcc-patches
Hi,

As subject, this patch adds tests for v[r]shrn_high Neon intrinsics. Since
these intrinsics are only supported for AArch64, these tests are restricted
to only run on AArch64 targets.

Ok for master?

Thanks,
Jonathan

---

gcc/testsuite/ChangeLog:

2021-03-02  Jonathan Wright  

* gcc.target/aarch64/advsimd-intrinsics/vrshrn_high_n.c:
New test.
* gcc.target/aarch64/advsimd-intrinsics/vshrn_high_n.c:
New test.
diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vrshrn_high_n.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vrshrn_high_n.c
new file mode 100644
index ..b570ddccde9c7cc9a22c0e23d2a852b78d9dc12a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vrshrn_high_n.c
@@ -0,0 +1,177 @@
+/* { dg-skip-if "" { arm*-*-* } } */
+
+#include 
+#include "arm-neon-ref.h"
+#include "compute-ref-data.h"
+
+/* Expected results with input=0.  */
+VECT_VAR_DECL(expected_0, int, 8, 16) [] = { 0x5, 0x5, 0x5, 0x5,
+	 0x5, 0x5, 0x5, 0x5,
+	 0x0, 0x0, 0x0, 0x0,
+	 0x0, 0x0, 0x0, 0x0 };
+VECT_VAR_DECL(expected_0, int, 16, 8) [] = { 0x5, 0x5, 0x5, 0x5,
+	 0x0, 0x0, 0x0, 0x0 };
+VECT_VAR_DECL(expected_0, int, 32, 4) [] = { 0x5, 0x5, 0x0, 0x0 };
+VECT_VAR_DECL(expected_0, uint, 8, 16) [] = { 0x5, 0x5, 0x5, 0x5,
+	  0x5, 0x5, 0x5, 0x5,
+	  0x0, 0x0, 0x0, 0x0,
+	  0x0, 0x0, 0x0, 0x0 };
+VECT_VAR_DECL(expected_0, uint, 16, 8) [] = { 0x5, 0x5, 0x5, 0x5,
+	  0x0, 0x0, 0x0, 0x0 };
+VECT_VAR_DECL(expected_0, uint, 32, 4) [] = { 0x5, 0x5, 0x0, 0x0 };
+
+/* Expected results.  */
+VECT_VAR_DECL(expected, int, 8, 16) [] = { 0x5, 0x5, 0x5, 0x5,
+	   0x5, 0x5, 0x5, 0x5,
+	   0xf8, 0xf9, 0xf9, 0xfa,
+	   0xfa, 0xfb, 0xfb, 0xfc };
+VECT_VAR_DECL(expected, int, 16, 8) [] = { 0x5, 0x5, 0x5, 0x5,
+	   0xfff8, 0xfff9, 0xfff9, 0xfffa };
+VECT_VAR_DECL(expected, int, 32, 4) [] = { 0x5, 0x5, 0xfffc, 0xfffc };
+VECT_VAR_DECL(expected, uint, 8, 16) [] = { 0x5, 0x5, 0x5, 0x5,
+	0x5, 0x5, 0x5, 0x5,
+	0xfc, 0xfc, 0xfd, 0xfd,
+	0xfd, 0xfd, 0xfe, 0xfe };
+VECT_VAR_DECL(expected, uint, 16, 8) [] = { 0x5, 0x5, 0x5, 0x5,
+	0xfffe, 0xfffe, 0xfffe, 0xfffe };
+VECT_VAR_DECL(expected, uint, 32, 4) [] = { 0x5, 0x5, 0xfffe, 0xfffe };
+
+/* Expected results with large shift amount.  */
+VECT_VAR_DECL(expected_sh_large, int, 8, 16) [] = { 0x5, 0x5, 0x5, 0x5,
+		0x5, 0x5, 0x5, 0x5,
+		0x0, 0x0, 0x0, 0x0,
+		0x0, 0x0, 0x0, 0x0 };
+VECT_VAR_DECL(expected_sh_large, int, 16, 8) [] = { 0x5, 0x5, 0x5, 0x5,
+		0x0, 0x0, 0x0, 0x0 };
+VECT_VAR_DECL(expected_sh_large, int, 32, 4) [] = { 0x5, 0x5, 0x0, 0x0 };
+VECT_VAR_DECL(expected_sh_large, uint, 8, 16) [] = { 0x5, 0x5, 0x5, 0x5,
+		 0x5, 0x5, 0x5, 0x5,
+		 0x0, 0x0, 0x0, 0x0,
+		 0x0, 0x0, 0x0, 0x0 };
+VECT_VAR_DECL(expected_sh_large, uint, 16, 8) [] = { 0x5, 0x5, 0x5, 0x5,
+		 0x0, 0x0, 0x0, 0x0 };
+VECT_VAR_DECL(expected_sh_large, uint, 32, 4) [] = { 0x5, 0x5, 0x0, 0x0 };
+
+#define TEST_MSG "VRSHRN_HIGH_N"
+void exec_vrshrn_high_n (void)
+{
+  /* Basic test: y=vrshrn_high_n(r,x,v), then store the result.  */
+#define TEST_VRSHRN_HIGH_N(T1, T2, W1, W2, N1, N2, V)\
+  VECT_VAR(vector_res, T1, W2, N2) =		\
+vrshrn_high_n_##T2##W1(VECT_VAR(vector_res_lo, T1, W2, N1),			\
+			   VECT_VAR(vector, T1, W1, N1),			\
+			   V);			\
+  vst1q_##T2##W2(VECT_VAR(result, T1, W2, N2), VECT_VAR(vector_res, T1, W2, N2))
+
+  DECL_VARIABLE(vector_res_lo, int, 8, 8);
+  DECL_VARIABLE(vector_res_lo, int, 16, 4);
+  DECL_VARIABLE(vector_res_lo, int, 32, 2);
+  DECL_VARIABLE(vector_res_lo, uint, 8, 8);
+  DECL_VARIABLE(vector_res_lo, uint, 16, 4);
+  DECL_VARIABLE(vector_res_lo, uint, 32, 2);
+
+  DECL_VARIABLE(vector, int, 16, 8);
+  DECL_VARIABLE(vector, int, 32, 4);
+  DECL_VARIABLE(vector, int, 64, 2);
+  DECL_VARIABLE(vector, uint, 16, 8);
+  DECL_VARIABLE(vector, uint, 32, 4);
+  DECL_VARIABLE(vector, uint, 64, 2);
+
+  DECL_VARIABLE(vector_res, int, 8, 16);
+  DECL_VARIABLE(vector_res, int, 16, 8);
+  DECL_VARIABLE(vector_res, int, 32, 4);
+  DECL_VARIABLE(vector_res, uint, 8, 16);
+  DECL_VARIABLE(vector_res, uint, 16, 8);
+  DECL_VARIABLE(vector_res, uint, 32, 4);
+
+  clean_results ();
+
+  /* Fill vector_res_lo with a value easy to recognise in the result vector. */
+  VDUP(vector_res_lo, , int, s, 8, 8, 0x5);
+  VDUP(vector_res_lo, , int, s, 16, 4, 0x5);
+  VDUP(vector_res_lo, , int, s, 32, 2, 0x5);
+  VDUP(vector_res_lo, , uint, u, 8, 8, 0x5);
+  VDUP(vector_res_lo, , uint, u, 16, 4, 0x5);
+  VDUP(vector_res_lo, , uint, u, 32, 2, 0x5);
+
+  /* Fill input vector with 0, to check behavior on limits.  */
+  VDUP(vector, q, int, s, 16, 8, 0);
+  VDUP(vector, q, int, s, 32, 4, 0);
+  VDUP(vector, q, int, s, 64, 2, 0);
+  VDUP(vector, q, uint, u, 16, 8, 0);
+  VDUP(vector, q, uint, u, 32, 4, 0);
+  VDUP(vector, q, u

[PATCH] libstdc++: Update Solaris baselines for GCC 11.1

2021-03-03 Thread Rainer Orth
The following patch updates the Solaris baselines for GCC 11.1.  There's
only one caveat: comparing the Solaris 11.3 and 11.4 baselines, I find

+FUNC:_ZSt10from_charsPKcS0_RdSt12chars_format@@GLIBCXX_3.4.29
+FUNC:_ZSt10from_charsPKcS0_ReSt12chars_format@@GLIBCXX_3.4.29
+FUNC:_ZSt10from_charsPKcS0_RfSt12chars_format@@GLIBCXX_3.4.29

i.e.

std::from_chars(char const*, char const*, double&, std::chars_format)

and similarly for long double, float.  Those are from from
src/c++17/floating_from_chars.cc and only defined if
_GLIBCXX_HAVE_USELOCALE, i.e. depend on the XPG7 addition.  Given that
only Solaris 11.4 supports XPG7, I've taken the 11.3 baselines to avoid
having separate ones for 11.3 and 11.4.

Tested on i386-pc-solaris2.11 and sparc-sun-solaris2.11 (sparc and x86,
32 and 64-bit, 11.3 and 11.4).

Ok for master?

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


2021-02-10  Rainer Orth  

libstdc++-v3:
* config/abi/post/i386-solaris/baseline_symbols.txt: Regenerate.
* config/abi/post/i386-solaris/amd64/baseline_symbols.txt:
Likewise.
* config/abi/post/sparc-solaris/baseline_symbols.txt: Likewise.
* config/abi/post/sparc-solaris/sparcv9/baseline_symbols.txt:
Likewise.

# HG changeset patch
# Parent  f1e2cd4176dc0123a8f5de36494b007f401b27d1
libstdc++: Update Solaris baselines for GCC 11.1

diff --git a/libstdc++-v3/config/abi/post/i386-solaris/amd64/baseline_symbols.txt b/libstdc++-v3/config/abi/post/i386-solaris/amd64/baseline_symbols.txt
--- a/libstdc++-v3/config/abi/post/i386-solaris/amd64/baseline_symbols.txt
+++ b/libstdc++-v3/config/abi/post/i386-solaris/amd64/baseline_symbols.txt
@@ -194,6 +194,14 @@ FUNC:_ZNK11__gnu_debug16_Error_formatter
 FUNC:_ZNK11__gnu_debug16_Error_formatter8_M_errorEv@@GLIBCXX_3.4
 FUNC:_ZNK11__gnu_debug19_Safe_iterator_base11_M_singularEv@@GLIBCXX_3.4
 FUNC:_ZNK11__gnu_debug19_Safe_iterator_base14_M_can_compareERKS0_@@GLIBCXX_3.4
+FUNC:_ZNKRSt7__cxx1115basic_stringbufIcSt11char_traitsIcESaIcEE3strEv@@GLIBCXX_3.4.29
+FUNC:_ZNKRSt7__cxx1115basic_stringbufIwSt11char_traitsIwESaIwEE3strEv@@GLIBCXX_3.4.29
+FUNC:_ZNKRSt7__cxx1118basic_stringstreamIcSt11char_traitsIcESaIcEE3strEv@@GLIBCXX_3.4.29
+FUNC:_ZNKRSt7__cxx1118basic_stringstreamIwSt11char_traitsIwESaIwEE3strEv@@GLIBCXX_3.4.29
+FUNC:_ZNKRSt7__cxx1119basic_istringstreamIcSt11char_traitsIcESaIcEE3strEv@@GLIBCXX_3.4.29
+FUNC:_ZNKRSt7__cxx1119basic_istringstreamIwSt11char_traitsIwESaIwEE3strEv@@GLIBCXX_3.4.29
+FUNC:_ZNKRSt7__cxx1119basic_ostringstreamIcSt11char_traitsIcESaIcEE3strEv@@GLIBCXX_3.4.29
+FUNC:_ZNKRSt7__cxx1119basic_ostringstreamIwSt11char_traitsIwESaIwEE3strEv@@GLIBCXX_3.4.29
 FUNC:_ZNKSbIwSt11char_traitsIwESaIwEE11_M_disjunctEPKw@@GLIBCXX_3.4.5
 FUNC:_ZNKSbIwSt11char_traitsIwESaIwEE12find_last_ofEPKwm@@GLIBCXX_3.4
 FUNC:_ZNKSbIwSt11char_traitsIwESaIwEE12find_last_ofEPKwmm@@GLIBCXX_3.4
@@ -841,19 +849,29 @@ FUNC:_ZNKSt7__cxx1112basic_stringIwSt11c
 FUNC:_ZNKSt7__cxx1112basic_stringIwSt11char_traitsIwESaIwEE8max_sizeEv@@GLIBCXX_3.4.21
 FUNC:_ZNKSt7__cxx1112basic_stringIwSt11char_traitsIwESaIwEEcvSt17basic_string_viewIwS2_EEv@@GLIBCXX_3.4.26
 FUNC:_ZNKSt7__cxx1112basic_stringIwSt11char_traitsIwESaIwEEixEm@@GLIBCXX_3.4.21
+FUNC:_ZNKSt7__cxx1115basic_stringbufIcSt11char_traitsIcESaIcEE13get_allocatorEv@@GLIBCXX_3.4.29
 FUNC:_ZNKSt7__cxx1115basic_stringbufIcSt11char_traitsIcESaIcEE3strEv@@GLIBCXX_3.4.21
+FUNC:_ZNKSt7__cxx1115basic_stringbufIcSt11char_traitsIcESaIcEE4viewEv@@GLIBCXX_3.4.29
+FUNC:_ZNKSt7__cxx1115basic_stringbufIwSt11char_traitsIwESaIwEE13get_allocatorEv@@GLIBCXX_3.4.29
 FUNC:_ZNKSt7__cxx1115basic_stringbufIwSt11char_traitsIwESaIwEE3strEv@@GLIBCXX_3.4.21
+FUNC:_ZNKSt7__cxx1115basic_stringbufIwSt11char_traitsIwESaIwEE4viewEv@@GLIBCXX_3.4.29
 FUNC:_ZNKSt7__cxx1118basic_stringstreamIcSt11char_traitsIcESaIcEE3strEv@@GLIBCXX_3.4.21
+FUNC:_ZNKSt7__cxx1118basic_stringstreamIcSt11char_traitsIcESaIcEE4viewEv@@GLIBCXX_3.4.29
 FUNC:_ZNKSt7__cxx1118basic_stringstreamIcSt11char_traitsIcESaIcEE5rdbufEv@@GLIBCXX_3.4.21
 FUNC:_ZNKSt7__cxx1118basic_stringstreamIwSt11char_traitsIwESaIwEE3strEv@@GLIBCXX_3.4.21
+FUNC:_ZNKSt7__cxx1118basic_stringstreamIwSt11char_traitsIwESaIwEE4viewEv@@GLIBCXX_3.4.29
 FUNC:_ZNKSt7__cxx1118basic_stringstreamIwSt11char_traitsIwESaIwEE5rdbufEv@@GLIBCXX_3.4.21
 FUNC:_ZNKSt7__cxx1119basic_istringstreamIcSt11char_traitsIcESaIcEE3strEv@@GLIBCXX_3.4.21
+FUNC:_ZNKSt7__cxx1119basic_istringstreamIcSt11char_traitsIcESaIcEE4viewEv@@GLIBCXX_3.4.29
 FUNC:_ZNKSt7__cxx1119basic_istringstreamIcSt11char_traitsIcESaIcEE5rdbufEv@@GLIBCXX_3.4.21
 FUNC:_ZNKSt7__cxx1119basic_istringstreamIwSt11char_traitsIwESaIwEE3strEv@@GLIBCXX_3.4.21
+FUNC:_ZNKSt7__cxx1119basic_istringstreamIwSt11char_traitsIwESaIwEE4viewEv@@GLIBCXX_3.4.29
 FUNC:_ZNKSt7__cxx1119basic_istringstreamIwSt11char_traitsIwESaIwEE5rdbufEv@@GLIBCXX_3.4.21
 FUNC:_ZNKSt7__cxx1119basic_ostringstre

[PATCH] testsuite: aarch64: Add tests for v[q]mov[u]n_high intrinsics

2021-03-03 Thread Jonathan Wright via Gcc-patches
Hi,

As subject, this patch adds tests for v[q]mov[u]n_high Neon intrinsics. Since
these intrinsics are only supported for AArch64, these tests are restricted
to only run on AArch64 targets.

Ok for master?

Thanks,
Jonathan

---

gcc/testsuite/ChangeLog:

2021-03-02  Jonathan Wright  

* gcc.target/aarch64/advsimd-intrinsics/vmovn_high.c:
New test.
* gcc.target/aarch64/advsimd-intrinsics/vqmovn_high.c:
New test.
* gcc.target/aarch64/advsimd-intrinsics/vqmovun_high.c:
New test.
diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vmovn_high.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vmovn_high.c
new file mode 100644
index ..e05a40f9619d9e817267d1611257820f62c0ffaa
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vmovn_high.c
@@ -0,0 +1,73 @@
+/* { dg-skip-if "" { arm*-*-* } } */
+
+#include 
+#include "arm-neon-ref.h"
+#include "compute-ref-data.h"
+
+/* Expected results.  */
+VECT_VAR_DECL(expected, int, 8, 16) [] = { 0x5, 0x5, 0x5, 0x5,
+	   0x5, 0x5, 0x5, 0x5,
+	   0xf0, 0xf1, 0xf2, 0xf3,
+   0xf4, 0xf5, 0xf6, 0xf7 };
+VECT_VAR_DECL(expected, int, 16, 8) [] = { 0x5, 0x5, 0x5, 0x5, 
+	   0xfff0, 0xfff1, 0xfff2, 0xfff3 };
+VECT_VAR_DECL(expected, int, 32, 4) [] = { 0x5, 0x5, 0xfff0, 0xfff1 };
+VECT_VAR_DECL(expected, uint, 8, 16) [] = { 0x5, 0x5, 0x5, 0x5,
+	0x5, 0x5, 0x5, 0x5,
+	0xf0, 0xf1, 0xf2, 0xf3,
+	0xf4, 0xf5, 0xf6, 0xf7 };
+VECT_VAR_DECL(expected, uint, 16, 8) [] = { 0x5, 0x5, 0x5, 0x5,
+	0xfff0, 0xfff1, 0xfff2, 0xfff3 };
+VECT_VAR_DECL(expected, uint, 32, 4) [] = { 0x5, 0x5, 0xfff0, 0xfff1 };
+
+#define TEST_MSG "VMOVN_HIGH"
+void exec_vmovn_high (void)
+{
+  /* Basic test: vec128_r=vmovn_high(vec64_r, vec128_x), store the result.  */
+#define TEST_VMOVN_HIGH(T1, T2, W1, W2, N1, N2)	\
+  VECT_VAR(vec128_r, T1, W2, N2) =		\
+vmovn_high_##T2##W1(VECT_VAR(vec64_r, T1, W2, N1),\
+			VECT_VAR(vec128_x, T1, W1, N1));			\
+  vst1q_##T2##W2(VECT_VAR(result, T1, W2, N2), VECT_VAR(vec128_r, T1, W2, N2))
+
+  DECL_VARIABLE_128BITS_VARIANTS(vec128_r);
+  DECL_VARIABLE_64BITS_VARIANTS(vec64_r);
+  DECL_VARIABLE_128BITS_VARIANTS(vec128_x);
+
+  clean_results ();
+
+  /* Fill vec64_r with a value easy to recognise in the result vector. */
+  VDUP(vec64_r, , int, s, 8, 8, 0x5);
+  VDUP(vec64_r, , int, s, 16, 4, 0x5);
+  VDUP(vec64_r, , int, s, 32, 2, 0x5);
+  VDUP(vec64_r, , uint, u, 8, 8, 0x5);
+  VDUP(vec64_r, , uint, u, 16, 4, 0x5);
+  VDUP(vec64_r, , uint, u, 32, 2, 0x5);
+
+  VLOAD(vec128_x, buffer, q, int, s, 16, 8);
+  VLOAD(vec128_x, buffer, q, int, s, 32, 4);
+  VLOAD(vec128_x, buffer, q, int, s, 64, 2);
+  VLOAD(vec128_x, buffer, q, uint, u, 16, 8);
+  VLOAD(vec128_x, buffer, q, uint, u, 32, 4);
+  VLOAD(vec128_x, buffer, q, uint, u, 64, 2);
+
+  TEST_VMOVN_HIGH(int, s, 16, 8, 8, 16);
+  TEST_VMOVN_HIGH(int, s, 32, 16, 4, 8);
+  TEST_VMOVN_HIGH(int, s, 64, 32, 2, 4);
+  TEST_VMOVN_HIGH(uint, u, 16, 8, 8, 16);
+  TEST_VMOVN_HIGH(uint, u, 32, 16, 4, 8);
+  TEST_VMOVN_HIGH(uint, u, 64, 32, 2, 4);
+
+  CHECK(TEST_MSG, int, 8, 16, PRIx8, expected, "");
+  CHECK(TEST_MSG, int, 16, 8, PRIx16, expected, "");
+  CHECK(TEST_MSG, int, 32, 4, PRIx32, expected, "");
+  CHECK(TEST_MSG, uint, 8, 16, PRIx8, expected, "");
+  CHECK(TEST_MSG, uint, 16, 8, PRIx16, expected, "");
+  CHECK(TEST_MSG, uint, 32, 4, PRIx32, expected, "");
+}
+
+int main (void)
+{
+  exec_vmovn_high ();
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vqmovn_high.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vqmovn_high.c
new file mode 100644
index ..cb4f5c83de889a420b1f4408d4f95575aa783ae5
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vqmovn_high.c
@@ -0,0 +1,121 @@
+/* { dg-skip-if "" { arm*-*-* } } */
+
+#include 
+#include "arm-neon-ref.h"
+#include "compute-ref-data.h"
+
+/* Expected results.  */
+VECT_VAR_DECL(expected, int, 8, 16) [] = { 0x5, 0x5, 0x5, 0x5,
+	   0x5, 0x5, 0x5, 0x5,
+	   0x12, 0x12, 0x12, 0x12,
+	   0x12, 0x12, 0x12, 0x12 };
+VECT_VAR_DECL(expected, int, 16, 8) [] = { 0x5, 0x5, 0x5, 0x5,
+	   0x1278, 0x1278, 0x1278, 0x1278 };
+VECT_VAR_DECL(expected, int, 32, 4) [] = { 0x5, 0x5, 0x12345678, 0x12345678 };
+VECT_VAR_DECL(expected, uint, 8, 16) [] = { 0x5, 0x5, 0x5, 0x5,
+	0x5, 0x5, 0x5, 0x5,
+	0x82, 0x82, 0x82, 0x82,
+	0x82, 0x82, 0x82, 0x82 };
+VECT_VAR_DECL(expected, uint, 16, 8) [] = { 0x5, 0x5, 0x5, 0x5,
+	0x8765, 0x8765, 0x8765, 0x8765 };
+VECT_VAR_DECL(expected, uint, 32, 4) [] = { 0x5, 0x5, 0x87654321, 0x87654321 };
+
+/* Expected results when saturation occurs.  */
+VECT_VAR_DECL(expected1, int, 8, 16) [] = { 0x5, 0x5, 0x5, 0x5,
+	0x5, 0x5, 0x5, 0x5,
+	0x7f, 0x7f, 0x7f, 0x7f,
+	0x7f, 0x7f, 0x7f, 0x7f };
+VECT_VAR_DECL(e

[PATCH] testsuite: i386: Fix gcc.target/i386/pr95798-?.c on Solaris

2021-03-03 Thread Rainer Orth
The new gcc.target/i386/pr95798-?.c tests FAIL on 64-bit Solaris/x86:

+FAIL: gcc.target/i386/pr95798-1.c scan-assembler 1, 
8(%rsp,%r[a-z0-9]*,8)
+FAIL: gcc.target/i386/pr95798-1.c scan-assembler 2, 
16(%rsp,%r[a-z0-9]*,8)
+FAIL: gcc.target/i386/pr95798-1.c scan-assembler 3, 
24(%rsp,%r[a-z0-9]*,8)
+FAIL: gcc.target/i386/pr95798-1.c scan-assembler 4, 
32(%rsp,%r[a-z0-9]*,8)
+FAIL: gcc.target/i386/pr95798-1.c scan-assembler 5, 
40(%rsp,%r[a-z0-9]*,8)
+FAIL: gcc.target/i386/pr95798-1.c scan-assembler 6, 
48(%rsp,%r[a-z0-9]*,8)
+FAIL: gcc.target/i386/pr95798-1.c scan-assembler 7, 
56(%rsp,%r[a-z0-9]*,8)
+FAIL: gcc.target/i386/pr95798-2.c scan-assembler 1, 
8(%rsp,%r[a-z0-9]*,8)
+FAIL: gcc.target/i386/pr95798-2.c scan-assembler 2, 
16(%rsp,%r[a-z0-9]*,8)
+FAIL: gcc.target/i386/pr95798-2.c scan-assembler 3, 
24(%rsp,%r[a-z0-9]*,8)
+FAIL: gcc.target/i386/pr95798-2.c scan-assembler 4, 
32(%rsp,%r[a-z0-9]*,8)
+FAIL: gcc.target/i386/pr95798-2.c scan-assembler 5, 
40(%rsp,%r[a-z0-9]*,8)
+FAIL: gcc.target/i386/pr95798-2.c scan-assembler 6, 
48(%rsp,%r[a-z0-9]*,8)
+FAIL: gcc.target/i386/pr95798-2.c scan-assembler 7, 
56(%rsp,%r[a-z0-9]*,8)

This happens because Solaris/amd64 defaults to -fno-omit-frame-pointer
and can be avoided by always passing -fomit-frame-pointer.

Tested on i386-pc-solaris2.11 and x86_64-pc-linux-gnu.

Ok for master?

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


2021-03-02  Rainer Orth  

gcc/testsuite:
* gcc.target/i386/pr95798-1.c: Add -fomit-frame-pointer to
dg-options.
* gcc.target/i386/pr95798-2.c: Likewise.

# HG changeset patch
# Parent  bedda525d3ac004d2d10680b74fcadf8324f1b5c
testsuite: i386: Fix gcc.target/i386/pr95798-?.c on Solaris

diff --git a/gcc/testsuite/gcc.target/i386/pr95798-1.c b/gcc/testsuite/gcc.target/i386/pr95798-1.c
--- a/gcc/testsuite/gcc.target/i386/pr95798-1.c
+++ b/gcc/testsuite/gcc.target/i386/pr95798-1.c
@@ -1,6 +1,6 @@
 /* PR target/95798 */
 /* { dg-do compile } */
-/* { dg-options "-O2 -masm=att" } */
+/* { dg-options "-O2 -masm=att -fomit-frame-pointer" } */
 /* { dg-final { scan-assembler "1, 8\\\(%rsp,%r\[a-z0-9]*,8\\\)" { target lp64 } } } */
 /* { dg-final { scan-assembler "2, 16\\\(%rsp,%r\[a-z0-9]*,8\\\)" { target lp64 } } } */
 /* { dg-final { scan-assembler "3, 24\\\(%rsp,%r\[a-z0-9]*,8\\\)" { target lp64 } } } */
diff --git a/gcc/testsuite/gcc.target/i386/pr95798-2.c b/gcc/testsuite/gcc.target/i386/pr95798-2.c
--- a/gcc/testsuite/gcc.target/i386/pr95798-2.c
+++ b/gcc/testsuite/gcc.target/i386/pr95798-2.c
@@ -1,6 +1,6 @@
 /* PR target/95798 */
 /* { dg-do compile } */
-/* { dg-options "-O2 -masm=att" } */
+/* { dg-options "-O2 -masm=att -fomit-frame-pointer" } */
 /* { dg-final { scan-assembler "1, 8\\\(%rsp,%r\[a-z0-9]*,8\\\)" { target lp64 } } } */
 /* { dg-final { scan-assembler "2, 16\\\(%rsp,%r\[a-z0-9]*,8\\\)" { target lp64 } } } */
 /* { dg-final { scan-assembler "3, 24\\\(%rsp,%r\[a-z0-9]*,8\\\)" { target lp64 } } } */


Re: [PATCH] testsuite: i386: Fix gcc.target/i386/pr95798-?.c on Solaris

2021-03-03 Thread Jakub Jelinek via Gcc-patches
On Wed, Mar 03, 2021 at 01:36:20PM +0100, Rainer Orth wrote:
> The new gcc.target/i386/pr95798-?.c tests FAIL on 64-bit Solaris/x86:
> 
> +FAIL: gcc.target/i386/pr95798-1.c scan-assembler 1, 
> 8(%rsp,%r[a-z0-9]*,8)
> +FAIL: gcc.target/i386/pr95798-1.c scan-assembler 2, 
> 16(%rsp,%r[a-z0-9]*,8)
> +FAIL: gcc.target/i386/pr95798-1.c scan-assembler 3, 
> 24(%rsp,%r[a-z0-9]*,8)
> +FAIL: gcc.target/i386/pr95798-1.c scan-assembler 4, 
> 32(%rsp,%r[a-z0-9]*,8)
> +FAIL: gcc.target/i386/pr95798-1.c scan-assembler 5, 
> 40(%rsp,%r[a-z0-9]*,8)
> +FAIL: gcc.target/i386/pr95798-1.c scan-assembler 6, 
> 48(%rsp,%r[a-z0-9]*,8)
> +FAIL: gcc.target/i386/pr95798-1.c scan-assembler 7, 
> 56(%rsp,%r[a-z0-9]*,8)
> +FAIL: gcc.target/i386/pr95798-2.c scan-assembler 1, 
> 8(%rsp,%r[a-z0-9]*,8)
> +FAIL: gcc.target/i386/pr95798-2.c scan-assembler 2, 
> 16(%rsp,%r[a-z0-9]*,8)
> +FAIL: gcc.target/i386/pr95798-2.c scan-assembler 3, 
> 24(%rsp,%r[a-z0-9]*,8)
> +FAIL: gcc.target/i386/pr95798-2.c scan-assembler 4, 
> 32(%rsp,%r[a-z0-9]*,8)
> +FAIL: gcc.target/i386/pr95798-2.c scan-assembler 5, 
> 40(%rsp,%r[a-z0-9]*,8)
> +FAIL: gcc.target/i386/pr95798-2.c scan-assembler 6, 
> 48(%rsp,%r[a-z0-9]*,8)
> +FAIL: gcc.target/i386/pr95798-2.c scan-assembler 7, 
> 56(%rsp,%r[a-z0-9]*,8)
> 
> This happens because Solaris/amd64 defaults to -fno-omit-frame-pointer
> and can be avoided by always passing -fomit-frame-pointer.
> 
> Tested on i386-pc-solaris2.11 and x86_64-pc-linux-gnu.
> 
> Ok for master?

Ok, thanks.

> 2021-03-02  Rainer Orth  
> 
>   gcc/testsuite:
>   * gcc.target/i386/pr95798-1.c: Add -fomit-frame-pointer to
>   dg-options.
>   * gcc.target/i386/pr95798-2.c: Likewise.

Jakub



[PATCH] testsuite: aarch64: Add tests for vcvt FP intrinsics

2021-03-03 Thread Jonathan Wright via Gcc-patches
Hi,

As subject, this patch adds tests for vcvtx* and vcvt_fXX_fXX floating-point
Neon intrinsics. Since these intrinsics are only supported for AArch64, these
tests are restricted to only run on AArch64 targets.

Ok for master?

Thanks,
Jonathan

---

gcc/testsuite/ChangeLog:

2021-02-18  Jonathan Wright  

* gcc.target/aarch64/advsimd-intrinsics/vcvt_fXX_fXX.c:
New test.
* gcc.target/aarch64/advsimd-intrinsics/vcvtx.c:
New test.
diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcvt_fXX_fXX.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcvt_fXX_fXX.c
new file mode 100644
index ..248f37a63a7defda9507172d7a00be0b1a230580
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcvt_fXX_fXX.c
@@ -0,0 +1,100 @@
+/* { dg-skip-if "" { arm*-*-* } } */
+
+#include 
+#include "arm-neon-ref.h"
+#include "compute-ref-data.h"
+#include 
+
+/* Expected results for vcvt_f64_f32.  */
+VECT_VAR_DECL (expected, hfloat, 64, 2) [] = { 0x4030,
+	   0x402e};
+/* Expected results for vcvt_f32_f64.  */
+VECT_VAR_DECL (expected, hfloat, 32, 2) [] = { 0x3fc0, 0x4020 };
+
+/* Expected results for vcvt_high_f64_f32.  */
+VECT_VAR_DECL (expected_high, hfloat, 64, 2) [] = { 0xc02c,
+		0xc02a };
+/* Expected results for vcvt_high_f32_f64.  */
+VECT_VAR_DECL (expected_high, hfloat, 32, 4) [] = { 0x4000, 0x4000,
+		0x3fc0, 0x4020 };
+
+void
+exec_vcvt (void)
+{
+  clean_results ();
+
+#define TEST_MSG "vcvt_f64_f32"
+  {
+VECT_VAR_DECL (buffer_src, float, 32, 2) [] = { 16.0, 15.0 };
+
+DECL_VARIABLE (vector_src, float, 32, 2);
+
+VLOAD (vector_src, buffer_src, , float, f, 32, 2);
+DECL_VARIABLE (vector_res, float, 64, 2) =
+	vcvt_f64_f32 (VECT_VAR (vector_src, float, 32, 2));
+vst1q_f64 (VECT_VAR (result, float, 64, 2),
+	   VECT_VAR (vector_res, float, 64, 2));
+
+CHECK_FP (TEST_MSG, float, 64, 2, PRIx64, expected, "");
+  }
+#undef TEST_MSG
+
+  clean_results ();
+
+#define TEST_MSG "vcvt_f32_f64"
+  {
+VECT_VAR_DECL (buffer_src, float, 64, 2) [] = { 1.50025, 2.50025 };
+DECL_VARIABLE (vector_src, float, 64, 2);
+
+VLOAD (vector_src, buffer_src, q, float, f, 64, 2);
+DECL_VARIABLE (vector_res, float, 32, 2) =
+  vcvt_f32_f64 (VECT_VAR (vector_src, float, 64, 2));
+vst1_f32 (VECT_VAR (result, float, 32, 2),
+	  VECT_VAR (vector_res, float, 32, 2));
+
+CHECK_FP (TEST_MSG, float, 32, 2, PRIx32, expected, "");
+  }
+#undef TEST_MSG
+
+  clean_results ();
+
+#define TEST_MSG "vcvt_high_f64_f32"
+  {
+DECL_VARIABLE (vector_src, float, 32, 4);
+VLOAD (vector_src, buffer, q, float, f, 32, 4);
+DECL_VARIABLE (vector_res, float, 64, 2);
+VECT_VAR (vector_res, float, 64, 2) =
+  vcvt_high_f64_f32 (VECT_VAR (vector_src, float, 32, 4));
+vst1q_f64 (VECT_VAR (result, float, 64, 2),
+	   VECT_VAR (vector_res, float, 64, 2));
+CHECK_FP (TEST_MSG, float, 64, 2, PRIx64, expected_high, "");
+  }
+#undef TEST_MSG
+
+  clean_results ();
+
+#define TEST_MSG "vcvt_high_f32_f64"
+  {
+VECT_VAR_DECL (buffer_src, float, 64, 2) [] = { 1.50025, 2.50025 };
+DECL_VARIABLE (vector_low, float, 32, 2);
+VDUP (vector_low, , float, f, 32, 2, 2.0);
+
+DECL_VARIABLE (vector_src, float, 64, 2);
+VLOAD (vector_src, buffer_src, q, float, f, 64, 2);
+
+DECL_VARIABLE (vector_res, float, 32, 4) =
+  vcvt_high_f32_f64 (VECT_VAR (vector_low, float, 32, 2),
+			 VECT_VAR (vector_src, float, 64, 2));
+vst1q_f32 (VECT_VAR (result, float, 32, 4),
+	   VECT_VAR (vector_res, float, 32, 4));
+
+CHECK_FP (TEST_MSG, float, 32, 4, PRIx32, expected_high, "");
+  }
+}
+
+int
+main (void)
+{
+  exec_vcvt ();
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcvtx.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcvtx.c
new file mode 100644
index ..8687204ab97310c731144a33943833219a3c341d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vcvtx.c
@@ -0,0 +1,83 @@
+/* { dg-skip-if "" { arm*-*-* } } */
+
+#include 
+#include "arm-neon-ref.h"
+#include "compute-ref-data.h"
+#include 
+
+/* Expected results for vcvtx_f32_f64 and vcvtxd_f32_f64.  */
+VECT_VAR_DECL (expected, hfloat, 32, 2) [] = { 0x3fc1, 0x4021 };
+
+/* Expected results for vcvtx_high_f32_f64.  */
+VECT_VAR_DECL (expected_high, hfloat, 32, 4) [] = { 0x4000, 0x4000,
+		0x3fc1, 0x4021 };
+
+void
+exec_vcvtx (void)
+{
+  clean_results ();
+
+#define TEST_MSG "vcvtx_f32_f64"
+  {
+VECT_VAR_DECL (buffer_src, float, 64, 2) [] = { 1.50025, 2.50025 };
+DECL_VARIABLE (vector_src, float, 64, 2);
+
+VLOAD (vector_src, buffer_src, q, float, f, 64, 2);
+DECL_VARIABLE (vector_res, float, 32, 2) =
+  vcvtx_f32_f64 (VECT_VAR (vector_src, fl

Re: [PATCH] IBM Z: Run mul-signed-overflow-*.c only on z14+

2021-03-03 Thread Andreas Krebbel via Gcc-patches
On 3/3/21 11:50 AM, Ilya Leoshkevich wrote:
> On Wed, 2021-03-03 at 07:50 +0100, Andreas Krebbel wrote:
>> On 3/2/21 11:59 PM, Ilya Leoshkevich wrote:
>>> mul-signed-overflow-*.c execution tests fail on z13, because they
>>> contain z14-specific instructions.  Fix by requiring s390_z14_hw
>>> target.
>>>
>>> gcc/testsuite/ChangeLog:
>>>
>>> * gcc.target/s390/mul-signed-overflow-1.c: Run only on
>>> z14+.
>>> * gcc.target/s390/mul-signed-overflow-2.c: Likewise.
>>
>> I did that change yesterday already.
> 
> Ah, I haven't noticed.  One difference between our patches is, though,
> that I also have `dg-do compile` - this way, compile tests still run on
> z13.

Ok, that's a bit better indeed. Feel free to commit that change ontop.

Andreas


Re: [PATCH] gcc.misc-tests/outputs.exp: assert unique test-names

2021-03-03 Thread Alexandre Oliva
On Feb 25, 2021, Hans-Peter Nilsson  wrote:

> Navigating and debugging causes for failing tests here isn't
> helped by the existence of tests with duplicate names.

Aah, I guess I see what happened: some test sets were copied to cover
additional cases I hadn't covered (cool :-), but the test names were not
changed.

I wonder if we wouldn't be better off using say $bp instead of $b in
test name strings, so that bp could be set to "$b whatever" or "$b"
before a group of tests.

This would make it easier to copy blocks of tests, but it would make it
harder to search for a failing test.

I suppose the changes you proposed, along with code that rejects
duplicates, is preferrable:

> gcc/testsuite:
>   * gcc.misc-tests/outputs.exp: Append discriminating
>   suffixes to tests with duplicate names.
>   (outest): Assert that each running test has a unique
>   name.

LGTM, thanks!

-- 
Alexandre Oliva, happy hacker  https://FSFLA.org/blogs/lxo/
   Free Software Activist GNU Toolchain Engineer
Vim, Vi, Voltei pro Emacs -- GNUlius Caesar


Re: [PATCH] libstdc++: Update Solaris baselines for GCC 11.1

2021-03-03 Thread Jonathan Wakely via Gcc-patches

On 03/03/21 13:29 +0100, Rainer Orth wrote:

The following patch updates the Solaris baselines for GCC 11.1.  There's
only one caveat: comparing the Solaris 11.3 and 11.4 baselines, I find

+FUNC:_ZSt10from_charsPKcS0_RdSt12chars_format@@GLIBCXX_3.4.29
+FUNC:_ZSt10from_charsPKcS0_ReSt12chars_format@@GLIBCXX_3.4.29
+FUNC:_ZSt10from_charsPKcS0_RfSt12chars_format@@GLIBCXX_3.4.29

i.e.

std::from_chars(char const*, char const*, double&, std::chars_format)

and similarly for long double, float.  Those are from from
src/c++17/floating_from_chars.cc and only defined if
_GLIBCXX_HAVE_USELOCALE, i.e. depend on the XPG7 addition.  Given that
only Solaris 11.4 supports XPG7, I've taken the 11.3 baselines to avoid
having separate ones for 11.3 and 11.4.


That makes sense to me.


Tested on i386-pc-solaris2.11 and sparc-sun-solaris2.11 (sparc and x86,
32 and 64-bit, 11.3 and 11.4).

Ok for master?


OK, thanks.



Re: [PATCH] outputs.exp: skip @file -save-temps if target has -L or -I

2021-03-03 Thread Alexandre Oliva
On Feb 25, 2021, Hans-Peter Nilsson  wrote:

> Additional files are created in presence of @file option.

Oh, wow.  I hope nobody uses @file in target board files ;-)

> gcc/testsuite:
>   * gcc.misc-tests/outputs.exp: Skip @file -save-temps
>   tests if target test-framework has -L or -I options.

Thanks, this looks reasonable.

I suppose it might be possible to record the presence of -I and -L flags
separately, and adjust the @file tests so as to conditionally expect
extra files where they'd appear, but I don't see that this would
increase test coverage, so ISTM that would be a pointless effort.  So
we're good.  Thanks!

-- 
Alexandre Oliva, happy hacker  https://FSFLA.org/blogs/lxo/
   Free Software Activist GNU Toolchain Engineer
Vim, Vi, Voltei pro Emacs -- GNUlius Caesar


[pushed] c++: C++17 and decltype of multi-operator expression [PR95675]

2021-03-03 Thread Jason Merrill via Gcc-patches
A call that is the immediate operand of decltype has special semantics: no
temporary is produced, so it's OK for the return type to be e.g. incomplete.
But we were treating (e | f) the same way, which confused overload
resolution when we then tried to evaluate ... | g.

Fixed by making build_temp do what its name says, and force the C++17
temporary materialization conversion.  We could also handle this by
making sure that tf_decltype is only set when appropriate, but we keep
finding more cases where we aren't getting that right yet, and the
approach of this patch is more consistent with the C++17 model.

Tested x86_64-pc-linux-gnu, applying to trunk.

gcc/cp/ChangeLog:

PR c++/95675
* call.c (build_temp): Wrap a CALL_EXPR in a TARGET_EXPR
if it didn't get one before.

gcc/testsuite/ChangeLog:

PR c++/95675
* g++.dg/cpp0x/decltype-call5.C: New test.
* g++.dg/cpp0x/decltype-call6.C: New test.
---
 gcc/cp/call.c   |  8 
 gcc/testsuite/g++.dg/cpp0x/decltype-call5.C |  7 +++
 gcc/testsuite/g++.dg/cpp0x/decltype-call6.C | 12 
 3 files changed, 27 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/decltype-call5.C
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/decltype-call6.C

diff --git a/gcc/cp/call.c b/gcc/cp/call.c
index 0ba0e19ae08..b00334d0919 100644
--- a/gcc/cp/call.c
+++ b/gcc/cp/call.c
@@ -7291,6 +7291,14 @@ build_temp (tree expr, tree type, int flags,
   && !type_has_nontrivial_copy_init (TREE_TYPE (expr)))
 return get_target_expr_sfinae (expr, complain);
 
+  /* In decltype, we might have decided not to wrap this call in a TARGET_EXPR.
+ But it turns out to be a subexpression, so perform temporary
+ materialization now.  */
+  if (TREE_CODE (expr) == CALL_EXPR
+  && CLASS_TYPE_P (type)
+  && same_type_ignoring_top_level_qualifiers_p (type, TREE_TYPE (expr)))
+expr = build_cplus_new (type, expr, complain);
+
   savew = warningcount + werrorcount, savee = errorcount;
   releasing_vec args (make_tree_vector_single (expr));
   expr = build_special_member_call (NULL_TREE, complete_ctor_identifier,
diff --git a/gcc/testsuite/g++.dg/cpp0x/decltype-call5.C 
b/gcc/testsuite/g++.dg/cpp0x/decltype-call5.C
new file mode 100644
index 000..81ef6e6f9c9
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/decltype-call5.C
@@ -0,0 +1,7 @@
+// PR c++/95675
+// { dg-do compile { target c++11 } }
+
+struct b {};
+b operator|(b, b) { return {}; }
+b e, f, g;
+using h = decltype(e | f | g);
diff --git a/gcc/testsuite/g++.dg/cpp0x/decltype-call6.C 
b/gcc/testsuite/g++.dg/cpp0x/decltype-call6.C
new file mode 100644
index 000..4173b607689
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/decltype-call6.C
@@ -0,0 +1,12 @@
+// PR c++/95675
+// { dg-do compile { target c++11 } }
+
+struct a {};
+template  struct b;
+template  struct b {
+  decltype(bq()(br())) c;
+};
+struct e {
+  operator a();
+};
+b d;

base-commit: f1b13064609a41fcaf4d1859663453bba237e277
-- 
2.27.0



[PATCH] arm: Ignore --with-mode when CPU only supports one instruction set.

2021-03-03 Thread Richard Earnshaw (lists) via Gcc-patches
Hopefully this change will reduce the number of times Christophe is
needing to tweak the testsuite.

--

Arm processors can support up to two instruction sets.  Some early
cores only support the traditional A32 (Arm) instructions, while some
more recent devices only support T32 (Thumb) instructions.

When configuring the compiler, --with-mode can be used to select the
default instruction set to target if the user has not made an explicit
choice, but this can cause needless problems if the default is not
supported by the requested CPU.

To fix this this patch adjusts the way that the --with-mode selection
is processed so that it can take into account the selected CPU or
architecture and not create a meaningless combination.

gcc:
* common/config/arm/arm-common.c: Include configargs.h.
(arm_config_default): New function.
(arm_target_mode): Renamed from arm_target_thumb_only.  Handle
processors that do not support Thumb.  Take into account the
--with-mode configuration setting for selecting the default.
* config/arm/arm.h (OPTION_DEFAULT_SPECS): Remove entry for 'mode'.
(TARGET_MODE_SPEC_FUNCTIONS): Update for function name change.
---
 gcc/common/config/arm/arm-common.c | 49 ++
 gcc/config/arm/arm.h   | 10 +++---
 2 files changed, 49 insertions(+), 10 deletions(-)


diff --git a/gcc/common/config/arm/arm-common.c b/gcc/common/config/arm/arm-common.c
index 98824517c7b..5b03b86724d 100644
--- a/gcc/common/config/arm/arm-common.c
+++ b/gcc/common/config/arm/arm-common.c
@@ -33,6 +33,8 @@
 #include "sbitmap.h"
 #include "diagnostic.h"
 
+#include "configargs.h"
+
 /* Set default optimization options.  */
 static const struct default_options arm_option_optimization_table[] =
   {
@@ -240,16 +242,34 @@ check_isa_bits_for (const enum isa_feature* bits, enum isa_feature bit)
   return false;
 }
 
+/* Look up NAME in the configuration defaults for this build of the
+   the compiler.  Return the value associated with that name, or NULL
+   if no value is found.  */
+static const char *
+arm_config_default (const char *name)
+{
+  int i;
+
+  if (configure_default_options[0].name == NULL)
+return NULL;
+
+  for (i = 0; i < ARRAY_SIZE (configure_default_options); i++)
+if (strcmp (configure_default_options[i].name, name) == 0)
+  return configure_default_options[i].value;
+
+  return NULL;
+}
+
 /* Called by the driver to check whether the target denoted by current
-   command line options is a Thumb-only target.  ARGV is an array of
-   tupples (normally only one) where the first element of the tupple
-   is 'cpu' or 'arch' and the second is the option passed to the
-   compiler for that.  An architecture tupple is always taken in
-   preference to a cpu tupple and the last of each type always
+   command line options is a Thumb-only, or ARM-only, target.  ARGV is
+   an array of tupples (normally only one) where the first element of
+   the tupple is 'cpu' or 'arch' and the second is the option passed
+   to the compiler for that.  An architecture tupple is always taken
+   in preference to a cpu tupple and the last of each type always
overrides any earlier setting.  */
 
 const char *
-arm_target_thumb_only (int argc, const char **argv)
+arm_target_mode (int argc, const char **argv)
 {
   const char *arch = NULL;
   const char *cpu = NULL;
@@ -285,6 +305,9 @@ arm_target_thumb_only (int argc, const char **argv)
   if (arch_opt && !check_isa_bits_for (arch_opt->common.isa_bits,
 	   isa_bit_notm))
 	return "-mthumb";
+  if (arch_opt && !check_isa_bits_for (arch_opt->common.isa_bits,
+	   isa_bit_thumb))
+	return "-marm";
 }
   else if (cpu)
 {
@@ -294,6 +317,20 @@ arm_target_thumb_only (int argc, const char **argv)
   if (cpu_opt && !check_isa_bits_for (cpu_opt->common.isa_bits,
 	  isa_bit_notm))
 	return "-mthumb";
+  if (cpu_opt && !check_isa_bits_for (cpu_opt->common.isa_bits,
+	   isa_bit_thumb))
+	return "-marm";
+}
+
+  const char *default_mode = arm_config_default ("mode");
+  if (default_mode)
+{
+  if (strcmp (default_mode, "thumb") == 0)
+	return "-mthumb";
+  else if (strcmp (default_mode, "arm") == 0)
+	return "-marm";
+  else
+	gcc_unreachable ();
 }
 
   /* Compiler hasn't been configured with a default, and the CPU
diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
index 6bc03ada0bf..113c015c455 100644
--- a/gcc/config/arm/arm.h
+++ b/gcc/config/arm/arm.h
@@ -390,7 +390,10 @@ emission of floating point pcs attributes.  */
--with-float is ignored if -mfloat-abi is specified.
--with-fpu is ignored if -mfpu is specified.
--with-abi is ignored if -mabi is specified.
-   --with-tls is ignored if -mtls-dialect is specified. */
+   --with-tls is ignored if -mtls-dialect is specified.
+   Note: --with-mode is not handled here, that has a special rule
+   TARGET_MODE_CHECK that also takes into account the sele

Re: [PATCH v2] c++: ICE with deduction guide in checking type-dep [PR99009, PR97034]

2021-03-03 Thread Jason Merrill via Gcc-patches

On 3/2/21 4:51 PM, Marek Polacek wrote:

On Mon, Mar 01, 2021 at 09:29:19PM -0500, Jason Merrill via Gcc-patches wrote:

On 3/1/21 7:59 PM, Marek Polacek wrote:

On Mon, Mar 01, 2021 at 03:08:58PM -0500, Jason Merrill wrote:

On 3/1/21 2:54 PM, Marek Polacek wrote:

On Thu, Feb 25, 2021 at 10:45:29PM -0500, Jason Merrill via Gcc-patches wrote:

On 2/25/21 5:41 PM, Marek Polacek wrote:

On Thu, Feb 25, 2021 at 10:59:49AM -0500, Jason Merrill wrote:

On 2/12/21 6:12 PM, Marek Polacek wrote:

We represent deduction guides with FUNCTION_DECLs, but they are built
without DECL_CONTEXT


Hmm, that seems wrong: "A deduction-guide shall be declared in the
same scope as the corresponding class template and, for a member class
template, with the same access."  But it probably isn't necessary to change
this:


leading to an ICE in type_dependent_expression_p
on the assert that the type of a function template with no dependent
(innermost!) template arguments must be non-dependent.  Consider the
attached class-deduction79.C: we create a deduction guide:

   template G(T)-> E::G

we deduce T and create a partial instantiation:

   G(T) -> E::G [with T = int]

And then do_class_deduction wants to create a CALL_EXPR from the above
using build_new_function_call -> build_over_call which calls mark_used
-> maybe_instantiate_noexcept -> type_dependent_expression_p.

There, the innermost template arguments are non-dependent (), but
the fntype is dependent -- the return type is a TYPENAME_TYPE, and
since we have no DECL_CONTEXT, this check holds:

   /* Otherwise, if the function decl isn't from a dependent scope, it 
can't be
  type-dependent.  Checking this is important for functions with auto 
return
  type, which looks like a dependent type.  */
   if (TREE_CODE (expression) == FUNCTION_DECL
   && !(DECL_CLASS_SCOPE_P (expression)
&& dependent_type_p (DECL_CONTEXT (expression)))

whereupon we ICE.

Experiments with setting DECL_CONTEXT didn't pan out.


In c8 of the PR it looks like you were using the class itself as
DECL_CONTEXT; the quote above says that the right context is the enclosing
scope of the class.


Sadly, using CP_TYPE_CONTEXT (type) would result in a crash in
retrieve_specialization:

  /* There should be as many levels of arguments as there are
 levels of parameters.  */
  gcc_assert (TMPL_ARGS_DEPTH (args)
  == (TREE_CODE (tmpl) == TEMPLATE_DECL
  ? TMPL_PARMS_DEPTH (DECL_TEMPLATE_PARMS (tmpl))
  : template_class_depth (DECL_CONTEXT (tmpl;


Yeah, probably simpler not to bother.


So perhaps we
just want to skip the assert for deduction guides, because they are
a little special.  Better ideas solicited.


In c3 you mention that one of the variants broke with r269093; this is
because my change to check CLASSTYPE_TEMPLATE_INSTANTIATION is false for the
template pattern itself (E).


And the original test started with my r11-1713 because using TREE_TYPE
directly instead of decltype (which is a non-deduced context) means we
can deduced from the argument.

But I think probably the right answer is to defer this deduction until the
enclosing scope is non-dependent, i.e. (untested)


Thanks.  That mostly works, except the new class-deduction-aggr[89].C
tests.  Consider 8:

namespace N {
template  struct S {
  template  S(T, U);
};
} // namespace N
template  struct E {
  template  struct G { T t; };
  void fn() { G{N::S{'a', 1}}; }
};

void
g ()
{
  E<1> e;
  e.fn ();
}

With your patch, when in do_class_deduction when processing_template_decl,
we just return.  When we call do_class_deduction again when p_t_d is 0,
maybe_aggr_guide returns early here:

  if (!CP_AGGREGATE_TYPE_P (type))
return NULL_TREE

because G is not complete (and rightly so, we didn't instantiate it).  So
we aren't able to deduce the template parameters.  I'm not sure if I should
pursue this direction further.  :(


I think so; we just need to test CP_AGGREGATE_TYPE_P on the original
template pattern instead of the instantiation E<1>::G.


I'm sorry, I've got stuck again.

Yes, using the original template pattern helps us get past the
CP_AGGREGATE_TYPE_P check.

However, using TREE_TYPE (DECL_TI_TEMPLATE (tmpl)) as the type of the deduction 
guide
means the guide will be "template G(T)-> E< >::G" which
results in

 class-deduction-aggr8.C:11:15: error: invalid use of dependent type 'typename 
E< >::G >'

which makes sense I guess: when we defer building up the guide until
we've instantiated E<1>, finding the dependent type E<> is not expected.


Yeah, I was only thinking to use the pattern for the aggregate check.


Ack.  Though I think I also have to use the pattern here:

init = reshape_init (type, init, complain);

otherwise reshape_init returns a TARGET_EXPR and we immediately
crash in collect_ctor_idx_types because that only expects a CONSTRUCTOR.
And what we need to get

Re: [arm/testsuite]: Skip pr97969.c if -mthumb is not compatible [PR target/97969]

2021-03-03 Thread Richard Earnshaw via Gcc-patches
On 02/03/2021 18:35, Christophe Lyon wrote:
> On Tue, 2 Mar 2021 at 19:18, Richard Earnshaw
>  wrote:
>>
>> On 02/03/2021 18:14, Richard Earnshaw via Gcc-patches wrote:
>>> On 02/03/2021 18:10, Christophe Lyon wrote:
 On Tue, 2 Mar 2021 at 17:25, Richard Earnshaw
  wrote:
>
> On 02/03/2021 16:19, Richard Earnshaw via Gcc-patches wrote:
>> On 01/03/2021 15:26, Christophe Lyon via Gcc-patches wrote:
>>> Ping?
>>>
>>> On Wed, 3 Feb 2021 at 10:01, Christophe Lyon 
>>>  wrote:

 Ping?
 I guess that's obvious enough?

 On Wed, 27 Jan 2021 at 10:03, Christophe Lyon
  wrote:
>
> Depending on how the toolchain is configured or how the testsuite is
> executed, -mthumb may not be compatible. Like for other tests, skip
> pr97969.c in this case.
>
> For instance arm-linux-gnueabihf and -march=armv5t in RUNTESTFLAGS.
>
> 2021-01-27  Christophe Lyon  
>
> gcc/testsuite/
> PR target/97969
> * gcc.target/arm/pr97969.c: Skip if thumb mode is not available.
>
> diff --git a/gcc/testsuite/gcc.target/arm/pr97969.c
> b/gcc/testsuite/gcc.target/arm/pr97969.c
> index 714a1d1..0b5d07f 100644
> --- a/gcc/testsuite/gcc.target/arm/pr97969.c
> +++ b/gcc/testsuite/gcc.target/arm/pr97969.c
> @@ -1,4 +1,5 @@
>  /* { dg-do compile } */
> +/* { dg-skip-if "" { ! { arm_thumb1_ok || arm_thumb2_ok } } } */
>  /* { dg-options "-std=c99 -fno-omit-frame-pointer -mthumb -w -Os" } 
> */
>
>  typedef a[23];
>>
>> I'm working on a patch to make this sort of change unnecessary (I hope).
>>  Just running some final checks.
>>
>> R.
>>
>
> Ah, wait.  This one already has an explicit -mthumb, so my patch won't
> affect this.  But why is -mthumb needed for this test anyway?  It's just
> a compilation test, so why not drop that and we'll generally get better
> coverage all round.
>

 For instance I see the test fail for target arm-none-linux-gnueabihf
 --with-mode arm --with-cpu cortex-a9 --with-fpu vfp
 and running the tests with -march=armv5t

 We get the famous thumb-1 + hard-float ABI not supported.

 I guess -mthumb is inherited from the bug report?

 Christophe

>>>
>>> dropping the -mthumb should fix that though?
>>>
>>> In fact, I'd drop -Os as well, it's not needed as -Os is just one of the
>>> many options that are used to build this test already.
>>>
>>> R.
>>>
>>
>> But maybe then we need to change dg-options into dg-add-options.
>>
> 
> Not sure to follow: the test is compiled only once, with:
> -std=c99 -fno-omit-frame-pointer -mthumb -w -Os
> in my logs
> 

I think it's only run the once /because/ the test sets dg-options rather
than dg-add-options.

R.


[PATCH] sparcv9: Disable -Wuninitialized warnings breaking bootstrap [PR92002]

2021-03-03 Thread Rainer Orth
sparcv9 bootstrap has been broken for 1 1/2 years now by spurious
-Wuninitialized warnings:

In function ‘wide_int wi::max_value(unsigned int, signop)’,
inlined from ‘wide_int wi::max_value(unsigned int, signop)’ at 
/vol/gcc/src/hg/master/local/gcc/wide-int.cc:330:1:
/vol/gcc/src/hg/master/local/gcc/wide-int.cc:335:31: error: 
‘.generic_wide_int::.wide_int_storage::val[1]’
 may be used uninitialized [-Werror=maybe-uninitialized]
  335 | return shwi (-1, precision);
  |   ^
[...]
In function ‘wide_int get_nonzero_bits(const_tree)’,
inlined from ‘wide_int get_nonzero_bits(const_tree)’ at 
/vol/gcc/src/hg/master/local/gcc/tree-ssanames.c:531:1:
/vol/gcc/src/hg/master/local/gcc/tree-ssanames.c:544:67: error: 
‘.generic_wide_int::.wide_int_storage::val[1]’
 may be used uninitialized [-Werror=maybe-uninitialized]
  544 |  | (HOST_WIDE_INT) pi->misalign, precision);
  |   ^
[...]

Before we ship yet another release with this issue, I suggest to at
least include a workaround of demoting them to warnings.

Tested on sparcv9-sun-solaris2.11.

Ok for master?

I originally meant to propose the patch for the gcc-10 branch as well,
but when I tried a sparcv9-sun-solaris2.11 bootstrap there some time
ago, it wasn't affected any longer.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


2021-03-03  Rainer Orth  

gcc:
PR bootstrap/92002
* config/sparc/t-sparc (tree-ssanames.o-warn): Don't error for
-Wuninitialized, -Wmaybe-uninitialized.
(wide-int.o-warn): Likewise.

# HG changeset patch
# Parent  710ee5be5a4d126389146bc2e4ab9bbcc36a44e1
Hack around 64-bit SPARC bootstrap failure (PR bootstrap/92002)

diff --git a/gcc/config/sparc/t-sparc b/gcc/config/sparc/t-sparc
--- a/gcc/config/sparc/t-sparc
+++ b/gcc/config/sparc/t-sparc
@@ -27,3 +27,7 @@ sparc-c.o: $(srcdir)/config/sparc/sparc-
 sparc-d.o: $(srcdir)/config/sparc/sparc-d.c
 	$(COMPILE) $<
 	$(POSTCOMPILE)
+
+# Hack around PR bootstrap/92002.
+tree-ssanames.o-warn += -Wno-error=uninitialized -Wno-error=maybe-uninitialized
+wide-int.o-warn += -Wno-error=uninitialized -Wno-error=maybe-uninitialized


Re: [PATCH] arm: Ignore --with-mode when CPU only supports one instruction set.

2021-03-03 Thread Christophe Lyon via Gcc-patches
On Wed, 3 Mar 2021 at 14:55, Richard Earnshaw (lists)
 wrote:
>
> Hopefully this change will reduce the number of times Christophe is
> needing to tweak the testsuite.
>

Thanks!

I guess this means we can now do some cleanup in the testsuite?
Did you have a quick look at the amount of tests involved?

Christophe

> --
>
> Arm processors can support up to two instruction sets.  Some early
> cores only support the traditional A32 (Arm) instructions, while some
> more recent devices only support T32 (Thumb) instructions.
>
> When configuring the compiler, --with-mode can be used to select the
> default instruction set to target if the user has not made an explicit
> choice, but this can cause needless problems if the default is not
> supported by the requested CPU.
>
> To fix this this patch adjusts the way that the --with-mode selection
> is processed so that it can take into account the selected CPU or
> architecture and not create a meaningless combination.
>
> gcc:
> * common/config/arm/arm-common.c: Include configargs.h.
> (arm_config_default): New function.
> (arm_target_mode): Renamed from arm_target_thumb_only.  Handle
> processors that do not support Thumb.  Take into account the
> --with-mode configuration setting for selecting the default.
> * config/arm/arm.h (OPTION_DEFAULT_SPECS): Remove entry for 'mode'.
> (TARGET_MODE_SPEC_FUNCTIONS): Update for function name change.
> ---
>  gcc/common/config/arm/arm-common.c | 49 ++
>  gcc/config/arm/arm.h   | 10 +++---
>  2 files changed, 49 insertions(+), 10 deletions(-)
>
>


Re: [PATCH] arm: Ignore --with-mode when CPU only supports one instruction set.

2021-03-03 Thread Richard Earnshaw (lists) via Gcc-patches
On 03/03/2021 14:11, Christophe Lyon via Gcc-patches wrote:
> On Wed, 3 Mar 2021 at 14:55, Richard Earnshaw (lists)
>  wrote:
>>
>> Hopefully this change will reduce the number of times Christophe is
>> needing to tweak the testsuite.
>>
> 
> Thanks!
> 
> I guess this means we can now do some cleanup in the testsuite?
> Did you have a quick look at the amount of tests involved?
> 

No, I wasn't expecting to change the existing tests again where you've
already done this.  But hopefully you won't need to do any more changes
for this reason in future.

R.

> Christophe
> 
>> --
>>
>> Arm processors can support up to two instruction sets.  Some early
>> cores only support the traditional A32 (Arm) instructions, while some
>> more recent devices only support T32 (Thumb) instructions.
>>
>> When configuring the compiler, --with-mode can be used to select the
>> default instruction set to target if the user has not made an explicit
>> choice, but this can cause needless problems if the default is not
>> supported by the requested CPU.
>>
>> To fix this this patch adjusts the way that the --with-mode selection
>> is processed so that it can take into account the selected CPU or
>> architecture and not create a meaningless combination.
>>
>> gcc:
>> * common/config/arm/arm-common.c: Include configargs.h.
>> (arm_config_default): New function.
>> (arm_target_mode): Renamed from arm_target_thumb_only.  Handle
>> processors that do not support Thumb.  Take into account the
>> --with-mode configuration setting for selecting the default.
>> * config/arm/arm.h (OPTION_DEFAULT_SPECS): Remove entry for 'mode'.
>> (TARGET_MODE_SPEC_FUNCTIONS): Update for function name change.
>> ---
>>  gcc/common/config/arm/arm-common.c | 49 ++
>>  gcc/config/arm/arm.h   | 10 +++---
>>  2 files changed, 49 insertions(+), 10 deletions(-)
>>
>>



Re: [PATCH] arm: Ignore --with-mode when CPU only supports one instruction set.

2021-03-03 Thread Christophe Lyon via Gcc-patches
On Wed, 3 Mar 2021 at 15:13, Richard Earnshaw (lists)
 wrote:
>
> On 03/03/2021 14:11, Christophe Lyon via Gcc-patches wrote:
> > On Wed, 3 Mar 2021 at 14:55, Richard Earnshaw (lists)
> >  wrote:
> >>
> >> Hopefully this change will reduce the number of times Christophe is
> >> needing to tweak the testsuite.
> >>
> >
> > Thanks!
> >
> > I guess this means we can now do some cleanup in the testsuite?
> > Did you have a quick look at the amount of tests involved?
> >
>
> No, I wasn't expecting to change the existing tests again where you've
> already done this.  But hopefully you won't need to do any more changes
> for this reason in future.
>
OK thanks!

> R.
>
> > Christophe
> >
> >> --
> >>
> >> Arm processors can support up to two instruction sets.  Some early
> >> cores only support the traditional A32 (Arm) instructions, while some
> >> more recent devices only support T32 (Thumb) instructions.
> >>
> >> When configuring the compiler, --with-mode can be used to select the
> >> default instruction set to target if the user has not made an explicit
> >> choice, but this can cause needless problems if the default is not
> >> supported by the requested CPU.
> >>
> >> To fix this this patch adjusts the way that the --with-mode selection
> >> is processed so that it can take into account the selected CPU or
> >> architecture and not create a meaningless combination.
> >>
> >> gcc:
> >> * common/config/arm/arm-common.c: Include configargs.h.
> >> (arm_config_default): New function.
> >> (arm_target_mode): Renamed from arm_target_thumb_only.  Handle
> >> processors that do not support Thumb.  Take into account the
> >> --with-mode configuration setting for selecting the default.
> >> * config/arm/arm.h (OPTION_DEFAULT_SPECS): Remove entry for 'mode'.
> >> (TARGET_MODE_SPEC_FUNCTIONS): Update for function name change.
> >> ---
> >>  gcc/common/config/arm/arm-common.c | 49 ++
> >>  gcc/config/arm/arm.h   | 10 +++---
> >>  2 files changed, 49 insertions(+), 10 deletions(-)
> >>
> >>
>


Re: [PATCH 1/5] std::latch: reduce internal implementation from ptrdiff_t to int

2021-03-03 Thread Jonathan Wakely via Gcc-patches

On 26/02/21 07:59 -0800, Thiago Macieira via Libstdc++ wrote:

ints can be used as futex on Linux. ptrdiff_t on 64-bit Linux can't.


Yes, we should do this for GCC 11.



libstdc++-v3/include/std/latch | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/libstdc++-v3/include/std/latch b/libstdc++-v3/include/std/latch
index ef8c301e5e9..156aea5c5e5 100644
--- a/libstdc++-v3/include/std/latch
+++ b/libstdc++-v3/include/std/latch
@@ -48,7 +48,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  public:
static constexpr ptrdiff_t
max() noexcept
-{ return __gnu_cxx::__int_traits::__max; }
+{ return __gnu_cxx::__int_traits::__max; }

constexpr explicit latch(ptrdiff_t __expected) noexcept
  : _M_a(__expected) { }
@@ -85,7 +85,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
}

  private:
-alignas(__alignof__(ptrdiff_t)) ptrdiff_t _M_a;
+alignas(__alignof__(int)) int _M_a;
  };
_GLIBCXX_END_NAMESPACE_VERSION
} // namespace
--
2.30.1





Re: [PATCH 2/5] Atomic __platform_wait: accept any 32-bit type, not just int

2021-03-03 Thread Jonathan Wakely via Gcc-patches

On 26/02/21 07:59 -0800, Thiago Macieira via Libstdc++ wrote:

The kernel doesn't care what we store in those 32 bits, only that they are
comparable. This commit adds:
* pointers and long on 32-bit architectures
* unsigned
* untyped enums and typed enums on int & unsigned int
* float

We're not using FUTEX_OP anywhere today. The kernel reserves 4 bits for
this field but has only used 6 values so far, so it can be extended to
unsigned compares in the future, if needed.


And this one for GCC 11 too.


libstdc++-v3/include/bits/atomic_wait.h | 7 ---
1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/libstdc++-v3/include/bits/atomic_wait.h 
b/libstdc++-v3/include/bits/atomic_wait.h
index 424fccbe4c5..1c6bda2e2b6 100644
--- a/libstdc++-v3/include/bits/atomic_wait.h
+++ b/libstdc++-v3/include/bits/atomic_wait.h
@@ -65,7 +65,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
template
  inline constexpr bool __platform_wait_uses_type
#ifdef _GLIBCXX_HAVE_LINUX_FUTEX
-   = is_same_v, __platform_wait_t>;
+   = is_scalar_v> && sizeof(_Tp) == 
sizeof(__platform_wait_t)
+ && alignof(_Tp) >= alignof(__platform_wait_t);
#else
= false;
#endif
@@ -91,13 +92,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION

template
  void
-  __platform_wait(const _Tp* __addr, __platform_wait_t __val) noexcept
+  __platform_wait(const _Tp* __addr, _Tp __val) noexcept
  {
for(;;)
  {
auto __e = syscall (SYS_futex, static_cast(__addr),
  
static_cast(__futex_wait_flags::__wait_private),
-   __val, nullptr);
+ static_cast<__platform_wait_t>(__val), 
nullptr);
if (!__e || errno == EAGAIN)
  break;
else if (errno != EINTR)
--
2.30.1





Re: [PATCH 5/5] barrier: optimise by not having the hasher in a loop

2021-03-03 Thread Jonathan Wakely via Gcc-patches

On 26/02/21 07:59 -0800, Thiago Macieira via Libstdc++ wrote:

Our thread's ID does not change so we don't have to get it every time
and hash it every time.


This looks good for GCC 11.

(This one wouldn't be an ABI break to change later, but we might as
well do it now, as we have the patch now).



libstdc++-v3/include/std/barrier | 10 +-
1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/libstdc++-v3/include/std/barrier b/libstdc++-v3/include/std/barrier
index ae058bd3dc3..eb31a89b175 100644
--- a/libstdc++-v3/include/std/barrier
+++ b/libstdc++-v3/include/std/barrier
@@ -101,12 +101,10 @@ It looks different from literature pseudocode for two 
main reasons:
  }

  bool
-  _M_arrive(__barrier_phase_t __old_phase)
+  _M_arrive(__barrier_phase_t __old_phase, size_t __current)
  {
size_t __current_expected = _M_expected;
-   std::hash __hasher;
-   size_t __current = __hasher(std::this_thread::get_id())
- % ((_M_expected + 1) >> 1);
+__current %= ((__current_expected + 1) >> 1);

const auto __half_step = _S_add_to_phase(__old_phase, 1);
const auto __full_step = _S_add_to_phase(__old_phase, 2);
@@ -167,11 +165,13 @@ It looks different from literature pseudocode for two 
main reasons:
  [[nodiscard]] arrival_token
  arrive(ptrdiff_t __update)
  {
+   std::hash __hasher;
+   size_t __current = __hasher(std::this_thread::get_id());
__atomic_phase_ref_t __phase(_M_phase);
const auto __old_phase = __phase.load(memory_order_relaxed);
for(; __update; --__update)
  {
-   if(_M_arrive(__old_phase))
+   if(_M_arrive(__old_phase, __current))
  {
_M_completion();
_M_expected += 
_M_expected_adjustment.load(memory_order_relaxed);
--
2.30.1





[PATCH, C++, OG10, OpenACC/OpenMP, committed] Allow static constexpr fields in mappable types

2021-03-03 Thread Chung-Lin Tang

On 2020/1/21 12:49 AM, Jakub Jelinek wrote:

The OpenMP 4.5 definition of mappable type for C++ is that
   - All data members must be non-static.
among other requirements.  In OpenMP 5.0 that has been removed.
So, if we follow the 4.5 definition, it shouldn't change, if we follow 5.0
definition, the whole loop should be dropped, but in no case shall static
constexpr data members be treated any differently from any other static data
members.


We have merged the patch as is (only static constexprs) to devel/omp/gcc-10
for now. Its possible that the entire checking loop should be eventually removed
to allow the full 5.0 range, but wondered if things like (automatic) 
accessibility
of the static members within target regions is an issue to resolve?
For now, I've committed the patch in its current state to OG10.

Re-tested on OG10, and committed with an additional testcase (same for OpenMP)

Chung-Lin

cp/
* decl2.c (cp_omp_mappable_type_1): Allow fields with
DECL_DECLARED_CONSTEXPR_P to be mapped.

testsuite/
* g++.dg/goacc/static-constexpr-1.C: New test.
* g++.dg/gomp/static-constexpr-1.C: New test.

From 1c3f38b30c1db0aef5ccbf6d20fb5fd13785d482 Mon Sep 17 00:00:00 2001
From: Chung-Lin Tang 
Date: Wed, 3 Mar 2021 22:39:10 +0800
Subject: [PATCH] Allow static constexpr fields in mappable types for C++

This patch is a merge of:
https://gcc.gnu.org/legacy-ml/gcc-patches/2020-01/msg01246.html

Static members in general disqualify a C++ class from being target mappable,
but static constexprs are inline optimized away, so should not interfere.

OpenMP 5.0 in general lifts the static member limitation, so this
patch will probably further adjusted later.

2021-03-03  Chung-Lin Tang  

gcc/cp/ChangeLog:

* decl2.c (cp_omp_mappable_type_1): Allow fields with
DECL_DECLARED_CONSTEXPR_P to be mapped.

gcc/testsuite/ChangeLog:

* g++.dg/goacc/static-constexpr-1.C: New test.
* g++.dg/gomp/static-constexpr-1.C: New test.
---
 gcc/cp/decl2.c  |  5 -
 gcc/testsuite/g++.dg/goacc/static-constexpr-1.C | 17 +
 gcc/testsuite/g++.dg/gomp/static-constexpr-1.C  | 17 +
 3 files changed, 38 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/goacc/static-constexpr-1.C
 create mode 100644 gcc/testsuite/g++.dg/gomp/static-constexpr-1.C

diff --git a/gcc/cp/decl2.c b/gcc/cp/decl2.c
index 5343ea3b068..872122fe83c 100644
--- a/gcc/cp/decl2.c
+++ b/gcc/cp/decl2.c
@@ -1460,7 +1460,10 @@ cp_omp_mappable_type_1 (tree type, bool notes)
 {
   tree field;
   for (field = TYPE_FIELDS (type); field; field = DECL_CHAIN (field))
-   if (VAR_P (field))
+   if (VAR_P (field)
+   /* Fields that are 'static constexpr' can be folded away at compile
+  time, thus does not interfere with mapping.  */
+   && !DECL_DECLARED_CONSTEXPR_P (field))
  {
if (notes)
  inform (DECL_SOURCE_LOCATION (field),
diff --git a/gcc/testsuite/g++.dg/goacc/static-constexpr-1.C 
b/gcc/testsuite/g++.dg/goacc/static-constexpr-1.C
new file mode 100644
index 000..edf5f1a7628
--- /dev/null
+++ b/gcc/testsuite/g++.dg/goacc/static-constexpr-1.C
@@ -0,0 +1,17 @@
+// { dg-do compile }
+// { dg-require-effective-target c++11 }
+
+/* Test that static constexpr members do not interfere with offloading.  */
+struct rec
+{
+  static constexpr int x = 1;
+  int y, z;
+};
+
+void foo (rec& r)
+{
+  #pragma acc parallel copy(r)
+  {
+r.y = r.y = r.x;
+  }
+}
diff --git a/gcc/testsuite/g++.dg/gomp/static-constexpr-1.C 
b/gcc/testsuite/g++.dg/gomp/static-constexpr-1.C
new file mode 100644
index 000..39eee92
--- /dev/null
+++ b/gcc/testsuite/g++.dg/gomp/static-constexpr-1.C
@@ -0,0 +1,17 @@
+// { dg-do compile }
+// { dg-require-effective-target c++11 }
+
+/* Test that static constexpr members do not interfere with offloading.  */
+struct rec
+{
+  static constexpr int x = 1;
+  int y, z;
+};
+
+void foo (rec& r)
+{
+  #pragma omp target map(r)
+  {
+r.y = r.y = r.x;
+  }
+}
-- 
2.17.1



Re: [PATCH] sparcv9: Disable -Wuninitialized warnings breaking bootstrap [PR92002]

2021-03-03 Thread Eric Botcazou
> sparcv9 bootstrap has been broken for 1 1/2 years now by spurious
> -Wuninitialized warnings:

IIRC we used to have 3 of them, now we have 2 so there is some progress. ;-)

> Before we ship yet another release with this issue, I suggest to at
> least include a workaround of demoting them to warnings.
> 
> Tested on sparcv9-sun-solaris2.11.
> 
> Ok for master?

Sure.

> I originally meant to propose the patch for the gcc-10 branch as well,
> but when I tried a sparcv9-sun-solaris2.11 bootstrap there some time
> ago, it wasn't affected any longer.

But release branches do not have -Werror set, do they?

-- 
Eric Botcazou




Re: [PATCH] profiling: fix streaming of TOPN counters

2021-03-03 Thread Martin Liška

On 3/3/21 12:26 PM, Jan Hubicka wrote:


libgcc/ChangeLog:

PR gcov-profile/99105
* libgcov-driver.c (write_top_counters): Rename to ...
(write_topn_counters): ... this.
(write_one_data): Pre-allocate buffer for number of items
in the corresponding linked lists.
* libgcov-merge.c (__gcov_merge_topn): Use renamed function.

gcc/testsuite/ChangeLog:

PR gcov-profile/99105
* gcc.dg/tree-prof/indir-call-prof-malloc.c: Use profile
correction as the wrapped malloc is called one more time
from libgcov.



for (unsigned i = 0; i < counters; i++)
  {
-  gcov_type pair_count = ci_ptr->values[GCOV_TOPN_MEM_COUNTERS * i + 1];
gcov_write_counter (ci_ptr->values[GCOV_TOPN_MEM_COUNTERS * i]);
-  gcov_write_counter (pair_count);
+  gcov_write_counter (list_sizes[i]);
gcov_type start = ci_ptr->values[GCOV_TOPN_MEM_COUNTERS * i + 2];
+
+  unsigned j = 0;
for (struct gcov_kvp *node = (struct gcov_kvp *)(intptr_t)start;
-  node != NULL; node = node->next)
+  node != NULL; node = node->next, j++)
{
  gcov_write_counter (node->value);
  gcov_write_counter (node->count);
+
+ /* Stop when we reach expected number of items.  */
+ if (j + 1 == list_sizes[i])
+   break;


Since you counted number of entries earlier, I would expect loop to
always terminate here and thus the node != NULL condition in for loop
above to be unnecesary.


Yes, fixed that.


}
  }
+
+  free (list_sizes);


We already have our own mmap allocator, so I wonder if we don't want to
avoid malloc/free pair here.  The counters are per-function, right?  I
wonder if they happen to be large on some bigger project, but it may
reduct risk of user messing this up with his own malloc/free
implementation if we used alloca for counts of reasonable size.


Good idea, I've also implemented that.


  }
  
  /* Write counters in GI_PTR and the summary in PRG to a gcda file. In

@@ -425,7 +446,7 @@ write_one_data (const struct gcov_info *gi_ptr,
  n_counts = ci_ptr->num;
  
  	  if (t_ix == GCOV_COUNTER_V_TOPN || t_ix == GCOV_COUNTER_V_INDIR)

-   write_top_counters (ci_ptr, t_ix, n_counts);
+   write_topn_counters (ci_ptr, t_ix, n_counts);
  else
{
  /* Do not stream when all counters are zero.  */
diff --git a/libgcc/libgcov-merge.c b/libgcc/libgcov-merge.c
index 7db188a4f4c..3379b688128 100644
--- a/libgcc/libgcov-merge.c
+++ b/libgcc/libgcov-merge.c
@@ -109,6 +109,7 @@ __gcov_merge_topn (gcov_type *counters, unsigned n_counters)
/* First value is number of total executions of the profiler.  */
gcov_type all = gcov_get_counter_ignore_scaling (-1);
gcov_type n = gcov_get_counter_ignore_scaling (-1);
+  gcc_assert (n <= GCOV_TOPN_MAXIMUM_TRACKED_VALUES);


I think in the runtime we do not want to have asserts checking
implementation correctness since it bloats it up.  So I would leave it
out.


Makes sense.



I wonder if we can have some testcase for parallel updating/streaming in
testsuite?


Well, I tried something like this:

#include 
#include "gcov.h"

#define LOOPS 100
#define THREADS 8

void *
worker (void *data)
{
  char memory[1024];
  for (unsigned i = 0; i < LOOPS; i++)
  {
  asm volatile("" ::: "memory");
  __builtin_memset (memory, 0, i % 15);
  }

  return NULL;
}

int main()
{
  pthread_t threads[THREADS];

  for (unsigned i = 0; i < THREADS; i++)
if (pthread_create (&threads[i], NULL, worker, NULL))
  return 0;

  __gcov_dump ();

  return 0;
}

But it's quite difficult to make a setup where we hit point when we append to 
the linked list.

I've just tested instrumented clang:
$ make -j128 check-clang
...
Testing Time: 365.93s
  Unsupported  :   787
  Passed   : 26513
  Expectedly Failed:25
[100%] Built target check-clang

May I install the patch?
Thanks,
Martin



Otherwise the patch looks good to me.
Honza
  
unsigned full = all < 0;

gcov_type *total = &counters[GCOV_TOPN_MEM_COUNTERS * i];
--
2.30.0





>From de61040043c30400a2b8662d81a24387f1a13b3f Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Tue, 16 Feb 2021 16:28:06 +0100
Subject: [PATCH] profiling: fix streaming of TOPN counters

libgcc/ChangeLog:

	PR gcov-profile/99105
	* libgcov-driver.c (write_top_counters): Rename to ...
	(write_topn_counters): ... this.
	(write_one_data): Pre-allocate buffer for number of items
	in the corresponding linked lists.
	* libgcov-merge.c (__gcov_merge_topn): Use renamed function.

gcc/testsuite/ChangeLog:

	PR gcov-profile/99105
	* gcc.dg/tree-prof/indir-call-prof-malloc.c: Use profile
	correction as the wrapped malloc is called one more time
	from libgcov.
	* gcc.dg/tree-prof/pr97461.c: Likewise.
---
 .../gcc.dg/tree-prof/indir-call-prof-malloc.c |  2 +-
 gcc/testsuite/gcc.dg/tree-prof/pr97461.c  |  2 +-
 libgcc/libgcov-dr

Re: [PATCH 1/5] std::latch: reduce internal implementation from ptrdiff_t to int

2021-03-03 Thread Jonathan Wakely via Gcc-patches

On 01/03/21 09:56 +0100, Richard Biener via Libstdc++ wrote:

On Sun, Feb 28, 2021 at 10:53 PM Hans-Peter Nilsson  wrote:




On Fri, 26 Feb 2021, Thiago Macieira via Gcc-patches wrote:

> On Friday, 26 February 2021 11:31:00 PST Andreas Schwab wrote:
> > On Feb 26 2021, Thiago Macieira wrote:
> > > On Friday, 26 February 2021 10:14:42 PST Andreas Schwab wrote:
> > >> On Feb 26 2021, Thiago Macieira via Gcc-patches wrote:
> > >> > -alignas(__alignof__(ptrdiff_t)) ptrdiff_t _M_a;
> > >> > +alignas(__alignof__(int)) int _M_a;
> > >>
> > >> Futexes must be aligned to 4 bytes.
> > >
> > > Agreed, but doesn't this accomplish that?
> >
> > No.  It uses whatever alignment the type already has, and is an
> > elaborate no-op.
>
> I thought so too when I read the original line. But I expected it was written
> like that for a reason, especially since the same pattern appears in other
> places.
>
> I can change to "alignas(4)" (which is a GCC extension, I believe). Is that
> the correct solution?

IMNSHO make use of the corresponding atomic type.  Then there'd
be no need for separate what's-the-right-align-curse games.


That won't work though, because we need direct access to the integer
object, not to a std::atomic which contains it.


Or align as the corresponding atomic type (in case using an actual
std::atomic is undesirable).  OTOH the proposed code isn't
any more bogus than the previous ...


The previous code accounts for the fact that ptrdiff_t is a typedef
for an unspecified type, and that some ABIs allow struct members to have
weaker alignment than they would have otherwise.

e.g. __alignof__(long long) != alignof(long long) on x86.

Yes, I know ptrdiff_t isn't long long on x86, but since it's a typedef
for some target-specific type, it's still possible that
alignof != __alignof__. So alignas(__alignof__(T)) is not necessarily
a no-op. So not bogus.

For int, there shouldn't be any need to force the alignment. I don't
think any ABI supported by GCC allows int members to be aligned to
less than __alignof__(int). Users could break it by compiling with
#pragma pack, but I have no sympathy for such silliness.



Re: [PATCH] middle-end/97855 - fix diagnostic with default pretty printer

2021-03-03 Thread David Malcolm via Gcc-patches
On Wed, 2021-03-03 at 08:48 +0100, Richard Biener wrote:
> On Tue, 2 Mar 2021, Martin Sebor wrote:
> 
> > On 3/2/21 9:52 AM, Jeff Law via Gcc-patches wrote:
> > > 
> > > 
> > > On 3/1/21 1:39 AM, Richard Biener wrote:
> > > > The default diagnostic tree printer relies on dump_generic_node
> > > > which
> > > > for some reason manages to clobber the diagnostic pretty-
> > > > printer state
> > > > so we see garbled diagnostics like
> > > > 
> > > > /home/rguenther/src/trunk/gcc/calls.c: In function
> > > > 'expand_call':
> > > > D.6750.coeffs[0]'/home/rguenther/src/trunk/gcc/dojump.c:118:28:
> > > > warning:
> > > > may be used uninitialized in this function [-Wmaybe-
> > > > uninitialized]
> > > > 
> > > > when the diagnostic is emitted by the LTO fronted.  The
> > > > following
> > > > approach using a temporary pretty-printer for the
> > > > dump_generic_node
> > > > call fixes this for some unknown reason and we issue
> > > > 
> > > > /home/rguenther/src/trunk/gcc/calls.c: In function
> > > > 'expand_call':
> > > > /home/rguenther/src/trunk/gcc/dojump.c:118:28: warning:
> > > > 'MEM[(struct
> > > > poly_int *)&save].D.6750.coeffs[0]' may be used uninitialized
> > > > in this
> > > > function [-Wmaybe-uninitialized]
> > > > 
> > > > [LTO] Bootstrapped and tested on x86_64-unknown-linux-gnu, OK
> > > > for trunk?
> > > > 
> > > > Thanks,
> > > > Richard.
> > > > 
> > > > 2021-02-26  Richard Biener  
> > > > 
> > > >  PR middle-end/97855
> > > >  * tree-diagnostic.c (default_tree_printer): Use a temporary
> > > >  pretty-printer when formatting a tree via dump_generic_node.
> > > It'd be good to know why this helps, but I trust your judgment
> > > that this
> > > is an improvement.
> > 
> > I don't know if it's related but pr98492 tracks a problem in the
> > C++
> > front end caused by reinitializing the pretty printer in a number
> > of
> > functions in cp/error.c.  When one of these functions is called
> > while
> > the pretty printer is formatting something, the effect of
> > the reinitialization is to drop the already formatted contents
> > of the printer's buffer.
> > 
> > IIRC, I tripped over this when working on the MEM_REF formatting
> > improvement for -Wuninitialized.
> 
> I've poked quite a bit with breakpoints on suspicious pretty-printer
> functions and watch points on the pp state but found nothing in the
> case I was looking at (curiously also -Wuninitialized).  But I also
> wasn't able to understand why the caller should work at all.  And
> yes, the C/C++ tree printers also simply format to the passed
> pretty-printer...
> 
> Hoping that David could shed some light on how this should play
> together.

This looks very much like the issue I ran into in
c46d057f55748520e819dcd8e04bca71be9902b2 (and, in retrospect, that
commit may have just been papering over the problem).

The issue there was that pp_printf is not reentrant - if a handler for
a pp_printf format code ends up making a nested call to pp_printf, I
got behavior that looks like what you're seeing.

That said, I've been poring over the output in PR middle-end/97855 and
comparing it to the various pretty-printer usage in the tree, and I'm
not seeing anywhere where a pp_printf seems to be used when generating:
  MEM[(struct poly_int *)&save + 8B].D.6750.coeffs[0]

Is there a minimal reproducer (or a .i file?)

Dave



>   Most specifically
> 
>   pp_format (context->printer, &diagnostic->message);
> 
> ^^^ this is the path affected by the patch
> 
>   (*diagnostic_starter (context)) (context, diagnostic);
> 
> ^^^ this somehow messes things up, it does pp_set_prefix on
> context->printer but also does some formatting
> 
>   pp_output_formatted_text (context->printer);
> 
> and now we expect this to magically output the composed pieces.
> 
> Note swapping the first two lines didn't have any effect (I don't
> remember if it changed anything so details might have changed but
> it was definitely still broken).
> 
> That said, the only hint I have is that the diagnostic plus prefix
> is quite long, but any problem in the generic code should eventually
> affect non-LTO as well but the PR is reported for LTO only
> (bogus diagnostics shown during LTO bootstrap).  The patch fixes
> all bogus diagnostics during LTO bootstrap.
> 
> I originally thought there's maybe a pp_flush too much but maybe
> there's a pp_flush missing ...
> 
> I'll wait for Davids feedback but will eventually install the
> patch to close the bug.
> 
> Richard.
> 




Re: [PATCH 1/5] std::latch: reduce internal implementation from ptrdiff_t to int

2021-03-03 Thread Andreas Schwab
On Mär 03 2021, Jonathan Wakely wrote:

> For int, there shouldn't be any need to force the alignment. I don't
> think any ABI supported by GCC allows int members to be aligned to
> less than __alignof__(int).

There is no requirement that __alignof__(int) is big enough.

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."


Re: [PATCH] c++: Fix -fstrong-eval-order for operator &&, || and , [PR82959]

2021-03-03 Thread Jason Merrill via Gcc-patches

On 3/2/21 6:10 PM, Jakub Jelinek wrote:

Hi!

P0145R3 added
"However, the operands are sequenced in the order prescribed for the built-in
operator" rule for overloaded operator calls when using the operator syntax.
op_is_ordered follows that, but added just the overloaded operators
added in that paper.  &&, || and comma operators had rules that
lhs is sequenced before rhs already in C++98.
The following patch adds those cases to op_is_ordered.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?


OK.


2021-03-02  Jakub Jelinek  

PR c++/82959
* call.c (op_is_ordered): Handle TRUTH_ANDIF_EXPR, TRUTH_ORIF_EXPR
and COMPOUND_EXPR.

* g++.dg/cpp1z/eval-order10.C: New test.

--- gcc/cp/call.c.jj2021-02-19 12:14:10.965113839 +0100
+++ gcc/cp/call.c   2021-03-02 13:32:08.423924095 +0100
@@ -6083,6 +6083,15 @@ op_is_ordered (tree_code code)
  case LSHIFT_EXPR:
// 8. a >> b
  case RSHIFT_EXPR:
+  // a && b
+  // Predates P0145R3.
+case TRUTH_ANDIF_EXPR:
+  // a || b
+  // Predates P0145R3.
+case TRUTH_ORIF_EXPR:
+  // a , b
+  // Predates P0145R3.
+case COMPOUND_EXPR:
return (flag_strong_eval_order ? 1 : 0);
  
  default:

--- gcc/testsuite/g++.dg/cpp1z/eval-order10.C.jj2021-03-02 
14:15:21.735278240 +0100
+++ gcc/testsuite/g++.dg/cpp1z/eval-order10.C   2021-03-02 14:24:36.984143267 
+0100
@@ -0,0 +1,27 @@
+// PR c++/82959
+// { dg-do run }
+// { dg-additional-options -fstrong-eval-order }
+
+struct A {};
+
+void operator && (const A x, const A) {}
+void operator || (const A x, const A) {}
+void operator , (const A x, const A) {}
+
+int i;
+
+A f () { if (i != 0) __builtin_abort (); i = 1; return A (); }
+A g () { if (i != 1) __builtin_abort (); i = 2; return A (); }
+
+int
+main ()
+{
+  f () && g ();
+  if (i != 2) __builtin_abort ();
+  i = 0;
+  f () || g ();
+  if (i != 2) __builtin_abort ();
+  i = 0;
+  f (), g ();
+  if (i != 2) __builtin_abort ();
+}

Jakub





Re: [PATCH 1/5] std::latch: reduce internal implementation from ptrdiff_t to int

2021-03-03 Thread Jonathan Wakely via Gcc-patches

On 03/03/21 14:56 +, Jonathan Wakely wrote:

On 01/03/21 09:56 +0100, Richard Biener via Libstdc++ wrote:

On Sun, Feb 28, 2021 at 10:53 PM Hans-Peter Nilsson  wrote:




On Fri, 26 Feb 2021, Thiago Macieira via Gcc-patches wrote:


On Friday, 26 February 2021 11:31:00 PST Andreas Schwab wrote:
> On Feb 26 2021, Thiago Macieira wrote:
> > On Friday, 26 February 2021 10:14:42 PST Andreas Schwab wrote:
> >> On Feb 26 2021, Thiago Macieira via Gcc-patches wrote:
> >> > -alignas(__alignof__(ptrdiff_t)) ptrdiff_t _M_a;
> >> > +alignas(__alignof__(int)) int _M_a;
> >>
> >> Futexes must be aligned to 4 bytes.
> >
> > Agreed, but doesn't this accomplish that?
>
> No.  It uses whatever alignment the type already has, and is an
> elaborate no-op.

I thought so too when I read the original line. But I expected it was written
like that for a reason, especially since the same pattern appears in other
places.

I can change to "alignas(4)" (which is a GCC extension, I believe). Is that
the correct solution?


It's not a GCC extensions. You're thinking of alignas(obj) which is a
GCC extension.


IMNSHO make use of the corresponding atomic type.  Then there'd
be no need for separate what's-the-right-align-curse games.


That won't work though, because we need direct access to the integer
object, not to a std::atomic which contains it.


Or align as the corresponding atomic type (in case using an actual
std::atomic is undesirable).  OTOH the proposed code isn't
any more bogus than the previous ...


The previous code accounts for the fact that ptrdiff_t is a typedef
for an unspecified type, and that some ABIs allow struct members to have
weaker alignment than they would have otherwise.

e.g. __alignof__(long long) != alignof(long long) on x86.

Yes, I know ptrdiff_t isn't long long on x86, but since it's a typedef
for some target-specific type, it's still possible that
alignof != __alignof__. So alignas(__alignof__(T)) is not necessarily
a no-op. So not bogus.

For int, there shouldn't be any need to force the alignment. I don't
think any ABI supported by GCC allows int members to be aligned to
less than __alignof__(int). Users could break it by compiling with
#pragma pack, but I have no sympathy for such silliness.


Jakub said on IRC that m68k might have alignof(int) == 2, so we'd need
to increase that alignment to use it as a futex.

N.B. std::atomic and std::atomic_ref don't use alignas(__alignof__(T))
they do this instead:

  static_assert(is_trivially_copyable_v<_Tp>);

  // 1/2/4/8/16-byte types must be aligned to at least their size.
  static constexpr int _S_min_alignment
= (sizeof(_Tp) & (sizeof(_Tp) - 1)) || sizeof(_Tp) > 16
? 0 : sizeof(_Tp);

public:

  static constexpr size_t required_alignment
= _S_min_alignment > alignof(_Tp) ? _S_min_alignment : alignof(_Tp);

Using std::atomic_ref::required_alignment would give that value.

For something being used as a futex we should just use alignas(4)
though, since that's what the kernel requires.



Re: [PATCH] sparcv9: Disable -Wuninitialized warnings breaking bootstrap [PR92002]

2021-03-03 Thread Rainer Orth
Hi Eric,

>> sparcv9 bootstrap has been broken for 1 1/2 years now by spurious
>> -Wuninitialized warnings:
>
> IIRC we used to have 3 of them, now we have 2 so there is some progress. ;-)
>
>> Before we ship yet another release with this issue, I suggest to at
>> least include a workaround of demoting them to warnings.
>> 
>> Tested on sparcv9-sun-solaris2.11.
>> 
>> Ok for master?
>
> Sure.

Thanks, installed.

>> I originally meant to propose the patch for the gcc-10 branch as well,
>> but when I tried a sparcv9-sun-solaris2.11 bootstrap there some time
>> ago, it wasn't affected any longer.
>
> But release branches do not have -Werror set, do they?

Right, I forgot.  That's what you get for mostly ignoring older branches
except for a weekly bootstrap or two ;-)

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [PATCH] [libstdc++] Refactor/cleanup of atomic wait implementation

2021-03-03 Thread Jonathan Wakely via Gcc-patches

On 23/02/21 13:57 -0800, Thomas Rodgers wrote:

diff --git a/libstdc++-v3/include/bits/atomic_wait.h 
b/libstdc++-v3/include/bits/atomic_wait.h
index 1a0f0943ebd..fa83ef6c231 100644
--- a/libstdc++-v3/include/bits/atomic_wait.h
+++ b/libstdc++-v3/include/bits/atomic_wait.h
@@ -39,17 +39,16 @@
#include 

#ifdef _GLIBCXX_HAVE_LINUX_FUTEX
+#define _GLIBCXX_HAVE_PLATFORM_WAIT 1


This is defined here (to 1) and then ...


# include 
# include 
# include 
# include 
# include 
-// TODO get this from Autoconf
-# define _GLIBCXX_HAVE_LINUX_FUTEX_PRIVATE 1
-#else
-# include   // std::mutex, std::__condvar
#endif

+# include   // std::mutex, std::__condvar
+
#define __cpp_lib_atomic_wait 201907L

namespace std _GLIBCXX_VISIBILITY(default)
@@ -57,20 +56,27 @@ namespace std _GLIBCXX_VISIBILITY(default)
_GLIBCXX_BEGIN_NAMESPACE_VERSION
  namespace __detail
  {
+#ifdef _GLIBCXX_HAVE_LINUX_FUTEX
using __platform_wait_t = int;
+#else
+using __platform_wait_t = uint64_t;
+#endif
+  } // namespace __detail

-constexpr auto __atomic_spin_count_1 = 16;
-constexpr auto __atomic_spin_count_2 = 12;
-
-template
-  inline constexpr bool __platform_wait_uses_type
-#ifdef _GLIBCXX_HAVE_LINUX_FUTEX
-   = is_same_v, __platform_wait_t>;
+  template
+inline constexpr bool __platform_wait_uses_type
+#ifdef _GLIBCXX_HAVE_PLATFORM_WAIT
+  = is_same_v, __detail::__platform_wait_t>
+   || ((sizeof(_Tp) == sizeof(__detail::__platform_wait_t))
+   && (alignof(_Tp*) == alignof(__detail::__platform_wait_t)));
#else
-   = false;
+  = false;
#endif

+  namespace __detail
+  {
#ifdef _GLIBCXX_HAVE_LINUX_FUTEX
+#define _GLIBCXX_HAVE_PLATFORM_WAIT


Redefined here (to empty), after it's already been tested.

Presumably this redefinition shouldn't be here.

Also the HAVE_PLATFORM_TIMED_WAIT macro is defined to empty. I think
they should both be defined to 1 (or both empty, but not
inconsistently).

I'm still going through the rest of the patch.




Re: [PATCH] middle-end/97855 - fix diagnostic with default pretty printer

2021-03-03 Thread Richard Biener
On Wed, 3 Mar 2021, David Malcolm wrote:

> On Wed, 2021-03-03 at 08:48 +0100, Richard Biener wrote:
> > On Tue, 2 Mar 2021, Martin Sebor wrote:
> > 
> > > On 3/2/21 9:52 AM, Jeff Law via Gcc-patches wrote:
> > > > 
> > > > 
> > > > On 3/1/21 1:39 AM, Richard Biener wrote:
> > > > > The default diagnostic tree printer relies on dump_generic_node
> > > > > which
> > > > > for some reason manages to clobber the diagnostic pretty-
> > > > > printer state
> > > > > so we see garbled diagnostics like
> > > > > 
> > > > > /home/rguenther/src/trunk/gcc/calls.c: In function
> > > > > 'expand_call':
> > > > > D.6750.coeffs[0]'/home/rguenther/src/trunk/gcc/dojump.c:118:28:
> > > > > warning:
> > > > > may be used uninitialized in this function [-Wmaybe-
> > > > > uninitialized]
> > > > > 
> > > > > when the diagnostic is emitted by the LTO fronted.  The
> > > > > following
> > > > > approach using a temporary pretty-printer for the
> > > > > dump_generic_node
> > > > > call fixes this for some unknown reason and we issue
> > > > > 
> > > > > /home/rguenther/src/trunk/gcc/calls.c: In function
> > > > > 'expand_call':
> > > > > /home/rguenther/src/trunk/gcc/dojump.c:118:28: warning:
> > > > > 'MEM[(struct
> > > > > poly_int *)&save].D.6750.coeffs[0]' may be used uninitialized
> > > > > in this
> > > > > function [-Wmaybe-uninitialized]
> > > > > 
> > > > > [LTO] Bootstrapped and tested on x86_64-unknown-linux-gnu, OK
> > > > > for trunk?
> > > > > 
> > > > > Thanks,
> > > > > Richard.
> > > > > 
> > > > > 2021-02-26  Richard Biener  
> > > > > 
> > > > >  PR middle-end/97855
> > > > >  * tree-diagnostic.c (default_tree_printer): Use a temporary
> > > > >  pretty-printer when formatting a tree via dump_generic_node.
> > > > It'd be good to know why this helps, but I trust your judgment
> > > > that this
> > > > is an improvement.
> > > 
> > > I don't know if it's related but pr98492 tracks a problem in the
> > > C++
> > > front end caused by reinitializing the pretty printer in a number
> > > of
> > > functions in cp/error.c.  When one of these functions is called
> > > while
> > > the pretty printer is formatting something, the effect of
> > > the reinitialization is to drop the already formatted contents
> > > of the printer's buffer.
> > > 
> > > IIRC, I tripped over this when working on the MEM_REF formatting
> > > improvement for -Wuninitialized.
> > 
> > I've poked quite a bit with breakpoints on suspicious pretty-printer
> > functions and watch points on the pp state but found nothing in the
> > case I was looking at (curiously also -Wuninitialized).  But I also
> > wasn't able to understand why the caller should work at all.  And
> > yes, the C/C++ tree printers also simply format to the passed
> > pretty-printer...
> > 
> > Hoping that David could shed some light on how this should play
> > together.
> 
> This looks very much like the issue I ran into in
> c46d057f55748520e819dcd8e04bca71be9902b2 (and, in retrospect, that
> commit may have just been papering over the problem).
> 
> The issue there was that pp_printf is not reentrant - if a handler for
> a pp_printf format code ends up making a nested call to pp_printf, I
> got behavior that looks like what you're seeing.
> 
> That said, I've been poring over the output in PR middle-end/97855 and
> comparing it to the various pretty-printer usage in the tree, and I'm
> not seeing anywhere where a pp_printf seems to be used when generating:
>   MEM[(struct poly_int *)&save + 8B].D.6750.coeffs[0]

I think it's the D.6750 which is printed via

  else if (TREE_CODE (node) == DEBUG_EXPR_DECL)
{
  if (flags & TDF_NOUID)
pp_string (pp, "D#");
  else
pp_printf (pp, "D#%i", DEBUG_TEMP_UID (node));

because this is a DECL_DEBUG_EXPR.  One could experiment with
avoiding pp_printf in dump_decl_name.

> Is there a minimal reproducer (or a .i file?)

No, you need to do a LTO bootstrap, repeat the link step of
for example cc1 with -v -save-temps and pick an ltrans invocation
that exhibits the issue ...

I can poke at the above tomorrow again.  I suppose we could
also add some checking-assert into the pp_printf code at
the problematic place (or is any recursion bogus?) to catch
the case with an ICE.

Richard.

> Dave
> 
> 
> 
> >   Most specifically
> > 
> >   pp_format (context->printer, &diagnostic->message);
> > 
> > ^^^ this is the path affected by the patch
> > 
> >   (*diagnostic_starter (context)) (context, diagnostic);
> > 
> > ^^^ this somehow messes things up, it does pp_set_prefix on
> > context->printer but also does some formatting
> > 
> >   pp_output_formatted_text (context->printer);
> > 
> > and now we expect this to magically output the composed pieces.
> > 
> > Note swapping the first two lines didn't have any effect (I don't
> > remember if it changed anything so details might have changed but
> > it was definitely still broken).
> > 
> > That said, the only hint I have is that the diagnostic plus pref

Re: [PATCH 1/5] std::latch: reduce internal implementation from ptrdiff_t to int

2021-03-03 Thread Hans-Peter Nilsson
On Wed, 3 Mar 2021, Jonathan Wakely wrote:
> For int, there shouldn't be any need to force the alignment. I don't
> think any ABI supported by GCC allows int members to be aligned to
> less than __alignof__(int).

(sizeof(int) last)

The CRIS ABI does as in default packed, and ISTR there was some
other gcc port too, but I don't recall whether that (too) had no
(current) Linux port.

brgds, H-P


RE: [PATCH] testsuite: aarch64: Add tests for narrowing-arithmetic intrinsics

2021-03-03 Thread Kyrylo Tkachov via Gcc-patches



> -Original Message-
> From: Jonathan Wright 
> Sent: 03 March 2021 12:24
> To: gcc-patches@gcc.gnu.org
> Cc: Kyrylo Tkachov 
> Subject: [PATCH] testsuite: aarch64: Add tests for narrowing-arithmetic
> intrinsics
> 
> Hi,
> 
> As subject, this patch adds tests for v[r]addhn_high and v[r]subhn_high Neon
> intrinsics. Since these intrinsics are only supported for AArch64, these tests
> are restricted to only run on AArch64 targets.
> 
> Ok for master?

Ok. These plug holes in testing of existing functionality so I'd like them in 
at this stage.
Thanks,
Kyrill

> 
> Thanks,
> Jonathan
> 
> ---
> 
> gcc/testsuite/ChangeLog:
> 
> 2021-03-02  Jonathan Wright  
> 
> * gcc.target/aarch64/advsimd-intrinsics/vXXXhn_high.inc:
> New test template.
> * gcc.target/aarch64/advsimd-intrinsics/vaddhn_high.c:
> New test.
> * gcc.target/aarch64/advsimd-intrinsics/vraddhn_high.c:
> New test.
> * gcc.target/aarch64/advsimd-intrinsics/vrsubhn_high.c:
> New test.
> * gcc.target/aarch64/advsimd-intrinsics/vsubhn_high.c:
> New test.



RE: [PATCH] testsuite: aarch64: Add tests for v[r]shrn_high intrinsics

2021-03-03 Thread Kyrylo Tkachov via Gcc-patches



> -Original Message-
> From: Jonathan Wright 
> Sent: 03 March 2021 12:29
> To: gcc-patches@gcc.gnu.org
> Cc: Kyrylo Tkachov 
> Subject: [PATCH] testsuite: aarch64: Add tests for v[r]shrn_high intrinsics
> 
> Hi,
> 
> As subject, this patch adds tests for v[r]shrn_high Neon intrinsics. Since
> these intrinsics are only supported for AArch64, these tests are restricted
> to only run on AArch64 targets.
> 
> Ok for master?

Ok.
Thanks,
Kyrill

> 
> Thanks,
> Jonathan
> 
> ---
> 
> gcc/testsuite/ChangeLog:
> 
> 2021-03-02  Jonathan Wright  
> 
> * gcc.target/aarch64/advsimd-intrinsics/vrshrn_high_n.c:
> New test.
> * gcc.target/aarch64/advsimd-intrinsics/vshrn_high_n.c:
> New test.



RE: [PATCH] testsuite: aarch64: Add tests for v[q]mov[u]n_high intrinsics

2021-03-03 Thread Kyrylo Tkachov via Gcc-patches



> -Original Message-
> From: Jonathan Wright 
> Sent: 03 March 2021 12:33
> To: gcc-patches@gcc.gnu.org
> Cc: Kyrylo Tkachov 
> Subject: [PATCH] testsuite: aarch64: Add tests for v[q]mov[u]n_high intrinsics
> 
> Hi,
> 
> As subject, this patch adds tests for v[q]mov[u]n_high Neon intrinsics. Since
> these intrinsics are only supported for AArch64, these tests are restricted
> to only run on AArch64 targets.
> 
> Ok for master?

Ok.
Thanks,
Kyrill

> 
> Thanks,
> Jonathan
> 
> ---
> 
> gcc/testsuite/ChangeLog:
> 
> 2021-03-02  Jonathan Wright  
> 
> * gcc.target/aarch64/advsimd-intrinsics/vmovn_high.c:
> New test.
> * gcc.target/aarch64/advsimd-intrinsics/vqmovn_high.c:
> New test.
> * gcc.target/aarch64/advsimd-intrinsics/vqmovun_high.c:
> New test.



RE: [PATCH] testsuite: aarch64: Add tests for vcvt FP intrinsics

2021-03-03 Thread Kyrylo Tkachov via Gcc-patches



> -Original Message-
> From: Jonathan Wright 
> Sent: 03 March 2021 12:38
> To: gcc-patches@gcc.gnu.org
> Cc: Kyrylo Tkachov 
> Subject: [PATCH] testsuite: aarch64: Add tests for vcvt FP intrinsics
> 
> Hi,
> 
> As subject, this patch adds tests for vcvtx* and vcvt_fXX_fXX floating-point
> Neon intrinsics. Since these intrinsics are only supported for AArch64, these
> tests are restricted to only run on AArch64 targets.
> 
> Ok for master?

Ok.
Thanks,
Kyrill

> 
> Thanks,
> Jonathan
> 
> ---
> 
> gcc/testsuite/ChangeLog:
> 
> 2021-02-18  Jonathan Wright  
> 
> * gcc.target/aarch64/advsimd-intrinsics/vcvt_fXX_fXX.c:
> New test.
> * gcc.target/aarch64/advsimd-intrinsics/vcvtx.c:
> New test.



Re: [PATCH] middle-end/97855 - fix diagnostic with default pretty printer

2021-03-03 Thread Jakub Jelinek via Gcc-patches
On Wed, Mar 03, 2021 at 04:23:59PM +0100, Richard Biener wrote:
> I think it's the D.6750 which is printed via
> 
>   else if (TREE_CODE (node) == DEBUG_EXPR_DECL)
> {
>   if (flags & TDF_NOUID)
> pp_string (pp, "D#");
>   else
> pp_printf (pp, "D#%i", DEBUG_TEMP_UID (node));
> 
> because this is a DECL_DEBUG_EXPR.  One could experiment with
> avoiding pp_printf in dump_decl_name.

Sure,
  {
pp_string (pp, "D#");
pp_decimal_int (pp, DEBUG_TEMP_UID (node));
  }
(or pp_wide_int) looks like the way to go.
But dump_decl_name has several such pp_printf calls.

Jakub



Re: [PATCH 2/5] Atomic __platform_wait: accept any 32-bit type, not just int

2021-03-03 Thread Jonathan Wakely via Gcc-patches

On 03/03/21 14:34 +, Jonathan Wakely wrote:

On 26/02/21 07:59 -0800, Thiago Macieira via Libstdc++ wrote:

The kernel doesn't care what we store in those 32 bits, only that they are
comparable. This commit adds:
* pointers and long on 32-bit architectures
* unsigned
* untyped enums and typed enums on int & unsigned int
* float

We're not using FUTEX_OP anywhere today. The kernel reserves 4 bits for
this field but has only used 6 values so far, so it can be extended to
unsigned compares in the future, if needed.


And this one for GCC 11 too.


libstdc++-v3/include/bits/atomic_wait.h | 7 ---
1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/libstdc++-v3/include/bits/atomic_wait.h 
b/libstdc++-v3/include/bits/atomic_wait.h
index 424fccbe4c5..1c6bda2e2b6 100644
--- a/libstdc++-v3/include/bits/atomic_wait.h
+++ b/libstdc++-v3/include/bits/atomic_wait.h
@@ -65,7 +65,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template
 inline constexpr bool __platform_wait_uses_type
#ifdef _GLIBCXX_HAVE_LINUX_FUTEX
-   = is_same_v, __platform_wait_t>;
+   = is_scalar_v> && sizeof(_Tp) == 
sizeof(__platform_wait_t)


Oh, except that is_scalar is surprisingly expensive to instantiate
(its defined in a really expensive way) and since we control all uses
of this constant, I don't think we need it. It's only ever used from
atomic waiting functions which are only defined for scalar types.


+ && alignof(_Tp) >= alignof(__platform_wait_t);
#else
= false;
#endif
@@ -91,13 +92,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION

   template
 void
-  __platform_wait(const _Tp* __addr, __platform_wait_t __val) noexcept
+  __platform_wait(const _Tp* __addr, _Tp __val) noexcept
 {
for(;;)
  {
auto __e = syscall (SYS_futex, static_cast(__addr),
  
static_cast(__futex_wait_flags::__wait_private),
-   __val, nullptr);
+ static_cast<__platform_wait_t>(__val), 
nullptr);
if (!__e || errno == EAGAIN)
  break;
else if (errno != EINTR)
--
2.30.1





Re: [PATCH 6/6] c++: Consolidate REQUIRES_EXPR evaluation/diagnostic routines

2021-03-03 Thread Patrick Palka via Gcc-patches



On Tue, 2 Mar 2021, Jason Merrill wrote:

> On 3/2/21 11:45 AM, Patrick Palka wrote:
> > On Mon, 1 Mar 2021, Jason Merrill wrote:
> > 
> > > On 2/28/21 12:59 PM, Patrick Palka wrote:
> > > > This folds the diagnose_requires_expr routines into the corresponding
> > > > tsubst_requires_expr ones.  This is achieved by making the latter
> > > > routines take a sat_info instead of a subst_info, and assigning the
> > > > appropriate meanings to the flags sat_info::noisy and
> > > > sat_info::diagnose_unsatisfaction_p during tsubst_requires_expr:
> > > > info.noisy() controls whether to diagnose invalid types and expressions
> > > > inside the requires-expression, and info.diagnose_unsatisfaction_p()
> > > > controls whether to diagnose why the requires-expression evaluates to
> > > > false.
> > > > 
> > > > gcc/cp/ChangeLog:
> > > > 
> > > > * constraint.cc (struct sat_info): Document the different
> > > > meanings of noisy() and diagnose_unsatisfaction_p() during
> > > > satisfaction and requires-expression evaluation.
> > > > (tsubst_valid_expression_requirement): Take a sat_info instead
> > > > of a subst_info.  Perform the substitution quietly first.  Fold
> > > > in error-replaying code from diagnose_valid_expression.
> > > > (tsubst_simple_requirement): Take a sat_info instead of a
> > > > subst_info.
> > > > (tsubst_type_requirement_1): New.  Fold in error-replaying code
> > > > from diagnose_valid_type.
> > > > (tsubst_type_requirement): Use the above.  Take a sat_info
> > > > instead of a subst_info.
> > > > (tsubst_compound_requirement): Likewise.  Fold in
> > > > error-replaying code from diagnose_compound_requirement.
> > > > (tsubst_nested_requirement): Take a sat_info instead of a
> > > > subst_info.  Fold in error-replaying code from
> > > > diagnose_nested_requirement.
> > > > (tsubst_requirement): Take a sat_info instead of a subst_info.
> > > > (tsubst_requires_expr): Split into two versions, one that takes
> > > > a sat_info argument and another that takes a complain and
> > > > in_decl argument.  Remove outdated documentation.  Document the
> > > > effects of the sat_info argument.
> > > > (diagnose_trait_expr): Make static.  Take a template argument
> > > > vector instead of a parameter mapping.
> > > > (diagnose_valid_expression): Remove.
> > > > (diagnose_valid_type): Remove.
> > > > (diagnose_simple_requirement): Remove.
> > > > (diagnose_compound_requirement): Remove.
> > > > (diagnose_type_requirement): Remove.
> > > > (diagnose_nested_requirement): Remove.
> > > > (diagnose_requirement): Remove.
> > > > (diagnose_requires_expr): Remove.
> > > > (diagnose_atomic_constraint): Take a sat_info instead of a
> > > > subst_info.  Adjust call to diagnose_trait_expr.  Call
> > > > tsubst_requires_expr instead of diagnose_requires_expr.
> > > > (diagnose_constraints): Call tsubst_requires_expr instead of
> > > > diagnose_requires_expr.
> > > > 
> > > > gcc/testsuite/ChangeLog:
> > > > 
> > > > * g++.dg/concepts/diagnostic1.C: Adjust expected diagnostics
> > > > now that we diagnose only the first failed requirement of a
> > > > requires-expression.
> > > > ---
> > > >gcc/cp/constraint.cc| 416
> > > > +---
> > > >gcc/testsuite/g++.dg/concepts/diagnostic1.C |   2 +-
> > > >2 files changed, 179 insertions(+), 239 deletions(-)
> > > > 
> > > > diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
> > > > index cf319b34da0..31f32c25dfe 100644
> > > > --- a/gcc/cp/constraint.cc
> > > > +++ b/gcc/cp/constraint.cc
> > > > @@ -100,17 +100,30 @@ struct subst_info
> > > >  /* Provides additional context for satisfaction.
> > > >-   The flag noisy() controls whether to diagnose ill-formed
> > > > satisfaction,
> > > > -   such as the satisfaction value of an atom being non-bool or
> > > > non-constant.
> > > > -
> > > > -   The flag diagnose_unsatisfaction_p() controls whether to explain why
> > > > -   a constraint is not satisfied.
> > > > -
> > > > -   The entrypoints to satisfaction for which we set noisy+unsat are
> > > > -   diagnose_constraints and diagnose_nested_requirement.  The
> > > > entrypoint
> > > > for
> > > > -   which we set noisy-unsat is the replay inside
> > > > constraint_satisfaction_value.
> > > > -   From constraints_satisfied_p, we enter satisfaction quietly (both
> > > > flags
> > > > -   cleared).  */
> > > > +   During satisfaction:
> > > > +- The flag noisy() controls whether to diagnose ill-formed
> > > > satisfaction,
> > > > +  such as the satisfaction value of an atom being non-bool or
> > > > non-constant.
> > > > +- The flag diagnose_unsatisfaction_p() controls whether to explain

Re: [PATCH 1/5] std::latch: reduce internal implementation from ptrdiff_t to int

2021-03-03 Thread Thiago Macieira via Gcc-patches
On Wednesday, 3 March 2021 06:34:26 PST Jonathan Wakely wrote:
> On 26/02/21 07:59 -0800, Thiago Macieira via Libstdc++ wrote:
> >ints can be used as futex on Linux. ptrdiff_t on 64-bit Linux can't.
> 
> Yes, we should do this for GCC 11.

Want me to re-submit this one alone, with the "alignas(4)" with a commend 
indicating it's what the kernel requires?

-- 
Thiago Macieira - thiago.macieira (AT) intel.com
  Software Architect - Intel DPG Cloud Engineering





Re: [RFC] ldist: Recognize rawmemchr loop patterns

2021-03-03 Thread Stefan Schulze Frielinghaus via Gcc-patches
On Tue, Mar 02, 2021 at 01:29:59PM +0100, Richard Biener wrote:
> On Sun, Feb 14, 2021 at 11:27 AM Stefan Schulze Frielinghaus
>  wrote:
> >
> > On Tue, Feb 09, 2021 at 09:57:58AM +0100, Richard Biener wrote:
> > > On Mon, Feb 8, 2021 at 3:11 PM Stefan Schulze Frielinghaus via
> > > Gcc-patches  wrote:
> > > >
> > > > This patch adds support for recognizing loops which mimic the behaviour
> > > > of function rawmemchr, and replaces those with an internal function call
> > > > in case a target provides them.  In contrast to the original rawmemchr
> > > > function, this patch also supports different instances where the memory
> > > > pointed to and the pattern are interpreted as 8, 16, and 32 bit sized,
> > > > respectively.
> > > >
> > > > This patch is not final and I'm looking for some feedback:
> > > >
> > > > Previously, only loops which mimic the behaviours of functions memset,
> > > > memcpy, and memmove have been detected and replaced by corresponding
> > > > function calls.  One characteristic of those loops/partitions is that
> > > > they don't have a reduction.  In contrast, loops which mimic the
> > > > behaviour of rawmemchr compute a result and therefore have a reduction.
> > > > My current attempt is to ensure that the reduction statement is not used
> > > > in any other partition and only in that case ignore the reduction and
> > > > replace the loop by a function call.  We then only need to replace the
> > > > reduction variable of the loop which contained the loop result by the
> > > > variable of the lhs of the internal function call.  This should ensure
> > > > that the transformation is correct independently of how partitions are
> > > > fused/distributed in the end.  Any thoughts about this?
> > >
> > > Currently we're forcing reduction partitions last (and force to have a 
> > > single
> > > one by fusing all partitions containing a reduction) because 
> > > code-generation
> > > does not properly update SSA form for the reduction results.  ISTR that
> > > might be just because we do not copy the LC PHI nodes or do not adjust
> > > them when copying.  That might not be an issue in case you replace the
> > > partition with a call.  I guess you can try to have a testcase with
> > > two rawmemchr patterns and a regular loop part that has to be scheduled
> > > inbetween both for correctness.
> >
> > Ah ok, in that case I updated my patch by removing the constraint that
> > the reduction statement must be in precisely one partition.  Please find
> > attached the testcases I came up so far.  Since transforming a loop into
> > a rawmemchr function call is backend dependend, I planned to include
> > those only in my backend patch.  I wasn't able to come up with any
> > testcase where a loop is distributed into multiple partitions and where
> > one is classified as a rawmemchr builtin.  The latter boils down to a
> > for loop with an empty body only in which case I suspect that loop
> > distribution shouldn't be done anyway.
> >
> > > > Furthermore, I simply added two new members (pattern, fn) to structure
> > > > builtin_info which I consider rather hacky.  For the long run I thought
> > > > about to split up structure builtin_info into a union where each member
> > > > is a structure for a particular builtin of a partition, i.e., something
> > > > like this:
> > > >
> > > > union builtin_info
> > > > {
> > > >   struct binfo_memset *memset;
> > > >   struct binfo_memcpymove *memcpymove;
> > > >   struct binfo_rawmemchr *rawmemchr;
> > > > };
> > > >
> > > > Such that a structure for one builtin does not get "polluted" by a
> > > > different one.  Any thoughts about this?
> > >
> > > Probably makes sense if the list of recognized patterns grow further.
> > >
> > > I see you use internal functions rather than builtin functions.  I guess
> > > that's OK.  But you use new target hooks for expansion where I think
> > > new optab entries similar to cmpmem would be more appropriate
> > > where the distinction between 8, 16 or 32 bits can be encoded in
> > > the modes.
> >
> > The optab implementation is really nice which allows me to use iterators
> > in the backend which in the end saves me some boiler plate code compared
> > to the previous implementation :)
> >
> > While using optabs now, I only require one additional member (pattern)
> > in the builtin_info struct.  Thus I didn't want to overcomplicate things
> > and kept the single struct approach as is.
> >
> > For the long run, should I resubmit this patch once stage 1 opens or how
> > would you propose to proceed?
> 
> Yes, and sorry for the delay.  Few comments on the patch given I had a
> quick look:
> 
> +void
> +expand_RAWMEMCHR (internal_fn, gcall *stmt)
> +{
> +  expand_operand ops[3];
> +
> +  tree lhs = gimple_call_lhs (stmt);
> 
> I think that give we have people testing with -fno-tree-dce you
> should try to handle a NULL LHS gracefully.  I suppose by
> simply doing nothing.
> 
> +  tree result_old = build_fold_addr_expr (DR_REF (d

Re: [PATCH 1/5] std::latch: reduce internal implementation from ptrdiff_t to int

2021-03-03 Thread Jonathan Wakely via Gcc-patches

On 03/03/21 09:14 -0800, Thiago Macieira via Libstdc++ wrote:

On Wednesday, 3 March 2021 06:34:26 PST Jonathan Wakely wrote:

On 26/02/21 07:59 -0800, Thiago Macieira via Libstdc++ wrote:
>ints can be used as futex on Linux. ptrdiff_t on 64-bit Linux can't.

Yes, we should do this for GCC 11.


Want me to re-submit this one alone, with the "alignas(4)" with a commend
indicating it's what the kernel requires?


Yes please.



[PATCH] libstdc++: Avoid accidental ADL when calling make_reverse_iterator

2021-03-03 Thread Moritz Sichert via Gcc-patches

std::ranges::reverse_view uses make_reverse_iterator in its
implementation as described in [range.reverse.view]. This accidentally
allows ADL as an unqualified name is used in the call. According to
[contents], however, this should be treated as a qualified lookup into
the std namespace.

This leads to errors due to ambiguous name lookups when another
make_reverse_iterator function is found via ADL.

I found this as llvm has its own implementation of ranges which also has 
llvm::make_reverse_iterator. This lead to a compile error with 
std::vector | std::ranges::views::reverse.

Best,
Moritz
From f9aaf991c75047f83f12a98311cb62278c32dcda Mon Sep 17 00:00:00 2001
From: Moritz Sichert 
Date: Wed, 3 Mar 2021 18:14:28 +0100
Subject: [PATCH] libstdc++: Avoid accidental ADL when calling
 make_reverse_iterator

std::ranges::reverse_view uses make_reverse_iterator in its
implementation as described in [range.reverse.view]. This accidentally
allows ADL as an unqualified name is used in the call. According to
[contents], however, this should be treated as a qualified lookup into
the std namespace.

This leads to errors due to ambiguous name lookups when another
make_reverse_iterator function is found via ADL.

libstdc++-v3/Changelog:

	* include/std/ranges: Avoid accidental ADL when calling
make_reverse_iterator.
* testsuite/std/ranges/adaptors/reverse_no_adl.cc: New test.
Test that views::reverse works when make_reverse_iterator is
defined in a namespace that could be found via ADL.
---
 libstdc++-v3/include/std/ranges   | 12 +++---
 .../std/ranges/adaptors/reverse_no_adl.cc | 39 +++
 2 files changed, 45 insertions(+), 6 deletions(-)
 create mode 100644 libstdc++-v3/testsuite/std/ranges/adaptors/reverse_no_adl.cc

diff --git a/libstdc++-v3/include/std/ranges b/libstdc++-v3/include/std/ranges
index 1be74beb860..adbc6d7b274 100644
--- a/libstdc++-v3/include/std/ranges
+++ b/libstdc++-v3/include/std/ranges
@@ -2958,29 +2958,29 @@ namespace views
   {
 	if constexpr (_S_needs_cached_begin)
 	  if (_M_cached_begin._M_has_value())
-	return make_reverse_iterator(_M_cached_begin._M_get(_M_base));
+	return std::make_reverse_iterator(_M_cached_begin._M_get(_M_base));
 
 	auto __it = ranges::next(ranges::begin(_M_base), ranges::end(_M_base));
 	if constexpr (_S_needs_cached_begin)
 	  _M_cached_begin._M_set(_M_base, __it);
-	return make_reverse_iterator(std::move(__it));
+	return std::make_reverse_iterator(std::move(__it));
   }
 
   constexpr auto
   begin() requires common_range<_Vp>
-  { return make_reverse_iterator(ranges::end(_M_base)); }
+  { return std::make_reverse_iterator(ranges::end(_M_base)); }
 
   constexpr auto
   begin() const requires common_range
-  { return make_reverse_iterator(ranges::end(_M_base)); }
+  { return std::make_reverse_iterator(ranges::end(_M_base)); }
 
   constexpr reverse_iterator>
   end()
-  { return make_reverse_iterator(ranges::begin(_M_base)); }
+  { return std::make_reverse_iterator(ranges::begin(_M_base)); }
 
   constexpr auto
   end() const requires common_range
-  { return make_reverse_iterator(ranges::begin(_M_base)); }
+  { return std::make_reverse_iterator(ranges::begin(_M_base)); }
 
   constexpr auto
   size() requires sized_range<_Vp>
diff --git a/libstdc++-v3/testsuite/std/ranges/adaptors/reverse_no_adl.cc b/libstdc++-v3/testsuite/std/ranges/adaptors/reverse_no_adl.cc
new file mode 100644
index 000..9aa96c7d40e
--- /dev/null
+++ b/libstdc++-v3/testsuite/std/ranges/adaptors/reverse_no_adl.cc
@@ -0,0 +1,39 @@
+// Copyright (C) 2020-2021 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// .
+
+// { dg-options "-std=gnu++2a" }
+// { dg-do run { target c++2a } }
+
+#include 
+
+// This test tests that views::reverse works and does not use ADL which could
+// lead to accidentally finding test_ns::make_reverse_iterator(const A&).
+
+namespace test_ns
+{
+  struct A {};
+  template 
+  void make_reverse_iterator(T&&) {}
+} // namespace test_ns
+
+void test()
+{
+  test_ns::A as[] = {{}, {}};
+  auto v = as | std::views::reverse;
+  static_assert(std::ranges::view);
+}
+
-- 
2.30.1



Re: [PATCH 2/5] Atomic __platform_wait: accept any 32-bit type, not just int

2021-03-03 Thread Thiago Macieira via Gcc-patches
On Wednesday, 3 March 2021 08:21:51 PST Jonathan Wakely wrote:
> >>-   = is_same_v, __platform_wait_t>;
> >>+   = is_scalar_v> && sizeof(_Tp) ==
> >>sizeof(__platform_wait_t)
> Oh, except that is_scalar is surprisingly expensive to instantiate
> (its defined in a really expensive way) and since we control all uses
> of this constant, I don't think we need it. It's only ever used from
> atomic waiting functions which are only defined for scalar types.

Thanks. Will update and re-submit.

-- 
Thiago Macieira - thiago.macieira (AT) intel.com
  Software Architect - Intel DPG Cloud Engineering





Re: [PATCH] [libstdc++] Refactor/cleanup of atomic wait implementation

2021-03-03 Thread Jonathan Wakely via Gcc-patches

On 23/02/21 13:57 -0800, Thomas Rodgers wrote:

From: Thomas Rodgers 

* This revises the previous version to fix std::__condvar::wait_until() usage.

This is a substantial rewrite of the atomic wait/notify (and timed wait
counterparts) implementation.

The previous __platform_wait looped on EINTR however this behavior is
not required by the standard. A new _GLIBCXX_HAVE_PLATFORM_WAIT macro
now controls whether wait/notify are implemented using a platform
specific primitive or with a platform agnostic mutex/condvar. This
patch only supplies a definition for linux futexes. A future update
could add support __ulock_wait/wake on Darwin, for instance.

The members of __waiters were lifted to a new base class. The members
are now arranged such that overall sizeof(__waiters_base) fits in two
cache lines (on platforms with at least 64 byte cache lines). The
definition will also use destructive_interference_size for this if it
is available.

The __waiters type is now specific to untimed waits. Timed waits have a
corresponding __timed_waiters type. Much of the code has been moved from
the previous __atomic_wait() free function to the __waiter_base template
and a __waiter derived type is provided to implement the un-timed wait
operations. A similar change has been made to the timed wait
implementation.

The __atomic_spin code has been extended to take a spin policy which is
invoked after the initial busy wait loop. The default policy is to
return from the spin. The timed wait code adds a timed backoff spinning
policy. The code from  which implements this_thread::sleep_for,
sleep_until has been moved to a new  header
which allows the thread sleep code to be consumed without pulling in the
whole of .

The entry points into the wait/notify code have been restructured to
support either -
  * Testing the current value of the atomic stored at the given address
and waiting on a notification.
  * Applying a predicate to determine if the wait was satisfied.
The entry points were renamed to make it clear that the wait and wake
operations operate on addresses. The first variant takes the expected
value and a function which returns the current value that should be used
in comparison operations, these operations are named with a _v suffix
(e.g. 'value'). All atomic<_Tp> wait/notify operations use the first
variant. Barriers, latches and semaphores use the predicate variant.

This change also centralizes what it means to compare values for the
purposes of atomic::wait rather than scattering through individual
predicates.

This change also centralizes the repetitive code which adjusts for
different user supplied clocks (this should be moved elsewhere
and all such adjustments should use a common implementation).

libstdc++-v3/ChangeLog:
* include/Makefile.am: Add new  header.
* include/Makefile.in: Regenerate.
* include/bits/atomic_base.h: Adjust all calls
to __atomic_wait/__atomic_notify for new call signatures.
* include/bits/atomic_wait.h: Extensive rewrite.
* include/bits/atomic_timed_wait.h: Likewise.
* include/bits/semaphore_base.h: Adjust all calls
to __atomic_wait/__atomic_notify for new call signatures.
* include/bits/std_thread_sleep.h: New file.
* include/std/atomic: Likewise.
* include/std/barrier: Likewise.
* include/std/latch: Likewise.
* testsuite/29_atomics/atomic/wait_notify/bool.cc: Simplify
test.
* testsuite/29_atomics/atomic/wait_notify/generic.cc: Likewise.
* testsuite/29_atomics/atomic/wait_notify/pointers.cc: Likewise.
* testsuite/29_atomics/atomic_flag/wait_notify.cc: Likewise.
* testsuite/29_atomics/atomic_float/wait_notify.cc: Likewise.
* testsuite/29_atomics/atomic_integral/wait_notify.cc: Likewise.
* testsuite/29_atomics/atomic_ref/wait_notify.cc: Likewise.


Some of this diff is very confusing, where the context being shown as
removed is actually a completely different function. Please try
--diff-algorithm=histogram for the next version of this patch. It
might make it easier to read.


+struct __timed_backoff_spin_policy
+{
+  __wait_clock_t::time_point _M_deadline;
+  __wait_clock_t::time_point _M_t0;
+
+  template
+   __timed_backoff_spin_policy(chrono::time_point<_Clock, _Dur>
+ __deadline = _Clock::time_point::max(),
+   chrono::time_point<_Clock, _Dur>
+ __t0 = _Clock::now()) noexcept
+ : _M_deadline(__to_wait_clock(__deadline))
+ , _M_t0(__to_wait_clock(__t0))


If this policy object is constructed with a time_point using the
steady_clock then it will still call __to_wait_clock to convert it to
the steady_clock, making multiple unnecessary (and expensive) calls to
steady_clock::now().

I think you either need to overload the constructor or overload
__to_wait_clock.


+   { }
+
+  bool
+

Re: [PATCH 2/5] Atomic __platform_wait: accept any 32-bit type, not just int

2021-03-03 Thread Ville Voutilainen via Gcc-patches
On Wed, 3 Mar 2021 at 19:25, Jonathan Wakely via Libstdc++
 wrote:
> Oh, except that is_scalar is surprisingly expensive to instantiate
> (its defined in a really expensive way) and since we control all uses

I'll be more than happy to write you an __is_scalar for GCC 12. :P


Re: [PATCH 2/5] Atomic __platform_wait: accept any 32-bit type, not just int

2021-03-03 Thread Jonathan Wakely via Gcc-patches

On 03/03/21 19:34 +0200, Ville Voutilainen wrote:

On Wed, 3 Mar 2021 at 19:25, Jonathan Wakely via Libstdc++
 wrote:

Oh, except that is_scalar is surprisingly expensive to instantiate
(its defined in a really expensive way) and since we control all uses


I'll be more than happy to write you an __is_scalar for GCC 12. :P


That would help with https://gcc.gnu.org/PR96710 too. The current
implementation is inefficient and arguably wrong.




Re: [PATCH] middle-end/97855 - fix diagnostic with default pretty printer

2021-03-03 Thread David Malcolm via Gcc-patches
On Wed, 2021-03-03 at 16:23 +0100, Richard Biener wrote:
> On Wed, 3 Mar 2021, David Malcolm wrote:
> 
> > On Wed, 2021-03-03 at 08:48 +0100, Richard Biener wrote:
> > > On Tue, 2 Mar 2021, Martin Sebor wrote:
> > > 
> > > > On 3/2/21 9:52 AM, Jeff Law via Gcc-patches wrote:
> > > > > 
> > > > > 
> > > > > On 3/1/21 1:39 AM, Richard Biener wrote:
> > > > > > The default diagnostic tree printer relies on
> > > > > > dump_generic_node
> > > > > > which
> > > > > > for some reason manages to clobber the diagnostic pretty-
> > > > > > printer state
> > > > > > so we see garbled diagnostics like
> > > > > > 
> > > > > > /home/rguenther/src/trunk/gcc/calls.c: In function
> > > > > > 'expand_call':
> > > > > > D.6750.coeffs[0]'/home/rguenther/src/trunk/gcc/dojump.c:118
> > > > > > :28:
> > > > > > warning:
> > > > > > may be used uninitialized in this function [-Wmaybe-
> > > > > > uninitialized]
> > > > > > 
> > > > > > when the diagnostic is emitted by the LTO fronted.  The
> > > > > > following
> > > > > > approach using a temporary pretty-printer for the
> > > > > > dump_generic_node
> > > > > > call fixes this for some unknown reason and we issue
> > > > > > 
> > > > > > /home/rguenther/src/trunk/gcc/calls.c: In function
> > > > > > 'expand_call':
> > > > > > /home/rguenther/src/trunk/gcc/dojump.c:118:28: warning:
> > > > > > 'MEM[(struct
> > > > > > poly_int *)&save].D.6750.coeffs[0]' may be used
> > > > > > uninitialized
> > > > > > in this
> > > > > > function [-Wmaybe-uninitialized]
> > > > > > 
> > > > > > [LTO] Bootstrapped and tested on x86_64-unknown-linux-gnu,
> > > > > > OK
> > > > > > for trunk?
> > > > > > 
> > > > > > Thanks,
> > > > > > Richard.
> > > > > > 
> > > > > > 2021-02-26  Richard Biener  
> > > > > > 
> > > > > >  PR middle-end/97855
> > > > > >  * tree-diagnostic.c (default_tree_printer): Use a
> > > > > > temporary
> > > > > >  pretty-printer when formatting a tree via
> > > > > > dump_generic_node.
> > > > > It'd be good to know why this helps, but I trust your
> > > > > judgment
> > > > > that this
> > > > > is an improvement.
> > > > 
> > > > I don't know if it's related but pr98492 tracks a problem in
> > > > the
> > > > C++
> > > > front end caused by reinitializing the pretty printer in a
> > > > number
> > > > of
> > > > functions in cp/error.c.  When one of these functions is called
> > > > while
> > > > the pretty printer is formatting something, the effect of
> > > > the reinitialization is to drop the already formatted contents
> > > > of the printer's buffer.
> > > > 
> > > > IIRC, I tripped over this when working on the MEM_REF
> > > > formatting
> > > > improvement for -Wuninitialized.
> > > 
> > > I've poked quite a bit with breakpoints on suspicious pretty-
> > > printer
> > > functions and watch points on the pp state but found nothing in
> > > the
> > > case I was looking at (curiously also -Wuninitialized).  But I
> > > also
> > > wasn't able to understand why the caller should work at all.  And
> > > yes, the C/C++ tree printers also simply format to the passed
> > > pretty-printer...
> > > 
> > > Hoping that David could shed some light on how this should play
> > > together.
> > 
> > This looks very much like the issue I ran into in
> > c46d057f55748520e819dcd8e04bca71be9902b2 (and, in retrospect, that
> > commit may have just been papering over the problem).
> > 
> > The issue there was that pp_printf is not reentrant - if a handler
> > for
> > a pp_printf format code ends up making a nested call to pp_printf,
> > I
> > got behavior that looks like what you're seeing.
> > 
> > That said, I've been poring over the output in PR middle-end/97855
> > and
> > comparing it to the various pretty-printer usage in the tree, and
> > I'm
> > not seeing anywhere where a pp_printf seems to be used when
> > generating:
> >   MEM[(struct poly_int *)&save + 8B].D.6750.coeffs[0]
> 
> I think it's the D.6750 which is printed via
> 
>   else if (TREE_CODE (node) == DEBUG_EXPR_DECL)
>     {
>   if (flags & TDF_NOUID)
>     pp_string (pp, "D#");
>   else
>     pp_printf (pp, "D#%i", DEBUG_TEMP_UID (node));
> 
> because this is a DECL_DEBUG_EXPR.  

Wouldn't that print
  D#6750
rather than
  D.6750
?


> One could experiment with
> avoiding pp_printf in dump_decl_name.
> 
> > Is there a minimal reproducer (or a .i file?)
> 
> No, you need to do a LTO bootstrap, repeat the link step of
> for example cc1 with -v -save-temps and pick an ltrans invocation
> that exhibits the issue ...
> 
> I can poke at the above tomorrow again.  I suppose we could
> also add some checking-assert into the pp_printf code at
> the problematic place (or is any recursion bogus?) to catch
> the case with an ICE.
> 
> Richard.
> 
> > Dave
> > 
> > 
> > 
> > >   Most specifically
> > > 
> > >   pp_format (context->printer, &diagnostic->message);
> > > 
> > > ^^^ this is the path affected by the patch
> > > 
> > >   (*diagnostic_starter (context)) (context, diagnostic

Re: [PATCH] libstdc++: Avoid accidental ADL when calling make_reverse_iterator

2021-03-03 Thread Patrick Palka via Gcc-patches
On Wed, 3 Mar 2021, Moritz Sichert via Libstdc++ wrote:

> std::ranges::reverse_view uses make_reverse_iterator in its
> implementation as described in [range.reverse.view]. This accidentally
> allows ADL as an unqualified name is used in the call. According to
> [contents], however, this should be treated as a qualified lookup into
> the std namespace.
> 
> This leads to errors due to ambiguous name lookups when another
> make_reverse_iterator function is found via ADL.
> 
> I found this as llvm has its own implementation of ranges which also has
> llvm::make_reverse_iterator. This lead to a compile error with
> std::vector | std::ranges::views::reverse.

Thanks for the patch!  It looks good to me.  Some very minor comments
below.

> 
> Best,
> Moritz
> 

> libstdc++-v3/Changelog:
> 
>   * include/std/ranges: Avoid accidental ADL when calling
> make_reverse_iterator.
> * testsuite/std/ranges/adaptors/reverse_no_adl.cc: New test.
> Test that views::reverse works when make_reverse_iterator is
> defined in a namespace that could be found via ADL.
> ---
>  libstdc++-v3/include/std/ranges   | 12 +++---
>  .../std/ranges/adaptors/reverse_no_adl.cc | 39 +++
>  2 files changed, 45 insertions(+), 6 deletions(-)
>  create mode 100644 
> libstdc++-v3/testsuite/std/ranges/adaptors/reverse_no_adl.cc
> 
> diff --git a/libstdc++-v3/include/std/ranges b/libstdc++-v3/include/std/ranges
> index 1be74beb860..adbc6d7b274 100644
> --- a/libstdc++-v3/include/std/ranges
> +++ b/libstdc++-v3/include/std/ranges
> @@ -2958,29 +2958,29 @@ namespace views
>{
>   if constexpr (_S_needs_cached_begin)
> if (_M_cached_begin._M_has_value())
> - return make_reverse_iterator(_M_cached_begin._M_get(_M_base));
> + return std::make_reverse_iterator(_M_cached_begin._M_get(_M_base));
>  
>   auto __it = ranges::next(ranges::begin(_M_base), ranges::end(_M_base));
>   if constexpr (_S_needs_cached_begin)
> _M_cached_begin._M_set(_M_base, __it);
> - return make_reverse_iterator(std::move(__it));
> + return std::make_reverse_iterator(std::move(__it));
>}
>  
>constexpr auto
>begin() requires common_range<_Vp>
> -  { return make_reverse_iterator(ranges::end(_M_base)); }
> +  { return std::make_reverse_iterator(ranges::end(_M_base)); }
>  
>constexpr auto
>begin() const requires common_range
> -  { return make_reverse_iterator(ranges::end(_M_base)); }
> +  { return std::make_reverse_iterator(ranges::end(_M_base)); }
>  
>constexpr reverse_iterator>
>end()
> -  { return make_reverse_iterator(ranges::begin(_M_base)); }
> +  { return std::make_reverse_iterator(ranges::begin(_M_base)); }
>  
>constexpr auto
>end() const requires common_range
> -  { return make_reverse_iterator(ranges::begin(_M_base)); }
> +  { return std::make_reverse_iterator(ranges::begin(_M_base)); }
>  
>constexpr auto
>size() requires sized_range<_Vp>
> diff --git a/libstdc++-v3/testsuite/std/ranges/adaptors/reverse_no_adl.cc 
> b/libstdc++-v3/testsuite/std/ranges/adaptors/reverse_no_adl.cc
> new file mode 100644
> index 000..9aa96c7d40e
> --- /dev/null
> +++ b/libstdc++-v3/testsuite/std/ranges/adaptors/reverse_no_adl.cc
> @@ -0,0 +1,39 @@
> +// Copyright (C) 2020-2021 Free Software Foundation, Inc.
> +//
> +// This file is part of the GNU ISO C++ Library.  This library is free
> +// software; you can redistribute it and/or modify it under the
> +// terms of the GNU General Public License as published by the
> +// Free Software Foundation; either version 3, or (at your option)
> +// any later version.
> +
> +// This library is distributed in the hope that it will be useful,
> +// but WITHOUT ANY WARRANTY; without even the implied warranty of
> +// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +// GNU General Public License for more details.
> +
> +// You should have received a copy of the GNU General Public License along
> +// with this library; see the file COPYING3.  If not see
> +// .
> +
> +// { dg-options "-std=gnu++2a" }
> +// { dg-do run { target c++2a } }

Since this isn't an execution test, we should use "compile" instead of
"run" here.

> 
> +
> +#include 
> +
> +// This test tests that views::reverse works and does not use ADL which could
> +// lead to accidentally finding test_ns::make_reverse_iterator(const A&).
> +
> +namespace test_ns
> +{
> +  struct A {};
> +  template 
> +  void make_reverse_iterator(T&&) {}
> +} // namespace test_ns
> +
> +void test()
> +{
> +  test_ns::A as[] = {{}, {}};
> +  auto v = as | std::views::reverse;
> +  static_assert(std::ranges::view);

I suppose we could also check

  static_assert(std::ranges::range)

which'll additionally verify the std:: qualification inside the const
begin()/end() overloads.

> +}
> +
> -- 
> 2.30.1
> 



Re: [PATCH] middle-end/97855 - fix diagnostic with default pretty printer

2021-03-03 Thread Jakub Jelinek via Gcc-patches
On Wed, Mar 03, 2021 at 12:45:54PM -0500, David Malcolm wrote:
> > I think it's the D.6750 which is printed via
> > 
> >   else if (TREE_CODE (node) == DEBUG_EXPR_DECL)
> >     {
> >   if (flags & TDF_NOUID)
> >     pp_string (pp, "D#");
> >   else
> >     pp_printf (pp, "D#%i", DEBUG_TEMP_UID (node));
> > 
> > because this is a DECL_DEBUG_EXPR.  
> 
> Wouldn't that print
>   D#6750
> rather than
>   D.6750

Sure.  The D.6750 case is a few lines later:
pp_printf (pp, "%c%c%u", c, uid_sep, DECL_UID (node));
so we'd use
pp_character (pp, c);
pp_character (pp, uid_sep);
pp_decimal_int (pp, DECL_UID (node));
for that one.

Jakub



[committed] d: Fix heap-buffer-overflow in checkModFileAlias [PR 99337]

2021-03-03 Thread Iain Buclaw via Gcc-patches
Hi,

This patch merges the D front-end implementation with upstream dmd,
fixing a heap-buffer-overflow in checkModFileAlias.  The code wrongly
assumed memcmp did not read past the mismatch.

Bootstrapped and regression tested on x86_64-linux-gnu/-m32/-mx32,
committed to mainline, and backported to the gcc-10 and gcc-9 release
branches.

Regards,
Iain.

---
gcc/d/ChangeLog:

PR d/99337
* dmd/MERGE: Merge upstream dmd a3c9bf422.
---
 gcc/d/dmd/MERGE | 2 +-
 gcc/d/dmd/dmodule.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/d/dmd/MERGE b/gcc/d/dmd/MERGE
index 690fe407278..78b454c1c64 100644
--- a/gcc/d/dmd/MERGE
+++ b/gcc/d/dmd/MERGE
@@ -1,4 +1,4 @@
-7132b3537dc27cb353da75798082ffe7ea3d69a6
+a3c9bf422e7ff54d45846b8c577ee82da4234db1
 
 The first line of this file holds the git revision number of the last
 merge done from the dlang/dmd repository.
diff --git a/gcc/d/dmd/dmodule.c b/gcc/d/dmd/dmodule.c
index a2b01f534eb..ed01858f06b 100644
--- a/gcc/d/dmd/dmodule.c
+++ b/gcc/d/dmd/dmodule.c
@@ -195,7 +195,7 @@ static void checkModFileAlias(OutBuffer *buf, OutBuffer 
*dotmods,
 const char *m = (*ms)[j];
 const char *q = strchr(m, '=');
 assert(q);
-if (dotmods->length() <= (size_t)(q - m) && 
memcmp(dotmods->peekChars(), m, q - m) == 0)
+if (dotmods->length() == (size_t)(q - m) && 
memcmp(dotmods->peekChars(), m, q - m) == 0)
 {
 buf->reset();
 size_t qlen = strlen(q + 1);
-- 
2.27.0



Re: [PATCH] fwprop: Fix single_use_p calculation

2021-03-03 Thread Jeff Law via Gcc-patches



On 3/2/21 3:37 PM, Ilya Leoshkevich via Gcc-patches wrote:
> Bootstrapped and regtested on x86_64-redhat-linux, ppc64le-redhat-linux
> and s390x-redhat-linux.  Ok for master?
>
>
>
> Commit efb6bc55a93a ("fwprop: Allow (subreg (mem)) simplifications")
> introduced a check that was supposed to look at the propagated def's
> number of uses.  It uses insn_info::num_uses (), which in reality
> returns the number of uses def's insn has.  The whole change therefore
> works only by accident.
>
> Fix by looking at def_info's uses instead of insn_info's uses.  This
> requires passing around def_info instead of insn_info.
>
> gcc/ChangeLog:
>
> 2021-03-02  Ilya Leoshkevich  
>
>   * fwprop.c (def_has_single_use_p): New function.
>   (fwprop_propagation::fwprop_propagation): Look at
>   def_info's uses.
>   (try_fwprop_subst_note): Use def_info instead of insn_info.
>   (try_fwprop_subst_pattern): Likewise.
>   (try_fwprop_subst_notes): Likewise.
>   (try_fwprop_subst): Likewise.
>   (forward_propagate_subreg): Likewise.
>   (forward_propagate_and_simplify): Likewise.
>   (forward_propagate_into): Likewise.
>   * iterator-utils.h (single_element_p): New function.
Given we're well into stage4, I'd recommend deferring to gcc-12 unless
this fixes a code correctness issue.

Jeff



[r11-7469 Regression] FAIL: gcc.dg/attr-flatten-1.c (test for warnings, line 13) on Linux/x86_64

2021-03-03 Thread sunil.k.pandey via Gcc-patches
On Linux/x86_64,

f8e7f3f3f33e22721a28772cc3f9b616e48cd1c9 is the first bad commit
commit f8e7f3f3f33e22721a28772cc3f9b616e48cd1c9
Author: Jason Merrill 
Date:   Thu Feb 11 22:01:19 2021 -0500

cgraph: flatten and same_body aliases [PR96078]

caused

FAIL: gcc.dg/attr-flatten-1.c  (test for warnings, line 13)

with GCC configured with

../../gcc/configure 
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r11-7469/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg.exp=gcc.dg/attr-flatten-1.c 
--target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg.exp=gcc.dg/attr-flatten-1.c 
--target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg.exp=gcc.dg/attr-flatten-1.c 
--target_board='unix{-m64}'"
$ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg.exp=gcc.dg/attr-flatten-1.c 
--target_board='unix{-m64\ -march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at skpgkp2 at gmail dot com)


Re: [PATCH 2/3 V2] Do not include stdio.h in libgcc's Decimal/Float128 conversions.

2021-03-03 Thread Michael Meissner via Gcc-patches
On Tue, Mar 02, 2021 at 03:53:06PM -0600, Segher Boessenkool wrote:
> If you want to make decimal and/or QP float work only on 64-bit LE Linux
> you should say so.  And in that case, that is certainly not acceptable
> if it doesn't "sorry" at configure time already.

Well in general the only supported configuration for _Float128 is 64-bit LE
Linux, but this is more due to the infrastructure not supporting it.

If you want to support _Float128 on big endian, you need a GLIBC that provides
the necessary support.  As far as I know, GLIBC does not build the _Float128
support on big endian.

You also need to set the minimum CPU to power7, because _Float128 is passed in
Altivec registers.  I think some of the configure tests use the default cpu
used in building GCC (i.e. power4 if you don't use --with-cpu).

As we have discussed many times, on 32-bit BE, you cannot use hardware
_Float128 support on power9/power10 because there is no TImode in 32-bit.
Various machine independent parts of GCC require an integer type to be the same
size as basic types.  If somebody made TImode work, we can remove the
restriction and allow _Float128 to work on 32-bit.

With the current compiler on BE, if you use -mcpu=power7 or newer, it will
enable the _Float128/__float128 keywords, and generate the code.  But until
there is an expectation of library support, it won't work for the general
user.

> > The second is whether you get an error at link time because there is not
> > sprintf or strtold functions.  For normal archive libraries, this would 
> > only be
> > an issue if you actually do the conversion.  I have my doubts whether there 
> > is
> > any extant code that wants to convert _Float128 to one of the Decimal types 
> > or
> > vice versa.
> 
> So what are these patches for at all, again?

The whole reason for the patches is to provide the support when we flip the
default long double type from the IBM 128-bit type to the IEEE 128-bit type
that conversions between long double and the Decimal types will succeed.  I
suspect in real life it won't be an issue, since Decimal is not used that
much.  But it is more likely people will want to convert between binary long
double and Decimal128.

I missed the fact that it had hidden dependencies for cross compilers.  So I am
trying to fix this.

> Anyway, if some conversion doesn't work (or cannot work correctly at
> all, which is the same thing), you should simply disable the conversion
> routines themselves (and then cascade that through the possible callers:just 
> make them say "sorry, unimplemented").
> 
> > Note the second issue would affect x86_64 cross compilers as well, since 
> > they
> > use those two functions to do the _Float128/Decimal versions.
> 
> Yes, it cannot work correctly at all there, either.

If you remember, the original versions of the patch would only work if you
configured the compiler to use GLIBC 2.32 or newer (such as using the
--with-advance-toolchain at14.0 option).  You did not like this because as you
pointed out, you might use a different GLIBC when linking.

So I wrote the current patch that uses weak references to see if we did link
with GLIBC 2.32.  If the library is present, we use the functions in that
library.  If not, we have to give an approximate answer.  I used the existing
IBM 128-bit support to do the conversion.

Given GLIBC 2.32 is the minimum version of GLIBC that supports IEEE 128-bit
long doubles, if you link against an earlier library, you will not get the
conversions from long double.  You will only get the conversion if you
explicitly use _Float128.

> > I am open to suggestions of how to move forward.  I think we have to have a 
> > way
> > to build cross compilers without decimal support (i.e. my third patch).
> > Secondarily, I think we should allow the compiler to be built if there is no
> > stdio.h, but the user did not disable decimal floating point.
> 
> Yes.  So just do not use it then.  Disable the feature that would use it.
>
> > I don't think rewriting the Decimal conversions not to use GLIBC is really
> > practical.
> 
> It is necessary if you want to support it on any other systems than the
> one you use.

Sure, but it is a massive amount of work.  And I don't have the necessary
skills to insure that the conversion process will not suffer from errors.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797


Re: add -mpowerpc-gpopt to options for sqrt insn on PowerPC

2021-03-03 Thread Alexandre Oliva
On Mar  2, 2021, David Edelsohn  wrote:

> The procs are used in more than that one test.

Err, are you looking at the trunk?  In my tree, there are only two tests
that mention sqrt_insn, and it's two rather than one just because I
added the flag to another test myself, in a patch yet to be contributed.
Here's the patch that introduced the only use of the proc that is known
to me: https://gcc.gnu.org/legacy-ml/gcc-patches/2018-12/msg01204.html

As for the to-be-contributed patch, it adds the sqrt_insn feature option
to cdce3.c.  Like gimplefe-28.c, a compile-only front-end test that
depends on the existence of a sqrt insn to do its job, cdce3.c is a
compile-only test for a middle-end shrink-wrap optimization, that is
only performed when there is a sqrt insn.


While we could hide the bug/missing feature in add_options_for_sqrt_insn
by constraining check_effective_target_sqrt_insn, the result would be
just reduced test coverage for powerpc builds that defaulted to
-mno-powerpc-gpopt.  A downside without any upside.

Whereas if we fix the former proc to perform its documented function on
powerpc, namely return the options required to enable the fsqrt insn,
the end result is that, if the options do the job, we get those two
tests performed, whereas if they happen to be incompatible with other
settings to the point of raising an error, we (should?) skip the tests.
In my book, that's upsides without downsides.


Now, if the grounds for rejecting the patch was based on the
understanding that the proc was used elsewhere, with a different
function, I hope this demonstrates that this understanding was based on
incorrect information (that I may have hinted at myself; sorry if so),
and now that it's been corrected, I request the rejection of this
approach to be reconsidered.

I suppose the patch as-is might still be rejected, because it introduces
only -mpowerpc-gpopt, without -mno-soft-float.  Since in some
configurations it may take both of them to enable the fsqrt insn, would
an alternate version that returned both be deemed acceptable instead?

Maybe we also want to document in the proc that these added options can
only be used in compile tests, because there's no expectation or
guarantee that the so-enabled feature is available in the target under
test.  But AFAIK this has always been the case, and now I see and
confirm that a feature-option-adding call is always followed by a
feature-available-check call.


Thanks,

-- 
Alexandre Oliva, happy hacker  https://FSFLA.org/blogs/lxo/
   Free Software Activist GNU Toolchain Engineer
Vim, Vi, Voltei pro Emacs -- GNUlius Caesar


Re: [PATCH] libstdc++: Avoid accidental ADL when calling make_reverse_iterator

2021-03-03 Thread Moritz Sichert via Gcc-patches

Thanks for the review. I attached the updated patch.

Can you commit this for me or point me to what I should do next? This is my 
first contribution here.

Best,
Moritz

Am 03.03.21 um 19:02 schrieb Patrick Palka:

On Wed, 3 Mar 2021, Moritz Sichert via Libstdc++ wrote:


std::ranges::reverse_view uses make_reverse_iterator in its
implementation as described in [range.reverse.view]. This accidentally
allows ADL as an unqualified name is used in the call. According to
[contents], however, this should be treated as a qualified lookup into
the std namespace.

This leads to errors due to ambiguous name lookups when another
make_reverse_iterator function is found via ADL.

I found this as llvm has its own implementation of ranges which also has
llvm::make_reverse_iterator. This lead to a compile error with
std::vector | std::ranges::views::reverse.


Thanks for the patch!  It looks good to me.  Some very minor comments
below.



Best,
Moritz




libstdc++-v3/Changelog:

* include/std/ranges: Avoid accidental ADL when calling
 make_reverse_iterator.
 * testsuite/std/ranges/adaptors/reverse_no_adl.cc: New test.
 Test that views::reverse works when make_reverse_iterator is
 defined in a namespace that could be found via ADL.
---
  libstdc++-v3/include/std/ranges   | 12 +++---
  .../std/ranges/adaptors/reverse_no_adl.cc | 39 +++
  2 files changed, 45 insertions(+), 6 deletions(-)
  create mode 100644 
libstdc++-v3/testsuite/std/ranges/adaptors/reverse_no_adl.cc

diff --git a/libstdc++-v3/include/std/ranges b/libstdc++-v3/include/std/ranges
index 1be74beb860..adbc6d7b274 100644
--- a/libstdc++-v3/include/std/ranges
+++ b/libstdc++-v3/include/std/ranges
@@ -2958,29 +2958,29 @@ namespace views
{
if constexpr (_S_needs_cached_begin)
  if (_M_cached_begin._M_has_value())
-   return make_reverse_iterator(_M_cached_begin._M_get(_M_base));
+   return std::make_reverse_iterator(_M_cached_begin._M_get(_M_base));
  
  	auto __it = ranges::next(ranges::begin(_M_base), ranges::end(_M_base));

if constexpr (_S_needs_cached_begin)
  _M_cached_begin._M_set(_M_base, __it);
-   return make_reverse_iterator(std::move(__it));
+   return std::make_reverse_iterator(std::move(__it));
}
  
constexpr auto

begin() requires common_range<_Vp>
-  { return make_reverse_iterator(ranges::end(_M_base)); }
+  { return std::make_reverse_iterator(ranges::end(_M_base)); }
  
constexpr auto

begin() const requires common_range
-  { return make_reverse_iterator(ranges::end(_M_base)); }
+  { return std::make_reverse_iterator(ranges::end(_M_base)); }
  
constexpr reverse_iterator>

end()
-  { return make_reverse_iterator(ranges::begin(_M_base)); }
+  { return std::make_reverse_iterator(ranges::begin(_M_base)); }
  
constexpr auto

end() const requires common_range
-  { return make_reverse_iterator(ranges::begin(_M_base)); }
+  { return std::make_reverse_iterator(ranges::begin(_M_base)); }
  
constexpr auto

size() requires sized_range<_Vp>
diff --git a/libstdc++-v3/testsuite/std/ranges/adaptors/reverse_no_adl.cc 
b/libstdc++-v3/testsuite/std/ranges/adaptors/reverse_no_adl.cc
new file mode 100644
index 000..9aa96c7d40e
--- /dev/null
+++ b/libstdc++-v3/testsuite/std/ranges/adaptors/reverse_no_adl.cc
@@ -0,0 +1,39 @@
+// Copyright (C) 2020-2021 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// .
+
+// { dg-options "-std=gnu++2a" }
+// { dg-do run { target c++2a } }


Since this isn't an execution test, we should use "compile" instead of
"run" here.



+
+#include 
+
+// This test tests that views::reverse works and does not use ADL which could
+// lead to accidentally finding test_ns::make_reverse_iterator(const A&).
+
+namespace test_ns
+{
+  struct A {};
+  template 
+  void make_reverse_iterator(T&&) {}
+} // namespace test_ns
+
+void test()
+{
+  test_ns::A as[] = {{}, {}};
+  auto v = as | std::views::reverse;
+  static_assert(std::ranges::view);


I suppose we could also check

   static_assert(std::ranges::range)

which'll additionally verify the std:: qualification inside the const
begin()/end() overloa

c++: Defer specialization registration [PR 99170]

2021-03-03 Thread Nathan Sidwell
This defers inserting specializations into the specialization table, 
until we have completed their streaming.  When streaming a cluster we 
ensure that all imports are populated before any of the cluster, so they 
need no visibility of other specializations.  Further within the same 
import, we've already partitioned the graph, so no earlier cluster can 
be refering to a specialization in a later cluster. Inserting them early 
causes problems when other specializations of the same template are 
inserted.  (This doesn't fix 99170, but is a necessary change for that PR).


Earlier	on, I had less deferred	processing, but	it has become clearer 
that deferred worklists are the right way of handling a few things. This 
patch highlights a fixme, in that we're streaming a key twice, and need 
not do that, but I wanted to get correctness first.  Besides the second 
streaming will end up being a back reference, which is of course much 
cheaper than a by-value stream.


PR c++/99170
gcc/cp/
* module.cc (trees_out::decl_value): Stream specialization keys
after decl.
(trees_in::decl_value): Stream them back and insert after
completing the decl.
(trees_out::key_mergeable): Drop some streaming here ...
(trees_in::key_mergeable): ... and here.  Don't insert into
specialization tables.

--
Nathan Sidwell
diff --git i/gcc/cp/module.cc w/gcc/cp/module.cc
index 31172824fdd..b068acea3bd 100644
--- i/gcc/cp/module.cc
+++ w/gcc/cp/module.cc
@@ -7808,7 +7808,31 @@ trees_out::decl_value (tree decl, depset *dep)
 }
 
   if (!is_key_order ())
-tree_node (get_constraints (decl));
+{
+  if (mk & MK_template_mask
+	  || mk == MK_partial
+	  || mk == MK_friend_spec)
+	{
+	  if (mk != MK_partial)
+	{
+	  // FIXME: We should make use of the merge-key by
+	  // exposing it outside of key_mergeable.  But this gets
+	  // the job done.
+	  auto *entry = reinterpret_cast  (dep->deps[0]);
+
+	  if (streaming_p ())
+		u (get_mergeable_specialization_flags (entry->tmpl, decl));
+	  tree_node (entry->tmpl);
+	  tree_node (entry->args);
+	}
+	  else
+	{
+	  tree_node (CLASSTYPE_TI_TEMPLATE (TREE_TYPE (inner)));
+	  tree_node (CLASSTYPE_TI_ARGS (TREE_TYPE (inner)));
+	}
+	}
+  tree_node (get_constraints (decl));
+}
 
   if (streaming_p ())
 {
@@ -8096,7 +8120,22 @@ trees_in::decl_value ()
 	   || (stub_decl && !tree_node_vals (stub_decl
 goto bail;
 
-  tree constraints = tree_node ();
+  spec_entry spec;
+  unsigned spec_flags = 0;
+  if (mk & MK_template_mask
+  || mk == MK_partial
+  || mk == MK_friend_spec)
+{
+  if (mk == MK_partial)
+	spec_flags = 2;
+  else
+	spec_flags = u ();
+
+  spec.tmpl = tree_node ();
+  spec.args = tree_node ();
+}
+  /* Hold constraints on the spec field, for a short while.  */
+  spec.spec = tree_node ();
 
   dump (dumper::TREE) && dump ("Read:%d %C:%N", tag, TREE_CODE (decl), decl);
 
@@ -8157,8 +8196,13 @@ trees_in::decl_value ()
 	}
 	}
 
-  if (constraints)
-	set_constraints (decl, constraints);
+  if (spec.spec)
+	set_constraints (decl, spec.spec);
+  if (mk & MK_template_mask
+	  || mk == MK_partial)
+	/* Add to specialization tables now that constraints etc are
+	   added.  */
+	add_mergeable_specialization (spec.tmpl, spec.args, decl, spec_flags);
 
   if (TREE_CODE (decl) == INTEGER_CST && !TREE_OVERFLOW (decl))
 	{
@@ -8247,6 +8291,15 @@ trees_in::decl_value ()
   decl = existing;
 }
 
+  if (mk == MK_friend_spec)
+{
+  tree e = match_mergeable_specialization (true, &spec);
+  if (!e)
+	add_mergeable_specialization (spec.tmpl, spec.args, decl, spec_flags);
+  else if (e != existing)
+	set_overrun ();
+}
+
   if (is_typedef)
 {
   /* Insert the type into the array now.  */
@@ -10415,8 +10468,6 @@ trees_out::key_mergeable (int tag, merge_kind mk, tree decl, tree inner,
 
   tree_node (entry->tmpl);
   tree_node (entry->args);
-  if (streaming_p ())
-	u (get_mergeable_specialization_flags (entry->tmpl, decl));
   if (mk & MK_tmpl_decl_mask)
 	if (flag_concepts && TREE_CODE (inner) == VAR_DECL)
 	  {
@@ -10504,16 +10555,6 @@ trees_out::key_mergeable (int tag, merge_kind mk, tree decl, tree inner,
 		key.ret = fndecl_declared_return_type (inner);
 	}
 
-	  if (mk == MK_friend_spec)
-	{
-	  gcc_checking_assert (dep && dep->is_special ());
-	  spec_entry *entry = reinterpret_cast  (dep->deps[0]);
-
-	  tree_node (entry->tmpl);
-	  tree_node (entry->args);
-	  if (streaming_p ())
-		u (get_mergeable_specialization_flags (entry->tmpl, decl));
-	}
 	  break;
 
 	case MK_field:
@@ -10763,7 +10804,6 @@ trees_in::key_mergeable (int tag, merge_kind mk, tree decl, tree inner,
   spec_entry spec;
   spec.tmpl = tree_node ();
   spec.args = tree_node ();
-  unsigned flags = u ();
 
   DECL_NAME (decl) 

ping^1 Re: [PATCH] Darwin, testsuite : Prune 'object file not found for object'.

2021-03-03 Thread Iain Sandoe

Iain Sandoe  wrote:


This is not a GCC problem, but a fault in the static linker where,
when a source file is used multiple times, with conditional compilation
the source file is only referenced by the linker for the first object.
Then, when dsymutil tries to find the source file for next object based
off that source there is no record for it.

I’ve had this patch kicking around for some time, in the hope that the
problem would be fixed in the XCode tools, but it’s still present in the
XC12.5b2 (and will never be fixed in older toolkits).

tested on *-darwin* and x86_64-linux-gnu,
OK for master?
Iain
(if were not in stage 4, I’d have applied it as obvious).


^^^


gcc/testsuite/ChangeLog:

* lib/prune.exp: Prune useless output caused by a linker bug.
---
gcc/testsuite/lib/prune.exp | 3 +++
1 file changed, 3 insertions(+)

diff --git a/gcc/testsuite/lib/prune.exp b/gcc/testsuite/lib/prune.exp
index a349c8ace3e..2809f88b16f 100644
--- a/gcc/testsuite/lib/prune.exp
+++ b/gcc/testsuite/lib/prune.exp
@@ -84,6 +84,9 @@ proc prune_gcc_output { text } {
# Ignore harmless warnings from Xcode 4.0.
regsub -all "(^|\n)\[^\n\]*ld: warning: could not create compact unwind for\[^\n\]*" 
$text "" text

+# Ignore dsymutil warning (tool bug is actually linker)
+regsub -all "(^|\n)\[^\n\]*could not find object file symbol for  
symbol\[^\n\]*" $text "" text

+
# If dg-enable-nn-line-numbers was provided, then obscure source-margin
# line numbers by converting them to "NN" form.
set text [maybe-handle-nn-line-numbers $text]
--
2.24.1





Re: add -mpowerpc-gpopt to options for sqrt insn on PowerPC

2021-03-03 Thread Alexandre Oliva
On Mar  2, 2021, Segher Boessenkool  wrote:

> This is PR99352 now.

Thanks.  I've filed PR99371 for the add_options_for_sqrt_insn
incompleteness, and PR99372 for the gimplefe-28.c ICE.

-- 
Alexandre Oliva, happy hacker  https://FSFLA.org/blogs/lxo/
   Free Software Activist GNU Toolchain Engineer
Vim, Vi, Voltei pro Emacs -- GNUlius Caesar


Re: [PATCH PING^3] Add input_modes parameter to TARGET_MD_ASM_ADJUST hook

2021-03-03 Thread Jeff Law via Gcc-patches



On 3/2/21 4:50 PM, Ilya Leoshkevich via Gcc-patches wrote:
> Hello,
>
> I would like to ping the following patch:
>
> Add input_modes parameter to TARGET_MD_ASM_ADJUST hook
> https://gcc.gnu.org/pipermail/gcc-patches/2021-January/562898.html
>
> It is needed for the following regression fix:
>
> IBM Z: Fix usage of "f" constraint with long doubles
> https://gcc.gnu.org/pipermail/gcc-patches/2021-January/564380.html
>
>
> Jakub, who would be the right person to review this change?  I've
> decided to ask you, since `git shortlog -ns gcc/cfgexpand.c` shows that
> you deal with this code a lot.
>
> Best regards,
> Ilya
>
>
>
>
> If TARGET_MD_ASM_ADJUST changes a mode of an input operand (which
> should be ok as long as the hook itself as well as after_md_seq make up
> for it), input_mode will contain stale information.
>
> It might be tempting to fix this by removing input_mode altogether and
> just using GET_MODE (), but this will not work correctly with constants.
> So add input_modes parameter and document that it should be updated
> whenever inputs parameter is updated.
>
> gcc/ChangeLog:
>
> 2021-01-05  Ilya Leoshkevich  
>
>   * cfgexpand.c (expand_asm_loc): Pass new parameter.
>   (expand_asm_stmt): Likewise.
>   * config/arm/aarch-common-protos.h (arm_md_asm_adjust): Add new
>   parameter.
>   * config/arm/aarch-common.c (arm_md_asm_adjust): Likewise.
>   * config/arm/arm.c (thumb1_md_asm_adjust): Likewise.
>   * config/cris/cris.c (cris_md_asm_adjust): Likewise.
>   * config/i386/i386.c (ix86_md_asm_adjust): Likewise.
>   * config/mn10300/mn10300.c (mn10300_md_asm_adjust): Likewise.
>   * config/nds32/nds32.c (nds32_md_asm_adjust): Likewise.
>   * config/pdp11/pdp11.c (pdp11_md_asm_adjust): Likewise.
>   * config/rs6000/rs6000.c (rs6000_md_asm_adjust): Likewise.
>   * config/vax/vax.c (vax_md_asm_adjust): Likewise.
>   * config/visium/visium.c (visium_md_asm_adjust): Likewise.
>   * target.def (md_asm_adjust): Likewise.
Ugh.    A couple questions
Are there any cases where you're going to want to change modes for
arguments that were constants?   I'm a bit surprised that we don't have
a mode for constants for the cases that we care about.  Presumably we
can get a (modeless) CONST_INT here and we're not restricted to
CONST_DOUBLE and friends (which have modes).

Is input_modes read after the call to md_asm_adjust?   I'm trying to
figure out why we'd need to update it.

Not acking or naking at this point, I just want to make sure I
understand what's going on.

jeff



Re: add -mpowerpc-gpopt to options for sqrt insn on PowerPC

2021-03-03 Thread Segher Boessenkool
hi!

On Wed, Mar 03, 2021 at 04:22:29PM -0300, Alexandre Oliva wrote:
> While we could hide the bug/missing feature in add_options_for_sqrt_insn
> by constraining check_effective_target_sqrt_insn, the result would be
> just reduced test coverage for powerpc builds that defaulted to
> -mno-powerpc-gpopt.  A downside without any upside.

You just *cannot* use add_options_for_sqrt_insn usefully ever (on our
configurations): unless you override which -mcpu= is used, and whether
hard float is enabled, you cannot depend on having classic float (and
you need to disable VSX or detect x[sv]sqrtdp etc. as well as fsqrt).

> Whereas if we fix the former proc to perform its documented function on
> powerpc, namely return the options required to enable the fsqrt insn,

But in all configurations where you *can*, it already is, unless it is
"manually" disabled by using -msoft-float.  Which a) almost no one does,
and b) overriding that causes *less* testing, not *more*.

Anyway, this is PR99352 now, and I have a patch (that works with
-msoft-float as well, etc.)  Will commit later today.


Segher


Re: [PATCH] fwprop: Fix single_use_p calculation

2021-03-03 Thread Ilya Leoshkevich via Gcc-patches
On Wed, 2021-03-03 at 11:34 -0700, Jeff Law wrote:
> 
> 
> On 3/2/21 3:37 PM, Ilya Leoshkevich via Gcc-patches wrote:
> > Bootstrapped and regtested on x86_64-redhat-linux, ppc64le-redhat-
> > linux
> > and s390x-redhat-linux.  Ok for master?
> > 
> > 
> > 
> > Commit efb6bc55a93a ("fwprop: Allow (subreg (mem))
> > simplifications")
> > introduced a check that was supposed to look at the propagated
> > def's
> > number of uses.  It uses insn_info::num_uses (), which in reality
> > returns the number of uses def's insn has.  The whole change
> > therefore
> > works only by accident.
> > 
> > Fix by looking at def_info's uses instead of insn_info's uses. 
> > This
> > requires passing around def_info instead of insn_info.
> > 
> > gcc/ChangeLog:
> > 
> > 2021-03-02  Ilya Leoshkevich  
> > 
> > * fwprop.c (def_has_single_use_p): New function.
> > (fwprop_propagation::fwprop_propagation): Look at
> > def_info's uses.
> > (try_fwprop_subst_note): Use def_info instead of insn_info.
> > (try_fwprop_subst_pattern): Likewise.
> > (try_fwprop_subst_notes): Likewise.
> > (try_fwprop_subst): Likewise.
> > (forward_propagate_subreg): Likewise.
> > (forward_propagate_and_simplify): Likewise.
> > (forward_propagate_into): Likewise.
> > * iterator-utils.h (single_element_p): New function.
> Given we're well into stage4, I'd recommend deferring to gcc-12
> unless
> this fixes a code correctness issue.
> 
> Jeff
> 

Fortunately the issue here is not a miscompilation, but it's still
a regression: on s390 small functions that use long doubles get
a number of useless load/stores as well as a stack frame, where none
was required before.  Basically, the same issue efb6bc55a93a failed to
fully fix due to the num_uses() / nondebug_insn_uses() mixup.



[pushed] c++: Add fixed test [PR96474]

2021-03-03 Thread Marek Polacek via Gcc-patches
Was happy to find out that my recent dguide fix (r11-7483) fixed
this test too.  In particular, the

+  /* Wait until the enclosing scope is non-dependent.  */
+  if (DECL_CLASS_SCOPE_P (tmpl)
+  && dependent_type_p (DECL_CONTEXT (tmpl)))
+return ptype;

bit.

Tested x86_64-pc-linux-gnu, applying to trunk.

gcc/testsuite/ChangeLog:

PR c++/96474
* g++.dg/cpp1z/class-deduction83.C: New test.
---
 gcc/testsuite/g++.dg/cpp1z/class-deduction83.C | 13 +
 1 file changed, 13 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/cpp1z/class-deduction83.C

diff --git a/gcc/testsuite/g++.dg/cpp1z/class-deduction83.C 
b/gcc/testsuite/g++.dg/cpp1z/class-deduction83.C
new file mode 100644
index 000..63eab6a8442
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1z/class-deduction83.C
@@ -0,0 +1,13 @@
+// PR c++/96474
+// { dg-do compile { target c++17 } }
+
+template 
+struct A
+{
+template 
+struct B
+{
+};
+};
+
+A<>::B b;

base-commit: 499193a692efa33c9b2fe3ad8da0f4d5e5fd0e0c
-- 
2.29.2



Re: add -mpowerpc-gpopt to options for sqrt insn on PowerPC

2021-03-03 Thread Segher Boessenkool
On Wed, Mar 03, 2021 at 04:45:23PM -0300, Alexandre Oliva wrote:
> On Mar  2, 2021, Segher Boessenkool  wrote:
> 
> > This is PR99352 now.
> 
> Thanks.  I've filed PR99371 for the add_options_for_sqrt_insn
> incompleteness,

As I said before, I don't agree with that.  We do no longer support
enabling separate features / insns, because that requires exponential
testing effort, and an exponential number of backend patterns as well --
and very often some are missed.

And on all systems where these insns exist, they are enabled by default.
If a user disabled it, we should *not* reenable it, that reduces
testing surface.

> and PR99372 for the gimplefe-28.c ICE.

This is fixed trivially by the PR99352 patch as far as I can see.
Please verify (I'll post it later today).


Segher


Re: [PATCH PING^3] Add input_modes parameter to TARGET_MD_ASM_ADJUST hook

2021-03-03 Thread Ilya Leoshkevich via Gcc-patches
On Wed, 2021-03-03 at 13:02 -0700, Jeff Law wrote:
> 
> 
> On 3/2/21 4:50 PM, Ilya Leoshkevich via Gcc-patches wrote:
> > Hello,
> > 
> > I would like to ping the following patch:
> > 
> > Add input_modes parameter to TARGET_MD_ASM_ADJUST hook
> > https://gcc.gnu.org/pipermail/gcc-patches/2021-January/562898.html
> > 
> > It is needed for the following regression fix:
> > 
> > IBM Z: Fix usage of "f" constraint with long doubles
> > https://gcc.gnu.org/pipermail/gcc-patches/2021-January/564380.html
> > 
> > 
> > Jakub, who would be the right person to review this change?  I've
> > decided to ask you, since `git shortlog -ns gcc/cfgexpand.c` shows
> > that
> > you deal with this code a lot.
> > 
> > Best regards,
> > Ilya
> > 
> > 
> > 
> > 
> > If TARGET_MD_ASM_ADJUST changes a mode of an input operand (which
> > should be ok as long as the hook itself as well as after_md_seq
> > make up
> > for it), input_mode will contain stale information.
> > 
> > It might be tempting to fix this by removing input_mode altogether
> > and
> > just using GET_MODE (), but this will not work correctly with
> > constants.
> > So add input_modes parameter and document that it should be updated
> > whenever inputs parameter is updated.
> > 
> > gcc/ChangeLog:
> > 
> > 2021-01-05  Ilya Leoshkevich  
> > 
> > * cfgexpand.c (expand_asm_loc): Pass new parameter.
> > (expand_asm_stmt): Likewise.
> > * config/arm/aarch-common-protos.h (arm_md_asm_adjust): Add
> > new
> > parameter.
> > * config/arm/aarch-common.c (arm_md_asm_adjust): Likewise.
> > * config/arm/arm.c (thumb1_md_asm_adjust): Likewise.
> > * config/cris/cris.c (cris_md_asm_adjust): Likewise.
> > * config/i386/i386.c (ix86_md_asm_adjust): Likewise.
> > * config/mn10300/mn10300.c (mn10300_md_asm_adjust):
> > Likewise.
> > * config/nds32/nds32.c (nds32_md_asm_adjust): Likewise.
> > * config/pdp11/pdp11.c (pdp11_md_asm_adjust): Likewise.
> > * config/rs6000/rs6000.c (rs6000_md_asm_adjust): Likewise.
> > * config/vax/vax.c (vax_md_asm_adjust): Likewise.
> > * config/visium/visium.c (visium_md_asm_adjust): Likewise.
> > * target.def (md_asm_adjust): Likewise.
> Ugh.    A couple questions
> Are there any cases where you're going to want to change modes for
> arguments that were constants?   I'm a bit surprised that we don't
> have
> a mode for constants for the cases that we care about.  Presumably we
> can get a (modeless) CONST_INT here and we're not restricted to
> CONST_DOUBLE and friends (which have modes).

Yes, this might happen.  For example, here:

asm("sqxbr\t%0,%1" : "=f"(res) : "f"(0x1.1p+0L));

the (const_double) and the corresponding operand will initially have 
the mode TFmode.  s390_md_asm_adjust () will add a conversion from
TFmode to FPRX2mode and change the argument accordingly.

However, this is not the problematic case that I refer to in the
commit message:  I caught some failures in the testsuite that I
tracked down to (const_int)s, which, like you mentioned, don't have
a mode.

> Is input_modes read after the call to md_asm_adjust?   I'm trying to
> figure out why we'd need to update it.

Yes, its contents goes into (asm_operand)'s (asm_input). If we don't
adjust it, (asm_input)s will no longer be consistent with input operand
RTXes.

> Not acking or naking at this point, I just want to make sure I
> understand what's going on.
> 
> jeff



Re: [PATCH, rs6000] Update "size" attribute for Power10

2021-03-03 Thread Segher Boessenkool
Hi!

On Tue, Dec 08, 2020 at 03:46:31PM -0600, Pat Haugen wrote:
> gcc/
>   * config/rs6000/dfp.md (extendddtd2, trunctddd2, *cmp_internal1,
>   floatditd2, ftrunc2, fixdi2, dfp_ddedpd_,
>   dfp_denbcd_, dfp_dxex_, dfp_diex_,
>   *dfp_sgnfcnc_, dfp_dscli_, dfp_dscri_): Update size
>   attribute for Power10.
>   * config/rs6000/mma.md (*movoo): Likewise.
>   * config/rs6000/rs6000.md (define_attr "size"): Add 256.
>   (define_mode_attr bits): Add DD/TD modes.
>   * config/rs6000/sync.md (load_quadpti, store_quadpti, load_lockedpti,
>   store_conditionalpti): Update size attribute for Power10.

> --- a/gcc/config/rs6000/mma.md
> +++ b/gcc/config/rs6000/mma.md
> @@ -288,6 +288,7 @@ (define_insn_and_split "*movoo"
>DONE;
>  }
>[(set_attr "type" "vecload,vecstore,veclogical")
> +   (set_attr "size" "256,256,*")
> (set_attr "length" "*,*,8")])

Does "*" do the right thing here?  It has "32" as the default, is that
wanted?  It doesn't matter much at all of course, but I'd expect 128 or
even just 256 for all alternatives.

Okay for trunk with or without that tweaked.  Thanks!


Segher


Re: [PATCH PING^3] Add input_modes parameter to TARGET_MD_ASM_ADJUST hook

2021-03-03 Thread Ilya Leoshkevich via Gcc-patches
On Wed, 2021-03-03 at 21:26 +0100, Ilya Leoshkevich via Gcc-patches
wrote:
> On Wed, 2021-03-03 at 13:02 -0700, Jeff Law wrote:
> > 
> > 
> > On 3/2/21 4:50 PM, Ilya Leoshkevich via Gcc-patches wrote:
> > > Hello,
> > > 
> > > I would like to ping the following patch:
> > > 
> > > Add input_modes parameter to TARGET_MD_ASM_ADJUST hook
> > >  https://gcc.gnu.org/pipermail/gcc-patches/2021-January/562898.html
> > > 
> > > It is needed for the following regression fix:
> > > 
> > > IBM Z: Fix usage of "f" constraint with long doubles
> > >  https://gcc.gnu.org/pipermail/gcc-patches/2021-January/564380.html
> > > 
> > > 
> > > Jakub, who would be the right person to review this change?  I've
> > > decided to ask you, since `git shortlog -ns gcc/cfgexpand.c` shows
> > > that
> > > you deal with this code a lot.
> > > 
> > > Best regards,
> > > Ilya
> > > 
> > > 
> > > 
> > > 
> > > If TARGET_MD_ASM_ADJUST changes a mode of an input operand (which
> > > should be ok as long as the hook itself as well as after_md_seq
> > > make up
> > > for it), input_mode will contain stale information.
> > > 
> > > It might be tempting to fix this by removing input_mode altogether
> > > and
> > > just using GET_MODE (), but this will not work correctly with
> > > constants.
> > > So add input_modes parameter and document that it should be updated
> > > whenever inputs parameter is updated.
> > > 
> > > gcc/ChangeLog:
> > > 
> > > 2021-01-05  Ilya Leoshkevich  
> > > 
> > > * cfgexpand.c (expand_asm_loc): Pass new parameter.
> > > (expand_asm_stmt): Likewise.
> > > * config/arm/aarch-common-protos.h (arm_md_asm_adjust): Add
> > > new
> > > parameter.
> > > * config/arm/aarch-common.c (arm_md_asm_adjust): Likewise.
> > > * config/arm/arm.c (thumb1_md_asm_adjust): Likewise.
> > > * config/cris/cris.c (cris_md_asm_adjust): Likewise.
> > > * config/i386/i386.c (ix86_md_asm_adjust): Likewise.
> > > * config/mn10300/mn10300.c (mn10300_md_asm_adjust):
> > > Likewise.
> > > * config/nds32/nds32.c (nds32_md_asm_adjust): Likewise.
> > > * config/pdp11/pdp11.c (pdp11_md_asm_adjust): Likewise.
> > > * config/rs6000/rs6000.c (rs6000_md_asm_adjust): Likewise.
> > > * config/vax/vax.c (vax_md_asm_adjust): Likewise.
> > > * config/visium/visium.c (visium_md_asm_adjust): Likewise.
> > > * target.def (md_asm_adjust): Likewise.
> > Ugh.    A couple questions
> > Are there any cases where you're going to want to change modes for
> > arguments that were constants?   I'm a bit surprised that we don't
> > have
> > a mode for constants for the cases that we care about.  Presumably we
> > can get a (modeless) CONST_INT here and we're not restricted to
> > CONST_DOUBLE and friends (which have modes).
> 
> Yes, this might happen.  For example, here:
> 
>     asm("sqxbr\t%0,%1" : "=f"(res) : "f"(0x1.1p+0L));
> 
> the (const_double) and the corresponding operand will initially have 
> the mode TFmode.  s390_md_asm_adjust () will add a conversion from
> TFmode to FPRX2mode and change the argument accordingly.

Just to be more precise: the mode of the (const_double) itself will not
change.  Here is the resulting RTL for the asm statement above:

# s390_md_asm_adjust () step 1: put the (const_double) operand into a
# new (reg) with the same mode
(insn (set (reg:TF 63)
   (const_double:TF ...)))

# s390_md_asm_adjust () step 2: convert a reg from TFmode to FPRX2mode
(insn (set (reg:FPRX2 65)
   (subreg:FPRX2 (reg:TF 63) 0)))

# s390_md_asm_adjust () step 3: replace the original operand with the
# resulting (reg), adjust (asm_input) accordingly
(insn (set (reg:FPRX2 64)
   (asm_operands:FPRX2 ("sqxbr %0,%1") ("=f") 0
   [(reg:FPRX2 65)]
   [(asm_input:FPRX2 ("f"))])))



c++: Defer cloning to post-loading [PR 99170]

2021-03-03 Thread Nathan Sidwell


It turns out that cloning can cause use to load things. Specifically 
when checking paramter shadows (this is avoidable), and also the delete 
operator of a deleting dtor (not avoidable).  Doing that in the middle 
of loading is a bad thing.  This defers it to a post-load worklist.  If 
it causes more loading at that point there is no problem, as we've 
completed the first set of loads, bar this bit of cleanup.


Again, this doesn't fix 99170, but is a step towards a solution.

PR c++/99170
gcc/cp/
* module.cc (post_load_decls): New.
(lazy_snum, recursive_lazy): Move earlier.
(module_state::read_cluster): Push cloning onto post_load_decls.
(post_load_processing): New.  Do the cloning here.
(module_state::read_inits): Call post_load_processing.
(module_state::read_language): Likewise.
(lazy_load_binding, lazy_load_specializations): Likewise
(lazy_load_members): Likewise

--
Nathan Sidwell
diff --git c/gcc/cp/module.cc w/gcc/cp/module.cc
index b068acea3bd..b7b9c3734f2 100644
--- c/gcc/cp/module.cc
+++ w/gcc/cp/module.cc
@@ -2644,6 +2644,10 @@ depset *depset::make_entity (tree entity, entity_kind ek, bool is_defn)
   return r;
 }
 
+/* Decls that need some post processing once a batch of lazy loads has
+   completed.  */
+vec *post_load_decls;
+
 /* Values keyed to some unsigned integer.  This is not GTY'd, so if
T is tree they must be reachable via some other path.  */
 
@@ -14023,6 +14027,21 @@ make_mapper (location_t loc)
   return mapper;
 }
 
+static unsigned lazy_snum;
+
+static bool
+recursive_lazy (unsigned snum = ~0u)
+{
+  if (lazy_snum)
+{
+  error_at (input_location, "recursive lazy load");
+  return true;
+}
+
+  lazy_snum = snum;
+  return false;
+}
+
 /* If THIS is the current purview, issue an import error and return false.  */
 
 bool
@@ -15016,11 +15035,7 @@ module_state::read_cluster (unsigned snum)
   if (abstract)
 	;
   else if (DECL_ABSTRACT_P (decl))
-	{
-	  bool cloned = maybe_clone_body (decl);
-	  if (!cloned)
-	from ()->set_error ();
-	}
+	vec_safe_push (post_load_decls, decl);
   else
 	{
 	  bool aggr = aggregate_value_p (DECL_RESULT (decl), decl);
@@ -17267,6 +17282,33 @@ module_state::write_inits (elf_out *to, depset::hash &table, unsigned *crc_ptr)
   return count;
 }
 
+/* We have to defer some post-load processing until we've completed
+   reading, because they can cause more reading.  */
+
+static void
+post_load_processing ()
+{
+  if (!post_load_decls)
+return;
+
+  tree old_cfd = current_function_decl;
+  struct function *old_cfun = cfun;
+  while (post_load_decls->length ())
+{
+  tree decl = post_load_decls->pop ();
+
+  dump () && dump ("Post-load processing of %N", decl);
+
+  gcc_checking_assert (DECL_ABSTRACT_P (decl));
+  /* Cloning can cause loading -- specifically operator delete for
+	 the deleting dtor.  */
+  maybe_clone_body (decl);
+}
+
+  cfun = old_cfun;
+  current_function_decl = old_cfd;
+}
+
 bool
 module_state::read_inits (unsigned count)
 {
@@ -17276,6 +17318,7 @@ module_state::read_inits (unsigned count)
   dump () && dump ("Reading %u initializers", count);
   dump.indent ();
 
+  lazy_snum = ~0u;
   for (unsigned ix = 0; ix != count; ix++)
 {
   /* Merely referencing the decl causes its initializer to be read
@@ -17287,6 +17330,8 @@ module_state::read_inits (unsigned count)
   if (decl)
 	dump ("Initializer:%u for %N", count, decl);
 }
+  lazy_snum = 0;
+  post_load_processing ();
   dump.outdent ();
   if (!sec.end (from ()))
 return false;  
@@ -18025,21 +18070,6 @@ module_state::read_preprocessor (bool outermost)
   return check_read (outermost, ok);
 }
 
-static unsigned lazy_snum;
-
-static bool
-recursive_lazy (unsigned snum = ~0u)
-{
-  if (lazy_snum)
-{
-  error_at (input_location, "recursive lazy load");
-  return true;
-}
-
-  lazy_snum = snum;
-  return false;
-}
-
 /* Read language state.  */
 
 bool
@@ -18114,16 +18144,15 @@ module_state::read_language (bool outermost)
 
   unsigned hwm = counts[MSC_sec_hwm];
   for (unsigned ix = counts[MSC_sec_lwm]; ok && ix != hwm; ix++)
-	{
-	  if (!load_section (ix, NULL))
-	{
-	  ok = false;
-	  break;
-	}
-	  ggc_collect ();
-	}
-
+	if (!load_section (ix, NULL))
+	  {
+	ok = false;
+	break;
+	  }
   lazy_snum = 0;
+  post_load_processing ();
+
+  ggc_collect ();
 
   if (ok && CHECKING_P)
 	for (unsigned ix = 0; ix != entity_num; ix++)
@@ -18873,6 +18902,7 @@ lazy_load_binding (unsigned mod, tree ns, tree id, binding_slot *mslot)
 {
   ok = module->load_section (snum, mslot);
   lazy_snum = 0;
+  post_load_processing ();
 }
 
   dump.pop (n);
@@ -18929,6 +18959,7 @@ lazy_load_specializations (tree tmpl)
 	  function_depth--;
 	}
   lazy_snum = 0;
+  post_load_processing ();
 }
 
   timevar_stop (TV_MODULE_IMPORT);
@@ -18969,6 +19000,

Re: add -mpowerpc-gpopt to options for sqrt insn on PowerPC

2021-03-03 Thread Alexandre Oliva
On Mar  3, 2021, Segher Boessenkool  wrote:

> On Wed, Mar 03, 2021 at 04:45:23PM -0300, Alexandre Oliva wrote:
>> On Mar  2, 2021, Segher Boessenkool  wrote:
>> 
>> > This is PR99352 now.
>> 
>> Thanks.  I've filed PR99371 for the add_options_for_sqrt_insn
>> incompleteness,

> As I said before, I don't agree with that.

Maybe you'll change of mind if you try to make sense of why the proc
has, since its inception, added vfp options to enable sqrt on arm,
regardless of whether vfp is available with the processor being tested,
and realize this is not different from the case of powerpc.

> If a user disabled it, we should *not* reenable it, that reduces
> testing surface.

It's skipping the test, as the change you propose, that reduces testing
surface, when testing only a configuration that ends up skipping it.
Now, if you're testing multiple combinations, skipping or running does
*not* change the test surface.

So rejecting Eric's patch makes for a no-op in one case, and a reduction
in another.

Whereas installing it makes for a no-op in one case, and an increase in
another.  Please explain how you came to the conclusion that this
amounts to reducing hte test surface.  Something appears to be amiss in
that reasoning.


>> and PR99372 for the gimplefe-28.c ICE.

> This is fixed trivially by the PR99352 patch as far as I can see.

If your patch also deals with the ICE that appears with the options
named in PR99372, great.

> Please verify (I'll post it later today).

Please Cc me so I don't miss it, thanks,

-- 
Alexandre Oliva, happy hacker  https://FSFLA.org/blogs/lxo/
   Free Software Activist GNU Toolchain Engineer
Vim, Vi, Voltei pro Emacs -- GNUlius Caesar


Re: ping^1 Re: [PATCH] Darwin, testsuite : Prune 'object file not found for object'.

2021-03-03 Thread Jeff Law via Gcc-patches



On 3/3/21 12:39 PM, Iain Sandoe wrote:
> Iain Sandoe  wrote:
>
>> This is not a GCC problem, but a fault in the static linker where,
>> when a source file is used multiple times, with conditional compilation
>> the source file is only referenced by the linker for the first object.
>> Then, when dsymutil tries to find the source file for next object based
>> off that source there is no record for it.
>>
>> I’ve had this patch kicking around for some time, in the hope that the
>> problem would be fixed in the XCode tools, but it’s still present in the
>> XC12.5b2 (and will never be fixed in older toolkits).
>>
>> tested on *-darwin* and x86_64-linux-gnu,
>> OK for master?
>> Iain
>> (if were not in stage 4, I’d have applied it as obvious).
>
> ^^^
Then I'll give it an explicit OK :-)
jeff



Re: [PATCH PING^3] Add input_modes parameter to TARGET_MD_ASM_ADJUST hook

2021-03-03 Thread Jeff Law via Gcc-patches



On 3/3/21 1:42 PM, Ilya Leoshkevich wrote:
> On Wed, 2021-03-03 at 21:26 +0100, Ilya Leoshkevich via Gcc-patches
> wrote:
>> On Wed, 2021-03-03 at 13:02 -0700, Jeff Law wrote:
>>>
>>> On 3/2/21 4:50 PM, Ilya Leoshkevich via Gcc-patches wrote:
 Hello,

 I would like to ping the following patch:

 Add input_modes parameter to TARGET_MD_ASM_ADJUST hook
  https://gcc.gnu.org/pipermail/gcc-patches/2021-January/562898.html

 It is needed for the following regression fix:

 IBM Z: Fix usage of "f" constraint with long doubles
  https://gcc.gnu.org/pipermail/gcc-patches/2021-January/564380.html


 Jakub, who would be the right person to review this change?  I've
 decided to ask you, since `git shortlog -ns gcc/cfgexpand.c` shows
 that
 you deal with this code a lot.

 Best regards,
 Ilya




 If TARGET_MD_ASM_ADJUST changes a mode of an input operand (which
 should be ok as long as the hook itself as well as after_md_seq
 make up
 for it), input_mode will contain stale information.

 It might be tempting to fix this by removing input_mode altogether
 and
 just using GET_MODE (), but this will not work correctly with
 constants.
 So add input_modes parameter and document that it should be updated
 whenever inputs parameter is updated.

 gcc/ChangeLog:

 2021-01-05  Ilya Leoshkevich  

 * cfgexpand.c (expand_asm_loc): Pass new parameter.
 (expand_asm_stmt): Likewise.
 * config/arm/aarch-common-protos.h (arm_md_asm_adjust): Add
 new
 parameter.
 * config/arm/aarch-common.c (arm_md_asm_adjust): Likewise.
 * config/arm/arm.c (thumb1_md_asm_adjust): Likewise.
 * config/cris/cris.c (cris_md_asm_adjust): Likewise.
 * config/i386/i386.c (ix86_md_asm_adjust): Likewise.
 * config/mn10300/mn10300.c (mn10300_md_asm_adjust):
 Likewise.
 * config/nds32/nds32.c (nds32_md_asm_adjust): Likewise.
 * config/pdp11/pdp11.c (pdp11_md_asm_adjust): Likewise.
 * config/rs6000/rs6000.c (rs6000_md_asm_adjust): Likewise.
 * config/vax/vax.c (vax_md_asm_adjust): Likewise.
 * config/visium/visium.c (visium_md_asm_adjust): Likewise.
 * target.def (md_asm_adjust): Likewise.
>>> Ugh.    A couple questions
>>> Are there any cases where you're going to want to change modes for
>>> arguments that were constants?   I'm a bit surprised that we don't
>>> have
>>> a mode for constants for the cases that we care about.  Presumably we
>>> can get a (modeless) CONST_INT here and we're not restricted to
>>> CONST_DOUBLE and friends (which have modes).
>> Yes, this might happen.  For example, here:
>>
>>     asm("sqxbr\t%0,%1" : "=f"(res) : "f"(0x1.1p+0L));
>>
>> the (const_double) and the corresponding operand will initially have 
>> the mode TFmode.  s390_md_asm_adjust () will add a conversion from
>> TFmode to FPRX2mode and change the argument accordingly.
> Just to be more precise: the mode of the (const_double) itself will not
> change.  Here is the resulting RTL for the asm statement above:
>
> # s390_md_asm_adjust () step 1: put the (const_double) operand into a
> # new (reg) with the same mode
> (insn (set (reg:TF 63)
>(const_double:TF ...)))
OK, but you still need to know the constant's original mode so that you
can get the right  mode on the insn above and in the SUBREG_REG for the
insn in step #2.  Those cases were exposed by the testsuite.

I'd originally wondered if we could just not adjust in the CONST_INT
case, but I think I was mis-understanding the implementation of
s390_md_asm_adjust in an important way.

I also think I focused too much on expand_asm_loc where it doesn't look
like we read input_mode after the call to MD_ASM_ADJUST.   But we
certainly do in expand_asm_stmt.


OK for the trunk.

Thanks,
jeff



Re: [PATCH] coroutines: Correct frame capture of compiler temps [PR95591+4].

2021-03-03 Thread Jason Merrill via Gcc-patches
I notice this patch includes

+  var_nest_node () = default;

which will break GCC 10 bootstrap with a C++98 compiler; we only
switched to C++11 for GCC 11.


On Fri, Jul 17, 2020 at 8:57 AM Nathan Sidwell  wrote:
>
> On 7/17/20 3:37 AM, Richard Biener wrote:
> > On Thu, Jul 16, 2020 at 7:39 PM Nathan Sidwell  wrote:
> >>
> >> On 7/16/20 11:56 AM, Iain Sandoe wrote:
> >>> Hello folks,
> >>
> >> It's unfortunate the original plan of handling lifetime issues in a
> >> gimple pass didn't work out.
> >>
> >>> OK for master?
> >>>
> >>> OK for 10.x?
> >
> > IMHO too big of a change for 10.2 but of course fine after the
> > 10.2 release which gives it some beating-time on trunk as well.
>
> I agree.  I forgot to add that after reviewing.  We (FB) will apply this
> patch when updating to 10.2.  Given we fell over many of the bugs, it'd
> be rude not to!
>
> >
> > Richard.
> >
> >> ok, some nits
> >>> +struct interesting
> >>> +{
> >>
> >> not a very good name.  As it's not in an anonumous namespace, it has
> >> external linkage, and hence ripe for ODR breakage, if some other TU is
> >> interested about something else :)  Is there a more specific name it
> >> could have?
> >>
> >>
> >>> +  else if ((tmp_target_expr_p (expr)
> >>> + && !p->temps_used->contains (expr)))
> >>
> >> too many parens here.
> >>
> >> nathan
> >>
> >> --
> >> Nathan Sidwell
>
>
> --
> Nathan Sidwell
>



Re: [PATCH v4 1/2] MIPS: Not trigger error for pre-R6 and -mcompact-branches=always

2021-03-03 Thread Jeff Law via Gcc-patches



On 2/23/21 3:14 AM, YunQiang Su wrote:
> For MIPSr6, we may wish to use compact-branches only.
> Currently, we have to use `always' option, while it is mark as conflict
> with pre-R6.
>   cc1: error: unsupported combination: ‘mips32r2’ -mcompact-branches=always
> Just ignore -mcompact-branches=always for pre-R6.
>
> This patch also defines
> __mips_compact_branches_never
> __mips_compact_branches_always
> __mips_compact_branches_optimal
> predefined macros
>
> gcc/ChangeLog:
>   * config/mips/mips.c (mips_option_override):
>   * config/mips/mips.h (TARGET_RTP_PIC): not trigger error for
>   compact-branches=always for pre-R6.
>   (TARGET_CB_NEVER): Likewise.
>   (TARGET_CB_ALWAYS): Likewise.
>   (struct mips_cpu_info): define macros for compact branch policy.
>   * doc/invoke.texi: Document "always" with pre-R6.
>
> gcc/testsuite/ChangeLog:
>   * gcc.target/mips/compact-branches-1.c: add isa_rev>=6.
>   * gcc.target/mips/mips.exp: don't add -mipsXXr6 option for
>   -mcompact-branches=always. It is usable for pre-R6 now.
>   * gcc.target/mips/compact-branches-8.c: New test.
>   * gcc.target/mips/compact-branches-9.c: New test.
So I think Maciej's comment was that you simply shouldn't be using
-mcompact-branches=always at mips32r2 (or anything pre-r6) together.

I think what you're trying to do here is set up a scenario where you're
defaulting to mips32r6 and compact-branches, but not error if something
specifies -mcpu=mips32r2 or something similar, right?

jeff



Re: [PATCH] ipa: Fix resolving speculations through cgraph_edge::set_call_stmt (PR 98078)

2021-03-03 Thread Jeff Law via Gcc-patches



On 1/21/21 2:59 AM, Martin Jambor wrote:
> Hi,
>
> in the PR 98078 testcase, speculative call-graph edges which were
> created by IPA-CP are confirmed during inlining but
> cgraph_edge::set_call_stmt does not take it very well.
>
> The function enters the update_speculative branch and updates the edges
> in the speculation bundle separately (by a recursive call), but when it
> processes the first direct edge, most of the bundle actually ceases to
> exist because it is devirtualized.  It nevertheless goes on to attempt
> to update the indirect edge (that has just been removed), which
> surprisingly gets as far as adding the edge to the call_site_hash, the
> same devirtualized edge for the second time, and that triggers an
> assert.
>
> Fixed by this patch which makes the function aware that it is about to
> resolve a speculation and do so instead of updating components of
> speculation.  Also, it does so before dealing with the hash because
> the speculation resolution code needs the hash to point to the first
> speculative direct edge and also cleans the hash up by calling
> update_call_stmt_hash_for_removing_direct_edge.
>
> I don't have a testcase, at least not yet, the one in BZ does not link
> when it does not ICE.  I can try to find some time to make it link it is
> deemed very important.
>
> Bootstrapped and tested on x86_64-linux, also profile-LTO-bootstrapped
> on the same system.  OK for trunk?  What about gcc10, where we cannot
> trigger it but I suppose the bug is there?
>
> Thanks,
>
> Martin
>
>
> gcc/ChangeLog:
>
> 2021-01-20  Martin Jambor  
>
>   PR ipa/98078
>   * cgraph.c (cgraph_edge::set_call_stmt): Do not update all
>   corresponding speculative edges if we are about to resolve
>   sepculation.  Make edge direct (and so resolve speculations) before
>   removing it from call_site_hash.
>   (cgraph_edge::make_direct): Relax the initial assert to allow calling
>   the function on speculative direct edges.
OK.

jeff



Re: PING [PATCH] avoid -Warray-bounds checks for vtable assignments (PR 98266)

2021-03-03 Thread Martin Sebor via Gcc-patches

On 3/2/21 7:11 AM, Jason Merrill wrote:

On 3/1/21 6:11 PM, Martin Sebor wrote:

On 2/24/21 5:35 PM, Jason Merrill wrote:

On 2/23/21 6:07 PM, Martin Sebor wrote:

On 2/23/21 2:52 PM, Jason Merrill wrote:

On 2/23/21 11:02 AM, Martin Sebor wrote:

[CC Jason for any further comments/clarification]

On 2/9/21 10:49 AM, Martin Sebor wrote:

On 2/8/21 4:11 PM, Jeff Law wrote:



On 2/8/21 3:44 PM, Martin Sebor wrote:

On 2/8/21 3:26 PM, Jeff Law wrote:



On 2/8/21 2:56 PM, Martin Sebor wrote:

On 2/8/21 12:59 PM, Jeff Law wrote:



On 1/19/21 5:56 PM, Martin Sebor via Gcc-patches wrote:
Similar to the problem reported for -Wstringop-overflow in 
pr98266

and already fixed, -Warray-bounds is also susceptible to false
positives in assignments and copies involving virtual 
inheritance.
Because the two warnings don't share code yet (hopefully in 
GCC 12)

the attached patch adds its own workaround for this problem to
gimple-array-bounds.cc, this one slightly more crude 
because of
the limited insight the array bounds checking has into the 
checked

expressions.

Tested on x86_64-linux.

Martin

gcc-98266.diff

PR middle-end/98266 - bogus array subscript is partly 
outside array

bounds on virtual inheritance

gcc/ChangeLog:

  PR middle-end/98266
  * gimple-array-bounds.cc
(array_bounds_checker::check_array_bounds):
  Avoid checking references involving artificial members.

gcc/testsuite/ChangeLog:

  PR middle-end/98266
  * g++.dg/warn/Warray-bounds-15.C: New test.
It seems to me that we've got the full statement at some 
point and

thus
the full expression so at some point couldn't we detect when
TYPE_SIZE_UNIT!= DECL_SIZE_UNIT?  Or should we be using 
TYPE_SIZE_UNIT

rather than DECL_SIZE_UNIT in gimple-array-bounds.cc

Am I missing something?


The expression we're looking at when the false positive is 
issued

is the MEM_REF in the LHS of:

MEM[(struct D *)&D.2652 + 24B]._vptr.D = &MEM  
[(void

*)&_ZTC1E24_1D + 24B];

TREE_TYPE(LHS) is D, DECL_SIZE_UNIT (D.2652) is 24, and
TYPE_SIZE_UNIT(D) is also 24, so there's no discrepancy between
DECL_SIZE and TYPE_SIZE.
So that seems like it's a different issue then, unrelated to 
97595.

Right?


I think the underlying problem is the same.  We're getting a size
that doesn't correspond to what's actually being accessed, and it
happens because of the virtual inheritance.  In pr97595 Jason
suggested to use the decl/type size inequality to identify this
case but I think we could have just as well used DECL_ARTIFICIAL
instead.  At least the test cases from pr97595 both pass with
this change.
But in the 98266 case the type and decl sizes are the same.  So 
to be

true that would mean that the underlying type we're using to access
memory differs from its actual type.  Is that the case in the 
IL? And
does this have wider implications for diagnostics or 
optimizations that

rely on accurate type sizing?

I'm just trying to make sure I understand, not accepting or 
rejecting

the patch yet.


The part of the IL with the MEM_REF is this:

void g ()
{
   void * D.2789;
   struct E D.2652;

    [local count: 1073741824]:
   E::E (&D.2652, "");
   f (&D.2652);

    [local count: 1073741824]:
   MEM[(struct D *)&D.2652 + 24B]._vptr.D = &MEM  
[(void *)&_ZTC1E24_1D + 24B];

   ...

The access here is to the _vptr.D pointer member of D.2652 which is
just past the end of the parent object (as reflected by its SIZE):
it sets sets up the virtual table pointer.

The access in pr97595 is to the member subobject, which, as Jason
explained (and I accordingly documented under DECL_SIZE in tree.h),
is also laid out separately from the parent object.

These cases aren't exactly the same (which is also why the test
I added for -Warray-bounds in pr97595 didn't expose this bug) but
they are closely related.  The one here can be distinguished by
DECL_ARTIFICAL.  The other by the DECL_SIZE != TYPE_SIZE member
inequality.

Might this impact other warnings?  I'd say so if they don't take
these things into account.  I just learned about this in pr97595
which was a -Wstringop-overflow false positive but I also saw
a similar instance of -Warray-bounds with my patch to improve
caching and enhance array bounds checking.  I dealt with that
instance of the warning in that patch but proactively added
a test case to the fix for pr97595.  But the test case is focused
on the subobject access and not on one to the virtual table so
(as I said above) it didn't expose this bug.

Might this also impact optimizations?  I can imagine someone
unaware of this "gotcha" making the same "naive" assumption
I did, but I'd also expect such an invalid assumption to be
found either in code review or quickly cause problems in
testing.


Jeff, does this answer your question?


I don't see how the issue here depends on the artificiality of the 
vptr; I'd expect to see the same problem with a data member.  The 
problem is that a D base subobject is smaller than a complete D 
object, and in this case the base 

Re: [PATCH 2/3 V2] Do not include stdio.h in libgcc's Decimal/Float128 conversions.

2021-03-03 Thread Joseph Myers
On Wed, 3 Mar 2021, Michael Meissner via Gcc-patches wrote:

> As we have discussed many times, on 32-bit BE, you cannot use hardware
> _Float128 support on power9/power10 because there is no TImode in 32-bit.
> Various machine independent parts of GCC require an integer type to be the 
> same
> size as basic types.  If somebody made TImode work, we can remove the
> restriction and allow _Float128 to work on 32-bit.

I'm not sure exactly what the machine-independent requirement is, but 
_Float128 is supported for 32-bit x86, riscv, sparc and s390 at least.

There's no support for _Float128 in the 32-bit powerpc ABI, but I don't 
think there is anything architecture-independent preventing such support 
on 32-bit platforms where TImode is unsupported (i.e. not supported by the 
scalar_mode_supported_p hook).

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH 2/3] MIPS: add builtime option for -mcompact-branches

2021-03-03 Thread Maciej W. Rozycki
On Fri, 19 Feb 2021, YunQiang Su wrote:

> >  My understanding therefore is that the original assumption that `optimal'
> > will serve people best is no longer true.
> >
> 
> I guess that `optimal' can still produce the best performance, while
> the delay slot
> make MIPS quite differnent with other architectures.
> And the hardware engineers seems hate it also.

 Right, but what does it have to do with compiler defaults?  Given what we 
have available in hardware we want the best results possible, except for 
research or special use cases (such as GAS's `-minsn32' option with 
microMIPS code).  I would like to understand what the use case is here.

> And we expect that MIPS can have as few as possible differnece delta
> with other major architectures,
> to ultily all of new framworks of community.

 Well, machine code is inherently architecture-specific, so you can't 
have a single template that fits all.  The difference betwen processor 
architectures is more than just the bit patterns for individual opcodes
and operand encodings (and the corresponding mnemonics and syntax for 
the assembly language).

 For example one of the major architectures is ARM, which has conditions 
encoded with all the instructions.  And you cant mimic it with other ISAs.  
Similarly Power has 8 sets of condition codes and dedicated instructions 
to make ALU operations between these codes.  You can't do those elsewhere 
either.  Well, the MIPS or RISC-V ISAs do not have condition codes at all. 
And x86 is not a load-store architecture at all, so you'll see operations 
made directly on memory, as a destination even (let's ignore the even more 
arcane original 32-bit instruction set).

 These are all considered major architectures nowadays.

 So we have got the MIPS ISA and its delay slots.  Some subsets/variations 
of the ISA have already either greatly reduced their use or eliminated 
them completely, but we went into great lengths with GCC to produce good 
code making use of these delay slots, so I think it would be a shame to 
get this effort wasted on one hand, and MIPS code put at a disadvantage 
due to cycles wasted for pipeline stalls that could be avoided if delay 
slots were scheduled -- on the other.

> >  First, I think it would be good if we knew why.  I find proliferating
> > variants of defaults, especially for the less obvious cases, will cause
> > user confusion as one won't know what code model to expect, especially as
> > (please correct me if I am wrong) we don't actually provide a way to dump
> > the compiler's overridable configuration defaults.
> >
> 
> So, should we provide a predefined macro for it?

 I've been thinking more along `gcc -v --version' dumping the invocation 
of `configure' used, but I have to correct myself here in that it already 
happens, so nothing to do.  I'm not sure why I forgot it and/or could not 
have figured it out previously.  Sorry about the confusion.

> >  Second, I wonder if it makes sense to just keep things simple, and rather
> > than adding `prefer' (to stand for "`always' if available"), and possibly
> > `avoid' (to stand for "`never' if available"), whether we shouldn't just
> > relax the checks for `always' and `never', and let them through whether
> > the architecture selected provides for the option chosen or not.
> >
> 
> relax the `always' is what I would like to do first.
> But I was afread to break some complatiable.

 Hmm, honestly I don't think there could be any compatibility to care of 
here given that the compiler currently refuses to run with such an option 
combination.  Nobody may have relied on that then and the extra protection 
given is in my opinion a bit excessive.  Garbage in, garbage out: you get 
what you have requested.  Our usual policy with irrelevant options has 
been to silently ignore them, which helps users override Makefile defaults 
by just having CFLAGS, etc. appended to whatever the defaults are.

> >  Please note that in the discussion quoted Catherine has already raised a
> > concern I agree with of adding a complication here, and now we would
> > complicate it even further by not only adding a fourth choice, but another
> > overridable configuration default as well.
> 
> I am still concern about whether we should just set the `always' as default.
> My short team plan is to set it default for Debian r6 Port.
> So, at least, I wish that we can provide a buildtime option for other need.

 You're free with what you do with your distribution, although it seems to 
me like it's going to be a performance regression.  I suggest that you do 
some benchmarking with real code and hardware before you decide.  Maybe 
you can prove me wrong and there will be no loss of any significance.

  Maciej


Re: [PATCH v4 1/2] MIPS: Not trigger error for pre-R6 and -mcompact-branches=always

2021-03-03 Thread Maciej W. Rozycki
On Wed, 3 Mar 2021, Jeff Law wrote:

> > gcc/testsuite/ChangeLog:
> > * gcc.target/mips/compact-branches-1.c: add isa_rev>=6.
> > * gcc.target/mips/mips.exp: don't add -mipsXXr6 option for
> > -mcompact-branches=always. It is usable for pre-R6 now.
> > * gcc.target/mips/compact-branches-8.c: New test.
> > * gcc.target/mips/compact-branches-9.c: New test.
> So I think Maciej's comment was that you simply shouldn't be using
> -mcompact-branches=always at mips32r2 (or anything pre-r6) together.
> 
> I think what you're trying to do here is set up a scenario where you're
> defaulting to mips32r6 and compact-branches, but not error if something
> specifies -mcpu=mips32r2 or something similar, right?

 Actually what I had in mind was relaxing the strictness of the `always' 
case.  I have replied in a more elaborate manner in the original thread 
where I raised my concerns.  I have not looked through the code of any 
newer proposals.

  Maciej


Re: [[PATCH] 1/3] aarch64: Sanitize access to cfun in aarch64_declare_function_name()

2021-03-03 Thread Christoph Müllner via Gcc-patches
[CCing Andrew Pinski, who had the same question]

On 2/15/21 1:12 PM, Richard Earnshaw wrote:
> On 09/12/2020 17:21, Christoph Müllner wrote:
>> From: Christoph Muellner 
>>
>> It is possible to call aarch64_declare_function_name() and
>> have cfun not set. Let's sanitize the access to this variable.
>>
>> gcc/
>>
>> * config/aarch64/aarch64.c (aarch64_declare_function_name):
>> Santize access to cfun.
>> ---
>>  gcc/config/aarch64/aarch64.c | 3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
>> index 67ffba02d3e..264ccb8beb2 100644
>> --- a/gcc/config/aarch64/aarch64.c
>> +++ b/gcc/config/aarch64/aarch64.c
>> @@ -19317,7 +19317,8 @@ aarch64_declare_function_name (FILE *stream, const 
>> char* name,
>>ASM_OUTPUT_TYPE_DIRECTIVE (stream, name, "function");
>>ASM_OUTPUT_LABEL (stream, name);
>>  
>> -  cfun->machine->label_is_assembled = true;
>> +  if (cfun)
>> +cfun->machine->label_is_assembled = true;
>>  }
>>  
>>  /* Implement PRINT_PATCHABLE_FUNCTION_ENTRY.  Check if the patch area is 
>> after
>>
> 
> Do you have a simple testcase that demonstrates this?  If so, we likely
> have a regression here that will not only need fixing, but back-porting
> as well.

I was testing this on master back then in December and it this patch
had an influence on the tests (make check), but I can't recall which ones.
I rebased the patches today and cannot reproduce the issues anymore with "make 
check".
I.e. this patch does not influence any tests as of today's master branch 
(anymore).

However, if I also apply patch 3/3 of this series, then this patch is needed.
More or less all new tests of patch 3/3 trigger this test otherwise.

Thanks,
Christoph


Re: [PATCH] correct fix to avoid false positives for vectorized stores (PR 96963, 94655)

2021-03-03 Thread Martin Sebor via Gcc-patches

On 3/3/21 3:31 AM, Richard Biener wrote:

On Tue, Mar 2, 2021 at 9:23 PM Martin Sebor  wrote:


On 3/2/21 3:39 AM, Richard Biener wrote:

On Fri, Jan 22, 2021 at 12:39 AM Martin Sebor via Gcc-patches
 wrote:


The hack I put in compute_objsize() last January for pr93200 isn't
quite correct.  It happened to suppress the false positive there
but, due to what looks like a thinko on my part, not in some other
test cases involving vectorized stores.

The attached change adjusts the hack to have compute_objsize() give
up on all MEM_REFs with a vector type.  This effectively disables
-Wstringop-{overflow,overread} for vector accesses (either written
by the user or synthesized by GCC from ordinary accesses).  It
doesn't affect -Warray-bounds because this warning doesn't use
compute_objsize() yet.  When it does (it should considerably
simplify the code) some additional changes will be needed to
preserve -Warray-bounds for out of bounds vector accesses.
The test this patch adds should serve as a reminder to make
it if we forget.

Tested on x86_64-linux.  Since PR 94655 was reported against GCC
10 I'd like to apply this fix to both the trunk and the 10 branch.


The proposed code reads even more confusingly than before.
What is called 'ptr' is treated as a reference and what you
then name ptrtype is the type of the reference.

That leads to code like

+   if (code == MEM_REF && TREE_CODE (ptrtype) == VECTOR_TYPE)

what?  the pointer type is a VECTOR_TYPE?!

I think you are looking for whether the reference type is a VECTOR_TYPE.

Part of the confusion is likely because the code commons

if (code == ARRAY_REF || code == MEM_REF)

but in one case operand zero is a pointer to the object (MEM_REF) and
in one case
it is the object (ARRAY_REF).

Please refactor this code so one can actually read it literally
without second-guessing proper variable names.


I agree that handling both REFs in the same block makes the code
more difficult to follow than it needs to be.

In the attached patch I've factored the code out into two helper
functions, one for ARRAY_REF and another for MEM_REF, and chosen
(hopefully) clearer names for the local variables.

A similar refactoring could be done for COMPONENT_REF and also
for SSA_NAME to reduce the size of the function and make it much
easier to follow.  I stopped short of doing that but I can do it
for GCC 12 when I move the whole compute_objsize() implementation
into a more appropriate  place (e.g., its own file).


+  if (!addr && TREE_CODE (TREE_TYPE (reftype)) == POINTER_TYPE)

POINTER_TYPE_P ()


I think this is intentional: POINTER_TYPE_P() includes C++ references
and although they don't come up here (there's no such thing as arrays
of references) it didn't seem appropriate to be including them in
the test even if it's benign.



+/* Avoid arrays of pointers.  FIXME: Hande pointers to arrays
+   of known bound.  */
+return false;

+  if (TREE_CODE (TREE_TYPE (mref)) == VECTOR_TYPE)
+{
+  /* Hack: Give up for MEM_REFs of vector types; those may be
+synthesized from multiple assignments to consecutive data
+members (see PR 93200 and 96963).

VECTOR_TYPE_P (TREE_TYPE (mref))


Okay.



+FIXME: Vectorized assignments should only be present after
+vectorization so this hack is only necessary after it has
+run and could be avoided in calls from prior passes (e.g.,
+tree-ssa-strlen.c).

You could gate this on cfun->curr_properties & PROP_gimple_lvec
which is only set after vectorization.  Users can write vector
accesses using intrinsics or GCCs vector extension and I guess
warning for those is useful (and they less likely will cover multiple
fields).


Thanks, that's useful to know!  I'll keep it in mind for GCC 12.



Note the vectorizer also uses integer types for vector accesses
either when vectorizing using 'generic' vectors (for loads, stores
and bit operations we don't need any vector ISA) or when
composing vectors.

Note store-merging has the same issue (but fortunately runs later?)


Yes, it's the same issue (as is FRE/PRE using one member to write
to the next) and it does cause false positives depending on where
the code is called from.  They're on my to do list for GCC 12 to
avoid as we discussed, by transform those into MEM_REFs with
the enclosing object + offset instead of the member.



OK with the above changes.


I've committed the patch without the POINTER_TYPE_P change (I'm
okay with it if you feel it necessary.)

Martin



Thanks,
Richard.


Martin



Thanks,
Richard.




Martin






gcc.misc-tests/outputs.exp: enumerate tests

2021-03-03 Thread Hans-Peter Nilsson via Gcc-patches
> From: Hans-Peter Nilsson 
> Date: Tue, 2 Mar 2021 23:58:08 +0100

> > From: Jeff Law 
> > Date: Tue, 2 Mar 2021 23:39:27 +0100

> > On 2/24/21 10:17 PM, Hans-Peter Nilsson via Gcc-patches wrote:

> > > Ok to commit?  Or is a renaming patch appending a
> > > number-suffix, like:
> > >
> > > --- outputs.exp.orig3 2021-02-25 06:13:28.304243791 +0100
> > > +++ outputs.exp   2021-02-25 06:13:51.575457825 +0100
> > > @@ -280,8 +280,8 @@ if { "$aout" != "" } then {
> > >  }
> > >  
> > >  # Driver-chosen outputs.
> > > -outest "$b asm default 1" $sing "-S" {} {{-0.s}}
> > > -outest "$b asm default 2" $mult "-S" {} {{-1.s -2.s}}
> > > +outest "$b-1 asm default 1" $sing "-S" {} {{-0.s}}
> > > +outest "$b-2 asm default 2" $mult "-S" {} {{-1.s -2.s}}
> > > ...
> > >
> > > ...better and ok to commit?  (IMHO: yes, much easier to follow)
> > >
> > > gcc/testsuite:
> > >   * gcc.misc-tests/outputs.exp: Append discriminating
> > >   suffixes to tests with duplicate names.
> > >   (outest): Assert that each running test has a unique
> > >   name.
> > OK.  And I think that in general changes which fix duplicated testnames
> > should be considered as not needing approval.  Just fix 'em and post the
> > patch for the historical record :-)
> 
> Ok, thanks for the review!
> 
> Since there were two versions suggested and (to me) the "OK"
> was ambiguous, I'll go ahead with the suggested (but
> scripted) renaming of e.g. "$b blah", "$b foo", "$b blah
> with futz", "$b frob" (etc) into "$b-1 blah", "$b-2 foo",
> "$b-3 blah with futz", "$b-4 frob" (etc) ;-) But, I'm going
> to wait 24h.

I applied this on top of the other patches, separately committed:

Go from:
PASS: outputs asm default 1: outputs-0.s
PASS: outputs asm default 1: extra
PASS: outputs asm default 1: std out
PASS: outputs asm default 2: outputs-1.s
PASS: outputs asm default 2: outputs-2.s
PASS: outputs asm default 2: extra
PASS: outputs asm default 2: std out
PASS: outputs obj default 1: outputs-0.o
PASS: outputs obj default 1: extra
PASS: outputs obj default 1: std out
...

to:
PASS: outputs-1 asm default 1: outputs-0.s
PASS: outputs-1 asm default 1: extra
PASS: outputs-1 asm default 1: std out
PASS: outputs-2 asm default 2: outputs-1.s
PASS: outputs-2 asm default 2: outputs-2.s
PASS: outputs-2 asm default 2: extra
PASS: outputs-2 asm default 2: std out
PASS: outputs-3 obj default 1: outputs-0.o
PASS: outputs-3 obj default 1: extra
PASS: outputs-3 obj default 1: std out
...

thereby helping a human to quickly pin-point the result of a
specific test.

This mechanical patch was produced by applying
perl -pe 's/^(outest "\$b) /sprintf("%s-%d ",$1, ++$n)/e;'
to outputs.exp.

gcc/testsuite:
* gcc.misc-tests/outputs.exp: Enumerate tests.
---
 gcc/testsuite/gcc.misc-tests/outputs.exp | 790 +++
 1 file changed, 395 insertions(+), 395 deletions(-)

diff --git a/gcc/testsuite/gcc.misc-tests/outputs.exp 
b/gcc/testsuite/gcc.misc-tests/outputs.exp
index 0e5c1a55ce87..b2dd0880ba1e 100644
--- a/gcc/testsuite/gcc.misc-tests/outputs.exp
+++ b/gcc/testsuite/gcc.misc-tests/outputs.exp
@@ -280,238 +280,238 @@ if { "$aout" != "" } then {
 }
 
 # Driver-chosen outputs.
-outest "$b asm default 1" $sing "-S" {} {{-0.s}}
-outest "$b asm default 2" $mult "-S" {} {{-1.s -2.s}}
+outest "$b-1 asm default 1" $sing "-S" {} {{-0.s}}
+outest "$b-2 asm default 2" $mult "-S" {} {{-1.s -2.s}}
 
-outest "$b obj default 1" $sing "-c" {} {{-0.o}}
-outest "$b obj default 2" $mult "-c" {} {{-1.o -2.o}}
+outest "$b-3 obj default 1" $sing "-c" {} {{-0.o}}
+outest "$b-4 obj default 2" $mult "-c" {} {{-1.o -2.o}}
 
-outest "$b exe default 1" $sing "$oaout" {} {{$aout}}
-outest "$b exe default 2" $mult "$oaout" {} {{$aout}}
+outest "$b-5 exe default 1" $sing "$oaout" {} {{$aout}}
+outest "$b-6 exe default 2" $mult "$oaout" {} {{$aout}}
 
 # Driver-chosen aux outputs.
-outest "$b asm savetmp 1" $sing "-S -save-temps" {} {{-0.i -0.s}}
-outest "$b asm savetmp 2" $mult "-S -save-temps" {} {{-1.i -1.s -2.i -2.s}}
-outest "$b obj savetmp unnamed1" $sing "-c -save-temps" {} {{-0.i -0.s -0.o}}
-outest "$b obj savetmp unnamed2" $mult "-c -save-temps" {} {{-1.i -1.s -1.o 
-2.i -2.s -2.o}}
+outest "$b-7 asm savetmp 1" $sing "-S -save-temps" {} {{-0.i -0.s}}
+outest "$b-8 asm savetmp 2" $mult "-S -save-temps" {} {{-1.i -1.s -2.i -2.s}}
+outest "$b-9 obj savetmp unnamed1" $sing "-c -save-temps" {} {{-0.i -0.s -0.o}}
+outest "$b-10 obj savetmp unnamed2" $mult "-c -save-temps" {} {{-1.i -1.s -1.o 
-2.i -2.s -2.o}}
 
 # Aux outputs computed within the driver, based on output name (and
 # input).
-outest "$b cpp savetmp named0" $sing "-E -o $b-0.i -save-temps" {} {{-0.i}}
-outest "$b asm savetmp named0" $sing "-S -o $b-0.s -save-temps" {} {{-0.i 
-0.s}}
-outest "$b obj savetmp named0" $sing "-c -o $b-0.o -save-temps" {} {{-0.i -0.s 
-0.o}}
-outest "$b cpp savetmp namedb" $sing "-E -o $b.i -save-temps" {} {{.i}}
-outest "$b asm savetmp namedb" $sing "-S -o $b.s -save-temps" {} {{.i .s}}

Re: [[PATCH] 1/3] aarch64: Sanitize access to cfun in aarch64_declare_function_name()

2021-03-03 Thread Andrew Pinski via Gcc-patches
On Wed, Mar 3, 2021 at 4:02 PM Christoph Müllner  wrote:
>
> [CCing Andrew Pinski, who had the same question]
>
> On 2/15/21 1:12 PM, Richard Earnshaw wrote:
> > On 09/12/2020 17:21, Christoph Müllner wrote:
> >> From: Christoph Muellner 
> >>
> >> It is possible to call aarch64_declare_function_name() and
> >> have cfun not set. Let's sanitize the access to this variable.
> >>
> >> gcc/
> >>
> >> * config/aarch64/aarch64.c (aarch64_declare_function_name):
> >> Santize access to cfun.
> >> ---
> >>  gcc/config/aarch64/aarch64.c | 3 ++-
> >>  1 file changed, 2 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> >> index 67ffba02d3e..264ccb8beb2 100644
> >> --- a/gcc/config/aarch64/aarch64.c
> >> +++ b/gcc/config/aarch64/aarch64.c
> >> @@ -19317,7 +19317,8 @@ aarch64_declare_function_name (FILE *stream, const 
> >> char* name,
> >>ASM_OUTPUT_TYPE_DIRECTIVE (stream, name, "function");
> >>ASM_OUTPUT_LABEL (stream, name);
> >>
> >> -  cfun->machine->label_is_assembled = true;
> >> +  if (cfun)
> >> +cfun->machine->label_is_assembled = true;
> >>  }
> >>
> >>  /* Implement PRINT_PATCHABLE_FUNCTION_ENTRY.  Check if the patch area is 
> >> after
> >>
> >
> > Do you have a simple testcase that demonstrates this?  If so, we likely
> > have a regression here that will not only need fixing, but back-porting
> > as well.
>
> I was testing this on master back then in December and it this patch
> had an influence on the tests (make check), but I can't recall which ones.
> I rebased the patches today and cannot reproduce the issues anymore with 
> "make check".
> I.e. this patch does not influence any tests as of today's master branch 
> (anymore).
>
> However, if I also apply patch 3/3 of this series, then this patch is needed.
> More or less all new tests of patch 3/3 trigger this test otherwise.

That still does not describe why cfun is null in the case of patch 3/3?
>From looking into that patch I noticed you call set_cfun wtih null in
output_indirect_thunk_function shouldn't you push and pop the cfun
instead?

Thanks,
Andrew Pinski

>
> Thanks,
> Christoph


[PATCH] c++: ICE with real-to-int conversion in template [PR97973]

2021-03-03 Thread Marek Polacek via Gcc-patches
In this test we are building a call in a template, but since neither
the function nor any of its arguments are dependent, we go down the
normal path in finish_call_expr.  convert_arguments sees that we're
binding a reference to int to double and therein convert_to_integer
creates a FIX_TRUNC_EXPR.  Later, we call check_function_arguments
which folds the arguments, and, in a template, fold_for_warn calls
fold_non_dependent_expr.  But tsubst_copy_and_build should not see
a FIX_TRUNC_EXPR (see the patch discussed in
)
or we crash.

So let's not create a FIX_TRUNC_EXPR in a template in the first place
and instead use IMPLICIT_CONV_EXPR.

(I'm not sure why tsubst* also doesn't crash on a FLOAT_EXPR; it'd
make sense to me to also create an IMPLICIT_CONV_EXPR for integer
to real convs and add "case FLOAT_EXPR" just under FIX_TRUNC_EXPR.
But perhaps that's too risky to do now.)

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk/10?

gcc/cp/ChangeLog:

PR c++/97973
* call.c (convert_like): When converting real to integer in
a template, use IMPLICIT_CONV_EXPR.

gcc/testsuite/ChangeLog:

PR c++/97973
* g++.dg/conversion/real-to-int1.C: New test.
---
 gcc/cp/call.c  |  7 ++-
 gcc/testsuite/g++.dg/conversion/real-to-int1.C | 17 +
 2 files changed, 23 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/conversion/real-to-int1.C

diff --git a/gcc/cp/call.c b/gcc/cp/call.c
index 123f06b1f2b..8074d8fdba8 100644
--- a/gcc/cp/call.c
+++ b/gcc/cp/call.c
@@ -8062,7 +8062,12 @@ convert_like (conversion *convs, tree expr, tree fn, int 
argnum,
   tree conv_expr = NULL_TREE;
   if (processing_template_decl
   && convs->kind != ck_identity
-  && (CLASS_TYPE_P (convs->type) || CLASS_TYPE_P (TREE_TYPE (expr
+  && (CLASS_TYPE_P (convs->type)
+ || CLASS_TYPE_P (TREE_TYPE (expr))
+ /* Converting real to integer produces FIX_TRUNC_EXPR which
+tsubst also doesn't grok.  */
+ || (SCALAR_FLOAT_TYPE_P (TREE_TYPE (expr))
+ && INTEGRAL_OR_ENUMERATION_TYPE_P (convs->type
 {
   conv_expr = build1 (IMPLICIT_CONV_EXPR, convs->type, expr);
   if (convs->kind != ck_ref_bind)
diff --git a/gcc/testsuite/g++.dg/conversion/real-to-int1.C 
b/gcc/testsuite/g++.dg/conversion/real-to-int1.C
new file mode 100644
index 000..f7b990b3f4b
--- /dev/null
+++ b/gcc/testsuite/g++.dg/conversion/real-to-int1.C
@@ -0,0 +1,17 @@
+// PR c++/97973
+
+void (*foo[1])(const int &);
+void (*foo2[1])(const double &);
+
+template
+void f ()
+{
+  (foo[0])(1.1);
+  (foo2[0])(1);
+}
+
+void
+g ()
+{
+  f ();
+}

base-commit: 8d57bdadd2d9c2e5c95515ca7a583d7b407b55c4
-- 
2.29.2



Re: [PATCH 2/3 V2] Do not include stdio.h in libgcc's Decimal/Float128 conversions.

2021-03-03 Thread Michael Meissner via Gcc-patches
On Wed, Mar 03, 2021 at 11:33:52PM +, Joseph Myers wrote:
> On Wed, 3 Mar 2021, Michael Meissner via Gcc-patches wrote:
> 
> > As we have discussed many times, on 32-bit BE, you cannot use hardware
> > _Float128 support on power9/power10 because there is no TImode in 32-bit.
> > Various machine independent parts of GCC require an integer type to be the 
> > same
> > size as basic types.  If somebody made TImode work, we can remove the
> > restriction and allow _Float128 to work on 32-bit.
> 
> I'm not sure exactly what the machine-independent requirement is, but 
> _Float128 is supported for 32-bit x86, riscv, sparc and s390 at least.
> 
> There's no support for _Float128 in the 32-bit powerpc ABI, but I don't 
> think there is anything architecture-independent preventing such support 
> on 32-bit platforms where TImode is unsupported (i.e. not supported by the 
> scalar_mode_supported_p hook).

The problem is several parts of GCC insists that there be an integer MODE that
is the same size as the scalar MODE.  I recall it happens in moves (such as
during calls), but it could be other places as well.

In theory, it would be nice to get this fixed, but the practicality is you can
go down a lot of places to find the next bug and fix that.

Maybe there is somebody who has enough time and patience to fix all of the
places that demand that there be an integer type large enough to hold scalar
types like _Float128 and generate alternate code.

Or perhaps implement enough of TImode so that it doesn't cause issues.

But in order to do the work and make sure it does not cause issues for 32-bit
and hardware _Float128, you need to build on a big endian power9 system.  In
the compile farm there is gcc135, but it is run little endian.

BTW, during the initial development of MMA, I ran into the same issue for the
vector pair and vector quad types.  Each of the approaches that we took had
some serious issues.  What I did in the initial stages of development is add
dummy moves that would never succeed for 256 and 512-bit integers, and then use
partial integers to represent that these weren't quite real integers.  However,
even then we discovered various parts of the compiler would strip off the
partial integer type and use the full integer type, even though there was no
real support for that type.  Peter and Aaron eventually solved this by adding
support for opaque modes.

However, Float128 needs to be a scalar floating point mode, not an opaque mode.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797


Re: [PATCH] coroutines: Correct frame capture of compiler temps [PR95591+4].

2021-03-03 Thread Iain Sandoe

Jason Merrill  wrote:


I notice this patch includes

+  var_nest_node () = default;

which will break GCC 10 bootstrap with a C++98 compiler; we only
switched to C++11 for GCC 11.


Hmm, the patch was already backported…
… I will fix this.

I missed the warning during testing.
Iain



Re: [Patch] Fortran: Fix -freal-{4,8}-real* handling [PR99355]

2021-03-03 Thread Jerry DeLisle

Yes OK got mainline.

On 3/3/21 3:45 AM, Tobias Burnus wrote:

Rather obvious change – don't apply replacement multiple times.

OK for mainline?

Tobias

-
Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München 
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, 
Frank Thürauf




Re: [[PATCH] 1/3] aarch64: Sanitize access to cfun in aarch64_declare_function_name()

2021-03-03 Thread Christoph Müllner via Gcc-patches
On Thu, Mar 4, 2021 at 1:19 AM Andrew Pinski  wrote:

> On Wed, Mar 3, 2021 at 4:02 PM Christoph Müllner 
> wrote:
> >
> > [CCing Andrew Pinski, who had the same question]
> >
> > On 2/15/21 1:12 PM, Richard Earnshaw wrote:
> > > On 09/12/2020 17:21, Christoph Müllner wrote:
> > >> From: Christoph Muellner 
> > >>
> > >> It is possible to call aarch64_declare_function_name() and
> > >> have cfun not set. Let's sanitize the access to this variable.
> > >>
> > >> gcc/
> > >>
> > >> * config/aarch64/aarch64.c (aarch64_declare_function_name):
> > >> Santize access to cfun.
> > >> ---
> > >>  gcc/config/aarch64/aarch64.c | 3 ++-
> > >>  1 file changed, 2 insertions(+), 1 deletion(-)
> > >>
> > >> diff --git a/gcc/config/aarch64/aarch64.c
> b/gcc/config/aarch64/aarch64.c
> > >> index 67ffba02d3e..264ccb8beb2 100644
> > >> --- a/gcc/config/aarch64/aarch64.c
> > >> +++ b/gcc/config/aarch64/aarch64.c
> > >> @@ -19317,7 +19317,8 @@ aarch64_declare_function_name (FILE *stream,
> const char* name,
> > >>ASM_OUTPUT_TYPE_DIRECTIVE (stream, name, "function");
> > >>ASM_OUTPUT_LABEL (stream, name);
> > >>
> > >> -  cfun->machine->label_is_assembled = true;
> > >> +  if (cfun)
> > >> +cfun->machine->label_is_assembled = true;
> > >>  }
> > >>
> > >>  /* Implement PRINT_PATCHABLE_FUNCTION_ENTRY.  Check if the patch
> area is after
> > >>
> > >
> > > Do you have a simple testcase that demonstrates this?  If so, we likely
> > > have a regression here that will not only need fixing, but back-porting
> > > as well.
> >
> > I was testing this on master back then in December and it this patch
> > had an influence on the tests (make check), but I can't recall which
> ones.
> > I rebased the patches today and cannot reproduce the issues anymore with
> "make check".
> > I.e. this patch does not influence any tests as of today's master branch
> (anymore).
> >
> > However, if I also apply patch 3/3 of this series, then this patch is
> needed.
> > More or less all new tests of patch 3/3 trigger this test otherwise.
>
> That still does not describe why cfun is null in the case of patch 3/3?
> From looking into that patch I noticed you call set_cfun wtih null in
> output_indirect_thunk_function shouldn't you push and pop the cfun
> instead?
>

I got very much inspired by the function with the same name in
gcc/config/i386/i386.c. Therefore I assumed, set_cfun(NULL) is fine.
I guess, that the code does so, because it is called via the macro
TARGET_ASM_CODE_END, and setting cfun to NULL is considered
as prevention of mistakes.

Thanks,
Christoph


>
> Thanks,
> Andrew Pinski
>
> >
> > Thanks,
> > Christoph
>


Re: [PATCH] PR libfortran/99218 - [8/9/10/11 Regression] matmul on temporary array accesses invalid memory

2021-03-03 Thread Jerry DeLisle
Yes, OK, however, have you been able to test performance. I am only 
curious. There was a test program we used back when this code was first 
implemented in bugzilla. I do not remember the PR number off hand.


Jerry

On 2/23/21 1:46 PM, Harald Anlauf via Fortran wrote:

Dear all,

under certain circumstances a call to MATMUL for rank-2 times rank-1
would invoke a highly tuned rank-2 times rank-2 algorithm which could
lead to invalid reads and writes.  The solution is to check the rank
of the second argument to matmul and fall back to a regular algorithm
for rank-1.  The invalid accesses did show up with valgrind.

I have not been able to create a testcase that gives wrong results.

Regtested on x86_64-pc-linux-gnu, and verified with valgrind.

OK for master?

As this affects all open branches down to 8, ok for backports?

Thanks,
Harald


PR libfortran/99218 - matmul on temporary array accesses invalid memory

Do not invoke tuned rank-2 times rank-2 matmul if rank(b) == 1.

libgfortran/ChangeLog:

PR libfortran/99218
* m4/matmul_internal.m4: Invoke tuned matmul only for rank(b)>1.
* generated/matmul_c10.c: Regenerated.
 * generated/matmul_c16.c: Likewise.
 * generated/matmul_c4.c: Likewise.
 * generated/matmul_c8.c: Likewise.
 * generated/matmul_i1.c: Likewise.
 * generated/matmul_i16.c: Likewise.
 * generated/matmul_i2.c: Likewise.
 * generated/matmul_i4.c: Likewise.
 * generated/matmul_i8.c: Likewise.
 * generated/matmul_r10.c: Likewise.
 * generated/matmul_r16.c: Likewise.
 * generated/matmul_r4.c: Likewise.
 * generated/matmul_r8.c: Likewise.
 * generated/matmulavx128_c10.c: Likewise.
 * generated/matmulavx128_c16.c: Likewise.
 * generated/matmulavx128_c4.c: Likewise.
 * generated/matmulavx128_c8.c: Likewise.
 * generated/matmulavx128_i1.c: Likewise.
 * generated/matmulavx128_i16.c: Likewise.
 * generated/matmulavx128_i2.c: Likewise.
 * generated/matmulavx128_i4.c: Likewise.
 * generated/matmulavx128_i8.c: Likewise.
 * generated/matmulavx128_r10.c: Likewise.
 * generated/matmulavx128_r16.c: Likewise.
 * generated/matmulavx128_r4.c: Likewise.
 * generated/matmulavx128_r8.c: Likewise.





回复: [PATCH v4 1/2] MIPS: Not trigger error for pre-R6 and -mcompact-branches=always

2021-03-03 Thread yunqiang.su
> 
> On 2/23/21 3:14 AM, YunQiang Su wrote:
> > For MIPSr6, we may wish to use compact-branches only.
> > Currently, we have to use `always' option, while it is mark as
> > conflict with pre-R6.
> >   cc1: error: unsupported combination: ‘mips32r2’
> > -mcompact-branches=always Just ignore -mcompact-branches=always for
> pre-R6.
> >
> > This patch also defines
> > __mips_compact_branches_never
> > __mips_compact_branches_always
> > __mips_compact_branches_optimal
> > predefined macros
> >
> > gcc/ChangeLog:
> > * config/mips/mips.c (mips_option_override):
> > * config/mips/mips.h (TARGET_RTP_PIC): not trigger error for
> > compact-branches=always for pre-R6.
> > (TARGET_CB_NEVER): Likewise.
> > (TARGET_CB_ALWAYS): Likewise.
> > (struct mips_cpu_info): define macros for compact branch policy.
> > * doc/invoke.texi: Document "always" with pre-R6.
> >
> > gcc/testsuite/ChangeLog:
> > * gcc.target/mips/compact-branches-1.c: add isa_rev>=6.
> > * gcc.target/mips/mips.exp: don't add -mipsXXr6 option for
> > -mcompact-branches=always. It is usable for pre-R6 now.
> > * gcc.target/mips/compact-branches-8.c: New test.
> > * gcc.target/mips/compact-branches-9.c: New test.
> So I think Maciej's comment was that you simply shouldn't be using
> -mcompact-branches=always at mips32r2 (or anything pre-r6) together.
> 
> I think what you're trying to do here is set up a scenario where you're
> defaulting to mips32r6 and compact-branches, but not error if something
> specifies -mcpu=mips32r2 or something similar, right?
> 

Yes. If we introduce the build time option, and configure gcc with always, then 
gcc will always try to
Pass -mconpact-branches=always to cc1, even we use something like:
mipsisa32r6el-linux-gnu-gcc -mips32r2 -c xx.c
It may break something.

> jeff




  1   2   >