Re: Ping^3 Re: float.h: C2x NaN and Inf macros

2020-11-17 Thread Richard Biener via Gcc-patches
On Mon, Nov 16, 2020 at 5:10 PM Joseph Myers  wrote:
>
> Ping^3.  This patch
>  is
> still pending review (the DFP sNaN followup has been approved).  (The
> independent C2x  patches
>  and
>  are
> also pending review.)

These patches are all OK.

Thanks,
Richard.

>
> --
> Joseph S. Myers
> jos...@codesourcery.com


Re: [PATCH V2] Clean up loop-closed PHIs after loop finalize

2020-11-17 Thread Jiufu Guo via Gcc-patches
Jiufu Guo  writes:

> On 2020-11-16 17:35, Richard Biener wrote:
>> On Mon, Nov 16, 2020 at 10:26 AM Jiufu Guo 
>> wrote:
>>>
>>> Jiufu Guo  writes:
>>>
>>> > Richard Biener  writes:
>>> >
>>> >> On Wed, 11 Nov 2020, Jiufu Guo wrote:
>>> >>
..
>>> +
>>> +  /* Check dominator info before get loop-close PHIs from loop
>>> exits.  */
>>> +  if (dom_info_state (CDI_DOMINATORS) != DOM_OK)
>>
>> Please change this to
>>
>>/* Avoid possibly quadratic work when scanning for loop exits
>> across
>>   all loops of a nest.  */
>>if (!loop_state_satisfies_p (LOOPS_HAVE_RECORDED_EXITS))
>>  return 0;
>>
>
> Great suggestion, thanks!
>
> And, the patch for loop-init.c, is also updated a little as below: call
> clean_up_loop_closed_phi before release_recorded_exits, to avoid flag
> LOOPS_HAVE_RECORDED_EXITS is cleared before checked.
>
> -
> diff --git a/gcc/loop-init.c b/gcc/loop-init.c
> index 401e5282907..ac87dafef6e 100644
> --- a/gcc/loop-init.c
> +++ b/gcc/loop-init.c
> @@ -33,6 +33,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "tree-ssa-loop-niter.h"
>  #include "loop-unroll.h"
>  #include "tree-scalar-evolution.h"
> +#include "tree-cfgcleanup.h"
>
>  ^L
>  /* Apply FLAGS to the loop state.  */
> @@ -133,13 +134,19 @@ loop_optimizer_init (unsigned flags)
>  /* Finalize loop structures.  */
>
>  void
> -loop_optimizer_finalize (struct function *fn)
> +loop_optimizer_finalize (struct function *fn, bool
> clean_loop_closed_phi)
>  {
>class loop *loop;
>basic_block bb;
>
>timevar_push (TV_LOOP_FINI);
>
> +  if (clean_loop_closed_phi && loops_state_satisfies_p (fn,
> LOOP_CLOSED_SSA))
> +{
> +  clean_up_loop_closed_phi (fn);
> +  loops_state_clear (fn, LOOP_CLOSED_SSA);
> +}
> +
>if (loops_state_satisfies_p (fn, LOOPS_HAVE_RECORDED_EXITS))
>  release_recorded_exits (fn);
> 
>>> +return 0;
>>> +
..
>>> +   {
>>> + phi = gsi.phi ();
>>> + rhs = degenerate_phi_result (phi);
>>
>>   rhs = gimple_phi_arg_def (phi, 0);
> Thanks, sorry for missing this, you mentioned in previous mail.
>

>>> > ..
>>> >>> +
>>> >>> + replace_uses_by (lhs, rhs);
>>> >>> + remove_phi_node (&psi, true);
>>> >>> + cfg_altered = true;
>>> >>
>>> >> in the end the return value is unused but I think we should avoid
>>> >> altering the CFG since doing so requires it to be cleaned up for
>>> >> unreachable blocks.  That means to open-code replace_uses_by as
>>> >>
>>> >>   imm_use_iterator imm_iter;
>>> >>   use_operand_p use;
>>> >>   gimple *stmt;
>>> >>   FOR_EACH_IMM_USE_STMT (stmt, imm_iter, name)
>>> >> {
>>> >>   FOR_EACH_IMM_USE_ON_STMT (use, imm_iter)
>>> >> replace_exp (use, val);
>>> >>   update_stmt (stmt);
>>> >> }
>>> >
>>> > Thansk! This could also save some code in replace_uses_by.
With more checking on `replace_uses_by` and tests, when a const is propagated
into an assignment stmt that contains ADDR_EXPR, invariant flag of the stmt 
would be updated.


  /* Update the invariant flag for ADDR_EXPR if replacing   
   
 a variable index with a constant.  */
  if (gimple_assign_single_p (use_stmt)
  && TREE_CODE (gimple_assign_rhs1 (use_stmt))
   == ADDR_EXPR)
recompute_tree_invariant_for_addr_expr (
  gimple_assign_rhs1 (use_stmt));


And then the updated patch looks like:

This updated patch propagates loop-closed PHIs them out at
loop_optimizer_finalize.  For some cases, to clean up loop-closed PHIs
would save efforts of optimization passes after loopdone.

This patch passes bootstrap and regtest on ppc64le.
Thanks for any comments.

Thanks,
Jiufu Guo.

diff --git a/gcc/cfgloop.h b/gcc/cfgloop.h
index d14689dc31f..438b1f779bb 100644
--- a/gcc/cfgloop.h
+++ b/gcc/cfgloop.h
@@ -824,7 +824,7 @@ extern void init_set_costs (void);
 
 /* Loop optimizer initialization.  */
 extern void loop_optimizer_init (unsigned);
-extern void loop_optimizer_finalize (function *);
+extern void loop_optimizer_finalize (function *, bool = false);
 inline void
 loop_optimizer_finalize ()
 {
diff --git a/gcc/loop-init.c b/gcc/loop-init.c
index 401e5282907..ac87dafef6e 100644
--- a/gcc/loop-init.c
+++ b/gcc/loop-init.c
@@ -33,6 +33,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-ssa-loop-niter.h"
 #include "loop-unroll.h"
 #include "tree-scalar-evolution.h"
+#include "tree-cfgcleanup.h"
 
 
 /* Apply FLAGS to the loop state.  */
@@ -133,13 +134,19 @@ loop_optimizer_init (unsigned flags)
 /* Finalize loop structures.  */
 
 void
-loop_optimizer_finalize (struct function *fn)
+loop_optimizer_finalize (struct function *fn, bool clean_loop_closed_phi)
 {
   class loop *loop;
   basic_block bb;
 

Re: [patch] Fix build when source directory includes @ character

2020-11-17 Thread FX via Gcc-patches
> OK.  You have commit privs, right?

Yes, and I did commit after Richard’s OK: 
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=034db20e2ea8301b5dc251bf10a97ce1cf90655f

… but I forgot to send an email saying I had, sorry.

FX

Re: [PATCH] c++: Implement -Wuninitialized for mem-initializers [PR19808]

2020-11-17 Thread Jakub Jelinek via Gcc-patches
On Tue, Nov 17, 2020 at 01:33:48AM -0500, Jason Merrill via Gcc-patches wrote:
> > > Why doesn't the middle-end warning work for inline functions?
> > 
> > It does but only when they're called (and, as usual, also unless
> > the uninitialized use is eliminated).
> 
> Yes, but why?  I assume because we don't bother going through all the phases
> of compilation for unused inlines, but couldn't we change that when we're
> asking for (certain) warnings?

CCing Richard and Honza on this.

I think for unused functions we don't even gimplify unused functions, the
cgraph code just throws them away.  Even trying just to run the first few
passes (gimplification up to uninit1) would have several high costs,
the tree body of everything unneeded would be unshared, reemitted as GIMPLE,
then cfg created for it and turned into SSA form only to get the
uninitialized warnings (for that warning we could stop there and throw the
bodies away).
I think we would need to try that on some larger C++ codebases and see what
the compile time memory and time hit would be.

Anyway, taking e.g. Marek's Wuninitialized-17.C testcase, if I add
S s;
at the end to make the S::S() ctor used, uninit1 does warn
(guess many of the warnings would be dependent on -flifetime-dse=2
inserted CLOBBERs, thankfully that is the default), but the warnings then
look fairly cryptic:
Wuninitialized-17.C: In constructor ‘S::S()’:
Wuninitialized-17.C:22:14: warning: ‘*.S::y’ is used uninitialized 
[-Wuninitialized]
   22 |   S() : a{1, y} { } // { dg-warning "field .S::y. is used 
uninitialized" }
  |  ^
Guess we should at least try to improve this special case's printing for C++, 
the IL
is:
  *this_3(D) ={v} {CLOBBER};
  this_3(D)->a.a = 1;
  _1 = this_3(D)->y;
  this_3(D)->a.b = _1;
and we could figure out that this is in a METHOD_TYPE fndecl, the MEM_REF
has as address the default def of this SSA_NAME where this is the first
argument of the method and the PARM_DECL has this name and is
DECL_ARTIFICIAL to print it as this->S::y instead of *.S::y.
I bet that code is in error.c ...

If the costs of gimplifying, creating cfg and SSA form for all functions
would be too high (I think it would be, but haven't measured it), then
perphaps it might be useful for Marek's FE code to handle the easiest cases
and punt when it gets more complicated?
I mean e.g. the a(b=1) cases, or can't e.g. some member be initialized only
in some other function - a (foo (&b)) where foo would store to what the
pointer points to (or reference refers to)?

Jakub



Re: [PATCH] c++: Implement -Wuninitialized for mem-initializers [PR19808]

2020-11-17 Thread Jan Hubicka
> On Tue, Nov 17, 2020 at 01:33:48AM -0500, Jason Merrill via Gcc-patches wrote:
> > > > Why doesn't the middle-end warning work for inline functions?
> > > 
> > > It does but only when they're called (and, as usual, also unless
> > > the uninitialized use is eliminated).
> > 
> > Yes, but why?  I assume because we don't bother going through all the phases
> > of compilation for unused inlines, but couldn't we change that when we're
> > asking for (certain) warnings?
> 
> CCing Richard and Honza on this.
> 
> I think for unused functions we don't even gimplify unused functions, the
> cgraph code just throws them away.  Even trying just to run the first few
> passes (gimplification up to uninit1) would have several high costs,
Note that uninit1 is a late pass so it is not just few passes we speak
about.  Late passes are run only on cocde that really lands in .s file
so enabling them would mean splitting the pass queue and running another
unreachable code somewhere.  That would confuse inliner and other IPA
passes since they will have to somehow deal with dead code in their
program size estimate and also affect LTO.

Even early passes are run only on reachable portion of program, since
functions are analyzed by cgraphunit on demand (only if they are
analyzed by someone else). Simlar logic is also done be C++ FE to decide
what templates.  Changling this would also have quite some compile
time/memory use impact.

There is -fkeep-inline-functions.

> the tree body of everything unneeded would be unshared, reemitted as GIMPLE,
> then cfg created for it and turned into SSA form only to get the
> uninitialized warnings (for that warning we could stop there and throw the
> bodies away).
> I think we would need to try that on some larger C++ codebases and see what
> the compile time memory and time hit would be.
> 
> Anyway, taking e.g. Marek's Wuninitialized-17.C testcase, if I add
> S s;
> at the end to make the S::S() ctor used, uninit1 does warn
> (guess many of the warnings would be dependent on -flifetime-dse=2
> inserted CLOBBERs, thankfully that is the default), but the warnings then
> look fairly cryptic:
> Wuninitialized-17.C: In constructor ‘S::S()’:
> Wuninitialized-17.C:22:14: warning: ‘*.S::y’ is used uninitialized 
> [-Wuninitialized]
>22 |   S() : a{1, y} { } // { dg-warning "field .S::y. is used 
> uninitialized" }
>   |  ^
> Guess we should at least try to improve this special case's printing for C++, 
> the IL
> is:
>   *this_3(D) ={v} {CLOBBER};
>   this_3(D)->a.a = 1;
>   _1 = this_3(D)->y;
>   this_3(D)->a.b = _1;
> and we could figure out that this is in a METHOD_TYPE fndecl, the MEM_REF
> has as address the default def of this SSA_NAME where this is the first
> argument of the method and the PARM_DECL has this name and is
> DECL_ARTIFICIAL to print it as this->S::y instead of *.S::y.
> I bet that code is in error.c ...
> 
> If the costs of gimplifying, creating cfg and SSA form for all functions
> would be too high (I think it would be, but haven't measured it), then
> perphaps it might be useful for Marek's FE code to handle the easiest cases
> and punt when it gets more complicated?
> I mean e.g. the a(b=1) cases, or can't e.g. some member be initialized only
> in some other function - a (foo (&b)) where foo would store to what the
> pointer points to (or reference refers to)?
> 
>   Jakub
> 


Re: [PATCH] c++: Implement -Wuninitialized for mem-initializers [PR19808]

2020-11-17 Thread Jakub Jelinek via Gcc-patches
On Tue, Nov 17, 2020 at 09:44:26AM +0100, Jan Hubicka wrote:
> > I think for unused functions we don't even gimplify unused functions, the
> > cgraph code just throws them away.  Even trying just to run the first few
> > passes (gimplification up to uninit1) would have several high costs,
> Note that uninit1 is a late pass so it is not just few passes we speak

You are speaking about uninit2?
The one I'm talking about is:
  PUSH_INSERT_PASSES_WITHIN (pass_build_ssa_passes)
  NEXT_PASS (pass_fixup_cfg);
  NEXT_PASS (pass_build_ssa);
  NEXT_PASS (pass_warn_nonnull_compare);
  NEXT_PASS (pass_early_warn_uninitialized);
and that is run right after 023t.ssa

Jakub



Re: [PATCH] c++: Implement -Wuninitialized for mem-initializers [PR19808]

2020-11-17 Thread Jan Hubicka
> On Tue, Nov 17, 2020 at 09:44:26AM +0100, Jan Hubicka wrote:
> > > I think for unused functions we don't even gimplify unused functions, the
> > > cgraph code just throws them away.  Even trying just to run the first few
> > > passes (gimplification up to uninit1) would have several high costs,
> > Note that uninit1 is a late pass so it is not just few passes we speak
> 
> You are speaking about uninit2?
> The one I'm talking about is:
>   PUSH_INSERT_PASSES_WITHIN (pass_build_ssa_passes)
>   NEXT_PASS (pass_fixup_cfg);
>   NEXT_PASS (pass_build_ssa);
>   NEXT_PASS (pass_warn_nonnull_compare);
>   NEXT_PASS (pass_early_warn_uninitialized);
> and that is run right after 023t.ssa

that one is called "*early_warn_uninitialized"
and thus have no dump file. One with comes with uninit1 dump is the late
pass.

Honza
> 
>   Jakub
> 


[PATCH] ada: c++: Get rid of libposix4, librt on Solaris

2020-11-17 Thread Rainer Orth
I recently noticed that neither libposix4 nor librt are needed on
Solaris 11 any longer:

* libposix4 was renamed to librt in Solaris 7 back in 1998.

* librt was folded into libc in the OpenSolaris timeframe, leaving librt
  only as a filter on libc.  Thus, it's no longer needed on either
  Solaris 11 or Illumos.

The following patch removes both uses.  At the same time, Ada's use of
libthread has gone: it was folded into libc in Solaris 10 already.
TIME_LIBRARY and friends in g++ are likewise removed: Solaris was the
only user.

Bootstrapped without regressions on i386-pc-solaris2.11,
sparc-sun-solaris2.11, and x86_64-pc-linux-gnu.

Ok for master?

There are two more uses of librt left:

* On glibc targets before 2.17 it's needed for clock_gettime.  I've no
  idea how long gcc is supposed to support such targets (glibc 2.17 was
  released in December 2012).

* On HP-UX, it is needed for sem_init in libgomp and various specs for
  -fopenmp etc.  There are no public HP-UX systems in the compile farm,
  and the sem_init(2) man page in the public docs on hpe.com was just a
  dangling link, so I cannot tell if this is still true.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


2020-11-16  Rainer Orth  

gcc/cp:
* g++spec.c (TIMELIB, TIME_LIBRARY): Remove.
(lang_specific_driver): Remove TIME_LIBRARY handling.

gcc:
* config/sol2.h (TIME_LIBRARY): Remove.

libstdc++-v3:
* acinclude.m4 (GLIBCXX_ENABLE_LIBSTDCXX_TIME): Remove libposix4
references.
: Don't use -lrt any longer.
* configure: Regenerate.

* doc/xml/manual/configure.xml (--enable-libstdcxx-time=OPTION):
Remove libposix4 reference.

gcc/ada:
* Makefile.rtl  (THREADSLIB): Remove.
(MISCLIB): Remove -lposix4.
<*86-*-solaris2*>: Likewise.
* libgnarl/s-osinte__solaris.ads (System.OS_Interface): Remove
-lposix4 -lthread.

# HG changeset patch
# Parent  aea7401c4d83f8a1fd609b22ec7a9131c857c98d
ada: c++: Get rid of libposix4, librt on Solaris

diff --git a/gcc/ada/Makefile.rtl b/gcc/ada/Makefile.rtl
--- a/gcc/ada/Makefile.rtl
+++ b/gcc/ada/Makefile.rtl
@@ -1641,8 +1641,7 @@ ifeq ($(strip $(filter-out sparc% sun so
   endif
 
   EH_MECHANISM=-gcc
-  THREADSLIB = -lposix4 -lthread
-  MISCLIB = -lposix4 -lnsl -lsocket
+  MISCLIB = -lnsl -lsocket
   SO_OPTS = -Wl,-h,
   GNATLIB_SHARED = gnatlib-shared-dual
   GMEM_LIB = gmemlib
@@ -1695,8 +1694,7 @@ ifeq ($(strip $(filter-out %86 %x86_64 s
   EXTRA_GNATRTL_NONTASKING_OBJS += $(TRASYM_DWARF_UNIX_OBJS)
 
   EH_MECHANISM=-gcc
-  THREADSLIB = -lposix4 -lthread
-  MISCLIB = -lposix4 -lnsl -lsocket
+  MISCLIB = -lnsl -lsocket
   SO_OPTS = -Wl,-h,
   GNATLIB_SHARED = gnatlib-shared-dual
   GMEM_LIB = gmemlib
diff --git a/gcc/ada/libgnarl/s-osinte__solaris.ads b/gcc/ada/libgnarl/s-osinte__solaris.ads
--- a/gcc/ada/libgnarl/s-osinte__solaris.ads
+++ b/gcc/ada/libgnarl/s-osinte__solaris.ads
@@ -45,9 +45,6 @@ with Ada.Unchecked_Conversion;
 package System.OS_Interface is
pragma Preelaborate;
 
-   pragma Linker_Options ("-lposix4");
-   pragma Linker_Options ("-lthread");
-
subtype intis Interfaces.C.int;
subtype short  is Interfaces.C.short;
subtype long   is Interfaces.C.long;
diff --git a/gcc/config/sol2.h b/gcc/config/sol2.h
--- a/gcc/config/sol2.h
+++ b/gcc/config/sol2.h
@@ -381,9 +381,6 @@ along with GCC; see the file COPYING3.  
   { "endfile_vtv",		ENDFILE_VTV_SPEC },		\
   SUBTARGET_CPU_EXTRA_SPECS
 
-/* C++11 programs need -lrt for nanosleep.  */
-#define TIME_LIBRARY "rt"
-
 #ifndef USE_GLD
 /* With Sun ld, -rdynamic is a no-op.  */
 #define RDYNAMIC_SPEC ""
diff --git a/gcc/cp/g++spec.c b/gcc/cp/g++spec.c
--- a/gcc/cp/g++spec.c
+++ b/gcc/cp/g++spec.c
@@ -27,12 +27,10 @@ along with GCC; see the file COPYING3.  
 #define LANGSPEC	(1<<1)
 /* This bit is set if they did `-lm' or `-lmath'.  */
 #define MATHLIB		(1<<2)
-/* This bit is set if they did `-lrt' or equivalent.  */
-#define TIMELIB		(1<<3)
 /* This bit is set if they did `-lc'.  */
-#define WITHLIBC	(1<<4)
+#define WITHLIBC	(1<<3)
 /* Skip this option.  */
-#define SKIPOPT		(1<<5)
+#define SKIPOPT		(1<<4)
 
 #ifndef MATH_LIBRARY
 #define MATH_LIBRARY "m"
@@ -41,10 +39,6 @@ along with GCC; see the file COPYING3.  
 #define MATH_LIBRARY_PROFILE MATH_LIBRARY
 #endif
 
-#ifndef TIME_LIBRARY
-#define TIME_LIBRARY ""
-#endif
-
 #ifndef LIBSTDCXX
 #define LIBSTDCXX "stdc++"
 #endif
@@ -95,15 +89,12 @@ lang_specific_driver (struct cl_decoded_
   const struct cl_decoded_option *saw_libc = NULL;
 
   /* An array used to flag each argument that needs a bit set for
- LANGSPEC, MATHLIB, TIMELIB, or WITHLIBC.  */
+ LANGSPEC, MATHLIB, or WITHLIBC.  */
   int *args;
 
   /* By default, we throw on the math library if we have one.  */
   int need_math = (MATH_LIBRARY[0] !

[PATCH] gcov: Add __gcov_info_to_gdca()

2020-11-17 Thread Sebastian Huber
This is a proposal to get the gcda data for a gcda info in a free-standing
environment.  It is intended to be used with the -fprofile-info-section option.
A crude test program which doesn't use a linker script is:

  #include 
  #include 

  extern const struct gcov_info *my_info;

  static void
  filename(const char *f, void *arg)
  {
printf("filename: %s\n", f);
  }

  static void
  dump(const void *d, unsigned n, void *arg)
  {
const unsigned char *c;
unsigned i;

c = d;

for (i = 0; i < n; ++i) {
printf("%02x", c[i]);
}
  }

  int main()
  {
__asm__ volatile (".set my_info, .LPBX2");
__gcov_info_to_gcda(my_info, filename, dump, NULL);
return 0;
  }

gcc/

* doc/invoke.texi (fprofile-info-section): Mention
__gcov_info_to_gdca().

libgcc/

Makefile.in (LIBGCOV_DRIVER): Add _gcov_info_to_gcda.
gcov.h (gcov_info): Declare.
(__gcov_info_to_gdca): Likewise.
libgcov-driver.c (gcov_are_all_counters_zero): New.
(write_one_data): Use gcov_are_all_counters_zero().
(gcov_fn_info_to_gcda): New.
(__gcov_info_to_gcda): Likewise.
---
 gcc/doc/invoke.texi |  73 
 libgcc/Makefile.in  |   2 +-
 libgcc/gcov.h   |  15 +
 libgcc/libgcov-driver.c | 120 
 4 files changed, 188 insertions(+), 22 deletions(-)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 3510a54c6c4..09cb4922f5e 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -14248,17 +14248,17 @@ To optimize the program based on the collected 
profile information, use
 Register the profile information in the specified section instead of using a
 constructor/destructor.  The section name is @var{name} if it is specified,
 otherwise the section name defaults to @code{.gcov_info}.  A pointer to the
-profile information generated by @option{-fprofile-arcs} or
-@option{-ftest-coverage} is placed in the specified section for each
-translation unit.  This option disables the profile information registration
-through a constructor and it disables the profile information processing
-through a destructor.  This option is not intended to be used in hosted
-environments such as GNU/Linux.  It targets systems with limited resources
-which do not support constructors and destructors.  The linker could collect
-the input sections in a continuous memory block and define start and end
-symbols.  The runtime support could dump the profiling information registered
-in this linker set during program termination to a serial line for example.  A
-GNU linker script example which defines a linker output section follows:
+profile information generated by @option{-fprofile-arcs} is placed in the
+specified section for each translation unit.  This option disables the profile
+information registration through a constructor and it disables the profile
+information processing through a destructor.  This option is not intended to be
+used in hosted environments such as GNU/Linux.  It targets free-standing
+environments (for example embedded systems) with limited resources which do not
+support constructors/destructors or the C library file I/O.
+
+The linker could collect the input sections in a continuous memory block and
+define start and end symbols.  A GNU linker script example which defines a
+linker output section follows:
 
 @smallexample
   .gcov_info  :
@@ -14269,6 +14269,57 @@ GNU linker script example which defines a linker 
output section follows:
   @}
 @end smallexample
 
+The program could dump the profiling information registered in this linker set
+for example like this:
+
+@smallexample
+#include 
+#include 
+
+extern const struct gcov_info *__gcov_info_start[];
+extern const struct gcov_info *__gcov_info_end[];
+
+static void
+filename (const char *f, void *arg)
+@{
+  puts (f);
+@}
+
+static void
+dump (const void *d, unsigned n, void *arg)
+@{
+  const unsigned char *c = d;
+
+  for (unsigned i = 0; i < n; ++i)
+printf ("%02x", c[i]);
+@}
+
+static void
+dump_gcov_info (void)
+@{
+  const struct gcov_info **info = __gcov_info_start;
+  const struct gcov_info **end = __gcov_info_end;
+
+  /* Obfuscate variable to prevent compiler optimizations.  */
+  __asm__ ("" : "+r" (end));
+
+  while (info != end)
+  @{
+void *arg = NULL;
+__gcov_info_to_gcda (*info, filename, dump, arg);
+putchar ('\n');
+++info;
+  @}
+@}
+
+int
+main()
+@{
+  dump_gcov_info();
+  return 0;
+@}
+@end smallexample
+
 @item -fprofile-note=@var{path}
 @opindex fprofile-note
 
diff --git a/libgcc/Makefile.in b/libgcc/Makefile.in
index d6075d32bd4..c22413d768c 100644
--- a/libgcc/Makefile.in
+++ b/libgcc/Makefile.in
@@ -908,7 +908,7 @@ LIBGCOV_INTERFACE = _gcov_dump _gcov_fork   
\
_gcov_execl _gcov_execlp\
_gcov_execle _gcov_execv _gcov_execvp _gcov_execve _gcov_reset  \
_gcov_lock_

Re: [PATCH V2] Clean up loop-closed PHIs after loop finalize

2020-11-17 Thread Richard Biener
On Tue, 17 Nov 2020, Jiufu Guo wrote:

> Jiufu Guo  writes:
> 
> > On 2020-11-16 17:35, Richard Biener wrote:
> >> On Mon, Nov 16, 2020 at 10:26 AM Jiufu Guo 
> >> wrote:
> >>>
> >>> Jiufu Guo  writes:
> >>>
> >>> > Richard Biener  writes:
> >>> >
> >>> >> On Wed, 11 Nov 2020, Jiufu Guo wrote:
> >>> >>
> ..
> >>> +
> >>> +  /* Check dominator info before get loop-close PHIs from loop
> >>> exits.  */
> >>> +  if (dom_info_state (CDI_DOMINATORS) != DOM_OK)
> >>
> >> Please change this to
> >>
> >>/* Avoid possibly quadratic work when scanning for loop exits
> >> across
> >>   all loops of a nest.  */
> >>if (!loop_state_satisfies_p (LOOPS_HAVE_RECORDED_EXITS))
> >>  return 0;
> >>
> >
> > Great suggestion, thanks!
> >
> > And, the patch for loop-init.c, is also updated a little as below: call
> > clean_up_loop_closed_phi before release_recorded_exits, to avoid flag
> > LOOPS_HAVE_RECORDED_EXITS is cleared before checked.
> >
> > -
> > diff --git a/gcc/loop-init.c b/gcc/loop-init.c
> > index 401e5282907..ac87dafef6e 100644
> > --- a/gcc/loop-init.c
> > +++ b/gcc/loop-init.c
> > @@ -33,6 +33,7 @@ along with GCC; see the file COPYING3.  If not see
> >  #include "tree-ssa-loop-niter.h"
> >  #include "loop-unroll.h"
> >  #include "tree-scalar-evolution.h"
> > +#include "tree-cfgcleanup.h"
> >
> >  ^L
> >  /* Apply FLAGS to the loop state.  */
> > @@ -133,13 +134,19 @@ loop_optimizer_init (unsigned flags)
> >  /* Finalize loop structures.  */
> >
> >  void
> > -loop_optimizer_finalize (struct function *fn)
> > +loop_optimizer_finalize (struct function *fn, bool
> > clean_loop_closed_phi)
> >  {
> >class loop *loop;
> >basic_block bb;
> >
> >timevar_push (TV_LOOP_FINI);
> >
> > +  if (clean_loop_closed_phi && loops_state_satisfies_p (fn,
> > LOOP_CLOSED_SSA))
> > +{
> > +  clean_up_loop_closed_phi (fn);
> > +  loops_state_clear (fn, LOOP_CLOSED_SSA);
> > +}
> > +
> >if (loops_state_satisfies_p (fn, LOOPS_HAVE_RECORDED_EXITS))
> >  release_recorded_exits (fn);
> > 
> >>> +return 0;
> >>> +
> ..
> >>> +   {
> >>> + phi = gsi.phi ();
> >>> + rhs = degenerate_phi_result (phi);
> >>
> >>   rhs = gimple_phi_arg_def (phi, 0);
> > Thanks, sorry for missing this, you mentioned in previous mail.
> >
> 
> >>> > ..
> >>> >>> +
> >>> >>> + replace_uses_by (lhs, rhs);
> >>> >>> + remove_phi_node (&psi, true);
> >>> >>> + cfg_altered = true;
> >>> >>
> >>> >> in the end the return value is unused but I think we should avoid
> >>> >> altering the CFG since doing so requires it to be cleaned up for
> >>> >> unreachable blocks.  That means to open-code replace_uses_by as
> >>> >>
> >>> >>   imm_use_iterator imm_iter;
> >>> >>   use_operand_p use;
> >>> >>   gimple *stmt;
> >>> >>   FOR_EACH_IMM_USE_STMT (stmt, imm_iter, name)
> >>> >> {
> >>> >>   FOR_EACH_IMM_USE_ON_STMT (use, imm_iter)
> >>> >> replace_exp (use, val);
> >>> >>   update_stmt (stmt);
> >>> >> }
> >>> >
> >>> > Thansk! This could also save some code in replace_uses_by.
> With more checking on `replace_uses_by` and tests, when a const is propagated
> into an assignment stmt that contains ADDR_EXPR, invariant flag of the stmt 
> would be updated.
> 
> 
>   /* Update the invariant flag for ADDR_EXPR if replacing 
>  
>  a variable index with a constant.  */
>   if (gimple_assign_single_p (use_stmt)
>   && TREE_CODE (gimple_assign_rhs1 (use_stmt))
>== ADDR_EXPR)
> recompute_tree_invariant_for_addr_expr (
>   gimple_assign_rhs1 (use_stmt));
> 
> 
> And then the updated patch looks like:
> 
> This updated patch propagates loop-closed PHIs them out at
> loop_optimizer_finalize.  For some cases, to clean up loop-closed PHIs
> would save efforts of optimization passes after loopdone.
> 
> This patch passes bootstrap and regtest on ppc64le.
> Thanks for any comments.

OK.

Thanks,
Richard.


> Thanks,
> Jiufu Guo.
> 
> diff --git a/gcc/cfgloop.h b/gcc/cfgloop.h
> index d14689dc31f..438b1f779bb 100644
> --- a/gcc/cfgloop.h
> +++ b/gcc/cfgloop.h
> @@ -824,7 +824,7 @@ extern void init_set_costs (void);
>  
>  /* Loop optimizer initialization.  */
>  extern void loop_optimizer_init (unsigned);
> -extern void loop_optimizer_finalize (function *);
> +extern void loop_optimizer_finalize (function *, bool = false);
>  inline void
>  loop_optimizer_finalize ()
>  {
> diff --git a/gcc/loop-init.c b/gcc/loop-init.c
> index 401e5282907..ac87dafef6e 100644
> --- a/gcc/loop-init.c
> +++ b/gcc/loop-init.c
> @@ -33,6 +33,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "tree-ssa-loop-niter.h"
>  #include "loop-unroll.h"
>  #include "tree-sca

Re: [PATCH] ada: c++: Get rid of libposix4, librt on Solaris

2020-11-17 Thread Arnaud Charlet
> I recently noticed that neither libposix4 nor librt are needed on
> Solaris 11 any longer:
> 
> * libposix4 was renamed to librt in Solaris 7 back in 1998.
> 
> * librt was folded into libc in the OpenSolaris timeframe, leaving librt
>   only as a filter on libc.  Thus, it's no longer needed on either
>   Solaris 11 or Illumos.
> 
> The following patch removes both uses.  At the same time, Ada's use of
> libthread has gone: it was folded into libc in Solaris 10 already.
> TIME_LIBRARY and friends in g++ are likewise removed: Solaris was the
> only user.
> 
> Bootstrapped without regressions on i386-pc-solaris2.11,
> sparc-sun-solaris2.11, and x86_64-pc-linux-gnu.
> 
> Ok for master?

The Ada part is OK, thanks.

> 2020-11-16  Rainer Orth  
> 
>   gcc/cp:
>   * g++spec.c (TIMELIB, TIME_LIBRARY): Remove.
>   (lang_specific_driver): Remove TIME_LIBRARY handling.
> 
>   gcc:
>   * config/sol2.h (TIME_LIBRARY): Remove.
> 
>   libstdc++-v3:
>   * acinclude.m4 (GLIBCXX_ENABLE_LIBSTDCXX_TIME): Remove libposix4
>   references.
>   : Don't use -lrt any longer.
>   * configure: Regenerate.
> 
>   * doc/xml/manual/configure.xml (--enable-libstdcxx-time=OPTION):
>   Remove libposix4 reference.
> 
>   gcc/ada:
>   * Makefile.rtl  (THREADSLIB): Remove.
>   (MISCLIB): Remove -lposix4.
>   <*86-*-solaris2*>: Likewise.
>   * libgnarl/s-osinte__solaris.ads (System.OS_Interface): Remove
>   -lposix4 -lthread.


Re: [PATCH] c++: Implement -Wuninitialized for mem-initializers [PR19808]

2020-11-17 Thread Richard Biener
On Tue, 17 Nov 2020, Jan Hubicka wrote:

> > On Tue, Nov 17, 2020 at 01:33:48AM -0500, Jason Merrill via Gcc-patches 
> > wrote:
> > > > > Why doesn't the middle-end warning work for inline functions?
> > > > 
> > > > It does but only when they're called (and, as usual, also unless
> > > > the uninitialized use is eliminated).
> > > 
> > > Yes, but why?  I assume because we don't bother going through all the 
> > > phases
> > > of compilation for unused inlines, but couldn't we change that when we're
> > > asking for (certain) warnings?
> > 
> > CCing Richard and Honza on this.
> > 
> > I think for unused functions we don't even gimplify unused functions, the
> > cgraph code just throws them away.  Even trying just to run the first few
> > passes (gimplification up to uninit1) would have several high costs,
> Note that uninit1 is a late pass so it is not just few passes we speak
> about.  Late passes are run only on cocde that really lands in .s file
> so enabling them would mean splitting the pass queue and running another
> unreachable code somewhere.  That would confuse inliner and other IPA
> passes since they will have to somehow deal with dead code in their
> program size estimate and also affect LTO.
> 
> Even early passes are run only on reachable portion of program, since
> functions are analyzed by cgraphunit on demand (only if they are
> analyzed by someone else). Simlar logic is also done be C++ FE to decide
> what templates.  Changling this would also have quite some compile
> time/memory use impact.

There's also impact on various IPA heuristics - we'd need to make sure
to not account any doings to unreachable functions or effects of them
still there (do we want to inline into them?  do we want heuristics
of inline into all functions affected?)

So IMHO not a good idea.  GCC isn't a static analysis engine ...

Richard.

> There is -fkeep-inline-functions.
> 
> > the tree body of everything unneeded would be unshared, reemitted as GIMPLE,
> > then cfg created for it and turned into SSA form only to get the
> > uninitialized warnings (for that warning we could stop there and throw the
> > bodies away).
> > I think we would need to try that on some larger C++ codebases and see what
> > the compile time memory and time hit would be.
> > 
> > Anyway, taking e.g. Marek's Wuninitialized-17.C testcase, if I add
> > S s;
> > at the end to make the S::S() ctor used, uninit1 does warn
> > (guess many of the warnings would be dependent on -flifetime-dse=2
> > inserted CLOBBERs, thankfully that is the default), but the warnings then
> > look fairly cryptic:
> > Wuninitialized-17.C: In constructor ?S::S()?:
> > Wuninitialized-17.C:22:14: warning: ?*.S::y? is used uninitialized 
> > [-Wuninitialized]
> >22 |   S() : a{1, y} { } // { dg-warning "field .S::y. is used 
> > uninitialized" }
> >   |  ^
> > Guess we should at least try to improve this special case's printing for 
> > C++, the IL
> > is:
> >   *this_3(D) ={v} {CLOBBER};
> >   this_3(D)->a.a = 1;
> >   _1 = this_3(D)->y;
> >   this_3(D)->a.b = _1;
> > and we could figure out that this is in a METHOD_TYPE fndecl, the MEM_REF
> > has as address the default def of this SSA_NAME where this is the first
> > argument of the method and the PARM_DECL has this name and is
> > DECL_ARTIFICIAL to print it as this->S::y instead of *.S::y.
> > I bet that code is in error.c ...
> > 
> > If the costs of gimplifying, creating cfg and SSA form for all functions
> > would be too high (I think it would be, but haven't measured it), then
> > perphaps it might be useful for Marek's FE code to handle the easiest cases
> > and punt when it gets more complicated?
> > I mean e.g. the a(b=1) cases, or can't e.g. some member be initialized only
> > in some other function - a (foo (&b)) where foo would store to what the
> > pointer points to (or reference refers to)?
> > 
> > Jakub
> > 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imend


Re: [PATCH][RFC] Make mingw-w64 printf/scanf attribute alias to ms_printf/ms_scanf only for C89

2020-11-17 Thread Jonathan Yong via Gcc-patches

On 11/16/20 5:40 AM, Jonathan Yong wrote:

On 11/14/20 12:29 PM, Liu Hao via Gcc-patches wrote:

This is the third revision of my patch:

1. Two typos in the commit message have been fixed.
2. Support for `%a` and `%A` has been added. Documentation can be
    found on the same page in the commit message.
3. GCC will no longer warn about 'ISO C does not support the ‘L’
    ms_printf length modifier'. This was caused by mistaken array
    indices in `TARGET_OVERRIDES_FORMAT_INIT`.




I'll be pushing this soon if there aren't any more complaints.


Pushed to master, thanks all for contributing and reviewing.


OpenPGP_0x713B5FE29C145D45_and_old_rev.asc
Description: application/pgp-keys


OpenPGP_signature
Description: OpenPGP digital signature


[PATCH] tree-optimization/97832 - avoid breaking linearization in reassoc

2020-11-17 Thread Richard Biener
The first reassoc pass is supposed to strictly linearize expressions
but re-propagation of negates can break this in case it is faced
with (-a + -b) + c which it turns into c - (a + b).  The root-cause
is swap_ops_for_binary_stmt which prematurely associates two negates
together.  The following patch avoids this and allows the two new
testcases to be SLP vectorized - previously reassoc made the
expression trees not match up for SLP discovery.

This fixes the reassoc part of the PR.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

OK?

Thanks,
Richard.

2020-11-17  Richard Biener  

PR tree-optimization/97832
* tree-ssa-reassoc.c (op_negated_p): New function.
(swap_ops_for_binary_stmt): Avoid swapping two negated operands
together.
(get_rank): Simplify.

* gcc.dg/vect/pr97832-1.c: New testcase.
* gcc.dg/vect/pr97832-2.c: Likewise.
---
 gcc/testsuite/gcc.dg/vect/pr97832-1.c | 17 +++
 gcc/testsuite/gcc.dg/vect/pr97832-2.c | 29 ++
 gcc/tree-ssa-reassoc.c| 30 ++-
 3 files changed, 71 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/vect/pr97832-1.c
 create mode 100644 gcc/testsuite/gcc.dg/vect/pr97832-2.c

diff --git a/gcc/testsuite/gcc.dg/vect/pr97832-1.c 
b/gcc/testsuite/gcc.dg/vect/pr97832-1.c
new file mode 100644
index 000..063fc7bd717
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/pr97832-1.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-Ofast" } */
+/* { dg-require-effective-target vect_double } */
+
+double a[1024], b[1024], c[1024];
+
+void foo()
+{
+  for (int i = 0; i < 256; ++i)
+{
+  a[2*i] = a[2*i] + b[2*i] - c[2*i];
+  a[2*i+1] = a[2*i+1] - b[2*i+1] - c[2*i+1];
+}
+}
+
+/* { dg-final { scan-tree-dump "vectorizing stmts using SLP" "vect" } } */
+/* { dg-final { scan-tree-dump "Loop contains only SLP stmts" "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/pr97832-2.c 
b/gcc/testsuite/gcc.dg/vect/pr97832-2.c
new file mode 100644
index 000..4f0578120ee
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/pr97832-2.c
@@ -0,0 +1,29 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-Ofast" } */
+/* { dg-require-effective-target vect_double } */
+
+void foo1x1(double* restrict y, const double* restrict x, int clen)
+{
+  int xi = clen & 2;
+  double f_re = x[0+xi+0];
+  double f_im = x[4+xi+0];
+  int clen2 = (clen+xi) * 2;
+#pragma GCC unroll 0
+  for (int c = 0; c < clen2; c += 8) {
+// y[c] = y[c] - x[c]*conj(f);
+#pragma GCC unroll 4
+for (int k = 0; k < 4; ++k) {
+  double x_re = x[c+0+k];
+  double x_im = x[c+4+k];
+  double y_re = y[c+0+k];
+  double y_im = y[c+4+k];
+  y_re = y_re - x_re * f_re - x_im * f_im;;
+  y_im = y_im + x_re * f_im - x_im * f_re;
+  y[c+0+k] = y_re;
+  y[c+4+k] = y_im;
+}
+  }
+}
+
+/* { dg-final { scan-tree-dump "vectorizing stmts using SLP" "vect" } } */
+/* { dg-final { scan-tree-dump "Loop contains only SLP stmts" "vect" } } */
diff --git a/gcc/tree-ssa-reassoc.c b/gcc/tree-ssa-reassoc.c
index a2ca1713d4b..4b728ee69de 100644
--- a/gcc/tree-ssa-reassoc.c
+++ b/gcc/tree-ssa-reassoc.c
@@ -450,16 +450,18 @@ get_rank (tree e)
   FOR_EACH_SSA_TREE_OPERAND (op, stmt, iter, SSA_OP_USE)
rank = propagate_rank (rank, op);
 
+  rank += 1;
+
   if (dump_file && (dump_flags & TDF_DETAILS))
{
  fprintf (dump_file, "Rank for ");
  print_generic_expr (dump_file, e);
- fprintf (dump_file, " is %ld\n", (rank + 1));
+ fprintf (dump_file, " is %ld\n", rank);
}
 
   /* Note the rank in the hashtable so we don't recompute it.  */
-  insert_operand_rank (e, (rank + 1));
-  return (rank + 1);
+  insert_operand_rank (e, rank);
+  return rank;
 }
 
   /* Constants, globals, etc., are rank 0 */
@@ -4890,6 +4892,18 @@ remove_visited_stmt_chain (tree var)
 }
 }
 
+/* Return true if OE is negated.  */
+
+static bool
+op_negated_p (operand_entry *oe)
+{
+  if (TREE_CODE (oe->op) != SSA_NAME)
+return false;
+  if (gassign *ass = dyn_cast  (SSA_NAME_DEF_STMT (oe->op)))
+return gimple_assign_rhs_code (ass) == NEGATE_EXPR;
+  return false;
+}
+
 /* This function checks three consequtive operands in
passed operands vector OPS starting from OPINDEX and
swaps two operands if it is profitable for binary operation
@@ -4921,13 +4935,19 @@ swap_ops_for_binary_stmt (vec ops,
   oe3 = ops[opindex + 2];
 
   if ((oe1->rank == oe2->rank
-   && oe2->rank != oe3->rank)
+   && oe2->rank != oe3->rank
+   /* Avoid associating two negated operands together, this
+ undoes linearization during negate repropagation.  */
+   && (!(op_negated_p (oe1) && op_negated_p (oe2))
+  || op_negated_p (oe3)))
   || (stmt && is_phi_for_stmt (stmt, oe3->op)
  && !is_phi_for_stmt (stmt, oe1->op)
  && !is_phi_

[committed] testsuite: Extend vector() regexp

2020-11-17 Thread Richard Sandiford via Gcc-patches
For variable-length vectors, the N inside “vector(N) T” can
contain the characters ‘[’, ‘]’ and ‘,’.

Tested on aarch64-linux-gnu (with and without SVE), arm-linux-gnueabihf
and x86_64-linux-gnu.  Pushed as obvious.

Richard


gcc/testsuite/
* gcc.dg/vect/pr91750.c: Allow "[]," inside a vector(...) lane count.
---
 gcc/testsuite/gcc.dg/vect/pr91750.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.dg/vect/pr91750.c 
b/gcc/testsuite/gcc.dg/vect/pr91750.c
index fe914b2d939..3586f1168ae 100644
--- a/gcc/testsuite/gcc.dg/vect/pr91750.c
+++ b/gcc/testsuite/gcc.dg/vect/pr91750.c
@@ -11,5 +11,5 @@ foo (int n)
 }
 
 /* Make sure the induction IV uses an unsigned increment.  */
-/* { dg-final { scan-tree-dump "vector\\\(\[0-9\]*\\\) unsigned int" "vect" } 
} */
+/* { dg-final { scan-tree-dump {vector\([][0-9,]*\) unsigned int} "vect" } } */
 /* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" } } */
-- 
2.17.1



[committed] testsuite: Remove XFAIL for variable-length vectors

2020-11-17 Thread Richard Sandiford via Gcc-patches
The XFAIL for variable-length vectors is no longer needed since
we can't build the required constant vector and so fall back to
fixed-length alternatives.

Tested on aarch64-linux-gnu (with and without SVE), arm-linux-gnueabihf
and x86_64-linux-gnu.  Pushed as obvious.

Richard


gcc/testsuite/
* gcc.dg/vect/bb-slp-43.c: Remove XFAIL for vect_variable_length.
---
 gcc/testsuite/gcc.dg/vect/bb-slp-43.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-43.c 
b/gcc/testsuite/gcc.dg/vect/bb-slp-43.c
index a65d9513c4d..40bd2e0dfbf 100644
--- a/gcc/testsuite/gcc.dg/vect/bb-slp-43.c
+++ b/gcc/testsuite/gcc.dg/vect/bb-slp-43.c
@@ -14,4 +14,4 @@ f (int *restrict x, short *restrict y)
 }
 
 /* { dg-final { scan-tree-dump-not "mixed mask and nonmask" "slp2" } } */
-/* { dg-final { scan-tree-dump-not "vector operands from scalars" "slp2" { 
target { { vect_int && vect_bool_cmp } && { vect_unpack && vect_hw_misalign } } 
xfail vect_variable_length } } } */
+/* { dg-final { scan-tree-dump-not "vector operands from scalars" "slp2" { 
target { { vect_int && vect_bool_cmp } && { vect_unpack && vect_hw_misalign } } 
} } } */
-- 
2.17.1



[committed] testsuite: XFAIL some SLP reduction tests for VLA SVE

2020-11-17 Thread Richard Sandiford via Gcc-patches
For variable-length SVE, we can only use SLP for N scalars of type
T if the number of Ts in a vector is a multiple of N.  For ints
this means that N must be 4 or 2, so this patch XFAILs two tests
for N==8.

The exact limit seems inherently target-specific -- variable-length
vectors with a 256-bit granule would work fine -- so I used aarch64_sve
selectors on the XFAILs.

Tested on aarch64-linux-gnu (with and without SVE), arm-linux-gnueabihf
and x86_64-linux-gnu.  Pushed as obvious.

Richard


gcc/testsuite/
* gcc.dg/vect/slp-reduc-4.c: XFAIL test for SLP vectorization
for variable-length SVE.
* gcc.dg/vect/slp-reduc-7.c: Likewise.
---
 gcc/testsuite/gcc.dg/vect/slp-reduc-4.c | 6 --
 gcc/testsuite/gcc.dg/vect/slp-reduc-7.c | 6 --
 2 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/vect/slp-reduc-4.c 
b/gcc/testsuite/gcc.dg/vect/slp-reduc-4.c
index 266b439f0a6..cffb0114bcb 100644
--- a/gcc/testsuite/gcc.dg/vect/slp-reduc-4.c
+++ b/gcc/testsuite/gcc.dg/vect/slp-reduc-4.c
@@ -57,6 +57,8 @@ int main (void)
 }
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { xfail 
vect_no_int_min_max } } } */
-/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { 
xfail { vect_no_int_min_max || vect_variable_length } } } } */
-/* { dg-final { scan-tree-dump-times "VEC_PERM_EXPR" 0 "vect" } } */
+/* For variable-length SVE, the number of scalar statements in the
+   reduction exceeds the number of elements in a 128-bit granule.  */
+/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { 
xfail { vect_no_int_min_max || { aarch64_sve && vect_variable_length } } } } } 
*/
+/* { dg-final { scan-tree-dump-times "VEC_PERM_EXPR" 0 "vect" { xfail { 
aarch64_sve && vect_variable_length } } } } */
 
diff --git a/gcc/testsuite/gcc.dg/vect/slp-reduc-7.c 
b/gcc/testsuite/gcc.dg/vect/slp-reduc-7.c
index 05cc9eddacb..7a958f24733 100644
--- a/gcc/testsuite/gcc.dg/vect/slp-reduc-7.c
+++ b/gcc/testsuite/gcc.dg/vect/slp-reduc-7.c
@@ -55,5 +55,7 @@ int main (void)
 }
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { xfail 
vect_no_int_add } } } */
-/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { 
xfail { vect_no_int_add || vect_variable_length } } } } */
-/* { dg-final { scan-tree-dump-times "VEC_PERM_EXPR" 0 "vect" } } */
+/* For variable-length SVE, the number of scalar statements in the
+   reduction exceeds the number of elements in a 128-bit granule.  */
+/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { 
xfail { vect_no_int_add || { aarch64_sve && vect_variable_length } } } } } */
+/* { dg-final { scan-tree-dump-times "VEC_PERM_EXPR" 0 "vect" { xfail { 
aarch64_sve && vect_variable_length } } } } */


[committed] testsuite: XFAIL SLP induction tests for VL vectors

2020-11-17 Thread Richard Sandiford via Gcc-patches
We don't yet support SLP inductions for variable-length vectors,
so this patch XFAILs some associated tests.

(Inductions aren't inherently difficult to support.  It just hasn't
been done yet.)

Tested on aarch64-linux-gnu (with and without SVE), arm-linux-gnueabihf
and x86_64-linux-gnu.  Pushed as obvious.

Richard


gcc/testsuite/
* gcc.dg/vect/pr97678.c: XFAIL test for SLP vectorization
for variable-length vectors.
* gcc.dg/vect/pr97835.c: Likewise.
* gcc.dg/vect/slp-49.c: Likewise.
* gcc.dg/vect/vect-outer-slp-1.c: Likewise.
* gcc.dg/vect/vect-outer-slp-2.c: Likewise.
* gcc.dg/vect/vect-outer-slp-3.c: Likewise.
---
 gcc/testsuite/gcc.dg/vect/pr97678.c  | 3 ++-
 gcc/testsuite/gcc.dg/vect/pr97835.c  | 3 ++-
 gcc/testsuite/gcc.dg/vect/slp-49.c   | 3 ++-
 gcc/testsuite/gcc.dg/vect/vect-outer-slp-1.c | 3 ++-
 gcc/testsuite/gcc.dg/vect/vect-outer-slp-2.c | 3 ++-
 gcc/testsuite/gcc.dg/vect/vect-outer-slp-3.c | 3 ++-
 6 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/vect/pr97678.c 
b/gcc/testsuite/gcc.dg/vect/pr97678.c
index ebe4a35bb3f..d9ffb7a169b 100644
--- a/gcc/testsuite/gcc.dg/vect/pr97678.c
+++ b/gcc/testsuite/gcc.dg/vect/pr97678.c
@@ -26,4 +26,5 @@ main ()
 }
 
 /* The init loop should be vectorized with SLP.  */
-/* { dg-final { scan-tree-dump "vectorizing stmts using SLP" "vect" } } */
+/* We don't yet support SLP inductions for variable length vectors.  */
+/* { dg-final { scan-tree-dump "vectorizing stmts using SLP" "vect" { xfail 
vect_variable_length } } } */
diff --git a/gcc/testsuite/gcc.dg/vect/pr97835.c 
b/gcc/testsuite/gcc.dg/vect/pr97835.c
index 5ca477bf806..a90c773eac9 100644
--- a/gcc/testsuite/gcc.dg/vect/pr97835.c
+++ b/gcc/testsuite/gcc.dg/vect/pr97835.c
@@ -18,4 +18,5 @@ x0 (struct co *yy, long int kc, int wi, int md)
 }
 }
 
-/* { dg-final { scan-tree-dump "vectorizing stmts using SLP" "vect" } } */
+/* We don't yet support SLP inductions for variable length vectors.  */
+/* { dg-final { scan-tree-dump "vectorizing stmts using SLP" "vect" { xfail 
vect_variable_length } } } */
diff --git a/gcc/testsuite/gcc.dg/vect/slp-49.c 
b/gcc/testsuite/gcc.dg/vect/slp-49.c
index 3f53baf707b..4141a09ed97 100644
--- a/gcc/testsuite/gcc.dg/vect/slp-49.c
+++ b/gcc/testsuite/gcc.dg/vect/slp-49.c
@@ -34,5 +34,6 @@ main()
   return 0;
 }
 
-/* { dg-final { scan-tree-dump "vectorizing stmts using SLP" "vect" } } */
+/* We don't yet support SLP inductions for variable length vectors.  */
+/* { dg-final { scan-tree-dump "vectorizing stmts using SLP" "vect" { xfail 
vect_variable_length } } } */
 /* { dg-final { scan-tree-dump "Loop contains only SLP stmts" "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-outer-slp-1.c 
b/gcc/testsuite/gcc.dg/vect/vect-outer-slp-1.c
index 62b18bd5764..445157d39b5 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-outer-slp-1.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-outer-slp-1.c
@@ -27,5 +27,6 @@ void foo (void)
 
 /* We should vectorize this outer loop with SLP.  */
 /* { dg-final { scan-tree-dump "OUTER LOOP VECTORIZED" "vect" } } */
-/* { dg-final { scan-tree-dump "vectorizing stmts using SLP" "vect" } } */
+/* We don't yet support SLP inductions for variable length vectors.  */
+/* { dg-final { scan-tree-dump "vectorizing stmts using SLP" "vect" { xfail 
vect_variable_length } } } */
 /* { dg-final { scan-tree-dump-not "VEC_PERM_EXPR" "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-outer-slp-2.c 
b/gcc/testsuite/gcc.dg/vect/vect-outer-slp-2.c
index 08b4fc52430..ec1e1036f57 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-outer-slp-2.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-outer-slp-2.c
@@ -48,4 +48,5 @@ int main ()
 }
 
 /* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED" 1 "vect" } } */
-/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" } 
} */
+/* We don't yet support SLP inductions for variable length vectors.  */
+/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { 
xfail vect_variable_length } } } */
diff --git a/gcc/testsuite/gcc.dg/vect/vect-outer-slp-3.c 
b/gcc/testsuite/gcc.dg/vect/vect-outer-slp-3.c
index c67d3690bb4..53865d4737b 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-outer-slp-3.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-outer-slp-3.c
@@ -59,4 +59,5 @@ int main ()
 }
 
 /* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED" 1 "vect" } } */
-/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" } 
} */
+/* We don't yet support SLP inductions for variable length vectors.  */
+/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { 
xfail vect_variable_length } } } */


[committed] testsuite: Adjust vect/pr65947-8.c for SVE

2020-11-17 Thread Richard Sandiford via Gcc-patches
We can vectorise vect/pr65947-8.c for SVE, as we can for GCN.

Tested on aarch64-linux-gnu (with and without SVE), arm-linux-gnueabihf
and x86_64-linux-gnu.  Pushed as obvious.

Richard


gcc/testsuite/
* gcc.dg/vect/pr65947-8.c: Expect the loop to be vectorized for SVE.
---
 gcc/testsuite/gcc.dg/vect/pr65947-8.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/vect/pr65947-8.c 
b/gcc/testsuite/gcc.dg/vect/pr65947-8.c
index a2a940daf1a..d0426792e35 100644
--- a/gcc/testsuite/gcc.dg/vect/pr65947-8.c
+++ b/gcc/testsuite/gcc.dg/vect/pr65947-8.c
@@ -41,6 +41,6 @@ main (void)
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-not "LOOP VECTORIZED" "vect" { target { ! 
amdgcn*-*-* } } } } */
-/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" { target amdgcn*-*-* } 
} } */
-/* { dg-final { scan-tree-dump "multiple types in double reduction or 
condition reduction" "vect" { target { ! amdgcn*-*-* } } } } */
+/* { dg-final { scan-tree-dump-not "LOOP VECTORIZED" "vect" { target { ! { 
amdgcn*-*-* || aarch64_sve } } } } } */
+/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" { target { amdgcn*-*-* 
|| aarch64_sve } } } } */
+/* { dg-final { scan-tree-dump "multiple types in double reduction or 
condition reduction" "vect" { target { ! { amdgcn*-*-* || aarch64_sve } } } } } 
*/


[committed] testsuite: Adjust vect/bb-slp-subgroups-3.c for VL vectors

2020-11-17 Thread Richard Sandiford via Gcc-patches
Because we disable the cost model, targets with variable-length
vectors can end up vectorising the store to a[0..7] on its own.
With the cost model we do something sensible.

Tested on aarch64-linux-gnu (with and without SVE), arm-linux-gnueabihf
and x86_64-linux-gnu.  Pushed as obvious.

Richard


gcc/testsuite/
* gcc.dg/vect/bb-slp-subgroups-3.c: XFAIL for variable-length vectors.
---
 gcc/testsuite/gcc.dg/vect/bb-slp-subgroups-3.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-subgroups-3.c 
b/gcc/testsuite/gcc.dg/vect/bb-slp-subgroups-3.c
index fe36f90bb90..e27f956d7b8 100644
--- a/gcc/testsuite/gcc.dg/vect/bb-slp-subgroups-3.c
+++ b/gcc/testsuite/gcc.dg/vect/bb-slp-subgroups-3.c
@@ -38,4 +38,7 @@ main (int argc, char **argv)
 }
 
 /* { dg-final { scan-tree-dump-times "Basic block will be vectorized using 
SLP" 1 "slp2" } } */
-/* { dg-final { scan-tree-dump-times "optimized: basic block" 2 "slp2" } } */
+/* Because we disable the cost model, targets with variable-length
+   vectors can end up vectorizing the store to a[0..7] on its own.
+   With the cost model we do something sensible.  */
+/* { dg-final { scan-tree-dump-times "optimized: basic block" 2 "slp2" { xfail 
vect_variable_length } } } */


[committed] testsuite: Add a vect_element_align_preferred guard

2020-11-17 Thread Richard Sandiford via Gcc-patches
We don't try to increase the alignment of decls if
vect_element_align_preferred.

Tested on aarch64-linux-gnu (with and without SVE), arm-linux-gnueabihf
and x86_64-linux-gnu.  Pushed as obvious.

Richard


gcc/testsuite/
* gcc.dg/vect/aligned-section-anchors-nest-1.c: XFAIL alignment
test if vect_element_align_preferred.
---
 gcc/testsuite/gcc.dg/vect/aligned-section-anchors-nest-1.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.dg/vect/aligned-section-anchors-nest-1.c 
b/gcc/testsuite/gcc.dg/vect/aligned-section-anchors-nest-1.c
index f8c36a89fc7..24b2fa86da2 100644
--- a/gcc/testsuite/gcc.dg/vect/aligned-section-anchors-nest-1.c
+++ b/gcc/testsuite/gcc.dg/vect/aligned-section-anchors-nest-1.c
@@ -30,4 +30,4 @@ int *foo(void)
   return &c[0][0];
 }
 
-/* { dg-final { scan-ipa-dump-times "Increasing alignment of decl" 3 
"increase_alignment" } } */
+/* { dg-final { scan-ipa-dump-times "Increasing alignment of decl" 3 
"increase_alignment" { xfail vect_element_align_preferred } } } */


[committed] testsuite: Add a vect_load_lanes guard

2020-11-17 Thread Richard Sandiford via Gcc-patches
We still fall back to load/store-lanes for slp-46.c, if the target
supports it.

Tested on aarch64-linux-gnu (with and without SVE), arm-linux-gnueabihf
and x86_64-linux-gnu.  Pushed as obvious.

Richard


gcc/testsuite/
* gcc.dg/vect/slp-46.c: XFAIL test for SLP on vect_load_lanes targets.
---
 gcc/testsuite/gcc.dg/vect/slp-46.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.dg/vect/slp-46.c 
b/gcc/testsuite/gcc.dg/vect/slp-46.c
index 58a238aad6c..18476a43d3f 100644
--- a/gcc/testsuite/gcc.dg/vect/slp-46.c
+++ b/gcc/testsuite/gcc.dg/vect/slp-46.c
@@ -94,4 +94,4 @@ main ()
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 "vect" } 
} */
+/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 "vect" { 
xfail vect_load_lanes } } } */


[PATCH 1/5] testsuite: Fix vect/vect-sdiv-pow2-1.c

2020-11-17 Thread Richard Sandiford via Gcc-patches
We're now able to vectorise the set-up loop:

  int p = power2 (fns[i].po2);
  for (int j = 0; j < N; j++)
a[j] = ((p << 4) * j) / (N - 1) - (p << 5);

Rather than adjust the expected output for that, it seemed better
to disable optimisation for the testing code.

Tested on aarch64-linux-gnu (with and without SVE), arm-linux-gnueabihf
and x86_64-linux-gnu.  OK to install?

Richard


gcc/testsuite/
* gcc.dg/vect/vect-sdiv-pow2-1.c (main): Disable optimization.
---
 gcc/testsuite/gcc.dg/vect/vect-sdiv-pow2-1.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.dg/vect/vect-sdiv-pow2-1.c 
b/gcc/testsuite/gcc.dg/vect/vect-sdiv-pow2-1.c
index be70bc6c47e..bf387133d01 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-sdiv-pow2-1.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-sdiv-pow2-1.c
@@ -53,7 +53,7 @@ power2 (int x)
 
 #define N 50
 
-int
+int __attribute__ ((optimize (0)))
 main (void)
 {
   int a[N], b[N], c[N];


[PATCH 2/5] testsuite: Add a vect_partial_vectors_usage_2 guard

2020-11-17 Thread Richard Sandiford via Gcc-patches
We don't need an epilogue loop if the main loop can operate on
partial vectors, so this patch disables an associated test.
The alternative would be to force partial-vectors-usage=1
on the command line.

Tested on aarch64-linux-gnu (with and without SVE), arm-linux-gnueabihf
and x86_64-linux-gnu.  OK to install?

Richard


gcc/testsuite/
* gcc.dg/vect/vect-epilogues.c: XFAIL test for epilogue loop
vectorization if vect_partial_vectors_usage_2.
---
 gcc/testsuite/gcc.dg/vect/vect-epilogues.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.dg/vect/vect-epilogues.c 
b/gcc/testsuite/gcc.dg/vect/vect-epilogues.c
index a146bb6518a..ab7e8a1a759 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-epilogues.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-epilogues.c
@@ -16,4 +16,4 @@ void pixel_avg( unsigned char *dst, int i_dst_stride,
  }
  }
 
-/* { dg-final { scan-tree-dump "LOOP EPILOGUE VECTORIZED" "vect" { target 
vect_multiple_sizes xfail { arm32 && be } } } }  */
+/* { dg-final { scan-tree-dump "LOOP EPILOGUE VECTORIZED" "vect" { target 
vect_multiple_sizes xfail { { arm32 && be } || vect_partial_vectors_usage_2 } } 
} } */


[PATCH 3/5] testsuite: Add vect_perm3_int guards

2020-11-17 Thread Richard Sandiford via Gcc-patches
SLP vectorisation of gcc.dg/vect/fast-math-vect-call-1.c involves
a group of 3 floats, which requires the same permutation as
vect_perm3_int.

The load/store_lanes XFAILs in gcc.dg/vect/slp-perm-6.c implicitly
assumed vect_perm3_int, which is true for Advanced SIMD but not for
VLA SVE.  Whether it's true for fixed-length SVE depends on the
vector length.

The xfail selector applies on top of the target selector, so it's
not necessary to make the xfail selector a strict subset of the
target selector.

Tested on aarch64-linux-gnu (with and without SVE), arm-linux-gnueabihf
and x86_64-linux-gnu.  OK to install?

Richard


gcc/testsuite/
* gcc.dg/vect/fast-math-vect-call-1.c: Only expect SLP to be used
on vect_perm3_int targets.
* gcc.dg/vect/slp-perm-6.c: Likewise.  Only XFAIL the LOAD/STORE_LANES
tests on vect_perm3_int targets.
---
 gcc/testsuite/gcc.dg/vect/fast-math-vect-call-1.c | 2 +-
 gcc/testsuite/gcc.dg/vect/slp-perm-6.c| 8 
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/vect/fast-math-vect-call-1.c 
b/gcc/testsuite/gcc.dg/vect/fast-math-vect-call-1.c
index 877de4eb5be..495c0319c9d 100644
--- a/gcc/testsuite/gcc.dg/vect/fast-math-vect-call-1.c
+++ b/gcc/testsuite/gcc.dg/vect/fast-math-vect-call-1.c
@@ -97,4 +97,4 @@ main ()
 }
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 4 "vect" { target { 
vect_call_copysignf && vect_call_sqrtf } } } } */
-/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 3 "vect" { 
target { vect_call_copysignf && vect_call_sqrtf } } } } */
+/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 3 "vect" { 
target { { vect_call_copysignf && vect_call_sqrtf } && vect_perm3_int } } } } */
diff --git a/gcc/testsuite/gcc.dg/vect/slp-perm-6.c 
b/gcc/testsuite/gcc.dg/vect/slp-perm-6.c
index cc863de76bf..5f121b52ffb 100644
--- a/gcc/testsuite/gcc.dg/vect/slp-perm-6.c
+++ b/gcc/testsuite/gcc.dg/vect/slp-perm-6.c
@@ -106,7 +106,7 @@ int main (int argc, const char* argv[])
 /* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 "vect" { 
target { vect_perm3_int && { {! vect_load_lanes } && {! 
vect_partial_vectors_usage_1 } } } } } } */
 /* The epilogues are vectorized using partial vectors.  */
 /* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 4 "vect" { 
target { vect_perm3_int && { {! vect_load_lanes } && 
vect_partial_vectors_usage_1 } } } } } */
-/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 "vect" { 
target vect_load_lanes } } } */
-/* { dg-final { scan-tree-dump "Built SLP cancelled: can use load/store-lanes" 
"vect" { target { vect_perm3_int && vect_load_lanes } xfail { vect_perm3_int && 
vect_load_lanes } } } } */
-/* { dg-final { scan-tree-dump "LOAD_LANES" "vect" { target { vect_load_lanes 
} xfail { vect_load_lanes } } } } */
-/* { dg-final { scan-tree-dump "STORE_LANES" "vect" { target { vect_load_lanes 
} xfail { vect_load_lanes } } } } */
+/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 "vect" { 
target { vect_perm3_int && vect_load_lanes } } } } */
+/* { dg-final { scan-tree-dump "Built SLP cancelled: can use load/store-lanes" 
"vect" { target { vect_perm3_int && vect_load_lanes } xfail *-*-* } } } */
+/* { dg-final { scan-tree-dump "LOAD_LANES" "vect" { target vect_load_lanes 
xfail vect_perm3_int } } } */
+/* { dg-final { scan-tree-dump "STORE_LANES" "vect" { target vect_load_lanes 
xfail vect_perm3_int } } } */


[PATCH 4/5] testsuite: Adjust gcc.dg/vect/slp-21.c for Arm targets

2020-11-17 Thread Richard Sandiford via Gcc-patches
On arm* and aarch64* targets, we can vectorise the second of the main
loops using SLP, not just the third.  As the comments say, whether this
is supported depends on a very specific permutation, so it seemed better
to use direct target selectors.

Tested on aarch64-linux-gnu (with and without SVE), arm-linux-gnueabihf
and x86_64-linux-gnu.  OK to install?

Richard


gcc/testsuite/
* gcc.dg/vect/slp-21.c: Expect 4 SLP instances to be vectorized
on arm* and aarch64* targets.
---
 gcc/testsuite/gcc.dg/vect/slp-21.c | 12 +++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.dg/vect/slp-21.c 
b/gcc/testsuite/gcc.dg/vect/slp-21.c
index 1f8c82e8ba8..117d65c5ddb 100644
--- a/gcc/testsuite/gcc.dg/vect/slp-21.c
+++ b/gcc/testsuite/gcc.dg/vect/slp-21.c
@@ -201,6 +201,16 @@ int main (void)
 
 /* { dg-final { scan-tree-dump-times "vectorized 4 loops" 1 "vect"  { target { 
vect_strided4 || vect_extract_even_odd } } } } */
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect"  { target  
{ ! { vect_strided4 || vect_extract_even_odd } } } } } */
-/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 "vect" { 
target vect_strided4 } } } */
+/* Some targets can vectorize the second of the three main loops using
+   hybrid SLP.  For 128-bit vectors, the required 4->3 permutations are:
+
+   { 0, 1, 2, 4, 5, 6, 8, 9 }
+   { 2, 4, 5, 6, 8, 9, 10, 12 }
+   { 5, 6, 8, 9, 10, 12, 13, 14 }
+
+   Not all vect_perm targets support that, and it's a bit too specific to have
+   its own effective-target selector, so we just test targets directly.  */
+/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 4 "vect" { 
target { aarch64*-*-* arm*-*-* } } } } */
+/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 "vect" { 
target { vect_strided4 && { ! { aarch64*-*-* arm*-*-* } } } } } } */
 /* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 0 "vect"  { 
target { ! { vect_strided4 } } } } } */
   


[PATCH 5/5] testsuite: Adjust bb-slp-pr68892.c for AArch64

2020-11-17 Thread Richard Sandiford via Gcc-patches
AArch64 passes the "not profitable" test because it treats vec_construct
as having a high-enough cost.  This means that we can try other vector
modes, which in turn causes "BB vectorization with gaps at the end of
a load is not supported" to be printed more than once.  The number of
times that we print the message doesn't seem important, so the patch
converts it to a plain scan-tree-dump.

Tested on aarch64-linux-gnu (with and without SVE), arm-linux-gnueabihf
and x86_64-linux-gnu.  OK to install?

Richard


gcc/testsuite/
* gcc.dg/vect/bb-slp-pr68892.c: Don't XFAIL the profitability
test for aarch64*-*-*.  Allow the "BB vectorization with gaps"
message to be printed more than once.
---
 gcc/testsuite/gcc.dg/vect/bb-slp-pr68892.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-pr68892.c 
b/gcc/testsuite/gcc.dg/vect/bb-slp-pr68892.c
index 8cd3a6a1274..e9909cf0dfa 100644
--- a/gcc/testsuite/gcc.dg/vect/bb-slp-pr68892.c
+++ b/gcc/testsuite/gcc.dg/vect/bb-slp-pr68892.c
@@ -15,6 +15,6 @@ void foo(void)
 
 /* ???  Due to the gaps we fall back to scalar loads which makes the
vectorization profitable.  */
-/* { dg-final { scan-tree-dump "not profitable" "slp2" { xfail *-*-* } } } */
-/* { dg-final { scan-tree-dump-times "BB vectorization with gaps at the end of 
a load is not supported" 1 "slp2" } } */
-/* { dg-final { scan-tree-dump-times "Basic block will be vectorized" 1 "slp2" 
} } */
+/* { dg-final { scan-tree-dump "not profitable" "slp2" { xfail { ! 
aarch64*-*-* } } } } */
+/* { dg-final { scan-tree-dump "BB vectorization with gaps at the end of a 
load is not supported" "slp2" } } */
+/* { dg-final { scan-tree-dump-times "Basic block will be vectorized" 1 "slp2" 
{ xfail aarch64*-*-* } } } */


[committed] PR97693: Specify required vectype in vectorizable_call

2020-11-17 Thread Richard Sandiford via Gcc-patches
The vectorizable_call part of r11-1143 dropped the required
vectype when moving from vect_get_vec_def_for_operand to
vect_get_vec_defs_for_operand.  This caused an ICE on the
testcase for SVE, because we ended up with a non-predicate
value being passed to a predicate input.

AFAICT this was the only instance of that happening.  The types
seemed to get carried forward for all the other converted calls.

Tested on aarch64-linux-gnu (with and without SVE), arm-linux-gnueabihf
and x86_64-linux-gnu.  Pushed as obvious.

Richard


gcc/
PR tree-optimization/97693
* tree-vect-stmts.c (vectorizable_call): Pass the required vectype
to vect_get_vec_defs_for_operand.

gcc/testsuite/
PR tree-optimization/97693
* gcc.dg/vect/pr97693.c: New test.
---
 gcc/testsuite/gcc.dg/vect/pr97693.c | 15 +++
 gcc/tree-vect-stmts.c   |  3 ++-
 2 files changed, 17 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/vect/pr97693.c

diff --git a/gcc/testsuite/gcc.dg/vect/pr97693.c 
b/gcc/testsuite/gcc.dg/vect/pr97693.c
new file mode 100644
index 000..4da44c70555
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/pr97693.c
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+
+extern short a[];
+int b;
+short c, d;
+unsigned e() {
+  if (c)
+return c;
+  return d;
+}
+void f() {
+  for (unsigned g = b; g; g += 6)
+for (_Bool h = 0; h < (_Bool)e(); h = 1)
+  a[g] = 1 / b;
+}
diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
index 2c7a8a70913..4e535fec9ca 100644
--- a/gcc/tree-vect-stmts.c
+++ b/gcc/tree-vect-stmts.c
@@ -3427,7 +3427,8 @@ vectorizable_call (vec_info *vinfo,
{
  vec_defs.quick_push (vNULL);
  vect_get_vec_defs_for_operand (vinfo, stmt_info, ncopies,
-op, &vec_defs[i]);
+op, &vec_defs[i],
+vectypes[i]);
}
  orig_vargs[i] = vargs[i] = vec_defs[i][j];
}


Re: [PATCH v1 1/2] Simplify shifts wider than the bitwidth of types

2020-11-17 Thread Philipp Tomsich
Jeff,

On Tue, 17 Nov 2020 at 00:38, Jeff Law  wrote:

>
> On 11/16/20 11:57 AM, Philipp Tomsich wrote:
> > From: Philipp Tomsich 
> >
> > While most shifts wider than the bitwidth of a type will be caught by
> > other passes, it is possible that these show up for VRP.
> > Consider the following example:
> >   int func (int a, int b, int c)
> >   {
> > return (a << ((b && c) - 1));
> >   }
> >
> > This adds simplify_using_ranges::simplify_lshift_using_ranges to
> > detect and rewrite such cases.  If the intersection of meaningful
> > shift amounts for the underlying type and the value-range computed
> > for the shift-amount (whether an integer constant or a variable) is
> > empty, the statement is replaced with the zero-constant of the same
> > precision as the result.
> >
> > gcc/ChangeLog:
> >
> >* vr-values.h (simplify_using_ranges): Declare.
> >* vr-values.c (simplify_lshift_using_ranges): New function.
> >(simplify): Use simplify_lshift_using_ranges for LSHIFT_EXPR.
>
> Umm, isn't this a shift wider than the bitwidth undefined behavior?  We
> should be generating warnings for that, not trying to further optimize
> it :-)
>

The shift is undefined behavior on the language level (for C) and a warning
will be generated, if such a shift is encountered; additionally, the shift
will be
replaced with the value 0.

However, in the above case, the shift is generated only in the middle end:
At 136t.walloca, I still have:

>   # RANGE [-1, 0]
>   _1 = iftmp.1_2 + -1;
>   _6 = a_5(D) << _1;

Whereas at 137t.pre, this is changed into:

> Found partial redundancy for expression {lshift_expr,a_5(D),_1} (0006)
> Inserted _9 = a_5(D) << -1;


In other words, the change to VRP canonicalizes what a lshift_expr with an
shift-amount outside of the type width means... it doesn't assume anything
about the original language.
Do we assume that a LSHIFT_EXPR has the same semantics as for a
C-language shift-left? If so, then pre should not generate the LSHIFT_EXPR
for _9... or we might even catch this later in path isolation (as undefined
behavior, insert a __builtin_trap() and emit a warning)?

Note that in his comment to patch 2/2, Jim has noted that user code for
RISC-V may assume a truncation of the shift-operand...

Philipp.


[committed] aarch64: Remove XFAILs for two SVE tests

2020-11-17 Thread Richard Sandiford via Gcc-patches
These tests started passing a while ago, so remove the XFAILs.

Tested on aarch64-linux-gnu, pushed to trunk.

Richard


gcc/testsuite/
* gcc.target/aarch64/sve/cond_cnot_1.c: Remove XFAIL.
* gcc.target/aarch64/sve/cond_unary_1.c: Likewise.
---
 gcc/testsuite/gcc.target/aarch64/sve/cond_cnot_1.c  | 3 +--
 gcc/testsuite/gcc.target/aarch64/sve/cond_unary_1.c | 4 +---
 2 files changed, 2 insertions(+), 5 deletions(-)

diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_cnot_1.c 
b/gcc/testsuite/gcc.target/aarch64/sve/cond_cnot_1.c
index bd877663723..49f0b18a5a5 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve/cond_cnot_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve/cond_cnot_1.c
@@ -31,5 +31,4 @@ TEST_ALL (DEF_LOOP)
 
 /* { dg-final { scan-assembler-not {\tmov\tz} } } */
 /* { dg-final { scan-assembler-not {\tmovprfx\t} } } */
-/* Currently we canonicalize the ?: so that !b[i] is the "false" value.  */
-/* { dg-final { scan-assembler-not {\tsel\t} { xfail *-*-* } } } */
+/* { dg-final { scan-assembler-not {\tsel\t} } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_unary_1.c 
b/gcc/testsuite/gcc.target/aarch64/sve/cond_unary_1.c
index 2b5f9c345ab..0492476715d 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve/cond_unary_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve/cond_unary_1.c
@@ -54,6 +54,4 @@ TEST_ALL (DEF_LOOP)
 
 /* { dg-final { scan-assembler-not {\tmov\tz} } } */
 /* { dg-final { scan-assembler-not {\tmovprfx\t} } } */
-/* XFAILed because the ?: gets canonicalized so that the operation is in
-   the false arm.  */
-/* { dg-final { scan-assembler-not {\tsel\t} { xfail *-*-* } } } */
+/* { dg-final { scan-assembler-not {\tsel\t} } } */


[PATCH]AArch64[GCC-8] Fix overflow in memcopy expansion on aarch64.

2020-11-17 Thread Tamar Christina via Gcc-patches
Hi All,

This a partial backport for 0f801e0b6cc9f67c9a8983127e23161f6025c5b6 which fixes
a truncation error for the inline memcopy on AArch64 on GCC-8.

Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.

Ok for GCC-8?

gcc/ChangeLog:

PR target/97535
* config/aarch64/aarch64.c (aarch64_expand_movme): Use
unsigned HOST_WIDE_INT.

gcc/testsuite/ChangeLog:

PR target/97535
* gcc.target/aarch64/pr97535.c: New test.

Thanks,
Tamar

--- inline copy of patch -- 
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 
72c11e3565908916d42c2d6481b5177f6cc07a5e..8faadabc996232cb37d876c973aba7f9aff39b6f
 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -15968,7 +15968,7 @@ aarch64_copy_one_block_and_progress_pointers (rtx *src, 
rtx *dst,
 bool
 aarch64_expand_movmem (rtx *operands)
 {
-  unsigned int n;
+  unsigned HOST_WIDE_INT n;
   rtx dst = operands[0];
   rtx src = operands[1];
   rtx base;
diff --git a/gcc/testsuite/gcc.target/aarch64/pr97535.c 
b/gcc/testsuite/gcc.target/aarch64/pr97535.c
new file mode 100644
index 
..55586c6e5c5f74f0422ec52484459e31cda99cf0
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/pr97535.c
@@ -0,0 +1,16 @@
+/* { dg-do compile { target { ! ilp32 } } } */
+
+#include 
+
+#define SIZE 0xFF
+
+extern char raw_buffer[SIZE];
+
+void setRaw(const void *raw)
+{
+memcpy(raw_buffer, raw, SIZE);
+}
+
+/* At any optimization level this should be a function call
+   and not inlined.  */
+/* { dg-final { scan-assembler "bl\tmemcpy" } } */


-- 
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 72c11e3565908916d42c2d6481b5177f6cc07a5e..8faadabc996232cb37d876c973aba7f9aff39b6f 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -15968,7 +15968,7 @@ aarch64_copy_one_block_and_progress_pointers (rtx *src, rtx *dst,
 bool
 aarch64_expand_movmem (rtx *operands)
 {
-  unsigned int n;
+  unsigned HOST_WIDE_INT n;
   rtx dst = operands[0];
   rtx src = operands[1];
   rtx base;
diff --git a/gcc/testsuite/gcc.target/aarch64/pr97535.c b/gcc/testsuite/gcc.target/aarch64/pr97535.c
new file mode 100644
index ..55586c6e5c5f74f0422ec52484459e31cda99cf0
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/pr97535.c
@@ -0,0 +1,16 @@
+/* { dg-do compile { target { ! ilp32 } } } */
+
+#include 
+
+#define SIZE 0xFF
+
+extern char raw_buffer[SIZE];
+
+void setRaw(const void *raw)
+{
+memcpy(raw_buffer, raw, SIZE);
+}
+
+/* At any optimization level this should be a function call
+   and not inlined.  */
+/* { dg-final { scan-assembler "bl\tmemcpy" } } */



Re: [PATCH]AArch64[GCC-8] Fix overflow in memcopy expansion on aarch64.

2020-11-17 Thread Richard Sandiford via Gcc-patches
Tamar Christina  writes:
> Hi All,
>
> This a partial backport for 0f801e0b6cc9f67c9a8983127e23161f6025c5b6 which 
> fixes
> a truncation error for the inline memcopy on AArch64 on GCC-8.
>
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
>
> Ok for GCC-8?

OK, thanks.

Richard

>
> gcc/ChangeLog:
>
>   PR target/97535
>   * config/aarch64/aarch64.c (aarch64_expand_movme): Use
>   unsigned HOST_WIDE_INT.
>
> gcc/testsuite/ChangeLog:
>
>   PR target/97535
>   * gcc.target/aarch64/pr97535.c: New test.
>
> Thanks,
> Tamar
>
> --- inline copy of patch -- 
> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index 
> 72c11e3565908916d42c2d6481b5177f6cc07a5e..8faadabc996232cb37d876c973aba7f9aff39b6f
>  100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -15968,7 +15968,7 @@ aarch64_copy_one_block_and_progress_pointers (rtx 
> *src, rtx *dst,
>  bool
>  aarch64_expand_movmem (rtx *operands)
>  {
> -  unsigned int n;
> +  unsigned HOST_WIDE_INT n;
>rtx dst = operands[0];
>rtx src = operands[1];
>rtx base;
> diff --git a/gcc/testsuite/gcc.target/aarch64/pr97535.c 
> b/gcc/testsuite/gcc.target/aarch64/pr97535.c
> new file mode 100644
> index 
> ..55586c6e5c5f74f0422ec52484459e31cda99cf0
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/pr97535.c
> @@ -0,0 +1,16 @@
> +/* { dg-do compile { target { ! ilp32 } } } */
> +
> +#include 
> +
> +#define SIZE 0xFF
> +
> +extern char raw_buffer[SIZE];
> +
> +void setRaw(const void *raw)
> +{
> +memcpy(raw_buffer, raw, SIZE);
> +}
> +
> +/* At any optimization level this should be a function call
> +   and not inlined.  */
> +/* { dg-final { scan-assembler "bl\tmemcpy" } } */


Re: [PATCH] libgcc: Add a weak stub for __sync_synchronize

2020-11-17 Thread Richard Earnshaw (lists) via Gcc-patches
On 03/11/2020 15:08, Bernd Edlinger wrote:
> Hi,
> 
> this fixes a problem with a missing symbol __sync_synchronize
> which happens when newlib is used together with libstdc++ for
> the non-threaded simulator target arm-none-eabi.
> 
> There are several questions on stackoverflow about this issue.
> 
> I would like to add a weak symbol for this target, since this
> is only a default implementation and not meant to override a
> possibly more sophisticated synchronization function from the
> c-runtime.
> 
> 
> Regression tested successfully on arm-none-eabi with newlib-3.3.0.
> 
> Is it OK for trunk?
> 
> 
> Thanks
> Bernd.
> 

I seem to recall that this was a deliberate decision - you can't guess
this correctly, at least when trying to build portable code - you just
have to know which runtime you will be using.

I think Ramana had some changes in the works at one point to address
(some) of this, but I'm not sure what happened to them.  Ramana?


+#if defined (__ARM_ARCH_6__) || defined (__ARM_ARCH_6J__)   \
+|| defined (__ARM_ARCH_6K__) || defined (__ARM_ARCH_6T2__)  \
+|| defined (__ARM_ARCH_6Z__) || defined (__ARM_ARCH_6ZK__)  \
+|| defined (__ARM_ARCH_7__) || defined (__ARM_ARCH_7A__)
+#if defined (__ARM_ARCH_7__) || defined (__ARM_ARCH_7A__)

Ug, no!  Use the ACLE macros to avoid this sort of mess.

R.


c-family: token streamer

2020-11-17 Thread Nathan Sidwell


This is broken out of modules patch 01-langhooks.diff, I realized that
this part is independent, and removes some duplicated code -- migrated
to the token_streamer class.

gcc/c-family/
* c-ppoutput.c (scan_translation_unit): Use token_streamer, remove
code duplicating that functionality.

pushing to trunk
--
Nathan Sidwell
diff --git i/gcc/c-family/c-ppoutput.c w/gcc/c-family/c-ppoutput.c
index 44c6f30e06b..517de15d97c 100644
--- i/gcc/c-family/c-ppoutput.c
+++ w/gcc/c-family/c-ppoutput.c
@@ -304,120 +304,18 @@ token_streamer::stream (cpp_reader *pfile, const cpp_token *token,
 static void
 scan_translation_unit (cpp_reader *pfile)
 {
-  bool avoid_paste = false;
-  bool do_line_adjustments
-= cpp_get_options (parse_in)->lang != CLK_ASM
-  && !flag_no_line_commands;
-  bool in_pragma = false;
-  bool line_marker_emitted = false;
+  token_streamer streamer (pfile);
 
   print.source = NULL;
   for (;;)
 {
-  location_t loc;
-  const cpp_token *token = cpp_get_token_with_location (pfile, &loc);
-
-  if (token->type == CPP_PADDING)
-	{
-	  avoid_paste = true;
-	  if (print.source == NULL
-	  || (!(print.source->flags & PREV_WHITE)
-		  && token->val.source == NULL))
-	print.source = token->val.source;
-	  continue;
-	}
+  location_t spelling_loc;
+  const cpp_token *token
+	= cpp_get_token_with_location (pfile, &spelling_loc);
 
+  streamer.stream (pfile, token, spelling_loc);
   if (token->type == CPP_EOF)
 	break;
-
-  /* Subtle logic to output a space if and only if necessary.  */
-  if (avoid_paste)
-	{
-	  int src_line = LOCATION_LINE (loc);
-
-	  if (print.source == NULL)
-	print.source = token;
-
-	  if (src_line != print.src_line
-	  && do_line_adjustments
-	  && !in_pragma)
-	{
-	  line_marker_emitted = do_line_change (pfile, token, loc, false);
-	  putc (' ', print.outf);
-	  print.printed = true;
-	}
-	  else if (print.source->flags & PREV_WHITE
-		   || (print.prev
-		   && cpp_avoid_paste (pfile, print.prev, token))
-		   || (print.prev == NULL && token->type == CPP_HASH))
-	{
-	  putc (' ', print.outf);
-	  print.printed = true;
-	}
-	}
-  else if (token->flags & PREV_WHITE)
-	{
-	  int src_line = LOCATION_LINE (loc);
-
-	  if (src_line != print.src_line
-	  && do_line_adjustments
-	  && !in_pragma)
-	line_marker_emitted = do_line_change (pfile, token, loc, false);
-	  putc (' ', print.outf);
-	  print.printed = true;
-	}
-
-  avoid_paste = false;
-  print.source = NULL;
-  print.prev = token;
-  if (token->type == CPP_PRAGMA)
-	{
-	  const char *space;
-	  const char *name;
-
-	  line_marker_emitted = maybe_print_line (token->src_loc);
-	  fputs ("#pragma ", print.outf);
-	  c_pp_lookup_pragma (token->val.pragma, &space, &name);
-	  if (space)
-	fprintf (print.outf, "%s %s", space, name);
-	  else
-	fprintf (print.outf, "%s", name);
-	  print.printed = true;
-	  in_pragma = true;
-	}
-  else if (token->type == CPP_PRAGMA_EOL)
-	{
-	  maybe_print_line (token->src_loc);
-	  in_pragma = false;
-	}
-  else
-	{
-	  if (cpp_get_options (parse_in)->debug)
-	linemap_dump_location (line_table, token->src_loc, print.outf);
-
-	  if (do_line_adjustments
-	  && !in_pragma
-	  && !line_marker_emitted
-	  && print.prev_was_system_token != !!in_system_header_at (loc)
-	  && !is_location_from_builtin_token (loc))
-	/* The system-ness of this token is different from the one
-	   of the previous token.  Let's emit a line change to
-	   mark the new system-ness before we emit the token.  */
-	{
-	  do_line_change (pfile, token, loc, false);
-	  print.prev_was_system_token = !!in_system_header_at (loc);
-	}
-	  cpp_output_token (token, print.outf);
-	  line_marker_emitted = false;
-	  print.printed = true;
-	}
-
-  /* CPP_COMMENT tokens and raw-string literal tokens can
-	 have embedded new-line characters.  Rather than enumerating
-	 all the possible token types just check if token uses
-	 val.str union member.  */
-  if (cpp_token_val_index (token) == CPP_TOKEN_FLD_STR)
-	account_for_newlines (token->val.str.text, token->val.str.len);
 }
 }
 


Make ltrans type canonicals compatible with WPA ones

2020-11-17 Thread Jan Hubicka
Hi,
this patch fixes profiledbootstrap failure with LTO enabled.
What happens is that alias_ptr_types_compatible_p relies on the
fact that alias sets are not refined from WPA to ltrans time:

/* This function originally abstracts from simply comparing
   get_deref_alias_set so that we are sure this still computes
   the same result after LTO type merging is applied.
   When in LTO type merging is done we can actually do this compare.
*/
  if (in_lto_p)
return get_deref_alias_set (t1) == get_deref_alias_set (t2);
  else
return (TYPE_MAIN_VARIANT (TREE_TYPE (t1))
== TYPE_MAIN_VARIANT (TREE_TYPE (t2)));

This conditional is confused - it pesimizes code with -fno-lto
for no good reason. I will fix that separately: we now have
lto_streaming_expected_p so I think it should read

 if (!lto_stremaing_expected_p () || flag_wpa)
   use alias sets
 else
   use main varaiants as conservative estimate.

(so if we ever get idea to ICF during incremental link or deduplicate in
early passes, things will work safely).  I will send separate patch on
this.


Not refining alias sets from WPA to ltrans time is a good invariant to
maintain and the canonical type hash behaves this way.  However I broke
this with the ODR logic.

Normally we define canonical types for C++ ODR types according to their
type names.  However to make ODR types compatible with C types we check
if structurally equivalent C type exists and if so, we ignore ODR
names giving up on the precision.

This however is not stable between WPA and ltrans since at ltrans the
type merging does not see as many types as WPA does.  To make this
consistent the patch makes WPA ODR_TYPE_P == 0 for ODR types that
conflicted with non-ODR type.

I had to drop one sanity check in ipa-utils.h (that I think is not very
important - I added it while introducing CXX_ODR_P machinery) and also
it now may happen that we query odr_based_tbaa_p before registering
first ODR type so we do not want to ICE here.
ODR type registration happens early to produce ODR violation warings.
Those are not done at ltrans, so dropping the registration is safe. The
type will still be added while computing the type inheritance graph if
needed for devirtualization (and late devirtualization is not very
useful anyway since it won't enable inlining).

Bootstrapped, regtested x86_64-linux, OK?
I think this should go to release branches after some soaking in
mainline too, even if we don't have direct reproducers.

Note that Martin implemented type checker to sanity check that alias
sets are never getting refined from compile to WPa and from WPA to
ltrans.  I recovered the patch and will play with it more.
I think we should eventually establish this (if alias sets are refined
from copmile time to WPA it is either wrong code issue or frontend alias
sets are not as good as they should be), but of course there are fun
issues.  My plan is to see if I can identify some wrong code bugs and
leave rest for early next stage1.

PR bootstrap/97857
* ipa-devirt.c (odr_based_tbaa_p): Do not ICE when
odr_hash is not initialized.
* ipa-utils.h (type_with_linkage_p): Do not sanity check
CXX_ODR_P.
* lto-common.c (gimple_register_canonical_type_1): Only
register types with TYPE_CXX_ODR_P flag; sanity check that no
conflict happens at ltrans time.
* tree-streamer-out.c (pack_ts_type_common_value_fields): Set
CXX_ODR_P according to the canonical type.
diff --git a/gcc/ipa-devirt.c b/gcc/ipa-devirt.c
index 067ed5ba073..6e6df0b2af5 100644
--- a/gcc/ipa-devirt.c
+++ b/gcc/ipa-devirt.c
@@ -2032,6 +2032,8 @@ odr_based_tbaa_p (const_tree type)
 {
   if (!RECORD_OR_UNION_TYPE_P (type))
 return false;
+  if (!odr_hash)
+return false;
   odr_type t = get_odr_type (const_cast  (type), false);
   if (!t || !t->tbaa_enabled)
 return false;
diff --git a/gcc/ipa-utils.h b/gcc/ipa-utils.h
index 880e527c590..91571d8e82a 100644
--- a/gcc/ipa-utils.h
+++ b/gcc/ipa-utils.h
@@ -211,8 +211,6 @@ type_with_linkage_p (const_tree t)
   if (!TYPE_CONTEXT (t))
 return false;
 
-  gcc_checking_assert (TREE_CODE (t) == ENUMERAL_TYPE || TYPE_CXX_ODR_P (t));
-
   return true;
 }
 
diff --git a/gcc/lto/lto-common.c b/gcc/lto/lto-common.c
index 6944c469f89..0a3033c3695 100644
--- a/gcc/lto/lto-common.c
+++ b/gcc/lto/lto-common.c
@@ -415,8 +415,8 @@ gimple_register_canonical_type_1 (tree t, hashval_t hash)
  that we can use to lookup structurally equivalent non-ODR type.
  In case we decide to treat type as unique ODR type we recompute hash based
  on name and let TBAA machinery know about our decision.  */
-  if (RECORD_OR_UNION_TYPE_P (t)
-  && odr_type_p (t) && !odr_type_violation_reported_p (t))
+  if (RECORD_OR_UNION_TYPE_P (t) && odr_type_p (t)
+  && TYPE_CXX_ODR_P (t) && !odr_type_violation_reported_p (t))
 {
   /* Anonymous namespace types never conflict with non-C++ types.  */
   if (type_w

Re: [PATCH 1/5] testsuite: Fix vect/vect-sdiv-pow2-1.c

2020-11-17 Thread Richard Biener via Gcc-patches
On Tue, Nov 17, 2020 at 12:24 PM Richard Sandiford via Gcc-patches
 wrote:
>
> We're now able to vectorise the set-up loop:
>
>   int p = power2 (fns[i].po2);
>   for (int j = 0; j < N; j++)
> a[j] = ((p << 4) * j) / (N - 1) - (p << 5);
>
> Rather than adjust the expected output for that, it seemed better
> to disable optimisation for the testing code.
>
> Tested on aarch64-linux-gnu (with and without SVE), arm-linux-gnueabihf
> and x86_64-linux-gnu.  OK to install?

In other places we just add a asm ("" : : : "memory") to the loop body, can you
do it like htat?

Thanks,
RIchard.

> Richard
>
>
> gcc/testsuite/
> * gcc.dg/vect/vect-sdiv-pow2-1.c (main): Disable optimization.
> ---
>  gcc/testsuite/gcc.dg/vect/vect-sdiv-pow2-1.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/gcc/testsuite/gcc.dg/vect/vect-sdiv-pow2-1.c 
> b/gcc/testsuite/gcc.dg/vect/vect-sdiv-pow2-1.c
> index be70bc6c47e..bf387133d01 100644
> --- a/gcc/testsuite/gcc.dg/vect/vect-sdiv-pow2-1.c
> +++ b/gcc/testsuite/gcc.dg/vect/vect-sdiv-pow2-1.c
> @@ -53,7 +53,7 @@ power2 (int x)
>
>  #define N 50
>
> -int
> +int __attribute__ ((optimize (0)))
>  main (void)
>  {
>int a[N], b[N], c[N];


Re: [PATCH 2/5] testsuite: Add a vect_partial_vectors_usage_2 guard

2020-11-17 Thread Richard Biener via Gcc-patches
On Tue, Nov 17, 2020 at 12:24 PM Richard Sandiford via Gcc-patches
 wrote:
>
> We don't need an epilogue loop if the main loop can operate on
> partial vectors, so this patch disables an associated test.
> The alternative would be to force partial-vectors-usage=1
> on the command line.
>
> Tested on aarch64-linux-gnu (with and without SVE), arm-linux-gnueabihf
> and x86_64-linux-gnu.  OK to install?

OK

> Richard
>
>
> gcc/testsuite/
> * gcc.dg/vect/vect-epilogues.c: XFAIL test for epilogue loop
> vectorization if vect_partial_vectors_usage_2.
> ---
>  gcc/testsuite/gcc.dg/vect/vect-epilogues.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/gcc/testsuite/gcc.dg/vect/vect-epilogues.c 
> b/gcc/testsuite/gcc.dg/vect/vect-epilogues.c
> index a146bb6518a..ab7e8a1a759 100644
> --- a/gcc/testsuite/gcc.dg/vect/vect-epilogues.c
> +++ b/gcc/testsuite/gcc.dg/vect/vect-epilogues.c
> @@ -16,4 +16,4 @@ void pixel_avg( unsigned char *dst, int i_dst_stride,
>   }
>   }
>
> -/* { dg-final { scan-tree-dump "LOOP EPILOGUE VECTORIZED" "vect" { target 
> vect_multiple_sizes xfail { arm32 && be } } } }  */
> +/* { dg-final { scan-tree-dump "LOOP EPILOGUE VECTORIZED" "vect" { target 
> vect_multiple_sizes xfail { { arm32 && be } || vect_partial_vectors_usage_2 } 
> } } } */


Re: [PATCH 3/5] testsuite: Add vect_perm3_int guards

2020-11-17 Thread Richard Biener via Gcc-patches
On Tue, Nov 17, 2020 at 12:25 PM Richard Sandiford via Gcc-patches
 wrote:
>
> SLP vectorisation of gcc.dg/vect/fast-math-vect-call-1.c involves
> a group of 3 floats, which requires the same permutation as
> vect_perm3_int.
>
> The load/store_lanes XFAILs in gcc.dg/vect/slp-perm-6.c implicitly
> assumed vect_perm3_int, which is true for Advanced SIMD but not for
> VLA SVE.  Whether it's true for fixed-length SVE depends on the
> vector length.
>
> The xfail selector applies on top of the target selector, so it's
> not necessary to make the xfail selector a strict subset of the
> target selector.
>
> Tested on aarch64-linux-gnu (with and without SVE), arm-linux-gnueabihf
> and x86_64-linux-gnu.  OK to install?

OK

> Richard
>
>
> gcc/testsuite/
> * gcc.dg/vect/fast-math-vect-call-1.c: Only expect SLP to be used
> on vect_perm3_int targets.
> * gcc.dg/vect/slp-perm-6.c: Likewise.  Only XFAIL the LOAD/STORE_LANES
> tests on vect_perm3_int targets.
> ---
>  gcc/testsuite/gcc.dg/vect/fast-math-vect-call-1.c | 2 +-
>  gcc/testsuite/gcc.dg/vect/slp-perm-6.c| 8 
>  2 files changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/gcc/testsuite/gcc.dg/vect/fast-math-vect-call-1.c 
> b/gcc/testsuite/gcc.dg/vect/fast-math-vect-call-1.c
> index 877de4eb5be..495c0319c9d 100644
> --- a/gcc/testsuite/gcc.dg/vect/fast-math-vect-call-1.c
> +++ b/gcc/testsuite/gcc.dg/vect/fast-math-vect-call-1.c
> @@ -97,4 +97,4 @@ main ()
>  }
>
>  /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 4 "vect" { target 
> { vect_call_copysignf && vect_call_sqrtf } } } } */
> -/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 3 "vect" 
> { target { vect_call_copysignf && vect_call_sqrtf } } } } */
> +/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 3 "vect" 
> { target { { vect_call_copysignf && vect_call_sqrtf } && vect_perm3_int } } } 
> } */
> diff --git a/gcc/testsuite/gcc.dg/vect/slp-perm-6.c 
> b/gcc/testsuite/gcc.dg/vect/slp-perm-6.c
> index cc863de76bf..5f121b52ffb 100644
> --- a/gcc/testsuite/gcc.dg/vect/slp-perm-6.c
> +++ b/gcc/testsuite/gcc.dg/vect/slp-perm-6.c
> @@ -106,7 +106,7 @@ int main (int argc, const char* argv[])
>  /* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 "vect" 
> { target { vect_perm3_int && { {! vect_load_lanes } && {! 
> vect_partial_vectors_usage_1 } } } } } } */
>  /* The epilogues are vectorized using partial vectors.  */
>  /* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 4 "vect" 
> { target { vect_perm3_int && { {! vect_load_lanes } && 
> vect_partial_vectors_usage_1 } } } } } */
> -/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 "vect" 
> { target vect_load_lanes } } } */
> -/* { dg-final { scan-tree-dump "Built SLP cancelled: can use 
> load/store-lanes" "vect" { target { vect_perm3_int && vect_load_lanes } xfail 
> { vect_perm3_int && vect_load_lanes } } } } */
> -/* { dg-final { scan-tree-dump "LOAD_LANES" "vect" { target { 
> vect_load_lanes } xfail { vect_load_lanes } } } } */
> -/* { dg-final { scan-tree-dump "STORE_LANES" "vect" { target { 
> vect_load_lanes } xfail { vect_load_lanes } } } } */
> +/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 "vect" 
> { target { vect_perm3_int && vect_load_lanes } } } } */
> +/* { dg-final { scan-tree-dump "Built SLP cancelled: can use 
> load/store-lanes" "vect" { target { vect_perm3_int && vect_load_lanes } xfail 
> *-*-* } } } */
> +/* { dg-final { scan-tree-dump "LOAD_LANES" "vect" { target vect_load_lanes 
> xfail vect_perm3_int } } } */
> +/* { dg-final { scan-tree-dump "STORE_LANES" "vect" { target vect_load_lanes 
> xfail vect_perm3_int } } } */


Re: [PATCH 4/5] testsuite: Adjust gcc.dg/vect/slp-21.c for Arm targets

2020-11-17 Thread Richard Biener via Gcc-patches
On Tue, Nov 17, 2020 at 12:29 PM Richard Sandiford via Gcc-patches
 wrote:
>
> On arm* and aarch64* targets, we can vectorise the second of the main
> loops using SLP, not just the third.  As the comments say, whether this
> is supported depends on a very specific permutation, so it seemed better
> to use direct target selectors.
>
> Tested on aarch64-linux-gnu (with and without SVE), arm-linux-gnueabihf
> and x86_64-linux-gnu.  OK to install?

OK

> Richard
>
>
> gcc/testsuite/
> * gcc.dg/vect/slp-21.c: Expect 4 SLP instances to be vectorized
> on arm* and aarch64* targets.
> ---
>  gcc/testsuite/gcc.dg/vect/slp-21.c | 12 +++-
>  1 file changed, 11 insertions(+), 1 deletion(-)
>
> diff --git a/gcc/testsuite/gcc.dg/vect/slp-21.c 
> b/gcc/testsuite/gcc.dg/vect/slp-21.c
> index 1f8c82e8ba8..117d65c5ddb 100644
> --- a/gcc/testsuite/gcc.dg/vect/slp-21.c
> +++ b/gcc/testsuite/gcc.dg/vect/slp-21.c
> @@ -201,6 +201,16 @@ int main (void)
>
>  /* { dg-final { scan-tree-dump-times "vectorized 4 loops" 1 "vect"  { target 
> { vect_strided4 || vect_extract_even_odd } } } } */
>  /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect"  { target 
>  { ! { vect_strided4 || vect_extract_even_odd } } } } } */
> -/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 "vect" 
> { target vect_strided4 } } } */
> +/* Some targets can vectorize the second of the three main loops using
> +   hybrid SLP.  For 128-bit vectors, the required 4->3 permutations are:
> +
> +   { 0, 1, 2, 4, 5, 6, 8, 9 }
> +   { 2, 4, 5, 6, 8, 9, 10, 12 }
> +   { 5, 6, 8, 9, 10, 12, 13, 14 }
> +
> +   Not all vect_perm targets support that, and it's a bit too specific to 
> have
> +   its own effective-target selector, so we just test targets directly.  */
> +/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 4 "vect" 
> { target { aarch64*-*-* arm*-*-* } } } } */
> +/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 2 "vect" 
> { target { vect_strided4 && { ! { aarch64*-*-* arm*-*-* } } } } } } */
>  /* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 0 "vect"  
> { target { ! { vect_strided4 } } } } } */
>


Re: [PATCH 5/5] testsuite: Adjust bb-slp-pr68892.c for AArch64

2020-11-17 Thread Richard Biener via Gcc-patches
On Tue, Nov 17, 2020 at 12:29 PM Richard Sandiford via Gcc-patches
 wrote:
>
> AArch64 passes the "not profitable" test because it treats vec_construct
> as having a high-enough cost.  This means that we can try other vector
> modes, which in turn causes "BB vectorization with gaps at the end of
> a load is not supported" to be printed more than once.  The number of
> times that we print the message doesn't seem important, so the patch
> converts it to a plain scan-tree-dump.
>
> Tested on aarch64-linux-gnu (with and without SVE), arm-linux-gnueabihf
> and x86_64-linux-gnu.  OK to install?

OK

> Richard
>
>
> gcc/testsuite/
> * gcc.dg/vect/bb-slp-pr68892.c: Don't XFAIL the profitability
> test for aarch64*-*-*.  Allow the "BB vectorization with gaps"
> message to be printed more than once.
> ---
>  gcc/testsuite/gcc.dg/vect/bb-slp-pr68892.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-pr68892.c 
> b/gcc/testsuite/gcc.dg/vect/bb-slp-pr68892.c
> index 8cd3a6a1274..e9909cf0dfa 100644
> --- a/gcc/testsuite/gcc.dg/vect/bb-slp-pr68892.c
> +++ b/gcc/testsuite/gcc.dg/vect/bb-slp-pr68892.c
> @@ -15,6 +15,6 @@ void foo(void)
>
>  /* ???  Due to the gaps we fall back to scalar loads which makes the
> vectorization profitable.  */
> -/* { dg-final { scan-tree-dump "not profitable" "slp2" { xfail *-*-* } } } */
> -/* { dg-final { scan-tree-dump-times "BB vectorization with gaps at the end 
> of a load is not supported" 1 "slp2" } } */
> -/* { dg-final { scan-tree-dump-times "Basic block will be vectorized" 1 
> "slp2" } } */
> +/* { dg-final { scan-tree-dump "not profitable" "slp2" { xfail { ! 
> aarch64*-*-* } } } } */
> +/* { dg-final { scan-tree-dump "BB vectorization with gaps at the end of a 
> load is not supported" "slp2" } } */
> +/* { dg-final { scan-tree-dump-times "Basic block will be vectorized" 1 
> "slp2" { xfail aarch64*-*-* } } } */


Re: [PATCH] Add MODE_OPAQUE

2020-11-17 Thread Richard Sandiford via Gcc-patches
acsaw...@linux.ibm.com writes:
> From: Aaron Sawdey 
>
> Richard,
>   Thanks for the review. I think I have resolved everything, as follows:
>
> * I was able to remove the const_tiny_rtx initialization for
> MODE_OPAQUE.  If that becomes a problem it's a pretty simple matter to
> use an UNSPEC to assign a constant to an opaque mode if necessary. The
> whole point of this exercise was not to have this thing treated as an
> integral type so I think it's best to leave this out if at all
> possible.

OK, sounds good.

> * I ended up adding a precision to opaque after I had put in that hack
> in get_nonzero_bits(). Now that it has a precision (equal to bitsize
> as you say) this is no longer needed. The underlying problem there was
> that without a precision, you ended up returning wi::shwi(-1,0) which
> did not get treated as -1.

OK.

> * I have documented OPAQUE_TYPE in generic.texi and MODE_OPAQUE in
> rtl.texi.
>
> OK for trunk if bootstrap/regtest passes on x86_64 and ppc64le?
>
> Thanks,
> Aaron
>
> gcc/ChangeLog
>   PR target/96791
>   * mode-classes.def: Add MODE_OPAQUE.
>   * machmode.def: Add OPAQUE_MODE.
>   * tree.def: Add OPAQUE_TYPE for types that will use
>   MODE_OPAQUE.
>   * doc/generic.texi: Document OPAQUE_TYPE.
>   * doc/rtl.texi: Document MODE_OPAQUE.
>   * machmode.h: Add OPAQUE_MODE_P().
>   * genmodes.c (complete_mode): Add MODE_OPAQUE.
>   (opaque_mode): New function.
>   * tree.c (tree_code_size): Add OPAQUE_TYPE.
>   * tree.h: Add OPAQUE_TYPE_P().
>   * stor-layout.c (int_mode_for_mode): Treat MODE_OPAQUE modes
>   like BLKmode.
>   * ira.c (find_moveable_pseudos): Treat MODE_OPAQUE modes more
>   like integer/float modes here.
>   * dbxout.c (dbxout_type): Treat OPAQUE_TYPE like VOID_TYPE.
>   * tree-pretty-print.c (dump_generic_node): Treat OPAQUE_TYPE
>   like like other types.
> ---
>  gcc/dbxout.c|  1 +
>  gcc/doc/generic.texi|  8 
>  gcc/doc/rtl.texi|  6 ++
>  gcc/genmodes.c  | 22 ++
>  gcc/ira.c   |  4 +++-
>  gcc/machmode.def|  3 +++
>  gcc/machmode.h  |  4 
>  gcc/mode-classes.def|  3 ++-
>  gcc/stor-layout.c   |  3 +++
>  gcc/tree-pretty-print.c |  1 +
>  gcc/tree.c  |  1 +
>  gcc/tree.def|  6 ++
>  gcc/tree.h  |  3 +++
>  13 files changed, 63 insertions(+), 2 deletions(-)
>
> diff --git a/gcc/dbxout.c b/gcc/dbxout.c
> index 5a20fdecdcc..eaee2f19ce0 100644
> --- a/gcc/dbxout.c
> +++ b/gcc/dbxout.c
> @@ -1963,6 +1963,7 @@ dbxout_type (tree type, int full)
>  case VOID_TYPE:
>  case NULLPTR_TYPE:
>  case LANG_TYPE:
> +case OPAQUE_TYPE:
>/* For a void type, just define it as itself; i.e., "5=5".
>This makes us consider it defined
>without saying what it is.  The debugger will make it
> diff --git a/gcc/doc/generic.texi b/gcc/doc/generic.texi
> index 7373266c69f..7e7b74c6c8b 100644
> --- a/gcc/doc/generic.texi
> +++ b/gcc/doc/generic.texi
> @@ -302,6 +302,7 @@ The elements are indexed from zero.
>  @tindex ARRAY_TYPE
>  @tindex RECORD_TYPE
>  @tindex UNION_TYPE
> +@tindex OPAQUE_TYPE
>  @tindex UNKNOWN_TYPE
>  @tindex OFFSET_TYPE
>  @findex TYPE_UNQUALIFIED
> @@ -487,6 +488,13 @@ assigned to that constant.  These constants will appear 
> in the order in
>  which they were declared.  The @code{TREE_TYPE} of each of these
>  constants will be the type of enumeration type itself.
>  
> +@item OPAQUE_TYPE
> +Used for things that use a @code{MODE_OPAQUE} mode class in the

Maybe s/use/have/?  Just a suggestion though -- it's ok either way.

> +backend. Opaque types have a size and precision, and can be held in
> +memory or registers. They are used when we do not want the compiler to
> +make assumptions about the availability of other operations as would
> +happen with integer types.
> +
>  @item BOOLEAN_TYPE
>  Used to represent the @code{bool} type.
>  
> diff --git a/gcc/doc/rtl.texi b/gcc/doc/rtl.texi
> index 22af5731bb6..cf892d425a2 100644
> --- a/gcc/doc/rtl.texi
> +++ b/gcc/doc/rtl.texi
> @@ -1406,6 +1406,12 @@ Pointer bounds modes.  Used to represent values of 
> pointer bounds type.
>  Operations in these modes may be executed as NOPs depending on hardware
>  features and environment setup.
>  
> +@findex MODE_OPAQUE
> +@item MODE_OPAQUE
> +This is a mode class for modes that don't want to provide operations
> +other than moves between registers/memory. They have a size and

How about “other than register moves, memory moves, loads, stores and
@code{unspec}s?”.  I don't think there's anything stopping us using
unspecs for these modes, and it wasn't clear to me which combinations
were included in “moves between registers/memory”.

OK with those changes if you agree.

Thanks,
Richard

> +precision and that's all.
> +
>  @findex MODE_RANDOM
>  @item MODE_RANDOM
>  This is a catchall mode class for modes which don't fi

Re: [PATCH 1/5] testsuite: Fix vect/vect-sdiv-pow2-1.c

2020-11-17 Thread Richard Sandiford via Gcc-patches
Richard Biener via Gcc-patches  writes:
> On Tue, Nov 17, 2020 at 12:24 PM Richard Sandiford via Gcc-patches
>  wrote:
>>
>> We're now able to vectorise the set-up loop:
>>
>>   int p = power2 (fns[i].po2);
>>   for (int j = 0; j < N; j++)
>> a[j] = ((p << 4) * j) / (N - 1) - (p << 5);
>>
>> Rather than adjust the expected output for that, it seemed better
>> to disable optimisation for the testing code.
>>
>> Tested on aarch64-linux-gnu (with and without SVE), arm-linux-gnueabihf
>> and x86_64-linux-gnu.  OK to install?
>
> In other places we just add a asm ("" : : : "memory") to the loop body, can 
> you
> do it like htat?

I wondered about that, but I don't think it's reliable long-term.
We could (perhaps rightly) decide that it's a win to vectorise the
rhs of a[j] even if the asm prevents us from doing a vector store.

Thanks,
Richard

>
> Thanks,
> RIchard.
>
>> Richard
>>
>>
>> gcc/testsuite/
>> * gcc.dg/vect/vect-sdiv-pow2-1.c (main): Disable optimization.
>> ---
>>  gcc/testsuite/gcc.dg/vect/vect-sdiv-pow2-1.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/gcc/testsuite/gcc.dg/vect/vect-sdiv-pow2-1.c 
>> b/gcc/testsuite/gcc.dg/vect/vect-sdiv-pow2-1.c
>> index be70bc6c47e..bf387133d01 100644
>> --- a/gcc/testsuite/gcc.dg/vect/vect-sdiv-pow2-1.c
>> +++ b/gcc/testsuite/gcc.dg/vect/vect-sdiv-pow2-1.c
>> @@ -53,7 +53,7 @@ power2 (int x)
>>
>>  #define N 50
>>
>> -int
>> +int __attribute__ ((optimize (0)))
>>  main (void)
>>  {
>>int a[N], b[N], c[N];


Re: Make ltrans type canonicals compatible with WPA ones

2020-11-17 Thread Richard Biener
On Tue, 17 Nov 2020, Jan Hubicka wrote:

> Hi,
> this patch fixes profiledbootstrap failure with LTO enabled.
> What happens is that alias_ptr_types_compatible_p relies on the
> fact that alias sets are not refined from WPA to ltrans time:
> 
> /* This function originally abstracts from simply comparing
>get_deref_alias_set so that we are sure this still computes
>the same result after LTO type merging is applied.
>When in LTO type merging is done we can actually do this compare.
> */
>   if (in_lto_p)
> return get_deref_alias_set (t1) == get_deref_alias_set (t2);
>   else
> return (TYPE_MAIN_VARIANT (TREE_TYPE (t1))
> == TYPE_MAIN_VARIANT (TREE_TYPE (t2)));
> 
> This conditional is confused - it pesimizes code with -fno-lto
> for no good reason. I will fix that separately: we now have
> lto_streaming_expected_p so I think it should read
> 
>  if (!lto_stremaing_expected_p () || flag_wpa)
>use alias sets
>  else
>use main varaiants as conservative estimate.
> 
> (so if we ever get idea to ICF during incremental link or deduplicate in
> early passes, things will work safely).  I will send separate patch on
> this.
> 
> 
> Not refining alias sets from WPA to ltrans time is a good invariant to
> maintain and the canonical type hash behaves this way.  However I broke
> this with the ODR logic.
> 
> Normally we define canonical types for C++ ODR types according to their
> type names.  However to make ODR types compatible with C types we check
> if structurally equivalent C type exists and if so, we ignore ODR
> names giving up on the precision.
> 
> This however is not stable between WPA and ltrans since at ltrans the
> type merging does not see as many types as WPA does.  To make this
> consistent the patch makes WPA ODR_TYPE_P == 0 for ODR types that
> conflicted with non-ODR type.
> 
> I had to drop one sanity check in ipa-utils.h (that I think is not very
> important - I added it while introducing CXX_ODR_P machinery) and also
> it now may happen that we query odr_based_tbaa_p before registering
> first ODR type so we do not want to ICE here.
> ODR type registration happens early to produce ODR violation warings.
> Those are not done at ltrans, so dropping the registration is safe. The
> type will still be added while computing the type inheritance graph if
> needed for devirtualization (and late devirtualization is not very
> useful anyway since it won't enable inlining).
> 
> Bootstrapped, regtested x86_64-linux, OK?

OK.

> I think this should go to release branches after some soaking in
> mainline too, even if we don't have direct reproducers.
> 
> Note that Martin implemented type checker to sanity check that alias
> sets are never getting refined from compile to WPa and from WPA to
> ltrans.  I recovered the patch and will play with it more.
> I think we should eventually establish this (if alias sets are refined
> from copmile time to WPA it is either wrong code issue or frontend alias
> sets are not as good as they should be), but of course there are fun
> issues.  My plan is to see if I can identify some wrong code bugs and
> leave rest for early next stage1.

So do we want to actually compute alias sets and stream them,
"freeing up" TYPE_CANONICAL again?  We're sort-of taking it away
from FEs which use it before and recompute it possibly in different
ways ...

Richard.

>   PR bootstrap/97857
>   * ipa-devirt.c (odr_based_tbaa_p): Do not ICE when
>   odr_hash is not initialized.
>   * ipa-utils.h (type_with_linkage_p): Do not sanity check
>   CXX_ODR_P.
>   * lto-common.c (gimple_register_canonical_type_1): Only
>   register types with TYPE_CXX_ODR_P flag; sanity check that no
>   conflict happens at ltrans time.
>   * tree-streamer-out.c (pack_ts_type_common_value_fields): Set
>   CXX_ODR_P according to the canonical type.
> diff --git a/gcc/ipa-devirt.c b/gcc/ipa-devirt.c
> index 067ed5ba073..6e6df0b2af5 100644
> --- a/gcc/ipa-devirt.c
> +++ b/gcc/ipa-devirt.c
> @@ -2032,6 +2032,8 @@ odr_based_tbaa_p (const_tree type)
>  {
>if (!RECORD_OR_UNION_TYPE_P (type))
>  return false;
> +  if (!odr_hash)
> +return false;
>odr_type t = get_odr_type (const_cast  (type), false);
>if (!t || !t->tbaa_enabled)
>  return false;
> diff --git a/gcc/ipa-utils.h b/gcc/ipa-utils.h
> index 880e527c590..91571d8e82a 100644
> --- a/gcc/ipa-utils.h
> +++ b/gcc/ipa-utils.h
> @@ -211,8 +211,6 @@ type_with_linkage_p (const_tree t)
>if (!TYPE_CONTEXT (t))
>  return false;
>  
> -  gcc_checking_assert (TREE_CODE (t) == ENUMERAL_TYPE || TYPE_CXX_ODR_P (t));
> -
>return true;
>  }
>  
> diff --git a/gcc/lto/lto-common.c b/gcc/lto/lto-common.c
> index 6944c469f89..0a3033c3695 100644
> --- a/gcc/lto/lto-common.c
> +++ b/gcc/lto/lto-common.c
> @@ -415,8 +415,8 @@ gimple_register_canonical_type_1 (tree t, hashval_t hash)
>   that we can use to lookup structurally equivalent non-ODR type.
> 

Re: [PATCH] ada: c++: Get rid of libposix4, librt on Solaris

2020-11-17 Thread Jonathan Wakely via Gcc-patches

On 17/11/20 10:47 +0100, Rainer Orth wrote:

I recently noticed that neither libposix4 nor librt are needed on
Solaris 11 any longer:

* libposix4 was renamed to librt in Solaris 7 back in 1998.

* librt was folded into libc in the OpenSolaris timeframe, leaving librt
 only as a filter on libc.  Thus, it's no longer needed on either
 Solaris 11 or Illumos.

The following patch removes both uses.  At the same time, Ada's use of
libthread has gone: it was folded into libc in Solaris 10 already.
TIME_LIBRARY and friends in g++ are likewise removed: Solaris was the
only user.

Bootstrapped without regressions on i386-pc-solaris2.11,
sparc-sun-solaris2.11, and x86_64-pc-linux-gnu.

Ok for master?

There are two more uses of librt left:

* On glibc targets before 2.17 it's needed for clock_gettime.  I've no
 idea how long gcc is supposed to support such targets (glibc 2.17 was
 released in December 2012).


RHEL 7 uses glibc 2.17, so it will still be in use for some time.



diff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4
--- a/libstdc++-v3/acinclude.m4
+++ b/libstdc++-v3/acinclude.m4
@@ -1381,8 +1381,8 @@ dnl
dnl --enable-libstdcxx-time
dnl --enable-libstdcxx-time=yes
dnlchecks for the availability of monotonic and realtime clocks,
-dnlnanosleep and sched_yield in libc and libposix4 and, if needed,
-dnllinks in the latter.
+dnlnanosleep and sched_yield in libc and, if needed, links in the
+dnllatter.


"The latter" was referring to libposix4, and we always link to libc,
so "if needed" doesn't apply to it.

So I think it should be:

 dnlchecks for the availability of monotonic and realtime clocks,
-dnlnanosleep and sched_yield in libc and libposix4 and, if needed,
-dnllinks in the latter.
+dnlnanosleep and sched_yield in libc.




diff --git a/libstdc++-v3/doc/xml/manual/configure.xml 
b/libstdc++-v3/doc/xml/manual/configure.xml
--- a/libstdc++-v3/doc/xml/manual/configure.xml
+++ b/libstdc++-v3/doc/xml/manual/configure.xml
@@ -171,7 +171,7 @@
sched_yield functions, used in the
implementation of [thread.thread.this] of the 2011 ISO C++ standard.
The choice OPTION=yes checks for the availability of the facilities
-   in libc and libposix4.  In case it's needed the latter is also linked
+   in libc.  In case it's needed the latter is also linked
to libstdc++ as part of the build process.  OPTION=rt also checks in


Similarly, the whole "In case it's needed the latter is also linked to
libstdc++ as part of the build process." sentence should be removed. It
only applied to libposix4.


librt (and, if it's needed, links to it).  Note that linking to librt
is not always desirable because for glibc it requires linking to


The libstdc++ part is OK with those adjustments. Thanks for doing
this, it's really helpful to trim these checks so the unnecessary
parts don't hang around indefinitely.




Re: V2 [PATCH] Use SHF_GNU_RETAIN to preserve symbol definitions

2020-11-17 Thread H.J. Lu via Gcc-patches
On Mon, Nov 16, 2020 at 7:59 PM Hans-Peter Nilsson  wrote:
>
> On Fri, 13 Nov 2020, H.J. Lu via Gcc-patches wrote:
> > Done.  Here is the updated patch.
>
> Hi.  I see a test-case for this kind of construct:
>
>  int foo __attribute__((__used__, __section__ (".bar"))) = 42;
>
> and IIUC that it's handled as I'd hope (setting "R" on the named
> section, not another derived section), good.
>
> Could you also add a test-case that the same construct
> *without* a specific initializer is handled the same way?
> I.e.:
>  int foo __attribute__((__used__, __section__ (".bar")));
>

Done.  The only changes are

/* { dg-final { scan-assembler ".data.used_bar_sec,\"awR\"" } } */
...
int __attribute__((used,section(".data.used_bar_sec"))) used_bar;

and 2 additional tests for -fcommon.

Thanks.

-- 
H.J.
From d19f2e2ec7f0f47121a2a4c05ffe20af8972c1bb Mon Sep 17 00:00:00 2001
From: "H.J. Lu" 
Date: Mon, 3 Feb 2020 11:55:43 -0800
Subject: [PATCH] Use SHF_GNU_RETAIN to preserve symbol definitions

In assemly code, the section flag 'R' sets the SHF_GNU_RETAIN flag to
indicate that the section must be preserved by the linker.

Add SECTION_RETAIN to indicate a section should be retained by the linker
and set SECTION_RETAIN on section for the preserved symbol if assembler
supports SHF_GNU_RETAIN.  All retained symbols are placed in separate
sections with

	.section .data.rel.local.preserved_symbol,"awR"
preserved_symbol:
...
	.section .data.rel.local,"aw"
not_preserved_symbol:
...

to avoid

	.section .data.rel.local,"awR"
preserved_symbol:
...
not_preserved_symbol:
...

which places not_preserved_symbol definition in the SHF_GNU_RETAIN
section.

gcc/

2020-11-XX  H.J. Lu  

	* configure.ac (HAVE_GAS_SHF_GNU_RETAIN): New.  Define 1 if
	the assembler supports marking sections with SHF_GNU_RETAIN flag.
	* output.h (SECTION_RETAIN): New.  Defined as 0x400.
	(SECTION_MACH_DEP): Changed from 0x400 to 0x800.
	(default_unique_section): Add a bool argument.
	* varasm.c (get_section): Set SECTION_RETAIN for the preserved
	symbol with HAVE_GAS_SHF_GNU_RETAIN.
	(resolve_unique_section): Used named section for the preserved
	symbol if assembler supports SHF_GNU_RETAIN.
	(get_variable_section): Handle the preserved common symbol with
	HAVE_GAS_SHF_GNU_RETAIN.
	(default_elf_asm_named_section): Require the full declaration and
	use the 'R' flag for SECTION_RETAIN.
	* config.in: Regenerated.
	* configure: Likewise.

gcc/testsuite/

2020-11-XX  H.J. Lu  
	Jozef Lawrynowicz  

	* c-c++-common/attr-used.c: Check the 'R' flag.
	* c-c++-common/attr-used-2.c: Likewise.
	* c-c++-common/attr-used-3.c: New test.
	* c-c++-common/attr-used-4.c: Likewise.
	* gcc.c-torture/compile/attr-used-retain-1.c: Likewise.
	* gcc.c-torture/compile/attr-used-retain-2.c: Likewise.
	* gcc.c-torture/compile/attr-used-retain-3.c: Likewise.
	* gcc.c-torture/compile/attr-used-retain-4.c: Likewise.
	* lib/target-supports.exp
	(check_effective_target_R_flag_in_section): New proc.
---
 gcc/config.in |  7 +++
 gcc/configure | 51 +++
 gcc/configure.ac  | 20 
 gcc/output.h  |  6 ++-
 gcc/testsuite/c-c++-common/attr-used-2.c  |  1 +
 gcc/testsuite/c-c++-common/attr-used-3.c  |  7 +++
 gcc/testsuite/c-c++-common/attr-used-4.c  |  7 +++
 gcc/testsuite/c-c++-common/attr-used.c|  1 +
 .../compile/attr-used-retain-1.c  | 37 ++
 .../compile/attr-used-retain-2.c  | 17 +++
 .../compile/attr-used-retain-3.c  | 17 +++
 .../compile/attr-used-retain-4.c  | 17 +++
 gcc/testsuite/lib/target-supports.exp | 40 +++
 gcc/varasm.c  | 17 +--
 14 files changed, 241 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/attr-used-3.c
 create mode 100644 gcc/testsuite/c-c++-common/attr-used-4.c
 create mode 100644 gcc/testsuite/gcc.c-torture/compile/attr-used-retain-1.c
 create mode 100644 gcc/testsuite/gcc.c-torture/compile/attr-used-retain-2.c
 create mode 100644 gcc/testsuite/gcc.c-torture/compile/attr-used-retain-3.c
 create mode 100644 gcc/testsuite/gcc.c-torture/compile/attr-used-retain-4.c

diff --git a/gcc/config.in b/gcc/config.in
index b7c3107bfe3..23ae2f9bc1b 100644
--- a/gcc/config.in
+++ b/gcc/config.in
@@ -1352,6 +1352,13 @@
 #endif
 
 
+/* Define 0/1 if your assembler supports marking sections with SHF_GNU_RETAIN
+   flag. */
+#ifndef USED_FOR_TARGET
+#undef HAVE_GAS_SHF_GNU_RETAIN
+#endif
+
+
 /* Define 0/1 if your assembler supports marking sections with SHF_MERGE flag.
*/
 #ifndef USED_FOR_TARGET
diff --git a/gcc/configure b/gcc/configure
index dbda4415a17..a925a6e5efb 100755
--- a/gcc/configure
+++ b/gcc/configure
@@ -24272,6 +24272,57 @@ cat >>confdefs.h <<_ACEOF
 _ACEOF
 
 
+# Test if the assembler supports the section flag 'R' for specifying
+# sec

Re: [PATCH] ada: c++: Get rid of libposix4, librt on Solaris

2020-11-17 Thread Rainer Orth
Hi Jonathan,

>>There are two more uses of librt left:
>>
>>* On glibc targets before 2.17 it's needed for clock_gettime.  I've no
>>  idea how long gcc is supposed to support such targets (glibc 2.17 was
>>  released in December 2012).
>
> RHEL 7 uses glibc 2.17, so it will still be in use for some time.

but at least the comments say < 2.17, so RHEL 7 wouldn't be affected.

>>diff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4
>>--- a/libstdc++-v3/acinclude.m4
>>+++ b/libstdc++-v3/acinclude.m4
>>@@ -1381,8 +1381,8 @@ dnl
>> dnl --enable-libstdcxx-time
>> dnl --enable-libstdcxx-time=yes
>> dnlchecks for the availability of monotonic and realtime clocks,
>>-dnlnanosleep and sched_yield in libc and libposix4 and, if needed,
>>-dnllinks in the latter.
>>+dnlnanosleep and sched_yield in libc and, if needed, links in the
>>+dnllatter.
>
> "The latter" was referring to libposix4, and we always link to libc,
> so "if needed" doesn't apply to it.
>
> So I think it should be:
>
>  dnlchecks for the availability of monotonic and realtime clocks,
> -dnlnanosleep and sched_yield in libc and libposix4 and, if needed,
> -dnllinks in the latter.
> +dnlnanosleep and sched_yield in libc.
>
>
>
>>diff --git a/libstdc++-v3/doc/xml/manual/configure.xml
>> b/libstdc++-v3/doc/xml/manual/configure.xml
>>--- a/libstdc++-v3/doc/xml/manual/configure.xml
>>+++ b/libstdc++-v3/doc/xml/manual/configure.xml
>>@@ -171,7 +171,7 @@
>>  sched_yield functions, used in the
>>  implementation of [thread.thread.this] of the 2011 ISO C++ standard.
>>  The choice OPTION=yes checks for the availability of the facilities
>>- in libc and libposix4.  In case it's needed the latter is also linked
>>+ in libc.  In case it's needed the latter is also linked
>>  to libstdc++ as part of the build process.  OPTION=rt also checks in
>
> Similarly, the whole "In case it's needed the latter is also linked to
> libstdc++ as part of the build process." sentence should be removed. It
> only applied to libposix4.

Good catch: I've been too mechanical in my updates.  Btw., can you take
care of regenerating the html files there?

>>  librt (and, if it's needed, links to it).  Note that linking to librt
>>  is not always desirable because for glibc it requires linking to
>
> The libstdc++ part is OK with those adjustments. Thanks for doing
> this, it's really helpful to trim these checks so the unnecessary
> parts don't hang around indefinitely.

My pleasure: they are easy enough to miss, unfortunately, since the are
seldom labeled with `for OS version X.Y' or some such.  E.g. we still
have a libexc test in gcc/configure.in, which was only added for Tru64
UNIX, I believe (unless Linux/alpha needs it, too).

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: Make ltrans type canonicals compatible with WPA ones

2020-11-17 Thread Jan Hubicka
Hi,
thanks!
> 
> So do we want to actually compute alias sets and stream them,
> "freeing up" TYPE_CANONICAL again?  We're sort-of taking it away

I am not sure what you mean by freeing up TYPE_CANONICAL again :) but I
was playing with idea of streaming the alias sets from WPA to ltrans. It
may simplify things especially if our canonical type logic gets more
complex... In particular I would eventually like to avoid pesimizing
C/C++ code just becuase we worry about compatibility with Fortran and
such and had some patches for this direction  but so far there was more
pressing issues with TBAA than this.

We do not really use TYPE_CANONICAL outside of tree-ssa-alias and
alias.c

Honza
> from FEs which use it before and recompute it possibly in different
> ways ...
> 
> Richard.


Re: [1/3][aarch64] Add aarch64 support for vec_widen_add, vec_widen_sub patterns

2020-11-17 Thread Richard Sandiford via Gcc-patches
Joel Hutton  writes:
> Tests are still running, but I believe I've addressed the comment.
>
>> There are ways in which we could reduce the amount of cut-&-paste here,
>> but I guess everything is a trade-off between clarity and compactness.
>> One extreme is to write them all out explicitly, another extreme would
>> be to have one define_expand and various iterators and attributes.
>>
>> I think the vec_widen_mult_*_ patterns strike a good balance:
>> the use ANY_EXTEND to hide the sign difference while still having
>> separate hi and lo patterns:
>
> Done
>
> gcc/ChangeLog:
>
> 2020-11-13  Joel Hutton  
>
> * config/aarch64/aarch64-simd.md: New patterns
>   vec_widen_saddl_lo/hi_.
>
> From c52fd11f5d471200c1292fad3bc04056e7721f06 Mon Sep 17 00:00:00 2001
> From: Joel Hutton 
> Date: Mon, 9 Nov 2020 15:35:57 +
> Subject: [PATCH 1/3] [aarch64] Add vec_widen patterns to aarch64
>
> Add widening add and subtract patterns to the aarch64
> backend. These allow taking vectors of N elements of size S
> and performing and add/subtract on the high or low half
> widening the resulting elements and storing N/2 elements of size 2*S.
> These correspond to the addl,addl2,subl,subl2 instructions.
> ---
>  gcc/config/aarch64/aarch64-simd.md | 47 ++
>  1 file changed, 47 insertions(+)
>
> diff --git a/gcc/config/aarch64/aarch64-simd.md 
> b/gcc/config/aarch64/aarch64-simd.md
> index 2cf6fe9154a..30299610635 100644
> --- a/gcc/config/aarch64/aarch64-simd.md
> +++ b/gcc/config/aarch64/aarch64-simd.md
> @@ -3382,6 +3382,53 @@
>[(set_attr "type" "neon__long")]
>  )
>  
> +(define_expand "vec_widen_addl_lo_"
> +  [(match_operand: 0 "register_operand")
> +   (ANY_EXTEND: (match_operand:VQW 1 "register_operand"))
> +   (ANY_EXTEND: (match_operand:VQW 2 "register_operand"))]
> +  "TARGET_SIMD"
> +{
> +  rtx p = aarch64_simd_vect_par_cnst_half (mode, , false);
> +  emit_insn (gen_aarch64_addl_lo_internal (operands[0], 
> operands[1],
> +   operands[2], p));

Nit: operands[2] should be indented three more columns now that “s” and
“u” have changed to “”.

OK with that change, thanks.

Richard


Re: [PATCH] libstdc++: Fix ranges::search_n for random access iterators [PR97828]

2020-11-17 Thread Jonathan Wakely via Gcc-patches

On 16/11/20 15:25 -0500, Patrick Palka via Libstdc++ wrote:

My ranges transcription of the std::search_n implementation for random
access iterators missed a crucial part of the algorithm which the tests
unfortunately didn't catch.  When __remainder is less than __count at
the start of an iteration of the outer while loop, it means we're
continuing a run of __count - __remainder identical elements from the
previous iteration in which the backwards scan got interrupted by the
predicate failing.  If the backwards scan gets interrupted again at this
next iteration, we need to reset __remainder so that it's offset only by
the size the most recent run of identical elements, rather than by the
sum of all the (disjoint) runs.

This patch fixes this appropriately, mirroring how it's done in the
corresponding std::search_n implementation.

Tested on x86_64-pc-linux-gnu, does this look OK for trunk and the 10
branch?


OK, thanks.



Re: [PATCH] ada: c++: Get rid of libposix4, librt on Solaris

2020-11-17 Thread Jonathan Wakely via Gcc-patches

On 17/11/20 14:25 +0100, Rainer Orth wrote:

Hi Jonathan,


There are two more uses of librt left:

* On glibc targets before 2.17 it's needed for clock_gettime.  I've no
 idea how long gcc is supposed to support such targets (glibc 2.17 was
 released in December 2012).


RHEL 7 uses glibc 2.17, so it will still be in use for some time.


but at least the comments say < 2.17, so RHEL 7 wouldn't be affected.


Ah right, sorry, I read too quickly. Yes, < 2.17 probably isn't very
relevant now, although historically libstdc++ has not explicitly
dropped support older glibc versions. If it builds, then it builds.

We could consider doing some housekeeping in that area, or just
documenting our requirements more carefully (for example, we now
require Linux kernel version 2.6.22 for the FUTEX_PRIVATE_FLAG but I
don't think we say that anywhere).


diff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4
--- a/libstdc++-v3/acinclude.m4
+++ b/libstdc++-v3/acinclude.m4
@@ -1381,8 +1381,8 @@ dnl
dnl --enable-libstdcxx-time
dnl --enable-libstdcxx-time=yes
dnlchecks for the availability of monotonic and realtime clocks,
-dnlnanosleep and sched_yield in libc and libposix4 and, if needed,
-dnllinks in the latter.
+dnlnanosleep and sched_yield in libc and, if needed, links in the
+dnllatter.


"The latter" was referring to libposix4, and we always link to libc,
so "if needed" doesn't apply to it.

So I think it should be:

 dnlchecks for the availability of monotonic and realtime clocks,
-dnlnanosleep and sched_yield in libc and libposix4 and, if needed,
-dnllinks in the latter.
+dnlnanosleep and sched_yield in libc.




diff --git a/libstdc++-v3/doc/xml/manual/configure.xml
b/libstdc++-v3/doc/xml/manual/configure.xml
--- a/libstdc++-v3/doc/xml/manual/configure.xml
+++ b/libstdc++-v3/doc/xml/manual/configure.xml
@@ -171,7 +171,7 @@
sched_yield functions, used in the
implementation of [thread.thread.this] of the 2011 ISO C++ standard.
The choice OPTION=yes checks for the availability of the facilities
-   in libc and libposix4.  In case it's needed the latter is also linked
+   in libc.  In case it's needed the latter is also linked
to libstdc++ as part of the build process.  OPTION=rt also checks in


Similarly, the whole "In case it's needed the latter is also linked to
libstdc++ as part of the build process." sentence should be removed. It
only applied to libposix4.


Good catch: I've been too mechanical in my updates.  Btw., can you take
care of regenerating the html files there?


Yes, no problem.


librt (and, if it's needed, links to it).  Note that linking to librt
is not always desirable because for glibc it requires linking to


The libstdc++ part is OK with those adjustments. Thanks for doing
this, it's really helpful to trim these checks so the unnecessary
parts don't hang around indefinitely.


My pleasure: they are easy enough to miss, unfortunately, since the are
seldom labeled with `for OS version X.Y' or some such.  E.g. we still
have a libexc test in gcc/configure.in, which was only added for Tru64
UNIX, I believe (unless Linux/alpha needs it, too).


Do we even have anybody still using alpha?




Re: [2/3][vect] Add widening add, subtract vect patterns

2020-11-17 Thread Richard Sandiford via Gcc-patches
Richard Biener  writes:
> On Fri, 13 Nov 2020, Joel Hutton wrote:
>
>> Tests are still running, but I believe I've addressed all the comments.
>> 
>> > Like Richard said, the new patterns need to be documented in md.texi
>> > and the new tree codes need to be documented in generic.texi.
>> 
>> Done.
>> 
>> > While we're using tree codes, I think we need to make the naming
>> > consistent with other tree codes: WIDEN_PLUS_EXPR instead of
>> > WIDEN_ADD_EXPR and WIDEN_MINUS_EXPR instead of WIDEN_SUB_EXPR.
>> > Same idea for the VEC_* codes.
>> 
>> Fixed.
>> 
>> > > gcc/ChangeLog:
>> > >
>> > > 2020-11-12  Joel Hutton  
>> > >
>> > > * expr.c (expand_expr_real_2): add widen_add,widen_subtract cases
>> > 
>> > Not that I personally care about this stuff (would love to see changelogs
>> > go away :-)) but some nits:
>> > 
>> > Each description is supposed to start with a capital letter and end with
>> > a full stop (even if it's not a complete sentence).  Same for the rest
>> 
>> Fixed.
>> 
>> > > * optabs-tree.c (optab_for_tree_code): optabs for widening 
>> > > adds,subtracts
>> > 
>> > The line limit for changelogs is 80 characters.  The entry should say
>> > what changed, so ?Handle ?? or ?Add case for ?? or something.
>> 
>> Fixed.
>> 
>> > > * tree-vect-patterns.c (vect_recog_widen_add_pattern): New recog 
>> > > ptatern
>> > 
>> > typo: pattern
>> 
>> Fixed.
>> 
>> > > Add widening add, subtract patterns to tree-vect-patterns.
>> > > Add aarch64 tests for patterns.
>> > >
>> > > fix sad
>> > 
>> > Would be good to expand on this for the final commit message.
>> 
>> 'fix sad' was accidentally included when I squashed two commits. I've made 
>> all the commit messages more descriptive.
>> 
>> > > +
>> > > +case VEC_WIDEN_SUB_HI_EXPR:
>> > > +  return (TYPE_UNSIGNED (type)
>> > > +   ? vec_widen_usubl_hi_optab  : vec_widen_ssubl_hi_optab);
>> > > +
>> > > +
>> > 
>> > Nits: excess blank line at the end and excess space before the ?:?s.
>> 
>> Fixed.
>> 
>> > > +OPTAB_D (vec_widen_usubl_lo_optab, "vec_widen_usubl_lo_$a")
>> > > +OPTAB_D (vec_widen_uaddl_hi_optab, "vec_widen_uaddl_hi_$a")
>> > > +OPTAB_D (vec_widen_uaddl_lo_optab, "vec_widen_uaddl_lo_$a")
>> > >  OPTAB_D (vec_widen_sshiftl_hi_optab, "vec_widen_sshiftl_hi_$a")
>> > >  OPTAB_D (vec_widen_sshiftl_lo_optab, "vec_widen_sshiftl_lo_$a")
>> > >  OPTAB_D (vec_widen_umult_even_optab, "vec_widen_umult_even_$a")
>> > 
>> > Looks like the current code groups signed stuff together and
>> > unsigned stuff together, so would be good to follow that.
>> 
>> Fixed.
>> 
>> > Same comments as the previous patch about having a "+nosve" pragma
>> > and about the scan-assembler-times lines.  Same for the sub test.
>> 
>> Fixed.
>> 
>> > I am missing documentation in md.texi for the new patterns.  In
>> > particular I wonder why you need singed and unsigned variants
>> > for the add/subtract patterns.
>> 
>> Fixed. Signed and unsigned variants because they correspond to signed and
>> unsigned instructions, (uaddl/uaddl2, saddl/saddl2).
>> 
>> > The new functions should have comments before them.  Can probably
>> > just use the vect_recog_widen_mult_pattern comment as a template.
>> 
>> Fixed.
>> 
>> > > +case VEC_WIDEN_SUB_HI_EXPR:
>> > > +case VEC_WIDEN_SUB_LO_EXPR:
>> > > +case VEC_WIDEN_ADD_HI_EXPR:
>> > > +case VEC_WIDEN_ADD_LO_EXPR:
>> > > +  return false;
>> > > +
>> >
>> > I think these should get the same validity checking as
>> > VEC_WIDEN_MULT_HI_EXPR etc.
>> 
>> Fixed.
>> 
>> > > --- a/gcc/tree-vect-patterns.c
>> > > +++ b/gcc/tree-vect-patterns.c
>> > > @@ -1086,8 +1086,10 @@ vect_recog_sad_pattern (vec_info *vinfo,
>> > >   of the above pattern.  */
>> > >
>> > >tree plus_oprnd0, plus_oprnd1;
>> > > -  if (!vect_reassociating_reduction_p (vinfo, stmt_vinfo, PLUS_EXPR,
>> > > -&plus_oprnd0, &plus_oprnd1))
>> > > +  if (!(vect_reassociating_reduction_p (vinfo, stmt_vinfo, PLUS_EXPR,
>> > > +&plus_oprnd0, &plus_oprnd1)
>> > > + || vect_reassociating_reduction_p (vinfo, stmt_vinfo, 
>> > > WIDEN_ADD_EXPR,
>> > > +&plus_oprnd0, &plus_oprnd1)))
>> > >  return NULL;
>> > >
>> > > tree sum_type = gimple_expr_type (last_stmt);
>> >
>> > I think we should make:
>> >
>> >   /* Any non-truncating sequence of conversions is OK here, since
>> >  with a successful match, the result of the ABS(U) is known to fit
>> >  within the nonnegative range of the result type.  (It cannot be the
>> >  negative of the minimum signed value due to the range of the widening
>> >  MINUS_EXPR.)  */
>> >   vect_unpromoted_value unprom_abs;
>> >   plus_oprnd0 = vect_look_through_possible_promotion (vinfo, plus_oprnd0,
>> >   &unprom_abs);
>> >
>> > specific to the PLUS_EXPR case.  If we look through promotions on
>> > the operands of a W

Re: [3/3][aarch64] Add support for vec_widen_shift pattern

2020-11-17 Thread Richard Sandiford via Gcc-patches
Richard Biener  writes:
> On Fri, 13 Nov 2020, Joel Hutton wrote:
>
>> Tests are still running, but I believe I've addressed all the comments.
>> 
>> > > +#include 
>> > > +
>> > 
>> > SVE targets will need a:
>> > 
>> > #pragma GCC target "+nosve"
>> > 
>> > here, since we'll generate different code for SVE.
>> 
>> Fixed.
>> 
>> > > +/* { dg-final { scan-assembler-times "shll\t" 1} } */
>> > > +/* { dg-final { scan-assembler-times "shll2\t" 1} } */
>> > 
>> > Very minor nit, sorry, but I think:
>> > 
>> > /* { dg-final { scan-assembler-times {\tshll\t} 1 } } */
>> > 
>> > would be better.  Using "?\t" works, but IIRC it shows up as a tab
>> > character in the testsuite result summary too.
>> 
>> Fixed. Minor nits welcome. :)
>> 
>> 
>> > OK for the aarch64 bits with the testsuite changes above.
>> ok?
>
> The gcc/tree-vect-stmts.c parts are OK.

Same for the AArch64 stuff.

Thanks,
Richard


Re: [PATCH] ada: c++: Get rid of libposix4, librt on Solaris

2020-11-17 Thread Rainer Orth
Hi Jonathan,

There are two more uses of librt left:

* On glibc targets before 2.17 it's needed for clock_gettime.  I've no
  idea how long gcc is supposed to support such targets (glibc 2.17 was
  released in December 2012).
>>>
>>> RHEL 7 uses glibc 2.17, so it will still be in use for some time.
>>
>>but at least the comments say < 2.17, so RHEL 7 wouldn't be affected.
>
> Ah right, sorry, I read too quickly. Yes, < 2.17 probably isn't very
> relevant now, although historically libstdc++ has not explicitly
> dropped support older glibc versions. If it builds, then it builds.
>
> We could consider doing some housekeeping in that area, or just
> documenting our requirements more carefully (for example, we now
> require Linux kernel version 2.6.22 for the FUTEX_PRIVATE_FLAG but I
> don't think we say that anywhere).

that would certainly help, if only to set user expectations.  When
e.g. I tried to get any info from the LLVM community which macOS
versions were supposed to be still supported, it was like pulling teeth
and in the end got me nothing.  Not a particularly pleasant experience.
We should be able to do better than that.

>>> The libstdc++ part is OK with those adjustments. Thanks for doing
>>> this, it's really helpful to trim these checks so the unnecessary
>>> parts don't hang around indefinitely.
>>
>>My pleasure: they are easy enough to miss, unfortunately, since the are
>>seldom labeled with `for OS version X.Y' or some such.  E.g. we still
>>have a libexc test in gcc/configure.in, which was only added for Tru64
>>UNIX, I believe (unless Linux/alpha needs it, too).
>
> Do we even have anybody still using alpha?

I have no idea: I donated my last alpha systems to some sort of computer
museum years ago once I had removed Tru64 UNIX support.  There are always
some enthusiasts around, some of which clamour for keeping their pet
target around a bit longer ;-)

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


[PATCH] x86: Add a testcase for PR target/31799

2020-11-17 Thread H.J. Lu via Gcc-patches
Add a testcase for PR target/31799 which was fixed by

commit 4f0473fe89e68bf7c09542ee5c3684da25a5b435
Author: Uros Bizjak 
Date:   Fri May 12 21:04:05 2017 +0200

compare-elim.c (try_eliminate_compare): Canonicalize operation with 
embedded compare to [(set (reg:CCM) (compare:CCM...

* compare-elim.c (try_eliminate_compare): Canonicalize
operation with embedded compare to
[(set (reg:CCM) (compare:CCM (operation) (immediate)))
 (set (reg) (operation)].

* config/i386/i386.c (TARGET_FLAGS_REGNUM): New define.

in GCC 8.

PR target/31799
* gcc.target/i386/pr31799.c: New test.
---
 gcc/testsuite/gcc.target/i386/pr31799.c | 12 
 1 file changed, 12 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr31799.c

diff --git a/gcc/testsuite/gcc.target/i386/pr31799.c 
b/gcc/testsuite/gcc.target/i386/pr31799.c
new file mode 100644
index 000..c72c4eab986
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr31799.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+void
+foo (int x, int *y, int *z)
+{
+  *z = ++x;
+  if (x != 0)
+  *y = 1;
+}
+
+/* { dg-final { scan-assembler-not "test" } } */
-- 
2.28.0



langhooks: preprocessor hooks for c++ modules

2020-11-17 Thread Nathan Sidwell


This is a slightly modified version of 01-langhooks.def.  I realized I
didn't need the deferred macro langhook -- that can be directly
installed into the preprocessor callbacks via preprocess_options lang
hook.

gcc/
* langhooks-def.h (LANG_HOOKS_PREPROCESS_MAIN_FILE)
(LANG_HOOKS_PREPROCESS_OPTIONS, LANG_HOOKS_PREPROCESS_UNDEF)
(LANG_HOOKS_PREPROCESS_TOKEN): New.
(LANG_HOOKS_INITIALIZER): Add them.
* langhooks.h (struct lang_hooks): Add preprocess_main_file,
preprocess_options, preprocess_undef, preprocess_token hooks.  Add
enum PT_flags.
gcc/c-family.
* c-lex.c: #include "langhooks.h".
(cb_undef): Maybe call preprocess_undef lang hook.
* c-opts.c (c_common_post_options): Maybe call preprocess_options
lang hook.
(push_command_line_include): Maybe call preprocess_main_file lang
hook.
(cb_file_change): Likewise.
* c-ppoutput.c: #include "langhooks.h.
(scan_translation_unit): Maybe call preprocess_token lang hook.
(class do_streamer): New, derive from token_streamer.
(directives_only_cb): Data pointer is do_streamer, call
preprocess_token lang hook.
(scan_translation_unit_directives_only): Use do_streamer.
(print_line_1): Move src_line recording to after string output.
(cb_undef): Maybe call preprocess_undef lang hook.

pushing to trunk
--
Nathan Sidwell
diff --git i/gcc/c-family/c-lex.c w/gcc/c-family/c-lex.c
index 6cd3df7c96f..8dd1420d10d 100644
--- i/gcc/c-family/c-lex.c
+++ w/gcc/c-family/c-lex.c
@@ -28,7 +28,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "c-pragma.h"
 #include "debug.h"
 #include "file-prefix-map.h" /* remap_macro_filename()  */
-
+#include "langhooks.h"
 #include "attribs.h"
 
 /* We may keep statistics about how long which files took to compile.  */
@@ -274,9 +274,11 @@ cb_define (cpp_reader *pfile, location_t loc, cpp_hashnode *node)
 
 /* #undef callback for DWARF and DWARF2 debug info.  */
 static void
-cb_undef (cpp_reader * ARG_UNUSED (pfile), location_t loc,
-	  cpp_hashnode *node)
+cb_undef (cpp_reader *pfile, location_t loc, cpp_hashnode *node)
 {
+  if (lang_hooks.preprocess_undef)
+lang_hooks.preprocess_undef (pfile, loc, node);
+
   const struct line_map *map = linemap_lookup (line_table, loc);
   (*debug_hooks->undef) (SOURCE_LINE (linemap_check_ordinary (map), loc),
 			 (const char *) NODE_NAME (node));
diff --git i/gcc/c-family/c-opts.c w/gcc/c-family/c-opts.c
index 40e92229d8a..77844d7daf1 100644
--- i/gcc/c-family/c-opts.c
+++ w/gcc/c-family/c-opts.c
@@ -1106,6 +1106,8 @@ c_common_post_options (const char **pfilename)
   struct cpp_callbacks *cb = cpp_get_callbacks (parse_in);
   cb->file_change = cb_file_change;
   cb->dir_change = cb_dir_change;
+  if (lang_hooks.preprocess_options)
+lang_hooks.preprocess_options (parse_in);
   cpp_post_options (parse_in);
   init_global_opts_from_cpp (&global_options, cpp_get_options (parse_in));
 
@@ -1548,7 +1550,13 @@ push_command_line_include (void)
   cpp_opts->warn_unused_macros = cpp_warn_unused_macros;
   /* Restore the line map back to the main file.  */
   if (!cpp_opts->preprocessed)
-	cpp_change_file (parse_in, LC_RENAME, this_input_filename);
+	{
+	  cpp_change_file (parse_in, LC_RENAME, this_input_filename);
+	  if (lang_hooks.preprocess_main_file)
+	/* We're starting the main file.  Inform the FE of that.  */
+	lang_hooks.preprocess_main_file
+	  (parse_in, line_table, LINEMAPS_LAST_ORDINARY_MAP (line_table));
+	}
 
   /* Set this here so the client can change the option if it wishes,
 	 and after stacking the main file so we don't trace the main file.  */
@@ -1558,14 +1566,19 @@ push_command_line_include (void)
 
 /* File change callback.  Has to handle -include files.  */
 static void
-cb_file_change (cpp_reader * ARG_UNUSED (pfile),
-		const line_map_ordinary *new_map)
+cb_file_change (cpp_reader *reader, const line_map_ordinary *new_map)
 {
   if (flag_preprocess_only)
 pp_file_change (new_map);
   else
 fe_file_change (new_map);
 
+  if (new_map && cpp_opts->preprocessed
+  && lang_hooks.preprocess_main_file && MAIN_FILE_P (new_map)
+  && ORDINARY_MAP_STARTING_LINE_NUMBER (new_map))
+/* We're starting the main file.  Inform the FE of that.  */
+lang_hooks.preprocess_main_file (reader, line_table, new_map);
+
   if (new_map 
   && (new_map->reason == LC_ENTER || new_map->reason == LC_RENAME))
 {
diff --git i/gcc/c-family/c-ppoutput.c w/gcc/c-family/c-ppoutput.c
index 517de15d97c..e3e0e59fcc7 100644
--- i/gcc/c-family/c-ppoutput.c
+++ w/gcc/c-family/c-ppoutput.c
@@ -21,6 +21,7 @@
 #include "coretypes.h"
 #include "c-common.h"		/* For flags.  */
 #include "../libcpp/internal.h"
+#include "langhooks.h"
 #include "c-pragma.h"		/* For parse_in.  */
 #include "file-prefix-map.h"/* remap_macro_filename()  */
 
@@ -301,10 +302,15 @@ token_str

Re: Make ltrans type canonicals compatible with WPA ones

2020-11-17 Thread Richard Biener
On Tue, 17 Nov 2020, Jan Hubicka wrote:

> Hi,
> thanks!
> > 
> > So do we want to actually compute alias sets and stream them,
> > "freeing up" TYPE_CANONICAL again?  We're sort-of taking it away
> 
> I am not sure what you mean by freeing up TYPE_CANONICAL again :) but I
> was playing with idea of streaming the alias sets from WPA to ltrans. It
> may simplify things especially if our canonical type logic gets more
> complex... In particular I would eventually like to avoid pesimizing
> C/C++ code just becuase we worry about compatibility with Fortran and
> such and had some patches for this direction  but so far there was more
> pressing issues with TBAA than this.
> 
> We do not really use TYPE_CANONICAL outside of tree-ssa-alias and
> alias.c

GIMPLE type checking uses it heavily (aka useless_type_conversion_p).

Richard.

> Honza
> > from FEs which use it before and recompute it possibly in different
> > ways ...
> > 
> > Richard.
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imend


preprocessor: Fix profiled bootstrap warning [pr97858]

2020-11-17 Thread Nathan Sidwell


As Jakub points out, we only ever pass a single variadic parm (if at
all), so just an optional arg is fine.

PR 97858
libcpp/
* mkdeps.c (munge): Drop varadic args, we only ever use one.

pushing to trunk

--
Nathan Sidwell
diff --git i/libcpp/mkdeps.c w/libcpp/mkdeps.c
index ea5f060c380..a989ed355fa 100644
--- i/libcpp/mkdeps.c
+++ w/libcpp/mkdeps.c
@@ -105,23 +105,20 @@ public:
   unsigned short quote_lwm;
 };
 
-/* Apply Make quoting to STR, TRAIL etc.  Note that it's not possible
-   to quote all such characters - e.g. \n, %, *, ?, [, \ (in some
+/* Apply Make quoting to STR, TRAIL.  Note that it's not possible to
+   quote all such characters - e.g. \n, %, *, ?, [, \ (in some
contexts), and ~ are not properly handled.  It isn't possible to
get this right in any current version of Make.  (??? Still true?
Old comment referred to 3.76.1.)  */
 
 static const char *
-munge (const char *str, const char *trail = NULL, ...)
+munge (const char *str, const char *trail = nullptr)
 {
   static unsigned alloc;
   static char *buf;
   unsigned dst = 0;
-  va_list args;
-  if (trail)
-va_start (args, trail);
 
-  for (bool first = true; str; first = false)
+  for (; str; str = trail, trail = nullptr)
 {
   unsigned slashes = 0;
   char c;
@@ -169,14 +166,7 @@ munge (const char *str, const char *trail = NULL, ...)
 
 	  buf[dst++] = c;
 	}
-
-  if (first)
-	str = trail;
-  else
-	str = va_arg (args, const char *);
 }
-  if (trail)
-va_end (args);
 
   buf[dst] = 0;
   return buf;
@@ -332,7 +322,7 @@ make_write_name (const char *name, FILE *fp, unsigned col, unsigned colmax,
 		 bool quote = true, const char *trail = NULL)
 {
   if (quote)
-name = munge (name, trail, NULL);
+name = munge (name, trail);
   unsigned size = strlen (name);
 
   if (col)


Re: [PATCH 0/2] Improve MSP430 hardware multiply support

2020-11-17 Thread Jozef Lawrynowicz
On Mon, Nov 16, 2020 at 06:36:17PM -0700, Jeff Law via Gcc-patches wrote:
> 
> 
> On 11/15/20 2:14 PM, Jozef Lawrynowicz wrote:
> > The attached patch series improves MSP430 hardware multiply support, by
> > improving code generation when setting up the 16-bit and 32-bit hardware
> > multiply functions, and adding a 64-bit hardware multiply library
> > function for devices that have a 32-bit hardware multiplier.
> >
> > Successfully regtested GCC and G++ testsuites for:
> > msp430-sim
> > msp430-sim/-mcpu=msp430
> > msp430-sim/-mhwmult=f5series
> > 
> > msp430-sim/-mhwmult=f5series/-mlarge/-mdata-region=either/-mcode-region=either
> > msp430-sim/-mlarge
> > msp430-sim/-mlarge/-mdata-region=either/-mcode-region=either
> >
> > Additionally regtested GCC execute.exp for:
> > msp430-sim/-mhwmult=16bit
> > msp430-sim/-mhwmult=32bit
> > msp430-sim/-mhwmult=f5series
> > msp430-sim/-mhwmult=none
> > 
> > msp430-sim/-mlarge/-mcode-region=either/-mdata-region=either/-mhwmult=16bit
> > 
> > msp430-sim/-mlarge/-mcode-region=either/-mdata-region=either/-mhwmult=32bit
> > 
> > msp430-sim/-mlarge/-mcode-region=either/-mdata-region=either/-mhwmult=f5series
> > 
> > msp430-sim/-mlarge/-mcode-region=either/-mdata-region=either/-mhwmult=none
> >
> > Ok for trunk?
> >
> > Jozef Lawrynowicz (2):
> >   MSP430: Add mulhi, mulsi and {u,}mulsidi3  expanders
> >   MSP430: Add 64-bit hardware multiply support
> >
> >  gcc/config/msp430/msp430.md   | 61 ++--
> >  libgcc/config/msp430/lib2hw_mul.S | 77 +--
> >  libgcc/config/msp430/lib2mul.c| 52 +
> >  libgcc/config/msp430/t-msp430 |  5 ++
> >  4 files changed, 186 insertions(+), 9 deletions(-)
> Both are fine.

Thanks.

> 
> BTW, what would be a reasonable set of multlibs for automated testing? 
> My tester has the ability to define them on a per-target basis, but I
> haven't tried to do that except for targets that I happen to know
> well.   So right now it's just using the default via
> -target_board=msp430-sim.    Figure we've probably got a time budget to
> add 3 multilibs without causing headaches.  What 3 might you suggest?

In addition to the default config, I would suggest:
  msp430-sim/-mcpu=msp430
Test the 430 ISA
  msp430-sim/-mlarge/-mcode-region=either
Test the large memory model with data assumed to be in the lower
memory region (default, reduces code size penalty of using -mlarge),
whilst shuffling code between the upper and lower memory regions to
make the program fit.
  msp430-sim/-mlarge/-mdata-region=either/-mcode-region=either
   Test the large memory model, shuffling code and data between upper
   and lower memory regions.

I should really use -mlarge/-mcode-region=either, instead of just
-mlarge, as well. -mcode-region=either doesn't change code gen, just
allows the linker shuffling of text sections so more tests build and so
we get better test coverage.

With limited testing capacity, testing hwmult configs is not very useful
unless hwmult behavior is specifically changed. There are msp430
specific tests to verify the options basically work.

Thanks,
Jozef


Re: OpenACC 'kernels' testsuite failures

2020-11-17 Thread David Edelsohn via Gcc-patches
Hi, Thomas

The standard version of Tcl installed on AIX currently is Tcl 8.4.
I'll see if I can have a newer version on the side.

The patch resolves the "no such variable" error message.  (Great!
Thanks!)  I now see:

during GIMPLE pass: omplower

as an Excess error.  Any idea where that comes from and why it doesn't
appear on other targets?  Does that need to be captured and ignored?

Thanks, David

On Mon, Nov 16, 2020 at 11:46 AM Thomas Schwinge
 wrote:
>
> Hi David!
>
> While you were writing your email, I've also been busy:
>
> On 2020-11-16T11:32:46-0500, David Edelsohn  wrote:
> > On Mon, Nov 16, 2020 at 9:16 AM Thomas Schwinge  
> > wrote:
> >> On 2020-11-15T15:47:15-0500, David Edelsohn  wrote:
> >> > I am seeing a number of new failures on AIX related to the OpenACC
> >> > kernels patches.
> >> >
> >> > c-c++-common/goacc/kernels-decompose-1.c
> >> > c-c++-common/goacc/kernels-decompose-2.c
> >> > gfortran.dg/goacc/kernels-decompose-1.f95
> >> > gfortran.dg/goacc/kernels-decompose-2.f95
> >> > libgomp.oacc-c++/../libgomp.oacc-c-c++-common/kernels-decompose-1.c
> >> > libgomp.oacc-fortran/pr94358-1.f90
> >>
> >> I suppose what you're asking about is what appears in
> >>  reports as:
> >>
> >> ERROR: c-c++-common/goacc/kernels-decompose-1.c: can't read 
> >> "c_loop_i": no such variable for " dg-line 24 l_loop_i[incr c_loop_i] "
> >> UNRESOLVED: c-c++-common/goacc/kernels-decompose-1.c: can't read 
> >> "c_loop_i": no such variable for " dg-line 24 l_loop_i[incr c_loop_i] "
> >>
> >> Etc.
>
> In the mean time, I did remember that weeks ago, I had noticed this
> following "detail": on  we
> read that "Starting with the Tcl 8.5 release, the variable 'varName'
> passed to 'incr' may be unset, and in that case, it will be set to
> [...]".  Tcl 8.5 has been released in 2007.
>
> Per 'gcc/doc/install.texi' we require:
>
> @item DejaGnu 1.4.4
> @itemx Expect
> @itemx Tcl
>
> Necessary to run the GCC testsuite; [...]
>
> DejaGnu has been released in 2004 (so cannot have required Tcl 8.5
> released in 2007).
>
> >> However, per the reports posted there, these really only (!) appear to
> >> fail for your "Native configuration is powerpc-ibm-aix7.2.3.0" testing,
> >> strange.  Which versions of DejaGnu and Tcl are used?
> >
> > For my internal tester DejaGNU reports the following:
> >
> > Expect version is 5.42.1
> > Tcl version is 8.4
> > Framework version is 1.5.3
>
> There we go: you're on Tcl 8.4.  ;-D
>
> >> On  I see there are AIX
> >> systems gcc111, gcc119 -- maybe I'll have luck reproducing the issue
> >> there.
>
> On these, we've got:
>
> tschwinge@gcc111:[/home/tschwinge]/opt/freeware/bin/runtest --version
> WARNING: Couldn't find the global config file.
> Expect version is   5.45.4
> Tcl version is  8.6
> Framework version is1.4.4
>
> tschwinge@gcc119:[/home/tschwinge]/opt/freeware/bin/runtest --version
> WARNING: Couldn't find the global config file.
> Expect version is   5.44.1.15
> Tcl version is  8.5
> Framework version is1.5.3
>
> ..., so can't (easily) be used to reproduce the issue.  (... but it
> wouldn't be specific to AIX, anyway.)
>
> Before I spend time on building/verifying with Tcl 8.4: are you able to
> give the attached patch "[testsuite] Avoid Tcl 8.5-specific behavior" a
> try?
>
>
> Grüße
>  Thomas
>
>
> >> Admittedly, using Tcl code inside DejaGnu directives is not most common,
> >> but it does make sense conceptually (at least to me), and reportedly does
> >> work with a large number of DejaGnu/Tcl versions combinations.
> >>
> >> > Looking at the testsuite logs I see:
> >> >
> >> > fatal error: GCC is not configured to support amdgcn-amdhsa as offload 
> >> > target
> >>
> >> That one's not actually related to the new OpenACC 'kernels' testcases:
> >> it's just the testsuite harness checking whether GCN offloading is
> >> configured.  (See  "GCC is not configured to
> >> support amdgcn-unknown-amdhsa as offload target"; this should one appear
> >> once per testsuite.)
> >>
> >> > I don't know why this is different from the other OpenACC tests.
> >>
> >> It's not.  At least not intentionally.
> >
> > I don't see any obvious difference in the style of the additional
> > options for the kernels testcases versus others, although it
> > specifically is using an option for that test.  I only see the "GCC is
> > not configured ... amdhsa" for those tests.
> >
> >>
> >> > How
> >> > should these tests be skipped or adjusted to not fail on other
> >> > systems?
> >>
> >> They are expected to work fine on all systems; they're not specific to
> >> actual code offloading.  So if something FAILs, we shall resolve it.
> >
> > Thanks, David
>
>
> -
> Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
> Registergericht München HRB 1

Re: [PATCH] libgcc: Add a weak stub for __sync_synchronize

2020-11-17 Thread Bernd Edlinger
On 11/17/20 1:44 PM, Richard Earnshaw (lists) wrote:
> On 03/11/2020 15:08, Bernd Edlinger wrote:
>> Hi,
>>
>> this fixes a problem with a missing symbol __sync_synchronize
>> which happens when newlib is used together with libstdc++ for
>> the non-threaded simulator target arm-none-eabi.
>>
>> There are several questions on stackoverflow about this issue.
>>
>> I would like to add a weak symbol for this target, since this
>> is only a default implementation and not meant to override a
>> possibly more sophisticated synchronization function from the
>> c-runtime.
>>
>>
>> Regression tested successfully on arm-none-eabi with newlib-3.3.0.
>>
>> Is it OK for trunk?
>>
>>
>> Thanks
>> Bernd.
>>
> 
> I seem to recall that this was a deliberate decision - you can't guess
> this correctly, at least when trying to build portable code - you just
> have to know which runtime you will be using.
> 

Therefore I suggest to use the weak attribute.  It is on purpose not
implementing all of the atomics.

The use case, is a C++ program which initializes a local static variable.

$ cat test.cc
#include 
main(int argc, char **argv)
{
  static std::string x = "test";
  return 0;
}

compiles to this:
sub sp, sp, #20
str r0, [fp, #-24]
str r1, [fp, #-28]
ldr r3, .L14
ldrbr4, [r3]
bl  __sync_synchronize
and r3, r4, #255
and r3, r3, #1
cmp r3, #0
moveq   r3, #1
movne   r3, #0
and r3, r3, #255
cmp r3, #0
beq .L8
ldr r0, .L14
bl  __cxa_guard_acquire
mov r3, r0

so __sync_synchronize is not defined in newlib since the target (arm-sim)
is known to be not multi-threaded,
but __cxa_guard_acquire is also not a thread safe function,
because __GTHREADS is not defined by libgcc, since it is known
at configure time, that the target does not support threads.
So libstdc++ does not try to use a mutex or any atomics either,
because it is not compiled with __GTHREADS.

I can further narrow down the patch by only defining this function when
__GTHREADS is not defined, to make it more clear.


> I think Ramana had some changes in the works at one point to address
> (some) of this, but I'm not sure what happened to them.  Ramana?
> 
> 
> +#if defined (__ARM_ARCH_6__) || defined (__ARM_ARCH_6J__)   \
> +|| defined (__ARM_ARCH_6K__) || defined (__ARM_ARCH_6T2__)  \
> +|| defined (__ARM_ARCH_6Z__) || defined (__ARM_ARCH_6ZK__)  \
> +|| defined (__ARM_ARCH_7__) || defined (__ARM_ARCH_7A__)
> +#if defined (__ARM_ARCH_7__) || defined (__ARM_ARCH_7A__)
> 
> Ug, no!  Use the ACLE macros to avoid this sort of mess.
> 

Ah, thanks, copy-paste from freebsd-atomic.c :)


I've attached the updated patch.
Is it OK?


Thanks
Bernd.
From ca44e1fcb4b991306cbcde6293d20c77ce74ad68 Mon Sep 17 00:00:00 2001
From: Bernd Edlinger 
Date: Mon, 2 Nov 2020 11:43:44 +0100
Subject: [PATCH] libgcc: Add a weak stub for __sync_synchronize

This patch adds a default implementation for __sync_synchronize,
which prevents many unresolved symbol errors on arm-none-eabi.
This happens often in C++ programs even without any threading.

libgcc:
2020-11-17  Bernd Edlinger  

	* config.host: Use t-eabi for arm-none-eabi.
	* config/arm/t-eabi: New.
	* config/arm/eabi-sync.c: New.
---
 libgcc/config.host|  2 +-
 libgcc/config/arm/eabi-sync.c | 38 ++
 libgcc/config/arm/t-eabi  |  1 +
 3 files changed, 40 insertions(+), 1 deletion(-)
 create mode 100644 libgcc/config/arm/eabi-sync.c
 create mode 100644 libgcc/config/arm/t-eabi

diff --git a/libgcc/config.host b/libgcc/config.host
index 66af834..eaf74f1 100644
--- a/libgcc/config.host
+++ b/libgcc/config.host
@@ -496,7 +496,7 @@ arm*-*-eabi* | arm*-*-symbianelf* | arm*-*-rtems*)
 	tm_file="$tm_file arm/bpabi-lib.h"
 	case ${host} in
 	arm*-*-eabi* | arm*-*-rtems*)
-	  tmake_file="${tmake_file} arm/t-bpabi t-crtfm"
+	  tmake_file="${tmake_file} arm/t-bpabi t-crtfm arm/t-eabi"
 	  extra_parts="crtbegin.o crtend.o crti.o crtn.o"
 	  ;;
 	arm*-*-symbianelf*)
diff --git a/libgcc/config/arm/eabi-sync.c b/libgcc/config/arm/eabi-sync.c
new file mode 100644
index 000..c37bacf
--- /dev/null
+++ b/libgcc/config/arm/eabi-sync.c
@@ -0,0 +1,38 @@
+/* Copyright (C) 2020 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+Under Section 7 of GPL version 3, you are granted additional
+permissions described in the GCC Runtime Library Exception, v

Re: (VAX) cc0 anyone? (was: [PATCH 0/2] Fixes for old version NetBSD targets)

2020-11-17 Thread Kamil Rytarowski
On 17.11.2020 04:49, Hans-Peter Nilsson wrote:
> On Sun, 15 Nov 2020, Maciej W. Rozycki wrote:
> 
>> Hi,
>>
>>  In the course of my recent VAX backend modernisation effort
> 
> Hi.  That reminds me that VAX is "still" a cc0 target.
> 
> Are you aware of anyone planning on that level of modernization?
> 
> More than a year ago, there was a major heads-up that all
> remaining cc0 targets would be "retired" in the next release.
> 
> Now, I'm thinking that was just an empty threat. 0:-)
> 
> brgds, H-P
> PS. not volunteering, sorry.
> 

The "VAX backend modernisation effort" means upgrading out of cc0.



signature.asc
Description: OpenPGP digital signature


Re: [PATCH] libgcc: Add a weak stub for __sync_synchronize

2020-11-17 Thread Richard Earnshaw (lists) via Gcc-patches
On 17/11/2020 15:18, Bernd Edlinger wrote:
> On 11/17/20 1:44 PM, Richard Earnshaw (lists) wrote:
>> On 03/11/2020 15:08, Bernd Edlinger wrote:
>>> Hi,
>>>
>>> this fixes a problem with a missing symbol __sync_synchronize
>>> which happens when newlib is used together with libstdc++ for
>>> the non-threaded simulator target arm-none-eabi.
>>>
>>> There are several questions on stackoverflow about this issue.
>>>
>>> I would like to add a weak symbol for this target, since this
>>> is only a default implementation and not meant to override a
>>> possibly more sophisticated synchronization function from the
>>> c-runtime.
>>>
>>>
>>> Regression tested successfully on arm-none-eabi with newlib-3.3.0.
>>>
>>> Is it OK for trunk?
>>>
>>>
>>> Thanks
>>> Bernd.
>>>
>>
>> I seem to recall that this was a deliberate decision - you can't guess
>> this correctly, at least when trying to build portable code - you just
>> have to know which runtime you will be using.
>>
> 
> Therefore I suggest to use the weak attribute.  It is on purpose not
> implementing all of the atomics.
> 
> The use case, is a C++ program which initializes a local static variable.
> 
> $ cat test.cc
> #include 
> main(int argc, char **argv)
> {
>   static std::string x = "test";
>   return 0;
> }
> 
> compiles to this:
> sub sp, sp, #20
> str r0, [fp, #-24]
> str r1, [fp, #-28]
> ldr r3, .L14
> ldrbr4, [r3]
> bl  __sync_synchronize
> and r3, r4, #255
> and r3, r3, #1
> cmp r3, #0
> moveq   r3, #1
> movne   r3, #0
> and r3, r3, #255
> cmp r3, #0
> beq .L8
> ldr r0, .L14
> bl  __cxa_guard_acquire
> mov r3, r0
> 
> so __sync_synchronize is not defined in newlib since the target (arm-sim)
> is known to be not multi-threaded,
> but __cxa_guard_acquire is also not a thread safe function,
> because __GTHREADS is not defined by libgcc, since it is known
> at configure time, that the target does not support threads.
> So libstdc++ does not try to use a mutex or any atomics either,
> because it is not compiled with __GTHREADS.
> 
> I can further narrow down the patch by only defining this function when
> __GTHREADS is not defined, to make it more clear.
> 
> 
>> I think Ramana had some changes in the works at one point to address
>> (some) of this, but I'm not sure what happened to them.  Ramana?
>>
>>
>> +#if defined (__ARM_ARCH_6__) || defined (__ARM_ARCH_6J__)   \
>> +|| defined (__ARM_ARCH_6K__) || defined (__ARM_ARCH_6T2__)  \
>> +|| defined (__ARM_ARCH_6Z__) || defined (__ARM_ARCH_6ZK__)  \
>> +|| defined (__ARM_ARCH_7__) || defined (__ARM_ARCH_7A__)
>> +#if defined (__ARM_ARCH_7__) || defined (__ARM_ARCH_7A__)
>>
>> Ug, no!  Use the ACLE macros to avoid this sort of mess.
>>
> 
> Ah, thanks, copy-paste from freebsd-atomic.c :)
> 
> 
> I've attached the updated patch.
> Is it OK?
> 
> 
> Thanks
> Bernd.
> 

libgcc is *still* the wrong place for this.  It belongs in the system
library (eg newlib, or glibc, or whatever), which knows about the system
it's running on.  (Sorry, I should have said this before, but I've
context-switched this out since it's been a long time since it came up).

This hack will just lead to silent code failure of the worst kind
(non-reproducable, racy) at runtime.

R.


Re: [PATCH] libgcc: Add a weak stub for __sync_synchronize

2020-11-17 Thread Christophe Lyon via Gcc-patches
On Tue, 17 Nov 2020 at 16:41, Richard Earnshaw (lists) via Gcc-patches
 wrote:
>
> On 17/11/2020 15:18, Bernd Edlinger wrote:
> > On 11/17/20 1:44 PM, Richard Earnshaw (lists) wrote:
> >> On 03/11/2020 15:08, Bernd Edlinger wrote:
> >>> Hi,
> >>>
> >>> this fixes a problem with a missing symbol __sync_synchronize
> >>> which happens when newlib is used together with libstdc++ for
> >>> the non-threaded simulator target arm-none-eabi.
> >>>
> >>> There are several questions on stackoverflow about this issue.
> >>>
> >>> I would like to add a weak symbol for this target, since this
> >>> is only a default implementation and not meant to override a
> >>> possibly more sophisticated synchronization function from the
> >>> c-runtime.
> >>>
> >>>
> >>> Regression tested successfully on arm-none-eabi with newlib-3.3.0.
> >>>
> >>> Is it OK for trunk?
> >>>
> >>>
> >>> Thanks
> >>> Bernd.
> >>>
> >>
> >> I seem to recall that this was a deliberate decision - you can't guess
> >> this correctly, at least when trying to build portable code - you just
> >> have to know which runtime you will be using.
> >>
> >
> > Therefore I suggest to use the weak attribute.  It is on purpose not
> > implementing all of the atomics.
> >
> > The use case, is a C++ program which initializes a local static variable.
> >
> > $ cat test.cc
> > #include 
> > main(int argc, char **argv)
> > {
> >   static std::string x = "test";
> >   return 0;
> > }
> >
> > compiles to this:
> > sub sp, sp, #20
> > str r0, [fp, #-24]
> > str r1, [fp, #-28]
> > ldr r3, .L14
> > ldrbr4, [r3]
> > bl  __sync_synchronize
> > and r3, r4, #255
> > and r3, r3, #1
> > cmp r3, #0
> > moveq   r3, #1
> > movne   r3, #0
> > and r3, r3, #255
> > cmp r3, #0
> > beq .L8
> > ldr r0, .L14
> > bl  __cxa_guard_acquire
> > mov r3, r0
> >
> > so __sync_synchronize is not defined in newlib since the target (arm-sim)
> > is known to be not multi-threaded,
> > but __cxa_guard_acquire is also not a thread safe function,
> > because __GTHREADS is not defined by libgcc, since it is known
> > at configure time, that the target does not support threads.
> > So libstdc++ does not try to use a mutex or any atomics either,
> > because it is not compiled with __GTHREADS.
> >
> > I can further narrow down the patch by only defining this function when
> > __GTHREADS is not defined, to make it more clear.
> >
> >
> >> I think Ramana had some changes in the works at one point to address
> >> (some) of this, but I'm not sure what happened to them.  Ramana?
> >>
> >>
> >> +#if defined (__ARM_ARCH_6__) || defined (__ARM_ARCH_6J__)   \
> >> +|| defined (__ARM_ARCH_6K__) || defined (__ARM_ARCH_6T2__)  \
> >> +|| defined (__ARM_ARCH_6Z__) || defined (__ARM_ARCH_6ZK__)  \
> >> +|| defined (__ARM_ARCH_7__) || defined (__ARM_ARCH_7A__)
> >> +#if defined (__ARM_ARCH_7__) || defined (__ARM_ARCH_7A__)
> >>
> >> Ug, no!  Use the ACLE macros to avoid this sort of mess.
> >>
> >
> > Ah, thanks, copy-paste from freebsd-atomic.c :)
> >
> >
> > I've attached the updated patch.
> > Is it OK?
> >
> >
> > Thanks
> > Bernd.
> >
>
> libgcc is *still* the wrong place for this.  It belongs in the system
> library (eg newlib, or glibc, or whatever), which knows about the system
> it's running on.  (Sorry, I should have said this before, but I've
> context-switched this out since it's been a long time since it came up).
>
> This hack will just lead to silent code failure of the worst kind
> (non-reproducable, racy) at runtime.
>

I haven't fully re-read the thread, but I think this is related to an
old discussion,
not very well archived in:
https://gcc.gnu.org/pipermail/gcc-patches/2016-November/462299.html

There's a pointer to a newlib patch from Ramana.

> R.


Re: [PATCH v1 1/2] Simplify shifts wider than the bitwidth of types

2020-11-17 Thread Jeff Law via Gcc-patches



On 11/17/20 4:53 AM, Philipp Tomsich wrote:
> Jeff,
>
> On Tue, 17 Nov 2020 at 00:38, Jeff Law  > wrote:
>
>
> On 11/16/20 11:57 AM, Philipp Tomsich wrote:
> > From: Philipp Tomsich mailto:p...@gnu.org>>
> >
> > While most shifts wider than the bitwidth of a type will be
> caught by
> > other passes, it is possible that these show up for VRP.
> > Consider the following example:
> >   int func (int a, int b, int c)
> >   {
> >     return (a << ((b && c) - 1));
> >   }
> >
> > This adds simplify_using_ranges::simplify_lshift_using_ranges to
> > detect and rewrite such cases.  If the intersection of meaningful
> > shift amounts for the underlying type and the value-range computed
> > for the shift-amount (whether an integer constant or a variable) is
> > empty, the statement is replaced with the zero-constant of the same
> > precision as the result.
> >
> > gcc/ChangeLog:
> >
> >        * vr-values.h (simplify_using_ranges): Declare.
> >        * vr-values.c (simplify_lshift_using_ranges): New function.
> >        (simplify): Use simplify_lshift_using_ranges for LSHIFT_EXPR.
>
> Umm, isn't this a shift wider than the bitwidth undefined
> behavior?  We
> should be generating warnings for that, not trying to further optimize
> it :-)
>
>
> The shift is undefined behavior on the language level (for C) and a
> warning
> will be generated, if such a shift is encountered; additionally, the
> shift will be
> replaced with the value 0.
>
> However, in the above case, the shift is generated only in the middle end:
> At 136t.walloca, I still have:
>
>   # RANGE [-1, 0]
>   _1 = iftmp.1_2 + -1;
>   _6 = a_5(D) << _1;
>
> Whereas at 137t.pre, this is changed into:
>
> Found partial redundancy for expression {lshift_expr,a_5(D),_1} (0006)
> Inserted _9 = a_5(D) << -1;
>
>
> In other words, the change to VRP canonicalizes what a lshift_expr with an
> shift-amount outside of the type width means... it doesn't assume anything
> about the original language.
> Do we assume that a LSHIFT_EXPR has the same semantics as for a
> C-language shift-left? If so, then pre should not generate the LSHIFT_EXPR
> for _9... or we might even catch this later in path isolation (as
> undefined
> behavior, insert a __builtin_trap() and emit a warning)?
>
> Note that in his comment to patch 2/2, Jim has noted that user code for
> RISC-V may assume a truncation of the shift-operand...
What I'd suggest doing would be to leave the invalid shift count in the
IL in VRP, then extend the erroneous path isolation code to turn an
invalid shift into a trap (conditionally of course).

jeff



[committed] libstdc++: Fix unconditional definition of __cpp_lib_span in [PR 97869}

2020-11-17 Thread Jonathan Wakely via Gcc-patches
The  header is empty unless Concepts are supported, but 
defines the __cpp_lib_span feature test macro unconditionally. It should
be guarded by the same conditions as in .

libstdc++-v3/ChangeLog:

PR libstdc++/97869
* include/precompiled/stdc++.h: Include .
* include/std/version (__cpp_lib_span): Check __cpp_lib_concepts
before defining.

Tested powerpc64le-linux. Committed to trunk.

This also needs to be backported to gcc-10.

commit ecf65330c11544ebf35e198087b4a42be089c620
Author: Jonathan Wakely 
Date:   Tue Nov 17 15:26:29 2020

libstdc++: Fix unconditional definition of __cpp_lib_span in  [PR 
97869}

The  header is empty unless Concepts are supported, but 
defines the __cpp_lib_span feature test macro unconditionally. It should
be guarded by the same conditions as in .

libstdc++-v3/ChangeLog:

PR libstdc++/97869
* include/precompiled/stdc++.h: Include .
* include/std/version (__cpp_lib_span): Check __cpp_lib_concepts
before defining.

diff --git a/libstdc++-v3/include/precompiled/stdc++.h 
b/libstdc++-v3/include/precompiled/stdc++.h
index 8899c323a281..a418c46288de 100644
--- a/libstdc++-v3/include/precompiled/stdc++.h
+++ b/libstdc++-v3/include/precompiled/stdc++.h
@@ -137,6 +137,9 @@
 #include 
 #include 
 #include 
+#if __cpp_impl_coroutine
+# include 
+#endif
 #include 
 #include 
 #include 
diff --git a/libstdc++-v3/include/std/version b/libstdc++-v3/include/std/version
index 7f51ef3a6c4f..12455ad93146 100644
--- a/libstdc++-v3/include/std/version
+++ b/libstdc++-v3/include/std/version
@@ -226,7 +226,9 @@
 # define __cpp_lib_ranges 201911L
 #endif
 #define __cpp_lib_shift 201806L
-#define __cpp_lib_span 202002L
+#if __cpp_lib_concepts
+# define __cpp_lib_span 202002L
+#endif
 #define __cpp_lib_ssize 201902L
 #define __cpp_lib_starts_ends_with 201711L
 # if _GLIBCXX_USE_CXX11_ABI


preprocessor: module line maps

2020-11-17 Thread Nathan Sidwell


This patch adds LC_MODULE as a map kind, used to indicate a c++
module.  Unlike a regular source file, it only contains a single
location, and the source locations in that module are represented by
ordinary locations whose 'included_from' location is the module.

It also exposes some entry points that modules will use to create
blocks of line maps.

In the original posting, I'd missed the deletion of the
linemap_enter_macro from internal.h.  That's included here.

libcpp/
* include/line-map.h (enum lc_reason): Add LC_MODULE.
(MAP_MODULE_P): New.
(line_map_new_raw): Declare.
(linemap_enter_macro): Move declaration from internal.h
(linemap_module_loc, linemap_module_reparent)
(linemap_module_restore): Declare.
(linemap_lookup_macro_indec): Declare.
* internal.h (linemap_enter_macro): Moved to line-map.h.
* linemap.c (linemap_new_raw): New, broken out of ...
(new_linemap): ... here.  Call it.
(LAST_SOURCE_LINE_LOCATION): New.
(liemap_module_loc, linemap_module_reparent)
(linemap_module_restore): New.
(linemap_lookup_macro_index): New, broken out of ...
(linemap_macro_map_lookup): ... here.  Call it.
(linemap_dump): Add module dump.

pushing to trunk
--
Nathan Sidwell
diff --git i/libcpp/include/line-map.h w/libcpp/include/line-map.h
index 44008be5c08..50b2e4ff91a 100644
--- i/libcpp/include/line-map.h
+++ w/libcpp/include/line-map.h
@@ -72,6 +72,7 @@ enum lc_reason
   LC_RENAME,		/* Other reason for name change.  */
   LC_RENAME_VERBATIM,	/* Likewise, but "" != stdin.  */
   LC_ENTER_MACRO,	/* Begin macro expansion.  */
+  LC_MODULE,		/* A (C++) Module.  */
   /* FIXME: add support for stringize and paste.  */
   LC_HWM /* High Water Mark.  */
 };
@@ -439,7 +440,8 @@ struct GTY((tag ("1"))) line_map_ordinary : public line_map {
 
   /* Location from whence this line map was included.  For regular
  #includes, this location will be the last location of a map.  For
- outermost file, this is 0.  */
+ outermost file, this is 0.  For modules it could be anywhere
+ within a map.  */
   location_t included_from;
 
   /* Size is 20 or 24 bytes, no padding  */
@@ -662,6 +664,15 @@ ORDINARY_MAP_IN_SYSTEM_HEADER_P (const line_map_ordinary *ord_map)
   return ord_map->sysp;
 }
 
+/* TRUE if this line map is for a module (not a source file).  */
+
+inline bool
+MAP_MODULE_P (const line_map *map)
+{
+  return (MAP_ORDINARY_P (map)
+	  && linemap_check_ordinary (map)->reason == LC_MODULE);
+}
+
 /* Get the filename of ordinary map MAP.  */
 
 inline const char *
@@ -1076,6 +1087,9 @@ extern void linemap_check_files_exited (class line_maps *);
 extern location_t linemap_line_start
 (class line_maps *set, linenum_type to_line,  unsigned int max_column_hint);
 
+/* Allocate a raw block of line maps, zero initialized.  */
+extern line_map *line_map_new_raw (line_maps *, bool, unsigned);
+
 /* Add a mapping of logical source line to physical source file and
line number. This function creates an "ordinary map", which is a
map that records locations of tokens that are not part of macro
@@ -1093,6 +1107,39 @@ extern const line_map *linemap_add
   (class line_maps *, enum lc_reason, unsigned int sysp,
const char *to_file, linenum_type to_line);
 
+/* Create a macro map.  A macro map encodes source locations of tokens
+   that are part of a macro replacement-list, at a macro expansion
+   point. See the extensive comments of struct line_map and struct
+   line_map_macro, in line-map.h.
+
+   This map shall be created when the macro is expanded. The map
+   encodes the source location of the expansion point of the macro as
+   well as the "original" source location of each token that is part
+   of the macro replacement-list. If a macro is defined but never
+   expanded, it has no macro map.  SET is the set of maps the macro
+   map should be part of.  MACRO_NODE is the macro which the new macro
+   map should encode source locations for.  EXPANSION is the location
+   of the expansion point of MACRO. For function-like macros
+   invocations, it's best to make it point to the closing parenthesis
+   of the macro, rather than the the location of the first character
+   of the macro.  NUM_TOKENS is the number of tokens that are part of
+   the replacement-list of MACRO.  */
+const line_map_macro *linemap_enter_macro (line_maps *, cpp_hashnode *,
+	   location_t, unsigned int);
+
+/* Create a source location for a module.  The creator must either do
+   this after the TU is tokenized, or deal with saving and restoring
+   map state.  */
+
+extern location_t linemap_module_loc
+  (line_maps *, location_t from, const char *name);
+extern void linemap_module_reparent
+  (line_maps *, location_t loc, location_t new_parent);
+
+/* Restore the linemap state such that the map at LWM-1 continues.  */
+extern void linemap_module_restore
+  (line_maps *, unsigned lwm);
+
 /* Given

preprocessor: new callbacks

2020-11-17 Thread Nathan Sidwell


These two callbacks are needed for C++ modules.  The first is for
handling macros from header-units.  These are resolved lazily.  The
second is for include-translation -- whether a #include gets turned
into a header-unit import.

libcpp/
* include/cpplib.h (struct cpp_callbacks): Add
user_deferred_macro & translate_include.

pushing to trunk

--
Nathan Sidwell
diff --git c/libcpp/include/cpplib.h w/libcpp/include/cpplib.h
index 8e398863cf6..81be6457951 100644
--- c/libcpp/include/cpplib.h
+++ w/libcpp/include/cpplib.h
@@ -680,6 +695,9 @@ struct cpp_callbacks
   /* Callback that can change a user lazy into normal macro.  */
   void (*user_lazy_macro) (cpp_reader *, cpp_macro *, unsigned);
 
+  /* Callback to handle deferred cpp_macros.  */
+  cpp_macro *(*user_deferred_macro) (cpp_reader *, location_t, cpp_hashnode *);
+
   /* Callback to parse SOURCE_DATE_EPOCH from environment.  */
   time_t (*get_source_date_epoch) (cpp_reader *);
 
@@ -698,6 +716,11 @@ struct cpp_callbacks
   /* Callback for filename remapping in __FILE__ and __BASE_FILE__ macro
  expansions.  */
   const char *(*remap_filename) (const char*);
+
+  /* Maybe translate a #include into something else.  Return a
+ cpp_buffer containing the translation if translating.  */
+  char *(*translate_include) (cpp_reader *, line_maps *, location_t,
+			  const char *path);
 };
 
 #ifdef VMS


Re: [patch] Fix build when source directory includes @ character

2020-11-17 Thread Jeff Law via Gcc-patches



On 11/17/20 1:22 AM, FX wrote:
>> OK.  You have commit privs, right?
> Yes, and I did commit after Richard’s OK: 
> https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=034db20e2ea8301b5dc251bf10a97ce1cf90655f
>
> … but I forgot to send an email saying I had, sorry.
No worries.  Thanks.
jeff



Re: [PATCH v1 1/2] Simplify shifts wider than the bitwidth of types

2020-11-17 Thread Philipp Tomsich via Gcc-patches
Jakub,

On Tue, 17 Nov 2020 at 16:56, Jeff Law  wrote:
>
>
>
> On 11/17/20 4:53 AM, Philipp Tomsich wrote:
> > Jeff,
> >
> > On Tue, 17 Nov 2020 at 00:38, Jeff Law  > > wrote:
> >
> >
> > On 11/16/20 11:57 AM, Philipp Tomsich wrote:
> > > From: Philipp Tomsich mailto:p...@gnu.org>>
> > >
> > > While most shifts wider than the bitwidth of a type will be
> > caught by
> > > other passes, it is possible that these show up for VRP.
> > > Consider the following example:
> > >   int func (int a, int b, int c)
> > >   {
> > > return (a << ((b && c) - 1));
> > >   }
> > >
> > > This adds simplify_using_ranges::simplify_lshift_using_ranges to
> > > detect and rewrite such cases.  If the intersection of meaningful
> > > shift amounts for the underlying type and the value-range computed
> > > for the shift-amount (whether an integer constant or a variable) is
> > > empty, the statement is replaced with the zero-constant of the same
> > > precision as the result.
> > >
> > > gcc/ChangeLog:
> > >
> > >* vr-values.h (simplify_using_ranges): Declare.
> > >* vr-values.c (simplify_lshift_using_ranges): New function.
> > >(simplify): Use simplify_lshift_using_ranges for LSHIFT_EXPR.
> >
> > Umm, isn't this a shift wider than the bitwidth undefined
> > behavior?  We
> > should be generating warnings for that, not trying to further optimize
> > it :-)
> >
> >
> > The shift is undefined behavior on the language level (for C) and a
> > warning
> > will be generated, if such a shift is encountered; additionally, the
> > shift will be
> > replaced with the value 0.
> >
> > However, in the above case, the shift is generated only in the middle end:
> > At 136t.walloca, I still have:
> >
> >   # RANGE [-1, 0]
> >   _1 = iftmp.1_2 + -1;
> >   _6 = a_5(D) << _1;
> >
> > Whereas at 137t.pre, this is changed into:
> >
> > Found partial redundancy for expression {lshift_expr,a_5(D),_1} (0006)
> > Inserted _9 = a_5(D) << -1;
> >
> >
> > In other words, the change to VRP canonicalizes what a lshift_expr with an
> > shift-amount outside of the type width means... it doesn't assume anything
> > about the original language.
> > Do we assume that a LSHIFT_EXPR has the same semantics as for a
> > C-language shift-left? If so, then pre should not generate the LSHIFT_EXPR
> > for _9... or we might even catch this later in path isolation (as
> > undefined
> > behavior, insert a __builtin_trap() and emit a warning)?
> >
> > Note that in his comment to patch 2/2, Jim has noted that user code for
> > RISC-V may assume a truncation of the shift-operand...
> What I'd suggest doing would be to leave the invalid shift count in the
> IL in VRP, then extend the erroneous path isolation code to turn an
> invalid shift into a trap (conditionally of course).

You had originally suggested to add this to VRP ...
Given the various comments to this patch, do you still want any of
this in VRP or
would you rather see this only in path isolation?

Philipp.


Re: [PATCH v1 1/2] Simplify shifts wider than the bitwidth of types

2020-11-17 Thread Philipp Tomsich via Gcc-patches
Jeff,

On Tue, 17 Nov 2020 at 16:56, Jeff Law  wrote:
> > Note that in his comment to patch 2/2, Jim has noted that user code for
> > RISC-V may assume a truncation of the shift-operand...
> What I'd suggest doing would be to leave the invalid shift count in the
> IL in VRP, then extend the erroneous path isolation code to turn an
> invalid shift into a trap (conditionally of course).

As I remember, FORTRAN allows both LSHIFT(i, shift) or SHIFTL(i, shift) with
'shift' less than or equal to BITSIZE(i) ... this leaves i <<
BITSIZE(i) defined for
FORTRAN and undefined for C.

This seems to indicate that an LSHIFT_EXPR is intentionally not constrained
either to C language (or any other) semantics at this time.  To handle this in
path isolation, should we have different tree expressions for a
left-shift with C
semantics and one with FORTRAN semantics?

Philipp.


Re: [PATCH v1 1/2] Simplify shifts wider than the bitwidth of types

2020-11-17 Thread Jakub Jelinek via Gcc-patches
On Tue, Nov 17, 2020 at 05:29:57PM +0100, Philipp Tomsich wrote:
> > > In other words, the change to VRP canonicalizes what a lshift_expr with an
> > > shift-amount outside of the type width means... it doesn't assume anything
> > > about the original language.
> > > Do we assume that a LSHIFT_EXPR has the same semantics as for a
> > > C-language shift-left? If so, then pre should not generate the LSHIFT_EXPR
> > > for _9... or we might even catch this later in path isolation (as
> > > undefined
> > > behavior, insert a __builtin_trap() and emit a warning)?
> > >
> > > Note that in his comment to patch 2/2, Jim has noted that user code for
> > > RISC-V may assume a truncation of the shift-operand...
> > What I'd suggest doing would be to leave the invalid shift count in the
> > IL in VRP, then extend the erroneous path isolation code to turn an
> > invalid shift into a trap (conditionally of course).
> 
> You had originally suggested to add this to VRP ...
> Given the various comments to this patch, do you still want any of
> this in VRP or
> would you rather see this only in path isolation?

Well, I said if we want to do it at all, it should be done in VRP, because
there is not really a difference between ((int) x) << 32 and ((int) x) << y
for y in [32, 137] etc.
Otherwise it is the general question of what to do upon proven UB, and that
is a topic discussed several years during Cauldron that it would be nice to
have switches where users can choose what to do in that case,
__builtin_unreachable (), __builtin_trap (), ... and another thing is where
we should warn about it (tight e.g. to the __builtin_warning thing, because
we don't want these warnings for dead code).

So, e.g. if we had __builtin_warning (dunno where Martin S. is with that),
we could e.g. queue a __builtin_warning and add __builtin_unreachable (or
other possibilities), or e.g. during VRP just canonicalize proven always
out of bound shifts to shifts by an out of bound constant and let some later
pass warn and/or add __builtin_warning.

Jakub



Re: [PATCH v1 1/2] Simplify shifts wider than the bitwidth of types

2020-11-17 Thread Jeff Law via Gcc-patches



On 11/17/20 9:46 AM, Jakub Jelinek wrote:
> On Tue, Nov 17, 2020 at 05:29:57PM +0100, Philipp Tomsich wrote:
 In other words, the change to VRP canonicalizes what a lshift_expr with an
 shift-amount outside of the type width means... it doesn't assume anything
 about the original language.
 Do we assume that a LSHIFT_EXPR has the same semantics as for a
 C-language shift-left? If so, then pre should not generate the LSHIFT_EXPR
 for _9... or we might even catch this later in path isolation (as
 undefined
 behavior, insert a __builtin_trap() and emit a warning)?

 Note that in his comment to patch 2/2, Jim has noted that user code for
 RISC-V may assume a truncation of the shift-operand...
>>> What I'd suggest doing would be to leave the invalid shift count in the
>>> IL in VRP, then extend the erroneous path isolation code to turn an
>>> invalid shift into a trap (conditionally of course).
>> You had originally suggested to add this to VRP ...
>> Given the various comments to this patch, do you still want any of
>> this in VRP or
>> would you rather see this only in path isolation?
> Well, I said if we want to do it at all, it should be done in VRP, because
> there is not really a difference between ((int) x) << 32 and ((int) x) << y
> for y in [32, 137] etc.
Right.  VRP is the right place to discover, but I'm not sure it's the
right place to clean up the mess though.

> Otherwise it is the general question of what to do upon proven UB, and that
> is a topic discussed several years during Cauldron that it would be nice to
> have switches where users can choose what to do in that case,
> __builtin_unreachable (), __builtin_trap (), ... and another thing is where
> we should warn about it (tight e.g. to the __builtin_warning thing, because
> we don't want these warnings for dead code).
>
> So, e.g. if we had __builtin_warning (dunno where Martin S. is with that),
> we could e.g. queue a __builtin_warning and add __builtin_unreachable (or
> other possibilities), or e.g. during VRP just canonicalize proven always
> out of bound shifts to shifts by an out of bound constant and let some later
> pass warn and/or add __builtin_warning.
So the idea is to start funneling this through the path isolation code
and handle the various strategies there.

__builtin_warning is on hold pending a rework to make it act more like a
debug statement than a builtin function call.  The latter impacts
various heuristics, which would mean that it could impact codegen which
would highly undesirable.

jeff
>
>   Jakub



Re: [PATCH v1 1/2] Simplify shifts wider than the bitwidth of types

2020-11-17 Thread Jakub Jelinek via Gcc-patches
On Tue, Nov 17, 2020 at 09:54:46AM -0700, Jeff Law wrote:
> > So, e.g. if we had __builtin_warning (dunno where Martin S. is with that),
> > we could e.g. queue a __builtin_warning and add __builtin_unreachable (or
> > other possibilities), or e.g. during VRP just canonicalize proven always
> > out of bound shifts to shifts by an out of bound constant and let some later
> > pass warn and/or add __builtin_warning.
> So the idea is to start funneling this through the path isolation code
> and handle the various strategies there.

If the path isolation code would use the ranger for this, it wouldn't need
to be in VRP but could be anywhere, sure.

> __builtin_warning is on hold pending a rework to make it act more like a
> debug statement than a builtin function call.  The latter impacts
> various heuristics, which would mean that it could impact codegen which
> would highly undesirable.

Agreed on that.

Jakub



[pushed] C++ : Remove an overzealous checking assert [PR97871]

2020-11-17 Thread Iain Sandoe

Hi,

This amends my commit from r11-4927 to remove an assert.

tested on x86_64-darwin,
pushed to master
Iain

---

It seems we accept __attribute__(()) without any diagnostic at present,
so my added checking assert fires for something like:

__attribute__ (()) int a;

Fixed by removing the assert; in the case that the user enters something
like:

__attribute__ (()) extern "C" int foo;

The diagnostic about attributes before linkage specs will fire and show
the empty attributes.

gcc/cp/ChangeLog:

PR c++/97871
* parser.c (cp_parser_declaration): Remove checking assert.
---
 gcc/cp/parser.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 42f705266bb..b7ef259b048 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -13536,7 +13536,6 @@ cp_parser_declaration (cp_parser* parser, tree  
prefix_attrs)

 {
   cp_lexer_save_tokens (parser->lexer);
   attributes = cp_parser_attributes_opt (parser);
-  gcc_checking_assert (attributes);
   cp_token *t1 = cp_lexer_peek_token (parser->lexer);
   cp_token *t2 = (t1->type == CPP_EOF
  ? t1 : cp_lexer_peek_nth_token (parser->lexer, 2));
--
2.24.1




Re: [PATCH] Practical Improvement to libgcc Complex Divide

2020-11-17 Thread Patrick McGehearty via Gcc-patches

Joseph, thank you for your detailed review and comments.

I will get to work on the necessary revisions as well
as find for a suitable place for sharing my random number
generating tests.

- patrick

On 11/16/2020 8:34 PM, Joseph Myers wrote:

On Tue, 8 Sep 2020, Patrick McGehearty via Gcc-patches wrote:


This project started with an investigation related to
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59714.  Study of Beebe[1]
provided an overview of past and recent practice for computing complex
divide. The current glibc implementation is based on Robert Smith's
algorithm [2] from 1962.  A google search found the paper by Baudin
and Smith [3] (same Robert Smith) published in 2012. Elen Kalda's
proposed patch [4] is based on that paper.

Thanks, I've now read Baudin and Smith so can review the patch properly.
I'm fine with the overall algorithm, so my comments generally relate to
how the code should best be integrated into libgcc while keeping it
properly machine-mode-generic as far as possible.


I developed two sets of test set by randomly distributing values over
a restricted range and the full range of input values. The current

Are these tests available somewhere?


Support for half, float, double, extended, and long double precision
is included as all are handled with suitable preprocessor symbols in a
single source routine. Since half precision is computed with float
precision as per current libgcc practice, the enhanced algorithm
provides no benefit for half precision and would cost performance.
Therefore half precision is left unchanged.

The existing constants for each precision:
float: FLT_MAX, FLT_MIN;
double: DBL_MAX, DBL_MIN;
extended and/or long double: LDBL_MAX, LDBL_MIN
are used for avoiding the more common overflow/underflow cases.

In general, libgcc code works with modes, not types; hardcoding references
to a particular mapping between modes and types is problematic.  Rather,
the existing code in c-cppbuiltin.c that has a loop over modes should be
extended to provide whatever information is needed, as macros defined for
each machine mode.

   /* For libgcc-internal use only.  */
   if (flag_building_libgcc)
 {
   /* Properties of floating-point modes for libgcc2.c.  */
   opt_scalar_float_mode mode_iter;
   FOR_EACH_MODE_IN_CLASS (mode_iter, MODE_FLOAT)
 {
[...]

For example, that defines macros such as __LIBGCC_DF_FUNC_EXT__ and
__LIBGCC_DF_MANT_DIG__.  The _FUNC_EXT__ definition involves that code
computing a mapping to types.

I'd suggest defining additional macros such as __LIBGCC_DF_MAX__ in the
same code - for each supported floating-point mode.  They can be defined
to __FLT_MAX__ etc. (the predefined macros rather than the ones in
float.h) - the existing code that computes a suffix for functions can be
adjusted so it also computes the string such as "FLT", "DBL", "LDBL",
"FLT128" etc.

(I suggest defining to __FLT_MAX__ rather than to the expansion of
__FLT_MAX__ because that avoids any tricky interactions with the logic to
compute such expansions lazily.  I suggest __FLT_MAX__ rather than the
FLT_MAX name from float.h because that way you avoid any need to define
feature test macros to access names such as FLT128_MAX.)


diff --git a/gcc/c-family/c-cppbuiltin.c b/gcc/c-family/c-cppbuiltin.c
index 74ecca8..02c06d8 100644
--- a/gcc/c-family/c-cppbuiltin.c
+++ b/gcc/c-family/c-cppbuiltin.c
@@ -1343,6 +1343,11 @@ c_cpp_builtins (cpp_reader *pfile)
builtin_define_with_value ("__LIBGCC_INIT_SECTION_ASM_OP__",
 INIT_SECTION_ASM_OP, 1);
  #endif
+  /* For libgcc float/double optimization */
+#ifdef HAVE_adddf3
+  builtin_define_with_int_value ("__LIBGCC_HAVE_HWDBL__",
+HAVE_adddf3);
+#endif

This is another thing to handle more generically - possibly with something
like the mode_has_fma function, and defining a macro for each mode, named
after the mode, rather than only for DFmode.  For an alternative, see the
discussion below.


diff --git a/libgcc/ChangeLog b/libgcc/ChangeLog
index ccfd6f6..8bd66c5 100644
--- a/libgcc/ChangeLog
+++ b/libgcc/ChangeLog
@@ -1,3 +1,10 @@
+2020-08-27  Patrick McGehearty 
+
+   * libgcc2.c (__divsc3, __divdc3, __divxc3, __divtc3): Enhance
+   accuracy of complex divide by avoiding underflow/overflow when
+   ratio underflows or when arguments have very large or very
+   small exponents.

Note that diffs to ChangeLog files should not now be included in patches;
the ChangeLog content needs to be included in the proposed commit message
instead for automatic ChangeLog generation.


+#if defined(L_divsc3)
+#define RBIG   ((FLT_MAX)/2.0)
+#define RMIN   (FLT_MIN)
+#define RMIN2  (0x1.0p-21)
+#define RMINSCAL (0x1.0p+19)
+#define RMAX2  ((RBIG)*(RMIN2))
+#endif

I'd expect all of these to use generic macros based on the mode.  For the
division by 2.0, probably also divide by integer 2 not 2.0 to avoid
unwanted conversions to/from double.


+#i

[PATCH] IOR with nonzero, range cannot contain 0.

2020-11-17 Thread Andrew MacLeod via Gcc-patches
PR 83072 mentions that we have lost the ability to recognize that when 
we see

  c |= 1;
c cannot be zero.   We can at least put it back for multi-ranges. Added 
a new testcase to make sure EVRP is tracking it.


bootstrapped on x86_64-pc-linux-gnu, no regressions.  pushed.

Andrew

commit a5f9c27bfc4417224e332392bb81a2d733b2b5bf
Author: Andrew MacLeod 
Date:   Tue Nov 17 10:04:38 2020 -0500

IOR with nonzero, range cannot contain 0.

Remove zero from IOR ranges with non-zero masks.

gcc/
PR tree-optimization/83072
* range-op.cc (wi_optimize_and_or): Remove zero from IOR range when
mask is non-zero.
gcc/testsuite/
* gcc.dg/pr83072.c: New.

diff --git a/gcc/range-op.cc b/gcc/range-op.cc
index b746aadb603..d0adc95527a 100644
--- a/gcc/range-op.cc
+++ b/gcc/range-op.cc
@@ -2163,6 +2163,14 @@ wi_optimize_and_or (irange &r,
   else
 gcc_unreachable ();
   value_range_with_overflow (r, type, res_lb, res_ub);
+
+  // Furthermore, if the mask is non-zero, an IOR cannot contain zero.
+  if (code == BIT_IOR_EXPR && wi::ne_p (mask, 0))
+{
+  int_range<2> tmp;
+  tmp.set_nonzero (type);
+  r.intersect (tmp);
+}
   return true;
 }
 
diff --git a/gcc/testsuite/gcc.dg/pr83072.c b/gcc/testsuite/gcc.dg/pr83072.c
new file mode 100644
index 000..3bed8d89013
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr83072.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-evrp -fno-tree-ccp -fno-tree-forwprop -fno-tree-fre" } */
+
+void kill (void);
+
+int f(int c){
+  c |= 1;
+  if (c == 0)
+kill ();
+
+  return c;
+}
+
+/* { dg-final { scan-tree-dump-not "kill" "evrp" } }  */


Re: [PATCH v1 2/2] RISC-V: Adjust predicates for immediate shift operands

2020-11-17 Thread Jim Wilson
On Mon, Nov 16, 2020 at 2:45 PM Philipp Tomsich 
wrote:

> This is an de-optimization only, if applied without patch 1 from the
> series: the change to VRP ensures that the backend will never see a shift
> wider than the immediate field.
> The problem is that if a negative shift-amount makes it to the backend,
> unindented code may be generated (as a shift-amount, in my reading, should
> always be interpreted as unsigned).
>

I doubt that you are catching every possible case.  What kind of testing
have you done to prove this?  The code you are removing does nothing
harmful, but removing it may de-optimize code.  So it should not be removed
without a very good reason and a lot of testing to make sure there is no
regression.

Shift counts do not have to be unsigned at the assembly language level.
Consider this testcase
int sub (int i, int j) { return i << (32 - j); }
Compiling with rv32 -O2 gives me
li a5,32
sub a5,a5,a1
sll a0,a0,a5
which is 3 instructions.  But since we know that shift counts are
truncated, we should be compiling this as
neg a1,a1
sll a0,a0,a1
which gives the exact same result with 2 instructions.  This is on my todo
list.  This deliberately uses a negative shift count, but this is well
defined in the RISC-V ISA, so this poses no problems.

There are other related things we can do here to improve code generation
for shifts.  We should not be limiting optimization at the assembly level
by what the C standard says.

You also might want to look at SHIFT_COUNT_TRUNCATED and
TARGET_SHIFT_TRUNCATION_MASK.  We already have infrastructure to handle
out-of-range shifts as the hardware dictates.  Your changes conflict with
this.  I think we should define TARGET_SHIFT_TRUNCATION_MASK for RISC-V.
That is another item on my todo list.

Jim


Re: [PATCH v1 1/2] Simplify shifts wider than the bitwidth of types

2020-11-17 Thread Jim Wilson
On Tue, Nov 17, 2020 at 8:46 AM Jakub Jelinek via Gcc-patches <
gcc-patches@gcc.gnu.org> wrote:

> On Tue, Nov 17, 2020 at 05:29:57PM +0100, Philipp Tomsich wrote:
> > > > In other words, the change to VRP canonicalizes what a lshift_expr
> with an
> > > > shift-amount outside of the type width means... it doesn't assume
> anything
> > > > about the original language.
>
> Well, I said if we want to do it at all, it should be done in VRP, because
> there is not really a difference between ((int) x) << 32 and ((int) x) << y
> for y in [32, 137] etc.
>

How does this stuff interact with SHIFT_COUNT_TRUNCATED and
TARGET_SHIFT_TRUNCATION_MASK?  We already provide a mechanism to
truncate shift counts to fit based on how the hardware handles out-of-range
shift counts.  Handling out-of-range shift counts differently in VRP would
confuse things.  Maybe VRP should be using SHIFT_COUNT_TRUCNATED and/or
TARGET_SHIFT_TRUNCATION_MASK here?  Or maybe we give up on the shift
truncation macros?

Jim


Re: [PATCH] libgcc: Add a weak stub for __sync_synchronize

2020-11-17 Thread Bernd Edlinger
On 11/17/20 4:41 PM, Richard Earnshaw (lists) wrote:
> 
> libgcc is *still* the wrong place for this.  It belongs in the system
> library (eg newlib, or glibc, or whatever), which knows about the system
> it's running on.  (Sorry, I should have said this before, but I've
> context-switched this out since it's been a long time since it came up).
> 

No problem.  I just saw it from the other end.

It is odd that this problem does not go away even if gcc is configured
with --disable-threads, which should be the default for arm-none-eabi
anyway.

If we assume a threaded environment then it is still libgcc
which does not define __GTHREADS in libgcc/gthr.h, and libstdc++'s
__cxa_guard_acquire is not making use of functions like __gthread_mutex_lock.
But that appears to be this way by design.

Of course the race is not fixed if you ask newlib to implement just this
__sync_synchronize function.

> This hack will just lead to silent code failure of the worst kind
> (non-reproducable, racy) at runtime.
> 

So in a arm-none-eabi system with armv6 or higher where the intrinsic
__sync_synchronize is not a library call but an instruction
we have exactly this worst kind scenario, already.

It is however possible that the default of -fthreadsafe_statics
is inappropriate for --disable-threads ?


Bernd.


> R.
> 


Re: [PATCH v1 1/2] Simplify shifts wider than the bitwidth of types

2020-11-17 Thread Philipp Tomsich
Jeff & Jakub,

I went back to reread the C language standard and it turns out that the
delineation between defined and undefined is not as simple as I thought
that I remembered (see below).

On Tue, 17 Nov 2020 at 17:54, Jeff Law  wrote:

>
>
> On 11/17/20 9:46 AM, Jakub Jelinek wrote:
> > On Tue, Nov 17, 2020 at 05:29:57PM +0100, Philipp Tomsich wrote:
>  In other words, the change to VRP canonicalizes what a lshift_expr
> with an
>  shift-amount outside of the type width means... it doesn't assume
> anything
>  about the original language.
>  Do we assume that a LSHIFT_EXPR has the same semantics as for a
>  C-language shift-left? If so, then pre should not generate the
> LSHIFT_EXPR
>  for _9... or we might even catch this later in path isolation (as
>  undefined
>  behavior, insert a __builtin_trap() and emit a warning)?
> 
>  Note that in his comment to patch 2/2, Jim has noted that user code
> for
>  RISC-V may assume a truncation of the shift-operand...
> >>> What I'd suggest doing would be to leave the invalid shift count in the
> >>> IL in VRP, then extend the erroneous path isolation code to turn an
> >>> invalid shift into a trap (conditionally of course).
> >> You had originally suggested to add this to VRP ...
> >> Given the various comments to this patch, do you still want any of
> >> this in VRP or
> >> would you rather see this only in path isolation?
> > Well, I said if we want to do it at all, it should be done in VRP,
> because
> > there is not really a difference between ((int) x) << 32 and ((int) x)
> << y
> > for y in [32, 137] etc.
> Right.  VRP is the right place to discover, but I'm not sure it's the
> right place to clean up the mess though.
>

The rules for E1 << E2 are:
  - if E2 is negative => undefined
  - if E1 is unsigned => E1 x 2^E2, reduced module one more than the
maximum representable value
  - if E1 is signed and non-negative => E1 x 2^E2, if E1 x 2^E2 is
representable; otherwise, undefined

So the test case is 'undefined' due to E2 being negative.
However, if it was a large positive number, the transform would be valid if
E1 is unsigned.

I would propose the following revision to this patch:
 1. tighten the logic in VRP to handle the case of E1 being unsigned and E2
being positive...
 2. catch the undefined case of E2 being negative in path isolation

Agreed?
Philipp.


[PATCH,rs6000] Make MMA builtins use opaque modes

2020-11-17 Thread acsawdey--- via Gcc-patches
From: Aaron Sawdey 

This patch changes powerpc MMA builtins to use the new opaque
mode class and use modes OO (32 bytes) and XO (64 bytes)
instead of POI/PXI. Using the opaque modes prevents
optimization from trying to do anything with vector
pair/quad, which was the problem we were seeing with the
partial integer modes.

OK for trunk if bootstrap/regtest passes? 

gcc/
* gcc/config/rs6000/mma.md (unspec):
Add assemble/extract UNSPECs.
(movoi): Change to movoo.
(*movpoi): Change to *movoo.
(movxi): Change to movxo.
(*movpxi): Change to *movxo.
(mma_assemble_pair): Change to OO mode.
(*mma_assemble_pair): New define_insn_and_split.
(mma_disassemble_pair): New define_expand.
(*mma_disassemble_pair): New define_insn_and_split.
(mma_assemble_acc): Change to XO mode.
(*mma_assemble_acc): Change to XO mode.
(mma_disassemble_acc): New define_expand.
(*mma_disassemble_acc): New define_insn_and_split.
(mma_): Change to XO mode.
(mma_): Change to XO mode.
(mma_): Change to XO mode.
(mma_): Change to OO mode.
(mma_): Change to XO/OO mode.
(mma_): Change to XO mode.
(mma_): Change to XO mode.
(mma_): Change to XO mode.
(mma_): Change to XO mode.
(mma_): Change to XO mode.
(mma_): Change to XO mode.
(mma_): Change to XO/OO mode.
(mma_): Change to XO/OO mode.
(mma_): Change to XO mode.
(mma_): Change to XO mode.
* gcc/config/rs6000/predicates.md (input_operand): Allow opaque.
(mma_disassemble_output_operand): New predicate.
* gcc/config/rs6000/rs6000-builtin.def:
Changes to disassemble builtins.
* gcc/config/rs6000/rs6000-call.c (rs6000_return_in_memory):
Disallow __vector_pair/__vector_quad as return types.
(rs6000_promote_function_mode): Remove function return type
check because we can't test it here any more.
(rs6000_function_arg): Do not allow __vector_pair/__vector_quad
as as function arguments.
(rs6000_gimple_fold_mma_builtin):
Handle mma_disassemble_* builtins.
(rs6000_init_builtins): Create types for XO/OO modes.
* gcc/config/rs6000/rs6000-modes.def: Create XO and OO modes.
* gcc/config/rs6000/rs6000-string.c (expand_block_move):
Update to OO mode.
* gcc/config/rs6000/rs6000.c (rs6000_hard_regno_mode_ok_uncached):
Update for XO/OO modes.
(rs6000_modes_tieable_p): Update for XO/OO modes.
(rs6000_debug_reg_global): Update for XO/OO modes.
(rs6000_setup_reg_addr_masks): Update for XO/OO modes.
(rs6000_init_hard_regno_mode_ok): Update for XO/OO modes.
(reg_offset_addressing_ok_p): Update for XO/OO modes.
(rs6000_emit_move): Update for XO/OO modes.
(rs6000_preferred_reload_class): Update for XO/OO modes.
(rs6000_split_multireg_move): Update for XO/OO modes.
(rs6000_mangle_type): Update for opaque types.
(rs6000_invalid_conversion): Update for XO/OO modes.
* gcc/config/rs6000/rs6000.h (VECTOR_ALIGNMENT_P):
Update for XO/OO modes.
* gcc/config/rs6000/rs6000.md (RELOAD): Update for XO/OO modes.
gcc/testsuite/
* gcc.target/powerpc/mma-double-test.c (main): Call abort for failure.
* gcc.target/powerpc/mma-single-test.c (main): Call abort for failure.
* gcc.target/powerpc/pr96506.c: Rename to pr96506-1.c.
* gcc.target/powerpc/pr96506-2.c: New test.
---
 gcc/config/rs6000/mma.md  | 385 ++
 gcc/config/rs6000/predicates.md   |  14 +-
 gcc/config/rs6000/rs6000-builtin.def  |  14 +-
 gcc/config/rs6000/rs6000-call.c   | 144 ---
 gcc/config/rs6000/rs6000-modes.def|  10 +-
 gcc/config/rs6000/rs6000-string.c |   6 +-
 gcc/config/rs6000/rs6000.c| 189 +
 gcc/config/rs6000/rs6000.h|   3 +-
 gcc/config/rs6000/rs6000.md   |   2 +-
 .../gcc.target/powerpc/mma-double-test.c  |   3 +
 .../gcc.target/powerpc/mma-single-test.c  |   3 +
 .../powerpc/{pr96506.c => pr96506-1.c}|  24 --
 gcc/testsuite/gcc.target/powerpc/pr96506-2.c  |  38 ++
 13 files changed, 482 insertions(+), 353 deletions(-)
 rename gcc/testsuite/gcc.target/powerpc/{pr96506.c => pr96506-1.c} (61%)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr96506-2.c

diff --git a/gcc/config/rs6000/mma.md b/gcc/config/rs6000/mma.md
index a3fd28bdd0a..7d520e19b0d 100644
--- a/gcc/config/rs6000/mma.md
+++ b/gcc/config/rs6000/mma.md
@@ -19,24 +19,19 @@
 ;; along with GCC; see the file COPYING3.  If not see
 ;; .
 
-;; The MMA patterns use the multi-register PXImode and POImode partial
+;; The MMA patterns use the multi-register XOmode and OOmode partial
 ;; integer modes to impl

Re: [PATCH v1 1/2] Simplify shifts wider than the bitwidth of types

2020-11-17 Thread Jakub Jelinek via Gcc-patches
On Tue, Nov 17, 2020 at 09:14:31AM -0800, Jim Wilson wrote:
> On Tue, Nov 17, 2020 at 8:46 AM Jakub Jelinek via Gcc-patches <
> gcc-patches@gcc.gnu.org> wrote:
> 
> > On Tue, Nov 17, 2020 at 05:29:57PM +0100, Philipp Tomsich wrote:
> > > > > In other words, the change to VRP canonicalizes what a lshift_expr
> > with an
> > > > > shift-amount outside of the type width means... it doesn't assume
> > anything
> > > > > about the original language.
> >
> > Well, I said if we want to do it at all, it should be done in VRP, because
> > there is not really a difference between ((int) x) << 32 and ((int) x) << y
> > for y in [32, 137] etc.
> >
> 
> How does this stuff interact with SHIFT_COUNT_TRUNCATED and
> TARGET_SHIFT_TRUNCATION_MASK?  We already provide a mechanism to
> truncate shift counts to fit based on how the hardware handles out-of-range
> shift counts.  Handling out-of-range shift counts differently in VRP would
> confuse things.  Maybe VRP should be using SHIFT_COUNT_TRUCNATED and/or
> TARGET_SHIFT_TRUNCATION_MASK here?  Or maybe we give up on the shift
> truncation macros?

Those are RTL only, aren't they?  So, in GIMPLE we can still (and already do
in various places) assume that out of bounds shifts are UB, and only in RTL
follow the rules of those macros/hooks, so only at the RTL level we can e.g.
optimize x << (y & 31) to x << y if the macros/hooks say the target handles
it that way.

Jakub



Re: [PATCH v1 1/2] Simplify shifts wider than the bitwidth of types

2020-11-17 Thread Jakub Jelinek via Gcc-patches
On Tue, Nov 17, 2020 at 06:23:51PM +0100, Philipp Tomsich wrote:
> The rules for E1 << E2 are:
>   - if E2 is negative => undefined
>   - if E1 is unsigned => E1 x 2^E2, reduced module one more than the
> maximum representable value
>   - if E1 is signed and non-negative => E1 x 2^E2, if E1 x 2^E2 is
> representable; otherwise, undefined

Those are rules about UB -fsanitize=shift-base diagnoses, and that greatly
differs between different languages and versions of those languages, and as
we don't really record what it comes from, for the GIMPLE IL everything is
well defined.
What we were talking about before is written earlier in the
"If the value of the right operand is negative or is
greater than or equal to the width of the promoted left operand, the behavior is
undefined."
sentence and is what -fsanitize=shift-exponent diagnoses.  In the GIMPLE IL
such shifts are still UB and in RTL only depending on some target macros
(i.e. undefined for some targets, wrapping with some mask or saturating on
others).

Jakub



Re: [PATCH] add -Wmismatched-new-delete to middle end (PR 90629)

2020-11-17 Thread Martin Sebor via Gcc-patches

On 11/16/20 5:54 PM, Jeff Law wrote:


On 11/3/20 4:56 PM, Martin Sebor via Gcc-patches wrote:

Attached is a simple middle end implementation of detection of
mismatched pairs of calls to C++ new and delete, along with
a substantially enhanced implementation of -Wfree-nonheap-object.
The latter option has been in place since 2011 but detected only
the most trivial bugs.

Unlike the Clang -Wmismatched-new-delete which diagnoses
declarations of "overloaded operator new() and operator delete()
functions that do not have a corresponding free store function
defined within the same scope", this patch detects mismatches
between calls to allocation and deallocation functions, such as
calling free() on the result of new, of delete on the result of
array new.  The functionality provided by Clang can be added on
top of what this feature does and since they are so close I think
it's fine to have both under the same option (a new level could
be introduced to distinguish the two).

The -Wfree-nonheap-object enhancement lets the warning detect all
calls to free, realloc, or C++ delete, with pointers that can be
proven not to point to the first byte of an allocated object.

The patch relies on the well-tested compute_objsize() function
for the determination of pointer provenance and makes use of
the changes in the following patch submitted for review just
yesterday:
https://gcc.gnu.org/pipermail/gcc-patches/2020-November/557807.html

As usual, I tested on x86_64-linux with Glibc & Binutils/GDB with
no new false positives.

Martin

PS A few words on the implementation choices:

The new code is in builtins.c only because -Wfree-nonheap-object
is there.  I still plan to move all of the invalid access checking
code into its own module or pass at some point but I didn't want
to make this improvement contingent on that restructuring.
Even though it's all in builtins.c, the code is called from calls.c.
This is so that simple mismatches can be diagnosed even when free
isn't handed in builtins.c (i.e., without optimization).
The warning makes no attempt to analyze the CFG or handle
conditional mismatches.  That will have to wait until the code
is moved to a GIMPLE pass.

gcc-90629.diff

PR c++/90629 - Support for -Wmismatched-new-delete

gcc/ChangeLog:

PR c++/90629
* builtins.c (access_ref::access_ref): Initialize new member.
(compute_objsize): Use access_ref::deref.  Handle simple pointer
assignment.
(expand_builtin): Remove handling of the free built-in.
(find_assignment_location): New function.
(gimple_call_alloc_p): Same.
(gimple_call_dealloc_argno): Same.
(gimple_call_dealloc_p): Same.
(matching_alloc_calls_p): Same.
(warn_dealloc_offset): Same.
* builtins.h (struct access_ref): Declare new member.
(maybe_emit_free_warning): Make extern.  Make use of access_ref.
Handle -Wmismatched-new-delete.
* calls.c (initialize_argument_information): Call
maybe_emit_free_warning.
* doc/invoke.texi (-Wfree-nonheap-object): Expand documentation.
(-Wmismatched-new-delete): Document new option.

gcc/c-family/ChangeLog:

PR c++/90629
* c.opt (-Wmismatched-new-delete): New option.

gcc/testsuite/ChangeLog:

PR c++/90629
* g++.dg/warn/delete-array-1.C: Add expected warning.
* g++.old-deja/g++.other/delete2.C: Add expected warning.
* g++.dg/warn/Wfree-nonheap-object-2.C: New test.
* g++.dg/warn/Wfree-nonheap-object.C: New test.
* g++.dg/warn/Wmismatched-new-delete.C: New test.
* gcc.dg/Wfree-nonheap-object-2.c: New test.
* gcc.dg/Wfree-nonheap-object-3.c: New test.
* gcc.dg/Wfree-nonheap-object.c: New test.


So do we need to reconcile with David M's patch that adds the 
"deallocated_by" attribute.  In that thread I raised the question about 
using the same attribute to track both pointers as well as things like 
file descriptors.  It looks like this patch ignores everything except 
builtins/language intrinsics as allocation functions.  ISTM that would 
need to be fixed for David's patch to be useful here, right?


Our emails have crossed mid air.  I replied to your comments on
David's patch here:
https://gcc.gnu.org/pipermail/gcc-patches/2020-November/559309.html
The work I point to there extends this warning to user-defined
functions.

I also gave some preliminary comments on David's initial RFC.
It has some more background:
https://gcc.gnu.org/pipermail/gcc-patches/2020-October/555667.html

As we discussed privately, this can trigger false positives, 
particularly for unoptimized code  as we've seen in Fedora testing.  But 
I'm not terribly worried about these.


I haven't seen anything failing with a valid error because of this, but 
that's because the packages where it's triggering aren't compiling with 
-Werror.  I did some quick grepping of the logs and I think I found 
instances of each of the new warnings.


[PATCH, rs6000] Re-enable vector pair memcpy/memmove expansion

2020-11-17 Thread acsawdey--- via Gcc-patches
From: Aaron Sawdey 

After the MMA opaque mode patch goes in, we can re-enable
use of vector pair in the inline expansion of memcpy/memmove.

After bootstrap/regtest, OK for trunk?

Thanks,
Aaron

gcc/
* config/rs6000/rs6000.c (rs6000_option_override_internal):
Enable vector pair memcpy/memmove expansion.
---
 gcc/config/rs6000/rs6000.c | 9 -
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index bb48ed92aef..53f92970414 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -4117,11 +4117,10 @@ rs6000_option_override_internal (bool global_init_p)
 
   if (!(rs6000_isa_flags_explicit & OPTION_MASK_BLOCK_OPS_VECTOR_PAIR))
 {
-  /* When the POImode issues of PR96791 are resolved, then we can
-once again enable use of vector pair for memcpy/memmove on
-P10 if we have TARGET_MMA.  For now we make it disabled by
-default for all targets.  */
-  rs6000_isa_flags &= ~OPTION_MASK_BLOCK_OPS_VECTOR_PAIR;
+  if (TARGET_MMA && TARGET_EFFICIENT_UNALIGNED_VSX)
+   rs6000_isa_flags |= OPTION_MASK_BLOCK_OPS_VECTOR_PAIR;
+  else
+   rs6000_isa_flags &= ~OPTION_MASK_BLOCK_OPS_VECTOR_PAIR;
 }
 
   /* Use long double size to select the appropriate long double.  We use
-- 
2.18.4



[PATCH] c++: Allow template lambdas without lambda-declarator [PR97839]

2020-11-17 Thread Marek Polacek via Gcc-patches
Our implementation of template lambdas incorrectly requires the optional
lambda-declarator.  This was probably required by an early draft of
generic lambdas, but now the production is [expr.prim.lambda.general]:

 lambda-expression:
lambda-introducer lambda-declarator [opt] compound-statement
lambda-introducer < template-parameter-list > requires-clause [opt]
  lambda-declarator [opt] compound-statement

Therefore, we should accept the following test.

Incidentally, I noticed we give a terrible diagnostic when the user uses
'mutable', but forgets to type '()' before it, which sounds like a common
mistake.  So it seems to me we should handle that specifically, rather
than to emit this:

lambda-generic8.C: In lambda function:
lambda-generic8.C:8:18: error: expected '{' before 'mutable'
8 |   [] mutable {}.operator()();
  |  ^~~
lambda-generic8.C: In function 'int main()':
lambda-generic8.C:8:17: error: expected ';' before 'mutable'
8 |   [] mutable {}.operator()();
  | ^~~~
  | ;
lambda-generic8.C:8:28: error: expected primary-expression before '.' token
8 |   [] mutable {}.operator()();
  |^
lambda-generic8.C:8:40: error: expected primary-expression before 'int'
8 |   [] mutable {}.operator()();
  |^~~

Is it okay to fix this in stage3?

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

gcc/cp/ChangeLog:

PR c++/97839
* parser.c (cp_parser_lambda_declarator_opt): Don't require ().

gcc/testsuite/ChangeLog:

PR c++/97839
* g++.dg/cpp2a/lambda-generic8.C: New test.
---
 gcc/cp/parser.c  | 14 ++
 gcc/testsuite/g++.dg/cpp2a/lambda-generic8.C |  9 +
 2 files changed, 15 insertions(+), 8 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/lambda-generic8.C

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 42f705266bb..9f09c778c29 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -10604,6 +10604,8 @@ cp_parser_trait_expr (cp_parser* parser, enum rid 
keyword)
 
lambda-expression:
  lambda-introducer lambda-declarator [opt] compound-statement
+ lambda-introducer < template-parameter-list > requires-clause [opt]
+   lambda-declarator [opt] compound-statement
 
Returns a representation of the expression.  */
 
@@ -11061,13 +11063,11 @@ cp_parser_lambda_introducer (cp_parser* parser, tree 
lambda_expr)
 /* Parse the (optional) middle of a lambda expression.
 
lambda-declarator:
- < template-parameter-list [opt] >
-   requires-clause [opt]
- ( parameter-declaration-clause [opt] )
-   attribute-specifier [opt]
+ ( parameter-declaration-clause )
decl-specifier-seq [opt]
-   exception-specification [opt]
-   lambda-return-type-clause [opt]
+   noexcept-specifier [opt]
+   attribute-specifier-seq [opt]
+   trailing-return-type [opt]
requires-clause [opt]
 
LAMBDA_EXPR is the current representation of the lambda expression.  */
@@ -11217,8 +11217,6 @@ cp_parser_lambda_declarator_opt (cp_parser* parser, 
tree lambda_expr)
  trailing-return-type in case of decltype.  */
   pop_bindings_and_leave_scope ();
 }
-  else if (template_param_list != NULL_TREE) // generate diagnostic
-cp_parser_require (parser, CPP_OPEN_PAREN, RT_OPEN_PAREN);
 
   /* Create the function call operator.
 
diff --git a/gcc/testsuite/g++.dg/cpp2a/lambda-generic8.C 
b/gcc/testsuite/g++.dg/cpp2a/lambda-generic8.C
new file mode 100644
index 000..f3c3809b36d
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/lambda-generic8.C
@@ -0,0 +1,9 @@
+// PR c++/97839
+// { dg-do compile { target c++20 } }
+// Test that a lambda with  doesn't require
+// a lambda-declarator.
+
+int main()
+{
+  []{}.operator()();
+}

base-commit: 8661f4faa875f361cd22a197774c1fa04cd0580b
-- 
2.28.0



global trees

2020-11-17 Thread Nathan Sidwell


This reorders the common and c++ global tree arrays.  It introduces a
module-specific High Water Mark, below which are the immutable slots
initialized at startup and beyond which are the lazily filled slots
(and a few immutables we need to locate by name lookup anyway).

gcc/c-family/
* c-common.h (enum c_tree_index): Reorder to place lazy fields
after newly-added CTI_MODULE_HWM.
gcc/cp/
* cp-tree.h (enum cp_tree_index): Reorder to place lazy fields
after newly-added CPTI_MODULE_HWM.

Pushing to trunk
--
Nathan Sidwell
diff --git i/gcc/c-family/c-common.h w/gcc/c-family/c-common.h
index 3c508979b14..f413e8773f5 100644
--- i/gcc/c-family/c-common.h
+++ w/gcc/c-family/c-common.h
@@ -364,13 +364,17 @@ enum c_tree_index
 
 CTI_DEFAULT_FUNCTION_TYPE,
 
+CTI_NULL,
+
 /* These are not types, but we have to look them up all the time.  */
 CTI_FUNCTION_NAME_DECL,
 CTI_PRETTY_FUNCTION_NAME_DECL,
 CTI_C99_FUNCTION_NAME_DECL,
-CTI_SAVED_FUNCTION_NAME_DECLS,
 
-CTI_NULL,
+CTI_MODULE_HWM,
+/* Below here entities change during compilation.  */
+
+CTI_SAVED_FUNCTION_NAME_DECLS,
 
 CTI_MAX
 };
diff --git i/gcc/cp/cp-tree.h w/gcc/cp/cp-tree.h
index 9ae6ff5f7a2..81485de94f9 100644
--- i/gcc/cp/cp-tree.h
+++ w/gcc/cp/cp-tree.h
@@ -128,12 +128,8 @@ enum cp_tree_index
 CPTI_EXPLICIT_VOID_LIST,
 CPTI_VTBL_TYPE,
 CPTI_VTBL_PTR_TYPE,
-CPTI_STD,
-CPTI_ABI,
 CPTI_GLOBAL,
 CPTI_GLOBAL_TYPE,
-CPTI_CONST_TYPE_INFO_TYPE,
-CPTI_TYPE_INFO_PTR_TYPE,
 CPTI_ABORT_FNDECL,
 CPTI_AGGR_TAG,
 CPTI_CONV_OP_MARKER,
@@ -190,8 +186,28 @@ enum cp_tree_index
 CPTI_NOEXCEPT_FALSE_SPEC,
 CPTI_NOEXCEPT_DEFERRED_SPEC,
 
+CPTI_NULLPTR,
+CPTI_NULLPTR_TYPE,
+
+CPTI_ANY_TARG,
+
+CPTI_MODULE_HWM,
+/* Nodes after here change during compilation, or should not be in
+   the module's global tree table.  */
+
+/* We must find these via the global namespace.  */
+CPTI_STD,
+CPTI_ABI,
+
+/* These are created at init time, but the library/headers provide
+   definitions.  */
+CPTI_ALIGN_TYPE,
+CPTI_CONST_TYPE_INFO_TYPE,
+CPTI_TYPE_INFO_PTR_TYPE,
 CPTI_TERMINATE_FN,
 CPTI_CALL_UNEXPECTED_FN,
+
+/* These are lazily inited.  */
 CPTI_GET_EXCEPTION_PTR_FN,
 CPTI_BEGIN_CATCH_FN,
 CPTI_END_CATCH_FN,
@@ -204,13 +220,6 @@ enum cp_tree_index
 CPTI_DSO_HANDLE,
 CPTI_DCAST,
 
-CPTI_NULLPTR,
-CPTI_NULLPTR_TYPE,
-
-CPTI_ALIGN_TYPE,
-
-CPTI_ANY_TARG,
-
 CPTI_SOURCE_LOCATION_IMPL,
 
 CPTI_FALLBACK_DFLOAT32_TYPE,


Re: [gcc r9-8794] aarch64: Clear canary value after stack_protect_test [PR96191]

2020-11-17 Thread Richard Sandiford via Gcc-patches
Sebastian Pop  writes:
> Hi,
>
> On Fri, Aug 7, 2020 at 6:18 AM Richard Sandiford  wrote:
>>
>> https://gcc.gnu.org/g:5380912a17ea09a8996720fb62b1a70c16c8f9f2
>>
>> commit r9-8794-g5380912a17ea09a8996720fb62b1a70c16c8f9f2
>> Author: Richard Sandiford 
>> Date:   Fri Aug 7 12:17:37 2020 +0100
>
> could you please also apply this change to the gcc-8 branch?

I've now pushed the attached patch to GCC 8.  It's somewhat simpler
than the GCC 9+ version since GCC 8 didn't support the sysreg model.

Tested on aarch64-linux-gnu.

Thanks,
Richard


gcc/
PR target/96191
* config/aarch64/aarch64.md (stack_protect_test_): Set the
CC register directly, instead of a GPR.  Replace the original GPR
destination with an extra scratch register.  Zero out operand 3
after use.
(stack_protect_test): Update accordingly.

gcc/testsuite/
PR target/96191
* gcc.target/aarch64/stack-protector-1.c: New test.
* gcc.target/aarch64/stack-protector-2.c: Likewise.

(cherry picked from commit fe1a26429038d7cd17abc53f96a6f3e2639b605f)
---
 gcc/config/aarch64/aarch64.md | 35 
 .../gcc.target/aarch64/stack-protector-1.c| 89 +++
 .../gcc.target/aarch64/stack-protector-2.c|  6 ++
 3 files changed, 110 insertions(+), 20 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/stack-protector-1.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/stack-protector-2.c

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 9fc555c4006..ea1319c56a4 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -5995,35 +5995,30 @@
(match_operand 2)]
   ""
 {
-  rtx result;
   machine_mode mode = GET_MODE (operands[0]);
 
-  result = gen_reg_rtx(mode);
-
   emit_insn ((mode == DImode
- ? gen_stack_protect_test_di
- : gen_stack_protect_test_si) (result,
-   operands[0],
-   operands[1]));
-
-  if (mode == DImode)
-emit_jump_insn (gen_cbranchdi4 (gen_rtx_EQ (VOIDmode, result, const0_rtx),
-   result, const0_rtx, operands[2]));
-  else
-emit_jump_insn (gen_cbranchsi4 (gen_rtx_EQ (VOIDmode, result, const0_rtx),
-   result, const0_rtx, operands[2]));
+? gen_stack_protect_test_di
+: gen_stack_protect_test_si) (operands[0], operands[1]));
+
+  rtx cc_reg = gen_rtx_REG (CCmode, CC_REGNUM);
+  emit_jump_insn (gen_condjump (gen_rtx_EQ (VOIDmode, cc_reg, const0_rtx),
+   cc_reg, operands[2]));
   DONE;
 })
 
+;; DO NOT SPLIT THIS PATTERN.  It is important for security reasons that the
+;; canary value does not live beyond the end of this sequence.
 (define_insn "stack_protect_test_"
-  [(set (match_operand:PTR 0 "register_operand" "=r")
-   (unspec:PTR [(match_operand:PTR 1 "memory_operand" "m")
-(match_operand:PTR 2 "memory_operand" "m")]
-UNSPEC_SP_TEST))
+  [(set (reg:CC CC_REGNUM)
+   (unspec:CC [(match_operand:PTR 0 "memory_operand" "m")
+   (match_operand:PTR 1 "memory_operand" "m")]
+  UNSPEC_SP_TEST))
+   (clobber (match_scratch:PTR 2 "=&r"))
(clobber (match_scratch:PTR 3 "=&r"))]
   ""
-  "ldr\t%3, %1\;ldr\t%0, %2\;eor\t%0, %3, %0"
-  [(set_attr "length" "12")
+  "ldr\t%2, %0\;ldr\t%3, %1\;subs\t%2, %2, %3\;mov\t%3, 0"
+  [(set_attr "length" "16")
(set_attr "type" "multiple")])
 
 ;; Write Floating-point Control Register.
diff --git a/gcc/testsuite/gcc.target/aarch64/stack-protector-1.c 
b/gcc/testsuite/gcc.target/aarch64/stack-protector-1.c
new file mode 100644
index 000..73e83bc413f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/stack-protector-1.c
@@ -0,0 +1,89 @@
+/* { dg-do run } */
+/* { dg-require-effective-target fstack_protector } */
+/* { dg-options "-fstack-protector-all -O2" } */
+
+extern volatile long *stack_chk_guard_ptr;
+
+volatile long *
+get_ptr (void)
+{
+  return stack_chk_guard_ptr;
+}
+
+void __attribute__ ((noipa))
+f (void)
+{
+  volatile int x;
+  x = 1;
+  x += 1;
+}
+
+#define CHECK(REG) "\tcmp\tx0, " #REG "\n\tbeq\t1f\n"
+
+asm (
+"  .pushsection .data\n"
+"  .align  3\n"
+"  .globl  stack_chk_guard_ptr\n"
+"stack_chk_guard_ptr:\n"
+#if __ILP32__
+"  .word   __stack_chk_guard\n"
+#else
+"  .xword  __stack_chk_guard\n"
+#endif
+"  .weak   __stack_chk_guard\n"
+"__stack_chk_guard:\n"
+"  .word   0xdead4321\n"
+"  .word   0xbeef8765\n"
+"  .text\n"
+"  .globl  main\n"
+"  .type   main, %function\n"
+"main:\n"
+"  bl  get_ptr\n"
+"  str x0, [sp, #-16]!\n"
+"  bl  f\n"
+"  str x0, [sp, #8]\n"
+"  ldr x0, [sp]\n"
+#if __ILP32__
+"  ldr w0, [x0]\n"
+#else
+"  ldr x0, [x0]\n"
+#endif
+   CHECK (x1)
+   CHECK (x2)
+   CHECK (x3)
+  

[AArch64] Add --with-tune configure flag

2020-11-17 Thread Pop, Sebastian via Gcc-patches
Hi,

the attached patch fixes a configure error on Arm64 when passing 
--with-tune=... to configure:
```
This target does not support --with-tune.
Valid --with options are: abi cpu arch
```
The missing flag sets target tuning to a different value than the generic 
tuning.

gcc/
* config.gcc: Add --with-tune to AArch64 configure flags.

Tested on aarch64-linux with bootstrap and regression test.
Ok to commit to trunk and back-port to active branches?

Thanks,
Sebastian



0001-AArch64-add-with-tune-configure-flag.patch
Description: 0001-AArch64-add-with-tune-configure-flag.patch


Re: [PATCH,rs6000] Make MMA builtins use opaque modes

2020-11-17 Thread Peter Bergner via Gcc-patches
On 11/17/20 11:48 AM, acsaw...@linux.ibm.com wrote:
> -;; The MMA patterns use the multi-register PXImode and POImode partial
> +;; The MMA patterns use the multi-register XOmode and OOmode partial
>  ;; integer modes to implement the target specific __vector_quad and

XOmode and OOmode are not partial integer modes, so change to opaque mode.


> +;; Return 1 if this operand is valid for an MMA disassemble insn.
> +(define_predicate "mma_disassemble_output_operand"
> +  (match_code "reg,subreg,mem")
> +{
> +  if (REG_P (op) && !vsx_register_operand (op, mode))
> +return false;
> +  return true;
> +})

Do we really want to accept subregs here?  If so, why are they not also required
to be vsx_register_operand()?


> -   if ((attr & RS6000_BTC_QUAD) == 0)
> +   if ( !( d->code == MMA_BUILTIN_DISASSEMBLE_ACC_INTERNAL
> +   || d->code == MMA_BUILTIN_DISASSEMBLE_PAIR_INTERNAL)
> +&& ((attr & RS6000_BTC_QUAD) == 0))

No white space after the '('.



> -  if (icode == CODE_FOR_nothing)
> +  /* This is a disassemble pair/acc function. */
> +  if ( d->code == MMA_BUILTIN_DISASSEMBLE_ACC
> +|| d->code == MMA_BUILTIN_DISASSEMBLE_PAIR)

Ditto.



> +  /* The __vector_pair and __vector_quad modes are multi-register
> + modes, so if have to load or store the registers, we have to be
> + careful to properly swap them if we're in little endian mode

s/so if have to/so if we have to/


> -   /* We are writing an accumulator register, so we have to
> -  prime it after we've written it.  */
> -   emit_insn (gen_mma_xxmtacc (dst, dst));
> +   if ( GET_MODE (src) == XOmode )

White space again.


>/* Move register range backwards, if we might have destructive
>overlap.  */
>int i;
> -  for (i = nregs - 1; i >= 0; i--)
> - emit_insn (gen_rtx_SET (simplify_gen_subreg (reg_mode, dst, mode,
> -  i * reg_mode_size),
> - simplify_gen_subreg (reg_mode, src, mode,
> -  i * reg_mode_size)));
> +  /* XO/OO are opaque so cannot use subregs. */
> +  if ( mode == OOmode || mode == XOmode )

Ditto.


> +   /* XO/OO are opaque so cannot use subregs. */
> +   if ( mode == OOmode || mode == XOmode )

Ditto.


Peter


[PATCH] c++: Fix ICE-on-invalid with -Wvexing-parse [PR97881]

2020-11-17 Thread Marek Polacek via Gcc-patches
This invalid (?) code broke my assumption that if decl_specifiers->type
is null, there must be any type-specifiers.  Turn the assert into an if
to fix this crash.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

gcc/cp/ChangeLog:

PR c++/97881
* parser.c (warn_about_ambiguous_parse): Only assume "int" if we
actually saw any type-specifiers.

gcc/testsuite/ChangeLog:

PR c++/97881
* g++.dg/warn/Wvexing-parse9.C: New test.
---
 gcc/cp/parser.c| 11 +--
 gcc/testsuite/g++.dg/warn/Wvexing-parse9.C |  8 
 2 files changed, 13 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/warn/Wvexing-parse9.C

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index b7ef259b048..7a6bf4ad2cf 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -20792,13 +20792,12 @@ warn_about_ambiguous_parse (const 
cp_decl_specifier_seq *decl_specifiers,
   if (same_type_p (type, void_type_node))
return;
 }
+  else if (decl_specifiers->any_type_specifiers_p)
+/* Code like long f(); will have null ->type.  If we have any
+   type-specifiers, pretend we've seen int.  */
+type = integer_type_node;
   else
-{
-  /* Code like long f(); will have null ->type.  If we have any
-type-specifiers, pretend we've seen int.  */
-  gcc_checking_assert (decl_specifiers->any_type_specifiers_p);
-  type = integer_type_node;
-}
+return;
 
   auto_diagnostic_group d;
   location_t loc = declarator->u.function.parens_loc;
diff --git a/gcc/testsuite/g++.dg/warn/Wvexing-parse9.C 
b/gcc/testsuite/g++.dg/warn/Wvexing-parse9.C
new file mode 100644
index 000..dc4198d6c5e
--- /dev/null
+++ b/gcc/testsuite/g++.dg/warn/Wvexing-parse9.C
@@ -0,0 +1,8 @@
+// PR c++/97881
+// { dg-do compile }
+
+void
+cb ()
+{
+  volatile _Atomic (int) a1; // { dg-error "expected initializer" }
+}

base-commit: a5f9c27bfc4417224e332392bb81a2d733b2b5bf
-- 
2.28.0



[r11-5094 Regression] FAIL: gcc.dg/torture/pr8081.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects (test for excess errors) on Linux/x86_64

2020-11-17 Thread sunil.k.pandey via Gcc-patches
On Linux/x86_64,

afa6adbd6c83eeef6d75655140f7c0c9a02a479e is the first bad commit
commit afa6adbd6c83eeef6d75655140f7c0c9a02a479e
Author: Jan Hubicka 
Date:   Tue Nov 17 15:41:06 2020 +0100

Improve handling of memory operands in ipa-icf 3/4

caused

FAIL: gcc.c-torture/execute/20020412-1.c   -O2 -flto -fno-use-linker-plugin 
-flto-partition=none  (internal compiler error)
FAIL: gcc.c-torture/execute/20020412-1.c   -O2 -flto -fno-use-linker-plugin 
-flto-partition=none  (test for excess errors)
FAIL: gcc.c-torture/execute/20020412-1.c   -O2 -flto -fuse-linker-plugin 
-fno-fat-lto-objects  (internal compiler error)
FAIL: gcc.c-torture/execute/20020412-1.c   -O2 -flto -fuse-linker-plugin 
-fno-fat-lto-objects  (test for excess errors)
FAIL: gcc.c-torture/execute/pr82210.c   -O2 -flto -fno-use-linker-plugin 
-flto-partition=none  (internal compiler error)
FAIL: gcc.c-torture/execute/pr82210.c   -O2 -flto -fno-use-linker-plugin 
-flto-partition=none  (test for excess errors)
FAIL: gcc.c-torture/execute/pr82210.c   -O2 -flto -fuse-linker-plugin 
-fno-fat-lto-objects  (internal compiler error)
FAIL: gcc.c-torture/execute/pr82210.c   -O2 -flto -fuse-linker-plugin 
-fno-fat-lto-objects  (test for excess errors)
FAIL: gcc.dg/pr34457-1.c (internal compiler error)
FAIL: gcc.dg/pr34457-1.c (test for excess errors)
FAIL: gcc.dg/torture/pr8081.c   -O2 -flto -fno-use-linker-plugin 
-flto-partition=none  (internal compiler error)
FAIL: gcc.dg/torture/pr8081.c   -O2 -flto -fno-use-linker-plugin 
-flto-partition=none  (test for excess errors)
FAIL: gcc.dg/torture/pr8081.c   -O2 -flto -fuse-linker-plugin 
-fno-fat-lto-objects  (internal compiler error)
FAIL: gcc.dg/torture/pr8081.c   -O2 -flto -fuse-linker-plugin 
-fno-fat-lto-objects  (test for excess errors)

with GCC configured with

../../gcc/configure 
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r11-5094/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="execute.exp=gcc.c-torture/execute/20020412-1.c 
--target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="execute.exp=gcc.c-torture/execute/20020412-1.c 
--target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="execute.exp=gcc.c-torture/execute/20020412-1.c 
--target_board='unix{-m64}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="execute.exp=gcc.c-torture/execute/20020412-1.c 
--target_board='unix{-m64\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="execute.exp=gcc.c-torture/execute/pr82210.c 
--target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="execute.exp=gcc.c-torture/execute/pr82210.c 
--target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="execute.exp=gcc.c-torture/execute/pr82210.c 
--target_board='unix{-m64}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="execute.exp=gcc.c-torture/execute/pr82210.c 
--target_board='unix{-m64\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg.exp=gcc.dg/pr34457-1.c 
--target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg.exp=gcc.dg/pr34457-1.c 
--target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg.exp=gcc.dg/pr34457-1.c 
--target_board='unix{-m64}'"
$ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg.exp=gcc.dg/pr34457-1.c 
--target_board='unix{-m64\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="dg-torture.exp=gcc.dg/torture/pr8081.c 
--target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="dg-torture.exp=gcc.dg/torture/pr8081.c --target_board='unix{-m32\ 
-march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="dg-torture.exp=gcc.dg/torture/pr8081.c 
--target_board='unix{-m64}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="dg-torture.exp=gcc.dg/torture/pr8081.c --target_board='unix{-m64\ 
-march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at skpgkp2 at gmail dot com)


[PATCH] Remove lambdas from _Rb_tree

2020-11-17 Thread François Dumont via Gcc-patches
This is a change that has been done to _Hashtable and that I forgot to 
propose for _Rb_tree.


The _GLIBCXX_XREF macro can be easily removed of course.

    libstdc++: _Rb_tree code cleanup, remove lambdas.

    Use an additional template parameter on the clone method to 
propagate if the values must be

    copy or move rather than lambdas.

    libstdc++-v3/ChangeLog:

    * include/bits/move.h (_GLIBCXX_XREF): New.
    * include/bits/stl_tree.h: Adapt to use latter.
    (_Rb_tree<>::_S_fwd_value_for): New.
    (_Rb_tree<>::_M_clone_node): Add _Tree template parameter.
    Use _S_fwd_value_for.
    (_Rb_tree<>::_M_cbegin): New.
    (_Rb_tree<>::_M_begin): Use latter.
    (_Rb_tree<>::_M_copy): Add _Tree template parameter.
    (_Rb_tree<>::_M_move_data): Use rvalue reference for 
_Rb_tree parameter.

    (_Rb_tree<>::_M_move_assign): Likewise.

Tested under Linux x86_64.

Ok to commit ?

François

diff --git a/libstdc++-v3/include/bits/move.h b/libstdc++-v3/include/bits/move.h
index 5a4dbdc823c..e0d68ca9108 100644
--- a/libstdc++-v3/include/bits/move.h
+++ b/libstdc++-v3/include/bits/move.h
@@ -158,9 +158,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   /// @} group utilities
 
+#define _GLIBCXX_XREF(_Tp) _Tp&&
 #define _GLIBCXX_MOVE(__val) std::move(__val)
 #define _GLIBCXX_FORWARD(_Tp, __val) std::forward<_Tp>(__val)
 #else
+#define _GLIBCXX_XREF(_Tp) const _Tp&
 #define _GLIBCXX_MOVE(__val) (__val)
 #define _GLIBCXX_FORWARD(_Tp, __val) (__val)
 #endif
diff --git a/libstdc++-v3/include/bits/stl_tree.h b/libstdc++-v3/include/bits/stl_tree.h
index ec141ea01c7..128c7e2c892 100644
--- a/libstdc++-v3/include/bits/stl_tree.h
+++ b/libstdc++-v3/include/bits/stl_tree.h
@@ -478,11 +478,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
 	template
 	  _Link_type
-#if __cplusplus < 201103L
-	  operator()(const _Arg& __arg)
-#else
-	  operator()(_Arg&& __arg)
-#endif
+	  operator()(_GLIBCXX_XREF(_Arg) __arg)
 	  {
 	_Link_type __node = static_cast<_Link_type>(_M_extract());
 	if (__node)
@@ -544,11 +540,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
 	template
 	  _Link_type
-#if __cplusplus < 201103L
-	  operator()(const _Arg& __arg) const
-#else
-	  operator()(_Arg&& __arg) const
-#endif
+	  operator()(_GLIBCXX_XREF(_Arg) __arg) const
 	  { return _M_t._M_create_node(_GLIBCXX_FORWARD(_Arg, __arg)); }
 
   private:
@@ -655,11 +647,27 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 	_M_put_node(__p);
   }
 
-  template
+#if __cplusplus >= 201103L
+  template
+	static constexpr
+	typename conditional::value,
+			 const value_type&, value_type&&>::type
+	_S_fwd_value_for(value_type& __val) noexcept
+	{ return std::move(__val); }
+#else
+  template
+	static const value_type&
+	_S_fwd_value_for(value_type& __val)
+	{ return __val; }
+#endif
+
+  template
 	_Link_type
-	_M_clone_node(_Const_Link_type __x, _NodeGen& __node_gen)
+	_M_clone_node(_GLIBCXX_XREF(_Tree),
+		  _Link_type __x, _NodeGen& __node_gen)
 	{
-	  _Link_type __tmp = __node_gen(*__x->_M_valptr());
+	  _Link_type __tmp
+	= __node_gen(_S_fwd_value_for<_Tree>(*__x->_M_valptr()));
 	  __tmp->_M_color = __x->_M_color;
 	  __tmp->_M_left = 0;
 	  __tmp->_M_right = 0;
@@ -748,9 +756,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   { return this->_M_impl._M_header._M_right; }
 
   _Link_type
-  _M_begin() _GLIBCXX_NOEXCEPT
+  _M_cbegin() const _GLIBCXX_NOEXCEPT
   { return static_cast<_Link_type>(this->_M_impl._M_header._M_parent); }
 
+  _Link_type
+  _M_begin() _GLIBCXX_NOEXCEPT
+  { return _M_cbegin(); }
+
   _Const_Link_type
   _M_begin() const _GLIBCXX_NOEXCEPT
   {
@@ -889,15 +901,16 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   _M_insert_equal_lower(const value_type& __x);
 #endif
 
-  template
+  template
 	_Link_type
-	_M_copy(_Const_Link_type __x, _Base_ptr __p, _NodeGen&);
+	_M_copy(_GLIBCXX_XREF(_Tree), _Link_type, _Base_ptr, _NodeGen&);
 
-  template
+  template
 	_Link_type
-	_M_copy(const _Rb_tree& __x, _NodeGen& __gen)
+	_M_copy(_GLIBCXX_XREF(_Tree) __x, _NodeGen& __gen)
 	{
-	  _Link_type __root = _M_copy(__x._M_begin(), _M_end(), __gen);
+	  _Link_type __root = _M_copy(_GLIBCXX_FORWARD(_Tree, __x),
+  __x._M_cbegin(), _M_end(), __gen);
 	  _M_leftmost() = _S_minimum(__root);
 	  _M_rightmost() = _S_maximum(__root);
 	  _M_impl._M_node_count = __x._M_impl._M_node_count;
@@ -977,7 +990,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   : _M_impl(__x._M_impl._M_key_compare, std::move(__a))
   {
 	if (__x._M_root() != nullptr)
-	  _M_move_data(__x, false_type{});
+	  _M_move_data(std::move(__x), false_type{});
   }
 
 public:
@@ -1426,22 +1439,22 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 private:
   // Move elements from container with equal allocator.
   void
-  _M_move_data(_Rb_tree& __x, true_type)
+  _M_move_data(_Rb_tree&& __x, true_type)
   { _M_impl._M_move_data(__x._M_

Re: [r11-5094 Regression] FAIL: gcc.dg/torture/pr8081.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects (test for excess errors) on Linux/x86_64

2020-11-17 Thread Jan Hubicka
Hi,
I am testing the following fix.  I manually applied a rejected hunk and
for some reaosn managed to reverse the conditonal :(

Honza

* ipa-icf.c (sem_function::hash_stmt): Fix conditional on
variably_modified_type_p.
diff --git a/gcc/ipa-icf.c b/gcc/ipa-icf.c
index 27eeda3a319..6ae842766e6 100644
--- a/gcc/ipa-icf.c
+++ b/gcc/ipa-icf.c
@@ -1459,10 +1459,10 @@ sem_function::hash_stmt (gimple *stmt, inchash::hash 
&hstate)
 
ao_ref_init (&ref, gimple_op (stmt, i));
tree t = ao_ref_alias_ptr_type (&ref);
-   if (variably_modified_type_p (t, NULL_TREE))
+   if (!variably_modified_type_p (t, NULL_TREE))
  memory_access_types.safe_push (t);
t = ao_ref_base_alias_ptr_type (&ref);
-   if (variably_modified_type_p (t, NULL_TREE))
+   if (!variably_modified_type_p (t, NULL_TREE))
  memory_access_types.safe_push (t);
  }
  }


c++: duplicate block-scope extern [PR 97877]

2020-11-17 Thread Nathan Sidwell

We ICED with a duplicated block-scope extern, as duplicate_decls was
dropping the decl_lang_specific of olddecl.  Simplys adding
appropriate retrofitting and copying turned out to be insufficient
because you can get a block-scope using decl also matching the extern.
The latter seems a little suspicious and I have asked CWG for advice.
While there robustified the assert about releasing olddecls'
lang-specific -- if it had one, the new decl better have one.

PR c++/97877
gcc/cp/
* decl.c (duplicate_decls): Deal with duplicated DECL_LOCAL_DECL_P
decls.  Extend decl_lang_specific checking assert.
gcc/testsuite/
* g++.dg/lookup/pr97877.C: New.


pushing to trunk

--
Nathan Sidwell
diff --git i/gcc/cp/decl.c w/gcc/cp/decl.c
index 89bae06cd6b..d90e9840f40 100644
--- i/gcc/cp/decl.c
+++ w/gcc/cp/decl.c
@@ -2452,6 +2452,20 @@ duplicate_decls (tree newdecl, tree olddecl, bool hiding, bool was_hidden)
   if (! DECL_COMDAT (olddecl))
 DECL_COMDAT (newdecl) = 0;
 
+  if (VAR_OR_FUNCTION_DECL_P (newdecl) && DECL_LOCAL_DECL_P (newdecl))
+{
+  if (!DECL_LOCAL_DECL_P (olddecl))
+	/* This can happen if olddecl was brought in from the
+	   enclosing namespace via a using-decl.  The new decl is
+	   then not a block-scope extern at all.  */
+	DECL_LOCAL_DECL_P (newdecl) = false;
+  else
+	{
+	  retrofit_lang_decl (newdecl);
+	  DECL_LOCAL_DECL_ALIAS (newdecl) = DECL_LOCAL_DECL_ALIAS (olddecl);
+	}
+}
+
   new_template_info = NULL_TREE;
   if (DECL_LANG_SPECIFIC (newdecl) && DECL_LANG_SPECIFIC (olddecl))
 {
@@ -2735,8 +2749,9 @@ duplicate_decls (tree newdecl, tree olddecl, bool hiding, bool was_hidden)
  with that from NEWDECL below.  */
   if (DECL_LANG_SPECIFIC (olddecl))
 {
-  gcc_assert (DECL_LANG_SPECIFIC (olddecl)
-		  != DECL_LANG_SPECIFIC (newdecl));
+  gcc_checking_assert (DECL_LANG_SPECIFIC (newdecl)
+			   && (DECL_LANG_SPECIFIC (olddecl)
+			   != DECL_LANG_SPECIFIC (newdecl)));
   ggc_free (DECL_LANG_SPECIFIC (olddecl));
 }
 


Re: [PATCH 2/3] RISC-V: Support zicsr and zifencei extension for -march.

2020-11-17 Thread Jim Wilson
On Thu, Nov 12, 2020 at 11:29 PM Kito Cheng  wrote:

>  - CSR related instructions and fence instructions has to be splitted from
>baseline ISA, zicsr and zifencei are corresponding sub-extension.
>

It is actually only fence.i that is split off.  fence is still part of the
base ISA.  This is why it is called zifencei.

diff --git a/gcc/config/riscv/riscv.c b/gcc/config/riscv/riscv.c
> index 738556539f6..2aaa8e96451 100644
> --- a/gcc/config/riscv/riscv.c
> +++ b/gcc/config/riscv/riscv.c
> @@ -3337,6 +3337,9 @@ riscv_memmodel_needs_amo_acquire (enum memmodel
> model)
>  static bool
>  riscv_memmodel_needs_release_fence (enum memmodel model)
>  {
> +  if (!TARGET_ZIFENCEI)
> +return false;
> +
>switch (model)
>  {
>case MEMMODEL_ACQ_REL:
>

This part looks wrong, as riscv_memmodel_needs_release_fence is only used
for fence instructions, not for fence.i.

> diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
> index f15bad3b29e..756b35fb8c0 100644
> --- a/gcc/config/riscv/riscv.md
> +++ b/gcc/config/riscv/riscv.md
> @@ -1543,19 +1543,20 @@
>  LCT_NORMAL, VOIDmode, operands[0], Pmode,
>  operands[1], Pmode, const0_rtx, Pmode);
>  #else
> -  emit_insn (gen_fence_i ());
> +  if (TARGET_ZIFENCEI)
> +emit_insn (gen_fence_i ());
>  #endif
>DONE;
>  })
>
>  (define_insn "fence"
>[(unspec_volatile [(const_int 0)] UNSPECV_FENCE)]
> -  ""
> +  "TARGET_ZIFENCEI"
>"%|fence%-")
>
>  (define_insn "fence_i"
>[(unspec_volatile [(const_int 0)] UNSPECV_FENCE_I)]
> -  ""
> +  "TARGET_ZIFENCEI"
>"fence.i")
>
>  ;;
>

The fence_i and clear_cache patterns are OK.  The fence pattern change is
wrong.

You didn't change sync.md, but it only uses fence, so it needs no change.

Jim


Re: [AArch64] Add --with-tune configure flag

2020-11-17 Thread Pop, Sebastian via Gcc-patches
Hi,

here is a follow-up patch to add missing Arm64 configure flags as aliases to 
the existing flags.

gcc/
* config.gcc: add configure flags --with-{cpu,arch,tune}-{32,64}
as alias flags for --with-{cpu,arch,tune} on AArch64.
* doc/install.texi: Document new flags for aarch64.

Tested on aarch64-linux with bootstrap and regression test.
Ok to commit to trunk and back-port to active branches?

Thanks,
Sebastian



0001-AArch64-add-with-cpu-arch-tune-32-64-as-alias-flags-.patch
Description: 0001-AArch64-add-with-cpu-arch-tune-32-64-as-alias-flags-.patch


extend cache_integer_cst

2020-11-17 Thread Nathan Sidwell

This modules-related patch extends cache_integer_cst.  Currently, when
given a small cst, that cst is added to the type's small and /must
not/ already be there.  Large values are fine if they are already in
the large cache.  This adds a parameter to indicate small duplicates
are ok, and it returns the cached value -- either what was already
tehre, or the newly inserted const.

gcc/
* tree.h (cache_integer_cst): Add defaulted might_duplicate parm.
* tree.c (cache_integer_cst): Return the integer cst, add
might_duplicate parm to permit finding a small duplicate.

applying to trunk

--
Nathan Sidwell
diff --git i/gcc/tree.c w/gcc/tree.c
index 569a9b9317b..004385548c9 100644
--- i/gcc/tree.c
+++ w/gcc/tree.c
@@ -1755,8 +1755,15 @@ wide_int_to_tree (tree type, const poly_wide_int_ref &value)
   return build_poly_int_cst (type, value);
 }
 
-void
-cache_integer_cst (tree t)
+/* Insert INTEGER_CST T into a cache of integer constants.  And return
+   the cached constant (which may or may not be T).  If MIGHT_DUPLICATE
+   is false, and T falls into the type's 'smaller values' range, there
+   cannot be an existing entry.  Otherwise, if MIGHT_DUPLICATE is true,
+   or the value is large, should an existing entry exist, it is
+   returned (rather than inserting T).  */
+
+tree
+cache_integer_cst (tree t, bool might_duplicate ATTRIBUTE_UNUSED)
 {
   tree type = TREE_TYPE (t);
   int ix = -1;
@@ -1770,7 +1777,7 @@ cache_integer_cst (tree t)
   switch (TREE_CODE (type))
 {
 case NULLPTR_TYPE:
-  gcc_assert (integer_zerop (t));
+  gcc_checking_assert (integer_zerop (t));
   /* Fallthru.  */
 
 case POINTER_TYPE:
@@ -1850,21 +1857,32 @@ cache_integer_cst (tree t)
 	  TYPE_CACHED_VALUES (type) = make_tree_vec (limit);
 	}
 
-  gcc_assert (TREE_VEC_ELT (TYPE_CACHED_VALUES (type), ix) == NULL_TREE);
-  TREE_VEC_ELT (TYPE_CACHED_VALUES (type), ix) = t;
+  if (tree r = TREE_VEC_ELT (TYPE_CACHED_VALUES (type), ix))
+	{
+	  gcc_checking_assert (might_duplicate);
+	  t = r;
+	}
+  else
+	TREE_VEC_ELT (TYPE_CACHED_VALUES (type), ix) = t;
 }
   else
 {
   /* Use the cache of larger shared ints.  */
   tree *slot = int_cst_hash_table->find_slot (t, INSERT);
-  /* If there is already an entry for the number verify it's the
- same.  */
-  if (*slot)
-	gcc_assert (wi::to_wide (tree (*slot)) == wi::to_wide (t));
+  if (tree r = *slot)
+	{
+	  /* If there is already an entry for the number verify it's the
+	 same value.  */
+	  gcc_checking_assert (wi::to_wide (tree (r)) == wi::to_wide (t));
+	  /* And return the cached value.  */
+	  t = r;
+	}
   else
 	/* Otherwise insert this one into the hash table.  */
 	*slot = t;
 }
+
+  return t;
 }
 
 
diff --git i/gcc/tree.h w/gcc/tree.h
index bea3e16c091..20f66a02403 100644
--- i/gcc/tree.h
+++ w/gcc/tree.h
@@ -5124,7 +5124,7 @@ extern const_tree strip_invariant_refs (const_tree);
 extern tree lhd_gcc_personality (void);
 extern void assign_assembler_name_if_needed (tree);
 extern bool warn_deprecated_use (tree, tree);
-extern void cache_integer_cst (tree);
+extern tree cache_integer_cst (tree, bool might_duplicate = false);
 extern const char *combined_fn_name (combined_fn);
 
 /* Compare and hash for any structure which begins with a canonical


[Patch, Fortran, committed]

2020-11-17 Thread Harald Anlauf
Committed to master as obvious.

Thanks,
Harald


Fortran texi: Fix description of GFC_RTCHECK_* macros.

gcc/fortran/ChangeLog:

* gfortran.texi: Fix description of GFC_RTCHECK_* to match actual
code.


diff --git a/gcc/fortran/gfortran.texi b/gcc/fortran/gfortran.texi
index 453b30f7c61..3b27217d369 100644
--- a/gcc/fortran/gfortran.texi
+++ b/gcc/fortran/gfortran.texi
@@ -3866,8 +3866,8 @@ initialization using @code{_gfortran_set_args}.
 Default: enabled.
 @item @var{option}[6] @tab Enables run-time checking.  Possible values
 are (bitwise or-ed): GFC_RTCHECK_BOUNDS (1), GFC_RTCHECK_ARRAY_TEMPS (2),
-GFC_RTCHECK_RECURSION (4), GFC_RTCHECK_DO (16), GFC_RTCHECK_POINTER (32),
-GFC_RTCHECK_BITS (64).
+GFC_RTCHECK_RECURSION (4), GFC_RTCHECK_DO (8), GFC_RTCHECK_POINTER (16),
+GFC_RTCHECK_MEM (32), GFC_RTCHECK_BITS (64).
 Default: disabled.
 @item @var{option}[7] @tab Unused.
 @item @var{option}[8] @tab Show a warning when invoking @code{STOP} and



Re: [PATCH 3/3] RISC-V: Support version controling for ISA standard extensions

2020-11-17 Thread Jim Wilson
On Thu, Nov 12, 2020 at 11:28 PM Kito Cheng  wrote:

> +#ifndef HAVE_AS_MARCH_ZIFENCE
> +  /* Skip since older binutils don't recognize zifencei,
> + we mad a mistake that is binutils 2.35 support zicsr but not support
> + zifencei.  */
> +  skip_zifencei = true;
> +#endif
>

I'd suggest something like "Skip since older binutils doesn't recognize
zifencei, we made a mistake in that binutils 2.35 supports zicsr but not
zifencei."

> diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h
> index 172c7ca7c98..9dec5415eab 100644
> --- a/gcc/config/riscv/riscv.h
> +++ b/gcc/config/riscv/riscv.h
> @@ -70,13 +70,20 @@ extern const char *riscv_default_mtune (int argc,
> const char **argv);
>  #define TARGET_64BIT   (__riscv_xlen == 64)
>  #endif /* IN_LIBGCC2 */
>
> +#ifdef HAVE_AS_MISA_SPEC
> +#define ASM_MISA_SPEC ""
> +#else
> +#define ASM_MISA_SPEC "%{misa-spec=*}"
> +#endif
>

This is backwards.  We do want to pass -misa-spec to the assembler if it
supports it.  Or maybe you meant ifndef?

Jim


  1   2   >