date:20160809

Re: [PATCH] Fix ldist memset discovery with -0.0 (PR tree-optimization/72824)

2016-08-09 Thread Jakub Jelinek

On Tue, Aug 09, 2016 at 08:53:54AM +0200, Richard Biener wrote:
> Hmm, I don't think we should see -0.0 as +0.0 with -fno-signed-zeros.
> As far as I can see this is a memory load/store op and we may not
> transform, say,
> 
>   double x = a[i];
>   b[i] = x;
> 
> into a copy that changes -0.0 to +0.0.  loop distribution looks for
> a value with all-zero bytes.

So shall I just drop the if (!HONOR_SIGNED_ZEROS (val)) return 0; from the
patch?  With that removed, it will not transform b[i] = -0.0; into
memset (b, 0, ...); even with -fno-signed-zeros.

> > 2016-08-08  Jakub Jelinek  
> > 
> > PR tree-optimization/72824
> > * tree-loop-distribution.c (const_with_all_bytes_same): If honoring
> > signed zeros, verify real_zerop is not negative.
> > 
> > * gcc.c-torture/execute/ieee/pr72824.c: New test.
> > 
> > --- gcc/tree-loop-distribution.c.jj 2016-07-16 10:41:04.0 +0200
> > +++ gcc/tree-loop-distribution.c2016-08-07 13:55:19.704681784 +0200
> > @@ -750,12 +750,40 @@ const_with_all_bytes_same (tree val)
> >int i, len;
> >  
> >if (integer_zerop (val)
> > -  || real_zerop (val)
> >|| (TREE_CODE (val) == CONSTRUCTOR
> >&& !TREE_CLOBBER_P (val)
> >&& CONSTRUCTOR_NELTS (val) == 0))
> >  return 0;
> >  
> > +  if (real_zerop (val))
> > +{
> > +  if (!HONOR_SIGNED_ZEROS (val))
> > +   return 0;
> > +  /* If honoring signed zeros, only return 0 for +0.0, not for -0.0.  
> > */
> > +  switch (TREE_CODE (val))
> > +   {
> > +   case REAL_CST:
> > + if (!real_isneg (TREE_REAL_CST_PTR (val)))
> > +   return 0;
> > + break;
> > +   case COMPLEX_CST:
> > + if (!const_with_all_bytes_same (TREE_REALPART (val))
> > + && !const_with_all_bytes_same (TREE_IMAGPART (val)))
> > +   return 0;
> > + break;
> > +   case VECTOR_CST:
> > + unsigned int j;
> > + for (j = 0; j < VECTOR_CST_NELTS (val); ++j)
> > +   if (const_with_all_bytes_same (val))
> > + break;
> > + if (j == VECTOR_CST_NELTS (val))
> > +   return 0;
> > + break;
> > +   default:
> > + break;
> > +   }
> > +}
> > +
> >if (CHAR_BIT != 8 || BITS_PER_UNIT != 8)
> >  return -1;
> >  

Jakub

Re: [PATCH, COMMITTED] Add branch_changer.py script to maintainer-scripts

2016-08-09 Thread Gerald Pfeifer

On Wed, 3 Aug 2016, Martin Liška wrote:
> I've installed (r239066) the script which is used by maintainers to 
> update PRs in a batch mode.

I think it would be good to add a comment at the top that
describes what the scripts does and how to invoke the script?

Gerald

Re: [PATCH] Fix unaligned access when predictive commoning (PR 71083)

2016-08-09 Thread Richard Biener

On Mon, 8 Aug 2016, Bernd Edlinger wrote:

> Hi!
> 
> 
> As described in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71083
> we generate unaligned accesses because tree-predcom rewrites
> bitfield and packed accesses in a way that looses the proper
> alignment information.
> 
> The attached patch fixes this by re-using the gimple memory
> expression from the original access.
> 
> I was not sure, if any non-constant array references would be
> valid at where the ref_at_iteration is expanded, I set these
> to the array-lower-bound, as the DR_OFFSET contains the folded
> value of all non-constant array references.
> 
> The patch compensates for the constant offset from the outer
> reference to the inner reference, and replaces the inner
> reference instead of the outer reference.
> 
> 
> Boot-strapped and reg-tested on x86_64-linux-gnu.
> OK for trunk?

I don't think what you do is correct.  Consider

 for (i)
  for (j)
   {
 ... = a[i][j];
   }

predcom considers innermost loops only and thus the DRs will
be analyzed with respect to that which means DR_BASE_ADDRESS
is &a[i][0] but you end up generating (*(&a) + i * ..)[0][0]
which is at best suspicious with respect to further analyses
by data-ref and TBAA.  Also note that this may get alignment
wrong as well as changing [i] to [0] may lead to false alignment
(consider a [2][2] char array aligned to 4 bytes).

Your patch goes halfway back to code we had in the past
(looking at the 4.3 branch right now) which had numerous
issues (don't remember exactly).

I believe that if we'd handle bitfields by

  if (TREE_CODE (DR_REF (dr)) == COMPONENT_REF
  && DECL_BIT_FIELD_TYPE (TREE_OPERAND (DR_REF (dr), 1)))
ref = TREE_OPERAND (DR_REF (dr), 0);
  else
ref = DR_REF (dr);
  unsigned align = get_object_alignment (ref);

and use the type / alignment of that ref for the built MEM_REF
(with coff adjusted by the split bitfield component-ref offset)
and build a COMPONENT_REF around the MEM_REF to handle the
bitfield part this should work ok.

Richard.

> 
> Thanks
> Bernd.
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)

Re: [PATCH] Fix ldist memset discovery with -0.0 (PR tree-optimization/72824)

2016-08-09 Thread Richard Biener

On Tue, 9 Aug 2016, Jakub Jelinek wrote:

> On Tue, Aug 09, 2016 at 08:53:54AM +0200, Richard Biener wrote:
> > Hmm, I don't think we should see -0.0 as +0.0 with -fno-signed-zeros.
> > As far as I can see this is a memory load/store op and we may not
> > transform, say,
> > 
> >   double x = a[i];
> >   b[i] = x;
> > 
> > into a copy that changes -0.0 to +0.0.  loop distribution looks for
> > a value with all-zero bytes.
> 
> So shall I just drop the if (!HONOR_SIGNED_ZEROS (val)) return 0; from the
> patch?  With that removed, it will not transform b[i] = -0.0; into
> memset (b, 0, ...); even with -fno-signed-zeros.

Yes, ok with that change (maybe add a comment that we do this on purpose).

Thanks,
Richard.

> > > 2016-08-08  Jakub Jelinek  
> > > 
> > >   PR tree-optimization/72824
> > >   * tree-loop-distribution.c (const_with_all_bytes_same): If honoring
> > >   signed zeros, verify real_zerop is not negative.
> > > 
> > >   * gcc.c-torture/execute/ieee/pr72824.c: New test.
> > > 
> > > --- gcc/tree-loop-distribution.c.jj   2016-07-16 10:41:04.0 
> > > +0200
> > > +++ gcc/tree-loop-distribution.c  2016-08-07 13:55:19.704681784 +0200
> > > @@ -750,12 +750,40 @@ const_with_all_bytes_same (tree val)
> > >int i, len;
> > >  
> > >if (integer_zerop (val)
> > > -  || real_zerop (val)
> > >|| (TREE_CODE (val) == CONSTRUCTOR
> > >&& !TREE_CLOBBER_P (val)
> > >&& CONSTRUCTOR_NELTS (val) == 0))
> > >  return 0;
> > >  
> > > +  if (real_zerop (val))
> > > +{
> > > +  if (!HONOR_SIGNED_ZEROS (val))
> > > + return 0;
> > > +  /* If honoring signed zeros, only return 0 for +0.0, not for -0.0. 
> > >  */
> > > +  switch (TREE_CODE (val))
> > > + {
> > > + case REAL_CST:
> > > +   if (!real_isneg (TREE_REAL_CST_PTR (val)))
> > > + return 0;
> > > +   break;
> > > + case COMPLEX_CST:
> > > +   if (!const_with_all_bytes_same (TREE_REALPART (val))
> > > +   && !const_with_all_bytes_same (TREE_IMAGPART (val)))
> > > + return 0;
> > > +   break;
> > > + case VECTOR_CST:
> > > +   unsigned int j;
> > > +   for (j = 0; j < VECTOR_CST_NELTS (val); ++j)
> > > + if (const_with_all_bytes_same (val))
> > > +   break;
> > > +   if (j == VECTOR_CST_NELTS (val))
> > > + return 0;
> > > +   break;
> > > + default:
> > > +   break;
> > > + }
> > > +}
> > > +
> > >if (CHAR_BIT != 8 || BITS_PER_UNIT != 8)
> > >  return -1;
> > >  
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)

[PATCH] Fix PR 71802

2016-08-09 Thread Richard Biener


The following fixes a missed CFG cleanup opportunity (requiring its
iteration) that ultimatively causes us to create a dead if-converted
loop variant (discovered as dead by CFG cleanup run by ifcvt).

The issue is that when removing unreachable code CFG cleanup does not
schedule predecessors of path exits for revisiting but only the path
exit itself - this leads to a missed BB merging opportunity.  Fixed
by slightly re-organizing the BB merging code.

Bootstrapped / tested on x86_64-unknown-linux-gnu, applied to trunk.

Richard.

2016-08-09  Richard Biener  

PR tree-optimization/71802
* tree-cfgcleanup.c (cleanup_tree_cfg_bb): Make sure to catch
all merge opportunities with the predecessor.

* gcc.dg/torture/pr71802.c: New testcase.

Index: gcc/tree-cfgcleanup.c
===
*** gcc/tree-cfgcleanup.c   (revision 239164)
--- gcc/tree-cfgcleanup.c   (working copy)
*** cleanup_tree_cfg_bb (basic_block bb)
*** 641,664 
&& remove_forwarder_block (bb))
  return true;
  
/* Merging the blocks may create new opportunities for folding
   conditional branches (due to the elimination of single-valued PHI
   nodes).  */
!   if (single_succ_p (bb)
!   && can_merge_blocks_p (bb, single_succ (bb)))
  {
!   /* If there is a merge opportunity with the predecessor
!  do nothing now but wait until we process the predecessor.
!This happens when we visit BBs in a non-optimal order and
!avoids quadratic behavior with adjusting stmts BB pointer.  */
!   if (single_pred_p (bb)
! && can_merge_blocks_p (single_pred (bb), bb))
!   ;
!   else
!   {
! merge_blocks (bb, single_succ (bb));
! return true;
!   }
  }
  
return false;
--- 641,665 
&& remove_forwarder_block (bb))
  return true;
  
+   /* If there is a merge opportunity with the predecessor
+  do nothing now but wait until we process the predecessor.
+  This happens when we visit BBs in a non-optimal order and
+  avoids quadratic behavior with adjusting stmts BB pointer.  */
+   if (single_pred_p (bb)
+   && can_merge_blocks_p (single_pred (bb), bb))
+ /* But make sure we _do_ visit it.  When we remove unreachable paths
+ending in a backedge we fail to mark the destinations predecessors
+as changed.  */
+ bitmap_set_bit (cfgcleanup_altered_bbs, single_pred (bb)->index);
+ 
/* Merging the blocks may create new opportunities for folding
   conditional branches (due to the elimination of single-valued PHI
   nodes).  */
!   else if (single_succ_p (bb)
!  && can_merge_blocks_p (bb, single_succ (bb)))
  {
!   merge_blocks (bb, single_succ (bb));
!   return true;
  }
  
return false;
Index: gcc/testsuite/gcc.dg/torture/pr71802.c
===
*** gcc/testsuite/gcc.dg/torture/pr71802.c  (revision 0)
--- gcc/testsuite/gcc.dg/torture/pr71802.c  (working copy)
***
*** 0 
--- 1,37 
+ /* { dg-do compile } */
+ 
+ int b, c;
+ long d, f;
+ void fn1()
+ {
+   char g;
+   long long h = 0;
+   int *i;
+   if (0) {
+ L2:
+   b && (b = f);
+   d = 3;
+   for (; d;) {
+ char *j = &g;
+ c = *j = 0;
+ L3:
+ *j %= b;
+ for (; g <= 4;)
+   ;
+   }
+   goto L2;
+   }
+   for (; *i; *i = 1) {
+   if ((h -= 4) == (h != (b ?: d))) {
+ g = 3;
+ goto L3;
+   }
+   i = (int *)&h;
+   *i = f;
+   i = (int *)&f;
+   if ((h && 6) - (h = 0))
+   goto L2;
+   }
+   for (; d;)
+ goto L3;
+ }

[PATCH, libgo]: Fix syscall test failure on CentOS 5.11

2016-08-09 Thread Uros Bizjak

Hello!

syscall test fails on CentOS 5.11 with:

exec_linux_test.go:167:25: error: reference to undefined identifier
'syscall.CLONE_NEWNET'
   Unshareflags: syscall.CLONE_NEWNET,
 ^
FAIL: syscall

Attached patch fixes the failure by providing CLONE_NEWNET definition.

Tested on x86_64-linux-gnu.

Uros.
diff --git a/libgo/mksysinfo.sh b/libgo/mksysinfo.sh
index 026bef2..ac5f086 100755
--- a/libgo/mksysinfo.sh
+++ b/libgo/mksysinfo.sh
@@ -1475,6 +1475,9 @@ grep '^const _CLONE_' gen-sysinfo.go | \
 if ! grep '^const CLONE_NEWUSER ' ${OUT} > /dev/null 2>&1; then
   echo "const CLONE_NEWUSER = 0x1000" >> ${OUT}
 fi
+if ! grep '^const CLONE_NEWNET ' ${OUT} > /dev/null 2>&1; then
+  echo "const CLONE_NEWNET = 0x4000" >> ${OUT}
+fi
 
 # Struct sizes.
 set cmsghdr Cmsghdr ip_mreq IPMreq ip_mreqn IPMreqn ipv6_mreq IPv6Mreq \

Re: [PATCH] Improve backwards threading

2016-08-09 Thread Richard Biener

On Fri, 5 Aug 2016, Jeff Law wrote:

> On 08/05/2016 07:36 AM, Richard Biener wrote:
> > @@ -560,6 +562,14 @@ fsm_find_control_statement_thread_paths
> >   edge taken_edge = profitable_jump_thread_path (path, bbi, name,
> > arg);
> >   if (taken_edge)
> > {
> > + /* If the taken edge is a loop entry avoid mashing two
> > +loops into one with multiple latches by splitting
> > +the edge.  This only works if that block won't become
> > +a latch of this loop.  */
> > + if ((bb_loop_depth (taken_edge->src)
> > +  < bb_loop_depth (taken_edge->dest))
> > + && ! single_succ_p (bbi))
> > +   split_edge (taken_edge);
> >   if (bb_loop_depth (taken_edge->src)
> >   >= bb_loop_depth (taken_edge->dest))
> > convert_and_register_jump_thread_path (path, taken_edge);
> > 
> > note you need the PR72772 fix to trigger all this.
> I'm a little confused here.  In the case where the taken edge goes into a
> deeper loop nest you're splitting the edge -- to what end?  The backwards
> threader isn't going to register that jump thread.  So if this is fixing
> something, then we've got the fix in the wrong place.

Note that the case is not going "into" a deeper loop nest but as the
threading path ends at taken_edge it is threading to a loop header of
an inner loop.  And yes, the fix doesn't work (not sure how I thought
it could).  But the condition also doesn't make sense to me and we
miss a guard for the case I added a comment for - generating wrong-code
because loop meta data is wrong after the transform (I think this
may not only apply to the niter upper bound but also things like
safelen).

> FWIW, I the anonymous name fix ought to go in separately, it looks like the
> right thing to do independent of this stuff.

I have applied that part now.

I'm not sure what to do with the pre-header creation fix (PR72772) now,
but I am considering to put that into the tree regardless of the
"fallout" (a few FAILs for transforms we no longer perform).  I spent
half a week now but am quite lost with the threading cases (which
I think are latent issues).

Richard.

Re: [RFC] ipa bitwise constant propagation

2016-08-09 Thread Prathamesh Kulkarni

On 8 August 2016 at 19:33, Martin Jambor  wrote:
> Hi,
>
> thanks for following through.  You'll need an approval from Honza, but
> I think the code looks good (I have looked at the places that I
> believe have changed since the last week).  However, I have discovered
> one new thing I don't like and still believe you need to handle
> different precisions in lattice need:
>
> On Mon, Aug 08, 2016 at 03:08:35AM +0530, Prathamesh Kulkarni wrote:
>> On 5 August 2016 at 18:06, Martin Jambor  wrote:
>>
>> ...
>>
>> >> diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
>> >> index 5b6cb9a..b770f6a 100644
>> >> --- a/gcc/ipa-cp.c
>> >> +++ b/gcc/ipa-cp.c
>> >> @@ -120,6 +120,7 @@ along with GCC; see the file COPYING3.  If not see
>> >>  #include "params.h"
>> >>  #include "ipa-inline.h"
>> >>  #include "ipa-utils.h"
>> >> +#include "tree-ssa-ccp.h"
>> >>
>> >>  template  class ipcp_value;
>> >>
>> >> @@ -266,6 +267,40 @@ private:
>> >>bool meet_with_1 (unsigned new_align, unsigned new_misalign);
>> >>  };
>> >>
>> >> +/* Lattice of known bits, only capable of holding one value.
>> >> +   Similar to ccp_prop_value_t, mask represents which bits of value are 
>> >> constant.
>> >> +   If a bit in mask is set to 0, then the corresponding bit in
>> >> +   value is known to be constant.  */
>> >> +
>> >> +class ipcp_bits_lattice
>> >> +{
>> >> +public:
>> >> +  bool bottom_p () { return lattice_val == IPA_BITS_VARYING; }
>> >> +  bool top_p () { return lattice_val == IPA_BITS_UNDEFINED; }
>> >> +  bool constant_p () { return lattice_val == IPA_BITS_CONSTANT; }
>> >> +  bool set_to_bottom ();
>> >> +  bool set_to_constant (widest_int, widest_int, signop, unsigned);
>> >> +
>> >> +  widest_int get_value () { return value; }
>> >> +  widest_int get_mask () { return mask; }
>> >> +  signop get_sign () { return sgn; }
>> >> +  unsigned get_precision () { return precision; }
>> >> +
>> >> +  bool meet_with (ipcp_bits_lattice& other, enum tree_code, tree);
>> >> +  bool meet_with (widest_int, widest_int, signop, unsigned);
>> >> +
>> >> +  void print (FILE *);
>> >> +
>> >> +private:
>> >> +  enum { IPA_BITS_UNDEFINED, IPA_BITS_CONSTANT, IPA_BITS_VARYING } 
>> >> lattice_val;
>> >> +  widest_int value, mask;
>> >> +  signop sgn;
>> >> +  unsigned precision;
>
> I know that the existing code in ipa-cp.c does not do this, but please
> prefix member variables with "m_" like our coding style guidelines
> suggest (or even require?).  You routinely reuse those same names in
> names of parameters of meet_with and I believe that is a practice that
> will sooner or later lead to confusing the two and bugs.
Sorry about this, will change to m_ prefix in followup patch.
>
>> >> +
>> >> +  bool meet_with_1 (widest_int, widest_int);
>> >> +  void get_value_and_mask (tree, widest_int *, widest_int *);
>> >> +};
>> >> +
>> >>  /* Structure containing lattices for a parameter itself and for pieces of
>> >> aggregates that are passed in the parameter or by a reference in a 
>> >> parameter
>> >> plus some other useful flags.  */
>> >> @@ -281,6 +316,8 @@ public:
>> >>ipcp_agg_lattice *aggs;
>> >>/* Lattice describing known alignment.  */
>> >>ipcp_alignment_lattice alignment;
>> >> +  /* Lattice describing known bits.  */
>> >> +  ipcp_bits_lattice bits_lattice;
>> >>/* Number of aggregate lattices */
>> >>int aggs_count;
>> >>/* True if aggregate data were passed by reference (as opposed to by
>> >> @@ -458,6 +495,21 @@ ipcp_alignment_lattice::print (FILE * f)
>> >>  fprintf (f, " Alignment %u, misalignment %u\n", align, 
>> >> misalign);
>> >>  }
>> >>
>>
>> ...
>>
>> >> +/* Convert operand to value, mask form.  */
>> >> +
>> >> +void
>> >> +ipcp_bits_lattice::get_value_and_mask (tree operand, widest_int *valuep, 
>> >> widest_int *maskp)
>> >> +{
>> >> +  wide_int get_nonzero_bits (const_tree);
>> >> +
>> >> +  if (TREE_CODE (operand) == INTEGER_CST)
>> >> +{
>> >> +  *valuep = wi::to_widest (operand);
>> >> +  *maskp = 0;
>> >> +}
>> >> +  else if (TREE_CODE (operand) == SSA_NAME)
>> >
>> > IIUC, operand is the operand from pass-through jump function and that
>> > should never be an SSA_NAME.  I have even looked at how we generate
>> > them and it seems fairly safe to say that they never are.  If you have
>> > seen an SSA_NAME here, it is a bug and please let me know because
>> > sooner or later it will cause an assert.
>> >
>> >> +{
>> >> +  *valuep = 0;
>> >> +  *maskp = widest_int::from (get_nonzero_bits (operand), UNSIGNED);
>> >> +}
>> >> +  else
>> >> +gcc_unreachable ();
>> >
>> > The operand however can be any any other is_gimple_ip_invariant tree.
>> > I assume that you could hit this gcc_unreachable only in a program
>> > with undefined behavior (or with a Fortran CONST_DECL?) but you should
>> > not ICE here.
>> Changed to:
>> if (TREE_CODE (operand) == INTEGER_CST)
>> {
>>   *valuep = wi::to_widest (operand);
>>   *maskp = 0;
>> }
>>   els

Re: [PATCH][expr.c] PR middle-end/71700: zero-extend sub-word value when widening constructor element

2016-08-09 Thread Richard Biener

On Fri, 5 Aug 2016, Kyrill Tkachov wrote:

> 
> On 01/08/16 15:31, Kyrill Tkachov wrote:
> > 
> > On 11/07/16 18:55, Bernd Schmidt wrote:
> > > On 07/11/2016 04:52 PM, Kyrill Tkachov wrote:
> > > > Based on that, I think that code block is a useful optimisation, we just
> > > > need
> > > > to take care with immediates.
> > > > 
> > > > What do you think?
> > > 
> > > Yeah, I think the patch is ok.
> > > 
> > 
> > This patch (https://gcc.gnu.org/ml/gcc-patches/2016-07/msg00017.html) has
> > been in trunk
> > for a month with no reported problems.
> > 
> > I'd like to backport it to the GCC 5 and 6 branches.
> > I've bootstrapped and tested it there on arm-none-linux-gnueabihf.
> > 
> 
> Sorry for the early ping but given that GCC 6.2 is due shortly and I'm away
> next week,
> could I please backport this patch to the GCC 6 branch?

Generally patches approved for the trunk that fix regressions can be
backported without further approval.

Richard.

> Thanks,
> Kyrill
> 
> > Ok?
> > 
> > Thanks,
> > Kyrill
> > 
> > > 
> > > Bernd
> > > 
> > 
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)

[PATCH 2/N] Fix usage of POW2 histogram

2016-08-09 Thread Martin Liška

Another small follow-up changes instrumentation of MOD exprs
to not instrument divisors different from SSA_NAME.

Patch survives:
make check -k -j10 RUNTESTFLAGS="tree-prof.exp"

Ready for trunk?
Thanks,
Martin
>From 00aecc0dd74c4546a1882bdbbb0f66fbf39a5408 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Mon, 8 Aug 2016 19:18:11 +0200
Subject: [PATCH 2/2] Fix usage of POW2 histogram

gcc/ChangeLog:

2016-08-08  Martin Liska  

	* value-prof.c (gimple_divmod_values_to_profile): Do not
	instrument MOD histogram if a value is not a SSA name.

gcc/testsuite/ChangeLog:

2016-08-08  Martin Liska  

	* gcc.dg/tree-prof/val-prof-9.c: New test.
---
 gcc/testsuite/gcc.dg/tree-prof/val-prof-9.c | 18 ++
 gcc/value-prof.c|  3 ++-
 2 files changed, 20 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-prof/val-prof-9.c

diff --git a/gcc/testsuite/gcc.dg/tree-prof/val-prof-9.c b/gcc/testsuite/gcc.dg/tree-prof/val-prof-9.c
new file mode 100644
index 000..8fc2301
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-prof/val-prof-9.c
@@ -0,0 +1,18 @@
+/* { dg-options "-O0 -fdump-tree-optimized" } */
+
+int
+main (int argc, char **argv)
+{
+  unsigned u = (argc - 1);
+  int counter = 0;
+
+  for (unsigned i = 0; i < 100; i++)
+  {
+counter += u % 16;
+  }
+
+  return counter;
+}
+
+/* autofdo does not do value profiling so far */
+/* { dg-final-use-not-autofdo { scan-tree-dump-times "__gcov_pow2_profiler" 0 "optimized" } } */
diff --git a/gcc/value-prof.c b/gcc/value-prof.c
index 0527c2c..a4653aa 100644
--- a/gcc/value-prof.c
+++ b/gcc/value-prof.c
@@ -1950,7 +1950,8 @@ gimple_divmod_values_to_profile (gimple *stmt, histogram_values *values)
   /* For mod, check whether it is not often a noop (or replaceable by
 	 a few subtractions).  */
   if (gimple_assign_rhs_code (stmt) == TRUNC_MOD_EXPR
-	  && TYPE_UNSIGNED (type))
+	  && TYPE_UNSIGNED (type)
+	  && TREE_CODE (divisor) == SSA_NAME)
 	{
   tree val;
   /* Check for a special case where the divisor is power of 2.  */
-- 
2.9.2

Re: [PATCH, LRA] PR71680, Reload of slow mems

2016-08-09 Thread Alan Modra

On Tue, Aug 02, 2016 at 11:02:56PM +0930, Alan Modra wrote:
> This is a patch for a problem in lra, triggered by the rs6000
> backend not allowing SImode in floating point registers.

Ping?  https://gcc.gnu.org/ml/gcc-patches/2016-08/msg00113.html

Note that to recreate the problem with the comment #1 testcase in the
PR you'll need to revert svn 239011, 239012, 239013 and 239217.

-- 
Alan Modra
Australia Development Lab, IBM

Re: [PATCH] PR middle-end/21137: Folding if (((int)x >> 31) & 64) into if ((int)x < 0)

2016-08-09 Thread Richard Biener

On Mon, Aug 8, 2016 at 7:04 PM, Jeff Law  wrote:
> On 08/08/2016 05:58 AM, Marek Polacek wrote:
>>
>> On Mon, Aug 08, 2016 at 12:56:20PM +0100, Roger Sayle wrote:
>>>
>>>
>>> The following patch is an attempt to finally fully close PR
>>> middle-end/21137.
>>> As explained in the PR, my original patch from 2006 didn't handle the
>>> case
>>> where there's a sign preserving NOP in the tree.  Easily fixed by calling
>>> tree_strip_nop_conversions at the appropriate point in fold-const.c.
>>> Most of this patch is the resulting re-indentation.
>>
>>
>> Shouldn't this be rather handled through the match.pd interface?
>
> I don't see this transformation in match.pd.  So the whole transformation
> would need to be pulled out of fold-const.c and reimplemented in the
> match.pd framework.
>
> Roger, are you up for that?  match.pd is certainly where we want much of
> this kind of stuff happening.

Moving shouldn't be a requirement for improving things here, so I think
the patch is ok as-is.

Richard.

> jeff

Re: [PATCH build/doc] Replacing libiberty with gnulib

2016-08-09 Thread Pedro Alves

On 08/06/2016 05:34 AM, ayush goel wrote:
> On 5 August 2016 at 4:09:00 AM, Pedro Alves (pal...@redhat.com) wrote:
>> On 08/02/2016 12:38 AM, Manuel López-Ibáñez wrote:
>>>
>>> If there is something wrong or missing, ideally we would like to know
>>> so that Ayush can work on fixing it before the Summer of Code is over
>>> in less than two weeks.
>>
>> I couldn't see anywhere gnulib's config.h file is included.
>>
>> gnulib's replacement headers and (and .c files) depend on
>> that being included.
>>
>> Did I simply miss it?
>>
> 
> gnulib’s config.h is created on compile time. After building the
> library it is present inside gnulib build folder.

Sure, but that was not the question.  The question is how are
the gcc files including that new config.h file.

E.g., how come you're not getting this:

 gcc/foo.c
   -> #include "config.h" (pick up gcc's config.h not the new gnulib one)
   -> #include  (e.g., #include 
)
 -> trip on #error in gnulib replacement header:
#ifndef _GL_INLINE_HEADER_BEGIN
 #error "Please include config.h first."
#endif

As explained here:

  https://gcc.gnu.org/ml/gcc/2016-06/msg00144.html

and here in more detail:

  https://sourceware.org/ml/gdb-patches/2012-04/msg00426.html

the scheme of configuring gnulib in a separate directory as borrowed from gdb
requires including two config.h headers -- the gnulib client's, and gnulib's.

Did you do something different that avoids needing that somehow?

In gdb, .c files don't include "config.h" directly.  Instead all .c files
include a "defs.h" file first thing, and that in turn (after another 
indirection)
is what includes both gdb's "config.h" and gnulib's "config.h":

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=blob;f=gdb/common/common-defs.h;h=2c9411704531b510d176a4a22a718ae8313294e7;hb=HEAD#l23

Thanks,
Pedro Alves

RE: [PATCH] Fix (parts of) PR68273

2016-08-09 Thread Matthew Fortune

Richard Biener  writes:
> The following patch avoids overaligned types created from IPA parameter
> replacement.  It is said to help mipsel which still suffers from the
> backend-looks-at-type-alignment-for-parameter-passing-ABI bug.
> 
> Bootstrap and regtest pending on x86_64-unknown-linux-gnu.
> 
> mips testing appreciated.

Thanks Richard.

Regtest on mips-linux-gnu for mips32r2 is fine. My bootstrap failed for
unrelated reasons.

Matthew

Re: Set nonnull attribute to ptr_info_def based on VRP

2016-08-09 Thread Richard Biener

On Tue, Aug 9, 2016 at 12:58 AM, kugan
 wrote:
> Hi Jakub,
>
> Thanks for the review.
>
> On 08/08/16 16:40, Jakub Jelinek wrote:
>>
>> On Mon, Aug 08, 2016 at 01:36:51PM +1000, kugan wrote:
>>>
>>> diff --git a/gcc/tree-ssanames.h b/gcc/tree-ssanames.h
>>> index c81b1a1..6e34433 100644
>>> --- a/gcc/tree-ssanames.h
>>> +++ b/gcc/tree-ssanames.h
>>> @@ -43,6 +43,9 @@ struct GTY(()) ptr_info_def
>>>   above alignment.  Access only through the same helper functions as
>>> align
>>>   above.  */
>>>unsigned int misalign;
>>> +  /* When this pointer is knonw to be nnonnull this would be true
>>> otherwise
>>> + false.  */
>>> +  bool  nonnull_p;
>>>  };
>>
>>
>> Why do you need this?  Doesn't the pt.null bit represent that already?
>
>
> It looks like I can use this. As in gcc/tree-ssa-alias.h:
>
>   /* Nonzero if the points-to set includes 'nothing', the points-to set
>  includes memory at address NULL.  */
>   unsigned int null : 1;
>
> But in gcc/tree-ssa-alias.c, ptrs_compare_unequal has the following comment
> which says:
>
> /* ???  We'd like to handle ptr1 != NULL and ptr1 != ptr2
>  but those require pt.null to be conservatively correct.  */
>
> Does that means it can be wrong at times? I haven't looked it in detail yet
> but if it is, it would be a problem.

Yes, this bit is not correctly computed for all cases (well, it works
for trivial
ones but for example misses weaks and maybe some other corner-cases).

Currently there are no consumers of this bit so the incorrectness
doesn't matter.

I suggest you simply use that bit but set it conservatively from PTA (to true)
for now and adjust it from VRP only.

I do have some patches for fixinig some of the .nonnull_p issues in PTA but
they come at a cost of more constraints.

Richard.

>> Also, formatting and spelling:
>> s/knonw/known/
>> s/nnon/non/
>> s/bool  /bool /
>
>
> I will change this.
>
> Thanks,
> Kugan
>>
>>
>> Jakub
>>
>

[RS6000] e500 part of pr71680

2016-08-09 Thread Alan Modra

The fallback part of HARD_REGNO_CALLER_SAVE_MODE, choose_hard_reg_mode,
returns DFmode for SImode when TARGET_E500_DOUBLE.  This confuses
lra when attempting to save ctr around a call.

Arseny, the bug reporter, has regression tested this patch.
OK to apply?

PR target/71680
* config/rs6000/rs6000.h (HARD_REGNO_CALLER_SAVE_MODE): Return
SImode for TARGET_E500_DOUBLE when given SImode.

diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h
index af77258..353f388 100644
--- a/gcc/config/rs6000/rs6000.h
+++ b/gcc/config/rs6000/rs6000.h
@@ -1273,6 +1273,8 @@ enum data_align { align_abi, align_opt, align_both };
&& ((MODE) == VOIDmode || ALTIVEC_OR_VSX_VECTOR_MODE (MODE))
\
&& FP_REGNO_P (REGNO)   \
? V2DFmode  \
+   : TARGET_E500_DOUBLE && (MODE) == SImode\
+   ? SImode\
: TARGET_E500_DOUBLE && ((MODE) == VOIDmode || (MODE) == DFmode)\
? DFmode\
: !TARGET_E500_DOUBLE && FLOAT128_IBM_P (MODE) && FP_REGNO_P (REGNO)
\

-- 
Alan Modra
Australia Development Lab, IBM

Re: [Fortran, patch, pr71936, v1] [6/7 Regression] ICE in wide_int_to_tree, at tree.c:1487

2016-08-09 Thread Richard Biener

On Mon, Aug 8, 2016 at 11:58 AM, Andre Vehreschild  wrote:
> Hi Paul,
>
> thanks for the review. Committed to trunk as r239237.
>
> I will commit to gcc-6 in one week.

If you want to catch the 6.2 release then please make sure to have it on the
branch this Friday.

Richard.

> Regards,
> Andre
>
> On Mon, 8 Aug 2016 11:33:06 +0200
> Paul Richard Thomas  wrote:
>
>> Hi Andre,
>>
>> That looks fine to me. OK for 6-branch and trunk.
>>
>> Thanks for the patch.
>>
>> Paul
>>
>> On 7 August 2016 at 15:32, Andre Vehreschild  wrote:
>> > Hi all,
>> >
>> > attached patch fixes the ICE by ensuring that when the SOURCE=/MOLD=
>> > expression is an array-valued function call with no ref, the ref of
>> > object to allocate is taken. The array properties nevertheless are
>> > taken from the function's result.
>> >
>> > Bootstraps and regtests ok on x86_64-linux-gnu/F23. Ok for trunk and
>> > gcc-6?
>> >
>> > Regards,
>> > Andre
>> > --
>> > Andre Vehreschild * Email: vehre ad gmx dot de
>>
>>
>>
>
>
> --
> Andre Vehreschild * Email: vehre ad gmx dot de

Re: [RFC] ipa bitwise constant propagation

2016-08-09 Thread Richard Biener

On Tue, 9 Aug 2016, Prathamesh Kulkarni wrote:

> On 8 August 2016 at 19:33, Martin Jambor  wrote:
> > Hi,
> >
> > thanks for following through.  You'll need an approval from Honza, but
> > I think the code looks good (I have looked at the places that I
> > believe have changed since the last week).  However, I have discovered
> > one new thing I don't like and still believe you need to handle
> > different precisions in lattice need:
> >
> > On Mon, Aug 08, 2016 at 03:08:35AM +0530, Prathamesh Kulkarni wrote:
> >> On 5 August 2016 at 18:06, Martin Jambor  wrote:
> >>
> >> ...
> >>
> >> >> diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
> >> >> index 5b6cb9a..b770f6a 100644
> >> >> --- a/gcc/ipa-cp.c
> >> >> +++ b/gcc/ipa-cp.c
> >> >> @@ -120,6 +120,7 @@ along with GCC; see the file COPYING3.  If not see
> >> >>  #include "params.h"
> >> >>  #include "ipa-inline.h"
> >> >>  #include "ipa-utils.h"
> >> >> +#include "tree-ssa-ccp.h"
> >> >>
> >> >>  template  class ipcp_value;
> >> >>
> >> >> @@ -266,6 +267,40 @@ private:
> >> >>bool meet_with_1 (unsigned new_align, unsigned new_misalign);
> >> >>  };
> >> >>
> >> >> +/* Lattice of known bits, only capable of holding one value.
> >> >> +   Similar to ccp_prop_value_t, mask represents which bits of value 
> >> >> are constant.
> >> >> +   If a bit in mask is set to 0, then the corresponding bit in
> >> >> +   value is known to be constant.  */
> >> >> +
> >> >> +class ipcp_bits_lattice
> >> >> +{
> >> >> +public:
> >> >> +  bool bottom_p () { return lattice_val == IPA_BITS_VARYING; }
> >> >> +  bool top_p () { return lattice_val == IPA_BITS_UNDEFINED; }
> >> >> +  bool constant_p () { return lattice_val == IPA_BITS_CONSTANT; }
> >> >> +  bool set_to_bottom ();
> >> >> +  bool set_to_constant (widest_int, widest_int, signop, unsigned);
> >> >> +
> >> >> +  widest_int get_value () { return value; }
> >> >> +  widest_int get_mask () { return mask; }
> >> >> +  signop get_sign () { return sgn; }
> >> >> +  unsigned get_precision () { return precision; }
> >> >> +
> >> >> +  bool meet_with (ipcp_bits_lattice& other, enum tree_code, tree);
> >> >> +  bool meet_with (widest_int, widest_int, signop, unsigned);
> >> >> +
> >> >> +  void print (FILE *);
> >> >> +
> >> >> +private:
> >> >> +  enum { IPA_BITS_UNDEFINED, IPA_BITS_CONSTANT, IPA_BITS_VARYING } 
> >> >> lattice_val;
> >> >> +  widest_int value, mask;
> >> >> +  signop sgn;
> >> >> +  unsigned precision;
> >
> > I know that the existing code in ipa-cp.c does not do this, but please
> > prefix member variables with "m_" like our coding style guidelines
> > suggest (or even require?).  You routinely reuse those same names in
> > names of parameters of meet_with and I believe that is a practice that
> > will sooner or later lead to confusing the two and bugs.
> Sorry about this, will change to m_ prefix in followup patch.
> >
> >> >> +
> >> >> +  bool meet_with_1 (widest_int, widest_int);
> >> >> +  void get_value_and_mask (tree, widest_int *, widest_int *);
> >> >> +};
> >> >> +
> >> >>  /* Structure containing lattices for a parameter itself and for pieces 
> >> >> of
> >> >> aggregates that are passed in the parameter or by a reference in a 
> >> >> parameter
> >> >> plus some other useful flags.  */
> >> >> @@ -281,6 +316,8 @@ public:
> >> >>ipcp_agg_lattice *aggs;
> >> >>/* Lattice describing known alignment.  */
> >> >>ipcp_alignment_lattice alignment;
> >> >> +  /* Lattice describing known bits.  */
> >> >> +  ipcp_bits_lattice bits_lattice;
> >> >>/* Number of aggregate lattices */
> >> >>int aggs_count;
> >> >>/* True if aggregate data were passed by reference (as opposed to by
> >> >> @@ -458,6 +495,21 @@ ipcp_alignment_lattice::print (FILE * f)
> >> >>  fprintf (f, " Alignment %u, misalignment %u\n", align, 
> >> >> misalign);
> >> >>  }
> >> >>
> >>
> >> ...
> >>
> >> >> +/* Convert operand to value, mask form.  */
> >> >> +
> >> >> +void
> >> >> +ipcp_bits_lattice::get_value_and_mask (tree operand, widest_int 
> >> >> *valuep, widest_int *maskp)
> >> >> +{
> >> >> +  wide_int get_nonzero_bits (const_tree);
> >> >> +
> >> >> +  if (TREE_CODE (operand) == INTEGER_CST)
> >> >> +{
> >> >> +  *valuep = wi::to_widest (operand);
> >> >> +  *maskp = 0;
> >> >> +}
> >> >> +  else if (TREE_CODE (operand) == SSA_NAME)
> >> >
> >> > IIUC, operand is the operand from pass-through jump function and that
> >> > should never be an SSA_NAME.  I have even looked at how we generate
> >> > them and it seems fairly safe to say that they never are.  If you have
> >> > seen an SSA_NAME here, it is a bug and please let me know because
> >> > sooner or later it will cause an assert.
> >> >
> >> >> +{
> >> >> +  *valuep = 0;
> >> >> +  *maskp = widest_int::from (get_nonzero_bits (operand), UNSIGNED);
> >> >> +}
> >> >> +  else
> >> >> +gcc_unreachable ();
> >> >
> >> > The operand however can be any any other is_gimple_ip_invariant tree.
> >> > I as

Re: [RFC] [ipa bitwise cp] tree-ssa-ccp changes

2016-08-09 Thread Richard Biener

On Mon, 8 Aug 2016, Prathamesh Kulkarni wrote:

> Hi Richard,
> In the attached version, I tried to address your suggestions from:
> https://gcc.gnu.org/ml/gcc-patches/2016-08/msg00279.html
> 
> In ccp_finalize we do:
> wide_int nonzero_bits = wide_int::from (val->mask, precision,
>   UNSIGNED) | val->value;
> 
> Similar to the change to extend_mask to extend based on signop,
> should this be changed to:
> wide_int::from (val->mask, precision, TYPE_SIGN (TREE_TYPE (val->value))) ?
> (although I guess we are narrowing the type here rather than extending).

Yeah, we're truncating here so it doesn't really matter what sign the
larger mask has.

The patch is ok if the IPA bits are approved.

Thanks,
Richard.

Re: [PATCH/AARCH64] Improve ThunderX code generation slightly with load/store pair

2016-08-09 Thread Richard Earnshaw (lists)

On 08/08/16 18:48, Andrew Pinski wrote:
> On Fri, Aug 5, 2016 at 12:18 AM, Andrew Pinski  wrote:
>> Hi,
>>   On ThunderX, load (and store) pair that does a pair of two word
>> (32bits) load/stores is slower in some cases than doing two
>> load/stores.  For some internal benchmarks, it provides a 2-5%
>> improvement.
>>
>> This patch disables the forming of the load/store pairs for SImode if
>> we are tuning for ThunderX.  I used the tuning flags route so it can
>> be overridden if needed later on or if someone else wants to use the
>> same method for their core.
>>
>> OK?  Bootstrapped and tested on aarch64-linux-gnu with no regressions.
> 
> 
> Here is a new version based on feedback both on the list and off.
> I added a check for alignment to greater than 8 bytes as that is
> alignment < 8 causes the slow down.
> I also added two new testcases testing this to make sure it did the
> load pair optimization when it is profitable.
> 
> OK?  Bootstrapped and tested on aarch64-linux-gnu with no regressions.
> 
> Thanks,
> Andrew Pinski
> 
> ChangeLog:
> * config/aarch64/aarch64-tuning-flags.def (slow_ldpw): New tuning option.
> * config/aarch64/aarch64.c (thunderx_tunings): Enable
> AARCH64_EXTRA_TUNE_SLOW_LDPW.
> (aarch64_operands_ok_for_ldpstp): Return false if
> AARCH64_EXTRA_TUNE_SLOW_LDPW and the mode was SImode
> and the alignment is less than 8 byte.
> (aarch64_operands_adjust_ok_for_ldpstp): Likewise.
> 
> testsuite/ChangeLog:
> * gcc.target/aarch64/thunderxloadpair.c: New testcase.
> * gcc.target/aarch64/thunderxnoloadpair.c: New testcase.
> 
>>
>> Thanks,
>> Andrew Pinski
>>
>> ChangeLog:
>> * config/aarch64/aarch64-tuning-flags.def (slow_ldpw): New tuning option.
>> * config/aarch64/aarch64.c (thunderx_tunings): Enable
>> AARCH64_EXTRA_TUNE_SLOW_LDPW.
>> (aarch64_operands_ok_for_ldpstp): Return false if
>> AARCH64_EXTRA_TUNE_SLOW_LDPW and the mode was SImode.
>> (aarch64_operands_adjust_ok_for_ldpstp): Likewise.
>>

OK with the changes noted below.

R.

>> stldpw.diff.txt
>>
>>
>> Index: config/aarch64/aarch64-tuning-flags.def
>> ===
>> --- config/aarch64/aarch64-tuning-flags.def  (revision 239228)
>> +++ config/aarch64/aarch64-tuning-flags.def  (working copy)
>> @@ -29,3 +29,4 @@
>>   AARCH64_TUNE_ to give an enum name. */
>>  
>>  AARCH64_EXTRA_TUNING_OPTION ("rename_fma_regs", RENAME_FMA_REGS)
>> +AARCH64_EXTRA_TUNING_OPTION ("slow_ldpw", SLOW_LDPW)

I think this should now be renamed SLOW_UNALIGNED_LDPW.  I think it also
should be commented as to what it does at this point.

>> Index: config/aarch64/aarch64.c
>> ===
>> --- config/aarch64/aarch64.c (revision 239228)
>> +++ config/aarch64/aarch64.c (working copy)
>> @@ -712,7 +712,7 @@
>>0,/* max_case_values.  */
>>0,/* cache_line_size.  */
>>tune_params::AUTOPREFETCHER_OFF,  /* autoprefetcher_model.  */
>> -  (AARCH64_EXTRA_TUNE_NONE) /* tune_flags.  */
>> +  (AARCH64_EXTRA_TUNE_SLOW_LDPW)/* tune_flags.  */
>>  };
>>  
>>  static const struct tune_params xgene1_tunings =
>> @@ -13593,6 +13593,15 @@
>>if (MEM_VOLATILE_P (mem_1) || MEM_VOLATILE_P (mem_2))
>>  return false;
>>  
>> +  /* If we have SImode and slow ldp, check the alignment to be greater
>> + than 8 byte. */

Comment doesn't match code.  Should be "at least 8 byte alignment".

>> +  if (mode == SImode
>> +  && (aarch64_tune_params.extra_tuning_flags
>> +  & AARCH64_EXTRA_TUNE_SLOW_LDPW)
>> +  && !optimize_size
>> +  && MEM_ALIGN (mem_1) < 8 * BITS_PER_UNIT)
>> +return false;
>> +
>>/* Check if the addresses are in the form of [base+offset].  */
>>extract_base_offset_in_addr (mem_1, &base_1, &offset_1);
>>if (base_1 == NULL_RTX || offset_1 == NULL_RTX)
>> @@ -13752,6 +13761,15 @@
>>  return false;
>>  }
>>  
>> +  /* If we have SImode and slow ldp, check the alignment to be greater
>> + than 8 byte. */

Likewise.

>> +  if (mode == SImode
>> +  && (aarch64_tune_params.extra_tuning_flags
>> +  & AARCH64_EXTRA_TUNE_SLOW_LDPW)
>> +  && !optimize_size
>> +  && MEM_ALIGN (mem_1) < 8 * BITS_PER_UNIT)
>> +return false;
>> +
>>if (REG_P (reg_1) && FP_REGNUM_P (REGNO (reg_1)))
>>  rclass_1 = FP_REGS;
>>else
>> Index: testsuite/gcc.target/aarch64/thunderxloadpair.c
>> ===
>> --- testsuite/gcc.target/aarch64/thunderxloadpair.c  (nonexistent)
>> +++ testsuite/gcc.target/aarch64/thunderxloadpair.c  (working copy)
>> @@ -0,0 +1,20 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-O2 -mcpu=thunderx" } */
>> +
>> +struct ldp
>> +{
>> +  long long c;
>> +  int a, b;
>> +};
>> +
>> +
>> +int f(struct ldp *a)
>> +{
>> +  return a->a + a->b;
>> +}
>> +
>> +
>> +/* We know the alignement of a->a to be 8 byte aligned so it is profitable
>> +   to do ldp. */
>> +/* { dg-final { scan-asse

Re: [PATCH PR71734] Add missed check that reference defined inside loop.

2016-08-09 Thread Richard Biener

On Tue, Aug 9, 2016 at 11:20 AM, Yuri Rumyantsev  wrote:
> Yes it is impossible since all basic blocks are handled from outer
> loops to innermost so we can not have the sequence with wrong
> dependence, i.e. we cached that reference is independent (due to
> safelen) but the same reference in outer loop must be evaluated as
> dependent. So we must re-evaluate only dependent references in loops
> having non-zero safelen attribute.

Hmm.  I don't like depending on this implementation detail.  Does the
attached patch work
which simply avoids any positive/negative caching on safelen affected
refs?  It also makes
the query cheaper by avoiding the dive into child loops.

Richard.

> 2016-08-09 11:44 GMT+03:00 Richard Biener :
>> On Fri, Aug 5, 2016 at 4:57 PM, Yuri Rumyantsev  wrote:
>>> Richard,
>>>
>>> I added additional check before caching dependencies since (1) all
>>> statements in loop are handled in loop postorder order, i.e. form
>>> outer to inner; (2) we can change dependency for reference in subloops
>>> which have non-zero safelen attribute. So I propose to re-evaluate it
>>> in such cases. I don't see why we need to avoid dependence caching for
>>> all loop nests since pragma omp simd is used very rarely.
>>
>> You think it is impossible to construct a testcase which hits the
>> correctness issue?
>> "very rarely" is not a good argument to generate wrong code.
>>
>> Richard.
>>
>>> 2016-08-05 16:50 GMT+03:00 Richard Biener :
 On Fri, Aug 5, 2016 at 3:28 PM, Yuri Rumyantsev  wrote:
> Richard,
>
> Here is updated patch which implements your proposal - I pass loop
> instead of stmt to determine either REF is defined inside LOOP nest or
> not. I checked that for pr70729-nest.cc the reference this->S_n  for
> statements which are out of innermost loop are  not considered as
> independent as you pointed out.
>
> Regression testing did not show any new failures and both failed tests
> from libgomp.fortran suite now passed.
>
> Is it OK for trunk?

 I don't quite understand

 +  /* Ignore dependence for loops having greater safelen.  */
 +  if (new_safelen == safelen
 +  && bitmap_bit_p (&ref->dep_loop, LOOP_DEP_BIT (loop->num, 
 stored_p)))
  return false;

 this seems to suggest (correctly I think) that we cannot rely on the 
 caching
 for safelen, neither for optimal results (you seem to address that) but 
 also
 not for correctness (we cache the no-dep result from a safelen run and
 then happily re-use that info for a ref that is not safe for safelen).

 It seems to me we need to avoid any caching if we made things independent
 because of safelen and simply not do the dep test afterwards.  this means
 inlining ref_indep_loop_p_1 partly into _2 (not sure if there's a great way
 to do this w/o confusing the control flow).

 Richard.

> ChangeLog:
> 2016-08-05  Yuri Rumyantsev  
>
> PR tree-optimization/71734
> * tree-ssa-loop-im.c (ref_indep_loop_p): Add new argument REF_LOOP.
> (outermost_indep_loop): Pass LOOP argumnet where REF was defined to
> ref_indep_loop_p.
> (ref_indep_loop_p_1): Fix commentary.
> (ref_indep_loop_p_2): Add additional argument REF_LOOP, introduce new
> variable NEW_SAFELEN which may have new value for SAFELEN, ignore
> dependencde for loop having greater safelen value, pass REF_LOOP in
> recursive call.
> (can_sm_ref_p): Pass LOOP as additional argument to ref_indep_loop_p.
>
> 2016-08-03 16:44 GMT+03:00 Richard Biener :
>> On Fri, Jul 29, 2016 at 4:00 PM, Yuri Rumyantsev  
>> wrote:
>>> Hi Richard.
>>>
>>> It turned out that the fix proposed by you does not work for liggomp
>>> tests simd3 and simd4.
>>> The reason is that we can't change safelen value for references not
>>> defined inside loop. So I add missed check on it to patch.
>>> Is it OK for trunk?
>>
>> Hmm, I don't like the walk of all subloops in ref_defined_in_loop_p as
>> that operation can end up being quadratic in the loop depth/width.
>>
>> But I also wonder about correctness given that LIM "commons"
>> references.  So we can have
>>
>>   for (;;)
>> .. = ref;  (1)
>> for (;;) // safelen == 2  (2)
>>   ... = ref;
>>
>> and when looking at the ref at (1) which according to you should not
>> have safelen applied your function will happily return that ref is 
>> defined
>> in the inner loop.
>>
>> So it looks like to be able to apply safelen the caller of 
>> ref_indep_loop_p
>> needs to pass down a ref plus a location (a stmt).  In which case your
>> function can simply use flow_loop_nested_p (loop, gimple_bb
>> (stmt)->loop_father);
>>
>> Richard.
>>
>>> ChangeLog:
>>> 2016-07-29  Yuri Rumyantsev  
>>>
>>> PR tree-optimization/71734
>>> * tr

Re: Implement -Wimplicit-fallthrough (take 2): the rest

2016-08-09 Thread Marek Polacek

On Thu, Aug 04, 2016 at 10:47:27AM -0600, Jeff Law wrote:
> On 08/04/2016 06:36 AM, Michael Matz wrote:
> > Hi,
> > 
> > On Wed, 27 Jul 2016, Marek Polacek wrote:
> > 
> > > And this is the rest.  Either I just adjusted a falls through comment,
> > > or I added __builtin_fallthrough ().  These were the cases where I was
> > > fairly sure that the fall through is intentional.
> > 
> > I saw one case where I think the warning is a bit over-active:
> > 
> > @@ -42072,6 +42089,7 @@ rdseed_step:
> >  case IX86_BUILTIN_ADDCARRYX64:
> >icode = CODE_FOR_addcarrydi;
> >mode0 = DImode;
> > +  gcc_fallthrough ();
> > 
> >  handlecarry:
> >arg0 = CALL_EXPR_ARG (exp, 0); /* unsigned char c_in.  */
> > 
> > I.e. it also warns if the following label is not a case label but a normal
> > one.  I don't think this counts as a classical fall-through and it IMHO
> > should not be warned about nor should it be marked.

Ok, I'll buy that.

> It's probably the same underlying issue I saw with a false-positive in one
> of the other patches.

Yes.  I'll adjust the warning to not warn for non-case labels.

Thanks,

Marek

Re: [RFC][PR61839]Convert CST BINOP COND_EXPR to COND_EXPR ? (CST BINOP 1) : (CST BINOP 0)

2016-08-09 Thread Richard Biener

On Tue, Aug 9, 2016 at 4:51 AM, Kugan Vivekanandarajah
 wrote:
> Hi Richard,
>
> Thanks for the review.
>
> On 29 April 2016 at 20:47, Richard Biener  wrote:
>> On Sun, Apr 17, 2016 at 1:14 AM, kugan
>>  wrote:
>>> As explained in PR61839,
>>>
>>> Following difference results in extra instructions:
>>> -  c = b != 0 ? 486097858 : 972195717;
>>> +  c = a + 972195718 >> (b != 0);
>>>
>>> As suggested in PR, attached patch converts CST BINOP COND_EXPR to COND_EXPR
>>> ? (CST BINOP 1) : (CST BINOP 0).
>>>
>>> Bootstrapped and regression tested for x86-64-linux-gnu with no new
>>> regression. Is this OK for statege-1.
>>
>> You are missing a testcase.
>>
>> I think the transform can be generalized to any two-value value-range by
>> instead of
>>
>>   lhs = cond_res ? (cst binop 1) : (cst binop 0)
>>
>> emitting
>>
>>   lhs = tmp == val1 ? (cst binop val1) : (cst binop val2);
>>
>> In the PR I asked the transform to be only carried out if cond_res and
>> tmp have a single use (and thus they'd eventually vanish).
>>
>> I'm not sure if a general two-value "constant" propagation is profitable
>> which is why I was originally asking for the pattern to only apply
>> if the resulting value is used in a comparison which we could then
>> in turn simplify by substituting COND_RES (or ! COND_RES) for it.
>> For the general two-value case we'd substitute it with tmp [=!]= val[12]
>> dependent on which constant is cheaper to test for.
>>
>> So I think this needs some exploring work on which way to go
>> and which transform is profitable in the end.  I think the general
>> two-value case feeding a condition will be always profitable.
>
>
> Please find a modified version which checks for two-valued variable
> and uses this to optimize. In the attached test case (in function
> bar), we end up doing the conversion twice.
>
> Bootstrapped and regression tested on x86_64-linux-gnu without no new
> regressions. Is this OK for trunk?

+/* Return true if VAR is a two-valued variable.  Set MIN and MAX when it is
+   true.  Return false otherwise.  */
+
+static bool
+two_valued_val_range_p (tree var, tree *min, tree *max)
+{

I'd use A and B, not MIN/MAX given it's two values, not necessarily
a two-valued range (for example for ~[1, UINT_MAX-1] which you
don't handle).  In theory VRP might get a more sophisticated range
representation to also allow a range consisting of just 3 and 7 for example.

+  tree tmp
+= int_const_binop (PLUS_EXPR,
+  vr->min,
+  build_int_cst_type (TREE_TYPE (var), 1));
+  if (0 != compare_values (tmp, vr->max))
+return false;

I think simply

   if (wi::sub (vr->max, vr->min) == 1)

might work as well and avoid building a tree node.

+  /* Convert:
+LHS = CST BINOP VAR
+where VAR is two-valued.
+
+To:
+LHS = VAR == VAL1 ? (CST BINOP VAL1) : (CST BINOP VAL2) */
+
+  if (TREE_CODE_CLASS (rhs_code) == tcc_binary
+ && TREE_CODE (rhs1) == INTEGER_CST
+ && TREE_CODE (rhs2) == SSA_NAME

Note that for all commutative tcc_binary operators the constant will be on the
other operand.  I think you need to handle the constant appearing in both places
(and for division for example watch out for a zero divisor).

+ && has_single_use (rhs2)
+ && two_valued_val_range_p (rhs2, &min, &max))
+
+   {
+ tree cond = build2 (EQ_EXPR, TREE_TYPE (rhs2), rhs2, min);
+ tree new_rhs1 =  int_const_binop (rhs_code, rhs1, min);
+ tree new_rhs2 =  int_const_binop (rhs_code, rhs1, max);

too many spaces after '='.

+
+ if (new_rhs1 && new_rhs2)

You didn't address my point about profitability - you test for a single use
but not for the kind of use.  Please instead use

&& single_imm_use (rhs2, &use_p, &use_stmt)
&& gimple_code (use_stmt) == GIMPLE_COND

The testcase won't work on targets with small integers thus please
require int32plus.  With the existing scan-dumps it's not obvious
what transform it is testing for - please add a comment before
the dump scan reflecting the desired transform.  Maybe also scan
"optimized" instead to also verify that followup transforms trigger.

Thanks,
Richard.

> Thanks,
> Kugan
>
> gcc/testsuite/ChangeLog:
>
> 2016-08-09  Kugan Vivekanandarajah  
>
> PR tree-optimization/61839
> * gcc.dg/tree-ssa/pr61839.c: New test.
>
> gcc/ChangeLog:
>
> 2016-08-09  Kugan Vivekanandarajah  
>
> PR tree-optimization/61839
> * tree-vrp.c (two_valued_val_range_p): New.
> (simplify_stmt_using_ranges): Convert CST BINOP VAR where VAR is
> two-valued to VAR == VAL1 ? (CST BINOP VAL1) : (CST BINOP VAL2).

Re: Implement -Wimplicit-fallthrough (take 2): the rest

2016-08-09 Thread Jakub Jelinek

On Tue, Aug 09, 2016 at 12:18:11PM +0200, Marek Polacek wrote:
> > > On Wed, 27 Jul 2016, Marek Polacek wrote:
> > > 
> > > > And this is the rest.  Either I just adjusted a falls through comment,
> > > > or I added __builtin_fallthrough ().  These were the cases where I was
> > > > fairly sure that the fall through is intentional.
> > > 
> > > I saw one case where I think the warning is a bit over-active:
> > > 
> > > @@ -42072,6 +42089,7 @@ rdseed_step:
> > >  case IX86_BUILTIN_ADDCARRYX64:
> > >icode = CODE_FOR_addcarrydi;
> > >mode0 = DImode;
> > > +  gcc_fallthrough ();
> > > 
> > >  handlecarry:
> > >arg0 = CALL_EXPR_ARG (exp, 0); /* unsigned char c_in.  */
> > > 
> > > I.e. it also warns if the following label is not a case label but a normal
> > > one.  I don't think this counts as a classical fall-through and it IMHO
> > > should not be warned about nor should it be marked.
> 
> Ok, I'll buy that.
> 
> > It's probably the same underlying issue I saw with a false-positive in one
> > of the other patches.
> 
> Yes.  I'll adjust the warning to not warn for non-case labels.

What about the case where you have non-case labels followed by case labels?
I think warning for:
  case 1:
something ();
  lab:
  case 2:
something_else ();
is still desirable, so just avoid the warning for
  case 1:
something ();
  lab:
something_else ();
?

Jakub

Re: RFC [1/2] divmod transform

2016-08-09 Thread Prathamesh Kulkarni

ping https://gcc.gnu.org/ml/gcc-patches/2016-07/msg01867.html

Thanks,
Prathamesh

On 28 July 2016 at 19:05, Prathamesh Kulkarni
 wrote:
> On 8 June 2016 at 19:53, Richard Biener  wrote:
>> On Fri, 3 Jun 2016, Jim Wilson wrote:
>>
>>> On Mon, May 30, 2016 at 12:45 AM, Richard Biener  wrote:
>>> > Joseph - do you know sth about why there's not a full set of divmod
>>> > libfuncs in libgcc?
>>>
>>> Because udivmoddi4 isn't a libfunc, it is a helper function for the
>>> div and mov libfuncs.  Since we can compute the signed div and mod
>>> results from udivmoddi4, there was no need to also add a signed
>>> version of it.  It was given a libfunc style name so that we had the
>>> option of making it a libfunc in the future, but that never happened.
>>> There was no support for calling any divmod libfunc until it was added
>>> as a special case to call an ARM library (not libgcc) function.  This
>>> happened here
>>>
>>> 2004-08-09  Mark Mitchell  
>>>
>>> * config.gcc (arm*-*-eabi*): New target.
>>> * defaults.h (TARGET_LIBGCC_FUNCS): New macro.
>>> (TARGET_LIB_INT_CMP_BIASED): Likewise.
>>> * expmed.c (expand_divmod): Try a two-valued divmod function as a
>>> last resort.
>>> ...
>>> * config/arm/arm.c (arm_init_libfuncs): New function.
>>> (arm_compute_initial_eliminatino_offset): Return HOST_WIDE_INT.
>>> (TARGET_INIT_LIBFUNCS): Define it.
>>> ...
>>>
>>> Later, two ports added their own divmod libfuncs, but I don't see any
>>> evidence that they were ever used, since there is no support for
>>> calling divmod other than the expand_divmod last resort code that only
>>> triggers for ARM.
>>>
>>> It is only now that Prathamesh is adding gimple support for divmod
>>> operations that we need to worry about getting this right, without
>>> breaking the existing ARM library support or the existing udivmoddi4
>>> support.
>>
>> Ok, so as he is primarily targeting the special arm divmod libcall
>> I suppose we can live with special-casing libcall handling to
>> udivmoddi3.  It would be nice to not lie about divmod availablilty
>> as libcall though... - it looks like the libcall is also guarded
>> on TARGET_HAS_NO_HW_DIVIDE (unless it was available historically
>> like on x86).
>>
>> So not sure where to go from here.
> Hi,
> I have attached patch, which is rebased on trunk.
> Needed to update divmod-7.c, which now gets transformed to divmod
> thanks to your code-hoisting patch -;)
> We still have the issue of optab_libfunc() returning non-existent
> libcalls. As in previous patch, I am checking
> explicitly for "__udivmoddi4", with a FIXME note.
> I hope that's okay for now ?
>
> Bootstrapped and tested on x86_64-unknown-linux-gnu,
> armv8l-unknown-linux-gnueabihf.
> Bootstrap+test in progress on i686-linux-gnu.
> Cross-tested on arm*-*-*.
>
> Thanks,
> Prathamesh
>>
>> Richard.

Re: [RFC] [2/2] divmod transform: override expand_divmod_libfunc for ARM and add test-cases

2016-08-09 Thread Prathamesh Kulkarni

ping https://gcc.gnu.org/ml/gcc-patches/2016-07/msg01969.html

Thanks,
Prathamesh

On 29 July 2016 at 22:13, Prathamesh Kulkarni
 wrote:
> On 29 July 2016 at 05:40, Prathamesh Kulkarni
>  wrote:
>> On 28 July 2016 at 20:14, Ramana Radhakrishnan
>>  wrote:
>>>
 appear UNSUPPORTED.
 That's because this config appears to define
 __ARM_ARCH_EXT_IDIV__ however idiv appears not to be present.

 For instance __aeabi_div is called to perform
 division for the following test-case:
 int f(int x, int y)
 {
   int r = x / y;
   return r;
 }

 Compiling with -O2:
 f:
 @ args = 0, pretend = 0, frame = 0
 @ frame_needed = 0, uses_anonymous_args = 0
 push{r4, lr}
 bl  __aeabi_idiv
 pop {r4, pc}

 I assumed if __ARM_ARCH_EXT_IDIV was defined, then
 there should have been idiv instead of call to __aeabi_div
 or am I missing something ?

 Um I had configured with --with-tune=cortex-a9. Is that incorrect for
 armv8l-unknown-linux-gnueabihf ?
>>>
>>> --with-tune shouldn't make a difference to code generation settings. The 
>>> code generation you are showing is certainly odd for this testcase  - and 
>>> not something I can reproduce on pristine trunk - so sounds like 
>>> something's broken by your patch . You should be seeing an sdiv in this 
>>> case in the output - Look at the .arch directive at the top of your file - 
>>> maybe that gives you a clue in terms of making sure that you had configured 
>>> the toolchain correctly.
>> Hi,
>> There is no .arch in the assembly however there's .cpu arm10dtmi at
>> the top, full assembly: http://pastebin.com/6tzckiG0
>> With pristine trunk (r238800), I still get __aeabi_idiv for the above 
>> test-case.
>> config opts: --enable-languages=c,c++ --target=armv8l-linux-gnueabihf
>> --with-arch=armv8-a --with-fpu=neon-fp-armv8 --with-float=hard
>> --with-mode=thumb --enable-multiarch --disable-multilib
>> Tried with native stage-1 build and cross build.
>> I verified that __ARM_ARCH_EXT_IDIV__ is defined, with following
>> test-case, which fails to compile.
>> #ifdef __ARM_ARCH_EXT_IDIV__
>> #error "has div insn"
>> #endif
>> int x;
> Apparently looks like I screwed sth in my build :/
> After re-building from scratch, I could get sdiv in the output -;)
> Verified that the patch does not regress on armv8l-unknown-linux-gnu
> and cross-tested on arm*-*-*.
> Ok for trunk ?
>
> Thanks,
> Prathamesh
>>
>> Thanks,
>> Prathamesh
>>>
>>>
>>> regards
>>> Ramana
>>>

 xgcc -v:
 Using built-in specs.
 COLLECT_GCC=armhf-bootstrap-build/gcc/xgcc
 Target: armv8l-unknown-linux-gnueabihf
 Configured with: ../gcc/configure --enable-languages=c,c++,fortran
 --with-arch=armv8-a --with-fpu=neon-fp-armv8 --with-float=hard
 --with-mode=thumb --enable-multiarch --with-tune=cortex-a9
 --disable-multilib
 Thread model: posix
 gcc version 7.0.0 20160727 (experimental) (GCC)

 Thanks,
 Prathamesh
>
> Thanks,
> Ramana
>>>

Re: divmod transform: add test-cases

2016-08-09 Thread Prathamesh Kulkarni

ping https://gcc.gnu.org/ml/gcc-patches/2016-07/msg01869.html

Thanks,
Prathamesh

On 28 July 2016 at 19:07, Prathamesh Kulkarni
 wrote:
> Hi,
> The following patch adds test-cases for divmod transform.
> I separated the SImode tests into separate file from DImode tests
> because certain arm configs (cortex-15) have hardware div insn for
> SImode but not for DImode,
> and for that config we want SImode tests to be disabled but not DImode tests.
> The patch therefore has two target-effective checks: divmod and divmod_simode.
> Is it OK for trunk ?
>
> Thanks,
> Prathamesh

Re: [RFC] ipa bitwise constant propagation

2016-08-09 Thread Martin Jambor

Hi,

On Tue, Aug 09, 2016 at 01:41:21PM +0530, Prathamesh Kulkarni wrote:
> On 8 August 2016 at 19:33, Martin Jambor  wrote:
> >> >> +class ipcp_bits_lattice
> >> >> +{
> >> >> +public:
> >> >> +  bool bottom_p () { return lattice_val == IPA_BITS_VARYING; }
> >> >> +  bool top_p () { return lattice_val == IPA_BITS_UNDEFINED; }
> >> >> +  bool constant_p () { return lattice_val == IPA_BITS_CONSTANT; }
> >> >> +  bool set_to_bottom ();
> >> >> +  bool set_to_constant (widest_int, widest_int, signop, unsigned);
> >> >> +
> >> >> +  widest_int get_value () { return value; }
> >> >> +  widest_int get_mask () { return mask; }
> >> >> +  signop get_sign () { return sgn; }
> >> >> +  unsigned get_precision () { return precision; }
> >> >> +
> >> >> +  bool meet_with (ipcp_bits_lattice& other, enum tree_code, tree);
> >> >> +  bool meet_with (widest_int, widest_int, signop, unsigned);
> >> >> +
> >> >> +  void print (FILE *);
> >> >> +
> >> >> +private:
> >> >> +  enum { IPA_BITS_UNDEFINED, IPA_BITS_CONSTANT, IPA_BITS_VARYING } 
> >> >> lattice_val;
> >> >> +  widest_int value, mask;
> >> >> +  signop sgn;
> >> >> +  unsigned precision;
> >
> > I know that the existing code in ipa-cp.c does not do this, but please
> > prefix member variables with "m_" like our coding style guidelines
> > suggest (or even require?).  You routinely reuse those same names in
> > names of parameters of meet_with and I believe that is a practice that
> > will sooner or later lead to confusing the two and bugs.
> Sorry about this, will change to m_ prefix in followup patch.

Thanks a lot.

> >
> >> >> +
> >> >> +  bool meet_with_1 (widest_int, widest_int);
> >> >> +  void get_value_and_mask (tree, widest_int *, widest_int *);
> >> >> +};
> >> >> +
> >> >>  /* Structure containing lattices for a parameter itself and for pieces 
> >> >> of
> >> >> aggregates that are passed in the parameter or by a reference in a 
> >> >> parameter
> >> >> plus some other useful flags.  */
> >> >> @@ -281,6 +316,8 @@ public:
> >> >>ipcp_agg_lattice *aggs;
> >> >>/* Lattice describing known alignment.  */
> >> >>ipcp_alignment_lattice alignment;
> >> >> +  /* Lattice describing known bits.  */
> >> >> +  ipcp_bits_lattice bits_lattice;
> >> >>/* Number of aggregate lattices */
> >> >>int aggs_count;
> >> >>/* True if aggregate data were passed by reference (as opposed to by
> >> >> @@ -458,6 +495,21 @@ ipcp_alignment_lattice::print (FILE * f)
> >> >>  fprintf (f, " Alignment %u, misalignment %u\n", align, 
> >> >> misalign);
> >> >>  }
> >> >>
> >>
> >> ...
> >>
> >> >> +}
> >> >> +
> >> >> +/* Meet operation, similar to ccp_lattice_meet, we xor values
> >> >> +   if this->value, value have different values at same bit positions, 
> >> >> we want
> >> >> +   to drop that bit to varying. Return true if mask is changed.
> >> >> +   This function assumes that the lattice value is in CONSTANT state  
> >> >> */
> >> >> +
> >> >> +bool
> >> >> +ipcp_bits_lattice::meet_with_1 (widest_int value, widest_int mask)
> >> >> +{
> >> >> +  gcc_assert (constant_p ());
> >> >> +
> >> >> +  widest_int old_mask = this->mask;
> >> >> +  this->mask = (this->mask | mask) | (this->value ^ value);
> >> >> +
> >> >> +  if (wi::sext (this->mask, this->precision) == -1)
> >> >> +return set_to_bottom ();
> >> >> +
> >> >> +  bool changed = this->mask != old_mask;
> >> >> +  return changed;
> >> >> +}
> >> >> +
> >> >> +/* Meet the bits lattice with operand
> >> >> +   described by  >> >> +
> >> >> +bool
> >> >> +ipcp_bits_lattice::meet_with (widest_int value, widest_int mask,
> >> >> +   signop sgn, unsigned precision)
> >> >> +{
> >> >> +  if (bottom_p ())
> >> >> +return false;
> >> >> +
> >> >> +  if (top_p ())
> >> >> +{
> >> >> +  if (wi::sext (mask, precision) == -1)
> >> >> + return set_to_bottom ();
> >> >> +  return set_to_constant (value, mask, sgn, precision);
> >> >> +}
> >> >> +
> >> >> +  return meet_with_1 (value, mask);
> >> >
> >> > What if precisions do not match?
> >> Sorry I don't understand. Since we extend to widest_int, precision
> >> would be same ?
> >
> > I meant what if:
> >
> >   this->precision != precision /* the parameter value */
> >
> > (you see, giving both the same name is error-prone).  You are
> > propagating the recorded precision gathered from types of the actual
> > arguments in calls, those can be different.  For example, one caller
> > can pass a direct integer value with full integer precision, another
> > caller can pass in that same argument an enum value with a very
> > limited precision.  Your code ignores that difference and the
> > resulting precision is the one that arrives first.  If it is the enum,
> > it might be too small for the integer value from the other call-site?
> Ah indeed the patch incorrectly propagates precision of argument.
> So IIUC in ipcp_bits_lattice, we want m_precision to be the precision
> of parameter's type and _no

Re: [Fortran, Patch] First patch for coarray FAILED IMAGES (TS 18508)

2016-08-09 Thread Paul Richard Thomas

Hi Sandro,

As far as I can see, this is OK barring a couple of minor wrinkles and
a question:

For coarray_failed_images_err.f90 and coarray_image_status_err.f90 you
have used the option -fdump-tree-original without making use of the
tree dump.

Mikael asked you to provide an executable test with -fcoarray=single.
Is this not possible for some reason?

Otherwise, this is OK for trunk.

Thanks for the patch.

Paul

On 4 August 2016 at 05:07, Alessandro Fanfarillo
 wrote:
> * PING *
>
> 2016-07-21 13:05 GMT-06:00 Alessandro Fanfarillo :
>> Dear Mikael and all,
>>
>> in attachment the new patch, built and regtested on x86_64-pc-linux-gnu.
>>
>> Cheers,
>> Alessandro
>>
>> 2016-07-20 13:17 GMT-06:00 Mikael Morin :
>>> Le 20/07/2016 à 11:39, Andre Vehreschild a écrit :

 Hi Mikael,


>> +  if(st == ST_FAIL_IMAGE)
>> +new_st.op = EXEC_FAIL_IMAGE;
>> +  else
>> +gcc_unreachable();
>
> You can use
> gcc_assert (st == ST_FAIL_IMAGE);
> foo...;
> instead of
> if (st == ST_FAIL_IMAGE)
> foo...;
> else
> gcc_unreachable ();


 Be careful, this is not 100% identical in the general case. For older
 gcc version (gcc < 4008) gcc_assert() is mapped to nothing, esp. not to
 an abort(), so the behavior can change. But in this case everything is
 fine, because the patch is most likely not backported.

>>> Didn't know about this. The difference seems to be very subtle.
>>> I don't mind much anyway. The original version can stay if preferred, this
>>> was just a suggestion.
>>>
>>> By the way, if the function is inlined in its single caller, the assert or
>>> unreachable statement can be removed, which avoids choosing between them.
>>> That's another suggestion.
>>>
>>>
>> +
>> +  return MATCH_YES;
>> +
>> + syntax:
>> +  gfc_syntax_error (st);
>> +
>> +  return MATCH_ERROR;
>> +}
>> +
>> +match
>> +gfc_match_fail_image (void)
>> +{
>> +  /* if (!gfc_notify_std (GFC_STD_F2008_TS, "FAIL IMAGE statement
>> at %C")) */
>> +  /*   return MATCH_ERROR; */
>> +
>
> Can this be uncommented?
>
>> +  return fail_image_statement (ST_FAIL_IMAGE);
>> +}
>>
>>  /* Match LOCK/UNLOCK statement. Syntax:
>>   LOCK ( lock-variable [ , lock-stat-list ] )
>> diff --git a/gcc/fortran/trans-intrinsic.c
>> b/gcc/fortran/trans-intrinsic.c index 1aaf4e2..b2f5596 100644
>> --- a/gcc/fortran/trans-intrinsic.c
>> +++ b/gcc/fortran/trans-intrinsic.c
>> @@ -1647,6 +1647,24 @@ trans_this_image (gfc_se * se, gfc_expr
>> *expr) m, lbound));
>>  }
>>
>> +static void
>> +gfc_conv_intrinsic_image_status (gfc_se *se, gfc_expr *expr)
>> +{
>> +  unsigned int num_args;
>> +  tree *args,tmp;
>> +
>> +  num_args = gfc_intrinsic_argument_list_length (expr);
>> +  args = XALLOCAVEC (tree, num_args);
>> +
>> +  gfc_conv_intrinsic_function_args (se, expr, args, num_args);
>> +
>> +  if (flag_coarray == GFC_FCOARRAY_LIB)
>> +{
>
> Can everything be put under the if?
> Does it work with -fcoarray=single?


 IMO coarray=single should not generate code here, therefore putting
 everything under the if should to fine.

>>> My point was more avoiding generating code for the arguments if they are not
>>> used in the end.
>>> Regarding the -fcoarray=single case, the function returns a result, which
>>> can be used in an expression, so I don't think it will work without at least
>>> hardcoding a fixed value as result in that case.
>>> But even that wouldn't be enough, as the function wouldn't work consistently
>>> with the fail image statement.
>>>
 Sorry for the comments ...

>>> Comments are welcome here, as far as I know. ;-)
>>>
>>> Mikael



-- 
The difference between genius and stupidity is; genius has its limits.

Albert Einstein

[PATCH] Set -fprofile-update=atomic when -pthread is present

2016-08-09 Thread Martin Liška

Hi.

As mention in [1], enabling -fprofile-update=atomic when -pthread is logical
thing and is quite expected default behavior.

Ready for trunk?

[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58306#c21
>From 7135fcf5f7b8f2f3c55a6e80048660c1beea5052 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Tue, 9 Aug 2016 13:19:26 +0200
Subject: [PATCH] Set -fprofile-update=atomic when -pthread is present

gcc/ChangeLog:

2016-08-09  Martin Liska  

	* gcc.c: Add -fprofile-update=atomic when
	-pthread is present.
---
 gcc/gcc.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/gcc.c b/gcc/gcc.c
index 7460f6a..2f42619 100644
--- a/gcc/gcc.c
+++ b/gcc/gcc.c
@@ -1141,7 +1141,8 @@ static const char *cc1_options =
  %{-help=*:--help=%*}\
  %{!fsyntax-only:%{S:%W{o*}%{!o*:-o %b.s}}}\
  %{fsyntax-only:-o %j} %{-param*}\
- %{coverage:-fprofile-arcs -ftest-coverage}";
+ %{coverage:-fprofile-arcs -ftest-coverage}\
+ %{pthread:-fprofile-update=atomic}";
 
 static const char *asm_options =
 "%{-target-help:%:print-asm-header()} "
-- 
2.9.2

Re: [PATCH PR71734] Add missed check that reference defined inside loop.

2016-08-09 Thread Yuri Rumyantsev

Richard,

The patch proposed by you does not work properly for
g++.dg/vect/pr70729-nest.cc test since the reference for this->S_n has
been cached as dependent for outer loop and loop is not vectorized:

 g++ -Ofast -fopenmp -mavx2 pr70729-nest.cc -c
-fdump-tree-vect-details
grep 'LOOP VECTORIZED' pr70729-nest.cc.149t.vect


You missed additional check I added before check on cached dependence.

2016-08-09 13:00 GMT+03:00 Richard Biener :
> On Tue, Aug 9, 2016 at 11:20 AM, Yuri Rumyantsev  wrote:
>> Yes it is impossible since all basic blocks are handled from outer
>> loops to innermost so we can not have the sequence with wrong
>> dependence, i.e. we cached that reference is independent (due to
>> safelen) but the same reference in outer loop must be evaluated as
>> dependent. So we must re-evaluate only dependent references in loops
>> having non-zero safelen attribute.
>
> Hmm.  I don't like depending on this implementation detail.  Does the
> attached patch work
> which simply avoids any positive/negative caching on safelen affected
> refs?  It also makes
> the query cheaper by avoiding the dive into child loops.
>
> Richard.
>
>> 2016-08-09 11:44 GMT+03:00 Richard Biener :
>>> On Fri, Aug 5, 2016 at 4:57 PM, Yuri Rumyantsev  wrote:
 Richard,

 I added additional check before caching dependencies since (1) all
 statements in loop are handled in loop postorder order, i.e. form
 outer to inner; (2) we can change dependency for reference in subloops
 which have non-zero safelen attribute. So I propose to re-evaluate it
 in such cases. I don't see why we need to avoid dependence caching for
 all loop nests since pragma omp simd is used very rarely.
>>>
>>> You think it is impossible to construct a testcase which hits the
>>> correctness issue?
>>> "very rarely" is not a good argument to generate wrong code.
>>>
>>> Richard.
>>>
 2016-08-05 16:50 GMT+03:00 Richard Biener :
> On Fri, Aug 5, 2016 at 3:28 PM, Yuri Rumyantsev  
> wrote:
>> Richard,
>>
>> Here is updated patch which implements your proposal - I pass loop
>> instead of stmt to determine either REF is defined inside LOOP nest or
>> not. I checked that for pr70729-nest.cc the reference this->S_n  for
>> statements which are out of innermost loop are  not considered as
>> independent as you pointed out.
>>
>> Regression testing did not show any new failures and both failed tests
>> from libgomp.fortran suite now passed.
>>
>> Is it OK for trunk?
>
> I don't quite understand
>
> +  /* Ignore dependence for loops having greater safelen.  */
> +  if (new_safelen == safelen
> +  && bitmap_bit_p (&ref->dep_loop, LOOP_DEP_BIT (loop->num, 
> stored_p)))
>  return false;
>
> this seems to suggest (correctly I think) that we cannot rely on the 
> caching
> for safelen, neither for optimal results (you seem to address that) but 
> also
> not for correctness (we cache the no-dep result from a safelen run and
> then happily re-use that info for a ref that is not safe for safelen).
>
> It seems to me we need to avoid any caching if we made things independent
> because of safelen and simply not do the dep test afterwards.  this means
> inlining ref_indep_loop_p_1 partly into _2 (not sure if there's a great 
> way
> to do this w/o confusing the control flow).
>
> Richard.
>
>> ChangeLog:
>> 2016-08-05  Yuri Rumyantsev  
>>
>> PR tree-optimization/71734
>> * tree-ssa-loop-im.c (ref_indep_loop_p): Add new argument REF_LOOP.
>> (outermost_indep_loop): Pass LOOP argumnet where REF was defined to
>> ref_indep_loop_p.
>> (ref_indep_loop_p_1): Fix commentary.
>> (ref_indep_loop_p_2): Add additional argument REF_LOOP, introduce new
>> variable NEW_SAFELEN which may have new value for SAFELEN, ignore
>> dependencde for loop having greater safelen value, pass REF_LOOP in
>> recursive call.
>> (can_sm_ref_p): Pass LOOP as additional argument to ref_indep_loop_p.
>>
>> 2016-08-03 16:44 GMT+03:00 Richard Biener :
>>> On Fri, Jul 29, 2016 at 4:00 PM, Yuri Rumyantsev  
>>> wrote:
 Hi Richard.

 It turned out that the fix proposed by you does not work for liggomp
 tests simd3 and simd4.
 The reason is that we can't change safelen value for references not
 defined inside loop. So I add missed check on it to patch.
 Is it OK for trunk?
>>>
>>> Hmm, I don't like the walk of all subloops in ref_defined_in_loop_p as
>>> that operation can end up being quadratic in the loop depth/width.
>>>
>>> But I also wonder about correctness given that LIM "commons"
>>> references.  So we can have
>>>
>>>   for (;;)
>>> .. = ref;  (1)
>>> for (;;) // safelen == 2  (2)
>>>   ... = ref;
>>>
>>> and when looking at the ref a

Re: protected alloca class for malloc fallback

2016-08-09 Thread Oleg Endo

On Mon, 2016-08-08 at 13:39 -0400, Trevor Saunders wrote:

> First sizeof std::string is 32 on x86_64, a char *, a size_t for the
> length, and a 16 byte union of a size_t for allocated size and a 16 
> byte buffer for short strings.  I suppose some of this is required by 
> the C++ standard,

I recommend checking what others have figured regarding that matter
https://github.com/elliotgoodrich/SSO-23

>  but I doubt we really need to care about strings longer than 2^32 in
> gcc.  If we put the length and allocated size in the buffer,
> and have a separate class for stack allocated buffers I think we can 
> have a string that is just sizeof void *.

This idea has one flaw in that it does not allow having objects by
value.  It essentially *requires* one to *always* allocate them on the
heap via new/delete.  Being able to store objects by value is useful in
many situations.  So if you do the above, the next logical step that
will follow is a smart-pointer-like wrapper that allows value
semantics.  Because eventually somebody will want that operator == or
operator < e.g. for associative containers.

> 
> Second it would be useful performance wise to have a std::string_view
> type class, but that is c++14 or 17? only so we'd need to import it 
> into gcc or something.

http://en.cppreference.com/w/cpp/experimental/basic_string_view
http://en.cppreference.com/w/cpp/string/basic_string_view

It's "funny" that GCC contains a C++ stdlib implementation and so
little is actually used by GCC itself.

Cheers,
Oleg

Re: [PATCH PR71734] Add missed check that reference defined inside loop.

2016-08-09 Thread Richard Biener

On Tue, Aug 9, 2016 at 1:26 PM, Yuri Rumyantsev  wrote:
> Richard,
>
> The patch proposed by you does not work properly for
> g++.dg/vect/pr70729-nest.cc test since the reference for this->S_n has
> been cached as dependent for outer loop and loop is not vectorized:
>
>  g++ -Ofast -fopenmp -mavx2 pr70729-nest.cc -c
> -fdump-tree-vect-details
> grep 'LOOP VECTORIZED' pr70729-nest.cc.149t.vect
> 
>
> You missed additional check I added before check on cached dependence.

Ok, but it should get the correctness right?

I suppose that if you move the cache checks inside the else clause it
would work?
I'd be ok with that change.

Richard.

> 2016-08-09 13:00 GMT+03:00 Richard Biener :
>> On Tue, Aug 9, 2016 at 11:20 AM, Yuri Rumyantsev  wrote:
>>> Yes it is impossible since all basic blocks are handled from outer
>>> loops to innermost so we can not have the sequence with wrong
>>> dependence, i.e. we cached that reference is independent (due to
>>> safelen) but the same reference in outer loop must be evaluated as
>>> dependent. So we must re-evaluate only dependent references in loops
>>> having non-zero safelen attribute.
>>
>> Hmm.  I don't like depending on this implementation detail.  Does the
>> attached patch work
>> which simply avoids any positive/negative caching on safelen affected
>> refs?  It also makes
>> the query cheaper by avoiding the dive into child loops.
>>
>> Richard.
>>
>>> 2016-08-09 11:44 GMT+03:00 Richard Biener :
 On Fri, Aug 5, 2016 at 4:57 PM, Yuri Rumyantsev  wrote:
> Richard,
>
> I added additional check before caching dependencies since (1) all
> statements in loop are handled in loop postorder order, i.e. form
> outer to inner; (2) we can change dependency for reference in subloops
> which have non-zero safelen attribute. So I propose to re-evaluate it
> in such cases. I don't see why we need to avoid dependence caching for
> all loop nests since pragma omp simd is used very rarely.

 You think it is impossible to construct a testcase which hits the
 correctness issue?
 "very rarely" is not a good argument to generate wrong code.

 Richard.

> 2016-08-05 16:50 GMT+03:00 Richard Biener :
>> On Fri, Aug 5, 2016 at 3:28 PM, Yuri Rumyantsev  
>> wrote:
>>> Richard,
>>>
>>> Here is updated patch which implements your proposal - I pass loop
>>> instead of stmt to determine either REF is defined inside LOOP nest or
>>> not. I checked that for pr70729-nest.cc the reference this->S_n  for
>>> statements which are out of innermost loop are  not considered as
>>> independent as you pointed out.
>>>
>>> Regression testing did not show any new failures and both failed tests
>>> from libgomp.fortran suite now passed.
>>>
>>> Is it OK for trunk?
>>
>> I don't quite understand
>>
>> +  /* Ignore dependence for loops having greater safelen.  */
>> +  if (new_safelen == safelen
>> +  && bitmap_bit_p (&ref->dep_loop, LOOP_DEP_BIT (loop->num, 
>> stored_p)))
>>  return false;
>>
>> this seems to suggest (correctly I think) that we cannot rely on the 
>> caching
>> for safelen, neither for optimal results (you seem to address that) but 
>> also
>> not for correctness (we cache the no-dep result from a safelen run and
>> then happily re-use that info for a ref that is not safe for safelen).
>>
>> It seems to me we need to avoid any caching if we made things independent
>> because of safelen and simply not do the dep test afterwards.  this means
>> inlining ref_indep_loop_p_1 partly into _2 (not sure if there's a great 
>> way
>> to do this w/o confusing the control flow).
>>
>> Richard.
>>
>>> ChangeLog:
>>> 2016-08-05  Yuri Rumyantsev  
>>>
>>> PR tree-optimization/71734
>>> * tree-ssa-loop-im.c (ref_indep_loop_p): Add new argument REF_LOOP.
>>> (outermost_indep_loop): Pass LOOP argumnet where REF was defined to
>>> ref_indep_loop_p.
>>> (ref_indep_loop_p_1): Fix commentary.
>>> (ref_indep_loop_p_2): Add additional argument REF_LOOP, introduce new
>>> variable NEW_SAFELEN which may have new value for SAFELEN, ignore
>>> dependencde for loop having greater safelen value, pass REF_LOOP in
>>> recursive call.
>>> (can_sm_ref_p): Pass LOOP as additional argument to ref_indep_loop_p.
>>>
>>> 2016-08-03 16:44 GMT+03:00 Richard Biener :
 On Fri, Jul 29, 2016 at 4:00 PM, Yuri Rumyantsev  
 wrote:
> Hi Richard.
>
> It turned out that the fix proposed by you does not work for liggomp
> tests simd3 and simd4.
> The reason is that we can't change safelen value for references not
> defined inside loop. So I add missed check on it to patch.
> Is it OK for trunk?

 Hmm, I don't like the walk of all subloops in ref_defined_in_loop_p as
 t

Re: [RFC] ipa bitwise constant propagation

2016-08-09 Thread Prathamesh Kulkarni

On 9 August 2016 at 16:39, Martin Jambor  wrote:
> Hi,
>
> On Tue, Aug 09, 2016 at 01:41:21PM +0530, Prathamesh Kulkarni wrote:
>> On 8 August 2016 at 19:33, Martin Jambor  wrote:
>> >> >> +class ipcp_bits_lattice
>> >> >> +{
>> >> >> +public:
>> >> >> +  bool bottom_p () { return lattice_val == IPA_BITS_VARYING; }
>> >> >> +  bool top_p () { return lattice_val == IPA_BITS_UNDEFINED; }
>> >> >> +  bool constant_p () { return lattice_val == IPA_BITS_CONSTANT; }
>> >> >> +  bool set_to_bottom ();
>> >> >> +  bool set_to_constant (widest_int, widest_int, signop, unsigned);
>> >> >> +
>> >> >> +  widest_int get_value () { return value; }
>> >> >> +  widest_int get_mask () { return mask; }
>> >> >> +  signop get_sign () { return sgn; }
>> >> >> +  unsigned get_precision () { return precision; }
>> >> >> +
>> >> >> +  bool meet_with (ipcp_bits_lattice& other, enum tree_code, tree);
>> >> >> +  bool meet_with (widest_int, widest_int, signop, unsigned);
>> >> >> +
>> >> >> +  void print (FILE *);
>> >> >> +
>> >> >> +private:
>> >> >> +  enum { IPA_BITS_UNDEFINED, IPA_BITS_CONSTANT, IPA_BITS_VARYING } 
>> >> >> lattice_val;
>> >> >> +  widest_int value, mask;
>> >> >> +  signop sgn;
>> >> >> +  unsigned precision;
>> >
>> > I know that the existing code in ipa-cp.c does not do this, but please
>> > prefix member variables with "m_" like our coding style guidelines
>> > suggest (or even require?).  You routinely reuse those same names in
>> > names of parameters of meet_with and I believe that is a practice that
>> > will sooner or later lead to confusing the two and bugs.
>> Sorry about this, will change to m_ prefix in followup patch.
>
> Thanks a lot.
>
>> >
>> >> >> +
>> >> >> +  bool meet_with_1 (widest_int, widest_int);
>> >> >> +  void get_value_and_mask (tree, widest_int *, widest_int *);
>> >> >> +};
>> >> >> +
>> >> >>  /* Structure containing lattices for a parameter itself and for 
>> >> >> pieces of
>> >> >> aggregates that are passed in the parameter or by a reference in a 
>> >> >> parameter
>> >> >> plus some other useful flags.  */
>> >> >> @@ -281,6 +316,8 @@ public:
>> >> >>ipcp_agg_lattice *aggs;
>> >> >>/* Lattice describing known alignment.  */
>> >> >>ipcp_alignment_lattice alignment;
>> >> >> +  /* Lattice describing known bits.  */
>> >> >> +  ipcp_bits_lattice bits_lattice;
>> >> >>/* Number of aggregate lattices */
>> >> >>int aggs_count;
>> >> >>/* True if aggregate data were passed by reference (as opposed to by
>> >> >> @@ -458,6 +495,21 @@ ipcp_alignment_lattice::print (FILE * f)
>> >> >>  fprintf (f, " Alignment %u, misalignment %u\n", align, 
>> >> >> misalign);
>> >> >>  }
>> >> >>
>> >>
>> >> ...
>> >>
>> >> >> +}
>> >> >> +
>> >> >> +/* Meet operation, similar to ccp_lattice_meet, we xor values
>> >> >> +   if this->value, value have different values at same bit positions, 
>> >> >> we want
>> >> >> +   to drop that bit to varying. Return true if mask is changed.
>> >> >> +   This function assumes that the lattice value is in CONSTANT state  
>> >> >> */
>> >> >> +
>> >> >> +bool
>> >> >> +ipcp_bits_lattice::meet_with_1 (widest_int value, widest_int mask)
>> >> >> +{
>> >> >> +  gcc_assert (constant_p ());
>> >> >> +
>> >> >> +  widest_int old_mask = this->mask;
>> >> >> +  this->mask = (this->mask | mask) | (this->value ^ value);
>> >> >> +
>> >> >> +  if (wi::sext (this->mask, this->precision) == -1)
>> >> >> +return set_to_bottom ();
>> >> >> +
>> >> >> +  bool changed = this->mask != old_mask;
>> >> >> +  return changed;
>> >> >> +}
>> >> >> +
>> >> >> +/* Meet the bits lattice with operand
>> >> >> +   described by > >> >> +
>> >> >> +bool
>> >> >> +ipcp_bits_lattice::meet_with (widest_int value, widest_int mask,
>> >> >> +   signop sgn, unsigned precision)
>> >> >> +{
>> >> >> +  if (bottom_p ())
>> >> >> +return false;
>> >> >> +
>> >> >> +  if (top_p ())
>> >> >> +{
>> >> >> +  if (wi::sext (mask, precision) == -1)
>> >> >> + return set_to_bottom ();
>> >> >> +  return set_to_constant (value, mask, sgn, precision);
>> >> >> +}
>> >> >> +
>> >> >> +  return meet_with_1 (value, mask);
>> >> >
>> >> > What if precisions do not match?
>> >> Sorry I don't understand. Since we extend to widest_int, precision
>> >> would be same ?
>> >
>> > I meant what if:
>> >
>> >   this->precision != precision /* the parameter value */
>> >
>> > (you see, giving both the same name is error-prone).  You are
>> > propagating the recorded precision gathered from types of the actual
>> > arguments in calls, those can be different.  For example, one caller
>> > can pass a direct integer value with full integer precision, another
>> > caller can pass in that same argument an enum value with a very
>> > limited precision.  Your code ignores that difference and the
>> > resulting precision is the one that arrives first.  If it is the enum,
>> > it might be too small for the integer value from the other ca

Re: [PATCH 0/7, GCC, V8M] ARMv8-M Security Extensions

2016-08-09 Thread Andre Vieira (lists)

On 08/08/16 05:19, Sandra Loosemore wrote:
> On 07/25/2016 07:17 AM, Andre Vieira (lists) wrote:
>> [PATCH 0/7, GCC, V8M] ARMv8-M Security Extensions
>>
>> Hello,
>>
>> This is a respin of a previous patch series for ARMv8-M Security
>> Extensions. In this version I have removed one patch, rebased the rest
>> and changed some of them.
>>
>> This patch series aims at implementing support for ARMv8-M's Security
>> Extensions. You can find the specification of ARMV8-M Security
>> Extensions in: ARM®v8-M Security Extensions: Requirements on Development
>> Tools
>> (http://infocenter.arm.com/help/topic/com.arm.doc.ecm0359818/index.html).
>>
>> We currently:
>> - do not support passing arguments or returning on the stack for
>> cmse_nonsecure_{call,entry} functions,
>> - only test Security Extensions for -mfpu=fpv5-d16 and fpv5-sp-d16 and
>> only support single and double precision FPU's with d16.
>>
>> Bootstrapped and tested on arm-none-linux-gnueabihf and tested on
>> arm-none-eabi with ARMv8-M Baseline and Mainline targets.
>>
>> Andre Vieira (7):
>>   Add support for ARMv8-M's Security Extensions flag and intrinsics
>>   Handling ARMv8-M Security Extension's cmse_nonsecure_entry attribute
>>   ARMv8-M Security Extension's cmse_nonsecure_entry: __acle_se label and
>> bxns return
>>   ARMv8-M Security Extension's cmse_nonsecure_entry: clear registers
>>   Handling ARMv8-M Security Extension's cmse_nonsecure_call attribute
>>   ARMv8-M Security Extension's cmse_nonsecure_call: use
>> __gnu_cmse_nonsecure_call
>>   Added support for ARMV8-M Security Extension cmse_nonsecure_caller
>> intrinsic
> 
> I didn't see any documentation here for the new attributes and built-in
> function.
> 
> -Sandra
> 

Hi Sandra,

The documentation is in the ARMV8-M Security Extensions in: ARM®v8-M
Security Extensions: Requirements on Development Tools document I linked
in the email above and subsequent emails
(http://infocenter.arm.com/help/topic/com.arm.doc.ecm0359818/index.html).

Also per patch I refer to the relevant sections. So for instance in
PATCH 3/7 refers to Section 5.4, which describes Entry functions and
mentions the cmse_nonsecure_entry attribute. Whereas PATCH 7/7 refers to
Section 5.4.3 of the same document which describes the
cmse_nonsecure_caller intrinsic which that patch implements.

Is there a specific intrinsic/attribute you are missing?

Cheers,
Andre

Re: [PATCH PR71734] Add missed check that reference defined inside loop.

2016-08-09 Thread Yuri Rumyantsev

Richard,

I checked that this move helps.
Does it mean that I've got approval to integrate it to trunk?

2016-08-09 14:33 GMT+03:00 Richard Biener :
> On Tue, Aug 9, 2016 at 1:26 PM, Yuri Rumyantsev  wrote:
>> Richard,
>>
>> The patch proposed by you does not work properly for
>> g++.dg/vect/pr70729-nest.cc test since the reference for this->S_n has
>> been cached as dependent for outer loop and loop is not vectorized:
>>
>>  g++ -Ofast -fopenmp -mavx2 pr70729-nest.cc -c
>> -fdump-tree-vect-details
>> grep 'LOOP VECTORIZED' pr70729-nest.cc.149t.vect
>> 
>>
>> You missed additional check I added before check on cached dependence.
>
> Ok, but it should get the correctness right?
>
> I suppose that if you move the cache checks inside the else clause it
> would work?
> I'd be ok with that change.
>
> Richard.
>
>> 2016-08-09 13:00 GMT+03:00 Richard Biener :
>>> On Tue, Aug 9, 2016 at 11:20 AM, Yuri Rumyantsev  wrote:
 Yes it is impossible since all basic blocks are handled from outer
 loops to innermost so we can not have the sequence with wrong
 dependence, i.e. we cached that reference is independent (due to
 safelen) but the same reference in outer loop must be evaluated as
 dependent. So we must re-evaluate only dependent references in loops
 having non-zero safelen attribute.
>>>
>>> Hmm.  I don't like depending on this implementation detail.  Does the
>>> attached patch work
>>> which simply avoids any positive/negative caching on safelen affected
>>> refs?  It also makes
>>> the query cheaper by avoiding the dive into child loops.
>>>
>>> Richard.
>>>
 2016-08-09 11:44 GMT+03:00 Richard Biener :
> On Fri, Aug 5, 2016 at 4:57 PM, Yuri Rumyantsev  
> wrote:
>> Richard,
>>
>> I added additional check before caching dependencies since (1) all
>> statements in loop are handled in loop postorder order, i.e. form
>> outer to inner; (2) we can change dependency for reference in subloops
>> which have non-zero safelen attribute. So I propose to re-evaluate it
>> in such cases. I don't see why we need to avoid dependence caching for
>> all loop nests since pragma omp simd is used very rarely.
>
> You think it is impossible to construct a testcase which hits the
> correctness issue?
> "very rarely" is not a good argument to generate wrong code.
>
> Richard.
>
>> 2016-08-05 16:50 GMT+03:00 Richard Biener :
>>> On Fri, Aug 5, 2016 at 3:28 PM, Yuri Rumyantsev  
>>> wrote:
 Richard,

 Here is updated patch which implements your proposal - I pass loop
 instead of stmt to determine either REF is defined inside LOOP nest or
 not. I checked that for pr70729-nest.cc the reference this->S_n  for
 statements which are out of innermost loop are  not considered as
 independent as you pointed out.

 Regression testing did not show any new failures and both failed tests
 from libgomp.fortran suite now passed.

 Is it OK for trunk?
>>>
>>> I don't quite understand
>>>
>>> +  /* Ignore dependence for loops having greater safelen.  */
>>> +  if (new_safelen == safelen
>>> +  && bitmap_bit_p (&ref->dep_loop, LOOP_DEP_BIT (loop->num, 
>>> stored_p)))
>>>  return false;
>>>
>>> this seems to suggest (correctly I think) that we cannot rely on the 
>>> caching
>>> for safelen, neither for optimal results (you seem to address that) but 
>>> also
>>> not for correctness (we cache the no-dep result from a safelen run and
>>> then happily re-use that info for a ref that is not safe for safelen).
>>>
>>> It seems to me we need to avoid any caching if we made things 
>>> independent
>>> because of safelen and simply not do the dep test afterwards.  this 
>>> means
>>> inlining ref_indep_loop_p_1 partly into _2 (not sure if there's a great 
>>> way
>>> to do this w/o confusing the control flow).
>>>
>>> Richard.
>>>
 ChangeLog:
 2016-08-05  Yuri Rumyantsev  

 PR tree-optimization/71734
 * tree-ssa-loop-im.c (ref_indep_loop_p): Add new argument REF_LOOP.
 (outermost_indep_loop): Pass LOOP argumnet where REF was defined to
 ref_indep_loop_p.
 (ref_indep_loop_p_1): Fix commentary.
 (ref_indep_loop_p_2): Add additional argument REF_LOOP, introduce new
 variable NEW_SAFELEN which may have new value for SAFELEN, ignore
 dependencde for loop having greater safelen value, pass REF_LOOP in
 recursive call.
 (can_sm_ref_p): Pass LOOP as additional argument to ref_indep_loop_p.

 2016-08-03 16:44 GMT+03:00 Richard Biener :
> On Fri, Jul 29, 2016 at 4:00 PM, Yuri Rumyantsev  
> wrote:
>> Hi Richard.
>>
>> It turned out that the fix proposed by you does not work for liggomp
>> tes

Re: [Patch, tentative, reload] Make push_reload work with more types of subregs?

2016-08-09 Thread Bernd Schmidt


On 08/08/2016 05:26 PM, Senthil Kumar Selvaraj wrote:


I picked out the commit where you'd added SYMBOL_REF handling (rev
#190252), and patched that with the below code. Bootstrapped and
regtested on x86_64-pc-linux - the results were identical with and
without the patch. Is this good enough for trunk?



-  else if (GET_CODE (SUBREG_REG (in)) == SYMBOL_REF)
+  else if (GET_CODE (SUBREG_REG (in)) == SYMBOL_REF
+   || CONSTANT_P (SUBREG_REG (in))
+   || GET_CODE (SUBREG_REG (in)) == PLUS)
subreg_in_class = find_valid_class_1 (inmode,
  GET_MODE (SUBREG_REG (in)),
  rclass);


Actually SYMBOL_REF should also be CONSTANT_P. For integers it looks 
like it'll pass VOIDmode to find_valid_class_1 and just return NO_REGS. 
which I suppose is OK.


Would you mind removing the SYMBOL_REF test and retesting? Ok with that 
change.



Bernd

[PATCH, fixincludes] Fix PR bootstrap/72833

2016-08-09 Thread Bernd Edlinger

Hi!

Due to my recent cleanup of special_function_p the darwin bootstrap is
currently broken because the longjmp function misses the noreturn
attribute.  Therefore I revived an old solaris_longjmp_noreturn
fixinclude rule.

See the attached patch.

Boot-strap on i686-apple-darwin11 and x86_64-apple-darwin11 was
successful.

Is it OK for trunk?


Thanks
Bernd.
2016-08-09  Bernd Edlinger  

	PR bootstrap/72833
	* fixincl.tpl (version-compare): Fix generation with autogen 5.18.
	* inclhack.def (darwin_longjmp_noreturn): New fix.
	* fixincl.x: Regenerated.
	* tests/base/i386/setjmp.h [DARWIN_LONGJMP_NORETURN_CHECK]: new test.

Index: fixincludes/fixincl.tpl
===
--- fixincludes/fixincl.tpl	(revision 239193)
+++ fixincludes/fixincl.tpl	(working copy)
@@ -1,7 +1,7 @@
 [= AutoGen5 Template -*- Mode: C -*-
 x=fixincl.x =]
 [=
- (if (version-compare >= autogen-version "5.18")
+ (if (version-compare >= autogen-version "5.18.1")
  (dne "-D" " * " "/* ")
  (dne " * " "/* ") ) =]
  */
Index: fixincludes/fixincl.x
===
--- fixincludes/fixincl.x	(revision 239193)
+++ fixincludes/fixincl.x	(working copy)
@@ -1,12 +1,12 @@
 /*  -*- buffer-read-only: t -*- vi: set ro:
- * 
+ *
  * DO NOT EDIT THIS FILE   (fixincl.x)
- * 
- * It has been AutoGen-ed  June 10, 2016 at 12:56:52 PM by AutoGen 5.18.3
+ *
+ * It has been AutoGen-ed  August  8, 2016 at 08:46:37 PM by AutoGen 5.18
  * From the definitionsinclhack.def
  * and the template file   fixincl
  */
-/* DO NOT SVN-MERGE THIS FILE, EITHER Fri Jun 10 12:56:52 UTC 2016
+/* DO NOT SVN-MERGE THIS FILE, EITHER Mon Aug  8 20:46:37 CEST 2016
  *
  * You must regenerate it.  Use the ./genfixes script.
  *
@@ -15,7 +15,7 @@
  * certain ANSI-incompatible system header files which are fixed to work
  * correctly with ANSI C and placed in a directory that GNU C will search.
  *
- * This file contains 235 fixup descriptions.
+ * This file contains 236 fixup descriptions.
  *
  * See README for more information.
  *
@@ -2699,6 +2699,50 @@
 
 /* * * * * * * * * * * * * * * * * * * * * * * * * *
  *
+ *  Description of Darwin_Longjmp_Noreturn fix
+ */
+tSCC zDarwin_Longjmp_NoreturnName[] =
+ "darwin_longjmp_noreturn";
+
+/*
+ *  File name selection pattern
+ */
+tSCC zDarwin_Longjmp_NoreturnList[] =
+  "i386/setjmp.h\0";
+/*
+ *  Machine/OS name selection pattern
+ */
+tSCC* apzDarwin_Longjmp_NoreturnMachs[] = {
+"*-*-darwin*",
+(const char*)NULL };
+
+/*
+ *  content selection pattern - do fix if pattern found
+ */
+tSCC zDarwin_Longjmp_NoreturnSelect0[] =
+   "(.*longjmp\\(.*jmp_buf.*[^)]+\\));";
+
+/*
+ *  content bypass pattern - skip fix if pattern found
+ */
+tSCC zDarwin_Longjmp_NoreturnBypass0[] =
+   "__dead2";
+
+#defineDARWIN_LONGJMP_NORETURN_TEST_CT  2
+static tTestDesc aDarwin_Longjmp_NoreturnTests[] = {
+  { TT_NEGREP,   zDarwin_Longjmp_NoreturnBypass0, (regex_t*)NULL },
+  { TT_EGREP,zDarwin_Longjmp_NoreturnSelect0, (regex_t*)NULL }, };
+
+/*
+ *  Fix Command Arguments for Darwin_Longjmp_Noreturn
+ */
+static const char* apzDarwin_Longjmp_NoreturnPatch[] = {
+"format",
+"%1 __attribute__ ((__noreturn__));",
+(char*)NULL };
+
+/* * * * * * * * * * * * * * * * * * * * * * * * * *
+ *
  *  Description of Darwin_Private_Extern fix
  */
 tSCC zDarwin_Private_ExternName[] =
@@ -9633,9 +9677,9 @@
  *
  *  List of all fixes
  */
-#define REGEX_COUNT  272
+#define REGEX_COUNT  274
 #define MACH_LIST_SIZE_LIMIT 187
-#define FIX_COUNT235
+#define FIX_COUNT236
 
 /*
  *  Enumerate the fixes
@@ -9703,6 +9747,7 @@
 DARWIN_9_LONG_DOUBLE_FUNCS_2_FIXIDX,
 DARWIN_EXTERNC_FIXIDX,
 DARWIN_GCC4_BREAKAGE_FIXIDX,
+DARWIN_LONGJMP_NORETURN_FIXIDX,
 DARWIN_PRIVATE_EXTERN_FIXIDX,
 DARWIN_STDINT_1_FIXIDX,
 DARWIN_STDINT_2_FIXIDX,
@@ -10189,6 +10234,11 @@
  DARWIN_GCC4_BREAKAGE_TEST_CT, FD_MACH_ONLY | FD_SUBROUTINE,
  aDarwin_Gcc4_BreakageTests,   apzDarwin_Gcc4_BreakagePatch, 0 },
 
+  {  zDarwin_Longjmp_NoreturnName,zDarwin_Longjmp_NoreturnList,
+ apzDarwin_Longjmp_NoreturnMachs,
+ DARWIN_LONGJMP_NORETURN_TEST_CT, FD_MACH_ONLY | FD_SUBROUTINE,
+ aDarwin_Longjmp_NoreturnTests,   apzDarwin_Longjmp_NoreturnPatch, 0 },
+
   {  zDarwin_Private_ExternName,zDarwin_Private_ExternList,
  apzDarwin_Private_ExternMachs,
  DARWIN_PRIVATE_EXTERN_TEST_CT, FD_MACH_ONLY | FD_SUBROUTINE,
Index: fixincludes/inclhack.def
===
--- fixincludes/inclhack.def	(revision 239193)
+++ fixincludes/inclhack.def	(working copy)
@@ -1395,6 +1395,21 @@
 };
 
 /*
+ *  Before Mac OS X 10.8  doesn't mark longjump noreturn.
+ */
+fix = {
+hackname  = darwin_longjmp_noreturn;
+mach  = "*-*-darwin*";
+files = "i386/setjmp.h";
+bypass= "__dead2";
+select= "(.*lo

Re: Implement -Wimplicit-fallthrough (take 2): the rest

2016-08-09 Thread Marek Polacek

On Tue, Aug 09, 2016 at 12:24:03PM +0200, Jakub Jelinek wrote:
> On Tue, Aug 09, 2016 at 12:18:11PM +0200, Marek Polacek wrote:
> > > > On Wed, 27 Jul 2016, Marek Polacek wrote:
> > > > 
> > > > > And this is the rest.  Either I just adjusted a falls through comment,
> > > > > or I added __builtin_fallthrough ().  These were the cases where I was
> > > > > fairly sure that the fall through is intentional.
> > > > 
> > > > I saw one case where I think the warning is a bit over-active:
> > > > 
> > > > @@ -42072,6 +42089,7 @@ rdseed_step:
> > > >  case IX86_BUILTIN_ADDCARRYX64:
> > > >icode = CODE_FOR_addcarrydi;
> > > >mode0 = DImode;
> > > > +  gcc_fallthrough ();
> > > > 
> > > >  handlecarry:
> > > >arg0 = CALL_EXPR_ARG (exp, 0); /* unsigned char c_in.  */
> > > > 
> > > > I.e. it also warns if the following label is not a case label but a 
> > > > normal
> > > > one.  I don't think this counts as a classical fall-through and it IMHO
> > > > should not be warned about nor should it be marked.
> > 
> > Ok, I'll buy that.
> > 
> > > It's probably the same underlying issue I saw with a false-positive in one
> > > of the other patches.
> > 
> > Yes.  I'll adjust the warning to not warn for non-case labels.
> 
> What about the case where you have non-case labels followed by case labels?
> I think warning for:
>   case 1:
> something ();
>   lab:
>   case 2:
> something_else ();
> is still desirable, so just avoid the warning for
>   case 1:
> something ();
>   lab:
> something_else ();
> ?

All right, done, and an umpteenth test added.  Will post the new version later.
Thanks,

Marek

Re: [PATCH PR71734] Add missed check that reference defined inside loop.

2016-08-09 Thread Richard Biener

On Tue, Aug 9, 2016 at 2:05 PM, Yuri Rumyantsev  wrote:
> Richard,
>
> I checked that this move helps.
> Does it mean that I've got approval to integrate it to trunk?

Yes, if it survives bootstrap & regtest.

Richard.

> 2016-08-09 14:33 GMT+03:00 Richard Biener :
>> On Tue, Aug 9, 2016 at 1:26 PM, Yuri Rumyantsev  wrote:
>>> Richard,
>>>
>>> The patch proposed by you does not work properly for
>>> g++.dg/vect/pr70729-nest.cc test since the reference for this->S_n has
>>> been cached as dependent for outer loop and loop is not vectorized:
>>>
>>>  g++ -Ofast -fopenmp -mavx2 pr70729-nest.cc -c
>>> -fdump-tree-vect-details
>>> grep 'LOOP VECTORIZED' pr70729-nest.cc.149t.vect
>>> 
>>>
>>> You missed additional check I added before check on cached dependence.
>>
>> Ok, but it should get the correctness right?
>>
>> I suppose that if you move the cache checks inside the else clause it
>> would work?
>> I'd be ok with that change.
>>
>> Richard.
>>
>>> 2016-08-09 13:00 GMT+03:00 Richard Biener :
 On Tue, Aug 9, 2016 at 11:20 AM, Yuri Rumyantsev  
 wrote:
> Yes it is impossible since all basic blocks are handled from outer
> loops to innermost so we can not have the sequence with wrong
> dependence, i.e. we cached that reference is independent (due to
> safelen) but the same reference in outer loop must be evaluated as
> dependent. So we must re-evaluate only dependent references in loops
> having non-zero safelen attribute.

 Hmm.  I don't like depending on this implementation detail.  Does the
 attached patch work
 which simply avoids any positive/negative caching on safelen affected
 refs?  It also makes
 the query cheaper by avoiding the dive into child loops.

 Richard.

> 2016-08-09 11:44 GMT+03:00 Richard Biener :
>> On Fri, Aug 5, 2016 at 4:57 PM, Yuri Rumyantsev  
>> wrote:
>>> Richard,
>>>
>>> I added additional check before caching dependencies since (1) all
>>> statements in loop are handled in loop postorder order, i.e. form
>>> outer to inner; (2) we can change dependency for reference in subloops
>>> which have non-zero safelen attribute. So I propose to re-evaluate it
>>> in such cases. I don't see why we need to avoid dependence caching for
>>> all loop nests since pragma omp simd is used very rarely.
>>
>> You think it is impossible to construct a testcase which hits the
>> correctness issue?
>> "very rarely" is not a good argument to generate wrong code.
>>
>> Richard.
>>
>>> 2016-08-05 16:50 GMT+03:00 Richard Biener :
 On Fri, Aug 5, 2016 at 3:28 PM, Yuri Rumyantsev  
 wrote:
> Richard,
>
> Here is updated patch which implements your proposal - I pass loop
> instead of stmt to determine either REF is defined inside LOOP nest or
> not. I checked that for pr70729-nest.cc the reference this->S_n  for
> statements which are out of innermost loop are  not considered as
> independent as you pointed out.
>
> Regression testing did not show any new failures and both failed tests
> from libgomp.fortran suite now passed.
>
> Is it OK for trunk?

 I don't quite understand

 +  /* Ignore dependence for loops having greater safelen.  */
 +  if (new_safelen == safelen
 +  && bitmap_bit_p (&ref->dep_loop, LOOP_DEP_BIT (loop->num, 
 stored_p)))
  return false;

 this seems to suggest (correctly I think) that we cannot rely on the 
 caching
 for safelen, neither for optimal results (you seem to address that) 
 but also
 not for correctness (we cache the no-dep result from a safelen run and
 then happily re-use that info for a ref that is not safe for safelen).

 It seems to me we need to avoid any caching if we made things 
 independent
 because of safelen and simply not do the dep test afterwards.  this 
 means
 inlining ref_indep_loop_p_1 partly into _2 (not sure if there's a 
 great way
 to do this w/o confusing the control flow).

 Richard.

> ChangeLog:
> 2016-08-05  Yuri Rumyantsev  
>
> PR tree-optimization/71734
> * tree-ssa-loop-im.c (ref_indep_loop_p): Add new argument REF_LOOP.
> (outermost_indep_loop): Pass LOOP argumnet where REF was defined to
> ref_indep_loop_p.
> (ref_indep_loop_p_1): Fix commentary.
> (ref_indep_loop_p_2): Add additional argument REF_LOOP, introduce new
> variable NEW_SAFELEN which may have new value for SAFELEN, ignore
> dependencde for loop having greater safelen value, pass REF_LOOP in
> recursive call.
> (can_sm_ref_p): Pass LOOP as additional argument to ref_indep_loop_p.
>
> 2016-08-03 16:

Re: [PATCH] Fix POW2 histogram

2016-08-09 Thread Nathan Sidwell


On 08/08/16 12:56, Martin Liška wrote:

Hello.

Currently, we utilize pow2 profile histogram to track gimple STMTs like this: 
ssa_name_x % value.

void
__gcov_pow2_profiler (gcov_type *counters, gcov_type value)
{
  if (value & (value - 1))
counters[0]++;
  else
counters[1]++;
}

Although __gcov_pow2_profiler function wrongly handles 0 (which is not power of 
two), it's impossible
to write a test-case which would not expose a division by zero. As one can 
potentially use the same profiler
for a different purpose, I would like to fix it. Apart from that, we've got a 
small bug in a dump function
of the POW2 histograms.

Survives make check -k -j10 RUNTESTFLAGS="tree-prof.exp"
Ready for trunk?


Ok.

nathan

Re: [PATCH 5/N] Add new *_atomic counter update function, (-fprofile-update=atomic)

2016-08-09 Thread Nathan Sidwell


On 08/08/16 13:03, Martin Liška wrote:



v3: fixed wrong defines in libgcc/Makefine.in


ok, thanks

Re: [PATCH 2/N] Fix usage of POW2 histogram

2016-08-09 Thread Nathan Sidwell


On 08/09/16 04:41, Martin Liška wrote:

Another small follow-up changes instrumentation of MOD exprs
to not instrument divisors different from SSA_NAME.

Patch survives:
make check -k -j10 RUNTESTFLAGS="tree-prof.exp"



ok, thanks

[MIPS,committed] Use create_tmp_var_raw in mips_atomic_assign_expand_fenv

2016-08-09 Thread Matthew Fortune

I have applied the MIPS specific fix for PR65345.  It is a mechanical
change.  This fixes the MIPS failures in atomic/pr65345-4.c.

gcc/
PR c/65345
* config/mips/mips.c (mips_atomic_assign_expand_fenv):
Use create_tmp_var_raw instead of create_tmp_var.

Thanks,
Matthew

diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c
index 57b7633..88f4038 100644
--- a/gcc/config/mips/mips.c
+++ b/gcc/config/mips/mips.c
@@ -21991,9 +21991,9 @@ mips_atomic_assign_expand_fenv (tree *hold, tree 
*clear, tree *update)
 {
   if (!TARGET_HARD_FLOAT_ABI)
 return;
-  tree exceptions_var = create_tmp_var (MIPS_ATYPE_USI);
-  tree fcsr_orig_var = create_tmp_var (MIPS_ATYPE_USI);
-  tree fcsr_mod_var = create_tmp_var (MIPS_ATYPE_USI);
+  tree exceptions_var = create_tmp_var_raw (MIPS_ATYPE_USI);
+  tree fcsr_orig_var = create_tmp_var_raw (MIPS_ATYPE_USI);
+  tree fcsr_mod_var = create_tmp_var_raw (MIPS_ATYPE_USI);
   tree get_fcsr = mips_builtin_decls[MIPS_GET_FCSR];
   tree set_fcsr = mips_builtin_decls[MIPS_SET_FCSR];
   tree get_fcsr_hold_call = build_call_expr (get_fcsr, 0);

Re: [PATCH] Set -fprofile-update=atomic when -pthread is present

2016-08-09 Thread Nathan Sidwell


On 08/09/16 07:24, Martin Liška wrote:

Hi.

As mention in [1], enabling -fprofile-update=atomic when -pthread is logical
thing and is quite expected default behavior.

Ready for trunk?

[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58306#c21


This certainly requires changes to invoke.texi, and  possibly NEWS.

nathan

Re: [PATCH] Improve backwards threading

2016-08-09 Thread Richard Biener

On Fri, 5 Aug 2016, Jeff Law wrote:

> On 08/05/2016 07:36 AM, Richard Biener wrote:
> > @@ -560,6 +562,14 @@ fsm_find_control_statement_thread_paths
> >   edge taken_edge = profitable_jump_thread_path (path, bbi, name,
> > arg);
> >   if (taken_edge)
> > {
> > + /* If the taken edge is a loop entry avoid mashing two
> > +loops into one with multiple latches by splitting
> > +the edge.  This only works if that block won't become
> > +a latch of this loop.  */
> > + if ((bb_loop_depth (taken_edge->src)
> > +  < bb_loop_depth (taken_edge->dest))
> > + && ! single_succ_p (bbi))
> > +   split_edge (taken_edge);
> >   if (bb_loop_depth (taken_edge->src)
> >   >= bb_loop_depth (taken_edge->dest))
> > convert_and_register_jump_thread_path (path, taken_edge);
> > 
> > note you need the PR72772 fix to trigger all this.
> I'm a little confused here.  In the case where the taken edge goes into a
> deeper loop nest you're splitting the edge -- to what end?  The backwards
> threader isn't going to register that jump thread.  So if this is fixing
> something, then we've got the fix in the wrong place.

Ok, so I've figured that splitting the edge is indeed pointless unless
it does exactly the same as creating a forwarder.  We may not blindly
thread backwards to a loop header because of correctness issues in
re-using old loop meta-data for that loop (and in the 
ubsan.exp=pr71403-2.c case miscompiling the testcase).  What we need
is a forwarder block we can thread to which eventually becomes the
new loop header.  Note this is also what we'd achieve with properly
initializing loops in the threader - sth we should do anyway with
looking at loop meta data.  This is likely also why the old threader has
(in DOM):

  /* We need to know loop structures in order to avoid destroying them
 in jump threading.  Note that we still can e.g. thread through loop
 headers to an exit edge, or through loop header to the loop body, 
assuming
 that we update the loop info.

 TODO: We don't need to set LOOPS_HAVE_PREHEADERS generally, but due
 to several overly conservative bail-outs in jump threading, case
 gcc.dg/tree-ssa/pr21417.c can't be threaded if loop preheader is
 missing.  We should improve jump threading in future then
 LOOPS_HAVE_PREHEADERS won't be needed here.  */
  loop_optimizer_init (LOOPS_HAVE_PREHEADERS | LOOPS_HAVE_SIMPLE_LATCHES);

thus we run into exactly those cases now in the FSM threader.  Thus
the following patch fixes the two testcases with the PR72772 fix
applied as well.

It's also possible to create a forwarder on-demand at the place I
splitted the edge and avoid creating preheaders for all loops but I
as we should init the loop optimizer to be able to look at loop
metadata anyway there's not much point in doing that.

Bootstrap / regtest pending on x86_64-unknown-linux-gnu, ok for trunk?

Thanks,
Richard.

2016-08-09  Richard Biener  

* tree-ssa-threadbackward.c (pass_data_thread_jumps): Remove
unconditional TODO_cleanup_cfg.
(pass_thread_jumps::execute): Initialize loops, perform a CFG
cleanup only if we threaded a jump.

Index: gcc/tree-ssa-threadbackward.c
===
*** gcc/tree-ssa-threadbackward.c   (revision 239276)
--- gcc/tree-ssa-threadbackward.c   (working copy)
*** const pass_data pass_data_thread_jumps =
*** 674,680 
0,
0,
0,
!   ( TODO_cleanup_cfg | TODO_update_ssa ),
  };

  class pass_thread_jumps : public gimple_opt_pass
--- 674,680 
0,
0,
0,
!   TODO_update_ssa,
  };

  class pass_thread_jumps : public gimple_opt_pass
*** pass_thread_jumps::gate (function *fun A
*** 699,704 
--- 699,706 
  unsigned int
  pass_thread_jumps::execute (function *fun)
  {
+   loop_optimizer_init (LOOPS_HAVE_PREHEADERS | LOOPS_HAVE_SIMPLE_LATCHES);
+ 
/* Try to thread each block with more than one successor.  */
basic_block bb;
FOR_EACH_BB_FN (bb, fun)
*** pass_thread_jumps::execute (function *fu
*** 706,713 
if (EDGE_COUNT (bb->succs) > 1)
find_jump_threads_backwards (bb);
  }
!   thread_through_all_blocks (true);
!   return 0;
  }

  }
--- 708,717 
if (EDGE_COUNT (bb->succs) > 1)
find_jump_threads_backwards (bb);
  }
!   bool changed = thread_through_all_blocks (true);
! 
!   loop_optimizer_finalize ();
!   return changed ? TODO_cleanup_cfg : 0;
  }

  }

Re: Ping^2: [PATCH build/doc] Replacing libiberty with gnulib

2016-08-09 Thread DJ Delorie

ayush goel  writes:
> This patch https://gcc.gnu.org/ml/gcc-patches/2016-07/msg01302.html
> is still pending review.

The build portions are OK, assuming a global maintainer has approved of
adding gnulib in general.

Re: protected alloca class for malloc fallback

2016-08-09 Thread Aldy Hernandez


On 08/05/2016 01:55 PM, Richard Biener wrote:

Hi Richard.


Please don't use std::string.  For string building you can use obstacks.


Alright let's talk details then so I can write things up in a way you 
approve of.


Take for instance simple uses like all the tree_*check_failed routines, 
which I thought were great candidates for std::string-- they're going to 
be outputted to the screen or disk which is clearly many times more 
expensive than the malloc or overhead of std::string:


  length += strlen ("expected ");
  buffer = tmp = (char *) alloca (length);
  length = 0;
  while ((code = (enum tree_code) va_arg (args, int)))
{
  const char *prefix = length ? " or " : "expected ";

  strcpy (tmp + length, prefix);
  length += strlen (prefix);
  strcpy (tmp + length, get_tree_code_name (code));
  length += strlen (get_tree_code_name (code));
}

Do you suggest using obstacks here, or did you have something else in mind?

How about if it's something like the above but there are multiple exit 
points from the function.  I mean, we still need to call obstack_free() 
on all exit points, which is what I wanted to avoid for something as 
simple as building a throw-away string.


Aldy

Re: protected alloca class for malloc fallback

2016-08-09 Thread Bernd Schmidt




On 08/09/2016 03:17 PM, Aldy Hernandez wrote:

On 08/05/2016 01:55 PM, Richard Biener wrote:

Hi Richard.


Please don't use std::string.  For string building you can use obstacks.


Alright let's talk details then so I can write things up in a way you
approve of.

Take for instance simple uses like all the tree_*check_failed routines,
which I thought were great candidates for std::string-- they're going to
be outputted to the screen or disk which is clearly many times more
expensive than the malloc or overhead of std::string:

  length += strlen ("expected ");
  buffer = tmp = (char *) alloca (length);
  length = 0;
  while ((code = (enum tree_code) va_arg (args, int)))
{
  const char *prefix = length ? " or " : "expected ";

  strcpy (tmp + length, prefix);
  length += strlen (prefix);
  strcpy (tmp + length, get_tree_code_name (code));
  length += strlen (get_tree_code_name (code));
}

Do you suggest using obstacks here, or did you have something else in mind?

How about if it's something like the above but there are multiple exit
points from the function.  I mean, we still need to call obstack_free()
on all exit points, which is what I wanted to avoid for something as
simple as building a throw-away string.


I think that given our choice to switch to C++, we should use the 
language in idiomatic ways, and the standard library is part of that. 
Otherwise the language switch makes even less sense than it did in the 
first place.



Bernd

[PR72835] Incorrect arithmetic optimization involving bitfield arguments

2016-08-09 Thread kugan


Hi,

The test-case in PR72835 is failing with -O2 and passing with 
-fno-tree-reassoc. It also passes with -O2 -fno-tree-vrp.


diff of .115t.dse2 and .116t.reassoc1 for the c++ testcase is as 
follows, which looks OK.


+  unsigned int _16;
+  unsigned int _17;
+  unsigned int _18;

   :
   _1 = s1.m2;
   _2 = (unsigned int) _1;
   _3 = s1.m3;
   _4 = (unsigned int) _3;
-  _5 = -_4;
-  _6 = _2 * _5;
+  _5 = _4;
+  _6 = _5 * _2;
   var_32.0_7 = var_32;
   _8 = (unsigned int) var_32.0_7;
   _9 = s1.m1;
   _10 = (unsigned int) _9;
-  _11 = -_10;
-  _12 = _8 * _11;
-  c_14 = _6 + _12;
+  _11 = _10;
+  _12 = _11 * _8;
+  _16 = _12 + _6;
+  _18 = _16;
+  _17 = -_18;
+  c_14 = _17;
   if (c_14 != 4098873984)


However, I noticed that when we re-associate and assign different 
operands to the stmts, we are not resetting the flow information to the 
LHS. This looks wrong. Attached patch resets it. With this, the 
testcases in TH PR is now working.



Bootstrap and regression testing for x86_64-linux-gnu is in progress. Is 
this OK for trunk if there is no regression.


Thanks,
Kugan

gcc/testsuite/ChangeLog:

2016-08-09  Kugan Vivekanandarajah  

PR tree-optimization/72835
* gcc.dg/torture/pr72835.c: New test.

gcc/ChangeLog:

2016-08-09  Kugan Vivekanandarajah  

PR tree-optimization/72835
* tree-ssa-reassoc.c (rewrite_expr_tree): Reset value_range of LHS when
operands are changed.
(rewrite_expr_tree_parallel): Likewise.
diff --git a/gcc/testsuite/gcc.dg/torture/pr72835.c 
b/gcc/testsuite/gcc.dg/torture/pr72835.c
index e69de29..d7d0a8d 100644
--- a/gcc/testsuite/gcc.dg/torture/pr72835.c
+++ b/gcc/testsuite/gcc.dg/torture/pr72835.c
@@ -0,0 +1,36 @@
+/* PR tree-optimization/72835 */
+/* { dg-do run } */
+/* { dg-options "-O2" } */
+
+struct struct_1 {
+unsigned int m1 : 6 ;
+unsigned int m2 : 24 ;
+unsigned int m3 : 6 ;
+};
+
+unsigned short var_32 = 0x2d10;
+
+struct struct_1 s1;
+
+void init ()
+{
+  s1.m1 = 4;
+  s1.m2 = 0x7ca4b8;
+  s1.m3 = 24;
+}
+
+void foo ()
+{
+  unsigned int c =
+((unsigned int) s1.m2) * (-((unsigned int) s1.m3))
++ (var_32) * (-((unsigned int) (s1.m1)));
+  if (c != 4098873984)
+__builtin_abort ();
+}
+
+int main ()
+{
+init ();
+foo ();
+return 0;
+}
diff --git a/gcc/tree-ssa-reassoc.c b/gcc/tree-ssa-reassoc.c
index 7fd7550..b864ed1 100644
--- a/gcc/tree-ssa-reassoc.c
+++ b/gcc/tree-ssa-reassoc.c
@@ -3945,6 +3945,7 @@ rewrite_expr_tree (gimple *stmt, unsigned int opindex,
  gimple_assign_set_rhs1 (stmt, oe1->op);
  gimple_assign_set_rhs2 (stmt, oe2->op);
  update_stmt (stmt);
+ reset_flow_sensitive_info (lhs);
}
 
  if (rhs1 != oe1->op && rhs1 != oe2->op)
@@ -4193,9 +4194,11 @@ rewrite_expr_tree_parallel (gassign *stmt, int width,
 others are recreated.  */
   if (i == stmt_num - 1)
{
+ tree lhs = gimple_assign_lhs (stmts[i]);
  gimple_assign_set_rhs1 (stmts[i], op1);
  gimple_assign_set_rhs2 (stmts[i], op2);
  update_stmt (stmts[i]);
+ reset_flow_sensitive_info (lhs);
}
   else
{

[PATCH] Fix PR middle-end/71654 (missed shortening of a compare)

2016-08-09 Thread Patrick Palka

Hi, this patch makes match.pd to shorten comparisons that involve a
sign-changing cast from a shorter unsigned type to a wider signed type.
This should be safe to do because a wider signed type can hold all the
possible values of a shorter unsigned type.

This change causes vrp23.c and vrp24.c to fail because it makes the
"n_sets > 0" tests within get trivially simplified by forwprop and so
there is nothing left to simplify by VRP.  Fixed by passing
-fno-tree-forwprop in these tests.

Bootstrapped + regtested on x86_64-pc-linux-gnu.  Does this look OK to commit?

gcc/ChangeLog:

PR middle-end/71654
* match.pd ((T)A CMP (T)B -> A CMP B): Allow (T)A to be a
sign-changing cast from a shorter unsigned type to a wider
signed type.

gcc/testsuite/ChangeLog:

PR middle-end/71654
* gcc.dg/c-c++-common/pr71654.c: New test.
* gcc.dg/tree-ssa/vrp23: Add -fno-tree-forwprop to
dg-options.
* gcc.dg/tree-ssa/vrp24: Likewise.
---
 gcc/match.pd  |  4 +++-
 gcc/testsuite/c-c++-common/pr71654.c  | 28 
 gcc/testsuite/gcc.dg/tree-ssa/vrp23.c |  2 +-
 gcc/testsuite/gcc.dg/tree-ssa/vrp24.c |  2 +-
 4 files changed, 33 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/pr71654.c

diff --git a/gcc/match.pd b/gcc/match.pd
index ac7cfff..da87b02 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -2350,7 +2350,9 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 (if (TYPE_PRECISION (TREE_TYPE (@0)) > TYPE_PRECISION (TREE_TYPE (@00)))
  /* If possible, express the comparison in the shorter mode.  */
  (if ((cmp == EQ_EXPR || cmp == NE_EXPR
-  || TYPE_UNSIGNED (TREE_TYPE (@0)) == TYPE_UNSIGNED (TREE_TYPE (@00)))
+  || TYPE_UNSIGNED (TREE_TYPE (@0)) == TYPE_UNSIGNED (TREE_TYPE (@00))
+  || (!TYPE_UNSIGNED (TREE_TYPE (@0))
+  && TYPE_UNSIGNED (TREE_TYPE (@00
  && (types_match (TREE_TYPE (@10), TREE_TYPE (@00))
  || ((TYPE_PRECISION (TREE_TYPE (@00))
   >= TYPE_PRECISION (TREE_TYPE (@10)))
diff --git a/gcc/testsuite/c-c++-common/pr71654.c 
b/gcc/testsuite/c-c++-common/pr71654.c
new file mode 100644
index 000..2942493
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/pr71654.c
@@ -0,0 +1,28 @@
+/* PR middle-end/71654  */
+/* { dg-do link }  */
+/* { dg-options "-O2" }  */
+
+unsigned char i0, i1;
+
+void foo (void);
+
+int
+main (void)
+{
+  int j = i0;
+  if (j < 4)
+{
+  if (i0 & 4)
+foo ();
+}
+
+  unsigned int k = i1;
+  if (k < 8)
+{
+  if (i1 & 8)
+foo ();
+}
+
+  return 0;
+}
+
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp23.c 
b/gcc/testsuite/gcc.dg/tree-ssa/vrp23.c
index b877ccc..ae68c090 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/vrp23.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp23.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-vrp1-details" } */
+/* { dg-options "-O2 -fno-tree-forwprop -fdump-tree-vrp1-details" } */
 
 void aa (void);
 void aos (void);
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp24.c 
b/gcc/testsuite/gcc.dg/tree-ssa/vrp24.c
index e740575..853ee21 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/vrp24.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp24.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-vrp1-details" } */
+/* { dg-options "-O2 -fno-tree-forwprop -fdump-tree-vrp1-details" } */
 
 
 struct rtx_def;
-- 
2.9.2.614.g990027a

Re: [PATCH AArch64/V3]Add new patterns for vcond_mask and vec_cmp

2016-08-09 Thread James Greenhalgh

On Mon, Aug 08, 2016 at 12:10:00PM +0100, Bin.Cheng wrote:
> On Mon, Aug 8, 2016 at 11:40 AM, James Greenhalgh
>  wrote:
> > On Mon, Aug 01, 2016 at 01:18:53PM +, Bin Cheng wrote:
> >> Hi,
> >> This is the 3rd version patch implementing vcond_mask and vec_cmp patterns 
> >> on
> >> AArch64.  Bootstrap and test along with next patch on AArch64, is it OK?
> >
> > OK, with a couple of comments below, one on an extension and once style nit.
> >
> >> 2016-07-28  Alan Lawrence  
> >>   Renlin Li  
> >>   Bin Cheng  
> >>
> >>   * config/aarch64/aarch64-simd.md (vec_cmp): New pattern.
> >>   (vec_cmp): New pattern.
> >>   (vec_cmpu): New pattern.
> >>   (vcond_mask_): New pattern.
> >
> >> +(define_expand "vcond_mask_"
> >> +  [(match_operand:VALLDI 0 "register_operand")
> >> +   (match_operand:VALLDI 1 "nonmemory_operand")
> >> +   (match_operand:VALLDI 2 "nonmemory_operand")
> >> +   (match_operand: 3 "register_operand")]
> >> +  "TARGET_SIMD"
> >> +{
> >> +  /* If we have (a = (P) ? -1 : 0);
> >> + Then we can simply move the generated mask (result must be int).  */
> >> +  if (operands[1] == CONSTM1_RTX (mode)
> >> +  && operands[2] == CONST0_RTX (mode))
> >> +emit_move_insn (operands[0], operands[3]);
> >> +  /* Similarly, (a = (P) ? 0 : -1) is just inverting the generated mask.  
> >> */
> >> +  else if (operands[1] == CONST0_RTX (mode)
> >> +&& operands[2] == CONSTM1_RTX (mode))
> >> +emit_insn (gen_one_cmpl2 (operands[0], operands[3]));
> Hi,
> Thanks for reviewing, here is updated patch.
> >
> > Should we be also be catching these in a generic way before expanding?
> This pattern was copied unchanged from Alan's original patch.
> Yes, standard match-and-simplify pattern may be added to check such
> special simplifications.  Not sure what the current behavior is, we'd
> better keep this at expanding too?

Yes, that's what I'd expect.

> Comment issues fixed.  Is this version OK?

Yes this is OK.

Thanks,
James

Go patch committed: Rewrite compiler directive support

2016-08-09 Thread Ian Lance Taylor

This patch to the Go frontend rewrites the compiler directive support
to recognize all the compiler directives handled by the current gc
compiler.  They are now turned into flags attached to a function or
function declaration, or, for the case of go:linkname, into a map
attached to the Lex object.  No new directives are handled here, they
are just recognized for later support.  Bootstrapped and ran Go
testsuite on x86_64-pc-linux-gnu.  Committed to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 239261)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-3b9c57a35370f26e6cf5839084e367e75e45ec97
+58be5c6c7d92182dec50a62c8e319d2d7aab12a4
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: gcc/go/gofrontend/gogo.cc
===
--- gcc/go/gofrontend/gogo.cc   (revision 238653)
+++ gcc/go/gofrontend/gogo.cc   (working copy)
@@ -4439,7 +4439,7 @@ Function::Function(Function_type* type,
   : type_(type), enclosing_(enclosing), results_(NULL),
 closure_var_(NULL), block_(block), location_(location), labels_(),
 local_type_count_(0), descriptor_(NULL), fndecl_(NULL), defer_stack_(NULL),
-is_sink_(false), results_are_named_(false), nointerface_(false),
+pragmas_(0), is_sink_(false), results_are_named_(false),
 is_unnamed_type_stub_method_(false), calls_recover_(false),
 is_recover_thunk_(false), has_recover_thunk_(false),
 calls_defer_retaddr_(false), is_type_specific_function_(false),
@@ -4511,6 +4511,24 @@ Function::update_result_variables()
 (*p)->result_var_value()->set_function(this);
 }
 
+// Whether this method should not be included in the type descriptor.
+
+bool
+Function::nointerface() const
+{
+  go_assert(this->is_method());
+  return (this->pragmas_ & GOPRAGMA_NOINTERFACE) != 0;
+}
+
+// Record that this method should not be included in the type
+// descriptor.
+
+void
+Function::set_nointerface()
+{
+  this->pragmas_ |= GOPRAGMA_NOINTERFACE;
+}
+
 // Return the closure variable, creating it if necessary.
 
 Named_object*
@@ -5042,7 +5060,8 @@ Function::get_or_make_decl(Gogo* gogo, N
   // We want to put a nointerface function into a unique section
   // because there is a good chance that the linker garbage
   // collection can discard it.
-  bool in_unique_section = this->in_unique_section_ || this->nointerface_;
+  bool in_unique_section = (this->in_unique_section_
+   || (this->is_method() && this->nointerface()));
 
   Btype* functype = this->type_->get_backend_fntype(gogo);
   this->fndecl_ =
Index: gcc/go/gofrontend/gogo.h
===
--- gcc/go/gofrontend/gogo.h(revision 239002)
+++ gcc/go/gofrontend/gogo.h(working copy)
@@ -975,23 +975,22 @@ class Function
   results_are_named() const
   { return this->results_are_named_; }
 
+  // Set the pragmas for this function.
+  void
+  set_pragmas(unsigned int pragmas)
+  {
+this->pragmas_ = pragmas;
+  }
+
   // Whether this method should not be included in the type
   // descriptor.
   bool
-  nointerface() const
-  {
-go_assert(this->is_method());
-return this->nointerface_;
-  }
+  nointerface() const;
 
   // Record that this method should not be included in the type
   // descriptor.
   void
-  set_nointerface()
-  {
-go_assert(this->is_method());
-this->nointerface_ = true;
-  }
+  set_nointerface();
 
   // Record that this function is a stub method created for an unnamed
   // type.
@@ -1238,12 +1237,12 @@ class Function
   // distinguish the defer stack for one function from another.  This
   // is NULL unless we actually need a defer stack.
   Temporary_statement* defer_stack_;
+  // Pragmas for this function.  This is a set of GOPRAGMA bits.
+  unsigned int pragmas_;
   // True if this function is sink-named.  No code is generated.
   bool is_sink_ : 1;
   // True if the result variables are named.
   bool results_are_named_ : 1;
-  // True if this method should not be included in the type descriptor.
-  bool nointerface_ : 1;
   // True if this function is a stub method created for an unnamed
   // type.
   bool is_unnamed_type_stub_method_ : 1;
@@ -1305,7 +1304,7 @@ class Function_declaration
  public:
   Function_declaration(Function_type* fntype, Location location)
 : fntype_(fntype), location_(location), asm_name_(), descriptor_(NULL),
-  fndecl_(NULL)
+  fndecl_(NULL), pragmas_(0)
   { }
 
   Function_type*
@@ -1325,6 +1324,13 @@ class Function_declaration
   set_asm_name(const std::string& asm_name)
   { this->asm_name_ = asm_name; }
 
+  // Set the pragmas for this function.
+  void
+  set_pragmas(unsigned int pragmas)
+  {
+this->pragmas_ = pragmas;
+  }
+
   // Return an expression for the function desc

C++ PATCH for C++17 constexpr lambda

2016-08-09 Thread Jason Merrill

The first patch fixes a constexpr issue whereby we were complaining
about passing a non-constant empty class to a constexpr function; this
is OK because there's no actual copying involved.

The second patch implements the C++17 constexpr lambda proposal.  This
was mostly pretty straightforward.  A few notes:

* We now call potential_constant_expression on the entire
DECL_SAVED_TREE of a function, which turned up a few unhandled tree
codes.

* When we diagnose constexpr failure, we want to look through the
static thunk returned from the lambda conversion operator, since it
isn't part of the language.

* Looking at those diagnostics also led me to notice that we were
printing temploid functions as template and bindings even when the
function appears as part of an expression.  This is extremely
convoluted and unhelpful when we're looking at a call to the
conversion operator of a generic lambda, which involves a complicated
decltype.  So we don't do that anymore in expression context.

Tested x86_64-pc-linux-gnu, applied to trunk.
commit a906fc606677b942489aa21392aca1dc0837c856
Author: Jason Merrill 
Date:   Sat Aug 6 23:46:27 2016 -0400

Fix empty class parameters with constexpr.

PR c++/67131
* class.c (is_really_empty_class): Call complete_type.
* constexpr.c (cxx_eval_constant_expression): Check
is_really_empty_class.
(potential_constant_expression_1): Likewise.  Check for error type.

diff --git a/gcc/cp/class.c b/gcc/cp/class.c
index f834965..d898c38 100644
--- a/gcc/cp/class.c
+++ b/gcc/cp/class.c
@@ -8419,7 +8419,7 @@ is_really_empty_class (tree type)
 
   /* CLASSTYPE_EMPTY_P isn't set properly until the class is actually laid
 out, but we'd like to be able to check this before then.  */
-  if (COMPLETE_TYPE_P (type) && is_empty_class (type))
+  if (COMPLETE_TYPE_P (complete_type (type)) && is_empty_class (type))
return true;
 
   for (binfo = TYPE_BINFO (type), i = 0;
diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
index e7b08c8..2ced9c6 100644
--- a/gcc/cp/constexpr.c
+++ b/gcc/cp/constexpr.c
@@ -3670,7 +3670,13 @@ cxx_eval_constant_expression (const constexpr_ctx *ctx, 
tree t,
 CONST_DECL for aggregate constants.  */
   if (lval)
return t;
-  if (ctx->strict)
+  if (is_really_empty_class (TREE_TYPE (t)))
+   {
+ /* If the class is empty, we aren't actually loading anything.  */
+ r = build_constructor (TREE_TYPE (t), NULL);
+ TREE_CONSTANT (r) = true;
+   }
+  else if (ctx->strict)
r = decl_really_constant_value (t);
   else
r = decl_constant_value (t);
@@ -3705,7 +3711,7 @@ cxx_eval_constant_expression (const constexpr_ctx *ctx, 
tree t,
/* Defer in case this is only used for its type.  */;
   else if (TREE_CODE (TREE_TYPE (t)) == REFERENCE_TYPE)
/* Defer, there's no lvalue->rvalue conversion.  */;
-  else if (is_empty_class (TREE_TYPE (t)))
+  else if (is_really_empty_class (TREE_TYPE (t)))
{
  /* If the class is empty, we aren't actually loading anything.  */
  r = build_constructor (TREE_TYPE (t), NULL);
@@ -4736,6 +4742,9 @@ potential_constant_expression_1 (tree t, bool want_rval, 
bool strict,
 }
   if (CONSTANT_CLASS_P (t))
 return true;
+  if (CODE_CONTAINS_STRUCT (TREE_CODE (t), TS_TYPED)
+  && TREE_TYPE (t) == error_mark_node)
+return false;
 
   switch (TREE_CODE (t))
 {
@@ -4891,12 +4900,14 @@ potential_constant_expression_1 (tree t, bool 
want_rval, bool strict,
 
 case VAR_DECL:
   if (want_rval
+ && !var_in_constexpr_fn (t)
+ && !type_dependent_expression_p (t)
  && !decl_constant_var_p (t)
  && (strict
  || !CP_TYPE_CONST_NON_VOLATILE_P (TREE_TYPE (t))
  || !DECL_INITIALIZED_BY_CONSTANT_EXPRESSION_P (t))
- && !var_in_constexpr_fn (t)
- && !type_dependent_expression_p (t))
+ && COMPLETE_TYPE_P (TREE_TYPE (t))
+ && !is_really_empty_class (TREE_TYPE (t)))
 {
   if (flags & tf_error)
 non_const_var_error (t);
diff --git a/gcc/testsuite/g++.dg/cpp0x/constexpr-empty12.C 
b/gcc/testsuite/g++.dg/cpp0x/constexpr-empty12.C
new file mode 100644
index 000..7463442
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/constexpr-empty12.C
@@ -0,0 +1,5 @@
+// { dg-do compile { target c++11 } }
+
+struct A { } a;
+constexpr int f (A a) { return 42; }
+constexpr int i = f(a);
diff --git a/gcc/testsuite/g++.dg/cpp0x/constexpr-empty13.C 
b/gcc/testsuite/g++.dg/cpp0x/constexpr-empty13.C
new file mode 100644
index 000..1896ead
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/constexpr-empty13.C
@@ -0,0 +1,7 @@
+// { dg-do compile { target c++11 } }
+
+struct A {
+  struct B { } b;
+} a;
+constexpr int f (A a) { return 42; }
+constexpr int i = f(a);
diff --git a/gcc/testsuite/g++.dg/cpp1y/var-templ42.C 
b/gcc/testsuite/g++.dg/cpp1y/var-templ4

Re: C++ PATCH for c++/56701 (treating 'this' as an rvalue)

2016-08-09 Thread Jason Merrill

On Thu, Mar 28, 2013 at 2:14 PM, Jason Merrill  wrote:
> When 'this' appears in an expression, it should be an rvalue rather than a
> const lvalue; in C++11, the distinction matters.

And likewise for &*this.

Tested x86_64-pc-linux-gnu, applying to trunk.
commit 9258170ab23137c255e656ac76c96736a2e175d6
Author: Jason Merrill 
Date:   Tue Aug 9 00:32:10 2016 -0400

PR c++/56701 - wrong type of &*this

* typeck.c (cp_build_addr_expr_1): Remove special *this handling.

diff --git a/gcc/cp/typeck.c b/gcc/cp/typeck.c
index 78d443b..bedc453 100644
--- a/gcc/cp/typeck.c
+++ b/gcc/cp/typeck.c
@@ -5626,11 +5626,6 @@ cp_build_addr_expr_1 (tree arg, bool strict_lvalue, 
tsubst_flags_t complain)
   /* Let &* cancel out to simplify resulting code.  */
   if (INDIRECT_REF_P (arg))
 {
-  /* We don't need to have `current_class_ptr' wrapped in a
-NON_LVALUE_EXPR node.  */
-  if (arg == current_class_ref)
-   return current_class_ptr;
-
   arg = TREE_OPERAND (arg, 0);
   if (TREE_CODE (TREE_TYPE (arg)) == REFERENCE_TYPE)
{
diff --git a/gcc/testsuite/g++.dg/cpp0x/rv-this2.C 
b/gcc/testsuite/g++.dg/cpp0x/rv-this2.C
new file mode 100644
index 000..fcfd265
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/rv-this2.C
@@ -0,0 +1,8 @@
+// PR c++/56701
+// { dg-do compile { target c++11 } }
+
+struct A
+{
+  void f(){ A*&& a = &*this; }
+};
+int main(){}

[hsa-branch] Avoid register moves for unnecessary conversions

2016-08-09 Thread Martin Jambor

Hi,

in the first pre-release HSAIL/BRIG specifications, registers had a type
which they now don't, they only have size.  We have thought about
removing it from our internal representation but it is actually useful
to know what the type of the stored value is when it is actually
necessary to perform conversions.  Nevertheless, we still often create
unnecessary register moves, which however can be avoided by relaxing our
main conversion function like the patch below does.

Thanks,

Martin

2016-08-03  Martin Jambor  

* hsa-gen.c (get_in_type): Return this if it is a register of
matching size.
---
 gcc/hsa-gen.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/gcc/hsa-gen.c b/gcc/hsa-gen.c
index ecd6f8a..0e48377 100644
--- a/gcc/hsa-gen.c
+++ b/gcc/hsa-gen.c
@@ -984,6 +984,14 @@ hsa_op_with_type::get_in_type (BrigType16_t dtype, hsa_bb 
*hbb)
   dest = new hsa_op_reg (dtype);
   hbb->append_insn (new hsa_insn_cvt (dest, this));
 }
+  else if (is_a  (this))
+{
+  /* In the end, HSA registers do not really have types, only sizes, so if
+the sizes match, we can use the register directly.  */
+  gcc_checking_assert (hsa_type_bit_size (dtype)
+  == hsa_type_bit_size (m_type));
+  return this;
+}
   else
 {
   dest = new hsa_op_reg (m_type);
-- 
2.9.2

[hsa-branch] Remove useless build of kernel_dependencies_vector_type

2016-08-09 Thread Martin Jambor

Hi,

the type is built twice and the first one is not used.  This patch
removes it.

Thanks,

Martin

2016-08-04  Martin Jambor  

* hsa-brig.c (hsa_output_kernels): Remove unnecessary building of
kernel_dependencies_vector_type.
---
 gcc/hsa-brig.c | 6 +-
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/gcc/hsa-brig.c b/gcc/hsa-brig.c
index 34231d7..6128c4b 100644
--- a/gcc/hsa-brig.c
+++ b/gcc/hsa-brig.c
@@ -2232,11 +2232,6 @@ hsa_output_kernels (tree *host_func_table, tree *kernels)
   tree gridified_kernel_p_tree = build_int_cstu (boolean_type_node,
 gridified_kernel_p);
   unsigned count = 0;
-
-  kernel_dependencies_vector_type
-   = build_array_type (build_pointer_type (char_type_node),
-   build_index_type (size_int (0)));
-
   vec *kernel_dependencies_vec = NULL;
   if (hsa_decl_kernel_dependencies)
{
@@ -2286,6 +2281,7 @@ hsa_output_kernels (tree *host_func_table, tree *kernels)
   if (count > 0)
{
  ASM_GENERATE_INTERNAL_LABEL (tmp_name, "__hsa_dependencies_list", i);
+ gcc_checking_assert (kernel_dependencies_vector_type);
  tree dependencies_list = build_decl (UNKNOWN_LOCATION, VAR_DECL,
   get_identifier (tmp_name),
   kernel_dependencies_vector_type);
-- 
2.9.2

[hsa-branch] Remove typedef.*_p from hsa.h

2016-08-09 Thread Martin Jambor

Hi,

the rest of hsa code uses direct pointers, which I find preferable (and
since we have moved to explicit pointers also for example with the
gimple type, I believe I am not alone, so I removed the two _p typedefs
we had in hsa.h.  They were not in widespread use anyway.

Thanks,

Martin

2016-08-03  Martin Jambor  

* hsa.h (hsa_insn_basic_p): Remove typedef.
(hsa_op_with_type): Change hsa_insn_basic_p into plain pointers.
(hsa_op_reg_p): Remove typedef.
(hsa_function_representation): Change hsa_op_reg_p into plain
pointers.
---
 gcc/hsa.h | 7 ++-
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/gcc/hsa.h b/gcc/hsa.h
index 092fd3b..242103c 100644
--- a/gcc/hsa.h
+++ b/gcc/hsa.h
@@ -50,7 +50,6 @@ class hsa_insn_basic;
 class hsa_op_address;
 class hsa_op_reg;
 class hsa_bb;
-typedef hsa_insn_basic *hsa_insn_basic_p;
 
 /* Class representing an input argument, output argument (result) or a
variable, that will eventually end up being a symbol directive.  */
@@ -259,11 +258,9 @@ private:
   /* Set definition where the register is defined.  */
   void set_definition (hsa_insn_basic *insn);
   /* Uses of the value while still in SSA.  */
-  auto_vec  m_uses;
+  auto_vec  m_uses;
 };
 
-typedef class hsa_op_reg *hsa_op_reg_p;
-
 /* Report whether or not P is a register operand.  */
 
 template <>
@@ -1269,7 +1266,7 @@ public:
   unsigned m_temp_symbol_count;
 
   /* SSA names mapping.  */
-  vec  m_ssa_map;
+  vec  m_ssa_map;
 
   /* Flag whether a function needs update of dominators before RA.  */
   bool m_modified_cfg;
-- 
2.9.2

[hsa-branch] Use hsa_obstack to allocate temporary names

2016-08-09 Thread Martin Jambor

Hi,

HSA backend now uses ggc_strdup to allocate name strings for stuff that
doesn't have any and therefore we rely on the garbage collector not
running.  That works but is not nice and since the HSA branch now has an
obstack, not necessary.  This patch changes the allocation to be from
the obstack.  Since I did not want to export the obstack, I have moved
the function instead.

Committed to the branch and queued for committing to trunk later.

Thanks,

Martin

2016-08-04  Martin Jambor  

* hsa.c (hsa_get_declaration_name): Moved to...
* hsa-gen.c (hsa_get_declaration_name): ...here.  Allocate
temporary string on an obstack instead from ggc.
---
 gcc/hsa-gen.c | 30 ++
 gcc/hsa.c | 28 
 2 files changed, 30 insertions(+), 28 deletions(-)

diff --git a/gcc/hsa-gen.c b/gcc/hsa-gen.c
index 0e48377..c578294 100644
--- a/gcc/hsa-gen.c
+++ b/gcc/hsa-gen.c
@@ -779,6 +779,36 @@ hsa_needs_cvt (BrigType16_t dtype, BrigType16_t stype)
   return false;
 }
 
+/* Return declaration name if exists.  */
+
+const char *
+hsa_get_declaration_name (tree decl)
+{
+  if (!DECL_NAME (decl))
+{
+  char buf[64];
+  snprintf (buf, 64, "__hsa_anon_%i", DECL_UID (decl));
+  size_t len = strlen (buf);
+  char *copy = (char *) obstack_alloc (&hsa_obstack, len + 1);
+  memcpy (copy, buf, len + 1);
+  return copy;
+}
+
+  tree name_tree;
+  if (TREE_CODE (decl) == FUNCTION_DECL
+  || (TREE_CODE (decl) == VAR_DECL && is_global_var (decl)))
+name_tree = DECL_ASSEMBLER_NAME (decl);
+  else
+name_tree = DECL_NAME (decl);
+
+  const char *name = IDENTIFIER_POINTER (name_tree);
+  /* User-defined assembly names have prepended asterisk symbol.  */
+  if (name[0] == '*')
+name++;
+
+  return name;
+}
+
 /* Lookup or create the associated hsa_symbol structure with a given VAR_DECL
or lookup the hsa_structure corresponding to a PARM_DECL.  */
 
diff --git a/gcc/hsa.c b/gcc/hsa.c
index 01520e8..88058cf 100644
--- a/gcc/hsa.c
+++ b/gcc/hsa.c
@@ -786,34 +786,6 @@ hsa_brig_function_name (const char *p)
   return buf;
 }
 
-/* Return declaration name if exists.  */
-
-const char *
-hsa_get_declaration_name (tree decl)
-{
-  if (!DECL_NAME (decl))
-{
-  char buf[64];
-  snprintf (buf, 64, "__hsa_anonymous_%i", DECL_UID (decl));
-  const char *ggc_str = ggc_strdup (buf);
-  return ggc_str;
-}
-
-  tree name_tree;
-  if (TREE_CODE (decl) == FUNCTION_DECL
-  || (TREE_CODE (decl) == VAR_DECL && is_global_var (decl)))
-name_tree = DECL_ASSEMBLER_NAME (decl);
-  else
-name_tree = DECL_NAME (decl);
-
-  const char *name = IDENTIFIER_POINTER (name_tree);
-  /* User-defined assembly names have prepended asterisk symbol.  */
-  if (name[0] == '*')
-name++;
-
-  return name;
-}
-
 /* Add a flatten attribute and disable vectorization for gpu implementation
function decl GDECL.  */
 
-- 
2.9.2

[hsa-branch] Make section::add return pointer to data

2016-08-09 Thread Martin Jambor

Hi,

when emitting a BRIG function definition, we need to fixup the leading
directive after all of the instructions and everything has been added to
the section (well, we could pre-compute the value but it is not worth
it).  At the moment the function emitting code calculates the pointer in
an ugly way, when it is much better to just add that ability to the
interface, like this patch does.

I'll commit it to the HSA branch for now but will commit it to trunk
after a while too.

Martin

2016-08-04  Martin Jambor  

* hsa-brig.c (hsa_brig_section): Make allocate_new_chunk, chunks
and cur_chunk provate, add a default NULL parameter to add method.
(hsa_brig_section::add): Added a new parameter, store pointer to
output data there if it is non-NULL.
(emit_function_directives): Use this new parameter instead of
calculating the pointer itself, fix function comment.
(hsa_brig_emit_function): Add forgotten endian conversion.
---
 gcc/hsa-brig.c | 54 +++---
 1 file changed, 27 insertions(+), 27 deletions(-)

diff --git a/gcc/hsa-brig.c b/gcc/hsa-brig.c
index 1edd126..34231d7 100644
--- a/gcc/hsa-brig.c
+++ b/gcc/hsa-brig.c
@@ -160,19 +160,21 @@ public:
   /* The size of the header of the section without any padding.  */
   unsigned header_byte_delta;
 
-  /* Buffers of binary data, each containing BRIG_CHUNK_MAX_SIZE bytes.  */
-  vec  chunks;
-
-  /* More convenient access to the last chunk from the vector above.  */
-  struct hsa_brig_data_chunk *cur_chunk;
-
-  void allocate_new_chunk ();
   void init (const char *name);
   void release ();
   void output ();
-  unsigned add (const void *data, unsigned len);
+  unsigned add (const void *data, unsigned len, void **output = NULL);
   void round_size_up (int factor);
   void *get_ptr_by_offset (unsigned int offset);
+
+private:
+  void allocate_new_chunk ();
+
+  /* Buffers of binary data, each containing BRIG_CHUNK_MAX_SIZE bytes.  */
+  vec  chunks;
+
+  /* More convenient access to the last chunk from the vector above.  */
+  struct hsa_brig_data_chunk *cur_chunk;
 };
 
 static struct hsa_brig_section brig_data, brig_code, brig_operand;
@@ -270,10 +272,11 @@ hsa_brig_section::output ()
 }
 
 /* Add to the stream LEN bytes of opaque binary DATA.  Return the offset at
-   which it was stored.  */
+   which it was stored.  If OUTPUT is not NULL, store into it the pointer to
+   the place where DATA was actually stored.  */
 
 unsigned
-hsa_brig_section::add (const void *data, unsigned len)
+hsa_brig_section::add (const void *data, unsigned len, void **output)
 {
   unsigned offset = total_size;
 
@@ -281,7 +284,10 @@ hsa_brig_section::add (const void *data, unsigned len)
   if (cur_chunk->size > (BRIG_CHUNK_MAX_SIZE - len))
 allocate_new_chunk ();
 
-  memcpy (cur_chunk->data + cur_chunk->size, data, len);
+  char *dst = cur_chunk->data + cur_chunk->size;
+  memcpy (dst, data, len);
+  if (output)
+*output = dst;
   cur_chunk->size += len;
   total_size += len;
 
@@ -625,8 +631,12 @@ emit_directive_variable (struct hsa_symbol *symbol)
   return symbol->m_directive_offset;
 }
 
-/* Emit directives describing either a function declaration or
-   definition F.  */
+/* Emit directives describing either a function declaration or definition F and
+   return the produced BrigDirectiveExecutable structure.  The function does
+   not take into account any instructions when calculating nextModuleEntry
+   field of the produced BrigDirectiveExecutable structure so when emitting
+   actual definitions, this field needs to be updated after all of the function
+   is actually added to the code section.  */
 
 static BrigDirectiveExecutable *
 emit_function_directives (hsa_function_representation *f, bool is_declaration)
@@ -634,7 +644,7 @@ emit_function_directives (hsa_function_representation *f, 
bool is_declaration)
   struct BrigDirectiveExecutable fndir;
   unsigned name_offset, inarg_off, scoped_off, next_toplev_off;
   int count = 0;
-  BrigDirectiveExecutable *ptr_to_fndir;
+  void *ptr_to_fndir;
   hsa_symbol *sym;
 
   if (!f->m_declaration_p)
@@ -692,17 +702,7 @@ emit_function_directives (hsa_function_representation *f, 
bool is_declaration)
   *slot = int_fn;
 }
 
-  brig_code.add (&fndir, sizeof (fndir));
-  /* terrible hack: we need to set instCount after we emit all
- insns, but we need to emit directive in order, and we emit directives
- during insn emitting.  So we need to emit the FUNCTION directive
- early, then the insns, and then we need to set instCount, so remember
- a pointer to it, in some horrible way.  cur_chunk.data+size points
- directly to after fndir here.  */
-  ptr_to_fndir
-  = (BrigDirectiveExecutable *)(brig_code.cur_chunk->data
-   + brig_code.cur_chunk->size
-   - sizeof (fndir));
+  brig_code.add (&fndir, sizeof (fndir), &ptr_to_fndir);

[hsa-branch] Remove unnecessary operators

2016-08-09 Thread Martin Jambor

Hi,

since the HSA branch switched to a simple obstack from alloc-pools, it
is not necessary to have a separate new operator for each instruction
kind and it is enough if the common ancestor has one.  Therefore, this
patch removes them.  Similarly, I learned that it is enough to make the
delete operator private in the ancestor to make inaccessible in the
descendants so I removed those as well.

Unlike instructions, not all operand kinds use obstacks, so we have to
have the operands in those that do.  But at least there are fewer
operand kinds than instruction kinds.

Thanks,

Martin

2016-08-04  Martin Jambor  

* hsa.h (hsa_insn_phi): Removed new and delete operators.
(hsa_insn_br): Likewise.
(hsa_insn_cbr): Likewise.
(hsa_insn_sbr): Likewise.
(hsa_insn_cmp): Likewise.
(hsa_insn_mem): Likewise.
(hsa_insn_atomic): Likewise.
(hsa_insn_signal): Likewise.
(hsa_insn_seg): Likewise.
(hsa_insn_call): Likewise.
(hsa_insn_arg_block): Likewise.
(hsa_insn_comment): Likewise.
(hsa_insn_srctype): Likewise.
(hsa_insn_packed): Likewise.
(hsa_insn_cvt): Likewise.
(hsa_insn_alloca): Likewise.

* hsa-gen.c (hsa_insn_phi::operator new): Removed.
(hsa_insn_br::operator new): Likewise.
(hsa_insn_cbr::operator new): Likewise.
(hsa_insn_sbr::operator new): Likewise.
(hsa_insn_cmp::operator new): Likewise.
(hsa_insn_mem::operator new): Likewise.
(hsa_insn_atomic::operator new): Likewise.
(hsa_insn_signal::operator new): Likewise.
(hsa_insn_seg::operator new): Likewise.
(hsa_insn_call::operator new): Likewise.
(hsa_insn_arg_block::operator new): Likewise.
(hsa_insn_comment::operator new): Likewise.
(hsa_insn_srctype::operator new): Likewise.
(hsa_insn_packed::operator new): Likewise.
(hsa_insn_cvt::operator new): Likewise.
(hsa_insn_alloca::operator new): Likewise.
---
 gcc/hsa-gen.c | 128 --
 gcc/hsa.h |  70 
 2 files changed, 198 deletions(-)

diff --git a/gcc/hsa-gen.c b/gcc/hsa-gen.c
index c578294..a9f4a8f 100644
--- a/gcc/hsa-gen.c
+++ b/gcc/hsa-gen.c
@@ -1452,14 +1452,6 @@ hsa_insn_phi::hsa_insn_phi (unsigned nops, hsa_op_reg 
*dst)
   dst->set_definition (this);
 }
 
-/* New operator to allocate PHI instruction from obstack.  */
-
-void *
-hsa_insn_phi::operator new (size_t size)
-{
-  return obstack_alloc (&hsa_obstack, size);
-}
-
 /* Constructor of class representing instructions for control flow and
sychronization,   */
 
@@ -1472,14 +1464,6 @@ hsa_insn_br::hsa_insn_br (unsigned nops, int opc, 
BrigType16_t t,
 {
 }
 
-/* New operator to allocate branch instruction from obstack.  */
-
-void *
-hsa_insn_br::operator new (size_t size)
-{
-  return obstack_alloc (&hsa_obstack, size);
-}
-
 /* Constructor of class representing instruction for conditional jump, CTRL is
the control register determining whether the jump will be carried out, the
new instruction is automatically added to its uses list.  */
@@ -1489,14 +1473,6 @@ hsa_insn_cbr::hsa_insn_cbr (hsa_op_reg *ctrl)
 {
 }
 
-/* New operator to allocate branch instruction from obstack.  */
-
-void *
-hsa_insn_cbr::operator new (size_t size)
-{
-  return obstack_alloc (&hsa_obstack, size);
-}
-
 /* Constructor of class representing instruction for switch jump, CTRL is
the index register.  */
 
@@ -1507,14 +1483,6 @@ hsa_insn_sbr::hsa_insn_sbr (hsa_op_reg *index, unsigned 
jump_count)
 {
 }
 
-/* New operator to allocate switch branch instruction from obstack.  */
-
-void *
-hsa_insn_sbr::operator new (size_t size)
-{
-  return obstack_alloc (&hsa_obstack, size);
-}
-
 /* Replace all occurrences of OLD_BB with NEW_BB in the statements
jump table.  */
 
@@ -1541,14 +1509,6 @@ hsa_insn_cmp::hsa_insn_cmp (BrigCompareOperation8_t cmp, 
BrigType16_t t,
 {
 }
 
-/* New operator to allocate compare instruction from obstack.  */
-
-void *
-hsa_insn_cmp::operator new (size_t size)
-{
-  return obstack_alloc (&hsa_obstack, size);
-}
-
 /* Constructor of classes representing memory accesses.  OPC is the opcode 
(must
be BRIG_OPCODE_ST or BRIG_OPCODE_LD) and T is the type.  The instruction
operands are provided as ARG0 and ARG1.  */
@@ -1574,14 +1534,6 @@ hsa_insn_mem::hsa_insn_mem (unsigned nops, int opc, 
BrigType16_t t,
 {
 }
 
-/* New operator to allocate memory instruction from obstack.  */
-
-void *
-hsa_insn_mem::operator new (size_t size)
-{
-  return obstack_alloc (&hsa_obstack, size);
-}
-
 /* Constructor of class representing atomic instructions.  OPC is the principal
opcode, AOP is the specific atomic operation opcode.  T is the type of the
instruction.  The instruction operands are provided as ARG[0-3].  */
@@ -1602,14 +1554,6 @@ hsa_insn_atomic::hsa_insn_atomic (int nops, int opc,

[hsa-branch] Add support for CONST_DECLs

2016-08-09 Thread Martin Jambor

Hi,

the HSA/BRIG back-end does not have any support for CONST_DECSs so far.
This patch adds it, emitting readonly segment symbols with initializers
for them.  One complication is that addresses pertaining to HSA readonly
segment cannot be converted to flat addresses.  So when we need an
address of a CONST_DECL, we take an address of a normal private symbol
instead.

Will commit to the HSA branch now and queue for trunk.

Martin

2016-08-08  Martin Jambor  

* hsa-gen.c (get_symbol_for_decl): Accept CONST_DECLs, put them to
readonly segment.
(gen_hsa_addr): Also process CONST_DECLs.
(gen_hsa_addr_insns): Process CONST_DECLs by creating private
copies.

* hsa-brig.c (emit_immediate_operand): Declare.
(emit_directive_variable): Also emit initializers of CONST_DECLs.
---
 gcc/hsa-brig.c | 10 +-
 gcc/hsa-gen.c  | 53 +
 gcc/hsa.h  |  3 ++-
 3 files changed, 56 insertions(+), 10 deletions(-)

diff --git a/gcc/hsa-brig.c b/gcc/hsa-brig.c
index 6128c4b..237069e 100644
--- a/gcc/hsa-brig.c
+++ b/gcc/hsa-brig.c
@@ -570,6 +570,7 @@ enqueue_op (hsa_op_base *op)
   return ret;
 }
 
+static void emit_immediate_operand (hsa_op_immed *imm);
 
 /* Emit directive describing a symbol if it has not been emitted already.
Return the offset of the directive.  */
@@ -608,7 +609,14 @@ emit_directive_variable (struct hsa_symbol *symbol)
 }
 
   dirvar.name = lendian32 (name_offset);
-  dirvar.init = 0;
+
+  if (symbol->m_decl && TREE_CODE (symbol->m_decl) == CONST_DECL)
+{
+  hsa_op_immed *tmp = new hsa_op_immed (DECL_INITIAL (symbol->m_decl));
+  dirvar.init = lendian32 (enqueue_op (tmp));
+}
+  else
+dirvar.init = 0;
   dirvar.type = lendian16 (symbol->m_type);
   dirvar.segment = symbol->m_segment;
   dirvar.align = symbol->m_align;
diff --git a/gcc/hsa-gen.c b/gcc/hsa-gen.c
index a9f4a8f..baa20b9 100644
--- a/gcc/hsa-gen.c
+++ b/gcc/hsa-gen.c
@@ -820,7 +820,8 @@ get_symbol_for_decl (tree decl)
 
   gcc_assert (TREE_CODE (decl) == PARM_DECL
  || TREE_CODE (decl) == RESULT_DECL
- || TREE_CODE (decl) == VAR_DECL);
+ || TREE_CODE (decl) == VAR_DECL
+ || TREE_CODE (decl) == CONST_DECL);
 
   dummy.m_decl = decl;
 
@@ -862,11 +863,14 @@ get_symbol_for_decl (tree decl)
   else
 {
   hsa_symbol *sym;
-  gcc_assert (TREE_CODE (decl) == VAR_DECL);
+  /* PARM_DECLs and RESULT_DECL should be already in m_local_symbols.  */
+  gcc_assert (TREE_CODE (decl) == VAR_DECL
+ || TREE_CODE (decl) == CONST_DECL);
   BrigAlignment8_t align = hsa_object_alignment (decl);
 
   if (is_in_global_vars)
{
+ gcc_checking_assert (TREE_CODE (decl) != CONST_DECL);
  sym = new hsa_symbol (BRIG_TYPE_NONE, BRIG_SEGMENT_GLOBAL,
BRIG_LINKAGE_PROGRAM, true,
BRIG_ALLOCATION_PROGRAM, align);
@@ -888,11 +892,15 @@ get_symbol_for_decl (tree decl)
  if (AGGREGATE_TYPE_P (TREE_TYPE (decl)))
align = MAX ((BrigAlignment8_t) BRIG_ALIGNMENT_8, align);
 
- /* PARM_DECL and RESULT_DECL should be already in m_local_symbols.  */
- gcc_assert (TREE_CODE (decl) == VAR_DECL);
-
+ BrigAllocation allocation = BRIG_ALLOCATION_AUTOMATIC;
  BrigSegment8_t segment;
- if (lookup_attribute ("hsa_group_segment", DECL_ATTRIBUTES (decl)))
+ if (TREE_CODE (decl) == CONST_DECL)
+   {
+ segment = BRIG_SEGMENT_READONLY;
+ allocation = BRIG_ALLOCATION_AGENT;
+   }
+ else if (lookup_attribute ("hsa_group_segment",
+DECL_ATTRIBUTES (decl)))
segment = BRIG_SEGMENT_GROUP;
  else if (TREE_STATIC (decl)
   || lookup_attribute ("hsa_global_segment",
@@ -901,8 +909,8 @@ get_symbol_for_decl (tree decl)
  else
segment = BRIG_SEGMENT_PRIVATE;
 
- sym = new hsa_symbol (BRIG_TYPE_NONE, segment, BRIG_LINKAGE_FUNCTION);
- sym->m_align = align;
+ sym = new hsa_symbol (BRIG_TYPE_NONE, segment, BRIG_LINKAGE_FUNCTION,
+   false, allocation, align);
  sym->fillup_for_decl (decl);
  hsa_cfun->m_private_variables.safe_push (sym);
}
@@ -1944,6 +1952,7 @@ gen_hsa_addr (tree ref, hsa_bb *hbb, HOST_WIDE_INT 
*output_bitsize = NULL,
 case PARM_DECL:
 case VAR_DECL:
 case RESULT_DECL:
+case CONST_DECL:
   gcc_assert (!symbol);
   symbol = get_symbol_for_decl (ref);
   addrtype = hsa_get_segment_addr_type (symbol->m_segment);
@@ -2161,6 +2170,34 @@ gen_hsa_addr_insns (tree val, hsa_op_reg *dest, hsa_bb 
*hbb)
 val = TREE_OPERAND (val, 0);
   addr = gen_hsa_addr (val, hbb);
 
+  if (TREE_CODE (val) == CONST_DECL
+  && is_gimple_reg_type (TREE_TYPE (val)))
+{
+  gcc_assert (addr->m_sym

[PATCH] gcov: add new option (--hash-names) (PR gcov-profile/36412).

2016-08-09 Thread Martin Liška

Hello.

Following enhancement for gcov solves issues when we cannot create a file due 
to a filesystem
path length limit. Selected approach utilizes existing md5sum functions.

Patch survives make check -k RUNTESTFLAGS="gcov.exp" on x86_64-linux-gnu.

Ready for trunk?
Thanks,
Martin

>From 42393a73c0433d7de28dc560f47ff61e615718bf Mon Sep 17 00:00:00 2001
From: marxin 
Date: Tue, 9 Aug 2016 16:27:10 +0200
Subject: [PATCH] gcov: add new option (--hash-names) (PR gcov-profile/36412).

gcc/ChangeLog:

2016-08-09  Martin Liska  

	PR gcov-profile/36412
	* doc/gcov.texi: Document --hash-names (-e).
	* gcov.c (print_usage): Add the option.
	(process_args): Process the option.
	(md5sum_to_hex): New function.
	(make_gcov_file_name): Do the md5sum and append it to a
	filename.
---
 gcc/doc/gcov.texi |  6 ++
 gcc/gcov.c| 47 +--
 2 files changed, 51 insertions(+), 2 deletions(-)

diff --git a/gcc/doc/gcov.texi b/gcc/doc/gcov.texi
index 89d8049..78c0c75 100644
--- a/gcc/doc/gcov.texi
+++ b/gcc/doc/gcov.texi
@@ -123,6 +123,7 @@ gcov [@option{-v}|@option{--version}] [@option{-h}|@option{--help}]
  [@option{-b}|@option{--branch-probabilities}]
  [@option{-c}|@option{--branch-counts}]
  [@option{-d}|@option{--display-progress}]
+ [@option{-e}|@option{--hash-names}]
  [@option{-f}|@option{--function-summaries}]
  [@option{-i}|@option{--intermediate-format}]
  [@option{-l}|@option{--long-file-names}]
@@ -171,6 +172,11 @@ be shown, unless the @option{-u} option is given.
 Write branch frequencies as the number of branches taken, rather than
 the percentage of branches taken.
 
+@item -e
+@itemx --hash-names
+For situations when a long name can potentially hit filesystem path limit,
+let's calculate md5sum of the patch and create file x.gcov##md5sum.gcov.
+
 @item -n
 @itemx --no-output
 Do not create the @command{gcov} output file.
diff --git a/gcc/gcov.c b/gcc/gcov.c
index f05ef7c..97739b7 100644
--- a/gcc/gcov.c
+++ b/gcc/gcov.c
@@ -43,6 +43,7 @@ along with Gcov; see the file COPYING3.  If not see
 
 #include 
 #include 
+#include "md5.h"
 
 using namespace std;
 
@@ -359,6 +360,11 @@ static int flag_demangled_names = 0;
 
 static int flag_long_names = 0;
 
+/* For situations when a long name can potentially hit filesystem path limit,
+   let's calculate md5sum of the patch and create file x.gcov##md5sum.gcov.  */
+
+static int flag_hash_names = 0;
+
 /* Output count information for every basic block, not merely those
that contain line number information.  */
 
@@ -655,6 +661,8 @@ print_usage (int error_p)
   fnotice (file, "  -c, --branch-counts Output counts of branches taken\n\
 rather than percentages\n");
   fnotice (file, "  -d, --display-progress  Display progress information\n");
+  fnotice (file, "  -e, --hash-namesUse hash of file path in "
+	   "file names\n");
   fnotice (file, "  -f, --function-summariesOutput summaries for each function\n");
   fnotice (file, "  -i, --intermediate-format   Output .gcov file in intermediate text format\n");
   fnotice (file, "  -l, --long-file-names   Use long output file names for included\n\
@@ -694,6 +702,7 @@ static const struct option options[] =
   { "all-blocks",   no_argument,   NULL, 'a' },
   { "branch-probabilities", no_argument,   NULL, 'b' },
   { "branch-counts",no_argument,   NULL, 'c' },
+  { "long-file-names",  no_argument,   NULL, 'e' },
   { "intermediate-format",  no_argument,   NULL, 'i' },
   { "no-output",no_argument,   NULL, 'n' },
   { "long-file-names",  no_argument,   NULL, 'l' },
@@ -716,8 +725,8 @@ process_args (int argc, char **argv)
 {
   int opt;
 
-  while ((opt = getopt_long (argc, argv, "abcdfhilmno:s:pruv", options, NULL)) !=
- -1)
+  const char *opts = "abcdefhilmno:s:pruv";
+  while ((opt = getopt_long (argc, argv, opts, options, NULL)) != -1)
 {
   switch (opt)
 	{
@@ -730,6 +739,9 @@ process_args (int argc, char **argv)
 	case 'c':
 	  flag_counts = 1;
 	  break;
+	case 'e':
+	  flag_hash_names = 1;
+	  break;
 	case 'f':
 	  flag_function_summary = 1;
 	  break;
@@ -2147,6 +2159,15 @@ canonicalize_name (const char *name)
   return result;
 }
 
+/* Print hex representation of 16 bytes from SUM and write it to BUFFER.  */
+
+static void
+md5sum_to_hex (const char *sum, char *buffer)
+{
+  for (unsigned i = 0; i < 16; i++)
+sprintf (buffer + (2 * i), "%02x", sum[i]);
+}
+
 /* Generate an output file name. INPUT_NAME is the canonicalized main
input file and SRC_NAME is the canonicalized file name.
LONG_OUTPUT_NAMES and PRESERVE_PATHS affect name generation.  With
@@ -2184,6 +2205,28 @@ make_gcov_file_name (const char *input_name, const char *src_name)
   ptr = mangle_name (src_name, ptr);
   strcpy (ptr, ".gcov");
 
+  if (flag_hash_names)
+{
+  md5_ctx ctx;
+  c

[MIPS, committed] Skip gcc.dg/loop-8.c

2016-08-09 Thread Matthew Fortune

MIPS creates more invariants than gcc.dg/loop-8.c expects.  There is
no way to represent what the original test was trying to verify given
the additional invariants being moved out of the loop.

Thanks,
Matthew

gcc/
PR rtl-optimization/9
* gcc.dg/loop-8.c: Skip for MIPS due to extra invariants.

diff --git a/gcc/testsuite/gcc.dg/loop-8.c b/gcc/testsuite/gcc.dg/loop-8.c
index 463c5d0..8a4b209 100644
--- a/gcc/testsuite/gcc.dg/loop-8.c
+++ b/gcc/testsuite/gcc.dg/loop-8.c
@@ -1,6 +1,6 @@
 /* { dg-do compile } */
 /* { dg-options "-O1 -fdump-rtl-loop2_invariant" } */
-/* { dg-skip-if "unexpected IV" { "hppa*-*-* visium-*-*" } { "*" } { "" } } */
+/* { dg-skip-if "unexpected IV" { "hppa*-*-* mips*-*-* visium-*-*" } { "*" } { 
"" } } */
 
 void
 f (int *a, int *b)
-- 
2.2.1

Re: [PATCH] Fix early debug regression with DW_AT_string_length (PR debug/71906)

2016-08-09 Thread Jakub Jelinek

On Mon, Aug 01, 2016 at 12:44:30PM +0200, Richard Biener wrote:
> Note that reading the dwarf standard, it looks like it accepts a location
> which means we could do an implicit location description using
> DW_OP_stack_value which gives us access to arbitrary dwarf
> expressions (and thus the possibility to handle it similar to VLAs).

Sure.  But for the DW_AT_string_length, the issue is that with early debug,
we want the attribute early, but locations within a function obviously
can be provided late, after the function is compiled.

Which is why the patch uses DW_OP_call4 as the holder of the information
that the string length is provided by the value of some variable.
In certain cases it can stay that way even during late debug, if the var
disappears or has no location info etc., then DW_AT_string_length is
dropped, or in other cases the location of the var is copied into the
attribute etc.

This is different from the VLA bounds in DW_AT_upper_bound etc., where
it is allowed to just refer to a DIE that holds the value (so we can stick
it into the attribute early and just supply the var's location later), for
DW_AT_string_length it is not allowed.

> But maybe I am missing something?  (now running into the issue
> with LTO debug and gfortran.dg/save_5.f90 where during early debug
> we emit a location that ends up refering to a symbol that might be
> optimized away later - early debug cannot sanitize referenced symbols
> via resolv_addr obviously).  Annotating the DIE late is also not
> what I want to do as I'd need to pull in all type DIEs into the late CU
> that way (well, at least selected ones, but I'm really trying to avoid
> going down that route).

In some cases like location of file scope vars, or this DW_AT_string_length,
you really need to adjust late the debug info created early, there is no
workaround for that.

Jakub

[v3 PATCH] Implement LWG 2744 and LWG 2754.

2016-08-09 Thread Ville Voutilainen

Tested on Linux-x64.

2016-08-09  Ville Voutilainen  

Implement LWG 2744 and LWG 2754.
* include/std/any (any(ValueType&&)): Constrain with __is_in_place.
(any(in_place_type_t<_ValueType>, _Args&&...)): Use _Decay.
(any(in_place_type_t<_ValueType>, initializer_list<_Up>, _Args&&...)):
Likewise.
(emplace(_Args&&...)): Likewise.
(emplace(initializer_list<_Up>, _Args&&...)): Likewise.
* include/std/utility: (__is_in_place_impl, __is_in_place): New.
* testsuite/20_util/any/assign/emplace.cc: Add tests for decaying
emplace.
* testsuite/20_util/any/cons/in_place.cc: Add tests for decaying
in_place constructor.
* testsuite/20_util/any/misc/any_cast_neg.cc: Adjust.
diff --git a/libstdc++-v3/include/std/any b/libstdc++-v3/include/std/any
index 4add118..27325c5 100644
--- a/libstdc++-v3/include/std/any
+++ b/libstdc++-v3/include/std/any
@@ -153,7 +153,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 /// Construct with a copy of @p __value as the contained object.
 template ,
  typename _Mgr = _Manager<_Tp>,
-  __any_constructible_t<_Tp, _ValueType&&> = true>
+  __any_constructible_t<_Tp, _ValueType&&> = true,
+ enable_if_t::value, bool> = true>
   any(_ValueType&& __value)
   : _M_manager(&_Mgr::_S_manage)
   {
@@ -164,9 +165,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 template ,
  typename _Mgr = _Manager<_Tp>,
   enable_if_t<__and_,
-__not_<
-  is_constructible<_Tp,
-   _ValueType&&>>>::value,
+__not_>,
+__not_<__is_in_place<_Tp>>>::value,
  bool> = false>
   any(_ValueType&& __value)
   : _M_manager(&_Mgr::_S_manage)
@@ -175,10 +175,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   }
 
 /// Construct with an object created from @p __args as the contained 
object.
-template ,
  typename _Mgr = _Manager<_Tp>,
   __any_constructible_t<_Tp, _Args&&...> = false>
-  any(in_place_type_t<_Tp>, _Args&&... __args)
+  any(in_place_type_t<_ValueType>, _Args&&... __args)
   : _M_manager(&_Mgr::_S_manage)
   {
 _Mgr::_S_create(_M_storage, std::forward<_Args>(__args)...);
@@ -186,11 +187,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
 /// Construct with an object created from @p __il and @p __args as
 /// the contained object.
-template ,
  typename _Mgr = _Manager<_Tp>,
   __any_constructible_t<_Tp, initializer_list<_Up>,
_Args&&...> = false>
-  any(in_place_type_t<_Tp>, initializer_list<_Up> __il, _Args&&... __args)
+  any(in_place_type_t<_ValueType>,
+ initializer_list<_Up> __il, _Args&&... __args)
   : _M_manager(&_Mgr::_S_manage)
   {
 _Mgr::_S_create(_M_storage, __il, std::forward<_Args>(__args)...);
@@ -248,7 +251,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   }
 
 /// Emplace with an object created from @p __args as the contained object.
-template ,
  typename _Mgr = _Manager<_Tp>,
   __any_constructible_t<_Tp, _Args&&...> = false>
   void emplace(_Args&&... __args)
@@ -260,7 +264,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
 /// Emplace with an object created from @p __il and @p __args as
 /// the contained object.
-template ,
  typename _Mgr = _Manager<_Tp>,
   __any_constructible_t<_Tp, initializer_list<_Up>,
_Args&&...> = false>
diff --git a/libstdc++-v3/include/std/utility b/libstdc++-v3/include/std/utility
index 0c03644..e1a523f 100644
--- a/libstdc++-v3/include/std/utility
+++ b/libstdc++-v3/include/std/utility
@@ -356,6 +356,19 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template 
 in_place_tag in_place(__in_place_index<_Idx>*) {terminate();}
 
+  template
+struct __is_in_place_impl : false_type
+{ };
+
+  template
+  struct __is_in_place_impl> : true_type
+{ };
+
+  template
+struct __is_in_place
+: public __is_in_place_impl>>
+{ };
+
 #define  __cpp_lib_as_const 201510
   template
 constexpr add_const_t<_Tp>& as_const(_Tp& __t) noexcept { return __t; }
@@ -363,7 +376,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template
 void as_const(const _Tp&&) = delete;
 
-#endif
+#endif // C++17
 
 _GLIBCXX_END_NAMESPACE_VERSION
 } // namespace
diff --git a/libstdc++-v3/testsuite/20_util/any/assign/emplace.cc 
b/libstdc++-v3/testsuite/20_util/any/assign/emplace.cc
index 663bae2..df37f78 100644
--- a/libstdc++-v3/testsuite/20_util/any/assign/emplace.cc
+++ b/libstdc++-v3/testsuite/20_util/any/assign/emplace.cc
@@ -59,4 +59,18 @@ int main()
   combined& c2 = std::any_cast(o5);
   VERIFY(c2.v[0] == 1 && c2.v[1] == 2
 && std::get<0>(c2.t) == 3 && std::get<1>(c2.t) == 4 );
+  std::any o6;
+  o6.emplace(i);
+  V

Re: [PATCH] Fix PR72772

2016-08-09 Thread Bin.Cheng

On Fri, Aug 5, 2016 at 11:48 AM, Richard Biener  wrote:
>
> This fixes PR72772 by avoing placing a degenerate PHI in each
> forwarder block loop init creates when creating simple preheaders.
> The solution is to simply split the single loop entry edge which
> is also way cheaper than using the forwarder block creation path.
>
> You've seen a load of fallout fixes already, so this is the final
> patch adjusting two testcases (for gcc.dg/tree-ssa/pr59597.c we
> no longer register the unwanted threadings as the forwarders
> no longer contain PHIs).
>
> This patch will cause
>
> +FAIL: gcc.dg/graphite/scop-dsyr2k.c scan-tree-dump-times graphite "number
> of SCoPs: 1" 1
> +FAIL: gcc.dg/graphite/scop-dsyrk.c scan-tree-dump-times graphite "number
> of SCoPs: 1" 1
>
> on x86_64 with -m32 as niter analysis is confused by us now generating
> an optimized loop nest via threading that has all redundant checks
> removed.  We no longer can prove that the loops do not eventually
> iterate zero times.  I will open a PR for this (it is a latent issue).
> The tests would pass with VRP disabled but I choose to leave them
> FAILing for now.
>
> Bootstrapped / tested many times on x86_64-unknown-linux-gnu, re-doing
> this after the latest fallout fix now.
>
> Richard.
>
> 2016-08-05  Richard Biener  
>
> PR tree-optimization/72772
> * cfgloopmanip.c (create_preheader): Use split_edge if there
> is a single loop entry, avoiding degenerate PHIs.
>
> * gcc.dg/tree-ssa/ldist-24.c: New testcase.
FYI, I committed the same test as gcc.dg/tree-ssa/pr72772.c.  If it's
appropriate, this ldist test can be saved here.

Thanks,
bin

[PATCH] Fix a few fall through cases

2016-08-09 Thread Marek Polacek

Testing with -Wimplicit-fallthrough turned up a few spots where
fall through is either a bug, or just too confusing.  This patch
should be desirable even though the warning is still not in.

Eric, can you look at the c-ada-spec.c part?
Martin, does the hsa-gen.c part look correct?

What about the rest?

Bootstrapped/regtested on x86_64-linux and ppc64le-redhat-linux.

2016-08-09  Marek Polacek  

PR c/7652
gcc/c-family/
* c-ada-spec.c (dump_generic_ada_node): Add return.
gcc/
* cselib.c (cselib_expand_value_rtx_1): Add return.
* gengtype.c (dbgprint_count_type_at): Likewise.
* hsa-gen.c (gen_hsa_insn_for_internal_fn_call): Likewise.
* reg-stack.c (get_true_reg): Restructure to avoid fallthrough warning.

diff --git gcc/gcc/c-family/c-ada-spec.c gcc/gcc/c-family/c-ada-spec.c
index e33fdff..17b8610 100644
--- gcc/gcc/c-family/c-ada-spec.c
+++ gcc/gcc/c-family/c-ada-spec.c
@@ -1862,6 +1862,7 @@ dump_generic_ada_node (pretty_printer *buffer, tree node, 
tree type, int spc,
 case TREE_BINFO:
   dump_generic_ada_node
(buffer, BINFO_TYPE (node), type, spc, limited_access, name_only);
+  return 0;
 
 case TREE_VEC:
   pp_string (buffer, "--- unexpected node: TREE_VEC");
diff --git gcc/gcc/cselib.c gcc/gcc/cselib.c
index 0c5183c..1277842 100644
--- gcc/gcc/cselib.c
+++ gcc/gcc/cselib.c
@@ -1618,6 +1618,7 @@ cselib_expand_value_rtx_1 (rtx orig, struct 
expand_value_data *evd,
  else
return orig;
}
+   return orig;
   }
 
 CASE_CONST_ANY:
diff --git gcc/gcc/gengtype.c gcc/gcc/gengtype.c
index 5479b8f..0518355 100644
--- gcc/gcc/gengtype.c
+++ gcc/gcc/gengtype.c
@@ -175,6 +175,7 @@ dbgprint_count_type_at (const char *fil, int lin, const 
char *msg, type_p t)
{
case TYPE_UNDEFINED:
  nb_undefined++;
+ break;
case TYPE_SCALAR:
  nb_scalar++;
  break;
diff --git gcc/gcc/hsa-gen.c gcc/gcc/hsa-gen.c
index 6cf1538..fa4ef12 100644
--- gcc/gcc/hsa-gen.c
+++ gcc/gcc/hsa-gen.c
@@ -5039,6 +5039,7 @@ gen_hsa_insn_for_internal_fn_call (gcall *stmt, hsa_bb 
*hbb)
 case IFN_FMIN:
 case IFN_FMAX:
   gen_hsa_insns_for_call_of_internal_fn (stmt, hbb);
+  break;
 
 default:
   HSA_SORRY_ATV (gimple_location (stmt),
diff --git gcc/gcc/reg-stack.c gcc/gcc/reg-stack.c
index c931349..907f28d 100644
--- gcc/gcc/reg-stack.c
+++ gcc/gcc/reg-stack.c
@@ -423,23 +423,25 @@ get_true_reg (rtx *pat)
  GET_MODE (subreg));
  return pat;
}
+ pat = &XEXP (*pat, 0);
+ break;
}
   case FLOAT:
   case FIX:
   case FLOAT_EXTEND:
-   pat = & XEXP (*pat, 0);
+   pat = &XEXP (*pat, 0);
break;
 
   case UNSPEC:
if (XINT (*pat, 1) == UNSPEC_TRUNC_NOOP
|| XINT (*pat, 1) == UNSPEC_FILD_ATOMIC)
- pat = & XVECEXP (*pat, 0, 0);
+ pat = &XVECEXP (*pat, 0, 0);
return pat;
 
   case FLOAT_TRUNCATE:
if (!flag_unsafe_math_optimizations)
  return pat;
-   pat = & XEXP (*pat, 0);
+   pat = &XEXP (*pat, 0);
break;
 
   default:

Marek

Re: Implement -Wimplicit-fallthrough (take 2): questionable code

2016-08-09 Thread Marek Polacek

On Wed, Aug 03, 2016 at 10:56:08AM -0600, Jeff Law wrote:
> On 07/27/2016 10:53 AM, Marek Polacek wrote:
> > These are the cases where I wasn't sure if the falls through were 
> > intentional
> > or not.
> > 
> > This patch has been tested on powerpc64le-unknown-linux-gnu, 
> > aarch64-linux-gnu,
> > and x86_64-redhat-linux.
> > 
> > 2016-07-27  Marek Polacek  
> > 
> > PR c/7652
> > gcc/
> > * config/i386/i386.c (ix86_expand_branch): Add gcc_fallthrough..
> I'm pretty sure this is an intended fallthru.

Ok.
 
> > * cselib.c (cselib_expand_value_rtx_1): Likewise.
> I'm pretty sure this is an intended fallthru.  But rather than fallthru, can
> we just "return orig;"?  If more code gets added here it'll be less and less
> clear if we continue to fall thru.

Done in the patch I just posted.
 
> > * gensupport.c (get_alternatives_number): Likewise.
> > (subst_dup): Likewise.
> These are definitely an intended fallthrough.  E & V are essentially the
> same thing, except that for V the vector may be NULL.

Ok.
 
> > * gimplify.c (goa_stabilize_expr): Likewise.
> Definitely intended fallthrough.  We're "cleverly" handling the 2nd operand
> of binary expressions, then falling into the unary expression cases which
> handle the 1st operand.

Ok.
 
> > * hsa-gen.c (gen_hsa_insn_for_internal_fn_call): Likewise.
> I think this is a bug, but please contact Martin Jambor directly on this to
> confirm.

Done.

> > * reg-stack.c (get_true_reg): Likewise.
> Pretty sure this is intentional.  Like the cselib.c case, I'd prefer to just
> duplicate the code from teh fallthru path since it's trivial and makes the
> intent clearer.

Done.

> > * tree-complex.c (expand_complex_division): Likewise.
> No sure on this one.
 
Ok, I kept it as it was.
 
> > * tree-data-ref.c (get_references_in_stmt): Likewise.
> Intentional fallthru.  See how we change ref.is_read for IFN_MASK_LOAD, then
> test it in the fallthru path.

Ok. 

> > * tree-pretty-print.c (dump_generic_node): Likewise.
> ?!?  This looks like a false positive from your warning.  We're not falling
> into any case statements here AFAICT.

I've fixed the warning to not warn here anymore.

> > * var-tracking.c (adjust_mems): Likewise.
> Definitely an intended fallthru.   Though I don't like the way this code is
> structured at all.

Yea, it's ugly.

> > gcc/cp/
> > * call.c (add_builtin_candidate): Add gcc_fallthrough.
> > * cxx-pretty-print.c (pp_cxx_unqualified_id): Likewise.
> > * parser.c (cp_parser_skip_to_end_of_statement): Likewise.
> > (cp_parser_cache_defarg): Likewise.
> No idea on these.
> 
> 
> > gcc/c-family/
> > * c-ada-spec.c (dump_generic_ada_node): Add gcc_fallthrough.
> I don't think this is supposed to fall thru.  If it falls through then it's
> going to issue an error about an unexpected TREE_VEC node when we were
> actually working on a TREE_BINFO node.   I think a return 0 is appropriate
> here.
> 
> 
> > libcpp/
> > * pch.c (write_macdef): Add CPP_FALLTHRU.
> > * lex.c (_cpp_lex_direct): Likewise.
> No idea on these.
> 
> Sooo.  For those where I've indicated fallthru is intentional, drop the
> comment and/or rewrite to avoid fallthru as indicated.
 
Done.

> You've got one that looks like a error in your warning
> (tree-pretty-print.c).
 
Fixed.

> Sync with Martin J. on the hsa warning.  I think it's a bug in the hsa code,
> but confirm with Martin.
 
Done.

> I think the c-ada-spec should be twiddled to avoid the fallthru with a
> simple "return 0;"
 
Done.

> For the rest, use your patch as-is.  That preserves behavior and indicates
> we're not sure if fallthru is appropriate or not.  For cp/ feel free to ping
> Jason directly on those for his thoughts.

Thanks a lot for reviewing this.  I'll post an updated version later on.

Marek

Re: Implement C _FloatN, _FloatNx types [version 3]

2016-08-09 Thread James Greenhalgh

On Thu, Jul 28, 2016 at 10:43:25PM +, Joseph Myers wrote:
> On Tue, 19 Jul 2016, James Greenhalgh wrote:
> 
> > These slightly complicate the description you give above as we now want
> > two behaviours. Where the 16-bit floating point extensions are available,
> > we want to use the native operations (FLT_EVAL_METHOD == 16). Where they
> > are not available we want to use promotion to float (FLT_EVAL_METHOD == 0).
> 
> That's not a complication since TARGET_FLT_EVAL_METHOD already can and 
> does depend on target options.
> 
> The complications are that the present excess precision support includes 
> code to default the options (in c-opts.c:c_common_post_options and 
> toplev.c:init_excess_precision), and then to use them (in 
> tree.c:excess_precision_type and c-cppbuiltin.c:c_cpp_builtins), that (a) 
> only handles the C99/C11 values, (b) ignores the possibility of types 
> narrower than float and (c) assumes, when determining the defaults, that 
> excess precision is x86-style, i.e. the back end pretending to have 
> operations it does not have and C99-style excess precision being 
> inefficient so only desired in standards conformance modes.
> 
> (a) and (b) are straightforward to fix.  But to fix (c), first you need to 
> work out the design for how you actually want the compiler to behave for 
> ARM / AArch64, for both FLT_EVAL_METHOD values, outside standards 
> conformance modes.  Then you need to work out what that means for the 
> initialization of the relevant variables, inside and outside standards 
> conformance modes (remembering that for most targets, all the excess 
> precision logic *can* be elided because there aren't any types narrower 
> than float and so FLT_EVAL_METHOD == 0 really does mean no excess 
> precision).  And you need to make all the functions I listed respect the 
> new values and the new semantics for the value 0, and add appropriate 
> testcases.
> 
> > I'm hoping that enabling _Float16 for ARM/AArch64 should not be too
> > difficult with the groundwork in your patches, but I would appreciate any
> > pointers on where I am likely to run in to trouble; I haven't worked in the
> > front-end before.
> 
> The actual C front-end code, outside of the functions I listed, should not 
> require any changes; it just uses excess_precision_type.
> 
> You do need to make appropriate arrangements for _Complex _Float16 
> arithmetic.  See PR 63250.  For things to work when you don't promote 
> (FLT_EVAL_METHOD == 16) you'll need to arrange for the relevant libgcc 
> functions to be built for HCmode (cf. the powerpc float128 issue of 
> needing to build them for KCmode to avoid the present duplicate copies of 
> those functions).

Hi Joseph,

Thanks for this reply, with this I've been able to prototype the _Float16
support for AArch64 and get many of the _Float16 tests you add in this patch,
and the patch for builtin functions, passing.

However, I still have some trouble with TFmode and TImode conversions. In
particular, I'm missing support for the standard names:

  extendhftf2
  trunctfhf2
  floattihf2
  fixhfti

In addition, where the ARMv8.2-A extensions are not available, I have no
support for the standard names: 

  floatdihf
  floatundihf

Providing these names through soft-fp in libgcc should be possible. Certainly
I can enable extendhftf2 and trunctfhf2 using
softfp_extensions/softfp_truncations and providing the appropriate files.
Though, I'd rather avoid adding the full suite of soft-float routines, in
particular I don't need arithmetic like divhf3. There are probably
modifications I can make to the soft-fp build machinery to enable building
only specified conversion functions.

Is this a sensible approach, or would I be better simply the full suite
of HFmode operations that soft-fp expects?

Thanks,
James

Re: [openacc] add a warning for non-contiguous data clauses

2016-08-09 Thread Jakub Jelinek

On Tue, Aug 02, 2016 at 09:16:00PM -0700, Cesar Philippidis wrote:
> 2016-08-02  Cesar Philippidis  
> 
>   gcc/fortran/
>   * openmp.c (resolve_oacc_data_clauses): Emit a warning about a
>   potentially non-contiguous array pointer.
>   (resolve_omp_clauses): Extend coverage of OpenACC data clause
>   validation.
>   * trans-openmp.c (gfc_omp_privatize_by_reference): Use the
>   TREE_VISITED bit on decl to determine if decl is used inside an
>   OpenACC region.  If so, emit a warning whenever decl is an array
>   pointer.
> 
>   gcc/
>   * gimplify.c (oacc_default_clause): Use TREE_VISITED to denote that
>   decl is used inside an OpenACC region.
> 
>   gcc/testsuite/
>   * gfortran.dg/goacc/contiguous.f90: New test.
>   * gfortran.dg/goacc/kernels-alias-4.f95:
> 
> diff --git a/gcc/fortran/openmp.c b/gcc/fortran/openmp.c
> index e463df7..f3ea9d8 100644
> --- a/gcc/fortran/openmp.c
> +++ b/gcc/fortran/openmp.c
> @@ -3268,6 +3268,10 @@ resolve_oacc_data_clauses (gfc_symbol *sym, locus loc, 
> const char *name)
> && CLASS_DATA (sym)->attr.allocatable))
>  gfc_error ("ALLOCATABLE object %qs of polymorphic type "
>  "in %s clause at %L", sym->name, name, &loc);
> +  if (sym->as && sym->as->type == AS_DEFERRED && sym->attr.pointer
> +  && !sym->attr.contiguous)
> +gfc_warning (0, "Potentially noncontiguous deferred shape array %qs in 
> %s "
> +  "clause at %L", sym->name, name, &loc);
>check_symbol_not_pointer (sym, loc, name);
>check_array_not_assumed (sym, loc, name);
>  }

That doesn't make sense.
>From what I can see, the OpenACC 2.0 standard refers to Fortran 2003, which
doesn't have CONTIGUOUS attribute.  That attribute has been only introduced
in Fortran 2008, IMNSHO you can't force OpenACC users even through such
warnings (which can't be disabled except for -w even) to use newer Fortran
standard in their sources.
I'm pretty sure the standard requires something similar to OpenMP, i.e. that
the array sections are contiguous.  The attribute just asserts something is
contiguous, it doesn't mean if the attribute is missing, it isn't
contiguous.

So, IMNSHO if you want to warn about something, only warn when you provably
detect at compile time a non-contiguous array section.

Jakub

Re: [PATCH, Fortran, OpenACC] Fix PR70598, Fortran host_data ICE

2016-08-09 Thread Jakub Jelinek

On Fri, Jul 29, 2016 at 11:47:18PM +0800, Chung-Lin Tang wrote:
> On 2016/7/21 07:13 PM, Jakub Jelinek wrote:
> > Better put every && on a separate line if the whole if (...) doesn't fit
> > on a single line.
> > 
> >> > +  && !n->sym->attr.cray_pointer
> >> > +  && !n->sym->attr.cray_pointee)
> > This is too ugly.  I'd instead move the if after the cray pointer/pointee
> > tests, i.e.
> > if (n->sym->attr.cray_pointer)
> >   gfc_error (...);
> > else if (n->sym->attr.cray_pointee)
> >   gfc_error (...);
> > else if (n->sym->attr.flavor == FL_VARIABLE
> >  && !n->sym->as
> >  && !n->sym->attr.pointer)
> >   gfc_error (...);
> > 
> 
> Hi Jakub, I've adjusted the patch like you suggested.
> 
> Patch has been re-tested and applied to gomp-4_0-branch,
> okay for trunk as well?

Ok with suitable ChangeLog entry.

Jakub

[PATCH, i386]: Fix PR 72843, ICE in lra_set_insn_recog_data

2016-08-09 Thread Uros Bizjak

Hello!

Insn predicates have to allow all constraints during LRA, not only
after reload_completed is true.

2016-08-09  Uros Bizjak  

PR target/72843
* config/i386/i386.md (*movtf_internal): Use
lra_in_progress || reload_completed instead of !can_create_pseudo_p
in the insn constraint.
(*movxf_internal): Ditto.
(*movdf_internal): Ditto.
(*movsf_internal): Ditto.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32} and
committed to mainline.

The patch will be backported to to other release branches after a week.

Uros.
Index: config/i386/i386.md
===
--- config/i386/i386.md (revision 239247)
+++ config/i386/i386.md (working copy)
@@ -3119,7 +3119,7 @@
(match_operand:TF 1 "general_operand"  "C ,vm,v,*roF,*rC"))]
   "(TARGET_64BIT || TARGET_SSE)
&& !(MEM_P (operands[0]) && MEM_P (operands[1]))
-   && (!can_create_pseudo_p ()
+   && (lra_in_progress || reload_completed
|| !CONST_DOUBLE_P (operands[1])
|| ((optimize_function_for_size_p (cfun)
|| (ix86_cmodel == CM_LARGE || ix86_cmodel == CM_LARGE_PIC))
@@ -3205,7 +3205,7 @@
(match_operand:XF 1 "general_operand"
 "fm,f,G,roF,r , *roF,*r,F ,C,roF,rF"))]
   "!(MEM_P (operands[0]) && MEM_P (operands[1]))
-   && (!can_create_pseudo_p ()
+   && (lra_in_progress || reload_completed
|| !CONST_DOUBLE_P (operands[1])
|| ((optimize_function_for_size_p (cfun)
|| (ix86_cmodel == CM_LARGE || ix86_cmodel == CM_LARGE_PIC))
@@ -3278,7 +3278,7 @@
(match_operand:DF 1 "general_operand"
 "Yf*fm,Yf*f,G   ,roF,r ,*roF,*r,F ,rm,rC,C ,F ,C,v,m,v,C ,*x,m ,*x,Yj,r 
,roF,rF,rmF,rC"))]
   "!(MEM_P (operands[0]) && MEM_P (operands[1]))
-   && (!can_create_pseudo_p ()
+   && (lra_in_progress || reload_completed
|| !CONST_DOUBLE_P (operands[1])
|| ((optimize_function_for_size_p (cfun)
|| (ix86_cmodel == CM_LARGE || ix86_cmodel == CM_LARGE_PIC))
@@ -3480,7 +3480,7 @@
(match_operand:SF 1 "general_operand"
  "Yf*fm,Yf*f,G   ,rmF,rF,C,v,m,v,Yj,r  ,*y ,m  ,*y,*Yn,r   ,rmF,rF"))]
   "!(MEM_P (operands[0]) && MEM_P (operands[1]))
-   && (!can_create_pseudo_p ()
+   && (lra_in_progress || reload_completed
|| !CONST_DOUBLE_P (operands[1])
|| ((optimize_function_for_size_p (cfun)
|| (ix86_cmodel == CM_LARGE || ix86_cmodel == CM_LARGE_PIC))

Re: [PATCH] Fix a few fall through cases

2016-08-09 Thread Jakub Jelinek

On Tue, Aug 09, 2016 at 05:21:26PM +0200, Marek Polacek wrote:
> Bootstrapped/regtested on x86_64-linux and ppc64le-redhat-linux.
> 
> 2016-08-09  Marek Polacek  
> 
>   PR c/7652
> gcc/c-family/
>   * c-ada-spec.c (dump_generic_ada_node): Add return.
> gcc/
>   * cselib.c (cselib_expand_value_rtx_1): Add return.
>   * gengtype.c (dbgprint_count_type_at): Likewise.
>   * hsa-gen.c (gen_hsa_insn_for_internal_fn_call): Likewise.
>   * reg-stack.c (get_true_reg): Restructure to avoid fallthrough warning.
> 
> --- gcc/gcc/cselib.c
> +++ gcc/gcc/cselib.c
> @@ -1618,6 +1618,7 @@ cselib_expand_value_rtx_1 (rtx orig, struct 
> expand_value_data *evd,
> else
>   return orig;
>   }
> + return orig;
>}
>  
>  CASE_CONST_ANY:
> --- gcc/gcc/gengtype.c
> +++ gcc/gcc/gengtype.c
> @@ -175,6 +175,7 @@ dbgprint_count_type_at (const char *fil, int lin, const 
> char *msg, type_p t)
>   {
>   case TYPE_UNDEFINED:
> nb_undefined++;
> +   break;
>   case TYPE_SCALAR:
> nb_scalar++;
> break;

These two are ok.

> --- gcc/gcc/reg-stack.c
> +++ gcc/gcc/reg-stack.c
> @@ -423,23 +423,25 @@ get_true_reg (rtx *pat)
> GET_MODE (subreg));
> return pat;
>   }
> +   pat = &XEXP (*pat, 0);
> +   break;
>   }

While this one is similar to the cselib.c case, I wonder if it wouldn't be
nicer to drop the {}s, reindent the case SUBREG: and just /* FALLTHRU */
at the end.  Uninitialized decls can be declared anywhere in the switch
in C++ or in C99+.  I don't feel strongly about it, so if you don't want to
do that, this hunk is ok too.

>case FLOAT:
>case FIX:
>case FLOAT_EXTEND:
> - pat = & XEXP (*pat, 0);
> + pat = &XEXP (*pat, 0);
>   break;
>  
>case UNSPEC:
>   if (XINT (*pat, 1) == UNSPEC_TRUNC_NOOP
>   || XINT (*pat, 1) == UNSPEC_FILD_ATOMIC)
> -   pat = & XVECEXP (*pat, 0, 0);
> +   pat = &XVECEXP (*pat, 0, 0);
>   return pat;
>  
>case FLOAT_TRUNCATE:
>   if (!flag_unsafe_math_optimizations)
> return pat;
> - pat = & XEXP (*pat, 0);
> + pat = &XEXP (*pat, 0);
>   break;
>  
>default:

The above 3 are obviously ok.

Jakub

Re: [PATCH GCC]Resolve compilation time known alias checks in vectorizer

2016-08-09 Thread Bin.Cheng

On Sat, Aug 6, 2016 at 9:20 PM, Andreas Schwab  wrote:
> On Mi, Jul 13 2016, "Bin.Cheng"  wrote:
>
>> Patch re-tested/applied on trunk as r238301.
>
> This breaks gcc.dg/vect/vect-117.c on powerpc.
Hi Andreas,
Sorry for the inconvenience, I will have a look.

Thanks,
bin
>
> Andreas.
>
> --
> Andreas Schwab, sch...@linux-m68k.org
> GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
> "And now for something completely different."

Re: libgo patch committed: Change build procedure to use build tags

2016-08-09 Thread Lynn A. Boger

libgo test results in gcc-testresults for ppc64le & ppc64 are back to 
what they were before the change.


Thanks.

On 08/08/2016 03:34 PM, Ian Lance Taylor wrote:

On Mon, Aug 8, 2016 at 11:14 AM, Lynn A. Boger
 wrote:

The libgo tests on ppc64le and ppc64 have all been failing in
gcc-testresults since this change went in and continues to fail after the
recent fixes for failures on other platforms.

Built myself and got the same failures.  I set keep=true in gotest to save
the test dirs.  Just running a single package:

make bufio/check
file /home/boger/gccgo.work/trunk/bld/../src/libgo/go/bufio/bufio.go not
found
Keeping gotest9734
FAIL: bufio
Makefile:3645: recipe for target 'bufio/check' failed
make: *** [bufio/check] Error 1
boger@willow3:~/gccgo.work/trunk/bld/powerpc64le-linux/libgo$ ls
/home/boger/gccgo.work/trunk/bld/../src/libgo/go/bufio/bufio.go
/home/boger/gccgo.work/trunk/bld/../src/libgo/go/bufio/bufio.go
boger@willow3:~/gccgo.work/trunk/bld/powerpc64le-linux/libgo$ file
/home/boger/gccgo.work/trunk/bld/../src/libgo/go/bufio/bufio.go
/home/boger/gccgo.work/trunk/bld/../src/libgo/go/bufio/bufio.go: ASCII text

I found the problem.  I always configure with a relative srcdir.  You
are using an absolute srcdir.  The recent changes to use build tags
changed the libgo Makefile so that when invoked with with an absolute
srcdir it would pass absolute path names to the gotest script.  That
never worked.  This patch makes it work, and should fix your problem.
Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu with a
relative srcdir and an absolute srcdir.  Committed to mainline.

Ian

Re: [PATCH] Fix a few fall through cases

2016-08-09 Thread Martin Jambor

Hi,

On Tue, Aug 09, 2016 at 05:21:26PM +0200, Marek Polacek wrote:
> Testing with -Wimplicit-fallthrough turned up a few spots where
> fall through is either a bug, or just too confusing.  This patch
> should be desirable even though the warning is still not in.
> 
> Eric, can you look at the c-ada-spec.c part?
> Martin, does the hsa-gen.c part look correct?
> 
> --- gcc/gcc/hsa-gen.c
> +++ gcc/gcc/hsa-gen.c
> @@ -5039,6 +5039,7 @@ gen_hsa_insn_for_internal_fn_call (gcall *stmt, hsa_bb 
> *hbb)
>  case IFN_FMIN:
>  case IFN_FMAX:
>gen_hsa_insns_for_call_of_internal_fn (stmt, hbb);
> +  break;
>  
>  default:
>HSA_SORRY_ATV (gimple_location (stmt),


Yes it is correct.  Thanks for catching this.
-Wimplicit-fallthrough is really useful work!

Martin

Re: Use verify_oacc_routine_clauses for Fortran (was: Use verify_oacc_routine_clauses for C/C++)

2016-08-09 Thread Cesar Philippidis

On 08/01/2016 08:29 AM, Thomas Schwinge wrote:

> On Mon, 01 Aug 2016 17:21:37 +0200, I wrote:
>> Some checking of OpenACC clauses currently done in the front ends should
>> be moved later, and be unified.  (Also, I suppose, for supporting of the
>> device_type clause, such checking actually *must* be moved later, into
>> the oaccdevlow pass, or similar.)  Here is a first preparatory patch.  OK
>> for trunk?
>>
>> commit e02a9b65c505b404f8d985b0ec6ccb99d73515d3
>> Author: Thomas Schwinge 
>> Date:   Wed Jul 27 15:54:38 2016 +0200
>>
>> Use verify_oacc_routine_clauses for C/C++
> 
> Here is a Fortran patch.  This depends on other Fortran patches in flight
> (such as Cesar's), and on PR72741 "Fortran OpenACC routine directive
> doesn't properly handle clauses specifying the level of parallelism" be
> resolved, and thereabouts, but I'm posting it anyway, in case anyone has
> any review comments already.  I suppose, to begin with, the call of
> gfc_oacc_routine_dims will move later into the Fortran front end
> pipeline, to the point then function declarations' attributes are set, or
> similar.  Also, as discussed already, the Fortran front end currently is
> very "forgetful" in regards to OpenACC/OpenMP clauses' specific location
> information, so we're not able at present to produce diagnostics with
> precise location information.

Are you planning on staging this patch in gomp4? I know it's a work in
progress, but is this function ...


> diff --git gcc/fortran/trans-openmp.c gcc/fortran/trans-openmp.c
> index 0d646ed..254732c 100644
> --- gcc/fortran/trans-openmp.c
> +++ gcc/fortran/trans-openmp.c
> @@ -4570,3 +4570,70 @@ gfc_trans_omp_declare_simd (gfc_namespace *ns)
>DECL_ATTRIBUTES (fndecl) =;
>  }
>  }
> +
> +/* Determine and verify the level of parallelism for an OpenACC routine.  */
> +
> +oacc_function
> +gfc_oacc_routine_dims (gfc_omp_clauses *clauses, locus location)
> +{
> +  /* This is implemented in terms of OMP_CLAUSE trees, so that we can use the
> + generic functions for checking validity.  This has a little bit of
> + overhead, but as the number of clauses on OpenACC routine directives as
> + well as the number of OpenACC routine directives will both be rather
> + small, this is acceptable.  */
> +  tree clauses_t =ULL_TREE;
> +  /* We don't have specific location information available for the individual
> + clauses...  */
> +  location_t loc =ocation.lb->location;
> +  if (clauses)
> +{
> +  if (clauses->gang)
> + {
> +   tree c =uild_omp_clause (loc, OMP_CLAUSE_GANG);
> +   OMP_CLAUSE_CHAIN (c) =lauses_t;
> +   clauses_t =;
> + }
> +  if (clauses->worker)
> + {
> +   tree c =uild_omp_clause (loc, OMP_CLAUSE_WORKER);
> +   OMP_CLAUSE_CHAIN (c) =lauses_t;
> +   clauses_t =;
> + }
> +  if (clauses->vector)
> + {
> +   tree c =uild_omp_clause (loc, OMP_CLAUSE_VECTOR);
> +   OMP_CLAUSE_CHAIN (c) =lauses_t;
> +   clauses_t =;
> + }
> +  if (clauses->seq)
> + {
> +   tree c =uild_omp_clause (loc, OMP_CLAUSE_SEQ);
> +   OMP_CLAUSE_CHAIN (c) =lauses_t;
> +   clauses_t =;
> + }
> +}
> +  verify_oacc_routine_clauses (&clauses_t, loc);

included elsewhere in your patch set? I'll take over this patch as I
work on PR72741.

Cesar

Re: Implement -Wimplicit-fallthrough (take 2): questionable code

2016-08-09 Thread Jakub Jelinek

On Wed, Jul 27, 2016 at 06:53:39PM +0200, Marek Polacek wrote:
> --- gcc/gcc/var-tracking.c
> +++ gcc/gcc/var-tracking.c
> @@ -1056,6 +1056,8 @@ adjust_mems (rtx loc, const_rtx old_rtx, void *data)
>? GET_MODE_SIZE (amd->mem_mode)
>: -GET_MODE_SIZE (amd->mem_mode),
>GET_MODE (loc)));
> +  /* XXX Really fallthru?  */
> +  gcc_fallthrough ();
>  case POST_INC:
>  case POST_DEC:
>if (addr == loc)
> @@ -1076,6 +1078,8 @@ adjust_mems (rtx loc, const_rtx old_rtx, void *data)
>return addr;
>  case PRE_MODIFY:
>addr = XEXP (loc, 1);
> +  /* XXX Really fallthru?  */
> +  gcc_fallthrough ();
>  case POST_MODIFY:
>if (addr == loc)
>   addr = XEXP (loc, 0);

Indeed, these two are intentional fallthrus.  But, can you just add
/* FALLTHRU */
comment in there, rather than gcc_fallthrough () or similar?
Patch preapproved.

Jakub

[PATCH] cpp/c: Add -Wexpansion-to-defined

2016-08-09 Thread Paolo Bonzini

clang recently added a new warning -Wexpansion-to-defined, which
warns when `defined' is used outside a #if expression (including the
case of a macro that is then used in a #if expression).

While I disagree with their inclusion of the warning to -Wall, I think
it is a good addition overall.  First, it is a logical extension of the
existing warning for breaking defined across a macro and its caller.
Second, it is good to make these warnings for `defined' available with
a command-line option other than -pedantic.

This patch adds -Wexpansion-to-defined, and enables it automatically
for both -pedantic (as before) and -Wextra.

Bootstrapped and regression tested x86_64-pc-linux-gnu, ok?

Paolo

libcpp:
2016-08-09  Paolo Bonzini  

* include/cpplib.h (struct cpp_options): Add new member
warn_expansion_to_defined.
(CPP_W_EXPANSION_TO_DEFINED): New enum member.
* expr.c (parse_defined): Warn for all uses of "defined"
in macros, and tie warning to CPP_W_EXPANSION_TO_DEFINED.
* system.h (HAVE_DESIGNATED_INITIALIZERS): Do not use
"defined" in macros.

gcc:
2016-08-09  Paolo Bonzini  

* system.h (HAVE_DESIGNATED_INITIALIZERS,
HAVE_DESIGNATED_UNION_INITIALIZERS): Do not use
"defined" in macros.

gcc/c-family:
2016-08-09  Paolo Bonzini  

* c-opts (sanitize_cpp_opts): Compute default value
for -Wexpansion-to-defined.
* c.opt (Wexpansion-to-defined): New.

gcc/doc:
2016-08-09  Paolo Bonzini  

* cpp.texi (Defined): Mention -Wexpansion-to-defined.
* cppopts.texi (Invocation): Document -Wexpansion-to-defined.
* invoke.texi (Warning Options): Document -Wexpansion-to-defined.

gcc/testsuite:
2016-08-09  Paolo Bonzini  

* gcc.dg/cpp/defined.c: Mark newly introduced warnings.
* gcc.dg/cpp/defined-Wexpansion-to-defined.c,
gcc.dg/cpp/defined-Wextra-Wno-expansion-to-defined.c,
gcc.dg/cpp/defined-Wextra.c,
gcc.dg/cpp/defined-Wno-expansion-to-defined.c: New testcases.

Index: gcc/c-family/c-opts.c
===
--- gcc/c-family/c-opts.c   (revision 239276)
+++ gcc/c-family/c-opts.c   (working copy)
@@ -1256,6 +1256,10 @@ sanitize_cpp_opts (void)
   cpp_opts->unsigned_char = !flag_signed_char;
   cpp_opts->stdc_0_in_system_headers = STDC_0_IN_SYSTEM_HEADERS;
 
+  cpp_opts->warn_expansion_to_defined = cpp_warn_expansion_to_defined;
+  if (cpp_warn_expansion_to_defined == -1)
+cpp_warn_expansion_to_defined = pedantic || extra_warnings;
+
   /* Wlong-long is disabled by default. It is enabled by:
   [-Wpedantic | -Wtraditional] -std=[gnu|c]++98 ; or
   [-Wpedantic | -Wtraditional] -std=non-c99 
Index: gcc/c-family/c.opt
===
--- gcc/c-family/c.opt  (revision 239276)
+++ gcc/c-family/c.opt  (working copy)
@@ -506,6 +506,10 @@ Wdouble-promotion
 C ObjC C++ ObjC++ Var(warn_double_promotion) Warning
 Warn about implicit conversions from \"float\" to \"double\".
 
+Wexpansion-to-defined
+C ObjC C++ ObjC++ CppReason(CPP_W_EXPANSION_TO_DEFINED) 
Var(cpp_warn_expansion_to_defined) Init(-1) Warning
+Warn if an undefined macro is used in an #if directive.
+
 Wimplicit-function-declaration
 C ObjC Var(warn_implicit_function_declaration) Init(-1) Warning 
LangEnabledBy(C ObjC,Wimplicit)
 Warn about implicit function declarations.
Index: gcc/doc/cpp.texi
===
--- gcc/doc/cpp.texi(revision 239276)
+++ gcc/doc/cpp.texi(working copy)
@@ -3342,7 +3342,8 @@ If the @code{defined} operator appears as a result
 the C standard says the behavior is undefined.  GNU cpp treats it as a
 genuine @code{defined} operator and evaluates it normally.  It will warn
 wherever your code uses this feature if you use the command-line option
-@option{-pedantic}, since other compilers may handle it differently.
+@option{-pedantic}, since other compilers may handle it differently.  The
+warning can also be enabled individually with @option{-Wexpansion-to-defined}.
 
 @node Else
 @subsection Else
Index: gcc/doc/cppopts.texi
===
--- gcc/doc/cppopts.texi(revision 239276)
+++ gcc/doc/cppopts.texi(working copy)
@@ -120,6 +120,12 @@ Warn whenever an identifier which is not a macro i
 @samp{#if} directive, outside of @samp{defined}.  Such identifiers are
 replaced with zero.
 
+@item -Wexpansion-to-defined
+@opindex Wexpansion-to-defined
+Warn whenever @samp{defined} is encountered in the expansion of a macro
+(including the case where the macro is expanded by an @samp{#if} directive).
+Such usage is not portable.
+
 @item -Wunused-macros
 @opindex Wunused-macros
 Warn about macros defined in the main file that are unused.  A macro
Index: gcc/doc/invoke.texi
===
--- g

[PR ipa/71981] Make get_dynamic_type grok MEM_REF

2016-08-09 Thread Martin Jambor

Hi,

in the PR ipa_polymorphic_call_context::get_dynamic_type hits an assert
when it tries to figure out dynamic type changes in sequence:

  _3 = MEM[(char * *)""];
  fn1.isra.0 (_3);

and gets the MEM_REF as its instance parameter.  This patch makes it
bail out early instead.

Pre-approved by Honza on IRC, I have committed it to the trunk and the
gcc-6-branch after bootstrapping and testing it on both.

Thanks,

Martin

2016-08-09  Martin Jambor  

PR ipa/71981
* ipa-polymorphic-call.c (get_dynamic_type): Bail out gracefully
if instance is a MEM_REF.

testsuite/
PR ipa/71981
* gcc.dg/ipa/pr71981.c: New test.


diff --git a/gcc/ipa-polymorphic-call.c b/gcc/ipa-polymorphic-call.c
index 56f3344..f7ef6aa 100644
--- a/gcc/ipa-polymorphic-call.c
+++ b/gcc/ipa-polymorphic-call.c
@@ -1544,6 +1544,11 @@ ipa_polymorphic_call_context::get_dynamic_type (tree 
instance,
   if (!maybe_in_construction && !maybe_derived_type)
 return false;
 
+  /* If we are in fact not looking at any object object or the instance is
+ some placement new into a random load, give up straight away.  */
+  if (TREE_CODE (instance) == MEM_REF)
+return false;
+
   /* We need to obtain refernce to virtual table pointer.  It is better
  to look it up in the code rather than build our own.  This require bit
  of pattern matching, but we end up verifying that what we found is
@@ -1664,7 +1669,6 @@ ipa_polymorphic_call_context::get_dynamic_type (tree 
instance,
   tci.offset = instance_offset;
   tci.instance = instance;
   tci.vtbl_ptr_ref = instance_ref;
-  gcc_assert (TREE_CODE (instance) != MEM_REF);
   tci.known_current_type = NULL_TREE;
   tci.known_current_offset = 0;
   tci.otr_type = otr_type;
diff --git a/gcc/testsuite/gcc.dg/ipa/pr71981.c 
b/gcc/testsuite/gcc.dg/ipa/pr71981.c
new file mode 100644
index 000..1b21602
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ipa/pr71981.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -w" } */
+
+int **a;
+static void fn1(char **p1) {
+  char s = *p1, b = &s;
+  while (*fn2()[a])
+;
+}
+int main() { fn1(""); return 0; }

Re: [PATCH] Fix a few fall through cases

2016-08-09 Thread Eric Botcazou

> 2016-08-09  Marek Polacek  
> 
>   PR c/7652
> gcc/c-family/
>   * c-ada-spec.c (dump_generic_ada_node): Add return.

OK, thanks.

-- 
Eric Botcazou

Re: [PATCH, fixincludes] Fix PR bootstrap/72833

2016-08-09 Thread Bruce Korb

Index: fixincludes/fixincl.tpl

===

--- fixincludes/fixincl.tpl (revision 239193)

+++ fixincludes/fixincl.tpl (working copy)

@@ -1,7 +1,7 @@

 [= AutoGen5 Template -*- Mode: C -*-

 x=fixincl.x =]

 [=

- (if (version-compare >= autogen-version "5.18")

+ (if (version-compare >= autogen-version "5.18.1")

  (dne "-D" " * " "/* ")

  (dne " * " "/* ") ) =]

Was this causing a problem?  If so, then it all looks good to me.  I
had thought the "-D" was implemented in 5.18, but that was long, long
ago now.

On Tue, Aug 9, 2016 at 5:17 AM, Bernd Edlinger
 wrote:
> Hi!
>
> Due to my recent cleanup of special_function_p the darwin bootstrap is
> currently broken because the longjmp function misses the noreturn
> attribute.  Therefore I revived an old solaris_longjmp_noreturn
> fixinclude rule.
>
> See the attached patch.
>
> Boot-strap on i686-apple-darwin11 and x86_64-apple-darwin11 was
> successful.
>
> Is it OK for trunk?
>
>
> Thanks
> Bernd.

Re: [PATCH, libgo]: Fix syscall test failure on CentOS 5.11

2016-08-09 Thread Ian Lance Taylor

On Tue, Aug 9, 2016 at 1:03 AM, Uros Bizjak  wrote:
>
> syscall test fails on CentOS 5.11 with:
>
> exec_linux_test.go:167:25: error: reference to undefined identifier
> 'syscall.CLONE_NEWNET'
>Unshareflags: syscall.CLONE_NEWNET,
>  ^
> FAIL: syscall
>
> Attached patch fixes the failure by providing CLONE_NEWNET definition.
>
> Tested on x86_64-linux-gnu.

Thanks.  Committed.

Ian

Re: [PATCH 0/7, GCC, V8M] ARMv8-M Security Extensions

2016-08-09 Thread Sandra Loosemore


On 08/09/2016 06:01 AM, Andre Vieira (lists) wrote:

[snip]

The documentation is in the ARMV8-M Security Extensions in: ARM®v8-M
Security Extensions: Requirements on Development Tools document I linked
in the email above and subsequent emails
(http://infocenter.arm.com/help/topic/com.arm.doc.ecm0359818/index.html).

Also per patch I refer to the relevant sections. So for instance in
PATCH 3/7 refers to Section 5.4, which describes Entry functions and
mentions the cmse_nonsecure_entry attribute. Whereas PATCH 7/7 refers to
Section 5.4.3 of the same document which describes the
cmse_nonsecure_caller intrinsic which that patch implements.

Is there a specific intrinsic/attribute you are missing?


You need to at least add entries to the relevant sections in extend.texi 
for the new target-specific intrinsic and attributes.  The documentation 
there doesn't need to be terribly detailed (one sentence and a link to 
the external document is probably all you need), but it's important that 
these things be listed in GCC's supported extensions so that users know 
they can use them and so that people who see them in code written by 
other people can find out what they mean.


-Sandra

C++ PATCHes for abi_tag bugs

2016-08-09 Thread Jason Merrill

The first patch corrects a remaining issue with the GCC 6 fix for ABI
tags on template member functions.  We were missing tags on them, and
I fixed that by setting tags on the template, but the proper fix is to
look at the instantiation rather than the template to find the
relevant tags, since the type we mangle is the instantiated type.

The second patch corrects a bug whereby we were mangling tags on a
conversion function twice.

Tested x86_64-pc-linux-gnu, applying to trunk.
commit 47a130c94ba180704779c0e024adee84a8f001e5
Author: Jason Merrill 
Date:   Fri Jul 29 22:33:04 2016 -0400

Adjust mangling of ABI tags on class template member functions.

* class.c (missing_abi_tags): New.
(check_abi_tags): Don't check template. Add just_checking mode.
* mangle.c (abi_flag_at_least, any_abi_below, equal_abi_tags): New.
(sorted_abi_tags): Split out from write_abi_tags.
(struct releasing_vec): New.
(write_unqualified_name): Only look for the primary
template for types.  Implement backward compatibility.

diff --git a/gcc/cp/class.c b/gcc/cp/class.c
index 8249d93..e21647a 100644
--- a/gcc/cp/class.c
+++ b/gcc/cp/class.c
@@ -1561,20 +1561,20 @@ mark_abi_tags (tree t, bool val)
 
 /* Check that T has all the ABI tags that subobject SUBOB has, or
warn if not.  If T is a (variable or function) declaration, also
-   add any missing tags.  */
+   return any missing tags, and add them to T if JUST_CHECKING is false.  */
 
-static void
-check_abi_tags (tree t, tree subob)
+static tree
+check_abi_tags (tree t, tree subob, bool just_checking = false)
 {
   bool inherit = DECL_P (t);
 
   if (!inherit && !warn_abi_tag)
-return;
+return NULL_TREE;
 
   tree decl = TYPE_P (t) ? TYPE_NAME (t) : t;
   if (!TREE_PUBLIC (decl))
 /* No need to worry about things local to this TU.  */
-return;
+return NULL_TREE;
 
   mark_abi_tags (t, true);
 
@@ -1585,7 +1585,15 @@ check_abi_tags (tree t, tree subob)
 
   cp_walk_tree_without_duplicates (&subtype, find_abi_tags_r, &data);
 
-  if (inherit && data.tags)
+  if (!(inherit && data.tags))
+/* We don't need to do anything with data.tags.  */;
+  else if (just_checking)
+for (tree t = data.tags; t; t = TREE_CHAIN (t))
+  {
+   tree id = get_identifier (TREE_STRING_POINTER (TREE_VALUE (t)));
+   IDENTIFIER_MARKED (id) = false;
+  }
+  else
 {
   tree attr = lookup_attribute ("abi_tag", DECL_ATTRIBUTES (t));
   if (attr)
@@ -1597,6 +1605,8 @@ check_abi_tags (tree t, tree subob)
 }
 
   mark_abi_tags (t, false);
+
+  return data.tags;
 }
 
 /* Check that DECL has all the ABI tags that are used in parts of its type
@@ -1605,15 +1615,6 @@ check_abi_tags (tree t, tree subob)
 void
 check_abi_tags (tree decl)
 {
-  tree t;
-  if (abi_version_at_least (10)
-  && DECL_LANG_SPECIFIC (decl)
-  && DECL_USE_TEMPLATE (decl)
-  && (t = DECL_TEMPLATE_RESULT (DECL_TI_TEMPLATE (decl)),
- t != decl))
-/* Make sure that our template has the appropriate tags, since
-   write_unqualified_name looks for them there.  */
-check_abi_tags (t);
   if (VAR_P (decl))
 check_abi_tags (decl, TREE_TYPE (decl));
   else if (TREE_CODE (decl) == FUNCTION_DECL
@@ -1621,6 +1622,22 @@ check_abi_tags (tree decl)
 check_abi_tags (decl, TREE_TYPE (TREE_TYPE (decl)));
 }
 
+/* Return any ABI tags that are used in parts of the type of DECL
+   that are not reflected in its mangled name.  This function is only
+   used in backward-compatible mangling for ABI <11.  */
+
+tree
+missing_abi_tags (tree decl)
+{
+  if (VAR_P (decl))
+return check_abi_tags (decl, TREE_TYPE (decl), true);
+  else if (TREE_CODE (decl) == FUNCTION_DECL
+  && !mangle_return_type_p (decl))
+return check_abi_tags (decl, TREE_TYPE (TREE_TYPE (decl)), true);
+  else
+return NULL_TREE;
+}
+
 void
 inherit_targ_abi_tags (tree t)
 {
diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 70a42f8..f32613c 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -5686,6 +5686,7 @@ extern void inherit_targ_abi_tags (tree);
 extern void defaulted_late_check   (tree);
 extern bool defaultable_fn_check   (tree);
 extern void check_abi_tags (tree);
+extern tree missing_abi_tags   (tree);
 extern void fixup_type_variants(tree);
 extern void fixup_attribute_variants   (tree);
 extern tree* decl_cloned_function_p(const_tree, bool);
diff --git a/gcc/cp/mangle.c b/gcc/cp/mangle.c
index b6c9628..f7ff221 100644
--- a/gcc/cp/mangle.c
+++ b/gcc/cp/mangle.c
@@ -91,6 +91,14 @@ along with GCC; see the file COPYING3.  If not see
 #define abi_warn_or_compat_version_crosses(N) \
   (abi_version_crosses (N) || abi_compat_version_crosses (N))
 
+/* And sometimes we can simplify the code path if we don't need to worry about
+   previous ABIs.  */
+#define abi_flag_at_least(flag,N) (flag == 0 ||

Re: [PATCH, fixincludes] Fix PR bootstrap/72833

2016-08-09 Thread Bernd Edlinger

On 08/09/16 18:28, Bruce Korb wrote:
> Index: fixincludes/fixincl.tpl
>
> ===
>
> --- fixincludes/fixincl.tpl (revision 239193)
>
> +++ fixincludes/fixincl.tpl (working copy)
>
> @@ -1,7 +1,7 @@
>
>   [= AutoGen5 Template -*- Mode: C -*-
>
>   x=fixincl.x =]
>
>   [=
>
> - (if (version-compare >= autogen-version "5.18")
>
> + (if (version-compare >= autogen-version "5.18.1")
>
>(dne "-D" " * " "/* ")
>
>(dne " * " "/* ") ) =]
>
> Was this causing a problem?  If so, then it all looks good to me.  I
> had thought the "-D" was implemented in 5.18, but that was long, long
> ago now.
>

Yes. I was surprised that genfixes generated a bogus fixincl.x
there was no warning, but the result did not compile in the end.

You can see the result as the first obsoleted patch in the PR 72833.

I use autogen 5.18 from Ubuntu 14.04.


Thanks
Bernd.

> On Tue, Aug 9, 2016 at 5:17 AM, Bernd Edlinger
>  wrote:
>> Hi!
>>
>> Due to my recent cleanup of special_function_p the darwin bootstrap is
>> currently broken because the longjmp function misses the noreturn
>> attribute.  Therefore I revived an old solaris_longjmp_noreturn
>> fixinclude rule.
>>
>> See the attached patch.
>>
>> Boot-strap on i686-apple-darwin11 and x86_64-apple-darwin11 was
>> successful.
>>
>> Is it OK for trunk?
>>
>>
>> Thanks
>> Bernd.

Re: [PATCH] Fix early debug regression with DW_AT_string_length (PR debug/71906)

2016-08-09 Thread Richard Biener

On August 9, 2016 4:36:24 PM GMT+02:00, Jakub Jelinek  wrote:
>On Mon, Aug 01, 2016 at 12:44:30PM +0200, Richard Biener wrote:
>> Note that reading the dwarf standard, it looks like it accepts a
>location
>> which means we could do an implicit location description using
>> DW_OP_stack_value which gives us access to arbitrary dwarf
>> expressions (and thus the possibility to handle it similar to VLAs).
>
>Sure.  But for the DW_AT_string_length, the issue is that with early
>debug,
>we want the attribute early, but locations within a function obviously
>can be provided late, after the function is compiled.
>
>Which is why the patch uses DW_OP_call4 as the holder of the
>information
>that the string length is provided by the value of some variable.
>In certain cases it can stay that way even during late debug, if the
>var
>disappears or has no location info etc., then DW_AT_string_length is
>dropped, or in other cases the location of the var is copied into the
>attribute etc.
>
>This is different from the VLA bounds in DW_AT_upper_bound etc., where
>it is allowed to just refer to a DIE that holds the value (so we can
>stick
>it into the attribute early and just supply the var's location later),
>for
>DW_AT_string_length it is not allowed.
>
>> But maybe I am missing something?  (now running into the issue
>> with LTO debug and gfortran.dg/save_5.f90 where during early debug
>> we emit a location that ends up refering to a symbol that might be
>> optimized away later - early debug cannot sanitize referenced symbols
>> via resolv_addr obviously).  Annotating the DIE late is also not
>> what I want to do as I'd need to pull in all type DIEs into the late
>CU
>> that way (well, at least selected ones, but I'm really trying to
>avoid
>> going down that route).
>
>In some cases like location of file scope vars, or this
>DW_AT_string_length,
>you really need to adjust late the debug info created early, there is
>no
>workaround for that.

I suppose the workaround is to fix/extend DWARF... (DW_AT_GNU_string_length 
that allows us to refer to another DIE?)

Richard.

>   Jakub

Re: [PATCH] Fix unaligned access when predictive commoning (PR 71083)

2016-08-09 Thread Bernd Edlinger

On 08/09/16 09:29, Richard Biener wrote:
> On Mon, 8 Aug 2016, Bernd Edlinger wrote:
>
>> Hi!
>>
>>
>> As described in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71083
>> we generate unaligned accesses because tree-predcom rewrites
>> bitfield and packed accesses in a way that looses the proper
>> alignment information.
>>
>> The attached patch fixes this by re-using the gimple memory
>> expression from the original access.
>>
>> I was not sure, if any non-constant array references would be
>> valid at where the ref_at_iteration is expanded, I set these
>> to the array-lower-bound, as the DR_OFFSET contains the folded
>> value of all non-constant array references.
>>
>> The patch compensates for the constant offset from the outer
>> reference to the inner reference, and replaces the inner
>> reference instead of the outer reference.
>>
>>
>> Boot-strapped and reg-tested on x86_64-linux-gnu.
>> OK for trunk?
>
> I don't think what you do is correct.  Consider
>
>   for (i)
>for (j)
> {
>   ... = a[i][j];
> }
>
> predcom considers innermost loops only and thus the DRs will
> be analyzed with respect to that which means DR_BASE_ADDRESS
> is &a[i][0] but you end up generating (*(&a) + i * ..)[0][0]
> which is at best suspicious with respect to further analyses
> by data-ref and TBAA.  Also note that this may get alignment
> wrong as well as changing [i] to [0] may lead to false alignment
> (consider a [2][2] char array aligned to 4 bytes).
>
> Your patch goes halfway back to code we had in the past
> (looking at the 4.3 branch right now) which had numerous
> issues (don't remember exactly).
>

Hmm.  Yes.  These ARRAY_REFs ruin the whole idea :)

> I believe that if we'd handle bitfields by
>
>if (TREE_CODE (DR_REF (dr)) == COMPONENT_REF
>&& DECL_BIT_FIELD_TYPE (TREE_OPERAND (DR_REF (dr), 1)))
>  ref = TREE_OPERAND (DR_REF (dr), 0);
>else
>  ref = DR_REF (dr);
>unsigned align = get_object_alignment (ref);
>
> and use the type / alignment of that ref for the built MEM_REF
> (with coff adjusted by the split bitfield component-ref offset)
> and build a COMPONENT_REF around the MEM_REF to handle the
> bitfield part this should work ok.
>

Ooookay.  I think your idea works in principle.

Attached a new version of the patch, I hope it did not become
too ugly.

I think from Eric's comment in get_inner_ref it can be possible
in Ada that the outer object is accidentally byte-aligned
but the inner reference is not.  In that case I think it is
better to generate a BIT_FIELD_REF instead of a COMPONENT_REF.

Eric do you agree?  Are there any Ada test cases where the
pcom optimization jumps in?

We prefer the COMPONENT_REF just because it is likely more
aligned than the bit field itself.  The mem_ref should be
correctly aligned in both cases.

I was not sure if I should pass the TREE_OPERAND(ref, 2) to
the COMPONENT_REF constructor. Is that used at all? All other
places where COMPONENT_REF are built, seen to pass NULL there.


Bootstrap on x86_64-linux-gnu, reg-test is still running.

Is it OK it no regressions?


Thanks
Bernd.
2016-08-08  Bernd Edlinger  

PR tree-optimization/71083
* tree-predcom.c (ref_at_iteration): Correctly align the
inner reference.

testsuite:
2016-08-08  Bernd Edlinger  

PR tree-optimization/71083
* gcc.c-torture/execute/pr71083.c: New test.
Index: gcc/tree-predcom.c
===
--- gcc/tree-predcom.c	(revision 239193)
+++ gcc/tree-predcom.c	(working copy)
@@ -213,6 +213,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-scalar-evolution.h"
 #include "params.h"
 #include "tree-affine.h"
+#include "builtins.h"
 
 /* The maximum number of iterations between the considered memory
references.  */
@@ -1364,11 +1365,16 @@ replace_ref_with (gimple *stmt, tree new_tree, boo
 /* Returns a memory reference to DR in the ITER-th iteration of
the loop it was analyzed in.  Append init stmts to STMTS.  */
 
-static tree 
+static tree
 ref_at_iteration (data_reference_p dr, int iter, gimple_seq *stmts)
 {
   tree off = DR_OFFSET (dr);
   tree coff = DR_INIT (dr);
+  tree ref = DR_REF (dr);
+  enum tree_code ref_code = ERROR_MARK;
+  tree ref_type = NULL_TREE;
+  tree ref_op1 = NULL_TREE;
+  tree ref_op2 = NULL_TREE;
   if (iter == 0)
 ;
   else if (TREE_CODE (DR_STEP (dr)) == INTEGER_CST)
@@ -1377,27 +1383,43 @@ ref_at_iteration (data_reference_p dr, int iter, g
   else
 off = size_binop (PLUS_EXPR, off,
 		  size_binop (MULT_EXPR, DR_STEP (dr), ssize_int (iter)));
+  if (TREE_CODE (ref) == COMPONENT_REF
+  && DECL_BIT_FIELD (TREE_OPERAND (ref, 1)))
+{
+  unsigned HOST_WIDE_INT boff;
+  tree field = TREE_OPERAND (ref, 1);
+  ref_type = TREE_TYPE (ref);
+  boff = tree_to_uhwi (DECL_FIELD_BIT_OFFSET (field));
+  /* This can occur in Ada.  See Eric's comment in get_bit_range.  */
+  if ((boff % BITS_PER_UN

Re: protected alloca class for malloc fallback

2016-08-09 Thread Trevor Saunders

On Tue, Aug 09, 2016 at 08:33:49PM +0900, Oleg Endo wrote:
> On Mon, 2016-08-08 at 13:39 -0400, Trevor Saunders wrote:
> 
> 
> > First sizeof std::string is 32 on x86_64, a char *, a size_t for the
> > length, and a 16 byte union of a size_t for allocated size and a 16 
> > byte buffer for short strings.  I suppose some of this is required by 
> > the C++ standard,
> 
> I recommend checking what others have figured regarding that matter
> https://github.com/elliotgoodrich/SSO-23

 I think one of the big mistakes in the C++ stl strings is that they try
 to shove stack allocated strings where you want an internal buffer for
 SSO and a class to put in structs into the same type.  imho it would be
 better to have std::string for the second use case, and a separate
 std::stack_string or something for the first case.

> >  but I doubt we really need to care about strings longer than 2^32 in
> > gcc.  If we put the length and allocated size in the buffer,
> > and have a separate class for stack allocated buffers I think we can 
> > have a string that is just sizeof void *.
> 
> This idea has one flaw in that it does not allow having objects by
> value.  It essentially *requires* one to *always* allocate them on the
> heap via new/delete.  Being able to store objects by value is useful in
> many situations.  So if you do the above, the next logical step that
> will follow is a smart-pointer-like wrapper that allows value
> semantics.  Because eventually somebody will want that operator == or
> operator < e.g. for associative containers.

 If what you want is the ability to put the buffer on the stack instead
 of the heap then I think a stack_string class that interoperates with
 your string class is the thing you want.

 I don't really see anything wrong with a string class being a really
 fancy smart pointer that has a bunch of useful string stuff on it.  As
 for operator == I'd be fairly ok with that, other than it hiding a O(N)
 operation in ==.

> > 
> > Second it would be useful performance wise to have a std::string_view
> > type class, but that is c++14 or 17? only so we'd need to import it 
> > into gcc or something.
> 
> http://en.cppreference.com/w/cpp/experimental/basic_string_view
> http://en.cppreference.com/w/cpp/string/basic_string_view
> 
> It's "funny" that GCC contains a C++ stdlib implementation and so
> little is actually used by GCC itself.

Regretably necessary sure, but I'm not sure its funny.  The first big
problem with using the stl is that the subset available in C++98 isn't
that great, you could maybe hack up libstdc++ so that you can use newer
types just without the bits that use newer language features, but that
would take some work.  The other big problem is that the stl is often
too general, and tries to be too simple.  std::string is actually a good
example of both of those problems.  There's really no reason it should
use size_t instead of uint32_t for the string length / capacity.  It
would also be a lot better if it had separate types for strings where
you want an internal buffer or don't.

Trev

Re: [Fortran, Patch] First patch for coarray FAILED IMAGES (TS 18508)

2016-08-09 Thread Alessandro Fanfarillo

Thanks Paul,

I fixed the unused -fdump-tree-original on the tests.

About -fcoarray=single, I agree with Andre about not producing code
for failed images functions when running in single-image mode. If you,
or anybody else, thing otherwise I can adjust the functions to return
a constant value (except for fail image... :)).


2016-08-09 5:22 GMT-06:00 Paul Richard Thomas :
> Hi Sandro,
>
> As far as I can see, this is OK barring a couple of minor wrinkles and
> a question:
>
> For coarray_failed_images_err.f90 and coarray_image_status_err.f90 you
> have used the option -fdump-tree-original without making use of the
> tree dump.
>
> Mikael asked you to provide an executable test with -fcoarray=single.
> Is this not possible for some reason?
>
> Otherwise, this is OK for trunk.
>
> Thanks for the patch.
>
> Paul
>
> On 4 August 2016 at 05:07, Alessandro Fanfarillo
>  wrote:
>> * PING *
>>
>> 2016-07-21 13:05 GMT-06:00 Alessandro Fanfarillo :
>>> Dear Mikael and all,
>>>
>>> in attachment the new patch, built and regtested on x86_64-pc-linux-gnu.
>>>
>>> Cheers,
>>> Alessandro
>>>
>>> 2016-07-20 13:17 GMT-06:00 Mikael Morin :
 Le 20/07/2016 à 11:39, Andre Vehreschild a écrit :
>
> Hi Mikael,
>
>
>>> +  if(st == ST_FAIL_IMAGE)
>>> +new_st.op = EXEC_FAIL_IMAGE;
>>> +  else
>>> +gcc_unreachable();
>>
>> You can use
>> gcc_assert (st == ST_FAIL_IMAGE);
>> foo...;
>> instead of
>> if (st == ST_FAIL_IMAGE)
>> foo...;
>> else
>> gcc_unreachable ();
>
>
> Be careful, this is not 100% identical in the general case. For older
> gcc version (gcc < 4008) gcc_assert() is mapped to nothing, esp. not to
> an abort(), so the behavior can change. But in this case everything is
> fine, because the patch is most likely not backported.
>
 Didn't know about this. The difference seems to be very subtle.
 I don't mind much anyway. The original version can stay if preferred, this
 was just a suggestion.

 By the way, if the function is inlined in its single caller, the assert or
 unreachable statement can be removed, which avoids choosing between them.
 That's another suggestion.


>>> +
>>> +  return MATCH_YES;
>>> +
>>> + syntax:
>>> +  gfc_syntax_error (st);
>>> +
>>> +  return MATCH_ERROR;
>>> +}
>>> +
>>> +match
>>> +gfc_match_fail_image (void)
>>> +{
>>> +  /* if (!gfc_notify_std (GFC_STD_F2008_TS, "FAIL IMAGE statement
>>> at %C")) */
>>> +  /*   return MATCH_ERROR; */
>>> +
>>
>> Can this be uncommented?
>>
>>> +  return fail_image_statement (ST_FAIL_IMAGE);
>>> +}
>>>
>>>  /* Match LOCK/UNLOCK statement. Syntax:
>>>   LOCK ( lock-variable [ , lock-stat-list ] )
>>> diff --git a/gcc/fortran/trans-intrinsic.c
>>> b/gcc/fortran/trans-intrinsic.c index 1aaf4e2..b2f5596 100644
>>> --- a/gcc/fortran/trans-intrinsic.c
>>> +++ b/gcc/fortran/trans-intrinsic.c
>>> @@ -1647,6 +1647,24 @@ trans_this_image (gfc_se * se, gfc_expr
>>> *expr) m, lbound));
>>>  }
>>>
>>> +static void
>>> +gfc_conv_intrinsic_image_status (gfc_se *se, gfc_expr *expr)
>>> +{
>>> +  unsigned int num_args;
>>> +  tree *args,tmp;
>>> +
>>> +  num_args = gfc_intrinsic_argument_list_length (expr);
>>> +  args = XALLOCAVEC (tree, num_args);
>>> +
>>> +  gfc_conv_intrinsic_function_args (se, expr, args, num_args);
>>> +
>>> +  if (flag_coarray == GFC_FCOARRAY_LIB)
>>> +{
>>
>> Can everything be put under the if?
>> Does it work with -fcoarray=single?
>
>
> IMO coarray=single should not generate code here, therefore putting
> everything under the if should to fine.
>
 My point was more avoiding generating code for the arguments if they are 
 not
 used in the end.
 Regarding the -fcoarray=single case, the function returns a result, which
 can be used in an expression, so I don't think it will work without at 
 least
 hardcoding a fixed value as result in that case.
 But even that wouldn't be enough, as the function wouldn't work 
 consistently
 with the fail image statement.

> Sorry for the comments ...
>
 Comments are welcome here, as far as I know. ;-)

 Mikael
>
>
>
> --
> The difference between genius and stupidity is; genius has its limits.
>
> Albert Einstein

Re: [RFC] ipa bitwise constant propagation

2016-08-09 Thread Martin Jambor

Hi,

On Tue, Aug 09, 2016 at 05:17:31PM +0530, Prathamesh Kulkarni wrote:
> On 9 August 2016 at 16:39, Martin Jambor  wrote:
>
> ...
>
> >> Instead of storing arg's precision and sign, we should store
> >> parameter's precision and sign in ipa_compute_jump_functions_for_edge ().
> >> Diff with respect to previous patch:
> >>
> >> @@ -1688,9 +1690,9 @@ ipa_compute_jump_functions_for_edge (struct
> >> ipa_func_body_info *fbi,
> >>&& (TREE_CODE (arg) == SSA_NAME || TREE_CODE (arg) == INTEGER_CST))
> >>   {
> >>jfunc->bits.known = true;
> >> -  jfunc->bits.sgn = TYPE_SIGN (TREE_TYPE (arg));
> >> -  jfunc->bits.precision = TYPE_PRECISION (TREE_TYPE (arg));
> >> -
> >> +  jfunc->bits.sgn = TYPE_SIGN (param_type);
> >> +  jfunc->bits.precision = TYPE_PRECISION (param_type);
> >> +
> >
> > If you want to use the precision of the formal parameter then you do
> > not need to store it to jump functions.  Parameter DECLs along with
> > their types are readily accessible in IPA (even with LTO).  It would
> > also be much clearer what is going on, IMHO.
> Could you please point out how to access parameter decl in wpa ?
> The only reason I ended up putting this in jump function was because
> I couldn't figure out how to access param decl during WPA.
> I see there's ipa_get_param() in ipa-prop.h however it's gated on
> gcc_checking_assert (!flag_wpa), so I suppose I can't use this
> during WPA ?
> 
> Alternatively I think I could access cs->callee->decl and get to the param 
> decl
> by walking DECL_ARGUMENTS ?

Actually, we no longer have DECL_ARGUMENTS during LTO WPA.  But in
most cases, you can still get at the type with something like the
following (only very lightly tested) patch, if Honza does not think it
is too crazy.

Note that= for old K&R C sources we do not have TYPE_ARG_TYPES and so
ipa_get_type can return NULL(!) ...however I wonder whether for such
programs the type assumptions made in callers when constructing jump
functions can be trusted either.

I have to run, we will continue the discussion later.

Martin


2016-08-09  Martin Jambor  

* ipa-prop.h (ipa_param_descriptor): Renamed decl to decl_or_type.
Update comment.
(ipa_get_param): Updated comment, added assert that we have a
PARM_DECL.
(ipa_get_type): New function.
* ipa-cp.c (ipcp_propagate_stage): Fill in argument types in LTO mode.
* ipa-prop.c (ipa_get_param_decl_index_1): Use decl_or_type
instead of decl;
(ipa_populate_param_decls): Likewise.
(ipa_dump_param): Likewise.


diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
index 5b6cb9a..3465da5 100644
--- a/gcc/ipa-cp.c
+++ b/gcc/ipa-cp.c
@@ -1952,11 +1952,21 @@ propagate_constants_accross_call (struct cgraph_edge 
*cs)
   else
 i = 0;
 
+  /* !!! The following dump is of course only a demonstration that it works: */
+  debug_generic_expr (callee->decl);
+  fprintf (stderr, "\n");
+
   for (; (i < args_count) && (i < parms_count); i++)
 {
   struct ipa_jump_func *jump_func = ipa_get_ith_jump_func (args, i);
   struct ipcp_param_lattices *dest_plats;
 
+  /* !!! The following dump is of course only a demonstration that it
+ works: */
+  fprintf (stderr, "  The type of parameter %i is: ", i);
+  debug_generic_expr (ipa_get_type (callee_info, i));
+  fprintf (stderr, "\n");
+
   dest_plats = ipa_get_parm_lattices (callee_info, i);
   if (availability == AVAIL_INTERPOSABLE)
ret |= set_all_contains_variable (dest_plats);
@@ -2936,6 +2946,19 @@ ipcp_propagate_stage (struct ipa_topo_info *topo)
   {
 struct ipa_node_params *info = IPA_NODE_REF (node);
 
+/* In LTO we do not have PARM_DECLs but we would still like to be able to
+   look at types of parameters.  */
+if (in_lto_p)
+  {
+   tree t = TYPE_ARG_TYPES (TREE_TYPE (node->decl));
+   for (int k = 0; k < ipa_get_param_count (info); k++)
+ {
+   gcc_assert (t != void_list_node);
+   info->descriptors[k].decl_or_type = TREE_VALUE (t);
+   t = t ? TREE_CHAIN (t) : NULL;
+ }
+  }
+
 determine_versionability (node, info);
 if (node->has_gimple_body_p ())
   {
diff --git a/gcc/ipa-prop.c b/gcc/ipa-prop.c
index 132b622..1eaccdf 100644
--- a/gcc/ipa-prop.c
+++ b/gcc/ipa-prop.c
@@ -103,9 +103,10 @@ ipa_get_param_decl_index_1 (vec 
descriptors, tree ptree)
 {
   int i, count;
 
+  gcc_checking_assert (!flag_wpa);
   count = descriptors.length ();
   for (i = 0; i < count; i++)
-if (descriptors[i].decl == ptree)
+if (descriptors[i].decl_or_type == ptree)
   return i;
 
   return -1;
@@ -138,7 +139,7 @@ ipa_populate_param_decls (struct cgraph_node *node,
   param_num = 0;
   for (parm = fnargs; parm; parm = DECL_CHAIN (parm))
 {
-  descriptors[param_num].decl = parm;
+  descriptors[param_num].decl_or_type = parm;
   descriptors[param_num].move_cost = estimate_move_cost (TREE_TYPE (parm),

Re: [PATCH] cpp/c: Add -Wexpansion-to-defined

2016-08-09 Thread Manuel López-Ibáñez


On 09/08/16 16:59, Paolo Bonzini wrote:

Index: gcc/c-family/c-opts.c
===
--- gcc/c-family/c-opts.c   (revision 239276)
+++ gcc/c-family/c-opts.c   (working copy)
@@ -1256,6 +1256,10 @@ sanitize_cpp_opts (void)
   cpp_opts->unsigned_char = !flag_signed_char;
   cpp_opts->stdc_0_in_system_headers = STDC_0_IN_SYSTEM_HEADERS;

+  cpp_opts->warn_expansion_to_defined = cpp_warn_expansion_to_defined;
+  if (cpp_warn_expansion_to_defined == -1)
+cpp_warn_expansion_to_defined = pedantic || extra_warnings;
+


Instead of the above, plase use LangEnabledBy() or EnabledBy() in c.opt. See 
Wendif-labels and other examples. Then, you do not need Init(-1).



   /* Wlong-long is disabled by default. It is enabled by:
   [-Wpedantic | -Wtraditional] -std=[gnu|c]++98 ; or
   [-Wpedantic | -Wtraditional] -std=non-c99
Index: gcc/c-family/c.opt
===
--- gcc/c-family/c.opt  (revision 239276)
+++ gcc/c-family/c.opt  (working copy)
@@ -506,6 +506,10 @@ Wdouble-promotion
 C ObjC C++ ObjC++ Var(warn_double_promotion) Warning
 Warn about implicit conversions from \"float\" to \"double\".

+Wexpansion-to-defined
+C ObjC C++ ObjC++ CppReason(CPP_W_EXPANSION_TO_DEFINED) 
Var(cpp_warn_expansion_to_defined) Init(-1) Warning
+Warn if an undefined macro is used in an #if directive.
+


You are also missing CPP(warn_expansion_to_defined) so that the cpp and gcc 
sides are in sync even when using pragmas.


Cheers,

Manuel.

Re: [PATCH] cpp/c: Add -Wexpansion-to-defined

2016-08-09 Thread Eric Gallager

Hi, I'm just a lurker, but...

On 8/9/16, Paolo Bonzini  wrote:
> Index: gcc/c-family/c.opt
> ===
> --- gcc/c-family/c.opt(revision 239276)
> +++ gcc/c-family/c.opt(working copy)
> @@ -506,6 +506,10 @@ Wdouble-promotion
>  C ObjC C++ ObjC++ Var(warn_double_promotion) Warning
>  Warn about implicit conversions from \"float\" to \"double\".
>
> +Wexpansion-to-defined
> +C ObjC C++ ObjC++ CppReason(CPP_W_EXPANSION_TO_DEFINED)
> Var(cpp_warn_expansion_to_defined) Init(-1) Warning
> +Warn if an undefined macro is used in an #if directive.
> +

Isn't "Warn if an undefined macro is used in an #if directive" the
same description that -Wundef uses in c.opt? Did you copy and paste
and forget to update that part?

Eric

Re: [PATCH] Set -fprofile-update=atomic when -pthread is present

2016-08-09 Thread Andi Kleen

Martin Liška  writes:

> Hi.
>
> As mention in [1], enabling -fprofile-update=atomic when -pthread is logical
> thing and is quite expected default behavior.
>
> Ready for trunk?

It could potentially make things a lot slower. I don't think it's a good
idea to do this by default.

-Andi

Re: Init df for split pass since some target use REG_NOTE in split pattern

2016-08-09 Thread Steven Bosscher

On Mon, Aug 8, 2016 at 9:45 PM, Jeff Law wrote:

>
> I'm pretty sure we do _not_ want it for split_all_insns_noflow since that's
> used when the CFG is not necessarily correct and thus I don't see how we can
> reliably have death/unused notes.

Actually, trying to initializing DF without a valid CFG will most
likely just crash the compiler. DF doesn't work without a valid CFG.

I expect the patch to fail for at least MIPS, SPARC, and ARM.

Ciao!
Steven

Re: [RS6000] e500 part of pr71680

2016-08-09 Thread Segher Boessenkool

On Tue, Aug 09, 2016 at 06:34:46PM +0930, Alan Modra wrote:
> The fallback part of HARD_REGNO_CALLER_SAVE_MODE, choose_hard_reg_mode,
> returns DFmode for SImode when TARGET_E500_DOUBLE.  This confuses
> lra when attempting to save ctr around a call.
> 
> Arseny, the bug reporter, has regression tested this patch.
> OK to apply?

Okay for trunk, thanks,


Segher


>   PR target/71680
>   * config/rs6000/rs6000.h (HARD_REGNO_CALLER_SAVE_MODE): Return
>   SImode for TARGET_E500_DOUBLE when given SImode.
> 
> diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h
> index af77258..353f388 100644
> --- a/gcc/config/rs6000/rs6000.h
> +++ b/gcc/config/rs6000/rs6000.h
> @@ -1273,6 +1273,8 @@ enum data_align { align_abi, align_opt, align_both };
> && ((MODE) == VOIDmode || ALTIVEC_OR_VSX_VECTOR_MODE (MODE))  
> \
> && FP_REGNO_P (REGNO) \
> ? V2DFmode
> \
> +   : TARGET_E500_DOUBLE && (MODE) == SImode  \
> +   ? SImode  \
> : TARGET_E500_DOUBLE && ((MODE) == VOIDmode || (MODE) == DFmode)  \
> ? DFmode  \
> : !TARGET_E500_DOUBLE && FLOAT128_IBM_P (MODE) && FP_REGNO_P (REGNO)  
> \

Re: Implement C _FloatN, _FloatNx types [version 3]

2016-08-09 Thread Joseph Myers

On Tue, 9 Aug 2016, James Greenhalgh wrote:

> However, I still have some trouble with TFmode and TImode conversions. In
> particular, I'm missing support for the standard names:
> 
>   extendhftf2
>   trunctfhf2
>   floattihf2
>   fixhfti
> 
> In addition, where the ARMv8.2-A extensions are not available, I have no
> support for the standard names: 
> 
>   floatdihf
>   floatundihf
> 
> Providing these names through soft-fp in libgcc should be possible. Certainly
> I can enable extendhftf2 and trunctfhf2 using
> softfp_extensions/softfp_truncations and providing the appropriate files.
> Though, I'd rather avoid adding the full suite of soft-float routines, in
> particular I don't need arithmetic like divhf3. There are probably
> modifications I can make to the soft-fp build machinery to enable building
> only specified conversion functions.
> 
> Is this a sensible approach, or would I be better simply the full suite
> of HFmode operations that soft-fp expects?

The first question is: is using soft-fp for these operations optimal?  For 
conversions to DImode, for example, maybe converting via SFmode would be 
better when that's supported in hardware?  (That's something that would 
need to be benchmarked.)

soft-fp changes always go to glibc first, and it's now established that 
glibc gets all the soft-fp function / format implementations even those 
that are only actually used in libgcc.

HFmode support for soft-fp is at 
, though of 
course it needs updating for global changes to soft-fp since then.

If you do conclude that soft-fp should be used, but only a limited set of 
functions is needed, that's what softfp_extras in t-softfp is for.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [PATCH] Fix unaligned access when predictive commoning (PR 71083)

2016-08-09 Thread Eric Botcazou

> I think from Eric's comment in get_inner_ref it can be possible
> in Ada that the outer object is accidentally byte-aligned
> but the inner reference is not.  In that case I think it is
> better to generate a BIT_FIELD_REF instead of a COMPONENT_REF.

The patch says get_bit_range instead...  The comment therein means that 
bitfields can be nested in Ada: you can have a bitfield in a record which is 
itself a bitfield in another record.

> Eric do you agree?  Are there any Ada test cases where the
> pcom optimization jumps in?

According to the comment, data-ref analysis punts on bit offsets so I'm not 
sure how boff can be not byte-aligned.

-- 
Eric Botcazou

Re: [PATCH] cpp/c: Add -Wexpansion-to-defined

2016-08-09 Thread Paolo Bonzini



On 09/08/2016 20:30, Manuel López-Ibáñez wrote:
>>
>>
>> +  cpp_opts->warn_expansion_to_defined = cpp_warn_expansion_to_defined;
>> +  if (cpp_warn_expansion_to_defined == -1)
>> +cpp_warn_expansion_to_defined = pedantic || extra_warnings;
>> +
> 
> Instead of the above, plase use LangEnabledBy() or EnabledBy() in c.opt.
> See Wendif-labels and other examples. Then, you do not need Init(-1).

This causes this to produce an error if -pedantic-errors is provided,
because -pedantic-errors does

  control_warning_option (OPT_Wpedantic, DK_ERROR, NULL, value,
  loc, lang_mask,
  handlers, opts, opts_set,
  dc);

and this enables -Wexpansion-to-defined at DK_ERROR level.  Bug or fix?

Paolo

1 2 >

1 - 100 of 123 matches

Mail list logo