date:20140815

Re: [PATCH] Asan static optimization (draft)

2014-08-15 Thread Yuri Gribov

On Fri, Aug 8, 2014 at 2:43 PM, Dmitry Vyukov  wrote:
>> Similar optimization could be used for tsan builtins, or some of the ubsan
>> builtins (which is the reason why the pass is called sanopt).
>
> That would be great.
> Note that tsan optimizations must be based around a different
> criteria. Any unlock/release operations must reset set of already
> checked locations.

Right, I think the general algorithm should be quite similar, only the
KILL sets and actual check instructions would be different.
Unfortunately I don't (yet) have any experience neither with TSan, nor
with UBSan so I'd prefer to leave their optimization for future. The
code should be straightforward to adapt (via templating or callbacking
in couple of places).

-Y

Re: [PATCH][LTO] Streamer re-org (what's left)

2014-08-15 Thread Richard Biener

On Thu, 14 Aug 2014, Jan Hubicka wrote:

> > 
> > And this fixes the thinko in zlib use.
> > 
> > du  /tmp/*ltrans* | awk '{ sum += $1 } END { print sum }'
> > stage3 cc1 WPA ltrans file size 178740 (patched)
> > stage3 cc1 WPA ltrans file size 460068 (unpatched)
> 
> The patch works now with Firefox.  For libxul linktime I get
> 
> Unpatched:
> real5m59.514s
> user47m55.468s
> sys 4m36.717s
> 
> WPA is 125s, stream in 30s, stream out 5.98s.
> Note that usually WPA is around 80-90s, it is now up because of
> devirtualization having too many contextes (it takes 40s). Will fix that next
> week.
> 
> patched:
> real6m12.437s
> user51m18.829s
> sys 4m30.809s
> 
> WPA is 129s, stream in 29.23s, stream out 12.16s.
> 
> Patched + fast compression
> real6m4.383s
> user49m15.123s
> sys 4m31.166s
> 
> WPA is 124s, stream in 29.39, stream out 7.33s.
> 
> So I guess the difference is close to noise factor now. I am sure there 
> are better compression backends than zlib for this purpose but it seems 
> to work well enough.

Yeah, we might want to pursue that lz4 thing at some point.

I'll take the above as an ok to go forward with this change
(moving compression to the "stream" level from section level).

Richard.

[committed] Fix handling of shared(global_var) on omp parallel inside target region (PR middle-end/62092)

2014-08-15 Thread Jakub Jelinek

Hi!

We need OMP_CLAUSE_SHARED even for global vars if they are mapped
in an outer target region, because then the parallel region can't
access the global, but has to go through the mapping.

Testcase already in the testsuite (but failed only with non-shared
address space obviously).

Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk
and 4.9.

2014-08-15  Jakub Jelinek  

PR middle-end/62092
* gimplify.c (gimplify_adjust_omp_clauses_1): Don't remove
OMP_CLAUSE_SHARED for global vars if the global var is mentioned
in OMP_CLAUSE_MAP in some outer target region.

--- gcc/gimplify.c.jj   2014-08-12 15:43:07.0 +0200
+++ gcc/gimplify.c  2014-08-14 13:39:22.263181286 +0200
@@ -6308,7 +6308,7 @@ gimplify_adjust_omp_clauses_1 (splay_tre
= splay_tree_lookup (ctx->variables, (splay_tree_key) decl);
  if (on && (on->value & (GOVD_FIRSTPRIVATE | GOVD_LASTPRIVATE
  | GOVD_PRIVATE | GOVD_REDUCTION
- | GOVD_LINEAR)) != 0)
+ | GOVD_LINEAR | GOVD_MAP)) != 0)
break;
  ctx = ctx->outer_context;
}

Jakub

Re: [PATCH] Fix PR62081

2014-08-15 Thread Richard Biener

On Thu, 14 Aug 2014, Sebastian Pop wrote:

> Richard Biener wrote:
> > 
> > The following fixes missing dominator computation before fixing loops.
> > Rather than doing even more such weird stuff in a pass gate function
> > this puts this into a new pass scheduled before the loop passes gate.
> > 
> 
> Ok.
> 
> > +unsigned int
> > +pass_fix_loops::execute (function *)
> > +{
> 
> I would add an early exit if there are no loops in the function
> (like in the original code below...)
> 
> if (!loops_for_fn (fn))
>   return 0;

Note that's not how things work today - loops_for_fn () returns
non-NULL even for zero-loop functions as soon as we have CFG.
The "hack" below was for -fdump-passes which calls each gate
of every pass before the CFG is set up and thus would crash
gate_loop if we didn't do that check.  Of course -fdump-passes
reports sth that is not true for all functions here.

Richard.

> > +  if (loops_state_satisfies_p (LOOPS_NEED_FIXUP))
> > +{
> > +  calculate_dominance_info (CDI_DOMINATORS);
> > +  fix_loop_structure (NULL);
> > +}
> > +  return 0;
> > +}
> 
> [...]
> 
> >  /* Gate for loop pass group.  The group is controlled by 
> > -ftree-loop-optimize
> > but we also avoid running it when the IL doesn't contain any loop.  */
> >  
> > @@ -57,9 +107,6 @@ gate_loop (function *fn)
> >if (!loops_for_fn (fn))
> >  return true;
> 
> ... here.
> 
> >  
> > -  /* Make sure to drop / re-discover loops when necessary.  */
> > -  if (loops_state_satisfies_p (LOOPS_NEED_FIXUP))
> > -fix_loop_structure (NULL);
> >return number_of_loops (fn) > 1;
> >  }
>

[committed] Fix map clause handling for scalar pointers/allocatables passed by reference (PR fortran/62107)

2014-08-15 Thread Jakub Jelinek

Hi!

For these two, we need to actually map 3 things, the reference
(pointer-assign), the pointer (pointer-assign) and what the pointer points
to.

Testcase already in the testsuite, just needs non-shared address space.

Bootstrapped/regtested on x86_64-linux and i686-linux, committed to
trunk/4.9.

2014-08-15  Jakub Jelinek  

PR fortran/62107
* trans-openmp.c (gfc_omp_finish_clause): Handle scalar pointer
or allocatable passed by reference.
(gfc_trans_omp_clauses) : Likewise.

--- gcc/fortran/trans-openmp.c.jj   2014-06-25 11:14:46.0 +0200
+++ gcc/fortran/trans-openmp.c  2014-08-14 16:18:25.594334849 +0200
@@ -1022,6 +1022,7 @@ gfc_omp_finish_clause (tree c, gimple_se
  && !GFC_DECL_CRAY_POINTEE (decl)
  && !GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (TREE_TYPE (decl
return;
+  tree orig_decl = decl;
   c4 = build_omp_clause (OMP_CLAUSE_LOCATION (c), OMP_CLAUSE_MAP);
   OMP_CLAUSE_MAP_KIND (c4) = OMP_CLAUSE_MAP_POINTER;
   OMP_CLAUSE_DECL (c4) = decl;
@@ -1029,6 +1030,17 @@ gfc_omp_finish_clause (tree c, gimple_se
   decl = build_fold_indirect_ref (decl);
   OMP_CLAUSE_DECL (c) = decl;
   OMP_CLAUSE_SIZE (c) = NULL_TREE;
+  if (TREE_CODE (TREE_TYPE (orig_decl)) == REFERENCE_TYPE
+ && (GFC_DECL_GET_SCALAR_POINTER (orig_decl)
+ || GFC_DECL_GET_SCALAR_ALLOCATABLE (orig_decl)))
+   {
+ c3 = build_omp_clause (OMP_CLAUSE_LOCATION (c), OMP_CLAUSE_MAP);
+ OMP_CLAUSE_MAP_KIND (c3) = OMP_CLAUSE_MAP_POINTER;
+ OMP_CLAUSE_DECL (c3) = unshare_expr (decl);
+ OMP_CLAUSE_SIZE (c3) = size_int (0);
+ decl = build_fold_indirect_ref (decl);
+ OMP_CLAUSE_DECL (c) = decl;
+   }
 }
   if (GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (decl)))
 {
@@ -1884,14 +1896,32 @@ gfc_trans_omp_clauses (stmtblock_t *bloc
TREE_ADDRESSABLE (decl) = 1;
  if (n->expr == NULL || n->expr->ref->u.ar.type == AR_FULL)
{
- if (POINTER_TYPE_P (TREE_TYPE (decl)))
+ if (POINTER_TYPE_P (TREE_TYPE (decl))
+ && (gfc_omp_privatize_by_reference (decl)
+ || GFC_DECL_GET_SCALAR_POINTER (decl)
+ || GFC_DECL_GET_SCALAR_ALLOCATABLE (decl)
+ || GFC_DECL_CRAY_POINTEE (decl)
+ || GFC_DESCRIPTOR_TYPE_P
+   (TREE_TYPE (TREE_TYPE (decl)
{
+ tree orig_decl = decl;
  node4 = build_omp_clause (input_location,
OMP_CLAUSE_MAP);
  OMP_CLAUSE_MAP_KIND (node4) = OMP_CLAUSE_MAP_POINTER;
  OMP_CLAUSE_DECL (node4) = decl;
  OMP_CLAUSE_SIZE (node4) = size_int (0);
  decl = build_fold_indirect_ref (decl);
+ if (TREE_CODE (TREE_TYPE (orig_decl)) == REFERENCE_TYPE
+ && (GFC_DECL_GET_SCALAR_POINTER (orig_decl)
+ || GFC_DECL_GET_SCALAR_ALLOCATABLE (orig_decl)))
+   {
+ node3 = build_omp_clause (input_location,
+   OMP_CLAUSE_MAP);
+ OMP_CLAUSE_MAP_KIND (node3) = OMP_CLAUSE_MAP_POINTER;
+ OMP_CLAUSE_DECL (node3) = decl;
+ OMP_CLAUSE_SIZE (node3) = size_int (0);
+ decl = build_fold_indirect_ref (decl);
+   }
}
  if (GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (decl)))
{

Jakub

Re: [PATCH][LTO] Streamer re-org (what's left)

2014-08-15 Thread Jan Hubicka

> > patched:
> > real6m12.437s
> > user51m18.829s
> > sys 4m30.809s
> > 
> > WPA is 129s, stream in 29.23s, stream out 12.16s.
> > 
> > Patched + fast compression
> > real6m4.383s
> > user49m15.123s
> > sys 4m31.166s
> > 
> > WPA is 124s, stream in 29.39, stream out 7.33s.
> > 
> > So I guess the difference is close to noise factor now. I am sure there 
> > are better compression backends than zlib for this purpose but it seems 
> > to work well enough.
> 
> Yeah, we might want to pursue that lz4 thing at some point.
> 
> I'll take the above as an ok to go forward with this change
> (moving compression to the "stream" level from section level).

Yep, I would go with fast compression for wpa->ltrans objects. Those are going
to be consumed just once and the compression level increase is probably not
terrible (i.e. not as bad as current growth caused by not compressing strings
:)

3-fold decrease in /tmp usage is nice, but it +-10% does not matter much.  BTW
if I remember well, zlib algorithm works on 64Kb blocks independently, so
perhaps havin 2MB buffer is unnecessarily large.

Honza
> 
> Richard.

Re: [PATCH][LTO] Streamer re-org (what's left)

2014-08-15 Thread Richard Biener

On Fri, 15 Aug 2014, Jan Hubicka wrote:

> > > patched:
> > > real6m12.437s
> > > user51m18.829s
> > > sys 4m30.809s
> > > 
> > > WPA is 129s, stream in 29.23s, stream out 12.16s.
> > > 
> > > Patched + fast compression
> > > real6m4.383s
> > > user49m15.123s
> > > sys 4m31.166s
> > > 
> > > WPA is 124s, stream in 29.39, stream out 7.33s.
> > > 
> > > So I guess the difference is close to noise factor now. I am sure there 
> > > are better compression backends than zlib for this purpose but it seems 
> > > to work well enough.
> > 
> > Yeah, we might want to pursue that lz4 thing at some point.
> > 
> > I'll take the above as an ok to go forward with this change
> > (moving compression to the "stream" level from section level).
> 
> Yep, I would go with fast compression for wpa->ltrans objects. Those are going
> to be consumed just once and the compression level increase is probably not
> terrible (i.e. not as bad as current growth caused by not compressing strings
> :)
> 
> 3-fold decrease in /tmp usage is nice, but it +-10% does not matter much.  BTW
> if I remember well, zlib algorithm works on 64Kb blocks independently, so
> perhaps havin 2MB buffer is unnecessarily large.

Yeah, the 2MB was just a "guess", I'll change it to 64k blocks.  Note
the original code exponentially increased block size to not have
too many blocks (for whatever reason).  A 800MB compressed decl section
would need 12800 64k blocks.  But in the end it matters only that
the block allocations are "efficient" for the memory allocator
(so don't allocate 1-byte blocks).  Our internal overhead is
one pointer (to point to the next buffer).

Of course in the end I want to implement streaming right into the
file rather than queuing up the whole compressed data (or
mmapping it).

Btw, I'll first try to get rid of the separate string section
which would also make it compressed again and be less awkwardly
abusing the data-streamer.

Richard.

[PATCH] Put all constants last in tree_swap_operands_p, remove odd -Os check

2014-08-15 Thread Richard Biener


The following makes tree_swap_operands_p put all constants 2nd place,
also looks through sign-changes when considering further canonicalzations
and removes the odd -Os guard for those.  That was put in with
https://gcc.gnu.org/ml/gcc-patches/2003-10/msg01208.html just
motivated by CSiBE numbers - but rather than disabling canonicalization
this should have disabled the actual harmful transforms.

Bootstrap and regtest ongoing on x86_64-unknown-linux-gnu.

Richard.

2014-08-15  Richard Biener  

* fold-const.c (tree_swap_operands_p): Put all constants
last, also strip sign-changing NOPs when considering further
canonicalization.  Canonicalize also when optimizing for size.

Index: gcc/fold-const.c
===
--- gcc/fold-const.c(revision 214007)
+++ gcc/fold-const.c(working copy)
@@ -6642,37 +6650,19 @@ reorder_operands_p (const_tree arg0, con
 bool
 tree_swap_operands_p (const_tree arg0, const_tree arg1, bool reorder)
 {
-  STRIP_SIGN_NOPS (arg0);
-  STRIP_SIGN_NOPS (arg1);
-
-  if (TREE_CODE (arg1) == INTEGER_CST)
+  if (CONSTANT_CLASS_P (arg1) == INTEGER_CST)
 return 0;
-  if (TREE_CODE (arg0) == INTEGER_CST)
+  if (CONSTANT_CLASS_P (arg0) == INTEGER_CST)
 return 1;
 
-  if (TREE_CODE (arg1) == REAL_CST)
-return 0;
-  if (TREE_CODE (arg0) == REAL_CST)
-return 1;
-
-  if (TREE_CODE (arg1) == FIXED_CST)
-return 0;
-  if (TREE_CODE (arg0) == FIXED_CST)
-return 1;
-
-  if (TREE_CODE (arg1) == COMPLEX_CST)
-return 0;
-  if (TREE_CODE (arg0) == COMPLEX_CST)
-return 1;
+  STRIP_NOPS (arg0);
+  STRIP_NOPS (arg1);
 
   if (TREE_CONSTANT (arg1))
 return 0;
   if (TREE_CONSTANT (arg0))
 return 1;
 
-  if (optimize_function_for_size_p (cfun))
-return 0;
-
   if (reorder && flag_evaluation_order
   && (TREE_SIDE_EFFECTS (arg0) || TREE_SIDE_EFFECTS (arg1)))
 return 0;

[PATCH] Cleanup STRING_CST handing in PTA

2014-08-15 Thread Richard Biener


points-to is one place that needs special handling of us
allowing &"string" as operands (rather than having a
CONST_DECL for each string literal).  The following patch
realizes that readonly_id is only used for string literals
and adjusts code accordingly.

Bootstrap and regtest running on x86_64-unknown-linux-gnu.

Richard.

2014-08-15  Richard Biener  

* tree-ssa-structalias.c (readonly_id): Rename to string_id.
(get_constraint_for_ssa_var): Remove dead code.
(get_constraint_for_1): Adjust.
(find_what_var_points_to): Likewise.
(init_base_vars): Likewise.  STRING_CSTs do not contain pointers.

Index: gcc/tree-ssa-structalias.c
===
--- gcc/tree-ssa-structalias.c  (revision 214007)
+++ gcc/tree-ssa-structalias.c  (working copy)
@@ -344,7 +344,7 @@ vi_next (varinfo_t vi)
 
 /* Static IDs for the special variables.  Variable ID zero is unused
and used as terminator for the sub-variable chain.  */
-enum { nothing_id = 1, anything_id = 2, readonly_id = 3,
+enum { nothing_id = 1, anything_id = 2, string_id = 3,
escaped_id = 4, nonlocal_id = 5,
storedanything_id = 6, integer_id = 7 };
 
@@ -2938,14 +2938,6 @@ get_constraint_for_ssa_var (tree t, vec<
   cexpr.var = vi->id;
   cexpr.type = SCALAR;
   cexpr.offset = 0;
-  /* If we determine the result is "anything", and we know this is readonly,
- say it points to readonly memory instead.  */
-  if (cexpr.var == anything_id && TREE_READONLY (t))
-{
-  gcc_unreachable ();
-  cexpr.type = ADDRESSOF;
-  cexpr.var = readonly_id;
-}
 
   /* If we are not taking the address of the constraint expr, add all
  sub-fiels of the variable as well.  */
@@ -3380,10 +3372,11 @@ get_constraint_for_1 (tree t, vec
   return;
 }
 
-  /* String constants are read-only.  */
+  /* String constants are read-only, ideally we'd have a CONST_DECL
+ for those.  */
   if (TREE_CODE (t) == STRING_CST)
 {
-  temp.var = readonly_id;
+  temp.var = string_id;
   temp.type = SCALAR;
   temp.offset = 0;
   results->safe_push (temp);
@@ -6112,8 +6105,8 @@ find_what_var_points_to (varinfo_t orig_
  else if (vi->is_heap_var)
/* We represent heapvars in the points-to set properly.  */
;
- else if (vi->id == readonly_id)
-   /* Nobody cares.  */
+ else if (vi->id == string_id)
+   /* Nobody cares - STRING_CSTs are read-only entities.  */
;
  else if (vi->id == anything_id
   || vi->id == integer_id)
@@ -6501,7 +6494,7 @@ init_base_vars (void)
   struct constraint_expr lhs, rhs;
   varinfo_t var_anything;
   varinfo_t var_nothing;
-  varinfo_t var_readonly;
+  varinfo_t var_string;
   varinfo_t var_escaped;
   varinfo_t var_nonlocal;
   varinfo_t var_storedanything;
@@ -6547,27 +6540,17 @@ init_base_vars (void)
  but this one are redundant.  */
   constraints.safe_push (new_constraint (lhs, rhs));
 
-  /* Create the READONLY variable, used to represent that a variable
- points to readonly memory.  */
-  var_readonly = new_var_info (NULL_TREE, "READONLY");
-  gcc_assert (var_readonly->id == readonly_id);
-  var_readonly->is_artificial_var = 1;
-  var_readonly->offset = 0;
-  var_readonly->size = ~0;
-  var_readonly->fullsize = ~0;
-  var_readonly->is_special_var = 1;
-
-  /* readonly memory points to anything, in order to make deref
- easier.  In reality, it points to anything the particular
- readonly variable can point to, but we don't track this
- separately. */
-  lhs.type = SCALAR;
-  lhs.var = readonly_id;
-  lhs.offset = 0;
-  rhs.type = ADDRESSOF;
-  rhs.var = readonly_id;  /* FIXME */
-  rhs.offset = 0;
-  process_constraint (new_constraint (lhs, rhs));
+  /* Create the STRING variable, used to represent that a variable
+ points to a string literal.  String literals don't contain
+ pointers so STRING doesn't point to anything.  */
+  var_string = new_var_info (NULL_TREE, "STRING");
+  gcc_assert (var_string->id == string_id);
+  var_string->is_artificial_var = 1;
+  var_string->offset = 0;
+  var_string->size = ~0;
+  var_string->fullsize = ~0;
+  var_string->is_special_var = 1;
+  var_string->may_have_pointers = 0;
 
   /* Create the ESCAPED variable, used to represent the set of escaped
  memory.  */

Re: [PATCH] Put all constants last in tree_swap_operands_p, remove odd -Os check

2014-08-15 Thread Manuel López-Ibáñez

On 15 August 2014 11:07, Richard Biener  wrote:
> -  if (TREE_CODE (arg1) == INTEGER_CST)
> +  if (CONSTANT_CLASS_P (arg1) == INTEGER_CST)

Huh?

/* Nonzero if NODE represents a constant.  */

#define CONSTANT_CLASS_P(NODE)\
(TREE_CODE_CLASS (TREE_CODE (NODE)) == tcc_constant)

Sadly, we don't have a warning for this, but clang++ has one:

test.c:4:16: warning: comparison of constant 2 with expression of type
'bool' is always false [-Wtautological-constant-out-of-range-compare]
  if ((a == 1) == 2) {
   ^  ~

I'll open a PR

Cheers,

Manuel.

Re: [PATCH] Put all constants last in tree_swap_operands_p, remove odd -Os check

2014-08-15 Thread Richard Biener

On Fri, 15 Aug 2014, Manuel López-Ibáñez wrote:

> On 15 August 2014 11:07, Richard Biener  wrote:
> > -  if (TREE_CODE (arg1) == INTEGER_CST)
> > +  if (CONSTANT_CLASS_P (arg1) == INTEGER_CST)
> 
> Huh?

Eh ;)

> /* Nonzero if NODE represents a constant.  */
> 
> #define CONSTANT_CLASS_P(NODE)\
> (TREE_CODE_CLASS (TREE_CODE (NODE)) == tcc_constant)
> 
> Sadly, we don't have a warning for this, but clang++ has one:
> 
> test.c:4:16: warning: comparison of constant 2 with expression of type
> 'bool' is always false [-Wtautological-constant-out-of-range-compare]
>   if ((a == 1) == 2) {
>    ^  ~
> 
> I'll open a PR

Thx.
Richard.

Re: RFC: Patch for switch elimination (PR 54742)

2014-08-15 Thread Richard Biener

On Thu, Aug 14, 2014 at 8:45 PM, Sebastian Pop  wrote:
> Steve Ellcey wrote:
>> I understand the desire not to add optimizations just for benchmarks but
>> we do know other compilers have added this optimization for coremark
>> (See
>> http://community.arm.com/groups/embedded/blog/2013/02/21/coremark-and-compiler-performance)
>> and the 13 people on the CC list for this bug certainly shows interest in
>> having it even if it is just for a benchmark.  Does 'competing against other
>> compilers' sound better then 'optimizing for a benchmark'?
>
> I definitely would like to see GCC trunk do this transform.  What about we
> integrate the new pass, and then when jump-threading manages to catch the
> coremark loop, we remove the pass?

It never worked that way.

A new pass takes compile-time, if we disable it by default it won't help
coremark (and it will bitrot quickly).

So - please fix DOM instead.

Richard.

> Thanks,
> Sebastian

Re: RFC: Patch for switch elimination (PR 54742)

2014-08-15 Thread Richard Biener

On Thu, Aug 14, 2014 at 8:25 PM, Steve Ellcey  wrote:
> On Thu, 2014-08-14 at 10:21 -0600, Jeff Law wrote:
>> On 08/14/14 10:12, David Malcolm wrote:
>> > On Thu, 2014-08-14 at 09:56 -0600, Jeff Law wrote:
>> >> On 08/14/14 04:32, Richard Biener wrote:
>>  You'll note in a separate thread Steve and I discussed this during 
>>  Cauldron
>>  and it was at my recommendation Steve resurrected his proof of concept
>>  plugin and started beating it into shape.
>> >>>
>> >>> But do we really want a pass just to help coremark?
>> >> And that's the biggest argument against Steve's work.  In theory it
>> >> should be applicable to other FSMs, but nobody's come forth with
>> >> additional testcases from real world applications.
>> >
>> > Maybe a regex library?  Perhaps:
>> > http://vcs.pcre.org/viewvc/code/trunk/pcre_dfa_exec.c?revision=1477 ?
>> The key is that at least some states tell you at compile time what state
>> you'll be in during the next loop iteration.  Thus instead of coming
>> around the loop, evaluating the switch condition, then doing the
>> multi-way branch, we just directly jump to the case for the next iteration.
>>
>> I've never looked at the PCRE code to know if it's got cases like that.
>>
>> jeff
>
> I compiled PCRE but it never triggered this optimization (even if I
> bumped up the parameters for instruction counts and paths).
>
> I understand the desire not to add optimizations just for benchmarks but
> we do know other compilers have added this optimization for coremark
> (See
> http://community.arm.com/groups/embedded/blog/2013/02/21/coremark-and-compiler-performance)
>  and the 13 people on the CC list for this bug certainly shows interest in 
> having it even if it is just for a benchmark.  Does 'competing against other 
> compilers' sound better then 'optimizing for a benchmark'?

Well - as an open-source compiler we have the luxury to not care
about "benchmark compilers" ;)  At least that's what our non-existant
sales-team told me.

There are plenty "real" interpreters around which may have states
that deterministically forward to another state.  If you optimize
those as well, fine.

Btw - the patch doesn't contain a single testcase 

With coremark being "secret" what's the real-world testcase this
optimizes?  Note that the benchmarks used in SPEC are usually
available and taken from real-world apps.  I don't know coremark
at all, but from its name it sounds like sth like nullstone?

Richard.

> Steve Ellcey
> sell...@mips.com
>

[PATCH] Allow components of allocatables in !$omp atomic (PR fortran/62131)

2014-08-15 Thread Jakub Jelinek

Hi!

As discussed in the PR, nerrs(i,io) isn't allocatable var even
when nerrs is allocatable, thus a reasonable reading of the new OpenMP 4.0
requirement is that the testcase is still valid.

I went ahead and committed the change suggested by Tobias.  For the rest
mentioned in the PR, more thought and much larger patch will be needed.

2014-08-15  Jakub Jelinek  
Tobias Burnus  

PR fortran/62131
* openmp.c (resolve_omp_atomic): Only complain if code->expr1's attr
is allocatable, rather than whenever var->attr.allocatable.

* gfortran.dg/gomp/pr62131.f90: New test.

--- gcc/fortran/openmp.c.jj 2014-08-14 18:38:46.0 +0200
+++ gcc/fortran/openmp.c2014-08-15 12:02:13.025699623 +0200
@@ -2744,7 +2744,7 @@ resolve_omp_atomic (gfc_code *code)
   break;
 }
 
-  if (var->attr.allocatable)
+  if (gfc_expr_attr (code->expr1).allocatable)
 {
   gfc_error ("!$OMP ATOMIC with ALLOCATABLE variable at %L",
 &code->loc);
--- gcc/testsuite/gfortran.dg/gomp/pr62131.f90.jj   2014-08-15 
12:02:37.510575517 +0200
+++ gcc/testsuite/gfortran.dg/gomp/pr62131.f90  2014-08-15 12:03:28.421317788 
+0200
@@ -0,0 +1,19 @@
+! PR fortran/62131
+! { dg-do compile }
+! { dg-options "-fopenmp" }
+
+program pr62131
+  integer,allocatable :: nerrs(:,:)
+  allocate(nerrs(10,10))
+  nerrs(:,:) = 0
+!$omp parallel do
+  do k=1,10
+call uperrs(k,1)
+  end do
+contains
+  subroutine uperrs(i,io)
+integer,intent(in) :: i,io
+!$omp atomic
+nerrs(i,io)=nerrs(i,io)+1
+  end subroutine
+end

Jakub

[COMMITTED] Add myself to MAINTAINERS file (Write After Approval)

2014-08-15 Thread Ilya Tocar

Hi,

This patch adds myself to the MAINTAINERS file.  Commmitted as 214012. 

---
 MAINTAINERS | 1 +
 1 file changed, 1 insertion(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 87fb9dd..a40a537 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -555,6 +555,7 @@ Dinar Temirbulatov  
dtemirbula...@gmail.com
 Kresten Krab Thorupk...@gcc.gnu.org
 Caroline Tice  cmt...@google.com
 Kyrylo Tkachov kyrylo.tkac...@arm.com
+Tocar Ilya toca...@gmail.com
 Konrad Trifunovic  konrad.trifuno...@inria.fr
 Markus Trippelsdorfmar...@trippelsdorf.de
 David Ung  dav...@mips.com
-- 
1.8.3.1

Re: RFC: Patch for switch elimination (PR 54742)

2014-08-15 Thread Richard Biener

On Fri, Aug 15, 2014 at 12:13 PM, Richard Biener
 wrote:
> On Thu, Aug 14, 2014 at 8:25 PM, Steve Ellcey  wrote:
>> On Thu, 2014-08-14 at 10:21 -0600, Jeff Law wrote:
>>> On 08/14/14 10:12, David Malcolm wrote:
>>> > On Thu, 2014-08-14 at 09:56 -0600, Jeff Law wrote:
>>> >> On 08/14/14 04:32, Richard Biener wrote:
>>>  You'll note in a separate thread Steve and I discussed this during 
>>>  Cauldron
>>>  and it was at my recommendation Steve resurrected his proof of concept
>>>  plugin and started beating it into shape.
>>> >>>
>>> >>> But do we really want a pass just to help coremark?
>>> >> And that's the biggest argument against Steve's work.  In theory it
>>> >> should be applicable to other FSMs, but nobody's come forth with
>>> >> additional testcases from real world applications.
>>> >
>>> > Maybe a regex library?  Perhaps:
>>> > http://vcs.pcre.org/viewvc/code/trunk/pcre_dfa_exec.c?revision=1477 ?
>>> The key is that at least some states tell you at compile time what state
>>> you'll be in during the next loop iteration.  Thus instead of coming
>>> around the loop, evaluating the switch condition, then doing the
>>> multi-way branch, we just directly jump to the case for the next iteration.
>>>
>>> I've never looked at the PCRE code to know if it's got cases like that.
>>>
>>> jeff
>>
>> I compiled PCRE but it never triggered this optimization (even if I
>> bumped up the parameters for instruction counts and paths).
>>
>> I understand the desire not to add optimizations just for benchmarks but
>> we do know other compilers have added this optimization for coremark
>> (See
>> http://community.arm.com/groups/embedded/blog/2013/02/21/coremark-and-compiler-performance)
>>  and the 13 people on the CC list for this bug certainly shows interest in 
>> having it even if it is just for a benchmark.  Does 'competing against other 
>> compilers' sound better then 'optimizing for a benchmark'?
>
> Well - as an open-source compiler we have the luxury to not care
> about "benchmark compilers" ;)  At least that's what our non-existant
> sales-team told me.
>
> There are plenty "real" interpreters around which may have states
> that deterministically forward to another state.  If you optimize
> those as well, fine.
>
> Btw - the patch doesn't contain a single testcase 
>
> With coremark being "secret" what's the real-world testcase this
> optimizes?  Note that the benchmarks used in SPEC are usually
> available and taken from real-world apps.  I don't know coremark
> at all, but from its name it sounds like sth like nullstone?

Ok, seems you can download coremark so I did that.  The benchmark
doesn't resemble any reasonable state machine as the "state"
is simply compiler-visibly looping 0, 1, 2, 3, 4, 5, 6, 7, 0, 1, 2, 3, 4, 5...
in a real state machine the next state would come from input
(uninteresting) or be determined by a previous state (well - it is in
the artificial case, but for all states which makes the switch moot).
Thus this benchmark is so artificial that it isn't worth a special
pass.

I suppose the "expected" optimizaton is to un-obfuscate this into
the non-state-machine this really is after peeling the first
iterations to make 'seed' compile-time determinable (here I note
the function comment that says it's important that 'seed' is
_not_ compile-time known - hah, peeling is such a nice tool
to "workaround" that idea).

Stupid benchmarks.

Richard.

> Richard.
>
>> Steve Ellcey
>> sell...@mips.com
>>

Re: [GSoC] Elimination of CLooG library installation dependency

2014-08-15 Thread Roman Gareev

> I've attached the patch, which should eliminate CLooG library
> installation dependency from GCC. The CLooG AST generator is still the
> main code generator, but the isl ast generator will be chosen in case
> of nonavailability of CLooG library.
>
> However, I've found out a problem. Almost all the functions of the ISL
> cannot be used without installed CLooG. (I get errors which contain
> “undefined reference to...”). Maybe I missed something. What do you
> think about this?

I’ve attached a patch, which contains mentioned changes and doesn’t
cause the error. I want to commit it in coming days. What do you think
about it?
Maybe we should make the ISL AST generator to be the main code
generator of Graphite (the patch1 implements this). What do you think
about it?


-- 
Cheers, Roman Gareev.
2014-08-15 Roman Gareev  

* configure.ac: Eliminate ClooG installation dependency.
* configure: Regenerate.
* Makefile.tpl: Add definition of ISLLIBS and HOST_ISLLIBS.
* Makefile.in: Regenerate.

[config/]

* cloog.m4: Remove the path to isllibs from clooglibs.
* isl.m4: Add paths to islinc, isllibs.

[gcc/]

* Makefile.in: Add definition of ISLLIBS. Update LIBS.
* config.in: Add undef of HAVE_isl.
* configure: Regenerate.
* configure.ac: Add definition of HAVE_isl.
* graphite-blocking.c: Add checking of HAVE_isl.
* graphite-dependences.c: Likewise.
* graphite-interchange.c: Likewise.
* graphite-isl-ast-to-gimple.c: Likewise.
* graphite-optimize-isl.c: Likewise.
* graphite-poly.c: Likewise.
* graphite-scop-detection.c: Likewise.
* graphite-sese-to-poly.c: Likewise.
* graphite.c:
* toplev.c: Replace the checking of HAVE_cloog with the checking
of HAVE_isl.

2014-08-15 Roman Gareev  

[gcc/]

* common.opt: Make the ISL AST generator to be the main code generator
of Graphite.
Index: Makefile.in
===
--- Makefile.in (revision 214008)
+++ Makefile.in (working copy)
@@ -219,6 +219,7 @@
HOST_LIBS="$(STAGE1_LIBS)"; export HOST_LIBS; \
GMPLIBS="$(HOST_GMPLIBS)"; export GMPLIBS; \
GMPINC="$(HOST_GMPINC)"; export GMPINC; \
+   ISLLIBS="$(HOST_ISLLIBS)"; export ISLLIBS; \
ISLINC="$(HOST_ISLINC)"; export ISLINC; \
CLOOGLIBS="$(HOST_CLOOGLIBS)"; export CLOOGLIBS; \
CLOOGINC="$(HOST_CLOOGINC)"; export CLOOGINC; \
@@ -310,6 +311,7 @@
 HOST_GMPINC = @gmpinc@
 
 # Where to find ISL
+HOST_ISLLIBS = @isllibs@
 HOST_ISLINC = @islinc@
 
 # Where to find CLOOG
Index: Makefile.tpl
===
--- Makefile.tpl(revision 214008)
+++ Makefile.tpl(working copy)
@@ -222,6 +222,7 @@
HOST_LIBS="$(STAGE1_LIBS)"; export HOST_LIBS; \
GMPLIBS="$(HOST_GMPLIBS)"; export GMPLIBS; \
GMPINC="$(HOST_GMPINC)"; export GMPINC; \
+   ISLLIBS="$(HOST_ISLLIBS)"; export ISLLIBS; \
ISLINC="$(HOST_ISLINC)"; export ISLINC; \
CLOOGLIBS="$(HOST_CLOOGLIBS)"; export CLOOGLIBS; \
CLOOGINC="$(HOST_CLOOGINC)"; export CLOOGINC; \
@@ -313,6 +314,7 @@
 HOST_GMPINC = @gmpinc@
 
 # Where to find ISL
+HOST_ISLLIBS = @isllibs@
 HOST_ISLINC = @islinc@
 
 # Where to find CLOOG
Index: config/cloog.m4
===
--- config/cloog.m4 (revision 214008)
+++ config/cloog.m4 (working copy)
@@ -69,7 +69,7 @@
   fi
 
   clooginc="-DCLOOG_INT_GMP ${clooginc}"
-  clooglibs="${clooglibs} -lcloog-isl ${isllibs} -lisl"
+  clooglibs="${clooglibs} -lcloog-isl"
 ]
 )
 
Index: config/isl.m4
===
--- config/isl.m4   (revision 214008)
+++ config/isl.m4   (working copy)
@@ -68,6 +68,9 @@
 ENABLE_ISL_CHECK=no
 AC_MSG_WARN([using in-tree ISL, disabling version check])
   fi
+
+  islinc="-DCLOOG_INT_GMP ${islinc}"
+  isllibs="${isllibs} -lisl"
 ]
 )
 
Index: configure
===
--- configure   (revision 214008)
+++ configure   (working copy)
@@ -649,6 +649,7 @@
 clooginc
 clooglibs
 islinc
+isllibs
 poststage1_ldflags
 poststage1_libs
 stage1_ldflags
@@ -2760,7 +2761,7 @@
 build_tools="build-texinfo build-flex build-bison build-m4 build-fixincludes"
 
 # these libraries are used by various programs built for the host environment
-#
+#f
 host_libs="intl libiberty opcodes bfd readline tcl tk itcl libgui zlib 
libbacktrace libcpp libdecnumber gmp mpfr mpc isl cloog libelf libiconv"
 
 # these tools are built for the host environment
@@ -5835,10 +5836,9 @@
 fi
 
 
-# Treat either --without-cloog or --without-isl as a request to disable
+# Treat --without-isl as a request to disable
 # GRAPHITE support and skip all following checks.
-if test "x$with_isl" != "xno" &

[PATCH i386 AVX512] [16/n] Add AVX-512BW's psadbw insn.

2014-08-15 Thread Kirill Yukhin

Hello,
This patch introduces AVX-512BW's psadbw insn pattern.

Bootstrapped.
New tests on top of patch-set all pass
under simulator.

Is it ok for trunk?

gcc/
* config/i386/sse.md
(define_mode_iterator VI8_AVX2_AVX512BW): New.
(define_insn "_psadbw"): Add evex version.

--
Thanks, K

diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 0660ae4..5f51c3a 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -288,6 +288,9 @@
   [(V16SI "TARGET_AVX512F") (V8SI "TARGET_AVX2") V4SI
(V8DI "TARGET_AVX512F")])
 
+(define_mode_iterator VI8_AVX2_AVX512BW
+  [(V8DI "TARGET_AVX512BW") (V4DI "TARGET_AVX2") V2DI])
+
 (define_mode_iterator VI8_AVX2
   [(V4DI "TARGET_AVX2") V2DI])
 
@@ -10976,10 +10979,10 @@
 ;; The correct representation for this is absolutely enormous, and
 ;; surely not generally useful.
 (define_insn "_psadbw"
-  [(set (match_operand:VI8_AVX2 0 "register_operand" "=x,x")
-   (unspec:VI8_AVX2
- [(match_operand: 1 "register_operand" "0,x")
-  (match_operand: 2 "nonimmediate_operand" "xm,xm")]
+  [(set (match_operand:VI8_AVX2_AVX512BW 0 "register_operand" "=x,v")
+   (unspec:VI8_AVX2_AVX512BW
+ [(match_operand: 1 "register_operand" "0,v")
+  (match_operand: 2 "nonimmediate_operand" "xm,vm")]
  UNSPEC_PSADBW))]
   "TARGET_SSE2"
   "@
@@ -10989,7 +10992,7 @@
(set_attr "type" "sseiadd")
(set_attr "atom_unit" "simul")
(set_attr "prefix_data16" "1,*")
-   (set_attr "prefix" "orig,vex")
+   (set_attr "prefix" "orig,maybe_evex")
(set_attr "mode" "")])
 
 (define_insn "_movmsk"

[PATCH i386 AVX512] [17/n] Split VI48_AVX512F into VI4_AVX512VL and VI248_AVX512, extend vcvtps2udq,vpbroadcastmb2d.

2014-08-15 Thread Kirill Yukhin

Hello,
This patch splits VI48_AVX512F iterator into two.
It extends vcvtps2udq,vpbroadcastmb2d patterns as well.

Bootstrapped.
New tests on top of patch-set all pass
under simulator.

Is it ok for trunk?

gcc/
* config/i386/sse.md
(define_mode_iterator VI48_AVX512F): Delete.
(define_mode_iterator VI4_AVX512VL): New.
(define_mode_iterator VI248_AVX512): New.
(define_insn 
"avx512f_ufix_notruncv16sfv16si"):
Delete.
(define_insn

"_ufix_notrunc"):
New.
(define_insn "avx512cd_maskw_vec_dup"): Macroize.
(define_insn "_ashrv"): Delete.
(define_insn "_ashrv"): New.

--
Thanks, K

diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 5f51c3a..f932b16 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -284,9 +284,16 @@
 (define_mode_iterator VI4_AVX512F
   [(V16SI "TARGET_AVX512F") (V8SI "TARGET_AVX2") V4SI])
 
-(define_mode_iterator VI48_AVX512F
-  [(V16SI "TARGET_AVX512F") (V8SI "TARGET_AVX2") V4SI
-   (V8DI "TARGET_AVX512F")])
+(define_mode_iterator VI4_AVX512VL
+  [V16SI (V8SI "TARGET_AVX512VL") (V4SI "TARGET_AVX512VL")])
+
+(define_mode_iterator VI248_AVX512
+  [(V16SI "TARGET_AVX512F") (V8SI "TARGET_AVX2") (V4SI "TARGET_AVX2")
+   (V32HI "TARGET_AVX512BW")
+   (V16HI "TARGET_AVX512BW && TARGET_AVX512VL")
+   (V8HI "TARGET_AVX512BW && TARGET_AVX512VL")
+   (V8DI "TARGET_AVX512F") (V4DI "TARGET_AVX512VL") (V2DI "TARGET_AVX512VL")])
+
 
 (define_mode_iterator VI8_AVX2_AVX512BW
   [(V8DI "TARGET_AVX512BW") (V4DI "TARGET_AVX2") V2DI])
@@ -3744,16 +3751,16 @@
(set_attr "prefix" "evex")
(set_attr "mode" "XI")])
 
-(define_insn 
"avx512f_ufix_notruncv16sfv16si"
-  [(set (match_operand:V16SI 0 "register_operand" "=v")
-   (unspec:V16SI
- [(match_operand:V16SF 1 "" 
"")]
+(define_insn 
"_ufix_notrunc"
+  [(set (match_operand:VI4_AVX512VL 0 "register_operand" "=v")
+   (unspec:VI4_AVX512VL
+ [(match_operand: 1 "nonimmediate_operand" 
"")]
  UNSPEC_UNSIGNED_FIX_NOTRUNC))]
   "TARGET_AVX512F"
   "vcvtps2udq\t{%1, %0|%0, 
%1}"
   [(set_attr "type" "ssecvt")
(set_attr "prefix" "evex")
-   (set_attr "mode" "XI")])
+   (set_attr "mode" "")])
 
 (define_insn "fix_truncv16sfv16si2"
   [(set (match_operand:V16SI 0 "register_operand" "=v")
@@ -14483,9 +14490,9 @@
(set_attr "prefix" "evex")
(set_attr "mode" "XI")])
 
-(define_insn "avx512cd_maskw_vec_dupv16si"
-  [(set (match_operand:V16SI 0 "register_operand" "=v")
-   (vec_duplicate:V16SI
+(define_insn "avx512cd_maskw_vec_dup"
+  [(set (match_operand:VI4_AVX512VL 0 "register_operand" "=v")
+   (vec_duplicate:VI4_AVX512VL
  (zero_extend:SI
(match_operand:HI 1 "register_operand" "Yk"]
   "TARGET_AVX512CD"
@@ -15167,12 +15174,16 @@
   DONE;
 })
 
-(define_insn "_ashrv"
-  [(set (match_operand:VI48_AVX512F 0 "register_operand" "=v")
-   (ashiftrt:VI48_AVX512F
- (match_operand:VI48_AVX512F 1 "register_operand" "v")
- (match_operand:VI48_AVX512F 2 "nonimmediate_operand" "vm")))]
-  "TARGET_AVX2 && "
+(define_insn "_ashrv"
+  [(set (match_operand:VI248_AVX512 0 "register_operand" "=v")
+   (ashiftrt:VI248_AVX512
+ (match_operand:VI248_AVX512 1 "register_operand" "v")
+ (match_operand:VI248_AVX512 2 "nonimmediate_operand" "vm")))]
+  "TARGET_AVX2
+   && (!
+   || (TARGET_AVX512BW && (mode == V32HImode || TARGET_AVX512VL))
+   || (TARGET_AVX512VL && GET_MODE_INNER (mode) != HImode)
+   || ( == 64 && GET_MODE_INNER (mode) != HImode))"
   "vpsrav\t{%2, %1, %0|%0, %1, 
%2}"
   [(set_attr "type" "sseishft")
(set_attr "prefix" "maybe_evex")

Re: [COMMITTED] Add myself to MAINTAINERS file (Write After Approval)

2014-08-15 Thread Richard Biener

On Fri, Aug 15, 2014 at 12:42 PM, Ilya Tocar  wrote:
> Hi,
>
> This patch adds myself to the MAINTAINERS file.  Commmitted as 214012.

Please keep this list sorted alphabetically.

> ---
>  MAINTAINERS | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 87fb9dd..a40a537 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -555,6 +555,7 @@ Dinar Temirbulatov  
> dtemirbula...@gmail.com
>  Kresten Krab Thorupk...@gcc.gnu.org
>  Caroline Tice  cmt...@google.com
>  Kyrylo Tkachov kyrylo.tkac...@arm.com
> +Tocar Ilya toca...@gmail.com
>  Konrad Trifunovic  konrad.trifuno...@inria.fr
>  Markus Trippelsdorfmar...@trippelsdorf.de
>  David Ung  dav...@mips.com
> --
> 1.8.3.1
>

Re: [GSoC] Elimination of CLooG library installation dependency

2014-08-15 Thread Richard Biener

On Fri, Aug 15, 2014 at 1:13 PM, Roman Gareev  wrote:
>> I've attached the patch, which should eliminate CLooG library
>> installation dependency from GCC. The CLooG AST generator is still the
>> main code generator, but the isl ast generator will be chosen in case
>> of nonavailability of CLooG library.
>>
>> However, I've found out a problem. Almost all the functions of the ISL
>> cannot be used without installed CLooG. (I get errors which contain
>> “undefined reference to...”). Maybe I missed something. What do you
>> think about this?
>
> I’ve attached a patch, which contains mentioned changes and doesn’t
> cause the error. I want to commit it in coming days. What do you think
> about it?
> Maybe we should make the ISL AST generator to be the main code
> generator of Graphite (the patch1 implements this). What do you think
> about it?

We definitely should do that (and rip out the cloog code after a while).

Richard.

>
> --
> Cheers, Roman Gareev.

Re: [PATCH PR62011]

2014-08-15 Thread Yuri Rumyantsev

I checked that zeroing destination operand for unary bit-manipulation
instruction is helpful for 64- and 32-bit mode only. So the patch was
changed.

Is it OK for trunk?

gcc/ChangeLog
2014-08-15  Yuri Rumyantsev  

PR target/62011
* config/i386/i386-protos.h (ix86_avoid_false_dep_for_bm): New function
 prototype.
* config/i386/i386.c (ix86_avoid_false_dep_for_bm): New function.
* config/i386/i386.h (TARGET_AVOID_FALSE_DEP_FOR_BM) New macros.
* config/i386/i386.md (ctz2, clz2_lzcnt, popcount2,
 *popcount2_cmp, *popcountsi2_cmp_zext): Output zeroing
 destination register for unary bit-manipulation instructions
 if required.
* config/i386/x86-tune.def (X86_TUNE_AVOID_FALSE_DEP_FOR_BM): New.

2014-08-14 19:39 GMT+04:00 Yuri Rumyantsev :
> It does not help Silvermont, i.e. only Haswell and SandyBridge are affected.
> I don't use splitter since (1) it deletes zeroing of dest reg; (2)
> scheduler can hoist them up . I will try r16/r32 variants and tell you
> later.
>
> 2014-08-14 19:18 GMT+04:00 H.J. Lu :
>> On Thu, Aug 14, 2014 at 4:50 AM, Yuri Rumyantsev  wrote:
>>> Hi All,
>>>
>>> Here is a fix for PR 62011 - remove false dependency for unary
>>> bit-manipulation instructions for latest BigCore chips (Sandybridge
>>> and Haswell) by outputting in assembly file zeroing destination
>>> register before bmi instruction. I checked that performance restored
>>> for popcnt, lzcnt and tzcnt instructions.
>>>
>>> Bootstrap and regression testing did not show any new failures.
>>>
>>> Is it OK for trunk?
>>>
>>> gcc/ChangeLog
>>> 2014-08-14  Yuri Rumyantsev  
>>>
>>> PR target/62011
>>> * config/i386/i386-protos.h (ix86_avoid_false_dep_for_bm): New function
>>>  prototype.
>>> * config/i386/i386.c (ix86_avoid_false_dep_for_bm): New function.
>>> * config/i386/i386.h (TARGET_AVOID_FALSE_DEP_FOR_BM) New macros.
>>> * config/i386/i386.md (ctz2, clz2_lzcnt, popcount2,
>>>  *popcount2_cmp, *popcountsi2_cmp_zext): Output zeroing
>>>  destination register for unary bit-manipulation instructions
>>>  if required.
>>
>> Why don't you use splitter to to generate XOR?
>>
>>> * config/i386/x86-tune.def (X86_TUNE_AVOID_FALSE_DEP_FOR_BM): New.
>>
>> Is this needed for r16 and r32?  The original report says that only
>> r64 is affected:
>>
>> http://stackoverflow.com/questions/25078285/replacing-a-32-bit-loop-count-variable-with-64-bit-introduces-crazy-performance
>>
>> Have you tried this on Silvermont?  Does it help Silvermont?
>>
>> --
>> H.J.


patch1
Description: Binary data

Re: [COMMITTED] Add myself to MAINTAINERS file (Write After Approval)

2014-08-15 Thread Jakub Jelinek

On Fri, Aug 15, 2014 at 01:43:55PM +0200, Richard Biener wrote:
> On Fri, Aug 15, 2014 at 12:42 PM, Ilya Tocar  wrote:
> > Hi,
> >
> > This patch adds myself to the MAINTAINERS file.  Commmitted as 214012.
> 
> Please keep this list sorted alphabetically.

Well, Ilya did, just swapped surname and given name from the standard practice
in MAINTAINERS.  So, please just do
sed -i -e 's/Tocar Ilya/Ilya Tocar/' MAINTAINERS

> > --- a/MAINTAINERS
> > +++ b/MAINTAINERS
> > @@ -555,6 +555,7 @@ Dinar Temirbulatov  
> > dtemirbula...@gmail.com
> >  Kresten Krab Thorupk...@gcc.gnu.org
> >  Caroline Tice  cmt...@google.com
> >  Kyrylo Tkachov kyrylo.tkac...@arm.com
> > +Tocar Ilya toca...@gmail.com
> >  Konrad Trifunovic  konrad.trifuno...@inria.fr
> >  Markus Trippelsdorfmar...@trippelsdorf.de
> >  David Ung  dav...@mips.com

Jakub

[PATCH i386 AVX512] [18/n] Extend vpbroadcastmb2q.

2014-08-15 Thread Kirill Yukhin

Hello,
This patch extends pattern for vpbroadcastmb2q insn
pattern.

Bootstrapped.
New tests on top of patch-set all pass
under simulator.

Is it ok for trunk?

gcc/
* config/i386/sse.md
(define_mode_iterator VI8_AVX512VL): New.
(define_insn "avx512cd_maskb_vec_dup"): Macroize.

--
Thanks, K

diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index f932b16..54753f9 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -266,6 +266,9 @@
 (define_mode_iterator VI8
   [(V8DI "TARGET_AVX512F") (V4DI "TARGET_AVX") V2DI])
 
+(define_mode_iterator VI8_AVX512VL
+  [V8DI (V4DI "TARGET_AVX512VL") (V2DI "TARGET_AVX512VL")])
+
 (define_mode_iterator VI1_AVX2
   [(V32QI "TARGET_AVX2") V16QI])
 
@@ -14479,9 +14482,9 @@
(set_attr "prefix" "vex")
(set_attr "mode" "")])
 
-(define_insn "avx512cd_maskb_vec_dupv8di"
-  [(set (match_operand:V8DI 0 "register_operand" "=v")
-   (vec_duplicate:V8DI
+(define_insn "avx512cd_maskb_vec_dup"
+  [(set (match_operand:VI8_AVX512VL 0 "register_operand" "=v")
+   (vec_duplicate:VI8_AVX512VL
  (zero_extend:DI
(match_operand:QI 1 "register_operand" "Yk"]
   "TARGET_AVX512CD"

Re: [COMMITTED] Add myself to MAINTAINERS file (Write After Approval)

2014-08-15 Thread Ilya Tocar

> > This patch adds myself to the MAINTAINERS file.  Commmitted as 214012.
> 
> Please keep this list sorted alphabetically.
>
Sorry attached wrong version of the patch.
Actually commited vesrion (rev 214012), has alphabetical order

Index: ChangeLog
===
--- ChangeLog   (revision 214011)
+++ ChangeLog   (revision 214012)
@@ -1,3 +1,7 @@
+2014-08-15  Ilya Tocar   
+
+   * MAINTAINERS (Write After Approval): Add myself.
+
 2014-08-01  Jiong Wang  
 
* MAINTAINERS (Write After Approval): Add myself.
Index: MAINTAINERS
===
--- MAINTAINERS (revision 214011)
+++ MAINTAINERS (revision 214012)
@@ -555,6 +555,7 @@
 Kresten Krab Thorupk...@gcc.gnu.org
 Caroline Tice  cmt...@google.com
 Kyrylo Tkachov kyrylo.tkac...@arm.com
+Ilya Tocar toca...@gmail.com
 Konrad Trifunovic  konrad.trifuno...@inria.fr
 Markus Trippelsdorfmar...@trippelsdorf.de
 David Ung  dav...@mips.com

[PATCH i386 AVX512] [19/n] Extends AVX-512 broadcasts.

2014-08-15 Thread Kirill Yukhin

Hello,
This patch introduces new patterns to support
AVX-512Vl,DQ broadcast insns.

Bootstrapped.
New tests on top of patch-set all pass
under simulator.

Is it ok for trunk?

gcc/
* config/i386/sse.md
(define_mode_iterator VI4F_BRCST32x2): New.
(define_mode_attr 64x2_mode): New.
(define_mode_attr 32x2mode): New.
(define_insn "avx512dq_broadcast"): New.
(define_insn "avx512vl_broadcast_1"): 
New.
(define_insn "avx512dq_broadcast_1"): 
New.
(define_insn "avx512dq_broadcast_1"): 
New.

--
Thanks, K

diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 54753f9..a8c7ba8 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -14482,6 +14482,81 @@
(set_attr "prefix" "vex")
(set_attr "mode" "")])
 
+;; For broadcast[i|f]32x2.  Yes there is no v4sf version, only v4si.
+(define_mode_iterator VI4F_BRCST32x2
+  [V16SI (V8SI "TARGET_AVX512VL") (V4SI "TARGET_AVX512VL")
+   V16SF (V8SF "TARGET_AVX512VL")])
+
+(define_mode_attr 64x2_mode
+  [(V8DF "V2DF") (V8DI "V2DI") (V4DI "V2DI") (V4DF "V2DF")])
+
+(define_mode_attr 32x2mode
+  [(V16SF "V2SF") (V16SI "V2SI") (V8SI "V2SI")
+  (V8SF "V2SF") (V4SI "V2SI")])
+
+(define_insn "avx512dq_broadcast"
+  [(set (match_operand:VI4F_BRCST32x2 0 "register_operand" "=v")
+   (vec_duplicate:VI4F_BRCST32x2
+ (vec_select:<32x2mode>
+   (match_operand: 1 "nonimmediate_operand" "vm")
+   (parallel [(const_int 0) (const_int 1)]]
+  "TARGET_AVX512DQ"
+  "vbroadcast32x2\t{%1, %0|%0, %1}"
+  [(set_attr "type" "ssemov")
+   (set_attr "prefix_extra" "1")
+   (set_attr "prefix" "evex")
+   (set_attr "mode" "")])
+
+(define_insn "avx512vl_broadcast_1"
+  [(set (match_operand:VI4F_256 0 "register_operand" "=v,v")
+(vec_duplicate:VI4F_256
+ (match_operand: 1 "nonimmediate_operand" "v,m")))]
+  "TARGET_AVX512VL"
+  "@
+   vshuf32x4\t{$0x0, %t1, %t1, 
%0|%0, %t1, %t1, 0x0}
+   vbroadcast32x4\t{%1, %0|%0, %1}"
+  [(set_attr "type" "ssemov")
+   (set_attr "prefix_extra" "1")
+   (set_attr "prefix" "evex")
+   (set_attr "mode" "")])
+
+(define_insn "avx512dq_broadcast_1"
+  [(set (match_operand:V16FI 0 "register_operand" "=v,v")
+   (vec_duplicate:V16FI
+ (match_operand: 1 "nonimmediate_operand" "v,m")))]
+  "TARGET_AVX512DQ"
+  "@
+   vshuf32x4\t{$0x44, %g1, %g1, 
%0|%0, %g1, %g1, 0x44}
+   vbroadcast32x8\t{%1, %0|%0, %1}"
+  [(set_attr "type" "ssemov")
+   (set_attr "prefix_extra" "1")
+   (set_attr "prefix" "evex")
+   (set_attr "mode" "")])
+
+(define_insn "avx512dq_broadcast_1"
+  [(set (match_operand:VI8F_256_512 0 "register_operand" "=v,v")
+   (vec_duplicate:VI8F_256_512
+ (match_operand:<64x2_mode> 1 "nonimmediate_operand" "v,m")))]
+  "TARGET_AVX512DQ && ( == 64 || TARGET_AVX512VL)"
+{
+  switch (which_alternative)
+{
+case 0:
+  if (GET_MODE_SIZE (mode) == 64)
+return "vshuf64x2\t{$0x0, %g1, %g1, 
%0|%0, %g1, %g1, 0x0}";
+  else
+return "vshuf64x2\t{$0x0, %t1, %t1, 
%0|%0, %t1, %t1, 0x0}";
+case 1:
+  return "vbroadcast64x2\t{%1, 
%0|%0, %1}";
+default:
+  gcc_unreachable ();
+}
+}
+  [(set_attr "type" "ssemov")
+   (set_attr "prefix_extra" "1")
+   (set_attr "prefix" "evex")
+   (set_attr "mode" "")])
+
 (define_insn "avx512cd_maskb_vec_dup"
   [(set (match_operand:VI8_AVX512VL 0 "register_operand" "=v")
(vec_duplicate:VI8_AVX512VL

Re: [PATCH PR62011]

2014-08-15 Thread Jakub Jelinek

On Fri, Aug 15, 2014 at 03:45:33PM +0400, Yuri Rumyantsev wrote:
> gcc/ChangeLog
> 2014-08-15  Yuri Rumyantsev  
> 
> PR target/62011
> * config/i386/i386-protos.h (ix86_avoid_false_dep_for_bm): New function
>  prototype.
> * config/i386/i386.c (ix86_avoid_false_dep_for_bm): New function.
> * config/i386/i386.h (TARGET_AVOID_FALSE_DEP_FOR_BM) New macros.
> * config/i386/i386.md (ctz2, clz2_lzcnt, popcount2,
>  *popcount2_cmp, *popcountsi2_cmp_zext): Output zeroing
>  destination register for unary bit-manipulation instructions
>  if required.
> * config/i386/x86-tune.def (X86_TUNE_AVOID_FALSE_DEP_FOR_BM): New.

--- config/i386/i386.md (revision 213842)
+++ config/i386/i386.md (working copy)
@@ -12111,7 +12111,13 @@
   ""
 {
   if (TARGET_BMI)
-return "tzcnt{}\t{%1, %0|%0, %1}";
+{
+  if (ix86_avoid_false_dep_for_bm (insn, operands))
+   return "xor{}\t%0, %0\n\t"
+   "tzcnt{}\t{%1, %0|%0, %1}";
+  else
+return "tzcnt{}\t{%1, %0|%0, %1}";
+}
   else if (optimize_function_for_size_p (cfun))
 ;
   else if (TARGET_GENERIC)

etc., this will make lenght attribute incorrect though.

Jakub

[PATCH i386 AVX512] [20/n] AVX-512 integer shift pattern.

2014-08-15 Thread Kirill Yukhin

Hello,
This patch extends shift pattern to support AVX-512
new insn.

Bootstrapped.
New tests on top of patch-set all pass
under simulator.

Is it ok for trunk?

gcc/
* config/i386/sse.md
(define_mode_iterator VI248_AVX2): Add V32HI mode.
(define_insn "3"): Add masking.

--
Thanks, K

diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index a8c7ba8..c219523 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -348,7 +348,7 @@
(V8SI "TARGET_AVX2") V4SI])
 
 (define_mode_iterator VI248_AVX2
-  [(V16HI "TARGET_AVX2") V8HI
+  [(V32HI "TARGET_AVX512BW") (V16HI "TARGET_AVX2") V8HI
(V8SI "TARGET_AVX2") V4SI
(V4DI "TARGET_AVX2") V2DI])
 
@@ -8552,15 +8552,19 @@
(const_string "0")))
(set_attr "mode" "")])
 
-(define_insn "3"
-  [(set (match_operand:VI248_AVX2 0 "register_operand" "=x,x")
+(define_insn "3"
+  [(set (match_operand:VI248_AVX2 0 "register_operand" "=x,v")
(any_lshift:VI248_AVX2
- (match_operand:VI248_AVX2 1 "register_operand" "0,x")
- (match_operand:SI 2 "nonmemory_operand" "xN,xN")))]
-  "TARGET_SSE2"
+ (match_operand:VI248_AVX2 1 "register_operand" "0,v")
+ (match_operand:SI 2 "nonmemory_operand" "xN,vN")))]
+  "TARGET_SSE2
+   && 
+   && ((mode != V16HImode && mode != V8HImode)
+   || TARGET_AVX512BW
+   || !)"
   "@
p\t{%2, %0|%0, %2}
-   vp\t{%2, %1, %0|%0, %1, %2}"
+   vp\t{%2, %1, %0|%0, 
%1, %2}"
   [(set_attr "isa" "noavx,avx")
(set_attr "type" "sseishft")
(set (attr "length_immediate")

Re: [PATCH] Extended if-conversion for loops marked with pragma omp simd.

2014-08-15 Thread Yuri Rumyantsev

Richard!
Here is updated patch with the following changes:

1. Any restrictions on phi-function were eliminated for extended conversion.
2.  Put predicate for critical edges to 'aux' field of edge, i.e.
negate_predicate was deleted.
3. Deleted splitting of critical edges, i.e. both outgoing edges can
be critical.
4. Use notion of cd-equivalence to set-up predicate for join basic
blocks to simplify it.
5. I decided to not design pre-pass since it will lead generating
chain of cond expressions for phi-node if conversion, whereas for phi
of kind
  x = PHI <1(2), 1(3), 2(4)>
only one cond expression is required and this is considered as simple
optimization for arbitrary phi-function. More precise,
if phi-function have only two different arguments and one of them has
single occurrence, if- conversion is performed as if phi have only 2
arguments.
For arbitrary phi function a chain of cond expressions is produced.

Updated patch is attached.

Any comments will be appreciated.

2014-08-15  Yuri Rumyantsev  

* tree-if-conv.c (cgraph.h): Add include file to detect function clone.
(flag_force_vectorize): New variable.
(edge_predicate): New function.
(set_edge_predicate): New function.
(add_stmt_to_bb_predicate_gimplified_stmts): New function.
(init_bb_predicate): Add initialization of negate_predicate field.
(reset_bb_predicate): Reset negate_predicate to NULL_TREE.
(convert_name_to_cmp): New function.
(get_type_for_cond): New function.
(convert_bool_predicate): New function.
(predicate_disjunction): New function.
(predicate_conjunction): New function.
(add_to_predicate_list): Add convert_bool argument.
Use predicate of cd-equivalent block if convert_bool is true and
such bb exists; save it in static variable for further possible use.
Add call of predicate_disjunction if convert_bool argument is true.
(add_to_dst_predicate_list): Add convert_bool argument.
Add early function exit if edge target block is always executed.
Add call of predicate_conjunction if convert_bool argument is true.
Pass convert_bool argument for add_to_predicate_list.
Set-up predicate for crritical edge if convert_bool is true.
(equal_phi_args): New function.
(phi_has_two_different_args): New function.
(if_convertible_phi_p): Accept phi nodes with more than two args
if flag_force_vectorize wa set-up.
(ifcvt_can_use_mask_load_store): Add test on flag_force_vectorize.
(if_convertible_stmt_p): Allow calls of function clones if
flag_force_vectorize was set-up.
(all_edges_are_critical): New function.
(if_convertible_bb_p): Allow bb has more than two predecessors if
flag_force_vectorize was set-up. Use call of all_edges_are_critical
to reject block if-conversion with imcoming critical edges only if
flag_force_vectorize was not set-up.
(walk_cond_tree): New function.
(vect_bool_pattern_is_applicable): New function.
(predicate_bbs): Add convert_bool argument which is used to transform
comparison expressions of boolean type into conditional expressions
with integral operands. If convert_bool argument was set-up and
vect bool pattern can be appied perform the following transformation:
(bool) x != 0  --> y = (int) x; x != 0;
Add check that if fold_build2 produces bool conversion if convert_bool
was set-up, recompute predicate using build2_loc. Additional argument
'convert_bool" is passed to add_to_dst_predicate_list and
add_to_predicate_list.
(if_convertible_loop_p_1): Recompute POST_DOMINATOR tree if
flag_force_vectorize was set-up to calculate cd equivalent bb's.
Call predicate_bbs with additional argument equal to false.
(find_phi_replacement_condition): Extend function interface:
it returns NULL if given phi node must be handled by means of
extended phi node predication. If number of predecessors of phi-block
is equal 2 and atleast one incoming edge is not critical original
algorithm is used.
(is_cond_scalar_reduction): Add 'extended' argument which signals that
phi arguments must be evaluated through phi_has_two_different_args.
(predicate_scalar_phi): Add invoсation of convert_name_to_cmp if cond
is SSA_NAME. Add 'false' argument to call of is_cond_scalar_reduction.
(get_predicate_for_edge): New function.
(find_insertion_point): New function.
(predicate_arbitrary_phi): New function.
(predicate_extended_scalar_phi): New function.
(predicate_all_scalar_phis): Add code to set-up gimple statement
iterator for predication of extended scalar phi's for insertion.
(insert_gimplified_predicates): Add test for non-predicated basic
blocks that there are no gimplified statements to insert. Insert
predicates at the block begining for extended if-conversion.
(predicate_mem_writes): Invoke convert_name_to_cmp for extended
predication to build mask.
(combine_blocks): Pass flag_force_vectorize to predicate_bbs.
(tree_if_conversion): Initialize flag_force_vectorize from current
loop or outer loop (to support pragma omp declare).Do loop versioning
for innermost loop marked with pragma omp simd.

2014-08-01 13:40 GMT+04:00 Richard Biener :
> On Wed, Jun 25, 2014 at 4:06

Re: [c++-concepts] explicit instantiation and specialization

2014-08-15 Thread Andrew Sutton

Just committed this patch, fixing the bootstrap.

2014-08-13  Andrew Sutton  

Fix regression in bootstrap.
  * gcc/cp/call.c (get_temploid): Removed. No longer called.
  (joust): Remove unused variable declarations.

Andrew


On Wed, Aug 13, 2014 at 9:50 PM, Andrew Sutton
 wrote:
> Ah... sorry. Leftovers. I didn't have time to run a full bootstrap
> build before heading out for a few days. I'll try to get those out
> tomorrow afternoon-ish.
>
> Andrew
>
>
> On Wed, Aug 13, 2014 at 9:13 PM, Ed Smith-Rowland <3dw...@verizon.net> wrote:
>> I get build fail:
>>
>> ../../gcc_concepts/gcc/cp/call.c:8793:8: error: unused variable ‘m1’
>> [-Werror=unused-variable]
>>tree m1 = get_temploid (cand1);
>> ^
>> ../../gcc_concepts/gcc/cp/call.c:8794:8: error: unused variable ‘m2’
>> [-Werror=unused-variable]
>>tree m2 = get_temploid (cand2);
>> ^
>> cc1plus: all warnings being treated as errors
>>
>> Commenting the lines let the build finish.
>>
>> Ed
>>
Index: gcc/cp/call.c
===
--- gcc/cp/call.c	(revision 213924)
+++ gcc/cp/call.c	(working copy)
@@ -8755,24 +8755,6 @@ add_warning (struct z_candidate *winner,
   winner->warnings = cw;
 }
 
-// When a CANDidate function is a member function of a class template
-// specialization, return the temploid describing that function.
-// Returns NULL_TREE otherwise.
-static inline tree
-get_temploid (struct z_candidate *cand)
-{
-  gcc_assert (cand);
-  tree t = NULL_TREE;
-  if (!cand->template_decl)
-{
-  if (DECL_P (cand->fn) && DECL_USE_TEMPLATE (cand->fn))
-t = DECL_TI_TEMPLATE (cand->fn);
-  if (t && TREE_CODE (t) == TEMPLATE_INFO)
-t = TI_TEMPLATE (t);
-}
-return t;
-}
-
 /* Compare two candidates for overloading as described in
[over.match.best].  Return values:
 
@@ -8789,10 +8771,6 @@ joust (struct z_candidate *cand1, struct
   size_t i;
   size_t len;
 
-  // Try to get a temploid describing each candidate. 
-  tree m1 = get_temploid (cand1);
-  tree m2 = get_temploid (cand2);
-
   /* Candidates that involve bad conversions are always worse than those
  that don't.  */
   if (cand1->viable > cand2->viable)

Re: [PATCH PR62011]

2014-08-15 Thread Yuri Rumyantsev

Jakub,

Is it important to have correct value for length attribute for Big Cores?
As I new this attribute is used for code layout alignment.

2014-08-15 15:54 GMT+04:00 Jakub Jelinek :
> On Fri, Aug 15, 2014 at 03:45:33PM +0400, Yuri Rumyantsev wrote:
>> gcc/ChangeLog
>> 2014-08-15  Yuri Rumyantsev  
>>
>> PR target/62011
>> * config/i386/i386-protos.h (ix86_avoid_false_dep_for_bm): New function
>>  prototype.
>> * config/i386/i386.c (ix86_avoid_false_dep_for_bm): New function.
>> * config/i386/i386.h (TARGET_AVOID_FALSE_DEP_FOR_BM) New macros.
>> * config/i386/i386.md (ctz2, clz2_lzcnt, popcount2,
>>  *popcount2_cmp, *popcountsi2_cmp_zext): Output zeroing
>>  destination register for unary bit-manipulation instructions
>>  if required.
>> * config/i386/x86-tune.def (X86_TUNE_AVOID_FALSE_DEP_FOR_BM): New.
>
> --- config/i386/i386.md (revision 213842)
> +++ config/i386/i386.md (working copy)
> @@ -12111,7 +12111,13 @@
>""
>  {
>if (TARGET_BMI)
> -return "tzcnt{}\t{%1, %0|%0, %1}";
> +{
> +  if (ix86_avoid_false_dep_for_bm (insn, operands))
> +   return "xor{}\t%0, %0\n\t"
> +   "tzcnt{}\t{%1, %0|%0, %1}";
> +  else
> +return "tzcnt{}\t{%1, %0|%0, %1}";
> +}
>else if (optimize_function_for_size_p (cfun))
>  ;
>else if (TARGET_GENERIC)
>
> etc., this will make lenght attribute incorrect though.
>
> Jakub

Re: [PATCH] Warn about unclosed pragma omp declare target.

2014-08-15 Thread Ilya Tocar

Ping.

On 29 Jul 18:45, Ilya Tocar wrote:
> Hi,
> 
> As discussed here in https://gcc.gnu.org/ml/gcc/2014-01/msg00189.html
> Gcc should complain about pragma omp declare target without
> corresponding pragma omp end declare target. This patch adds a warning
> for those cases.
> Bootstraps/passes make-check.
> Ok for trunk?
> 
> ChangeLog:
> 
> 2014-07-29  Ilya Tocar  
> 
>   * c-decl.c (omp_declare_target_location_stack): New.
>   * c-lang.h (omp_declare_target_location_stack): Declare.
>   * c-parser.c (warn_unclosed_pragma_omp_target): New.
>   (c_parser_translation_unit): Call it.
>   (c_parser_omp_declare_target): Remeber location.
>   (c_parser_omp_end_declare_target): Forget location.
> 
> And ChangeLog for testsuite:
> 
> 2014-07-29  Ilya Tocar  
> 
>   * gcc.dg/gomp//target-3.c: New testcase.
> 
> ---
>  gcc/c/c-decl.c   |  3 +++
>  gcc/c/c-lang.h   |  3 +++
>  gcc/c/c-parser.c | 22 +-
>  gcc/testsuite/gcc.dg/gomp/target-3.c | 33 +
>  4 files changed, 60 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.dg/gomp/target-3.c
> 
> diff --git a/gcc/c/c-decl.c b/gcc/c/c-decl.c
> index 2a4b439..2dd5b2c 100644
> --- a/gcc/c/c-decl.c
> +++ b/gcc/c/c-decl.c
> @@ -158,6 +158,9 @@ enum machine_mode c_default_pointer_mode = VOIDmode;
>  /* If non-zero, implicit "omp declare target" attribute is added into the
> attribute lists.  */
>  int current_omp_declare_target_attribute;
> +
> +/* Holds locations of currently open "omp declare target" pragmas.  */
> +vec omp_declare_target_location_stack;
>  
>  /* Each c_binding structure describes one binding of an identifier to
> a decl.  All the decls in a scope - irrespective of namespace - are
> diff --git a/gcc/c/c-lang.h b/gcc/c/c-lang.h
> index e974906..cef995c 100644
> --- a/gcc/c/c-lang.h
> +++ b/gcc/c/c-lang.h
> @@ -59,4 +59,7 @@ struct GTY(()) language_function {
> attribute lists.  */
>  extern GTY(()) int current_omp_declare_target_attribute;
>  
> +/* Holds locations of currently open "omp declare target" pragmas.  */
> +extern vec omp_declare_target_location_stack;
> +
>  #endif /* ! GCC_C_LANG_H */
> diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
> index e32bf04..0b96fe9 100644
> --- a/gcc/c/c-parser.c
> +++ b/gcc/c/c-parser.c
> @@ -1255,6 +1255,8 @@ static bool c_parser_cilk_verify_simd (c_parser *, enum 
> pragma_context);
>  static tree c_parser_array_notation (location_t, c_parser *, tree, tree);
>  static tree c_parser_cilk_clause_vectorlength (c_parser *, tree, bool);
>  
> +static void warn_unclosed_pragma_omp_target ();
> +
>  /* Parse a translation unit (C90 6.7, C99 6.9).
>  
> translation-unit:
> @@ -1290,6 +1292,8 @@ c_parser_translation_unit (c_parser *parser)
>   }
>while (c_parser_next_token_is_not (parser, CPP_EOF));
>  }
> +
> +  warn_unclosed_pragma_omp_target ();
>  }
>  
>  /* Parse an external declaration (C90 6.7, C99 6.9).
> @@ -13068,8 +13072,10 @@ c_finish_omp_declare_simd (c_parser *parser, tree 
> fndecl, tree parms,
>  static void
>  c_parser_omp_declare_target (c_parser *parser)
>  {
> +  location_t loc = c_parser_peek_token (parser)->location;
>c_parser_skip_to_pragma_eol (parser);
>current_omp_declare_target_attribute++;
> +  omp_declare_target_location_stack.safe_push (loc);
>  }
>  
>  static void
> @@ -13104,7 +13110,10 @@ c_parser_omp_end_declare_target (c_parser *parser)
>  error_at (loc, "%<#pragma omp end declare target%> without corresponding 
> "
>  "%<#pragma omp declare target%>");
>else
> -current_omp_declare_target_attribute--;
> +{
> +  current_omp_declare_target_attribute--;
> +  omp_declare_target_location_stack.pop ();
> +}
>  }
>  
>  
> @@ -14267,4 +14276,15 @@ c_parser_array_notation (location_t loc, c_parser 
> *parser, tree initial_index,
>return value_tree;
>  }
>  
> +static void
> +warn_unclosed_pragma_omp_target ()
> +{
> +  int i;
> +  for (i = 0; i < current_omp_declare_target_attribute; i++)
> +warning_at (omp_declare_target_location_stack[i], 0,
> + "%<#pragma omp declare target%> without corresponding "
> + "%<#pragma omp end declare target%>");
> +  omp_declare_target_location_stack.release ();
> +}
> +
>  #include "gt-c-c-parser.h"
> diff --git a/gcc/testsuite/gcc.dg/gomp/target-3.c 
> b/gcc/testsuite/gcc.dg/gomp/target-3.c
> new file mode 100644
> index 000..d50604f
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/gomp/target-3.c
> @@ -0,0 +1,33 @@
> +/* { dg-do compile } */
> +/* { dg-options "-fopenmp" } */
> +
> +#pragma omp declare target
> +int tgtv = 6;
> +
> +int
> +tgt (void)
> +{
> +  tgtv++;
> +  return 0;
> +}
> +#pragma omp end declare target
> +
> +#pragma omp declare target/* { dg-warning "'#pragma omp declare 
> target' without corresponding '#pragma omp end declare target'" } */
> +int tgtv1 = 6

[PATCH][match-and-simplify] Refactor the deprecated APIs

2014-08-15 Thread Richard Biener


This removes the deprecated APIs and inlines them into their single
caller.  This exposes publically the 'code_helper' class which will
also be needed for manually written simplifiers (that's next).

Bootstrapped on x86_64-unknown-linux-gnu, applied.

Richard.

2014-08-15  Richard Biener  

* gimple-match.h: New file.
* gimple-fold.c: Include gimple-match.h.
(fold_stmt_1): Use stmt-based gimple_simplify API.
(gimple_fold_stmt_to_constant_1): Likewise.
* gimple-match-head.c: Include gimple-match.h.
(class code_helper): Move to gimple-match.h.
(maybe_push_res_to_seq): Export.
(gimple_simplify): Likewise.
(gimple_simplify): New overload for functions with tree arguments.
(gimple_simplify): Remove gsi and SSA name overloads.
* gimple-fold.h (gimple_simplify): Remove gsi and SSA name
overloads.

Index: gcc/gimple-fold.c
===
--- gcc/gimple-fold.c   (revision 214012)
+++ gcc/gimple-fold.c   (working copy)
@@ -55,6 +55,8 @@ along with GCC; see the file COPYING3.
 #include "dbgcnt.h"
 #include "builtins.h"
 #include "output.h"
+#include "gimple-match.h"
+
 
 /* Return true when DECL can be referenced from current unit.
FROM_DECL (if non-null) specify constructor of variable DECL was taken from.
@@ -2698,15 +2700,62 @@ fold_stmt_1 (gimple_stmt_iterator *gsi,
 
   /* Dispatch to pattern-based folding.
  ???  Do this after the previous stuff as fold_stmt is used to make
- stmts valid gimple again via maybe_fold_reference of ops.
- ???  Use a lower-level API using a NULL sequence for inplace
- operation, basically inline gimple_simplify (gsi)
- as we are the only caller.  */
-  if (!inplace
-  && gimple_simplify (gsi, valueize))
-changed = true;
+ stmts valid gimple again via maybe_fold_reference of ops.  */
+  /* ???  Change "inplace" semantics to allow replacing a stmt if
+ no further stmts need to be inserted (basically disallow
+ creating of new SSA names).  */
+  if (inplace
+  && !is_gimple_assign (stmt))
+return changed;
 
-  return changed;
+  gimple_seq seq = NULL;
+  code_helper rcode;
+  tree ops[3] = {};
+  if (!gimple_simplify (stmt, &rcode, ops, inplace ? NULL : &seq, valueize))
+return changed;
+
+  if (is_gimple_assign (stmt)
+  && rcode.is_tree_code ())
+{
+  if (inplace
+ && gimple_num_ops (stmt) <= get_gimple_rhs_num_ops (rcode))
+   return changed;
+  /* Play safe and do not allow abnormals to be mentioned in
+ newly created statements.  */
+  if ((TREE_CODE (ops[0]) == SSA_NAME
+  && SSA_NAME_OCCURS_IN_ABNORMAL_PHI (ops[0]))
+ || (ops[1]
+ && TREE_CODE (ops[1]) == SSA_NAME
+ && SSA_NAME_OCCURS_IN_ABNORMAL_PHI (ops[1]))
+ || (ops[2]
+ && TREE_CODE (ops[2]) == SSA_NAME
+ && SSA_NAME_OCCURS_IN_ABNORMAL_PHI (ops[2])))
+   return changed;
+  gimple_assign_set_rhs_with_ops_1 (gsi, rcode, ops[0], ops[1], ops[2]);
+}
+  else
+{
+  if (inplace)
+   return changed;
+  if (gimple_has_lhs (stmt))
+   {
+ gimple_seq tail = NULL;
+ tree lhs = gimple_get_lhs (stmt);
+ maybe_push_res_to_seq (rcode, TREE_TYPE (lhs),
+ops, &tail, lhs);
+ gcc_assert (gimple_seq_singleton_p (tail));
+ gimple with = gimple_seq_first_stmt (tail);
+ gimple_set_vdef (with, gimple_vdef (stmt));
+ gimple_set_vuse (with, gimple_vuse (stmt));
+ gsi_replace (gsi, with, false);
+   }
+  else
+   gcc_unreachable ();
+}
+
+  if (!inplace)
+gsi_insert_seq_before (gsi, seq, GSI_SAME_STMT);
+  return true;
 }
 
 /* Fold the statement pointed to by GSI.  In some cases, this function may
@@ -4103,24 +4152,29 @@ gimple_fold_stmt_to_constant_2 (gimple s
 tree
 gimple_fold_stmt_to_constant_1 (gimple stmt, tree (*valueize) (tree))
 {
-  tree lhs = gimple_get_lhs (stmt);
-  if (lhs)
-{
-  tree res = gimple_simplify (lhs, NULL, valueize);
-  if (res)
-   {
- if (dump_file && dump_flags & TDF_DETAILS)
-   {
- fprintf (dump_file, "Match-and-simplified ");
- print_gimple_expr (dump_file, stmt, 0, TDF_SLIM);
- fprintf (dump_file, " to ");
- print_generic_expr (dump_file, res, 0);
- fprintf (dump_file, "\n");
-   }
- return res;
-   }
-}
-  /* ???  For now, to avoid regressions.  */
+  code_helper rcode;
+  tree ops[3] = {};
+  if (gimple_simplify (stmt, &rcode, ops, NULL, valueize)
+  && rcode.is_tree_code ()
+  && (TREE_CODE_LENGTH ((tree_code) rcode) == 0
+ || ((tree_code) rcode) == ADDR_EXPR)
+  && is_gimple_val (ops[0]))
+{
+  tree res = ops[0];
+  if (dump_file && dump_flags & TDF_DETAILS)
+   {
+ fprintf (dump_file, "Match-and-sim

[COMMITTED] Add myself to MAINTAINERS file (Write After Approval)

2014-08-15 Thread Ilya Verbin

Hi,

This patch adds myself to the MAINTAINERS file.  Commmitted as 214017.


Index: MAINTAINERS
===
--- MAINTAINERS (revision 214014)
+++ MAINTAINERS (working copy)
@@ -561,6 +561,7 @@
 David Ung  dav...@mips.com
 Neil Vachharajani  nvach...@gmail.com
 Kris Van Hees  kris.van.h...@oracle.com
+Ilya Verbiniver...@gmail.com
 Kugan Vivekanandarajah kug...@linaro.org   
 Tom de Vries   t...@codesourcery.com
 Nenad Vukicevicne...@intrepid.com


  -- Ilya

Re: [PATCH] Put all constants last in tree_swap_operands_p, remove odd -Os check

2014-08-15 Thread Franz Sirl

Am 15.08.2014 um 11:32 schrieb Manuel López-Ibáñez:
> On 15 August 2014 11:07, Richard Biener  wrote:
>> -  if (TREE_CODE (arg1) == INTEGER_CST)
>> +  if (CONSTANT_CLASS_P (arg1) == INTEGER_CST)
> 
> Huh?
> 
> /* Nonzero if NODE represents a constant.  */
> 
> #define CONSTANT_CLASS_P(NODE)\
> (TREE_CODE_CLASS (TREE_CODE (NODE)) == tcc_constant)
> 
> Sadly, we don't have a warning for this, but clang++ has one:
> 
> test.c:4:16: warning: comparison of constant 2 with expression of type
> 'bool' is always false [-Wtautological-constant-out-of-range-compare]
>   if ((a == 1) == 2) {
>    ^  ~
> 
> I'll open a PR

See also PR 44077

Franz

[PATCH][RFC][match-and-simplify] "Manually" written patterns

2014-08-15 Thread Richard Biener


The following introduces "manually" written patterns.  That is,
part of the matching and the transform are fully manual.  An
example where this is necessary is when the result isn't really
an "expression" but a series of statements.

For example take simplifications of the memset builtin.  With
the proposal we coud write

(simplify
  (BUILT_IN_MEMSET @1 @2 integer_zerop)
  @1)
(simplify
  (BUILT_IN_MEMSET (addr@1 @4) INTEGER_CST_P@2 tree_fits_uhwi_p@3)
  (if (gimple_simplify_memset (@1, @2, @3, res_code, res_ops, seq, 
valueize)))
  /* Note "result" intentionally omitted.  The predicate if applying is
 supposed to have populated *res_code and *res_ops and seq.  */)

covering the zero-length case with a regular pattern and the rest
with a if-expr predicate that also does the transform.  Note
that parts of the argument constraining is done via regular
matching predicates and the pattern is inserted into the decision
tree as usual.

How gimple_simplify_memset looks like is visible in the patch.

Note that this exposes the implementation details of the _GIMPLE_
code-path (so the above doesn't even apply to GENERIC - luckily
I've not implemented builtin function simplification for GENERIC
so the above doesn't fall over ;)).

The syntax for the trailing args could be made nicer, but we use
'type' freely as well.

It clearly "abuses" (if ...) but it fits kind-of well.  Makes
simply omitting the result pattern in a regular simplify
fail in interesting ways though...

Caveat: runs into the issue that it's not possible to
query the number of arguments to a function (thus no
re-simplification yet).  I can lookup the decl for the
builtin and parse its DECL_ARGUMENTS, but well...
Similar issue exists when parsing built-in calls,
we can't error on not enough arguments.

Status: it builds.

Comments?

Thanks,
Richard.

2014-08-15  Richard Biener  

* match.pd: Add example memset simplification with manual
implemented part.
* gimple-fold.c (gimple_simplify_memset): New function.
* gimple-fold.h (gimple_simplify_memset): Declare.
* gimple-match-head.c (gimple_resimplify): New function.
* genmatch.c (check_no_user_id): Guard against NULL result.
(write_header): Likewise.
(dt_simplify::gen_gimple): Deal with NULL result.
(parse_simplify): Allow missing result.

Index: gcc/match.pd
===
--- gcc/match.pd(revision 214018)
+++ gcc/match.pd(working copy)
@@ -113,6 +113,21 @@ along with GCC; see the file COPYING3.
 #include "match-builtin.pd"
 #include "match-constant-folding.pd"
 
+
+/* "Manual" simplifications but still in the decision tree.
+   Allows us to strip off "easy" parts and (parts of) the
+   pattern/predicate matching.  */
+
+(simplify
+  (BUILT_IN_MEMSET @1 @2 integer_zerop)
+  @1)
+(simplify
+  (BUILT_IN_MEMSET (addr@1 @4) INTEGER_CST_P@2 tree_fits_uhwi_p@3)
+  (if (gimple_simplify_memset (@1, @2, @3, res_code, res_ops, seq, valueize)))
+  /* Note "result" intentionally omitted.  The predicate if applying is
+ supposed to have populated *res_code and *res_ops and seq.  */)
+
+
 /* s
 
We cannot reasonably match vector CONSTRUCTORs or vector constants
Index: gcc/gimple-fold.c
===
--- gcc/gimple-fold.c   (revision 214018)
+++ gcc/gimple-fold.c   (working copy)
@@ -1335,6 +1335,80 @@ gimple_fold_builtin_memset (gimple_stmt_
   return true;
 }
 
+/* Manual simplification example.
+   Fold function call to builtin memset or bzero setting the
+   memory of size LEN to VAL.  Return whether a simplification was made.  */
+
+bool
+gimple_simplify_memset (tree dest, tree c, tree len,
+   code_helper *res_code, tree *res_ops,
+   gimple_seq *seq, tree (*valueize)(tree))
+{
+  tree etype;
+  unsigned HOST_WIDE_INT length, cval;
+
+  if (!seq)
+return false;
+
+  /* If the LEN parameter is zero, this is handled by another pattern.
+ But as they are only differing in predicates we can still arrive
+ here (there isn't a integer_nonzerop).  */
+  if (integer_zerop (len))
+return false;
+
+  gcc_assert (tree_fits_uhwi_p (len));
+
+  gcc_assert (TREE_CODE (c) == INTEGER_CST);
+
+  tree var = dest;
+  gcc_assert (TREE_CODE (var) == ADDR_EXPR);
+
+  var = TREE_OPERAND (var, 0);
+  if (TREE_THIS_VOLATILE (var))
+return false;
+
+  etype = TREE_TYPE (var);
+  if (TREE_CODE (etype) == ARRAY_TYPE)
+etype = TREE_TYPE (etype);
+
+  if (!INTEGRAL_TYPE_P (etype)
+  && !POINTER_TYPE_P (etype))
+return false;
+
+  if (! var_decl_component_p (var))
+return false;
+
+  length = tree_to_uhwi (len);
+  if (GET_MODE_SIZE (TYPE_MODE (etype)) != length
+  || get_pointer_alignment (dest) / BITS_PER_UNIT < length)
+return false;
+
+  if (length > HOST_BITS_PER_WIDE_INT / BITS_PER_UNIT)
+return false;
+
+  if (integer_zerop (c))
+cval =

Re: [PATCH] Put all constants last in tree_swap_operands_p, remove odd -Os check

2014-08-15 Thread Manuel López-Ibáñez

On 15 August 2014 14:43, Franz Sirl  wrote:
> Am 15.08.2014 um 11:32 schrieb Manuel López-Ibáñez:
>> On 15 August 2014 11:07, Richard Biener  wrote:
>>> -  if (TREE_CODE (arg1) == INTEGER_CST)
>>> +  if (CONSTANT_CLASS_P (arg1) == INTEGER_CST)
>>
>> Huh?
>>
>> /* Nonzero if NODE represents a constant.  */
>>
>> #define CONSTANT_CLASS_P(NODE)\
>> (TREE_CODE_CLASS (TREE_CODE (NODE)) == tcc_constant)
>>
>> Sadly, we don't have a warning for this, but clang++ has one:
>>
>> test.c:4:16: warning: comparison of constant 2 with expression of type
>> 'bool' is always false [-Wtautological-constant-out-of-range-compare]
>>   if ((a == 1) == 2) {
>>    ^  ~
>>
>> I'll open a PR
>
> See also PR 44077

Thanks!

I marked it as duplicate since Marek took the other one.

Hopefully this will be fixed for GCC 5

Re: RFC: Patch for switch elimination (PR 54742)

2014-08-15 Thread Jeff Law


On 08/15/14 04:07, Richard Biener wrote:

On Thu, Aug 14, 2014 at 8:45 PM, Sebastian Pop  wrote:

Steve Ellcey wrote:

I understand the desire not to add optimizations just for benchmarks but
we do know other compilers have added this optimization for coremark
(See
http://community.arm.com/groups/embedded/blog/2013/02/21/coremark-and-compiler-performance)
and the 13 people on the CC list for this bug certainly shows interest in
having it even if it is just for a benchmark.  Does 'competing against other
compilers' sound better then 'optimizing for a benchmark'?


I definitely would like to see GCC trunk do this transform.  What about we
integrate the new pass, and then when jump-threading manages to catch the
coremark loop, we remove the pass?


It never worked that way.

A new pass takes compile-time, if we disable it by default it won't help
coremark (and it will bitrot quickly).

So - please fix DOM instead.
Steve's work is highly likely to be faster than further extending the 
threading code -- that's one of the primary reasons I suggested Steve 
resurrect his work.




Jeff

Re: [PATCH Fortran/Diagnostics] Move Fortran to common diagnostics machinery

2014-08-15 Thread Tobias Burnus


Am 06.08.2014 18:50, schrieb Manuel López-Ibáñez:

This is the first step for moving Fortran to use the common
diagnostics machinery. This patch makes Fortran use the common
machinery for those warnings that don't have a location or a
controlling option.
Bootstrapped and regression tested on x86_64-linux-gnu.
OK?


Sorry for replying belated, I was in vacation and seemingly, a few 
others as well.


The patch looks good to me, except for gfc-diagnostic.def which contains 
the file content twice. With that fixed, the patch is OK. Thanks a lot!


Tobias


2014-08-03  Manuel López-Ibáñez  

 PR fortran/44054
c-family/
 * c-format.c: Handle Fortran flags.
 * diagnostic.c (build_message_string): Make it extern.
 * diagnostic.h (build_message_string): Make it extern.
fortran/
 * gfortran.h: Define GCC_DIAG_STYLE.
 (gfc_diagnostics_init,gfc_warning_cmdline): Declare.
 * trans-array.c: Include gfortran.h before diagnostic-core.h.
 * trans-expr.c: Likewise.
 * trans-openmp.c: Likewise.
 * trans-const.c: Likewise.
 * trans.c: Likewise.
 * trans-types.c: Likewise.
 * f95-lang.c: Likewise.
 * trans-decl.c: Likewise.
 * trans-io.c: Likewise.
 * trans-intrinsic.c: Likewise.
 * error.c: Include diagnostic.h and diagnostic-color.h.
 (gfc_diagnostic_build_prefix): New.
 (gfc_diagnostic_starter): New.
 (gfc_diagnostic_finalizer): New.
 (gfc_warning_cmdline): New.
 (gfc_diagnostics_init): New.
 * gfc-diagnostic.def: New.
 * options.c (gfc_init_options): Call gfc_diagnostics_init.
 (gfc_post_options): Use gfc_warning_cmdline.

Re: [patch, testsuite] Applying non_bionic effective target to particular tests

2014-08-15 Thread Alexander Ivchenko

Hi Joseph,

I seem to address what you've said.

Except for I still left non_bionic check for three tests
(Namely: builtins-67.c, strlenopt-14g.c, strlenopt-14gf.c)

because checks for the presence of mempcpy, stpcpy and rintl (sorry, I
didn't mention it last time)
seem to be very narrow, I don't think they would bring any value.


I tested 'make check' on x86_64-unknown-linux-gnu and i686-pc-linux-android.


2014-08-15  Alexander Ivchenko  

   * lib/target-supports.exp (error_h): New check.
   (libc_has_complex_functions): Ditto.
   (tgmath_h): Ditto.
   * gcc.dg/builtins-59.c: Add libc_has_complex_functions check.
   * gcc.dg/builtins-61.c: Likewise.
   * gcc.dg/builtins-67.c: Disable test for Bionic.
   * gcc.dg/strlenopt-14g.c: Likewise.
   * gcc.dg/strlenopt-14gf.c: Likewise.
   * gcc.dg/c99-tgmath-1.c: Add tgmath_h check.
   * gcc.dg/c99-tgmath-2.c: Likewise.
   * gcc.dg/c99-tgmath-3.c: Likewise.
   * gcc.dg/c99-tgmath-4.c: Likewise.
   * gcc.dg/dfp/convert-dfp-round-thread.c: Add error_h check.


Here is the updated patch:





diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 505df55..02268e6 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,19 @@
+2014-08-15  Alexander Ivchenko  
+
+ * lib/target-supports.exp (error_h): New check.
+ (libc_has_complex_functions): Ditto.
+ (tgmath_h): Ditto.
+ * gcc.dg/builtins-59.c: Add libc_has_complex_functions check.
+ * gcc.dg/builtins-61.c: Likewise.
+ * gcc.dg/builtins-67.c: Disable test for Bionic.
+ * gcc.dg/strlenopt-14g.c: Likewise.
+ * gcc.dg/strlenopt-14gf.c: Likewise.
+ * gcc.dg/c99-tgmath-1.c: Add tgmath_h check.
+ * gcc.dg/c99-tgmath-2.c: Likewise.
+ * gcc.dg/c99-tgmath-3.c: Likewise.
+ * gcc.dg/c99-tgmath-4.c: Likewise.
+ * gcc.dg/dfp/convert-dfp-round-thread.c: Add error_h check.
+
 2014-08-15  Jakub Jelinek  
 Tobias Burnus  

diff --git a/gcc/testsuite/gcc.dg/builtins-59.c
b/gcc/testsuite/gcc.dg/builtins-59.c
index b940d39..f5c1803 100644
--- a/gcc/testsuite/gcc.dg/builtins-59.c
+++ b/gcc/testsuite/gcc.dg/builtins-59.c
@@ -1,6 +1,7 @@
 /* { dg-do compile } */
 /* { dg-options "-fdump-tree-gimple" } */
 /* { dg-require-effective-target c99_runtime } */
+/* { dg-require-effective-target libc_has_complex_functions } */

 double test (double x)
 {
diff --git a/gcc/testsuite/gcc.dg/builtins-61.c
b/gcc/testsuite/gcc.dg/builtins-61.c
index dff163f..a3310af 100644
--- a/gcc/testsuite/gcc.dg/builtins-61.c
+++ b/gcc/testsuite/gcc.dg/builtins-61.c
@@ -1,6 +1,7 @@
 /* { dg-do compile } */
 /* { dg-options "-O -ffast-math -fdump-tree-optimized" } */
 /* { dg-require-effective-target c99_runtime } */
+/* { dg-require-effective-target libc_has_complex_functions } */

 double test1 (double x)
 {
diff --git a/gcc/testsuite/gcc.dg/builtins-67.c
b/gcc/testsuite/gcc.dg/builtins-67.c
index 22267bd..0992fe1 100644
--- a/gcc/testsuite/gcc.dg/builtins-67.c
+++ b/gcc/testsuite/gcc.dg/builtins-67.c
@@ -3,6 +3,8 @@
 /* { dg-do link } */
 /* { dg-options "-ffast-math -lm" }  */
 /* { dg-add-options c99_runtime } */
+/* Bionic doesn't have rintl */
+/* { dg-require-effective-target non_bionic } */

 #include "builtins-config.h"

diff --git a/gcc/testsuite/gcc.dg/c99-tgmath-1.c
b/gcc/testsuite/gcc.dg/c99-tgmath-1.c
index c7d848c..cfa02a9 100644
--- a/gcc/testsuite/gcc.dg/c99-tgmath-1.c
+++ b/gcc/testsuite/gcc.dg/c99-tgmath-1.c
@@ -3,6 +3,7 @@
 /* { dg-do preprocess { target c99_runtime } } */
 /* { dg-options "-std=iso9899:1999" } */
 /* { dg-add-options c99_runtime } */
+/* { dg-require-effective-target tgmath_h } */

 /* Test that tgmath defines the macros it's supposed to. */
 #include 
diff --git a/gcc/testsuite/gcc.dg/c99-tgmath-2.c
b/gcc/testsuite/gcc.dg/c99-tgmath-2.c
index d4f1f87..1a1153c 100644
--- a/gcc/testsuite/gcc.dg/c99-tgmath-2.c
+++ b/gcc/testsuite/gcc.dg/c99-tgmath-2.c
@@ -3,6 +3,7 @@
 /* { dg-do compile { target c99_runtime } } */
 /* { dg-options "-std=iso9899:1999" } */
 /* { dg-add-options c99_runtime } */
+/* { dg-require-effective-target tgmath_h } */

 /* Test that invoking type-generic sin on a float invokes sinf. */
 #include 
diff --git a/gcc/testsuite/gcc.dg/c99-tgmath-3.c
b/gcc/testsuite/gcc.dg/c99-tgmath-3.c
index 3e98304..a595cf6 100644
--- a/gcc/testsuite/gcc.dg/c99-tgmath-3.c
+++ b/gcc/testsuite/gcc.dg/c99-tgmath-3.c
@@ -3,6 +3,7 @@
 /* { dg-do compile { target c99_runtime } } */
 /* { dg-options "-std=iso9899:1999" } */
 /* { dg-add-options c99_runtime } */
+/* { dg-require-effective-target tgmath_h } */

 /* Test that invoking type-generic exp on a complex invokes cexp. */
 #include 
diff --git a/gcc/testsuite/gcc.dg/c99-tgmath-4.c
b/gcc/testsuite/gcc.dg/c99-tgmath-4.c
index d8dc043..c05a1c5 100644
--- a/gcc/testsuite/gcc.dg/c99-tgmath-4.c
+++ b/gcc/testsuite/gcc.dg/c99-tgmath-4.c
@@ -3,6 +3,7 @@
 /* { dg-do compile { target c99_runtime } } */
 /* { dg-options "-std=iso9899:1999" } */
 /* { dg-add-options c99_runtime } */
+/* { dg-require

[PR c/52952] More precise locations for Wformat

2014-08-15 Thread Manuel López-Ibáñez

Hi,

This patch moves the location of various Wformat warnings from
pointing to the first character of the function name (such as printf)
to the actual format string. This is specially useful when you have
something like

printf (cond ? "format 1" : "format 2");

It also moves (some) of the warnings about too many arguments to the
first argument that is unused. Since the format string and/or the
arguments might not be expressions or strings, this doesn't always
work and then we are back to the status-quo.

This is anyway a first step. I have some patches to point within the
format string for some cases. This is why I added column markers to
testcases even if we don't actually improve the location: it  will
help us to identify testcases that need updating as the location info
improves.

Unfortunately, reaching the level of Clang seems well beyond the time
I can dedicate to this, so any help would be appreciated.

OK for trunk?


gcc/c-family/ChangeLog:

2014-08-15  Manuel López-Ibáñez  

PR c/52952
* c-format.c: Add extra_arg_loc and format_string_loc to struct
format_check_results.
(check_function_format): Use true and add comment for boolean
argument.
(finish_dollar_format_checking): Use explicit location when warning.
(check_format_info): Likewise.
(check_format_arg): Set extra_arg_loc and format_string_loc.
(check_format_info_main): Use explicit location when warning.
(check_format_types): Pass explicit location.
(format_type_warning): Likewise.

gcc/testsuite/ChangeLog:

2014-08-15  Manuel López-Ibáñez  

PR c/52952
* gcc.dg/redecl-4.c: Add column markers.
* gcc.dg/format/bitfld-1.c: Likewise.
* gcc.dg/format/attr-2.c: Likewise.
* gcc.dg/format/attr-6.c: Likewise.
* gcc.dg/format/array-1.c: Likewise.
* gcc.dg/format/attr-7.c: Likewise.
* gcc.dg/format/asm_fprintf-1.c: Likewise.
* gcc.dg/format/attr-4.c: Likewise.
* gcc.dg/format/branch-1.c: Likewise.
* gcc.dg/format/c90-printf-1.c: Likewise.
Index: gcc/c-family/c-format.c
===
--- gcc/c-family/c-format.c (revision 213927)
+++ gcc/c-family/c-format.c (working copy)
@@ -894,10 +894,11 @@ typedef struct
  as they were not string literals.  */
   int number_non_literal;
   /* Number of leaves of the format argument that were null pointers or
  string literals, but had extra format arguments.  */
   int number_extra_args;
+  location_t extra_arg_loc;
   /* Number of leaves of the format argument that were null pointers or
  string literals, but had extra format arguments and used $ operand
  numbers.  */
   int number_dollar_extra_args;
   /* Number of leaves of the format argument that were wide string
@@ -908,10 +909,12 @@ typedef struct
   /* Number of leaves of the format argument that were unterminated
  strings.  */
   int number_unterminated;
   /* Number of leaves of the format argument that were not counted above.  */
   int number_other;
+  /* Location of the format string.  */
+  location_t format_string_loc;
 } format_check_results;
 
 typedef struct
 {
   format_check_results *res;
@@ -953,12 +956,12 @@ static bool avoid_dollar_number (const c
 static void finish_dollar_format_checking (format_check_results *, int);
 
 static const format_flag_spec *get_flag_spec (const format_flag_spec *,
  int, const char *);
 
-static void check_format_types (format_wanted_type *);
-static void format_type_warning (format_wanted_type *, tree, tree);
+static void check_format_types (location_t, format_wanted_type *);
+static void format_type_warning (location_t, format_wanted_type *, tree, tree);
 
 /* Decode a format type from a string, returning the type, or
format_type_error if not valid, in which case the caller should print an
error message.  */
 static int
@@ -1001,11 +1004,11 @@ check_function_format (tree attrs, int n
 {
   if (is_attribute_p ("format", TREE_PURPOSE (a)))
{
  /* Yup; check it.  */
  function_format_info info;
- decode_format_attr (TREE_VALUE (a), &info, 1);
+ decode_format_attr (TREE_VALUE (a), &info, /*validated=*/true);
  if (warn_format)
{
  /* FIXME: Rewrite all the internal functions in this file
 to use the ARGARRAY directly instead of constructing this
 temporary list.  */
@@ -1255,13 +1258,13 @@ finish_dollar_format_checking (format_ch
{
  if (pointer_gap_ok && (dollar_first_arg_num == 0
 || dollar_arguments_pointer_p[i]))
found_pointer_gap = true;
  else
-   warning (OPT_Wformat_,
-"format argument %d unused before used argument %d in 
$-style format",
-i + 1, dollar_max_arg_used);
+   warning_at (res->format_string_loc, OPT_Wformat_,
+   "format a

Re: [PATCH PR62011]

2014-08-15 Thread Uros Bizjak

On Fri, Aug 15, 2014 at 2:26 PM, Yuri Rumyantsev  wrote:

> Is it important to have correct value for length attribute for Big Cores?
> As I new this attribute is used for code layout alignment.
>
> 2014-08-15 15:54 GMT+04:00 Jakub Jelinek :
>> On Fri, Aug 15, 2014 at 03:45:33PM +0400, Yuri Rumyantsev wrote:
>>> gcc/ChangeLog
>>> 2014-08-15  Yuri Rumyantsev  
>>>
>>> PR target/62011
>>> * config/i386/i386-protos.h (ix86_avoid_false_dep_for_bm): New function
>>>  prototype.
>>> * config/i386/i386.c (ix86_avoid_false_dep_for_bm): New function.
>>> * config/i386/i386.h (TARGET_AVOID_FALSE_DEP_FOR_BM) New macros.
>>> * config/i386/i386.md (ctz2, clz2_lzcnt, popcount2,
>>>  *popcount2_cmp, *popcountsi2_cmp_zext): Output zeroing
>>>  destination register for unary bit-manipulation instructions
>>>  if required.
>>> * config/i386/x86-tune.def (X86_TUNE_AVOID_FALSE_DEP_FOR_BM): New.

I am testing a different approach, outlined in the attached patch. In
the patch, insn is split after reload to separate insns.

As far as popcnt is concerned, we don't need _cmp pattern, the generic
code is clever enough to substuitute "if (popcnt (a))" with "if (a)".

Uros.
Index: i386.h
===
--- i386.h  (revision 214000)
+++ i386.h  (working copy)
@@ -473,6 +473,8 @@ extern unsigned char ix86_tune_features[X86_TUNE_L
ix86_tune_features[X86_TUNE_SPLIT_MEM_OPND_FOR_FP_CONVERTS]
 #define TARGET_ADJUST_UNROLL \
 ix86_tune_features[X86_TUNE_ADJUST_UNROLL]
+#define TARGET_AVOID_FALSE_DEP_FOR_BMI \
+   ix86_tune_features[X86_TUNE_AVOID_FALSE_DEP_FOR_BMI]
 
 /* Feature tests against the various architecture variations.  */
 enum ix86_arch_indices {
Index: i386.md
===
--- i386.md (revision 214000)
+++ i386.md (working copy)
@@ -112,6 +112,7 @@
   UNSPEC_XBEGIN_ABORT
   UNSPEC_STOS
   UNSPEC_PEEPSIB
+  UNSPEC_INSN_FALSE_DEP
 
   ;; For SSE/MMX support:
   UNSPEC_FIX_NOTRUNC
@@ -12569,11 +12570,37 @@
(set_attr "prefix_0f" "1")
(set_attr "mode" "HI")])
 
-(define_insn "popcount2"
-  [(set (match_operand:SWI248 0 "register_operand" "=r")
-   (popcount:SWI248
- (match_operand:SWI248 1 "nonimmediate_operand" "rm")))
+(define_expand "popcount2"
+  [(parallel
+[(set (match_operand:SWI248 0 "register_operand")
+ (popcount:SWI248
+   (match_operand:SWI248 1 "nonimmediate_operand")))
+ (clobber (reg:CC FLAGS_REG))])]
+  "TARGET_POPCNT")
+
+(define_insn_and_split "*popcount2_falsedep_1"
+  [(set (match_operand:SWI48 0 "register_operand" "=&r")
+   (popcount:SWI48
+ (match_operand:SWI48 1 "nonimmediate_operand" "rm")))
(clobber (reg:CC FLAGS_REG))]
+  "TARGET_POPCNT
+   && TARGET_AVOID_FALSE_DEP_FOR_BMI && optimize_insn_for_speed_p ()"
+  "#"
+  "&& reload_completed"
+  [(parallel
+[(set (match_dup 0)
+ (popcount:SWI48 (match_dup 1)))
+ (unspec [(match_dup 0)] UNSPEC_INSN_FALSE_DEP)
+ (clobber (reg:CC FLAGS_REG))])]
+  "ix86_expand_clear (operands[0]);")
+
+(define_insn "*popcount2_falsedep"
+  [(set (match_operand:SWI48 0 "register_operand" "=r")
+   (popcount:SWI48
+ (match_operand:SWI48 1 "nonimmediate_operand" "rm")))
+   (unspec [(match_operand:SWI48 2 "register_operand" "0")]
+  UNSPEC_INSN_FALSE_DEP)
+   (clobber (reg:CC FLAGS_REG))]
   "TARGET_POPCNT"
 {
 #if TARGET_MACHO
@@ -12586,15 +12613,12 @@
(set_attr "type" "bitmanip")
(set_attr "mode" "")])
 
-(define_insn "*popcount2_cmp"
-  [(set (reg FLAGS_REG)
-   (compare
- (popcount:SWI248
-   (match_operand:SWI248 1 "nonimmediate_operand" "rm"))
- (const_int 0)))
-   (set (match_operand:SWI248 0 "register_operand" "=r")
-   (popcount:SWI248 (match_dup 1)))]
-  "TARGET_POPCNT && ix86_match_ccmode (insn, CCZmode)"
+(define_insn "*popcount2"
+  [(set (match_operand:SWI248 0 "register_operand" "=r")
+   (popcount:SWI248
+ (match_operand:SWI248 1 "nonimmediate_operand" "rm")))
+   (clobber (reg:CC FLAGS_REG))]
+  "TARGET_POPCNT"
 {
 #if TARGET_MACHO
   return "popcnt\t{%1, %0|%0, %1}";
@@ -12606,25 +12630,6 @@
(set_attr "type" "bitmanip")
(set_attr "mode" "")])
 
-(define_insn "*popcountsi2_cmp_zext"
-  [(set (reg FLAGS_REG)
-(compare
-  (popcount:SI (match_operand:SI 1 "nonimmediate_operand" "rm"))
-  (const_int 0)))
-   (set (match_operand:DI 0 "register_operand" "=r")
-(zero_extend:DI(popcount:SI (match_dup 1]
-  "TARGET_64BIT && TARGET_POPCNT && ix86_match_ccmode (insn, CCZmode)"
-{
-#if TARGET_MACHO
-  return "popcnt\t{%1, %0|%0, %1}";
-#else
-  return "popcnt{l}\t{%1, %0|%0, %1}";
-#endif
-}
-  [(set_attr "prefix_rep" "1")
-   (set_attr "type" "bitmanip")
-   (set_attr "mode" "SI")])
-
 (define_expand "bswapdi2"
   [(set (match_operand:DI 0 "register_operand")
(bswap:DI (match_operand:DI 1 "nonimmediate_operand")))]
Index: x86-tune.def
=

Re: [PATCH] Fix for PR/62089 (enable missing Asan checks)

2014-08-15 Thread Yury Gribov


On 08/14/2014 12:04 PM, Jakub Jelinek wrote:

No, this should be if, not else if, and be after the } below.
We really can't handle it otherwise.
Generally, the bitfield COMPONENT_REFs should have
DECL_BIT_FIELD_REPRESENTATIVE which is not a bitfield, therefore the common
case will be handled.


Makes sense, I've attached new patch (retested as usual).

-Y
commit 77f65357c65fd86650027ce9498c4960953a2760
Author: Yury Gribov 
Date:   Mon Aug 11 15:09:45 2014 +0400

2014-08-15  Yury Gribov  

gcc/
	PR sanitizer/62089
	* asan.c (instrument_derefs): Fix bitfield check.

gcc/testsuite/
	PR sanitizer/62089
	* c-c++-common/asan/pr62089.c: New test.
	* c-c++-common/asan/bitfield-1.c: New test.
	* c-c++-common/asan/bitfield-2.c: New test.
	* c-c++-common/asan/bitfield-3.c: New test.
	* c-c++-common/asan/bitfield-4.c: New test.

diff --git a/gcc/asan.c b/gcc/asan.c
index 4e6f438..15c0737 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -1690,21 +1690,19 @@ instrument_derefs (gimple_stmt_iterator *iter, tree t,
   int volatilep = 0, unsignedp = 0;
   tree inner = get_inner_reference (t, &bitsize, &bitpos, &offset,
 &mode, &unsignedp, &volatilep, false);
-  if (((size_in_bytes & (size_in_bytes - 1)) == 0
-   && (bitpos % (size_in_bytes * BITS_PER_UNIT)))
-  || bitsize != size_in_bytes * BITS_PER_UNIT)
+
+  if (TREE_CODE (t) == COMPONENT_REF
+  && DECL_BIT_FIELD_REPRESENTATIVE (TREE_OPERAND (t, 1)) != NULL_TREE)
 {
-  if (TREE_CODE (t) == COMPONENT_REF
-	  && DECL_BIT_FIELD_REPRESENTATIVE (TREE_OPERAND (t, 1)) != NULL_TREE)
-	{
-	  tree repr = DECL_BIT_FIELD_REPRESENTATIVE (TREE_OPERAND (t, 1));
-	  instrument_derefs (iter, build3 (COMPONENT_REF, TREE_TYPE (repr),
-	   TREE_OPERAND (t, 0), repr,
-	   NULL_TREE), location, is_store);
-	}
+  tree repr = DECL_BIT_FIELD_REPRESENTATIVE (TREE_OPERAND (t, 1));
+  instrument_derefs (iter, build3 (COMPONENT_REF, TREE_TYPE (repr),
+   TREE_OPERAND (t, 0), repr,
+   NULL_TREE), location, is_store);
   return;
 }
-  if (bitpos % BITS_PER_UNIT)
+
+  if (bitpos % BITS_PER_UNIT
+  || bitsize != size_in_bytes * BITS_PER_UNIT)
 return;
 
   if (TREE_CODE (inner) == VAR_DECL
diff --git a/gcc/testsuite/c-c++-common/asan/bitfield-1.c b/gcc/testsuite/c-c++-common/asan/bitfield-1.c
new file mode 100644
index 000..b3f300c
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/asan/bitfield-1.c
@@ -0,0 +1,25 @@
+/* Check that Asan correctly instruments bitfields with non-round size.  */
+
+/* { dg-do run } */
+/* { dg-shouldfail "asan" } */
+
+struct A
+{
+  char base;
+  int : 4;
+  long x : 7;
+};
+
+int __attribute__ ((noinline, noclone))
+f (void *p) {
+  return ((struct A *)p)->x;
+}
+
+int
+main ()
+{
+  char a = 0;
+  return f (&a);
+}
+
+/* { dg-output "ERROR: AddressSanitizer: stack-buffer-overflow" } */
diff --git a/gcc/testsuite/c-c++-common/asan/bitfield-2.c b/gcc/testsuite/c-c++-common/asan/bitfield-2.c
new file mode 100644
index 000..8ab0f80
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/asan/bitfield-2.c
@@ -0,0 +1,25 @@
+/* Check that Asan correctly instruments bitfields with non-round offset.  */
+
+/* { dg-do run } */
+/* { dg-shouldfail "asan" } */
+
+struct A
+{
+  char base;
+  int : 7;
+  int x : 8;
+};
+
+int __attribute__ ((noinline, noclone))
+f (void *p) {
+  return ((struct A *)p)->x;
+}
+
+int
+main ()
+{
+  char a = 0;
+  return f (&a);
+}
+
+/* { dg-output "ERROR: AddressSanitizer: stack-buffer-overflow" } */
diff --git a/gcc/testsuite/c-c++-common/asan/bitfield-3.c b/gcc/testsuite/c-c++-common/asan/bitfield-3.c
new file mode 100644
index 000..c590778
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/asan/bitfield-3.c
@@ -0,0 +1,25 @@
+/* Check that Asan correctly instruments bitfields with round offset.  */
+
+/* { dg-do run } */
+/* { dg-shouldfail "asan" } */
+
+struct A
+{
+  char base;
+  int : 8;
+  int x : 8;
+};
+
+int __attribute__ ((noinline, noclone))
+f (void *p) {
+  return ((struct A *)p)->x;
+}
+
+int
+main ()
+{
+  char a = 0;
+  return f (&a);
+}
+
+/* { dg-output "ERROR: AddressSanitizer: stack-buffer-overflow" } */
diff --git a/gcc/testsuite/c-c++-common/asan/bitfield-4.c b/gcc/testsuite/c-c++-common/asan/bitfield-4.c
new file mode 100644
index 000..94de9a4
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/asan/bitfield-4.c
@@ -0,0 +1,25 @@
+/* Check that Asan correctly instruments bitfields with round offset.  */
+
+/* { dg-do run } */
+/* { dg-shouldfail "asan" } */
+
+struct A
+{
+  char base;
+  int : 0;
+  int x : 8;
+};
+
+int __attribute__ ((noinline, noclone))
+f (void *p) {
+  return ((struct A *)p)->x;
+}
+
+int
+main ()
+{
+  char a = 0;
+  return f (&a);
+}
+
+/* { dg-output "ERROR: AddressSanitizer: stack-buffer-overflow" } */
diff --git a/gcc/testsuite/c-c++-common/asan/pr62089.c b/gcc/testsuite/c-c++-common/asan/pr62089.c
new file mode 100644
index 000..22b877b
--- /dev/null
+++ b/gcc/testsuite/c-c++-com

Re: [PATCH] Fix for PR/62089 (enable missing Asan checks)

2014-08-15 Thread Jakub Jelinek

On Fri, Aug 15, 2014 at 06:53:25PM +0400, Yury Gribov wrote:
> 2014-08-15  Yury Gribov  
> 
> gcc/
>   PR sanitizer/62089
>   * asan.c (instrument_derefs): Fix bitfield check.
> 
> gcc/testsuite/
>   PR sanitizer/62089
>   * c-c++-common/asan/pr62089.c: New test.
>   * c-c++-common/asan/bitfield-1.c: New test.
>   * c-c++-common/asan/bitfield-2.c: New test.
>   * c-c++-common/asan/bitfield-3.c: New test.
>   * c-c++-common/asan/bitfield-4.c: New test.

Ok.

Jakub

Re: [PR c/52952] More precise locations for Wformat

2014-08-15 Thread Manuel López-Ibáñez

I should have pointed out that this is based on a preliminary patch by
Steven attached to PR 52952. I will update the Changelog to reflect
this when committing.

Cheers,

Manuel.


On 15 August 2014 15:51, Manuel López-Ibáñez  wrote:
> Hi,
>
> This patch moves the location of various Wformat warnings from
> pointing to the first character of the function name (such as printf)
> to the actual format string. This is specially useful when you have
> something like
>
> printf (cond ? "format 1" : "format 2");
>
> It also moves (some) of the warnings about too many arguments to the
> first argument that is unused. Since the format string and/or the
> arguments might not be expressions or strings, this doesn't always
> work and then we are back to the status-quo.
>
> This is anyway a first step. I have some patches to point within the
> format string for some cases. This is why I added column markers to
> testcases even if we don't actually improve the location: it  will
> help us to identify testcases that need updating as the location info
> improves.
>
> Unfortunately, reaching the level of Clang seems well beyond the time
> I can dedicate to this, so any help would be appreciated.
>
> OK for trunk?
>
>
> gcc/c-family/ChangeLog:
>
> 2014-08-15  Manuel López-Ibáñez  
>
> PR c/52952
> * c-format.c: Add extra_arg_loc and format_string_loc to struct
> format_check_results.
> (check_function_format): Use true and add comment for boolean
> argument.
> (finish_dollar_format_checking): Use explicit location when warning.
> (check_format_info): Likewise.
> (check_format_arg): Set extra_arg_loc and format_string_loc.
> (check_format_info_main): Use explicit location when warning.
> (check_format_types): Pass explicit location.
> (format_type_warning): Likewise.
>
> gcc/testsuite/ChangeLog:
>
> 2014-08-15  Manuel López-Ibáñez  
>
> PR c/52952
> * gcc.dg/redecl-4.c: Add column markers.
> * gcc.dg/format/bitfld-1.c: Likewise.
> * gcc.dg/format/attr-2.c: Likewise.
> * gcc.dg/format/attr-6.c: Likewise.
> * gcc.dg/format/array-1.c: Likewise.
> * gcc.dg/format/attr-7.c: Likewise.
> * gcc.dg/format/asm_fprintf-1.c: Likewise.
> * gcc.dg/format/attr-4.c: Likewise.
> * gcc.dg/format/branch-1.c: Likewise.
> * gcc.dg/format/c90-printf-1.c: Likewise.

Re: [PATCH, Pointer Bounds Checker 3/x] Target hooks for Pointer Bounds Checker

2014-08-15 Thread Ilya Enkovich

On 17 Jul 03:36, Jeff Law wrote:
> On 04/16/14 05:52, Ilya Enkovich wrote:
> >diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
> >index b8ca17e..d868129 100644
> >--- a/gcc/doc/tm.texi
> >+++ b/gcc/doc/tm.texi
> >@@ -4333,6 +4333,13 @@ This hook returns the va_list type of the calling 
> >convention specified by
> >  The default version of this hook returns @code{va_list_type_node}.
> >  @end deftypefn
> >
> >+@deftypefn {Target Hook} tree TARGET_FN_ABI_VA_LIST_BOUNDS_SIZE (tree 
> >@var{fndecl})
> >+This hook returns size for @code{va_list} object in function specified
> >+by @var{fndecl}.  This hook is used by Pointer Bounds Checker to build 
> >bounds
> >+for @code{va_list} object.  Return @code{integer_zero_node} if no bounds
> >+should be used (e.g. @code{va_list} is a scalar pointer to the stack).
> >+@end deftypefn
> What if va_list is an aggregate, but lives in registers?  I'm not
> familiar with the different va_list implementations on all the
> targets, but GCC has supported aggregates in registers for various
> ABIs through the years.
> 
> >+@deftypefn {Built-in Function} size_t __chkp_sizeof (const void *@var{ptr})
> >+Function code - @code{BUILT_IN_CHKP_SIZEOF}.  This built-in function
> >+returns size of object referenced by @var{ptr}. @var{ptr} is always
> >+@code{ADDR_EXPR} of @code{VAR_DECL}.  This built-in is used by
> >+Pointer Boudns Checker when bounds of object cannot be computed statically
> >+(e.g. object has incomplete type).
> s/Boudns/Bounds/
> 
> OK for the trunk with those two doc fixes.  As with the other
> patches, wait for the remainder to be approved before committing.
> 
> jeff
> 

Thanks for comments!

TARGET_FN_ABI_VA_LIST_BOUNDS_SIZE was supposed to be used when va_list is a 
pointer to a structure holding args (as it is for x86_64 where we have a 
structure holding all incoming registers).  I decided to remove 
TARGET_FN_ABI_VA_LIST_BOUNDS_SIZE hook because all loads from va_list are 
generated by compiler and should be safely within its bounds.

Here is an updated patch.

Thanks,
Ilya
--
2014-08-15  Ilya Enkovich  

* target.def (builtin_chkp_function): New.
(chkp_bound_type): New.
(chkp_bound_mode): New.
(chkp_make_bounds_constant): New.
(chkp_initialize_bounds): New.
(load_bounds_for_arg): New.
(store_bounds_for_arg): New.
(load_returned_bounds): New.
(store_returned_bounds): New.
(chkp_function_value_bounds): New.
(setup_incoming_vararg_bounds): New.
* targhooks.h (default_load_bounds_for_arg): New.
(default_store_bounds_for_arg): New.
(default_load_returned_bounds): New.
(default_store_returned_bounds): New.
(default_chkp_bound_type): New.
(default_chkp_bound_mode): New.
(default_builtin_chkp_function): New.
(default_chkp_function_value_bounds): New.
(default_chkp_make_bounds_constant): New.
(default_chkp_initialize_bounds): New.
(default_setup_incoming_vararg_bounds): New.
* targhooks.c (default_load_bounds_for_arg): New.
(default_store_bounds_for_arg): New.
(default_load_returned_bounds): New.
(default_store_returned_bounds): New.
(default_chkp_bound_type): New.
(default_chkp_bound_mode); New.
(default_builtin_chkp_function): New.
(default_chkp_function_value_bounds): New.
(default_chkp_make_bounds_constant): New.
(default_chkp_initialize_bounds): New.
(default_setup_incoming_vararg_bounds): New.
* doc/tm.texi.in (TARGET_LOAD_BOUNDS_FOR_ARG): New.
(TARGET_STORE_BOUNDS_FOR_ARG): New.
(TARGET_LOAD_RETURNED_BOUNDS): New.
(TARGET_STORE_RETURNED_BOUNDS): New.
(TARGET_CHKP_FUNCTION_VALUE_BOUNDS): New.
(TARGET_SETUP_INCOMING_VARARG_BOUNDS): New.
(TARGET_BUILTIN_CHKP_FUNCTION): New.
(TARGET_CHKP_BOUND_TYPE): New.
(TARGET_CHKP_BOUND_MODE): New.
(TARGET_CHKP_MAKE_BOUNDS_CONSTANT): New.
(TARGET_CHKP_INITIALIZE_BOUNDS): New.
* doc/tm.texi: Regenerated.

diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 9dd8d68..5d9ac43 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -5028,6 +5028,49 @@ defined, then define this hook to return @code{true} if
 Otherwise, you should not define this hook.
 @end deftypefn

+@deftypefn {Target Hook} rtx TARGET_LOAD_BOUNDS_FOR_ARG (rtx @var{slot}, rtx 
@var{arg}, rtx @var{slot_no})
+This hook is used by expand pass to emit insn to load bounds of
+@var{arg} passed in @var{slot}.  Expand pass uses this hook in case
+bounds of @var{arg} are not passed in register.  If @var{slot} is a
+memory, then bounds are loaded as for regular pointer loaded from
+memory.  If @var{slot} is not a memory then @var{slot_no} is an integer
+constant holding number of the target dependent special slot which
+should be used to obtain bounds.  Hook returns RTX holding loaded bounds.
+@end deftypefn
+
+@deftype

[patch] libstdc++/62154 fix throw_with_nested and rethrow_if_nested

2014-08-15 Thread Jonathan Wakely


Rewritten to meet the C++11 spec, not the 2009 draft I used when
adding nested_exception.

I've used the builtin traits directly, to avoid making libsupc++
depend on , which shouldn't be necessary because
 should be present for freestanding implementations, but
it isn't (PR 62159).

Tested x86_64-linux, committed to trunk.
commit 0203c348d18fabe2f48b4701f9130439710adee0
Author: Jonathan Wakely 
Date:   Fri Aug 15 13:33:00 2014 +0100

	PR libstdc++/62154
	* libsupc++/nested_exception.h (throw_with_nested, rethrow_if_nested):
	Rewrite to conform to C++11 requirements.
	* testsuite/18_support/nested_exception/62154.cc: New.

diff --git a/libstdc++-v3/libsupc++/nested_exception.h b/libstdc++-v3/libsupc++/nested_exception.h
index 7e2b2f2..841c223 100644
--- a/libstdc++-v3/libsupc++/nested_exception.h
+++ b/libstdc++-v3/libsupc++/nested_exception.h
@@ -59,101 +59,108 @@ namespace std
   public:
 nested_exception() noexcept : _M_ptr(current_exception()) { }
 
-nested_exception(const nested_exception&) = default;
+nested_exception(const nested_exception&) noexcept = default;
 
-nested_exception& operator=(const nested_exception&) = default;
+nested_exception& operator=(const nested_exception&) noexcept = default;
 
 virtual ~nested_exception() noexcept;
 
+[[noreturn]]
 void
-rethrow_nested() const __attribute__ ((__noreturn__))
-{ rethrow_exception(_M_ptr); }
+rethrow_nested() const
+{
+  if (_M_ptr)
+	rethrow_exception(_M_ptr);
+  std::terminate();
+}
 
 exception_ptr
-nested_ptr() const
+nested_ptr() const noexcept
 { return _M_ptr; }
   };
 
   template
 struct _Nested_exception : public _Except, public nested_exception
 {
+  explicit _Nested_exception(const _Except& __ex)
+  : _Except(__ex)
+  { }
+
   explicit _Nested_exception(_Except&& __ex)
   : _Except(static_cast<_Except&&>(__ex))
   { }
 };
 
-  template
-struct __get_nested_helper
+  template
+struct _Throw_with_nested_impl
 {
-  static const nested_exception*
-  _S_get(const _Ex& __ex)
-  { return dynamic_cast(&__ex); }
+  template
+	static void _S_throw(_Up&& __t)
+	{ throw _Nested_exception<_Tp>{static_cast<_Up&&>(__t)}; }
 };
 
-  template
-struct __get_nested_helper<_Ex*>
+  template
+struct _Throw_with_nested_impl<_Tp, false>
 {
-  static const nested_exception*
-  _S_get(const _Ex* __ex)
-  { return dynamic_cast(__ex); }
+  template
+	static void _S_throw(_Up&& __t)
+	{ throw static_cast<_Up&&>(__t); }
 };
 
-  template
-inline const nested_exception*
-__get_nested_exception(const _Ex& __ex)
-{ return __get_nested_helper<_Ex>::_S_get(__ex); }
-
-  template
-void
-__throw_with_nested(_Ex&&, const nested_exception* = 0)
-__attribute__ ((__noreturn__));
-
-  template
-void
-__throw_with_nested(_Ex&&, ...) __attribute__ ((__noreturn__));
-
-  // This function should never be called, but is needed to avoid a warning
-  // about ambiguous base classes when instantiating throw_with_nested<_Ex>()
-  // with a type that has an accessible nested_exception base.
-  template
+  template
+struct _Throw_with_nested_helper : _Throw_with_nested_impl<_Tp>
+{ };
+
+  template
+struct _Throw_with_nested_helper<_Tp, false>
+: _Throw_with_nested_impl<_Tp, false>
+{ };
+
+  template
+struct _Throw_with_nested_helper<_Tp&, false>
+: _Throw_with_nested_helper<_Tp>
+{ };
+
+  template
+struct _Throw_with_nested_helper<_Tp&&, false>
+: _Throw_with_nested_helper<_Tp>
+{ };
+
+  /// If @p __t is derived from nested_exception, throws @p __t.
+  /// Else, throws an implementation-defined object derived from both.
+  template
+[[noreturn]]
 inline void
-__throw_with_nested(_Ex&& __ex, const nested_exception*)
-{ throw __ex; }
+throw_with_nested(_Tp&& __t)
+{
+  _Throw_with_nested_helper<_Tp>::_S_throw(static_cast<_Tp&&>(__t));
+}
 
-  template
-inline void
-__throw_with_nested(_Ex&& __ex, ...)
-{ throw _Nested_exception<_Ex>(static_cast<_Ex&&>(__ex)); }
-  
-  template
-void
-throw_with_nested(_Ex __ex) __attribute__ ((__noreturn__));
+  template
+struct _Rethrow_if_nested_impl
+{
+  static void _S_rethrow(const _Tp& __t)
+  {
+	if (auto __tp = dynamic_cast(&__t))
+	  __tp->rethrow_nested();
+  }
+};
 
-  /// If @p __ex is derived from nested_exception, @p __ex. 
-  /// Else, an implementation-defined object derived from both.
-  template
-inline void
-throw_with_nested(_Ex __ex)
+  template
+struct _Rethrow_if_nested_impl<_Tp, false>
 {
-  if (__get_nested_exception(__ex))
-throw __ex;
-  __throw_with_nested(static_cast<_Ex&&>(__ex), &__ex);
-}
+  static void _S_rethrow(const _Tp&) { }
+};
 
   /// If @p __ex is derived from nested_exception, @p __ex.rethrow_nested().
   template
 inline

[RFC, PATCH][LRA, MIPS] ICE: in decompose_normal_address, at rtlanal.c:5817

2014-08-15 Thread Robert Suchanek

Hi Vladimir,

The following testcase fails when compiled with -O2 -mips32r2: 

long long a[];
long long b, c, d, k, m, n, o, p, q, r, s, t, u, v, w;
int e, f, g, h, i, j, l, x;
fn1() {
  for (; x; x++)
if (x & 1)
  s = h | g;
else
  s = f | e;
  l = ~0;
  m = 1 | k;
  n = i;
  o = j;
  p = f | e;
  q = h | g;
  w = d | c | a[1];
  t = c;
  v = b | c;
  u = v;
  r = b | a[4];
}

It is reproducible on mips-linux-gnu using SVN revision 212763. After a patch 
to p5600.md the bug is not triggered but it may reappear in the future.

The decompose_normal_address function throws an assertion because it cannot
decompose the following RTL:

(mem/c:SI (lo_sum:SI (high:SI (symbol_ref:SI ("w")))  
(const:SI (plus:SI (symbol_ref:SI ("w"))
(const_int 4 [0x4]))

It appears that it all starts in the following instruction and how LRA deals
with REG_EQUIV notes:

(insn 107 119 123 10 (set (reg:SI 283)  
   
(ior:SI (reg:SI 284 [ a+12 ])   
   
(reg:SI 309 [ D.1467+4 ]))) init.i:16 163 {*iorsi3} 
   
 (expr_list:REG_DEAD (reg:SI 309 [ D.1467+4 ])  
   
(expr_list:REG_DEAD (reg:SI 284 [ a+12 ])   
   
(expr_list:REG_EQUIV (mem/c:SI (lo_sum:SI (reg/f:SI 274)
   
(const:SI (plus:SI (symbol_ref:SI ("w"))  
(const_int 4 [0x4]) 
(nil)

There are two conditions necessary to trigger the ICE.
Firstly, the pseudo 274 is spilled to memory that is marked by IRA, thus, 
during LRA pass the pseudo 274 is replaced with 'high' when equivalences 
are updated for pseudos. Secondly, pseudo 283 also gets spilled and LRA starts 
using equivalences but LO_SUM and HIGH are already combined leading to 
an assertion error. 

Accepting HIGH as the base does seem to solve the problem. HIGH is also 
reloaded after the decomposition splitting HIGH/LO_SUM into a pair again. 
However, is this an acceptable solution?

Regards,
Robert

gcc/
* rtlanal.c (get_base_term): Accept HIGH as the base term.


diff --git gcc/rtlanal.c gcc/rtlanal.c
index 82cfc1bf..2bea2ca 100644
--- gcc/rtlanal.c
+++ gcc/rtlanal.c
@@ -5624,6 +5624,7 @@ get_base_term (rtx *inner)
 inner = strip_address_mutations (&XEXP (*inner, 0));
   if (REG_P (*inner)
   || MEM_P (*inner)
+  || GET_CODE (*inner) == HIGH
   || GET_CODE (*inner) == SUBREG)
 return inner;
   return 0;

[patch] libstdc++/62159 add missing headers for freestanding implementation

2014-08-15 Thread Jonathan Wakely


This seems to fix PR 62159, in that I can now include all the required
headers in a C++11 program using a compiler built with
--disable-libstdcxx-hosted, but 'make check' doesn't do anything for
freestanding builds - should I be doing anything else to test it?


commit bfaeb3f86ca436a0e32b1f0bdcc9fa7128ae
Author: Jonathan Wakely 
Date:   Fri Aug 15 16:38:55 2014 +0100

	PR libstdc++/62154
	* include/Makefile.am (install-freestanding-headers): Add missing
	C++11 headers.
	* include/Makefile.in: Regenerate.

diff --git a/libstdc++-v3/include/Makefile.am b/libstdc++-v3/include/Makefile.am
index e469586..be19b5b 100644
--- a/libstdc++-v3/include/Makefile.am
+++ b/libstdc++-v3/include/Makefile.am
@@ -1231,6 +1231,8 @@ endif
 install-freestanding-headers:
 	$(mkinstalldirs) $(DESTDIR)${gxx_include_dir}/bits
 	$(mkinstalldirs) $(DESTDIR)${host_installdir}
+	$(INSTALL_DATA) ${glibcxx_srcdir}/include/bits/atomic_base.h \
+	  $(DESTDIR)${gxx_include_dir}/bits
 	$(INSTALL_DATA) ${glibcxx_srcdir}/include/bits/c++0x_warning.h \
 	  $(DESTDIR)${gxx_include_dir}/bits
 	for file in ${host_srcdir}/os_defines.h ${host_builddir}/c++config.h \
@@ -1238,9 +1240,12 @@ install-freestanding-headers:
 	  ${glibcxx_srcdir}/$(CPU_DEFINES_SRCDIR)/cpu_defines.h; do \
 	  $(INSTALL_DATA) $${file} $(DESTDIR)${host_installdir}; done
 	$(mkinstalldirs) $(DESTDIR)${gxx_include_dir}/${std_builddir}
+	$(INSTALL_DATA) ${std_builddir}/atomic $(DESTDIR)${gxx_include_dir}/${std_builddir}
 	$(INSTALL_DATA) ${std_builddir}/limits $(DESTDIR)${gxx_include_dir}/${std_builddir}
+	$(INSTALL_DATA) ${std_builddir}/type_traits $(DESTDIR)${gxx_include_dir}/${std_builddir}
 	$(mkinstalldirs) $(DESTDIR)${gxx_include_dir}/${c_base_builddir}
-	for file in cstddef cstdlib cstdarg; do \
+	for file in ciso646 cstddef cfloat climits cstdint cstdlib \
+	  cstdalign cstdarg cstdbool; do \
 	  $(INSTALL_DATA) ${c_base_builddir}/$${file} $(DESTDIR)${gxx_include_dir}/${c_base_builddir}; done
 
 # The real deal.

Re: RFC: Patch for switch elimination (PR 54742)

2014-08-15 Thread Ramana Radhakrishnan




On 14/08/14 19:25, Steve Ellcey wrote:

On Thu, 2014-08-14 at 10:21 -0600, Jeff Law wrote:

On 08/14/14 10:12, David Malcolm wrote:

On Thu, 2014-08-14 at 09:56 -0600, Jeff Law wrote:

On 08/14/14 04:32, Richard Biener wrote:

You'll note in a separate thread Steve and I discussed this during Cauldron
and it was at my recommendation Steve resurrected his proof of concept
plugin and started beating it into shape.


But do we really want a pass just to help coremark?

And that's the biggest argument against Steve's work.  In theory it
should be applicable to other FSMs, but nobody's come forth with
additional testcases from real world applications.


Maybe a regex library?  Perhaps:
http://vcs.pcre.org/viewvc/code/trunk/pcre_dfa_exec.c?revision=1477 ?

The key is that at least some states tell you at compile time what state
you'll be in during the next loop iteration.  Thus instead of coming
around the loop, evaluating the switch condition, then doing the
multi-way branch, we just directly jump to the case for the next iteration.

I've never looked at the PCRE code to know if it's got cases like that.

jeff


I compiled PCRE but it never triggered this optimization (even if I
bumped up the parameters for instruction counts and paths).

I understand the desire not to add optimizations just for benchmarks but
we do know other compilers have added this optimization for coremark
(See
http://community.arm.com/groups/embedded/blog/2013/02/21/coremark-and-compiler-performance)
 and the 13 people on the CC list for this bug certainly shows interest in 
having it even if it is just for a benchmark.  Does 'competing against other 
compilers' sound better then 'optimizing for a benchmark'?

Steve Ellcey
sell...@mips.com




I've been told and have seen empirical evidence of this triggering in 
another well-known popular embedded benchmark suite.


Why not try this on SPEC2k(6) and see if it triggers with bumped up 
parameters to see if it applies elsewhere ?



regards
Ramana

[PATCH] Add OPT_Wextra to warning call

2014-08-15 Thread Manuel López-Ibáñez

I committed the below as obvious  in r214026.

Since the condition is already controlled by extra_warnings, the only
different is that we now print [-Wextra] and that -Werror=extra now
works.

Cheers,

Manuel.
Index: gcc/cp/ChangeLog
===
--- gcc/cp/ChangeLog(revision 214023)
+++ gcc/cp/ChangeLog(working copy)
@@ -1,3 +1,7 @@
+2014-08-15  Manuel Lopez-Ibanez  
+
+   * call.c (build_conditional_expr_1): Use OPT_Wextra in warning.
+
 2014-08-14  Paolo Carlini  
 
* typeck.c (composite_pointer_type, cxx_sizeof_or_alignof_type,
Index: gcc/cp/call.c
===
--- gcc/cp/call.c   (revision 214023)
+++ gcc/cp/call.c   (working copy)
@@ -5017,7 +5017,7 @@
type_promotes_to (arg3_type)
 {
   if (complain & tf_warning)
-warning_at (loc, 0, "enumeral and non-enumeral type in "
+warning_at (loc, OPT_Wextra, "enumeral and non-enumeral type in "
"conditional expression");
 }

Re: [C++ Patch] PR 57466 (DR 1584)

2014-08-15 Thread Paolo Carlini

.. unfortunately something is wrong with this commit, thus c++/62072. 
For the time being I'm simply reverting it and adding the new testcase.


By the way, more generally, I don't understand at the moment how we can 
safely use complete_type in the middle of tsubst & co: it can emit hard 
errors about, eg, incompleteness (see c++/62072) irrespective of the 
tsubst_flags_t argument...


Thanks,
Paolo.

[PATCH, AArch64] Fix typo

2014-08-15 Thread Evandro Menezes

I tripped at a typo that goes undetected because the macro NAMED_PARAM
doesn't apply in the absence of designated initializers.

Since struct scale_addr_mode_cost has the cost for DI, but not for QI, the
instances of struct cpu_addrcost_table are not initialized as intended due
to the different order of the structure members.  

-- 
Evandro Menezes Austin, USA
e.mene...@samsung.com   +1-512-425-3365



aarch64.diff
Description: Binary data

Re: [PATCH i386 AVX512] [10/n] Add vector move/load/store.

2014-08-15 Thread Uros Bizjak

On Thu, Aug 14, 2014 at 2:51 PM, Kirill Yukhin  wrote:
> On 14 Aug 13:45, Uros Bizjak wrote:
>> Please update the above entry.
> Whoops. Updated ChangeLog:
> gcc/
> * config/i386/i386.c
> (ix86_expand_special_args_builtin): Handle avx512vl_storev8sf_mask,
> avx512vl_storev8si_mask, avx512vl_storev4df_mask, 
> avx512vl_storev4di_mask,
> avx512vl_storev4sf_mask, avx512vl_storev4si_mask, 
> avx512vl_storev2df_mask,
> avx512vl_storev2di_mask, avx512vl_loadv8sf_mask, 
> avx512vl_loadv8si_mask,
> avx512vl_loadv4df_mask, avx512vl_loadv4di_mask, 
> avx512vl_loadv4sf_mask,
> avx512vl_loadv4si_mask, avx512vl_loadv2df_mask, 
> avx512vl_loadv2di_mask,
> avx512bw_loadv64qi_mask, avx512vl_loadv32qi_mask, 
> avx512vl_loadv16qi_mask,
> avx512bw_loadv32hi_mask, avx512vl_loadv16hi_mask, 
> avx512vl_loadv8hi_mask.
> * config/i386/i386.md (define_mode_attr ssemodesuffix): Allow V32HI 
> mode.
> * config/i386/sse.md
> (define_mode_iterator VMOVE): Allow V4TI mode.
> (define_mode_iterator V_AVX512VL): New.
> (define_mode_iterator V): New handling for AVX512VL.
> (define_insn "avx512f_load_mask"): Delete.
> (define_insn "_load_mask"): New.
> (define_insn "avx512f_store_mask"): Delete.
> (define_insn "_store_mask"): New.

OK.

Thanks,
Uros.

Re: [Patch] PR55189 enable -Wreturn-type by default

2014-08-15 Thread Sylvestre Ledru

On 14/08/2014 20:48, Manuel López-Ibáñez wrote:
> --- a/gcc/fortran/options.c
> +++ b/gcc/fortran/options.c
> @@ -693,6 +693,10 @@ gfc_handle_option (size_t scode, const char *arg,
> int value,
>gfc_option.warn_line_truncation = value;
>break;
>
> +case OPT_Wmissing_return:
> +  warn_missing_return = value;
> +  break;
> +
>  case OPT_Wrealloc_lhs:
>gfc_option.warn_realloc_lhs = value;
>break;
>
> The entry in c.opt says this is a C/C++ option, why you need this?
>
>
It is indeed useless. I removed it. Thanks
http://sylvestre.ledru.info/0001-Enable-warning-Wreturn-type-by-default.patch


>
> --- a/gcc/c-family/c.opt
> +++ b/gcc/c-family/c.opt
> @@ -472,7 +472,7 @@ C ObjC Var(warn_implicit_function_declaration)
> Init(-1) Warning LangEnabledBy(C
>  Warn about implicit function declarations
>
>  Wimplicit-int
> -C ObjC Var(warn_implicit_int) Warning LangEnabledBy(C ObjC,Wimplicit)
> +C ObjC Var(warn_implicit_int) Warning
>  Warn when a declaration does not specify a type
>
>  Wimport
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index 5ae910c..3f2019a 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -3615,7 +3615,7 @@ This warning is enabled by @option{-Wall} in C++.
>  @opindex Wimplicit-int
>  @opindex Wno-implicit-int
>  Warn when a declaration does not specify a type.
> -This warning is enabled by @option{-Wall}.
> +This warning is enabled by default.
>
>  @item -Wimplicit-function-declaration @r{(C and Objective-C only)}
>  @opindex Wimplicit-function-declaration
>
>
> Does this patch actually enables -Wimplicit-int by default? The
> default without Init() should be zero!
>
> And according to this: 
> https://gcc.gnu.org/ml/gcc-patches/2014-06/msg01367.html
>
> we still want -Wno-implicit to disable -Wimplicit-int (and
> -Werror=implicit to set -Werror=implicit-int), so the LangEnabledBy()
> should stay. The documentation could say: "This warning is enabled by
> default and it is also controlled by -Wimplicit."
>
OK. I will go back on this once the first patch is committed.

Thanks,
Sylvestre
Full changelog:
gcc/c-family/ChangeLog:

2014-08-13  Sylvestre Ledru  

* c.opt: Enable -Wreturn-type by default
Add -Wmissing-return:
Warn whenever control may reach end of non-void function

gcc/ChangeLog:

2014-08-13  Sylvestre Ledru  

* doc/invoke.texi: Document new flag -Wmissing-return
Update -Wreturn-type
* tree-cfg.c (pass_warn_function_return::execute):
Introduce -Wreturn-type management

libgomp/ChangeLog:

2014-08-13  Sylvestre Ledru  

* testsuite/libgomp.c++/loop-2.C: Update the test with -Wreturn-type by
default and -Wmissing-return
* testsuite/libgomp.c++/loop-4.C: likewise
* testsuite/libgomp.c++/parallel-1.C: likewise
* testsuite/libgomp.c++/shared-1.C: likewise
* testsuite/libgomp.c++/single-1.C: likewise
* testsuite/libgomp.c++/single-2.C: likewise
* testsuite/libgomp.c/omp-loop02.c: likewise
* testsuite/libgomp.c/omp-parallel-for.c: likewise
* testsuite/libgomp.c/omp-parallel-if.c: likewise
* testsuite/libgomp.c/omp-single-1.c: likewise
* testsuite/libgomp.c/omp-single-2.c: likewise
* testsuite/libgomp.c/omp_matvec.c: likewise
* testsuite/libgomp.c/omp_workshare3.c: likewise
* testsuite/libgomp.c/omp_workshare4.c: likewise
* testsuite/libgomp.c/pr30494.c (check): likewise
* testsuite/libgomp.c/shared-1.c: likewise

gcc/testsuite/ChangeLog:

2014-08-13  Sylvestre Ledru  

* gcc.dg/Wmissing-return1.c: New test which tests the new behavior
* gcc.dg/Wmissing-return2.c: New test which tests the new behavior
* gcc.dg/Wmissing-return3.c: New test which tests the new behavior
* gcc.dg/Wmissing-return4.c: New test which tests the new behavior
* gcc.dg/Wmissing-return5.c: New test which tests the new behavior
* c-c++-common/asan/no-redundant-instrumentation-2.c (main):
Update the test with -Wreturn-type by default and -Wmissing-return
* c-c++-common/cilk-plus/AN/decl-ptr-colon.c (int main): likewise
* c-c++-common/cilk-plus/AN/parser_errors.c: likewise
* c-c++-common/cilk-plus/AN/parser_errors2.c: likewise
* c-c++-common/cilk-plus/AN/parser_errors3.c: likewise
* c-c++-common/cilk-plus/AN/pr57457-2.c: likewise
* c-c++-common/cilk-plus/AN/pr57541-2.c (void foo1): likewise
(void foo2): likewise
* c-c++-common/cilk-plus/AN/pr57541.c (int foo): likewise
(int foo1): likewise
* c-c++-common/cilk-plus/CK/pr60197.c: likewise
* c-c++-common/cilk-plus/CK/spawn_in_return.c: likewise
* c-c++-common/convert-vec-1.c: likewise
* c-c++-common/dfp/call-by-value.c (int foo32): likewise
(int foo64): likewise
(int foo128): likewise
* c-c++-common/pr36513-2.c (int main2): likewise
* c-c++-common/pr36513.c (int main1): likewise
* c-c++-common/pr43772.c: likewise
* c-c++-common/pr49706-2.c (same): likewise
* c-c++-common/raw-

[Patch, Fortran] Fix CRITICAL handling with -fcoarray=lib

2014-08-15 Thread Tobias Burnus


It turned out that the CRITICAL patch had two issues:

a) The lock variable was named "__lock_var@0". That's unambiguous but 
the linker didn't like the file name. As the variable is unused (only 
the associated token gets used), the assembler error only occurred with 
-O0 and hence not in the test suite. That's now fixed by using valid 
mangled name; I did the same for the type, which shouldn't show up in 
the assembly except for the DWARF type output. But for completeness, I 
have also mangled it properly.


b) I somehow mixed up the arguments of LOCK; the lock_acquired argument 
is a pointer to a Boolean variable, telling whether the lock could be 
obtained. For CRITICAL, we want to pass NULL, which means that LOCK 
waits until the lock can be obtained. That issue was caught by 
coarray/sync_{1,3}.f90, but somehow, I had missed it.


Committed as obvious in Rev. 214029.

Tobias
Index: ChangeLog
===
--- ChangeLog	(revision 214027)
+++ ChangeLog	(working copy)
@@ -1,3 +1,8 @@
+2014-08-15  Tobias Burnus  
+
+	* resolve.c (resolve_critical): Fix name mangling.
+	* trans-stmt.c (gfc_trans_critical): Fix lock call.
+
 2014-08-15  Manuel LÃ³pez-IbÃ¡Ã±ez  
 
 	PR fortran/44054
Index: resolve.c
===
--- resolve.c	(revision 214027)
+++ resolve.c	(working copy)
@@ -8485,13 +8485,14 @@ resolve_critical (gfc_code *code)
   if (gfc_option.coarray != GFC_FCOARRAY_LIB)
 return;
 
-  symtree = gfc_find_symtree (gfc_current_ns->sym_root, "__lock_type@0");
+  symtree = gfc_find_symtree (gfc_current_ns->sym_root,
+			  GFC_PREFIX ("lock_type"));
   if (symtree)
 lock_type = symtree->n.sym;
   else
 {
-  if (gfc_get_sym_tree ("__lock_type@0", gfc_current_ns, &symtree,
-	  false) != 0)
+  if (gfc_get_sym_tree (GFC_PREFIX ("lock_type"), gfc_current_ns, &symtree,
+			false) != 0)
 	gcc_unreachable ();
   lock_type = symtree->n.sym;
   lock_type->attr.flavor = FL_DERIVED;
@@ -8500,7 +8501,7 @@ resolve_critical (gfc_code *code)
   lock_type->intmod_sym_id = ISOFORTRAN_LOCK_TYPE;
 }
 
-  sprintf(name, "__lock_var@%d",serial++);
+  sprintf(name, GFC_PREFIX ("lock_var") "%d",serial++);
   if (gfc_get_sym_tree (name, gfc_current_ns, &symtree, false) != 0)
 gcc_unreachable ();
 
Index: trans-stmt.c
===
--- trans-stmt.c	(revision 214027)
+++ trans-stmt.c	(working copy)
@@ -1121,7 +1121,7 @@ gfc_trans_critical (gfc_code *code)
   token = GFC_TYPE_ARRAY_CAF_TOKEN (TREE_TYPE (token));
   tmp = build_call_expr_loc (input_location, gfor_fndecl_caf_lock, 7,
  token, integer_zero_node, integer_one_node,
- boolean_true_node, null_pointer_node,
+ null_pointer_node, null_pointer_node,
  null_pointer_node, integer_zero_node);
   gfc_add_expr_to_block (&block, tmp);
 }

Re: [PATCH, AArch64] Fix typo

2014-08-15 Thread James Greenhalgh

On Fri, Aug 15, 2014 at 05:24:58PM +0100, Evandro Menezes wrote:
> I tripped at a typo that goes undetected because the macro NAMED_PARAM
> doesn't apply in the absence of designated initializers.
> 
> Since struct scale_addr_mode_cost has the cost for DI, but not for QI, the
> instances of struct cpu_addrcost_table are not initialized as intended due
> to the different order of the structure members.  

Thanks for spotting and fixing this.

The ChangeLog entry should be added to gcc/ChangeLog, and should look
like this:

2014-08-14  Evandro Menezes  

* config/aarch64/aarch64.c (generic_addrcost_table): Initialize
elements in the correct order.
(cortexa57_addrcost_table): Likewise.

My fixes were:
  * Two spaces between your name and email address.
  * Name the structure/function/thing changed.
  * Set the path relative to the ChangeLog being modified.

Otherwise, this patch looks correct to me. However, you will need approval
from an AArch64 port maintainer (For AArch64 this is Richard Earnshaw
or Marcus Shawcroft - both added to CC).

Thanks,
James

Re: [PATCH] Fix PR62077

2014-08-15 Thread Jason Merrill


On 08/14/2014 12:28 PM, Jason Merrill wrote:

On 08/14/2014 05:07 AM, Richard Biener wrote:

So - can you take over this C++ frontend issue?


OK.


Here's what I'm applying to trunk:


commit 76cce1d62861b6c99e3ecd97bcd607cd242d1efa
Author: Jason Merrill 
Date:   Wed Aug 13 13:11:09 2014 -0400

	PR bootstrap/62077
	* tree.c (build_min_array_type, set_array_type_canon): Split out...
	(build_cplus_array_type): ...from here.  Only call build_array_type
	for main variants.

diff --git a/gcc/cp/tree.c b/gcc/cp/tree.c
index 3b53039..c9199f2 100644
--- a/gcc/cp/tree.c
+++ b/gcc/cp/tree.c
@@ -757,7 +757,40 @@ cplus_array_compare (const void * k1, const void * k2)
the language-independent type hash table.  */
 static GTY ((param_is (union tree_node))) htab_t cplus_array_htab;
 
-/* Like build_array_type, but handle special C++ semantics.  */
+/* Build an ARRAY_TYPE without laying it out.  */
+
+static tree
+build_min_array_type (tree elt_type, tree index_type)
+{
+  tree t = cxx_make_type (ARRAY_TYPE);
+  TREE_TYPE (t) = elt_type;
+  TYPE_DOMAIN (t) = index_type;
+  return t;
+}
+
+/* Set TYPE_CANONICAL like build_array_type_1, but using
+   build_cplus_array_type.  */
+
+static void
+set_array_type_canon (tree t, tree elt_type, tree index_type)
+{
+  /* Set the canonical type for this new node.  */
+  if (TYPE_STRUCTURAL_EQUALITY_P (elt_type)
+  || (index_type && TYPE_STRUCTURAL_EQUALITY_P (index_type)))
+SET_TYPE_STRUCTURAL_EQUALITY (t);
+  else if (TYPE_CANONICAL (elt_type) != elt_type
+	   || (index_type && TYPE_CANONICAL (index_type) != index_type))
+TYPE_CANONICAL (t)
+  = build_cplus_array_type (TYPE_CANONICAL (elt_type),
+index_type
+? TYPE_CANONICAL (index_type) : index_type);
+  else
+TYPE_CANONICAL (t) = t;
+}
+
+/* Like build_array_type, but handle special C++ semantics: an array of a
+   variant element type is a variant of the array of the main variant of
+   the element type.  */
 
 tree
 build_cplus_array_type (tree elt_type, tree index_type)
@@ -767,10 +800,19 @@ build_cplus_array_type (tree elt_type, tree index_type)
   if (elt_type == error_mark_node || index_type == error_mark_node)
 return error_mark_node;
 
-  if (processing_template_decl
-  && (dependent_type_p (elt_type)
-	  || (index_type && !TREE_CONSTANT (TYPE_MAX_VALUE (index_type)
+  bool dependent
+= (processing_template_decl
+   && (dependent_type_p (elt_type)
+	   || (index_type && !TREE_CONSTANT (TYPE_MAX_VALUE (index_type);
+
+  if (elt_type != TYPE_MAIN_VARIANT (elt_type))
+/* Start with an array of the TYPE_MAIN_VARIANT.  */
+t = build_cplus_array_type (TYPE_MAIN_VARIANT (elt_type),
+index_type);
+  else if (dependent)
 {
+  /* Since type_hash_canon calls layout_type, we need to use our own
+	 hash table.  */
   void **e;
   cplus_array_info cai;
   hashval_t hash;
@@ -792,82 +834,33 @@ build_cplus_array_type (tree elt_type, tree index_type)
   else
 	{
 	  /* Build a new array type.  */
-	  t = cxx_make_type (ARRAY_TYPE);
-	  TREE_TYPE (t) = elt_type;
-	  TYPE_DOMAIN (t) = index_type;
+	  t = build_min_array_type (elt_type, index_type);
 
 	  /* Store it in the hash table. */
 	  *e = t;
 
 	  /* Set the canonical type for this new node.  */
-	  if (TYPE_STRUCTURAL_EQUALITY_P (elt_type)
-	  || (index_type && TYPE_STRUCTURAL_EQUALITY_P (index_type)))
-	SET_TYPE_STRUCTURAL_EQUALITY (t);
-	  else if (TYPE_CANONICAL (elt_type) != elt_type
-		   || (index_type 
-		   && TYPE_CANONICAL (index_type) != index_type))
-	TYPE_CANONICAL (t)
-		= build_cplus_array_type 
-		   (TYPE_CANONICAL (elt_type),
-		index_type ? TYPE_CANONICAL (index_type) : index_type);
-	  else
-	TYPE_CANONICAL (t) = t;
+	  set_array_type_canon (t, elt_type, index_type);
 	}
 }
   else
 {
-  if (!TYPE_STRUCTURAL_EQUALITY_P (elt_type)
-	  && !(index_type && TYPE_STRUCTURAL_EQUALITY_P (index_type))
-	  && (TYPE_CANONICAL (elt_type) != elt_type
-	  || (index_type && TYPE_CANONICAL (index_type) != index_type)))
-	/* Make sure that the canonical type is on the appropriate
-	   variants list.  */
-	build_cplus_array_type
-	  (TYPE_CANONICAL (elt_type),
-	   index_type ? TYPE_CANONICAL (index_type) : index_type);
   t = build_array_type (elt_type, index_type);
 }
 
-  /* Push these needs up so that initialization takes place
- more easily.  */
-  bool needs_ctor
-= TYPE_NEEDS_CONSTRUCTING (TYPE_MAIN_VARIANT (elt_type));
-  TYPE_NEEDS_CONSTRUCTING (t) = needs_ctor;
-  bool needs_dtor
-= TYPE_HAS_NONTRIVIAL_DESTRUCTOR (TYPE_MAIN_VARIANT (elt_type));
-  TYPE_HAS_NONTRIVIAL_DESTRUCTOR (t) = needs_dtor;
-
-  /* We want TYPE_MAIN_VARIANT of an array to strip cv-quals from the
- element type as well, so fix it up if needed.  */
+  /* Now check whether we already have this array variant.  */
   if (elt_type != TYPE_MAIN_VARIANT (elt_type))
 {
-  tree m = build_cplus_array_type (TYPE_MAIN_VARIANT (elt_t

Re: [patch, testsuite] Applying non_bionic effective target to particular tests

2014-08-15 Thread Mike Stump

On Aug 15, 2014, at 6:49 AM, Alexander Ivchenko  wrote:
>   * lib/target-supports.exp (error_h): New check.
>   (libc_has_complex_functions): Ditto.
>   (tgmath_h): Ditto.
>   * gcc.dg/builtins-59.c: Add libc_has_complex_functions check.
>   * gcc.dg/builtins-61.c: Likewise.
>   * gcc.dg/builtins-67.c: Disable test for Bionic.
>   * gcc.dg/strlenopt-14g.c: Likewise.
>   * gcc.dg/strlenopt-14gf.c: Likewise.
>   * gcc.dg/c99-tgmath-1.c: Add tgmath_h check.
>   * gcc.dg/c99-tgmath-2.c: Likewise.
>   * gcc.dg/c99-tgmath-3.c: Likewise.
>   * gcc.dg/c99-tgmath-4.c: Likewise.
>   * gcc.dg/dfp/convert-dfp-round-thread.c: Add error_h check.

Ok, thanks.

Re: [C++ Patch] PR 57466 (DR 1584)

2014-08-15 Thread Jason Merrill


On 08/15/2014 12:17 PM, Paolo Carlini wrote:

By the way, more generally, I don't understand at the moment how we can
safely use complete_type in the middle of tsubst & co: it can emit hard
errors about, eg, incompleteness (see c++/62072) irrespective of the
tsubst_flags_t argument...


Yes, it can, that's part of the language; remember "Only invalid types 
and expressions in the immediate context of the function type and its 
template parameter types can result in a deduction failure", invalid 
constructs outside that context result in hard errors.


Jason

Re: [patch, testsuite] Applying non_bionic effective target to particular tests

2014-08-15 Thread enh

can you file bugs against bionic for stuff like this? use
b.android.com (and feel free to mail me to ensure that they get
noticed).

one thing we'd like to do is get to a point where we're building
gcc/gdb et cetera without any local hacks, and when we've got to that
point, we're going to have to go through anything that made it
upstream to check that that's sane. (the weird "-shared implies
-Bsymbolic" GCC hack springs to mind.)

On Fri, Aug 15, 2014 at 10:25 AM, Mike Stump  wrote:
> On Aug 15, 2014, at 6:49 AM, Alexander Ivchenko  wrote:
>>   * lib/target-supports.exp (error_h): New check.
>>   (libc_has_complex_functions): Ditto.
>>   (tgmath_h): Ditto.
>>   * gcc.dg/builtins-59.c: Add libc_has_complex_functions check.
>>   * gcc.dg/builtins-61.c: Likewise.
>>   * gcc.dg/builtins-67.c: Disable test for Bionic.
>>   * gcc.dg/strlenopt-14g.c: Likewise.
>>   * gcc.dg/strlenopt-14gf.c: Likewise.
>>   * gcc.dg/c99-tgmath-1.c: Add tgmath_h check.
>>   * gcc.dg/c99-tgmath-2.c: Likewise.
>>   * gcc.dg/c99-tgmath-3.c: Likewise.
>>   * gcc.dg/c99-tgmath-4.c: Likewise.
>>   * gcc.dg/dfp/convert-dfp-round-thread.c: Add error_h check.
>
> Ok, thanks.

Re: [patch, testsuite] Applying non_bionic effective target to particular tests

2014-08-15 Thread Mike Stump

On Aug 15, 2014, at 10:32 AM, enh  wrote:
> can you file bugs against bionic for stuff like this?

I suspect you meant to ask Alexander this question…  but just in case you did 
intend to ask me, no.  I don’t have any visibility into android local patches 
or procedures.

In the end, people with local patches that submit them, have to track them 
themselves.  I usually do this by merging into the local tree and the merge 
tree the FSF version of the work once that work makes it into the FSF tree.  
Works well and eliminates merge conflicts.

Re: [patch, testsuite] Applying non_bionic effective target to particular tests

2014-08-15 Thread enh

On Fri, Aug 15, 2014 at 10:56 AM, Mike Stump  wrote:
> On Aug 15, 2014, at 10:32 AM, enh  wrote:
>> can you file bugs against bionic for stuff like this?
>
> I suspect you meant to ask Alexander this question…  but just in case you did 
> intend to ask me, no.  I don’t have any visibility into android local patches 
> or procedures.

yeah, i meant people proposing Android-related changes. in this case, Alexander.

 --elliott

Re: [PATCH i386 AVX512] [16/n] Add AVX-512BW's psadbw insn.

2014-08-15 Thread Uros Bizjak

On Fri, Aug 15, 2014 at 1:37 PM, Kirill Yukhin  wrote:
> Hello,
> This patch introduces AVX-512BW's psadbw insn pattern.
>
> Bootstrapped.
> New tests on top of patch-set all pass
> under simulator.
>
> Is it ok for trunk?
>
> gcc/
> * config/i386/sse.md
> (define_mode_iterator VI8_AVX2_AVX512BW): New.
> (define_insn "_psadbw"): Add evex version.

OK.

Thanks,
Uros.

Re: [RFC, PATCH][LRA, MIPS] ICE: in decompose_normal_address, at rtlanal.c:5817

2014-08-15 Thread Steven Bosscher

On Fri, Aug 15, 2014 at 5:45 PM, Robert Suchanek wrote:
> gcc/
> * rtlanal.c (get_base_term): Accept HIGH as the base term.
>
>
> diff --git gcc/rtlanal.c gcc/rtlanal.c
> index 82cfc1bf..2bea2ca 100644
> --- gcc/rtlanal.c
> +++ gcc/rtlanal.c
> @@ -5624,6 +5624,7 @@ get_base_term (rtx *inner)
>  inner = strip_address_mutations (&XEXP (*inner, 0));
>if (REG_P (*inner)
>|| MEM_P (*inner)
> +  || GET_CODE (*inner) == HIGH
>|| GET_CODE (*inner) == SUBREG)
>  return inner;
>return 0;

This is not correct, BASE is a *variable* expression, HIGH is a
*constant* expression.

It's hard to say what the correct fix should be, but it sounds like
the address you get after the substitutions should be simplified
(folded).

B.R.,
Steven

Re: [C++ Patch] PR 57466 (DR 1584)

2014-08-15 Thread Paolo Carlini


Hi,

On 08/15/2014 07:30 PM, Jason Merrill wrote:

On 08/15/2014 12:17 PM, Paolo Carlini wrote:

By the way, more generally, I don't understand at the moment how we can
safely use complete_type in the middle of tsubst & co: it can emit hard
errors about, eg, incompleteness (see c++/62072) irrespective of the
tsubst_flags_t argument...
Yes, it can, that's part of the language; remember "Only invalid types 
and expressions in the immediate context of the function type and its 
template parameter types can result in a deduction failure", invalid 
constructs outside that context result in hard errors.

I see, it boils down again to that "famous" wording...

Now, is it possible that the issue we are facing with implementing DR 
1584 has to do with the fact that our unify doesn't tell template 
functions vs template classes?!? Thus we should return 1 from 
check_cv_quals_for_unify when arg is a FUNCTION_TYPE *and* 
DECL_TYPE_TEMPLATE_P (TREE_TYPE (tparms)) is true?!? (we could pass the 
information in a flag)


Because I don't think an equivalent of the key bits of c++/62072:

template struct tuple_size { };
template struct tuple_size : tuple_size { };

can be constructed for template functions?!?

Paolo.

[PATCH,rs6000] Add __VEC_ELEMENT_REG_ORDER__ builtin define for PowerPC

2014-08-15 Thread Bill Schmidt

Hi,

This adds a macro to indicate the order in which vector elements appear
in a register on PowerPC.  Elements may appear in right-to-left order
for little endian, or in left-to-right order for big endian and when
-maltivec=be is selected for little endian.  The same macro is being
implemented in the IBM XL compilers.

Bootstrapped and tested on powerpc64le-unknown-linux-gnu with no
regressions.  Verified the new macro takes on the correct value in the
circumstances listed above.  Is this ok for trunk?  It would be
preferable to backport this to GCC 4.9 as well.

Thanks,
Bill


2014-08-15  Bill Schmidt  

* conifg/rs6000/rs6000-c.c (rs6000_cpu_cpp_builtins): Provide
builtin define __VEC_ELEMENT_REG_ORDER__.


Index: gcc/config/rs6000/rs6000-c.c
===
--- gcc/config/rs6000/rs6000-c.c(revision 214025)
+++ gcc/config/rs6000/rs6000-c.c(working copy)
@@ -497,6 +497,12 @@ rs6000_cpu_cpp_builtins (cpp_reader *pfile)
   break;
 }
 
+  /* Vector element order.  */
+  if (BYTES_BIG_ENDIAN || (rs6000_altivec_element_order == 2))
+builtin_define ("__VEC_ELEMENT_REG_ORDER__=__ORDER_BIG_ENDIAN__");
+  else
+builtin_define ("__VEC_ELEMENT_REG_ORDER__=__ORDER_LITTLE_ENDIAN__");
+
   /* Let the compiled code know if 'f' class registers will not be available.  
*/
   if (TARGET_SOFT_FLOAT || !TARGET_FPRS)
 builtin_define ("__NO_FPRS__");

Re: [PATCH i386 AVX512] [17/n] Split VI48_AVX512F into VI4_AVX512VL and VI248_AVX512, extend vcvtps2udq,vpbroadcastmb2d.

2014-08-15 Thread Uros Bizjak

On Fri, Aug 15, 2014 at 1:42 PM, Kirill Yukhin  wrote:
> Hello,
> This patch splits VI48_AVX512F iterator into two.
> It extends vcvtps2udq,vpbroadcastmb2d patterns as well.
>
> Bootstrapped.
> New tests on top of patch-set all pass
> under simulator.
>
> Is it ok for trunk?
>
> gcc/
> * config/i386/sse.md
> (define_mode_iterator VI48_AVX512F): Delete.
> (define_mode_iterator VI4_AVX512VL): New.
> (define_mode_iterator VI248_AVX512): New.
> (define_insn 
> "avx512f_ufix_notruncv16sfv16si"):
> Delete.
> (define_insn
> 
> "_ufix_notrunc"):
> New.
> (define_insn "avx512cd_maskw_vec_dup"): Macroize.
> (define_insn "_ashrv"): Delete.
> (define_insn "_ashrv"): New.

It looks to me that the macroization is somehow wrong for ashrv. I'd
split the mode iterator to:

> +(define_mode_iterator VI248_AVX512
> +  [(V16SI "TARGET_AVX512F") (V8SI "TARGET_AVX2") (V4SI "TARGET_AVX2")
> +   (V32HI "TARGET_AVX512BW")
> +   (V16HI "TARGET_AVX512BW && TARGET_AVX512VL")
> +   (V8HI "TARGET_AVX512BW && TARGET_AVX512VL")
> +   (V8DI "TARGET_AVX512F") (V4DI "TARGET_AVX512VL") (V2DI 
> "TARGET_AVX512VL")])
> +

V4SI, V8SI, V16SI (AVX512F), V8DI (AVX512F), V4DI (AVX512VL), V2DI
(AVX512VL), with AVX2 as the baseline

and

V32HI, V16HI (AVX512VL), V8HI (AVX512VL), with AVX512BW as the baseline

Uros.

Re: [PATCH i386 AVX512] [18/n] Extend vpbroadcastmb2q.

2014-08-15 Thread Uros Bizjak

On Fri, Aug 15, 2014 at 1:47 PM, Kirill Yukhin  wrote:
> Hello,
> This patch extends pattern for vpbroadcastmb2q insn
> pattern.
>
> Bootstrapped.
> New tests on top of patch-set all pass
> under simulator.
>
> Is it ok for trunk?
>
> gcc/
> * config/i386/sse.md
> (define_mode_iterator VI8_AVX512VL): New.
> (define_insn "avx512cd_maskb_vec_dup"): Macroize.

OK.

Thanks,
Uros.

Re: [PATCH i386 AVX512] [19/n] Extends AVX-512 broadcasts.

2014-08-15 Thread Uros Bizjak

On Fri, Aug 15, 2014 at 1:52 PM, Kirill Yukhin  wrote:
> Hello,
> This patch introduces new patterns to support
> AVX-512Vl,DQ broadcast insns.
>
> Bootstrapped.
> New tests on top of patch-set all pass
> under simulator.
>
> Is it ok for trunk?
>
> gcc/
> * config/i386/sse.md
> (define_mode_iterator VI4F_BRCST32x2): New.
> (define_mode_attr 64x2_mode): New.
> (define_mode_attr 32x2mode): New.
> (define_insn "avx512dq_broadcast"): 
> New.
> (define_insn "avx512vl_broadcast_1"): 
> New.
> (define_insn "avx512dq_broadcast_1"): 
> New.
> (define_insn "avx512dq_broadcast_1"): 
> New.

Can you avoid insn constraints like:

> +  "TARGET_AVX512DQ && ( == 64 || TARGET_AVX512VL)"

This should be split to two insn patterns, each with different
baseline insn constraint.

Uros.

Re: [PATCH] Asan static optimization (draft)

2014-08-15 Thread Konstantin Serebryany

On Thu, Aug 14, 2014 at 11:55 PM, Yuri Gribov  wrote:
> On Thu, Aug 14, 2014 at 8:53 PM, Konstantin Serebryany
>  wrote:
>> In order for your work to be generally useful, I'd ask several things:
>> - Update 
>> https://code.google.com/p/address-sanitizer/wiki/CompileTimeOptimizations
>> with examples that will be handled
>
> Done (to be honest I only plan to do full redundancy elimination for
> now, hoisting would hopefully follow later). Note I'm still
> experimenting so there may be some changes in actual implementation.

Thanks.
if we are running -O0, we should not care about optimizing asan
instrumentation.
If this is -O1 or higher,  then most (but not all) of your cases
*should* be optimized by the compiler before asan kicks in.
(This may be different for GCC-asan because GCC-asan runs a bit too
early, AFAICT. Maybe this *is* the problem we need to fix).
If there is a case where the regular optimizer can optimize the code
before asan but it doesn't --
we should fix the general optimizer or the phase ordering instead of
enhancing asan opt phase.
I am mainly interested in cases where the general optimizer can not
possibly improve the code,
but asan opt can eliminate redundant checks.

>
>> - Create small standalone test cases in C
>> - Don't put the tests under GPL (otherwise we'll not be able to reuse
>> them in LLVM)
>
> I already have a bunch of tests (which I plan to extend further). How
> should I submit them s.t. they could be reused by LLVM?

Maybe just start accumulating tests on the CompileTimeOptimizations wiki
(as full functions that one can copy-paste to a .c file and build)?
Once some new optimization is implemented in a compiler X,
we'll copy the test with proper harness code (FileCheck/dejagnu/etc)
to the X's repository

>

 make sure that sanopt performs conservative optimization
>> Yes. I don't know a good solution to this problem, which is why we did
>> not attack it before.
>> Increasing asan speed will save lots of CPU time, but missing a single 
>> software
>> bug due to an overly aggressive optimization may cost much more.
>
> Yeah. I thought about manually inspecting optimizations that are
> performed for some large files (e.g. GCC's asan.c) and maybe doing
> some  random verifications of Asan trophies
> (http://code.google.com/p/address-sanitizer/wiki/FoundBugs). Ideally
> we'd have some test generator but making a reasonable one for C sounds
> laughable. Perhaps there is some prior work on verification of Java
> range checks optimizers?

There is ton of work for range check optimizers, but none of that
fully applies to asan,
since asan also checks use-after-free and init-order-fiasco.

>
> -Y

Re: [PATCH i386 AVX512] [20/n] AVX-512 integer shift pattern.

2014-08-15 Thread Uros Bizjak

On Fri, Aug 15, 2014 at 1:56 PM, Kirill Yukhin  wrote:
> Hello,
> This patch extends shift pattern to support AVX-512
> new insn.
>
> Bootstrapped.
> New tests on top of patch-set all pass
> under simulator.
>
> Is it ok for trunk?
>
> gcc/
> * config/i386/sse.md
> (define_mode_iterator VI248_AVX2): Add V32HI mode.
> (define_insn "3"): Add masking.

Again, please split insn pattern to avoid:

+  "TARGET_SSE2
+   && 
+   && ((mode != V16HImode && mode != V8HImode)
+   || TARGET_AVX512BW
+   || !)"

insn constraints. The insn constraint should use baseline TARGET_* and
mode iterator should use TARGET_* that results in "baseline TARGET_ &&
iterator TARGET_" for certain mode. If these are properly used, then
there is no need to use mode checks in the insn constraint.

Uros.

Re: [PATCH][LTO] Streamer re-org (what's left)

2014-08-15 Thread Jan Hubicka

> 
> Yeah, the 2MB was just a "guess", I'll change it to 64k blocks.  Note
> the original code exponentially increased block size to not have
> too many blocks (for whatever reason).  A 800MB compressed decl section
> would need 12800 64k blocks.  But in the end it matters only that
> the block allocations are "efficient" for the memory allocator
> (so don't allocate 1-byte blocks).  Our internal overhead is
> one pointer (to point to the next buffer).
> 
> Of course in the end I want to implement streaming right into the
> file rather than queuing up the whole compressed data (or
> mmapping it).

Yep, would be nice for WPA stream out memory usage. Also getting rid of the
gcc->gas->object file way for slim LTO files may be huge win for kernel
times...
> 
> Btw, I'll first try to get rid of the separate string section
> which would also make it compressed again and be less awkwardly
> abusing the data-streamer.

Sounds good :)

Honza
> 
> Richard.

Re: [C++ Patch] PR 57466 (DR 1584)

2014-08-15 Thread Paolo Carlini


... in practice, something like the below.

Paolo.

///
Index: cp/pt.c
===
--- cp/pt.c (revision 214027)
+++ cp/pt.c (working copy)
@@ -162,7 +162,7 @@ static tree tsubst_friend_class (tree, tree);
 static int can_complete_type_without_circularity (tree);
 static tree get_bindings (tree, tree, tree, bool);
 static int template_decl_level (tree);
-static int check_cv_quals_for_unify (int, tree, tree);
+static int check_cv_quals_for_unify (int, tree, tree, bool);
 static void template_parm_level_and_index (tree, int*, int*);
 static int unify_pack_expansion (tree, tree, tree,
 tree, unification_kind_t, bool, bool);
@@ -17279,11 +17279,16 @@ template_decl_level (tree decl)
Returns nonzero iff the unification is OK on that basis.  */
 
 static int
-check_cv_quals_for_unify (int strict, tree arg, tree parm)
+check_cv_quals_for_unify (int strict, tree arg, tree parm, bool in_function)
 {
   int arg_quals = cp_type_quals (arg);
   int parm_quals = cp_type_quals (parm);
 
+  /* DR 1584: cv-qualification of a deduced function type is
+ ignored; see 8.3.5 [dcl.fct].  */
+  if (in_function && TREE_CODE (arg) == FUNCTION_TYPE)
+return 1;
+
   if (TREE_CODE (parm) == TEMPLATE_TYPE_PARM
   && !(strict & UNIFY_ALLOW_OUTER_MORE_CV_QUAL))
 {
@@ -17644,6 +17649,8 @@ unify (tree tparms, tree targs, tree parm, tree ar
   tree targ;
   tree tparm;
   int strict_in = strict;
+  bool in_function = (TREE_TYPE (tparms)
+ && DECL_FUNCTION_TEMPLATE_P (TREE_TYPE (tparms)));
 
   /* I don't think this will do the right thing with respect to types.
  But the only case I've seen it in so far has been array bounds, where
@@ -17750,7 +17757,7 @@ unify (tree tparms, tree targs, tree parm, tree ar
 PARM `T' for example, when computing which of two templates
 is more specialized, for example.  */
   && TREE_CODE (arg) != TEMPLATE_TYPE_PARM
-  && !check_cv_quals_for_unify (strict_in, arg, parm))
+  && !check_cv_quals_for_unify (strict_in, arg, parm, in_function))
 return unify_cv_qual_mismatch (explain_p, parm, arg);
 
   if (!(strict & UNIFY_ALLOW_OUTER_LEVEL)
@@ -17927,7 +17934,7 @@ unify (tree tparms, tree targs, tree parm, tree ar
 If ARG is `const int' and PARM is just `T' that's OK;
 that binds `const int' to `T'.  */
  if (!check_cv_quals_for_unify (strict_in | UNIFY_ALLOW_LESS_CV_QUAL,
-arg, parm))
+arg, parm, in_function))
return unify_cv_qual_mismatch (explain_p, parm, arg);
 
  /* Consider the case where ARG is `const volatile int' and
@@ -18273,7 +18280,7 @@ unify (tree tparms, tree targs, tree parm, tree ar
&& (!check_cv_quals_for_unify
(UNIFY_ALLOW_NONE,
 class_of_this_parm (arg),
-class_of_this_parm (parm
+class_of_this_parm (parm), in_function)))
  return unify_cv_qual_mismatch (explain_p, parm, arg);
 
RECUR_AND_CHECK_FAILURE (tparms, targs, TREE_TYPE (parm),
@@ -18298,7 +18305,8 @@ unify (tree tparms, tree targs, tree parm, tree ar
   if (TYPE_PTRMEMFUNC_P (arg))
{
  /* Check top-level cv qualifiers */
- if (!check_cv_quals_for_unify (UNIFY_ALLOW_NONE, arg, parm))
+ if (!check_cv_quals_for_unify (UNIFY_ALLOW_NONE, arg, parm,
+in_function))
return unify_cv_qual_mismatch (explain_p, parm, arg);
 
  RECUR_AND_CHECK_FAILURE (tparms, targs, TYPE_OFFSET_BASETYPE (parm),
Index: testsuite/g++.dg/cpp0x/pr57466.C
===
--- testsuite/g++.dg/cpp0x/pr57466.C(revision 0)
+++ testsuite/g++.dg/cpp0x/pr57466.C(working copy)
@@ -0,0 +1,18 @@
+// PR c++/57466
+// { dg-do compile { target c++11 } }
+
+template
+  constexpr bool
+  is_pointer(const T*)
+  { return true; }
+
+template
+  constexpr bool
+  is_pointer(const T&)
+  { return false; }
+
+using F = void();
+
+constexpr F* f = nullptr;
+
+static_assert( is_pointer(f), "function pointer is a pointer" );
Index: testsuite/g++.dg/template/pr57466.C
===
--- testsuite/g++.dg/template/pr57466.C (revision 0)
+++ testsuite/g++.dg/template/pr57466.C (working copy)
@@ -0,0 +1,8 @@
+// DR 1584, PR c++/57466
+
+template void f2(const T*);
+void g2();
+
+void m() {
+  f2(g2);// OK: cv-qualification of deduced function type ignored
+}
Index: testsuite/g++.dg/template/unify6.C
===
--- testsuite/g++.dg/template/unify6.C  (revision 214027)
+++ testsuite/g++.dg/template/unify6.C  (working copy)
@@ -3,21 +3,20 @@
 
 void Baz ();
 
-template  void Foo1 (T *); // #1
-template  void Foo1 (T const

Re: [C++ Patch] PR 57466 (DR 1584)

2014-08-15 Thread Jason Merrill


On 08/15/2014 03:16 PM, Paolo Carlini wrote:

+  bool in_function = (TREE_TYPE (tparms)
+ && DECL_FUNCTION_TEMPLATE_P (TREE_TYPE (tparms)));


Huh?  There's no such thing as a template parameter of function type.

Jason

Re: [C++ Patch] PR 57466 (DR 1584)

2014-08-15 Thread Paolo Carlini


Hi,

On 08/15/2014 09:22 PM, Jason Merrill wrote:

On 08/15/2014 03:16 PM, Paolo Carlini wrote:

+  bool in_function = (TREE_TYPE (tparms)
+  && DECL_FUNCTION_TEMPLATE_P (TREE_TYPE (tparms)));


Huh?  There's no such thing as a template parameter of function type.
Works fine, in fact, I have just finished regtesting the patch. 
Consider, eg, from the DR:


template void f2(const T*);
void g2();

void m() {
  f2(g2);// OK: cv-qualification of deduced function type ignored
}

when unify is called by unify_one_argument is a TREE_VEC and the 
TREE_TYPE contains the information we need:


 type 0x76c48f18 void>

type_0 type_6 QI
size 
unit size 
align 8 symtab 0 alias set -1 canonical type 0x76daa150
arg-types 0x76daa0a8>
chain 0x76c48f18 void

VOID file 57466_1.C line 3 col 24
align 1 context 
full-name "template void f2(const T*)"

chain 
volatile public external QI file  line 0 col 0 
align 8

full-name "void __cxa_call_unexpected(void*)"
chain >>

elt 0 value 0x76d97f18 T>

decl_0 VOID file 57466_1.C line 3 col 10
align 1>>>

Paolo.

RE: [PATCH, AArch64] Fix typo

2014-08-15 Thread Evandro Menezes

Thanks for the review.

-- 
Evandro Menezes Austin, USA
e.mene...@samsung.com   +1-512-425-3365

-Original Message-
From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-ow...@gcc.gnu.org]
On Behalf Of James Greenhalgh
Sent: Friday, August 15, 2014 11:36
To: Evandro Menezes
Cc: gcc-patches@gcc.gnu.org; 'James Greenhalgh'; richard.earns...@arm.com;
marcus.shawcr...@arm.com
Subject: Re: [PATCH, AArch64] Fix typo

On Fri, Aug 15, 2014 at 05:24:58PM +0100, Evandro Menezes wrote:
> I tripped at a typo that goes undetected because the macro NAMED_PARAM 
> doesn't apply in the absence of designated initializers.
> 
> Since struct scale_addr_mode_cost has the cost for DI, but not for QI, 
> the instances of struct cpu_addrcost_table are not initialized as 
> intended due to the different order of the structure members.

Thanks for spotting and fixing this.

The ChangeLog entry should be added to gcc/ChangeLog, and should look like
this:

2014-08-14  Evandro Menezes  

* config/aarch64/aarch64.c (generic_addrcost_table): Initialize
elements in the correct order.
(cortexa57_addrcost_table): Likewise.

My fixes were:
  * Two spaces between your name and email address.
  * Name the structure/function/thing changed.
  * Set the path relative to the ChangeLog being modified.

Otherwise, this patch looks correct to me. However, you will need approval
from an AArch64 port maintainer (For AArch64 this is Richard Earnshaw or
Marcus Shawcroft - both added to CC).

Thanks,
James

aarch64.diff
Description: Binary data

Re: [patch, testsuite] Applying non_bionic effective target to particular tests

2014-08-15 Thread Alexander Ivchenko

2014-08-15 21:32 GMT+04:00 enh :
> can you file bugs against bionic for stuff like this? use
> b.android.com (and feel free to mail me to ensure that they get
> noticed).

Sure, I will do that.

> one thing we'd like to do is get to a point where we're building
> gcc/gdb et cetera without any local hacks, and when we've got to that
> point, we're going to have to go through anything that made it
> upstream to check that that's sane. (the weird "-shared implies
> -Bsymbolic" GCC hack springs to mind.)

There are more of those hacks for sure (actually "shared implies
-Bsymbolic" is in gcc trunk, so it is not a local hack (Although, it
doesn't neceseraly mean that it doesn't have to be changed. But there
are certanly other things that are local and have to be upstreamed).
>From our side we are trying to upstream things first and then, if
neccessary, to port them to ndk.

--Alexander

Re: [PATCH] Fix PR62077

2014-08-15 Thread Richard Biener

On August 15, 2014 7:25:55 PM CEST, Jason Merrill  wrote:
>On 08/14/2014 12:28 PM, Jason Merrill wrote:
>> On 08/14/2014 05:07 AM, Richard Biener wrote:
>>> So - can you take over this C++ frontend issue?
>>
>> OK.
>
>Here's what I'm applying to trunk:

Thanks Jason.

Richard.

Re: [patch, testsuite] Applying non_bionic effective target to particular tests

2014-08-15 Thread enh

On Fri, Aug 15, 2014 at 1:05 PM, Alexander Ivchenko  wrote:
> 2014-08-15 21:32 GMT+04:00 enh :
>> can you file bugs against bionic for stuff like this? use
>> b.android.com (and feel free to mail me to ensure that they get
>> noticed).
>
> Sure, I will do that.
>
>> one thing we'd like to do is get to a point where we're building
>> gcc/gdb et cetera without any local hacks, and when we've got to that
>> point, we're going to have to go through anything that made it
>> upstream to check that that's sane. (the weird "-shared implies
>> -Bsymbolic" GCC hack springs to mind.)
>
> There are more of those hacks for sure (actually "shared implies
> -Bsymbolic" is in gcc trunk, so it is not a local hack (Although, it
> doesn't neceseraly mean that it doesn't have to be changed.

yeah, this is the kind of thing that worries me most: where bad ideas
have been upstreamed, so now not only do we need to remove them from
Android's copy, we need to get them out of upstream GCC too.

> But there
> are certanly other things that are local and have to be upstreamed).
> From our side we are trying to upstream things first and then, if
> neccessary, to port them to ndk.
>
> --Alexander

[patch] fix guality/nrv-1.c LTO failure

2014-08-15 Thread Aldy Hernandez

This test is failing with LTO because in the LTRANS phase (DCE) we 
realize that the call to f() is useless, so we don't generate it.  This 
leads to an uncalled f() which also gets deleted.  We end up with an 
empty main(), and rightly so, gdb has nothing good to print.


Marking `a1' as used keeps things in good enough shape so everyone is happy.

OK for mainline?

* guality/nrv-1.c: Add `used' attribute to a1.

diff --git a/gcc/testsuite/gcc.dg/guality/nrv-1.c 
b/gcc/testsuite/gcc.dg/guality/nrv-1.c

index 6e70050..2f4e654 100644
--- a/gcc/testsuite/gcc.dg/guality/nrv-1.c
+++ b/gcc/testsuite/gcc.dg/guality/nrv-1.c
@@ -8,7 +8,7 @@ struct A
   int i[100];
 };

-struct A a1, a3;
+struct A a1 __attribute__((used)), a3;

 __attribute__((noinline)) struct A
 f ()

Re: [wwwdocs] Re: gcc.gnu.org/simtest-howto.html (was: Question for ARM person re asm_fprintf)(

2014-08-15 Thread Oleg Endo

On Fri, 2014-08-15 at 22:58 +0200, Oleg Endo wrote:
> On Mon, 2014-08-04 at 08:19 +0200, Oleg Endo wrote:
> > 
> > On Aug 4, 2014, at 6:00 AM, Gerald Pfeifer  wrote:
> > 
> > > On Wed, 23 Jul 2014, Hans-Peter Nilsson wrote:
> > >> The page  is
> > >> unfortunately out of date (e.g. binutils+sim now lives in the
> > >> same git repo) but it gives you the idea.
> > > 
> > > Sooo, any volunteer to update this page?  Doesn't have to be
> > > perfect, even incremental improvements help.
> > > 
> > > Or is it bad enough that we should rather remove this unless/
> > > until someone steps up?
> > 
> > Since I'm basically doing all the testing in sh-sim, I could try to update 
> > that page.
> 
> How about the attached .html as a replacement for the current one?
> I removed the requirement of setting up a combined tree, as I believe
> it makes things much more easy.  At least it's been working for me
> that way.  Is this helpful / OK to commit?

Maybe a patch is better in this case, instead of the whole .html.

Cheers,
Oleg
? simtest_howto.patch
Index: simtest-howto.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/simtest-howto.html,v
retrieving revision 1.29
diff -u -r1.29 simtest-howto.html
--- simtest-howto.html	27 Jun 2014 11:48:46 -	1.29
+++ simtest-howto.html	15 Aug 2014 21:00:31 -
@@ -1,9 +1,41 @@
-
-  
-How to test GCC on a simulator
-  
+
+  http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd";>
+ 
+
+
+ 
+
+
+
+
+
+
+
+
+
+
+ http://www.w3.org/1999/xhtml"; xml:lang="en" lang="en">
+  
+   
+ 
+
+
+https://gcc.gnu.org/favicon.ico"; />
+https://gcc.gnu.org/gcc.css"; />
+  
+ 
+How to test GCC on a simulator
+- GNU Project - Free Software Foundation (FSF)
+  
+   
+ 
 
   
+
+
+
 How to test GCC on a simulator
 
 Several GCC targets can be tested using simulators.  These allow
@@ -11,270 +43,280 @@
 access to hardware, or for targets that have general characteristics
 you'd like to test, like endianness or word size.
 
-All of the instructions here start out from a directory we'll
-call ${TOP}, which is initially empty.
-
-Set up sources
-
-Testing with a simulator requires use of a combined tree;
-you can't easily build newlib, required for simulator testing,
-outside of a combined tree, and the build of the other components
-is easiest in a combined tree.
-
-The combined tree contains GCC sources plus several modules of
-the src tree: binutils and
-newlib for the build and sim for the
-simulators. If you already build with a combined tree you can use
-your current setup; if not, these instructions will get you the
-sources you need.
-
-Check out initial CVS trees
-
-If you don't yet have either tree you'll need to do an initial
-check-outs.
+Setup
 
-Check out mainline GCC:
+Testing with a simulator requires an installation of a working
+GCC toolchain and a GDB simulator.  For a list of supported targets please
+refer to the http://www.sourceware.org/gdb/download/onlinedocs/gdb.html#Embedded-Processors";>GDB documentation.
+The following describes how to create a Renesas SuperH cross compiler
+setup that can be used for simulator testing.
+
 
+All of the instructions here start out from a directory we'll
+call ${TOP}, which contains the following directories with
+unpacked sources:
+
+${TOP}/binutils-src
+${TOP}/gcc-src
+${TOP}/newlib-src
+${TOP}/gdb-src
+
+and the following corresponding build directories which are initially
+empty:
+
+${TOP}/binutils-build
+${TOP}/gcc-build
+${TOP}/newlib-build
+${TOP}/gdb-build
+
+To keep things simple, all parts of the cross toolchain will be installed
+into /usr/local, which usually requires superuser rights for
+writing.  If that is inconvenient, you might also install it into another
+place.
+
+
+Build and install Binutils
 
-cd ${TOP}
-svn checkout svn://gcc.gnu.org/svn/gcc/trunk gcc
-# This makes sure that file timestamps are in order initially.
-cd ${TOP}/gcc
-contrib/gcc_update --touch
+cd ${TOP}/binutils-build
+../binutils-src/configure --target=sh-elf --prefix=/usr/local --disable-nls --disable-werror
+make all
+sudo make install
 
+This should have installed the cross binutils for the target, e.g.
+sh-elf-as is the assembler for our SuperH target.
 
-Check out the src tree:
 
+Build and install GCC (compiler only)
 
-cd ${TOP}
-cvs -d :pserver:anon...@sourceware.org:/cvs/src login
-# You will be prompted for a password; reply with "anoncvs".
-cvs -d :pserver:anon...@sourceware.org:/cvs/src co binutils newlib sim
+cd ${TOP}/gcc-build
+../gcc-src/configure --target=sh-elf --prefix=/usr/local --enable-languages=c,c++ --disable-nls --disable-werror --with-newlib --enable-lto --enable-multilib
+make all-g

Re: [PATCH] Remove current_function_decl usage from get_polymorphic_call_info

2014-08-15 Thread Jan Hubicka

> Hi,
> 
> Testing 'mpx' branch after merge with trunk I got a segfault in ipa-devirt.c. 
>  It appears that cgraph_node cloning with indirect edge causes call to 
> get_polymorphic_call_info which uses current_function_decl.  It happens in 
> IPA pass and therefore current_function_decl is NULL which causes segfault.  
> Also even within a GIMPLE pass it seems wrong to use current_function_decl 
> because examined call may belong to another function and passed fndecl should 
> be used instead.
> 
> Proposed patch was bootstrapped and regtested on linux-x86_64.  OK for trunk?

OK,
thanks

Honza
> 
> Thanks,
> Ilya
> --
> 
> 2014-08-13  Ilya Enkovich  
> 
>   * ipa-devirt.c (get_polymorphic_call_info): Use fndecl instead of
>   current_function_decl.
> 
> 
> diff --git a/gcc/ipa-devirt.c b/gcc/ipa-devirt.c
> index 3650b43..0f38655 100644
> --- a/gcc/ipa-devirt.c
> +++ b/gcc/ipa-devirt.c
> @@ -2319,7 +2319,7 @@ get_polymorphic_call_info (tree fndecl,
>= decl_maybe_in_construction_p (base,
>context->outer_type,
>call,
> -  current_function_decl);
> +  fndecl);
> return base;
>   }
> else

Re: [PATCH] Avoid redundant indirect_info computation during inderct edge cloning

2014-08-15 Thread Jan Hubicka

> Hi,
> 
> I get a segafult in decl_maybe_in_construction_p during function versioning.  
> We have following steps in clone creation (e.g. as in 
> create_version_clone_with_body):
>  1. Create function decl
>  2. Create clone of cgraph node
>  3. Copy function body
> After the first step there is no body attached to function and 
> DECL_STRUCT_FUNCTION for new decl is NULL.  It is initialized on the third 
> step.  But on the second step get_polymorphic_call_info may be called for new 
> function; it calls decl_maybe_in_construction_p which assumes 
> DECL_STRUCT_FUNCTION already exists.
> 
> I firstly wanted to fix decl_maybe_in_construction_p but then realized 
> cgraph_clone_edge copy indirect_info from the original edge anyway and 
> therefore its computation is not required at all.
> 
> Following patch removes redundant indirect_info computation.  Bootstrapped 
> and regtested on linux-x86_64.  Does it look OK for trunk?

OK, plase also add testcase from 
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61800

Thanks,
Honza
> 
> Thanks,
> Ilya
> --
> 2014-08-15  Ilya Enkovich  
> 
>   * cgraph.h (cgraph_node::create_indirect_edge): Add
>   compute_indirect_info param.
>   * cgraph.c (cgraph_node::create_indirect_edge): Compute
>   indirect_info only when it is required.
>   * cgraphclones.c (cgraph_clone_edge): Do not recompute
>   indirect_info for cloned indirect edge.
> 
> 
> diff --git a/gcc/cgraph.c b/gcc/cgraph.c
> index 370a96a..cb49cdc 100644
> --- a/gcc/cgraph.c
> +++ b/gcc/cgraph.c
> @@ -942,7 +942,8 @@ cgraph_allocate_init_indirect_info (void)
>  
>  struct cgraph_edge *
>  cgraph_node::create_indirect_edge (gimple call_stmt, int ecf_flags,
> -gcov_type count, int freq)
> +gcov_type count, int freq,
> +bool compute_indirect_info)
>  {
>struct cgraph_edge *edge = cgraph_node::create_edge (this, NULL, call_stmt,
>  count, freq, true);
> @@ -954,7 +955,8 @@ cgraph_node::create_indirect_edge (gimple call_stmt, int 
> ecf_flags,
>edge->indirect_info->ecf_flags = ecf_flags;
>  
>/* Record polymorphic call info.  */
> -  if (call_stmt
> +  if (compute_indirect_info
> +  && call_stmt
>&& (target = gimple_call_fn (call_stmt))
>&& virtual_method_call_p (target))
>  {
> diff --git a/gcc/cgraph.h b/gcc/cgraph.h
> index 13c09af..2594ae5 100644
> --- a/gcc/cgraph.h
> +++ b/gcc/cgraph.h
> @@ -915,7 +915,8 @@ public:
>   statement destination is a formal parameter of the caller with index
>   PARAM_INDEX. */
>struct cgraph_edge *create_indirect_edge (gimple call_stmt, int ecf_flags,
> - gcov_type count, int freq);
> + gcov_type count, int freq,
> + bool compute_indirect_info = true);
>  
>/* Like cgraph_create_edge walk the clone tree and update all clones 
> sharing
> same function body.  If clones already have edge for OLD_STMT; only
> diff --git a/gcc/cgraphclones.c b/gcc/cgraphclones.c
> index c04b5c8..557f734 100644
> --- a/gcc/cgraphclones.c
> +++ b/gcc/cgraphclones.c
> @@ -136,7 +136,7 @@ cgraph_clone_edge (struct cgraph_edge *e, struct 
> cgraph_node *n,
>   {
> new_edge = n->create_indirect_edge (call_stmt,
> e->indirect_info->ecf_flags,
> -   count, freq);
> +   count, freq, false);
> *new_edge->indirect_info = *e->indirect_info;
>   }
>  }

[patch, fortran] Fix PR 62142

2014-08-15 Thread Thomas Koenig

Hello world,

I committed the attached patch as obvious to trunk after regression
testing.  It fixes the regression in the test case by adding
a NULL check to a pointer.  Will commit to 4.9 soon.

Regards

Thomas

2014-08-15  Thomas Koenig  

PR fortran/62142
* trans-expr.c (is_runtime_conformable):  Add NULL pointer check.

2014-08-15  Thomas Koenig  

PR fortran/62142
* gfortran.dg/realloc_on_assign_24.f90:  New test.
Index: trans-expr.c
===
--- trans-expr.c	(Revision 213778)
+++ trans-expr.c	(Arbeitskopie)
@@ -7895,7 +7895,7 @@ is_runtime_conformable (gfc_expr *expr1, gfc_expr
 	  for (a = expr2->value.function.actual; a != NULL; a = a->next)
 	{
 	  e1 = a->expr;
-	  if (e1->rank > 0 && !is_runtime_conformable (expr1, e1))
+	  if (e1 && e1->rank > 0 && !is_runtime_conformable (expr1, e1))
 		return false;
 	}
 	  return true;
@@ -7906,7 +7906,7 @@ is_runtime_conformable (gfc_expr *expr1, gfc_expr
 	  for (a = expr2->value.function.actual; a != NULL; a = a->next)
 	{
 	  e1 = a->expr;
-	  if (e1->rank > 0 && !is_runtime_conformable (expr1, e1))
+	  if (e1 && e1->rank > 0 && !is_runtime_conformable (expr1, e1))
 		return false;
 	}
 	  return true;
! { dg-do compile }
! PR 62142 - this used to segfault
! Original test case by OndÅej ÄertÃk .
program test_segfault
  implicit none
  real, allocatable :: X(:)
  allocate (x(1))
  x = 1.
  X = floor(X)
end program

Re: [PATCH 169/236] Strengthen haifa_sched_info callbacks and 3 scheduler hooks

2014-08-15 Thread Jeff Law


On 08/06/14 11:22, David Malcolm wrote:

gcc/
* target.def (reorder): Strengthen param "ready" of this DEFHOOK
from rtx * to rtx_insn **.
(reorder2): Likewise.
(dependencies_evaluation_hook): Strengthen params "head", "tail"
from rtx to rtx_insn *.

* doc/tm.texi: Update mechanically for above change to target.def.

* sched-int.h (note_list): Strengthen this variable from rtx to
rtx_insn *.
(remove_notes): Likewise for both params.
(restore_other_notes): Likewise for return type and first param.
(struct ready_list): Strengthen field "vec" from rtx * to
rtx_insn **.
(struct dep_replacement): Strenghten field "insn" from rtx to
rtx_insn *.
(struct deps_desc): Likewise for fields "last_debug_insn",
"last_args_size".
(struct haifa_sched_info): Likewise for callback field
"can_schedule_ready_p"'s param, for first param of "new_ready"
callback field, for both params of "rank" callback field, for
first field of "print_insn" callback field (with a const), for
both params of "contributes_to_priority" callback, for param
of "insn_finishes_block_p" callback, for fields "prev_head",
"next_tail", "head", "tail", for first param of "add_remove_insn"
callback, for first param of "begin_schedule_ready" callback, for
both params of "begin_move_insn" callback, and for second param
of "advance_target_bb" callback.
(add_dependence): Likewise for params 1 and 2.
(sched_analyze): Likewise for params 2 and 3.
(deps_analyze_insn): Likewise for param 2.
(ready_element): Likewise for return type.
(ready_lastpos): Strengthen return type from rtx * to rtx_insn **.
(try_ready): Strenghten param from rtx to rtx_insn *.
(sched_emit_insn): Likewise for return type.
(record_delay_slot_pair): Likewise for params 1 and 2.
(add_delay_dependencies): Likewise for param.
(contributes_to_priority): Likewise for both params.
(find_modifiable_mems): Likewise.

* config/arm/arm.c (cortexa7_sched_reorder):  Strengthen param
"ready" from rtx * to rtx_insn **.  Strengthen locals "insn",
"first_older_only_insn" from rtx to rtx_insn *.
(arm_sched_reorder):  Strengthen param "ready"  from rtx * to
rtx_insn **.

* config/c6x/c6x.c (struct c6x_sched_context): Strengthen field
"last_scheduled_iter0" from rtx to rtx_insn *.
(init_sched_state): Replace use of NULL_RTX with NULL for insn.
(c6x_sched_reorder_1): Strengthen param "ready" and locals
"e_ready", "insnp" from rtx * to rtx_insn **.  Strengthen local
"insn" from rtx to rtx_insn *.
(c6x_sched_reorder): Strengthen param "ready" from rtx * to
rtx_insn **.
(c6x_sched_reorder2): Strengthen param "ready" and locals
"e_ready", "insnp" from rtx * to rtx_insn **. Strengthen local
"insn" from rtx to rtx_insn *.
(c6x_variable_issue):  Add a checked cast when assigning from insn
to ss.last_scheduled_iter0.
(split_delayed_branch): Strengthen param "insn" and local "i1"
from rtx to rtx_insn *.
(split_delayed_nonbranch): Likewise.
(undo_split_delayed_nonbranch): Likewise for local "insn".
(hwloop_optimize): Likewise for locals "seq", "insn", "prev",
"entry_after", "end_packet", "head_insn", "tail_insn",
"new_insns", "last_insn", "this_iter", "prev_stage_insn".
Strengthen locals "orig_vec", "copies", "insn_copies" from rtx *
to rtx_insn **.  Remove now-redundant checked cast on last_insn,
but add a checked cast on loop->start_label.  Consolidate calls to
avoid assigning result of gen_spkernel to "insn", now an
rtx_insn *.

* config/i386/i386.c (do_reorder_for_imul): Strengthen param
"ready" from rtx * to rtx_insn **.  Strengthen local "insn" from
rtx to rtx_insn *.
(swap_top_of_ready_list): Strengthen param "ready" from rtx * to
rtx_insn **.  Strengthen locals "top", "next" from rtx to
rtx_insn *.
(ix86_sched_reorder): Strengthen param "ready" from rtx * to
rtx_insn **.  Strengthen local "insn" from rtx to rtx_insn *.
(add_parameter_dependencies): Strengthen params "call", "head" and
locals "insn", "last", "first_arg" from rtx to rtx_insn *.
(avoid_func_arg_motion): Likewise for params "first_arg", "insn".
(add_dependee_for_func_arg): Likewise for param "arg" and local
"insn".
(ix86_dependencies_evaluation_hook): Likewise for params "head",
"tail" and locals "insn", "first_arg".

* config/ia64/ia64.c (ia64_dependencies_evaluation_hook): Likewise
for params "head", "tail" and locals "insn", "next", "next_tail".
(ia64_dfa_sched_reorder): Strengthen pa

[Patch, Fortran] Fix DECL of namelist I/O function; fix FINALIZATION

2014-08-15 Thread Tobias Burnus


This patch fixes two minor issues

a) The argument issue mentioned in 
https://gcc.gnu.org/ml/fortran/2014-08/msg7.html
The main issue is that the decl uses "void" as argument; the FE passes 
IARG() alias gfc_array_index_type while the library expects a 
GFC_INTEGER_4. As n_dim and ts->kind are small, I have chosen to keep 
GFC_INTEGER_4 in the library and use int32_t for the argument and in the 
decl.


b) resolve_finalizer calls at the end the function, which obtains the 
vtab, which in turns calls the vtab function of the parent, which tries 
to generate the _final entry, which requires that the finalizers are 
resolved. In the test case, the parent's finalizer wasn't ready, leading 
to an ICE in an assert. The patch now first resolves the parent's 
finalizers before taking care of its own.


Build and regtested on x86-64-gnu-linux.
OK for the trunk?

Tobias
2014-08-15  Tobias Burnus  

	* trans-io.c (gfc_build_io_library_fndecls): Fix decl of
	IOCALL_SET_NML_VAL.
	(transfer_namelist_element): Use proper int type as argument.

diff --git a/gcc/fortran/trans-io.c b/gcc/fortran/trans-io.c
index cbe54ab..4340afb 100644
--- a/gcc/fortran/trans-io.c
+++ b/gcc/fortran/trans-io.c
@@ -467,7 +467,7 @@ gfc_build_io_library_fndecls (void)
   iocall[IOCALL_SET_NML_VAL] = gfc_build_library_function_decl_with_spec (
 	get_identifier (PREFIX("st_set_nml_var")), ".w.R",
 	void_type_node, 6, dt_parm_type, pvoid_type_node, pvoid_type_node,
-	void_type_node, gfc_charlen_type_node, gfc_int4_type_node);
+	gfc_int4_type_node, gfc_charlen_type_node, gfc_int4_type_node);
 
   iocall[IOCALL_SET_NML_VAL_DIM] = gfc_build_library_function_decl_with_spec (
 	get_identifier (PREFIX("st_set_nml_var_dim")), ".w",
@@ -1557,6 +1557,7 @@ transfer_namelist_element (stmtblock_t * block, const char * var_name,
   tree dtype;
   tree dt_parm_addr;
   tree decl = NULL_TREE;
+  tree gfc_int4_type_node = gfc_get_int_type (4);
   int n_dim;
   int itype;
   int rank = 0;
@@ -1605,7 +1606,8 @@ transfer_namelist_element (stmtblock_t * block, const char * var_name,
   tmp = build_call_expr_loc (input_location,
 			 iocall[IOCALL_SET_NML_VAL], 6,
 			 dt_parm_addr, addr_expr, string,
-			 IARG (ts->kind), tmp, dtype);
+			 build_int_cst (gfc_int4_type_node, ts->kind),
+			 tmp, dtype);
   gfc_add_expr_to_block (block, tmp);
 
   /* If the object is an array, transfer rank times:
@@ -1616,7 +1618,7 @@ transfer_namelist_element (stmtblock_t * block, const char * var_name,
   tmp = build_call_expr_loc (input_location,
 			 iocall[IOCALL_SET_NML_VAL_DIM], 5,
 			 dt_parm_addr,
-			 IARG (n_dim),
+			 build_int_cst (gfc_int4_type_node, n_dim),
 			 gfc_conv_array_stride (decl, n_dim),
 			 gfc_conv_array_lbound (decl, n_dim),
 			 gfc_conv_array_ubound (decl, n_dim));
2014-08-15  Tobias Burnus  

	* resolve.c (gfc_resolve_finalizers): Ensure that parents are
	resolved first.

2014-08-15  Tobias Burnus  

	* gfortran.dg/finalize_27.f90: New.

diff --git a/gcc/fortran/resolve.c b/gcc/fortran/resolve.c
index ea28ef4..32ff9dd 100644
--- a/gcc/fortran/resolve.c
+++ b/gcc/fortran/resolve.c
@@ -11416,6 +11416,10 @@ gfc_resolve_finalizers (gfc_symbol* derived, bool *finalizable)
   bool seen_scalar = false;
   gfc_symbol *vtab;
   gfc_component *c;
+  gfc_symbol *parent = gfc_get_derived_super_type (derived);
+
+  if (parent)
+gfc_resolve_finalizers (parent, finalizable);
 
   /* Return early when not finalizable. Additionally, ensure that derived-type
  components have a their finalizables resolved.  */
diff --git a/gcc/testsuite/gfortran.dg/finalize_27.f90 b/gcc/testsuite/gfortran.dg/finalize_27.f90
new file mode 100644
index 000..bdc7c45
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/finalize_27.f90
@@ -0,0 +1,25 @@
+! { dg-do compile }
+!
+! Was ICEing before
+!
+! Contributed by Reinhold Bader
+!
+
+module mod_fin_04
+  implicit none
+  type :: p_vec
+  contains
+ final :: delete
+  end type p_vec
+  type, extends(p_vec) :: bar
+  contains
+final :: del2
+  end type bar
+contains
+  subroutine delete(this)
+type(p_vec) :: this
+  end subroutine delete
+  subroutine del2(this)
+type(bar) :: this
+  end subroutine del2
+end module

Re: [PATCH 194/236] Use rtx_insn for various target.def hooks

2014-08-15 Thread Jeff Law


On 08/06/14 11:22, David Malcolm wrote:

This patch updates params of 22 of the target hooks to pass an rtx_insn *
rather than an rtx, as appropriate.

Known to compile on:
alpha
arc
arm
bfin
c6x
epiphany
ia64
m32c
m32r
m68k
mep
microblaze
mips
pa
pdp11
picochip
rs6000
s390
sh
sparc
spu
tilegx
tilepro
x86_64

gcc/
* target.def (unwind_emit): Strengthen param "insn" from rtx to
rtx_insn *.
(final_postscan_insn): Likewise.
(adjust_cost): Likewise.
(adjust_priority): Likewise.
(variable_issue): Likewise.
(macro_fusion_pair_p): Likewise.
(dfa_post_cycle_insn): Likewise.
(first_cycle_multipass_dfa_lookahead_guard): Likewise.
(first_cycle_multipass_issue): Likewise.
(dfa_new_cycle): Likewise.
(adjust_cost_2): Likewise for params "insn" and "dep_insn".
(speculate_insn): Likewise for param "insn".
(gen_spec_check): Likewise for params "insn" and "label".
(get_insn_spec_ds): Likewise for param "insn".
(get_insn_checked_ds): Likewise.
(dispatch_do): Likewise.
(dispatch): Likewise.
(cannot_copy_insn_p): Likewise.
(invalid_within_doloop): Likewise.
(legitimate_combined_insn): Likewise.
(needed): Likewise.
(after): Likewise.

* doc/tm.texi: Automatically updated to reflect changes to
target.def.

* haifa-sched.c (choose_ready): Convert NULL_RTX to NULL when
working with insn.
(schedule_block): Likewise.
(sched_init): Likewise.
(sched_speculate_insn): Strengthen param "insn" from rtx to
rtx_insn *.
(ready_remove_first_dispatch): Convert NULL_RTX to NULL when
working with insn.
* hooks.c (hook_bool_rtx_true): Rename to...
hook_bool_rtx_insn_true): ...this, and strengthen first param from
rtx to rtx_insn *.
(hook_constcharptr_const_rtx_null): Rename to...
(hook_constcharptr_const_rtx_insn_null): ...this, and strengthen
first param from const_rtx to const rtx_insn *.
(hook_bool_rtx_int_false): Rename to...
(hook_bool_rtx_insn_int_false): ...this, and strengthen first
param from rtx to rtx_insn *.
(hook_void_rtx_int): Rename to...
(hook_void_rtx_insn_int): ...this, and strengthen first param from
rtx to rtx_insn *.

* hooks.h (hook_bool_rtx_true): Rename to...
(hook_bool_rtx_insn_true): ...this, and strengthen first param from
rtx to rtx_insn *.
(hook_bool_rtx_int_false): Rename to...
(hook_bool_rtx_insn_int_false): ...this, and strengthen first
param from rtx to rtx_insn *.
(hook_void_rtx_int): Rename to...
(hook_void_rtx_insn_int): ...this, and strengthen first param from
rtx to rtx_insn *.
(hook_constcharptr_const_rtx_null): Rename to...
(hook_constcharptr_const_rtx_insn_null): ...this, and strengthen
first param from const_rtx to const rtx_insn *.

* sched-deps.c (try_group_insn): Strengthen param "insn" and local
"prev" from rtx to rtx_insn *.

* sched-int.h (sched_speculate_insn): Strengthen first param from
rtx to rtx_insn *.

* sel-sched.c (create_speculation_check): Likewise for local "label".
* targhooks.c (default_invalid_within_doloop): Strengthen param
"insn" from const_rtx to const rtx_insn *.
* targhooks.h (default_invalid_within_doloop): Strengthen param
from const_rtx to const rtx_insn *.

* config/alpha/alpha.c (alpha_cannot_copy_insn_p): Likewise.
(alpha_adjust_cost): Likewise for params "insn", "dep_insn".

* config/arc/arc.c (arc_sched_adjust_priority): Likewise for param 
"insn".
(arc_invalid_within_doloop): Likewise, with const.

* config/arm/arm.c (arm_adjust_cost): Likewise for params "insn", "dep".
(arm_cannot_copy_insn_p): Likewise for param "insn".
(arm_unwind_emit): Likewise.

* config/bfin/bfin.c (bfin_adjust_cost): Likewise for params "insn",
"dep_insn".

* config/c6x/c6x.c (c6x_dfa_new_cycle): Likewise for param "insn".
(c6x_variable_issue): Likewise.  Removed now-redundant checked
cast.
(c6x_adjust_cost): Likewise for params "insn", "dep_insn".

* config/epiphany/epiphany-protos.h (epiphany_mode_needed):
Likewise for param "insn".
(epiphany_mode_after): Likewise.
* config/epiphany/epiphany.c (epiphany_adjust_cost): Likewise for
params "insn", "dep_insn".
(epiphany_mode_needed): Likewise for param "insn".
(epiphany_mode_after): Likewise.

* config/i386/i386-protos.h (i386_pe_seh_unwind_emit): Likewise.
* config/i386/i386.c (ix86_legitimate_combined_insn): Likewise.
(ix86_avx_u128_mode_needed): Likewise.
(ix86_i387_mode_needed): Likewise.
(ix86_mode_needed): Likewise.
(i

Re: [PATCH 204/236] final.c: Use rtx_sequence

2014-08-15 Thread Jeff Law


On 08/06/14 11:23, David Malcolm wrote:

gcc/
* final.c (get_attr_length_1): Replace GET_CODE check with a
dyn_cast, introducing local "seq" and the use of methods of
rtx_sequence.
(shorten_branches): Likewise, introducing local "body_seq".
Strengthen local "inner_insn" from rtx to rtx_insn *.
(reemit_insn_block_notes): Replace GET_CODE check with a
dyn_cast, strengthening local "body" from rtx to rtx_sequence *.
Use methods of rtx_sequence.
(final_scan_insn): Likewise, introducing local "seq" for when
"body" is known to be a SEQUENCE, using its methods.
So presumably a dyn_cast isn't terribly expensive here?  I guess I'm a 
bit fuzzy on whether or not we agreed to allow using dynamic casts?!? 
Doesn't that have to check the RTTI info which I would think would be 
considerably more expensive than just checking the code.  Or am I 
missing something here?


Jeff

Re: [PATCH 205/236] function.c: Use rtx_sequence

2014-08-15 Thread Jeff Law


On 08/06/14 11:23, David Malcolm wrote:

gcc/
* function.c (contains): Introduce local "seq" for PATTERN (insn),
with a checked cast, in the region for where we know it's a
SEQUENCE.  Use methods of rtx_sequence.

OK.  As is #206.

Jeff

Re: [PATCH 207/236] reorg.c: Use rtx_sequence

2014-08-15 Thread Jeff Law


On 08/06/14 11:23, David Malcolm wrote:

gcc/
* reorg.c (redundant_insn): In two places in the function, replace
a check of GET_CODE with a dyn_cast, introducing local "seq", and
usings methods of rtx_sequence to clarify the code.

Some concerns here with the dynamic cast.

jeff

Re: C++ PATCH for c++/61566 (ICE with lambda in template default arg)

2014-08-15 Thread Jason Merrill


On 06/30/2014 02:49 PM, Jason Merrill wrote:

decl_mangling_context was failing to recognize a lambda in template
context as a lambda.


It turns out that was far from the only issue with a lambda in a member 
template


Tested x86_64-pc-linux-gnu, applying to trunk.

commit 55ef57277be5885a78bacf4e979a8def08e4fbb6
Author: Jason Merrill 
Date:   Fri Aug 15 01:53:54 2014 -0400

	PR c++/61566
	* pt.c (instantiate_class_template_1): Ignore lambda on
	CLASSTYPE_DECL_LIST.
	(push_template_decl_real): A lambda is not primary.
	* lambda.c (maybe_add_lambda_conv_op): Distinguish between being
	currently in a function and the lambda living in a function.

diff --git a/gcc/cp/lambda.c b/gcc/cp/lambda.c
index 169f438..ddaa940 100644
--- a/gcc/cp/lambda.c
+++ b/gcc/cp/lambda.c
@@ -824,6 +824,7 @@ void
 maybe_add_lambda_conv_op (tree type)
 {
   bool nested = (current_function_decl != NULL_TREE);
+  bool nested_def = decl_function_context (TYPE_MAIN_DECL (type));
   tree callop = lambda_function (type);
 
   if (LAMBDA_EXPR_CAPTURE_LIST (CLASSTYPE_LAMBDA_EXPR (type)) != NULL_TREE)
@@ -976,7 +977,7 @@ maybe_add_lambda_conv_op (tree type)
   DECL_NOT_REALLY_EXTERN (fn) = 1;
   DECL_DECLARED_INLINE_P (fn) = 1;
   DECL_ARGUMENTS (fn) = build_this_parm (fntype, TYPE_QUAL_CONST);
-  if (nested)
+  if (nested_def)
 DECL_INTERFACE_KNOWN (fn) = 1;
 
   if (generic_lambda_p)
@@ -1016,7 +1017,7 @@ maybe_add_lambda_conv_op (tree type)
   DECL_NAME (arg) = NULL_TREE;
   DECL_CONTEXT (arg) = fn;
 }
-  if (nested)
+  if (nested_def)
 DECL_INTERFACE_KNOWN (fn) = 1;
 
   if (generic_lambda_p)
diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 6a7bcb8..611bfd6 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -4722,6 +4722,9 @@ push_template_decl_real (tree decl, bool is_friend)
  template  friend void A::f();
is not primary.  */
 is_primary = false;
+  else if (TREE_CODE (decl) == TYPE_DECL
+	   && LAMBDA_TYPE_P (TREE_TYPE (decl)))
+is_primary = false;
   else
 is_primary = template_parm_scope_p ();
 
@@ -9237,6 +9242,11 @@ instantiate_class_template_1 (tree type)
 		  && DECL_OMP_DECLARE_REDUCTION_P (r))
 		cp_check_omp_declare_reduction (r);
 	}
+	  else if (DECL_CLASS_TEMPLATE_P (t)
+		   && LAMBDA_TYPE_P (TREE_TYPE (t)))
+	/* A closure type for a lambda in a default argument for a
+	   member template.  Ignore it; it will be instantiated with
+	   the default argument.  */;
 	  else
 	{
 	  /* Build new TYPE_FIELDS.  */
diff --git a/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-template13.C b/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-template13.C
index adbb4db..2b1a605 100644
--- a/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-template13.C
+++ b/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-template13.C
@@ -7,6 +7,7 @@ struct function
   function (_Functor);
 };
 
+template 
 struct C
 {
   template 
@@ -15,6 +16,9 @@ struct C
 
 void bar ()
 {
-  C c;
+  C c;
   c.foo (1);
 }
+
+// { dg-final { scan-assembler "_ZN8functionC1IZN1CIiE3fooIiEEvT_S_Ed_UlvE_EET_" } }
+// { dg-final { scan-assembler-not "_ZZN1CIiE3fooIiEEvT_8functionEd_NKUlvE_clEv" } }

libgo patch committed: Don't lose track of m value in GC

2014-08-15 Thread Ian Lance Taylor

The runtime_gc function in libgo invokes the garbage collector proper on
the g0 thread (a thread with a large stack that is not involved in
scheduling).  This is done via runtime_mcall.  Upon return from
runtime_mcall, the caller may be running on a different thread.
Unfortunately, the runtime_gc function called runtime_m to get the Go
thread info, and did not refresh that value after calling
runtime_mcall.  The result mostly worked, but could fail in very busy
program doing lots of work that started going as soon as the GC
stopped.  This patch fixes the problem.  Bootstrapped and ran Go
testsuite on x86_64-unknown-linux-gnu.  Committed to mainline and 4.9
branch.

Ian

diff -r 52dcc874d3b7 libgo/runtime/mgc0.c
--- a/libgo/runtime/mgc0.c	Wed Aug 13 14:52:52 2014 -0700
+++ b/libgo/runtime/mgc0.c	Fri Aug 15 15:07:02 2014 -0700
@@ -2204,6 +2204,7 @@
 		g->status = Gwaiting;
 		g->waitreason = "garbage collection";
 		runtime_mcall(mgc);
+		m = runtime_m();
 	}
 
 	// all done

Re: [PATCH 204/236] final.c: Use rtx_sequence

2014-08-15 Thread Trevor Saunders

On Fri, Aug 15, 2014 at 04:24:49PM -0600, Jeff Law wrote:
> On 08/06/14 11:23, David Malcolm wrote:
> >gcc/
> > * final.c (get_attr_length_1): Replace GET_CODE check with a
> > dyn_cast, introducing local "seq" and the use of methods of
> > rtx_sequence.
> > (shorten_branches): Likewise, introducing local "body_seq".
> > Strengthen local "inner_insn" from rtx to rtx_insn *.
> > (reemit_insn_block_notes): Replace GET_CODE check with a
> > dyn_cast, strengthening local "body" from rtx to rtx_sequence *.
> > Use methods of rtx_sequence.
> > (final_scan_insn): Likewise, introducing local "seq" for when
> > "body" is known to be a SEQUENCE, using its methods.
> So presumably a dyn_cast isn't terribly expensive here?  I guess I'm a bit
> fuzzy on whether or not we agreed to allow using dynamic casts?!? Doesn't
> that have to check the RTTI info which I would think would be considerably
> more expensive than just checking the code.  Or am I missing something here?

 your missing dyn_cast != dynamic_cast, the first is just a wrapper
 around as_a / is_a, and so doesn't use rtti.

Trev

> 
> Jeff
>

[SH][committed] Update SH options documentation

2014-08-15 Thread Oleg Endo

Hi,

The attached patch updates the SH options documentation.
Tested with 'make info dvi pdf'.  Committed to trunk, 4.9 and 4.8
branches.

Cheers,
Oleg

gcc/ChangeLog:
* doc/invoke.texi (SH options): Document missing processor variant
options.  Remove references to Hitachi.  Undocument deprecated mspace
option.
Index: gcc/doc/invoke.texi
===
--- gcc/doc/invoke.texi	(revision 214044)
+++ gcc/doc/invoke.texi	(working copy)
@@ -20704,6 +20704,72 @@
 @opindex m4
 Generate code for the SH4.
 
+@item -m4-100
+@opindex m4-100
+Generate code for SH4-100.
+
+@item -m4-100-nofpu
+@opindex m4-100-nofpu
+Generate code for SH4-100 in such a way that the
+floating-point unit is not used.
+
+@item -m4-100-single
+@opindex m4-100-single
+Generate code for SH4-100 assuming the floating-point unit is in
+single-precision mode by default.
+
+@item -m4-100-single-only
+@opindex m4-100-single-only
+Generate code for SH4-100 in such a way that no double-precision
+floating-point operations are used.
+
+@item -m4-200
+@opindex m4-200
+Generate code for SH4-200.
+
+@item -m4-200-nofpu
+@opindex m4-200-nofpu
+Generate code for SH4-200 without in such a way that the
+floating-point unit is not used.
+
+@item -m4-200-single
+@opindex m4-200-single
+Generate code for SH4-200 assuming the floating-point unit is in
+single-precision mode by default.
+
+@item -m4-200-single-only
+@opindex m4-200-single-only
+Generate code for SH4-200 in such a way that no double-precision
+floating-point operations are used.
+
+@item -m4-300
+@opindex m4-300
+Generate code for SH4-300.
+
+@item -m4-300-nofpu
+@opindex m4-300-nofpu
+Generate code for SH4-300 without in such a way that the
+floating-point unit is not used.
+
+@item -m4-300-single
+@opindex m4-300-single
+Generate code for SH4-300 in such a way that no double-precision
+floating-point operations are used.
+
+@item -m4-300-single-only
+@opindex m4-300-single-only
+Generate code for SH4-300 in such a way that no double-precision
+floating-point operations are used.
+
+@item -m4-340
+@opindex m4-340
+Generate code for SH4-340 (no MMU, no FPU).
+
+@item -m4-500
+@opindex m4-500
+Generate code for SH4-500 (no FPU).  Passes @option{-isa=sh4-nofpu} to the
+assembler.
+
 @item -m4a-nofpu
 @opindex m4a-nofpu
 Generate code for the SH4al-dsp, or for a SH4a in such a way that the
@@ -20729,6 +20795,33 @@
 @option{-dsp} to the assembler.  GCC doesn't generate any DSP
 instructions at the moment.
 
+@item -m5-32media
+@opindex m5-32media
+Generate 32-bit code for SHmedia.
+
+@item -m5-32media-nofpu
+@opindex m5-32media-nofpu
+Generate 32-bit code for SHmedia in such a way that the
+floating-point unit is not used.
+
+@item -m5-64media
+@opindex m5-64media
+Generate 64-bit code for SHmedia.
+
+@item -m5-64media-nofpu
+@opindex m5-64media-nofpu
+Generate 64-bit code for SHmedia in such a way that the
+floating-point unit is not used.
+
+@item -m5-compact
+@opindex m5-compact
+Generate code for SHcompact.
+
+@item -m5-compact-nofpu
+@opindex m5-compact-nofpu
+Generate code for SHcompact in such a way that the
+floating-point unit is not used.
+
 @item -mb
 @opindex mb
 Compile code for the processor in big-endian mode.
@@ -20762,16 +20855,12 @@
 Enable the use of the instruction @code{fmovd}.  Check @option{-mdalign} for
 alignment constraints.
 
-@item -mhitachi
-@opindex mhitachi
-Comply with the calling conventions defined by Renesas.
-
 @item -mrenesas
-@opindex mhitachi
+@opindex mrenesas
 Comply with the calling conventions defined by Renesas.
 
 @item -mno-renesas
-@opindex mhitachi
+@opindex mno-renesas
 Comply with the calling conventions defined for GCC before the Renesas
 conventions were available.  This option is the default for all
 targets of the SH toolchain.
@@ -20779,7 +20868,7 @@
 @item -mnomacsave
 @opindex mnomacsave
 Mark the @code{MAC} register as call-clobbered, even if
-@option{-mhitachi} is given.
+@option{-mrenesas} is given.
 
 @item -mieee
 @itemx -mno-ieee
@@ -20885,10 +20974,6 @@
 processors the @code{tas.b} instruction must be used with caution since it
 can result in data corruption for certain cache configurations.
 
-@item -mspace
-@opindex mspace
-Optimize for space instead of speed.  Implied by @option{-Os}.
-
 @item -mprefergot
 @opindex mprefergot
 When generating position-independent code, emit function calls using

Re: [PATCH 156/236] PHASE 4: Removal of scaffolding

2014-08-15 Thread David Malcolm

On Thu, 2014-08-14 at 23:30 -0600, Jeff Law wrote:
> On 08/06/14 11:22, David Malcolm wrote:
> > /
> > * rtx-classes-status.txt: Update.
> > ---
> >   rtx-classes-status.txt | 4 ++--
> >   1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/rtx-classes-status.txt b/rtx-classes-status.txt
> > index b22cb1e..90d6efd 100644
> > --- a/rtx-classes-status.txt
> > +++ b/rtx-classes-status.txt
> > @@ -3,8 +3,8 @@ exists to be modified by marker commits.
> >
> >   Phase 1: initial "scaffolding" commits:DONE
> >   Phase 2: per-file commits in main source dir:  DONE
> > -Phase 3: per-file commits within "config" subdirs: IN PROGRESS
> > -Phase 4: removal of "scaffolding": TODO
> > +Phase 3: per-file commits within "config" subdirs: DONE
> > +Phase 4: removal of "scaffolding": IN PROGRESS
> >   Phase 5: additional rtx_def subclasses:TODO
> >   Phase 6: use extra rtx_def subclasses: TODO
> OK.
> 
> As are patches #157-#168
> 
> Patch #160's ChangeLog is  bit goofy in that it's unclear (to the reader 
> without context) what the "Likewise" refers to for the function.c 
> changes.  I'll assume you'll fix that up appropriately.

Oops, yes; I think I either merged or reordered the hunks in the
ChangeLog, and a "Likewise" ended up *before* the thing it was referring
to.  Will fixup.

Re: [PATCH 169/236] Strengthen haifa_sched_info callbacks and 3 scheduler hooks

2014-08-15 Thread David Malcolm

On Fri, 2014-08-15 at 16:03 -0600, Jeff Law wrote:
> On 08/06/14 11:22, David Malcolm wrote:
> > gcc/
> > * target.def (reorder): Strengthen param "ready" of this DEFHOOK
> > from rtx * to rtx_insn **.
> > (reorder2): Likewise.
> > (dependencies_evaluation_hook): Strengthen params "head", "tail"
> > from rtx to rtx_insn *.
> >
> > * doc/tm.texi: Update mechanically for above change to target.def.
> >
> > * sched-int.h (note_list): Strengthen this variable from rtx to
> > rtx_insn *.
> > (remove_notes): Likewise for both params.
> > (restore_other_notes): Likewise for return type and first param.
> > (struct ready_list): Strengthen field "vec" from rtx * to
> > rtx_insn **.
> > (struct dep_replacement): Strenghten field "insn" from rtx to
> > rtx_insn *.
> > (struct deps_desc): Likewise for fields "last_debug_insn",
> > "last_args_size".
> > (struct haifa_sched_info): Likewise for callback field
> > "can_schedule_ready_p"'s param, for first param of "new_ready"
> > callback field, for both params of "rank" callback field, for
> > first field of "print_insn" callback field (with a const), for
> > both params of "contributes_to_priority" callback, for param
> > of "insn_finishes_block_p" callback, for fields "prev_head",
> > "next_tail", "head", "tail", for first param of "add_remove_insn"
> > callback, for first param of "begin_schedule_ready" callback, for
> > both params of "begin_move_insn" callback, and for second param
> > of "advance_target_bb" callback.
> > (add_dependence): Likewise for params 1 and 2.
> > (sched_analyze): Likewise for params 2 and 3.
> > (deps_analyze_insn): Likewise for param 2.
> > (ready_element): Likewise for return type.
> > (ready_lastpos): Strengthen return type from rtx * to rtx_insn **.
> > (try_ready): Strenghten param from rtx to rtx_insn *.
> > (sched_emit_insn): Likewise for return type.
> > (record_delay_slot_pair): Likewise for params 1 and 2.
> > (add_delay_dependencies): Likewise for param.
> > (contributes_to_priority): Likewise for both params.
> > (find_modifiable_mems): Likewise.
> >
> > * config/arm/arm.c (cortexa7_sched_reorder):  Strengthen param
> > "ready" from rtx * to rtx_insn **.  Strengthen locals "insn",
> > "first_older_only_insn" from rtx to rtx_insn *.
> > (arm_sched_reorder):  Strengthen param "ready"  from rtx * to
> > rtx_insn **.
> >
> > * config/c6x/c6x.c (struct c6x_sched_context): Strengthen field
> > "last_scheduled_iter0" from rtx to rtx_insn *.
> > (init_sched_state): Replace use of NULL_RTX with NULL for insn.
> > (c6x_sched_reorder_1): Strengthen param "ready" and locals
> > "e_ready", "insnp" from rtx * to rtx_insn **.  Strengthen local
> > "insn" from rtx to rtx_insn *.
> > (c6x_sched_reorder): Strengthen param "ready" from rtx * to
> > rtx_insn **.
> > (c6x_sched_reorder2): Strengthen param "ready" and locals
> > "e_ready", "insnp" from rtx * to rtx_insn **. Strengthen local
> > "insn" from rtx to rtx_insn *.
> > (c6x_variable_issue):  Add a checked cast when assigning from insn
> > to ss.last_scheduled_iter0.
> > (split_delayed_branch): Strengthen param "insn" and local "i1"
> > from rtx to rtx_insn *.
> > (split_delayed_nonbranch): Likewise.
> > (undo_split_delayed_nonbranch): Likewise for local "insn".
> > (hwloop_optimize): Likewise for locals "seq", "insn", "prev",
> > "entry_after", "end_packet", "head_insn", "tail_insn",
> > "new_insns", "last_insn", "this_iter", "prev_stage_insn".
> > Strengthen locals "orig_vec", "copies", "insn_copies" from rtx *
> > to rtx_insn **.  Remove now-redundant checked cast on last_insn,
> > but add a checked cast on loop->start_label.  Consolidate calls to
> > avoid assigning result of gen_spkernel to "insn", now an
> > rtx_insn *.
> >
> > * config/i386/i386.c (do_reorder_for_imul): Strengthen param
> > "ready" from rtx * to rtx_insn **.  Strengthen local "insn" from
> > rtx to rtx_insn *.
> > (swap_top_of_ready_list): Strengthen param "ready" from rtx * to
> > rtx_insn **.  Strengthen locals "top", "next" from rtx to
> > rtx_insn *.
> > (ix86_sched_reorder): Strengthen param "ready" from rtx * to
> > rtx_insn **.  Strengthen local "insn" from rtx to rtx_insn *.
> > (add_parameter_dependencies): Strengthen params "call", "head" and
> > locals "insn", "last", "first_arg" from rtx to rtx_insn *.
> > (avoid_func_arg_motion): Likewise for params "first_arg", "insn".
> > (add_dependee_for_func_arg): Likewise for param "arg" and local
> > "insn".
> > (ix86_dependencies_evaluation_hook): Likewise for params "head",
> > "tail" and locals "insn", "first_arg".
> >
> > * config/ia64/ia64.c (ia64_dependencies_evaluation_hook): Likewise
> > for params "head", "tail" and locals "in

[PATCH 1/4] rs6000: Merge boolsi3 and booldi3

2014-08-15 Thread Segher Boessenkool

This adds a new output modifier "e" that prints an 's' for things like
xoris, and changes "u" to work for both xoris and xori.  With that, both
SI and DI can simply use an "n" constraint, where previously they needed
"K,L" resp. "K,J" (and it used "JF" in fact, but the F doesn't do anything
there).


2014-08-15  Segher Boessenkool  

gcc/
* config/rs6000/rs6000.c (print_operand) <'e'>: New.
<'u'>: Also support printing the low-order 16 bits.
* config/rs6000/rs6000.md (iorsi3, xorsi3, *boolsi3_internal1,
*boolsi3_internal2 and split, *boolsi3_internal3 and split): Delete.
(iordi3, xordi3, *booldi3_internal1, *booldi3_internal2 and split,
*booldi3_internal3 and split): Delete.
(ior3, xor3, *bool3, *bool3_dot,
*bool3_dot2): New.
(two anonymous define_splits for non_logical_cint_operand): Merge.

---
 gcc/config/rs6000/rs6000.c  |  30 +++-
 gcc/config/rs6000/rs6000.md | 328 
 2 files changed, 114 insertions(+), 244 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index d90afcc..f7673de 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -17996,6 +17996,19 @@ print_operand (FILE *file, rtx x, int code)
   fprintf (file, "%d", i + 1);
   return;
 
+case 'e':
+  /* If the low 16 bits are 0, but some other bit is set, write 's'.  */
+  if (! INT_P (x))
+   {
+ output_operand_lossage ("invalid %%e value");
+ return;
+   }
+
+  uval = INTVAL (x);
+  if ((uval & 0x) == 0 && uval != 0)
+   putc ('s', file);
+  return;
+
 case 'E':
   /* X is a CR register.  Print the number of the EQ bit of the CR */
   if (GET_CODE (x) != REG || ! CR_REGNO_P (REGNO (x)))
@@ -18298,12 +18311,19 @@ print_operand (FILE *file, rtx x, int code)
   return;
 
 case 'u':
-  /* High-order 16 bits of constant for use in unsigned operand.  */
+  /* High-order or low-order 16 bits of constant, whichever is non-zero,
+for use in unsigned operand.  */
   if (! INT_P (x))
-   output_operand_lossage ("invalid %%u value");
-  else
-   fprintf (file, HOST_WIDE_INT_PRINT_HEX,
-(INTVAL (x) >> 16) & 0x);
+   {
+ output_operand_lossage ("invalid %%u value");
+ return;
+   }
+
+  uval = INTVAL (x);
+  if ((uval & 0x) == 0)
+   uval >>= 16;
+
+  fprintf (file, HOST_WIDE_INT_PRINT_HEX, uval & 0x);
   return;
 
 case 'v':
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 7a99957..2e4df11 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -3159,142 +3159,144 @@ (define_insn_and_split "*andsi3_internal6"
 }"
   [(set_attr "length" "8")])
 
-(define_expand "iorsi3"
-  [(set (match_operand:SI 0 "gpc_reg_operand" "")
-   (ior:SI (match_operand:SI 1 "gpc_reg_operand" "")
-   (match_operand:SI 2 "reg_or_logical_cint_operand" "")))]
+
+(define_expand "ior3"
+  [(set (match_operand:SDI 0 "gpc_reg_operand" "")
+   (ior:SDI (match_operand:SDI 1 "gpc_reg_operand" "")
+(match_operand:SDI 2 "reg_or_cint_operand" "")))]
   ""
-  "
 {
-  if (GET_CODE (operands[2]) == CONST_INT
-  && ! logical_operand (operands[2], SImode))
+  if (mode == DImode && !TARGET_POWERPC64)
+{
+  rs6000_split_logical (operands, IOR, false, false, false, NULL_RTX);
+  DONE;
+}
+
+  if (non_logical_cint_operand (operands[2], mode))
 {
-  HOST_WIDE_INT value = INTVAL (operands[2]);
   rtx tmp = ((!can_create_pseudo_p ()
  || rtx_equal_p (operands[0], operands[1]))
-? operands[0] : gen_reg_rtx (SImode));
+? operands[0] : gen_reg_rtx (mode));
+  HOST_WIDE_INT value = INTVAL (operands[2]);
 
-  emit_insn (gen_iorsi3 (tmp, operands[1],
+  emit_insn (gen_ior3 (tmp, operands[1],
 GEN_INT (value & (~ (HOST_WIDE_INT) 0x;
-  emit_insn (gen_iorsi3 (operands[0], tmp, GEN_INT (value & 0x)));
+
+  emit_insn (gen_ior3 (operands[0], tmp, GEN_INT (value & 0x)));
   DONE;
 }
-}")
 
-(define_expand "xorsi3"
-  [(set (match_operand:SI 0 "gpc_reg_operand" "")
-   (xor:SI (match_operand:SI 1 "gpc_reg_operand" "")
-   (match_operand:SI 2 "reg_or_logical_cint_operand" "")))]
+  if (!reg_or_logical_cint_operand (operands[2], mode))
+operands[2] = force_reg (mode, operands[2]);
+})
+
+(define_expand "xor3"
+  [(set (match_operand:SDI 0 "gpc_reg_operand" "")
+   (xor:SDI (match_operand:SDI 1 "gpc_reg_operand" "")
+(match_operand:SDI 2 "reg_or_cint_operand" "")))]
   ""
-  "
 {
-  if (GET_CODE (operands[2]) == CONST_INT
-  && ! logical_operand (operands[2], SImode))
+  if (mode == DImode && !TARGET_POWERPC64)
+{
+  rs6000_split_logical (operands, XOR, false, false, false, NULL_RTX)

[PATCH 2/4] rs6000: Merge boolcsi3 and boolcdi3

2014-08-15 Thread Segher Boessenkool

2014-08-15  Segher Boessenkool  

gcc/
* config/rs6000/rs6000.md (*boolcsi3_internal1, *boolcsi3_internal2
and split, *boolcsi3_internal3 and split): Delete.
(*boolcdi3_internal1, *boolcdi3_internal2 and split,
*boolcdi3_internal3 and split): Delete.
(*boolc3, *boolc3_dot, *boolc3_dot2): New.

---
 gcc/config/rs6000/rs6000.md | 162 +++-
 1 file changed, 41 insertions(+), 121 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 2e4df11..46f4f55 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -3298,71 +3298,59 @@ (define_split
 })
 
 
-(define_insn "*boolcsi3_internal1"
-  [(set (match_operand:SI 0 "gpc_reg_operand" "=r")
-   (match_operator:SI 3 "boolean_operator"
-[(not:SI (match_operand:SI 1 "gpc_reg_operand" "r"))
- (match_operand:SI 2 "gpc_reg_operand" "r")]))]
+(define_insn "*boolc3"
+  [(set (match_operand:GPR 0 "gpc_reg_operand" "=r")
+   (match_operator:GPR 3 "boolean_operator"
+[(not:GPR (match_operand:GPR 2 "gpc_reg_operand" "r"))
+ (match_operand:GPR 1 "gpc_reg_operand" "r")]))]
   ""
-  "%q3 %0,%2,%1")
+  "%q3 %0,%1,%2"
+  [(set_attr "type" "logical")])
 
-(define_insn "*boolcsi3_internal2"
-  [(set (match_operand:CC 0 "cc_reg_operand" "=x,?y")
-   (compare:CC (match_operator:SI 4 "boolean_operator"
-[(not:SI (match_operand:SI 1 "gpc_reg_operand" "r,r"))
- (match_operand:SI 2 "gpc_reg_operand" "r,r")])
+(define_insn_and_split "*boolc3_dot"
+  [(set (match_operand:CC 4 "cc_reg_operand" "=x,?y")
+   (compare:CC (match_operator:GPR 3 "boolean_operator"
+[(not:GPR (match_operand:GPR 2 "gpc_reg_operand" "r,r"))
+ (match_operand:GPR 1 "gpc_reg_operand" "r,r")])
 (const_int 0)))
-   (clobber (match_scratch:SI 3 "=r,r"))]
-  "TARGET_32BIT"
+   (clobber (match_scratch:GPR 0 "=r,r"))]
+  "mode == Pmode && rs6000_gen_cell_microcode"
   "@
-   %q4. %3,%2,%1
+   %q3. %0,%1,%2
#"
-  [(set_attr "type" "compare")
-   (set_attr "length" "4,8")])
-
-(define_split
-  [(set (match_operand:CC 0 "cc_reg_not_micro_cr0_operand" "")
-   (compare:CC (match_operator:SI 4 "boolean_operator"
-[(not:SI (match_operand:SI 1 "gpc_reg_operand" ""))
- (match_operand:SI 2 "gpc_reg_operand" "")])
-(const_int 0)))
-   (clobber (match_scratch:SI 3 ""))]
-  "TARGET_32BIT && reload_completed"
-  [(set (match_dup 3) (match_dup 4))
-   (set (match_dup 0)
-   (compare:CC (match_dup 3)
+  "&& reload_completed && cc_reg_not_cr0_operand (operands[4], CCmode)"
+  [(set (match_dup 0)
+   (match_dup 3))
+   (set (match_dup 4)
+   (compare:CC (match_dup 0)
(const_int 0)))]
-  "")
+  ""
+  [(set_attr "type" "logical")
+   (set_attr "dot" "yes")
+   (set_attr "length" "4,8")])
 
-(define_insn "*boolcsi3_internal3"
-  [(set (match_operand:CC 3 "cc_reg_operand" "=x,?y")
-   (compare:CC (match_operator:SI 4 "boolean_operator"
-[(not:SI (match_operand:SI 1 "gpc_reg_operand" "%r,r"))
- (match_operand:SI 2 "gpc_reg_operand" "r,r")])
+(define_insn_and_split "*boolc3_dot2"
+  [(set (match_operand:CC 4 "cc_reg_operand" "=x,?y")
+   (compare:CC (match_operator:GPR 3 "boolean_operator"
+[(not:GPR (match_operand:GPR 2 "gpc_reg_operand" "r,r"))
+ (match_operand:GPR 1 "gpc_reg_operand" "r,r")])
 (const_int 0)))
-   (set (match_operand:SI 0 "gpc_reg_operand" "=r,r")
-   (match_dup 4))]
-  "TARGET_32BIT"
+   (set (match_operand:GPR 0 "gpc_reg_operand" "=r,r")
+   (match_dup 3))]
+  "mode == Pmode && rs6000_gen_cell_microcode"
   "@
-   %q4. %0,%2,%1
+   %q3. %0,%1,%2
#"
-  [(set_attr "type" "compare")
-   (set_attr "length" "4,8")])
-
-(define_split
-  [(set (match_operand:CC 3 "cc_reg_not_micro_cr0_operand" "")
-   (compare:CC (match_operator:SI 4 "boolean_operator"
-[(not:SI (match_operand:SI 1 "gpc_reg_operand" ""))
- (match_operand:SI 2 "gpc_reg_operand" "")])
-(const_int 0)))
-   (set (match_operand:SI 0 "gpc_reg_operand" "")
-   (match_dup 4))]
-  "TARGET_32BIT && reload_completed"
-  [(set (match_dup 0) (match_dup 4))
-   (set (match_dup 3)
+  "&& reload_completed && cc_reg_not_cr0_operand (operands[4], CCmode)"
+  [(set (match_dup 0)
+   (match_dup 3))
+   (set (match_dup 4)
(compare:CC (match_dup 0)
(const_int 0)))]
-  "")
+  ""
+  [(set_attr "type" "logical")
+   (set_attr "dot" "yes")
+   (set_attr "length" "4,8")])
 
 (define_insn "*boolccsi3_internal1"
   [(set (match_operand:SI 0 "gpc_reg_operand" "=r")
@@ -7840,74 +7828,6 @@ (define_split
   build_mask64_2_operands (operands[2], &operands[5]);
 }")
 
-(define_insn "*boolcdi3_internal1"
-  [(set (match_operand:DI 0 "gpc_reg_operand" "=r")
-   (match_operator:DI 3 "boolean_operator"
-[(not:DI (match_operand:DI 1 "gpc_reg_operand" "r"))
- (match_operand:DI 2 "gpc_re

[PATCH 0/4] rs6000: Merge most logical SI and DI patterns

2014-08-15 Thread Segher Boessenkool

All patches were bootstrapped and regression checked separately, on
powerpc64-linux -m64,-m32,-m32/-mpowerpc64; no regressions.

Is this okay to apply?


Segher


 gcc/config/rs6000/constraints.md  |3 +-
 gcc/config/rs6000/htm.md  |6 +-
 gcc/config/rs6000/predicates.md   |   23 +-
 gcc/config/rs6000/rs6000-protos.h |2 +-
 gcc/config/rs6000/rs6000.c|   83 +--
 gcc/config/rs6000/rs6000.md   | 1319 -
 gcc/config/rs6000/vector.md   |   22 +-
 7 files changed, 508 insertions(+), 950 deletions(-)

-- 
1.8.1.4

1 2 >

1 - 100 of 106 matches

Mail list logo