Re: coverage, propagate visibility

2011-11-15 Thread Nathan Sidwell

On 11/15/11 07:55, Markus Trippelsdorf wrote:


Nothing changed the set of emitted functions. The error message above
was just an example. Will post the full error message in the PR (51113).
I will also try to come up with a smaller testcase.


Something's changed the linkage of ZN2js11WeakMapBaseD0Ev though.  Where is it 
being emitted?  Anyway, a small testcase would certainly be helpful.




Re: [PATCH 2/3] Predication support

2011-11-15 Thread Andrey Belevantsev

Hello,

On 26.10.2011 21:49, Alexander Monakov wrote:

2011-10-26  Alexander Monakov

* common.opt: Add -fsel-sched-predication option.
* config/ia64/ia64.c (get_mode_no_for_insn): Support conditional loads.
* rtl.h (COND_SET_SRC_PTR, COND_SET_SRC_PTR): New macros.
* sched-deps.c (conditions_mutex_with_rev_p): Rename from
conditions_mutex_p.  Adjust the caller.
(conditions_mutex_p, conditions_same_p,
conditions_same_or_mutex_p): New functions.
(sched_analyze_insn): Extract LHS/RHS for conditional assignments.
* sched-int.h (conditions_mutex_p, conditions_same_p,
conditions_same_or_mutex_p): Declare.
* sel-sched-ir.c (vinsn_equal_p): Support conditional insns.
(vinsn_equal_skip_cond_p): New function.
(init_expr): Add new argument (was_predicated).  Update all callers.
(merge_expr_data): Assert equality of merged predicates.
(av_set_add_nocopy, join_distinct_sets): Export.
(av_set_tail, av_set_concat): New functions.
(av_set_union_and_live): Add parameter to_fallthru_p, generate
predicated insns.
(setup_id_lhs_rhs): Support conditional insns.
(setup_id_cond): New function.
(init_id_from_df, deps_init_id): Use it.
(get_dest_and_mode): Support conditional insns.
* sel-sched-ir.h (enum local_trans_type): Add TRANS_PREDICATION.
(struct _expr): Add new member (was_predicated).
(EXPR_COND, EXPR_WAS_PREDICATED): New accessor macros.
(struct idata_def): Add new member (cond).
(IDATA_COND, VINSN_COND, INSN_COND): New accessor macros.
(vinsn_equal_skip_cond_p, maybe_predicate_expr_into,
av_set_add_nocopy, av_set_intersect, join_distinct_sets,
av_set_concat): Declare.
(av_set_union_and_live): Change prototype.

* sel-sched.c is lost somewhere here.




diff --git a/gcc/sel-sched-ir.c b/gcc/sel-sched-ir.c
index 1a73308..85e469a 100644
--- a/gcc/sel-sched-ir.c
+++ b/gcc/sel-sched-ir.c
@@ -1607,6 +1614,39 @@ vinsn_equal_p (vinsn_t x, vinsn_t y)
return rtx_equal_p_cb (VINSN_PATTERN (x), VINSN_PATTERN (y), repcf);
  }
  
+/* Compare two vinsns as rhses if possible and as vinsns otherwise.
+   Ignore potential differences in conditions.  */
+bool
+vinsn_equal_skip_cond_p (vinsn_t x, vinsn_t y)
+{
As we discussed offline, this functionality can be integrated in 
vinsn_equal_p, so we don't need extra predicate for that.



@@ -2221,22 +2287,41 @@ av_set_union_and_clear (av_set_t *top, av_set_t *fromp, 
insn_t insn)
  }

  /* Same as above, but also update availability of target register in
-   TOP judging by TO_LV_SET and FROM_LV_SET.  */
+   TOP judging by TO_LV_SET and FROM_LV_SET.  INSN is the conditional branch,
+   and TO_FALLTHRU_P indicates whether TOP is from the fallthrough arm.  */
  void
  av_set_union_and_live (av_set_t *top, av_set_t *fromp, regset to_lv_set,

from the fallthru edge



@@ -3567,6 +3685,10 @@ get_dest_and_mode (rtx insn, rtx *dst_loc, enum 
machine_mode *mode)
rtx pat = PATTERN (insn);

gcc_assert (dst_loc);
+
+  if (GET_CODE (pat) == COND_EXEC)
+pat = COND_EXEC_CODE (pat);
+

Do we need to conditionalize this on flag_sel_sched_predication?


diff --git a/gcc/sel-sched.c b/gcc/sel-sched.c
index 91fb0fe..f5c6f8b 100644
--- a/gcc/sel-sched.c
+++ b/gcc/sel-sched.c
@@ -2712,6 +2716,181 @@ is_ineligible_successor (insn_t insn, ilist_t p)
  return false;
  }

+/* An entry in the predication cache.  */
+struct predication_cache
+{
+  /* A predicate that would be added.  */
+  rtx cond;
+  /* Original vinsn.  */
+  vinsn_t vinsn_old;
+  /* The vinsn resulting from applying predicate COND to vinsn VINSN_OLD.  */
+  vinsn_t vinsn_new;
+};
+
+static htab_t predication_cache;
No comment before this variable.  Also, this section is worth a comment 
describing why we need a separate cache for predicated insns.



+/* Try to construct NEW_EXPR as the result of applying predicate COND to
+   EXPR, using uid of INSN to record the transformation in the history vector.
+   Return value indicates whether transformation is valid.  */
+static bool
+predicate_expr (expr_t new_expr, expr_t expr, insn_t insn, rtx cond)
+{
+  struct predication_cache *entry;
+  vinsn_t vi = EXPR_VINSN (expr), new_vi;
+  rtx pat = PATTERN (EXPR_INSN_RTX (expr));
+
+  if (VINSN_UNIQUE_P (vi)
+  || modified_in_p (cond, EXPR_INSN_RTX (expr)))
+return false;
+
+  entry = find_in_predication_cache (cond, vi);
+
+  if (entry)
+{
+  new_vi = entry->vinsn_new;
+  if (!new_vi)
+   return false;
+  if (INSN_IN_STREAM_P (VINSN_INSN_RTX (new_vi)))
+   new_vi = vinsn_copy (new_vi, false);
+}
+  else
+{
+  entry = XNEW (struct predication_cache);
+  entry->cond = cond;
+  entry->vinsn_old = vi;
+  vinsn_attach (vi);
+
+  pat = gen_rtx_COND_EXEC (VOIDmode, copy_rtx (cond), copy_rtx (pat));
+  if (insn_invalid_p (make_ins

Re: [PATCH 3/3, RFC] Fixup COND_EXECs before reload

2011-11-15 Thread Andrey Belevantsev

Hello,

On 26.10.2011 21:58, Alexander Monakov wrote:


This RFC patch implements conversion of COND_EXEC instructions to control flow
for pre-RA selective scheduler.  Something like this is needed to employ
predication support before reload.

Each COND_EXEC is converted separately to a new basic block with the
unconditional variant of the instruction, and a conditional jump around it.

I'm not sure what would be an acceptable approach here.  I'm also not sure
about the recommended way to emit JUMPs.
This code looks good, but before checking it in we need to understand 
whether this functionality is needed by other passes (SMS?). If yes, then 
we need to move this to cfgrtl.c or similar.  If no, then at least we need 
to conditionalize this code on flag_sel_sched_predication as it is yet 
another pass over all insns.  Can it be unified with other initialization 
passes of sel-sched?


Andrey






2011-10-26  Sergey Grechanik

* sel-sched.c (convert_cond_execs): New.  Use it...
(sel_global_finish): ... here.

diff --git a/gcc/sel-sched.c b/gcc/sel-sched.c
index f5c6f8b..b8f2663 100644
--- a/gcc/sel-sched.c
+++ b/gcc/sel-sched.c
@@ -45,6 +45,7 @@ along with GCC; see the file COPYING3.  If not see
  #include "rtlhooks-def.h"
  #include "output.h"
  #include "emit-rtl.h"
+#include "cfghooks.h"

  #ifdef INSN_SCHEDULING
  #include "sel-sched-ir.h"
@@ -7978,6 +7979,60 @@ sel_global_init (void)
init_hard_regs_data ();
  }

+/* Convert cond_execs to jumps before reload.  */
+static void
+convert_cond_execs (void)
+{
+  basic_block bb;
+  rtx insn;
+
+  if (reload_completed)
+return;
+
+  FOR_EACH_BB (bb)
+   /* We don't need the safe variant because we break immediately after
+  removing the current instruction.  */
+FOR_BB_INSNS (bb, insn)
+  if (INSN_P (insn)&&  GET_CODE (PATTERN (insn)) == COND_EXEC)
+   {
+ rtx jump;
+ rtx cond = COND_EXEC_TEST (PATTERN (insn));
+ rtx rcond = reversed_comparison (cond, GET_MODE (cond));
+ rtx unpredicated = copy_rtx (COND_EXEC_CODE (PATTERN (insn)));
+
+ /* Split bb into BB, NEW_BB, NEXT_BB (in that order).  */
+ edge e1 = split_block (bb, insn);
+ basic_block next_bb = e1->dest;
+ edge e2 = split_block (bb, insn);
+ basic_block new_bb = e2->dest;
+
+ /* Emit conditional jump at the end of bb.  */
+ rtx label = block_label (next_bb);
+
+  /* FIXME  There should be a better way.  */
+ rtx jump_pat
+  = gen_rtx_SET (GET_MODE (pc_rtx), pc_rtx,
+ gen_rtx_IF_THEN_ELSE (GET_MODE (pc_rtx),
+   rcond,
+   gen_rtx_LABEL_REF (VOIDmode,
+  label),
+   pc_rtx));
+
+ make_edge (bb, next_bb, 0);
+ jump = emit_jump_insn_after (jump_pat, BB_END (bb));
+ JUMP_LABEL (jump) = label;
+ LABEL_NUSES (label)++;
+
+ emit_insn_after_noloc (unpredicated, BB_HEAD (new_bb), new_bb);
+
+ delete_insn (insn);
+ break;
+   }
+#ifdef ENABLE_CHECKING
+  verify_flow_info ();
+#endif
+}
+
  /* Free the global data of the scheduler.  */
  static void
  sel_global_finish (void)
@@ -7998,6 +8053,8 @@ sel_global_finish (void)

free_sched_pools ();
free_dominance_info (CDI_DOMINATORS);
+
+  convert_cond_execs ();
  }

  /* Return true when we need to skip selective scheduling.  Used for 
debugging.  */





[PATCH, take 2] Fix PR tree-optimization/49960 ,Fix self data dependence

2011-11-15 Thread Razya Ladelsky
> > I hope it's clearer now, I will add a comment to the code, and submit 
it
> > before committing it.
> 
> No, it's not clearer, because it is not clear why you need to add the 
hack
> instead of avoiding the 2nd access function. And iff you add the hack it
> needs a comment why zero should be special (any other constant would
> be the same I suppose).
> 
> Btw, your fortran example does not compile and I don't believe the issue
> is still present after my last changes to dr_analyze_indices.  So, did
> you verify this on trunk?
> 
> Richard.

This patch fixes the failures described in 
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49960
It also fixes bzips when run with autopar enabled.

In both cases the self dependences are not handled correctly.
In the first case, a non affine access is analyzed:
in the second, the distance vector is not calculated correctly (the 
distance vector considered for for self dependences is always (0,0,...))
As  a result, the loops get wrongfully parallelized.

I modified the previous patch according to the last changes in the trunk,
which indeed do not requite special handling for the 2nd access function 
(as mentioned by Richard).
Another change is that the previous version of the patch eliminated 
compute_self_dependences function
as the calls to it were redundant, while this version considers the new 
call to compute_self_dependences from the vect code for gather (inserted 
lately by Jakub).

ChangeLog:
PR tree-optimization/49960

* tree-data-ref.c (initialize_data_dependence_relation): Add 
initializations. 
Remove call to compute_self_dependence.
(compute_affine_dependence): Remove the !DDR_SELF_REFERENCE 
condition.
(compute_self_dependence): Remove old code. Add call to 
compute_affine_dependence.
(compute_all_dependences): Remove call to compute_self_dependence. 

Add call to compute_affine_dependence.
 
testsuite/ChangeLog:
PR tree-optimization/49960

* gcc.dg/autopar/pr49960.c: New test.
* gcc.dg/autopar/pr49960-1.c: New test.







Bootstrap and testsuite pass successfully for ppc64-redhat-linux.

OK for trunk?
Thank you,
Razya



pr49960.c
Description: Binary data


pr49960-1.c
Description: Binary data
Index: gcc/tree-data-ref.c
===
--- gcc/tree-data-ref.c (revision 181168)
+++ gcc/tree-data-ref.c (working copy)
@@ -1389,13 +1389,30 @@ initialize_data_dependence_relation (struct data_r
  the data dependence tests, just initialize the ddr and return.  */
   if (operand_equal_p (DR_REF (a), DR_REF (b), 0))
 {
+ if (loop_nest
+&& !object_address_invariant_in_loop_p (VEC_index (loop_p, loop_nest, 
0),
+   DR_BASE_OBJECT (a)))
+  {
+DDR_ARE_DEPENDENT (res) = chrec_dont_know;
+return res;
+  }
   DDR_AFFINE_P (res) = true;
   DDR_ARE_DEPENDENT (res) = NULL_TREE;
   DDR_SUBSCRIPTS (res) = VEC_alloc (subscript_p, heap, DR_NUM_DIMENSIONS 
(a));
   DDR_LOOP_NEST (res) = loop_nest;
   DDR_INNER_LOOP (res) = 0;
   DDR_SELF_REFERENCE (res) = true;
-  compute_self_dependence (res);
+  for (i = 0; i < DR_NUM_DIMENSIONS (a); i++)
+   {
+ struct subscript *subscript;
+
+ subscript = XNEW (struct subscript);
+ SUB_CONFLICTS_IN_A (subscript) = conflict_fn_not_known ();
+ SUB_CONFLICTS_IN_B (subscript) = conflict_fn_not_known ();
+ SUB_LAST_CONFLICT (subscript) = chrec_dont_know;
+ SUB_DISTANCE (subscript) = chrec_dont_know;
+ VEC_safe_push (subscript_p, heap, DDR_SUBSCRIPTS (res), subscript);
+   }
   return res;
 }
 
@@ -4040,8 +4057,7 @@ compute_affine_dependence (struct data_dependence_
 }
 
   /* Analyze only when the dependence relation is not yet known.  */
-  if (DDR_ARE_DEPENDENT (ddr) == NULL_TREE
-  && !DDR_SELF_REFERENCE (ddr))
+  if (DDR_ARE_DEPENDENT (ddr) == NULL_TREE)
 {
   dependence_stats.num_dependence_tests++;
 
@@ -4122,31 +4138,11 @@ compute_affine_dependence (struct data_dependence_
 void
 compute_self_dependence (struct data_dependence_relation *ddr)
 {
-  unsigned int i;
-  struct subscript *subscript;
-
   if (DDR_ARE_DEPENDENT (ddr) != NULL_TREE)
 return;
 
-  for (i = 0; VEC_iterate (subscript_p, DDR_SUBSCRIPTS (ddr), i, subscript);
-   i++)
-{
-  if (SUB_CONFLICTS_IN_A (subscript))
-   free_conflict_function (SUB_CONFLICTS_IN_A (subscript));
-  if (SUB_CONFLICTS_IN_B (subscript))
-   free_conflict_function (SUB_CONFLICTS_IN_B (subscript));
-
-  /* The accessed index overlaps for each iteration.  */
-  SUB_CONFLICTS_IN_A (subscript)
-   = conflict_fn (1, affine_fn_cst (integer_zero_node));
-  SUB_CONFLICTS_IN_B (subscript)
-   = conflict_fn (1, affine_fn_cst (integer_zero_node));
-  SUB_LAST_CONFLICT (subscript) = chrec_dont_know;
-}
-
- 

[v3] libstdc++/51133

2011-11-15 Thread Paolo Carlini

Hi,

tested x86_64-linux multilib, committed mainline and 4_6-branch.

Thanks,
Paolo.

///
2011-11-15  Jason Dick  

PR libstdc++/51133
* include/tr1/poly_hermite.tcc (__poly_hermite_recursion): Fix
wrong sign in recursion relation.
Index: include/tr1/poly_hermite.tcc
===
--- include/tr1/poly_hermite.tcc(revision 181359)
+++ include/tr1/poly_hermite.tcc(working copy)
@@ -1,6 +1,6 @@
 // Special functions -*- C++ -*-
 
-// Copyright (C) 2006, 2007, 2008, 2009, 2010
+// Copyright (C) 2006, 2007, 2008, 2009, 2010, 2011
 // Free Software Foundation, Inc.
 //
 // This file is part of the GNU ISO C++ Library.  This library is free
@@ -84,7 +84,7 @@
   unsigned int __i;
   for  (__H_nm2 = __H_0, __H_nm1 = __H_1, __i = 2; __i <= __n; ++__i)
 {
-  __H_n = 2 * (__x * __H_nm1 + (__i - 1) * __H_nm2);
+  __H_n = 2 * (__x * __H_nm1 - (__i - 1) * __H_nm2);
   __H_nm2 = __H_nm1;
   __H_nm1 = __H_n;
 }


Re: GCC 4.7.0 Status Report (2011-10-27), Stage 1 will end Nov 7th

2011-11-15 Thread Michael Zolotukhin
Hello!

x86-specific part of this patch was committed to the trunk recently.
There is also target-independent part, which covers memset/memcopy for
the smallest sizes (from 1 to ~256 bytes). In contrast to existing
implementation, it has a cost model to choose the fastest move-mode
(which could be a vector move-mode). This helps to increase the
performance on small sizes - these cases are especially important,
because libcalls can't be efficiently used here due to call overheads.

Could anyone from middle-end maintainers review it, when I updated it
to the latest changes?

Thanks!

On 27 October 2011 17:24, Uros Bizjak  wrote:
> Hello!
>
>> The GCC trunk is still in stage1.  Stage1 will last until
>> Nov 7th (including, use your timezone to your advantage) after
>> which we will have been in stage1 for nearth 8 months.
>> In stage3 the trunk will be open for general bugfixing, no
>> new features will be accepted.
>
> There is a patch that implements usage of vector instructions in
> memmov/memset expanding [1]. The patch was not reviewed for quite some
> time, but IIRC, we said that patches that were submitted before Stage
> 1 closes are still eligible for later stages (after a review of
> course).
>
> I think that this feature certainly improves gcc (also taking into
> account recent glibc changes in this area), and IMO implements an
> important feature for recent processors. I would like to motivate
> middle-end and target maintainers to consider the patch for a review
> before stage 1 closes, and ultimately ask Release Managers to decide
> how to proceed with this patch.
>
> [1] http://gcc.gnu.org/ml/gcc-patches/2011-10/msg02392.html
>
> Thanks,
> Uros.
>



-- 
---
Best regards,
Michael V. Zolotukhin,
Software Engineer
Intel Corporation.


[Patch] Fix compilation of libgcc/config/alpha/qrnnd.S on VMS

2011-11-15 Thread Tristan Gingold
Hi,

latest versions of gas are picky about the use of the pseudos.  Consequently we 
need to adjust them in qrnnd.S.

Tested by building gcc for alpha-vms.

Ok for the trunk ?

Tristan.

2011-11-07  Tristan Gingold  

* config/alpha/qrnnd.S: Use specific pseudos for VMS.

diff --git a/libgcc/config/alpha/qrnnd.S b/libgcc/config/alpha/qrnnd.S
index 51b13bc..794cf65 100644
--- a/libgcc/config/alpha/qrnnd.S
+++ b/libgcc/config/alpha/qrnnd.S
@@ -33,9 +33,15 @@
 
.globl __udiv_qrnnd
.ent __udiv_qrnnd
+#ifdef __VMS__
+__udiv_qrnnd..en:
+   .frame $29,0,$26,0
+   .prologue
+#else
 __udiv_qrnnd:
.frame $30,0,$26,0
.prologue 0
+#endif
 
 #define cnt$2
 #define tmp$3
@@ -160,4 +166,10 @@ $Odd:
bis $31,n0,$0
ret $31,($26),1
 
+#ifdef __VMS__
+   .link
+   .align 3
+__udiv_qrnnd:
+   .pdesc  __udiv_qrnnd..en,null
+#endif
.end__udiv_qrnnd



[C++11] Streamline user-defined literal error messages and some code reformatting.

2011-11-15 Thread Ed Smith-Rowland

I just wanted to do a little clean up on the user-defined literal code.

Index: gcc/testsuite/g++.dg/cpp0x/udlit-raw-op-string-neg.C
===
--- gcc/testsuite/g++.dg/cpp0x/udlit-raw-op-string-neg.C(revision 
181376)
+++ gcc/testsuite/g++.dg/cpp0x/udlit-raw-op-string-neg.C(working copy)
@@ -5,4 +5,4 @@
 int operator"" _embedraw(const char*)
 { return 41; };
 
-int k = "Boo!"_embedraw;  //  { dg-error "unable to find valid user-defined 
string literal operator" }
+int k = "Boo!"_embedraw;  //  { dg-error "unable to find valid string literal 
operator|Possible missing length argument" }
Index: gcc/testsuite/g++.dg/cpp0x/udlit-member-neg.C
===
--- gcc/testsuite/g++.dg/cpp0x/udlit-member-neg.C   (revision 181376)
+++ gcc/testsuite/g++.dg/cpp0x/udlit-member-neg.C   (working copy)
@@ -8,7 +8,7 @@
 };
 
 int i = operator"" _Bar(U'x');  // { dg-error "was not declared in this scope" 
}
-int j = U'x'_Bar;  // { dg-error "unable to find user-defined character 
literal operator" }
+int j = U'x'_Bar;  // { dg-error "unable to find character literal operator" }
 
 int
 Foo::operator"" _Bar(char32_t)  // { dg-error "must be a non-member function" }
Index: gcc/testsuite/g++.dg/cpp0x/udlit-declare-neg.C
===
--- gcc/testsuite/g++.dg/cpp0x/udlit-declare-neg.C  (revision 181376)
+++ gcc/testsuite/g++.dg/cpp0x/udlit-declare-neg.C  (working copy)
@@ -3,13 +3,13 @@
 //  Check that undeclared literal operator calls and literals give appropriate 
errors.
 
 int i = operator"" _Bar('x');  // { dg-error "was not declared in this scope" }
-int j = 'x'_Bar;  // { dg-error "unable to find user-defined character literal 
operator" }
+int j = 'x'_Bar;  // { dg-error "unable to find character literal operator" }
 
 int ii = operator"" _BarCharStr("Howdy, Pardner!");  // { dg-error "was not 
declared in this scope" }
-int jj = "Howdy, Pardner!"_BarCharStr;  // { dg-error "unable to find 
user-defined string literal operator" }
+int jj = "Howdy, Pardner!"_BarCharStr;  // { dg-error "unable to find string 
literal operator" }
 
 unsigned long long iULL = operator"" _BarULL(666ULL);  // { dg-error "was not 
declared in this scope" }
-unsigned long long jULL = 666_BarULL;  // { dg-error "unable to find 
user-defined numeric literal operator" }
+unsigned long long jULL = 666_BarULL;  // { dg-error "unable to find numeric 
literal operator" }
 
 long double iLD = operator"" _BarLD(666.0L);  // { dg-error "was not declared 
in this scope" }
-long double jLD = 666.0_BarLD;  // { dg-error "unable to find user-defined 
numeric literal operator" }
+long double jLD = 666.0_BarLD;  // { dg-error "unable to find numeric literal 
operator" }
Index: gcc/cp/parser.c
===
--- gcc/cp/parser.c (revision 181376)
+++ gcc/cp/parser.c (working copy)
@@ -3553,34 +3553,32 @@
 static tree
 cp_parser_userdef_char_literal (cp_parser *parser)
 {
-  cp_token *token = NULL;
-  tree literal, suffix_id, value;
-  tree name, decl;
-  tree result;
-  VEC(tree,gc) *vec;
+  cp_token *token = cp_lexer_consume_token (parser->lexer);
+  tree literal = token->u.value;
+  tree suffix_id = USERDEF_LITERAL_SUFFIX_ID (literal);
+  tree value = USERDEF_LITERAL_VALUE (literal);
+  tree name = cp_literal_operator_id (IDENTIFIER_POINTER (suffix_id));
+  tree decl, result;
 
-  token = cp_lexer_consume_token (parser->lexer);
-  literal = token->u.value;
-  suffix_id = USERDEF_LITERAL_SUFFIX_ID (literal);
-  value = USERDEF_LITERAL_VALUE (literal);
-  name = cp_literal_operator_id (IDENTIFIER_POINTER (suffix_id));
-
   /* Build up a call to the user-defined operator  */
   /* Lookup the name we got back from the id-expression.  */
-  vec = make_tree_vector ();
-  VEC_safe_push (tree, gc, vec, value);
-  decl = lookup_function_nonclass (name, vec, /*block_p=*/false);
+  VEC(tree,gc) *args = make_tree_vector ();
+  VEC_safe_push (tree, gc, args, value);
+  decl = lookup_function_nonclass (name, args, /*block_p=*/false);
   if (!decl || decl == error_mark_node)
 {
-  error ("unable to find user-defined character literal operator %qD",
-name);
-  release_tree_vector (vec);
+  error ("unable to find character literal operator %qD", name);
+  release_tree_vector (args);
   return error_mark_node;
 }
-  result = finish_call_expr (decl, &vec, false, true, tf_warning_or_error);
-  release_tree_vector (vec);
+  result = finish_call_expr (decl, &args, false, true, tf_warning_or_error);
+  release_tree_vector (args);
+  if (result != error_mark_node)
+return result;
 
-  return result;
+  error ("unable to find character literal operator %qD with %qT argument",
+name, TREE_TYPE (value));
+  return error_mark_node;
 }
 
 /* A subroutine of cp_parser_userdef_numeric_

Re: [wwwdocs] gcc-4.6/porting_to.html

2011-11-15 Thread Richard Sandiford
[Sorry for the delay, catching up after being away]

Gerald Pfeifer  writes:
> On Mon, 10 Oct 2011, Gerald Pfeifer wrote:
>> I realized this one hasn't made it in, but is really nice.  I made a 
>> number of minor edits (typos, markup, simplifying headings,... among 
>> others).  What do you think -- should we include this?
>
> Checking mailing list archives I realized that Jakub had provided
> feedback ( http://gcc.gnu.org/ml/gcc-patches/2011-03/msg00987.html )
> that the strict overflow warnings had been fixed.
>
> Hence I went ahead and committed the removal below.

Thanks for doing this.  I noticed a typo while reading it, so I committed
the patch below as obvious.

Richard


Index: porting_to.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.6/porting_to.html,v
retrieving revision 1.4
diff -u -r1.4 porting_to.html
--- porting_to.html 24 Oct 2011 00:57:54 -  1.4
+++ porting_to.html 15 Nov 2011 13:38:10 -
@@ -13,7 +13,7 @@
 http://gcc.gnu.org/gcc-4.6/changes.html";>changes. Some of
 these are a result of bug fixing, and some old behaviors have been
 intentionally changed in order to support new standards, or relaxed
-instandards-conforming ways to facilitate compilation or runtime
+in standards-conforming ways to facilitate compilation or runtime
 performance.  Some of these changes are not visible to the naked eye
 and will not cause problems when updating from older versions.
 


Re: Memset/memcpy patch

2011-11-15 Thread Michael Zolotukhin
> Looks like we have a bootstrap issue, thus sorry if may message may appear 
> stupid nitpicking: why Zolotukhin Michael instead of Michael Zolotukhin in 
> the ChangeLog? Is Michael the family name?
Michael is the first name, Zolotukhin - last name. I probably swapped
them accidentally in the changelog.

Michael


Re: [PATCH, debug] Emit basic block markers in .debug_line section

2011-11-15 Thread Tom Tromey
> "Roberto" == Roberto Agostino Vitillo  writes:

Roberto> With this patch DW_LNS_set_basic_block opcodes are emitted in
Roberto> the .debug_line section marking the instructions that indicate
Roberto> the beginning of a basic block as specified by the dwarf
Roberto> standards 2,3 and 4.

I'm curious to know what use you have for this.

Tom


Re: Memset/memcpy patch

2011-11-15 Thread Paolo Carlini

On 11/15/2011 04:12 PM, Michael Zolotukhin wrote:

Looks like we have a bootstrap issue, thus sorry if may message may appear 
stupid nitpicking: why Zolotukhin Michael instead of Michael Zolotukhin in the 
ChangeLog? Is Michael the family name?

Michael is the first name, Zolotukhin - last name. I probably swapped
them accidentally in the changelog.
Ah, ok, thanks. Many years ago I learned this "funny" (from my parochial 
Italian point of view, sorry) story:


http://en.wikipedia.org/wiki/Bui_Tuong_Phong

and I'm still quite sensitive to the issue.

Paolo.


[RFA/ARM] Make libgcc use UDIV/SDIV instructions when they are available.

2011-11-15 Thread Matthew Gretton-Dann

All,

The attached patch causes libgcc to use the UDIV and SDIV instructions 
when possible in the implementation of the ARM div/mod functions in libgcc.


This will benefit Cortex-M3, Cortex-M4, all Cortex-R* CPUs, Cortex-A7, 
and Cortex-A15.


The special case of some Cortex-R* CPUs where the UDIV/SDIV instructions 
are only available in Thumb mode, making it beneficial to force these 
library functions into Thumb mode to make use of those instructions, is 
not handled.


This was tested by configuring GCC --with-cpu cortex-a15, and then 
running the testsuite with -mcpu=cortex-a9.  I've also manually 
inspected libgcc to make sure the functions are being built as expected.


Please can someone review?

Thanks,

Matt

libgcc/ChangeLog:

2011-11-15  Matthew Gretton-Dann  

* config/arm/lib1funcs.asm (udivsi3): Add support for divide
functions.
(aeabi_uidivmod): Likewise. 
(umodsi3): Likewise.
(divsi3): Likewise.
(aeabi_idivmod): Likewise.
(modsi3): Likewise.

Thanks,

Matt

--
Matthew Gretton-Dann
Principal Engineer, PD Software - Tools, ARM Ltddiff --git a/libgcc/config/arm/lib1funcs.S b/libgcc/config/arm/lib1funcs.S
index 2e76c01..094d79a 100644
--- a/libgcc/config/arm/lib1funcs.S
+++ b/libgcc/config/arm/lib1funcs.S
@@ -951,6 +951,17 @@ LSYM(udivsi3_skip_div0_test):
pop { work }
RET
 
+#elif defined(__ARM_ARCH_EXT_IDIV__)
+
+   ARM_FUNC_START udivsi3
+   ARM_FUNC_ALIAS aeabi_uidiv udivsi3
+
+   cmp r1, #0
+   beq LSYM(Ldiv0)
+
+   udivr0, r0, r1
+   RET
+
 #else /* ARM version/Thumb-2.  */
 
ARM_FUNC_START udivsi3
@@ -997,6 +1008,14 @@ FUNC_START aeabi_uidivmod
mul r2, r0
sub r1, r1, r2
bx  r3
+#elif defined(__ARM_ARCH_EXT_IDIV__)
+ARM_FUNC_START aeabi_uidivmod
+   cmp r1, #0
+   beq LSYM(Ldiv0)
+   mov r2, r0 
+   udivr0, r0, r1
+   mls r1, r0, r1, r2
+   RET
 #else
 ARM_FUNC_START aeabi_uidivmod
cmp r1, #0
@@ -1014,9 +1033,19 @@ ARM_FUNC_START aeabi_uidivmod
 /*  */
 #ifdef L_umodsi3
 
-   FUNC_START umodsi3
+#ifdef __ARM_ARCH_EXT_IDIV__
 
-#ifdef __thumb__
+   ARM_FUNC_START umodsi3
+
+   cmp r1, #0
+   beq LSYM(Ldiv0)
+   udivr2, r0, r1
+   mls r0, r1, r2, r0
+   RET
+
+#elif defined(__thumb__)
+
+   FUNC_START umodsi3
 
cmp divisor, #0
beq LSYM(Ldiv0)
@@ -1035,6 +1064,8 @@ LSYM(Lover10):

 #else  /* ARM version.  */

+   FUNC_START umodsi3
+
subsr2, r1, #1  @ compare divisor with 1
bcc LSYM(Ldiv0)
cmpne   r0, r1  @ compare dividend with divisor
@@ -1091,6 +1122,16 @@ LSYM(Lover12):
pop { work }
RET
 
+#elif defined(__ARM_ARCH_EXT_IDIV__)
+
+   ARM_FUNC_START divsi3
+   ARM_FUNC_ALIAS aeabi_idiv divsi3
+
+   cmp r1, #0
+   beq LSYM(Ldiv0)
+   sdivr0, r0, r1
+   RET
+
 #else /* ARM/Thumb-2 version.  */

ARM_FUNC_START divsi3   
@@ -1153,6 +1194,14 @@ FUNC_START aeabi_idivmod
mul r2, r0
sub r1, r1, r2
bx  r3
+#elif defined(__ARM_ARCH_EXT_IDIV__)
+ARM_FUNC_START aeabi_idivmod
+   cmp r1, #0
+   beq LSYM(Ldiv0)
+   mov r2, r0
+   sdivr0, r0, r1
+   mls r1, r0, r1, r2
+   RET
 #else
 ARM_FUNC_START aeabi_idivmod
cmp r1, #0
@@ -1170,9 +1219,20 @@ ARM_FUNC_START aeabi_idivmod
 /*  */
 #ifdef L_modsi3
 
-   FUNC_START modsi3
+#if defined(__ARM_ARCH_EXT_IDIV__)
 
-#ifdef __thumb__
+   ARM_FUNC_START modsi3
+
+   cmp r1, #0
+   beq LSYM(Ldiv0)
+
+   sdivr2, r0, r1
+   mls r0, r1, r2, r0
+   RET
+
+#elif defined(__thumb__)
+
+   FUNC_START modsi3
 
mov curbit, #1
cmp divisor, #0
@@ -1204,6 +1264,8 @@ LSYM(Lover12):
 
 #else /* ARM version.  */

+   FUNC_START modsi3
+
cmp r1, #0
beq LSYM(Ldiv0)
rsbmi   r1, r1, #0  @ loops below use unsigned.

Re: PATCH for to use tree clobbers for c++/51060 (temporary re-use)

2011-11-15 Thread Benjamin Kosnik

> Now that we have a way of explicitly marking a variable as dead, we
> can use that to indicate the end of a temporary's lifetime by adding
> it as a cleanup for that temporary.  Since gimple_push_cleanup still
> deals in trees I needed to tweak a couple of places to avoid trying
> to treat a clobber as a real CONSTRUCTOR, but the changes were small.
> 
> One somewhat surprising thing that showed up as a result of this
> change were some failures in the libstdc++ testsuite.  When I
> investigated, I found that the tests were relying on temporaries
> living longer than they should:
> 
> >  const int& x = std::max(1, 2);
> 
> Since std::max takes its arguments by const reference and returns one
> of them, x ends up as a dangling reference to the temporary
> containing 2, which dies at the end of the declaration-statement.
> This patch breaks the testcase because now the temporary stack slots
> get reused by the next statement, so by the time we look at x it no
> longer points to a 2. I've fixed the tests by removing the references
> on the variables.

Interesting. Thanks for the heads up. This should definitely be
mentioned in 4.7/porting_to.html.

-benjamin


Re: [Patch] Fix compilation of libgcc/config/alpha/qrnnd.S on VMS

2011-11-15 Thread Richard Henderson
On 11/15/2011 01:58 AM, Tristan Gingold wrote:
> * config/alpha/qrnnd.S: Use specific pseudos for VMS.

Fine by me.  I know nothing about vms.


Re: Minor contrib.texi update

2011-11-15 Thread Joseph S. Myers
The attachment doesn't seem to match the rest of your message.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: Minor contrib.texi update

2011-11-15 Thread Jeff Law
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 11/15/11 10:19, Joseph S. Myers wrote:
> The attachment doesn't seem to match the rest of your message.
> 
Attached wrong file as Andrey pointed out earlier...  Here's the right
one...  Oh how I wish thunderbird would show the attachment inline...

jeff


-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJOwp+MAAoJEBRtltQi2kC7z3QH/2ZsNU3bqm27t+A27HbfDqza
7iBJEnz0pdZVqYb4cmaZ1p8kGgsn2L2G/MYjXPKUXp0seSSzmUFgipzVs/d3VbgS
e0ajLS9mIFHxmKBcjT3ubY4oJp1t8nL62nk3Ow2Uf5fnB5CShZ1dNXX6xkkeLkPf
RRz0m9WhQxRQu6GxxmM/AiC46EVVQdWLKHUP6Zn/bjQZNppZz2TNEaZ4y398p5X0
a0/YWycUJoVswCrMIjQA6q4HfKL7/DSR8Kvq5fHshxTLO03tSGlT+aP8O/Ei2ygw
5Wal3BNFfBKsKd9lhntvz31XCSB7WAXq4thX2v7Y6+FBLJpjbsYIxIOpwNil1uE=
=h7hS
-END PGP SIGNATURE-
Index: contrib.texi
===
*** contrib.texi(revision 180836)
--- contrib.texi(working copy)
*** improved alias analysis, plus migrating
*** 66,71 
--- 66,75 
  Geoff Berry for his Java object serialization work and various patches.
  
  @item
+ David Binderman tests weekly snapshots of GCC trunk against the Fedora rawhide
+ archive for several architectures.
+ 
+ @item
  Uros Bizjak for the implementation of x87 math built-in functions and
  for various middle end and i386 back end improvements and bug fixes.
  


[PING] PR50325: store_bit_field: Fix for big endian targets

2011-11-15 Thread Andreas Krebbel
This fixes many C++ tests on s390x and PPC64:
http://gcc.gnu.org/ml/gcc-patches/2011-09/msg01220.html

Bye,

-Andreas-


Re: [PATCH, take 2] Fix PR tree-optimization/49960 ,Fix self data dependence

2011-11-15 Thread Richard Guenther
On Tue, Nov 15, 2011 at 11:31 AM, Razya Ladelsky  wrote:
>> > I hope it's clearer now, I will add a comment to the code, and submit
> it
>> > before committing it.
>>
>> No, it's not clearer, because it is not clear why you need to add the
> hack
>> instead of avoiding the 2nd access function. And iff you add the hack it
>> needs a comment why zero should be special (any other constant would
>> be the same I suppose).
>>
>> Btw, your fortran example does not compile and I don't believe the issue
>> is still present after my last changes to dr_analyze_indices.  So, did
>> you verify this on trunk?
>>
>> Richard.
>
> This patch fixes the failures described in
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49960
> It also fixes bzips when run with autopar enabled.
>
> In both cases the self dependences are not handled correctly.
> In the first case, a non affine access is analyzed:
> in the second, the distance vector is not calculated correctly (the
> distance vector considered for for self dependences is always (0,0,...))
> As  a result, the loops get wrongfully parallelized.
>
> I modified the previous patch according to the last changes in the trunk,
> which indeed do not requite special handling for the 2nd access function
> (as mentioned by Richard).
> Another change is that the previous version of the patch eliminated
> compute_self_dependences function
> as the calls to it were redundant, while this version considers the new
> call to compute_self_dependences from the vect code for gather (inserted
> lately by Jakub).
> ChangeLog:
>        PR tree-optimization/49960
>
>        * tree-data-ref.c (initialize_data_dependence_relation): Add
> initializations.
>        Remove call to compute_self_dependence.
>        (compute_affine_dependence): Remove the !DDR_SELF_REFERENCE
> condition.
>        (compute_self_dependence): Remove old code. Add call to
> compute_affine_dependence.
>        (compute_all_dependences): Remove call to compute_self_dependence.
>
>        Add call to compute_affine_dependence.
>
>        testsuite/ChangeLog:
>        PR tree-optimization/49960
>
>        * gcc.dg/autopar/pr49960.c: New test.
>        * gcc.dg/autopar/pr49960-1.c: New test.
>
>
>
>
>
>
>
> Bootstrap and testsuite pass successfully for ppc64-redhat-linux.
>
> OK for trunk?

Ok.

Thanks,
Richard.

> Thank you,
> Razya
>
>


Re: CFG review needed for fix of "PowerPC shrink-wrap support 3 of 3"

2011-11-15 Thread Richard Henderson
On 11/14/2011 11:56 AM, Alan Modra wrote:
>   * function.c (thread_prologue_and_epilogue_insns): Guard
>   emitting return with single_succ_p test.

Ok.


r~


Re: [PING] PR50325: store_bit_field: Fix for big endian targets

2011-11-15 Thread Richard Henderson
On 11/15/2011 07:22 AM, Andreas Krebbel wrote:
> This fixes many C++ tests on s390x and PPC64:
> http://gcc.gnu.org/ml/gcc-patches/2011-09/msg01220.html

Ok.


r~


[wwwdocs] add news items for TM work

2011-11-15 Thread Aldy Hernandez

Is this OK?

Index: index.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/index.html,v
retrieving revision 1.824
diff -c -p -r1.824 index.html
*** index.html  15 Nov 2011 06:01:24 -  1.824
--- index.html  15 Nov 2011 18:16:43 -
*** mission statement.
*** 53,58 
--- 53,70 

  

+ Transactional memory support
+ [2011-11-15]
+ An implementation of the
+ ongoing href="http://gcc.gnu.org/wiki/TransactionalMemory";>transactional

+ memory standard has been added.  Code was contributed by Richard
+ Henderson, Aldy Hernandez, and Torvald Riegel, all of Red Hat, Inc.
+ The project was partially funded by
+ the http://www.velox-project.eu/";>Velox project.  This
+ feature is experimental and is available for C and C++ on selected
+ platforms.
+ 
+
  POWER7 on the GCC Compile Farm
  [2011-11-10]
  IBM has donated a 64 processor POWER7 machine (3.55 GHz, 64 GB RAM)
Index: gcc-4.7/changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.7/changes.html,v
retrieving revision 1.56
diff -c -p -r1.56 changes.html
*** gcc-4.7/changes.html8 Nov 2011 11:49:53 -   1.56
--- gcc-4.7/changes.html15 Nov 2011 18:16:43 -
*** void foo (char *a, const char *b, const
*** 195,200 
--- 195,221 
through which the compiler can be hinted about pointer alignment
and can use it to improve generated code.

+
+   Experimental support for transactional memory has been added.
+   It includes support for the compiler, as well as a supporting
+   runtime library called libitm.  To compile code
+   with transactional memory constructs, use
+   the -fgnu-tm option.
+
+   
+   Support is currently available for the x86 and Alpha platforms.
+   
+
+   
+   This work was contributed by Red Hat and was partly funded by
+   the http://www.velox-project.eu/";>Velox project.
+   
+
+   
+   For more details on transactional memory
+   see http://gcc.gnu.org/wiki/TransactionalMemory";>here.
+   
+   
  

  C++



Re: [wwwdocs] add news items for TM work

2011-11-15 Thread Aldy Hernandez

On 11/15/11 12:18, Aldy Hernandez wrote:

Is this OK?



BTW, I have updated the wiki here for more information:

http://gcc.gnu.org/wiki/TransactionalMemory

And included a relevant link from the news items.


Re: [PATCH, debug] Emit basic block markers in .debug_line section

2011-11-15 Thread Roberto Agostino Vitillo
I considered to use it as a starting point to build the control-flow graph of 
a function in order to display it in a custom profiler we use internally since
I could assume that I had the debugging information and I had to read the 
.debug_line section anyway to get the source lines.
I ended up not using it and it's clear to me that its practical uses may be 
questionable ("fast" single stepping in a debugger?) but I thought it might 
be worth to share it.

r

On Nov 15, 2011, at 7:17 AM, Tom Tromey wrote:

>> "Roberto" == Roberto Agostino Vitillo  writes:
> 
> Roberto> With this patch DW_LNS_set_basic_block opcodes are emitted in
> Roberto> the .debug_line section marking the instructions that indicate
> Roberto> the beginning of a basic block as specified by the dwarf
> Roberto> standards 2,3 and 4.
> 
> I'm curious to know what use you have for this.
> 
> Tom



[PATCH 1/2] ia64: Use define_c_enum for unspec constants.

2011-11-15 Thread Richard Henderson
---
 gcc/config/ia64/ia64.md |  104 +++---
 1 files changed, 52 insertions(+), 52 deletions(-)

diff --git a/gcc/config/ia64/ia64.md b/gcc/config/ia64/ia64.md
index 46eebc2..df744e7 100644
--- a/gcc/config/ia64/ia64.md
+++ b/gcc/config/ia64/ia64.md
@@ -48,61 +48,61 @@
 
 ;; ??? Need a better way to describe alternate fp status registers.
 
-(define_constants
+(define_c_enum "unspec"
   [; Relocations
-   (UNSPEC_LTOFF_DTPMOD0)
-   (UNSPEC_LTOFF_DTPREL1)
-   (UNSPEC_DTPREL  2)
-   (UNSPEC_LTOFF_TPREL 3)
-   (UNSPEC_TPREL   4)
-   (UNSPEC_DTPMOD  5)
-
-   (UNSPEC_LD_BASE 9)
-   (UNSPEC_GR_SPILL10)
-   (UNSPEC_GR_RESTORE  11)
-   (UNSPEC_FR_SPILL12)
-   (UNSPEC_FR_RESTORE  13)
-   (UNSPEC_FR_RECIP_APPROX 14)
-   (UNSPEC_PRED_REL_MUTEX  15)
-   (UNSPEC_GETF_EXP16)
-   (UNSPEC_PIC_CALL17)
-   (UNSPEC_MF  18)
-   (UNSPEC_CMPXCHG_ACQ 19)
-   (UNSPEC_FETCHADD_ACQ20)
-   (UNSPEC_BSP_VALUE   21)
-   (UNSPEC_FLUSHRS 22)
-   (UNSPEC_BUNDLE_SELECTOR 23)
-   (UNSPEC_ADDP4   24)
-   (UNSPEC_PROLOGUE_USE25)
-   (UNSPEC_RET_ADDR26)
-   (UNSPEC_SETF_EXP 27)
-   (UNSPEC_FR_SQRT_RECIP_APPROX 28)
-   (UNSPEC_SHRP29)
-   (UNSPEC_COPYSIGN30)
-   (UNSPEC_VECT_EXTR   31)
-   (UNSPEC_LDA  40)
-   (UNSPEC_LDS  41)
-   (UNSPEC_LDS_A42)
-   (UNSPEC_LDSA 43)
-   (UNSPEC_LDCCLR   44)
-   (UNSPEC_LDCNC45)
-   (UNSPEC_CHKACLR  46)
-   (UNSPEC_CHKANC   47)
-   (UNSPEC_CHKS 48)
-   (UNSPEC_FR_RECIP_APPROX_RES  49)
-   (UNSPEC_FR_SQRT_RECIP_APPROX_RES 50)
+   UNSPEC_LTOFF_DTPMOD
+   UNSPEC_LTOFF_DTPREL
+   UNSPEC_DTPREL
+   UNSPEC_LTOFF_TPREL
+   UNSPEC_TPREL
+   UNSPEC_DTPMOD
+
+   UNSPEC_LD_BASE
+   UNSPEC_GR_SPILL
+   UNSPEC_GR_RESTORE
+   UNSPEC_FR_SPILL
+   UNSPEC_FR_RESTORE
+   UNSPEC_FR_RECIP_APPROX
+   UNSPEC_PRED_REL_MUTEX
+   UNSPEC_GETF_EXP
+   UNSPEC_PIC_CALL
+   UNSPEC_MF
+   UNSPEC_CMPXCHG_ACQ
+   UNSPEC_FETCHADD_ACQ
+   UNSPEC_BSP_VALUE
+   UNSPEC_FLUSHRS
+   UNSPEC_BUNDLE_SELECTOR
+   UNSPEC_ADDP4
+   UNSPEC_PROLOGUE_USE
+   UNSPEC_RET_ADDR
+   UNSPEC_SETF_EXP
+   UNSPEC_FR_SQRT_RECIP_APPROX
+   UNSPEC_SHRP
+   UNSPEC_COPYSIGN
+   UNSPEC_VECT_EXTR
+   UNSPEC_LDA
+   UNSPEC_LDS
+   UNSPEC_LDS_A
+   UNSPEC_LDSA
+   UNSPEC_LDCCLR
+   UNSPEC_LDCNC
+   UNSPEC_CHKACLR
+   UNSPEC_CHKANC
+   UNSPEC_CHKS
+   UNSPEC_FR_RECIP_APPROX_RES
+   UNSPEC_FR_SQRT_RECIP_APPROX_RES
   ])
 
-(define_constants
-  [(UNSPECV_ALLOC  0)
-   (UNSPECV_BLOCKAGE   1)
-   (UNSPECV_INSN_GROUP_BARRIER 2)
-   (UNSPECV_BREAK  3)
-   (UNSPECV_SET_BSP4)
-   (UNSPECV_PSAC_ALL   5)  ; pred.safe_across_calls
-   (UNSPECV_PSAC_NORMAL6)
-   (UNSPECV_SETJMP_RECEIVER7)
-   (UNSPECV_GOTO_RECEIVER  8)
+(define_c_enum "unspecv" [
+   UNSPECV_ALLOC
+   UNSPECV_BLOCKAGE
+   UNSPECV_INSN_GROUP_BARRIER
+   UNSPECV_BREAK
+   UNSPECV_SET_BSP
+   UNSPECV_PSAC_ALL; pred.safe_across_calls
+   UNSPECV_PSAC_NORMAL
+   UNSPECV_SETJMP_RECEIVER
+   UNSPECV_GOTO_RECEIVER
   ])
 
 (include "predicates.md")
-- 
1.7.4.4



[PATCH 0/2] Convert ia64 to atomic optabs

2011-11-15 Thread Richard Henderson
This is relatively straight-forward, given that most of the
language actually matches up with the opcodes.  ;-)

As mentioned in the patch itself, this is based on the data
presented in 

   http://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html 

which has a few non-obvious points.

Tested on ia64-linux.  Steve, I'd be obliged if you could
also test hpux.


r~



Richard Henderson (2):
  ia64: Use define_c_enum for unspec constants.
  ia64: Update to atomic optabs

 gcc/config/ia64/ia64-protos.h |3 +-
 gcc/config/ia64/ia64.c|   75 --
 gcc/config/ia64/ia64.md   |  106 +++---
 gcc/config/ia64/sync.md   |  312 ++---
 4 files changed, 346 insertions(+), 150 deletions(-)

-- 
1.7.4.4



[PATCH 2/2] ia64: Update to atomic optabs

2011-11-15 Thread Richard Henderson
---
 gcc/config/ia64/ia64-protos.h |3 +-
 gcc/config/ia64/ia64.c|   75 --
 gcc/config/ia64/ia64.md   |2 +
 gcc/config/ia64/sync.md   |  312 ++---
 4 files changed, 294 insertions(+), 98 deletions(-)

diff --git a/gcc/config/ia64/ia64-protos.h b/gcc/config/ia64/ia64-protos.h
index 893ed88..c24f831 100644
--- a/gcc/config/ia64/ia64-protos.h
+++ b/gcc/config/ia64/ia64-protos.h
@@ -47,7 +47,8 @@ extern void ia64_expand_dot_prod_v8qi (rtx[], bool);
 extern void ia64_expand_call (rtx, rtx, rtx, int);
 extern void ia64_split_call (rtx, rtx, rtx, rtx, rtx, int, int);
 extern void ia64_reload_gp (void);
-extern void ia64_expand_atomic_op (enum rtx_code, rtx, rtx, rtx, rtx);
+extern void ia64_expand_atomic_op (enum rtx_code, rtx, rtx, rtx, rtx,
+  enum memmodel);
 
 extern HOST_WIDE_INT ia64_initial_elimination_offset (int, int);
 extern void ia64_expand_prologue (void);
diff --git a/gcc/config/ia64/ia64.c b/gcc/config/ia64/ia64.c
index cad6d0f..1499367 100644
--- a/gcc/config/ia64/ia64.c
+++ b/gcc/config/ia64/ia64.c
@@ -2266,7 +2266,7 @@ ia64_split_call (rtx retval, rtx addr, rtx retaddr, rtx 
scratch_r,
 
 void
 ia64_expand_atomic_op (enum rtx_code code, rtx mem, rtx val,
-  rtx old_dst, rtx new_dst)
+  rtx old_dst, rtx new_dst, enum memmodel model)
 {
   enum machine_mode mode = GET_MODE (mem);
   rtx old_reg, new_reg, cmp_reg, ar_ccv, label;
@@ -2283,12 +2283,31 @@ ia64_expand_atomic_op (enum rtx_code code, rtx mem, rtx 
val,
   if (!old_dst)
 old_dst = gen_reg_rtx (mode);
 
-  emit_insn (gen_memory_barrier ());
+  switch (model)
+   {
+   case MEMMODEL_ACQ_REL:
+   case MEMMODEL_SEQ_CST:
+ emit_insn (gen_memory_barrier ());
+ /* FALLTHRU */
+   case MEMMODEL_RELAXED:
+   case MEMMODEL_ACQUIRE:
+   case MEMMODEL_CONSUME:
+ if (mode == SImode)
+   icode = CODE_FOR_fetchadd_acq_si;
+ else
+   icode = CODE_FOR_fetchadd_acq_di;
+ break;
+   case MEMMODEL_RELEASE:
+ if (mode == SImode)
+   icode = CODE_FOR_fetchadd_rel_si;
+ else
+   icode = CODE_FOR_fetchadd_rel_di;
+ break;
+
+   default:
+ gcc_unreachable ();
+   }
 
-  if (mode == SImode)
-   icode = CODE_FOR_fetchadd_acq_si;
-  else
-   icode = CODE_FOR_fetchadd_acq_di;
   emit_insn (GEN_FCN (icode) (old_dst, mem, val));
 
   if (new_dst)
@@ -2302,8 +2321,12 @@ ia64_expand_atomic_op (enum rtx_code code, rtx mem, rtx 
val,
 }
 
   /* Because of the volatile mem read, we get an ld.acq, which is the
- front half of the full barrier.  The end half is the cmpxchg.rel.  */
-  gcc_assert (MEM_VOLATILE_P (mem));
+ front half of the full barrier.  The end half is the cmpxchg.rel.
+ For relaxed and release memory models, we don't need this.  But we
+ also don't bother trying to prevent it either.  */
+  gcc_assert (model == MEMMODEL_RELAXED
+ || model == MEMMODEL_RELEASE
+ || MEM_VOLATILE_P (mem));
 
   old_reg = gen_reg_rtx (DImode);
   cmp_reg = gen_reg_rtx (DImode);
@@ -2342,12 +2365,36 @@ ia64_expand_atomic_op (enum rtx_code code, rtx mem, rtx 
val,
   if (new_dst)
 emit_move_insn (new_dst, new_reg);
 
-  switch (mode)
+  switch (model)
 {
-case QImode:  icode = CODE_FOR_cmpxchg_rel_qi;  break;
-case HImode:  icode = CODE_FOR_cmpxchg_rel_hi;  break;
-case SImode:  icode = CODE_FOR_cmpxchg_rel_si;  break;
-case DImode:  icode = CODE_FOR_cmpxchg_rel_di;  break;
+case MEMMODEL_RELAXED:
+case MEMMODEL_ACQUIRE:
+case MEMMODEL_CONSUME:
+  switch (mode)
+   {
+   case QImode: icode = CODE_FOR_cmpxchg_acq_qi;  break;
+   case HImode: icode = CODE_FOR_cmpxchg_acq_hi;  break;
+   case SImode: icode = CODE_FOR_cmpxchg_acq_si;  break;
+   case DImode: icode = CODE_FOR_cmpxchg_acq_di;  break;
+   default:
+ gcc_unreachable ();
+   }
+  break;
+
+case MEMMODEL_RELEASE:
+case MEMMODEL_ACQ_REL:
+case MEMMODEL_SEQ_CST:
+  switch (mode)
+   {
+   case QImode: icode = CODE_FOR_cmpxchg_rel_qi;  break;
+   case HImode: icode = CODE_FOR_cmpxchg_rel_hi;  break;
+   case SImode: icode = CODE_FOR_cmpxchg_rel_si;  break;
+   case DImode: icode = CODE_FOR_cmpxchg_rel_di;  break;
+   default:
+ gcc_unreachable ();
+   }
+  break;
+
 default:
   gcc_unreachable ();
 }
@@ -6342,6 +6389,7 @@ rtx_needs_barrier (rtx x, struct reg_flags flags, int 
pred)
case UNSPEC_PIC_CALL:
 case UNSPEC_MF:
 case UNSPEC_FETCHADD_ACQ:
+case UNSPEC_FETCHADD_REL:
case UNSPEC_BSP_VALUE:
case UNSPEC_FLUSHRS:
case UNSPEC_BUNDLE_SELECTOR:
@@ -6385,6 +6433,7 @@ rtx_needs_barrier (rtx x, struct reg_flags flags, int 
pred)
  break;
 
 case UNSPEC_CM

Committed: fix typo in epiphany.md:movcc

2011-11-15 Thread Joern Rennecke

David Bremner alerted be to the presence of a typo in the Epiphany
movcc pattern.  I have checked in the attached patch as obvious.
2011-11-15  Joern Rennecke  

* config/epiphany/epiphany.md (movcc): Fix code to
get mode from CMP_OP1 if CMP_OP0 is VOIDmode.

Index: config/epiphany/epiphany.md
===
--- config/epiphany/epiphany.md (revision 181387)
+++ config/epiphany/epiphany.md (working copy)
@@ -1711,7 +1711,7 @@ (define_expand "movcc"
 
   cmp_in_mode = GET_MODE (cmp_op0);
   if (cmp_in_mode == VOIDmode)
-cmp_in_mode = GET_MODE (cmp_op0);
+cmp_in_mode = GET_MODE (cmp_op1);
   if (cmp_in_mode == VOIDmode)
 cmp_in_mode = SImode;
   /* If the operands are a better match when reversed, swap them now.


Re: New port^2: Renesas RL78

2011-11-15 Thread DJ Delorie

> Otherwise the port is looking ok.

What else need I do for this port?


[PATCH] tail_merge_optimize frequency fix

2011-11-15 Thread Tom de Vries
Richard,

this patch fixes up the basic block frequencies after merging 2 bbs in
tail_merge_optimize, and prevents tree-dump messages like:
'Invalid sum of incoming frequencies x, should be y'.

Bootstrapped and reg-tested on x86_64 and i686, build and reg-tested on ARM and
MIPS.

OK for trunk?

Thanks,
- Tom

2011-11-15  Tom de Vries  

* tree-ssa-tail-merge.c (replace_block_by): Add frequency of bb2 to bb1.

* gcc.dg/pr43864.c: Check for absence of 'Invalid sum' in pre tree-dump.
* gcc.dg/pr43864-2.c: Same.
* gcc.dg/pr43864-3.c: Same.
* gcc.dg/pr43864-4.c: Same.
Index: gcc/tree-ssa-tail-merge.c
===
--- gcc/tree-ssa-tail-merge.c (revision 181377)
+++ gcc/tree-ssa-tail-merge.c (working copy)
@@ -1396,6 +1396,9 @@ replace_block_by (basic_block bb1, basic
 		   pred_edge, UNKNOWN_LOCATION);
 }
 
+  bb2->frequency += bb1->frequency;
+  bb1->frequency = 0;
+
   /* Do updates that use bb1, before deleting bb1.  */
   release_last_vdef (bb1);
   same_succ_flush_bb (bb1);
Index: gcc/testsuite/gcc.dg/pr43864-2.c
===
--- gcc/testsuite/gcc.dg/pr43864-2.c (revision 181377)
+++ gcc/testsuite/gcc.dg/pr43864-2.c (working copy)
@@ -19,4 +19,5 @@ f (int c, int b, int d)
 
 /* { dg-final { scan-tree-dump-times "if " 0 "pre"} } */
 /* { dg-final { scan-tree-dump-times "_.*\\\+.*_" 1 "pre"} } */
+/* { dg-final { scan-tree-dump-not "Invalid sum" "pre"} } */
 /* { dg-final { cleanup-tree-dump "pre" } } */
Index: gcc/testsuite/gcc.dg/pr43864-3.c
===
--- gcc/testsuite/gcc.dg/pr43864-3.c (revision 181377)
+++ gcc/testsuite/gcc.dg/pr43864-3.c (working copy)
@@ -20,4 +20,5 @@ int f(int c, int b, int d)
 
 /* { dg-final { scan-tree-dump-times "if " 0 "pre"} } */
 /* { dg-final { scan-tree-dump-times "_.*\\\+.*_" 1 "pre"} } */
+/* { dg-final { scan-tree-dump-not "Invalid sum" "pre"} } */
 /* { dg-final { cleanup-tree-dump "pre" } } */
Index: gcc/testsuite/gcc.dg/pr43864-4.c
===
--- gcc/testsuite/gcc.dg/pr43864-4.c (revision 181377)
+++ gcc/testsuite/gcc.dg/pr43864-4.c (working copy)
@@ -25,4 +25,5 @@ int f(int c, int b, int d)
 /* { dg-final { scan-tree-dump-times "if " 0 "pre"} } */
 /* { dg-final { scan-tree-dump-times "_.*\\\+.*_" 1 "pre"} } */
 /* { dg-final { scan-tree-dump-times " - " 2 "pre"} } */
+/* { dg-final { scan-tree-dump-not "Invalid sum" "pre"} } */
 /* { dg-final { cleanup-tree-dump "pre" } } */
Index: gcc/testsuite/gcc.dg/pr43864.c
===
--- gcc/testsuite/gcc.dg/pr43864.c (revision 181377)
+++ gcc/testsuite/gcc.dg/pr43864.c (working copy)
@@ -32,4 +32,5 @@ hprofStartupp (char *outputFileName, cha
 }
 
 /* { dg-final { scan-tree-dump-times "myfree \\(" 1 "pre"} } */
+/* { dg-final { scan-tree-dump-not "Invalid sum" "pre"} } */
 /* { dg-final { cleanup-tree-dump "pre" } } */


[PATCH, i386]: Optimize v2df (x2) -> v4sf,v4si conversion sequences for AVX.

2011-11-15 Thread Uros Bizjak
Hello!

Attached patch optimizes  v2df (x2) -> v4sf,v4si conversion sequences
for AVX from:

vcvtpd2psx  48(%rsp), %xmm1
vcvtpd2psx  64(%rsp), %xmm0
vmovlhps%xmm0, %xmm1, %xmm0
vmovaps %xmm0, 32(%rsp)

to

vmovapd 64(%rsp), %xmm0
vinsertf128 $0x1, 80(%rsp), %ymm0, %ymm0
vcvtpd2psy  %ymm0, %xmm0
vmovaps %xmm0, 32(%rsp)

Please note only one conversion instruction.

In a similar way, the patch optimizes floor/ceil/round from:

vroundpd$1, 32(%rsp), %xmm1
vroundpd$1, 48(%rsp), %xmm0
vcvttpd2dqx %xmm1, %xmm1
vcvttpd2dqx %xmm0, %xmm0
vpunpcklqdq %xmm0, %xmm1, %xmm0
vmovdqa %xmm0, 16(%rsp)

to

vroundpd$1, 64(%rsp), %xmm1
vroundpd$1, 80(%rsp), %xmm0
vinsertf128 $0x1, %xmm0, %ymm1, %ymm0
vcvttpd2dqy %ymm0, %xmm0
vmovdqa %xmm0, 32(%rsp)

Ideally, this would be just "vcvtpd2psy 64(%rsp), %xmm0" or "vroundpd
$1, 64(%rsp), %ymm1", but vectorizer does not (yet) support mixed
vectorize factors.

The patch also changes a couple of patterns to use simpler SSE
patterns with vec-concat pattern to generate equivalent code.

2011-11-15  Uros Bizjak  

* config/i386/sse.md (vec_pack_trunc_v2df): Optimize sequence for AVX.
(vec_pack_sfix_trunc_v2df): Ditto.
(vec_pack_sfix_v2df): Ditto.
(vec_pack_sfix_trunc_v4df): Generate fix_truncv4dfv4si2 and
avx_vec_concatv8si patterns.
(vec_pack_sfix_v4df): Generate avx_cvtpd2dq256 and
avx_vec_concatv8si patterns.

testsuite/ChangeLog:

2011-11-15  Uros Bizjak  

* gcc.target/i386/avx-cvt-2-vec.c: New test.
* gcc.target/i386/avx-floor-sfix-2-vec.c: Ditto.
* gcc.target/i386/avx-ceil-sfix-2-vec.c: Ditto.
* gcc.target/i386/avx-rint-sfix-2-vec.c: Ditto.
* gcc.target/i386/avx-round-sfix-2-vec.c: Ditto.

Tested on x86_64-pc-linux-gnu {,-m32} AVX target, committed to mainline SVN.

Uros.
Index: config/i386/sse.md
===
--- config/i386/sse.md  (revision 181378)
+++ config/i386/sse.md  (working copy)
@@ -3038,14 +3038,25 @@
(match_operand:V2DF 2 "nonimmediate_operand" "")]
   "TARGET_SSE2"
 {
-  rtx r1, r2;
+  rtx tmp0, tmp1;
 
-  r1 = gen_reg_rtx (V4SFmode);
-  r2 = gen_reg_rtx (V4SFmode);
+  if (TARGET_AVX && !TARGET_PREFER_AVX128)
+{
+  tmp0 = gen_reg_rtx (V4DFmode);
+  tmp1 = force_reg (V2DFmode, operands[1]);
 
-  emit_insn (gen_sse2_cvtpd2ps (r1, operands[1]));
-  emit_insn (gen_sse2_cvtpd2ps (r2, operands[2]));
-  emit_insn (gen_sse_movlhps (operands[0], r1, r2));
+  emit_insn (gen_avx_vec_concatv4df (tmp0, tmp1, operands[2]));
+  emit_insn (gen_avx_cvtpd2ps256 (operands[0], tmp0));
+}
+  else
+{
+  tmp0 = gen_reg_rtx (V4SFmode);
+  tmp1 = gen_reg_rtx (V4SFmode);
+
+  emit_insn (gen_sse2_cvtpd2ps (tmp0, operands[1]));
+  emit_insn (gen_sse2_cvtpd2ps (tmp1, operands[2]));
+  emit_insn (gen_sse_movlhps (operands[0], tmp0, tmp1));
+}
   DONE;
 })
 
@@ -3057,12 +3068,12 @@
 {
   rtx r1, r2;
 
-  r1 = gen_reg_rtx (V8SImode);
-  r2 = gen_reg_rtx (V8SImode);
+  r1 = gen_reg_rtx (V4SImode);
+  r2 = gen_reg_rtx (V4SImode);
 
-  emit_insn (gen_avx_cvttpd2dq256_2 (r1, operands[1]));
-  emit_insn (gen_avx_cvttpd2dq256_2 (r2, operands[2]));
-  emit_insn (gen_avx_vperm2f128v8si3 (operands[0], r1, r2, GEN_INT (0x20)));
+  emit_insn (gen_fix_truncv4dfv4si2 (r1, operands[1]));
+  emit_insn (gen_fix_truncv4dfv4si2 (r2, operands[2]));
+  emit_insn (gen_avx_vec_concatv8si (operands[0], r1, r2));
   DONE;
 })
 
@@ -3072,16 +3083,28 @@
(match_operand:V2DF 2 "nonimmediate_operand" "")]
   "TARGET_SSE2"
 {
-  rtx r1, r2;
+  rtx tmp0, tmp1;
 
-  r1 = gen_reg_rtx (V4SImode);
-  r2 = gen_reg_rtx (V4SImode);
+  if (TARGET_AVX && !TARGET_PREFER_AVX128)
+{
+  tmp0 = gen_reg_rtx (V4DFmode);
+  tmp1 = force_reg (V2DFmode, operands[1]);
 
-  emit_insn (gen_sse2_cvttpd2dq (r1, operands[1]));
-  emit_insn (gen_sse2_cvttpd2dq (r2, operands[2]));
-  emit_insn (gen_vec_interleave_lowv2di (gen_lowpart (V2DImode, operands[0]),
-gen_lowpart (V2DImode, r1),
-gen_lowpart (V2DImode, r2)));
+  emit_insn (gen_avx_vec_concatv4df (tmp0, tmp1, operands[2]));
+  emit_insn (gen_fix_truncv4dfv4si2 (operands[0], tmp0));
+}
+  else
+{
+  tmp0 = gen_reg_rtx (V4SImode);
+  tmp1 = gen_reg_rtx (V4SImode);
+
+  emit_insn (gen_sse2_cvttpd2dq (tmp0, operands[1]));
+  emit_insn (gen_sse2_cvttpd2dq (tmp1, operands[2]));
+  emit_insn
+   (gen_vec_interleave_lowv2di (gen_lowpart (V2DImode, operands[0]),
+   gen_lowpart (V2DImode, tmp0),
+   gen_lowpart (V2DImode, tmp1)));
+}
   DONE;
 })
 
@@ -3126,12 +3149,12 @@
 {
   rt

Re: [PATCH 0/2] Convert ia64 to atomic optabs

2011-11-15 Thread Steve Ellcey
On Tue, 2011-11-15 at 08:53 -1000, Richard Henderson wrote:
> This is relatively straight-forward, given that most of the
> language actually matches up with the opcodes.  ;-)
> 
> As mentioned in the patch itself, this is based on the data
> presented in 
> 
>http://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html 
> 
> which has a few non-obvious points.
> 
> Tested on ia64-linux.  Steve, I'd be obliged if you could
> also test hpux.
> 
> 
> r~

I'll try to test it but I am currently trying to track down a bootstrap
failure on IA64 HP-UX that appears to have started somewhere between
r181238 and r181287.  My IA64 HP-UX build currently fails with:

build/genmddeps /ctires/gcc/nightly/src/trunk/gcc/config/ia64/ia64.md > 
tmp-mddeps
/ctires/gcc/nightly/src/trunk/gcc/config/ia64/ia64.md:52: expected character 
`[', found ` '
/ctires/gcc/nightly/src/trunk/gcc/config/ia64/ia64.md:52: following context is 
` [; Relocations'
make[3]: *** [s-mddeps] Error 1make[3]: Leaving directory 
`/ctires/gcc/nightly/build-ia64-hp-hpux11.23-trunk/obj_gcc/gcc'
make[2]: *** [all-stage2-gcc] Error 2
make[2]: Leaving directory 
`/ctires/gcc/nightly/build-ia64-hp-hpux11.23-trunk/obj_gcc'
make[1]: *** [stage2-bubble] Error 2

ia64.md didn't change so it looks like it is a problem caused by how
genmddeps was compiled with the stage 1 GCC.

Steve Ellcey
s...@cup.hp.com



Re: [PATCH 0/2] Convert ia64 to atomic optabs

2011-11-15 Thread Andrew Pinski
On Tue, Nov 15, 2011 at 11:30 AM, Steve Ellcey  wrote:
> I'll try to test it but I am currently trying to track down a bootstrap
> failure on IA64 HP-UX that appears to have started somewhere between
> r181238 and r181287.  My IA64 HP-UX build currently fails with:
>
> build/genmddeps /ctires/gcc/nightly/src/trunk/gcc/config/ia64/ia64.md > 
> tmp-mddeps
> /ctires/gcc/nightly/src/trunk/gcc/config/ia64/ia64.md:52: expected character 
> `[', found ` '
> /ctires/gcc/nightly/src/trunk/gcc/config/ia64/ia64.md:52: following context 
> is ` [; Relocations'
> make[3]: *** [s-mddeps] Error 1make[3]: Leaving directory 
> `/ctires/gcc/nightly/build-ia64-hp-hpux11.23-trunk/obj_gcc/gcc'
> make[2]: *** [all-stage2-gcc] Error 2
> make[2]: Leaving directory 
> `/ctires/gcc/nightly/build-ia64-hp-hpux11.23-trunk/obj_gcc'
> make[1]: *** [stage2-bubble] Error 2

This is interesting because it was just reported that s390 fails the same way:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51144

Thanks,
Andrew Pinski


Re: [PATCH] tail_merge_optimize frequency fix

2011-11-15 Thread Jeff Law
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 11/15/11 12:24, Tom de Vries wrote:
> Richard,
> 
> this patch fixes up the basic block frequencies after merging 2 bbs
> in tail_merge_optimize, and prevents tree-dump messages like: 
> 'Invalid sum of incoming frequencies x, should be y'.
> 
> Bootstrapped and reg-tested on x86_64 and i686, build and
> reg-tested on ARM and MIPS.
> 
> OK for trunk?
> 
> Thanks, - Tom
> 
> 2011-11-15  Tom de Vries  
> 
> * tree-ssa-tail-merge.c (replace_block_by): Add frequency of bb2 to
> bb1.
> 
> * gcc.dg/pr43864.c: Check for absence of 'Invalid sum' in pre
> tree-dump. * gcc.dg/pr43864-2.c: Same. * gcc.dg/pr43864-3.c: Same. 
> * gcc.dg/pr43864-4.c: Same.
Just to be safe, can you clamp bb's frequency at BB_FREQ_MAX.   ie,
after summing the frequencies

if (bb2->frequency > BB_FREQ_MAX)
  bb2->frequency = BB_FREQ_MAX;

In theory this shouldn't happen, but often the frequencies go goofy.

Approved with that fix.

jeff
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJOwshbAAoJEBRtltQi2kC7JbcH/1EWDBcbyGnBPK9TiSYN9bUq
jA9Unhw3cA6VChRL9B4LVI6IdHuukzCnyFHgKUYU0IwHWKEJFXK59JrQh5xxlXxd
Xe4A+7ryNxoFatXPPzisovB5YUVgxQ+vq6J1kQcXl/Ty2c64706ufJ/3lvV1THL1
1lLtxXZs8qzTB0tLerQAcviyr4pV7/Wmj7NXsylR43ogHZIrjRG2BeQdiVeNNgJ5
tPvC8/ZVjycj21jvISUxKbAJgRIVBHCPAYKf3K7nHblwnThq7M+BGazvrQIsgCSu
AjH7zDgrlD/x02dP/41UrxeGGtuMyefTpGVp5b7azzjni5/n/3xpwv/Ogv/Y97I=
=zT+M
-END PGP SIGNATURE-


Re: [PR50764, PATCH] Fix for ICE in maybe_record_trace_start with -fsched2-use-superblocks

2011-11-15 Thread Maxim Kuvyrkov
On 30/10/2011, at 8:17 AM, Tom de Vries wrote:

> Richard,
> 
> I have a tentative fix for PR50764.

Richard,

Tom's patch is good (with the comments below addressed), and I would appreciate 
you validating my review with your formal approval.

> 
> In the example from the test-case, -fsched2-use-superblocks moves an insn from
> block 4 to block 3.
> 
>   2
>  bar
>   |
>---+-
>   / \
>  *   *
>  5 * 3
> abortbar
>  |
>  |
>  *
>  4
>return
> 
> 
> The insn has a REG_CFA_DEF_CFA note and is frame-related.
> ...
> (insn/f 51 50 52 4 (set (reg:DI 39 r10)
>(mem/c:DI (plus:DI (reg/f:DI 6 bp)
>(const_int -8 [0xfff8])) [3 S8 A8])) pr50764.c:13
> 62 {*movdi_internal_rex64}
> (expr_list:REG_CFA_DEF_CFA (reg:DI 39 r10)
>(nil)))
> ...
> 
> This causes the assert in maybe_record_trace_start to trigger:
> ...
>  /* We ought to have the same state incoming to a given trace no
>matter how we arrive at the trace.  Anything else means we've
>got some kind of optimization error.  */
>  gcc_checking_assert (cfi_row_equal_p (cur_row, ti->beg_row));
> ...
> 
> The assert does not occur with -fno-tree-tail-merge, but that is due to the
> following:
> - -fsched-use-superblocks does not handle dead labels explicitly
> - -freorder-blocks introduces a dead label, which is not removed until after
>  sched2
> - -ftree-tail-merge makes a difference in which block -freorder-blocks
>  introduces the dead label. In the case of -ftree-tail-merge, the dead label
>  is introduced at the start of block 3, and block 3 and 4 end up in the same
>  ebb. In the case of -fno-tree-tail-merge, the dead label is introduced at the
>  start of block 4, and block 3 and 4 don't end up in the same ebb.
> 
> attached untested patch fixes PR50764 in a similar way as the patch for 
> PR49994,
> which is also about an ICE in maybe_record_trace_start with
> -fsched2-use-superblocks.
> 
> The patch for PR49994 makes sure frame-related instructions are not moved past
> the following jump.
> 
> Attached patch makes sure frame-related instructions are not moved past the
> preceding jump.
> 
> Is this the way to fix this PR?

Tom,

Thank you for good analysis, your patch is the right way to go.

Scheduler should not move frame-related insns from either prologue or epilogue 
basic blocks.  Currently sched-deps analysis handles the prologue case, and 
your patch fixes the epilogue case.  The primary reason why we didn't hit the 
assert before is due to the fact that we do interblock scheduling after reload 
only on few architectures.  With single-block scheduling after reload, which is 
what we do for most architectures, this issue cannot arise.

> 
> Index: gcc/sched-deps.c
> ===
> --- gcc/sched-deps.c (revision 180521)
> +++ gcc/sched-deps.c (working copy)
> @@ -2812,8 +2812,13 @@ sched_analyze_insn (struct deps_desc *de
>  during prologue generation and avoid marking the frame pointer setup
>  as frame-related at all.  */
>   if (RTX_FRAME_RELATED_P (insn))
> -deps->sched_before_next_jump
> -  = alloc_INSN_LIST (insn, deps->sched_before_next_jump);
> +{
> +  deps->sched_before_next_jump
> + = alloc_INSN_LIST (insn, deps->sched_before_next_jump);

This code is rather obscure, so additional comments would be helpful.  Please 
add something like "Make sure prologue INSN is scheduled before next jump." 
before the first statement; and add something like "Make sure epilogue INSN is 
not moved before preceding jumps." before the second statement.

> +
> +  if (deps->pending_jump_insns)
> + add_dependence (insn, XEXP (deps->pending_jump_insns, 0), REG_DEP_ANTI);

Please use "add_dependence_list (insn, deps->pending_jump_insns, 1, 
REG_DEP_ANTI);" instead.  We want INSN to depend upon all of pending jumps, not 
just one of them.  The situation where pending_jump_insns has more than a 
single jump does not happen in current setup of scheduling runs (as sched-rgn 
does not do interblock scheduling after reload), but that may change in the 
future.

OK upon formal approval from Richard or other reviewer.

Thank you,

--
Maxim Kuvyrkov
CodeSourcery / Mentor Graphics



[PATCH RFC] Correct sparc's REGMODE_NATURAL_SIZE and MODES_TIEABLE_P wrt. vector modes.

2011-11-15 Thread David Miller

Eric, this is just something I noticed while trying to fix the
vec_init problems last week.

I'm confident that the issue is real, however I can't point to any
real bugs that are caused by this.

Therefore I'm reluctant to commit this change.

What do you think?

gcc/

* config/sparc/sparc.c (sparc_regmode_natural_size): New function
implementing REGMODE_NATURAL_SIZE taking into consideration vector
modes.
(sparc_modes_tieable_p): Similarly for MODES_TIEABLE_P.
* config/sparc/sparc-protos.h (sparc_regmode_natural_size,
sparc_modes_tieable_p): Declare.
* gcc/config/sparc/sparc.h (REGMODE_NATURAL_SIZE,
MODES_TIEABLE_P): Use new helper functions.
---
 gcc/ChangeLog   |9 +
 gcc/config/sparc/sparc-protos.h |2 +
 gcc/config/sparc/sparc.c|   65 +++
 gcc/config/sparc/sparc.h|   18 +-
 4 files changed, 78 insertions(+), 16 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index cf4e66b..3544d38 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,5 +1,14 @@
 2011-11-11  David S. Miller  
 
+   * config/sparc/sparc.c (sparc_regmode_natural_size): New function
+   implementing REGMODE_NATURAL_SIZE taking into consideration vector
+   modes.
+   (sparc_modes_tieable_p): Similarly for MODES_TIEABLE_P.
+   * config/sparc/sparc-protos.h (sparc_regmode_natural_size,
+   sparc_modes_tieable_p): Declare.
+   * gcc/config/sparc/sparc.h (REGMODE_NATURAL_SIZE,
+   MODES_TIEABLE_P): Use new helper functions.
+
Revert
2011-11-05  David S. Miller  
 
diff --git a/gcc/config/sparc/sparc-protos.h b/gcc/config/sparc/sparc-protos.h
index ccf20b1..10fa5ed 100644
--- a/gcc/config/sparc/sparc-protos.h
+++ b/gcc/config/sparc/sparc-protos.h
@@ -109,6 +109,8 @@ extern void sparc_expand_vector_init (rtx, rtx);
 extern void sparc_expand_vec_perm_bmask(enum machine_mode, rtx);
 extern bool sparc_expand_conditional_move (enum machine_mode, rtx *);
 extern void sparc_expand_vcond (enum machine_mode, rtx *, int, int);
+unsigned int sparc_regmode_natural_size (enum machine_mode);
+bool sparc_modes_tieable_p (enum machine_mode, enum machine_mode);
 #endif /* RTX_CODE */
 
 #endif /* __SPARC_PROTOS_H__ */
diff --git a/gcc/config/sparc/sparc.c b/gcc/config/sparc/sparc.c
index 55759a0..b315698 100644
--- a/gcc/config/sparc/sparc.c
+++ b/gcc/config/sparc/sparc.c
@@ -11616,4 +11616,69 @@ sparc_expand_vcond (enum machine_mode mode, rtx 
*operands, int ccode, int fcode)
   emit_insn (gen_rtx_SET (VOIDmode, operands[0], bshuf));
 }
 
+/* On sparc, any mode which naturally allocates into the float
+   registers should return 4 here.  */
+
+unsigned int
+sparc_regmode_natural_size (enum machine_mode mode)
+{
+  int size = UNITS_PER_WORD;
+
+  if (TARGET_ARCH64)
+{
+  enum mode_class mclass = GET_MODE_CLASS (mode);
+
+  if (mclass == MODE_FLOAT || mclass == MODE_VECTOR_INT)
+   size = 4;
+}
+
+  return size;
+}
+
+/* Return TRUE if it is a good idea to tie two pseudo registers
+   when one has mode MODE1 and one has mode MODE2.
+   If HARD_REGNO_MODE_OK could produce different values for MODE1 and MODE2,
+   for any hard reg, then this must be FALSE for correct output.
+
+   For V9 we have to deal with the fact that only the lower 32 floating
+   point registers are 32-bit addressable.  */
+
+bool
+sparc_modes_tieable_p (enum machine_mode mode1, enum machine_mode mode2)
+{
+  enum mode_class mclass1, mclass2;
+  unsigned short size1, size2;
+
+  if (mode1 == mode2)
+return true;
+
+  mclass1 = GET_MODE_CLASS (mode1);
+  mclass2 = GET_MODE_CLASS (mode2);
+  if (mclass1 != mclass2)
+return false;
+
+  if (! TARGET_V9)
+return true;
+
+  /* Classes are the same and we are V9 so we have to deal with upper
+ vs. lower floating point registers.  If one of the modes is a
+ 4-byte mode, and the other is not, we have to mark them as not
+ tieable because only the lower 32 floating point register are
+ addressable 32-bits at a time.
+
+ We can't just test explicitly for SFmode, otherwise we won't
+ cover the vector mode cases properly.  */
+
+  if (mclass1 != MODE_FLOAT && mclass1 != MODE_VECTOR_INT)
+return true;
+
+  size1 = GET_MODE_SIZE (mode1);
+  size2 = GET_MODE_SIZE (mode2);
+  if ((size1 > 4 && size2 == 4)
+  || (size2 > 4 && size1 == 4))
+return false;
+
+  return true;
+}
+
 #include "gt-sparc.h"
diff --git a/gcc/config/sparc/sparc.h b/gcc/config/sparc/sparc.h
index e8707f5..32f8c10 100644
--- a/gcc/config/sparc/sparc.h
+++ b/gcc/config/sparc/sparc.h
@@ -716,8 +716,7 @@ extern enum cmodel sparc_cmodel;
 
 /* Due to the ARCH64 discrepancy above we must override this next
macro too.  */
-#define REGMODE_NATURAL_SIZE(MODE) \
-  ((TARGET_ARCH64 && FLOAT_MODE_P (MODE)) ? 4 : UNITS_PER_WORD)
+#define REGMODE_NATURAL_SIZE(MODE) sparc_regmode_natural_size (MODE)
 
 /* Value is 1 if hard reg

Re: [PATCH] reload: Try alternative with swapped operands before going to the next

2011-11-15 Thread Maxim Kuvyrkov
On 15/11/2011, at 6:21 AM, Andreas Krebbel wrote:

> Hi,
> 
> find_reloads currently loops over all alternatives in an insn and
> restarts the whole process after swapping commutative operands.  This
> together with the early exit for a perfectly matching alternative
> leads to an behavior which does not exactly matches the manual.
> 
> From http://gcc.gnu.org/onlinedocs/gccint/Multi_002dAlternative.html
> "If two alternatives need the same amount of copying, the one that
> comes first is chosen."
> 
> An earlier alternative could be turned into a perfect match by
> swapping the operands.  But this will not happen if a later
> alternative already provides a perfect match without swapping the
> operands.
...
> 
> The attached patch changes the loop nesting in find_reloads in order
> to try each alternative with swapped operands before picking the next.

I have eye-balled this patch for good half-an-hour and couldn't poke any holes 
in it.  I can't approve this patch, but below are some review comments.  Mostly 
these are suggested comments to make reload easier to understand for future 
generations.

> 
> Bootstrapped on s390x, x86_64 and PPC64. No regressions
> 
> Ok for mainline?

Good portion of the code you're changing was written by Richard K. ages ago, so 
CC'ing him to get his approval.

> 
> Bye,
> 
> -Andreas-
> 
> 
> 2011-11-14  Andreas Krebbel  
> 
>   * reload.c (find_reloads): Change the loop nesting when trying an
>   alternative with swapped operands.
> 
> 
> The first version of the patch is without adjusting the indention in
> order to allow for easier review.
> 
> Index: gcc/reload.c
> ===
> *** gcc/reload.c.orig
> --- gcc/reload.c
> *** find_reloads (rtx insn, int replace, int
> *** 2592,2598 
>char this_alternative_offmemok[MAX_RECOG_OPERANDS];
>char this_alternative_earlyclobber[MAX_RECOG_OPERANDS];
>int this_alternative_matches[MAX_RECOG_OPERANDS];
> -   int swapped;
>reg_class_t goal_alternative[MAX_RECOG_OPERANDS];
>int this_alternative_number;
>int goal_alternative_number = 0;
> --- 2592,2597 
> *** find_reloads (rtx insn, int replace, int
> *** 2938,2946 
> 
>best = MAX_RECOG_OPERANDS * 2 + 600;
> 
> -   swapped = 0;
>goal_alternative_swapped = 0;
> -  try_swapped:
> 
>/* The constraints are made of several alternatives.
>   Each operand's constraint looks like foo,bar,... with commas
> --- 2937,2943 
> *** find_reloads (rtx insn, int replace, int
> *** 2953,2958 
> --- 2950,2975 
> this_alternative_number < n_alternatives;
> this_alternative_number++)
>  {
> +   int swapped;
> +   const char *old_constraints[MAX_RECOG_OPERANDS];
> + 
/* Skip disabled alternatives.  */
> +   if (!recog_data.alternative_enabled_p[this_alternative_number])
> + {
> +   int i;
> + 
> +   for (i = 0; i < recog_data.n_operands; i++)
> + constraints[i] = skip_alternative (constraints[i]);
> + 
> +   continue;
> + }
> + 
> +   /* If insn is commutative (it's safe to exchange a certain pair
> +  of operands) then we need to try each alternative twice, the
> +  second time matching those two operands as if we had
> +  exchanged them.  To do this, really exchange them in
> +  operands.  */
> +   for (swapped = 0; swapped < (commutative >= 0 ? 2 : 1); swapped++)
> + {
>/* Loop over operands for one constraint alternative.  */
>/* LOSERS counts those that don't fit this alternative
>and would require loading.  */
> *** find_reloads (rtx insn, int replace, int
> *** 2968,2982 
>a bad register class to only count 1/3 as much.  */
>int reject = 0;
> 
> !   if (!recog_data.alternative_enabled_p[this_alternative_number])
> ! {
> !   int i;
> 
> !   for (i = 0; i < recog_data.n_operands; i++)
> ! constraints[i] = skip_alternative (constraints[i]);
> 
> !   continue;
> ! }
> 
>this_earlyclobber = 0;
> 
> --- 2985,3021 
>a bad register class to only count 1/3 as much.  */
>int reject = 0;
> 
> !   if (swapped)
/* Trying THIS_ALTERNATIVE with commutative operands swapped.
Do the swapping.  */
> ! {
> !   enum reg_class tclass;
> !   int t;
> 
> !   recog_data.operand[commutative] = substed_operand[commutative + 
> 1];
> !   recog_data.operand[commutative + 1] = 
> substed_operand[commutative];
> !   /* Swap the duplicates too.  */
> !   for (i = 0; i < recog_data.n_dups; i++)
> ! if (recog_data.dup_num[i] == commutative
> ! || recog_data.dup_num[i] == commutative + 1)
> !   *recog_data.dup_loc[i]
> ! = recog_data.operand[(int) recog_data.dup_num[i]];
> ! 
> !   tclass = preferred_class[commutative];
> !   

[v3] libstdc++/51142

2011-11-15 Thread Paolo Carlini

Hi,

this fixes the problem submitter noticed by implementing LWG 2059, which 
seems a good thing to do anyway, instead of just fixing the specific 
erase calls in the debug-mode code. The patch seems big, but actually is 
straightforward and limited to C++11, thus I mean to apply it to the 
branch too pretty soon.


Tested x86_64-linux, normal-mode and debug-mode.

Thanks,
Paolo.


2011-11-15  Paolo Carlini  

PR libstdc++/51142
* include/debug/unordered_map (unordered_map<>::erase(iterator),
unordered_multimap<>::erase(iterator)): Add, consistently with
LWG 2059.
* include/debug/unordered_set (unordered_set<>::erase(iterator),
unordered_multiset<>::erase(iterator)): Likewise.
* include/debug/map.h (map<>::erase(iterator)): Likewise.
* include/debug/multimap.h (multimap<>::erase(iterator)): Likewise.
* include/profile/map.h (map<>::erase(iterator)): Likewise.
* include/profile/multimap.h (multimap<>::erase(iterator)): Likewise.
* include/bits/hashtable.h (_Hashtable<>::erase(iterator)): Likewise.
* include/bits/stl_map.h (map<>::erase(iterator)): Likewise.
* include/bits/stl_multimap.h (multimap<>::erase(iterator)): Likewise.
* include/bits/stl_tree.h (_Rb_tree<>::erase(iterator)): Likewise.
* testsuite/23_containers/unordered_map/erase/51142.cc: New.
* testsuite/23_containers/multimap/modifiers/erase/51142.cc: Likewise.
* testsuite/23_containers/set/modifiers/erase/51142.cc: Likewise.
* testsuite/23_containers/unordered_multimap/erase/51142.cc: Likewise.
* testsuite/23_containers/unordered_set/erase/51142.cc: Likewise.
* testsuite/23_containers/multiset/modifiers/erase/51142.cc: Likewise.
* testsuite/23_containers/unordered_multiset/erase/51142.cc: Likewise.
* testsuite/23_containers/map/modifiers/erase/51142.cc: Likewise.
Index: include/debug/unordered_map
===
--- include/debug/unordered_map (revision 181385)
+++ include/debug/unordered_map (working copy)
@@ -334,6 +334,10 @@
   }
 
   iterator
+  erase(iterator __it)
+  { return erase(const_iterator(__it)); }
+
+  iterator
   erase(const_iterator __first, const_iterator __last)
   {
__glibcxx_check_erase_range(__first, __last);
@@ -709,6 +713,10 @@
   }
 
   iterator
+  erase(iterator __it)
+  { return erase(const_iterator(__it)); }
+
+  iterator
   erase(const_iterator __first, const_iterator __last)
   {
__glibcxx_check_erase_range(__first, __last);
Index: include/debug/unordered_set
===
--- include/debug/unordered_set (revision 181385)
+++ include/debug/unordered_set (working copy)
@@ -330,6 +330,10 @@
   }
 
   iterator
+  erase(iterator __it)
+  { return erase(const_iterator(__it)); }
+
+  iterator
   erase(const_iterator __first, const_iterator __last)
   {
__glibcxx_check_erase_range(__first, __last);
@@ -696,6 +700,10 @@
   }
 
   iterator
+  erase(iterator __it)
+  { return erase(const_iterator(__it)); }
+
+  iterator
   erase(const_iterator __first, const_iterator __last)
   {
__glibcxx_check_erase_range(__first, __last);
Index: include/debug/map.h
===
--- include/debug/map.h (revision 181385)
+++ include/debug/map.h (working copy)
@@ -271,6 +271,10 @@
this->_M_invalidate_if(_Equal(__position.base()));
return iterator(_Base::erase(__position.base()), this);
   }
+
+  iterator
+  erase(iterator __position)
+  { return erase(const_iterator(__position)); }
 #else
   void
   erase(iterator __position)
Index: include/debug/multimap.h
===
--- include/debug/multimap.h(revision 181385)
+++ include/debug/multimap.h(working copy)
@@ -254,6 +254,10 @@
this->_M_invalidate_if(_Equal(__position.base()));
return iterator(_Base::erase(__position.base()), this);
   }
+
+  iterator
+  erase(iterator __position)
+  { return erase(const_iterator(__position)); }
 #else
   void
   erase(iterator __position)
Index: include/profile/map.h
===
--- include/profile/map.h   (revision 181385)
+++ include/profile/map.h   (working copy)
@@ -327,6 +327,10 @@
 __profcxx_map_to_unordered_map_erase(this, size(), 1);
 return __i;
   }
+
+  iterator
+  erase(iterator __position)
+  { return erase(const_iterator(__position)); }
 #else
   void
   erase(iterator __position)
Index: include/profile/multimap.h
===
--- include/profile/mul

Fix x86-elf build

2011-11-15 Thread Joseph S. Myers
config/i386/i386elf.h wasn't updated for the change of STRING_LIMIT to 
ELF_STRING_LIMIT, so breaking builds for i?86-elf.  I've committed this 
patch as obvious to fix this.  Tested building cc1 and xgcc for cross to 
i686-elf.

Index: gcc/ChangeLog
===
--- gcc/ChangeLog   (revision 181399)
+++ gcc/ChangeLog   (working copy)
@@ -1,3 +1,8 @@
+2011-11-15  Joseph Myers  
+
+   * config/i386/i386elf.h (ASM_OUTPUT_ASCII): Change STRING_LIMIT to
+   ELF_STRING_LIMIT.
+
 2011-11-15  Richard Henderson  
 
* config/alpha/alpha.c (alpha_pre_atomic_barrier): New.
Index: gcc/config/i386/i386elf.h
===
--- gcc/config/i386/i386elf.h   (revision 181399)
+++ gcc/config/i386/i386elf.h   (working copy)
@@ -50,7 +50,7 @@
generated assembly code more compact (and thus faster to assemble)
as well as more readable.  Note that if we find subparts of the
character sequence which end with NUL (and which are shorter than
-   STRING_LIMIT) we output those using ASM_OUTPUT_LIMITED_STRING.  */
+   ELF_STRING_LIMIT) we output those using ASM_OUTPUT_LIMITED_STRING.  */
 
 #undef ASM_OUTPUT_ASCII
 #define ASM_OUTPUT_ASCII(FILE, STR, LENGTH)\
@@ -70,7 +70,7 @@
}   \
  for (p = _ascii_bytes; p < limit && *p != '\0'; p++)  \
continue;   \
- if (p < limit && (p - _ascii_bytes) <= (long) STRING_LIMIT)   \
+ if (p < limit && (p - _ascii_bytes) <= (long) ELF_STRING_LIMIT) \
{   \
  if (bytes_in_chunk > 0)   \
{   \

-- 
Joseph S. Myers
jos...@codesourcery.com


Make x86-elf use DWARF-2 not stabs

2011-11-15 Thread Joseph S. Myers
i?86-elf and x86_64-elf targets default to stabs debugging format.
This default dates back to when x86-elf support was added in
 - and given
the copyright dates in the original submission, the support was pretty
old even then.  Even if this default made sense then, deviating from
the usual DWARF-2 used on ELF targets certainly doesn't make sense
now.  As I explained in
 the
configuration of supported and default debug formats is generally
rather a mess and could probably do with being simplified, but I think
this simple local fix, removing the old PREFERRED_DEBUGGING_TYPE
definition, is appropriate for now for these targets (and in any case
it makes sense to separate semantic changes to what the defaults are
from any changes to how those defaults are implemented).

This patch removes PREFERRED_DEBUGGING_TYPE from
config/i386/i386elf.h, so changing the default to the normal ELF
default of DWARF-2.  This header is also used on i[34567]86-*-rtems*
so it's possible those targets will be affected as well.  I'm not set
up for full testing of x86-elf with FSF sources, but did a sanity
check on this patch by building cc1 and xgcc for i686-elf.  OK to
commit?

2011-11-15  Joseph Myers  

* config/i386/i386elf.h (PREFERRED_DEBUGGING_TYPE): Remove.

Index: gcc/config/i386/i386elf.h
===
--- gcc/config/i386/i386elf.h   (revision 181400)
+++ gcc/config/i386/i386elf.h   (working copy)
@@ -20,10 +20,6 @@
 along with GCC; see the file COPYING3.  If not see
 .  */
 
-/* Use stabs instead of DWARF debug format.  */
-#undef  PREFERRED_DEBUGGING_TYPE
-#define PREFERRED_DEBUGGING_TYPE DBX_DEBUG
-
 /* The ELF ABI for the i386 says that records and unions are returned
in memory.  */
 

-- 
Joseph S. Myers
jos...@codesourcery.com


[trunk] RFS: translate built-in include paths for sysroot (issue5394041)

2011-11-15 Thread Han Shen
2011-11-15   Han Shen  

* gcc/Makefile.in:
* gcc/configure:
* gcc/cppdefault.c:

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index ae4f4da..0a05783 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -615,6 +615,7 @@ gcc_tooldir = @gcc_tooldir@
 build_tooldir = $(exec_prefix)/$(target_noncanonical)
 # Directory in which the compiler finds target-independent g++ includes.
 gcc_gxx_include_dir = @gcc_gxx_include_dir@
+gcc_gxx_include_dir_add_sysroot = @gcc_gxx_include_dir_add_sysroot@
 # Directory to search for site-specific includes.
 local_includedir = $(local_prefix)/include
 includedir = $(prefix)/include
@@ -3979,6 +3980,7 @@ PREPROCESSOR_DEFINES = \
   -DGCC_INCLUDE_DIR=\"$(libsubdir)/include\" \
   -DFIXED_INCLUDE_DIR=\"$(libsubdir)/include-fixed\" \
   -DGPLUSPLUS_INCLUDE_DIR=\"$(gcc_gxx_include_dir)\" \
+  -DGPLUSPLUS_INCLUDE_DIR_ADD_SYSROOT=$(gcc_gxx_include_dir_add_sysroot) \
   
-DGPLUSPLUS_TOOL_INCLUDE_DIR=\"$(gcc_gxx_include_dir)/$(target_noncanonical)\" \
   -DGPLUSPLUS_BACKWARD_INCLUDE_DIR=\"$(gcc_gxx_include_dir)/backward\" \
   -DLOCAL_INCLUDE_DIR=\"$(local_includedir)\" \
diff --git a/gcc/configure b/gcc/configure
index 99334ce..364d8c2 100755
--- a/gcc/configure
+++ b/gcc/configure
@@ -638,6 +638,7 @@ host_xm_include_list
 host_xm_file_list
 host_exeext
 gcc_gxx_include_dir
+gcc_gxx_include_dir_add_sysroot
 gcc_config_arguments
 float_h_file
 extra_programs
@@ -3291,12 +3292,20 @@ gcc_gxx_include_dir=
 # Specify the g++ header file directory
 
 # Check whether --with-gxx-include-dir was given.
+gcc_gxx_include_dir_add_sysroot=0
 if test "${with_gxx_include_dir+set}" = set; then :
   withval=$with_gxx_include_dir; case "${withval}" in
 yes)   as_fn_error "bad value ${withval} given for g++ include directory" 
"$LINENO" 5 ;;
 no);;
 *) gcc_gxx_include_dir=$with_gxx_include_dir ;;
 esac
+  if test "${with_sysroot+set}" = set; then :
+gcc_gxx_without_sysroot=`expr "${gcc_gxx_include_dir}" : 
"${with_sysroot}"'\(.*\)'`
+if test "${gcc_gxx_without_sysroot}"; then :
+  gcc_gxx_include_dir="${gcc_gxx_without_sysroot}"
+  gcc_gxx_include_dir_add_sysroot=1
+fi
+  fi
 fi
 
 
diff --git a/gcc/cppdefault.c b/gcc/cppdefault.c
index 099899a..e8341d5 100644
--- a/gcc/cppdefault.c
+++ b/gcc/cppdefault.c
@@ -44,15 +44,15 @@ const struct default_include cpp_include_defaults[]
 = {
 #ifdef GPLUSPLUS_INCLUDE_DIR
 /* Pick up GNU C++ generic include files.  */
-{ GPLUSPLUS_INCLUDE_DIR, "G++", 1, 1, 0, 0 },
+{ GPLUSPLUS_INCLUDE_DIR, "G++", 1, 1, GPLUSPLUS_INCLUDE_DIR_ADD_SYSROOT, 0 
},
 #endif
 #ifdef GPLUSPLUS_TOOL_INCLUDE_DIR
 /* Pick up GNU C++ target-dependent include files.  */
-{ GPLUSPLUS_TOOL_INCLUDE_DIR, "G++", 1, 1, 0, 1 },
+{ GPLUSPLUS_TOOL_INCLUDE_DIR, "G++", 1, 1, 
GPLUSPLUS_INCLUDE_DIR_ADD_SYSROOT, 1 },
 #endif
 #ifdef GPLUSPLUS_BACKWARD_INCLUDE_DIR
 /* Pick up GNU C++ backward and deprecated include files.  */
-{ GPLUSPLUS_BACKWARD_INCLUDE_DIR, "G++", 1, 1, 0, 0 },
+{ GPLUSPLUS_BACKWARD_INCLUDE_DIR, "G++", 1, 1, 
GPLUSPLUS_INCLUDE_DIR_ADD_SYSROOT, 0 },
 #endif
 #ifdef GCC_INCLUDE_DIR
 /* This is the dir for gcc's private headers.  */

--
This patch is available for review at http://codereview.appspot.com/5394041


[alpha] Convert to atomic optabs

2011-11-15 Thread Richard Henderson
Tested with qemu.  Thankfully most of the changes really only
have to do with passing around the memory model, and eliding
one or both of the mb insns.

I didn't implement mem_thread_fence or atomic_load/store patterns,
because, with the existing memory_barrier pattern, the generic
fallback patterns do the right thing.

Committed.


r~
2011-11-15  Richard Henderson  

* config/alpha/alpha.c (alpha_pre_atomic_barrier): New.
(alpha_post_atomic_barrier): New.
(alpha_split_atomic_op): New memmodel argument; honor it.
(alpha_split_compare_and_swap): Take array of operands.  Honor
memmodel; always set bool output
(alpha_expand_compare_and_swap_12): Similarly.
(alpha_split_compare_and_swap_12): Similarly.
(alpha_split_atomic_exchange): Similarly.  Rename from
alpha_split_lock_test_and_set.
(alpha_expand_atomic_exchange_12): Similarly.  Rename from
alpha_expand_lock_test_and_set_12.
(alpha_split_atomic_exchange_12): Similarly.  Rename from
alpha_split_lock_test_and_set_12.
* config/alpha/alpha-protos.h: Update.
* config/alpha/alpha.md (UNSPECV_CMPXCHG): New.
* config/alpha/constraints.md ("w"): New.
* config/alpha/predicates.md (mem_noofs_operand): New.
* config/alpha/sync.md (atomic_compare_and_swap): Rename from
sync_compare_and_swap; add the new parameters.
(atomic_exchange): Update from sync_test_and_set.
(atomic_fetch_): Update from sync_old_.
(atomic__fetch): Update from sync_new_.
(atomic_): Update from sync_.



diff --git a/gcc/config/alpha/alpha-protos.h b/gcc/config/alpha/alpha-protos.h
index 3155168..42b34d3 100644
--- a/gcc/config/alpha/alpha-protos.h
+++ b/gcc/config/alpha/alpha-protos.h
@@ -88,15 +88,14 @@ extern bool alpha_emit_setcc (rtx[], enum machine_mode);
 extern int alpha_split_conditional_move (enum rtx_code, rtx, rtx, rtx, rtx);
 extern void alpha_emit_xfloating_arith (enum rtx_code, rtx[]);
 extern void alpha_emit_xfloating_cvt (enum rtx_code, rtx[]);
-extern void alpha_split_atomic_op (enum rtx_code, rtx, rtx, rtx, rtx, rtx);
-extern void alpha_split_compare_and_swap (rtx, rtx, rtx, rtx, rtx);
-extern void alpha_expand_compare_and_swap_12 (rtx, rtx, rtx, rtx);
-extern void alpha_split_compare_and_swap_12 (enum machine_mode, rtx, rtx,
-rtx, rtx, rtx, rtx, rtx);
-extern void alpha_split_lock_test_and_set (rtx, rtx, rtx, rtx);
-extern void alpha_expand_lock_test_and_set_12 (rtx, rtx, rtx);
-extern void alpha_split_lock_test_and_set_12 (enum machine_mode, rtx, rtx,
- rtx, rtx, rtx);
+extern void alpha_split_atomic_op (enum rtx_code, rtx, rtx, rtx, rtx, rtx,
+  enum memmodel);
+extern void alpha_split_compare_and_swap (rtx op[]);
+extern void alpha_expand_compare_and_swap_12 (rtx op[]);
+extern void alpha_split_compare_and_swap_12 (rtx op[]);
+extern void alpha_split_atomic_exchange (rtx op[]);
+extern void alpha_expand_atomic_exchange_12 (rtx op[]);
+extern void alpha_split_atomic_exchange_12 (rtx op[]);
 #endif
 
 extern rtx alpha_use_linkage (rtx, bool, bool);
diff --git a/gcc/config/alpha/alpha.c b/gcc/config/alpha/alpha.c
index 9a43f80..78717f9 100644
--- a/gcc/config/alpha/alpha.c
+++ b/gcc/config/alpha/alpha.c
@@ -4196,6 +4196,47 @@ emit_store_conditional (enum machine_mode mode, rtx res, 
rtx mem, rtx val)
   emit_insn (fn (res, mem, val));
 }
 
+/* Subroutines of the atomic operation splitters.  Emit barriers
+   as needed for the memory MODEL.  */
+
+static void
+alpha_pre_atomic_barrier (enum memmodel model)
+{
+  switch (model)
+{
+case MEMMODEL_RELAXED:
+case MEMMODEL_CONSUME:
+case MEMMODEL_ACQUIRE:
+  break;
+case MEMMODEL_RELEASE:
+case MEMMODEL_ACQ_REL:
+case MEMMODEL_SEQ_CST:
+  emit_insn (gen_memory_barrier ());
+  break;
+default:
+  gcc_unreachable ();
+}
+}
+
+static void
+alpha_post_atomic_barrier (enum memmodel model)
+{
+  switch (model)
+{
+case MEMMODEL_RELAXED:
+case MEMMODEL_CONSUME:
+case MEMMODEL_RELEASE:
+  break;
+case MEMMODEL_ACQUIRE:
+case MEMMODEL_ACQ_REL:
+case MEMMODEL_SEQ_CST:
+  emit_insn (gen_memory_barrier ());
+  break;
+default:
+  gcc_unreachable ();
+}
+}
+
 /* A subroutine of the atomic operation splitters.  Emit an insxl
instruction in MODE.  */
 
@@ -4236,13 +4277,13 @@ emit_insxl (enum machine_mode mode, rtx op1, rtx op2)
a scratch register.  */
 
 void
-alpha_split_atomic_op (enum rtx_code code, rtx mem, rtx val,
-  rtx before, rtx after, rtx scratch)
+alpha_split_atomic_op (enum rtx_code code, rtx mem, rtx val, rtx before,
+  rtx after, rtx scratch, enum memmodel model)
 {
   enum machine_mode mode = GET_MODE (mem);
   rtx label, x, cond = gen_rtx_REG (DImode, REGNO (scratch));
 
-  emit_insn 

Re: [trunk] RFS: translate built-in include paths for sysroot (issue 5394041)

2011-11-15 Thread shenhan

On 2011/11/16 00:47:10, shenhan wrote:

2011-11-15   Han Shen  



* gcc/Makefile.in:
* gcc/configure:
* gcc/cppdefault.c:



diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index ae4f4da..0a05783 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -615,6 +615,7 @@ gcc_tooldir = @gcc_tooldir@
  build_tooldir = $(exec_prefix)/$(target_noncanonical)
  # Directory in which the compiler finds target-independent g++

includes.

  gcc_gxx_include_dir = @gcc_gxx_include_dir@
+gcc_gxx_include_dir_add_sysroot = @gcc_gxx_include_dir_add_sysroot@
  # Directory to search for site-specific includes.
  local_includedir = $(local_prefix)/include
  includedir = $(prefix)/include
@@ -3979,6 +3980,7 @@ PREPROCESSOR_DEFINES = \
-DGCC_INCLUDE_DIR=\"$(libsubdir)/include\" \
-DFIXED_INCLUDE_DIR=\"$(libsubdir)/include-fixed\" \
-DGPLUSPLUS_INCLUDE_DIR=\"$(gcc_gxx_include_dir)\" \
+

-DGPLUSPLUS_INCLUDE_DIR_ADD_SYSROOT=$(gcc_gxx_include_dir_add_sysroot) \


-DGPLUSPLUS_TOOL_INCLUDE_DIR=\"$(gcc_gxx_include_dir)/$(target_noncanonical)\"
\

-DGPLUSPLUS_BACKWARD_INCLUDE_DIR=\"$(gcc_gxx_include_dir)/backward\" \

-DLOCAL_INCLUDE_DIR=\"$(local_includedir)\" \
diff --git a/gcc/configure b/gcc/configure
index 99334ce..364d8c2 100755
--- a/gcc/configure
+++ b/gcc/configure
@@ -638,6 +638,7 @@ host_xm_include_list
  host_xm_file_list
  host_exeext
  gcc_gxx_include_dir
+gcc_gxx_include_dir_add_sysroot
  gcc_config_arguments
  float_h_file
  extra_programs
@@ -3291,12 +3292,20 @@ gcc_gxx_include_dir=
  # Specify the g++ header file directory



  # Check whether --with-gxx-include-dir was given.
+gcc_gxx_include_dir_add_sysroot=0
  if test "${with_gxx_include_dir+set}" = set; then :
withval=$with_gxx_include_dir; case "${withval}" in
  yes)  as_fn_error "bad value ${withval} given for g++ include

directory"

"$LINENO" 5 ;;
  no)   ;;
  *)gcc_gxx_include_dir=$with_gxx_include_dir ;;
  esac
+  if test "${with_sysroot+set}" = set; then :
+gcc_gxx_without_sysroot=`expr "${gcc_gxx_include_dir}" :
"${with_sysroot}"'\(.*\)'`
+if test "${gcc_gxx_without_sysroot}"; then :
+  gcc_gxx_include_dir="${gcc_gxx_without_sysroot}"
+  gcc_gxx_include_dir_add_sysroot=1
+fi
+  fi
  fi




diff --git a/gcc/cppdefault.c b/gcc/cppdefault.c
index 099899a..e8341d5 100644
--- a/gcc/cppdefault.c
+++ b/gcc/cppdefault.c
@@ -44,15 +44,15 @@ const struct default_include

cpp_include_defaults[]

  = {
  #ifdef GPLUSPLUS_INCLUDE_DIR
  /* Pick up GNU C++ generic include files.  */
-{ GPLUSPLUS_INCLUDE_DIR, "G++", 1, 1, 0, 0 },
+{ GPLUSPLUS_INCLUDE_DIR, "G++", 1, 1,

GPLUSPLUS_INCLUDE_DIR_ADD_SYSROOT, 0

},
  #endif
  #ifdef GPLUSPLUS_TOOL_INCLUDE_DIR
  /* Pick up GNU C++ target-dependent include files.  */
-{ GPLUSPLUS_TOOL_INCLUDE_DIR, "G++", 1, 1, 0, 1 },
+{ GPLUSPLUS_TOOL_INCLUDE_DIR, "G++", 1, 1,
GPLUSPLUS_INCLUDE_DIR_ADD_SYSROOT, 1 },
  #endif
  #ifdef GPLUSPLUS_BACKWARD_INCLUDE_DIR
  /* Pick up GNU C++ backward and deprecated include files.  */
-{ GPLUSPLUS_BACKWARD_INCLUDE_DIR, "G++", 1, 1, 0, 0 },
+{ GPLUSPLUS_BACKWARD_INCLUDE_DIR, "G++", 1, 1,
GPLUSPLUS_INCLUDE_DIR_ADD_SYSROOT, 0 },
  #endif
  #ifdef GCC_INCLUDE_DIR
  /* This is the dir for gcc's private headers.  */



--
This patch is available for review at

http://codereview.appspot.com/5394041

Hi, this is a follow up for issue
"http://codereview.appspot.com/4641076";.

The issue description from that issue is copied below:

=
The setup:

Configuring a toolchain targeting x86-64 GNU Linux (Ubuntu Lucid), as a
cross-compiler.  Using a sysroot to provide the Lucid headers+libraries,
with the sysroot path being within the GCC install tree.  Want to use
the
Lucid system libstdc++ and headers, which means that I'm not
building/installing libstdc++-v3.

So, configuring with:
  --with-sysroot="$SYSROOT"
  --disable-libstdc++-v3 \
  --with-gxx-include-dir="$SYSROOT/usr/include/c++/4.4" \
(among other options).

Hoping to support two usage models with this configuration, w.r.t. use
of
the sysroot:

(1) somebody installs the sysroot in the normal location relative to the
GCC install, and relocates the whole bundle (sysroot+GCC).  This works
great AFAICT, GCC finds its includes (including the C++ includes) thanks
to the add_standard_paths iprefix handling.

(2) somebody installs the sysroot in a non-standard location, and uses
--sysroot to try to access it.  This works fine for the C headers, but
doesn't work.

For the C headers, add_standard_paths prepends the sysroot location to
the /usr/include path (since that's what's specified in cppdefault.c for
that path).  It doesn't do the same for the C++ include path, though
(again, as specified in cppdefault.c).

add_standard_paths doesn't attempt to relocate built-in include paths
that
start with the compiled-in sysroot location (e.g., the g++ include dir,
in
this case).  This isn't surprising really: normally you 

Re: Fix x86-elf build

2011-11-15 Thread Joseph S. Myers
I should add: I don't know any reason why this target should need its own 
ASM_OUTPUT_ASCII definition (instead of the default ELF version) at all, 
but haven't tried removing the definition.

-- 
Joseph S. Myers
jos...@codesourcery.com


RE: [PATCH RFA] rtl-optimization/PR50663, conditional propagation missed in cprop.c pass

2011-11-15 Thread Bin Cheng
Hi,
Thanks for your review.
Here comes the 2nd version patch modified according to your comments. Is it
ok?
Also could you please commit it if ok because I have no write access?

The new patch is tested against x86-linux-gnu.

Thanks.

2011-11-15  Bin Cheng  

PR rtl-optimization/50663
* cprop.c (implicit_set_indexes): New global variable.
(insert_set_in_table): Add additional parameter, record implicit set
info.
(hash_scan_set): Add additional parameter.
(compute_hash_table_work): And
(hash_scan_insn): Pass implicit to hash_scan_set.
(compute_cprop_data): Add implicit set to AVIN of block which the
implicit set is recorded for.
(one_cprop_pass): Handle implicit_set_indexes array.

> -Original Message-
> From: Eric Botcazou [mailto:ebotca...@adacore.com]
> Sent: Saturday, November 12, 2011 12:35 AM
> To: Bin Cheng
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH RFA] rtl-optimization/PR50663, conditional propagation
> missed in cprop.c pass
> 
> > 2011-11-07  Bin Cheng  
> >
> > PR rtl-optimization/50663
> > * cprop.c (bb_implicit): New global variable.
> > (insert_set_in_table): Add additional parameter, record implicit set
> > info.
> > (hash_scan_set): Add additional parameter.
> > (compute_hash_table_work): And
> > (hash_scan_insn): Pass implicit to hash_scan_set.
> > (compute_cprop_data): Add implicit set to AVIN of block which the
> > implicit set is recorded for.
> > (one_cprop_pass): Handle bb_implicit array.
> 
> [80 columns at most for ChangeLog entries as well]
> 
> The patch is OK with the following changes:
> 
> @@ -116,6 +116,10 @@
>  /* Array of implicit set patterns indexed by basic block index.  */
>  static rtx *implicit_sets;
> 
> +/* Array of bitmap_index of corresponding implicit set, indexed by
> +   basic block index.  */
> +static int *bb_implicit;
> 
> A better name is implicit_set_indexes:
> 
> /* Array of indexes of expressions for implicit set patterns indexed by
basic
>block index.  In other words, implicit_set_indexes[i] is the
bitmap_index
>of the expression whose RTX is implicit_sets[i].  */
> static int *implicit_set_indexes;
> 
> 
> +  /* Record bitmap_index of the implicit set in bb_implicit.  */
> +  if (implicit)
> +bb_implicit[BLOCK_FOR_INSN(cur_occr->insn)->index] =
> +  cur_expr->bitmap_index;
> 
> cur_occr->insn is just insn.
> 
> 
> +  /* Merge implicit set into CPROP_AVIN. There are always
> + available at the entry of corresponding basic block.  */
> 
> "...implicit sets into CPROP_AVIN.  They are..."
> 
> +  FOR_EACH_BB (bb)
> +{
> +  int index = bb_implicit[bb->index];
> +  if (index != -1)
> + SET_BIT (cprop_avin[bb->index], (unsigned int)index);
> 
> The cast is superfluous.
> 
> I think that an explanation as to why we need to do this is in order
(after
> all, this went unnoticed until now) along the lines of: "We need to do
this
> because 1) implicit sets aren't recorded for the local pass so they cannot
> be propagated within their basic block by this pass and 2) the global pass
> would otherwise propagate them only in the successors of their basic
block."
> 
> Btw, you'll need to slightly adjust the patch because of my changes to
cprop.c.
> 
> Thanks for investigating and addressing this issue.
> 
> --
> Eric Botcazou


pr50663-2016.patch
Description: Binary data


Re: [PATCH] reload: Try alternative with swapped operands before going to the next

2011-11-15 Thread Jeff Law
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 11/15/11 15:31, Maxim Kuvyrkov wrote:
> On 15/11/2011, at 6:21 AM, Andreas Krebbel wrote:

>> 
>> Bootstrapped on s390x, x86_64 and PPC64. No regressions
>> 
>> Ok for mainline?
> 
> Good portion of the code you're changing was written by Richard K.
> ages ago, so CC'ing him to get his approval.
I'd ask the meta question, is there a compelling reason to push this
patch into the tree now?  My obvious concern is that this change
potentially effects every target and twiddles one of GCC's most
sensitive areas.  Even if the patch is sound it's the kind of change
that could easily have unforeseen consequences.

If there isn't a compelling reason, then I think it should be queued
for 4.8-stage1.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJOw0/NAAoJEBRtltQi2kC7huoIAIBTy1fvXcdLSzC6jMekDHcS
XTNpuKLRDYCV1p2sBLQwGvYG6uOoJvz9EwzgL3Wgo7PTe5TDavCS7wb6EbDg7bV5
RnDWjMEgsgjs9I7MNn9GesqBWuJFedC0e9Qw65bsMGlKn3ZK8z3tfWZfJLCmcIMp
j2jUlZcfiYUGGBErPj5mqP6KsEIfl7B+I9EnEPJUc6A1Nvab0cXfKYi5hjRxkbtb
OT5O5Y8KqQj5dGKcYOKk9HcG5IyPxlzY9jZWbsCgeXrU8vaX77Z8sTod3KMQ6LFe
l3j0xAaJybmmxhnkXKhmZ7BjAEM6rkEspd0fRjD+OMDEYN7EkSdvHo34T2YmAGY=
=IArX
-END PGP SIGNATURE-