date:20131117

Re: Recent Go patch fails several tests on 32bit CentOS 5.10

2013-11-17 Thread Uros Bizjak

On Fri, Nov 15, 2013 at 5:57 PM, Ian Lance Taylor  wrote:

>>> I still see panic in runtime (trace below), segfault in sync,
>>> database/sql, net/http and abort in sync/atomic on 32bit CentOS 5.10
>>> library.
>
> The problems on 32-bit are a recently introduced middle-end bug:
> http://gcc.gnu.org/PR59099 .

I can confirm that the patch [1] from the PR fixes all remaining failures.

[1] http://gcc.gnu.org/ml/gcc-patches/2013-11/msg01820.html

Uros.

[wide-int] Remove tree_fits_hwi_p and tree_to_hwi

2013-11-17 Thread Richard Sandiford

AIUI the two-argument tree_fits_hwi_p and tree_to_hwi were replacements
for host_integerp and tree_low_cst with variable "pos" arguments.
I removed those uses from trunk this week, and Mike's merge has
brought that into branch.

The only remaining use is in a branch-local change to the way that
match_case_to_enum_1 prints constants.  The old code was:

  /* ??? Not working too hard to print the double-word value.
 Should perhaps be done with %lwd in the diagnostic routines?  */
  if (TREE_INT_CST_HIGH (key) == 0)
snprintf (buf, sizeof (buf), HOST_WIDE_INT_PRINT_UNSIGNED,
  TREE_INT_CST_LOW (key));
  else if (!TYPE_UNSIGNED (type)
   && TREE_INT_CST_HIGH (key) == -1
   && TREE_INT_CST_LOW (key) != 0)
snprintf (buf, sizeof (buf), "-" HOST_WIDE_INT_PRINT_UNSIGNED,
  -TREE_INT_CST_LOW (key));
  else
snprintf (buf, sizeof (buf), HOST_WIDE_INT_PRINT_DOUBLE_HEX,
  (unsigned HOST_WIDE_INT) TREE_INT_CST_HIGH (key),
  (unsigned HOST_WIDE_INT) TREE_INT_CST_LOW (key));

The first arm prints KEY as an unsigned HWI if the "infinite precision"
value of KEY fits in an unsigned HWI (regardless of whether the type
is signed or not, e.g. it could be a signed 128-bit type with a value
that happens to fit in an unsigned 64-bit HWI).  The second prints it
as a negative HWI if KEY is negative and fits.  The third arm is a hex
fallback.  But on branch we only print KEYs with signed types as decimal
if they fit in signed HWIs, which is different from the first arm on trunk.

This patch restores the trunk choices and gets rid of the then-unused
functions.

Also, the single-argument tree_fits_hwi_p was used only in one place,
sdbout.c.  On trunk it's a "host_integerp (..., 0)" call, so that
translates to tree_fits_shwi_p on branch.

Tested on x86_64-linux-gnu.  OK to install?

Thanks,
Richard


Index: gcc/c-family/c-common.c
===
--- gcc/c-family/c-common.c 2013-11-16 22:21:20.716272495 +
+++ gcc/c-family/c-common.c 2013-11-16 22:36:56.575137937 +
@@ -6056,8 +6056,10 @@ match_case_to_enum_1 (tree key, tree typ
 {
   char buf[WIDE_INT_PRINT_BUFFER_SIZE];
 
-  if (tree_fits_hwi_p (key, TYPE_SIGN (type)))
-print_dec (key, buf, TYPE_SIGN (type));
+  if (tree_fits_uhwi_p (key))
+print_dec (key, buf, UNSIGNED);
+  else if (tree_fits_shwi_p (key))
+print_dec (key, buf, SIGNED);
   else
 print_hex (key, buf);
 
Index: gcc/doc/generic.texi
===
--- gcc/doc/generic.texi2013-11-16 22:21:23.034289829 +
+++ gcc/doc/generic.texi2013-11-16 22:40:47.479832766 +
@@ -1024,10 +1024,8 @@ As this example indicates, the operands
 @tindex INTEGER_CST
 @tindex tree_fits_uhwi_p
 @tindex tree_fits_shwi_p
-@tindex tree_fits_hwi_p
 @tindex tree_to_uhwi
 @tindex tree_to_shwi
-@tindex tree_to_hwi
 @tindex REAL_CST
 @tindex FIXED_CST
 @tindex COMPLEX_CST
@@ -1050,16 +1048,11 @@ represented in an array of HOST_WIDE_INT
 in the array to represent the value without taking extra elements for
 redundant 0s or -1.
 
-The functions @code{tree_fits_uhwi_p}, @code{tree_fits_shwi_p}, and
-@code{tree_fits_hwi_p} can be used to tell if the value is small
-enough to fit in a HOST_WIDE_INT, as either a signed value, an unsiged
-value or a value whose sign is given as a parameter.  The value can
-then be extracted using the @code{tree_to_uhwi}, @code{tree_to_shwi},
-or @code{tree_to_hwi}.  The @code{tree_to_hwi} comes in both checked
-and unchecked flavors.  However, when the value is used in a context
-where it may represent a value that is larger than can be represented
-in HOST_BITS_PER_WIDE_INT bits, the wide_int class should be used to
-manipulate the constant.
+The functions @code{tree_fits_shwi_p} and @code{tree_fits_uhwi_p}
+can be used to tell if the value is small enough to fit in a
+signed HOST_WIDE_INT or an unsigned HOST_WIDE_INT respectively.
+The value can then be extracted using @code{tree_to_shwi} and
+@code{tree_to_uhwi}.
 
 @item REAL_CST
 
Index: gcc/sdbout.c
===
--- gcc/sdbout.c2013-10-22 10:15:14.050507121 +0100
+++ gcc/sdbout.c2013-11-16 22:37:51.168539468 +
@@ -1152,7 +1152,7 @@ sdbout_one_type (tree type)
if (TREE_CODE (value) == CONST_DECL)
  value = DECL_INITIAL (value);
 
-   if (tree_fits_hwi_p (value))
+   if (tree_fits_shwi_p (value))
  {
PUT_SDB_DEF (IDENTIFIER_POINTER (TREE_PURPOSE (tem)));
PUT_SDB_INT_VAL (tree_to_shwi (value));
Index: gcc/tree.h
===
--- gcc/tree.h  2013-11-16 22:21:25.473308066 +
+++ gcc/tree.h  2013-11-16 22:42:05.148400937 +
@@ -3235,37 +3235,6 @@ tree_fits_shwi_p (const_tree cst)
   return false;
 }
 
-/

[patch,libgfortran] Fix binary128 ERFC_SCALED

2013-11-17 Thread FX

This patch fixes libgfortran’s binary128 [aka real(kind=16)] variant of 
ERFC_SCALED. The original code, which I had lifted from netlib, gives only 18 
significant decimal digits, which is not enough for binary128 (33 decimal 
digits).

I thus implemented a new variant for binary128. For arguments < 12, it simply 
calls erfcq() then multiplies by expq(x*x). For larger arguments, it uses a 
power expansion in 1/x. The new implementation provides answers within to 2 ulp 
of the correct value.

Regtested on x86_64-apple-darwin13, comes with a testcase. OK to commit?
FX



erfc_scaled.ChangeLog
Description: Binary data


erfc_scaled.diff
Description: Binary data

[C++ Patch] PR 59123

2013-11-17 Thread Paolo Carlini


Hi,

C++11 7.1.5 seems very clear that redeclarations of *variables* 
differing in constexpr are fine (clang and icc agree). Tested x86_64-linux.


Thanks,
Paolo.

///
/cp
2013-11-17  Paolo Carlini  

PR c++/59123
* decl.c (validate_constexpr_redeclaration): Redeclarations of
variables can differ in constexpr.

/testsuite
2013-11-17  Paolo Carlini  

PR c++/59123
* g++.dg/cpp0x/constexpr-redeclaration1.C: New.
* g++.dg/cpp0x/constexpr-decl.C: Adjust.
Index: cp/decl.c
===
--- cp/decl.c   (revision 204913)
+++ cp/decl.c   (working copy)
@@ -1216,10 +1216,12 @@ validate_constexpr_redeclaration (tree old_decl, t
   if (! DECL_TEMPLATE_SPECIALIZATION (old_decl)
  && DECL_TEMPLATE_SPECIALIZATION (new_decl))
return true;
+
+  error ("redeclaration %qD differs in %", new_decl);
+  error ("from previous declaration %q+D", old_decl);
+  return false;
 }
-  error ("redeclaration %qD differs in %", new_decl);
-  error ("from previous declaration %q+D", old_decl);
-  return false;
+  return true;
 }
 
 #define GNU_INLINE_P(fn) (DECL_DECLARED_INLINE_P (fn)  \
Index: testsuite/g++.dg/cpp0x/constexpr-decl.C
===
--- testsuite/g++.dg/cpp0x/constexpr-decl.C (revision 204913)
+++ testsuite/g++.dg/cpp0x/constexpr-decl.C (working copy)
@@ -3,8 +3,7 @@
 
 struct S {
   static constexpr int size;   // { dg-error "must have an initializer" "must 
have" }
-  // { dg-error "previous declaration" "previous" { target *-*-* } 5 }
 };
 
 const int limit = 2 * S::size;
-constexpr int S::size = 256;   // { dg-error "" }
+constexpr int S::size = 256;
Index: testsuite/g++.dg/cpp0x/constexpr-redeclaration1.C
===
--- testsuite/g++.dg/cpp0x/constexpr-redeclaration1.C   (revision 0)
+++ testsuite/g++.dg/cpp0x/constexpr-redeclaration1.C   (working copy)
@@ -0,0 +1,10 @@
+// PR c++/59123
+// { dg-do compile { target c++11 } }
+
+// Fwd-declarations
+struct S;
+extern const S s;
+
+// (... later) definitions
+struct S {};
+constexpr S s {};

[libgfortran,patch] Silence a warning

2013-11-17 Thread FX

This attach patch adds an assert() in the library to fix PR 51828, i.e. silence 
a “may be used uninitialized” warning.

Built and regtested on x86_64-apple-darwin13. OK to commit?

FX



libwarning.ChangeLog
Description: Binary data


libwarning.diff
Description: Binary data

Re: [RFA][PATCH]Fix 59019

2013-11-17 Thread Steven Bosscher

On Sun, Nov 17, 2013 at 7:48 AM, Jeff Law wrote:
>
> * combine.c (try_combine): If we have created an unconditional trap,
> make sure to fixup the insn stream & CFG appropriately.
>
> diff --git a/gcc/combine.c b/gcc/combine.c
> index 13f5e29..b3d20f2 100644
> --- a/gcc/combine.c
> +++ b/gcc/combine.c
> @@ -4348,6 +4348,37 @@ try_combine (rtx i3, rtx i2, rtx i1, rtx i0, int
> *new_direct_jump_p,
>update_cfg_for_uncondjump (undobuf.other_insn);
>  }
>
> +  /* If we might have created an unconditional trap, then we have
> + cleanup work to do.
> +
> + The fundamental problem is a conditional trap is not considered
> + control flow altering, while an unconditional trap is considered
> + control flow altering.
> +
> + So while we could have a conditional trap in the middle of a block
> + we can not have an unconditional trap in the middle of a block.  */
> +  if (GET_CODE (i3) == INSN
> +  && GET_CODE (PATTERN (i3)) == TRAP_IF
> +  && XEXP (PATTERN (i3), 0) == const1_rtx)

TRAP_CONDITION (PATTERN (i3)) == const1_rtx

But shouldn't the check be on const_true_rtx? Or does combine put a
const1_rtx there?

> +{
> +  basic_block bb = BLOCK_FOR_INSN (i3);
> +  rtx last = get_last_bb_insn (bb);

This won't work, get_last_bb_insn() is intended to be used only in
cfgrtl mode and "combine" works in cfglayout mode. If you use it in
cfglayout mode on a block that ends in a tablejump, you get back the
JUMP_TABLE_DATA insn that is in BB_FOOTER and there is no NEXT_INSN
path from BB_END to any insns in the footer.

rtx last = BB_END (bb);

Any dead jump tables will be dealt with later.

> +  /* First remove all the insns after the trap.  */
> +  if (i3 != last)
> +   delete_insn_chain (NEXT_INSN (i3), last, true);
> +
> +  /* And ensure there's no outgoing edges anymore.  */
> +  while (EDGE_COUNT (bb->succs) > 0)
> +   remove_edge (EDGE_SUCC (bb, 0));

Alternatively, you could do "split_block (bb, i3);" and let cfgcleanup
deal with the new, unreachable basic block.

> +  /* And ensure cfglayout knows this block does not fall through.  */
> +  emit_barrier_after_bb (bb);

Bah... Emitting the barrier is necessary here because
fixup_reorder_chain doesn't handle cases where a basic block is a dead
end. That is actually a bug in fixup_reorder_chain: Other passes could
create dead ends in the CFG in cfglayout mode and not emit a barrier
into BB_FOOTER, and fixup_reorder_chain wouldn't be able to handle
that (resulting in verify_flow_info failure).

fixup_reorder_chain should emit a BARRIER if a block has no successor edges.

(It's a general short-comming of cfglayout mode that barriers are
still there at all. Ideally all barriers would be removed going into
cfglayout mode, and fixup_reorder_chain would put them back where
necessary. That would simplify the job of updating the CFG elsewhere
in the compiler, e.g. update_cfg_for_uncondjump)

> +  /* Not exactly true, but gets the effect we want.  */
> +  *new_direct_jump_p = 1;
> +}
> +
>/* A noop might also need cleaning up of CFG, if it comes from the
>   simplification of a jump.  */
>if (JUMP_P (i3)
>

Would you mind if I try spend some time making conditional traps be
control flow insns? It should make all of this a little bit less ugly.
And I have no fish to fry at all :-) Give me a week or two please, to
see if I can figure out those issues you've been running into.

Ciao!
Steven

Re: [PATCH] Time profiler - phase 2

2013-11-17 Thread Martin Liška

Dear Jan,

On 16 November 2013 12:24, Jan Hubicka  wrote:
>> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
>> index c566a85..1562098 100644
>> --- a/gcc/ChangeLog
>> +++ b/gcc/ChangeLog
>> @@ -1,3 +1,15 @@
>> +2013-11-13   Martin Liska
>> + Jan Hubicka  
>> +
>> + * cgraphunit.c (node_cmp): New function.
>> + (expand_all_functions): Function ordering added.
>> + * common.opt: New profile based function reordering flag introduced.
>> + * coverage.c (get_coverage_counts): Wrong profile handled.
>> + * ipa.c (cgraph_externally_visible_p): New late flag introduced.
>> + * lto-partition.c: Support for time profile added.
>> + * lto.c: Likewise.
>> + * value-prof.c: Histogram instrumentation switch added.
>> +
>>  2013-11-13  Vladimir Makarov  
>>
>>   PR rtl-optimization/59036
>> diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c
>> index 4765e6a..7cdd9a4 100644
>> --- a/gcc/cgraphunit.c
>> +++ b/gcc/cgraphunit.c
>> @@ -1821,6 +1821,17 @@ expand_function (struct cgraph_node *node)
>>ipa_remove_all_references (&node->ref_list);
>>  }
>>
>> +/* Node comparer that is responsible for the order that corresponds
>> +   to time when a function was launched for the first time.  */
>> +
>> +static int
>> +node_cmp (const void *pa, const void *pb)
>> +{
>> +  const struct cgraph_node *a = *(const struct cgraph_node * const *) pa;
>> +  const struct cgraph_node *b = *(const struct cgraph_node * const *) pb;
>> +
>> +  return b->tp_first_run - a->tp_first_run;
>
> Please stabilize this by using node->order when tp_first_run is equivalent.
> Later we ought to use better heuristic here, but order may be good enough to

Done.

> start with.
>> diff --git a/gcc/ipa.c b/gcc/ipa.c
>> index a11b1c7..d92a332 100644
>> --- a/gcc/ipa.c
>> +++ b/gcc/ipa.c
>> @@ -761,10 +761,14 @@ cgraph_externally_visible_p (struct cgraph_node *node,
>>   This improves code quality and we know we will duplicate them at most 
>> twice
>>   (in the case that we are not using plugin and link with object file
>>implementing same COMDAT)  */
>> -  if ((in_lto_p || whole_program)
>> -  && DECL_COMDAT (node->decl)
>> -  && comdat_can_be_unshared_p (node))
>> -return false;
>> +  if ((in_lto_p || whole_program || profile_arc_flag)
>> + && DECL_COMDAT (node->decl)
>> + && comdat_can_be_unshared_p (node))
>> +{
>> +  gcc_checking_assert (cgraph_function_body_availability (node)
>> +> AVAIL_OVERWRITABLE);
>> +  return false;
>> +}
>>
>>/* When doing link time optimizations, hidden symbols become local.  */
>>if (in_lto_p
>> @@ -932,7 +936,7 @@ function_and_variable_visibility (bool whole_program)
>>   }
>>gcc_assert ((!DECL_WEAK (node->decl)
>> && !DECL_COMDAT (node->decl))
>> -   || TREE_PUBLIC (node->decl)
>> +   || TREE_PUBLIC (node->decl)
>> || node->weakref
>> || DECL_EXTERNAL (node->decl));
>>if (cgraph_externally_visible_p (node, whole_program))
>> @@ -949,7 +953,7 @@ function_and_variable_visibility (bool whole_program)
>> && node->definition && !node->weakref
>> && !DECL_EXTERNAL (node->decl))
>>   {
>> -   gcc_assert (whole_program || in_lto_p
>> +   gcc_assert (whole_program || in_lto_p || profile_arc_flag
>> || !TREE_PUBLIC (node->decl));
>> node->unique_name = ((node->resolution == LDPR_PREVAILING_DEF_IRONLY
>> || node->resolution == 
>> LDPR_PREVAILING_DEF_IRONLY_EXP)
>
> These changes are unrelated, please remove them.
>> @@ -395,6 +397,20 @@ node_cmp (const void *pa, const void *pb)
>>  {
>>const struct cgraph_node *a = *(const struct cgraph_node * const *) pa;
>>const struct cgraph_node *b = *(const struct cgraph_node * const *) pb;
>> +
>> +  /* Profile reorder flag enables function reordering based on first 
>> execution
>> + of a function. All functions with profile are placed in ascending
>> + order at the beginning.  */
>> +
>> +  if (flag_profile_reorder_functions)
>&& a->tp_first_run != b->tp_first_run
>> +  {
>> +if (a->tp_first_run && b->tp_first_run)
>> +  return a->tp_first_run - b->tp_first_run;
>> +
>> +if (a->tp_first_run || b->tp_first_run)
>> +  return b->tp_first_run - a->tp_first_run;
>
> Drop a comment explaining the logic here ;)
>> @@ -449,7 +465,7 @@ void
>>  lto_balanced_map (void)
>>  {
>>int n_nodes = 0;
>> -  int n_varpool_nodes = 0, varpool_pos = 0, best_varpool_pos = 0;
>> +  int n_varpool_nodes = 0, varpool_pos = 0;
>>struct cgraph_node **order = XNEWVEC (struct cgraph_node *, 
>> cgraph_max_uid);
>>struct varpool_node **varpool_order = NULL;
>>int i;
>> @@ -481,10 +497,13 @@ lto_balanced_map (void)
>>   get better about minimizing the function bounday, but until that
>>

Re: [PowerPC] libffi fixes and support for PowerPC64 ELFv2

2013-11-17 Thread David Edelsohn

On Sun, Nov 17, 2013 at 1:25 AM, Alan Modra  wrote:
> On Sat, Nov 16, 2013 at 10:18:05PM +1030, Alan Modra wrote:
>> The following six patches correspond to patches posted to the libffi
>> mailing list a few days ago to add support for PowerPC64 ELFv2.  The
>
> The ChangeLog just became easier to write.  :)
>
> * src/powerpc/ffitarget.h: Import from upstream.
> * src/powerpc/ffi.c: Likewise.
> * src/powerpc/linux64.S: Likewise.
> * src/powerpc/linux64_closure.S: Likewise.
> * doc/libffi.texi: Likewise.
> * testsuite/libffi.call/cls_double_va.c: Likewise.
> * testsuite/libffi.call/cls_longdouble_va.c: Likewise.
>
> OK to apply?

Okay.

Thanks, David

Re: [PATCH, rs6000] Emit correct note for DWARF CFI information on LE prolog VSX stores

2013-11-17 Thread David Edelsohn

On Sat, Nov 16, 2013 at 10:32 PM, Bill Schmidt
 wrote:
> Hi,
>
> For VSX in little endian we currently split vector register stores into
> a permute/store pair.  For prolog stores, this results in a
> REG_FRAME_RELATED_EXPR note that doesn't have a simple register for its
> RHS, which it needs to have.  This patch detects that situation and
> ensures we produce the correct note.
>
> This problem was breaking bootstrap when configured with
> --with-cpu=power7, something we hadn't tried before.  With the patch we
> now get past stage 1.  There is at least one wrong-code bug to track
> down in stage 2, but modifying this note is clearly not involved with
> that.
>
> Otherwise bootstrapped and tested on powerpc64-unknown-linux-gnu with no
> regressions on the big-endian side, also bootstrapped with
> --with-cpu=power7.  Is this ok for trunk?
>
> Thanks,
> Bill
>
>
> 2011-11-16  Bill Schmidt  
>
> * config/rs6000/rs6000.c (rs6000_frame_related): Add split_reg
> parameter and use it in REG_FRAME_RELATED_EXPR note.
> (emit_frame_save): Call rs6000_frame_related with extra NULL_RTX
> parameter.
> (rs6000_emit_prologue): Likewise, but for little endian VSX
> stores, pass the source register of the store instead.

Okay.

Thanks, David

Re: [PowerPC] libffi fixes and support for PowerPC64 ELFv2

2013-11-17 Thread Alan Modra

On Sun, Nov 17, 2013 at 07:53:59AM -0500, David Edelsohn wrote:
> On Sun, Nov 17, 2013 at 1:25 AM, Alan Modra  wrote:
> > On Sat, Nov 16, 2013 at 10:18:05PM +1030, Alan Modra wrote:
> >> The following six patches correspond to patches posted to the libffi
> >> mailing list a few days ago to add support for PowerPC64 ELFv2.  The
> >
> > The ChangeLog just became easier to write.  :)
> >
> > * src/powerpc/ffitarget.h: Import from upstream.
> > * src/powerpc/ffi.c: Likewise.
> > * src/powerpc/linux64.S: Likewise.
> > * src/powerpc/linux64_closure.S: Likewise.
> > * doc/libffi.texi: Likewise.
> > * testsuite/libffi.call/cls_double_va.c: Likewise.
> > * testsuite/libffi.call/cls_longdouble_va.c: Likewise.
> >
> > OK to apply?
> 
> Okay.
> 
> Thanks, David

Committed revision 204917.  I'm also going to apply the following as
obvious.  The error I'm fixing here doesn't cause a runtime failure,
but does mess with the processor return branch prediction.

Index: ChangeLog
===
--- ChangeLog   (revision 204917)
+++ ChangeLog   (working copy)
@@ -1,5 +1,7 @@
 2013-11-18  Alan Modra  
 
+   * src/powerpc/ppc_closure.S: Don't bl .Luint128.
+
* src/powerpc/ffitarget.h: Import from upstream.
* src/powerpc/ffi.c: Likewise.
* src/powerpc/linux64.S: Likewise.
Index: src/powerpc/ppc_closure.S
===
--- src/powerpc/ppc_closure.S   (revision 204916)
+++ src/powerpc/ppc_closure.S   (working copy)
@@ -238,7 +238,7 @@
lwz %r3,112+0(%r1)
lwz %r4,112+4(%r1)
lwz %r5,112+8(%r1)
-   bl .Luint128
+   b .Luint128
 
 # The return types below are only used when the ABI type is FFI_SYSV.
 # case FFI_SYSV_TYPE_SMALL_STRUCT + 1. One byte struct.

-- 
Alan Modra
Australia Development Lab, IBM

Re: [wide-int] Remove tree_fits_hwi_p and tree_to_hwi

2013-11-17 Thread Kenneth Zadeck


On 11/17/2013 05:29 AM, Richard Sandiford wrote:

AIUI the two-argument tree_fits_hwi_p and tree_to_hwi were replacements
for host_integerp and tree_low_cst with variable "pos" arguments.
I removed those uses from trunk this week, and Mike's merge has
brought that into branch.
i think that i am a little uncomfortable with this.  last night when i 
was testing some stuff, i noticed a syntax error that i assume was a 
merge problem in dwarf2out.c, but i did not run down where it came from.


The code was at line 17499 and it used to be:

  if (tree_fits_hwi_p (value)
  && (simple_type_size_in_bits (TREE_TYPE (value))
  <= HOST_BITS_PER_WIDE_INT || tree_fits_shwi_p (value)))

someone (I assume you) had removed the tree_fits_hwi_p && and had left 
the last parentheses.


However the code on the trunk is

  if (host_integerp (value, TYPE_UNSIGNED (TREE_TYPE (value)))
  && (simple_type_size_in_bits (TREE_TYPE (value))
  <= HOST_BITS_PER_WIDE_INT || host_integerp (value, 0)))

so i do not see how you have matched this functionality by removing that 
call.



The only remaining use is in a branch-local change to the way that
match_case_to_enum_1 prints constants.  The old code was:

   /* ??? Not working too hard to print the double-word value.
  Should perhaps be done with %lwd in the diagnostic routines?  */
   if (TREE_INT_CST_HIGH (key) == 0)
 snprintf (buf, sizeof (buf), HOST_WIDE_INT_PRINT_UNSIGNED,
  TREE_INT_CST_LOW (key));
   else if (!TYPE_UNSIGNED (type)
   && TREE_INT_CST_HIGH (key) == -1
   && TREE_INT_CST_LOW (key) != 0)
 snprintf (buf, sizeof (buf), "-" HOST_WIDE_INT_PRINT_UNSIGNED,
  -TREE_INT_CST_LOW (key));
   else
 snprintf (buf, sizeof (buf), HOST_WIDE_INT_PRINT_DOUBLE_HEX,
  (unsigned HOST_WIDE_INT) TREE_INT_CST_HIGH (key),
  (unsigned HOST_WIDE_INT) TREE_INT_CST_LOW (key));

The first arm prints KEY as an unsigned HWI if the "infinite precision"
value of KEY fits in an unsigned HWI (regardless of whether the type
is signed or not, e.g. it could be a signed 128-bit type with a value
that happens to fit in an unsigned 64-bit HWI).  The second prints it
as a negative HWI if KEY is negative and fits.  The third arm is a hex
fallback.  But on branch we only print KEYs with signed types as decimal
if they fit in signed HWIs, which is different from the first arm on trunk.

This patch restores the trunk choices and gets rid of the then-unused
functions.

Also, the single-argument tree_fits_hwi_p was used only in one place,
sdbout.c.  On trunk it's a "host_integerp (..., 0)" call, so that
translates to tree_fits_shwi_p on branch.

Tested on x86_64-linux-gnu.  OK to install?

Thanks,
Richard


Index: gcc/c-family/c-common.c
===
--- gcc/c-family/c-common.c 2013-11-16 22:21:20.716272495 +
+++ gcc/c-family/c-common.c 2013-11-16 22:36:56.575137937 +
@@ -6056,8 +6056,10 @@ match_case_to_enum_1 (tree key, tree typ
  {
char buf[WIDE_INT_PRINT_BUFFER_SIZE];
  
-  if (tree_fits_hwi_p (key, TYPE_SIGN (type)))

-print_dec (key, buf, TYPE_SIGN (type));
+  if (tree_fits_uhwi_p (key))
+print_dec (key, buf, UNSIGNED);
+  else if (tree_fits_shwi_p (key))
+print_dec (key, buf, SIGNED);
else
  print_hex (key, buf);
  
Index: gcc/doc/generic.texi

===
--- gcc/doc/generic.texi2013-11-16 22:21:23.034289829 +
+++ gcc/doc/generic.texi2013-11-16 22:40:47.479832766 +
@@ -1024,10 +1024,8 @@ As this example indicates, the operands
  @tindex INTEGER_CST
  @tindex tree_fits_uhwi_p
  @tindex tree_fits_shwi_p
-@tindex tree_fits_hwi_p
  @tindex tree_to_uhwi
  @tindex tree_to_shwi
-@tindex tree_to_hwi
  @tindex REAL_CST
  @tindex FIXED_CST
  @tindex COMPLEX_CST
@@ -1050,16 +1048,11 @@ represented in an array of HOST_WIDE_INT
  in the array to represent the value without taking extra elements for
  redundant 0s or -1.
  
-The functions @code{tree_fits_uhwi_p}, @code{tree_fits_shwi_p}, and

-@code{tree_fits_hwi_p} can be used to tell if the value is small
-enough to fit in a HOST_WIDE_INT, as either a signed value, an unsiged
-value or a value whose sign is given as a parameter.  The value can
-then be extracted using the @code{tree_to_uhwi}, @code{tree_to_shwi},
-or @code{tree_to_hwi}.  The @code{tree_to_hwi} comes in both checked
-and unchecked flavors.  However, when the value is used in a context
-where it may represent a value that is larger than can be represented
-in HOST_BITS_PER_WIDE_INT bits, the wide_int class should be used to
-manipulate the constant.
+The functions @code{tree_fits_shwi_p} and @code{tree_fits_uhwi_p}
+can be used to tell if the value is small enough to fit in a
+signed HOST_WIDE_INT or an unsigned HOST_WIDE_INT respectively.
+The value can then be extracted using @code{tree_to_shwi} and
+@code{tree_

Re: [wide-int] Remove tree_fits_hwi_p and tree_to_hwi

2013-11-17 Thread Richard Sandiford

Kenneth Zadeck  writes:
> On 11/17/2013 05:29 AM, Richard Sandiford wrote:
>> AIUI the two-argument tree_fits_hwi_p and tree_to_hwi were replacements
>> for host_integerp and tree_low_cst with variable "pos" arguments.
>> I removed those uses from trunk this week, and Mike's merge has
>> brought that into branch.
> i think that i am a little uncomfortable with this.  last night when i 
> was testing some stuff, i noticed a syntax error that i assume was a 
> merge problem in dwarf2out.c, but i did not run down where it came from.
>
> The code was at line 17499 and it used to be:
>
>if (tree_fits_hwi_p (value)
>&& (simple_type_size_in_bits (TREE_TYPE (value))
><= HOST_BITS_PER_WIDE_INT || tree_fits_shwi_p (value)))
>
> someone (I assume you) had removed the tree_fits_hwi_p && and had left 
> the last parentheses.

Well, I removed the call from trunk and Mike left the extra parenthesis
when dealing with the resulting merge conflict.

As per the off-list message, I committed a patch to get the branch
building again this morning.

> However the code on the trunk is
>
>if (host_integerp (value, TYPE_UNSIGNED (TREE_TYPE (value)))
>&& (simple_type_size_in_bits (TREE_TYPE (value))
><= HOST_BITS_PER_WIDE_INT || host_integerp (value, 0)))
>
> so i do not see how you have matched this functionality by removing that 
> call.

Are you sure you're looking at an up-to-date trunk?  It was removed by:

http://gcc.gnu.org/ml/gcc-patches/2013-11/msg01673.html

which was committed on Friday as r204846.  There was also a change
to the C frontend committed as r204847.  Those were the changes I was
talking about in the quote above.  So current trunk doesn't have any
calls to host_integerp or tree_low_cst with variable "pos" arguments.

Thanks,
Richard

>> The only remaining use is in a branch-local change to the way that
>> match_case_to_enum_1 prints constants.  The old code was:
>>
>>/* ??? Not working too hard to print the double-word value.
>>   Should perhaps be done with %lwd in the diagnostic routines?  */
>>if (TREE_INT_CST_HIGH (key) == 0)
>>  snprintf (buf, sizeof (buf), HOST_WIDE_INT_PRINT_UNSIGNED,
>>TREE_INT_CST_LOW (key));
>>else if (!TYPE_UNSIGNED (type)
>> && TREE_INT_CST_HIGH (key) == -1
>> && TREE_INT_CST_LOW (key) != 0)
>>  snprintf (buf, sizeof (buf), "-" HOST_WIDE_INT_PRINT_UNSIGNED,
>>-TREE_INT_CST_LOW (key));
>>else
>>  snprintf (buf, sizeof (buf), HOST_WIDE_INT_PRINT_DOUBLE_HEX,
>>(unsigned HOST_WIDE_INT) TREE_INT_CST_HIGH (key),
>>(unsigned HOST_WIDE_INT) TREE_INT_CST_LOW (key));
>>
>> The first arm prints KEY as an unsigned HWI if the "infinite precision"
>> value of KEY fits in an unsigned HWI (regardless of whether the type
>> is signed or not, e.g. it could be a signed 128-bit type with a value
>> that happens to fit in an unsigned 64-bit HWI).  The second prints it
>> as a negative HWI if KEY is negative and fits.  The third arm is a hex
>> fallback.  But on branch we only print KEYs with signed types as decimal
>> if they fit in signed HWIs, which is different from the first arm on trunk.
>>
>> This patch restores the trunk choices and gets rid of the then-unused
>> functions.
>>
>> Also, the single-argument tree_fits_hwi_p was used only in one place,
>> sdbout.c.  On trunk it's a "host_integerp (..., 0)" call, so that
>> translates to tree_fits_shwi_p on branch.
>>
>> Tested on x86_64-linux-gnu.  OK to install?
>>
>> Thanks,
>> Richard
>>
>>
>> Index: gcc/c-family/c-common.c
>> ===
>> --- gcc/c-family/c-common.c  2013-11-16 22:21:20.716272495 +
>> +++ gcc/c-family/c-common.c  2013-11-16 22:36:56.575137937 +
>> @@ -6056,8 +6056,10 @@ match_case_to_enum_1 (tree key, tree typ
>>   {
>> char buf[WIDE_INT_PRINT_BUFFER_SIZE];
>>   
>> -  if (tree_fits_hwi_p (key, TYPE_SIGN (type)))
>> -print_dec (key, buf, TYPE_SIGN (type));
>> +  if (tree_fits_uhwi_p (key))
>> +print_dec (key, buf, UNSIGNED);
>> +  else if (tree_fits_shwi_p (key))
>> +print_dec (key, buf, SIGNED);
>> else
>>   print_hex (key, buf);
>>   
>> Index: gcc/doc/generic.texi
>> ===
>> --- gcc/doc/generic.texi 2013-11-16 22:21:23.034289829 +
>> +++ gcc/doc/generic.texi 2013-11-16 22:40:47.479832766 +
>> @@ -1024,10 +1024,8 @@ As this example indicates, the operands
>>   @tindex INTEGER_CST
>>   @tindex tree_fits_uhwi_p
>>   @tindex tree_fits_shwi_p
>> -@tindex tree_fits_hwi_p
>>   @tindex tree_to_uhwi
>>   @tindex tree_to_shwi
>> -@tindex tree_to_hwi
>>   @tindex REAL_CST
>>   @tindex FIXED_CST
>>   @tindex COMPLEX_CST
>> @@ -1050,16 +1048,11 @@ represented in an array of HOST_WIDE_INT
>>   in the array to represent the value without taking extra elements for
>>   redundant 0s or -1.
>>   
>> -The functions @c

Re: [wide-int] Remove tree_fits_hwi_p and tree_to_hwi

2013-11-17 Thread Kenneth Zadeck


On 11/17/2013 10:58 AM, Richard Sandiford wrote:

Kenneth Zadeck  writes:

On 11/17/2013 05:29 AM, Richard Sandiford wrote:

AIUI the two-argument tree_fits_hwi_p and tree_to_hwi were replacements
for host_integerp and tree_low_cst with variable "pos" arguments.
I removed those uses from trunk this week, and Mike's merge has
brought that into branch.

i think that i am a little uncomfortable with this.  last night when i
was testing some stuff, i noticed a syntax error that i assume was a
merge problem in dwarf2out.c, but i did not run down where it came from.

The code was at line 17499 and it used to be:

if (tree_fits_hwi_p (value)
&& (simple_type_size_in_bits (TREE_TYPE (value))
<= HOST_BITS_PER_WIDE_INT || tree_fits_shwi_p (value)))

someone (I assume you) had removed the tree_fits_hwi_p && and had left
the last parentheses.

Well, I removed the call from trunk and Mike left the extra parenthesis
when dealing with the resulting merge conflict.

As per the off-list message, I committed a patch to get the branch
building again this morning.


However the code on the trunk is

if (host_integerp (value, TYPE_UNSIGNED (TREE_TYPE (value)))
&& (simple_type_size_in_bits (TREE_TYPE (value))
<= HOST_BITS_PER_WIDE_INT || host_integerp (value, 0)))

so i do not see how you have matched this functionality by removing that
call.

Are you sure you're looking at an up-to-date trunk?  It was removed by:

 http://gcc.gnu.org/ml/gcc-patches/2013-11/msg01673.html

which was committed on Friday as r204846.  There was also a change
to the C frontend committed as r204847.  Those were the changes I was
talking about in the quote above.  So current trunk doesn't have any
calls to host_integerp or tree_low_cst with variable "pos" arguments.

Thanks,
Richard

i am likely off by a few days.   if there are no more calls on trunk 
then my objection is removed.


kenny

The only remaining use is in a branch-local change to the way that
match_case_to_enum_1 prints constants.  The old code was:

/* ??? Not working too hard to print the double-word value.
   Should perhaps be done with %lwd in the diagnostic routines?  */
if (TREE_INT_CST_HIGH (key) == 0)
  snprintf (buf, sizeof (buf), HOST_WIDE_INT_PRINT_UNSIGNED,
  TREE_INT_CST_LOW (key));
else if (!TYPE_UNSIGNED (type)
   && TREE_INT_CST_HIGH (key) == -1
   && TREE_INT_CST_LOW (key) != 0)
  snprintf (buf, sizeof (buf), "-" HOST_WIDE_INT_PRINT_UNSIGNED,
  -TREE_INT_CST_LOW (key));
else
  snprintf (buf, sizeof (buf), HOST_WIDE_INT_PRINT_DOUBLE_HEX,
  (unsigned HOST_WIDE_INT) TREE_INT_CST_HIGH (key),
  (unsigned HOST_WIDE_INT) TREE_INT_CST_LOW (key));

The first arm prints KEY as an unsigned HWI if the "infinite precision"
value of KEY fits in an unsigned HWI (regardless of whether the type
is signed or not, e.g. it could be a signed 128-bit type with a value
that happens to fit in an unsigned 64-bit HWI).  The second prints it
as a negative HWI if KEY is negative and fits.  The third arm is a hex
fallback.  But on branch we only print KEYs with signed types as decimal
if they fit in signed HWIs, which is different from the first arm on trunk.

This patch restores the trunk choices and gets rid of the then-unused
functions.

Also, the single-argument tree_fits_hwi_p was used only in one place,
sdbout.c.  On trunk it's a "host_integerp (..., 0)" call, so that
translates to tree_fits_shwi_p on branch.

Tested on x86_64-linux-gnu.  OK to install?

Thanks,
Richard


Index: gcc/c-family/c-common.c
===
--- gcc/c-family/c-common.c 2013-11-16 22:21:20.716272495 +
+++ gcc/c-family/c-common.c 2013-11-16 22:36:56.575137937 +
@@ -6056,8 +6056,10 @@ match_case_to_enum_1 (tree key, tree typ
   {
 char buf[WIDE_INT_PRINT_BUFFER_SIZE];
   
-  if (tree_fits_hwi_p (key, TYPE_SIGN (type)))

-print_dec (key, buf, TYPE_SIGN (type));
+  if (tree_fits_uhwi_p (key))
+print_dec (key, buf, UNSIGNED);
+  else if (tree_fits_shwi_p (key))
+print_dec (key, buf, SIGNED);
 else
   print_hex (key, buf);
   
Index: gcc/doc/generic.texi

===
--- gcc/doc/generic.texi2013-11-16 22:21:23.034289829 +
+++ gcc/doc/generic.texi2013-11-16 22:40:47.479832766 +
@@ -1024,10 +1024,8 @@ As this example indicates, the operands
   @tindex INTEGER_CST
   @tindex tree_fits_uhwi_p
   @tindex tree_fits_shwi_p
-@tindex tree_fits_hwi_p
   @tindex tree_to_uhwi
   @tindex tree_to_shwi
-@tindex tree_to_hwi
   @tindex REAL_CST
   @tindex FIXED_CST
   @tindex COMPLEX_CST
@@ -1050,16 +1048,11 @@ represented in an array of HOST_WIDE_INT
   in the array to represent the value without taking extra elements for
   redundant 0s or -1.
   
-The functions @code{tree_fits_uhwi_p}, @code{tree_fits_shwi_p

Re: [C++ Patch] PR 59123

2013-11-17 Thread Jason Merrill


OK.

Jason

Re: [PATCH] Time profiler - phase 2

2013-11-17 Thread Jan Hubicka

> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> index 5cb07b7..754f882 100644
> --- a/gcc/ChangeLog
> +++ b/gcc/ChangeLog
> @@ -1,3 +1,13 @@
> +2013-11-17  Martin Liska  
> + Jan Hubicka  
> +
> + * cgraphunit.c (node_cmp): New function.
> + (expand_all_functions): Function ordering added.
> + * common.opt: New profile based function reordering flag introduced.
> + * lto-partition.c: Support for time profile added.
> + * lto.c: Likewise.
> + * predict.c (handle_missing_profiles): Time profile handled in
> +   missing profiles.

OK.
> @@ -8933,6 +8933,14 @@ from profiling values of expressions for usage in 
> optimizations.
>  
>  Enabled with @option{-fprofile-generate} and @option{-fprofile-use}.
>  
> +@item -fprofile-reoder-functions
> +@opindex fprofile-reorder-functions
> +Function reordering based on profile instrumentation collects
> +first time of execution of a function and orders these functions
> +in ascending order.
> +
> +Enabled with @option{-fprofile-generate} and @option{-fprofile-use}.

I wonder if we don't want to enable it only for -fprofile-use -flto.
You do not need to enable it -fprofile-generate.
> --- a/gcc/opts.c
> +++ b/gcc/opts.c
> @@ -1690,6 +1690,8 @@ common_handle_option (struct gcc_options *opts,
>   opts->x_flag_vect_cost_model = VECT_COST_MODEL_DYNAMIC;
>if (!opts_set->x_flag_tree_loop_distribute_patterns)
>   opts->x_flag_tree_loop_distribute_patterns = value;
> +  if (!opts_set->x_flag_profile_reorder_functions)
> + opts->x_flag_profile_reorder_functions = value;
>/* Indirect call profiling should do all useful transformations
>speculative devirutalization does.  */
>if (!opts_set->x_flag_devirtualize_speculatively
> @@ -1708,6 +1710,8 @@ common_handle_option (struct gcc_options *opts,
>   opts->x_flag_profile_values = value;
>if (!opts_set->x_flag_inline_functions)
>   opts->x_flag_inline_functions = value;
> +  if (!opts_set->x_flag_profile_reorder_functions)
> + opts->x_flag_profile_reorder_functions = value;
>/* FIXME: Instrumentation we insert makes ipa-reference bitmaps
>quadratic.  Disable the pass until better memory representation
>is done.  */

Rmove the -fprofile-generate path here.
> +
> +  /* If time profile is missing, let assign the maximum that comes from
> +  caller functions.  */
> +  if (!node->tp_first_run)
> + node->tp_first_run = max_tp_first_run;

Probably +1 here, you want the function to appar 
afterwards.

Honza
> +
>if (call_count
>&& fn && fn->cfg
>&& (call_count * unlikely_count_fraction >= profile_info->runs))

[PATCH, rs6000] Fix little-endian access to sdmode_stack_slot

2013-11-17 Thread Ulrich Weigand

Hello,

when accessing the sdmode_stack_slot, code in rs6000_emit_move would
unconditionally use
rtx mem = adjust_address_nv (operands[0], mode, 4);

This is wrong in little-endian mode; we always need to access the
low word there too.

Fixed by the patch below, which fixes a large number of DFP test
suite failures in little-endian.

Tested on powerpc64-linux and powerpc64le-linux.

OK for mainline?

Bye,
Ulrich



ChangeLog:

* config/rs6000/rs6000.c (rs6000_emit_move): Use low word of
sdmode_stack_slot also in little-endian mode.


Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 204919)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -8188,7 +8188,9 @@
}
   else if (INT_REGNO_P (REGNO (operands[1])))
{
- rtx mem = adjust_address_nv (operands[0], mode, 4);
+ rtx mem = operands[0];
+ if (BYTES_BIG_ENDIAN)
+   mem = adjust_address_nv (mem, mode, 4);
  mem = eliminate_regs (mem, VOIDmode, NULL_RTX);
  emit_insn (gen_movsd_hardfloat (mem, operands[1]));
}
@@ -8211,7 +8213,9 @@
}
   else if (INT_REGNO_P (REGNO (operands[0])))
{
- rtx mem = adjust_address_nv (operands[1], mode, 4);
+ rtx mem = operands[1];
+ if (BYTES_BIG_ENDIAN)
+   mem = adjust_address_nv (mem, mode, 4);
  mem = eliminate_regs (mem, VOIDmode, NULL_RTX);
  emit_insn (gen_movsd_hardfloat (operands[0], mem));
}
-- 
  Dr. Ulrich Weigand
  GNU/Linux compilers and toolchain
  ulrich.weig...@de.ibm.com

Re: [PATCH][1-3] New configure option to enable Position independent executable as default.

2013-11-17 Thread Magnus Granberg

lördag 16 november 2013 20.37.58 skrev  Ryan Hill:
> On Wed, 13 Nov 2013 23:28:45 +0100
> 
> Magnus Granberg  wrote:
> > Hi
> > This patchset will add a new configure options --enable-default-pie.
> > With the new option enable will make it pass -fPIE and -pie from the gcc
> > and g++ frontend. Have only add the support for two targets but should
> > work on more targes. In configure.ac we add the new option. We can't
> > compile the compiler or the crt stuff with -fPIE it will brake the PCH
> > and the crtbegin and crtend files. The disabling is done in the
> > Makefiles. The needed spec is added to DRIVER_SELF_SPECS. We disable all
> > the profiling test for the linking will fail.Tested on x86_64 linux
> > (Gentoo).
> > 
> > /Magnus Granberg
> 
> Hey Magnus.  Some nits:
..
> > +for C, C++, ObjC, ObjC++, if none of @option{-fno-PIE},
> > @option{-fno-pie},
> > +@option{-fPIC}, @option{-fpic}, @option{-fno-PIC}, @option{-fno-pic},
> > +@option{-nostdlib}, @option{-nostartfiles}, @option{-shared},
> > +@option{-nodefaultlibs}, nor @option{static} are found.
> 
> Looks like nodefaultlibs is missing from PIE_DRIVER_SELF_SPECS or this needs
> to be updated.
> 
> Thanks!

Thankyou for the nits. Have updated the patches with the fixes.
The same changlog should work in my first post about this new option.

>Mike Stump wrote
>Ick.  Would be nice to figure out on what systems one can do this and just do 
>it without the configure option.  Is there some reason that we need an option 
>for it?
It would work well on most *-*-linux* targets but i don't have all the 
hardware for testing and I agre with Ian that it should not be default enable.

/Magnus Granberg
--- a/gcc/testsuite/gcc.dg/default-pie.c	2013-11-09 21:07:16.741479728 +0100
+++ b/gcc/testsuite/gcc.dg/default-pie.c	2013-11-09 21:05:07.801479218 +0100
@@ -0,0 +1,12 @@
+/* { dg-do compile { target *-*-linux* *-*-gnu* } } */
+/* { dg-require-effective-target default_pie } */
+/* { dg-options "-O2" } */
+int foo (void);
+
+int
+main (void)
+{
+	return foo ();
+}
+
+/* { dg-final { scan-assembler "foo@PLT" } } */
--- a/gcc/testsuite/g++.dg/other/anon5.C	2012-11-10 15:34:42.0 +0100
+++ b/gcc/testsuite/g++.dg/other/anon5.C	2013-11-09 14:49:52.281390127 +0100
@@ -1,5 +1,6 @@
 // PR c++/34094
 // { dg-do link { target { ! { *-*-darwin* *-*-hpux* *-*-solaris2.* } } } }
+// { dg-skip-if "" { default_pie } { "*" } { "" } }
 // { dg-options "-g" }
 // Ignore additional message on powerpc-ibm-aix
 // { dg-prune-output "obtain more information" } */
--- a/gcc/testsuite/lib/target-supports.exp	2013-10-01 11:18:30.0 +0200
+++ b/gcc/testsuite/lib/target-supports.exp	2013-10-25 22:01:46.743388469 +0200
@@ -474,6 +474,11 @@ proc check_profiling_available { test_wh
 	}
 }
 
+# Profiling don't work with default -fPIE -pie.
+if { [check_effective_target_default_pie] } {
+  return 0
+}
+
 # Support for -p on solaris2 relies on mcrt1.o which comes with the
 # vendor compiler.  We cannot reliably predict the directory where the
 # vendor compiler (and thus mcrt1.o) is installed so we can't
@@ -839,6 +844,14 @@ proc check_effective_target_pie { } {
 return 0
 }
 
+# Return 1 if -pie, -fPIE are default enable, 0 otherwise.
+
+proc check_effective_target_default_pie { } {
+global ENABLE_DEFAULT_PIE
+return [info exists ENABLE_DEFAULT_PIE]
+return 0
+}
+
 # Return true if the target supports -mpaired-single (as used on MIPS).
 
 proc check_effective_target_mpaired_single { } {
--- a/gcc/config/gnu-user.h	2013-08-20 10:31:40.0 +0200
+++ b/gcc/config/gnu-user.h	2013-10-23 22:01:42.337238981 +0200
@@ -134,3 +134,17 @@ see the files COPYING3 and COPYING.RUNTI
 /* Additional libraries needed by -static-libtsan.  */
 #undef STATIC_LIBTSAN_LIBS
 #define STATIC_LIBTSAN_LIBS "-ldl -lpthread"
+
+/* We use this to make the compiler use -fPIE as default and link
+   with -pie.  */
+#ifdef ENABLE_DEFAULT_PIE
+#define PIE_DRIVER_SELF_SPECS \
+"%{pie|fpic|fPIC|fpie|fPIE|fno-pic|fno-PIC|fno-pie|fno-PIE| \
+  shared|static|nostdlib|nodefaultlibs|nostartfiles:;:-fPIE -pie}"
+#else
+#define PIE_DRIVER_SELF_SPECS ""
+#endif
+
+#ifndef GNU_DRIVER_SELF_SPECS
+#define GNU_DRIVER_SELF_SPECS PIE_DRIVER_SELF_SPECS
+#endif
--- a/gcc/config/i386/gnu-user-common.h	2013-01-10 21:38:27.0 +0100
+++ b/gcc/config/i386/gnu-user-common.h	2013-10-23 17:37:45.432767049 +0200
@@ -70,3 +70,8 @@ along with GCC; see the file COPYING3.
 
 /* Static stack checking is supported by means of probes.  */
 #define STACK_CHECK_STATIC_BUILTIN 1
+
+/* Use GNU_DRIVER_SELF_SPECS.  */
+#ifndef DRIVER_SELF_SPECS
+#define DRIVER_SELF_SPECS GNU_DRIVER_SELF_SPECS
+#endif
--- a/gcc/configure.ac	2013-09-25 18:10:35.0 +0200
+++ b/gcc/configure.ac	2013-10-22 21:26:56.287602139 +0200
@@ -5434,6 +5434,30 @@ if test x"${LINKER_HASH_STYLE}" != x; th
  [The linker hash style])
 fi
 
+# Check whether --enable-default-pie was

Re: [libgfortran,patch] Silence a warning

2013-11-17 Thread Janne Blomqvist

On Sun, Nov 17, 2013 at 1:05 PM, FX  wrote:
> This attach patch adds an assert() in the library to fix PR 51828, i.e. 
> silence a “may be used uninitialized” warning.
>
> Built and regtested on x86_64-apple-darwin13. OK to commit?
>
> FX
>

Ok, thanks.

-- 
Janne Blomqvist

[PATCH, i386]: Fix PR59153, ICE: in memory_address_length, at config/i386/i386.c

2013-11-17 Thread Uros Bizjak

Hello!

ix86_decompose_address is called from many places in i386.c, also to
calculate various attributes of the insn (length, etc). The testcase
failed since addsi_1 pattern was declared as TYPE_LEA and its pattern
(involving subregs of SFmode) was passed to length attribute
calculation as memory operand. The failure was in
ix86_address_subreg_operand that rejected non-integer subregs.
ix86_decompose_address should fail only if the *structure* of address
is totaly wrong, and should not bother too much about its operands.
Operand checks should be done in ix86_legitimate_address_p.
The patch moves a couple of check to ix86_legitimate_address_p. The
prevention of non-integer registers in address was already there, so
the check in ix86_address_subreg_operand was not needed anyway. The
check for invalid x86_64 constant address was also already present in
ix86_legitimate_address_p, so corresponding x32 check was also moved
there.

2013-11-17  Uros Bizjak  

PR target/59153
* config/i386/i386.c (ix86_address_subreg_operand): Do not
reject non-integer subregs.
(ix86_decompose_address): Do not reject invalid CONST_INT RTXes.
Move check for invalid x32 constant addresses ...
(ix86_legitimate_address_p): ... here.

testsuite/ChangeLog:

2013-11-17  Uros Bizjak  

PR target/59153
* gcc.target/i386/pr59153.c: New test.

Patch was bootstrapped and regression tested on x86_64-pc-linux-gnu
{,-m32} and committed to mainline. The patch will be backported to
other release branches after a couple of days without problems in
mainline.

Uros.
Index: config/i386/i386.c
===
--- config/i386/i386.c  (revision 204921)
+++ config/i386/i386.c  (working copy)
@@ -11785,9 +11785,6 @@ ix86_address_subreg_operand (rtx op)
 
   mode = GET_MODE (op);
 
-  if (GET_MODE_CLASS (mode) != MODE_INT)
-return false;
-
   /* Don't allow SUBREGs that span more than a word.  It can lead to spill
  failures when the register is one word out of a two word structure.  */
   if (GET_MODE_SIZE (mode) > UNITS_PER_WORD)
@@ -11962,19 +11959,6 @@ ix86_decompose_address (rtx addr, struct ix86_addr
   scale = 1 << scale;
   retval = -1;
 }
-  else if (CONST_INT_P (addr))
-{
-  if (!x86_64_immediate_operand (addr, VOIDmode))
-   return 0;
-
-  /* Constant addresses are sign extended to 64bit, we have to
-prevent addresses from 0x8000 to 0x in x32 mode.  */
-  if (TARGET_X32
- && val_signbit_known_set_p (SImode, INTVAL (addr)))
-   return 0;
-
-  disp = addr;
-}
   else
 disp = addr;   /* displacement */
 
@@ -12706,6 +12690,12 @@ ix86_legitimate_address_p (enum machine_mode mode
   && !x86_64_immediate_operand (disp, VOIDmode))
/* Displacement is out of range.  */
return false;
+  /* In x32 mode, constant addresses are sign extended to 64bit, so
+we have to prevent addresses from 0x8000 to 0x.  */
+  else if (TARGET_X32 && !(index || base)
+  && CONST_INT_P (disp)
+  && val_signbit_known_set_p (SImode, INTVAL (disp)))
+   return false;
 }
 
   /* Everything looks valid.  */
Index: testsuite/gcc.target/i386/pr59153.c
===
--- testsuite/gcc.target/i386/pr59153.c (revision 0)
+++ testsuite/gcc.target/i386/pr59153.c (working copy)
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options "-O -flive-range-shrinkage -mdispatch-scheduler -march=bdver1" 
} */
+
+int foo (float f)
+{
+  union
+  {
+float f;
+int i;
+  } z = { .f = f };
+
+  return z.i - 1;
+}

Re: [PATCH, rs6000] Fix little-endian access to sdmode_stack_slot

2013-11-17 Thread David Edelsohn

On Sun, Nov 17, 2013 at 4:12 PM, Ulrich Weigand  wrote:
> Hello,
>
> when accessing the sdmode_stack_slot, code in rs6000_emit_move would
> unconditionally use
> rtx mem = adjust_address_nv (operands[0], mode, 4);
>
> This is wrong in little-endian mode; we always need to access the
> low word there too.
>
> Fixed by the patch below, which fixes a large number of DFP test
> suite failures in little-endian.
>
> Tested on powerpc64-linux and powerpc64le-linux.
>
> OK for mainline?

> ChangeLog:
>
> * config/rs6000/rs6000.c (rs6000_emit_move): Use low word of
> sdmode_stack_slot also in little-endian mode.

Okay.

Thanks, David

Re: [PATCH] Time profiler - phase 2

2013-11-17 Thread Jan Hubicka

> Hello,
>there is a new version of the patch, I disabled the branch with
> profile-generate. Could you please advise me how should I force to use
> profile-reorder-functions just with enable LTO optimization?
> 
> I also attach reordering results:
> o gimp-reoder-latest.html (latest patch)
> o gimp-reoder-without-fix.html (without the code in gcc/predict.c)
> 
> Thank you,
> Martin
> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> index 5cb07b7..754f882 100644
> --- a/gcc/ChangeLog
> +++ b/gcc/ChangeLog
> @@ -1,3 +1,13 @@
> +2013-11-17  Martin Liska  
> + Jan Hubicka  
> +
> + * cgraphunit.c (node_cmp): New function.
> + (expand_all_functions): Function ordering added.
> + * common.opt: New profile based function reordering flag introduced.
> + * lto-partition.c: Support for time profile added.
> + * lto.c: Likewise.
> + * predict.c (handle_missing_profiles): Time profile handled in
> +   missing profiles.

OK,
thanks!
> @@ -8645,7 +8645,7 @@ profile useful for later recompilation with profile 
> feedback based
>  optimization.  You must use @option{-fprofile-generate} both when
>  compiling and when linking your program.
>  
> -The following options are enabled: @code{-fprofile-arcs}, 
> @code{-fprofile-values}, @code{-fvpt}.
> +The following options are enabled: @code{-fprofile-arcs}, 
> @code{-fprofile-values}, @code{-fprofile-reorder-functions}, @code{-fvpt}.
>  
>  If @var{path} is specified, GCC looks at the @var{path} to find
>  the profile feedback data files. See @option{-fprofile-dir}.

Skip this change, it is only about the profiling options used.  I think it is 
enough to mention it later.
> @@ -8933,6 +8933,14 @@ from profiling values of expressions for usage in 
> optimizations.
>  
>  Enabled with @option{-fprofile-generate} and @option{-fprofile-use}.
>  
> +@item -fprofile-reoder-functions
> +@opindex fprofile-reorder-functions
> +Function reordering based on profile instrumentation collects
> +first time of execution of a function and orders these functions
> +in ascending order.
> +
> +Enabled with @option{-fprofile-generate} and @option{-fprofile-use}.

Only with -fprofile-use.

What happened with the plans for linker support? Perhaps we can implement the 
numbered sections by Carry's proposal and hope that binutils will catch up in 
next release?

Honza

Re: Add value range support into memcpy/memset expansion

2013-11-17 Thread Jan Hubicka

Hi,
this is version I comitted. It also adds a testcase and enables the support in 
i386 backend.

Honza

* doc/md.texi (setmem, movstr): Update documentation.
* builtins.c (determine_block_size): New function.
(expand_builtin_memcpy): Use it and pass it to
emit_block_move_hints.
(expand_builtin_memset_args): Use it and pass it to
set_storage_via_setmem.
* expr.c (emit_block_move_via_movmem): Add min_size/max_size parameters;
update call to expander.
(emit_block_move_hints): Add min_size/max_size parameters.
(clear_storage_hints): Likewise.
(set_storage_via_setmem): Likewise.
(clear_storage): Update.
* expr.h (emit_block_move_hints, clear_storage_hints,
set_storage_via_setmem): Update prototype.
* i386.c (ix86_expand_set_or_movmem): Add bounds; export.
(ix86_expand_movmem, ix86_expand_setmem): Remove.
(ix86_expand_movmem, ix86_expand_setmem): Remove.
* i386.md (movmem, setmem): Pass parameters.

* testsuite/gcc.target/i386/memcpy-2.c: New testcase.
Index: doc/md.texi
===
--- doc/md.texi (revision 204899)
+++ doc/md.texi (working copy)
@@ -5328,6 +5328,9 @@ destination and source strings are opera
 the expansion of this pattern should store in operand 0 the address in
 which the @code{NUL} terminator was stored in the destination string.
 
+This patern has also several optional operands that are same as in
+@code{setmem}.
+
 @cindex @code{setmem@var{m}} instruction pattern
 @item @samp{setmem@var{m}}
 Block set instruction.  The destination string is the first operand,
@@ -5347,6 +5350,8 @@ respectively.  The expected alignment di
 in a way that the blocks are not required to be aligned according to it in
 all cases. This expected alignment is also in bytes, just like operand 4.
 Expected size, when unknown, is set to @code{(const_int -1)}.
+Operand 7 is the minimal size of the block and operand 8 is the
+maximal size of the block (NULL if it can not be represented as CONST_INT).
 
 The use for multiple @code{setmem@var{m}} is as for @code{movmem@var{m}}.
 
Index: builtins.c
===
--- builtins.c  (revision 204899)
+++ builtins.c  (working copy)
@@ -3095,6 +3095,51 @@ builtin_memcpy_read_str (void *data, HOS
   return c_readstr (str + offset, mode);
 }
 
+/* LEN specify length of the block of memcpy/memset operation.
+   Figure out its range and put it into MIN_SIZE/MAX_SIZE.  */
+
+static void
+determine_block_size (tree len, rtx len_rtx,
+ unsigned HOST_WIDE_INT *min_size,
+ unsigned HOST_WIDE_INT *max_size)
+{
+  if (CONST_INT_P (len_rtx))
+{
+  *min_size = *max_size = UINTVAL (len_rtx);
+  return;
+}
+  else
+{
+  double_int min, max;
+  if (TREE_CODE (len) == SSA_NAME 
+ && get_range_info (len, &min, &max) == VR_RANGE)
+   {
+ if (min.fits_uhwi ())
+   *min_size = min.to_uhwi ();
+ else
+   *min_size = 0;
+ if (max.fits_uhwi ())
+   *max_size = max.to_uhwi ();
+ else
+   *max_size = (HOST_WIDE_INT)-1;
+   }
+  else
+   {
+ if (host_integerp (TYPE_MIN_VALUE (TREE_TYPE (len)), 1))
+   *min_size = tree_low_cst (TYPE_MIN_VALUE (TREE_TYPE (len)), 1);
+ else
+   *min_size = 0;
+ if (host_integerp (TYPE_MAX_VALUE (TREE_TYPE (len)), 1))
+   *max_size = tree_low_cst (TYPE_MAX_VALUE (TREE_TYPE (len)), 1);
+ else
+   *max_size = GET_MODE_MASK (GET_MODE (len_rtx));
+   }
+}
+  gcc_checking_assert (*max_size <=
+  (unsigned HOST_WIDE_INT)
+ GET_MODE_MASK (GET_MODE (len_rtx)));
+}
+
 /* Expand a call EXP to the memcpy builtin.
Return NULL_RTX if we failed, the caller should emit a normal call,
otherwise try to get the result in TARGET, if convenient (and in
@@ -3117,6 +3162,8 @@ expand_builtin_memcpy (tree exp, rtx tar
   rtx dest_mem, src_mem, dest_addr, len_rtx;
   HOST_WIDE_INT expected_size = -1;
   unsigned int expected_align = 0;
+  unsigned HOST_WIDE_INT min_size;
+  unsigned HOST_WIDE_INT max_size;
 
   /* If DEST is not a pointer type, call the normal function.  */
   if (dest_align == 0)
@@ -3136,6 +3183,7 @@ expand_builtin_memcpy (tree exp, rtx tar
   dest_mem = get_memory_rtx (dest, len);
   set_mem_align (dest_mem, dest_align);
   len_rtx = expand_normal (len);
+  determine_block_size (len, len_rtx, &min_size, &max_size);
   src_str = c_getstr (src);
 
   /* If SRC is a string constant and block move would be done
@@ -3164,7 +3212,8 @@ expand_builtin_memcpy (tree exp, rtx tar
   dest_addr = emit_block_move_hints (dest_mem, src_mem, len_rtx,
 CALL_EXPR_TAI

Re: Some wide-int review comments

2013-11-17 Thread Kenneth Zadeck


On 11/08/2013 05:30 AM, Richard Sandiford wrote:

Some comments from looking through the diff with the merge point,
ignoring wide-int.h and wide-int.cc.  A few more to follow in the
form of patchses.



dwarf2out.c has:

+case CONST_WIDE_INT:
+  if (mode == VOIDmode)
+   mode = GET_MODE (rtl);
+
+  if (mode != VOIDmode && (dwarf_version >= 4 || !dwarf_strict))
+   {
+ gcc_assert (mode == GET_MODE (rtl) || VOIDmode == GET_MODE (rtl));
+
+ /* Note that a CONST_DOUBLE rtx could represent either an integer
+or a floating-point constant.  A CONST_DOUBLE is used whenever
+the constant requires more than one word in order to be
+adequately represented.  We output CONST_DOUBLEs as blocks.  */
+ loc_result = new_loc_descr (DW_OP_implicit_value,
+ GET_MODE_SIZE (mode), 0);
+ loc_result->dw_loc_oprnd2.val_class = dw_val_class_wide_int;
+ loc_result->dw_loc_oprnd2.v.val_wide = ggc_alloc_cleared_wide_int ();
+ *loc_result->dw_loc_oprnd2.v.val_wide = std::make_pair (rtl, mode);

The comment looks like a cut-&-paste.  The "mode == GET_MODE (rtl)"
bit should never be true.

i removed the assertion completely.  it really is unnecessary.



 From fold-const.c:

@@ -13686,14 +13548,17 @@ fold_binary_loc (location_t loc,
  break;
}
  
-	else if (TREE_INT_CST_HIGH (arg1) == signed_max_hi

-&& TREE_INT_CST_LOW (arg1) == signed_max_lo
+   else if (wi::eq_p (arg1, signed_max)
 && TYPE_UNSIGNED (arg1_type)
+/* KENNY QUESTIONS THE CHECKING OF THE BITSIZE
+   HERE.  HE FEELS THAT THE PRECISION SHOULD BE
+   CHECKED */
+
 /* We will flip the signedness of the comparison operator
associated with the mode of arg1, so the sign bit is
specified by this mode.  Check that arg1 is the signed
max associated with this sign bit.  */
-&& width == GET_MODE_BITSIZE (TYPE_MODE (arg1_type))
+&& prec == GET_MODE_BITSIZE (TYPE_MODE (arg1_type))
 /* signed_type does not work on pointer types.  */
 && INTEGRAL_TYPE_P (arg1_type))
  {

Looks like it should be resolved one way or the other before the merge.
I have posted a patch to fix this on trunk.   the enclosed patch is 
consistent with that fix.





 From gcse.c:

--- wide-int-base/gcc/gcc/gcse.c2013-11-05 13:09:32.148376180 +
+++ wide-int/gcc/gcc/gcse.c 2013-11-05 13:07:28.431495118 +
@@ -1997,6 +1997,13 @@ prune_insertions_deletions (int n_elems)
bitmap_clear_bit (pre_delete_map[i], j);
  }
  
+  if (dump_file)

+{
+  dump_bitmap_vector (dump_file, "pre_insert_map", "", pre_insert_map, 
n_edges);
+  dump_bitmap_vector (dump_file, "pre_delete_map", "", pre_delete_map,
+  last_basic_block);
+}
+
sbitmap_free (prune_exprs);
free (insertions);
free (deletions);

This doesn't look related.

removed




 From lcm.c:

diff -udpr '--exclude=.svn' '--exclude=.pc' '--exclude=patches' 
wide-int-base/gcc/gcc/lcm.c wide-int/gcc/gcc/lcm.c
--- wide-int-base/gcc/gcc/lcm.c 2013-08-22 09:00:23.068716382 +0100
+++ wide-int/gcc/gcc/lcm.c  2013-10-26 13:19:16.287277520 +0100
@@ -64,6 +64,7 @@ along with GCC; see the file COPYING3.
  #include "sbitmap.h"
  #include "dumpfile.h"
  
+#define LCM_DEBUG_INFO 1

  /* Edge based LCM routines.  */
  static void compute_antinout_edge (sbitmap *, sbitmap *, sbitmap *, sbitmap 
*);
  static void compute_earliest (struct edge_list *, int, sbitmap *, sbitmap *,
@@ -106,6 +107,7 @@ compute_antinout_edge (sbitmap *antloc,
/* We want a maximal solution, so make an optimistic initialization of
   ANTIN.  */
bitmap_vector_ones (antin, last_basic_block);
+  bitmap_vector_clear (antout, last_basic_block);
  
/* Put every block on the worklist; this is necessary because of the

   optimistic initialization of ANTIN above.  */
@@ -432,6 +434,7 @@ pre_edge_lcm (int n_exprs, sbitmap *tran
  
/* Allocate an extra element for the exit block in the laterin vector.  */

laterin = sbitmap_vector_alloc (last_basic_block + 1, n_exprs);
+  bitmap_vector_clear (laterin, last_basic_block);
compute_laterin (edge_list, earliest, antloc, later, laterin);
  
  #ifdef LCM_DEBUG_INFO


Same here.
i removed this also.   this was done for the sanity of debugging. but i 
do not think that it is important enough for the trunk.   I assume that 
the code was done this way because it is faster to do this than 
explicitly clear the structures.




 From real.c:

@@ -2144,43 +2148,131 @@ real_from_string3 (REAL_VALUE_TYPE *r, c
  real_

[PATCH, rs6000] Fix libcpp/lex.c Altivec code to be correct for little endian

2013-11-17 Thread Bill Schmidt

Hi, 

As Ulrich Weigand discovered, libcpp/lex.c contains some code optimized
for use with Altivec that is incorrect for little endian targets.  This
breaks bootstrap on powerpc64le-unknown-linux-gnu when configured with
--with-cpu=power7.

This patch makes appropriate modifications for little endian.  The
transformation of lvsr/vperm(x,y,z) into lvsl/vperm(y,x,z) is familiar
from a previous patch.  The other obvious change is converting
count-leading-zeroes into count-trailing-zeroes.

Bootstrapped on powerpc64-unknown-linux-gnu (BE) using --with-cpu=power7
with no regressions.  Bootstrap for powerpc64le-unknown-linux-gnu (LE)
using --with-cpu=power7 now completes with this patch.  There are still
failures for --with-cpu=power7 that are not present using
--with-cpu=power6 that need to be investigated, but they are unrelated
to this change.

Ok for trunk?

Thanks,
Bill


2013-11-17  Bill Schmidt  

* lex.c (search_line_fast): Correct for little endian.


Index: libcpp/lex.c
===
--- libcpp/lex.c(revision 204928)
+++ libcpp/lex.c(working copy)
@@ -559,8 +559,13 @@ search_line_fast (const uchar *s, const uchar *end
  beginning with all ones and shifting in zeros according to the
  mis-alignment.  The LVSR instruction pulls the exact shift we
  want from the address.  */
+#ifdef __BIG_ENDIAN__
   mask = __builtin_vec_lvsr(0, s);
   mask = __builtin_vec_perm(zero, ones, mask);
+#else
+  mask = __builtin_vec_lvsl(0, s);
+  mask = __builtin_vec_perm(ones, zero, mask);
+#endif
   data &= mask;
 
   /* While altivec loads mask addresses, we still need to align S so
@@ -624,7 +629,11 @@ search_line_fast (const uchar *s, const uchar *end
 /* L now contains 0xff in bytes for which we matched one of the
relevant characters.  We can find the byte index by finding
its bit index and dividing by 8.  */
+#ifdef __BIG_ENDIAN__
 l = __builtin_clzl(l) >> 3;
+#else
+l = __builtin_ctzl(l) >> 3;
+#endif
 return s + l;
 
 #undef N

RE: [patch] [arm] ARM Cortex-M3/M4 tuning

2013-11-17 Thread Joey Ye

Sorry about this. I should have run x86 make check.

- Joey

> -Original Message-
> From: Richard Biener [mailto:richard.guent...@gmail.com]
> Sent: Thursday, November 14, 2013 22:16
> To: H.J. Lu
> Cc: Joey Ye; Janis Johnson; GCC Patches; Ramana Radhakrishnan
> Subject: Re: [patch] [arm] ARM Cortex-M3/M4 tuning
> 
> On Thu, Nov 14, 2013 at 1:35 PM, H.J. Lu  wrote:
> > On Thu, Nov 14, 2013 at 2:24 AM, Joey Ye  wrote:
> >> In mainline and arm/embedded-4_8-branch now.
> >>
> >>> -Original Message-
> >>> From: Janis Johnson [mailto:janis_john...@mentor.com]
> >>> Sent: Thursday, November 14, 2013 1:45
> >>> To: Joey Ye; jani...@codesourcery.com
> >>> Cc: gcc-patches@gcc.gnu.org; Ramana Radhakrishnan
> >>> Subject: Re: [patch] [arm] ARM Cortex-M3/M4 tuning
> >>>
> >>> On 11/12/2013 10:20 PM, Joey Ye wrote:
> >>> > Janis, can you please take a look at test case changes.
> >>> >
> >>> > Thanks,
> >>> > Joey
> >>>
> >>> They look fine.
> >>>
> >
> > I got
> >
> > ERROR: (DejaGnu) proc "{ scan-tree-dump-times "Threaded" 1 "vrp1" } ||
> > { arm_cortex_m }" does not exist.
> >
> > on Linux/x86-64.
> 
> me too, this stops testing tree-ssa.exp at this point which is bad.  FIxed
as
> attached.
> 
> Richard.
> 
> >
> >
> >
> >
> >
> > --
> > H.J.

Re: cilking away

2013-11-17 Thread Mike Stump

On Nov 15, 2013, at 8:23 PM, "Iyer, Balaji V"  wrote:
> This is already done in my patch for _Cilk-spawn and _Cilk_sync  support for 
> C++. The patch was submitted ~3-4 weeks ago.

Ping once a day until reviewed.  :-)  You should form a new patch and post it 
and be sure to cc jason on that email.  It is possible he just hasn't seen it 
yet.  Be sure to post and ping the work.

> It is currently under review 
> (http://gcc.gnu.org/ml/gcc-patches/2013-10/msg01807.html).

I saw no evidence it is currently under review.  I think the state is, jason 
hasn't been cced on the C++ patch and don't know of its existence.

>> The -fcilkplus in *.exp is redundant with that option in *.{c,cc}.  Please
>> remove one instance of them.
> 
> It is sort of like a safety measure

We don't want a sort of safety measure.

> since there are some error tests which may have dg-options omitted.

Once you decide if those test are wrong for not having dg-options, or not, the 
rest will just fall out.

> if it is really bad, I can put it in the tests and remove from the options.

Yes, it is really bad, thanks.

Re: Clean up atomic tests

2013-11-17 Thread Richard Henderson

On 11/08/2013 02:43 AM, Joseph S. Myers wrote:
> This patch cleans up various issues with the tests of atomics built-in
> functions and libatomic functions, in preparation for adapting those
> tests to add test coverage of stdatomic.h macros.  The tests were
> missing a return type for main (C11 doesn't allow implicit int return
> types).  Some tests were incrementing a variable in one part of an
> expression and using its value in another part without a sequence
> point in between.  And the libatomic tests shouldn't have been
> restricted to targets with hardware sync_* support, or adding
> command-line options to make such support available, since they are
> built with -fno-inline-atomics anyway to ensure the library
> functionality is what's tested, and the library is meant to cover all
> types regardless of what hardware support may be available.
> 
> Tested for x86_64-unknown-linux-gnu.  OK to commit?

Ok.


r~

_Cilk_spawn and _Cilk_sync for C++

2013-11-17 Thread Iyer, Balaji V

Hello Jason et al.,
Mike Stump mentioned that my _Cilk_spawn and _Cilk_sync for C++ may 
have been lost in the email pile. So, attached is an updated _Cilk_spawn and 
_Cilk_sync for C++ patch. Is this Ok to install?

Here are the ChangeLog entries (they shouldn't have changed since the last 
submission):

gcc/cp/ChangeLog
2013-11-17  Balaji V. Iyer  

* Make-lang.in (CXX_AND_OBJCXX_OBJS): Added cp/cp-cilk.o.
* cp-cilk.c: New file.
* cp-tree.h (cilk_valid_spawn): New prototype.
(gimplify_cilk_spawn): Likewise.
(cp_cilk_install_body_wframe_cleanup): Likewise.
(cilk_create_lambda_fn_tmp_var): Likewise.
* decl.c (finish_function): Insert Cilk function-calls when a
_Cilk_spawn is used in a function.
* except.c (do_begin_catch): Made the function non-static.
(do_end_catch): Likewise.
* parser.c (cp_parser_postfix_expression): Added RID_CILK_SPAWN and
RID_CILK_SYNC cases.
* parser.h (IN_CILK_SPAWN): New #define.
* cp-objcp-common.h (LANG_HOOKS_CILKPLUS_GIMPLIFY_SPAWN): Likewise.
(LANG_HOOKS_CILKPLUS_DETECT_SPAWN_AND_UNWRAP): Likewise.
(LANG_HOOKS_CILKPLUS_FRAME_CLEANUP): Likewise.
* pt.c (tsubst_expr): Added CILK_SPAWN_STMT and CILK_SYNC_STMT cases.
* semantics.c (potential_constant_expression_1): Likewise.
(finish_call_expr): Stored the lambda function to a variable when Cilk
Plus is enabled.
* typeck.c (cp_build_compound_expr): Reject a spawned function in a
compound expression.
(check_return_expr): Reject a spawned function in a return expression.

gcc/testsuite/ChangeLog
2013-11-17  Balaji V. Iyer  

* g++.dg/cilk-plus/CK/catch_exc.cc: New test case.
* g++.dg/cilk-plus/CK/const_spawn.cc: Likewise.
* g++.dg/cilk-plus/CK/fib-opr-overload.cc: Likewise.
* g++.dg/cilk-plus/CK/fib-tplt.cc: Likewise.
* g++.dg/cilk-plus/CK/lambda_spawns.cc: Likewise.
* g++.dg/cilk-plus/CK/lambda_spawns_tplt.cc: Likewise.
* g++.dg/cilk-plus/cilk-plus.exp: Added support to run Cilk Keywords
test stored in c-c++-common.  Also, added the Cilk runtime's library
to the ld_library_path.

Thanks,

Balaji V. Iyer.j
diff --git a/gcc/cp/Make-lang.in b/gcc/cp/Make-lang.in
index 424f2e6..e046ee3 100644
--- a/gcc/cp/Make-lang.in
+++ b/gcc/cp/Make-lang.in
@@ -77,7 +77,7 @@ CXX_AND_OBJCXX_OBJS = cp/call.o cp/decl.o cp/expr.o cp/pt.o 
cp/typeck2.o \
  cp/search.o cp/semantics.o cp/tree.o cp/repo.o cp/dump.o cp/optimize.o \
  cp/mangle.o cp/cp-objcp-common.o cp/name-lookup.o cp/cxx-pretty-print.o \
  cp/cp-cilkplus.o \
- cp/cp-gimplify.o cp/cp-array-notation.o cp/lambda.o \
+ cp/cp-gimplify.o cp/cp-array-notation.o cp/lambda.o cp/cp-cilk.o \
  cp/vtable-class-hierarchy.o $(CXX_C_OBJS)
 
 # Language-specific object files for C++.
diff --git a/gcc/cp/cp-cilk.c b/gcc/cp/cp-cilk.c
new file mode 100644
index 000..0da95e8
--- /dev/null
+++ b/gcc/cp/cp-cilk.c
@@ -0,0 +1,118 @@
+/* Functions to handle Cilk keywords support in C++.
+   Copyright (C) 2011-2013  Free Software Foundation, Inc.
+   Contributed by Balaji V. Iyer ,
+   Intel Corporation.
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   GCC is distributed in the hope that it will be useful, but
+   WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with GCC; see the file COPYING3.  If not see
+   .  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "cp-tree.h"
+#include "tree-iterator.h"
+#include "cilk.h"
+
+/* Sets the EXCEPTION bit (0x10) in the FRAME.flags field.  */
+
+static tree
+set_cilk_except_flag (tree frame)
+{
+  tree flags = cilk_dot (frame, CILK_TI_FRAME_FLAGS, 0);
+
+  flags = build2 (MODIFY_EXPR, void_type_node, flags,
+ build2 (BIT_IOR_EXPR, TREE_TYPE (flags), flags,
+ build_int_cst (TREE_TYPE (flags),
+CILK_FRAME_EXCEPTING)));
+  return flags;
+}
+
+/* Sets the frame.EXCEPT_DATA field to the head of the exception pointer.  */
+
+static tree
+set_cilk_except_data (tree frame)
+{
+  tree except_data = cilk_dot (frame, CILK_TI_FRAME_EXCEPTION, 0);
+  tree uresume_fn = builtin_decl_implicit (BUILT_IN_EH_POINTER);
+  tree ret_expr;
+  uresume_fn  = build_call_expr (uresume_fn, 1,
+build_int_cst (integer_type_node, 0));
+  ret_expr = build2 (MODIFY_EXPR, void_type_node, except_data, uresume_fn);
+  return ret_expr;
+}
+
+/* Installs BO

Re: [PATCH, rs6000] Fix libcpp/lex.c Altivec code to be correct for little endian

2013-11-17 Thread David Edelsohn

On Sun, Nov 17, 2013 at 8:33 PM, Bill Schmidt
 wrote:
> Hi,
>
> As Ulrich Weigand discovered, libcpp/lex.c contains some code optimized
> for use with Altivec that is incorrect for little endian targets.  This
> breaks bootstrap on powerpc64le-unknown-linux-gnu when configured with
> --with-cpu=power7.
>
> This patch makes appropriate modifications for little endian.  The
> transformation of lvsr/vperm(x,y,z) into lvsl/vperm(y,x,z) is familiar
> from a previous patch.  The other obvious change is converting
> count-leading-zeroes into count-trailing-zeroes.
>
> Bootstrapped on powerpc64-unknown-linux-gnu (BE) using --with-cpu=power7
> with no regressions.  Bootstrap for powerpc64le-unknown-linux-gnu (LE)
> using --with-cpu=power7 now completes with this patch.  There are still
> failures for --with-cpu=power7 that are not present using
> --with-cpu=power6 that need to be investigated, but they are unrelated
> to this change.
>
> Ok for trunk?
>
> Thanks,
> Bill
>
>
> 2013-11-17  Bill Schmidt  
>
> * lex.c (search_line_fast): Correct for little endian.

Okay.

Thanks, David

Re: [RFA][PATCH]Fix 59019

2013-11-17 Thread Jeff Law


On 11/17/13 04:28, Steven Bosscher wrote:


TRAP_CONDITION (PATTERN (i3)) == const1_rtx

But shouldn't the check be on const_true_rtx? Or does combine put a
const1_rtx there?
I took const1_rtx from control_flow_insn_p.  That's ultimately what we 
need to be consistent with.







+{
+  basic_block bb = BLOCK_FOR_INSN (i3);
+  rtx last = get_last_bb_insn (bb);


This won't work, get_last_bb_insn() is intended to be used only in
cfgrtl mode and "combine" works in cfglayout mode. If you use it in
cfglayout mode on a block that ends in a tablejump, you get back the
JUMP_TABLE_DATA insn that is in BB_FOOTER and there is no NEXT_INSN
path from BB_END to any insns in the footer.

rtx last = BB_END (bb);

Any dead jump tables will be dealt with later.

OK.





+  /* First remove all the insns after the trap.  */
+  if (i3 != last)
+   delete_insn_chain (NEXT_INSN (i3), last, true);
+
+  /* And ensure there's no outgoing edges anymore.  */
+  while (EDGE_COUNT (bb->succs) > 0)
+   remove_edge (EDGE_SUCC (bb, 0));



Alternatively, you could do "split_block (bb, i3);" and let cfgcleanup
deal with the new, unreachable basic block.
My first iteration split the block and let cfgcleanup take care of the 
rest.  That seemed wasteful when all we really need to do is zap the 
trailing instructions.







+  /* And ensure cfglayout knows this block does not fall through.  */
+  emit_barrier_after_bb (bb);


Bah... Emitting the barrier is necessary here because
fixup_reorder_chain doesn't handle cases where a basic block is a dead
end. That is actually a bug in fixup_reorder_chain: Other passes could
create dead ends in the CFG in cfglayout mode and not emit a barrier
into BB_FOOTER, and fixup_reorder_chain wouldn't be able to handle
that (resulting in verify_flow_info failure).
Umm, no.  Failure to emit the barrier will result in a checking failure. 
 Been there, done that.






Would you mind if I try spend some time making conditional traps be
control flow insns? It should make all of this a little bit less ugly.
And I have no fish to fry at all :-) Give me a week or two please, to
see if I can figure out those issues you've been running into.

Feel free.  I'm not terribly concerned about this issue right now.

To trigger use the test in 59019 with an itanic cross compiler and 
comment out these two lines from gimple-ssa-isolate-paths.c:



  TREE_THIS_VOLATILE (op) = 1;
  TREE_SIDE_EFFECTS (op) = 1;



jeff

Re: [0/10] Replace host_integerp and tree_low_cst

2013-11-17 Thread Jeff Law


On 11/16/13 05:53, Richard Sandiford wrote:

After the patch that went in yesterday, all calls to host_integerp and
tree_low_cst pass a constant "pos" argument.  This series replaces each
function with two separate ones:

[ ... ]
So I've almost entirely ignored the whole wide-int conversion discussion 
and I suspect I'm not entirely alone.


Can you briefly summarize what's y'all are trying to accomplish with the 
wide-int changes?


jeff

[PATCH] Fix static libasan link

2013-11-17 Thread Yury Gribov


Hi,

This patch is supposed to fix PR59106 
(http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59106).


The bug manifests when we link static sanitizer libs (asan, tsan or 
ubsan) against pure C programs.


The patch adds -fno-rtti to LDFLAGS of sanitizer runtime libs (based on 
Kcc's recommendation). This does not fix libubsan but it does not seem 
to support static link anyway.


Tested against gcc asan testsuite on x86_64.

-Y
diff --git a/libsanitizer/asan/Makefile.am b/libsanitizer/asan/Makefile.am
index 8764007..0e1ee11 100644
--- a/libsanitizer/asan/Makefile.am
+++ b/libsanitizer/asan/Makefile.am
@@ -7,7 +7,7 @@ DEFS = -D_GNU_SOURCE -D_DEBUG -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D
 if USING_MAC_INTERPOSE
 DEFS += -DMAC_INTERPOSE_FUNCTIONS -DMISSING_BLOCKS_SUPPORT
 endif
-AM_CXXFLAGS = -Wall -W -Wno-unused-parameter -Wwrite-strings -pedantic -Wno-long-long  -fPIC -fno-builtin -fno-exceptions -fomit-frame-pointer -funwind-tables -fvisibility=hidden -Wno-variadic-macros
+AM_CXXFLAGS = -Wall -W -Wno-unused-parameter -Wwrite-strings -pedantic -Wno-long-long  -fPIC -fno-builtin -fno-exceptions -fno-rtti -fomit-frame-pointer -funwind-tables -fvisibility=hidden -Wno-variadic-macros
 AM_CXXFLAGS += $(LIBSTDCXX_RAW_CXX_CXXFLAGS)
 ACLOCAL_AMFLAGS = -I $(top_srcdir) -I $(top_srcdir)/config
 
diff --git a/libsanitizer/asan/Makefile.in b/libsanitizer/asan/Makefile.in
index be8b879..c604474 100644
--- a/libsanitizer/asan/Makefile.in
+++ b/libsanitizer/asan/Makefile.in
@@ -254,7 +254,7 @@ AM_CPPFLAGS = -I $(top_srcdir)/include -I $(top_srcdir)
 # May be used by toolexeclibdir.
 gcc_version := $(shell cat $(top_srcdir)/../gcc/BASE-VER)
 AM_CXXFLAGS = -Wall -W -Wno-unused-parameter -Wwrite-strings -pedantic \
-	-Wno-long-long -fPIC -fno-builtin -fno-exceptions \
+	-Wno-long-long -fPIC -fno-builtin -fno-exceptions -fno-rtti \
 	-fomit-frame-pointer -funwind-tables -fvisibility=hidden \
 	-Wno-variadic-macros $(LIBSTDCXX_RAW_CXX_CXXFLAGS)
 ACLOCAL_AMFLAGS = -I $(top_srcdir) -I $(top_srcdir)/config
diff --git a/libsanitizer/interception/Makefile.am b/libsanitizer/interception/Makefile.am
index 4218983..e9fbe6a 100644
--- a/libsanitizer/interception/Makefile.am
+++ b/libsanitizer/interception/Makefile.am
@@ -4,7 +4,7 @@ AM_CPPFLAGS = -I $(top_srcdir)/include -I $(top_srcdir)
 gcc_version := $(shell cat $(top_srcdir)/../gcc/BASE-VER)
 
 DEFS = -D_GNU_SOURCE -D_DEBUG -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS 
-AM_CXXFLAGS = -Wall -W -Wno-unused-parameter -Wwrite-strings -pedantic -Wno-long-long  -fPIC -fno-builtin -fno-exceptions -fomit-frame-pointer -funwind-tables -fvisibility=hidden -Wno-variadic-macros
+AM_CXXFLAGS = -Wall -W -Wno-unused-parameter -Wwrite-strings -pedantic -Wno-long-long  -fPIC -fno-builtin -fno-exceptions -fno-rtti -fomit-frame-pointer -funwind-tables -fvisibility=hidden -Wno-variadic-macros
 AM_CXXFLAGS += $(LIBSTDCXX_RAW_CXX_CXXFLAGS)
 ACLOCAL_AMFLAGS = -I m4
 
diff --git a/libsanitizer/interception/Makefile.in b/libsanitizer/interception/Makefile.in
index 59b9a9a..f3a2f41 100644
--- a/libsanitizer/interception/Makefile.in
+++ b/libsanitizer/interception/Makefile.in
@@ -211,7 +211,7 @@ AM_CPPFLAGS = -I $(top_srcdir)/include -I $(top_srcdir)
 # May be used by toolexeclibdir.
 gcc_version := $(shell cat $(top_srcdir)/../gcc/BASE-VER)
 AM_CXXFLAGS = -Wall -W -Wno-unused-parameter -Wwrite-strings -pedantic \
-	-Wno-long-long -fPIC -fno-builtin -fno-exceptions \
+	-Wno-long-long -fPIC -fno-builtin -fno-exceptions -fno-rtti \
 	-fomit-frame-pointer -funwind-tables -fvisibility=hidden \
 	-Wno-variadic-macros $(LIBSTDCXX_RAW_CXX_CXXFLAGS)
 ACLOCAL_AMFLAGS = -I m4
diff --git a/libsanitizer/lsan/Makefile.am b/libsanitizer/lsan/Makefile.am
index 5c8726f..3d500f3 100644
--- a/libsanitizer/lsan/Makefile.am
+++ b/libsanitizer/lsan/Makefile.am
@@ -4,7 +4,7 @@ AM_CPPFLAGS = -I $(top_srcdir)/include -I $(top_srcdir)
 gcc_version := $(shell cat $(top_srcdir)/../gcc/BASE-VER)
 
 DEFS = -D_GNU_SOURCE -D_DEBUG -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS 
-AM_CXXFLAGS = -Wall -W -Wno-unused-parameter -Wwrite-strings -pedantic -Wno-long-long  -fPIC -fno-builtin -fno-exceptions -fomit-frame-pointer -funwind-tables -fvisibility=hidden -Wno-variadic-macros
+AM_CXXFLAGS = -Wall -W -Wno-unused-parameter -Wwrite-strings -pedantic -Wno-long-long  -fPIC -fno-builtin -fno-exceptions -fno-rtti -fomit-frame-pointer -funwind-tables -fvisibility=hidden -Wno-variadic-macros
 AM_CXXFLAGS += $(LIBSTDCXX_RAW_CXX_CXXFLAGS)
 ACLOCAL_AMFLAGS = -I m4
 
diff --git a/libsanitizer/lsan/Makefile.in b/libsanitizer/lsan/Makefile.in
index e01f65b..c5c07e7 100644
--- a/libsanitizer/lsan/Makefile.in
+++ b/libsanitizer/lsan/Makefile.in
@@ -210,7 +210,7 @@ AM_CPPFLAGS = -I $(top_srcdir)/include -I $(top_srcdir)
 # May be used by toolexeclibdir.
 gcc_version := $(shell cat $(top_srcdir)/../gcc/BASE-VER)
 AM_CXXFLAGS = -Wall -W -Wno-unused-parameter -Wwrite-strings -pedantic \
-	-Wno-long-long

Re: [ia64] [PR target/57491] internal compiler error: in ia64_split_tmode -O2, quadmath

2013-11-17 Thread Kirill Yukhin

On 16 Nov 10:03, Eric Botcazou wrote:
> > As far as I understand semantics of this insn:
> >   (insn 200 199 0 (set (reg:DI 15 r15)
> >   (mem:DI (post_dec:DI (reg/f:DI 15 r15 [447])) [3
> >   *_61[_12]{lb: 1 sz: 64}.text+8 S8 A64])) -1 (nil))
> > What is done is (in that sequence).
> >   1. Calculate address of MEM: get r15 value.
> >   2. Decrement r15 value.
> >   3. Load MEM in to r15.
> > 
> > Point 2 is useless as we kill it by 3.
> > So, it is clobbered and as mention in comment this is sometimes ok to
> > override pointer with pointer value.
> 
> That depends on the semantics of the hardware instruction though, does it 
> really guarantee 1/2/3 in that order?
I've read into IA-64 Spec and it states that if we have
same destination and address registers with post-update form then
`Illegal instruction' should be raised.
This not affects my recent patch.

> The patch looks good to me if you also adjust the last sentence in the 
> comment 
> just above the block:
> 
>   /* It is possible for reload to decide to overwrite a pointer with
>  the value it points to.  In that case we have to do the loads in
>  the appropriate order so that the pointer is not destroyed too
>  early.  Also we must not generate a postmodify for that second
>  load, or rws_access_regno will die.  */
> 
> Something like "And we must not generate a postmodify for the second load if
> the destination register overlaps with the base register".
Thanks, I'll check it in with proposed comment fix today.
Bootstrap (with all languages) pass.

--
Thanks, K

Re: Group static constructors and destructors in specific subsections

2013-11-17 Thread Martin Liška

Dear Cary,
   I've been merging my patches to GCC mainline and I would really
appreciate new section naming convention that was suggested by you in
the previous post. Is there any progress in implementation? Should I
participate in this change and write a patch that will introduce this
new section model?

Thank you,
Martin

On 20 July 2013 14:58, Martin Liška  wrote:
> On 17 July 2013 20:22, Cary Coutant  wrote:
 > Yep, the problem is where to produce the section ordering file.
 > The scheme is as follows:
 >   - with -fprofile-generate instrument every function entry point and 
 > record
 > time of first and last invocation of the functoin
 >   - At compile time we take functions that are executed during the 
 > startup
 > and we want to order them in the increasing order of the first 
 > invocation
 > time measured at FDO time. So we know the relative position of given 
 > function
 > in the program, but not the complette function order.

 Perhaps I misunderstand, but you can use --section-ordering-file
 without knowing the complete function ordering.  Just specify the
 functions you care about.
>>>
>>> The thing is that when compiling given object file, you know only functions 
>>> in
>>> that object file, so you can not produce full --section-ordering-file.  We
>>> would need a tool colleting the partial orders from all objects to single 
>>> file
>>> that I think may be just done in linker.
>>
>> How granular a solution do you need? If you need something fine-grain,
>> like microseconds since startup, we'd also need some way of ensuring
>> that all compilation units are using the same scale. What if someone
>> else wants to order by execution count instead? We could do something
>> coarse-grain by adding a few more "buckets" after "unlikely", "exit",
>> "startup", and "hot", but you probably would need to see the whole
>> program before you could translate something like time-since-startup
>> into a bucket.
>
> I cooperate on function reordering with Jan, we are primary motivated
> to reorder all functions called during startup. I did small
> observation, e.g. Inkscape has about 14K function, where 2.5K (~20%)
> are called during startup. Apart from that, we would also like to
> reorder the rest of functions that are not in the first collection of
> functions (e.g. according to edge call graph profile). Thus, it would
> be nice we can de facto set up an order for all functions.
>
>> In another old thread, I suggested modifying the section naming
>> convention to remove the ambiguity between a function named "unlikely"
>> compiled with -ffunction-sections, and an arbitrary function placed
>> into the "unlikely" bucket. Namely, instead of using
>> ".text.function_name" and ".text.bucket", we combine these into
>> ".text.bucket.function_name". Without -ffunction-sections, we'd just
>> have ".text.bucket" like we do today, but with -ffunction-sections,
>> we'd have ".text..function_name" in the case where there is no bucket.
>> In order to distinguish between old and new conventions, I'd amend
>> that suggestion to use a different set of delimiters -- perhaps
>> ".text[bucket](function_name)". That at least makes it more obvious
>> that the input section goes into an output section named ".text", and
>> we can have a general rule rather than the collection of special cases
>> we have now.
>>
>> To support your use case, we could allow, in addition to the four
>> buckets we already have defined, numeric buckets ranging from, say, 0
>> to 9. You could map whatever ordering criterion you want to use
>> into that range, and the linker would order the text sections by
>> bucket, placing the numbered buckets after "hot" and before all the
>> unbucketed sections. I might further suggest moving "unlikely" to the
>> end, after all the unbucketed sections.
>
> I would enhance the capacity of such ranging buckets (f.e. Firefox
> does have more than 10^5 functions).
>
>> (I can't believe I'm suggesting this -- I don't like the increasing
>> effect that section names have on the linker behavior, but I don't
>> think we really have any better options in ELF.)
>>
>> As an aside, is there any reason why the function name must appear in
>> the section name when we use -ffunction-sections? ELF doesn't require
>> sections to have unique names, so they could all be named ".text". We
>> could do section reordering based on the symbol names rather than the
>> section names, so it's not necessary for reordering. As far as I can
>> tell, it's just an assembler limitation, which we could fix by
>> modifying the syntax of the .section directive to allow both an
>> assembler name and a linker name. All those function names just bloat
>> the section string table for no good reason.
>
> It's not necessary to produce function sections, but it's question for
> Jan and another compiler developers, how could symbol renaming help
> with function ordering?

Re: [PATCH] Time profiler - phase 2

2013-11-17 Thread Martin Liška

Hello,
   there's new version of the patch. I wrote email to Cary to
negotiate how will implement gold's linker patch.

Thanks,
Martin

On 18 November 2013 00:37, Jan Hubicka  wrote:
>> Hello,
>>there is a new version of the patch, I disabled the branch with
>> profile-generate. Could you please advise me how should I force to use
>> profile-reorder-functions just with enable LTO optimization?
>>
>> I also attach reordering results:
>> o gimp-reoder-latest.html (latest patch)
>> o gimp-reoder-without-fix.html (without the code in gcc/predict.c)
>>
>> Thank you,
>> Martin
>> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
>> index 5cb07b7..754f882 100644
>> --- a/gcc/ChangeLog
>> +++ b/gcc/ChangeLog
>> @@ -1,3 +1,13 @@
>> +2013-11-17  Martin Liska  
>> + Jan Hubicka  
>> +
>> + * cgraphunit.c (node_cmp): New function.
>> + (expand_all_functions): Function ordering added.
>> + * common.opt: New profile based function reordering flag introduced.
>> + * lto-partition.c: Support for time profile added.
>> + * lto.c: Likewise.
>> + * predict.c (handle_missing_profiles): Time profile handled in
>> +   missing profiles.
>
> OK,
> thanks!
>> @@ -8645,7 +8645,7 @@ profile useful for later recompilation with profile 
>> feedback based
>>  optimization.  You must use @option{-fprofile-generate} both when
>>  compiling and when linking your program.
>>
>> -The following options are enabled: @code{-fprofile-arcs}, 
>> @code{-fprofile-values}, @code{-fvpt}.
>> +The following options are enabled: @code{-fprofile-arcs}, 
>> @code{-fprofile-values}, @code{-fprofile-reorder-functions}, @code{-fvpt}.
>>
>>  If @var{path} is specified, GCC looks at the @var{path} to find
>>  the profile feedback data files. See @option{-fprofile-dir}.
>
> Skip this change, it is only about the profiling options used.  I think it is 
> enough to mention it later.
>> @@ -8933,6 +8933,14 @@ from profiling values of expressions for usage in 
>> optimizations.
>>
>>  Enabled with @option{-fprofile-generate} and @option{-fprofile-use}.
>>
>> +@item -fprofile-reoder-functions
>> +@opindex fprofile-reorder-functions
>> +Function reordering based on profile instrumentation collects
>> +first time of execution of a function and orders these functions
>> +in ascending order.
>> +
>> +Enabled with @option{-fprofile-generate} and @option{-fprofile-use}.
>
> Only with -fprofile-use.
>
> What happened with the plans for linker support? Perhaps we can implement the 
> numbered sections by Carry's proposal and hope that binutils will catch up in 
> next release?
>
> Honza
diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 5cb07b7..754f882 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,13 @@
+2013-11-17  Martin Liska  
+	Jan Hubicka  
+
+	* cgraphunit.c (node_cmp): New function.
+	(expand_all_functions): Function ordering added.
+	* common.opt: New profile based function reordering flag introduced.
+	* lto-partition.c: Support for time profile added.
+	* lto.c: Likewise.
+	* predict.c (handle_missing_profiles): Time profile handled in
+	  missing profiles.
 2013-11-16  Joern Rennecke  
 
 	* config/arc/arc.c (arc_predicate_delay_insns): New function.
diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c
index 8ab274b..ea722b8 100644
--- a/gcc/cgraphunit.c
+++ b/gcc/cgraphunit.c
@@ -1824,6 +1824,19 @@ expand_function (struct cgraph_node *node)
   ipa_remove_all_references (&node->ref_list);
 }
 
+/* Node comparer that is responsible for the order that corresponds
+   to time when a function was launched for the first time.  */
+
+static int
+node_cmp (const void *pa, const void *pb)
+{
+  const struct cgraph_node *a = *(const struct cgraph_node * const *) pa;
+  const struct cgraph_node *b = *(const struct cgraph_node * const *) pb;
+
+  return a->tp_first_run != b->tp_first_run
+	 ? b->tp_first_run - a->tp_first_run
+	 : b->order - a->order;
+}
 
 /* Expand all functions that must be output.
 
@@ -1835,11 +1848,14 @@ expand_function (struct cgraph_node *node)
to use subsections to make the output functions appear in top-down
order).  */
 
+
 static void
 expand_all_functions (void)
 {
   struct cgraph_node *node;
   struct cgraph_node **order = XCNEWVEC (struct cgraph_node *, cgraph_n_nodes);
+
+  unsigned int expanded_func_count = 0, profiled_func_count = 0;
   int order_pos, new_order_pos = 0;
   int i;
 
@@ -1852,19 +1868,35 @@ expand_all_functions (void)
 if (order[i]->process)
   order[new_order_pos++] = order[i];
 
+  if (flag_profile_reorder_functions)
+qsort (order, new_order_pos, sizeof (struct cgraph_node *), node_cmp);
+
   for (i = new_order_pos - 1; i >= 0; i--)
 {
   node = order[i];
+
   if (node->process)
 	{
+ expanded_func_count++;
+ if(node->tp_first_run)
+   profiled_func_count++;
+
 	  node->process = 0;
 	  expand_function (node);
 	}
 }
+
+if (in_lto_p && dump_file)
+  fprintf (dump_file, "Expanded functions with time profile (%s):%u/%u\n",

Re: [PATCH] Fix static libasan link

2013-11-17 Thread Jakub Jelinek

On Mon, Nov 18, 2013 at 10:45:16AM +0400, Yury Gribov wrote:
> 2013-11-18  Yury Gribov  
> 
>   PR sanitizer/59106
>   * asan/Makefile.am: Disable RTTI.

* asan/Makefile.am (AM_CXXFLAGS): Add -fno-rtti.

>   * interception/Makefile.am: Likewise.

* interception/Makefile.am (AM_CXXFLAGS): Likewise.

>   * lsan/Makefile.am: Likewise.
>   * sanitizer_common/Makefile.am: Likewise.
>   * tsan/Makefile.am: Likewise.

Likewise for the above 3.

>   * asan/Makefile.in: Regenerate.
>   * interception/Makefile.in: Regenerate.
>   * tsan/Makefile.in: Regenerate.
>   * lsan/Makefile.in: Regenerate.
>   * sanitizer_common/Makefile.in: Regenerate.
> 

> 2013-11-18  Yury Gribov  
> 
>   PR sanitizer/59106
>   * c-c++-common/asan/pr59106.c: New test.

Ok with those changes.

Jakub

Re: Recent Go patch fails several tests on 32bit CentOS 5.10

[wide-int] Remove tree_fits_hwi_p and tree_to_hwi

[patch,libgfortran] Fix binary128 ERFC_SCALED

[C++ Patch] PR 59123

[libgfortran,patch] Silence a warning

Re: [RFA][PATCH]Fix 59019

Re: [PATCH] Time profiler - phase 2

Re: [PowerPC] libffi fixes and support for PowerPC64 ELFv2

Re: [PATCH, rs6000] Emit correct note for DWARF CFI information on LE prolog VSX stores

Re: [PowerPC] libffi fixes and support for PowerPC64 ELFv2

Re: [wide-int] Remove tree_fits_hwi_p and tree_to_hwi

Re: [wide-int] Remove tree_fits_hwi_p and tree_to_hwi

Re: [wide-int] Remove tree_fits_hwi_p and tree_to_hwi

Re: [C++ Patch] PR 59123

Re: [PATCH] Time profiler - phase 2

[PATCH, rs6000] Fix little-endian access to sdmode_stack_slot

Re: [PATCH][1-3] New configure option to enable Position independent executable as default.

Re: [libgfortran,patch] Silence a warning

[PATCH, i386]: Fix PR59153, ICE: in memory_address_length, at config/i386/i386.c

Re: [PATCH, rs6000] Fix little-endian access to sdmode_stack_slot

Re: [PATCH] Time profiler - phase 2

Re: Add value range support into memcpy/memset expansion

Re: Some wide-int review comments

[PATCH, rs6000] Fix libcpp/lex.c Altivec code to be correct for little endian

RE: [patch] [arm] ARM Cortex-M3/M4 tuning

Re: cilking away

Re: Clean up atomic tests

_Cilk_spawn and _Cilk_sync for C++

Re: [PATCH, rs6000] Fix libcpp/lex.c Altivec code to be correct for little endian

Re: [RFA][PATCH]Fix 59019

Re: [0/10] Replace host_integerp and tree_low_cst

[PATCH] Fix static libasan link

Re: [ia64] [PR target/57491] internal compiler error: in ia64_split_tmode -O2, quadmath

Re: Group static constructors and destructors in specific subsections

Re: [PATCH] Time profiler - phase 2

Re: [PATCH] Fix static libasan link

36 matches

Site Navigation

Mail list logo

Footer information