The following patch can fix an ICE when compiling with LIPO. OK for google-4_9?
Thanks,
Dehao
Index: gcc/l-ipo.c
===
--- gcc/l-ipo.c (revision 225685)
+++ gcc/l-ipo.c (working copy)
@@ -731,6 +731,7 @@ lipo_cmp_type (tree t1, tree
. Any comments?
Bootstrapped and test on-going.
OK for trunk?
Thanks,
Dehao
ChangeLog:
2015-06-23 Dehao Chen
* opts.c(finish_options): Disable reorder_blocks_and_partition for DWARF2.
Index: opts.c
===
--- opts.c (revision 2
ok for google branch.
Dehao
On Tue, Mar 3, 2015 at 12:26 PM, Cary Coutant wrote:
>>> @@ -21817,22 +21823,39 @@ out_subprog_directive (subprog_entry *su
>>> {
>>>tree decl = subprog->decl;
>>>tree decl_name = DECL_NAME (decl);
>
TER (DECL_ASSEMBLER_NAME (origin));
> + if (name[0] == '*')
> + name++;
> + }
> + else
> + name = dwarf2_name (origin, 0);
> }
>else
> -name = dwarf2_name (decl, 0);
> +{
> + /* To save space, we don't
ok.
Dehao
On Mon, Feb 23, 2015 at 11:02 AM, Cary Coutant wrote:
> Minor changes to -ftwo-level-line-tables.
>
> This patch is for the google/gcc-4_9 branch.
>
> Originally, -ftwo-level-line-tables would output .subprog directives
> only for inlined subprograms, and not fo
The offset overflow warning would cause build fails when function's
start line is missing(0). Until the start line issues is fixed, we
will suppress this warning.
Testing on-going. OK for google-4_9?
Thanks,
Dehao
Index: gcc/auto-prof
patch is ok for google branch.
Dehao
On Thu, Jan 29, 2015 at 1:11 PM, Cary Coutant wrote:
> Here's a very slightly revised patch, fixing a couple of bugs found
> during GDB testing.
>
> In out_logical_entry, I should pass along the value of is_stmt when
> creating a log
ble for each
> block_num in the function tree. But two or more blocks may map to a
> single logical, and some blocks may not correspond to a logical at all
> -- if dwarf2out_source_line() is never called for a block, I'll never
> create a logical for it.
I don't understand why multiple blocks may map to a single
logical_entry. Can you give an example?
Thanks,
Dehao
>
> -cary
>> +static hash_table *block_table;
> >
> > Not quite clear why we need block_table. This table is not gonna be
> > emitted. And we can easily get subprog_entry through block->block_num
>
> When final_scan_insn() calls dwarf2out_begin_block(), all it pa
On Sun, Jan 25, 2015 at 6:06 PM, Cary Coutant wrote:
> Add -ftwo-level-line-tables and -gline-tables-only options.
>
> With -ftwo-level-line-tables, GCC will generate two-level line tables,
> which adds inline call information to the line tables, obviating the
> need to keep bulky debug info aroun
promote the
indirect call anyway.
Dehao
On Tue, Dec 16, 2014 at 2:45 PM, Xinliang David Li wrote:
> Does it paper over the real bug?
>
> David
>
> On Tue, Dec 16, 2014 at 2:38 PM, Dehao Chen wrote:
>> This patch fixes the bug for undefined symbol in AutoFDO build.
>>
ping...
Thanks,
Dehao
On Tue, Nov 18, 2014 at 2:29 PM, Dehao Chen wrote:
> This patch updates ssa and inline summary in the correct location for AutoFDO.
>
> Bootstrapped and passed regression test. OK for trunk?
>
> Thanks,
> Dehao
>
> gcc/ChangeLog:
This patch fixes the bug for undefined symbol in AutoFDO build.
Testing on going. OK for google-4_9 branch?
Thanks,
Dehao
Index: gcc/auto-profile.c
===
--- gcc/auto-profile.c (revision 218784)
+++ gcc/auto-profile.c (working copy
This patch syncs google-4_9 autofdo implementation to trunk (as much
as possible).
Bootstrapped and passed regression test and performance test.
OK for google-4_9?
Thanks,
Dehao
Index: gcc/auto-profile.c
===
--- gcc/auto-profile.c
This patch updates ssa and inline summary in the correct location for AutoFDO.
Bootstrapped and passed regression test. OK for trunk?
Thanks,
Dehao
gcc/ChangeLog:
2014-11-18 Dehao Chen
* auto-profile.c (afdo_annotate_cfg): Invoke update_ssa in the right
place
There are actually two patches needed to port to mainline. I'll send
out the patch tomorrow.
Dehao
On Mon, Nov 17, 2014 at 4:58 PM, Andi Kleen wrote:
> Xinliang David Li writes:
>
>> Ok for now as a workraround, but this is probably not a long term fix.
>
> Is the wo
The patch was updated to ignore comdat einline tuning for AutoFDO.
Performance testing is green.
OK for google-4_9?
Thanks,
Dehao
Index: gcc/auto-profile.c
===
--- gcc/auto-profile.c (revision 217523)
+++ gcc/auto-profile.c
We do not do sophisticated recursive call detection in einline phase.
It only happens in ipa-inline phase.
Dehao
On Thu, Nov 13, 2014 at 3:18 PM, Xinliang David Li wrote:
> On Thu, Nov 13, 2014 at 2:57 PM, Dehao Chen wrote:
>> IIRC, AutoFDO the actual iteration for AutoFDO is mostly &
this case, recomputing inline summary does not help because the
code was bloated in first einline phase.
Dehao
>
> David
>
> On Thu, Nov 13, 2014 at 2:48 PM, Xinliang David Li wrote:
>> Is there a need to have 10 iterations of early inline for autofdo?
>>
>> David
>>
&
function, we need to
recompute inline parameters because rebuild_cgraph_edges will zero out
all inline parameters.
The patch is attached below, bootstrapped and perf test on-going. OK
for google-4_9?
Thanks,
Dehao
Index: gcc/auto-profile.c
The patch tested OK. And I think it's a trivial patch, and already
committed it to trunk.
About the perf parser. I'm syncing the toolchain to head which should
already have newer kernel support.
Thanks,
Dehao
On Wed, Oct 22, 2014 at 10:07 AM, Xinliang David Li wrote:
> Can someon
Looks like the perf data type is incompatible with quipper (perf data
parser). Can you send me the perf.data file so that I can take a look.
Thanks,
Dehao
On Tue, Oct 21, 2014 at 2:25 PM, Markus Trippelsdorf
wrote:
> On 2014.10.21 at 13:53 -0700, Dehao Chen wrote:
>> Everything will be
non-Intel CPU. But you are
more than welcome to tune the propagation algorithm to get most out of
inaccurate instruction profile.
Cheers,
Dehao
On Tue, Oct 21, 2014 at 12:30 PM, Markus Trippelsdorf
wrote:
> On 2014.10.20 at 14:21 -0700, Dehao Chen wrote:
>> >> +If @var{path} is sp
The updated patch attached. Will commit the patch in 2~3 hours if no
objection is received.
Thanks,
Dehao
On Sun, Oct 19, 2014 at 2:58 AM, Jan Hubicka wrote:
>> >> +/* Member functions for string_table. */
>> >> +
>> >> +string_table *
>> >> +str
Hi, Honza,
I've integrated all your comments to the patch. New patch attached.
Thanks,
Dehao
On Wed, Oct 15, 2014 at 7:28 AM, Jan Hubicka wrote:
>> Index: gcc/cfgloop.c
>> ===
>> --- gcc/cfgloop
ly_inline ();
autofdo::afdo_annotate_cfg (promoted_stmts);
compute_function_frequency ();
- update_ssa (TODO_update_ssa);
/* Local pure-const may imply need to fixup the cfg. */
if (execute_fixup_cfg () & TODO_cleanup_cfg)
Dehao
On Wed, Oct 15, 2014 at 10:50 AM, Xinl
This patch recalculates dominance info before update_ssa call in
AutoFDO. This fixes bug when dominance info is out-of-date and causes
segfaults during update_ssa.
Bootstrapped and regression test on-going.
OK for google-4_9 branch?
Thanks,
Dehao
Index: gcc/auto-profile.c
The new patch is attached. I used clang-format for format auto-profile.{c|h}
Thanks,
Dehao
On Tue, Oct 14, 2014 at 2:05 PM, Dehao Chen wrote:
> On Tue, Oct 14, 2014 at 8:02 AM, Jan Hubicka wrote:
>>> Index: gcc/cg
let follow-up logic to
decide if it needs to promote and inline.
And you are right, for the "before annotation" case, we can simply
call "mark speculative" and "inline". But we still need the logic to
fake histogram for "after annotation"
r fixing
>> > broken profiles, perhaps
>> > it could be useful here?
>>
>> The initial SampleFDO implementation uses MCF algorithm to calculate
>> edge counts (the current mcf.c actually comes from that effort).
>> However, from our performance tuning effort, we found that MCF is an
>> overkill for AutoFDO purpose, and it's extremely slow and hard to
>> debug. Thus we implemented this ad-hoc heuristic for propagation.
>
> OK, I do not see however why we do not share one of these two solutions for
> both
> cases. Again someting that may be cleaned up later.
> How slow MCF is in practice?
It depends, for large functions, it will be super slow, thus we need
to limit the # of iterations. As a result, the graph after MCF may
still be inconsistent.
My adhoc algorithm is not aiming at making flow consistent, but
guessing the right branch probability out of limited BB counts. MCF
can do the same thing, but if we reached the max # iter threshold, the
impact would be tragical.
>>
>> >> +
>> >> +/* Perform value profile transformation using AutoFDO profile. Add the
>> >> + promoted stmts to PROMOTED_STMTS. Return TRUE if there is any
>> >> + indirect call promoted. */
>> >> +
>> >> +static bool
>> >> +afdo_vpt_for_early_inline (stmt_set *promoted_stmts)
>> >
>> > What exactly this function does? It inlines according to the inline
>> > decisions
>> > made at train run time and recorded to profile feedback?
>>
>> As replied above, we need iterative vpt-einline to make sure the IR
>> looks exactly the same as profiling binary before annotation.
>
> So the main reason is that by vpt you turn indirect calls into direct calls?
> Perhaps this can happen during the einline itself (so we do not go through the
> redundant inline transform passes)?
I think it makes sense to add vpt into einline. In this way, we don't
need to expose early_inliner to auto-profile.c
Thanks,
Dehao
final version of the
patch). The performance of the new patch is the same as the original
patch.
Thanks,
Dehao
>> Index: gcc/bb-reorder.c
>> ===
>> --- gcc/bb-reorder.c (revision 210180)
>> +++ gc
OK for google-4_8 and google-4_9. David and Teresa may have further comments.
Dehao
On Wed, Aug 6, 2014 at 3:36 PM, Yi Yang wrote:
> This currently puts split sections together again in the specified
> section and breaks DWARF output. This patch disables the partitioning
> for such
This patch replaces getline with fgets so that gcc builts fine in darwin.
Testing on going, ok for google-4_9 if test passes?
Thanks,
Dehao
Index: gcc/coverage.c
===
--- gcc/coverage.c (revision 212523)
+++ gcc/coverage.c (working
OK for google-4_8 after testing.
Thanks,
Dehao
On Tue, Jul 1, 2014 at 1:04 PM, Yi Yang wrote:
> Per offline discussion,
> * do not export function start line number. Instead, hash branch
> offset and discriminator into the "function_hash" (renamed to just
> "hash&q
There is no need for fill_invalid_locus_information, just initialize
every field to 0, and if it's unknown location, no need to output this
line.
Dehao
On Mon, Jun 30, 2014 at 4:26 PM, Yi Yang wrote:
> Instead of storing percentages of the branch probabilities, store them
Let's use %d to replace %f (manual conversion, let's do xx%).
Dehao
On Mon, Jun 30, 2014 at 2:06 PM, Yi Yang wrote:
> Fixed.
>
> Also, I spotted some warnings caused by me using "%lf"s in snprintf().
> I changed these to "%f" and tested.
>
>
>
You don't need extra space to store file name in locus_information_t.
Use pointer instead.
Dehao
On Mon, Jun 30, 2014 at 1:36 PM, Yi Yang wrote:
>
> I refactored the code and added comments. A bug (prematurely breaking
> from a loop) was fixed during the refactoring.
>
&g
OK for google-4_8 and google-4_9
Thanks,
Dehao
On Tue, Jun 24, 2014 at 3:09 PM, Yi Yang wrote:
> Hi,
>
> This patch removes unnecessary edge probability calculations in
> afdo_propagate_circuit() that would eventually be overridden by
> afdo_calculate_branch_prob().
>
> Th
I think the patch looks good. David and Rong, any comments?
Dehao
On Thu, Jun 12, 2014 at 11:23 AM, Teresa Johnson wrote:
> These two patches fix multiple ICE that occurred due to DFE being
> recently enabled after AutoFDO LIPO linking.
>
> Passes regression and internal testing. O
e really meaningful -- if you omit the C benchmarks, the
> geomean will be a bit higher. Why, I wonder, is 483 affected so much
> more than 471 and 473?
483 is heavily templated code with very deep inline stacks. And the
function name for 483 is also much longer than 471 and 473.
Dehao
>
ping...
Dehao
On Fri, May 30, 2014 at 4:13 PM, Dehao Chen wrote:
> This will increase c++ g1/g2 binary size a little. For all spec
> cint2006 benchmarks, the binary size change is shown below.
>
> 400 0.00% 0.00% 0.00% 0.00%
> 401 0.00% 0.00% 0.00% 0.00%
> 403 0.00% 0.00%
Just tried with Teresa's patch, the ICE in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61384 is not resolved.
Dehao
On Mon, Jun 2, 2014 at 9:45 AM, Jeff Law wrote:
> On 06/02/14 10:17, Dehao Chen wrote:
>>
>> We need to rebuild frequency after vrp, otherwise the following
s[0]->frequency < BB_FREQ_MAX * 2)
rd->dup_blocks[0]->frequency += EDGE_FREQUENCY (e);
This is referring to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61384
Thanks,
Dehao
On Mon, Jun 2, 2014 at 9:13 AM, Jan Hubicka wrote:
>> This patch rebuilds frequency after vrp.
&
This patch rebuilds frequency after vrp.
Bootstrapped and testing on-going. OK for trunk if test pass?
Thanks,
Dehao
gcc/ChangeLog:
2014-06-02 Dehao Chen
PR tree-optimization/61384
* tree-vrp.c (execute_vrp): rebuild frequency after vrp.
gcc/testsuite/ChangeLog:
2014-06-02
Thanks for the suggestion. I actually want this function to be inlined
in ipa-inline phase, not einline phase.
Dehao
On Fri, May 30, 2014 at 4:50 PM, Steven Bosscher wrote:
> On Fri, May 30, 2014 at 11:43 PM, Dehao Chen wrote:
>> Index: gcc/testsuite/gcc.dg/tree-prof/merg
0.00% 0.00% 0.00% 0.00%
462 0.00% 0.00% 0.00% 0.00%
464 0.00% 0.00% 0.00% 0.00%
471 1.28% 0.20% 1.23% 0.15%
473 0.36% 0.00% 0.35% 0.01%
483 12.79% 1.73% 13.65% 2.12%
geomean 1.14% 0.16% 1.20% 0.19%
The 4 columns are:
o0 -g1
o0 -g2
o2 -g1
o2 -g2
Thanks,
Dehao
On Fri, May 30, 2014 at 3:23 PM, Dehao
As we are pushing AutoFDO patch upstream, is this patch OK for trunk?
Thanks,
Dehao
On Mon, Aug 19, 2013 at 1:32 PM, Dehao Chen wrote:
> After rerunning test, this will fail one gcc regression test. So I
> updated the patch to make sure all test will pass:
>
> Index: gcc
This patch updates the merged bb count only when they are in the same loop.
Bootstrapped and passed regression test.
Ok for trunk?
Thanks,
Dehao
gcc/ChangeLog:
2014-05-30 Dehao Chen
* tree-cfg.c (gimple_merge_blocks): Only reset count when BBs are in
the same loop.
gcc
This patch fixes LIPO ICE that an unresolved node escaped after lipo fixup.
testing on going. OK for google-4_9?
Thanks,
Dehao
Index: gcc/ipa.c
===
--- gcc/ipa.c (revision 210864)
+++ gcc/ipa.c (working copy)
@@ -39,6 +39,7
If a loop's header count is less than iteration count, the iteration
estimation is apparently incorrect for this loop. Thus disable
unrolling of such loops.
Testing on going. OK for trunk if test pass?
Thanks,
Dehao
gcc/ChangeLog:
2014-05-21 Dehao Chen
* cfgl
check what ipa-cp is doing.
I checked ipa-cp, but didn't see count propagation anywhere. Could you
point me to the function?
Thanks,
Dehao
>
> Patch is OK (with Changelog)
> Honza
>>
>> Thanks,
>> Dehao
>>
>> Index: gcc/ipa-inline-transform.c
>> ==
I've updated the patch. Shall I move the check inside cgraph_clone_node?
Thanks,
Dehao
Index: gcc/ipa-inline-transform.c
===
--- gcc/ipa-inline-transform.c (revision 210535)
+++ gcc/ipa-inline-transform.c (working copy)
@@ -
Do you mean adjusting bb->count? Because in
expand_call_inline(tree-inline.c), it will use bb->count to pass into
copy_body to calculate count_scale.
Thanks,
Dehao
On Fri, May 16, 2014 at 5:22 PM, Jan Hubicka wrote:
>> In AutoFDO, a basic block's count can be much larger
On Fri, May 16, 2014 at 4:41 PM, Jan Hubicka wrote:
>
> > Is this patch ok for trunk? Bootstrapped and regression test on-going.
> >
> > Thanks,
> > Dehao
> >
> > 2014-05-16 Dehao Chen
> >
> > * tree-inline.c (initialize_cfun
Is this patch ok for trunk? Bootstrapped and regression test on-going.
Thanks,
Dehao
2014-05-16 Dehao Chen
* tree-inline.c (initialize_cfun): Ensure count_scale is no larger
than REG_BR_PROB_BASE.
(copy_cfg_body): Likewise.
Index: gcc/tree-inline.c
This patch makes sure max count is used when merging two basic blocks.
Bootstrapped and testing on-going.
OK for trunk if test is ok?
Thanks,
Dehao
gcc/ChangeLog:
2014-05-16 Dehao Chen
* tree-cfg.c (gimple_merge_blocks): Updates bb count with max count.
Index: gcc/tree-cfg.c
This patch uses optimize_function_for_size_p to replace old
optimize_size check in regs.h and ira-int.h to make it consistent.
Bootstrapped and testing on-going.
OK for trunk if test passes?
Thanks,
Dehao
gcc/ChangeLog:
2014-05-16 Dehao Chen
* ira-int.h (REG_FREQ_FROM_EDGE_FREQ
trunk if test pass?
Thanks,
Dehao
gcc/ChangeLog:
2014-05-16 Dehao Chen
* cfghooks.c (make_forwarder_block): Use direct computation to
get fall-through edge's count and frequency.
Index: gcc/cfghooks.c
=
Attached patch passes regression tests and benchmark test. OK for google-4_9?
Thanks,
Dehao
On Tue, May 13, 2014 at 10:43 AM, Dehao Chen wrote:
> As discussed offline, this is actually due to missing parts of the
> previous patch (some changes does not appear in the change log of
>
As discussed offline, this is actually due to missing parts of the
previous patch (some changes does not appear in the change log of
r199154). I've updated the patch to include those missing pieces.
Testing on going.
Dehao
On Tue, May 13, 2014 at 10:04 AM, Cary Coutant wrote:
>> The
is actually not a macro location.
Dehao
On Tue, May 13, 2014 at 9:47 AM, Cary Coutant wrote:
>> Index: gcc/input.c
>> ===
>> --- gcc/input.c (revision 210338)
>> +++ gcc/input.c (workin
The previous checkin will break build for most application:
http://gcc.gnu.org/viewcvs/gcc/branches/google/gcc-4_9/gcc/?view=log
This patch fixes the regression by updating highest_location.
Testing on-going,
OK for google-4_9 branch?
Thanks,
Dehao
Index: gcc/input.c
Yes, this patch is a combination of all these patches. Some of them
are already in trunk.
Dehao
On Mon, May 12, 2014 at 1:28 PM, Cary Coutant wrote:
> On Mon, May 12, 2014 at 1:11 PM, Dehao Chen wrote:
>> This patch backports r199154 from google-4_8 to google-4_9
>>
>> Bo
This patch backports r199154 from google-4_8 to google-4_9
Bootstrapped and passed regression test.
OK for google-4_9 branch?
Thanks,
Dehao
Index: gcc/final.c
===
--- gcc/final.c (revision 210329)
+++ gcc/final.c (working copy
add more content to the wiki page
(https://github.com/google/autofdo/wiki). Feel free to send me emails
or discuss on github if you have any questions.
Cheers,
Dehao
This patch handles TYPE_PACK_EXPANSION in lipo_cmp_type.
testing on going. OK for google-4_8?
Thanks,
Dehao
Index: gcc/l-ipo.c
===
--- gcc/l-ipo.c (revision 209226)
+++ gcc/l-ipo.c (working copy)
@@ -676,6 +676,7
This patch calls add_fake_edge for the AutoFDO+LIPO path.
Bootstrapped and passed regression test and performance test.
OK for google-4_8?
Thanks,
Dehao
Index: gcc/auto-profile.c
===
--- gcc/auto-profile.c (revision 209123
This patch updates SSA after VPT transformation. This is needed
because compute_inline_parameters will ICE without updated SSA.
Testing on-going.
OK for google-4_8?
Thanks,
Dehao
Index: gcc/auto-profile.c
===
--- gcc/auto
On Wed, Mar 26, 2014 at 4:05 PM, Xinliang David Li wrote:
> is cgraph_init_gid_map called after linking?
Oh, forgot that part. It's interesting that the test can pass without
another cgraph_init_gid_map call.
Patch updated. Retested and the performance is OK.
Dehao
>
> David
&
Patch updated, passed performance tests.
Dehao
On Tue, Mar 25, 2014 at 4:03 PM, Xinliang David Li wrote:
> Add comment to the new function. init_node_map is better invoked after
> the link step to avoid creating entries with for dead nodes.
>
> Ok if large perf testing is fine.
>
On Sat, Mar 22, 2014 at 6:32 PM, Jan Hubicka wrote:
>> Hi,
>>
>> This patch updates node's inline summary after edge_summary is
>> updated. Otherwise it could lead to incorrect inline summary.
>>
>> Bootstrapped and gcc regression test on-going.
&
,
Dehao
Index: gcc/tree-profile.c
===
--- gcc/tree-profile.c (revision 208818)
+++ gcc/tree-profile.c (working copy)
@@ -1119,18 +1119,12 @@ tree_profiling (void)
cgraphunit.c:ipa_passes(). */
gcc_assert (cgraph_state
Hi,
This patch updates node's inline summary after edge_summary is
updated. Otherwise it could lead to incorrect inline summary.
Bootstrapped and gcc regression test on-going.
OK for trunk?
Thanks,
Dehao
gcc/ChangeLog:
2014-03-21 Dehao Chen
*ipa-inline.c (early_inliner): updates ov
ping ^2...
Dehao
On Mon, Feb 10, 2014 at 8:35 AM, Dehao Chen wrote:
> ping...
>
> Dehao
>
> On Fri, Jan 24, 2014 at 1:54 PM, Dehao Chen wrote:
>> Thanks, test updated:
>>
>> Index:
This patch guards autofdo annotation coverage recording with a flag.
Test on-going.
OK for google-4_8 if test passes?
Thanks,
Dehao
Index: gcc/auto-profile.c
===
--- gcc/auto-profile.c (revision 208753)
+++ gcc/auto-profile.c
On Thu, Mar 20, 2014 at 1:02 PM, Xinliang David Li wrote:
> On Thu, Mar 20, 2014 at 12:40 PM, Dehao Chen wrote:
>> Patch updated to add a wrapper early_inline function
>>
>> Index: gcc/auto-profile.c
>> ==
un->last_verified &= ~TODO_verify_ssa;
> }
>
> David
>
> On Thu, Mar 20, 2014 at 10:39 AM, Dehao Chen wrote:
>> This patch calls update_ssa before compute_inline_paramters.
>>
>> Bootstrapped and perf test on-going.
>>
>> OK for google-4_8?
This patch calls update_ssa before compute_inline_paramters.
Bootstrapped and perf test on-going.
OK for google-4_8?
Thanks,
Dehao
Index: gcc/auto-profile.c
===
--- gcc/auto-profile.c (revision 208726)
+++ gcc/auto-profile.c
Looks good to me.
Dehao
On Wed, Mar 12, 2014 at 3:35 PM, Hán Shěn (沈涵) wrote:
> ARM build (on chrome) is broken because of duplicate entries in arm.md
> and unspecs.md. Fixed by removing duplication and merge those in
> arm.md into unspecs.md.
>
> (We had a similar fix for goog
Thanks Cary for the comments.
Patch updated, an also added a tool in contrib/ to dump the profile
annotation coverage.
Dehao
>
>
> On Wed, Mar 12, 2014 at 9:48 AM, Cary Coutant wrote:
>>
>> +void autofdo_source_profile::write_annotated_count () const
>> +{
>>
performance test on-going.
OK for google-4_8?
Thanks,
Dehao
Index: gcc/auto-profile.c
===
--- gcc/auto-profile.c (revision 208283)
+++ gcc/auto-profile.c (working copy)
@@ -49,6 +49,8 @@ along with GCC; see the file COPYING3. If not see
Looks good to me.
Dehao
On Tue, Mar 11, 2014 at 3:22 PM, Hán Shěn (沈涵) wrote:
> Hi current google/main fails to build for arm because of duplicated
> head file entries in gtyp-input.list.
>
> Fixed by removing duplication in macro tm_file. This only affects arm
> plat
This patch removes the size limit for loop unroll/peel when the loop
is truly hot. This makes the implementation easily maintanable between
FDO and AutoFDO.
Bootstrapped and loadtest perf show neutral impact.
OK for google-4_8?
Thanks,
Dehao
Index: gcc/loop-unroll.c
+1532,7 @@ auto_profile (void)
early_inliner ();
}
+ compute_inline_parameters (cgraph_get_node
(current_function_decl), true);
early_inliner ();
autofdo::afdo_annotate_cfg (promoted_stmts);
compute_function_frequency ();
Dehao
On Wed, Feb 26, 2014 at 3:25 PM
This patch fixes the bug of not calling compute_inline_parameters
before early_inliner, which would lead to ICE.
Testing on going, OK for google-4_8 if test passes?
Thanks,
Dehao
Index: gcc/auto-profile.c
===
--- gcc/auto-profile.c
ping...
Dehao
On Fri, Jan 24, 2014 at 1:54 PM, Dehao Chen wrote:
> Thanks, test updated:
>
> Index: gcc/testsuite/gcc.dg/predict-8.c
> ===
> --- gcc/testsuite/gcc.dg/predict-8.c (revision 0)
> +++ gcc/testsuite/
This patch fixes performance regression for AutoFDO. When the entry
block count is 0, which is quite possible in AutoFDO, it can still
make right optimization decision.
Bootstrapped passed regression test and performance test (improve 0.5%
on average).
OK for google-4_8?
Thanks,
Dehao
Index
} } */
On Fri, Jan 24, 2014 at 11:38 AM, H.J. Lu wrote:
> On Fri, Jan 24, 2014 at 10:57 AM, Jakub Jelinek wrote:
>> On Fri, Jan 24, 2014 at 10:20:53AM -0800, Dehao Chen wrote:
>>> --- gcc/testsuite/gcc.dg/predict-8.c (revision 0)
>>> +++ gcc/testsuite/gcc.dg/predict-8.c
A new test is added:
gcc/testsuite/ChangeLog:
2014-01-24 Dehao Chen
* gcc.dg/predict-8.c: New test.
Index: gcc/testsuite/gcc.dg/predict-8.c
===
--- gcc/testsuite/gcc.dg/predict-8.c (revision 0)
+++ gcc/testsuite/gcc.dg
as 1%.
Bootstrapped and passed regression test.
OK for trunk?
Thanks,
Dehao
gcc/ChangeLog:
2014-01-22 Dehao Chen
* dojump.c (do_compare_rtx_and_jump): Sets correct probability for
compiler inserted conditional jumps for NAN float check.
Index: gcc/dojump.c
Unfortunately, copy_cfg_body is actually using basic block count
instead of cgraph edge count. Thus even fixing up the call graph does
not solve the problem. The 2nd chunk of the patch (cgraphclones.c) is
actually not necessary. We only need the first part (tree-inline.c).
Thanks,
Dehao
On Fri
_8 if performance test is ok?
Thanks,
Dehao
Index: gcc/tree-inline.c
===
--- gcc/tree-inline.c (revision 206721)
+++ gcc/tree-inline.c (working copy)
@@ -2262,6 +2262,9 @@ copy_cfg_body (copy_body_data * id, gcov_type coun
If a loop is cunrolled/vectorized, the AutoFDO computed trip count
will be very small. This patch disallows overwritting of precomputed
loop bound in AutoFDO mode.
Bootstrapped and passed regression test. Performance test on-going.
OK for Google branches?
Thanks,
Dehao
Index: tree-ssa-loop
This patch moves the LIPO linking before profile annotation so that
iterative-early-inline can cover functions from aux-module.
Bootstrapped and passed regression test and benchmark test.
OK for google-4_8?
Thanks,
Dehao
Index: gcc/auto-profile.c
This patch removes mod_id_to_name map because the info is already
there in module_infos. And also, AutoFDO don't have access to update
this map because its a file-static structure.
Bootstrapped and passed regression test.
OK for google branch?
Thanks,
Dehao
Index: gcc/cover
This patch fix the bug to honor max-lipo-group for AutoFDO.
Bootstrapped and passed regression test.
OK for google-4_8 branch?
Thanks,
Dehao
Index: gcc/auto-profile.c
===
--- gcc/auto-profile.c (revision 206135)
+++ gcc/auto
entry_edge is the edge that enters the
loop.
Dehao
>
>
> Diego.
of the block itself. Do you see
> any problems with that heuristic?
In this case, the propagate_edge function will keep increasing the BB
count. We set a threshold (PARAM_AUTOFDO_MAX_PROPAGATE_ITERATIONS) to
prevent it from making BB count too large.
Dehao
>
>
> T
afdo_propagate_multi_edge can do everything afdo_propagate_single_edge
does. So we refactor the code to keep only one afdo_propagate_edge
function.
Bootstrapped and passed all unittests and performance tests.
OK for googlge branch?
Thanks,
Dehao
Index: gcc/auto-profile.c
Patch updated...
There is no performance change with/without the patch. I think this
was used to workaround the debug info accuracy issue. But after debug
info is more improved now, the heuristic is not needed any more.
Thanks,
Dehao
Index: gcc/auto-profile.c
1 - 100 of 409 matches
Mail list logo