In xalancbmk, with the partition option, most of object files have
nonzero size cold sections generated. The text size of the binary is
increased to 3572728 bytes from 3466790 bytes.  Profiling the program
using the training input shows the following differences. With
partitioning, number of executed branch instructions slightly
increases, but itlb misses and icache load misses are significantly
lower compared with the binary without partitioning.


David

With partition:
-----------------
   53654937239  branches
      306751458  L1-icache-load-misses
        8146112  iTLB-load-misses

Without partition:
---------------------
    52348639025  branches
      454417666  L1-icache-load-misses
       14470953  iTLB-load-misses


On Mon, Jul 25, 2011 at 3:23 AM, Paolo Bonzini <bonz...@gnu.org> wrote:
> On 07/25/2011 06:42 AM, Xinliang David Li wrote:
>>
>> FYI  the performance impact of this option with SPEC06 (built with
>> google_46 compiler and measured on a core2 box).  The base line number
>> is FDO, and ref number is FDO + reorder_with_partitioning.
>>
>> xalancbmk improves>  3.5%
>> perlbench improves>  1.5%
>> dealII and bzip2 degrades about 1.4%.
>>
>> Note the partitioning scheme is not tuned at all -- there is not even
>> a tunable parameter to play with.
>
> Did you check what is pushed down to the cold section in these cases?
>
> Paolo
>

Reply via email to