FYI  the performance impact of this option with SPEC06 (built with
google_46 compiler and measured on a core2 box).  The base line number
is FDO, and ref number is FDO + reorder_with_partitioning.

xalancbmk improves > 3.5%
perlbench improves > 1.5%
dealII and bzip2 degrades about 1.4%.

Note the partitioning scheme is not tuned at all -- there is not even
a tunable parameter to play with.

David




On Tue, Jul 19, 2011 at 2:33 PM, Richard Henderson <r...@redhat.com> wrote:
> There are a number of problems with this code that affect
> its ability to work with any non-x86-like target, that is,
> anyone that doesn't define at least HAS_LONG_UNCOND_BRANCH
> and possibly HAS_LONG_COND_BRANCH.
>
> We begin, quite sensibly, with pass_partition_blocks which
> performs a number of transformations upon the code that,
> while the actual code could be better factored, is quite
> easy to follow.  Depending on the features of the target,
> fallthrus are turned into unconditional jumps, conditional
> jumps are split into branch around branch, unconditional
> jumps are turned into indirect jumps.
>
> There's nice bits of commentary that say why things are
> implemented this way, including exposing the indirect jumps
> to the register allocator.
>
> But after pass_partition_blocks, we run into trouble.  There
> are no less than 4 other passes that add *new* crossing jumps
> without doing *any* of the subsequent fixups for less capable
> targets: pass_outof_cfg_layout_mode, pass_reorder_blocks,
> pass_sched2 (ia64 only? it's in code in haifa that looks like
> speculative load fixups), and pass_convert_to_eh_region_ranges.
>
> The worst part is that test coverage for this feature is
> extremely poor.  It's very difficult to tell if any cleanup
> in this area is likely to introduce more bugs than it fixes.
>
> After 3 days fighting with this code, I had a bit of a
> cathartic whine on IRC.  I got two votes to just rip the
> whole thing out.
>
> Andrew Pinski points out that the feature could probably be
> equivalently implemented via outlining and function calls
> (I assume well back at the gimple level).  At which point we
> no longer have cross-segment jump_insns at the rtl level,
> which seems like a Really Big Win to me at this point.
> Not that I'm volunteering to actually do the work to implement
> any such scheme.
>
> Thoughts?
>
>
> r~
>

Reply via email to