> We're not able to enable BB reordering with -Os.  The behaviour is
> hard-coded via this if statement in rest_of_handle_reorder_blocks():
>   if ((flag_reorder_blocks || flag_reorder_blocks_and_partition)
>       /* Don't reorder blocks when optimizing for size because extra
> jump insns may
>        be created; also barrier may create extra padding.
>        More correctly we should have a block reordering mode that
> to
>        minimize the combined size of all the jumps.  This would more
> less
>        automatically remove extra jumps, but would also try to use
> short
>        jumps instead of long jumps.  */
>       && optimize_function_for_speed_p (cfun))
>     {
>       reorder_basic_blocks ();
> If you comment out the "&& optimize_function_for_speed_p (cfun)" then
> BB reordering takes places as desired (although this isn't a solution
> obviously).
> In a private message Ian indicated that this had a small impact for
> ISA he's working with but a significant performance gain.  I tried the
> same thing with the ISA I work on (Ubicom32) and this change typically
> increased code sizes by between 0.1% and 0.3% but improved performance
> by anything from 0.8% to 3% so on balance this is definitely winning
> for most of our users (this for a couple of benchmarks, the Linux
> kernel, busybox and smbd).

It should be noted that commenting out the conditional to do with
optimising for speed will make BB reordering come on for all functions,
even cold ones, so I think whatever gains have come from making this
hacky change could increase further if BB reordering is set to
only come on for hot functions when compiling with -Os.  (Certainly
the code size increases could be minimised, whilst hopefully retaining
the performance gains.)

Note that I am in no way suggesting this should be the default
behaviour for -Os, but that it should be switchable via the
flags just like other optimisations are.  But, once it is switchable,
I expect choosing to turn it on for -Os should not cause universal
enabling of BB reordering for every function (as opposed to the current
universal disabling of BB reordering for every function), but a sensible
half-way point, based on heat, so that you get the performance wins with
minimal code size increases on selected functions.


Reply via email to