> On 13 Sep 2017, at 1:57 AM, Wilco Dijkstra <wilco.dijks...@arm.com> wrote:
> 
> Hi all,
> 
> At the GNU Cauldron I was inspired by several interesting talks about 
> improving
> GCC in various ways. While GCC has many great optimizations, a common theme is
> that its default settings are rather conservative. As a result users are 
> required to enable several additional optimizations by hand to get good code.
> Other compilers enable more optimizations at -O2 (loop unrolling in LLVM was
> mentioned repeatedly) which GCC could/should do as well.

There are some nuances to -O2. Please consider -O2 users who wish use it like 
Clang/LLVM’s -Os (-O2 without loop vectorisation IIRC).

Clang/LLVM has an -Os that is like -O2 so adding optimisations that increase 
code size can be skipped from -Os without drastically effecting performance.

This is not the case with GCC where -Os is a size at all costs optimisation 
mode. GCC users option for size not at the expense of speed is to use -O2.

Clang           GCC
-Oz             ~=      -Os
-Os             ~=      -O2

So if adding optimisations to -O2 that increase code size, please considering 
adding an -O2s that maintains the compact code size of -O2. -O2 generates 
pretty compact code as many performance optimisations tend to reduce code size, 
or otherwise add optimisations that increase code size to -O3. Adding loop 
unrolling on makes sense in the Clang/LLVM context where they have a compact 
code model with good performance i.e. -Os. In GCC this is -O2.

So if you want to enable more optimisations at -O2, please copy -O2 
optimisations to -O2s or rename -Os to -Oz and copy -O2 optimisation defaults 
to a new -Os.

The present reality is that any project that wishes to optimize for size at all 
costs will need to run a configure test for -Oz, and then fall back to -Os, 
given the current disparity between Clang/LLVM and GCC flags here.

> Here are a few concrete proposals to improve GCC's option settings which will
> enable better code generation for most targets:
> 
> * Make -fno-math-errno the default - this mostly affects the code generated 
> for
>  sqrt, which should be treated just like floating point division and not set
>  errno by default (unless you explicitly select C89 mode).
> 
> * Make -fno-trapping-math the default - another obvious one. From the docs:
>  "Compile code assuming that floating-point operations cannot generate 
>   user-visible traps."
>  There isn't a lot of code that actually uses user-visible traps (if any -
>  many CPUs don't even support user traps as it's an optional IEEE feature). 
>  So assuming trapping math by default is way too conservative since there is
>  no obvious benefit to users. 
> 
> * Make -fno-common the default - this was originally needed for pre-ANSI C, 
> but
>  is optional in C (not sure whether it is still in C99/C11). This can
>  significantly improve code generation on targets that use anchors for globals
>  (note the linker could report a more helpful message when ancient code that
>  requires -fcommon fails to link).
> 
> * Make -fomit-frame-pointer the default - various targets already do this at
>  higher optimization levels, but this could easily be done for all targets.
>  Frame pointers haven't been needed for debugging for decades, however if 
> there
>  are still good reasons to keep it enabled with -O0 or -O1 (I can't think of 
> any
>  unless it is for last-resort backtrace when there is no unwind info at a 
> crash),
>  we could just disable the frame pointer from -O2 onwards.
> 
> These are just a few ideas to start. What do people think? I'd welcome 
> discussion
> and other proposals for similar improvements.
> 
> Wilco

Reply via email to