> On 13 Sep 2017, at 1:57 AM, Wilco Dijkstra <wilco.dijks...@arm.com> wrote: > > Hi all, > > At the GNU Cauldron I was inspired by several interesting talks about > improving > GCC in various ways. While GCC has many great optimizations, a common theme is > that its default settings are rather conservative. As a result users are > required to enable several additional optimizations by hand to get good code. > Other compilers enable more optimizations at -O2 (loop unrolling in LLVM was > mentioned repeatedly) which GCC could/should do as well.
There are some nuances to -O2. Please consider -O2 users who wish use it like Clang/LLVM’s -Os (-O2 without loop vectorisation IIRC). Clang/LLVM has an -Os that is like -O2 so adding optimisations that increase code size can be skipped from -Os without drastically effecting performance. This is not the case with GCC where -Os is a size at all costs optimisation mode. GCC users option for size not at the expense of speed is to use -O2. Clang GCC -Oz ~= -Os -Os ~= -O2 So if adding optimisations to -O2 that increase code size, please considering adding an -O2s that maintains the compact code size of -O2. -O2 generates pretty compact code as many performance optimisations tend to reduce code size, or otherwise add optimisations that increase code size to -O3. Adding loop unrolling on makes sense in the Clang/LLVM context where they have a compact code model with good performance i.e. -Os. In GCC this is -O2. So if you want to enable more optimisations at -O2, please copy -O2 optimisations to -O2s or rename -Os to -Oz and copy -O2 optimisation defaults to a new -Os. The present reality is that any project that wishes to optimize for size at all costs will need to run a configure test for -Oz, and then fall back to -Os, given the current disparity between Clang/LLVM and GCC flags here. > Here are a few concrete proposals to improve GCC's option settings which will > enable better code generation for most targets: > > * Make -fno-math-errno the default - this mostly affects the code generated > for > sqrt, which should be treated just like floating point division and not set > errno by default (unless you explicitly select C89 mode). > > * Make -fno-trapping-math the default - another obvious one. From the docs: > "Compile code assuming that floating-point operations cannot generate > user-visible traps." > There isn't a lot of code that actually uses user-visible traps (if any - > many CPUs don't even support user traps as it's an optional IEEE feature). > So assuming trapping math by default is way too conservative since there is > no obvious benefit to users. > > * Make -fno-common the default - this was originally needed for pre-ANSI C, > but > is optional in C (not sure whether it is still in C99/C11). This can > significantly improve code generation on targets that use anchors for globals > (note the linker could report a more helpful message when ancient code that > requires -fcommon fails to link). > > * Make -fomit-frame-pointer the default - various targets already do this at > higher optimization levels, but this could easily be done for all targets. > Frame pointers haven't been needed for debugging for decades, however if > there > are still good reasons to keep it enabled with -O0 or -O1 (I can't think of > any > unless it is for last-resort backtrace when there is no unwind info at a > crash), > we could just disable the frame pointer from -O2 onwards. > > These are just a few ideas to start. What do people think? I'd welcome > discussion > and other proposals for similar improvements. > > Wilco