On Wed, Mar 27, 2019 at 1:27 PM Martin Liška <mli...@suse.cz> wrote: > > On 3/25/19 1:36 PM, Moritz Strübe wrote: > > Hi, > > > > I have an issue with the optimization options. We are on an stm32 and it > > only has a prefetcher, but no cache. Thus it's nice to have linear > > default path. For example, we use __builtin_expect in our asserts. Yet > > it seems that this does not work when using -Os. I confirmed that this > > is not an arm issue, but can also be seen on x86. > > I have the following code: > > ---------- > > #include <stdint.h> > > #ifdef UN > > #define UNLIKELY(x) __builtin_expect((x),0) > > #else > > #define UNLIKELY(x) (x) > > #endif > > > > float a = 66; > > > > int test (float b, int test) { > > if(UNLIKELY(test)) { > > return b / a; > > } else { > > return b * a; > > } > > } > > ---------- > > "gcc -O2" reorders the code depending on a passed -DUN, while -Os always > > produces the same output (see https://godbolt.org/z/cL-Pbg) > > > > I played around with different options running > > gcc -O{s|2} -Q --help=optimizers > > , but didn't manage to get -Os to do that optimization. > > Hi. > > So first we have a misleading documentation for 8.3.0, you're hitting: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87829 > > So with -Os we enable BB reordering, the only difference is that we set > -freorder-blocks-algorithm=simple with -Os. However, on x86_64, it stays > with -freorder-blocks-algorithm=stv. > > Then you can use -fdump-rtl-bbro and see a dump file: > pr-expect.c.300r.bbro > > The issues you're seeing is caused by fact that bbro uses > optimize_function_for_size_p functions that return true with -Os. > That's probably why you see the difference. That's my quick analysis.
Yeah, I think it's use of optimize_function_for_size_p is at least odd. It seems to be a defensive check in place to do as little as possible keeping the original order. I think the only thing that should be disabled is tracing of cold paths. But this is all stage1 material. Richard. > Maybe Honza can help here? > > Martin > > > Opposed to what the manual[1] says, this only differs in > > -finline-functions and -foptimize-strlen for 8.3 > > (OT: Especially the info about freorder-blocks-algorithm seems to be > > outdated for gcc 8.3 (my arm 7.3.1 produces smaller code using stc, too).) > > Since adjusting all those options didn't help I tried > > gcc -O{s|2} -Q --help={param|common|target|c++} > > but that didn't give me any new insight. (BTW: "-Q --help=param" should > > probably be documented in the --param-section) > > > > Cheers > > Moritz > > > > > > [1] https://gcc.gnu.org/onlinedocs/gcc-8.3.0/gcc/Optimize-Options.html > > >