On September 13, 2017 5:35:11 PM GMT+02:00, Jan Hubicka <hubi...@ucw.cz> wrote: >> On Wed, Sep 13, 2017 at 3:46 PM, Jakub Jelinek <ja...@redhat.com> >wrote: >> > On Wed, Sep 13, 2017 at 03:41:19PM +0200, Richard Biener wrote: >> >> On its own -O3 doesn't add much (some loop opts and slightly more >> >> aggressive inlining/unrolling), so whatever it does we >> >> should consider doing at -O2 eventually. >> > >> > Well, -O3 adds vectorization, which we don't enable at -O2 by >default. >> >> As said, -fprofile-use enables it so -O2 should eventually do the >same >> for "really hot code". > >I don't see static profile prediction to be very useful here to find >"really >hot code" - neither in current implementation or future. The problem of >-O2 is that we kind of know that only 10% of code somewhere matters for >performance but we have no way to reliably identify it.
It's hard to do better than statically look at (ipa) loop depth. But shouldn't that be good enough? > >I would make sense to have less agressive vectoriazaoitn at -O2 and >more at >-Ofast/-O3. We tried that but the runtime effects were not offsetting the compile time cost. >Adding -Os and -Oz would make sense to me - even with hot/cold info it >is not >desriable to optimize as agressively for size as we do becuase mistakes >happen >and one do not want to make code paths 1000 times slower to save one >byte >of binary. > >We could handle this gratefully internally by having logic for "known >to be cold" >and "guessed to be cold". New profile code can make difference in this. > >Honza >> >> Richard. >> >> > Jakub