On Mon, 30 May 2016, Jan Hubicka wrote: > > On Sat, 28 May 2016, Jan Hubicka wrote: > > > > > Hello, > > > thanks for feedback. I updated the patch and also noticed that > > > -fpeel-all-loops gives up when > > > upper bound is known but it is large and when the max-peel-insns is too > > > small to permit > > > peeling max-peel-times. This patch also updates pr61743-2.c which are > > > now peeled before > > > we manage to propagate the proper loop bound. > > > > > > Bootstrapped/regtested x86_64-linux. OK? > > > > Humm, so why add -fpeel-all-loops? I don't think -funroll-all-loops > > is useful. > It is mostly there to trigger the transform to see if it is useful. Not > something > you want to enable by default. > > -fpeel-all-loops helps when you know your code have internal loops that > iterate few times. I.e. one can get good speedup for the sudoku solver > benchmark > because it has loops that iterate either once or 10 times. > > http://www.ucw.cz/~hubicka/papers/amd64/node4.html also claims that > -funroll-all-loops > improves specint by 2.5%, while -funroll-loops by 2.23%, so it seemed > somewhat useful > back then.
Still I'm hesitant to introduce new user-visible options. > > Did you check code-size/compile-time/performance effects of enabling > > -fpeel-loops at -O3 for, say, SPEC CPU 2006? > > Martin Liska run it on the SPEC2006 and v6 (not with latest fixes to > heuristics). Without FDO the loop peeling triggers only for loops where we > have likely_upper_bound != upper_bound. We do not predict that too often (we > may in future as there is room for improvement in niter). The code size > effect > was +0.9% for SPECint and +2.2% on SPECfp. The off-noise improvements were > vrp > 94.5->89.7 and John the ripper 106.9->100 (less than 0.1% in geomavg). I have > cut down the code size effects since that (there was a bug that made us to > peel > without control when maxiter overflowed), but I did not re-run full specs > since > then. > > My motivation was mainly to reduce number of optimizations that are not good > enough to be enabled by default and also observation that it helps to some > benchmarks. > > We can re-run benchmarks with current patch after fixing the profile update > issues I will send shortly. No fine-tuning of the parameters was done and I > guess they are still set the way they was set for RTL peeling in 2003. Sounds good. The patch is ok if you omit the new flag for now. Thanks, Richard.