On Wed, Oct 29, 2014 at 12:53 PM, Tejas Belagod <tejas.bela...@arm.com> wrote: > On 29/10/14 09:32, Richard Biener wrote: >> >> On Tue, Oct 28, 2014 at 4:55 PM, Evandro Menezes <e.mene...@samsung.com> >> wrote: >>> >>> While doing some benchmark flag mining on AArch64, I noticed that >>> -fpeel-loops was a mined option often. As a matter of fact, when using >>> it >>> always, even without FDO, it seemed to raise most benchmarks and to leave >>> almost all of the rest flat, with a barely noticeable cost in code-size. >>> It >>> seems to me that it might be safe enough to be implied perhaps at -O3. >>> Is >>> there any reason why this never came into being? > > > Loop peeling is done by default on AArch64 unless, IIRC, > -fvect-cost-model=cheap is specified which switches it off. There was a > general thread on loop peeling around the same time last year > (https://gcc.gnu.org/ml/gcc/2013-11/msg00307.html) where Richard suggested > that peeling vs. non-peeling should be factored into the vector cost model > and is a more generic improvement.
Oh, you are talking about the vectorizer pro-/epilogue loops where we know a (low) upper bound for the number of iterations. I think that is enabled by default at -O3 as it is a "completely peeling" operation. Only regular peeling which looks at the _estimated_ loop trip count (peeling that number of times) is guarded by -fpeel-loops. Richard. > Thanks, > Tejas. > > >> >> Not sure, but peeling is/was very stupid (peeling 8 times unconditionally >> or not at all). At least without FDO (and with -fprofile-use it is >> enabled). >> Similar case for -funroll-loops. >> >> For GCC 5 peeling now moved to GIMPLE, so maybe things changed >> for that (but I'd doubt that). Honza? > > > > >