On Mon, Dec 10, 2012 at 10:02 AM, Richard Biener <richard.guent...@gmail.com> wrote: > On Fri, Dec 7, 2012 at 6:30 PM, Richard Earnshaw <rearn...@arm.com> wrote: >> On 07/12/12 15:13, Christophe Lyon wrote: >>> >>> Hi, >>> >>> As ARM supports unaligned vector accesses for almost no penalty, I'd >>> like to disable loop peeling on ARM targets. >>> >>> I have ran benchmarks on cortex-A9 (hard-float) and noticed these >>> significant improvements: >>> * 1.5% improvement on a popular embedded benchmark (with peaks at +20% and >>> +29%) >>> * 2.1% on spec2k mesa >>> * 9.2% on spec2k eon >>> * up to 3.4% on some part of another embedded benchmark >>> >>> The largest regression I noticed is 1%. >>> >>> I have attached a preliminary patch to discuss how acceptable it would >>> be, and to discuss the needed changes in the testsuite. Indeed; quite >>> a few tests now fail because they count the number of "vectorizing an >>> unaligned access" and "alignment of access forced using peeling" >>> occurrences in the vectorizer traces. >>> >>> I could add a property to target-supports.exp, which would currently >>> be only true on ARM to select whether to rely on peeling or not, and >>> updated all the affected tests accordingly. >>> >>> As there are quite a few tests to update, I'd like opinions first. >>> >>> Thanks, >>> >>> Christophe. >>> >> >> This feels a bit like a sledge-hammer for a nut that really needs just a >> normal hammer. I guess the crux of the question comes down to do we know >> how many times the loop will be executed? If the answer is no, then OK we >> assume that the execution count will be small and don't peel. If the answer >> is yes (or we know the minimum iteration count), then we should be able to >> work out what the saving will be by peeling to reach alignment. >> >> So I think your hook should pass the known (minimum) iteration count as well >> -- with 0 indicating that we don't know what the minimum is. >> >> Note, it may be that today we can't work out what the minimum will be and >> that for now we always pass zero. But that doesn't mean we shouldn't code >> for the time when we can work this out. > > I agree that this is a sledgehammer. If aligned/unaligned loads/stores have > the same cost then reflect that in the vectorized stmt cost hook. If that > alone does not prevent peeling for alignment to happen then the fix is to > not consider doing peeling for alignment if aligned/unaligned costs are the > same, not adding a new hook.
Btw, arm does not have _any_ of the vectorizer cost hooks implemented! Richard.