On 10 December 2012 10:02, Richard Biener <richard.guent...@gmail.com> wrote:
> On Fri, Dec 7, 2012 at 6:30 PM, Richard Earnshaw <rearn...@arm.com> wrote:
>> On 07/12/12 15:13, Christophe Lyon wrote:
>>>
>>> Hi,
>>>
>>> As ARM supports unaligned vector accesses for almost no penalty, I'd
>>> like to disable loop peeling on ARM targets.
>>>
>>> I have ran benchmarks on cortex-A9 (hard-float) and noticed these
>>> significant improvements:
>>> * 1.5% improvement on a popular embedded benchmark (with peaks at +20% and
>>> +29%)
>>> * 2.1% on spec2k mesa
>>> * 9.2% on spec2k eon
>>> * up to 3.4% on some part of another embedded benchmark
>>>
>>> The largest regression I noticed is 1%.
>>>
>>> I have attached a preliminary patch to discuss how acceptable it would
>>> be, and to discuss the needed changes in the testsuite. Indeed; quite
>>> a few tests now fail because they count the number of "vectorizing an
>>> unaligned access" and "alignment of access forced using peeling"
>>> occurrences in the vectorizer traces.
>>>
>>> I could add a property to target-supports.exp, which would currently
>>> be only true on ARM to select whether to rely on peeling or not, and
>>> updated all the affected tests accordingly.
>>>
>>> As there are quite a few tests to update, I'd like opinions first.
>>>
>>> Thanks,
>>>
>>> Christophe.
>>>
>>
>> This feels a bit like a sledge-hammer for a nut that really needs just a
>> normal hammer.  I guess the crux of the question comes down to do we know
>> how many times the loop will be executed?  If the answer is no, then OK we
>> assume that the execution count will be small and don't peel.  If the answer
>> is yes (or we know the minimum iteration count), then we should be able to
>> work out what the saving will be by peeling to reach alignment.
>>
>> So I think your hook should pass the known (minimum) iteration count as well
>> -- with 0 indicating that we don't know what the minimum is.
>>
>> Note, it may be that today we can't work out what the minimum will be and
>> that for now we always pass zero.  But that doesn't mean we shouldn't code
>> for the time when we can work this out.
>
> I agree that this is a sledgehammer.  If aligned/unaligned loads/stores have
> the same cost then reflect that in the vectorized stmt cost hook.  If that

I am not sure to understand which hook you are referring to?
My understanding of vect_enhance_data_refs_alignment() is that it uses
cost to check if the target misaligned stores are more expensive than
misaligned loads, but at this point it has already decided to perform
peeling. On simple loops, it has no reason to later decide not to
perform peeling.

> alone does not prevent peeling for alignment to happen then the fix is to
> not consider doing peeling for alignment if aligned/unaligned costs are the
> same, not adding a new hook.
>
I thought that a new hook could enable target variations on this: if
the cost is very slightly different, it might be worth peeling or not,
depending on the peeling amount or the number of iterations as Richard
Earnshaw mentioned.

Thanks for your comments,

Christophe.

Reply via email to