> On Jun 8, 2017, at 7:31 PM, James Greenhalgh <james.greenha...@arm.com> wrote:
> 
> On Fri, Feb 03, 2017 at 02:58:23PM +0300, Maxim Kuvyrkov wrote:
>>> On Jan 30, 2017, at 5:50 PM, Maxim Kuvyrkov <maxim.kuvyr...@linaro.org> 
>>> wrote:
>>> 
>>>> On Jan 30, 2017, at 3:23 PM, Kyrill Tkachov <kyrylo.tkac...@foss.arm.com> 
>>>> wrote:
>>>> 
>>>> Hi Maxim,
>>>> 
>>>> On 30/01/17 12:06, Maxim Kuvyrkov wrote:
>>>>> This patch enables prefetching at -O3 for aarch64 cores that set 
>>>>> "simultaneous prefetches" parameter above 0.  There are currently no such 
>>>>> settings, so this patch doesn't change default code generation.
>>>>> 
>>>>> I'm now working on improvements to -fprefetch-loop-arrays pass to make it 
>>>>> suitable for -O2. I'll post this work in the next month.
>>>>> 
>>>>> Bootstrapped and regtested on x86_64-linux-gnu and aarch64-linux-gnu.
>>>>> 
>>>> 
>>>> Are you aiming to get this in for GCC 8?
>>>> I have one small comment on this patch:
>>>> 
>>>> +  /* Enable sw prefetching at -O3 for CPUS that have prefetch, and we
>>>> +     have deemed it beneficial (signified by setting
>>>> +     prefetch.num_slots to 1 or more).  */
>>>> +  if (flag_prefetch_loop_arrays < 0
>>>> +      && HAVE_prefetch
>>>> 
>>>> HAVE_prefetch will always be true on aarch64.
>>>> I imagine midend code that had logic like this would need this check, but 
>>>> aarch64-specific code shouldn't need it.
>>> 
>>> Agree, I'll remove HAVE_prefetch.
>>> 
>>> This pattern was copied from other backends, and HAVE_prefetch is most 
>>> likely a historical artifact.
>> 
>> Andrew raised a good point in the review of his patch that it is a bad idea
>> to use one of prefetching parameters (simultaneous_prefetches) as indicator
>> for whether to enable prefetching pass by default.  Indeed there are cases
>> when we want to set simultaneous_prefetch according to HW documentation (or
>> experimental results), but not enable prefetching pass by default.
>> 
>> This update to the patch addresses it.  The patch adds a new explicit field
>> to prefetch tuning structure "default_opt_level" that sets optimization level
>> from which prefetching should be enabled by default.  The current value is to
>> enable prefetching at -O3; additionally, this parameter will come handy for
>> enabling prefetching at -O2 [when it is ready].
> 
> I really don't like the scheme of changing the optimisation threshold when
> profiling data is used.
> 
> I've seen too many reports and presentations by the uninitiated who believe
> that the use of profiling data has made the difference, when in reality
> it is just GCC changing behaviour on which passes run. It is very
> misleading!

OK, agree.  That line came from i386 backend.  I'll run benchmarks for enabling 
prefetching at -O2 at a later date, and, possibly, we will have a performance 
argument for reducing prefetch threshold when profile data is available.

> 
> With that line removed, and any rebasing needed over changes to the macro,
> I'm happy with this patch.

Attached is updated patch.  I'll commit it after bootstrap / regtest.

Thanks,

--
Maxim Kuvyrkov
www.linaro.org


Attachment: 0002-Enable-fprefetch-loop-arrays-at-given-optimization-l.patch
Description: Binary data

Reply via email to