> On Jun 8, 2017, at 7:31 PM, James Greenhalgh <james.greenha...@arm.com> wrote: > > On Fri, Feb 03, 2017 at 02:58:23PM +0300, Maxim Kuvyrkov wrote: >>> On Jan 30, 2017, at 5:50 PM, Maxim Kuvyrkov <maxim.kuvyr...@linaro.org> >>> wrote: >>> >>>> On Jan 30, 2017, at 3:23 PM, Kyrill Tkachov <kyrylo.tkac...@foss.arm.com> >>>> wrote: >>>> >>>> Hi Maxim, >>>> >>>> On 30/01/17 12:06, Maxim Kuvyrkov wrote: >>>>> This patch enables prefetching at -O3 for aarch64 cores that set >>>>> "simultaneous prefetches" parameter above 0. There are currently no such >>>>> settings, so this patch doesn't change default code generation. >>>>> >>>>> I'm now working on improvements to -fprefetch-loop-arrays pass to make it >>>>> suitable for -O2. I'll post this work in the next month. >>>>> >>>>> Bootstrapped and regtested on x86_64-linux-gnu and aarch64-linux-gnu. >>>>> >>>> >>>> Are you aiming to get this in for GCC 8? >>>> I have one small comment on this patch: >>>> >>>> + /* Enable sw prefetching at -O3 for CPUS that have prefetch, and we >>>> + have deemed it beneficial (signified by setting >>>> + prefetch.num_slots to 1 or more). */ >>>> + if (flag_prefetch_loop_arrays < 0 >>>> + && HAVE_prefetch >>>> >>>> HAVE_prefetch will always be true on aarch64. >>>> I imagine midend code that had logic like this would need this check, but >>>> aarch64-specific code shouldn't need it. >>> >>> Agree, I'll remove HAVE_prefetch. >>> >>> This pattern was copied from other backends, and HAVE_prefetch is most >>> likely a historical artifact. >> >> Andrew raised a good point in the review of his patch that it is a bad idea >> to use one of prefetching parameters (simultaneous_prefetches) as indicator >> for whether to enable prefetching pass by default. Indeed there are cases >> when we want to set simultaneous_prefetch according to HW documentation (or >> experimental results), but not enable prefetching pass by default. >> >> This update to the patch addresses it. The patch adds a new explicit field >> to prefetch tuning structure "default_opt_level" that sets optimization level >> from which prefetching should be enabled by default. The current value is to >> enable prefetching at -O3; additionally, this parameter will come handy for >> enabling prefetching at -O2 [when it is ready]. > > I really don't like the scheme of changing the optimisation threshold when > profiling data is used. > > I've seen too many reports and presentations by the uninitiated who believe > that the use of profiling data has made the difference, when in reality > it is just GCC changing behaviour on which passes run. It is very > misleading!
OK, agree. That line came from i386 backend. I'll run benchmarks for enabling prefetching at -O2 at a later date, and, possibly, we will have a performance argument for reducing prefetch threshold when profile data is available. > > With that line removed, and any rebasing needed over changes to the macro, > I'm happy with this patch. Attached is updated patch. I'll commit it after bootstrap / regtest. Thanks, -- Maxim Kuvyrkov www.linaro.org
0002-Enable-fprefetch-loop-arrays-at-given-optimization-l.patch
Description: Binary data