Hi Maxim,
On 30/01/17 12:06, Maxim Kuvyrkov wrote:
This patch enables prefetching at -O3 for aarch64 cores that set "simultaneous
prefetches" parameter above 0. There are currently no such settings, so this patch
doesn't change default code generation.
I'm now working on improvements to -fprefetch-loop-arrays pass to make it
suitable for -O2. I'll post this work in the next month.
Bootstrapped and regtested on x86_64-linux-gnu and aarch64-linux-gnu.
Are you aiming to get this in for GCC 8?
I have one small comment on this patch:
+ /* Enable sw prefetching at -O3 for CPUS that have prefetch, and we
+ have deemed it beneficial (signified by setting
+ prefetch.num_slots to 1 or more). */
+ if (flag_prefetch_loop_arrays < 0
+ && HAVE_prefetch
HAVE_prefetch will always be true on aarch64.
I imagine midend code that had logic like this would need this check, but
aarch64-specific code shouldn't need it.
+ && optimize >= 3
+ && aarch64_tune_params.prefetch.num_slots > 0)
+ flag_prefetch_loop_arrays = 1;
+
Cheers,
Kyrill