Hi Maxim,

On 30/01/17 12:06, Maxim Kuvyrkov wrote:
This patch enables prefetching at -O3 for aarch64 cores that set "simultaneous 
prefetches" parameter above 0.  There are currently no such settings, so this patch 
doesn't change default code generation.

I'm now working on improvements to -fprefetch-loop-arrays pass to make it 
suitable for -O2.  I'll post this work in the next month.

Bootstrapped and regtested on x86_64-linux-gnu and aarch64-linux-gnu.


Are you aiming to get this in for GCC 8?
I have one small comment on this patch:

+  /* Enable sw prefetching at -O3 for CPUS that have prefetch, and we
+     have deemed it beneficial (signified by setting
+     prefetch.num_slots to 1 or more).  */
+  if (flag_prefetch_loop_arrays < 0
+      && HAVE_prefetch

HAVE_prefetch will always be true on aarch64.
I imagine midend code that had logic like this would need this check, but 
aarch64-specific code shouldn't need it.

+      && optimize >= 3
+      && aarch64_tune_params.prefetch.num_slots > 0)
+    flag_prefetch_loop_arrays = 1;
+

Cheers,
Kyrill

Reply via email to