On Thu, Jan 26, 2017 at 9:56 PM, Andrew Pinski <apin...@cavium.com> wrote: > Hi, > This patch enables -fprefetch-loop-arrays for -mcpu=thunderxt88 and > -mcpu=thunderxt88p1. I filled out the tuning structures for both > thunderx and thunderx2t99. No other core current enables software > prefetching so I set them to 0 which does not change the default > parameters. > > OK? Bootstrapped and tested on both ThunderX2 CN99xx and ThunderX > CN88xx with no regressions. I got a 2x improvement for 462.libquantum > on CN88xx, overall a 10% improvement on SPEC INT on CN88xx at -Ofast. > CN99xx's SPEC did not change.
Heh, quite impressive for this kind of bit-rotten (and broken?) pass ;) Richard. > Thanks, > Andrew Pinski > > ChangeLog: > * config/aarch64/aarch64-protos.h (struct tune_params): Add > prefetch_latency, simultaneous_prefetches, l1_cache_size, and > l2_cache_size fields. > (enum aarch64_autoprefetch_model): Add AUTOPREFETCHER_SW. > * config/aarch64/aarch64.c (generic_tunings): Update to include > prefetch_latency, simultaneous_prefetches, l1_cache_size, and > l2_cache_size fields to 0. > (cortexa35_tunings): Likewise. > (cortexa53_tunings): Likewise. > (cortexa57_tunings): Likewise. > (cortexa72_tunings): Likewise. > (cortexa73_tunings): Likewise. > (exynosm1_tunings): Likewise. > (thunderx_tunings): Fill out some of the new fields. > (thunderxt88_tunings): New variable. > (xgene1_tunings): Update to include prefetch_latency, > simultaneous_prefetches, l1_cache_size, and l2_cache_size fields to 0. > (qdf24xx_tunings): Likewise. > (thunderx2t99_tunings): Fill out some of the new fields. > (aarch64_override_options_internal): Consider AUTOPREFETCHER_SW like > AUTOPREFETCHER_OFF. > Set param values if the fields are non-zero. Turn on > prefetch-loop-arrays if AUTOPREFETCHER_SW and optimize level is at > least 3 or profile feed usage is enabled. > * config/aarch64/aarch64-cores.def (thunderxt88p1): Use thunderxt88 tuning. > (thunderxt88): Likewise.