On Tue, Feb 28, 2017 at 1:53 AM, Maxim Kuvyrkov <maxim.kuvyr...@linaro.org> wrote: >> On Feb 20, 2017, at 5:38 PM, Kyrill Tkachov <kyrylo.tkac...@foss.arm.com> >> wrote: >> >> Hi Maxim, >> >> On 30/01/17 11:24, Maxim Kuvyrkov wrote: >>> This patch series improves -fprefetch-loop-arrays pass through small fixes >>> and tweaks, and then enables it for several AArch64 cores. >>> >>> My tunings were done on and for Qualcomm hardware, with results varying >>> between +0.5-1.9% for SPEC2006 INT and +0.25%-1.0% for SPEC2006 FP at -O3, >>> depending on hardware revision. >>> >>> This patch series enables restricted -fprefetch-loop-arrays at -O2, which >>> also improves SPEC2006 numbers >>> >>> Biggest progressions are on 419.mcf and 437.leslie3d, with no serious >>> regressions on other benchmarks. >>> >>> I'm now investigating making -fprefetch-loop-arrays more aggressive for >>> Qualcomm hardware, which improves performance on most benchmarks, but also >>> causes big regressions on 454.calculix and 462.libquantum. If I can fix >>> these two regressions, prefetching will give another boost to AArch64. >>> >>> Andrew just posted similar prefetching tunings for Cavium's cores, and the >>> two patches have trivial conflicts. I'll post mine as-is, since it address >>> one of the comments on Andrew's review (adding a stand-alone struct for >>> tuning parameters). >>> >>> Andrew, feel free to just copy-paste it to your patch, since it is just a >>> mechanical change. >>> >>> All patches were bootstrapped and regtested on x86_64-linux-gnu and >>> aarch64-linux-gnu. >>> >> >> I've tried these patches out on Cortex-A72 and Cortex-A53, with the tuning >> structs entries appropriately >> modified to enable the changes on those cores. >> I'm seeing the mcf and leslie3d improvements as well on Cortex-A72 and >> Cortex-A53 and no noticeable regressions. >> I've also verified that the improvements are due to the prefetch >> instructions rather than just the unrolling that >> the pass does. >> So I'm in favor of enabling this for the cores that benefit from it. >> >> Do you plan to get this in for GCC 8? > > Hi Kyrill, > > My hope was to push them in time for GCC 7, but it seems to late now. I'll > return to these patches at the beginning of Stage 1.
Ping on this patch set as I really want to get in the prefetching side for ThunderX 1 and 2. Or should I resubmit my patch set? Thanks, Andrew > > -- > Maxim Kuvyrkov > www.linaro.org >