On 01/22/2018 06:46 AM, Luis Machado wrote:
> This patch adds a new option to control the minimum stride, for a memory
> reference, after which the loop prefetch pass may issue software prefetch
> hints for. There are two motivations:
>
> * Make the pass less aggressive, only issuing prefetch hints for bigger
> strides
> that are more likely to benefit from prefetching. I've noticed a case in
> cpu2017
> where we were issuing thousands of hints, for example.
>
> * For processors that have a hardware prefetcher, like Falkor, it allows the
> loop prefetch pass to defer prefetching of smaller (less than the threshold)
> strides to the hardware prefetcher instead. This prevents conflicts between
> the software prefetcher and the hardware prefetcher.
>
> I've noticed considerable reduction in the number of prefetch hints and
> slightly positive performance numbers. This aligns GCC and LLVM in terms of
> prefetch behavior for Falkor.
>
> The default settings should guarantee no changes for existing targets. Those
> are free to tweak the settings as necessary.
>
> No regressions in the testsuite and bootstrapped ok on aarch64-linux.
>
> Ok?
>
> 2018-01-22 Luis Machado <luis.mach...@linaro.org>
>
> Introduce option to limit software prefetching to known constant
> strides above a specific threshold with the goal of preventing
> conflicts with a hardware prefetcher.
>
> gcc/
> * config/aarch64/aarch64-protos.h (cpu_prefetch_tune)
> <minimum_stride>: New const int field.
> * config/aarch64/aarch64.c (generic_prefetch_tune): Update to include
> minimum_stride field.
> (exynosm1_prefetch_tune): Likewise.
> (thunderxt88_prefetch_tune): Likewise.
> (thunderx_prefetch_tune): Likewise.
> (thunderx2t99_prefetch_tune): Likewise.
> (qdf24xx_prefetch_tune): Likewise. Set minimum_stride to 2048.
> (aarch64_override_options_internal): Update to set
> PARAM_PREFETCH_MINIMUM_STRIDE.
> * doc/invoke.texi (prefetch-minimum-stride): Document new option.
> * params.def (PARAM_PREFETCH_MINIMUM_STRIDE): New.
> * params.h (PARAM_PREFETCH_MINIMUM_STRIDE): Define.
> * tree-ssa-loop-prefetch.c (should_issue_prefetch_p): Return false if
> stride is constant and is below the minimum stride threshold.
OK for the trunk.
jeff