On 30/07/2019 10:31, Ramana Radhakrishnan wrote:
On 30/07/2019 10:08, Christophe Lyon wrote:
On Mon, 29 Jul 2019 at 18:49, Wilco Dijkstra <wilco.dijks...@arm.com>
wrote:
Currently the Arm backend selects the alternative sched pressure
algorithm.
The issue is that this doesn't take register pressure into account,
and so
it causes significant additional spilling on Arm where there are only 14
allocatable registers. SPEC2006 shows significant codesize reduction
with the default pressure algorithm, so switch back to that. PR77308
shows
~800 fewer instructions.
SPECINT2006 is ~0.6% faster on Cortex-A57 together with the other DImode
patches. Overall SPEC codesize is 1.1% smaller.
Hi Wilco,
Do you know which benchmarks were used when this was checked-in?
It isn't clear from
https://gcc.gnu.org/ml/gcc-patches/2012-07/msg00706.html
It was from my time in Linaro and thus would have been a famous embedded
benchmark, coremark , spec2000 - all tested probably on cortex-a9 and
Cortex-A15. In addition to this I would like to see what the impact of
this is on something like Cortex-A53 as the issue rates are likely to be
different on the schedulers causing different behaviour.
I don't have all the notes today for that - maybe you can look into the
linaro wiki.
I am concerned about taking this patch in without some more data across
a variety of cores.
My concern is the original patch
(https://gcc.gnu.org/ml/gcc-patches/2012-07/msg00706.html) is lacking in
any real detail as to the reasons for the choice of the second algorithm
over the first.
- It's not clear what the win was
- It's not clear what outliers there were and whether they were significant.
And finally, it's not clear if, 7 years later, this is still the best
choice.
If the second algorithm really is better, why is no other target using
it by default?
I think we need a bit more information (both ways). In particular I'm
concerned not just by the overall benchmark average, but also the amount
of variance across the benchmarks. I think the default needs to avoid
significant outliers if at all possible, even if it is marginally less
good on the average.
R.
Thanks,
Ramana
Thanks,
Christophe
Bootstrap & regress OK on arm-none-linux-gnueabihf --with-cpu=cortex-a57
ChangeLog:
2019-07-29 Wilco Dijkstra <wdijk...@arm.com>
* config/arm/arm.c (arm_option_override): Don't override sched
pressure algorithm.
--
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index
81286cadf32f908e045d704128c5e06842e0cc92..628cf02f23fb29392a63d87f561c3ee2fb73a515
100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -3575,11 +3575,6 @@ arm_option_override (void)
if (use_neon_for_64bits == 1)
prefer_neon_for_64bits = true;
- /* Use the alternative scheduling-pressure algorithm by default. */
- maybe_set_param_value (PARAM_SCHED_PRESSURE_ALGORITHM,
SCHED_PRESSURE_MODEL,
- global_options.x_param_values,
- global_options_set.x_param_values);
-
/* Look through ready list and all of queue for instructions
relevant for L2 auto-prefetcher. */
int param_sched_autopref_queue_depth;