On Mon, May 16, 2016 at 11:38:04AM +0100, Wilco Dijkstra wrote:
> GCC expands switch statements in a very simplistic way and tries to use a
> table
> expansion even when it is a bad idea for performance or codesize.
> GCC typically emits extremely sparse tables that contain mostly default
> entri
On 05/23/16 15:32, Evandro Menezes wrote:
I'm fine with this patch, as it achieves in part what I intended
before: going beyond the default_case_values_threshold, too
conservative for Exynos M1. My concern is particularly what happens
to in-order targets, like the ubiquitous A53.
I'll get
On 05/24/16 07:08, Wilco Dijkstra wrote:
Jim Wilson wrote:
It looks like a slight lose on qdf24xx on SPEC CPU2006 at -O3. I see
about a 0.37% loss on the integer benchmarks, and no significant
change on the FP benchmarks. The integer loss is mainly due to
458.sjeng which drops 2%. We had trie
Jim Wilson wrote:
> It looks like a slight lose on qdf24xx on SPEC CPU2006 at -O3. I see
> about a 0.37% loss on the integer benchmarks, and no significant
> change on the FP benchmarks. The integer loss is mainly due to
> 458.sjeng which drops 2%. We had tried various values for
> max_case_valu
On 05/18/16 20:03, Jim Wilson wrote:
Though I see that the original patch from Samsung that added the
max_case_values field has the -O3 check, so there was apparently some
reason why they wanted it to work that way. The value that the
exynos-m1 is using, 48, looks pretty large, so maybe they th
On Mon, May 16, 2016 at 4:30 AM, James Greenhalgh
wrote:
> As this change will change code generation for all cores (except
> Exynos-M1), I'd like to hear from those with more detailed knowledge of
> ThunderX, X-Gene and qdf24xx before I take this patch.
It looks like a slight lose on qdf24xx on
James Greenhalgh wrote:
> As this change will change code generation for all cores (except
> Exynos-M1), I'd like to hear from those with more detailed knowledge of
> ThunderX, X-Gene and qdf24xx before I take this patch.
>
> Let's give it another week or so for comments, and expand the CC list.
N
tex-A53 built for generic, but there is no
> difference in perlbench.
Where were these changes if not perlbench?
Thanks,
James
>
> From: Wilco Dijkstra
> Sent: 22 April 2016 17:15
> To: gcc-patches@gcc.gnu.org
> Cc: nd
> Subject: [PATCH][
ping
From: Wilco Dijkstra
Sent: 22 April 2016 17:15
To: gcc-patches@gcc.gnu.org
Cc: nd
Subject: [PATCH][AArch64] Improve aarch64_case_values_threshold setting
GCC expands switch statements in a very simplistic way and tries to use a table
expansion even
Kyrill Tkachov wrote:
> On 25/04/16 20:21, Wilco Dijkstra wrote:
> > The GCC switch expansion is awful, so
> > even with a good indirect predictor it is better to use conditional
> > branches.
>
> In what way is it awful? If there's something we can do better at
> can you file a bug report with a
Hi Wilco,
On 25/04/16 20:21, Wilco Dijkstra wrote:
Evandro Menezes wrote:
I assume that you mean that such improvements are true for
-mcpu=generic, yes? On which target, A53 or A57 or other?
It's true for any CPU setting. The SPEC results are for Cortex-A57
however I wrote a microbenchmark th
On 04/26/16 11:14, Wilco Dijkstra wrote:
Evandro Menezes wrote:
True, but the results when running on A53 could be quite different.
GCC is ~1.2% faster on Cortex-A53 built for generic, but there is no
difference in perlbench.
Looks good, then. Fine by me.
Thanks for your patience,
--
Evand
Evandro Menezes wrote:
>
> True, but the results when running on A53 could be quite different.
GCC is ~1.2% faster on Cortex-A53 built for generic, but there is no
difference in perlbench.
Wilco
On 04/25/16 14:58, Wilco Dijkstra wrote:
Evandro Menezes wrote:
I agree with your assessment, but I'm more curious to understand how
this change affects code built with the default -mcpu=generic when run
on both A53 and A57, the typical configuration of big.LITTLE machines.
I wouldn't expect th
Evandro Menezes wrote:
> I agree with your assessment, but I'm more curious to understand how
> this change affects code built with the default -mcpu=generic when run
> on both A53 and A57, the typical configuration of big.LITTLE machines.
I wouldn't expect the result to be any different as the -m
On 04/25/16 14:21, Wilco Dijkstra wrote:
Evandro Menezes wrote:
I assume that you mean that such improvements are true for
-mcpu=generic, yes? On which target, A53 or A57 or other?
It's true for any CPU setting. The SPEC results are for Cortex-A57
however I wrote a microbenchmark that shows im
Evandro Menezes wrote:
> I assume that you mean that such improvements are true for
> -mcpu=generic, yes? On which target, A53 or A57 or other?
It's true for any CPU setting. The SPEC results are for Cortex-A57
however I wrote a microbenchmark that shows improvements on
all targets I have access
On 04/22/16 11:15, Wilco Dijkstra wrote:
This patch fixes that by setting the default aarch64_case_values_threshold to
16 when the per-CPU tuning is not set. On SPEC2006 this improves the switch
heavy benchmarks GCC and perlbench both in performance (1-2%) as well as size
(0.5-1% smaller).
I a
GCC expands switch statements in a very simplistic way and tries to use a table
expansion even when it is a bad idea for performance or codesize.
GCC typically emits extremely sparse tables that contain mostly default entries
(something which currently cannot be tuned by backends). Additionally th
19 matches
Mail list logo