On 06/03/16 17:22, Evandro Menezes wrote:
On 06/03/16 05:51, Wilco Dijkstra wrote:
It looks almost all AArch64 cores agree on alignment of 16 for
function, and 8 for loops and branches, so we should change
-mcpu=generic as well if there is no disagreement - feedback welcome.
I'll see what sets of values Exynos M1 would be most comfortable with,
but I also wonder if the -falign-labels shouldn't also be a parameter
in tune_params.
Thoughts?
FWIW, here are the values for the alignment of functions, branches and
loops that fare better on Exynos M1 when -mcpu=generic, in order of
preference:
1. 4-4-4
2. 16-4-16
3. 8-4-4
I also controlled the code size and, whenever the branch alignment was 8
or 16 bytes, it would grow quickly, with no proportional improvement to
performance on Exynos M1.
HTH
--
Evandro Menezes