On 09/27/2016 01:53 PM, Bernd Schmidt wrote:
On 09/26/2016 09:08 PM, Denys Vlasenko wrote:
+@gccoptlist{-faggressive-loop-optimizations @gol
+-falign-functions[=@var{n}[,@var{m},[@var{n}[,@var{m}]]]] @gol
+-falign-jumps[=@var{n}[,@var{m}]] @gol
+-falign-labels[=@var{n}[,@var{m}]] -falign-loops[=@var{n}[,@var{m}]] @gol
@itemx -falign-functions=@var{n}
+@itemx -falign-functions=@var{n},@var{m}
@opindex falign-functions
+If @var{m} is not specified, it defaults to @var{n}.
There are inconsistencies here about how many arguments these take.
There's no documentation of what it would mean to have more than two.
The first paragraph seems to imply it's only allowed for -falign-functions,
but the same implementation is used for all three.
Noted. I will improve this part.
+#if defined (__i386__) || defined (__x86_64__)
+ /* Before -falign-foo=N,M,N2,M2 was introduced, x86 had a tweak.
+ -falign-functions=N with N > 8 was adding secondary alignment.
+ -falign-functions=10 was emitting this before every function:
+ .p2align 4,,9
+ .p2align 3
+ Now this behavior (and more) can be explicitly requested:
+ -falign-functions=16,10,8
+ Retain old behavior if N2 is missing: */
+ else if (a[0].log > 3)
+ a[1].log = 3;
+#endif
Can't have such #ifdef blocks in generic code. To start with, this
changesbehaviour based on the host, when you want it to change
depending on the target. If there's no way to detect such a
situation from the x86 backend, such as in the option_override
function, then you'll need a hook.
Ok, will rework this part.
IIUC the intention for the whole patch is that behaviour is unchanged
by default, but there are additional options for users to choose?
Exactly.
Since it seems this is mostly for x86, maybe Uros should have a say
in whether this patch is a good idea or not.
Not really. I imagine the same considerations apply to any CPU:
cachelines are getting wider irrespective of architecture.