On 09/29/2016 04:45 PM, Bernd Schmidt wrote:
On 09/28/2016 02:57 PM, Denys Vlasenko wrote:
No change from past behavior: Tested that "-falign-functions" uses
an arch-dependent alignment. Tested that "-O2" uses an
arch-dependent alignment. Tested that "-O2 -falign-functions=N"
uses explicitly given alignment.

I suspect we may want some testcases to cover this as well as the new
functionality. Look for existing ones that can maybe be adapted.

I added one -  testsuite/gcc.target/i386/falign-functions.c


Index: gcc/common/config/i386/i386-common.c
===================================================================
--- gcc/common/config/i386/i386-common.c    (revision 239860)
+++ gcc/common/config/i386/i386-common.c    (working copy)
@@ -998,36 +998,17 @@ ix86_handle_option (struct gcc_options *opts,
     }
       return true;

-
-  /* Comes from final.c -- no real reason to change it.  */
-#define MAX_CODE_ALIGN 16
-
     case OPT_malign_loops_:
       warning_at (loc, 0, "-malign-loops is obsolete, use -falign-loops");
-      if (value > MAX_CODE_ALIGN)
-    error_at (loc, "-malign-loops=%d is not between 0 and %d",
-          value, MAX_CODE_ALIGN);
-      else
-    opts->x_align_loops = 1 << value;
       return true;

That does seem to be a functional change. I'll defer to Uros.

It would be awkward to translate -malign-loops=%d et al
to comma-separated string format.
Since this warning is there for some 15 years already,
anyone who actually cares should have converted to new options
long ago. With patch, -malign-loops=%d will still emit a warning,
but be ignored. At worst, this would result in not aligning
code as requested via obsolete options.


Index: gcc/config/arm/arm.c
===================================================================
--- gcc/config/arm/arm.c    (revision 239860)
+++ gcc/config/arm/arm.c    (working copy)
@@ -2899,9 +2899,10 @@ static GTY(()) tree init_optimize;
 static void
 arm_override_options_after_change_1 (struct gcc_options *opts)
 {
-  if (opts->x_align_functions <= 0)
+  if (opts->x_flag_align_functions && !opts->x_str_align_functions)

Are these conditions really equivalent? It looks like zero was
the default even when no -falign-functions was specified.
 Or is that overriden by init_alignments?

 {
-  if (opts->x_align_loops == 0)
+  /* -falign-foo without argument: supply one */
+  if (opts->x_flag_align_loops && !opts->x_str_align_loops)

Same here.

The execution flow for option parsing is somewhat convoluted, no doubt.

I found it experimentally that these are locations where
default alignment parameters are set when -falign-functions
is given with no arguments (or when it is implied by -O2).


+@gccoptlist{-faggressive-loop-optimizations @gol
+-falign-functions[=@var{n}[,@var{m},[@var{n}[,@var{m}]]]] @gol
+-falign-jumps[=@var{n}[,@var{m},[@var{n}[,@var{m}]]]] @gol
+-falign-labels[=@var{n}[,@var{m},[@var{n}[,@var{m}]]]] @gol
+-falign-loops[=@var{n}[,@var{m},[@var{n}[,@var{m}]]]] @gol

I think it would be best not to use the same name for different arguments. 
Maybe call the second set n2, m2 (everywhere).

 @item -falign-functions
 @itemx -falign-functions=@var{n}
+@itemx -falign-functions=@var{n},@var{m}
+@itemx -falign-functions=@var{n},@var{m},@var{n},@var{m}

Three args is also valid, isn't it (here and for the other options)?

Yes.
 @item -falign-labels
 @itemx -falign-labels=@var{n}
+@itemx -falign-labels=@var{n},@var{m}
+@itemx -falign-labels=@var{n},@var{m},@var{n},@var{m}
 @opindex falign-labels
+If @var{m} is not specified, it defaults to @var{n}.
 Align all branch targets to a power-of-two boundary, skipping up to
-@var{n} bytes like @option{-falign-functions}.  This option can easily
+@var{m}-1 bytes like @option{-falign-functions}.

Maybe just write "Align all branch targets to a power-of-two boundary. The options 
are exactly as described for @option{-falign-functions}."

Why m-1, by the way, Wouldn't it be simpler to just specify the maximum number 
of bytes skipped?

Two reasons.

(1) Because of the defaults. What M defaults to in -falign-labels=N? It's easy
to remember that missing M defaults to N.
The rule "M defaults to N-1" would be more awkward to remember.

(2) This parameter can be seen not as "maximum number of bytes skipped", but as
"minimum number of bytes available before boundary".

IOW: -falign-functions=N,M is
"align to N so as to ensure that at least M [without -1!] bytes can be fetched
before N-byte boundary".
Example: -falign-functions=16 == -falign-functions=16,16 ==
"align to 16 so as to ensure that at least [or in this case, always] 16 bytes
are available before 16-byte boundary".

This is actually a more useful point of view:
in choosing M, you need to decide what is the typical instruction length.
For example, http://lkml.iu.edu/hypermail/linux/kernel/1505.2/03292.html

        > defconfig vmlinux (w/o FRAME_POINTER) has 42141 functions.
        > 6923 of them have 1st insn 5 or more bytes long,
        > 5841 of them have 1st insn 6 or more bytes long,
        > 5095 of them have 1st insn 7 or more bytes long,
        > 786 of them have 1st insn 8 or more bytes long,
        > 548 of them have 1st insn 9 or more bytes long,
        > 375 of them have 1st insn 10 or more bytes long,
        > 73 of them have 1st insn 11 or more bytes long,
        > one of them has 1st insn 12 bytes long
        >
        > Thus ensuring that at least seven first bytes do not cross
        > 64-byte boundary would cover >98% of all functions.

With the current patch logic, if you want at least 7 bytes to be fetched without
crossing the cacheline, you simply set M=7 in -falign-functions=N,M. Not M=7-1.
Not M=7+1. Not M="I forgot that stupid rule again, + or - ???"


+#ifdef SUBALIGN_LOG

We want to avoid adding new #defines; existing ones are being converted
to target hooks. I suspect the best way is to record whether an M value
was specified, and override it in the x86 option_override if required.
If that's infeasible for some reason we can revisit this.

+      /* Before -falign-foo=N,M,N2,M2 was introduced, x86 had a tweak.
+     -falign-functions=N with N > 8 was adding secondary alignment.
+     -falign-functions=10 was emitting this before every function:
+        .p2align 4,,9
+        .p2align 3
+     Now this behavior (and more) can be explicitly requested:
+     -falign-functions=16,10,8
+     Retain old behavior if N2 is missing: */
+      else if (a[0].log > SUBALIGN_LOG)
+    a[1].log = SUBALIGN_LOG;

So this would live in i386.c, probably.

Thanks, I'm working on it...

Reply via email to