On 08/11/2016 10:59 PM, Andrew Pinski wrote:
On Thu, Aug 11, 2016 at 1:49 PM, Denys Vlasenko <dvlas...@redhat.com> wrote:
falign-functions=N is too simplistic.

Ingo Molnar ran some tests and it looks on latest x86 CPUs, 64-byte alignment
runs fastest (he tried many other possibilites).

However, developers are less than thrilled by the idea of a slam-dunk 64-byte
aligning everything. Too much waste:
        On 05/20/2015 02:47 AM, Linus Torvalds wrote:
        > At the same time, I have to admit that I abhor a 64-byte function
        > alignment, when we have a fair number of functions that are (much)
        > smaller than that.
        >
        > Is there some way to get gcc to take the size of the function into
        > account? Because aligning a 16-byte or 32-byte function on a 64-byte
        > alignment is just criminally nasty and wasteful.

This change makes it possible to align function to 64-byte boundaries *IF*
this does not introduce huge amount of padding.

Patch drops forced alignment to 8 if requested alignment is higher than 8:
before the patch, -falign-functions=9 was generating

        .p2align 4,,8
        .p2align 3

which means: "align to 16 if the skip is 8 bytes or less; else align to 8".
After this change, ".p2align 3" is not emitted.

It is dropped because I ultimately want to do something
like -falign-functions=64,8 - IOW, I want to align functions to 64 bytes,
but only if that generates padding of less than 8 bytes - otherwise I want
*no alignment at all*. The forced ".p2align 3" interferes with that intention.

Testing:
tested that with -falign-functions=N (tried 8, 15, 16, 17...) the alignment
directives are the same before and after the patch.
Tested that -falign-functions=N,N (two equal paramenters) works exactly
like -falign-functions=N.

Two things I noticed you missed:
1) ChangeLog entries (this is required as per the GNU coding style)

Is this form okay? -

...
Testing:
tested that with -falign-functions=N (tried 8, 15, 16, 17...) the alignment
directives are the same before and after the patch.
Tested that -falign-functions=N,N (two equal paramenters) works exactly
like -falign-functions=N.

2016-08-11  Denys Vlasenko  <dvlas...@redhat.com>

    * common.opt: Change definitions so that -falign_FOO= accepts a string.
    * config/i386/freebsd.h: Remove "If N is large, do at least 8 byte
    alignment" code.
    * config/i386/gnu-user.h: Likewise.
    * config/i386/iamcu.h: Likewise.
    * config/i386/openbsdelf.h: Likewise.
    * config/i386/x86-64.h: Likewise.
    * flags.h (struct target_flag_state): Add x_align_functions_max_skip
    member.
    * toplev.c (parse_N_M): New function.
    (init_alignments): Set align_FOO_log, align_FOO, align_FOO_max_skip
    from specified --align_FOO=N{,M} option
    * varasm.c (assemble_start_function): Use align_functions_max_skip
    instead of align_functions - 1.

Index: gcc/common.opt
===================================================================
....


2) Documentation changes.


Index: gcc/common.opt
===================================================================
...

 #define ASM_OUTPUT_MAX_SKIP_ALIGN(FILE,LOG,MAX_SKIP)                   \
   do {                                                                 \
     if ((LOG) != 0) {                                                  \
-      if ((MAX_SKIP) == 0) fprintf ((FILE), "\t.p2align %d\n", (LOG)); \
-      else {                                                           \
+      if ((MAX_SKIP) == 0 || (MAX_SKIP) >= (1<<(LOG)))                 \
+        fprintf ((FILE), "\t.p2align %d\n", (LOG));                    \
+      else                                                             \
        fprintf ((FILE), "\t.p2align %d,,%d\n", (LOG), (MAX_SKIP));     \
-       /* Make sure that we have at least 8 byte alignment if > 8 byte \
-          alignment is preferred.  */                                  \
-       if ((LOG) > 3                                                   \
-           && (1 << (LOG)) > ((MAX_SKIP) + 1)                          \
-           && (MAX_SKIP) >= 7)                                         \
-         fputs ("\t.p2align 3\n", (FILE));                             \
-      }                                                                        
\
     }                                                                  \
   } while (0)
 #undef  ASM_OUTPUT_MAX_SKIP_PAD

These looks like a bug fix, it most likely should be sent separately;
maybe even first.

ok

Also can you make sure the generic ASM_OUTPUT_MAX_SKIP_ALIGN is done correctly?

If by generic you mean "non-arch-specific", the situation is as follows:
if arch does not define ASM_OUTPUT_MAX_SKIP_ALIGN(file, align, max_skip), then
ASM_OUTPUT_ALIGN_WITH_NOP(file, align) or
ASM_OUTPUT_ALIGN (file, align) macros are used.
As you se from their arguments, those macros don't take "max_skip" argument.
IOW: even in existing code, they ignore max_skip, and always pad to the
specified alignment.
Whether this is considered "correct" or not, is up to you. The patch
changes nothing for those arches.

I checked all arches which do have ASM_OUTPUT_MAX_SKIP_ALIGN.

visium has the same "align at least to 8" logic like many x86 flavors,
do you think it should be removed?

rs6000, aarch64, arm use ".p2align %d,,%d" logic with no special cases,
look good to me.

rx uses ".balign %d,3,%d" and special-cases TARGET_AS100_SYNTAX,
this arch is probably fine too.

Reply via email to