Denys has submitted some patches to add more capabilities to the
-falign-* options, but these still have some issues, and the original
ideas seems to have been to allow for large alignments without
over-aligning small functions. The following patch implements that idea
by taking into account the function size as computed during
shorten_branches: let's say we use "-falign-functions=128
-flimit-function-alignment", then a 15-byte function will be 16-byte
aligned rather than 128-byte aligned.
Bootstrapped and tested on x86_64-linux. Denys, does this solve your
problem? Anyone else, ok to commit?
Bernd
gcc/
* common.opt (flimit-function-alignment): New.
* doc/invoke.texi (-flimit-function-alignment): Document.
* emit-rtl.h (struct rtl_data): Add max_insn_address field.
* final.c (shorten_branches): Set it.
* varasm.c (assemble_start_function): Limit alignment to it
if requested.
gcc/testsuite/
* gcc.target/i386/align-limit.c: New test.
Index: gcc/common.opt
===================================================================
--- gcc/common.opt (revision 240861)
+++ gcc/common.opt (working copy)
@@ -906,6 +906,9 @@ Align the start of functions.
falign-functions=
Common RejectNegative Joined UInteger Var(align_functions)
+flimit-function-alignment
+Common Report Var(flag_limit_function_alignment) Optimization Init(0)
+
falign-jumps
Common Report Var(align_jumps,0) Optimization UInteger
Align labels which are only reached by jumping.
Index: gcc/doc/invoke.texi
===================================================================
--- gcc/doc/invoke.texi (revision 240861)
+++ gcc/doc/invoke.texi (working copy)
@@ -368,7 +368,7 @@ Objective-C and Objective-C++ Dialects}.
-fno-ira-share-spill-slots @gol
-fisolate-erroneous-paths-dereference -fisolate-erroneous-paths-attribute @gol
-fivopts -fkeep-inline-functions -fkeep-static-functions @gol
--fkeep-static-consts -flive-range-shrinkage @gol
+-fkeep-static-consts -flimit-function-alignment -flive-range-shrinkage @gol
-floop-block -floop-interchange -floop-strip-mine @gol
-floop-unroll-and-jam -floop-nest-optimize @gol
-floop-parallelize-all -flra-remat -flto -flto-compression-level @gol
@@ -8262,6 +8262,11 @@ If @var{n} is not specified or is zero,
Enabled at levels @option{-O2}, @option{-O3}.
+@item -flimit-function-alignment
+When function alignment is requested by @option{-falign-functions}, use
+a size estimate to prevent functions from being over-aligned. This
+limits the alignment to be no larger than the function itself.
+
@item -falign-labels
@itemx -falign-labels=@var{n}
@opindex falign-labels
Index: gcc/emit-rtl.h
===================================================================
--- gcc/emit-rtl.h (revision 240861)
+++ gcc/emit-rtl.h (working copy)
@@ -284,6 +284,9 @@ struct GTY(()) rtl_data {
to eliminable regs (like the frame pointer) are set if an asm
sets them. */
HARD_REG_SET asm_clobbers;
+
+ /* The highest address seen during shorten_branches. */
+ unsigned HOST_WIDE_INT max_insn_address;
};
#define return_label (crtl->x_return_label)
Index: gcc/final.c
===================================================================
--- gcc/final.c (revision 240861)
+++ gcc/final.c (working copy)
@@ -1462,7 +1462,7 @@ shorten_branches (rtx_insn *first)
if (!increasing)
break;
}
-
+ crtl->max_insn_address = insn_current_address;
free (varying_length);
}
Index: gcc/varasm.c
===================================================================
--- gcc/varasm.c (revision 240861)
+++ gcc/varasm.c (working copy)
@@ -1791,9 +1791,19 @@ assemble_start_function (tree decl, cons
&& align_functions_log > align
&& optimize_function_for_speed_p (cfun))
{
+ int align_log = align_functions_log;
+ int max_skip = align_functions - 1;
+ if (flag_limit_function_alignment && crtl->max_insn_address > 0)
+ {
+ int size_log = ceil_log2 (crtl->max_insn_address);
+ if (size_log < align_log)
+ align_log = size_log;
+ int sz = 1 << size_log;
+ if (sz < max_skip)
+ max_skip = sz - 1;
+ }
#ifdef ASM_OUTPUT_MAX_SKIP_ALIGN
- ASM_OUTPUT_MAX_SKIP_ALIGN (asm_out_file,
- align_functions_log, align_functions - 1);
+ ASM_OUTPUT_MAX_SKIP_ALIGN (asm_out_file, align_log, max_skip);
#else
ASM_OUTPUT_ALIGN (asm_out_file, align_functions_log);
#endif
Index: gcc/testsuite/gcc.target/i386/align-limit.c
===================================================================
--- gcc/testsuite/gcc.target/i386/align-limit.c (nonexistent)
+++ gcc/testsuite/gcc.target/i386/align-limit.c (working copy)
@@ -0,0 +1,9 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -falign-functions=64 -flimit-function-alignment" } */
+/* { dg-final { scan-assembler ".p2align 1,,1" } } */
+/* { dg-final { scan-assembler-not ".p2align 6,,63" } } */
+
+void
+test_func (void)
+{
+}