https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88345
Kito Cheng <kito at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |kito at gcc dot gnu.org --- Comment #7 from Kito Cheng <kito at gcc dot gnu.org> --- We are hitting this issue on RISC-V, and got some complain from linux kernel developers, but in different form as the original report, we found cold function or any function is marked as cold by `-fguess-branch-probability` are all not honor to the -falign-functions=N setting, that become problem on some linux kernel feature since they want to control the minimal alignment to make sure they can atomically update the instruction which require align to 4 byte. However current GCC behavior can't guarantee that even -falign-functions=4 is given, there is 3 option in my mind: 1. Fix -falign-functions=N, let it work as expect on -Os and all cold functions 2. Force align to 4 byte if -fpatchable-function-entry is given, that's should be doable by adjust RISC-V's FUNCTION_BOUNDARY 3. Adjust RISC-V's FUNCTION_BOUNDARY to let it honor to -falign-functions=N 4. Adding a -malign-functions=N...Okay, I know that suck idea, x86 already deprecated that. But I think ideally this should fixed by 1 option if possible. Testcase from RISC-V kernel guy: ``` /* { dg-do compile } */ /* { dg-options "-march=rv64gc -mabi=lp64d -O1 -falign-functions=128" } */ /* { dg-final { scan-assembler-times ".align 7" 2 } } */ // Using 128 byte align rather than 4 byte align since it easier to observe. __attribute__((__cold__)) void a() {} // This function isn't align to 128 byte void b() {} // This function align to 128 byte. ``` Proposed fix: ``` diff --git a/gcc/varasm.c b/gcc/varasm.c index 49d5cda122f..6f8ed85fea9 100644 --- a/gcc/varasm.c +++ b/gcc/varasm.c @@ -1907,8 +1907,7 @@ assemble_start_function (tree decl, const char *fnname) Note that we still need to align to DECL_ALIGN, as above, because ASM_OUTPUT_MAX_SKIP_ALIGN might not do any alignment at all. */ if (! DECL_USER_ALIGN (decl) - && align_functions.levels[0].log > align - && optimize_function_for_speed_p (cfun)) + && align_functions.levels[0].log > align) { #ifdef ASM_OUTPUT_MAX_SKIP_ALIGN int align_log = align_functions.levels[0].log; ```