http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57932
Bug ID: 57932 Summary: Aligned stack wastes more than k bytes, if preferred stack boundary k=2**n, n>=4 Product: gcc Version: 4.8.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: meisenmann....@fh-salzburg.ac.at I have noticed that in many cases the "waste" of the local stack exceeds the minimum/optimal value, as specified by the option '-mpreferred-stack-boundary'. For example following code: extern int TestCall(int); int Caller(int val) { return TestCall(val); } , produces the assembler-code: pushl %ebp movl %esp, %ebp subl $24, %esp movl 8(%ebp), %eax movl %eax, (%esp) call TestCall leave ret Note: Compiled with a i386-elf cross compiler GCC 4.8.1, based on MinGW; The same or a similar result can be produced also with i386-elf-gcc 4.6.4 and (origin) MinGW version 4.7.2. The stack-adjustment (subl $24, %esp) has to consider the return-address, backup of previous frame-pointer (maybe pushed registers), local variables and arguments [...], an has to fulfill the requested stack-boundary. In this case: -mpreferred-stack-boundary=4 (and implies -mincoming-stack-boundary=4). But the operand/value 24 is not the optimal value to fulfill a stack-boundary, which is a multiple of k=16 byte (k=2**n, where n = 4). The minimal (optimal) value should 8; I.e. 16 - 4 (return-address) - 4 (backup of ebp), also to provide at least 4 remaining byte for the argument (to call 'TestCall'). Further investigations with different values of '-mpreferred-stack-boundary' has shown, that the wasted gap exceeds k = 2**n bytes, if n >= 4. If I use '-mpreferred-stack-boundary=3' the assembler code contains 'subl $8, %esp', fulfilling the requested 8 byte-boundary. Assuming that the incoming stack is already aligned to a multiple of 16 this would also ensure a (local) stack-boundary which is a multiple of 16 ... A more mathematical view: Give is a incoming and preferred stack-boundary of k = 2**n (where n >= 2), the assembler-output will contain a instruction of 'subl $A, %esp' and the unussed (wasted) stack is less than k. With another incoming and preferred stack-boundary k' = 2**n' and n' > n, the used operand A' (of the stack-adjustment instruction) will be (optimal): A <= A' <= 2**n' - 2**n - A, or the difference of A' - A is less or equal to 2**n' - 2**n (never exceeds the difference of the preferred stack-alignments). A better "test-case" is to use '-mpreferred-stack-boundary=8': With this simple code-sample above, I have assumed that the operand of the stack-adjustment does never excceds 256 (bytes), but result is: subl $504, %esp The requirement of stack-alignment to a 256 byte-boundary (assuming that incomming stack is already aligned) is fulfilled. But there's an "extra" wasted stack-area of 256 byte. In this case (in context to my understanding of stack-alignment), the operand should be 248 (I.e. 256 - 8).