tbp wrote:
On 1/28/07, Richard Guenther <[EMAIL PROTECTED]> wrote:
On 1/28/07, tbp <[EMAIL PROTECTED]> wrote:
> objdump -wdrfC --no-show-raw-insn $1|perl -pe 's/^\s+\w+:\s+//'|perl
> -ne 'printf "%4d\n", hex($1) if /sub\s+\$(0x\w+),%esp/'|sort -r| head
> -n 10
>
> msvc:2196 2100 1772 1692 1688 1444 1428 1312 1308 1160
> icc: 2412 2280 2172 2044 1928 1848 1820 1588 1428 1396
> gcc: 2604 2596 2412 2076 2028 1932 1900 1756 1720 1132
It would have been nice to tell us what the particular columns in
this table mean - now we have to decrypt objdump params and
perl postprocessing ourselves.
I should have known better than to post on a sunday morning. Sorry.
That's the sorted 10 largest stack allocations in binaries produced by
each compiler (presuming most everything falls in place).
Each time i verify codegen for a function across all 3, gcc always has
the largest frame by a substantial amount (on ia32). And that's what
that rigorous table is trying to demonstrate ;)
Basically i'm wondering if a stack frame shrinking pass [ ] is
possible, [ ] makes no sense, [ ] has been done, [ ] is planed etc...
The current gcc register allocator (more correctly reload) reserves a
separate stack slot for each pseudo register which did not get a hard
register. Stack slot sharing stack or slot coloring has been
implemented on IRA branch. This is very useful for x86 and x86_64.
Besides smaller stack and better code locality, it results in smaller
displacements which means smaller insns (code) and even better code
locality.
Another thing is sharing slots for saving call used hard registers
through calls. The current gcc register allocator reserves a separate
slot for each call used register assigned to a pseudo living through a
call. It is a not so important for x86 but it is more important for
x86_64 (there are more call used hard registers and they are bigger
64-bit). Such stack slot sharing has been also implemented on IRA.
I am focused to make IRA available for gcc 4.4.