Hi, I saw that stack instructions on Intel platform are not used that much. I think this is a pity cause stack operations are small (size optimization) and usually fast (from Pentium two consecutive push/pop are executed together -> speed optimization). Consider this small piece of code
extern int foo1(int *a); int foo2(int a) { int b = a + 2; return foo1(&b); } compiling with $ gcc -O2 -mpreferred-stack-boundary=2 -fomit-frame-pointer -S optim1.c $ gcc --version gcc (GCC) 4.2.0 20060107 (experimental) produce following code foo2: subl $8, %esp movl 12(%esp), %eax addl $2, %eax movl %eax, 4(%esp) leal 4(%esp), %eax movl %eax, (%esp) call foo1 addl $8, %esp ret compiled with $ gcc -Os -mpreferred-stack-boundary=2 -fomit-frame-pointer -S optim1.c foo2: subl $4, %esp movl 8(%esp), %eax addl $2, %eax movl %eax, (%esp) movl %esp, %eax pushl %eax call foo1 popl %edx popl %ecx ret this is worst than 4.0.2 $ gcc -O2 -mpreferred-stack-boundary=2 -fomit-frame-pointer -S optim1.c $ gcc --version gcc (GCC) 4.0.2 20051125 (Red Hat 4.0.2-8) foo2: pushl %eax movl 8(%esp), %eax addl $2, %eax movl %eax, (%esp) movl %esp, %eax pushl %eax call foo1 addl $8, %esp ret (note pushl %eax size optimization instead of subl $4, %esp) Would it possible instead of allocating memory with subl/pushl to allocate and set memory with pushl only? Something like foo2: movl 4(%esp), %eax addl $2, %eax pushl %eax pushl %esp call foo1 popl %edx popl %ecx ret (note that first pushl allocate and set variable on stack) Is anyone working in this direction? bye Frediano Ziglio