Hi,
  I saw that stack instructions on Intel platform are not used that
much. I think this is a pity cause stack operations are small (size
optimization) and usually fast (from Pentium two consecutive push/pop
are executed together -> speed optimization). Consider this small piece
of code

extern int foo1(int *a);

int foo2(int a)
{
        int b = a + 2;
        return foo1(&b);
}

compiling with 

$ gcc -O2 -mpreferred-stack-boundary=2 -fomit-frame-pointer  -S optim1.c

$ gcc --version
gcc (GCC) 4.2.0 20060107 (experimental)

produce following code

foo2:
        subl    $8, %esp
        movl    12(%esp), %eax
        addl    $2, %eax
        movl    %eax, 4(%esp)
        leal    4(%esp), %eax
        movl    %eax, (%esp)
        call    foo1
        addl    $8, %esp
        ret

compiled with

$ gcc -Os -mpreferred-stack-boundary=2 -fomit-frame-pointer  -S optim1.c

foo2:
        subl    $4, %esp
        movl    8(%esp), %eax
        addl    $2, %eax
        movl    %eax, (%esp)
        movl    %esp, %eax
        pushl   %eax
        call    foo1
        popl    %edx
        popl    %ecx
        ret

this is worst than 4.0.2

$ gcc -O2 -mpreferred-stack-boundary=2 -fomit-frame-pointer  -S optim1.c

$ gcc --version
gcc (GCC) 4.0.2 20051125 (Red Hat 4.0.2-8)

foo2:
        pushl   %eax
        movl    8(%esp), %eax
        addl    $2, %eax
        movl    %eax, (%esp)
        movl    %esp, %eax
        pushl   %eax
        call    foo1
        addl    $8, %esp
        ret

(note pushl %eax size optimization instead of subl $4, %esp)

Would it possible instead of allocating memory with subl/pushl to
allocate and set memory with pushl only? Something like

foo2:
        movl    4(%esp), %eax
        addl    $2, %eax
        pushl   %eax
        pushl   %esp
        call    foo1
        popl    %edx
        popl    %ecx
        ret

(note that first pushl allocate and set variable on stack)

Is anyone working in this direction?

bye
  Frediano Ziglio


Reply via email to