------- Comment #4 from ramana at gcc dot gnu dot org  2009-07-02 09:39 -------
(In reply to comment #3)
> Is there a C test case?  Can you add objdump of the gcc-generated asm and the
> fixed asm to show the impact on code size? (/me is surprised that 3*"add
> r0,sp,4" is smaller than 1**"add r0,sp,4"+3*"mov r0,r4"... Thumb is amazing 
> :-)

The length of add r0,sp,4 and mov r0,r4 is the same for Thumb1 (16 bits).


I suppose the ideal code generated would be something like this modulo errors
with stack alignments in the prologue and the epilogue. 

We also don't need r4 in that case :) . So we can save a load, a store as well
as 1 instruction over all. Smaller and faster by 1 instruction and reduced
register usage.



        push    {lr}
        sub     sp, sp, #12   (8 byte stack alignment )
        add     r0, sp, 4        // add  r0, sp, 4
        bl      _ZN1XC1Ev
        add     r0, sp, #4        // add  r0, sp, 4
        bl      _Z3barP1X
        add     r0, sp, #4       // add  r0, sp, 4
        bl      _ZN1XD1Ev
        add     sp, sp, #12    (8 byte stack alignment )
        @ sp needed for prologue
        pop     {pc}


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40615

Reply via email to