On 01/04/2010 10:51 AM, Mark Colby wrote:
> This sounds like a dumb question I know. However the following code
> snippet results in many more machine instructions under 4.4.2 than under
> 2.9.5 (I am running a cygwin->PowerPC cross):
> 
>   typedef unsigned int U32;
>   typedef union
>   {
>     U32 R;
>     struct
>     {
>       U32 BF1:2;
>       U32 :8;
>       U32 BF2:2;
>       U32 BF3:2;
>       U32 :18;
>     } B;
>   } TEST_t;
>   U32 testFunc(void)
>   {
>     TEST_t t;
>     t.R=0;
>     t.B.BF1=2;
>     t.B.BF2=3;
>     t.B.BF3=1;
>     return t.R;
>   }
> 
> Output under 4.4.2 (powerpc-eabi-gcc-4-4-2 -O3 -S gcc-test.cpp -o
> gcc-test-442.s):
> 
>   li 0,2
>   li 3,0
>   rlwimi 3,0,30,0,1
>   li 0,3
>   rlwimi 3,0,20,10,11
>   li 0,1
>   rlwimi 3,0,18,12,13
>   blr
> 
> Output under 2.9.5 (powerpc-eabi-gcc-2-9-5 -O3 -S gcc-test.cpp -o
> gcc-test-295.s):
> 
>   lis 3,0x8034
>   blr
> 
> Is there any way to improve this behaviour? I have been using 2.9.5 very
> successfully for years and am now looking at 4.4.2, but have many such
> examples in my code (for clarity of commenting and maintainability).

This is very strange.  On x86_64, gcc 4.4.1 generates

        movl    $7170, %eax
        ret

This optimization is done by the first RTL cse pass.  I can't understand
why it's not being done for your target.  I guess this will need a
powerpc expert.

Andrew.

Reply via email to