On 01/04/2010 10:51 AM, Mark Colby wrote: > This sounds like a dumb question I know. However the following code > snippet results in many more machine instructions under 4.4.2 than under > 2.9.5 (I am running a cygwin->PowerPC cross): > > typedef unsigned int U32; > typedef union > { > U32 R; > struct > { > U32 BF1:2; > U32 :8; > U32 BF2:2; > U32 BF3:2; > U32 :18; > } B; > } TEST_t; > U32 testFunc(void) > { > TEST_t t; > t.R=0; > t.B.BF1=2; > t.B.BF2=3; > t.B.BF3=1; > return t.R; > } > > Output under 4.4.2 (powerpc-eabi-gcc-4-4-2 -O3 -S gcc-test.cpp -o > gcc-test-442.s): > > li 0,2 > li 3,0 > rlwimi 3,0,30,0,1 > li 0,3 > rlwimi 3,0,20,10,11 > li 0,1 > rlwimi 3,0,18,12,13 > blr > > Output under 2.9.5 (powerpc-eabi-gcc-2-9-5 -O3 -S gcc-test.cpp -o > gcc-test-295.s): > > lis 3,0x8034 > blr > > Is there any way to improve this behaviour? I have been using 2.9.5 very > successfully for years and am now looking at 4.4.2, but have many such > examples in my code (for clarity of commenting and maintainability).
This is very strange. On x86_64, gcc 4.4.1 generates movl $7170, %eax ret This optimization is done by the first RTL cse pass. I can't understand why it's not being done for your target. I guess this will need a powerpc expert. Andrew.