On Wed, Sep 19, 2012 at 02:33:46PM -0700, Richard Henderson wrote: > On 09/19/2012 01:00 PM, Aurelien Jarno wrote: > > The copy propagation doesn't check the types of the temps during copy > > propagation. However TCG is using the mov_i32 for the i64 to i32 > > conversion and thus the two are not equivalent. > > > > With this patch tcg_opt_gen_mov() doesn't consider two temps with > > different types as copies anymore. > > > > So far it seems the optimization was not aggressive enough to trigger > > this bug, but it will be triggered later in this series once the copy > > propagation is improved. > > Exactly where/how does this manifest as problematic?
The problem arise when a 64-bit value is moved to a 32-bit value, and later this 64-bit value is reused. With the copy propagation if you consider them as identical, the 32-bit value might be used instead of the 64-bit value. This happens for example for the umull instruction on arm: | OP: | ---- 0xf67e5ea0 | mov_i32 tmp5,r3 | mov_i32 tmp6,r8 | ext32u_i64 tmp7,tmp5 | ext32u_i64 tmp8,tmp6 | mul_i64 tmp7,tmp7,tmp8 tmp7 is a 64-bit value | mov_i32 tmp6,tmp7 Now transfered to a 32-bit tmp | mov_i32 r1,tmp6 and a 32-bit register. | movi_i64 tmp8,$0x20 | shr_i64 tmp7,tmp7,tmp8 | mov_i32 tmp6,tmp7 | mov_i32 r3,tmp6 | goto_tb $0x1 | movi_i32 pc,$0xf67e5ea4 | exit_tb $0x7f16948b0299 | | OP after optimization and liveness analysis: | ---- 0xf67e5ea0 | nopn $0x2,$0x2 | nopn $0x2,$0x2 | ext32u_i64 tmp7,r3 | ext32u_i64 tmp8,r8 | mul_i64 tmp7,tmp7,tmp8 | mov_i32 tmp6,tmp7 | mov_i32 r1,tmp6 | movi_i64 tmp8,$0x20 | shr_i64 tmp7,r1,tmp8 Here, tmp7 is replaced by r1. However r1 only contains the 32-bit low part of tmp7, thus returning 0. | mov_i32 tmp6,tmp7 | mov_i32 r3,tmp6 | goto_tb $0x1 | movi_i32 pc,$0xf67e5ea4 | exit_tb $0x7f16948b0299 | end > We do this mov_i32 trick from i64->i32 when we're truncating. > Given that we're not (yet) targeting mips64 and having to > maintain 32-bit sign-extended quantities, I can't see how > that would matter. It does matter on architectures using different instructions to work on the 32 part of a register the registers. Actually in the case a above with a register shift right, shr_i32 and shr_i64 are always implemented differently to not shift bits from the 32bit high part to the low part. > We do the i32->i64 trick immediately before a proper extension. > > In either case I can't see how plain copy propagation should > matter until some other operation is involved. So, do we have > some other data propagation bug that is being masked here? I think it's a bug of the copy propagation considering these registers are equivalent. If on x86-64 you replace all accesses to rax by eax, your code will be smaller, but I am not sure it is going to work correctly. The only latent bug I can see there, is having mov_i32 both being used to move data between 32-bit temps and between a 64-bit temp and a 32-bit temp. -- Aurelien Jarno GPG: 1024D/F1BCDB73 aurel...@aurel32.net http://www.aurel32.net