On 05/18/2010 11:47 PM, Aurelien Jarno wrote: > The reg allocator is able to issue move if needed, so the only > improvement this patch is for doing a ext8u on both "q" registers. > > OTOH the reg allocator knows this situation and will try to avoid this > situation during the allocation. Cheating on the reg allocator might > have some wrong effects, especially after your patch "Allocate > call-saved registers first". I am thinking of the scenario where the > value is in memory (which is likely to be the case given the limited > number of registers), it will be likely loaded in a "r" register (they > are now at the top priority), and then ext8u will be called, which will > issue "mov" + "and" instructions instead of a "movzbl" instruction.
The case I was concerned with is the fact that if we have a value allocated to, say, %esi, and we need to to an ext8u, then the register allocator has been told that it must move the value to a "q" register in order to perform the movzbl. In this case, the new code will simply emit the andl. I.e. the real problem is that we've told the register allocator one way that the extend can be implemented, but not every way. > All of that is purely theoretical. Do you know how does it behave in > practice? Picking the i386 target since it seems to use more extensions than any other target, from linux-user-test -d op_opt,out_asm i386/ls: There are 176 instances of ext8u. Of those, 83 instances are in-place, i.e. "ext8u_i32 tmp0,tmp0" I examined the first 2 dozen appearances in the output assembly: There are several instances of the value being in an "r" register: shr_i32 tmp1,edx,tmp13 ext8u_i32 tmp1,tmp1 => 0x601c5468: shr $0x8,%edi 0x601c546b: and $0xff,%edi All of the instances that I looked at that were not in-place happened to already be using a "q" register -- usually %ebx. I assume that's because we place %ebx as the first allocation register and that's just how things happen to work out once we've flushed the registers before the qemu_ld. qemu_ld8u tmp0,tmp2,$0xffffffff ext8u_i32 tmp13,tmp0 => 0x601c82f9: movzbl (%esi),%ebx 0x601c82fc: movzbl %bl,%ebx r~