On 01/26/2011 01:23 AM, Alexander Graf wrote: > agraf@toonie:/studio/s390/qemu-s390> grep deposit target-s390x/translate.c > tcg_gen_deposit_i64(regs[reg], regs[reg], tmp, 0, 32); > tcg_gen_deposit_i64(regs[reg], regs[reg], v, 0, 32); > tcg_gen_deposit_i64(regs[reg], regs[reg], tmp, 0, 16); > tcg_gen_deposit_i64(regs[reg], regs[reg], v, 0, 8); > tcg_gen_deposit_i64(regs[r1], regs[r1], tmp, 48, 16); > tcg_gen_deposit_i64(regs[r1], regs[r1], tmp, 32, 16); > > The 0, 32 and 0, 16 versions should get accelerated pretty well while > the 32, 16 and 48, 16 are not I assume?
No, only the 0,16 and 0,8 deposits correspond to a hardware insn on x86. Given that the 0,32 lowpart writeback is almost certainly the most common operation for s390x, I doubt the deposit patch will help with an x86 host. Have you thought about buffering the lowpart writeback in the translator? I.e. when a 32-bit insn writes to a register, remember that value without writing it back. If the next insn in the TB is also 32-bit, reuse the saved value, etc. Only perform the writeback for 64-bit insns using the register as a source, end of TB, and places that can take an exception. r~