On 12/17/24 14:35, Pierrick Bouvier wrote:
@@ -3001,11 +3010,18 @@ void tcg_optimize(TCGContext *s)
break;
case INDEX_op_qemu_ld_a32_i32:
case INDEX_op_qemu_ld_a64_i32:
+ done = fold_qemu_ld_1reg(&ctx, op);
+ break;
case INDEX_op_qemu_ld_a32_i64:
case INDEX_op_qemu_ld_a64_i64:
+ if (TCG_TARGET_REG_BITS == 64) {
+ done = fold_qemu_ld_1reg(&ctx, op);
+ break;
+ }
+ QEMU_FALLTHROUGH;
case INDEX_op_qemu_ld_a32_i128:
case INDEX_op_qemu_ld_a64_i128:
- done = fold_qemu_ld(&ctx, op);
+ done = fold_qemu_ld_2reg(&ctx, op);
break;
case INDEX_op_qemu_st8_a32_i32:
case INDEX_op_qemu_st8_a64_i32:
Couldn't we handle this case in fold_masks instead (at least the 64 bits store on 32 bits
guest case)?
No, not with the assertion that the TCGOp passed to fold_masks have a single
output.
r~