On 12/17/24 14:35, Pierrick Bouvier wrote:
@@ -3001,11 +3010,18 @@ void tcg_optimize(TCGContext *s)
              break;
          case INDEX_op_qemu_ld_a32_i32:
          case INDEX_op_qemu_ld_a64_i32:
+            done = fold_qemu_ld_1reg(&ctx, op);
+            break;
          case INDEX_op_qemu_ld_a32_i64:
          case INDEX_op_qemu_ld_a64_i64:
+            if (TCG_TARGET_REG_BITS == 64) {
+                done = fold_qemu_ld_1reg(&ctx, op);
+                break;
+            }
+            QEMU_FALLTHROUGH;
          case INDEX_op_qemu_ld_a32_i128:
          case INDEX_op_qemu_ld_a64_i128:
-            done = fold_qemu_ld(&ctx, op);
+            done = fold_qemu_ld_2reg(&ctx, op);
              break;
          case INDEX_op_qemu_st8_a32_i32:
          case INDEX_op_qemu_st8_a64_i32:

Couldn't we handle this case in fold_masks instead (at least the 64 bits store on 32 bits guest case)?

No, not with the assertion that the TCGOp passed to fold_masks have a single 
output.


r~

Reply via email to