VFP load/store multiple instructions can be slightly optimized by loading the register offset constant into a variable outside the register loop and using the preloaded variable inside the loop instead of reloading the offset value to a temporary variable on each loop iteration. This causes less TCG ops to be generated for a VFP load/ store multiple instruction.
Signed-off-by: Juha Riihimäki <juha.riihim...@nokia.com> --- diff --git a/target-arm/translate.c b/target-arm/translate.c index 07ee638..e5a2881 100644 --- a/target-arm/translate.c +++ b/target-arm/translate.c @@ -3222,6 +3222,7 @@ static int disas_vfp_insn(CPUState * env, DisasContext *s, uint32_t insn) offset = 8; else offset = 4; + tmp = tcg_const_i32(offset); for (i = 0; i < n; i++) { if (insn & ARM_CP_RW_BIT) { /* load */ @@ -3232,8 +3233,9 @@ static int disas_vfp_insn(CPUState * env, DisasContext *s, uint32_t insn) gen_mov_F0_vreg(dp, rd + i); gen_vfp_st(s, dp, addr); } - tcg_gen_addi_i32(addr, addr, offset); + tcg_gen_add_i32(addr, addr, tmp); } + tcg_temp_free_i32(tmp); if (insn & (1 << 21)) { /* writeback */ if (insn & (1 << 24))
translate.c.vfpldmstm.diff
Description: translate.c.vfpldmstm.diff