VFP load/store multiple instructions can be slightly optimized by  
loading the register offset constant into a variable outside the  
register loop and using the preloaded variable inside the loop instead  
of reloading the offset value to a temporary variable on each loop  
iteration. This causes less TCG ops to be generated for a VFP load/ 
store multiple instruction.

Signed-off-by: Juha Riihimäki <juha.riihim...@nokia.com>
---
diff --git a/target-arm/translate.c b/target-arm/translate.c
index 07ee638..e5a2881 100644
--- a/target-arm/translate.c
+++ b/target-arm/translate.c
@@ -3222,6 +3222,7 @@ static int disas_vfp_insn(CPUState * env,  
DisasContext *s, uint32_t insn)
                      offset = 8;
                  else
                      offset = 4;
+                tmp = tcg_const_i32(offset);
                  for (i = 0; i < n; i++) {
                      if (insn & ARM_CP_RW_BIT) {
                          /* load */
@@ -3232,8 +3233,9 @@ static int disas_vfp_insn(CPUState * env,  
DisasContext *s, uint32_t insn)
                          gen_mov_F0_vreg(dp, rd + i);
                          gen_vfp_st(s, dp, addr);
                      }
-                    tcg_gen_addi_i32(addr, addr, offset);
+                    tcg_gen_add_i32(addr, addr, tmp);
                  }
+                tcg_temp_free_i32(tmp);
                  if (insn & (1 << 21)) {
                      /* writeback */
                      if (insn & (1 << 24))

Attachment: translate.c.vfpldmstm.diff
Description: translate.c.vfpldmstm.diff

Reply via email to