On 4/17/19 5:33 AM, Mateja Marjanovic wrote: > From: Mateja Marjanovic <mateja.marjano...@rt-rk.com> > > Optimize set of MSA instructions ILVEV.<B|H|W|D>, using > directly tcg registers and performing logic on them > instead of using helpers. > > In the following table, the first column is the performance > before this patch. The second represents the performance > after converting from helpers to tcg, but without using > tcg_gen_deposit function. The third one is with using the > tcg_gen_deposit function and with using a uint64_t constant > bit mask, and the fourth is with using the tcg_gen_deposit > function and with a mask which is a tcg constant. The fourth > is implemented in this patch. > > Performance measurement is done by executing the > instructions 10 million times on a computer > with Intel Core i7-3770 CPU @ 3.40GHz×8. > > ================================================================== > || instruction || 1 || 2 || 3 || 4 || > ================================================================== > || ilvev.b || 126.92 ms || 24.52 ms || 25.19 ms || 23.89 ms || > || ilvev.h || 93.67 ms || 23.92 ms || 24.76 ms || 24.31 ms || > || ilvev.w || 117.86 ms || 23.83 ms || 21.84 ms || 21.99 ms || > || ilvev.d || 45.49 ms || 19.74 ms || 20.21 ms || 20.07 ms || > ================================================================== > 1 - before > 2 - no-deposit-no-mask-as-tcg-constant > 3 - with-deposit-no-mask-as-tcg-constant > 4 - with-deposit-with-mask-as-tcg-constant (final) > > The deposit function is used only in ILVEV.W. > > No-deposit version of the ILVEV.W implementation: > > static inline void gen_ilvev_w(CPUMIPSState *env, uint32_t wd, > uint32_t ws, uint32_t wt) > { > TCGv_i64 t1 = tcg_temp_new_i64(); > TCGv_i64 t2 = tcg_temp_new_i64(); > uint64_t mask = 0x00000000ffffffffULL; > > tcg_gen_andi_i64(t1, msa_wr_d[wt * 2], mask); > tcg_gen_andi_i64(t2, msa_wr_d[ws * 2], mask); > tcg_gen_shli_i64(t2, t2, 32); > tcg_gen_or_i64(msa_wr_d[wd * 2], t1, t2); > > tcg_gen_andi_i64(t1, msa_wr_d[wt * 2 + 1], mask); > tcg_gen_andi_i64(t2, msa_wr_d[ws * 2 + 1], mask); > tcg_gen_shli_i64(t2, t2, 32); > tcg_gen_or_i64(msa_wr_d[wd * 2 + 1], t1, t2); > > tcg_temp_free_i64(t1); > tcg_temp_free_i64(t2); > } > > Suggested-by: Aleksandar Markovic <amarko...@wavecomp.com> > Suggested-by: Philippe Mathieu-Daudé <phi...@redhat.com> > Suggested-by: Richard Henderson <richard.hender...@linaro.org> > Signed-off-by: Mateja Marjanovic <mateja.marjano...@rt-rk.com> > --- > target/mips/helper.h | 1 - > target/mips/msa_helper.c | 9 ----- > target/mips/translate.c | 87 > +++++++++++++++++++++++++++++++++++++++++++++++- > 3 files changed, 86 insertions(+), 11 deletions(-)
Reviewed-by: Richard Henderson <richard.hender...@linaro.org> r~