On 4/17/19 5:33 AM, Mateja Marjanovic wrote: > From: Mateja Marjanovic <mateja.marjano...@rt-rk.com> > > Optimize set of MSA instructions ILVOD.<B|H|W|D>, using > directly tcg registers and performing logic on them instead > of using helpers. > > In the following table, the first column is the performance > before this patch. The second represents the performance > after converting from helpers to tcg, but without using > tcg_gen_deposit function. The third one is with the deposit > function and with using a uint64_t constant bit mask, and > the fourth is with the deposit function and with a mask > which is a tcg constant. The fourth is implemented in this > patch. > > Performance measurement is done by executing the > instructions 10 million times on a computer > with Intel Core i7-3770 CPU @ 3.40GHz×8. > > ================================================================== > || instruction || 1 || 2 || 3 || 4 || > ================================================================== > || ilvod.b || 117.50 ms || 24.13 ms || 24.45 ms || 23.24 ms || > || ilvod.h || 93.16 ms || 24.21 ms || 24.28 ms || 23.20 ms || > || ilvod.w || 119.90 ms || 24.15 ms || 23.19 ms || 22.95 ms || > || ilvod.d || 43.01 ms || 21.17 ms || 23.07 ms || 22.59 ms || > ================================================================== > 1 - before > 2 - no-deposit-no-mask-as-tcg-constant > 3 - with-deposit-no-mask-as-tcg-constant > 4 - with-deposit-with-mask-as-tcg-constant (final) > > The deposit function is used only in ILVOD.W. > > No-deposit version of the ILVOD.W implementation: > > static inline void gen_ilvod_w(CPUMIPSState *env, uint32_t wd, > uint32_t ws, uint32_t wt) > { > TCGv_i64 t1 = tcg_temp_new_i64(); > TCGv_i64 t2 = tcg_temp_new_i64(); > TCGv_i64 mask = tcg_const_i64(0xffffffff00000000ULL); > > tcg_gen_and_i64(t1, msa_wr_d[wt * 2], mask); > tcg_gen_shri_i64(t1, t1, 32); > tcg_gen_and_i64(t2, msa_wr_d[ws * 2], mask); > tcg_gen_or_i64(msa_wr_d[wd * 2], t1, t2); > > tcg_gen_and_i64(t1, msa_wr_d[wt * 2 + 1], mask); > tcg_gen_shri_i64(t1, t1, 32); > tcg_gen_and_i64(t2, msa_wr_d[ws * 2 + 1], mask); > tcg_gen_or_i64(msa_wr_d[wd * 2 + 1], t1, t2); > > tcg_temp_free_i64(mask); > tcg_temp_free_i64(t1); > tcg_temp_free_i64(t2); > } > > Suggested-by: Aleksandar Markovic <amarko...@wavecomp.com> > Suggested-by: Philippe Mathieu-Daudé <phi...@redhat.com> > Suggested-by: Richard Henderson <richard.hender...@linaro.org> > Signed-off-by: Mateja Marjanovic <mateja.marjano...@rt-rk.com> > --- > target/mips/helper.h | 1 - > target/mips/msa_helper.c | 7 ---- > target/mips/translate.c | 91 > +++++++++++++++++++++++++++++++++++++++++++++++- > 3 files changed, 90 insertions(+), 9 deletions(-)
Reviewed-by: Richard Henderson <richard.hender...@linaro.org> r~