On 9/23/22 21:47, Lucas Mateus Castro(alqotel) wrote:
From: "Lucas Mateus Castro (alqotel)"<lucas.ara...@eldorado.org.br>
This patch moves VMLADDUHM to decodetree a creates a gvec implementation
using mul_vec and add_vec.
rept loop master patch
8 12500 0,01810500 0,00903100 (-50.1%)
25 4000 0,01739400 0,00747700 (-57.0%)
100 1000 0,01843600 0,00901400 (-51.1%)
500 200 0,02574600 0,01971000 (-23.4%)
2500 40 0,05921600 0,07121800 (+20.3%)
8000 12 0,15326700 0,21725200 (+41.7%)
The significant difference in performance when REPT is low and LOOP is
high I think is due to the fact that the new implementation has a higher
translation time, as when using a helper only 5 TCGop are used but with
the patch a total of 10 TCGop are needed (Power lacks a direct mul_vec
equivalent so this instruction is implemented with the help of 5 others,
vmuleu, vmulou, vmrgh, vmrgl and vpkum).
Signed-off-by: Lucas Mateus Castro (alqotel)<lucas.ara...@eldorado.org.br>
---
target/ppc/helper.h | 2 +-
target/ppc/insn32.decode | 2 ++
target/ppc/int_helper.c | 3 +-
target/ppc/translate.c | 1 -
target/ppc/translate/vmx-impl.c.inc | 48 ++++++++++++++++++-----------
5 files changed, 35 insertions(+), 21 deletions(-)
Reviewed-by: Richard Henderson <richard.hender...@linaro.org>
r~