Thanks for the reviews.
Queued in gitlab.com/danielhb/qemu/tree/ppc-next. Daniel On 10/19/22 09:50, Lucas Mateus Castro(alqotel) wrote:
From: "Lucas Mateus Castro (alqotel)" <lucas.ara...@eldorado.org.br> Patches missing review: 12 v2 -> v3: - Used ctpop in i32 and i64 vprtyb - Changed gvec set up in xvtstdc[ds]p v1 -> v2: - Implemented instructions with fni4/fni8 and dropped the helper: * VSUBCUW * VADDCUW * VPRTYBW * VPRTYBD - Reworked patch12 to only use gvec implementation with a few immediates. - Used bitsel_ver on patch9 - Changed vec variables to tcg_constant_vec when possible This patch series moves some instructions from decode legacy to decodetree and translate said instructions with gvec. Some cases using gvec ended up with a bigger, more complex and slower so those instructions were only moved to decodetree. In each patch there's a comparison of the execution time before the patch being applied and after. Said result is the sum of 10 executions. The program used to time the execution worked like this: clock_t start = clock(); for (int i = 0; i < LOOP; i++) { asm ( load values in registers, between 2 and 3 instructions ".rept REPT\n\t" "INSTRUCTION registers\n\t" ".endr\n\t" save result from register, 1 instruction ); } clock_t end = clock(); printf("INSTRUCTION rept=REPT loop=LOOP, time taken: %.12lf\n", ((double)(end - start))/ CLOCKS_PER_SEC); Where the column rept in the value used in .rept in the inline assembly and loop column is the value used for the for loop. All of those tests were executed on a Power9. When comparing the TCGop the data used was gathered using '-d op' and '-d op_opt'. Lucas Mateus Castro (alqotel) (12): target/ppc: Moved VMLADDUHM to decodetree and use gvec target/ppc: Move VMH[R]ADDSHS instruction to decodetree target/ppc: Move V(ADD|SUB)CUW to decodetree and use gvec target/ppc: Move VNEG[WD] to decodtree and use gvec target/ppc: Move VPRTYB[WDQ] to decodetree and use gvec target/ppc: Move VAVG[SU][BHW] to decodetree and use gvec target/ppc: Move VABSDU[BHW] to decodetree and use gvec target/ppc: Use gvec to decode XV[N]ABS[DS]P/XVNEG[DS]P target/ppc: Use gvec to decode XVCPSGN[SD]P target/ppc: Moved XVTSTDC[DS]P to decodetree target/ppc: Moved XSTSTDC[QDS]P to decodetree target/ppc: Use gvec to decode XVTSTDC[DS]P target/ppc/fpu_helper.c | 137 +++++----- target/ppc/helper.h | 42 ++-- target/ppc/insn32.decode | 50 ++++ target/ppc/int_helper.c | 107 ++------ target/ppc/translate.c | 1 - target/ppc/translate/vmx-impl.c.inc | 352 ++++++++++++++++++++++---- target/ppc/translate/vmx-ops.c.inc | 15 +- target/ppc/translate/vsx-impl.c.inc | 372 +++++++++++++++++++++++----- target/ppc/translate/vsx-ops.c.inc | 21 -- 9 files changed, 771 insertions(+), 326 deletions(-)