On 9/1/23 02:30, Jiajie Chen wrote:
Signed-off-by: Jiajie Chen <c...@jia.je>
---
tcg/loongarch64/tcg-target-con-set.h | 1 +
tcg/loongarch64/tcg-target.c.inc | 60 ++++++++++++++++++++++++++++
2 files changed, 61 insertions(+)
Reviewed-by: Richard Henderson <richard.hender...@linaro.org>
diff --git a/tcg/loongarch64/tcg-target-con-set.h
b/tcg/loongarch64/tcg-target-con-set.h
index 37b3f80bf9..d04916db25 100644
--- a/tcg/loongarch64/tcg-target-con-set.h
+++ b/tcg/loongarch64/tcg-target-con-set.h
@@ -31,4 +31,5 @@ C_O1_I2(r, 0, rZ)
C_O1_I2(r, rZ, ri)
C_O1_I2(r, rZ, rJ)
C_O1_I2(r, rZ, rZ)
+C_O1_I2(w, w, wJ)
Notes for improvement: 'J' is a signed 32-bit immediate.
+ if (const_args[2]) {
+ /*
+ * cmp_vec dest, src, value
+ * Try vseqi/vslei/vslti
+ */
+ int64_t value = sextract64(a2, 0, 8 << vece);
+ if ((cond == TCG_COND_EQ || cond == TCG_COND_LE || \
+ cond == TCG_COND_LT) && (-0x10 <= value && value <= 0x0f)) {
+ tcg_out32(s, encode_vdvjsk5_insn(cmp_vec_imm_insn[cond][vece],
\
+ a0, a1, value));
+ break;
+ } else if ((cond == TCG_COND_LEU || cond == TCG_COND_LTU) &&
+ (0x00 <= value && value <= 0x1f)) {
+ tcg_out32(s, encode_vdvjuk5_insn(cmp_vec_imm_insn[cond][vece],
\
+ a0, a1, value));
Better would be a new constraint that only matches
-0x10 <= x <= 0x1f
If the sign is wrong for the comparison, it can *always* be loaded with just
vldi.
Whereas at present, using J,
+ tcg_out_dupi_vec(s, type, vece, temp_vec, a2);
+ a2 = temp_vec;
this may require 3 instructions (lu12i.w + ori + vreplgr2vr).
By constraining the constants allowed, you allow the register allocator to see that a
register is required, which may be reused for another instruction.
r~