On 2023/9/2 01:24, Richard Henderson wrote:
On 9/1/23 02:30, Jiajie Chen wrote:
Signed-off-by: Jiajie Chen <c...@jia.je>
---
tcg/loongarch64/tcg-target-con-set.h | 1 +
tcg/loongarch64/tcg-target.c.inc | 60 ++++++++++++++++++++++++++++
2 files changed, 61 insertions(+)
Reviewed-by: Richard Henderson <richard.hender...@linaro.org>
diff --git a/tcg/loongarch64/tcg-target-con-set.h
b/tcg/loongarch64/tcg-target-con-set.h
index 37b3f80bf9..d04916db25 100644
--- a/tcg/loongarch64/tcg-target-con-set.h
+++ b/tcg/loongarch64/tcg-target-con-set.h
@@ -31,4 +31,5 @@ C_O1_I2(r, 0, rZ)
C_O1_I2(r, rZ, ri)
C_O1_I2(r, rZ, rJ)
C_O1_I2(r, rZ, rZ)
+C_O1_I2(w, w, wJ)
Notes for improvement: 'J' is a signed 32-bit immediate.
I was wondering about the behavior of 'J' on i128 types: in
tcg_target_const_match(), the argument type is int, so will the higher
bits be truncated?
Besides, tcg_target_const_match() does not know the vector element width.
+ if (const_args[2]) {
+ /*
+ * cmp_vec dest, src, value
+ * Try vseqi/vslei/vslti
+ */
+ int64_t value = sextract64(a2, 0, 8 << vece);
+ if ((cond == TCG_COND_EQ || cond == TCG_COND_LE || \
+ cond == TCG_COND_LT) && (-0x10 <= value && value <=
0x0f)) {
+ tcg_out32(s,
encode_vdvjsk5_insn(cmp_vec_imm_insn[cond][vece], \
+ a0, a1, value));
+ break;
+ } else if ((cond == TCG_COND_LEU || cond ==
TCG_COND_LTU) &&
+ (0x00 <= value && value <= 0x1f)) {
+ tcg_out32(s,
encode_vdvjuk5_insn(cmp_vec_imm_insn[cond][vece], \
+ a0, a1, value));
Better would be a new constraint that only matches
-0x10 <= x <= 0x1f
If the sign is wrong for the comparison, it can *always* be loaded
with just vldi.
Whereas at present, using J,
+ tcg_out_dupi_vec(s, type, vece, temp_vec, a2);
+ a2 = temp_vec;
this may require 3 instructions (lu12i.w + ori + vreplgr2vr).
By constraining the constants allowed, you allow the register
allocator to see that a register is required, which may be reused for
another instruction.
r~