vslt

Jiajie Chen Fri, 01 Sep 2023 10:29:06 -0700


On 2023/9/2 01:24, Richard Henderson wrote:

On 9/1/23 02:30, Jiajie Chen wrote:

Signed-off-by: Jiajie Chen <c...@jia.je>
---
  tcg/loongarch64/tcg-target-con-set.h |  1 +
  tcg/loongarch64/tcg-target.c.inc     | 60 ++++++++++++++++++++++++++++
  2 files changed, 61 insertions(+)


Reviewed-by: Richard Henderson <richard.hender...@linaro.org>

diff --git a/tcg/loongarch64/tcg-target-con-set.hb/tcg/loongarch64/tcg-target-con-set.h

index 37b3f80bf9..d04916db25 100644
--- a/tcg/loongarch64/tcg-target-con-set.h
+++ b/tcg/loongarch64/tcg-target-con-set.h
@@ -31,4 +31,5 @@ C_O1_I2(r, 0, rZ)
  C_O1_I2(r, rZ, ri)
  C_O1_I2(r, rZ, rJ)
  C_O1_I2(r, rZ, rZ)
+C_O1_I2(w, w, wJ)


Notes for improvement: 'J' is a signed 32-bit immediate.

I was wondering about the behavior of 'J' on i128 types: intcg_target_const_match(), the argument type is int, so will the higherbits be truncated?


Besides, tcg_target_const_match() does not know the vector element width.

+        if (const_args[2]) {
+            /*
+             * cmp_vec dest, src, value
+             * Try vseqi/vslei/vslti
+             */
+            int64_t value = sextract64(a2, 0, 8 << vece);
+            if ((cond == TCG_COND_EQ || cond == TCG_COND_LE || \
+ cond == TCG_COND_LT) && (-0x10 <= value && value <=0x0f)) {+ tcg_out32(s,encode_vdvjsk5_insn(cmp_vec_imm_insn[cond][vece], \
+                                                 a0, a1, value));
+                break;
+ } else if ((cond == TCG_COND_LEU || cond ==TCG_COND_LTU) &&
+                (0x00 <= value && value <= 0x1f)) {
+ tcg_out32(s,encode_vdvjuk5_insn(cmp_vec_imm_insn[cond][vece], \
+                                                 a0, a1, value));
Better would be a new constraint that only matches

    -0x10 <= x <= 0x1f
If the sign is wrong for the comparison, it can *always* be loadedwith just vldi.
Whereas at present, using J,
+            tcg_out_dupi_vec(s, type, vece, temp_vec, a2);
+            a2 = temp_vec;
this may require 3 instructions (lu12i.w + ori + vreplgr2vr).
By constraining the constants allowed, you allow the registerallocator to see that a register is required, which may be reused foranother instruction.
r~

Re: [PATCH v2 03/14] tcg/loongarch64: Lower cmp_vec to vseq/vsle/vslt

Reply via email to