在 2023/9/14 上午11:16, Richard Henderson 写道:
On 9/13/23 19:26, Song Gao wrote:
+static bool gen_xvrepl128(DisasContext *ctx, arg_vv_i *a, MemOp mop)
{
- int ofs;
- TCGv_i64 desthigh, destlow, high, low;
+ int index = LSX_LEN / (8 * (1 << mop));
- if (!avail_LSX(ctx)) {
- return false;
- }
-
- if (!check_vec(ctx, 16)) {
+ if (!check_vec(ctx, 32)) {
return true;
}
- desthigh = tcg_temp_new_i64();
- destlow = tcg_temp_new_i64();
- high = tcg_temp_new_i64();
- low = tcg_temp_new_i64();
+ tcg_gen_gvec_dup_mem(mop, vec_reg_offset(a->vd, 0, mop),
+ vec_reg_offset(a->vj, a->imm, mop), 16, 16);
+ tcg_gen_gvec_dup_mem(mop, vec_reg_offset(a->vd, index, mop),
+ vec_reg_offset(a->vj, a->imm + index , mop),
16, 16);
I think this isn't right, because vec_reg_offset(a->vd, 0, mop) is not
the beginning of the vector for a big-endian host -- remember the xor in
vec_reg_offset.
You are right.
Better as
for (i = 0; i < 32; i += 16) {
tcg_gen_gvec_dup_mem(mop, vec_full_offset(a->vd) + i,
vec_reg_offset(a->vj, a->imm, mop) + i,
16, 16);
}
Got it.
Thanks.
Song Gao