The ISA [1] specifies the load order to be the target one, hence the use of MO_TEQ in my patch (in both lxvwsx and lxvdsx).
I believe the error is hidden in some of the .mak files: I could not reproduce this problem with Qemu's user-mode emulation in either BE nor LE mode, this lead me to discover that ppc64-softmmu.mak is always defining TARGET_WORDS_BIGENDIAN=y. The user-mode targets are correctly split into ppc64 and ppc64le, where only the former is declared as BE. The presence of that define is unconditionally making MO_TE an alias for MO_BE, that's why Paul's patch seems to fix the problem. I didn't catch this problem earlier as pretty much of our testing is done using the Linux user-mode emulation. Cheers, G.M. [1] https://ibm.ent.box.com/s/1hzcwkwf8rbju5h9iyf44wm94amnlcrv On 18/05/21 03:34, David Gibson wrote: > > On Mon, May 17, 2021 at 04:40:32PM -0500, Paul A. Clarke wrote: >> `lxvdsx` is byte-swapping the data it loads, which it should not >> do. Fix it. >> >> Fixes #212. >> >> Fixes: bcb0b7b1a1c05707304f80ca6f523d557816f85c >> Signed-off-by: Paul A. Clarke <p...@us.ibm.com > nit, missing '>' ...^ > > I'm having a hard time convincing myself this is correct in all cases. > Have you tested it with all combinations of BE/LE host and BE/LE guest > code? > > The description in the ISA is pretty inscrutable, since it's in terms > of the confusing numbering if different element types in BE vs LE > mode. > > It looks to me like before bcb0b7b1a1c0 this originally resolved to > MO_Q modified by ctx->default_tcg_memop_mask, which appears to depend > on the current guest endian mode. That's pretty hard to trace through > the various layers of macros, but for reference, before bcb0b7b1a1c0 > this used gen_qemu_ld64_i64(), which appears to be constructed by the > line GEN_QEMU_LOAD_64(ld64, DEF_MEMOP(MO_Q)) in translate.c. > > Richard or Giuseppe, care to weigh in? > >> --- >> target/ppc/translate/vsx-impl.c.inc | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/target/ppc/translate/vsx-impl.c.inc >> b/target/ppc/translate/vsx-impl.c.inc >> index b817d31260bb..46f97c029ca8 100644 >> --- a/target/ppc/translate/vsx-impl.c.inc >> +++ b/target/ppc/translate/vsx-impl.c.inc >> @@ -162,7 +162,7 @@ static void gen_lxvdsx(DisasContext *ctx) >> gen_addr_reg_index(ctx, EA); >> >> data = tcg_temp_new_i64(); >> - tcg_gen_qemu_ld_i64(data, EA, ctx->mem_idx, MO_TEQ); >> + tcg_gen_qemu_ld_i64(data, EA, ctx->mem_idx, MO_LEQ); >> tcg_gen_gvec_dup_i64(MO_Q, vsr_full_offset(xT(ctx->opcode)), 16, 16, >> data); >> >> tcg_temp_free(EA); >