Re: [PATCH 30/37] target/i386: reimplement 0x0f 0x10-0x17, add AVX

2022-09-15 Thread Richard Henderson
On 9/14/22 23:45, Paolo Bonzini wrote: You've just been moving i64 pieces in the other functions, why is this one different using a gvec move in the middle? I do wonder if a generic helper moving offset->offset, with the comparison wouldn't be helpful within these functions, even when you know

Re: [PATCH 30/37] target/i386: reimplement 0x0f 0x10-0x17, add AVX

2022-09-14 Thread Paolo Bonzini
On Tue, Sep 13, 2022 at 12:14 PM Richard Henderson wrote: > > +static void gen_VMOVLPx(DisasContext *s, CPUX86State *env, X86DecodedInsn > > *decode) > > +{ > > +int vec_len = sse_vec_len(s, decode); > > + > > +tcg_gen_ld_i64(s->tmp1_i64, cpu_env, decode->op[2].offset + > > offsetof(XMMR

Re: [PATCH 30/37] target/i386: reimplement 0x0f 0x10-0x17, add AVX

2022-09-13 Thread Richard Henderson
On 9/12/22 00:04, Paolo Bonzini wrote: +tcg_gen_qemu_ld_i64(s->tmp1_i64, s->A0, s->mem_index, MO_64); I just noticed this here, but please examine any other direct loads: you've forgotten the endian specification: MO_64 | MO_LE, or MO_LEUQ for short. r~

Re: [PATCH 30/37] target/i386: reimplement 0x0f 0x10-0x17, add AVX

2022-09-13 Thread Richard Henderson
On 9/12/22 00:04, Paolo Bonzini wrote: +static void gen_VMOVHPx_ld(DisasContext *s, CPUX86State *env, X86DecodedInsn *decode) +{ +if (decode->op[0].offset != decode->op[1].offset) { +tcg_gen_ld_i64(s->tmp1_i64, cpu_env, decode->op[1].offset + offsetof(XMMReg, XMM_Q(0))); +tc

[PATCH 30/37] target/i386: reimplement 0x0f 0x10-0x17, add AVX

2022-09-11 Thread Paolo Bonzini
These are mostly moves, and yet are a total pain. The main issue is that: 1) some instructions are selected by mod==11 (register operand) vs. mod=00/01/10 (memory operand) 2) stores to memory are two-operand operations, while the 3-register and load-from-memory versions operate on the entire con