On 4/30/20 11:09 AM, Peter Maydell wrote: > +static bool do_3same_qs32(DisasContext *s, arg_3same *a, NeonGenTwoOpEnvFn > *fn) > +{ > + /* > + * Saturating shift operations handled elementwise 32 bits at a > + * time which need to pass cpu_env to the helper and where the rn > + * and rm operands are reversed from the usual do_3same() order. > + */
Perhaps better to handle this as you did in "Convert Neon 64-bit element 3-reg-same insns", by adding a shim expander that adds env? It would appear we can then merge > +{ > + VQSHL_S64_3s 1111 001 0 0 . .. .... .... 0100 . . . 1 .... @3same_64 > + VQSHL_S_3s 1111 001 0 0 . .. .... .... 0100 . . . 1 .... @3same > +} back into a single pattern: void gen_gvec_srshl(unsigned vece, uint32_t rd_ofs, uint32_t rn_ofs, uint32_t rm_ofs, uint32_t oprsz, uint32_t maxsz) { static const GVecGen3 ops[4] = { { .fni4 = gen_helper_neon_rshl_s8 }, { .fni4 = gen_helper_neon_rshl_s16 }, { .fni4 = gen_helper_neon_rshl_s32 }, { .fni8 = gen_helper_neon_rshl_s64 } }; tcg_gen_gvec_3(rd_ofs, rn_ofs, rm_ofs, oprsz, maxsz, &ops[vece]); } I'm not 100% sure how best to handle the swapped operands issue. I don't think we want to do it here in gen_gvec_srshl, because we don't have the same reverse operand problem in the aarch64 encoding, and I'm looking forward to re-using this generator function in aa64 and sve2. Maybe it would be better to have @3same .... ... . . . size:2 .... .... .... . q:1 . . .... \ &3same vm=%vm_dp vn=%vn_dp vd=%vd_dp @3same_rev .... ... . . . size:2 .... .... .... . q:1 . . .... \ &3same vn=%vm_dp vm=%vn_dp vd=%vd_dp and swap the operands to "normal" during decode. FWIW, over in sve.decode, I prepared for reversed operands from the start (to handle things like SUBR), so the formats have the register names in order: @rd_rn_rm vs @rd_rm_rn. r~