On 2/26/19 3:38 AM, David Hildenbrand wrote: > To avoid an helper, we have to do the actual calculation of the element > address (offset in cpu_env + cpu_env) manually. Factor that out into > get_vec_element_ptr_i64(). The same logic will be reused for "VECTOR > LOAD VR ELEMENT FROM GR". > > Signed-off-by: David Hildenbrand <da...@redhat.com> > --- > target/s390x/insn-data.def | 2 ++ > target/s390x/translate_vx.inc.c | 55 +++++++++++++++++++++++++++++++++ > 2 files changed, 57 insertions(+) > > diff --git a/target/s390x/insn-data.def b/target/s390x/insn-data.def > index 46610e808f..f4201ff55a 100644 > --- a/target/s390x/insn-data.def > +++ b/target/s390x/insn-data.def > @@ -996,6 +996,8 @@ > E(0xe741, VLEIH, VRI_a, V, 0, 0, 0, 0, vlei, 0, MO_16, IF_VEC) > E(0xe743, VLEIF, VRI_a, V, 0, 0, 0, 0, vlei, 0, MO_32, IF_VEC) > E(0xe742, VLEIG, VRI_a, V, 0, 0, 0, 0, vlei, 0, MO_64, IF_VEC) > +/* VECTOR LOAD GR FROM VR ELEMENT */ > + F(0xe721, VLGV, VRS_c, V, la2, 0, r1, 0, vlgv, 0, IF_VEC) > > #ifndef CONFIG_USER_ONLY > /* COMPARE AND SWAP AND PURGE */ > diff --git a/target/s390x/translate_vx.inc.c b/target/s390x/translate_vx.inc.c > index 1bf654ff4e..a02a3ba81f 100644 > --- a/target/s390x/translate_vx.inc.c > +++ b/target/s390x/translate_vx.inc.c > @@ -137,6 +137,28 @@ static void load_vec_element(DisasContext *s, uint8_t > reg, uint8_t enr, > tcg_temp_free_i64(tmp); > } > > +static void get_vec_element_ptr_i64(TCGv_ptr ptr, uint8_t reg, TCGv_i64 enr, > + uint8_t es) > +{ > + TCGv_i64 tmp = tcg_temp_new_i64(); > + > + /* mask off invalid parts from the element nr */ > + tcg_gen_andi_i64(tmp, enr, NUM_VEC_ELEMENTS(es) - 1); > + > + /* convert it to an element offset relative to cpu_env (vec_reg_offset() > */ > + tcg_gen_muli_i64(tmp, tmp, NUM_VEC_ELEMENT_BYTES(es));
Or tcg_gen_shli_i64(tmp, tmp, es); > + /* generate the final ptr by adding cpu_env */ > + tcg_gen_trunc_i64_ptr(ptr, tmp); > + tcg_gen_add_ptr(ptr, ptr, cpu_env); Sadly, there's nothing in the optimizer that will propagate this... > + case MO_8: > + tcg_gen_ld8u_i64(o->out, ptr, 0); ... into this. Is it easy for you objdump|grep some binaries to tell if my hunch is correct, in that virtually all direct element access is with a constant, i.e. with c(r0) as the address? It would be nice if this could be (o->out, cpu_env, ofs) for those cases... But what's here is correct, and what I'm suggesting is mere refinement, Reviewed-by: Richard Henderson <richard.hender...@linaro.org> r~