On Wed, 6 Jul 2022 at 10:11, Richard Henderson <richard.hender...@linaro.org> wrote: > > We can reuse the SVE functions for implementing moves to/from > horizontal tile slices, but we need new ones for moves to/from > vertical tile slices. > > Signed-off-by: Richard Henderson <richard.hender...@linaro.org>
> +/* > + * Move Zreg vector to ZArray column. > + */ > +#define DO_MOVA_C(NAME, TYPE, H) \ > +void HELPER(NAME)(void *za, void *vn, void *vg, uint32_t desc) \ > +{ \ > + int i, oprsz = simd_oprsz(desc); \ > + for (i = 0; i < oprsz; ) { \ > + uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \ > + do { \ > + if (pg & 1) { \ > + *(TYPE *)(za + tile_vslice_offset(i)) = *(TYPE *)(vn + > H(i)); \ > + } \ > + i += sizeof(TYPE); \ > + pg >>= sizeof(TYPE); \ > + } while (i & 15); \ > + } \ > +} > + > +DO_MOVA_C(sme_mova_cz_b, uint8_t, H1) > +DO_MOVA_C(sme_mova_cz_h, uint16_t, H2) > +DO_MOVA_C(sme_mova_cz_s, uint32_t, H4) i is a byte offset in this loop, so shouldn't these be using H1_2 and H1_4 ? > +/* > + * Move ZArray column to Zreg vector. > + */ > +#define DO_MOVA_Z(NAME, TYPE, H) \ > +void HELPER(NAME)(void *vd, void *za, void *vg, uint32_t desc) \ > +{ \ > + int i, oprsz = simd_oprsz(desc); \ > + for (i = 0; i < oprsz; ) { \ > + uint16_t pg = *(uint16_t *)(vg + H1_2(i >> 3)); \ > + do { \ > + if (pg & 1) { \ > + *(TYPE *)(vd + H(i)) = *(TYPE *)(za + > tile_vslice_offset(i)); \ > + } \ > + i += sizeof(TYPE); \ > + pg >>= sizeof(TYPE); \ > + } while (i & 15); \ > + } \ > +} > + > +DO_MOVA_Z(sme_mova_zc_b, uint8_t, H1) > +DO_MOVA_Z(sme_mova_zc_h, uint16_t, H2) > +DO_MOVA_Z(sme_mova_zc_s, uint32_t, H4) Similarly here? Otherwise Reviewed-by: Peter Maydell <peter.mayd...@linaro.org> thanks -- PMM