On 02/23/2018 06:34 AM, Peter Maydell wrote: > On 17 February 2018 at 18:22, Richard Henderson > <richard.hender...@linaro.org> wrote: >> Signed-off-by: Richard Henderson <richard.hender...@linaro.org> >> --- >> target/arm/helper-sve.h | 23 +++++++++ >> target/arm/translate-a64.h | 14 +++--- >> target/arm/sve_helper.c | 114 >> +++++++++++++++++++++++++++++++++++++++++++++ >> target/arm/translate-sve.c | 113 >> ++++++++++++++++++++++++++++++++++++++++++++ >> target/arm/sve.decode | 29 +++++++++++- >> 5 files changed, 285 insertions(+), 8 deletions(-) >> >> diff --git a/target/arm/translate-a64.h b/target/arm/translate-a64.h >> index e519aee314..328aa7fce1 100644 >> --- a/target/arm/translate-a64.h >> +++ b/target/arm/translate-a64.h >> @@ -66,18 +66,18 @@ static inline void assert_fp_access_checked(DisasContext >> *s) >> static inline int vec_reg_offset(DisasContext *s, int regno, >> int element, TCGMemOp size) >> { >> - int offs = 0; >> + int element_size = 1 << size; >> + int offs = element * element_size; >> #ifdef HOST_WORDS_BIGENDIAN >> /* This is complicated slightly because vfp.zregs[n].d[0] is >> * still the low half and vfp.zregs[n].d[1] the high half >> * of the 128 bit vector, even on big endian systems. >> - * Calculate the offset assuming a fully bigendian 128 bits, >> - * then XOR to account for the order of the two 64 bit halves. >> + * Calculate the offset assuming a fully little-endian 128 bits, >> + * then XOR to account for the order of the 64 bit units. >> */ >> - offs += (16 - ((element + 1) * (1 << size))); >> - offs ^= 8; >> -#else >> - offs += element * (1 << size); >> + if (element_size < 8) { >> + offs ^= 8 - element_size; >> + } >> #endif >> offs += offsetof(CPUARMState, vfp.zregs[regno]); >> assert_fp_access_checked(s); > > This looks like it should have been in an earlier patch?
Hah! For the first time, no. But perhaps a separate patch. What this does is allow proper computation with size > 3. In particular, I want to support size==4, aka a 128-bit element. I think it's cleaner to extend this function than expose some internals where otherwise needed. r~