On 4/11/22 17:18, Peter Maydell wrote:
Looking a bit more closely, this won't work on big-endian hosts, because there we want to copy across the last 16 bytes of the struct, not the first 16. So I think we need some more macro magic:#if SHIFT == 0 #define COPY_REG(DEST, SRC) (DEST) = (SRC) #else #define COPY_REG(DEST, SRC) do { \ (DEST).Q(0) = (SRC).Q(0); \ (DEST).Q(1) = (SRC).Q(1); \ } while (0) #endif and then use COPY_REG(*d, r);
Right, I have written something similar after seeing your response to the bug.
We could probably try to write endian-specific flavours of memcpy() invocation, but "do two 64-bit word copies" is what the compiler would hopefully turn the memcpy into anyway :-)
Yeah, I actually wrote the memcpy() invocation because I was going to look at AVX later this year, which of course you couldn't know. :) What I came up after stealing parts of your nice comment is the following: /* * Copy the relevant parts of a Reg value around. In the case where * sizeof(Reg) > SIZE, these helpers operate only on the lower bytes of * a 64 byte ZMMReg, so we must copy only those and keep the top bytes * untouched in the guest-visible destination destination register. * Note that the "lower bytes" are placed last in memory on big-endian * hosts, which store the vector backwards in memory. In that case the * copy *starts* at B(SIZE - 1) and ends at B(0), the opposite of * the little-endian case. */ #ifdef HOST_WORDS_BIGENDIAN #define MOVE(d, r) memcpy(&((d).B(SIZE - 1)), &(d).B(SIZE - 1), SIZE) #else #define MOVE(d, r) memcpy(&(d).B(0), &(r).B(0), SIZE) #endif I'll still your nice comment and submit a patch later when 7.1 opens. Paolo
