On 10/10/19 2:16 AM, Andrew Jones wrote: >> It might be best to avoid the ifdef altogether: >> >> for (i = 0; i < 32; ++i) { >> uint64_t *d = (uint64_t *)&buf[sve_zreg_offset(vq, i)]; >> for (j = 0; j < vq * 2; ++j) { >> d[j] = cpu_to_le64(env->vfp.zregs[i].d[j]); >> } >> } >> >> The compiler may well transform the inner loop to memcpy for little-endian >> host, but even if it doesn't core dumping is hardly performance sensitive. > > True. I even had something like the above at first, but then > overcomplicated it with the #ifdef-ing.
Ah, I wonder if you changed things around with the ifdefs due to the pregs. There's no trivial solution for those. It'd be nice to share the bswapping subroutine that you add in the SVE KVM patch set, and size the temporary array using ARM_MAX_VQ. r~