> On 31 dec. 2014, at 18:25, Peter Maydell <peter.mayd...@linaro.org> wrote:
>> On 31 December 2014 at 18:08, Ard Biesheuvel <ard.biesheu...@linaro.org> 
>> wrote:
>>> On 31 December 2014 at 17:37, Peter Maydell <peter.mayd...@linaro.org> 
>>> wrote:
>>> It doesn't happen on LE TCG hosts. Joy :-)
>> I will need to find out first if this is the kernel driver or the
>> emulation failing. I never tested the former on BE, as the models
>> don't support the crypto extensions (at least, not without a plugin
>> that I don't have access to) and the actual hardware boots via EFI
>> which is LE only, and with no 'ground truth' as a reference, it is
>> hard to be 100% certain both implementations are in spec, even if they
>> work in combination.
> It's an LE guest in both cases (AArch64); the only difference is the
> host system (LE x86-64 vs BE PPC64). If our emulated CPU behaves
> differently based on the endianness of the host, it's always an
> emulation bug by definition. (Of course it's possible that we're
> currently correct on BE host, wrong on LE host, and with the
> corresponding bug in LE guest kernels, but that seems unlikely -- I
> did do cross-checks against the fast models for the crypto insns
> using an LE-host QEMU.)

Ok, I had missed that bit of context, but obviously, if the only difference is 
the endianness of the host, it must be an emulation bug.

>> The SHA emulation code only operates on 32-bit quantities, and the
>> only memory access it does is populating the CRYPTO_STATE union using
>> float64_val(). Does anyone feel there is anything obviously wrong with
>> that?
> How do you extract the 32 bit quantities from the float64_val ?
> Definition of vfp.regs[] from cpu.h:
>   * In AArch32:
>   *  Qn = regs[2n+1]:regs[2n]
>   *  Dn = regs[n]
>   *  Sn = regs[n/2] bits 31..0 for even n, and bits 63..32 for odd n
>   * (and regs[32] to regs[63] are inaccessible)
>   * In AArch64:
>   *  Qn = regs[2n+1]:regs[2n]
>   *  Dn = regs[2n]
>   *  Sn = regs[2n] bits 31..0
> where regs[x] is always a 64-bit value in host endianness
> (ie if you store 0x112233445566778899aabbccddeeff00 as
> a 64 bit write to VFP register D0 then regs[0] will be
> 0x112233445566778899aabbccddeeff00 regardless of host
> endianness. That is, the least significant 8 bits of D0
> will be (regs[0] & 0xff). (This isn't the same number as
> if you do the union-type-punning thing with union {
> uint64_t l; uint8_t b[8]; } and look at b[0].)

Ok, that must be the culprit then. I indeed use a union to convert between the 
various operand sizes.

> Notice that for 128-bit AArch64 vectors, the high half
> is always regs[2n+1] and the low half regs[2n], regardless
> of host endianness. This implies that on big-endian hosts
> a 128-bit vector is not represented as a 128-bit host
> endian vector at any offset in regs[]. The function
> vec_reg_offset() in translate-a64.c may be helpful.

I cannot easily test this myself, but i will try to cook up a patch anyway.


Reply via email to