arm: Fix short-vector increment behaviour

Richard Henderson Sat, 08 Jun 2019 13:03:17 -0700

On 6/6/19 12:46 PM, Peter Maydell wrote:
> For VFP short vectors, the VFP registers are divided into a
> series of banks: for single-precision these are s0-s7, s8-s15,
> s16-s23 and s24-s31; for double-precision they are d0-d3,
> d4-d7, ... d28-d31. Some banks are "scalar" meaning that
> use of a register within them triggers a pure-scalar or
> mixed vector-scalar operation rather than a full vector
> operation. The scalar banks are s0-s7, d0-d3 and d16-d19.
> When using a bank as part of a vector operation, we
> iterate through it, increasing the register number by
> the specified stride each time, and wrapping around to
> the beginning of the bank.
> 
> Unfortunately our calculation of the "increment" part of this
> was incorrect:
>  vd = ((vd + delta_d) & (bank_mask - 1)) | (vd & bank_mask)
> will only do the intended thing if bank_mask has exactly
> one set high bit. For instance for doubles (bank_mask = 0xc),
> if we start with vd = 6 and delta_d = 2 then vd is updated
> to 12 rather than the intended 4.
> 
> This only causes problems in the unlikely case that the
> starting register is not the first in its bank: if the
> register number doesn't have to wrap around then the
> expression happens to give the right answer.
> 
> Fix this bug by abstracting out the "check whether register
> is in a scalar bank" and "advance register within bank"
> operations to utility functions which use the right
> bit masking operations.
> 
> Signed-off-by: Peter Maydell <peter.mayd...@linaro.org>
> ---
>  target/arm/translate-vfp.inc.c | 100 ++++++++++++++++++++-------------
>  1 file changed, 60 insertions(+), 40 deletions(-)


Reviewed-by: Richard Henderson <richard.hender...@linaro.org>


r~

Re: [Qemu-devel] [PATCH 42/42] target/arm: Fix short-vector increment behaviour

Reply via email to