On 6/6/19 12:46 PM, Peter Maydell wrote: > For VFP short vectors, the VFP registers are divided into a > series of banks: for single-precision these are s0-s7, s8-s15, > s16-s23 and s24-s31; for double-precision they are d0-d3, > d4-d7, ... d28-d31. Some banks are "scalar" meaning that > use of a register within them triggers a pure-scalar or > mixed vector-scalar operation rather than a full vector > operation. The scalar banks are s0-s7, d0-d3 and d16-d19. > When using a bank as part of a vector operation, we > iterate through it, increasing the register number by > the specified stride each time, and wrapping around to > the beginning of the bank. > > Unfortunately our calculation of the "increment" part of this > was incorrect: > vd = ((vd + delta_d) & (bank_mask - 1)) | (vd & bank_mask) > will only do the intended thing if bank_mask has exactly > one set high bit. For instance for doubles (bank_mask = 0xc), > if we start with vd = 6 and delta_d = 2 then vd is updated > to 12 rather than the intended 4. > > This only causes problems in the unlikely case that the > starting register is not the first in its bank: if the > register number doesn't have to wrap around then the > expression happens to give the right answer. > > Fix this bug by abstracting out the "check whether register > is in a scalar bank" and "advance register within bank" > operations to utility functions which use the right > bit masking operations. > > Signed-off-by: Peter Maydell <peter.mayd...@linaro.org> > --- > target/arm/translate-vfp.inc.c | 100 ++++++++++++++++++++------------- > 1 file changed, 60 insertions(+), 40 deletions(-)
Reviewed-by: Richard Henderson <richard.hender...@linaro.org> r~