> On 31 Jan 2016, at 23:50, David Gibson <da...@gibson.dropbear.id.au> wrote: > On Fri, Jan 29, 2016 at 06:40:21PM +0000, James Clarke wrote: >> Here is the description of the mcrfs instruction from the PowerPC >> Architecture >> Book, Version 2.02, Book I: PowerPC User Instruction Set Architecture >> (http://www.ibm.com/developerworks/systems/library/es-archguide-v2.html), >> found >> on page 120: >> >> The contents of FPSCR field BFA are copied to Condition Register field BF. >> All exception bits copied are set to 0 in the FPSCR. If the FX bit is >> copied, it is set to 0 in the FPSCR. >> >> Special Registers Altered: >> CR field BF >> FX OX (if BFA=0) >> UX ZX XX VXSNAN (if BFA=1) >> VXISI VXIDI VXZDZ VXIMZ (if BFA=2) >> VXVC (if BFA=3) >> VXSOFT VXSQRT VXCVI (if BFA=5) >> >> However, currently every bit in FPSCR field BFA is set to 0, including ones >> not >> on that list. >> >> This can be seen in the following simple C program: >> >> #include <fenv.h> >> #include <stdio.h> >> >> int main(int argc, char **argv) { >> int ret; >> ret = fegetround(); >> printf("Current rounding: %d\n", ret); >> ret = fesetround(FE_UPWARD); >> printf("Setting to FE_UPWARD (%d): %d\n", FE_UPWARD, ret); >> ret = fegetround(); >> printf("Current rounding: %d\n", ret); >> ret = fegetround(); >> printf("Current rounding: %d\n", ret); >> return 0; >> } >> >> which gave the output (before this commit): >> >> Current rounding: 0 >> Setting to FE_UPWARD (2): 0 >> Current rounding: 2 >> Current rounding: 0 >> >> instead of (after this commit): >> >> Current rounding: 0 >> Setting to FE_UPWARD (2): 0 >> Current rounding: 2 >> Current rounding: 2 >> >> The relevant disassembly is in fegetround(), which, on my system, is: >> >> __GI___fegetround: >> <+0>: mcrfs cr7, cr7 >> <+4>: mfcr r3 >> <+8>: clrldi r3, r3, 62 >> <+12>: blr >> >> What happens is that, the first time fegetround() is called, FPSCR field 7 is >> retrieved. However, because of the bug in mcrfs, the entirety of field 7 is >> set >> to 0, which includes the rounding mode. >> >> There are other issues this will fix, such as condition flags not persisting >> when they should if read, and if you were to read a specific field with some >> exception bits set, but no others were set in the entire register, then the >> bits would be cleared correctly, but FEX/VX would not be updated to 0 as they >> should be. >> >> Signed-off-by: James Clarke <jrt...@jrtc27.com> > > Thanks for the fixup. It actually looks like helper_store_fpscr() > should really take a target_ulong instead of u64 and have the (single) > caller which wants to pass a 64 do the truncate. But that can be a > cleanup for another day. > > Applied to ppc-for-2.6.
Great, thanks. I agree it seems odd, especially given the argument is cast to target_ulong, but that’s a more invasive change. > >> --- >> target-ppc/cpu.h | 6 ++++++ >> target-ppc/translate.c | 21 +++++++++++++++++---- >> 2 files changed, 23 insertions(+), 4 deletions(-) >> >> diff --git a/target-ppc/cpu.h b/target-ppc/cpu.h >> index 3a967b7..d811bc9 100644 >> --- a/target-ppc/cpu.h >> +++ b/target-ppc/cpu.h >> @@ -718,6 +718,12 @@ enum { >> #define FP_RN1 (1ull << FPSCR_RN1) >> #define FP_RN (1ull << FPSCR_RN) >> >> +/* the exception bits which can be cleared by mcrfs - includes FX */ >> +#define FP_EX_CLEAR_BITS (FP_FX | FP_OX | FP_UX | FP_ZX | \ >> + FP_XX | FP_VXSNAN | FP_VXISI | FP_VXIDI | \ >> + FP_VXZDZ | FP_VXIMZ | FP_VXVC | FP_VXSOFT | \ >> + FP_VXSQRT | FP_VXCVI) >> + >> /*****************************************************************************/ >> /* Vector status and control register */ >> #define VSCR_NJ 16 /* Vector non-java */ >> diff --git a/target-ppc/translate.c b/target-ppc/translate.c >> index 4be7eaa..ca10bd1 100644 >> --- a/target-ppc/translate.c >> +++ b/target-ppc/translate.c >> @@ -2500,18 +2500,31 @@ static void gen_fmrgow(DisasContext *ctx) >> static void gen_mcrfs(DisasContext *ctx) >> { >> TCGv tmp = tcg_temp_new(); >> + TCGv_i32 tmask; >> + TCGv_i64 tnew_fpscr = tcg_temp_new_i64(); >> int bfa; >> + int nibble; >> + int shift; >> >> if (unlikely(!ctx->fpu_enabled)) { >> gen_exception(ctx, POWERPC_EXCP_FPU); >> return; >> } >> - bfa = 4 * (7 - crfS(ctx->opcode)); >> - tcg_gen_shri_tl(tmp, cpu_fpscr, bfa); >> + bfa = crfS(ctx->opcode); >> + nibble = 7 - bfa; >> + shift = 4 * nibble; >> + tcg_gen_shri_tl(tmp, cpu_fpscr, shift); >> tcg_gen_trunc_tl_i32(cpu_crf[crfD(ctx->opcode)], tmp); >> - tcg_temp_free(tmp); >> tcg_gen_andi_i32(cpu_crf[crfD(ctx->opcode)], cpu_crf[crfD(ctx->opcode)], >> 0xf); >> - tcg_gen_andi_tl(cpu_fpscr, cpu_fpscr, ~(0xF << bfa)); >> + tcg_temp_free(tmp); >> + tcg_gen_extu_tl_i64(tnew_fpscr, cpu_fpscr); >> + /* Only the exception bits (including FX) should be cleared if read */ >> + tcg_gen_andi_i64(tnew_fpscr, tnew_fpscr, ~((0xF << shift) & >> FP_EX_CLEAR_BITS)); >> + /* FEX and VX need to be updated, so don't set fpscr directly */ >> + tmask = tcg_const_i32(1 << nibble); >> + gen_helper_store_fpscr(cpu_env, tnew_fpscr, tmask); >> + tcg_temp_free_i32(tmask); >> + tcg_temp_free_i64(tnew_fpscr); >> } >> >> /* mffs */ > > -- > David Gibson | I'll have my music baroque, and my code > david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ > _other_ > | _way_ _around_! > http://www.ozlabs.org/~dgibson Regards, James
signature.asc
Description: Message signed with OpenPGP using GPGMail