On Thu, Jun 28, 2012 at 2:25 PM, Peter Maydell <peter.mayd...@linaro.org> wrote: > > /*---------------------------------------------------------------------------- > > +| Returns 1 if the extended double-precision floating-point value `a' is an > > +| unsupported; otherwise returns 0. > > an unsupported what? I think it should be "unsupported value" - fixed.
> > +/* > > + * BEGIN from Bochs rev 11224 dated 2012-06-21 10:33:37 -0700 > > + * > > + * Converted to use Qemu type aliases, use C features only, etc. > > + */ > > I'm not convinced we want these markers in the source code > (though they should probably be in the commit message). Ok, moved to the commit message. > Can you confirm that the Bochs licence is compatible with > the QEMU one? In particular, is it compatible with/the same > as the license as stated at the top of softfloat.c? Bochs is licensed under LGPL 2.1, and QEMU is GPL 2. According to the GNU license compatibility matrix [1], it appears to be allowed to copy code from Bochs to QEMU. However, each copy of SoftFloat is under its own license, which has a warranty disclaimer and the following: Derivative works are acceptable, even for commercial purposes, so long as (1) the source code for the derivative work includes prominent notice that the work is derivative, and (2) the source code includes prominent notice with these four paragraphs for those parts of this code that are retained. I'm not sure which license takes precedence. Perhaps because each project has modified SoftFloat, each copy is now licensed under the broader project license. In either case, I don't think there are any issues (though IANAL). > > +static uint64 remainder_kernel(uint64 aSig0, uint64 bSig, int expDiff, > > uint64 *zSig0, uint64 *zSig1) > > +{ > > + uint64 term0, term1; > > + uint64 aSig1 = 0; > > No new code should be using the uint64 &c types (which are > "at least NN bits wide") -- uint64_t or uint_fast64_t please. Ok, changed some {int -> flag} and {uint64 -> uint64_t}. There are some int32s left, but these seem to be tied to extractFloatx80Exp's return type, which is int32 despite the underlying floatx80.high being an int16_t. Is this intentional? Fixing this would imply many other changes so I would punt on this for now. > > + // handle unsupported extended double-precision floating encodings > > + if (floatx80_is_unsupported(a) || floatx80_is_unsupported(b)) > > + { > > + float_raise(float_flag_invalid, status); > > + *r = floatx80_default_nan; > > + return -1; > > + } > > So are we mishandling unsupported 80bit floats in other operations > (eg addition), or do those functions just opencode things? Yes, I believe addition (and likely others) mishandles these. I put together a small C program that clears FP exceptions, adds two long doubles, and checks FP exceptions. (I do these things using glibc functions like feclearexcept - I don't think this is any different from doing it at the machine level, but just to be sure - are you aware of any issues with this approach?) The operands were (decimal notation, then hexadecimal exponent.significand): st0: 1.000000000000 3FFF.8000000000000000 + st1: 0.000000000000 3FFF.0000000000000000 // Intel IA-32 [2] defines the latter as a "positive unnormal" (Table 8-3), one of those "unsupported" values. The result on bare metal was: result: -nan FFFF.C000000000000000 (with FE_INVALID) qemu i386-linux-user kvm-less: result: 3.000000000000 4000.C000000000000000 (no exceptions) I'm not sure what you mean by "opencode". > > +/*---------------------------------------------------------------------------- > > +| Returns the remainder of the extended double-precision floating-point > > value > > +| `a' with respect to the corresponding value `b'. The operation is > > performed > > +| according to the IEC/IEEE Standard for Binary Floating-Point Arithmetic. > > +*----------------------------------------------------------------------------*/ > This claims to return the remainder, but do_fprem only returns > 0, 1 or -1... Ok, fixed the comments. > > + /* TODO(catalinp): > > + * > > + * - What is the defined purpose of fp_status in Qemu? Seems many > > things > > + * write into it, but few if any translate into env->fpus. > > fp_status is a float_status structure which has two purposes: > * tells the softfloat code how to behave in certain circumstances > (how to handle denormals, rounding modes, etc) > * tracks the IEEE "cumulative exception flags" (inexact, overflow, etc) > Every softfloat routine which needs to set exception flags or behave > differently according to the flags in fp_status will take a float_status* > argument. (This is obscured slightly by the silly STATUS_PARAM #define.) > > (Note that for some CPU architectures there may be more than one float_status > struct; eg ARM has both an fp_status and a standard_fp_status, because Neon > operations behave differently from VFP ones; see comments in target-arm/cpu.h. > The Neon operations pass the softfloat code the standard_fp_status, and > the VFP ops pass in the fp_status.) > > The way I would recommend using it is: > * the "master copy" of this information is in the float_status > * when we do a read of whatever CPU register(s) expose this info > to the guest, construct that register value on the fly from the > float_status struct > * when we do a write of those registers (or a CPU reset), change > the float_status struct accordingly > * for most floating point arithmetic operations, no messing with > float_status should then be necessary (we just pass a pointer to > fp_status into the softfloat function) > The idea here is that fp operations are common but reading or writing > the registers which change rounding mode or reveal the exception > flags are rare, so we do the conversion only when we have to. > > target-i386 is a bit of a latecomer to using softfloat and not > particularly maintained either, so it is quite possible it is > buggy in this regard. (Very few programs actually care about > the cumulative exception flags so it is quite easy for bugs in > this area to go unnoticed.) target-arm gets it right. > > Also the functions which are still doing native arithmetic > (like the ones you're trying to convert) have to do their own > detection and handling of exception conditions. In theory at > least some of that code just vanishes into the generic softfloat > handling. Ok, this all makes sense, thanks for the clarification. > > + * - Bochs' FPU_pre_exception_handling resets a few more fields in > > fp_status > > + * before doing the operation. What is the purpose of that and is > > this > > + * necessary here? > > + */ > > + env->fp_status.float_exception_flags = 0; > > This will trash all the cumulative exception flags and is very unlikely > to be correct. I've changed exception handling around slightly. The helpers now make a copy of fp_status into local_status, and reset exception flags only there. This way I can detect newly-raised exceptions without modifying the global state. New exceptions are reported through fpu_set_exception (even though I guess that's going away eventually). Once env->fp_status becomes the authority for FPU state, local_status in these helpers can simply go away. > > + /* TODO(catalinp): Set only unmasked exceptions? */ > > + fpu_set_exception(fpu_exception_from_fp_status()); > > Check the implementation of fpu_set_exception -- it handles > the check for whether the exceptions are masked or not. Yes, you are right. I was confusing the role of the mask bit, thought it should also prevent reporting into the status word. Comment removed. As a sidenote, I did a quick check just to test my understanding of FPREM exceptions on real hardware. I set st0 to 1.0 (arbitrary positive value) and st1 to 0.0. I expected to get a div-by-zero exception, which seems to be what's defined in IA-32 Table 3-41. (ST0=+F and ST1=+0 gives "**" which means div-by-zero). This is the snippet I used, wrapped with some variable initialization and printfs: asm("fnclex\n" "fprem\n" "fnstsw %1" : "=t"(result), "=m"(sw) : "f"(st0), "f"(st1) : "memory" ); Instead of div-by-zero, I got only invalid-operand. This is on an i5. If I replace the FPREM with a (result = st0 / st1) I get div-by-zero. Any idea what might be going on here? I will wait to hear feedback on this mail before sending a patch v2. Stefan, I've done a run of checkpatch.pl and I think I've addressed nearly all warnings. When v2 goes out, please let me know if you still see any style issues. Catalin [1] http://www.gnu.org/licenses/gpl-faq.html#AllCompatibility [2] http://download.intel.com/products/processor/manual/325462.pdf