Re: [Qemu-devel] [PATCH] target-i386: implement FPREM and FPREM1 using softfloat only

Catalin Patulea Thu, 28 Jun 2012 18:51:02 -0700

On Thu, Jun 28, 2012 at 2:25 PM, Peter Maydell <peter.mayd...@linaro.org> wrote:
> >  /*----------------------------------------------------------------------------
> > +| Returns 1 if the extended double-precision floating-point value `a' is an
> > +| unsupported; otherwise returns 0.
>
> an unsupported what?
I think it should be "unsupported value" - fixed.


> > +/*
> > + * BEGIN from Bochs rev 11224 dated 2012-06-21 10:33:37 -0700
> > + *
> > + * Converted to use Qemu type aliases, use C features only, etc.
> > + */
>
> I'm not convinced we want these markers in the source code
> (though they should probably be in the commit message).
Ok, moved to the commit message.

> Can you confirm that the Bochs licence is compatible with
> the QEMU one? In particular, is it compatible with/the same
> as the license as stated at the top of softfloat.c?
Bochs is licensed under LGPL 2.1, and QEMU is GPL 2. According to the
GNU license compatibility matrix [1], it appears to be allowed to copy
code from Bochs to QEMU.

However, each copy of SoftFloat is under its own license, which has a
warranty disclaimer and the following:

Derivative works are acceptable, even for commercial purposes, so long as
(1) the source code for the derivative work includes prominent notice that
the work is derivative, and (2) the source code includes prominent notice with
these four paragraphs for those parts of this code that are retained.

I'm not sure which license takes precedence. Perhaps because each
project has modified SoftFloat, each copy is now licensed under the
broader project license. In either case, I don't think there are any
issues (though IANAL).

> > +static uint64 remainder_kernel(uint64 aSig0, uint64 bSig, int expDiff, 
> > uint64 *zSig0, uint64 *zSig1)
> > +{
> > +    uint64 term0, term1;
> > +    uint64 aSig1 = 0;
>
> No new code should be using the uint64 &c types (which are
> "at least NN bits wide") -- uint64_t or uint_fast64_t please.
Ok, changed some {int -> flag} and {uint64 -> uint64_t}.

There are some int32s left, but these seem to be tied to
extractFloatx80Exp's return type, which is int32 despite the
underlying floatx80.high being an int16_t. Is this intentional? Fixing
this would imply many other changes so I would punt on this for now.

> > +    // handle unsupported extended double-precision floating encodings
> > +    if (floatx80_is_unsupported(a) || floatx80_is_unsupported(b))
> > +    {
> > +        float_raise(float_flag_invalid, status);
> > +        *r = floatx80_default_nan;
> > +        return -1;
> > +    }
>
> So are we mishandling unsupported 80bit floats in other operations
> (eg addition), or do those functions just opencode things?
Yes, I believe addition (and likely others) mishandles these. I put
together a small C program that clears FP exceptions, adds two long
doubles, and checks FP exceptions. (I do these things using glibc
functions like feclearexcept - I don't think this is any different
from doing it at the machine level, but just to be sure - are you
aware of any issues with this approach?)

The operands were (decimal notation, then hexadecimal exponent.significand):
st0: 1.000000000000 3FFF.8000000000000000
+
st1: 0.000000000000 3FFF.0000000000000000
// Intel IA-32 [2] defines the latter as a "positive unnormal" (Table
8-3), one of those "unsupported" values.

The result on bare metal was:
result: -nan FFFF.C000000000000000  (with FE_INVALID)

qemu i386-linux-user kvm-less:
result: 3.000000000000 4000.C000000000000000 (no exceptions)

I'm not sure what you mean by "opencode".

> > +/*----------------------------------------------------------------------------
> > +| Returns the remainder of the extended double-precision floating-point 
> > value
> > +| `a' with respect to the corresponding value `b'.  The operation is 
> > performed
> > +| according to the IEC/IEEE Standard for Binary Floating-Point Arithmetic.
> > +*----------------------------------------------------------------------------*/
> This claims to return the remainder, but do_fprem only returns
> 0, 1 or -1...
Ok, fixed the comments.

> > +    /* TODO(catalinp):
> > +     *
> > +     * - What is the defined purpose of fp_status in Qemu? Seems many 
> > things
> > +     *   write into it, but few if any translate into env->fpus.
>
> fp_status is a float_status structure which has two purposes:
>  * tells the softfloat code how to behave in certain circumstances
>   (how to handle denormals, rounding modes, etc)
>  * tracks the IEEE "cumulative exception flags" (inexact, overflow, etc)
> Every softfloat routine which needs to set exception flags or behave
> differently according to the flags in fp_status will take a float_status*
> argument. (This is obscured slightly by the silly STATUS_PARAM #define.)
>
> (Note that for some CPU architectures there may be more than one float_status
> struct; eg ARM has both an fp_status and a standard_fp_status, because Neon
> operations behave differently from VFP ones; see comments in target-arm/cpu.h.
> The Neon operations pass the softfloat code the standard_fp_status, and
> the VFP ops pass in the fp_status.)
>
> The way I would recommend using it is:
>  * the "master copy" of this information is in the float_status
>  * when we do a read of whatever CPU register(s) expose this info
>   to the guest, construct that register value on the fly from the
>   float_status struct
>  * when we do a write of those registers (or a CPU reset), change
>   the float_status struct accordingly
>  * for most floating point arithmetic operations, no messing with
>   float_status should then be necessary (we just pass a pointer to
>   fp_status into the softfloat function)
> The idea here is that fp operations are common but reading or writing
> the registers which change rounding mode or reveal the exception
> flags are rare, so we do the conversion only when we have to.
>
> target-i386 is a bit of a latecomer to using softfloat and not
> particularly maintained either, so it is quite possible it is
> buggy in this regard. (Very few programs actually care about
> the cumulative exception flags so it is quite easy for bugs in
> this area to go unnoticed.) target-arm gets it right.
>
> Also the functions which are still doing native arithmetic
> (like the ones you're trying to convert) have to do their own
> detection and handling of exception conditions. In theory at
> least some of that code just vanishes into the generic softfloat
> handling.
Ok, this all makes sense, thanks for the clarification.

> > +     * - Bochs' FPU_pre_exception_handling resets a few more fields in 
> > fp_status
> > +     *   before doing the operation. What is the purpose of that and is 
> > this
> > +     *   necessary here?
> > +     */
> > +    env->fp_status.float_exception_flags = 0;
>
> This will trash all the cumulative exception flags and is very unlikely
> to be correct.
I've changed exception handling around slightly. The helpers now make
a copy of fp_status into local_status, and reset exception flags only
there. This way I can detect newly-raised exceptions without modifying
the global state. New exceptions are reported through
fpu_set_exception (even though I guess that's going away eventually).
Once env->fp_status becomes the authority for FPU state, local_status
in these helpers can simply go away.

> > +    /* TODO(catalinp): Set only unmasked exceptions? */
> > +    fpu_set_exception(fpu_exception_from_fp_status());
>
> Check the implementation of fpu_set_exception -- it handles
> the check for whether the exceptions are masked or not.
Yes, you are right. I was confusing the role of the mask bit, thought
it should also prevent reporting into the status word. Comment
removed.

As a sidenote, I did a quick check just to test my understanding of
FPREM exceptions on real hardware. I set st0 to 1.0 (arbitrary
positive value) and st1 to 0.0. I expected to get a div-by-zero
exception, which seems to be what's defined in IA-32 Table 3-41.
(ST0=+F and ST1=+0 gives "**" which means div-by-zero). This is the
snippet I used, wrapped with some variable initialization and printfs:

 asm("fnclex\n"
     "fprem\n"
     "fnstsw %1"
     : "=t"(result), "=m"(sw)
     : "f"(st0), "f"(st1)
     : "memory"
     );

Instead of div-by-zero, I got only invalid-operand. This is on an i5.
If I replace the FPREM with a (result = st0 / st1) I get div-by-zero.
Any idea what might be going on here?

I will wait to hear feedback on this mail before sending a patch v2.
Stefan, I've done a run of checkpatch.pl and I think I've addressed
nearly all warnings. When v2 goes out, please let me know if you still
see any style issues.

Catalin

[1] http://www.gnu.org/licenses/gpl-faq.html#AllCompatibility
[2] http://download.intel.com/products/processor/manual/325462.pdf

Re: [Qemu-devel] [PATCH] target-i386: implement FPREM and FPREM1 using softfloat only

Reply via email to