On Fri, 2 Aug 2024 at 08:45, Michael Tokarev <m...@tls.msk.ru> wrote:
>
> 01.08.2024 17:23, Peter Maydell wrote:
> > The FMOPA (widening) SME instruction takes pairs of half-precision
> > floating point values, widens them to single-precision, does a
> > two-way dot product and accumulates the results into a
> > single-precision destination.  We don't quite correctly handle the
> > FPCR bits FZ and FZ16 which control flushing of denormal inputs and
> > outputs.  This is because at the moment we pass a single float_status
> > value to the helper function, which then uses that configuration for
> > all the fp operations it does.  However, because the inputs to this
> > operation are float16 and the outputs are float32 we need to use the
> > fp_status_f16 for the float16 input widening but the normal fp_status
> > for everything else.  Otherwise we will apply the flushing control
> > FPCR.FZ16 to the 32-bit output rather than the FPCR.FZ control, and
> > incorrectly flush a denormal output to zero when we should not (or
> > vice-versa).
> >
> > (In commit 207d30b5fdb5b we tried to fix the FZ handling but
> > didn't get it right, switching from "use FPCR.FZ for everything" to
> > "use FPCR.FZ16 for everything".)
> >
> > Pass the CPU env to the sme_fmopa_h helper instead of an fp_status
> > pointer, and have the helper pass an extra fp_status into the
> > f16_dotadd() function so that we can use the right status for the
> > right parts of this operation.
> >
> > Cc: qemu-sta...@nongnu.org
> > Fixes: 207d30b5fdb5 ("target/arm: Use FPST_F16 for SME FMOPA (widening)")
> > Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2373
>
> I know it's too late already, but it looks like this Fixes needs to be:
> Fixes: 3916841ac75 ("target/arm: Implement FMOPA, FMOPS (widening)")

It's fixing a mistake in 207d30b5fdb5, which is in turn fixing
a mistake in 3916841ac75 (but didn't quite get it right).

-- PMM

Reply via email to