On Fri, 2 Aug 2024 at 08:45, Michael Tokarev <m...@tls.msk.ru> wrote: > > 01.08.2024 17:23, Peter Maydell wrote: > > The FMOPA (widening) SME instruction takes pairs of half-precision > > floating point values, widens them to single-precision, does a > > two-way dot product and accumulates the results into a > > single-precision destination. We don't quite correctly handle the > > FPCR bits FZ and FZ16 which control flushing of denormal inputs and > > outputs. This is because at the moment we pass a single float_status > > value to the helper function, which then uses that configuration for > > all the fp operations it does. However, because the inputs to this > > operation are float16 and the outputs are float32 we need to use the > > fp_status_f16 for the float16 input widening but the normal fp_status > > for everything else. Otherwise we will apply the flushing control > > FPCR.FZ16 to the 32-bit output rather than the FPCR.FZ control, and > > incorrectly flush a denormal output to zero when we should not (or > > vice-versa). > > > > (In commit 207d30b5fdb5b we tried to fix the FZ handling but > > didn't get it right, switching from "use FPCR.FZ for everything" to > > "use FPCR.FZ16 for everything".) > > > > Pass the CPU env to the sme_fmopa_h helper instead of an fp_status > > pointer, and have the helper pass an extra fp_status into the > > f16_dotadd() function so that we can use the right status for the > > right parts of this operation. > > > > Cc: qemu-sta...@nongnu.org > > Fixes: 207d30b5fdb5 ("target/arm: Use FPST_F16 for SME FMOPA (widening)") > > Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2373 > > I know it's too late already, but it looks like this Fixes needs to be: > Fixes: 3916841ac75 ("target/arm: Implement FMOPA, FMOPS (widening)")
It's fixing a mistake in 207d30b5fdb5, which is in turn fixing a mistake in 3916841ac75 (but didn't quite get it right). -- PMM