On Fri, 21 Feb 2020, Peter Maydell wrote:
On Fri, 21 Feb 2020 at 16:05, BALATON Zoltan <bala...@eik.bme.hu> wrote:
On Thu, 20 Feb 2020, Richard Henderson wrote:
On 2/18/20 9:10 AM, BALATON Zoltan wrote:
+ DEFINE_PROP_BOOL("hardfloat", PowerPCCPU, hardfloat, true),
I would also prefer a different name here -- perhaps x-no-fp-fi.
What's wrong with hardfloat? That's how the code refers to this so if
anyone searches what it does would turn up some meaningful results.
This prompted me to check what you're using the property for.
The cover letter says:
This patch implements a simple way to keep the inexact flag set for
hardfloat while still allowing to revert to softfloat for workloads
that need more accurate albeit slower emulation. (Set hardfloat
property of CPU, i.e. -cpu name,hardfloat=false for that.)
I think that is the wrong approach. Enabling use of the host
FPU should not affect the accuracy of the emulation, which
should remain bitwise-correct. We should only be using the
host FPU to the extent that we can do that without discarding
accuracy. As far as I'm aware that's how the hardfloat support
for other guest CPUs that use it works.
I don't know of a better approach. Please see section 4.2.2 Floating-Point
Status and Control Register on page 124 in this document:
https://openpowerfoundation.org/?resource_lib=power-isa-version-3-0
especially the definition of the FR and FI bits and tell me how can we
emulate these accurately and use host FPU. Not using the FPU even when
these bits are not needed (which seems to be the case for all workloads
we've tested so far) seriously limits the emulation speed so spending time
to emulate obscure and unused part of an architecture when not actually
needed just to keep emulation accurate but unusably slow does not seem to
be the right approach. In an ideal world of course this should be both
fast and accurate but we don't seem to have anyone who could achieve that
in past two years so maybe we could give up some accuracy now to get
usable speed and worry about emulating obscure features when we come
across some workload that actually needs it (but we have the option to
revert to accurate but slow emulation for that until a better way can be
devised that's both fast and accurate). Insisting on accuracy without any
solution to current state just hinders making any progress with this.
Other PowerPC emulators also seem to not bother or have similar
optimisation. I've quickly checked three that I know about:
https://github.com/mamedev/mame/blob/master/src/devices/cpu/powerpc/ppcdrc.cpp#L1893
https://github.com/mamedev/mame/blob/master/src/devices/cpu/powerpc/ppcdrc.cpp#L3503
there's also something here but no mention of FI bit I could notice:
https://github.com/mamedev/mame/blob/master/src/devices/cpu/powerpc/ppccom.cpp#L2023
https://github.com/xenia-project/xenia/blob/master/src/xenia/cpu/ppc/ppc_hir_builder.cc#L428
https://github.com/dolphin-emu/dolphin/blob/master/Source/Core/Core/PowerPC/Jit64/Jit_FloatingPoint.cpp
But I'm not sure I understand all of the above so hope this makes more
sense to someone and can advise.
Regards,
BALATON Zoltan