Hello, This is an RFC series to start exploring the possibility of enabling hardfloat for PPC target that haven't progressed in the last two years. Hopefully we can work out something now. Previously I've explored this here:
https://lists.nongnu.org/archive/html/qemu-ppc/2018-07/msg00261.html where some ad-hoc benchmarks using lame mp3 encoder is also explained that has two versions: one using VMX and another only using FP. Both are mostly floating point bounded. I've run this test on mac99 under MorphOS before and after my patches, also verifying that md5sum of resulting mp3 matches (this is no proof for correctness but maybe shows it did not break too much at least those ops used by this program). I've got these measurements on an Intel i7-9700K CPU @ 3.60GHz (did not bother to take multiple samples so these are just approximate): 1) before patch series using softfloat: lame: 4:01, lame_vmx: 3:14 2) only enabling hardfloat in fpu/softfloat.c without other changes: lame: 4:06, lame_vmx: 2:06 (this shows why hardfloat was disabled but VMX can benefit from this) 3) with this series, hardfloat=true: lame: 3:15, lame_vmx: 1:59 (so the patch does something even if there should be more places to preserve inexact flag to fully use hardfloat) 4) with this series but forcing softfloat with hardfloat=false: lame: 4:11, lame_vmx: 2:08 (unfortunately it's slower than before, likely due to adding if () to helper_reset_fpstatus() that should be avoided to at least get back previous hardfloat enabled case that's still slower than softfloat so this series only makes sense if the default can be hardfloat=true at the moment but even that would need more testing) I hope others can contribute to this by doing more testing to find out what else this would break or give some ideas how this could be improved. Regards, BALATON Zoltan BALATON Zoltan (2): target/ppc/cpu: Add hardfloat property target/ppc: Enable hardfloat for PPC fpu/softfloat.c | 14 +++++++------- target/ppc/cpu.h | 2 ++ target/ppc/fpu_helper.c | 7 ++++++- target/ppc/translate_init.inc.c | 2 ++ 4 files changed, 17 insertions(+), 8 deletions(-) -- 2.21.1