On 12.09.2017 17:13, Paolo Bonzini wrote: > On 12/09/2017 16:56, Thomas Huth wrote: >> The problem is that the SLOF firmware just performs very badly with TCG >> (it's fine on real hardware). It executes a lot of Forth code, and the >> Forth interpreter uses things like computed gotos or other tricks that >> basically prevent proper JIT operation here. I've done quite a bit of >> optimizations in SLOF in the past already, but I've got hardly any ideas >> left how to fix that further. > > Two ideas for QEMU based on a quick "perf record" test: > > - 25% of the time is spent in cpu_exec. PPC doesn't use > tcg_gen_lookup_and_goto_ptr.
I just realized that Richard recently already posted a patch for this: https://lists.gnu.org/archive/html/qemu-devel/2017-06/msg07124.html I've applied it locally, and indeed, it speeds up a simple test with -prom-env by factor two. Before the change: $ time ppc64-softmmu/qemu-system-ppc64 -nographic -vga none -prom-env 'use-nvramrc?=true' -prom-env 'nvramrc=power-off' [...] real 0m28.784s user 0m28.700s sys 0m0.031s After the change: $ time ppc64-softmmu/qemu-system-ppc64 -nographic -vga none -prom-env 'use-nvramrc?=true' -prom-env 'nvramrc=power-off' [...] real 0m13.953s user 0m13.904s sys 0m0.046s That's impressive! Richard, may I ask what's the current state of this? Do you plan to merge this soon, or are there still issues (like the ones that Paolo mentioned)? However, I only see that speed-up with the normal x86 backend. I've also tried it with TCI, but I hardly saw any improvements there ... is there still something missing in the TCI backend that is required for the tcg_gen_lookup_and_goto_ptr feature? Thomas