On 12.09.2017 17:13, Paolo Bonzini wrote:
> On 12/09/2017 16:56, Thomas Huth wrote:
>> The problem is that the SLOF firmware just performs very badly with TCG
>> (it's fine on real hardware). It executes a lot of Forth code, and the
>> Forth interpreter uses things like computed gotos or other tricks that
>> basically prevent proper JIT operation here. I've done quite a bit of
>> optimizations in SLOF in the past already, but I've got hardly any ideas
>> left how to fix that further.
> 
> Two ideas for QEMU based on a quick "perf record" test:
> 
> - 25% of the time is spent in cpu_exec.  PPC doesn't use
> tcg_gen_lookup_and_goto_ptr.

I just realized that Richard recently already posted a patch for this:

https://lists.gnu.org/archive/html/qemu-devel/2017-06/msg07124.html

I've applied it locally, and indeed, it speeds up a simple test with
-prom-env by factor two. Before the change:

$ time ppc64-softmmu/qemu-system-ppc64 -nographic -vga none -prom-env
'use-nvramrc?=true' -prom-env 'nvramrc=power-off'
[...]
real    0m28.784s
user    0m28.700s
sys     0m0.031s

After the change:

$ time ppc64-softmmu/qemu-system-ppc64 -nographic -vga none -prom-env
'use-nvramrc?=true' -prom-env 'nvramrc=power-off'
[...]
real    0m13.953s
user    0m13.904s
sys     0m0.046s

That's impressive! Richard, may I ask what's the current state of this?
Do you plan to merge this soon, or are there still issues (like the ones
that Paolo mentioned)?

However, I only see that speed-up with the normal x86 backend. I've also
tried it with TCI, but I hardly saw any improvements there ... is there
still something missing in the TCI backend that is required for the
tcg_gen_lookup_and_goto_ptr feature?

 Thomas

Reply via email to