On 10/25/22 16:29, Daniel Henrique Barboza wrote:
On 10/21/22 14:01, Leandro Lupori wrote:
Profiling QEMU during Fedora 35 for PPC64 boot revealed that
6.39% of total time was being spent in helper_insns_inc(), on a
POWER9 machine. To avoid calling this helper every time PMCs had
to be incremented, an inline implementation of PMC5 increment and
check for overflow was developed. This led to a reduction of
about 12% in Fedora's boot time.
Signed-off-by: Leandro Lupori <leandro.lup...@eldorado.org.br>
---
Given that PMC5 is the counter that is most likely to be active, yeah,
isolating the case where PMC5 is incremented standalone makes sense.
Still, 12% performance gain is not too shaby. Not too shaby at all.
I've tried to move more of helper_insns_inc() to the inline
implementation, but then performance started to decrease.
Initially I found this strange, but perf revealed a considerable
increase of time spent in functions such as tcg_gen_code and
liveness_pass_1.
So as this code has to be generated and optimized for most TBs, it seems
it makes code generation slower if it's too big.
Reviewed-by: Daniel Henrique Barboza <danielhb...@gmail.com>