On 01/07/2021 16:25, Richard Henderson wrote:
Based-on: <20210630183226.3290849-1-richard.hender...@linaro.org>
("[PATCH v2 00/28] accel/tcg: Introduce translator_use_goto_tb")
This is my attempt at fixing #404 ("windows xp boot takes much longer...").
I don't actually have windows xp available myself, so I don't know
if this has worked, really. I can still boot windows 7, but from
the lack of tracepoint firings I guess it doesn't play any silly
games with breakpoints.
This scheme is not without its drawbacks. In exchange for no bookkeeping
and invalidation whatsoever, other code on the same page as an active
breakpoint runs one insn per tb, doing indirect chaining through
helper_lookup_tb_ptr to see if we hit the breakpoint.
The minor testing that I did seemed fast enough though, with gdb
responding quickly. So before I go off and try to complicate
things again with extra bookkeeping, I thought I'd get some feedback.
r~
Richard Henderson (17):
target/i386: Use cpu_breakpoint_test in breakpoint_handler
accel/tcg: Move helper_lookup_tb_ptr to cpu-exec.c
accel/tcg: Move tb_lookup to cpu-exec.c
accel/tcg: Split out log_cpu_exec
accel/tcg: Log tb->cflags with -d exec
tcg: Remove TCG_TARGET_HAS_goto_ptr
accel/tcg: Reduce CF_COUNT_MASK to match TCG_MAX_INSNS
accel/tcg: Move curr_cflags into cpu-exec.c
accel/tcg: Add CF_NO_GOTO_TB and CF_NO_GOTO_PTR
accel/tcg: Drop CF_NO_GOTO_PTR from -d nochain
accel/tcg: Handle -singlestep in curr_cflags
accel/tcg: Use CF_NO_GOTO_{TB,PTR} in cpu_exec_step_atomic
accel/tcg: Move cflags lookup into tb_find
accel/tcg: Adjust interface of TranslatorOps.breakpoint_check
accel/tcg: Hoist tb_cflags to a local in translator_loop
accel/tcg: Encode breakpoint info into tb->cflags
cpu: Add breakpoint tracepoints
accel/tcg/tb-lookup.h | 49 ------
include/exec/exec-all.h | 30 ++--
include/exec/translator.h | 17 +-
include/tcg/tcg-opc.h | 3 +-
tcg/aarch64/tcg-target.h | 1 -
tcg/arm/tcg-target.h | 1 -
tcg/i386/tcg-target.h | 1 -
tcg/mips/tcg-target.h | 1 -
tcg/ppc/tcg-target.h | 1 -
tcg/riscv/tcg-target.h | 1 -
tcg/s390/tcg-target.h | 1 -
tcg/sparc/tcg-target.h | 1 -
tcg/tci/tcg-target.h | 1 -
accel/tcg/cpu-exec.c | 238 +++++++++++++++++++++++-----
accel/tcg/tcg-runtime.c | 22 ---
accel/tcg/translate-all.c | 7 +-
accel/tcg/translator.c | 79 ++++++---
cpu.c | 35 +---
target/alpha/translate.c | 12 +-
target/arm/translate-a64.c | 14 +-
target/arm/translate.c | 20 +--
target/avr/translate.c | 6 +-
target/cris/translate.c | 14 +-
target/hexagon/translate.c | 13 +-
target/hppa/translate.c | 7 +-
target/i386/tcg/sysemu/bpt_helper.c | 12 +-
target/i386/tcg/translate.c | 15 +-
target/m68k/translate.c | 14 +-
target/microblaze/translate.c | 14 +-
target/mips/tcg/translate.c | 14 +-
target/nios2/translate.c | 13 +-
target/openrisc/translate.c | 11 +-
target/ppc/translate.c | 13 +-
target/riscv/translate.c | 11 +-
target/rx/translate.c | 8 +-
target/s390x/translate.c | 12 +-
target/sh4/translate.c | 12 +-
target/sparc/translate.c | 9 +-
target/tricore/translate.c | 13 +-
target/xtensa/translate.c | 12 +-
tcg/tcg-op.c | 28 ++--
tcg/tcg.c | 8 +-
trace-events | 5 +
43 files changed, 386 insertions(+), 413 deletions(-)
delete mode 100644 accel/tcg/tb-lookup.h
Thanks Richard! I grabbed the git tag from patchew and gave it a quick smoke test
booting a pre-installed WinXP image to the login screen. I've included some extra
times below for comparison taken from my rather modest laptop:
b55f54bc~1 (i.e. last known good commit): 43s
current master (9c2647f750): 2m 40s
breakpoint reorg patchew tag: 25s(!)
I can certainly report that booting WinXP from the patchew tag is substantially
faster than both git master and the last known good commit before the TLB flush logic
was altered, so the initial results are extremely promising.
ATB,
Mark.