Hi Richard, On Thu, Feb 14, 2019 at 5:07 AM Richard Henderson <richard.hender...@linaro.org> wrote: > > We've talked about this before, caching state to reduce the > amount of computation that happens looking up each TB. > > I know that Peter has been concerned that we would not be able to > reliably maintain all of the places that need to be updates to > keep this up-to-date. > > Well, modulo dirty tricks within linux-user, it appears as if > exception delivery and return, plus after every TB-ending write > to a system register is sufficient. > > There seems to be a noticable improvement, although wall-time > is harder to come by -- all of my system-level measurements > include user input, and my user-level measurements seem to be > too small to matter.
FWIW this patch series made a run of linux-user AArch64 176.gcc 166.i go from 29.5s down to 24.5s (on an E5-2650 v2). Though that'd need more benchmarks, that looks quite good to me. Thanks, Laurent > > r~ > > > Richard Henderson (4): > target/arm: Split out recompute_hflags et al > target/arm: Rebuild hflags at el changes and MSR writes > target/arm: Assert hflags is correct in cpu_get_tb_cpu_state > target/arm: Rely on hflags correct in cpu_get_tb_cpu_state > > target/arm/cpu.h | 22 ++- > target/arm/helper.h | 3 + > target/arm/internals.h | 4 + > linux-user/syscall.c | 1 + > target/arm/cpu.c | 1 + > target/arm/helper-a64.c | 3 + > target/arm/helper.c | 267 ++++++++++++++++++++++--------------- > target/arm/machine.c | 1 + > target/arm/op_helper.c | 1 + > target/arm/translate-a64.c | 6 +- > target/arm/translate.c | 14 +- > 11 files changed, 204 insertions(+), 119 deletions(-) > > -- > 2.17.1 > >