The "faster helper" in the last patch is not so impressive. GCC does not want to inline the TCG specific helper but performs a tail call from it to common helper. This results in useless shuffling of registers and stack, also stack protector gets to kick in the second time. I tried also to modify the common helper (pass a boolean flag to select missed case first) but GCC still generated exactly the same code.
Anyway, patches 1 to 3 could serve as a basis to optimize the interface between TCG generated code and the memory access helpers further. For example, coding the helpers in assembly would be possible. The TLB indices etc. which have been calculated in generated code could be reused in the access helper. Blue Swirl (4): softmmu: move target alignment definition to configure stage softmmu: make unaligned access helper global softmmu: move TCG memory access helpers to TCG targets softmmu: add a faster helper for TCG configure | 11 +++++++ exec-all.h | 9 ++++++ exec.c | 6 ++++ softmmu_template.h | 73 +++++++++++++++++++++++++++++++++++--------- target-alpha/mem_helper.c | 19 ++---------- target-mips/op_helper.c | 18 ++++++++--- target-sparc/cpu.h | 3 -- target-sparc/ldst_helper.c | 18 ++--------- target-xtensa/op_helper.c | 19 ++++++++---- tcg/arm/tcg-target.c | 33 ++++++++++++++------ tcg/hppa/tcg-target.c | 31 ++++++++++++++----- tcg/i386/tcg-target.c | 31 ++++++++++++++----- tcg/ia64/tcg-target.c | 31 ++++++++++++++----- tcg/mips/tcg-target.c | 31 ++++++++++++++----- tcg/ppc/tcg-target.c | 31 ++++++++++++++----- tcg/ppc64/tcg-target.c | 31 ++++++++++++++----- tcg/s390/tcg-target.c | 31 ++++++++++++++----- tcg/sparc/tcg-target.c | 31 ++++++++++++++----- tci.c | 21 +++++++++++++ 19 files changed, 344 insertions(+), 134 deletions(-) -- 1.7.10