About half of these patches are focused on reducing the number of full 64-bit constants that need to be generated for addresses:
E.g. patch 5, looking through the function descriptor. If the program is built --disable-pie, the elements of the function descriptors are all 32-bit constants. E.g. the end result of indirect jump threading + TCG_REG_TB. Before, we reserve 6 insn slots to generate the full 64-bit address. After, we use 2 insns -- addis + ld -- to load the full 64-bit address from the indirection slot. The second patch could probably be reverted. I'd planned to be able to use the same conditional call + tail call scheme as ARM, but I'd forgotten the need for a conditional store to go along with that. OTOH, it might still turn out to be useful somewhere. r~ Richard Henderson (15): tcg-ppc64: Avoid code for nop move tcg-ppc64: Add an LK argument to tcg_out_call tcg-ppc64: Use the branch absolute instruction when possible tcg-ppc64: Don't load the static chain from TCG tcg-ppc64: Look through the function descriptor when profitable tcg-ppc64: Move AREG0 to r31 tcg-ppc64: Tidy register allocation order tcg-ppc64: Create PowerOpcode tcg-ppc64: Handle long offsets better tcg-ppc64: Use indirect jump threading tcg-ppc64: Setup TCG_REG_TB tcg-ppc64: Use TCG_REG_TB in tcg_out_movi and tcg_out_mem_long tcg-ppc64: Tidy tcg_target_qemu_prologue tcg-ppc64: Streamline tcg_out_tlb_read tcg-ppc64: Implement CONFIG_QEMU_LDST_OPTIMIZATION configure | 2 +- include/exec/exec-all.h | 7 +- tcg/ppc64/tcg-target.c | 1079 ++++++++++++++++++++++++++--------------------- tcg/ppc64/tcg-target.h | 2 +- 4 files changed, 598 insertions(+), 492 deletions(-) -- 1.8.3.1