This includes improved translation of checks, microoptimization of the helpers, and improvements to the cc_op_* functions from Richard.
Unlike his original patches[1] I didn't convert cc_op_live() to a switch statement, instead keeping the array but making sure that all of its entries are nonzero. The only zero entry was CC_OP_CLR, which is now changed to spill the constant value of EFLAGS to cc_op_src. While this has a 0.2% cost in number of TCG ops, getting rid of the special case for CC_OP_CLR makes it even easier to optimize computation of ZF from CC_OP_DYNAMIC; this is quite common, for example in switch statements that have CMP/JG/JE sequences (JE followed JL/JG/JA/JB seems less common than the opposite, though that's not universal). On a quick-and-dirty run of "ls -lR", the changes add ~750 spills of 0x44 to cc_op_src; but it also reduces to one half the calls to cc_compute_all (most of them are completely eliminated), and that is a lot more expensive. One thing I noticed is that those spills are really huge (11 bytes). It might help to move cc_* at the very beginning of CPUX86State, because the number of accesses to cc_* is comparable to the number of accesses to registers (despite cc_* being mostly written, while registers are both read and written). Thanks, Paolo [1] https://patchew.org/QEMU/20240701025115.1265117-1-richard.hender...@linaro.org/ Paolo Bonzini (10): target/i386: use tcg_gen_ext_tl when applicable target/i386: remove CC_OP_CLR target/i386: optimize computation of ZF from CC_OP_DYNAMIC target/i386: optimize TEST+Jxx sequences target/i386: add a few more trivial CCPrepare cases target/i386: add a note about gen_jcc1 target/i386: make flag variables unsigned target/i386: use builtin popcnt or parity to compute PF, if available target/i386: use higher-precision arithmetic to compute CF target/i386: use + to put flags together Richard Henderson (4): target/i386: Tidy cc_op_str usage target/i386: Rearrange CCOp target/i386: Introduce cc_op_size target/i386: Wrap cc_op_live with a validity check include/qemu/host-utils.h | 16 +++ target/i386/cpu.h | 33 ++++-- target/i386/helper.h | 1 + target/i386/tcg/helper-tcg.h | 12 +++ target/i386/tcg/cc_helper_template.h.inc | 127 +++++++++++++++-------- target/i386/cpu-dump.c | 18 ++-- target/i386/tcg/cc_helper.c | 25 ++++- target/i386/tcg/int_helper.c | 4 +- target/i386/tcg/translate.c | 103 ++++++++++++------ target/i386/tcg/decode-new.c.inc | 2 +- target/i386/tcg/emit.c.inc | 24 ++--- 11 files changed, 249 insertions(+), 116 deletions(-) -- 2.46.2