On 18/06/2015 11:42, Aurelien Jarno wrote: >> > QEMU could just always compute and store the restore_state information. >> > TCG needs to help filling it in (a new TCG opcode?), but it should be >> > easy. > Yes, that was another approach I have in mind (I called it exception > table in my other mail),
Okay, understood. My idea was more like always generating the gen_op_* arrays. > but it requires a tiny more work than just > saving the CPU state all the time. The problem is that the state > information we want to save are varying for target to target. Going > through a TCG opcode means we can use the liveness analysis pass to save > the minimum amount of data. I mentioned a TCG opcode because the target PC is not available inside the translator. So the translator could pepper the TCG instruction stream with things like checkpoint $target_pc, $target_cc_op, $0 TCG can then use them to fill in an array stored inside the TranslationBlock, together with the host PC. Since the gen_opc_pc, gen_opc_instr_start, gen_opc_icount arrays are inside tcg_ctx, it may be a good idea to store the checkpoint information compressed in a byte array (e.g. as a series of ULEB128 values---the host and target PCs can even be stored as deltas from the last value). As a first step, gen_intermediate_code_pc and tcg_gen_code_search_pc can then be merged into a single target-independent function that uncompresses the byte array up to the required host PC into tcg_ctx. Later you can optimize them to remove the tcg_ctx arrays altogether. So the patches could be something like this: 1) SPARC: put the jump target information directly in gen_opc_* without using gen_opc_jump_pc (not trivial) 2) a few targets: instead of gen_opc_* arrays, use a new generic member of tcg_ctx (similar to how csbase is used generically), e.g. tcg_ctx.gen_opc_target1[] and tcg_ctx.gen_opc_target2[]. 3) all targets: always fill in tcg_ctx.gen_*, even if search_pc is false 4) TCG: add support for a checkpoint operation, make it fill in tcg_ctx.gen_* 5) all targets: change explicit filling of tcg_ctx.gen_* to use the checkpoint operation 6) TCG/translate-all: convert gen_intermediate_code_pc as outlined above > That said I would like to push further the idea of always saving the CPU > state a bit more to see if we can keep the same performances. There are > still improvements to do, by removing more code on the core side (like > finding the call to tb_finc_pc which is now useless), or on the target > side by checking/improving helper flags. We might save the CPU state too > often if a helper doesn't declare it doesn't touch globals. True, on the other hand there are a lot of helpers to audit... Paolo