On 2015-06-18 12:02, Paolo Bonzini wrote: > > > On 18/06/2015 11:42, Aurelien Jarno wrote: > >> > QEMU could just always compute and store the restore_state information. > >> > TCG needs to help filling it in (a new TCG opcode?), but it should be > >> > easy. > > Yes, that was another approach I have in mind (I called it exception > > table in my other mail), > > Okay, understood. My idea was more like always generating the gen_op_* > arrays. > > > but it requires a tiny more work than just > > saving the CPU state all the time. The problem is that the state > > information we want to save are varying for target to target. Going > > through a TCG opcode means we can use the liveness analysis pass to save > > the minimum amount of data. > > I mentioned a TCG opcode because the target PC is not available inside > the translator. So the translator could pepper the TCG instruction
Well it is available through s->gen_opc_pc, but that's not that clean. > stream with things like > > checkpoint $target_pc, $target_cc_op, $0 Yes, it's clearly better to add an explicit instruction for that. As said we can pass it through the liveness analysis. But it means that this instruction will have a variable number of arguments. > TCG can then use them to fill in an array stored inside the > TranslationBlock, together with the host PC. Since the gen_opc_pc, > gen_opc_instr_start, gen_opc_icount arrays are inside tcg_ctx, it may be > a good idea to store the checkpoint information compressed in a byte > array (e.g. as a series of ULEB128 values---the host and target PCs can > even be stored as deltas from the last value). Either as deltas to the last value or as delta from the start of the TB. What I am worried about is the size of the checkpoint information, even if we do some compression, we might have one per guest instruction. I have implemented a naive version of that without compression, storing the checkpoint data at the end of the generated code, and it's about 30% of the size of the TB for MIPS. It's probably smaller on architectures storing only the PC. Also it's size is quite variable. That's why it's probably not a good idea to store it directly in the TranslationBlock. I don't like storing it directly in the generated code either, especially given this part is supposed to be executable. > As a first step, gen_intermediate_code_pc and tcg_gen_code_search_pc can > then be merged into a single target-independent function that > uncompresses the byte array up to the required host PC into tcg_ctx. > Later you can optimize them to remove the tcg_ctx arrays altogether. > > So the patches could be something like this: > > 1) SPARC: put the jump target information directly in gen_opc_* without > using gen_opc_jump_pc (not trivial) > > 2) a few targets: instead of gen_opc_* arrays, use a new generic member > of tcg_ctx (similar to how csbase is used generically), e.g. > tcg_ctx.gen_opc_target1[] and tcg_ctx.gen_opc_target2[]. > > 3) all targets: always fill in tcg_ctx.gen_*, even if search_pc is false > > 4) TCG: add support for a checkpoint operation, make it fill in > tcg_ctx.gen_* > > 5) all targets: change explicit filling of tcg_ctx.gen_* to use the > checkpoint operation > > 6) TCG/translate-all: convert gen_intermediate_code_pc as outlined above That's sounds like a plan when I have more time ;-) Aurelien -- Aurelien Jarno GPG: 4096R/1DDD8C9B aurel...@aurel32.net http://www.aurel32.net