Ilya Leoshkevich <i...@linux.ibm.com> writes: > On Wed, 2025-01-15 at 16:08 -0800, Richard Henderson wrote: >> On 1/15/25 15:20, Ilya Leoshkevich wrote: >> > Currently single-insn TBs created from I/O memory are not added to >> > region_trees. Therefore, when they generate exceptions, they are >> > not >> > handled by cpu_restore_state_from_tb(). For x86 this is not a >> > problem, >> > because x86_restore_state_to_opc() only restores pc and cc, which >> > are >> > already correct. However, on several other architectures, >> > restore_state_to_opc() restores more registers, and guests can >> > notice >> > incorrect values. >> > >> > Fix by always calling tcg_tb_insert(). This may increase the size >> > of >> > region_trees, but tcg_region_reset_all() clears it once >> > code_gen_buffer >> > fills up, so it will not grow uncontrollably. >> > >> > Co-developed-by: Nina Schoetterl-Glausch <n...@linux.ibm.com> >> > Signed-off-by: Ilya Leoshkevich <i...@linux.ibm.com> >> > --- >> >> This needs something else. The reason why they're not insertted is >> that they're not valid >> for a second execution. We need to not find them in the search tree. > > I have the impression that code_gen_buffer is append-only, so after we > create a new TB for the second execution, the first TB should not > be deleted - is this correct? At least I haven't found code_gen_ptr > decrements, besides the rollback at the end of tb_gen_code(). Then, > since region_trees are indexed by code_gen_buffer pointers, and not > guest pointers, this should not introduce any stale entries.
We used to generate a temporary TB, execute it and then wind codeptr back. We simplified the code to generate the TB but then not add it to QHT - I think the original reasoning was saving on scarce CF_ flags and not over complicating the tb_gen_code function. See around 873d64ac30 (accel/tcg: re-factor non-RAM execution code). This does have the effect of potentially regenerating the same TB over and over again but usually there only a few insns executed out of IO space. However some arches do more of this than others. And now we have devices such as CXL where people do want to run large amounts of code directly out of io address space. This eventually exhausts the codegen buffer so we do a tb_flush() and start again. This is obviously sloooooow. > > While we might not need to find the ones created for the first > execution, we still need to find the ones for executions that fail - > and there is no way to tell in advance, which ones these are going to > be, so the idea here is to register all of them. > > Am I missing something? > >> r~ -- Alex Bennée Virtualisation Tech Lead @ Linaro