On 1/31/21 1:38 AM, Liren Wei wrote: > However, similar to the situation described in: > https://lists.nongnu.org/archive/html/qemu-devel/2018-02/msg02529.html > > When we have 2 vCPUs with one of them writing to the code page while > the other just translated some code within that same page, the following > situation might happen: > > vCPU thread 1 - writing vCPU thread 2 - translating > ----------------------- ----------------------- > TLB check -> slow path > notdirty_write() > set dirty flag > write to RAM > tb_gen_code() > tb_page_add() > tlb_protect_code() > > TLB check -> fast path > set TLB_NOTDIRTY > write to RAM > executing unmodified code for this time > and maybe also for the next time, never > re-translate modified TBs. > > > My question is: > Should the situation described above be considered as a bug or, > an intended behavior for QEMU (, so it's the programmer's fault > for not flushing the icache after modifying shared code page)?
Yes, this is a bug, because we are trying to support e.g. x86 which does not require an icache flush. I think the page lock, the TLB_NOTDIRTY setting, and a possible sync on the setting, needs to happen before the bytes are read during translation. Otherwise we don't catch the case above, nor do we catch CPU1 CPU2 ------------------ -------------------------- TLB check -> fast tb_gen_code() -> all of it write to ram Also because of x86 (and other architectures in which a single instruction can span a page boundary), I think this lock+set+sync sequence needs to happen on demand in something called from the function set defined in include/exec/translator.h That also means that any target/cpu/ which has not been converted to use that interface remains broken, and should be converted or deprecated. Are you planning to work on this? r~