On Mon, Jan 10, 2011 at 03:07:52PM +0100, Alexander Graf wrote: > > On 10.01.2011, at 15:00, Aurelien Jarno wrote: > > > On Mon, Jan 10, 2011 at 01:13:25PM +0100, Alexander Graf wrote: > >> > >> On 06.01.2011, at 23:12, Aurelien Jarno wrote: > >> > >>> Hi, > >>> > >>> I have just sent a tcg/arm patch concerning code retranslation. You > >>> might want to look at the description (copied below), as from a first > >>> glance ppc, s390 and sparc TCG targets might be affected. If you see > >>> guest kernel panics, some segmentation fault of qemu or in the guest, > >>> strange behaviors, that happen randomly and that looks difficult to > >>> debug it might be the issue. > >>> > >>> Aurelien > >>> > >>> > >>> | QEMU uses code retranslation to restore the CPU state when an exception > >>> | happens. For it to work the retranslation must not modify the generated > >>> | code. This is what is currently implemented in ARM TCG. > >>> | > >>> | However on CPU that don't have icache/dcache/memory synchronised like > >>> | ARM, this requirement is stronger and code retranslation must not modify > >>> | the generated code "atomically", as the cache line might be flushed > >>> | at any moment (interrupt, exception, task switching), even if not > >>> | triggered by QEMU. The probability for this to happen is very low, and > >>> | depends on cache size and associativiy, machine load, interrupts, so the > >>> | symptoms are might happen randomly. > >>> | > >>> | This requirement is currently not followed in tcg/arm, for the > >>> | load/store code, which basically has the following structure: > >>> | 1) tlb access code is written > >>> | 2) conditional fast path code is written > >>> | 3) branch is written with a temporary target > >>> | 4) slow path code is written > >>> | 5) branch target is updated > >>> | The cache lines corresponding to the retranslated code is not flushed > >>> | after code retranslation as the generated code is supposed to be the > >>> | same. However if the cache line corresponding to the branch instruction > >>> | is flushed between step 3 and 5, and is not flushed again before the > >>> | code is executed again, the branch target is wrong. In the guest, the > >>> | symptoms are MMU page fault at a random addresses, which leads to > >>> | kernel page fault or segmentation faults. > >> > >> I don't see the problem. If you have separate icache from dcache, the code > >> in question doesn't get executed during the rewrite, so all should be > >> fine. If you have both combined, the data write should automatically > >> modify the cache line, so all is great too. > > > > As far as I understand, this is what happens to the branch target > > instruction, or at least one possibility: > > > > operation | icache | dcache | mem/L2 | > > ------------------------------------------+--------+--------+--------+ > > tlb access code is written | absent | absent | ok | > > conditional fast path code is written | absent | absent | ok | > > branch is written with a temporary target | absent | wrong | ok | > > cache line is flushed to memory | absent | absent | wrong | > > slow path code is written | absent | absent | wrong | > > branch target is updated | absent | ok | wrong | > > TB is re-executed | wrong | ok | wrong | > > > > Note that the issue is real, the patch really fixes the issue on ARM and > > MIPS. It's not impossible that I don't fully understand it given I can't > > analyse the cache values at a given time. However, when QEMU crashes > > because of that, we have seen that the executed code doesn't match what > > GDB says, and changing the temporary branch value also changes the way > > QEMU or the guest crash. > > > > The problem doesn't happens very often though (for sure less than 1 > > every few millions). The other way to fix it is to issue a cache flush > > after a code retranslation, however, it is something very costly on some > > architectures. > > I'd guess it only happens when code is overwritten. Only then icache can > still be stale in that code region. Can't we just invalidate the icache on > every tb flush? That should also fix the issue I'd guess.
That's a solution that works (tested). However making sure that the retranslation doesn't change the code looks better. > > > >> As long as the code in question doesn't get run in between, I don't see > >> where breakage could occur? > > > > The breakage can occur, because the same TB can be re-executed later, > > and we don't have an explicit cache flush after the retranslation. > > Oh, I think I'm starting to understand the issue. Have you seen this with > system or user emulation? I'd guess with user it makes sense: > > [T1] trap > [T2] run code > [T1] write branch code to find guest IP > [T2] run exactly that branch code > [T1] rewrite branch code > [T2] has invalid code in its cache > > If that's the case, a global lock on retranslation should do the trick, no? > I haven't seen that during user emulation, however your scenario, which is different is also possible. Note however that they are far less retranslation on linux-user code. Also far less code being translated. In any case, a global lock also doesn't seems to be the best solution, just making sure to not change the code during retranslation seems a better solution. > If it was during system emulation, I'm still puzzled :). > Yes, it was during system emulation. -- Aurelien Jarno GPG: 1024D/F1BCDB73 aurel...@aurel32.net http://www.aurel32.net