[Qemu-devel] Re: tcg/{ppc, s390, sparc}: branch target and code retranslation

Aurelien Jarno Mon, 10 Jan 2011 06:16:07 -0800

On Mon, Jan 10, 2011 at 03:07:52PM +0100, Alexander Graf wrote:
> 
> On 10.01.2011, at 15:00, Aurelien Jarno wrote:
> 
> > On Mon, Jan 10, 2011 at 01:13:25PM +0100, Alexander Graf wrote:
> >> 
> >> On 06.01.2011, at 23:12, Aurelien Jarno wrote:
> >> 
> >>> Hi,
> >>> 
> >>> I have just sent a tcg/arm patch concerning code retranslation. You
> >>> might want to look at the description (copied below), as from a first
> >>> glance ppc, s390 and sparc TCG targets might be affected. If you see
> >>> guest kernel panics, some segmentation fault of qemu or in the guest,
> >>> strange behaviors, that happen randomly and that looks difficult to
> >>> debug it might be the issue.
> >>> 
> >>> Aurelien
> >>> 
> >>> 
> >>> | QEMU uses code retranslation to restore the CPU state when an exception
> >>> | happens. For it to work the retranslation must not modify the generated
> >>> | code. This is what is currently implemented in ARM TCG.
> >>> |
> >>> | However on CPU that don't have icache/dcache/memory synchronised like
> >>> | ARM, this requirement is stronger and code retranslation must not modify
> >>> | the generated code "atomically", as the cache line might be flushed
> >>> | at any moment (interrupt, exception, task switching), even if not
> >>> | triggered by QEMU. The probability for this to happen is very low, and
> >>> | depends on cache size and associativiy, machine load, interrupts, so the
> >>> | symptoms are might happen randomly.
> >>> |
> >>> | This requirement is currently not followed in tcg/arm, for the
> >>> | load/store code, which basically has the following structure:
> >>> |   1) tlb access code is written
> >>> |   2) conditional fast path code is written
> >>> |   3) branch is written with a temporary target
> >>> |   4) slow path code is written
> >>> |   5) branch target is updated
> >>> | The cache lines corresponding to the retranslated code is not flushed
> >>> | after code retranslation as the generated code is supposed to be the
> >>> | same. However if the cache line corresponding to the branch instruction
> >>> | is flushed between step 3 and 5, and is not flushed again before the
> >>> | code is executed again, the branch target is wrong. In the guest, the
> >>> | symptoms are MMU page fault at a random addresses, which leads to
> >>> | kernel page fault or segmentation faults.
> >> 
> >> I don't see the problem. If you have separate icache from dcache, the code 
> >> in question doesn't get executed during the rewrite, so all should be 
> >> fine. If you have both combined, the data write should automatically 
> >> modify the cache line, so all is great too.
> > 
> > As far as I understand, this is what happens to the branch target
> > instruction, or at least one possibility:
> > 
> > operation                                 | icache | dcache | mem/L2 |
> > ------------------------------------------+--------+--------+--------+
> > tlb access code is written                | absent | absent |   ok   |
> > conditional fast path code is written     | absent | absent |   ok   |
> > branch is written with a temporary target | absent | wrong  |   ok   |
> > cache line is flushed to memory           | absent | absent |  wrong |
> > slow path code is written                 | absent | absent |  wrong |
> > branch target is updated                  | absent |   ok   |  wrong |
> > TB is re-executed                         | wrong  |   ok   |  wrong |
> > 
> > Note that the issue is real, the patch really fixes the issue on ARM and
> > MIPS. It's not impossible that I don't fully understand it given I can't
> > analyse the cache values at a given time. However, when QEMU crashes 
> > because of that, we have seen that the executed code doesn't match what
> > GDB says, and changing the temporary branch value also changes the way 
> > QEMU or the guest crash.
> > 
> > The problem doesn't happens very often though (for sure less than 1 
> > every few millions). The other way to fix it is to issue a cache flush
> > after a code retranslation, however, it is something very costly on some
> > architectures.
> 
> I'd guess it only happens when code is overwritten. Only then icache can 
> still be stale in that code region. Can't we just invalidate the icache on 
> every tb flush? That should also fix the issue I'd guess.


That's a solution that works (tested). However making sure that the
retranslation doesn't change the code looks better.

> > 
> >> As long as the code in question doesn't get run in between, I don't see 
> >> where breakage could occur?
> > 
> > The breakage can occur, because the same TB can be re-executed later,
> > and we don't have an explicit cache flush after the retranslation.
> 
> Oh, I think I'm starting to understand the issue. Have you seen this with 
> system or user emulation? I'd guess with user it makes sense:
> 
> [T1] trap
> [T2] run code
> [T1] write branch code to find guest IP
> [T2] run exactly that branch code
> [T1] rewrite branch code
> [T2] has invalid code in its cache
> 
> If that's the case, a global lock on retranslation should do the trick, no?
>

I haven't seen that during user emulation, however your scenario, which
is different is also possible. Note however that they are far less
retranslation on linux-user code. Also far less code being translated.

In any case, a global lock also doesn't seems to be the best solution,
just making sure to not change the code during retranslation seems a
better solution.

> If it was during system emulation, I'm still puzzled :).
> 

Yes, it was during system emulation.

-- 
Aurelien Jarno                          GPG: 1024D/F1BCDB73
[email protected]                 http://www.aurel32.net

[Qemu-devel] Re: tcg/{ppc, s390, sparc}: branch target and code retranslation

Reply via email to