On 02/24/2015 05:46 AM, Stefan Hajnoczi wrote: > On Tue, Feb 24, 2015 at 11:35 AM, Stefan Hajnoczi <stefa...@gmail.com> wrote: >> On Thu, Feb 19, 2015 at 08:24:19PM +0100, Radim Krčmář wrote: >>> Window 8.0 driver has a particular behavior for a small time frame after >>> it enables rx interrupts: the interrupt handler never clears >>> E1000_ICR_RXT0. The handler does this something like this: >>> set_imc(-1) (1) disable all interrupts >>> val = read_icr() (2) clear ICR >>> handled = magic(val) (3) do nothing to E1000_ICR_RXT0 >>> set_ics(val & ~handled) (4) set unhandled interrupts back to ICR >>> set_ims(157) (5) enable some interrupts >>> >>> so if we started with RXT0, then every time the handler re-enables e1000 >>> interrupts, it receives one. This likely wouldn't matter in real >>> hardware, because it is slow enough to make some progress between >>> interrupts, but KVM instantly interrupts it, and boot hangs. >>> (If we have multiple VCPUs, the interrupt gets load-balanced and >>> everything is fine.) >>> >>> I haven't found any problem in earlier phase of initialization and >>> windows writes 0 to RADV and RDTR, so some workaround looks like the >>> only way if we want to support win8.0 on uniprocessors. (I vote NO.) >>> >>> This workaround uses the fact that a constant is cleared from ICR and >>> later set back to it. After detecting this situation, we reuse the >>> mitigation framework to inject an interrupt 10 microseconds later. >>> (It's not exactly 10 microseconds, to keep the existing logic intact.) >>> >>> The detection is done by checking at (1), (2), and (5). (2) and (5) >>> require that the only bit in ICR is RXT0. We could also check at (4), >>> and on writes to any other register, but it would most likely only add >>> more useless code, because normal operations shouldn't behave like that >>> anyway. (An OS that deliberately keeps bits in ICR to notify itself >>> that there are more packets, or for more creative reasons, is nothing we >>> should care about.) >>> >>> Signed-off-by: Radim Krčmář <rkrc...@redhat.com> >>> --- >>> The patch is still untested -- it only approximates the behavior of RHEL >>> patches that worked, I'll try to get a reproducer ... >>> >>> hw/net/e1000.c | 29 ++++++++++++++++++++++------- >>> 1 file changed, 22 insertions(+), 7 deletions(-) >> >> Hi Alex, >> I've CCed you in case you have any advice regarding QEMU's e1000 >> emulation. It seems Windows 8 gets itself into a kind of interrupt >> storm and a workaround in QEMU will be necessary. >> >> Any thoughts? > > Okay, I guess Alex has changed jobs since the email has bounced. Too > bad, it was worth a shot. > > Regarding the workaround, I'm okay with it. It's a hack for sure but > what other option do we have? > I wasn't able to reproduce this problem with upstream QEMU. According to Radim, this bug requires a very subtle timing during guest installation. So probably my testing didn't hit the right timing. Additionally our QE confirmed that this patch fixed a Win8 installation issue that were seen on in-house QEMU (e.g. qemu-kvm-rhev). With that, I am OK with this patch. The only thing left is to fix the compilation in this patch (as Radim pointed out). Anyway,
Reviewed-by: Wei Huang <w...@redhat.com> Thanks, -Wei > Stefan >