On 29/11/2018 09:58, Paolo Bonzini wrote: > On 28/11/18 22:56, Guenter Roeck wrote: >> The guest OS reads RSTAT, RSEQ, and RINTR, and expects those registers >> to reflect a consistent state. However, it is possible that the registers >> can change after RSTAT was read, but before RINTR is read. >> >> Guest OS qemu >> -------- ---- >> Read RSTAT >> esp_command_complete() >> RSTAT = STAT_ST >> esp_dma_done() >> RSTAT |= STAT_TC >> RSEQ = 0 >> RINTR = INTR_BS >> >> Read RSEQ >> Read RINTR RINTR = 0 >> RSTAT &= ~STAT_TC >> RSEQ = SEQ_CD >> >> The guest OS would then try to handle INTR_BS combined with an old >> value of RSTAT. This sometimes resulted in lost events, spurious >> interrupts, guest OS confusion, and stalled SCSI operations. > > The question is, why was the guest running the interrupt routine before > STAT_INT was set in RSTAT? The code in esp_raise_irq seems good: > > if (!(s->rregs[ESP_RSTAT] & STAT_INT)) { > s->rregs[ESP_RSTAT] |= STAT_INT; > qemu_irq_raise(s->irq); > trace_esp_raise_irq(); > } > > Paolo
This patch is very interesting, as I have a long-running regression trying to boot NextSTEP 3.3 on qemu-system-sparc which I eventually bisected down to the commit that turned on iothread by default in QEMU. The symptom is that ESP SCSI requests hang/timeout before the kernel is able to get to the userspace installer: however if you launch QEMU with "taskset –cpu-list 1 qemu-system-sparc ..." then it works and you can complete the installation. So certainly this suggests that there is a race condition still present in ESP somewhere. I've given this patch a spin, and in a few quick tests here I was able to consistently get further in kernel boot, but it still doesn't completely solve issue for me :/ ATB, Mark.