On Mon, Mar 04, 2019 at 11:54:38PM -0800, Mike Larkin wrote: > On Mon, Mar 04, 2019 at 01:25:16PM +0200, Wictor Lund wrote: > > Hi misc@! > > > > I have figured out that it is possible to get vmd(8) into a state where > > 1) com1_dev.rcv_pending != 0 > > 2) there is data pending on com1_dev.fd > > 3) the guest doesn't seem to care > > > > This results in a locked up situation where com_rcv_event() is called on > > indefinitely. It seems to me that an interrupt is lost somewhere, leading > > to a situation where the guest OS is happily ignorant of the available data, > > while the vmm is waiting for the guest to eat it up. > > > > This has made it impossible to install Linux via the serial console on > > vmm(4). It seems that people previously have reported "freezing" problems > > in vmm(4) form time to time, but when reported no one else have been able to > > reproduce it. > > > > I have solved the problem for myself by changing com_rcv_event() to the > > following: > > > > static void > > com_rcv_event(int fd, short kind, void *arg) > > { > > mutex_lock(&com1_dev.mutex); > > > > /* > > * We already have other data pending to be received. The data that > > * has become available now will be moved to the com port later. > > */ > > if (com1_dev.rcv_pending) { > > /* If pending interrupt, inject */ > > if ((com1_dev.regs.iir & IIR_NOPEND) == 0) { > > utrace("comrcv injintr", &com1_dev.regs.lsr, > > sizeof(com1_dev.regs.lsr)); > > /* XXX: vcpu_id */ > > vcpu_assert_pic_irq((uintptr_t) arg, 0, > > com1_dev.irq); > > vcpu_deassert_pic_irq((uintptr_t) arg, 0, > > com1_dev.irq); > > } > > mutex_unlock(&com1_dev.mutex); > > return; > > } > > if (com1_dev.regs.lsr & LSR_RXRDY) > > com1_dev.rcv_pending = 1; > > else { > > com_rcv(&com1_dev, (uintptr_t) arg, 0); > > > > /* If pending interrupt, inject */ > > if ((com1_dev.regs.iir & IIR_NOPEND) == 0) { > > /* XXX: vcpu_id */ > > vcpu_assert_pic_irq((uintptr_t) arg, 0, > > com1_dev.irq); > > vcpu_deassert_pic_irq((uintptr_t) arg, 0, > > com1_dev.irq); > > } > > } > > > > mutex_unlock(&com1_dev.mutex); > > } > > > > However, I have little experience in the interrupt behaviour on x86. I'm > > also aware of that there has been an attempt to fix this behaviour [1]. > > > > I think the problem is that when com_rcv() is called from > > vcpu_process_com_data(), the interrupt is triggered using vcpu_exit_inout(), > > which was not touched in the previous attempt [1] to fix the "freezing" > > problem. vcpu_exit_inout() still uses a simple vcpu_assert_pic_irq() call > > to trigger the interrupt while for example com_rcv_event() uses the > > vcpu_assert_pic_irq(); vcpu_deassert_pic_irq() sequence to trigger it. > > > > With my modifications to com_rcv_event() I was able to install not only > > alpine linux, but even debian using the serial console. Without the > > modification I can't even install alpine linux via the serial console. > > > > Any thoughts on this? If people think my change is a sound one, I can make > > a proper patch for it. If people think the change is unsound, I would have > > to look into changing vcpu_exit_inout() and probably extend the interface to > > it to decide how the interrupt should be triggered. > > > > 1. https://marc.info/?l=openbsd-cvs&m=153115270302514&w=2 > > > > -- > > Wictor Lund > > > > Thanks Wictor! > > Can you make a proper diff and resend please?
The diff is appended below. I noticed that uint8_t elcr[2] is handled under pthread_mutex_t pic_mtx in i8259_deassert_irq() (which is called from vcpu_deassert_pic_irq()), while elcr is not handled under pic_mtx in pic_set_elcr() and vcpu_exit_elcr(). This may or may not be of importance because i8259_deassert_irq() is a no-op depending on the value of elcr. I could write a diff immediately that makes those functions take the mutex, but I haven't studied the locking strategies in vmd(8) well enough to know whether it may cause a deadlock. -- Wictor Lund diff --git a/usr.sbin/vmd/ns8250.c b/usr.sbin/vmd/ns8250.c index 6ed614b0a9f..3efbf357702 100644 --- a/usr.sbin/vmd/ns8250.c +++ b/usr.sbin/vmd/ns8250.c @@ -115,6 +115,12 @@ com_rcv_event(int fd, short kind, void *arg) * has become available now will be moved to the com port later. */ if (com1_dev.rcv_pending) { + /* If no interrupt is currently pending, inject one */ + if ((com1_dev.regs.iir & IIR_NOPEND) == 0) { + /* XXX: vcpu_id */ + vcpu_assert_pic_irq((uintptr_t) arg, 0, com1_dev.irq); + vcpu_deassert_pic_irq((uintptr_t) arg, 0, com1_dev.irq); + } mutex_unlock(&com1_dev.mutex); return; }