On 03/30/2014 06:25 PM, Andreas Färber wrote: > Hi, > > Am 30.03.2014 22:27, schrieb Cole Robinson: >> With git master, loadvm hits an assert failure if using ehci and usb tablet. >> Steps to reproduce: >> >> $ qemu-img create -f qcow2 foo.qcow2 10G >> $ ./x86_64-softmmu/qemu-system-x86_64 \ >> -enable-kvm -m 4096 \ >> -device ich9-usb-ehci1,id=usb,bus=pci.0,addr=0x5.0x7 \ >> -device >> ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,addr=0x5 >> \ >> -device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pci.0,addr=0x5.0x1 \ >> -device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pci.0,addr=0x5.0x2 \ >> -device usb-tablet,id=input0 \ >> -hda foo.qcow2 \ >> -cdrom Fedora-20-x86_64-Live-Desktop.iso \ >> -boot d -monitor stdio >> >> <wait until guest boot to 'welcome to fedora' window> >> (qemu) savevm foo >> (qemu) loadvm foo >> qemu-system-x86_64: hw/pci/pci.c:250: pcibus_reset: Assertion >> `bus->irq_count[i] == 0' failed. >> >> The relevant backtrace bits for the assertion: >> >> #4 0x00007f8f7241971e in pcibus_reset (qbus=0x7f8f74082fd0) >> at hw/pci/pci.c:250 >> #5 0x00007f8f723bd36d in qbus_reset_one (bus=0x7f8f74082fd0, >> opaque=<optimized out>) at hw/core/qdev.c:249 >> #6 0x00007f8f723bec88 in qdev_walk_children (dev=0x7f8f73efb320, >> pre_devfn=0x0, pre_busfn=0x0, post_devfn=0x7f8f723bf4f0 <qdev_reset_one>, >> post_busfn=0x7f8f723bd320 <qbus_reset_one>, opaque=0x0) >> at hw/core/qdev.c:403 >> #7 0x00007f8f723bedb8 in qbus_walk_children (bus=0x7f8f740706e0, >> pre_devfn=0x0, pre_busfn=0x0, post_devfn=0x7f8f723bf4f0 <qdev_reset_one>, >> post_busfn=0x7f8f723bd320 <qbus_reset_one>, opaque=0x0) >> at hw/core/qdev.c:369 >> #8 0x00007f8f724f5c5d in qemu_devices_reset () at vl.c:1867 >> #9 qemu_system_reset (report=report@entry=false) at vl.c:1880 >> #10 0x00007f8f7256dba2 in load_vmstate (name=name@entry=0x7f8f7417a160 "foo") >> at /home/crobinso/src/qemu/savevm.c:1098 >> >> The 'cause' is this: >> >> #0 ehci_detach (port=0x555556436968) at hw/usb/hcd-ehci.c:810 >> #1 0x0000555555727b5e in usb_detach (port=port@entry=0x555556436968) >> at hw/usb/core.c:49 >> #2 0x0000555555736bf3 in ehci_reset (opaque=0x5555564364d8) >> at hw/usb/hcd-ehci.c:941 >> #3 0x00005555557e1fcd in qemu_devices_reset () at vl.c:1867 >> #4 qemu_system_reset (report=report@entry=false) at vl.c:1880 >> #5 0x0000555555859f12 in load_vmstate (name=name@entry=0x555556458210 "foo") >> at /home/crobinso/src/qemu/savevm.c:1098 >> >> ehci_reset calls usb_detach which sets pcibus->irq_count[3] = 1. pcibus_reset >> runs and hits the assertion. But I don't understand this stuff enough to >> determine what's actually wrong here :) >> >> I bisected the issue to: >> >> commit 31b030d4abc5bea89c2b33b39d3b302836f6b6ee >> Author: Andreas Färber <afaer...@suse.de> >> Date: Wed Sep 4 01:29:02 2013 +0200 >> >> cputlb: Change tlb_flush_page() argument to CPUState >> >> Signed-off-by: Andreas Färber <afaer...@suse.de> >> >> ...and then I double checked it since that sounds unrelated. Same result. > > You are running into an unrelated migration bug: > http://git.qemu.org/?p=qemu.git;a=commit;h=c01a71c1a56fa27f43449ff59e5d03b2483658a2 > > Sorry about that. You'll need to patch -p1 the above commit on top of > each git-bisect commit to find the actual breakage if the above commit > is already bad (can't test right now). >
Indeed, that seemed to be messing up my search, thanks. So the real culprit is: commit 9bdbbfc3a04c28dc43af5afffb32066623cb0022 Author: Paolo Bonzini <pbonz...@redhat.com> Date: Fri Dec 6 17:54:25 2013 +0100 pci: clean up resetting of IRQs pci_device_reset will deassert the INTX pins, and this will make the irq_count array all-zeroes. Check that this is the case, and remove the existing loop which might even unsync irq_count and irq_state. Which is what adds the assert. Looking at pci_device_reset, there is an issue: dev->irq_state = 0; pci_update_irq_status(dev); pci_device_deassert_intx(dev); irq_state is cleared before pci_device_deassert_intx. But tries to clear all irqs via pci_irq_handler, but that function will exit without taking any action if the requested irq level matches what we already track in irq_state. Since irq_state is 0, pci_device_deassert_intx is basically a no-op. Any interrupts with level=1 will not be cleared, which is the case with the usb tablet after usb_detach. This fixes things for me, but I have no idea if it's the proper fix: diff --git a/hw/pci/pci.c b/hw/pci/pci.c index 8f722dd..1912dfb 100644 --- a/hw/pci/pci.c +++ b/hw/pci/pci.c @@ -189,9 +189,9 @@ static void pci_do_device_reset(PCIDevice *dev) { int r; + pci_device_deassert_intx(dev); dev->irq_state = 0; pci_update_irq_status(dev); - pci_device_deassert_intx(dev); /* Clear all writable bits */ pci_word_test_and_clear_mask(dev->config + PCI_COMMAND, pci_get_word(dev->wmask + PCI_COMMAND) | - Cole