>> If the PCI_EXP_LNKSTA_DLLLA capability is set by default, linux kernel will >> send >> PDC event to detect whether there is a device in pcie slot. If a device is >> pluged >> in the pcie-root-port at the same time, hot-plug device will send ABP + PDC >> events to the kernel. The VM kernel will wrongly unplug the device if two PDC >> events get too close. Thus we'd better set the PCI_EXP_LNKSTA_DLLLA >> capability only in hotplug callback. > >Could you please describe a reproducer in a bit more detail? > Step1: start a VM with qemu, the VM boots up within 500ms. /path/to/qemu-2.8.1/aarch64-softmmu/qemu-system-aarch64 \ -name test-c65961652639ccf9ce0b8476a325421811d4fdc873e90c27168497bc9e204776 \ -uuid a8ed4a86-3f49-45a3-a8ce-28d61b2f2914 \ -machine virt,usb=off,accel=kvm,gic-version=3 \ -cpu host \ -m 2048M,slots=10,maxmem=239477M \ -qmp unix:/var/run/qmp.sock,server,nowait \ -device pcie-root-port,port=0x8,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,addr=0x1 \ -device pcie-root-port,port=0x9,chassis=2,id=pci.2,bus=pcie.0,addr=0x1.0x1 \ -device pcie-root-port,port=0xa,chassis=3,id=pci.3,bus=pcie.0,addr=0x1.0x2 \ -device pcie-root-port,port=0xb,chassis=4,id=pci.4,bus=pcie.0,addr=0x1.0x3 \ -device pcie-root-port,port=0xc,chassis=5,id=pci.5,bus=pcie.0,addr=0x1.0x4 \ -device pcie-root-port,port=0xd,chassis=6,id=pci.6,bus=pcie.0,addr=0x1.0x5 \ -device pcie-root-port,port=0xe,chassis=7,id=pci.7,bus=pcie.0,addr=0x1.0x6 \ -device pcie-pci-bridge,id=pci.8,bus=pci.1,addr=0x0 \ -device pcie-root-port,port=0xf,chassis=9,id=pci.9,bus=pcie.0,addr=0x1.0x7 \ ....... Step2: Immediately hotplug a pcie device: qmp_msg='{ "execute": "qmp_capabilities" } {"arguments":{"addr":"0x0","bus":"pci.4","driver":"virtio-net-pci","id":"virtio-e1356802-4b9f-44bb-b8f0-5f98bf765823","mac":"02:42:20:6e:a2:59"},"execute":"device_del"} {"arguments":{"id":"netport_test_1","ifname":"nfs_tap"},"execute":"netdev_del"}'
echo $qmp_msg | nc -U /var/run/qmp.sock Result expected: hotplug successful, the pcie device could be seen inside the VM Result in fact: we found a "hotplug" and "unplug" message inside the VM, it failed in hotplug. > >> >> By the way, we should clean up the PCI_EXP_LNKSTA_DLLLA capability during >> unplug to avoid VM restart or migration failure which will enter the same >> abnormal scenario as above. >> >> Signed-off-by: limingw...@huawei.com >> Signed-off-by: fangyi...@huawei.com >> Signed-off-by: oscar.zhan...@huawei.com > >So looking at linux I see: > > * pciehp_card_present_or_link_active() - whether given slot is occupied > * @ctrl: PCIe hotplug controller > * > * Unlike pciehp_card_present(), which determines presence solely from the > * Presence Detect State bit, this helper also returns true if the Link Active > * bit is set. This is a concession to broken hotplug ports which hardwire > * Presence Detect State to zero, such as Wilocity's [1ae9:0200]. > >so it looks like linux actually looks at presence detect state, >but we have a bug just like Wilocity's and keeping >link active up fixes that. Can't we fix the bug instead? > @wangxiongfeng >> --- >> hw/pci/pcie.c | 9 +++++---- >> 1 file changed, 5 insertions(+), 4 deletions(-) >> >> diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c >> index a6beb56..174f392 100644 >> --- a/hw/pci/pcie.c >> +++ b/hw/pci/pcie.c >> @@ -75,10 +75,6 @@ pcie_cap_v1_fill(PCIDevice *dev, uint8_t port, uint8_t >type, uint8_t version) >> >QEMU_PCI_EXP_LNKSTA_NLW(QEMU_PCI_EXP_LNK_X1) | >> >QEMU_PCI_EXP_LNKSTA_CLS(QEMU_PCI_EXP_LNK_2_5GT)); >> >> - if (dev->cap_present & QEMU_PCIE_LNKSTA_DLLLA) { >> - pci_word_test_and_set_mask(exp_cap + PCI_EXP_LNKSTA, >> - PCI_EXP_LNKSTA_DLLLA); >> - } >> >> /* We changed link status bits over time, and changing them across >> * migrations is generally fine as hardware changes them too. > >Does this actually change anything? > >I don't know why do we bother setting it here but we do >set it later in pcie_cap_slot_plug_cb, correct? > >I'd like to understand whether this is part of fix or >just a cleanup. > > >> @@ -484,6 +480,11 @@ void >pcie_cap_slot_unplug_request_cb(HotplugHandler *hotplug_dev, >> return; >> } >> >> + if (pci_dev->cap_present & QEMU_PCIE_LNKSTA_DLLLA) { >> + pci_word_test_and_clear_mask(exp_cap + PCI_EXP_LNKSTA, >> + PCI_EXP_LNKSTA_DLLLA); >> + } >> + >> pcie_cap_slot_push_attention_button(PCI_DEVICE(hotplug_dev)); >> } > >So this reports data link inactive immediately after >unplug request. Seems a bit questionable: guest did not >respond yet. I'd like to see a comment explaining why >this makes sense. > > >> -- >> 1.8.3.1