On Thu, Sep 14, 2017 at 10:59:05AM +0200, Igor Mammedov wrote: > On Thu, 14 Sep 2017 13:48:26 +0530 > Bharata B Rao <bhar...@linux.vnet.ibm.com> wrote: > > > On Thu, Sep 14, 2017 at 10:00:11AM +0200, Igor Mammedov wrote: > > > On Thu, 14 Sep 2017 12:31:18 +0530 > > > Bharata B Rao <bhar...@linux.vnet.ibm.com> wrote: > > > > > > > Hi, > > > > > > > > QEMU hits the below assert > > > > > > > > qemu-system-ppc64: used ring relocated for ring 2 > > > > qemu-system-ppc64: qemu/hw/virtio/vhost.c:649: vhost_commit: Assertion > > > > `r >= 0' failed. > > > > > > > > in the following scenario: > > > > > > > > 1. Boot guest with vhost=on > > > > -netdev > > > > tap,id=mynet0,script=qemu-ifup,downscript=qemu-ifdown,vhost=on -device > > > > virtio-net-pci,netdev=mynet0 > > > > 2. Hot add a DIMM device > > > > 3. Reboot > > > > When the guest reboots, we can see > > > > vhost_virtqueue_start:vq->used_phys getting assigned an address that > > > > falls in the hotplugged memory range. > > > > 4. Remove the DIMM device > > > > Guest refuses the removal as the hotplugged memory is under use. > > > > 5. Reboot > > > > > > > QEMU forces the removal of the DIMM device during reset and that's > > > > when we hit the above assert. > > > I don't recall implementing forced removal om DIMM, > > > could you point out to the related code, pls? > > > > This is ppc specific. We have DR Connector objects for each LMB (multiple > > LMBs make up one DIMM device) and during reset we invoke the > > release routine for these LMBs which will further invoke > > pc_dimm_memory_unplug(). > > > > See hw/ppc/spapr_drc.c: spapr_drc_reset() > > hw/ppc/spapr.c: spapr_lmb_release() > > > > > > > > > Any pointers on why we are hitting this assert ? Shouldn't vhost be > > > > done with using the hotplugged memory when we hit reset ? > > > > > > >From another point of view, > > > DIMM shouldn't be removed unless guest explicitly ejects it > > > (at least that should be so in x86 case). > > > > While that is true for ppc also, shouldn't we start fresh from reset ? > we should. > > when it aborts vhost should print out error from vhost_verify_ring_mappings() > > if (r == -ENOMEM) { > error_report("Unable to map %s for ring %d", part_name[j], i); > > } else if (r == -EBUSY) { > > error_report("%s relocated for ring %d", part_name[j], i); > > that might give a clue where that memory stuck in. > > Michael might point out where to start look at, but he's on vacation > so ...
Michael (or anyone else) - Any pointers on this problem ? Regards, Bharata.