On Tue, Jan 2, 2024 at 6:33 AM Peter Xu <pet...@redhat.com> wrote: > > Jason, Eugenio, > > Apologies for a late reply; just back from the long holiday. > > On Thu, Dec 21, 2023 at 09:20:40AM +0100, Eugenio Perez Martin wrote: > > Si-Wei did the actual profiling as he is the one with the 128G guests, > > but most of the time was spent in the memory pinning. Si-Wei, please > > correct me if I'm wrong. > > IIUC we're talking about no-vIOMMU use case. The pinning should indeed > take a lot of time if it's similar to what VFIO does. > > > > > I didn't check VFIO, but I think it just maps at realize phase with > > vfio_realize -> vfio_attach_device -> vfio_connect_container(). In > > previous testings, this delayed the VM initialization by a lot, as > > we're moving that 20s of blocking to every VM start. > > > > Investigating a way to do it only in the case of being the destination > > of a live migration, I think the right place is .load_setup migration > > handler. But I'm ok to move it for sure. > > If it's destined to map the 128G, it does sound sensible to me to do it > when VM starts, rather than anytime afterwards. >
Just for completion, it is not 100% sure the driver will start the device. But it is likely for sure. > Could anyone help to explain what's the problem if vDPA maps 128G at VM > init just like what VFIO does? > The main problem was the delay of VM start. In the master branch, the pinning is done when the driver starts the device. While it takes the BQL, the rest of the vCPUs can move work forward while the host is pinning. So the impact of it is not so evident. To move it to initialization time made it very noticeable. To make things worse, QEMU did not respond to QMP commands and similar. That's why it was done only if the VM was the destination of a LM. However, we've added the memory map thread in this version, so this might not be a problem anymore. We could move the spawn of the thread to initialization time. But how to undo this pinning in the case the guest does not start the device? In this series, this is done at the destination with vhost_vdpa_load_cleanup. Or is it ok to just keep the memory mapped as long as QEMU has the vDPA device? Thanks!