> -----Original Message----- > From: Maxime Coquelin [mailto:maxime.coque...@redhat.com] > Sent: Friday, December 8, 2017 4:36 PM > To: Tan, Jianfeng; Victor Kaplansky; dev@dpdk.org; y...@fridaylinux.org; Bie, > Tiwei > Cc: sta...@dpdk.org; jfrei...@redhat.com > Subject: Re: [PATCH] vhost_user: protect active rings from async ring > changes > > > > On 12/08/2017 03:14 AM, Tan, Jianfeng wrote: > > > > > >> -----Original Message----- > >> From: Maxime Coquelin [mailto:maxime.coque...@redhat.com] > >> Sent: Thursday, December 7, 2017 6:02 PM > >> To: Tan, Jianfeng; Victor Kaplansky; dev@dpdk.org; y...@fridaylinux.org; > Bie, > >> Tiwei > >> Cc: sta...@dpdk.org; jfrei...@redhat.com > >> Subject: Re: [PATCH] vhost_user: protect active rings from async ring > >> changes > >> > >> > >> > >> On 12/07/2017 10:33 AM, Tan, Jianfeng wrote: > >>> > >>> > >>>> -----Original Message----- > >>>> From: Victor Kaplansky [mailto:vkapl...@redhat.com] > >>>> Sent: Wednesday, December 6, 2017 9:56 PM > >>>> To: dev@dpdk.org; y...@fridaylinux.org; Bie, Tiwei; Tan, Jianfeng; > >>>> vkapl...@redhat.com > >>>> Cc: sta...@dpdk.org; jfrei...@redhat.com; Maxime Coquelin > >>>> Subject: [PATCH] vhost_user: protect active rings from async ring > changes > >>>> > >>>> When performing live migration or memory hot-plugging, > >>>> the changes to the device and vrings made by message handler > >>>> done independently from vring usage by PMD threads. > >>>> > >>>> This causes for example segfauls during live-migration > >>> > >>> segfauls ->segfaults? > >>> > >>>> with MQ enable, but in general virtually any request > >>>> sent by qemu changing the state of device can cause > >>>> problems. > >>>> > >>>> These patches fixes all above issues by adding a spinlock > >>>> to every vring and requiring message handler to start operation > >>>> only after ensuring that all PMD threads related to the divece > >>> > >>> Another typo: divece. > >>> > >>>> are out of critical section accessing the vring data. > >>>> > >>>> Each vring has its own lock in order to not create contention > >>>> between PMD threads of different vrings and to prevent > >>>> performance degradation by scaling queue pair number. > >>> > >>> Also wonder how much overhead it brings. > >>> > >>> Instead of locking each vring, can we just, waiting a while (10us for > example) > >> after call destroy_device() callback so that every PMD thread has enough > >> time to skip out the criterial area? > >> > >> No, because we are not destroying the device when it is needed. > >> Actually, once destroy_device() is called, it is likely that the > >> application has taken care the ring aren't being processed anymore > >> before returning from the callback (This is at least the case with Vhost > >> PMD). > > > > OK, I did not put it right way as there are multiple cases above: migration > and memory hot plug. Let me try again: > > > > Whenever a vhost thread handles a message affecting PMD threads, (like > SET_MEM_TABLE, GET_VRING_BASE, etc) we can remove the dev flag - > VIRTIO_DEV_RUNNING, and wait for a while so that PMD threads skip out of > those criterial area. After message handling, reset the flag - > VIRTIO_DEV_RUNNING. > > I think you mean clearing vq's enabled flag, because PMD threads never > check the VIRTIO_DEV_RUNNING flag.
Ah, yes. > > > I suppose it can work, basing on an assumption that PMD threads work in > polling mode and can skip criterial area quickly and inevitably. > > That sounds very fragile, because if the CPU aren't perfectly isolated, > your PMD thread can be preempted for interrupt handling for example. > > Or what if for some reason the PMD thread CPU stalls for a short while? > > The later is unlikely, but if it happens, it will be hard to debug. > > Let's see first the performance impact of using the spinlock. It might > not be that important because 99.9999% of the times, it will not even > spin. Fair enough. Thanks, Jianfeng > > Thanks, > Maxime > > >> > >> The reason we need the lock is to protect PMD threads from the handling > >> of some vhost-user protocol requests. > >> For example SET_MEM_TABLE in the case of memory hotplug, or > >> SET_LOG_BASE > >> in case of multiqueue, which is sent for every queue pair and results in > >> unmapping/remapping the logging area. > > > > Yes, I understand how it fails. > > > > Thanks, > > Jianfeng > > > >> > >> Maxime