Re: [PATCH 1/1] vduse: moving kvfree into caller
On Mon, Dec 6, 2021 at 3:54 PM Guanjun wrote: > > From: Guanjun > > This free action should be moved into caller 'vduse_ioctl' in > concert with the allocation. > > No functional change. > > Fixes: c8a6153b6c59 ("vduse: Introduce VDUSE - vDPA Device in Userspace") Does this fix a real problem? If not, let's try not using fixes tags here. Thanks > Signed-off-by: Guanjun > --- > drivers/vdpa/vdpa_user/vduse_dev.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c > b/drivers/vdpa/vdpa_user/vduse_dev.c > index c9204c62f339..477a5a592002 100644 > --- a/drivers/vdpa/vdpa_user/vduse_dev.c > +++ b/drivers/vdpa/vdpa_user/vduse_dev.c > @@ -1355,7 +1355,6 @@ static int vduse_create_dev(struct vduse_dev_config > *config, > err_str: > vduse_dev_destroy(dev); > err: > - kvfree(config_buf); > return ret; > } > > @@ -1406,6 +1405,8 @@ static long vduse_ioctl(struct file *file, unsigned int > cmd, > } > config.name[VDUSE_NAME_MAX - 1] = '\0'; > ret = vduse_create_dev(&config, buf, control->api_version); > + if (ret) > + kvfree(buf); > break; > } > case VDUSE_DESTROY_DEV: { > -- > 2.27.0 > ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH] eni_vdpa: alibaba: select VIRTIO_PCI_LIB
On Mon, Dec 6, 2021 at 4:14 PM Arnd Bergmann wrote: > > On Mon, Dec 6, 2021 at 4:12 AM Jason Wang wrote: > > > > On Sat, Dec 4, 2021 at 2:55 AM Arnd Bergmann wrote: > > > > > > From: Arnd Bergmann > > > > > > When VIRTIO_PCI_LIB is not built-in but the alibaba driver is, the > > > kernel runs into a link error: > > > > > > x86_64-linux-ld: drivers/vdpa/alibaba/eni_vdpa.o: in function > > > `eni_vdpa_set_features': > > > eni_vdpa.c:(.text+0x23f): undefined reference to `vp_legacy_set_features' > > > x86_64-linux-ld: drivers/vdpa/alibaba/eni_vdpa.o: in function > > > `eni_vdpa_set_vq_state': > > > eni_vdpa.c:(.text+0x2fe): undefined reference to > > > `vp_legacy_get_queue_enable' > > > x86_64-linux-ld: drivers/vdpa/alibaba/eni_vdpa.o: in function > > > `eni_vdpa_set_vq_address': > > > eni_vdpa.c:(.text+0x376): undefined reference to > > > `vp_legacy_set_queue_address' > > > x86_64-linux-ld: drivers/vdpa/alibaba/eni_vdpa.o: in function > > > `eni_vdpa_set_vq_ready': > > > eni_vdpa.c:(.text+0x3b4): undefined reference to > > > `vp_legacy_set_queue_address' > > > x86_64-linux-ld: drivers/vdpa/alibaba/eni_vdpa.o: in function > > > `eni_vdpa_free_irq': > > > eni_vdpa.c:(.text+0x460): undefined reference to `vp_legacy_queue_vector' > > > x86_64-linux-ld: eni_vdpa.c:(.text+0x4b7): undefined reference to > > > `vp_legacy_config_vector' > > > x86_64-linux-ld: drivers/vdpa/alibaba/eni_vdpa.o: in function > > > `eni_vdpa_reset': > > > > Intersting, all those belongs to the legacy library. > > > > And I just have a try and I can complie alibaba eni without > > VIRTIO_PCI_LIB is set. > > Ah, so the problem is in drivers/Makefile: > > obj-$(CONFIG_VIRTIO)+= virtio/ > obj-$(CONFIG_VIRTIO_PCI_LIB)+= virtio/ > > We only enter this directory when one of these two symbols is set, but > in my randconfig > build, neither one is. I'll send a new patch. Yes, we need to include when VIRTIO_PCI_LIB_LEGACY is set. Thanks > > Arnd > ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH] drm: Return error codes from struct drm_driver.gem_create_object
On Tue, Nov 30, 2021 at 10:52:55AM +0100, Thomas Zimmermann wrote: > GEM helper libraries use struct drm_driver.gem_create_object to let > drivers override GEM object allocation. On failure, the call returns > NULL. > > Change the semantics to make the calls return a pointer-encoded error. > This aligns the callback with its callers. Fixes the ingenic driver, > which already returns an error pointer. > > Also update the callers to handle the involved types more strictly. > > Signed-off-by: Thomas Zimmermann > --- > There is an alternative patch at [1] that updates the value returned > by ingenics' gem_create_object to NULL. Fixing the interface to return > an errno code is more consistent with the rest of the GEM functions. > > [1] https://lore.kernel.org/dri-devel/2028111522.GD1147@kili/ My fix was already applied and backported to -stable etc... Your patch is not developed against a current tree so you broke it. That's the tricky thing with changing the API because say people wrote their code last week where returning NULL was correct. When they submit their driver upstream, everything will merge and build but it will break at runtime. For now, it's only vc4_create_object() which is broken. regards, dan carpenter ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH] drm: Return error codes from struct drm_driver.gem_create_object
Hi Am 06.12.21 um 11:42 schrieb Dan Carpenter: On Tue, Nov 30, 2021 at 10:52:55AM +0100, Thomas Zimmermann wrote: GEM helper libraries use struct drm_driver.gem_create_object to let drivers override GEM object allocation. On failure, the call returns NULL. Change the semantics to make the calls return a pointer-encoded error. This aligns the callback with its callers. Fixes the ingenic driver, which already returns an error pointer. Also update the callers to handle the involved types more strictly. Signed-off-by: Thomas Zimmermann --- There is an alternative patch at [1] that updates the value returned by ingenics' gem_create_object to NULL. Fixing the interface to return an errno code is more consistent with the rest of the GEM functions. [1] https://lore.kernel.org/dri-devel/2028111522.GD1147@kili/ My fix was already applied and backported to -stable etc... Your patch is not developed against a current tree so you broke it. Do you have a specific link? I just checked the stable tree at [1] and there no trace of your patch. Patches for DRM should go through through DRM trees; drm-misc-fixes in this case. Exceptions should at least be announce on dri-devel. Neither is the case here. Best regards Thomas [1] https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/drivers/gpu/drm/ingenic/ingenic-drm-drv.c That's the tricky thing with changing the API because say people wrote their code last week where returning NULL was correct. When they submit their driver upstream, everything will merge and build but it will break at runtime. For now, it's only vc4_create_object() which is broken. regards, dan carpenter -- Thomas Zimmermann Graphics Driver Developer SUSE Software Solutions Germany GmbH Maxfeldstr. 5, 90409 Nürnberg, Germany (HRB 36809, AG Nürnberg) Geschäftsführer: Ivo Totev OpenPGP_signature Description: OpenPGP digital signature ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v3 0/5] iommu/virtio: Add identity domains
On Wed, Dec 01, 2021 at 05:33:20PM +, Jean-Philippe Brucker wrote: > Jean-Philippe Brucker (5): > iommu/virtio: Add definitions for VIRTIO_IOMMU_F_BYPASS_CONFIG > iommu/virtio: Support bypass domains > iommu/virtio: Sort reserved regions > iommu/virtio: Pass end address to viommu_add_mapping() > iommu/virtio: Support identity-mapped domains Applied, thanks. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH] drm: Return error codes from struct drm_driver.gem_create_object
On Mon, Dec 06, 2021 at 12:16:24PM +0100, Thomas Zimmermann wrote: > Hi > > Am 06.12.21 um 11:42 schrieb Dan Carpenter: > > On Tue, Nov 30, 2021 at 10:52:55AM +0100, Thomas Zimmermann wrote: > > > GEM helper libraries use struct drm_driver.gem_create_object to let > > > drivers override GEM object allocation. On failure, the call returns > > > NULL. > > > > > > Change the semantics to make the calls return a pointer-encoded error. > > > This aligns the callback with its callers. Fixes the ingenic driver, > > > which already returns an error pointer. > > > > > > Also update the callers to handle the involved types more strictly. > > > > > > Signed-off-by: Thomas Zimmermann > > > --- > > > There is an alternative patch at [1] that updates the value returned > > > by ingenics' gem_create_object to NULL. Fixing the interface to return > > > an errno code is more consistent with the rest of the GEM functions. > > > > > > [1] https://lore.kernel.org/dri-devel/2028111522.GD1147@kili/ > > > > My fix was already applied and backported to -stable etc... Your > > patch is not developed against a current tree so you broke it. > > Do you have a specific link? I just checked the stable tree at [1] and there > no trace of your patch. It's in 5.15.6 and probably all the other supported -stable trees. https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/drivers/gpu/drm/vc4/vc4_bo.c?h=v5.15.6#n387 > > Patches for DRM should go through through DRM trees; drm-misc-fixes in this > case. Exceptions should at least be announce on dri-devel. Neither is the > case here. Yeah. That's a good question. I don't know, because I just work against linux-next... regards, dan carpenter ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
[PATCH 0/7] vhost flush cleanups
The following patches are Andrey Ryabinin's flush cleanups and some from me. They reduce the number of flush calls and remove some bogus ones where we don't even have a worker running anymore. I wanted to send these patches now, because my vhost threading patches have conflicts and are now built over them, and they seem like nice cleanups in general. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
[PATCH 3/7] vhost_net: get rid of vhost_net_flush_vq() and extra flush calls
From: Andrey Ryabinin vhost_net_flush_vq() calls vhost_work_dev_flush() twice passing vhost_dev pointer obtained via 'n->poll[index].dev' and 'n->vqs[index].vq.poll.dev'. This is actually the same pointer, initialized in vhost_net_open()/vhost_dev_init()/vhost_poll_init() Remove vhost_net_flush_vq() and call vhost_work_dev_flush() directly. Do the flushes only once instead of several flush calls in a row which seems rather useless. Signed-off-by: Andrey Ryabinin [drop vhost_dev forward declaration in vhost.h] Signed-off-by: Mike Christie --- drivers/vhost/net.c | 11 ++- 1 file changed, 2 insertions(+), 9 deletions(-) diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c index 11221f6d11b8..b1feb5e0571e 100644 --- a/drivers/vhost/net.c +++ b/drivers/vhost/net.c @@ -1373,16 +1373,9 @@ static void vhost_net_stop(struct vhost_net *n, struct socket **tx_sock, *rx_sock = vhost_net_stop_vq(n, &n->vqs[VHOST_NET_VQ_RX].vq); } -static void vhost_net_flush_vq(struct vhost_net *n, int index) -{ - vhost_work_dev_flush(n->poll[index].dev); - vhost_work_dev_flush(n->vqs[index].vq.poll.dev); -} - static void vhost_net_flush(struct vhost_net *n) { - vhost_net_flush_vq(n, VHOST_NET_VQ_TX); - vhost_net_flush_vq(n, VHOST_NET_VQ_RX); + vhost_work_dev_flush(&n->dev); if (n->vqs[VHOST_NET_VQ_TX].ubufs) { mutex_lock(&n->vqs[VHOST_NET_VQ_TX].vq.mutex); n->tx_flush = true; @@ -1572,7 +1565,7 @@ static long vhost_net_set_backend(struct vhost_net *n, unsigned index, int fd) } if (oldsock) { - vhost_net_flush_vq(n, index); + vhost_work_dev_flush(&n->dev); sockfd_put(oldsock); } -- 2.25.1 ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
[PATCH 1/7] vhost: get rid of vhost_poll_flush() wrapper
vhost_poll_flush() is a simple wrapper around vhost_work_dev_flush(). It gives wrong impression that we are doing some work over vhost_poll, while in fact it flushes vhost_poll->dev. It only complicate understanding of the code and leads to mistakes like flushing the same vhost_dev several times in a row. Just remove vhost_poll_flush() and call vhost_work_dev_flush() directly. Signed-off-by: Andrey Ryabinin [merge vhost_poll_flush removal from Stefano Garzarella] Signed-off-by: Mike Christie --- drivers/vhost/net.c | 4 ++-- drivers/vhost/test.c | 2 +- drivers/vhost/vhost.c | 12 ++-- drivers/vhost/vhost.h | 1 - drivers/vhost/vsock.c | 2 +- 5 files changed, 6 insertions(+), 15 deletions(-) diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c index 28ef323882fb..11221f6d11b8 100644 --- a/drivers/vhost/net.c +++ b/drivers/vhost/net.c @@ -1375,8 +1375,8 @@ static void vhost_net_stop(struct vhost_net *n, struct socket **tx_sock, static void vhost_net_flush_vq(struct vhost_net *n, int index) { - vhost_poll_flush(n->poll + index); - vhost_poll_flush(&n->vqs[index].vq.poll); + vhost_work_dev_flush(n->poll[index].dev); + vhost_work_dev_flush(n->vqs[index].vq.poll.dev); } static void vhost_net_flush(struct vhost_net *n) diff --git a/drivers/vhost/test.c b/drivers/vhost/test.c index a09dedc79f68..1a8ab1d8cb1c 100644 --- a/drivers/vhost/test.c +++ b/drivers/vhost/test.c @@ -146,7 +146,7 @@ static void vhost_test_stop(struct vhost_test *n, void **privatep) static void vhost_test_flush_vq(struct vhost_test *n, int index) { - vhost_poll_flush(&n->vqs[index].poll); + vhost_work_dev_flush(n->vqs[index].poll.dev); } static void vhost_test_flush(struct vhost_test *n) diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c index 8cf259d798c0..7346fa519eb6 100644 --- a/drivers/vhost/vhost.c +++ b/drivers/vhost/vhost.c @@ -244,14 +244,6 @@ void vhost_work_dev_flush(struct vhost_dev *dev) } EXPORT_SYMBOL_GPL(vhost_work_dev_flush); -/* Flush any work that has been scheduled. When calling this, don't hold any - * locks that are also used by the callback. */ -void vhost_poll_flush(struct vhost_poll *poll) -{ - vhost_work_dev_flush(poll->dev); -} -EXPORT_SYMBOL_GPL(vhost_poll_flush); - void vhost_work_queue(struct vhost_dev *dev, struct vhost_work *work) { if (!dev->worker) @@ -677,7 +669,7 @@ void vhost_dev_stop(struct vhost_dev *dev) for (i = 0; i < dev->nvqs; ++i) { if (dev->vqs[i]->kick && dev->vqs[i]->handle_kick) { vhost_poll_stop(&dev->vqs[i]->poll); - vhost_poll_flush(&dev->vqs[i]->poll); + vhost_work_dev_flush(dev->vqs[i]->poll.dev); } } } @@ -1721,7 +1713,7 @@ long vhost_vring_ioctl(struct vhost_dev *d, unsigned int ioctl, void __user *arg mutex_unlock(&vq->mutex); if (pollstop && vq->handle_kick) - vhost_poll_flush(&vq->poll); + vhost_work_dev_flush(vq->poll.dev); return r; } EXPORT_SYMBOL_GPL(vhost_vring_ioctl); diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h index 09748694cb66..67b23e178812 100644 --- a/drivers/vhost/vhost.h +++ b/drivers/vhost/vhost.h @@ -56,7 +56,6 @@ void vhost_poll_init(struct vhost_poll *poll, vhost_work_fn_t fn, __poll_t mask, struct vhost_dev *dev); int vhost_poll_start(struct vhost_poll *poll, struct file *file); void vhost_poll_stop(struct vhost_poll *poll); -void vhost_poll_flush(struct vhost_poll *poll); void vhost_poll_queue(struct vhost_poll *poll); void vhost_work_dev_flush(struct vhost_dev *dev); diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c index d6ca1c7ad513..2339587bcd31 100644 --- a/drivers/vhost/vsock.c +++ b/drivers/vhost/vsock.c @@ -707,7 +707,7 @@ static void vhost_vsock_flush(struct vhost_vsock *vsock) for (i = 0; i < ARRAY_SIZE(vsock->vqs); i++) if (vsock->vqs[i].handle_kick) - vhost_poll_flush(&vsock->vqs[i].poll); + vhost_work_dev_flush(vsock->vqs[i].poll.dev); vhost_work_dev_flush(&vsock->dev); } -- 2.25.1 ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
[PATCH 5/7] vhost_vsock: simplify vhost_vsock_flush()
From: Andrey Ryabinin vhost_vsock_flush() calls vhost_work_dev_flush(vsock->vqs[i].poll.dev) before vhost_work_dev_flush(&vsock->dev). This seems pointless as vsock->vqs[i].poll.dev is the same as &vsock->dev and several flushes in a row doesn't do anything useful, one is just enough. Signed-off-by: Andrey Ryabinin Reviewed-by: Stefano Garzarella Signed-off-by: Mike Christie --- drivers/vhost/vsock.c | 5 - 1 file changed, 5 deletions(-) diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c index 2339587bcd31..1f38160b249d 100644 --- a/drivers/vhost/vsock.c +++ b/drivers/vhost/vsock.c @@ -703,11 +703,6 @@ static int vhost_vsock_dev_open(struct inode *inode, struct file *file) static void vhost_vsock_flush(struct vhost_vsock *vsock) { - int i; - - for (i = 0; i < ARRAY_SIZE(vsock->vqs); i++) - if (vsock->vqs[i].handle_kick) - vhost_work_dev_flush(vsock->vqs[i].poll.dev); vhost_work_dev_flush(&vsock->dev); } -- 2.25.1 ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
[PATCH 2/7] vhost: flush dev once during vhost_dev_stop
When vhost_work_dev_flush returns all work queued at that time will have completed. There is then no need to flush after every vhost_poll_stop call, and we can move the flush call to after the loop that stops the pollers. Signed-off-by: Mike Christie --- drivers/vhost/vhost.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c index 7346fa519eb6..17e5956e7424 100644 --- a/drivers/vhost/vhost.c +++ b/drivers/vhost/vhost.c @@ -667,11 +667,11 @@ void vhost_dev_stop(struct vhost_dev *dev) int i; for (i = 0; i < dev->nvqs; ++i) { - if (dev->vqs[i]->kick && dev->vqs[i]->handle_kick) { + if (dev->vqs[i]->kick && dev->vqs[i]->handle_kick) vhost_poll_stop(&dev->vqs[i]->poll); - vhost_work_dev_flush(dev->vqs[i]->poll.dev); - } } + + vhost_work_dev_flush(dev); } EXPORT_SYMBOL_GPL(vhost_dev_stop); -- 2.25.1 ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
[PATCH 4/7] vhost_test: remove vhost_test_flush_vq()
From: Andrey Ryabinin vhost_test_flush_vq() just a simple wrapper around vhost_work_dev_flush() which seems have no value. It's just easier to call vhost_work_dev_flush() directly. Besides there is no point in obtaining vhost_dev pointer via 'n->vqs[index].poll.dev' while we can just use &n->dev. It's the same pointers, see vhost_test_open()/vhost_dev_init(). Signed-off-by: Andrey Ryabinin Signed-off-by: Mike Christie --- drivers/vhost/test.c | 11 +++ 1 file changed, 3 insertions(+), 8 deletions(-) diff --git a/drivers/vhost/test.c b/drivers/vhost/test.c index 1a8ab1d8cb1c..d4f63068d762 100644 --- a/drivers/vhost/test.c +++ b/drivers/vhost/test.c @@ -144,14 +144,9 @@ static void vhost_test_stop(struct vhost_test *n, void **privatep) *privatep = vhost_test_stop_vq(n, n->vqs + VHOST_TEST_VQ); } -static void vhost_test_flush_vq(struct vhost_test *n, int index) -{ - vhost_work_dev_flush(n->vqs[index].poll.dev); -} - static void vhost_test_flush(struct vhost_test *n) { - vhost_test_flush_vq(n, VHOST_TEST_VQ); + vhost_work_dev_flush(&n->dev); } static int vhost_test_release(struct inode *inode, struct file *f) @@ -209,7 +204,7 @@ static long vhost_test_run(struct vhost_test *n, int test) goto err; if (oldpriv) { - vhost_test_flush_vq(n, index); + vhost_test_flush(n, index); } } @@ -302,7 +297,7 @@ static long vhost_test_set_backend(struct vhost_test *n, unsigned index, int fd) mutex_unlock(&vq->mutex); if (enable) { - vhost_test_flush_vq(n, index); + vhost_test_flush(n); } mutex_unlock(&n->dev.mutex); -- 2.25.1 ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
[PATCH 6/7] vhost-scsi: drop flush after vhost_dev_cleanup
The flush after vhost_dev_cleanup is not needed because: 1. It doesn't do anything. vhost_dev_cleanup will stop the worker thread so the flush call will just return since the worker has not device. 2. It's not needed for the re-queue case. vhost_scsi_evt_handle_kick grabs the mutex and if the backend is NULL will return without queueing a work. vhost_scsi_clear_endpoint will set the backend to NULL under the vq->mutex then drops the mutex and does a flush. So we know when vhost_scsi_clear_endpoint has dropped the mutex after clearing the backend no evt related work will be able to requeue. The flush would then make sure any queued evts are run and return. Signed-off-by: Mike Christie --- drivers/vhost/scsi.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c index 532e204f2b1b..94535c813ef7 100644 --- a/drivers/vhost/scsi.c +++ b/drivers/vhost/scsi.c @@ -1827,8 +1827,6 @@ static int vhost_scsi_release(struct inode *inode, struct file *f) vhost_scsi_clear_endpoint(vs, &t); vhost_dev_stop(&vs->dev); vhost_dev_cleanup(&vs->dev); - /* Jobs can re-queue themselves in evt kick handler. Do extra flush. */ - vhost_scsi_flush(vs); kfree(vs->dev.vqs); kvfree(vs); return 0; -- 2.25.1 ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
[PATCH 7/7] vhost-test: drop flush after vhost_dev_cleanup
The flush after vhost_dev_cleanup is not needed because: 1. It doesn't do anything. vhost_dev_cleanup will stop the worker thread so the flush call will just return since the worker has not device. 2. It's not needed. The comment about jobs re-queueing themselves does not look correct because handle_vq does not requeue work. Signed-off-by: Mike Christie --- drivers/vhost/test.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/drivers/vhost/test.c b/drivers/vhost/test.c index d4f63068d762..57e24eceff27 100644 --- a/drivers/vhost/test.c +++ b/drivers/vhost/test.c @@ -158,9 +158,6 @@ static int vhost_test_release(struct inode *inode, struct file *f) vhost_test_flush(n); vhost_dev_stop(&n->dev); vhost_dev_cleanup(&n->dev); - /* We do an extra flush before freeing memory, -* since jobs can re-queue themselves. */ - vhost_test_flush(n); kfree(n); return 0; } -- 2.25.1 ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
[PATCH V5 01/12] vhost: add vhost_worker pointer to vhost_virtqueue
This patchset allows userspace to map vqs to different workers. This patch adds a worker pointer to the vq so we can store that info. Signed-off-by: Mike Christie --- drivers/vhost/vhost.c | 24 +--- drivers/vhost/vhost.h | 1 + 2 files changed, 14 insertions(+), 11 deletions(-) diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c index 17e5956e7424..a314f050413a 100644 --- a/drivers/vhost/vhost.c +++ b/drivers/vhost/vhost.c @@ -488,6 +488,7 @@ void vhost_dev_init(struct vhost_dev *dev, vq->log = NULL; vq->indirect = NULL; vq->heads = NULL; + vq->worker = NULL; vq->dev = dev; mutex_init(&vq->mutex); vhost_vq_reset(dev, vq); @@ -566,15 +567,14 @@ static void vhost_worker_free(struct vhost_dev *dev) kfree(worker); } -static int vhost_worker_create(struct vhost_dev *dev) +static struct vhost_worker *vhost_worker_create(struct vhost_dev *dev) { struct vhost_worker *worker; struct task_struct *task; - int ret; worker = kzalloc(sizeof(*worker), GFP_KERNEL_ACCOUNT); if (!worker) - return -ENOMEM; + return NULL; dev->worker = worker; worker->kcov_handle = kcov_common_handle(); @@ -586,25 +586,24 @@ static int vhost_worker_create(struct vhost_dev *dev) */ task = user_worker_create(vhost_worker, worker, NUMA_NO_NODE, CLONE_FS, USER_WORKER_NO_FILES | USER_WORKER_SIG_IGN); - if (IS_ERR(task)) { - ret = PTR_ERR(task); + if (IS_ERR(task)) goto free_worker; - } worker->task = task; user_worker_start(task, "vhost-%d", current->pid); - return 0; + return worker; free_worker: kfree(worker); dev->worker = NULL; - return ret; + return NULL; } /* Caller should have device mutex */ long vhost_dev_set_owner(struct vhost_dev *dev) { - int err; + struct vhost_worker *worker; + int err, i; /* Is there an owner already? */ if (vhost_dev_has_owner(dev)) { @@ -615,9 +614,12 @@ long vhost_dev_set_owner(struct vhost_dev *dev) vhost_attach_mm(dev); if (dev->use_worker) { - err = vhost_worker_create(dev); - if (err) + worker = vhost_worker_create(dev); + if (!worker) goto err_worker; + + for (i = 0; i < dev->nvqs; i++) + dev->vqs[i]->worker = worker; } err = vhost_dev_alloc_iovecs(dev); diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h index 67b23e178812..31d074724fe8 100644 --- a/drivers/vhost/vhost.h +++ b/drivers/vhost/vhost.h @@ -79,6 +79,7 @@ struct vhost_vring_call { /* The virtqueue structure describes a queue attached to a device. */ struct vhost_virtqueue { struct vhost_dev *dev; + struct vhost_worker *worker; /* The actual ring of buffers. */ struct mutex mutex; -- 2.25.1 ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
[PATCH V5 00/12] vhost: multiple worker support
The following patches apply over linus's tree and the user_worker patchset here: https://lore.kernel.org/virtualization/20211129194707.5863-1-michael.chris...@oracle.com/T/#t which allows us to check the vhost owner thread's RLIMITs, and they are built over Andrey's flush cleanups: https://lore.kernel.org/virtualization/20211207024510.23292-1-michael.chris...@oracle.com/T/#t The patches allow us to support multiple vhost workers per device. The design is a modified version of Stefan's original idea where userspace has the kernel create a worker and we pass back the pid. In this version, V4, instead of passing the pid between user/kernel space we use a worker_id which is just an integer managed by the vhost driver and we allow userspace to create and free workers and then attach them to virtqueues at setup time or while IO is running. All review comments from the past reviews should be handled. If I didn't reply to a review comment, I agreed with the comment and should have handled it in this posting. Let me know if I missed one. Results: fio jobs1 2 4 8 12 16 -- 1 worker84k492k510k- - - worker per vq 184k 380k744k1422k 2256k 2434k Notes: 0. This used a simple fio command: fio --filename=/dev/sdb --direct=1 --rw=randrw --bs=4k \ --ioengine=libaio --iodepth=128 --numjobs=$JOBS_ABOVE and I used a VM with 16 vCPUs and 16 virtqueues. 1. The patches were tested with emulate_pr=0 and these patches: https://lore.kernel.org/all/yq1tuhge4bg@ca-mkp.ca.oracle.com/t/ which are in mkp's scsi branches for the next kernel. They fix the perf issues where IOPs dropped at 12 vqs/jobs. 2. Because we have a hard limit of 1024 cmds, if the num jobs * iodepth was greater than 1024, I would decrease iodepth. So 12 jobs used 85 cmds, and 16 used 64. 3. The perf issue above at 2 jobs is because when we only have 1 worker we execute more cmds per vhost_work due to all vqs funneling to one worker. This results in less context switches and better batching without having to tweak any settings. I'm working on patches to add back batching during lio completion and do polling on the submission side. We will still want the threading patches, because if we batch at the fio level plus use the vhost theading patches, we can see a big boost like below. So hopefully doing it at the kernel will allow apps to just work without having to be smart like fio. fio using io_uring and batching with the iodepth_batch* settings: fio jobs1 2 4 8 12 16 - 1 worker494k520k- - - - worker per vq 496k878k1542k 2436k 2304k 2590k V5: - Rebase against user_worker patchset. - Rebase against flush patchset. - Redo vhost-scsi tmf flush handling so it doesn't access vq->worker. V4: - fix vhost-sock VSOCK_VQ_RX use. - name functions called directly by ioctl cmd's to match the ioctl cmd. - break up VHOST_SET_VRING_WORKER into a new, free and attach cmd. - document worker lifetime, and cgroup, namespace, mm, rlimit inheritance, make it clear we currently only support sharing within the device. - add support to attach workers while IO is running. - instead of passing a pid_t of the kernel thread, pass a int allocated by the vhost layer with an idr. V3: - fully convert vhost code to use vq based APIs instead of leaving it half per dev and half per vq. - rebase against kernel worker API. - Drop delayed worker creation. We always create the default worker at VHOST_SET_OWNER time. Userspace can create and bind workers after that. V2: - change loop that we take a refcount to the worker in - replaced pid == -1 with define. - fixed tabbing/spacing coding style issue - use hash instead of list to lookup workers. - I dropped the patch that added an ioctl cmd to get a vq's worker's pid. I saw we might do a generic netlink interface instead. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
[PATCH V5 03/12] vhost: take worker or vq instead of dev for queueing
This patch has the core work queueing function take a worker for when we support multiple workers. It also adds a helper that takes a vq during queueing so modules can control which vq/worker to queue work on. This temp leaves vhost_work_queue. It will be removed when the drivers are converted in the next patches. Signed-off-by: Mike Christie --- drivers/vhost/vhost.c | 44 +++ drivers/vhost/vhost.h | 1 + 2 files changed, 29 insertions(+), 16 deletions(-) diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c index 2fe7c2e9f6ad..86192e1658af 100644 --- a/drivers/vhost/vhost.c +++ b/drivers/vhost/vhost.c @@ -230,6 +230,34 @@ void vhost_poll_stop(struct vhost_poll *poll) } EXPORT_SYMBOL_GPL(vhost_poll_stop); +static void vhost_work_queue_on(struct vhost_worker *worker, + struct vhost_work *work) +{ + if (!worker) + return; + + if (!test_and_set_bit(VHOST_WORK_QUEUED, &work->flags)) { + /* We can only add the work to the list after we're +* sure it was not in the list. +* test_and_set_bit() implies a memory barrier. +*/ + llist_add(&work->node, &worker->work_list); + wake_up_process(worker->task); + } +} + +void vhost_work_queue(struct vhost_dev *dev, struct vhost_work *work) +{ + vhost_work_queue_on(dev->worker, work); +} +EXPORT_SYMBOL_GPL(vhost_work_queue); + +void vhost_vq_work_queue(struct vhost_virtqueue *vq, struct vhost_work *work) +{ + vhost_work_queue_on(vq->worker, work); +} +EXPORT_SYMBOL_GPL(vhost_vq_work_queue); + void vhost_work_dev_flush(struct vhost_dev *dev) { struct vhost_flush_struct flush; @@ -244,22 +272,6 @@ void vhost_work_dev_flush(struct vhost_dev *dev) } EXPORT_SYMBOL_GPL(vhost_work_dev_flush); -void vhost_work_queue(struct vhost_dev *dev, struct vhost_work *work) -{ - if (!dev->worker) - return; - - if (!test_and_set_bit(VHOST_WORK_QUEUED, &work->flags)) { - /* We can only add the work to the list after we're -* sure it was not in the list. -* test_and_set_bit() implies a memory barrier. -*/ - llist_add(&work->node, &dev->worker->work_list); - wake_up_process(dev->worker->task); - } -} -EXPORT_SYMBOL_GPL(vhost_work_queue); - /* A lockless hint for busy polling code to exit the loop */ bool vhost_vq_has_work(struct vhost_virtqueue *vq) { diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h index f0f2fb334b13..f5251cf902ce 100644 --- a/drivers/vhost/vhost.h +++ b/drivers/vhost/vhost.h @@ -198,6 +198,7 @@ int vhost_get_vq_desc(struct vhost_virtqueue *, struct vhost_log *log, unsigned int *log_num); void vhost_discard_vq_desc(struct vhost_virtqueue *, int n); +void vhost_vq_work_queue(struct vhost_virtqueue *vq, struct vhost_work *work); bool vhost_vq_has_work(struct vhost_virtqueue *vq); bool vhost_vq_is_setup(struct vhost_virtqueue *vq); int vhost_vq_init_access(struct vhost_virtqueue *); -- 2.25.1 ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
[PATCH V5 05/12] vhost: convert poll work to be vq based
This has the drivers pass in their poll to vq mapping and then converts the core poll code to use the vq based helpers. Signed-off-by: Mike Christie --- drivers/vhost/net.c | 6 -- drivers/vhost/vhost.c | 8 +--- drivers/vhost/vhost.h | 4 +++- 3 files changed, 12 insertions(+), 6 deletions(-) diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c index f225cb9dcb10..6e2b1670ae51 100644 --- a/drivers/vhost/net.c +++ b/drivers/vhost/net.c @@ -1339,8 +1339,10 @@ static int vhost_net_open(struct inode *inode, struct file *f) VHOST_NET_PKT_WEIGHT, VHOST_NET_WEIGHT, true, NULL); - vhost_poll_init(n->poll + VHOST_NET_VQ_TX, handle_tx_net, EPOLLOUT, dev); - vhost_poll_init(n->poll + VHOST_NET_VQ_RX, handle_rx_net, EPOLLIN, dev); + vhost_poll_init(n->poll + VHOST_NET_VQ_TX, handle_tx_net, EPOLLOUT, dev, + vqs[VHOST_NET_VQ_TX]); + vhost_poll_init(n->poll + VHOST_NET_VQ_RX, handle_rx_net, EPOLLIN, dev, + vqs[VHOST_NET_VQ_RX]); f->private_data = n; n->page_frag.page = NULL; diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c index e5f5acc7e648..e9f78f0d72cb 100644 --- a/drivers/vhost/vhost.c +++ b/drivers/vhost/vhost.c @@ -186,13 +186,15 @@ EXPORT_SYMBOL_GPL(vhost_work_init); /* Init poll structure */ void vhost_poll_init(struct vhost_poll *poll, vhost_work_fn_t fn, -__poll_t mask, struct vhost_dev *dev) +__poll_t mask, struct vhost_dev *dev, +struct vhost_virtqueue *vq) { init_waitqueue_func_entry(&poll->wait, vhost_poll_wakeup); init_poll_funcptr(&poll->table, vhost_poll_func); poll->mask = mask; poll->dev = dev; poll->wqh = NULL; + poll->vq = vq; vhost_work_init(&poll->work, fn); } @@ -287,7 +289,7 @@ EXPORT_SYMBOL_GPL(vhost_vq_has_work); void vhost_poll_queue(struct vhost_poll *poll) { - vhost_work_queue(poll->dev, &poll->work); + vhost_vq_work_queue(poll->vq, &poll->work); } EXPORT_SYMBOL_GPL(vhost_poll_queue); @@ -512,7 +514,7 @@ void vhost_dev_init(struct vhost_dev *dev, vhost_vq_reset(dev, vq); if (vq->handle_kick) vhost_poll_init(&vq->poll, vq->handle_kick, - EPOLLIN, dev); + EPOLLIN, dev, vq); } } EXPORT_SYMBOL_GPL(vhost_dev_init); diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h index f5251cf902ce..58032429bd05 100644 --- a/drivers/vhost/vhost.h +++ b/drivers/vhost/vhost.h @@ -46,13 +46,15 @@ struct vhost_poll { struct vhost_work work; __poll_tmask; struct vhost_dev*dev; + struct vhost_virtqueue *vq; }; void vhost_work_init(struct vhost_work *work, vhost_work_fn_t fn); void vhost_work_queue(struct vhost_dev *dev, struct vhost_work *work); void vhost_poll_init(struct vhost_poll *poll, vhost_work_fn_t fn, -__poll_t mask, struct vhost_dev *dev); +__poll_t mask, struct vhost_dev *dev, +struct vhost_virtqueue *vq); int vhost_poll_start(struct vhost_poll *poll, struct file *file); void vhost_poll_stop(struct vhost_poll *poll); void vhost_poll_queue(struct vhost_poll *poll); -- 2.25.1 ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
[PATCH V5 04/12] vhost: take worker or vq instead of dev for flushing
This patch has the core work flush function take a worker for when we support multiple workers. Signed-off-by: Mike Christie --- drivers/vhost/vhost.c | 24 +++- 1 file changed, 15 insertions(+), 9 deletions(-) diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c index 86192e1658af..e5f5acc7e648 100644 --- a/drivers/vhost/vhost.c +++ b/drivers/vhost/vhost.c @@ -246,6 +246,20 @@ static void vhost_work_queue_on(struct vhost_worker *worker, } } +static void vhost_work_flush_on(struct vhost_worker *worker) +{ + struct vhost_flush_struct flush; + + if (!worker) + return; + + init_completion(&flush.wait_event); + vhost_work_init(&flush.work, vhost_flush_work); + + vhost_work_queue_on(worker, &flush.work); + wait_for_completion(&flush.wait_event); +} + void vhost_work_queue(struct vhost_dev *dev, struct vhost_work *work) { vhost_work_queue_on(dev->worker, work); @@ -260,15 +274,7 @@ EXPORT_SYMBOL_GPL(vhost_vq_work_queue); void vhost_work_dev_flush(struct vhost_dev *dev) { - struct vhost_flush_struct flush; - - if (dev->worker) { - init_completion(&flush.wait_event); - vhost_work_init(&flush.work, vhost_flush_work); - - vhost_work_queue(dev, &flush.work); - wait_for_completion(&flush.wait_event); - } + vhost_work_flush_on(dev->worker); } EXPORT_SYMBOL_GPL(vhost_work_dev_flush); -- 2.25.1 ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
[PATCH V5 06/12] vhost-sock: convert to vhost_vq_work_queue
Convert from vhost_work_queue to vhost_vq_work_queue. Signed-off-by: Mike Christie --- drivers/vhost/vsock.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c index 1f38160b249d..068ccdbd3bcd 100644 --- a/drivers/vhost/vsock.c +++ b/drivers/vhost/vsock.c @@ -300,7 +300,7 @@ vhost_transport_send_pkt(struct virtio_vsock_pkt *pkt) list_add_tail(&pkt->list, &vsock->send_pkt_list); spin_unlock_bh(&vsock->send_pkt_list_lock); - vhost_work_queue(&vsock->dev, &vsock->send_pkt_work); + vhost_vq_work_queue(&vsock->vqs[VSOCK_VQ_RX], &vsock->send_pkt_work); rcu_read_unlock(); return len; @@ -608,7 +608,7 @@ static int vhost_vsock_start(struct vhost_vsock *vsock) /* Some packets may have been queued before the device was started, * let's kick the send worker to send them. */ - vhost_work_queue(&vsock->dev, &vsock->send_pkt_work); + vhost_vq_work_queue(&vsock->vqs[VSOCK_VQ_RX], &vsock->send_pkt_work); mutex_unlock(&vsock->dev.mutex); return 0; -- 2.25.1 ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
[PATCH V5 08/12] vhost-scsi: convert to vhost_vq_work_queue
Convert from vhost_work_queue to vhost_vq_work_queue. Signed-off-by: Mike Christie --- drivers/vhost/scsi.c | 20 ++-- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c index b2592e927316..93c6ad1246eb 100644 --- a/drivers/vhost/scsi.c +++ b/drivers/vhost/scsi.c @@ -361,8 +361,9 @@ static void vhost_scsi_release_cmd(struct se_cmd *se_cmd) if (se_cmd->se_cmd_flags & SCF_SCSI_TMR_CDB) { struct vhost_scsi_tmf *tmf = container_of(se_cmd, struct vhost_scsi_tmf, se_cmd); + struct vhost_virtqueue *vq = &tmf->svq->vq; - vhost_work_queue(&tmf->vhost->dev, &tmf->vwork); + vhost_vq_work_queue(vq, &tmf->vwork); } else { struct vhost_scsi_cmd *cmd = container_of(se_cmd, struct vhost_scsi_cmd, tvc_se_cmd); @@ -1360,11 +1361,9 @@ static void vhost_scsi_ctl_handle_kick(struct vhost_work *work) } static void -vhost_scsi_send_evt(struct vhost_scsi *vs, - struct vhost_scsi_tpg *tpg, - struct se_lun *lun, - u32 event, - u32 reason) +vhost_scsi_send_evt(struct vhost_scsi *vs, struct vhost_virtqueue *vq, + struct vhost_scsi_tpg *tpg, struct se_lun *lun, + u32 event, u32 reason) { struct vhost_scsi_evt *evt; @@ -1386,7 +1385,7 @@ vhost_scsi_send_evt(struct vhost_scsi *vs, } llist_add(&evt->list, &vs->vs_event_list); - vhost_work_queue(&vs->dev, &vs->vs_event_work); + vhost_vq_work_queue(vq, &vs->vs_event_work); } static void vhost_scsi_evt_handle_kick(struct vhost_work *work) @@ -1400,7 +1399,8 @@ static void vhost_scsi_evt_handle_kick(struct vhost_work *work) goto out; if (vs->vs_events_missed) - vhost_scsi_send_evt(vs, NULL, NULL, VIRTIO_SCSI_T_NO_EVENT, 0); + vhost_scsi_send_evt(vs, vq, NULL, NULL, VIRTIO_SCSI_T_NO_EVENT, + 0); out: mutex_unlock(&vq->mutex); } @@ -1965,8 +1965,8 @@ vhost_scsi_do_plug(struct vhost_scsi_tpg *tpg, vq = &vs->vqs[VHOST_SCSI_VQ_EVT].vq; mutex_lock(&vq->mutex); if (vhost_has_feature(vq, VIRTIO_SCSI_F_HOTPLUG)) - vhost_scsi_send_evt(vs, tpg, lun, - VIRTIO_SCSI_T_TRANSPORT_RESET, reason); + vhost_scsi_send_evt(vs, vq, tpg, lun, + VIRTIO_SCSI_T_TRANSPORT_RESET, reason); mutex_unlock(&vq->mutex); mutex_unlock(&vs->dev.mutex); } -- 2.25.1 ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
[PATCH V5 09/12] vhost: remove vhost_work_queue
vhost_work_queue is no longer used. Each driver is using the poll or vq based queueing, so remove vhost_work_queue. Signed-off-by: Mike Christie --- drivers/vhost/vhost.c | 6 -- drivers/vhost/vhost.h | 1 - 2 files changed, 7 deletions(-) diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c index e9f78f0d72cb..050b8007db8b 100644 --- a/drivers/vhost/vhost.c +++ b/drivers/vhost/vhost.c @@ -262,12 +262,6 @@ static void vhost_work_flush_on(struct vhost_worker *worker) wait_for_completion(&flush.wait_event); } -void vhost_work_queue(struct vhost_dev *dev, struct vhost_work *work) -{ - vhost_work_queue_on(dev->worker, work); -} -EXPORT_SYMBOL_GPL(vhost_work_queue); - void vhost_vq_work_queue(struct vhost_virtqueue *vq, struct vhost_work *work) { vhost_work_queue_on(vq->worker, work); diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h index 58032429bd05..4423b2420c90 100644 --- a/drivers/vhost/vhost.h +++ b/drivers/vhost/vhost.h @@ -50,7 +50,6 @@ struct vhost_poll { }; void vhost_work_init(struct vhost_work *work, vhost_work_fn_t fn); -void vhost_work_queue(struct vhost_dev *dev, struct vhost_work *work); void vhost_poll_init(struct vhost_poll *poll, vhost_work_fn_t fn, __poll_t mask, struct vhost_dev *dev, -- 2.25.1 ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
[PATCH V5 12/12] vhost: allow worker attachment after initial setup
This patch allows userspace to change the vq to worker mapping while it's in use so tools can do this setup post device creation if needed. Signed-off-by: Mike Christie --- drivers/vhost/vhost.c | 76 +++--- drivers/vhost/vhost.h | 2 +- include/uapi/linux/vhost.h | 2 +- 3 files changed, 57 insertions(+), 23 deletions(-) diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c index 1af43b5d1dbd..f702df0ce33f 100644 --- a/drivers/vhost/vhost.c +++ b/drivers/vhost/vhost.c @@ -232,12 +232,9 @@ void vhost_poll_stop(struct vhost_poll *poll) } EXPORT_SYMBOL_GPL(vhost_poll_stop); -static void vhost_work_queue_on(struct vhost_worker *worker, - struct vhost_work *work) +static void vhost_worker_work_queue(struct vhost_worker *worker, + struct vhost_work *work) { - if (!worker) - return; - if (!test_and_set_bit(VHOST_WORK_QUEUED, &work->flags)) { /* We can only add the work to the list after we're * sure it was not in the list. @@ -248,21 +245,32 @@ static void vhost_work_queue_on(struct vhost_worker *worker, } } -static int vhost_workers_idr_flush_iter(int id, void *worker, void *dev) +static void vhost_worker_flush(struct vhost_worker *worker) { struct vhost_flush_struct flush; init_completion(&flush.wait_event); vhost_work_init(&flush.work, vhost_flush_work); - vhost_work_queue_on(worker, &flush.work); + vhost_worker_work_queue(worker, &flush.work); wait_for_completion(&flush.wait_event); +} + +static int vhost_workers_idr_flush_iter(int id, void *worker, void *dev) +{ + vhost_worker_flush(worker); return 0; } void vhost_vq_work_queue(struct vhost_virtqueue *vq, struct vhost_work *work) { - vhost_work_queue_on(vq->worker, work); + struct vhost_worker *worker; + + rcu_read_lock(); + worker = rcu_dereference(vq->worker); + if (worker) + vhost_worker_work_queue(worker, work); + rcu_read_unlock(); } EXPORT_SYMBOL_GPL(vhost_vq_work_queue); @@ -282,7 +290,16 @@ EXPORT_SYMBOL_GPL(vhost_work_dev_flush); /* A lockless hint for busy polling code to exit the loop */ bool vhost_vq_has_work(struct vhost_virtqueue *vq) { - return vq->worker && !llist_empty(&vq->worker->work_list); + struct vhost_worker *worker; + bool has_work = false; + + rcu_read_lock(); + worker = rcu_dereference(vq->worker); + if (worker && !llist_empty(&worker->work_list)) + has_work = true; + rcu_read_unlock(); + + return has_work; } EXPORT_SYMBOL_GPL(vhost_vq_has_work); @@ -507,7 +524,7 @@ void vhost_dev_init(struct vhost_dev *dev, vq->log = NULL; vq->indirect = NULL; vq->heads = NULL; - vq->worker = NULL; + rcu_assign_pointer(vq->worker, NULL); vq->dev = dev; mutex_init(&vq->mutex); vhost_vq_reset(dev, vq); @@ -587,11 +604,30 @@ static void vhost_worker_put(struct vhost_dev *dev, struct vhost_worker *worker) kfree(worker); } -static void vhost_vq_detach_worker(struct vhost_virtqueue *vq) +static void vhost_vq_swap_worker(struct vhost_virtqueue *vq, +struct vhost_worker *new_worker, bool flush) { - if (vq->worker) - vhost_worker_put(vq->dev, vq->worker); - vq->worker = NULL; + struct vhost_worker *old_worker; + + old_worker = rcu_dereference_check(vq->worker, + lockdep_is_held(&vq->dev->mutex)); + rcu_assign_pointer(vq->worker, new_worker); + + if (!old_worker) + return; + + if (flush) { + /* +* For dev cleanup we won't have work running, but for the +* dynamic attach case we might so make sure we see the new +* worker and there is no work in the old worker when we +* return. +*/ + synchronize_rcu(); + vhost_worker_flush(old_worker); + } + + vhost_worker_put(vq->dev, old_worker); } static int vhost_workers_idr_iter(int id, void *worker, void *dev) @@ -608,7 +644,7 @@ static void vhost_workers_free(struct vhost_dev *dev) return; for (i = 0; i < dev->nvqs; i++) - vhost_vq_detach_worker(dev->vqs[i]); + vhost_vq_swap_worker(dev->vqs[i], NULL, false); idr_for_each(&dev->worker_idr, vhost_workers_idr_iter, dev); } @@ -664,18 +700,13 @@ static int vhost_vq_attach_worker(struct vhost_virtqueue *vq, if (!dev->use_worker) return -EINVAL; - /* We don't support setting a worker on an active vq */ - if (vq->private_data) - return -EBUSY; - worker = idr_fi
[PATCH V5 11/12] vhost: allow userspace to create workers
For vhost-scsi with 3 vqs and a workload like that tries to use those vqs like: fio --filename=/dev/sdb --direct=1 --rw=randrw --bs=4k \ --ioengine=libaio --iodepth=128 --numjobs=3 the single vhost worker thread will become a bottlneck. To better utilize virtqueues and available CPUs, this patch allows userspace to create workers and bind them to vqs. You can have N workers per dev and also share N workers with M vqs. Signed-off-by: Mike Christie --- drivers/vhost/vhost.c| 164 --- drivers/vhost/vhost.h| 4 +- include/uapi/linux/vhost.h | 22 + include/uapi/linux/vhost_types.h | 15 +++ 4 files changed, 188 insertions(+), 17 deletions(-) diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c index 050b8007db8b..1af43b5d1dbd 100644 --- a/drivers/vhost/vhost.c +++ b/drivers/vhost/vhost.c @@ -248,18 +248,16 @@ static void vhost_work_queue_on(struct vhost_worker *worker, } } -static void vhost_work_flush_on(struct vhost_worker *worker) +static int vhost_workers_idr_flush_iter(int id, void *worker, void *dev) { struct vhost_flush_struct flush; - if (!worker) - return; - init_completion(&flush.wait_event); vhost_work_init(&flush.work, vhost_flush_work); vhost_work_queue_on(worker, &flush.work); wait_for_completion(&flush.wait_event); + return 0; } void vhost_vq_work_queue(struct vhost_virtqueue *vq, struct vhost_work *work) @@ -268,9 +266,16 @@ void vhost_vq_work_queue(struct vhost_virtqueue *vq, struct vhost_work *work) } EXPORT_SYMBOL_GPL(vhost_vq_work_queue); +/** + * vhost_work_dev_flush - flush vhost work in device + * @dev: vhost dev to flush + * + * This must be called with the device mutex or from the device's release + * function when workers cannot be swapped. + */ void vhost_work_dev_flush(struct vhost_dev *dev) { - vhost_work_flush_on(dev->worker); + idr_for_each(&dev->worker_idr, vhost_workers_idr_flush_iter, dev); } EXPORT_SYMBOL_GPL(vhost_work_dev_flush); @@ -485,7 +490,6 @@ void vhost_dev_init(struct vhost_dev *dev, dev->umem = NULL; dev->iotlb = NULL; dev->mm = NULL; - dev->worker = NULL; dev->iov_limit = iov_limit; dev->weight = weight; dev->byte_weight = byte_weight; @@ -495,6 +499,7 @@ void vhost_dev_init(struct vhost_dev *dev, INIT_LIST_HEAD(&dev->read_list); INIT_LIST_HEAD(&dev->pending_list); spin_lock_init(&dev->iotlb_lock); + idr_init(&dev->worker_idr); for (i = 0; i < dev->nvqs; ++i) { @@ -568,31 +573,59 @@ static void vhost_worker_stop(struct vhost_worker *worker) wait_for_completion(worker->exit_done); } -static void vhost_worker_free(struct vhost_dev *dev) +static void vhost_worker_put(struct vhost_dev *dev, struct vhost_worker *worker) { - struct vhost_worker *worker = dev->worker; - if (!worker) return; - dev->worker = NULL; + if (!refcount_dec_and_test(&worker->refcount)) + return; + WARN_ON(!llist_empty(&worker->work_list)); vhost_worker_stop(worker); + idr_remove(&dev->worker_idr, worker->id); kfree(worker); } +static void vhost_vq_detach_worker(struct vhost_virtqueue *vq) +{ + if (vq->worker) + vhost_worker_put(vq->dev, vq->worker); + vq->worker = NULL; +} + +static int vhost_workers_idr_iter(int id, void *worker, void *dev) +{ + vhost_worker_put(dev, worker); + return 0; +} + +static void vhost_workers_free(struct vhost_dev *dev) +{ + int i; + + if (!dev->use_worker) + return; + + for (i = 0; i < dev->nvqs; i++) + vhost_vq_detach_worker(dev->vqs[i]); + + idr_for_each(&dev->worker_idr, vhost_workers_idr_iter, dev); +} + static struct vhost_worker *vhost_worker_create(struct vhost_dev *dev) { struct vhost_worker *worker; struct task_struct *task; + int id; worker = kzalloc(sizeof(*worker), GFP_KERNEL_ACCOUNT); if (!worker) return NULL; - dev->worker = worker; worker->kcov_handle = kcov_common_handle(); init_llist_head(&worker->work_list); + refcount_set(&worker->refcount, 1); /* * vhost used to use the kthread API which ignores all signals by @@ -605,14 +638,88 @@ static struct vhost_worker *vhost_worker_create(struct vhost_dev *dev) worker->task = task; user_worker_start(task, "vhost-%d", current->pid); + + /* idr accesses are done under the vhost_dev mutex */ + id = idr_alloc(&dev->worker_idr, worker, 0, INT_MAX, GFP_KERNEL); + if (id < 0) + goto stop_worker; + worker->id = id; + return worker; +stop_worker: + vhost_worker_stop(worker); free_worker: kfree(worker); - dev->worker = NULL; return NULL; } +/*
[PATCH V5 07/12] vhost-scsi: make SCSI cmd completion per vq
This patch separates the scsi cmd completion code paths so we can complete cmds based on their vq instead of having all cmds complete on the same worker/CPU. This will be useful with the next patches that allow us to create mulitple worker threads and bind them to different vqs, so we can have completions running on different threads/CPUs. Signed-off-by: Mike Christie Reviewed-by: Stefan Hajnoczi --- drivers/vhost/scsi.c | 48 +++- 1 file changed, 25 insertions(+), 23 deletions(-) diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c index 94535c813ef7..b2592e927316 100644 --- a/drivers/vhost/scsi.c +++ b/drivers/vhost/scsi.c @@ -164,6 +164,7 @@ enum { struct vhost_scsi_virtqueue { struct vhost_virtqueue vq; + struct vhost_scsi *vs; /* * Reference counting for inflight reqs, used for flush operation. At * each time, one reference tracks new commands submitted, while we @@ -178,6 +179,9 @@ struct vhost_scsi_virtqueue { struct vhost_scsi_cmd *scsi_cmds; struct sbitmap scsi_tags; int max_cmds; + + struct vhost_work completion_work; + struct llist_head completion_list; }; struct vhost_scsi { @@ -188,9 +192,6 @@ struct vhost_scsi { struct vhost_dev dev; struct vhost_scsi_virtqueue vqs[VHOST_SCSI_MAX_VQ]; - struct vhost_work vs_completion_work; /* cmd completion work item */ - struct llist_head vs_completion_list; /* cmd completion queue */ - struct vhost_work vs_event_work; /* evt injection work item */ struct llist_head vs_event_list; /* evt injection queue */ @@ -365,10 +366,11 @@ static void vhost_scsi_release_cmd(struct se_cmd *se_cmd) } else { struct vhost_scsi_cmd *cmd = container_of(se_cmd, struct vhost_scsi_cmd, tvc_se_cmd); - struct vhost_scsi *vs = cmd->tvc_vhost; + struct vhost_scsi_virtqueue *svq = container_of(cmd->tvc_vq, + struct vhost_scsi_virtqueue, vq); - llist_add(&cmd->tvc_completion_list, &vs->vs_completion_list); - vhost_work_queue(&vs->dev, &vs->vs_completion_work); + llist_add(&cmd->tvc_completion_list, &svq->completion_list); + vhost_vq_work_queue(&svq->vq, &svq->completion_work); } } @@ -531,18 +533,17 @@ static void vhost_scsi_evt_work(struct vhost_work *work) */ static void vhost_scsi_complete_cmd_work(struct vhost_work *work) { - struct vhost_scsi *vs = container_of(work, struct vhost_scsi, - vs_completion_work); - DECLARE_BITMAP(signal, VHOST_SCSI_MAX_VQ); + struct vhost_scsi_virtqueue *svq = container_of(work, + struct vhost_scsi_virtqueue, completion_work); struct virtio_scsi_cmd_resp v_rsp; struct vhost_scsi_cmd *cmd, *t; struct llist_node *llnode; struct se_cmd *se_cmd; struct iov_iter iov_iter; - int ret, vq; + bool signal = false; + int ret; - bitmap_zero(signal, VHOST_SCSI_MAX_VQ); - llnode = llist_del_all(&vs->vs_completion_list); + llnode = llist_del_all(&svq->completion_list); llist_for_each_entry_safe(cmd, t, llnode, tvc_completion_list) { se_cmd = &cmd->tvc_se_cmd; @@ -562,21 +563,16 @@ static void vhost_scsi_complete_cmd_work(struct vhost_work *work) cmd->tvc_in_iovs, sizeof(v_rsp)); ret = copy_to_iter(&v_rsp, sizeof(v_rsp), &iov_iter); if (likely(ret == sizeof(v_rsp))) { - struct vhost_scsi_virtqueue *q; + signal = true; vhost_add_used(cmd->tvc_vq, cmd->tvc_vq_desc, 0); - q = container_of(cmd->tvc_vq, struct vhost_scsi_virtqueue, vq); - vq = q - vs->vqs; - __set_bit(vq, signal); } else pr_err("Faulted on virtio_scsi_cmd_resp\n"); vhost_scsi_release_cmd_res(se_cmd); } - vq = -1; - while ((vq = find_next_bit(signal, VHOST_SCSI_MAX_VQ, vq + 1)) - < VHOST_SCSI_MAX_VQ) - vhost_signal(&vs->dev, &vs->vqs[vq].vq); + if (signal) + vhost_signal(&svq->vs->dev, &svq->vq); } static struct vhost_scsi_cmd * @@ -1776,6 +1772,7 @@ static int vhost_scsi_set_features(struct vhost_scsi *vs, u64 features) static int vhost_scsi_open(struct inode *inode, struct file *f) { + struct vhost_scsi_virtqueue *svq; struct vhost_scsi *vs; struct vhost_virtqueue **vqs; int r = -ENOMEM, i; @@ -1788,7 +1785,6 @@ static int vhost_scsi_open(struct inode *inode, struct file *f) if (!vqs) goto err_vqs; - vhost_work_init(&vs->vs_completion_work, vhost_scs
[PATCH V5 02/12] vhost, vhost-net: add helper to check if vq has work
This adds a helper to check if a vq has work pending and converts vhost-net to use it. Signed-off-by: Mike Christie --- drivers/vhost/net.c | 2 +- drivers/vhost/vhost.c | 6 +++--- drivers/vhost/vhost.h | 2 +- 3 files changed, 5 insertions(+), 5 deletions(-) diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c index b1feb5e0571e..f225cb9dcb10 100644 --- a/drivers/vhost/net.c +++ b/drivers/vhost/net.c @@ -543,7 +543,7 @@ static void vhost_net_busy_poll(struct vhost_net *net, endtime = busy_clock() + busyloop_timeout; while (vhost_can_busy_poll(endtime)) { - if (vhost_has_work(&net->dev)) { + if (vhost_vq_has_work(vq)) { *busyloop_intr = true; break; } diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c index a314f050413a..2fe7c2e9f6ad 100644 --- a/drivers/vhost/vhost.c +++ b/drivers/vhost/vhost.c @@ -261,11 +261,11 @@ void vhost_work_queue(struct vhost_dev *dev, struct vhost_work *work) EXPORT_SYMBOL_GPL(vhost_work_queue); /* A lockless hint for busy polling code to exit the loop */ -bool vhost_has_work(struct vhost_dev *dev) +bool vhost_vq_has_work(struct vhost_virtqueue *vq) { - return dev->worker && !llist_empty(&dev->worker->work_list); + return vq->worker && !llist_empty(&vq->worker->work_list); } -EXPORT_SYMBOL_GPL(vhost_has_work); +EXPORT_SYMBOL_GPL(vhost_vq_has_work); void vhost_poll_queue(struct vhost_poll *poll) { diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h index 31d074724fe8..f0f2fb334b13 100644 --- a/drivers/vhost/vhost.h +++ b/drivers/vhost/vhost.h @@ -50,7 +50,6 @@ struct vhost_poll { void vhost_work_init(struct vhost_work *work, vhost_work_fn_t fn); void vhost_work_queue(struct vhost_dev *dev, struct vhost_work *work); -bool vhost_has_work(struct vhost_dev *dev); void vhost_poll_init(struct vhost_poll *poll, vhost_work_fn_t fn, __poll_t mask, struct vhost_dev *dev); @@ -199,6 +198,7 @@ int vhost_get_vq_desc(struct vhost_virtqueue *, struct vhost_log *log, unsigned int *log_num); void vhost_discard_vq_desc(struct vhost_virtqueue *, int n); +bool vhost_vq_has_work(struct vhost_virtqueue *vq); bool vhost_vq_is_setup(struct vhost_virtqueue *vq); int vhost_vq_init_access(struct vhost_virtqueue *); int vhost_add_used(struct vhost_virtqueue *, unsigned int head, int len); -- 2.25.1 ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
[PATCH V5 10/12] vhost-scsi: flush IO vqs then send TMF rsp
With one worker we will always send the scsi cmd responses then send the TMF rsp, because LIO will always complete the scsi cmds first then call into us to send the TMF response. With multiple workers, the IO vq workers could be running while the TMF/ctl vq worker is so this has us do a flush before completing the TMF to make sure cmds are completed when it's work is later queued and run. Signed-off-by: Mike Christie --- drivers/vhost/scsi.c | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c index 93c6ad1246eb..33e3ff4c1f38 100644 --- a/drivers/vhost/scsi.c +++ b/drivers/vhost/scsi.c @@ -413,7 +413,13 @@ static void vhost_scsi_queue_tm_rsp(struct se_cmd *se_cmd) { struct vhost_scsi_tmf *tmf = container_of(se_cmd, struct vhost_scsi_tmf, se_cmd); - + /* +* LIO will complete the cmds this TMF has cleaned up, then call +* this function. If we have vqs that do not share a worker with the +* ctl vq, then those cmds/works could still be completing. Do a +* flush here to make sure when the tmf work runs the cmds are done. +*/ + vhost_work_dev_flush(&tmf->vhost->dev); tmf->scsi_resp = se_cmd->se_tmr_req->response; transport_generic_free_cmd(&tmf->se_cmd, 0); } -- 2.25.1 ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
[PATCH] hv_sock: Extract hvs_send_data() helper that takes only header
When building under -Warray-bounds, the compiler is especially conservative when faced with casts from a smaller object to a larger object. While this has found many real bugs, there are some cases that are currently false positives (like here). With this as one of the last few instances of the warning in the kernel before -Warray-bounds can be enabled globally, rearrange the functions so that there is a header-only version of hvs_send_data(). Silences this warning: net/vmw_vsock/hyperv_transport.c: In function 'hvs_shutdown_lock_held.constprop': net/vmw_vsock/hyperv_transport.c:231:32: warning: array subscript 'struct hvs_send_buf[0]' is partly outside array bounds of 'struct vmpipe_proto_header[1]' [-Warray-bounds] 231 | send_buf->hdr.pkt_type = 1; | ~~~^~~ net/vmw_vsock/hyperv_transport.c:465:36: note: while referencing 'hdr' 465 | struct vmpipe_proto_header hdr; |^~~ This change results in no executable instruction differences. Signed-off-by: Kees Cook --- net/vmw_vsock/hyperv_transport.c | 18 -- 1 file changed, 12 insertions(+), 6 deletions(-) diff --git a/net/vmw_vsock/hyperv_transport.c b/net/vmw_vsock/hyperv_transport.c index 19189cf30a72..e111e13b6660 100644 --- a/net/vmw_vsock/hyperv_transport.c +++ b/net/vmw_vsock/hyperv_transport.c @@ -225,14 +225,20 @@ static size_t hvs_channel_writable_bytes(struct vmbus_channel *chan) return round_down(ret, 8); } +static int __hvs_send_data(struct vmbus_channel *chan, + struct vmpipe_proto_header *hdr, + size_t to_write) +{ + hdr->pkt_type = 1; + hdr->data_size = to_write; + return vmbus_sendpacket(chan, hdr, sizeof(*hdr) + to_write, + 0, VM_PKT_DATA_INBAND, 0); +} + static int hvs_send_data(struct vmbus_channel *chan, struct hvs_send_buf *send_buf, size_t to_write) { - send_buf->hdr.pkt_type = 1; - send_buf->hdr.data_size = to_write; - return vmbus_sendpacket(chan, &send_buf->hdr, - sizeof(send_buf->hdr) + to_write, - 0, VM_PKT_DATA_INBAND, 0); + return __hvs_send_data(chan, &send_buf->hdr, to_write); } static void hvs_channel_cb(void *ctx) @@ -468,7 +474,7 @@ static void hvs_shutdown_lock_held(struct hvsock *hvs, int mode) return; /* It can't fail: see hvs_channel_writable_bytes(). */ - (void)hvs_send_data(hvs->chan, (struct hvs_send_buf *)&hdr, 0); + (void)__hvs_send_data(hvs->chan, &hdr, 0); hvs->fin_sent = true; } -- 2.30.2 ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization