Re: [PATCH net 5/6] virtio-net: fix the vq coalescing setting for vq resize
On Fri, Sep 22, 2023 at 1:02 PM Heng Qi wrote: > > > > 在 2023/9/22 下午12:29, Jason Wang 写道: > > On Tue, Sep 19, 2023 at 3:49 PM Heng Qi wrote: > >> According to the definition of virtqueue coalescing spec[1]: > >> > >>Upon disabling and re-enabling a transmit virtqueue, the device MUST set > >>the coalescing parameters of the virtqueue to those configured through > >> the > >>VIRTIO_NET_CTRL_NOTF_COAL_TX_SET command, or, if the driver did not set > >>any TX coalescing parameters, to 0. > >> > >>Upon disabling and re-enabling a receive virtqueue, the device MUST set > >>the coalescing parameters of the virtqueue to those configured through > >> the > >>VIRTIO_NET_CTRL_NOTF_COAL_RX_SET command, or, if the driver did not set > >>any RX coalescing parameters, to 0. > >> > >> We need to add this setting for vq resize (ethtool -G) where vq_reset > >> happens. > >> > >> [1] https://lists.oasis-open.org/archives/virtio-dev/202303/msg00415.html > >> > >> Fixes: 394bd87764b6 ("virtio_net: support per queue interrupt coalesce > >> command") > > I'm not sure this is a real fix as spec allows it to go zero? > > The spec says that if the user has configured interrupt coalescing > parameters, > parameters need to be restored after vq_reset, otherwise set to 0. > vi->intr_coal_tx and vi->intr_coal_rx always save the newest global > parameters, > regardless of whether the command is sent or not. So I think we need > this patch > it complies with the specification requirements. How can we make sure the old coalescing parameters still make sense for the new ring size? Thanks > > Thanks! > > > > > Thanks > ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v2 1/2] iommu/virtio: Make use of ops->iotlb_sync_map
On Tue, Sep 19, 2023 at 09:28:08AM +0100, Robin Murphy wrote: > On 2023-09-19 09:15, Jean-Philippe Brucker wrote: > > On Mon, Sep 18, 2023 at 05:37:47PM +0100, Robin Murphy wrote: > > > > diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c > > > > index 17dcd826f5c2..3649586f0e5c 100644 > > > > --- a/drivers/iommu/virtio-iommu.c > > > > +++ b/drivers/iommu/virtio-iommu.c > > > > @@ -189,6 +189,12 @@ static int viommu_sync_req(struct viommu_dev > > > > *viommu) > > > > int ret; > > > > unsigned long flags; > > > > + /* > > > > +* .iotlb_sync_map and .flush_iotlb_all may be called before > > > > the viommu > > > > +* is initialized e.g. via iommu_create_device_direct_mappings() > > > > +*/ > > > > + if (!viommu) > > > > + return 0; > > > > > > Minor nit: I'd be inclined to make that check explicitly in the places > > > where > > > it definitely is expected, rather than allowing *any* sync to silently do > > > nothing if called incorrectly. Plus then they could use > > > vdomain->nr_endpoints for consistency with the equivalent checks elsewhere > > > (it did take me a moment to figure out how we could get to .iotlb_sync_map > > > with a NULL viommu without viommu_map_pages() blowing up first...) > > > > They're not strictly equivalent: this check works around a temporary issue > > with the IOMMU core, which calls map/unmap before the domain is finalized. > > Once we merge domain_alloc() and finalize(), then this check disappears, > > but we still need to test nr_endpoints in map/unmap to handle detached > > domains (and we still need to fix the synchronization of nr_endpoints > > against attach/detach). That's why I preferred doing this on viommu and > > keeping it in one place. > > Fair enough - it just seems to me that in both cases it's a detached domain, > so its previous history of whether it's ever been otherwise or not shouldn't > matter. Even once viommu is initialised, does it really make sense to send > sync commands for a mapping on a detached domain where we haven't actually > sent any map/unmap commands? If no requests were added by map/unmap, then viommu_sync_req() is essentially a nop because virtio doesn't use sync commands (and virtqueue_kick() only kicks the host when the queue's not empty, if I remember correctly). It still does a bit of work so is less efficient than a preliminary check on nr_endpoints, but it feels nicer to streamline the case where the domain is attached, since map/unmap on detached domains happens rarely, if ever. Either is fine by me. An extra test won't make much difference performance wise, and I guess will look less confusing. Niklas, do you mind resending the version with nr_endpoints check (and updated commit messages)? Thanks, Jean ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v2 1/2] iommu/virtio: Make use of ops->iotlb_sync_map
On Tue, Sep 19, 2023 at 11:46:49AM -0300, Jason Gunthorpe wrote: > On Tue, Sep 19, 2023 at 09:15:19AM +0100, Jean-Philippe Brucker wrote: > > On Mon, Sep 18, 2023 at 05:37:47PM +0100, Robin Murphy wrote: > > > > diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c > > > > index 17dcd826f5c2..3649586f0e5c 100644 > > > > --- a/drivers/iommu/virtio-iommu.c > > > > +++ b/drivers/iommu/virtio-iommu.c > > > > @@ -189,6 +189,12 @@ static int viommu_sync_req(struct viommu_dev > > > > *viommu) > > > > int ret; > > > > unsigned long flags; > > > > + /* > > > > +* .iotlb_sync_map and .flush_iotlb_all may be called before > > > > the viommu > > > > +* is initialized e.g. via iommu_create_device_direct_mappings() > > > > +*/ > > > > + if (!viommu) > > > > + return 0; > > > > > > Minor nit: I'd be inclined to make that check explicitly in the places > > > where > > > it definitely is expected, rather than allowing *any* sync to silently do > > > nothing if called incorrectly. Plus then they could use > > > vdomain->nr_endpoints for consistency with the equivalent checks elsewhere > > > (it did take me a moment to figure out how we could get to .iotlb_sync_map > > > with a NULL viommu without viommu_map_pages() blowing up first...) > > This makes more sense to me > > Ultimately this driver should reach a point where every iommu_domain > always has a non-null domain->viommu because it will be set during > alloc. > > But it can still have nr_endpoints == 0, doesn't it make sense to > avoid sync in this case? > > (btw this driver is missing locking around vdomain->nr_endpoints) Yes, that's on my list > > > They're not strictly equivalent: this check works around a temporary issue > > with the IOMMU core, which calls map/unmap before the domain is > > finalized. > > Where? The above points to iommu_create_device_direct_mappings() but > it doesn't because the pgsize_bitmap == 0: __iommu_domain_alloc() sets pgsize_bitmap in this case: /* * If not already set, assume all sizes by default; the driver * may override this later */ if (!domain->pgsize_bitmap) domain->pgsize_bitmap = bus->iommu_ops->pgsize_bitmap; Thanks, Jean ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Weird locking hierarchy in virtcrypto
While backporting an upstream patch to one of our older distro kernels I've found a weird locking hierarchy in the virtio crypto driver: virtio_config_changed(): spin_lock_irqsave() ->__virtio_config_changed() ->drv->config_changed() <- assume drv == virtio_crypto_driver virtcrypto_config_changed(): ->virtcrypto_update_status() ->virtcrypto_dev_start() ->virtio_crypto_skcipher_algs_register() mutex_lock() I don't think trying to take a mutex while holding a spinlock with interrupts disabled is a proper thing to do. Am I overlooking something here? Juergen OpenPGP_0xB0DE9DD628BF132F.asc Description: OpenPGP public key OpenPGP_signature.asc Description: OpenPGP digital signature ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH vfio 11/11] vfio/virtio: Introduce a vfio driver over virtio devices
On Thu, Sep 21, 2023 at 07:48:36PM -0300, Jason Gunthorpe wrote: > On Thu, Sep 21, 2023 at 04:16:25PM -0400, Michael S. Tsirkin wrote: > > On Thu, Sep 21, 2023 at 04:53:45PM -0300, Jason Gunthorpe wrote: > > > On Thu, Sep 21, 2023 at 03:34:03PM -0400, Michael S. Tsirkin wrote: > > > > > > > that's easy/practical. If instead VDPA gives the same speed with just > > > > shadow vq then keeping this hack in vfio seems like less of a problem. > > > > Finally if VDPA is faster then maybe you will reconsider using it ;) > > > > > > It is not all about the speed. > > > > > > VDPA presents another large and complex software stack in the > > > hypervisor that can be eliminated by simply using VFIO. > > > > If all you want is passing through your card to guest > > then yes this can be addressed "by simply using VFIO". > > That is pretty much the goal, yes. > > > And let me give you a simple example just from this patchset: > > it assumes guest uses MSIX and just breaks if it doesn't. > > It does? Really? Where did you see that? This thing apparently: + opcode = (pos < VIRTIO_PCI_CONFIG_OFF(true)) ? + VIRTIO_ADMIN_CMD_LEGACY_COMMON_CFG_READ : + VIRTIO_ADMIN_CMD_LEGACY_DEV_CFG_READ; That "true" is supposed to be whether guest enabled MSI or not. > > > VFIO is > > > already required for other scenarios. > > > > Required ... by some people? Most VMs I run don't use anything > > outside of virtio. > > Yes, some people. The sorts of people who run large data centers. > > > It seems to deal with emulating virtio which seems more like a vdpa > > thing. > > Alex described it right, it creates an SW trapped IO bar that relays > the doorbell to an admin queue command. > > > If you start adding virtio emulation to vfio then won't > > you just end up with another vdpa? And if no why not? > > And I don't buy the "we already invested in this vfio based solution", > > sorry - that's not a reason upstream has to maintain it. > > I think you would be well justified to object to actual mediation, > like processing queues in VFIO or otherwise complex things. This mediation is kind of smallish, I agree. Not completely devoid of logic though. > Fortunately there is no need to do that with DPU HW. The legacy IO BAR > is a weird quirk that just cannot be done without a software trap, and > the OASIS standardization effort was for exactly this kind of > simplistic transformation. > > I also don't buy the "upstream has to maintain it" line. The team that > submitted it will maintain it just fine, thank you. it will require maintainance effort when virtio changes are made. For example it pokes at the device state - I don't see specific races right now but in the past we did e.g. reset the device to recover from errors and we might start doing it again. If more of the logic is under virtio directory where we'll remember to keep it in loop, and will be able to reuse it from vdpa down the road, I would be more sympathetic. -- MST ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH vfio 10/11] vfio/virtio: Expose admin commands over virtio device
On Thu, Sep 21, 2023 at 03:40:39PM +0300, Yishai Hadas wrote: > Expose admin commands over the virtio device, to be used by the > vfio-virtio driver in the next patches. > > It includes: list query/use, legacy write/read, read notify_info. > > Signed-off-by: Yishai Hadas This stuff is pure virtio spec. I think it should live under drivers/virtio, too. > --- > drivers/vfio/pci/virtio/cmd.c | 146 ++ > drivers/vfio/pci/virtio/cmd.h | 27 +++ > 2 files changed, 173 insertions(+) > create mode 100644 drivers/vfio/pci/virtio/cmd.c > create mode 100644 drivers/vfio/pci/virtio/cmd.h > > diff --git a/drivers/vfio/pci/virtio/cmd.c b/drivers/vfio/pci/virtio/cmd.c > new file mode 100644 > index ..f068239cdbb0 > --- /dev/null > +++ b/drivers/vfio/pci/virtio/cmd.c > @@ -0,0 +1,146 @@ > +// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB > +/* > + * Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved > + */ > + > +#include "cmd.h" > + > +int virtiovf_cmd_list_query(struct pci_dev *pdev, u8 *buf, int buf_size) > +{ > + struct virtio_device *virtio_dev = virtio_pci_vf_get_pf_dev(pdev); > + struct scatterlist out_sg; > + struct virtio_admin_cmd cmd = {}; > + > + if (!virtio_dev) > + return -ENOTCONN; > + > + sg_init_one(&out_sg, buf, buf_size); > + cmd.opcode = VIRTIO_ADMIN_CMD_LIST_QUERY; > + cmd.group_type = VIRTIO_ADMIN_GROUP_TYPE_SRIOV; > + cmd.result_sg = &out_sg; > + > + return virtio_admin_cmd_exec(virtio_dev, &cmd); > +} > + > +int virtiovf_cmd_list_use(struct pci_dev *pdev, u8 *buf, int buf_size) > +{ > + struct virtio_device *virtio_dev = virtio_pci_vf_get_pf_dev(pdev); > + struct scatterlist in_sg; > + struct virtio_admin_cmd cmd = {}; > + > + if (!virtio_dev) > + return -ENOTCONN; > + > + sg_init_one(&in_sg, buf, buf_size); > + cmd.opcode = VIRTIO_ADMIN_CMD_LIST_USE; > + cmd.group_type = VIRTIO_ADMIN_GROUP_TYPE_SRIOV; > + cmd.data_sg = &in_sg; > + > + return virtio_admin_cmd_exec(virtio_dev, &cmd); > +} > + > +int virtiovf_cmd_lr_write(struct virtiovf_pci_core_device *virtvdev, u16 > opcode, > + u8 offset, u8 size, u8 *buf) > +{ > + struct virtio_device *virtio_dev = > + virtio_pci_vf_get_pf_dev(virtvdev->core_device.pdev); > + struct virtio_admin_cmd_data_lr_write *in; > + struct scatterlist in_sg; > + struct virtio_admin_cmd cmd = {}; > + int ret; > + > + if (!virtio_dev) > + return -ENOTCONN; > + > + in = kzalloc(sizeof(*in) + size, GFP_KERNEL); > + if (!in) > + return -ENOMEM; > + > + in->offset = offset; > + memcpy(in->registers, buf, size); > + sg_init_one(&in_sg, in, sizeof(*in) + size); > + cmd.opcode = opcode; > + cmd.group_type = VIRTIO_ADMIN_GROUP_TYPE_SRIOV; > + cmd.group_member_id = virtvdev->vf_id + 1; > + cmd.data_sg = &in_sg; > + ret = virtio_admin_cmd_exec(virtio_dev, &cmd); > + > + kfree(in); > + return ret; > +} > + > +int virtiovf_cmd_lr_read(struct virtiovf_pci_core_device *virtvdev, u16 > opcode, > + u8 offset, u8 size, u8 *buf) > +{ > + struct virtio_device *virtio_dev = > + virtio_pci_vf_get_pf_dev(virtvdev->core_device.pdev); > + struct virtio_admin_cmd_data_lr_read *in; > + struct scatterlist in_sg, out_sg; > + struct virtio_admin_cmd cmd = {}; > + int ret; > + > + if (!virtio_dev) > + return -ENOTCONN; > + > + in = kzalloc(sizeof(*in), GFP_KERNEL); > + if (!in) > + return -ENOMEM; > + > + in->offset = offset; > + sg_init_one(&in_sg, in, sizeof(*in)); > + sg_init_one(&out_sg, buf, size); > + cmd.opcode = opcode; > + cmd.group_type = VIRTIO_ADMIN_GROUP_TYPE_SRIOV; > + cmd.data_sg = &in_sg; > + cmd.result_sg = &out_sg; > + cmd.group_member_id = virtvdev->vf_id + 1; > + ret = virtio_admin_cmd_exec(virtio_dev, &cmd); > + > + kfree(in); > + return ret; > +} > + > +int virtiovf_cmd_lq_read_notify(struct virtiovf_pci_core_device *virtvdev, > + u8 req_bar_flags, u8 *bar, u64 *bar_offset) > +{ > + struct virtio_device *virtio_dev = > + virtio_pci_vf_get_pf_dev(virtvdev->core_device.pdev); > + struct virtio_admin_cmd_notify_info_result *out; > + struct scatterlist out_sg; > + struct virtio_admin_cmd cmd = {}; > + int ret; > + > + if (!virtio_dev) > + return -ENOTCONN; > + > + out = kzalloc(sizeof(*out), GFP_KERNEL); > + if (!out) > + return -ENOMEM; > + > + sg_init_one(&out_sg, out, sizeof(*out)); > + cmd.opcode = VIRTIO_ADMIN_CMD_LEGACY_NOTIFY_INFO; > + cmd.group_type = VIRTIO_ADMIN_GROUP_TYPE_SRIOV; > + cmd.result_sg = &out_sg; > + cmd.group_member_id = virtvdev->vf_id + 1; > + ret = virtio_admin_cmd_exec(virtio_dev, &cmd); > + if (!ret)
Re: [PATCH net 5/6] virtio-net: fix the vq coalescing setting for vq resize
On Fri, 22 Sep 2023 15:32:39 +0800, Jason Wang wrote: > On Fri, Sep 22, 2023 at 1:02 PM Heng Qi wrote: > > > > > > > > 在 2023/9/22 下午12:29, Jason Wang 写道: > > > On Tue, Sep 19, 2023 at 3:49 PM Heng Qi wrote: > > >> According to the definition of virtqueue coalescing spec[1]: > > >> > > >>Upon disabling and re-enabling a transmit virtqueue, the device MUST > > >> set > > >>the coalescing parameters of the virtqueue to those configured > > >> through the > > >>VIRTIO_NET_CTRL_NOTF_COAL_TX_SET command, or, if the driver did not > > >> set > > >>any TX coalescing parameters, to 0. > > >> > > >>Upon disabling and re-enabling a receive virtqueue, the device MUST > > >> set > > >>the coalescing parameters of the virtqueue to those configured > > >> through the > > >>VIRTIO_NET_CTRL_NOTF_COAL_RX_SET command, or, if the driver did not > > >> set > > >>any RX coalescing parameters, to 0. > > >> > > >> We need to add this setting for vq resize (ethtool -G) where vq_reset > > >> happens. > > >> > > >> [1] https://lists.oasis-open.org/archives/virtio-dev/202303/msg00415.html > > >> > > >> Fixes: 394bd87764b6 ("virtio_net: support per queue interrupt coalesce > > >> command") > > > I'm not sure this is a real fix as spec allows it to go zero? > > > > The spec says that if the user has configured interrupt coalescing > > parameters, > > parameters need to be restored after vq_reset, otherwise set to 0. > > vi->intr_coal_tx and vi->intr_coal_rx always save the newest global > > parameters, > > regardless of whether the command is sent or not. So I think we need > > this patch > > it complies with the specification requirements. > > How can we make sure the old coalescing parameters still make sense > for the new ring size? For the user, I don't think we should drop the config for the coalescing. Maybe the config does not make sense for the new ring size, but when the user just change the ring size, the config for the coalesing is missing, I think that is not good. Thanks. > > Thanks > > > > > Thanks! > > > > > > > > Thanks > > > ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH vfio 11/11] vfio/virtio: Introduce a vfio driver over virtio devices
On Thu, Sep 21, 2023 at 03:40:40PM +0300, Yishai Hadas wrote: > Introduce a vfio driver over virtio devices to support the legacy > interface functionality for VFs. > > Background, from the virtio spec [1]. > > In some systems, there is a need to support a virtio legacy driver with > a device that does not directly support the legacy interface. In such > scenarios, a group owner device can provide the legacy interface > functionality for the group member devices. The driver of the owner > device can then access the legacy interface of a member device on behalf > of the legacy member device driver. > > For example, with the SR-IOV group type, group members (VFs) can not > present the legacy interface in an I/O BAR in BAR0 as expected by the > legacy pci driver. If the legacy driver is running inside a virtual > machine, the hypervisor executing the virtual machine can present a > virtual device with an I/O BAR in BAR0. The hypervisor intercepts the > legacy driver accesses to this I/O BAR and forwards them to the group > owner device (PF) using group administration commands. > > > Specifically, this driver adds support for a virtio-net VF to be exposed > as a transitional device to a guest driver and allows the legacy IO BAR > functionality on top. > > This allows a VM which uses a legacy virtio-net driver in the guest to > work transparently over a VF which its driver in the host is that new > driver. > > The driver can be extended easily to support some other types of virtio > devices (e.g virtio-blk), by adding in a few places the specific type > properties as was done for virtio-net. > > For now, only the virtio-net use case was tested and as such we introduce > the support only for such a device. > > Practically, > Upon probing a VF for a virtio-net device, in case its PF supports > legacy access over the virtio admin commands and the VF doesn't have BAR > 0, we set some specific 'vfio_device_ops' to be able to simulate in SW a > transitional device with I/O BAR in BAR 0. > > The existence of the simulated I/O bar is reported later on by > overwriting the VFIO_DEVICE_GET_REGION_INFO command and the device > exposes itself as a transitional device by overwriting some properties > upon reading its config space. > > Once we report the existence of I/O BAR as BAR 0 a legacy driver in the > guest may use it via read/write calls according to the virtio > specification. > > Any read/write towards the control parts of the BAR will be captured by > the new driver and will be translated into admin commands towards the > device. > > Any data path read/write access (i.e. virtio driver notifications) will > be forwarded to the physical BAR which its properties were supplied by > the command VIRTIO_PCI_QUEUE_NOTIFY upon the probing/init flow. > > With that code in place a legacy driver in the guest has the look and > feel as if having a transitional device with legacy support for both its > control and data path flows. > > [1] > https://github.com/oasis-tcs/virtio-spec/commit/03c2d32e5093ca9f2a17797242fbef88efe94b8c > > Signed-off-by: Yishai Hadas > --- > MAINTAINERS | 6 + > drivers/vfio/pci/Kconfig | 2 + > drivers/vfio/pci/Makefile| 2 + > drivers/vfio/pci/virtio/Kconfig | 15 + > drivers/vfio/pci/virtio/Makefile | 4 + > drivers/vfio/pci/virtio/cmd.c| 4 +- > drivers/vfio/pci/virtio/cmd.h| 8 + > drivers/vfio/pci/virtio/main.c | 546 +++ > 8 files changed, 585 insertions(+), 2 deletions(-) > create mode 100644 drivers/vfio/pci/virtio/Kconfig > create mode 100644 drivers/vfio/pci/virtio/Makefile > create mode 100644 drivers/vfio/pci/virtio/main.c > > diff --git a/MAINTAINERS b/MAINTAINERS > index bf0f54c24f81..5098418c8389 100644 > --- a/MAINTAINERS > +++ b/MAINTAINERS > @@ -22624,6 +22624,12 @@ L: k...@vger.kernel.org > S: Maintained > F: drivers/vfio/pci/mlx5/ > > +VFIO VIRTIO PCI DRIVER > +M: Yishai Hadas > +L: k...@vger.kernel.org > +S: Maintained > +F: drivers/vfio/pci/virtio > + > VFIO PCI DEVICE SPECIFIC DRIVERS > R: Jason Gunthorpe > R: Yishai Hadas > diff --git a/drivers/vfio/pci/Kconfig b/drivers/vfio/pci/Kconfig > index 8125e5f37832..18c397df566d 100644 > --- a/drivers/vfio/pci/Kconfig > +++ b/drivers/vfio/pci/Kconfig > @@ -65,4 +65,6 @@ source "drivers/vfio/pci/hisilicon/Kconfig" > > source "drivers/vfio/pci/pds/Kconfig" > > +source "drivers/vfio/pci/virtio/Kconfig" > + > endmenu > diff --git a/drivers/vfio/pci/Makefile b/drivers/vfio/pci/Makefile > index 45167be462d8..046139a4eca5 100644 > --- a/drivers/vfio/pci/Makefile > +++ b/drivers/vfio/pci/Makefile > @@ -13,3 +13,5 @@ obj-$(CONFIG_MLX5_VFIO_PCI) += mlx5/ > obj-$(CONFIG_HISI_ACC_VFIO_PCI) += hisilicon/ > > obj-$(CONFIG_PDS_VFIO_PCI) += pds/ > + > +obj-$(CO
Re: Weird locking hierarchy in virtcrypto
On Fri, Sep 22, 2023 at 10:36:59AM +0200, Juergen Gross wrote: > While backporting an upstream patch to one of our older distro kernels I've > found a weird locking hierarchy in the virtio crypto driver: > > virtio_config_changed(): > spin_lock_irqsave() > ->__virtio_config_changed() > ->drv->config_changed() <- assume drv == virtio_crypto_driver > virtcrypto_config_changed(): > ->virtcrypto_update_status() > ->virtcrypto_dev_start() > ->virtio_crypto_skcipher_algs_register() > mutex_lock() > > I don't think trying to take a mutex while holding a spinlock with interrupts > disabled is a proper thing to do. > > Am I overlooking something here? > > > Juergen Yes - this seems like a bug in virtcrypto - for example, virt blk queues a work for this reason. It's so common that it might make sense to move the config work into virtio core. In any case, patches welcome. -- MST ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v14 30/42] virtio_pci: introduce helper to get/set queue reset
On Fri, Sep 22, 2023 at 09:49:18AM +0800, Xuan Zhuo wrote: > On Thu, 21 Sep 2023 10:02:53 -0400, "Michael S. Tsirkin" > wrote: > > On Mon, Aug 01, 2022 at 02:38:50PM +0800, Xuan Zhuo wrote: > > > Introduce new helpers to implement queue reset and get queue reset > > > status. > > > > > > https://github.com/oasis-tcs/virtio-spec/issues/124 > > > https://github.com/oasis-tcs/virtio-spec/issues/139 > > > > > > Signed-off-by: Xuan Zhuo > > > Acked-by: Jason Wang > > > --- > > > drivers/virtio/virtio_pci_modern_dev.c | 39 ++ > > > include/linux/virtio_pci_modern.h | 2 ++ > > > 2 files changed, 41 insertions(+) > > > > > > diff --git a/drivers/virtio/virtio_pci_modern_dev.c > > > b/drivers/virtio/virtio_pci_modern_dev.c > > > index fa2a9445bb18..869cb46bef96 100644 > > > --- a/drivers/virtio/virtio_pci_modern_dev.c > > > +++ b/drivers/virtio/virtio_pci_modern_dev.c > > > @@ -3,6 +3,7 @@ > > > #include > > > #include > > > #include > > > +#include > > > > > > /* > > > * vp_modern_map_capability - map a part of virtio pci capability > > > @@ -474,6 +475,44 @@ void vp_modern_set_status(struct > > > virtio_pci_modern_device *mdev, > > > } > > > EXPORT_SYMBOL_GPL(vp_modern_set_status); > > > > > > +/* > > > + * vp_modern_get_queue_reset - get the queue reset status > > > + * @mdev: the modern virtio-pci device > > > + * @index: queue index > > > + */ > > > +int vp_modern_get_queue_reset(struct virtio_pci_modern_device *mdev, u16 > > > index) > > > +{ > > > + struct virtio_pci_modern_common_cfg __iomem *cfg; > > > + > > > + cfg = (struct virtio_pci_modern_common_cfg __iomem *)mdev->common; > > > + > > > + vp_iowrite16(index, &cfg->cfg.queue_select); > > > + return vp_ioread16(&cfg->queue_reset); > > > +} > > > +EXPORT_SYMBOL_GPL(vp_modern_get_queue_reset); > > > + > > > > Actually, this does not validate that the config structure is big > > enough. So it can access some unrelated memory. Don't know whether > > that's exploitable e.g. for CoCo but not nice, anyway. > > Need to validate the size and disable reset if it's too small. > > > static int vp_modern_disable_vq_and_reset(struct virtqueue *vq) > { > struct virtio_pci_device *vp_dev = to_vp_device(vq->vdev); > struct virtio_pci_modern_device *mdev = &vp_dev->mdev; > struct virtio_pci_vq_info *info; > unsigned long flags; > > ->if (!virtio_has_feature(vq->vdev, VIRTIO_F_RING_RESET)) > return -ENOENT; > > vp_modern_set_queue_reset(mdev, vq->index); > > > I checked VIRTIO_F_RING_RESET before call this. Yes but the point is that virtio is used with untrusted devices (e.g. for SEV/TDX), so you can't really assume config structures are in sync with feature bits. > Do you mean, we should put the check to this function. > > > Thanks. > > > > > > > > > > +/* > > > + * vp_modern_set_queue_reset - reset the queue > > > + * @mdev: the modern virtio-pci device > > > + * @index: queue index > > > + */ > > > +void vp_modern_set_queue_reset(struct virtio_pci_modern_device *mdev, > > > u16 index) > > > +{ > > > + struct virtio_pci_modern_common_cfg __iomem *cfg; > > > + > > > + cfg = (struct virtio_pci_modern_common_cfg __iomem *)mdev->common; > > > + > > > + vp_iowrite16(index, &cfg->cfg.queue_select); > > > + vp_iowrite16(1, &cfg->queue_reset); > > > + > > > + while (vp_ioread16(&cfg->queue_reset)) > > > + msleep(1); > > > + > > > + while (vp_ioread16(&cfg->cfg.queue_enable)) > > > + msleep(1); > > > +} > > > +EXPORT_SYMBOL_GPL(vp_modern_set_queue_reset); > > > + > > > /* > > > * vp_modern_queue_vector - set the MSIX vector for a specific virtqueue > > > * @mdev: the modern virtio-pci device > > > diff --git a/include/linux/virtio_pci_modern.h > > > b/include/linux/virtio_pci_modern.h > > > index 05123b9a606f..c4eeb79b0139 100644 > > > --- a/include/linux/virtio_pci_modern.h > > > +++ b/include/linux/virtio_pci_modern.h > > > @@ -113,4 +113,6 @@ void __iomem * vp_modern_map_vq_notify(struct > > > virtio_pci_modern_device *mdev, > > > u16 index, resource_size_t *pa); > > > int vp_modern_probe(struct virtio_pci_modern_device *mdev); > > > void vp_modern_remove(struct virtio_pci_modern_device *mdev); > > > +int vp_modern_get_queue_reset(struct virtio_pci_modern_device *mdev, u16 > > > index); > > > +void vp_modern_set_queue_reset(struct virtio_pci_modern_device *mdev, > > > u16 index); > > > #endif > > > -- > > > 2.31.0 > > ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH vfio 11/11] vfio/virtio: Introduce a vfio driver over virtio devices
On Thu, Sep 21, 2023 at 07:55:26PM -0300, Jason Gunthorpe wrote: > On Thu, Sep 21, 2023 at 04:45:45PM -0400, Michael S. Tsirkin wrote: > > On Thu, Sep 21, 2023 at 04:49:46PM -0300, Jason Gunthorpe wrote: > > > On Thu, Sep 21, 2023 at 03:13:10PM -0400, Michael S. Tsirkin wrote: > > > > On Thu, Sep 21, 2023 at 03:39:26PM -0300, Jason Gunthorpe wrote: > > > > > On Thu, Sep 21, 2023 at 12:53:04PM -0400, Michael S. Tsirkin wrote: > > > > > > > vdpa is not vfio, I don't know how you can suggest vdpa is a > > > > > > > replacement for a vfio driver. They are completely different > > > > > > > things. > > > > > > > Each side has its own strengths, and vfio especially is > > > > > > > accelerating > > > > > > > in its capability in way that vpda is not. eg if an iommufd > > > > > > > conversion > > > > > > > had been done by now for vdpa I might be more sympathetic. > > > > > > > > > > > > Yea, I agree iommufd is a big problem with vdpa right now. Cindy was > > > > > > sick and I didn't know and kept assuming she's working on this. I > > > > > > don't > > > > > > think it's a huge amount of work though. I'll take a look. > > > > > > Is there anything else though? Do tell. > > > > > > > > > > Confidential compute will never work with VDPA's approach. > > > > > > > > I don't see how what this patchset is doing is different > > > > wrt to Confidential compute - you trap IO accesses and emulate. > > > > Care to elaborate? > > > > > > This patch series isn't about confidential compute, you asked about > > > the future. VFIO will support confidential compute in the future, VDPA > > > will not. > > > > Nonsense it already works. > > That isn't what I'm talking about. With a real PCI function and TDISP > we can actually DMA directly from the guest's memory without needing > the ugly bounce buffer hack. Then you can get decent performance. Aha, TDISP. But that one clearly does not need and can not use this kind of hack? > > But I did not ask about the future since I do not believe it > > can be confidently predicted. I asked what is missing in VDPA > > now for you to add this feature there and not in VFIO. > > I don't see that VDPA needs this, VDPA should process the IO BAR on > its own with its own logic, just like everything else it does. First there's some logic here such as translating legacy IO offsets to modern ones that could be reused. But also, this is not just IO BAR, that indeed can be easily done in software. When a device operates in legacy mode there are subtle differences with modern mode such as a different header size for the net device. > This is specifically about avoiding mediation by relaying directly the > IO BAR operations to the device itself. > > That is the entire irony, this whole scheme was designed and > standardized *specifically* to avoid complex mediation and here you > are saying we should just use mediation. > > Jason Not exactly. What I had in mind is just having the logic in the vdpa module so users don't need to know what does the device support and what it doesn't. If we can we bypass mediation (to simplify the software stack) if we can not we do not. Looking at it from user's POV, it is just super confusing that card ABC would need to be used with VDPA to drive legacy while card DEF needs to be used with VFIO. And both VFIO and VDPA will happily bind, too. Oh man ... -- MST ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
RE: [PATCH vfio 11/11] vfio/virtio: Introduce a vfio driver over virtio devices
> From: Jason Gunthorpe > Sent: Friday, September 22, 2023 5:53 PM > > And what's more, using MMIO BAR0 then it can work for legacy. > > Oh? How? Our team didn't think so. It does not. It was already discussed. The device reset in legacy is not synchronous. The drivers do not wait for reset to complete; it was written for the sw backend. Hence MMIO BAR0 is not the best option in real implementations. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v2 1/2] iommu/virtio: Make use of ops->iotlb_sync_map
On Fri, Sep 22, 2023 at 08:57:19AM +0100, Jean-Philippe Brucker wrote: > > > They're not strictly equivalent: this check works around a temporary issue > > > with the IOMMU core, which calls map/unmap before the domain is > > > finalized. > > > > Where? The above points to iommu_create_device_direct_mappings() but > > it doesn't because the pgsize_bitmap == 0: > > __iommu_domain_alloc() sets pgsize_bitmap in this case: > > /* > * If not already set, assume all sizes by default; the driver > * may override this later > */ > if (!domain->pgsize_bitmap) > domain->pgsize_bitmap = bus->iommu_ops->pgsize_bitmap; Dirver's shouldn't do that. The core code was fixed to try again with mapping reserved regions to support these kinds of drivers. Jason ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
RE: [PATCH vfio 11/11] vfio/virtio: Introduce a vfio driver over virtio devices
> From: Jason Gunthorpe > Sent: Friday, September 22, 2023 6:07 PM > > On Thu, Sep 21, 2023 at 01:58:32PM -0600, Alex Williamson wrote: > > > If the heart of this driver is simply pretending to have an I/O BAR > > where I/O accesses into that BAR are translated to accesses in the > > MMIO BAR, why can't this be done in the VMM, ie. QEMU? > > That isn't exactly what it does, the IO bar access is translated into an admin > queue command on the PF and excuted by the PCI function. > > So it would be difficult to do that in qemu without also somehow wiring up > qemu to access the PF's kernel driver's admin queue. > > It would have been nice if it was a trivial 1:1 translation to the MMIO bar, > but it > seems that didn't entirely work with existing VMs. So OASIS standardized this > approach. > > The bigger picture is there is also a live migration standard & driver in the > works that will re-use all this admin queue infrastructure anyhow, so the best > course is to keep this in the kernel. Additionally in the future the AQ of the PF will also be used to provision the VFs (virtio OASIS calls them member devices), such framework also resides in the kernel. Such PFs are in use by the kernel driver. +1 for keeping this framework in the kernel. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v2 1/2] iommu/virtio: Make use of ops->iotlb_sync_map
On 22/09/2023 1:41 pm, Jason Gunthorpe wrote: On Fri, Sep 22, 2023 at 08:57:19AM +0100, Jean-Philippe Brucker wrote: They're not strictly equivalent: this check works around a temporary issue with the IOMMU core, which calls map/unmap before the domain is finalized. Where? The above points to iommu_create_device_direct_mappings() but it doesn't because the pgsize_bitmap == 0: __iommu_domain_alloc() sets pgsize_bitmap in this case: /* * If not already set, assume all sizes by default; the driver * may override this later */ if (!domain->pgsize_bitmap) domain->pgsize_bitmap = bus->iommu_ops->pgsize_bitmap; Dirver's shouldn't do that. The core code was fixed to try again with mapping reserved regions to support these kinds of drivers. This is still the "normal" code path, really; I think it's only AMD that started initialising the domain bitmap "early" and warranted making it conditional. However we *do* ultimately want all the drivers to do the same, so we can get rid of ops->pgsize_bitmap, because it's already pretty redundant and meaningless in the face of per-domain pagetable formats. Thanks, Robin. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
BUG: virtcrypto_dataq_callback calls crypto_finalize_request() from irq context
Hi Gonglei! Our CI has found that virtio-crypto does not honor the requirement of crypto_finalize_request() being called in softirq context which is asserted in that function via lockdep_assert_in_softirq() since commit 4058cf08945c ("crypto: engine - check if BH is disabled during completion"). The problem was originally found on s390x but Marc Hartmayer was so kind to reproduce it on amd64. Please find the corresponding kernel messages at the end of this email. The call chain looks like this. interrupt handler for queue notification --> virtcrypto_dataq_callback() --> via vc_req->alg_cb either virtio_crypto_skcipher_finalize_req() or virtio_crypto_akcipher_finalize_req() --> crypto_finalize_skcipher_request() or crypto_finalize_akcipher_request() --> crypto_finalize_request() Everything above is happening in the interrupt handler (and in "hard" irq context). I'm not really familiar with the implementation of virtio_crypto or with the crypto_engine interfaces. I assume the problem is on the side of virtio-crypto so I would like to kindly ask you as the maintainer of virtio-crypt to have a look at it. But if you think it is rather in the crypto_engine, please clarify that with Herbert. I have no strong opinion on this issue. Regards, Halil [ 31.033415][ C0] WARNING: CPU: 0 PID: 136 at crypto/crypto_engine.c:58 crypto_finalize_request (crypto/crypto_engine.c:58 (discriminator 23)) crypto_engine [ 31.034131][C0] Modules linked in: virtio_crypto(+) vmw_vsock_virtio_transport_common(+) crypto_engine vsock [ 31.035326][C0] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc38 04/01/2014 [ 31.035917][ C0] RIP: 0010:crypto_finalize_request (crypto/crypto_engine.c:58 (discriminator 23)) crypto_engine [ 31.036398][ C0] Code: 08 5b 5d 41 5c 41 5d e9 bf 88 1c c1 65 8b 05 b0 36 01 40 f6 c4 ff 74 12 a9 00 00 0f 00 75 0b a9 00 00 f0 00 0f 84 54 ff ff ff <0f> 0b e9 4d ff ff ff 4c 8d 6b 38 4c 89 ef e8 8e 47 1b c4 48 8d bb All code 0: 08 5b 5dor %bl,0x5d(%rbx) 3: 41 5c pop%r12 5: 41 5d pop%r13 7: e9 bf 88 1c c1 jmp0xc11c88cb c: 65 8b 05 b0 36 01 40mov%gs:0x400136b0(%rip),%eax# 0x400136c3 13: f6 c4 fftest $0xff,%ah 16: 74 12 je 0x2a 18: a9 00 00 0f 00 test $0xf,%eax 1d: 75 0b jne0x2a 1f: a9 00 00 f0 00 test $0xf0,%eax 24: 0f 84 54 ff ff ff je 0xff7e 2a:* 0f 0b ud2 <-- trapping instruction 2c: e9 4d ff ff ff jmp0xff7e 31: 4c 8d 6b 38 lea0x38(%rbx),%r13 35: 4c 89 efmov%r13,%rdi 38: e8 8e 47 1b c4 call 0xc41b47cb 3d: 48 rex.W 3e: 8d .byte 0x8d 3f: bb .byte 0xbb Code starting with the faulting instruction === 0: 0f 0b ud2 2: e9 4d ff ff ff jmp0xff54 7: 4c 8d 6b 38 lea0x38(%rbx),%r13 b: 4c 89 efmov%r13,%rdi e: e8 8e 47 1b c4 call 0xc41b47a1 13: 48 rex.W 14: 8d .byte 0x8d 15: bb .byte 0xbb [ 31.037591][C0] RSP: 0018:c9007da0 EFLAGS: 00010046 [ 31.037976][C0] RAX: 80010002 RBX: 888006c87428 RCX: 10c0e523 [ 31.038471][C0] RDX: RSI: 88810d0819e8 RDI: 888006c87449 [ 31.038967][C0] RBP: 88810d0819e8 R08: R09: fbfff0b04f04 [ 31.039463][C0] R10: 85827823 R11: 842013e6 R12: [ 31.039963][C0] R13: 0001 R14: 88810d081a18 R15: dc00 [ 31.040475][C0] FS: 7f80c0cc6800() GS:88811ae0() knlGS: [ 31.041058][C0] CS: 0010 DS: ES: CR0: 80050033 [ 31.041473][C0] CR2: 7f22ad455270 CR3: 000106b22000 CR4: 06f0 [ 31.042024][C0] Call Trace: [ 31.042250][C0] [ 31.042433][ C0] ? __warn (kernel/panic.c:673) [ 31.042710][ C0] ? crypto_finalize_request (crypto/crypto_engine.c:58 (discriminator 23)) crypto_engine [ 31.043161][ C0] ? report_bug (lib/bug.c:180 lib/bug.c:219) [ 31.043451][ C0] ? handle_bug (arch/x86/kernel/traps.c:237 (discriminator 1)) [ 31.043728][ C0] ? exc_invalid_op (arch/x86/kernel/traps.c:258 (discriminator 1)) [ 31.044039][ C0] ? asm_exc_invalid_op (./arch/x86/include/asm/idtentry.h:568) [ 31.044385][ C0] ? asm_common_interrupt (./arch/x86/include/asm/idtentry.h:636) [ 31.044746][ C0] ? crypto_finalize_request (crypto/crypto_engine.c:58 (discriminator 23)) crypto_engine [ 31.045202][
Re: [PATCH vfio 11/11] vfio/virtio: Introduce a vfio driver over virtio devices
On Fri, Sep 22, 2023 at 12:25:06PM +, Parav Pandit wrote: > > > From: Jason Gunthorpe > > Sent: Friday, September 22, 2023 5:53 PM > > > > > And what's more, using MMIO BAR0 then it can work for legacy. > > > > Oh? How? Our team didn't think so. > > It does not. It was already discussed. > The device reset in legacy is not synchronous. > The drivers do not wait for reset to complete; it was written for the sw > backend. > Hence MMIO BAR0 is not the best option in real implementations. Or maybe they made it synchronous in hardware, that's all. After all same is true for the IO BAR0 e.g. for the PF: IO writes are posted anyway. Whether that's possible would depend on the hardware architecture. -- MST ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH vfio 11/11] vfio/virtio: Introduce a vfio driver over virtio devices
On Fri, Sep 22, 2023 at 09:25:01AM -0300, Jason Gunthorpe wrote: > On Fri, Sep 22, 2023 at 11:02:50AM +0800, Jason Wang wrote: > > On Fri, Sep 22, 2023 at 3:53 AM Jason Gunthorpe wrote: > > > > > > On Thu, Sep 21, 2023 at 03:34:03PM -0400, Michael S. Tsirkin wrote: > > > > > > > that's easy/practical. If instead VDPA gives the same speed with just > > > > shadow vq then keeping this hack in vfio seems like less of a problem. > > > > Finally if VDPA is faster then maybe you will reconsider using it ;) > > > > > > It is not all about the speed. > > > > > > VDPA presents another large and complex software stack in the > > > hypervisor that can be eliminated by simply using VFIO. > > > > vDPA supports standard virtio devices so how did you define > > complexity? > > As I said, VFIO is already required for other devices in these VMs. So > anything incremental over base-line vfio-pci is complexity to > minimize. > > Everything vdpa does is either redundant or unnecessary compared to > VFIO in these environments. > > Jason Yes but you know. There are all kind of environments. I guess you consider yours the most mainstream and important, and are sure it will always stay like this. But if there's a driver that does what you need then you use that. You really should be explaining what vdpa *does not* do that you need. But anyway, if Alex wants to maintain this it's not too bad, but I would like to see more code move into a library living under the virtio directory. As it is structured now it will make virtio core development harder. -- MST ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH vfio 11/11] vfio/virtio: Introduce a vfio driver over virtio devices
On Fri, Sep 22, 2023 at 12:15:34PM -0300, Jason Gunthorpe wrote: > On Fri, Sep 22, 2023 at 11:13:18AM -0400, Michael S. Tsirkin wrote: > > On Fri, Sep 22, 2023 at 12:25:06PM +, Parav Pandit wrote: > > > > > > > From: Jason Gunthorpe > > > > Sent: Friday, September 22, 2023 5:53 PM > > > > > > > > > > > And what's more, using MMIO BAR0 then it can work for legacy. > > > > > > > > Oh? How? Our team didn't think so. > > > > > > It does not. It was already discussed. > > > The device reset in legacy is not synchronous. > > > The drivers do not wait for reset to complete; it was written for the sw > > > backend. > > > Hence MMIO BAR0 is not the best option in real implementations. > > > > Or maybe they made it synchronous in hardware, that's all. > > After all same is true for the IO BAR0 e.g. for the PF: IO writes > > are posted anyway. > > IO writes are not posted in PCI. Aha, I was confused. Thanks for the correction. I guess you just buffer subsequent transactions while reset is going on and reset quickly enough for it to be seemless then? -- MST ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH vfio 11/11] vfio/virtio: Introduce a vfio driver over virtio devices
On Fri, Sep 22, 2023 at 09:23:28AM -0300, Jason Gunthorpe wrote: > On Fri, Sep 22, 2023 at 05:47:23AM -0400, Michael S. Tsirkin wrote: > > > it will require maintainance effort when virtio changes are made. For > > example it pokes at the device state - I don't see specific races right > > now but in the past we did e.g. reset the device to recover from errors > > and we might start doing it again. > > > > If more of the logic is under virtio directory where we'll remember > > to keep it in loop, and will be able to reuse it from vdpa > > down the road, I would be more sympathetic. > > This is inevitable, the VFIO live migration driver will need all this > infrastructure too. > > Jason > I am not sure what you are saying and what is inevitable. VDPA for sure will want live migration support. I am not at all sympathetic to efforts that want to duplicate that support for virtio under VFIO. Put it in a library under the virtio directory, with a sane will documented interface. I don't maintain VFIO and Alex can merge what he wants, but I won't merge patches that export virtio internals in a way that will make virtio maintainance harder. -- MST ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH vfio 11/11] vfio/virtio: Introduce a vfio driver over virtio devices
On Thu, Sep 21, 2023 at 03:40:40PM +0300, Yishai Hadas wrote: > Introduce a vfio driver over virtio devices to support the legacy > interface functionality for VFs. > > Background, from the virtio spec [1]. > > In some systems, there is a need to support a virtio legacy driver with > a device that does not directly support the legacy interface. In such > scenarios, a group owner device can provide the legacy interface > functionality for the group member devices. The driver of the owner > device can then access the legacy interface of a member device on behalf > of the legacy member device driver. > > For example, with the SR-IOV group type, group members (VFs) can not > present the legacy interface in an I/O BAR in BAR0 as expected by the > legacy pci driver. If the legacy driver is running inside a virtual > machine, the hypervisor executing the virtual machine can present a > virtual device with an I/O BAR in BAR0. The hypervisor intercepts the > legacy driver accesses to this I/O BAR and forwards them to the group > owner device (PF) using group administration commands. > > > Specifically, this driver adds support for a virtio-net VF to be exposed > as a transitional device to a guest driver and allows the legacy IO BAR > functionality on top. > > This allows a VM which uses a legacy virtio-net driver in the guest to > work transparently over a VF which its driver in the host is that new > driver. > > The driver can be extended easily to support some other types of virtio > devices (e.g virtio-blk), by adding in a few places the specific type > properties as was done for virtio-net. > > For now, only the virtio-net use case was tested and as such we introduce > the support only for such a device. > > Practically, > Upon probing a VF for a virtio-net device, in case its PF supports > legacy access over the virtio admin commands and the VF doesn't have BAR > 0, we set some specific 'vfio_device_ops' to be able to simulate in SW a > transitional device with I/O BAR in BAR 0. > > The existence of the simulated I/O bar is reported later on by > overwriting the VFIO_DEVICE_GET_REGION_INFO command and the device > exposes itself as a transitional device by overwriting some properties > upon reading its config space. > > Once we report the existence of I/O BAR as BAR 0 a legacy driver in the > guest may use it via read/write calls according to the virtio > specification. > > Any read/write towards the control parts of the BAR will be captured by > the new driver and will be translated into admin commands towards the > device. > > Any data path read/write access (i.e. virtio driver notifications) will > be forwarded to the physical BAR which its properties were supplied by > the command VIRTIO_PCI_QUEUE_NOTIFY upon the probing/init flow. > > With that code in place a legacy driver in the guest has the look and > feel as if having a transitional device with legacy support for both its > control and data path flows. > > [1] > https://github.com/oasis-tcs/virtio-spec/commit/03c2d32e5093ca9f2a17797242fbef88efe94b8c > > Signed-off-by: Yishai Hadas > --- > MAINTAINERS | 6 + > drivers/vfio/pci/Kconfig | 2 + > drivers/vfio/pci/Makefile| 2 + > drivers/vfio/pci/virtio/Kconfig | 15 + > drivers/vfio/pci/virtio/Makefile | 4 + > drivers/vfio/pci/virtio/cmd.c| 4 +- > drivers/vfio/pci/virtio/cmd.h| 8 + > drivers/vfio/pci/virtio/main.c | 546 +++ > 8 files changed, 585 insertions(+), 2 deletions(-) > create mode 100644 drivers/vfio/pci/virtio/Kconfig > create mode 100644 drivers/vfio/pci/virtio/Makefile > create mode 100644 drivers/vfio/pci/virtio/main.c > > diff --git a/MAINTAINERS b/MAINTAINERS > index bf0f54c24f81..5098418c8389 100644 > --- a/MAINTAINERS > +++ b/MAINTAINERS > @@ -22624,6 +22624,12 @@ L: k...@vger.kernel.org > S: Maintained > F: drivers/vfio/pci/mlx5/ > > +VFIO VIRTIO PCI DRIVER > +M: Yishai Hadas > +L: k...@vger.kernel.org > +S: Maintained > +F: drivers/vfio/pci/virtio > + > VFIO PCI DEVICE SPECIFIC DRIVERS > R: Jason Gunthorpe > R: Yishai Hadas > diff --git a/drivers/vfio/pci/Kconfig b/drivers/vfio/pci/Kconfig > index 8125e5f37832..18c397df566d 100644 > --- a/drivers/vfio/pci/Kconfig > +++ b/drivers/vfio/pci/Kconfig > @@ -65,4 +65,6 @@ source "drivers/vfio/pci/hisilicon/Kconfig" > > source "drivers/vfio/pci/pds/Kconfig" > > +source "drivers/vfio/pci/virtio/Kconfig" > + > endmenu > diff --git a/drivers/vfio/pci/Makefile b/drivers/vfio/pci/Makefile > index 45167be462d8..046139a4eca5 100644 > --- a/drivers/vfio/pci/Makefile > +++ b/drivers/vfio/pci/Makefile > @@ -13,3 +13,5 @@ obj-$(CONFIG_MLX5_VFIO_PCI) += mlx5/ > obj-$(CONFIG_HISI_ACC_VFIO_PCI) += hisilicon/ > > obj-$(CONFIG_PDS_VFIO_PCI) += pds/ > + > +obj-$(CO
Re: [PATCH v2 1/2] iommu/virtio: Make use of ops->iotlb_sync_map
On Fri, Sep 22, 2023 at 02:13:18PM +0100, Robin Murphy wrote: > On 22/09/2023 1:41 pm, Jason Gunthorpe wrote: > > On Fri, Sep 22, 2023 at 08:57:19AM +0100, Jean-Philippe Brucker wrote: > > > > > They're not strictly equivalent: this check works around a temporary > > > > > issue > > > > > with the IOMMU core, which calls map/unmap before the domain is > > > > > finalized. > > > > > > > > Where? The above points to iommu_create_device_direct_mappings() but > > > > it doesn't because the pgsize_bitmap == 0: > > > > > > __iommu_domain_alloc() sets pgsize_bitmap in this case: > > > > > > /* > > > * If not already set, assume all sizes by default; the driver > > > * may override this later > > > */ > > > if (!domain->pgsize_bitmap) > > > domain->pgsize_bitmap = bus->iommu_ops->pgsize_bitmap; > > > > Dirver's shouldn't do that. > > > > The core code was fixed to try again with mapping reserved regions to > > support these kinds of drivers. > > This is still the "normal" code path, really; I think it's only AMD that > started initialising the domain bitmap "early" and warranted making it > conditional. My main point was that iommu_create_device_direct_mappings() should fail for unfinalized domains, setting pgsize_bitmap to allow it to succeed is not a nice hack, and not necessary now. What do you think about something like this to replace iommu_create_device_direct_mappings(), that does enforce things properly? static int resv_cmp(void *priv, const struct list_head *llhs, const struct list_head *lrhs) { struct iommu_resv_region *lhs = list_entry(llhs, struct iommu_resv_region, list); struct iommu_resv_region *rhs = list_entry(lrhs, struct iommu_resv_region, list); if (lhs->start == rhs->start) return 0; if (lhs->start < rhs->start) return -1; return 1; } static int iommu_create_device_direct_mappings(struct iommu_domain *domain, struct device *dev) { struct iommu_resv_region *entry; struct iommu_resv_region *tmp; struct list_head mappings; struct list_head direct; phys_addr_t cur = 0; int ret = 0; INIT_LIST_HEAD(&mappings); INIT_LIST_HEAD(&direct); iommu_get_resv_regions(dev, &mappings); list_for_each_entry_safe(entry, tmp, &mappings, list) { if (entry->type == IOMMU_RESV_DIRECT) dev->iommu->require_direct = 1; if ((domain->type & __IOMMU_DOMAIN_PAGING) && (entry->type == IOMMU_RESV_DIRECT || entry->type == IOMMU_RESV_DIRECT_RELAXABLE)) { if (domain->geometry.aperture_start > entry->start || domain->geometry.aperture_end == 0 || (domain->geometry.aperture_end - 1) < (entry->start + entry->length - 1)) { ret = -EINVAL; goto out; } list_move(&entry->list, &direct); } } if (list_empty(&direct)) goto out; /* * FW can have overlapping ranges, sort the list by start address * and map any duplicated IOVA only once. */ list_sort(NULL, &direct, resv_cmp); list_for_each_entry(entry, &direct, list) { phys_addr_t start_pfn = entry->start / PAGE_SIZE; phys_addr_t last_pfn = (entry->length - 1 + entry->start) / PAGE_SIZE; if (start_pfn < cur) start_pfn = cur; if (start_pfn <= last_pfn) { ret = iommu_map(domain, start_pfn * PAGE_SIZE, start_pfn * PAGE_SIZE, (last_pfn - start_pfn + 1) * PAGE_SIZE, entry->prot, GFP_KERNEL); if (ret) goto out; cur = last_pfn + 1; } } out: list_splice(&direct, &mappings); iommu_put_resv_regions(dev, &mappings); return ret; } ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
[PATCH 1/9] drm/amd/pm: Annotate struct smu10_voltage_dependency_table with __counted_by
Prepare for the coming implementation by GCC and Clang of the __counted_by attribute. Flexible array members annotated with __counted_by can have their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions). As found with Coccinelle[1], add __counted_by for struct smu10_voltage_dependency_table. [1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci Cc: Evan Quan Cc: Alex Deucher Cc: "Christian König" Cc: "Pan, Xinhui" Cc: David Airlie Cc: Daniel Vetter Cc: Xiaojian Du Cc: Huang Rui Cc: Kevin Wang Cc: amd-...@lists.freedesktop.org Cc: dri-de...@lists.freedesktop.org Signed-off-by: Kees Cook --- drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu10_hwmgr.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu10_hwmgr.h b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu10_hwmgr.h index 808e0ecbe1f0..42adc2a3dcbc 100644 --- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu10_hwmgr.h +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu10_hwmgr.h @@ -192,7 +192,7 @@ struct smu10_clock_voltage_dependency_record { struct smu10_voltage_dependency_table { uint32_t count; - struct smu10_clock_voltage_dependency_record entries[]; + struct smu10_clock_voltage_dependency_record entries[] __counted_by(count); }; struct smu10_clock_voltage_information { -- 2.34.1 ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
[PATCH 7/9] drm/virtio: Annotate struct virtio_gpu_object_array with __counted_by
Prepare for the coming implementation by GCC and Clang of the __counted_by attribute. Flexible array members annotated with __counted_by can have their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions). As found with Coccinelle[1], add __counted_by for struct virtio_gpu_object_array. [1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci Cc: David Airlie Cc: Gerd Hoffmann Cc: Gurchetan Singh Cc: Chia-I Wu Cc: Daniel Vetter Cc: dri-de...@lists.freedesktop.org Cc: virtualization@lists.linux-foundation.org Signed-off-by: Kees Cook --- drivers/gpu/drm/virtio/virtgpu_drv.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.h b/drivers/gpu/drm/virtio/virtgpu_drv.h index 8513b671f871..96365a772f77 100644 --- a/drivers/gpu/drm/virtio/virtgpu_drv.h +++ b/drivers/gpu/drm/virtio/virtgpu_drv.h @@ -119,7 +119,7 @@ struct virtio_gpu_object_array { struct ww_acquire_ctx ticket; struct list_head next; u32 nents, total; - struct drm_gem_object *objs[]; + struct drm_gem_object *objs[] __counted_by(total); }; struct virtio_gpu_vbuffer; -- 2.34.1 ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
[PATCH 6/9] drm/vc4: Annotate struct vc4_perfmon with __counted_by
Prepare for the coming implementation by GCC and Clang of the __counted_by attribute. Flexible array members annotated with __counted_by can have their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions). As found with Coccinelle[1], add __counted_by for struct vc4_perfmon. [1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci Cc: Emma Anholt Cc: Maxime Ripard Cc: David Airlie Cc: Daniel Vetter Cc: dri-de...@lists.freedesktop.org Signed-off-by: Kees Cook --- drivers/gpu/drm/vc4/vc4_drv.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/vc4/vc4_drv.h b/drivers/gpu/drm/vc4/vc4_drv.h index bf66499765fb..ab61e96e7e14 100644 --- a/drivers/gpu/drm/vc4/vc4_drv.h +++ b/drivers/gpu/drm/vc4/vc4_drv.h @@ -76,7 +76,7 @@ struct vc4_perfmon { * Note that counter values can't be reset, but you can fake a reset by * destroying the perfmon and creating a new one. */ - u64 counters[]; + u64 counters[] __counted_by(ncounters); }; struct vc4_dev { -- 2.34.1 ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
[PATCH 4/9] drm/msm/dpu: Annotate struct dpu_hw_intr with __counted_by
Prepare for the coming implementation by GCC and Clang of the __counted_by attribute. Flexible array members annotated with __counted_by can have their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions). As found with Coccinelle[1], add __counted_by for struct dpu_hw_intr. [1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci Cc: Rob Clark Cc: Abhinav Kumar Cc: Dmitry Baryshkov Cc: Sean Paul Cc: Marijn Suijten Cc: David Airlie Cc: Daniel Vetter Cc: Bjorn Andersson Cc: linux-arm-...@vger.kernel.org Cc: dri-de...@lists.freedesktop.org Cc: freedr...@lists.freedesktop.org Signed-off-by: Kees Cook --- drivers/gpu/drm/msm/disp/dpu1/dpu_hw_interrupts.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_interrupts.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_interrupts.h index dab761e54863..50cf9523d367 100644 --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_interrupts.h +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_interrupts.h @@ -61,7 +61,7 @@ struct dpu_hw_intr { void (*cb)(void *arg, int irq_idx); void *arg; atomic_t count; - } irq_tbl[]; + } irq_tbl[] __counted_by(total_irqs); }; /** -- 2.34.1 ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
[PATCH 5/9] drm/nouveau/pm: Annotate struct nvkm_perfdom with __counted_by
Prepare for the coming implementation by GCC and Clang of the __counted_by attribute. Flexible array members annotated with __counted_by can have their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions). As found with Coccinelle[1], add __counted_by for struct nvkm_perfdom. [1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci Cc: Ben Skeggs Cc: Karol Herbst Cc: Lyude Paul Cc: David Airlie Cc: Daniel Vetter Cc: dri-de...@lists.freedesktop.org Cc: nouv...@lists.freedesktop.org Signed-off-by: Kees Cook --- drivers/gpu/drm/nouveau/nvkm/engine/pm/priv.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/pm/priv.h b/drivers/gpu/drm/nouveau/nvkm/engine/pm/priv.h index 6ae25d3e7f45..c011227f7052 100644 --- a/drivers/gpu/drm/nouveau/nvkm/engine/pm/priv.h +++ b/drivers/gpu/drm/nouveau/nvkm/engine/pm/priv.h @@ -82,7 +82,7 @@ struct nvkm_perfdom { u8 mode; u32 clk; u16 signal_nr; - struct nvkm_perfsig signal[]; + struct nvkm_perfsig signal[] __counted_by(signal_nr); }; struct nvkm_funcdom { -- 2.34.1 ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
[PATCH 3/9] drm/i915/selftests: Annotate struct perf_series with __counted_by
Prepare for the coming implementation by GCC and Clang of the __counted_by attribute. Flexible array members annotated with __counted_by can have their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions). As found with Coccinelle[1], add __counted_by for struct perf_series. [1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci Cc: Jani Nikula Cc: Joonas Lahtinen Cc: Rodrigo Vivi Cc: Tvrtko Ursulin Cc: David Airlie Cc: Daniel Vetter Cc: Chris Wilson Cc: John Harrison Cc: Andi Shyti Cc: Matthew Brost Cc: intel-...@lists.freedesktop.org Cc: dri-de...@lists.freedesktop.org Signed-off-by: Kees Cook --- drivers/gpu/drm/i915/selftests/i915_request.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/selftests/i915_request.c b/drivers/gpu/drm/i915/selftests/i915_request.c index a9b79888c193..acae30a04a94 100644 --- a/drivers/gpu/drm/i915/selftests/i915_request.c +++ b/drivers/gpu/drm/i915/selftests/i915_request.c @@ -1924,7 +1924,7 @@ struct perf_stats { struct perf_series { struct drm_i915_private *i915; unsigned int nengines; - struct intel_context *ce[]; + struct intel_context *ce[] __counted_by(nengines); }; static int cmp_u32(const void *A, const void *B) -- 2.34.1 ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
[PATCH 0/9] drm: Annotate structs with __counted_by
Hi, This is a batch of patches touching drm for preparing for the coming implementation by GCC and Clang of the __counted_by attribute. Flexible array members annotated with __counted_by can have their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions). As found with Coccinelle[1], add __counted_by to structs that would benefit from the annotation. Since the element count member must be set before accessing the annotated flexible array member, some patches also move the member's initialization earlier. (These are noted in the individual patches.) -Kees [1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci Kees Cook (9): drm/amd/pm: Annotate struct smu10_voltage_dependency_table with __counted_by drm/amdgpu/discovery: Annotate struct ip_hw_instance with __counted_by drm/i915/selftests: Annotate struct perf_series with __counted_by drm/msm/dpu: Annotate struct dpu_hw_intr with __counted_by drm/nouveau/pm: Annotate struct nvkm_perfdom with __counted_by drm/vc4: Annotate struct vc4_perfmon with __counted_by drm/virtio: Annotate struct virtio_gpu_object_array with __counted_by drm/vmwgfx: Annotate struct vmw_surface_dirty with __counted_by drm/v3d: Annotate struct v3d_perfmon with __counted_by drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c| 2 +- drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu10_hwmgr.h | 2 +- drivers/gpu/drm/i915/selftests/i915_request.c| 2 +- drivers/gpu/drm/msm/disp/dpu1/dpu_hw_interrupts.h| 2 +- drivers/gpu/drm/nouveau/nvkm/engine/pm/priv.h| 2 +- drivers/gpu/drm/v3d/v3d_drv.h| 2 +- drivers/gpu/drm/vc4/vc4_drv.h| 2 +- drivers/gpu/drm/virtio/virtgpu_drv.h | 2 +- drivers/gpu/drm/vmwgfx/vmwgfx_surface.c | 2 +- 9 files changed, 9 insertions(+), 9 deletions(-) -- 2.34.1 ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
[PATCH 2/9] drm/amdgpu/discovery: Annotate struct ip_hw_instance with __counted_by
Prepare for the coming implementation by GCC and Clang of the __counted_by attribute. Flexible array members annotated with __counted_by can have their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions). As found with Coccinelle[1], add __counted_by for struct ip_hw_instance. [1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci Cc: Alex Deucher Cc: "Christian König" Cc: "Pan, Xinhui" Cc: David Airlie Cc: Daniel Vetter Cc: Hawking Zhang Cc: amd-...@lists.freedesktop.org Cc: dri-de...@lists.freedesktop.org Signed-off-by: Kees Cook --- drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c index d1bc7b212520..be4c97a3d7bf 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c @@ -662,7 +662,7 @@ struct ip_hw_instance { u8 harvest; int num_base_addresses; - u32 base_addr[]; + u32 base_addr[] __counted_by(num_base_addresses); }; struct ip_hw_id { -- 2.34.1 ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
[PATCH 8/9] drm/vmwgfx: Annotate struct vmw_surface_dirty with __counted_by
Prepare for the coming implementation by GCC and Clang of the __counted_by attribute. Flexible array members annotated with __counted_by can have their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions). As found with Coccinelle[1], add __counted_by for struct vmw_surface_dirty. [1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci Cc: Zack Rusin Cc: VMware Graphics Reviewers Cc: David Airlie Cc: Daniel Vetter Cc: dri-de...@lists.freedesktop.org Signed-off-by: Kees Cook --- drivers/gpu/drm/vmwgfx/vmwgfx_surface.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_surface.c b/drivers/gpu/drm/vmwgfx/vmwgfx_surface.c index 5db403ee8261..2d1d857f99ae 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_surface.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_surface.c @@ -77,7 +77,7 @@ struct vmw_surface_offset { struct vmw_surface_dirty { struct vmw_surface_cache cache; u32 num_subres; - SVGA3dBox boxes[]; + SVGA3dBox boxes[] __counted_by(num_subres); }; static void vmw_user_surface_free(struct vmw_resource *res); -- 2.34.1 ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH 1/9] drm/amd/pm: Annotate struct smu10_voltage_dependency_table with __counted_by
On Fri, Sep 22, 2023 at 1:32 PM Kees Cook wrote: > > Prepare for the coming implementation by GCC and Clang of the __counted_by > attribute. Flexible array members annotated with __counted_by can have > their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS > (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family > functions). > > As found with Coccinelle[1], add __counted_by for struct > smu10_voltage_dependency_table. > > [1] > https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci > > Cc: Evan Quan > Cc: Alex Deucher > Cc: "Christian König" > Cc: "Pan, Xinhui" > Cc: David Airlie > Cc: Daniel Vetter > Cc: Xiaojian Du > Cc: Huang Rui > Cc: Kevin Wang > Cc: amd-...@lists.freedesktop.org > Cc: dri-de...@lists.freedesktop.org > Signed-off-by: Kees Cook Acked-by: Alex Deucher > --- > drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu10_hwmgr.h | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu10_hwmgr.h > b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu10_hwmgr.h > index 808e0ecbe1f0..42adc2a3dcbc 100644 > --- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu10_hwmgr.h > +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu10_hwmgr.h > @@ -192,7 +192,7 @@ struct smu10_clock_voltage_dependency_record { > > struct smu10_voltage_dependency_table { > uint32_t count; > - struct smu10_clock_voltage_dependency_record entries[]; > + struct smu10_clock_voltage_dependency_record entries[] > __counted_by(count); > }; > > struct smu10_clock_voltage_information { > -- > 2.34.1 > ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
[PATCH 9/9] drm/v3d: Annotate struct v3d_perfmon with __counted_by
Prepare for the coming implementation by GCC and Clang of the __counted_by attribute. Flexible array members annotated with __counted_by can have their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions). As found with Coccinelle[1], add __counted_by for struct v3d_perfmon. [1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci Cc: Emma Anholt Cc: Melissa Wen Cc: David Airlie Cc: Daniel Vetter Cc: dri-de...@lists.freedesktop.org Signed-off-by: Kees Cook --- drivers/gpu/drm/v3d/v3d_drv.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/v3d/v3d_drv.h b/drivers/gpu/drm/v3d/v3d_drv.h index 7f664a4b2a75..106454f28956 100644 --- a/drivers/gpu/drm/v3d/v3d_drv.h +++ b/drivers/gpu/drm/v3d/v3d_drv.h @@ -59,7 +59,7 @@ struct v3d_perfmon { * values can't be reset, but you can fake a reset by * destroying the perfmon and creating a new one. */ - u64 values[]; + u64 values[] __counted_by(ncounters); }; struct v3d_dev { -- 2.34.1 ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH 2/9] drm/amdgpu/discovery: Annotate struct ip_hw_instance with __counted_by
On Fri, Sep 22, 2023 at 1:33 PM Kees Cook wrote: > > Prepare for the coming implementation by GCC and Clang of the __counted_by > attribute. Flexible array members annotated with __counted_by can have > their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS > (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family > functions). > > As found with Coccinelle[1], add __counted_by for struct ip_hw_instance. > > [1] > https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci > > Cc: Alex Deucher > Cc: "Christian König" > Cc: "Pan, Xinhui" > Cc: David Airlie > Cc: Daniel Vetter > Cc: Hawking Zhang > Cc: amd-...@lists.freedesktop.org > Cc: dri-de...@lists.freedesktop.org > Signed-off-by: Kees Cook Acked-by: Alex Deucher > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c > index d1bc7b212520..be4c97a3d7bf 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c > @@ -662,7 +662,7 @@ struct ip_hw_instance { > u8 harvest; > > int num_base_addresses; > - u32 base_addr[]; > + u32 base_addr[] __counted_by(num_base_addresses); > }; > > struct ip_hw_id { > -- > 2.34.1 > ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
[PATCH] virtio_console: Annotate struct port_buffer with __counted_by
Prepare for the coming implementation by GCC and Clang of the __counted_by attribute. Flexible array members annotated with __counted_by can have their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions). As found with Coccinelle[1], add __counted_by for struct port_buffer. [1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci Cc: Amit Shah Cc: Arnd Bergmann Cc: Greg Kroah-Hartman Cc: virtualization@lists.linux-foundation.org Signed-off-by: Kees Cook --- drivers/char/virtio_console.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/char/virtio_console.c b/drivers/char/virtio_console.c index 680d1ef2a217..431e9e5bf9c1 100644 --- a/drivers/char/virtio_console.c +++ b/drivers/char/virtio_console.c @@ -106,7 +106,7 @@ struct port_buffer { unsigned int sgpages; /* sg is used if spages > 0. sg must be the last in is struct */ - struct scatterlist sg[]; + struct scatterlist sg[] __counted_by(sgpages); }; /* -- 2.34.1 ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v2 1/2] iommu/virtio: Make use of ops->iotlb_sync_map
On 22/09/2023 5:27 pm, Jason Gunthorpe wrote: On Fri, Sep 22, 2023 at 02:13:18PM +0100, Robin Murphy wrote: On 22/09/2023 1:41 pm, Jason Gunthorpe wrote: On Fri, Sep 22, 2023 at 08:57:19AM +0100, Jean-Philippe Brucker wrote: They're not strictly equivalent: this check works around a temporary issue with the IOMMU core, which calls map/unmap before the domain is finalized. Where? The above points to iommu_create_device_direct_mappings() but it doesn't because the pgsize_bitmap == 0: __iommu_domain_alloc() sets pgsize_bitmap in this case: /* * If not already set, assume all sizes by default; the driver * may override this later */ if (!domain->pgsize_bitmap) domain->pgsize_bitmap = bus->iommu_ops->pgsize_bitmap; Dirver's shouldn't do that. The core code was fixed to try again with mapping reserved regions to support these kinds of drivers. This is still the "normal" code path, really; I think it's only AMD that started initialising the domain bitmap "early" and warranted making it conditional. My main point was that iommu_create_device_direct_mappings() should fail for unfinalized domains, setting pgsize_bitmap to allow it to succeed is not a nice hack, and not necessary now. Sure, but it's the whole "unfinalised domains" and rewriting domain->pgsize_bitmap after attach thing that is itself the massive hack. AMD doesn't do that, and doesn't need to; it knows the appropriate format at allocation time and can quite happily return a fully working domain which allows map before attach, but the old ops->pgsize_bitmap mechanism fundamentally doesn't work for multiple formats with different page sizes. The only thing I'd accuse it of doing wrong is the weird half-and-half thing of having one format as a default via one mechanism, and the other as an override through the other, rather than setting both explicitly. virtio isn't setting ops->pgsize_bitmap for the sake of direct mappings either; it sets it once it's discovered any instance, since apparently it's assuming that all instances must support identical page sizes, and thus once it's seen one it can work "normally" per the core code's assumptions. It's also I think the only driver which has a "finalise" bodge but *can* still properly support map-before-attach, by virtue of having to replay mappings to every new endpoint anyway. What do you think about something like this to replace iommu_create_device_direct_mappings(), that does enforce things properly? I fail to see how that would make any practical difference. Either the mappings can be correctly set up in a pagetable *before* the relevant device is attached to that pagetable, or they can't (if the driver doesn't have enough information to be able to do so) and we just have to really hope nothing blows up in the race window between attaching the device to an empty pagetable and having a second try at iommu_create_device_direct_mappings(). That's a driver-level issue and has nothing to do with pgsize_bitmap either way. Thanks, Robin. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH 5/9] drm/nouveau/pm: Annotate struct nvkm_perfdom with __counted_by
Reviewed-by: Lyude Paul Thanks! On Fri, 2023-09-22 at 10:32 -0700, Kees Cook wrote: > Prepare for the coming implementation by GCC and Clang of the __counted_by > attribute. Flexible array members annotated with __counted_by can have > their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS > (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family > functions). > > As found with Coccinelle[1], add __counted_by for struct nvkm_perfdom. > > [1] > https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci > > Cc: Ben Skeggs > Cc: Karol Herbst > Cc: Lyude Paul > Cc: David Airlie > Cc: Daniel Vetter > Cc: dri-de...@lists.freedesktop.org > Cc: nouv...@lists.freedesktop.org > Signed-off-by: Kees Cook > --- > drivers/gpu/drm/nouveau/nvkm/engine/pm/priv.h | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/pm/priv.h > b/drivers/gpu/drm/nouveau/nvkm/engine/pm/priv.h > index 6ae25d3e7f45..c011227f7052 100644 > --- a/drivers/gpu/drm/nouveau/nvkm/engine/pm/priv.h > +++ b/drivers/gpu/drm/nouveau/nvkm/engine/pm/priv.h > @@ -82,7 +82,7 @@ struct nvkm_perfdom { > u8 mode; > u32 clk; > u16 signal_nr; > - struct nvkm_perfsig signal[]; > + struct nvkm_perfsig signal[] __counted_by(signal_nr); > }; > > struct nvkm_funcdom { -- Cheers, Lyude Paul (she/her) Software Engineer at Red Hat ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH 3/9] drm/i915/selftests: Annotate struct perf_series with __counted_by
On 9/22/23 11:32, Kees Cook wrote: Prepare for the coming implementation by GCC and Clang of the __counted_by attribute. Flexible array members annotated with __counted_by can have their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions). As found with Coccinelle[1], add __counted_by for struct perf_series. [1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci Cc: Jani Nikula Cc: Joonas Lahtinen Cc: Rodrigo Vivi Cc: Tvrtko Ursulin Cc: David Airlie Cc: Daniel Vetter Cc: Chris Wilson Cc: John Harrison Cc: Andi Shyti Cc: Matthew Brost Cc: intel-...@lists.freedesktop.org Cc: dri-de...@lists.freedesktop.org Signed-off-by: Kees Cook Reviewed-by: Gustavo A. R. Silva Thanks -- Gustavo --- drivers/gpu/drm/i915/selftests/i915_request.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/selftests/i915_request.c b/drivers/gpu/drm/i915/selftests/i915_request.c index a9b79888c193..acae30a04a94 100644 --- a/drivers/gpu/drm/i915/selftests/i915_request.c +++ b/drivers/gpu/drm/i915/selftests/i915_request.c @@ -1924,7 +1924,7 @@ struct perf_stats { struct perf_series { struct drm_i915_private *i915; unsigned int nengines; - struct intel_context *ce[]; + struct intel_context *ce[] __counted_by(nengines); }; static int cmp_u32(const void *A, const void *B) ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH 4/9] drm/msm/dpu: Annotate struct dpu_hw_intr with __counted_by
On 9/22/23 11:32, Kees Cook wrote: Prepare for the coming implementation by GCC and Clang of the __counted_by attribute. Flexible array members annotated with __counted_by can have their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions). As found with Coccinelle[1], add __counted_by for struct dpu_hw_intr. [1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci Cc: Rob Clark Cc: Abhinav Kumar Cc: Dmitry Baryshkov Cc: Sean Paul Cc: Marijn Suijten Cc: David Airlie Cc: Daniel Vetter Cc: Bjorn Andersson Cc: linux-arm-...@vger.kernel.org Cc: dri-de...@lists.freedesktop.org Cc: freedr...@lists.freedesktop.org Signed-off-by: Kees Cook Reviewed-by: Gustavo A. R. Silva Thanks -- Gustavo --- drivers/gpu/drm/msm/disp/dpu1/dpu_hw_interrupts.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_interrupts.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_interrupts.h index dab761e54863..50cf9523d367 100644 --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_interrupts.h +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_hw_interrupts.h @@ -61,7 +61,7 @@ struct dpu_hw_intr { void (*cb)(void *arg, int irq_idx); void *arg; atomic_t count; - } irq_tbl[]; + } irq_tbl[] __counted_by(total_irqs); }; /** ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH 2/9] drm/amdgpu/discovery: Annotate struct ip_hw_instance with __counted_by
On 9/22/23 11:32, Kees Cook wrote: Prepare for the coming implementation by GCC and Clang of the __counted_by attribute. Flexible array members annotated with __counted_by can have their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions). As found with Coccinelle[1], add __counted_by for struct ip_hw_instance. [1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci Cc: Alex Deucher Cc: "Christian König" Cc: "Pan, Xinhui" Cc: David Airlie Cc: Daniel Vetter Cc: Hawking Zhang Cc: amd-...@lists.freedesktop.org Cc: dri-de...@lists.freedesktop.org Signed-off-by: Kees Cook Reviewed-by: Gustavo A. R. Silva Thanks -- Gustavo --- drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c index d1bc7b212520..be4c97a3d7bf 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c @@ -662,7 +662,7 @@ struct ip_hw_instance { u8 harvest; int num_base_addresses; - u32 base_addr[]; + u32 base_addr[] __counted_by(num_base_addresses); }; struct ip_hw_id { ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v2 1/2] iommu/virtio: Make use of ops->iotlb_sync_map
On Fri, Sep 22, 2023 at 07:07:40PM +0100, Robin Murphy wrote: > virtio isn't setting ops->pgsize_bitmap for the sake of direct mappings > either; it sets it once it's discovered any instance, since apparently it's > assuming that all instances must support identical page sizes, and thus once > it's seen one it can work "normally" per the core code's assumptions. It's > also I think the only driver which has a "finalise" bodge but *can* still > properly support map-before-attach, by virtue of having to replay mappings > to every new endpoint anyway. Well it can't quite do that since it doesn't know the geometry - it all is sort of guessing and hoping it doesn't explode on replay. If it knows the geometry it wouldn't need finalize... > > What do you think about something like this to replace > > iommu_create_device_direct_mappings(), that does enforce things > > properly? > > I fail to see how that would make any practical difference. Either the > mappings can be correctly set up in a pagetable *before* the relevant device > is attached to that pagetable, or they can't (if the driver doesn't have > enough information to be able to do so) and we just have to really hope > nothing blows up in the race window between attaching the device to an empty > pagetable and having a second try at iommu_create_device_direct_mappings(). > That's a driver-level issue and has nothing to do with pgsize_bitmap either > way. Except we don't detect this in the core code correctly, that is my point. We should detect the aperture conflict, not pgsize_bitmap to check if it is the first or second try. Jason ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH 6/9] drm/vc4: Annotate struct vc4_perfmon with __counted_by
On 9/22/23 11:32, Kees Cook wrote: Prepare for the coming implementation by GCC and Clang of the __counted_by attribute. Flexible array members annotated with __counted_by can have their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions). As found with Coccinelle[1], add __counted_by for struct vc4_perfmon. [1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci Cc: Emma Anholt Cc: Maxime Ripard Cc: David Airlie Cc: Daniel Vetter Cc: dri-de...@lists.freedesktop.org Signed-off-by: Kees Cook Reviewed-by: Gustavo A. R. Silva Thanks -- Gustavo --- drivers/gpu/drm/vc4/vc4_drv.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/vc4/vc4_drv.h b/drivers/gpu/drm/vc4/vc4_drv.h index bf66499765fb..ab61e96e7e14 100644 --- a/drivers/gpu/drm/vc4/vc4_drv.h +++ b/drivers/gpu/drm/vc4/vc4_drv.h @@ -76,7 +76,7 @@ struct vc4_perfmon { * Note that counter values can't be reset, but you can fake a reset by * destroying the perfmon and creating a new one. */ - u64 counters[]; + u64 counters[] __counted_by(ncounters); }; struct vc4_dev { ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization