Hi Maxime, > -----Original Message----- > From: Maxime Coquelin [mailto:maxime.coque...@redhat.com] > Sent: Thursday, February 8, 2018 5:09 PM > To: Wang, Xiao W <xiao.w.w...@intel.com>; dev@dpdk.org > Cc: Tan, Jianfeng <jianfeng....@intel.com>; Bie, Tiwei <tiwei....@intel.com>; > y...@fridaylinux.org; Liang, Cunming <cunming.li...@intel.com>; Daly, Dan > <dan.d...@intel.com>; Wang, Zhihong <zhihong.w...@intel.com> > Subject: Re: [PATCH 2/3] net/vdpa_virtio_pci: introduce vdpa sample driver > > Hi Xiao, > > On 02/08/2018 03:23 AM, Wang, Xiao W wrote: > > Hi Maxime, > > > >> -----Original Message----- > >> From: Maxime Coquelin [mailto:maxime.coque...@redhat.com] > >> Sent: Tuesday, February 6, 2018 10:24 PM > >> To: Wang, Xiao W <xiao.w.w...@intel.com>; dev@dpdk.org > >> Cc: Tan, Jianfeng <jianfeng....@intel.com>; Bie, Tiwei > <tiwei....@intel.com>; > >> y...@fridaylinux.org; Liang, Cunming <cunming.li...@intel.com>; Daly, Dan > >> <dan.d...@intel.com>; Wang, Zhihong <zhihong.w...@intel.com> > >> Subject: Re: [PATCH 2/3] net/vdpa_virtio_pci: introduce vdpa sample driver > >> > >> Hi Xiao, > >> > >> On 02/04/2018 03:55 PM, Xiao Wang wrote: > >>> This driver is a reference sample of making vDPA device driver based > >>> on vhost lib, this driver uses a standard virtio-net PCI device as > >>> vDPA device, it can serve as a backend for a virtio-net pci device > >>> in nested VM. > >>> > >>> The key driver ops implemented are: > >>> > >>> * vdpa_virtio_eng_init > >>> Mapping virtio pci device with VFIO into userspace, and read device > >>> capability and intialize internal data. > >>> > >>> * vdpa_virtio_eng_uninit > >>> Release the mapped device. > >>> > >>> * vdpa_virtio_info_query > >>> Device capability reporting, e.g. queue number, features. > >>> > >>> * vdpa_virtio_dev_config > >>> With the guest virtio information provideed by vhost lib, this > >>> function configures device and IOMMU to set up vhost datapath, > >>> which includes: Rx/Tx vring, VFIO interrupt, kick relay. > >>> > >>> * vdpa_virtio_dev_close > >>> Unset the stuff that are configured previously by dev_conf. > >>> > >>> This driver requires the virtio device supports > VIRTIO_F_IOMMU_PLATFORM > >>> , because the buffer address written in desc is IOVA. > >>> > >>> Because vDPA driver needs to set up MSI-X vector to interrupt the guest, > >>> only vfio-pci is supported currently. > >>> > >>> Signed-off-by: Xiao Wang<xiao.w.w...@intel.com> > >>> --- > >>> config/common_base | 6 + > >>> config/common_linuxapp | 1 + > >>> drivers/net/Makefile | 1 + > >>> drivers/net/vdpa_virtio_pci/Makefile | 31 + > >>> .../net/vdpa_virtio_pci/rte_eth_vdpa_virtio_pci.c | 1527 > >> ++++++++++++++++++++ > >>> .../rte_vdpa_virtio_pci_version.map | 4 + > >>> mk/rte.app.mk | 1 + > >>> 7 files changed, 1571 insertions(+) > >>> create mode 100644 drivers/net/vdpa_virtio_pci/Makefile > >>> create mode 100644 > drivers/net/vdpa_virtio_pci/rte_eth_vdpa_virtio_pci.c > >>> create mode 100644 > >> drivers/net/vdpa_virtio_pci/rte_vdpa_virtio_pci_version.map > >> > >> Is there a specific constraint that makes you expose PCI functions and > >> duplicate a lot of vfio code into the driver? > > > > The existing vfio code doesn't fit VDPA well, this vDPA driver needs to > program IOMMU for a vDPA device with a VM's memory table. > > While the eal/vfio uses a struct vfio_cfg to takes all regular devices and > > add > them to a single vfio_container, and program IOMMU with DPDK process's > memory table. > > > > This driver doing PCI VFIO initialization itself can avoid affecting the > > global > vfio_cfg structure. > > Ok, I get it. > So I think what you have to do is to extend eal/vfio for this case. > Or at least, have a vdpa layer to perform this, else every offload > driver will have to duplicate the code.
I think I need to extend eal/vfio to provide container based APIs, such as creating container, vfio group fd binding with container, DMAR programming, etc. > > >> > >> Wouldn't it be better (if possible) to use RTE_PMD_REGISTER_PCI() & co. > >> to benefit from all the existing infrastructure? > > > > RTE_PMD_REGISTER_PCI() & co will make this driver as PCI driver (physical > device), then this will conflict with the virtio_pmd. > > So I make vDPA device driver as a vdev driver. > > Yes, but it is a PCI device, not a virtual device. You have to extend > the EAL to support this new class of devices/drivers. Think of it as in > kernel when a NIC device can be either binded to its NIC driver, VFIO or > UIO. > > If I look at patch 3, you have to set --no-pci, or at least I think to > blacklist the Virtio device. > > I wonder if real vDPA cards will support either vDPA mode or or behave > like a regular NIC, like the Virtio case in your example. > If this is the case, maybe the vDPA code for a NIC could be in the same > driver as the "NIC" mode. > A new struct rte_pci_driver driver flag could be introduced to specify > that the driver supports vDPA. > Then, in EAL arguments, if a vhost vdev specifies it wants Virtio device > at PCI addr 00:01:00 as offload, the PCI layer could probe this device > in "vdpa" mode. Considering that we could have a pool of vDPA devices, we need to have a port supporting port representor, it defines control domain to which these vDPA devices belong to, we can have a vdev port for this purpose and this vdev helps to register vDPA ports by port-representor library (patch submitted). +------+ | vdev | +---+ |------| |app|--register representor port-->|broker|-->add port with vDPA device 0/1/2... +---+ +------+ I plan to submit vdpa driver patch for a real vDPA card, that card will have different sub device_id/vendor_id, so we won't have conflict issue on that driver. > > Also, I don't know if this will be possible with real vDPA cards, but we > could have the application doing packet switching between vhost-user > vdev and the Virtio device. And at some point, at runtime, switch into > vDPA mode. This use-case would be much easier to implement if vDPA > relied on existing PCI layer. In vDPA mode, each vhost-user datapath is performed by a vDPA device, If switchover to normal SW packet switching, it will be typically many vhost-user ports and one uplink port. Thanks, Xiao > > I may be not very clear, don't hesitate to ask questions. > But generally, I think vDPA has to fit in existing DPDK architecture, > and not try to live outside of it. > > Thanks, > Maxime > >> > >> Maxime > > > > Thanks for the comments, > > Xiao > >