On Tue, Jan 26, 2016 at 6:30 PM, Thomas Monjalon <thomas.monjalon at 6wind.com> wrote: > 2016-01-26 15:56, Santosh Shukla: >> On Mon, Jan 25, 2016 at 8:59 PM, Thomas Monjalon >> <thomas.monjalon at 6wind.com> wrote: >> > 2016-01-21 22:47, Santosh Shukla: >> >> On Thu, Jan 21, 2016 at 8:16 PM, Thomas Monjalon >> >> <thomas.monjalon at 6wind.com> wrote: >> >> > 2016-01-21 17:34, Santosh Shukla: >> >> >> On Thu, Jan 21, 2016 at 4:58 PM, Thomas Monjalon >> >> >> <thomas.monjalon at 6wind.com> wrote: >> >> >> > 2016-01-21 16:43, Santosh Shukla: >> >> >> >> David Marchand <david.marchand at 6wind.com> wrote: >> >> >> >> > This is a mode (specific to vfio), not a new kernel driver. >> >> >> >> > >> >> >> >> Yes, Specific to VFIO and this is why noiommu appended after vfio >> >> >> >> i.e.. >> >> >> >> __VFIO and __VFIO_NOIOMMU. >> >> >> > >> >> >> > Woaaa! Your logic is really disappointing :) >> >> >> > Specific to VFIO => append _NOIOMMU >> >> >> > If it's for VFIO, it should be called VFIO (that's my logic). >> >> >> > >> >> >> I am confused by reading your comment. vfio works for default iommu >> >> >> and now with noiommu. drv->kdrv need to know driver mode for vfio >> >> >> case. So that user can simply read drv->kdrv value in their driver and >> >> >> accordingly use vfio rd/wr api for example {pread/pwrite}. This is how >> >> >> rte_eal_pci_vfio_read/write_bar() api implemented. >> >> > >> >> > Sorry I don't understand. Why EAL read/write functions should be >> >> > different >> >> > depending of the VFIO mode? >> >> >> >> no, EAL rd/wr functions are not different for vfio or vfio modes {same >> >> for iommu or noiommu}. Pl. see pci_eal_read/write_bar() api. Those >> >> apis currently used for VFIO, Irrespective of vfio mode. If required, >> >> we can add UIO bar_rd/wr api too. pci_eal_rd/wr_bar() are abstract >> >> apis. Underneath implementation can be vfio or uio type. >> > >> > It means you agree the suffix _NOIOMMU is not needed? >> > It seems we go nowhere in this discussion. You said >> > "drv->kdrv need to know driver mode for vfio" >> >> In my observation, currently virtio work for vfio-noiommu, that's why >> said drv->kdrv need to know vfio mode. > > It is your observation. It may change in near future. >
so that mean till then, virtio support for non-x86 arch has to wait? We have working model with vfio-noiommu, don't you think it make sense to let vfio_noiommu implementation exist and later in-case virtio+iommu gets mainline then switch to vfio __mode__ agnostic approach. And for that All it takes to replace __noiommu suffix with default. >> > and after >> > "Those apis currently used for VFIO, Irrespective of vfio mode" >> > That's why I assume your first assumption was wrong. >> > >> >> Newly introduced dpdk global api pci_eal_rd/wr_bar(), can be used for >> vfio and uio both; can be used for vfio w/IOMMU and vfio w/o IOMMU >> both. >> >> >> >> > Why do we care to parse noiommu only? >> >> >> >> >> >> Because pmd drivers example virtio can work with vfio only in >> >> >> _noiommu_ mode. In particular, virtio spec 0.95 / legacy virtio. >> >> > >> >> > Please could you explain the limitation (except IOMMU availability)? >> >> >> >> Ok. >> >> >> >> I believe - we both agree that noiommu mode is a need for pmd drivers >> >> like virtio, right? if so then other reason is implementation driven >> > >> > No, noiommu is a need for some environment having no IOMMU. >> > But in my understanding, virtio could run with a nested IOMMU. >> >> Interesting, like to understand nested one, I did tried in past by >> passing "iommu=pt intel_iommu=on kvm-intel.nested=1" in cmdline for >> x86 (for guest/host both), but virtio pci device binding to vfio-pci >> driver fails. Tried on 4.2 kernel (qemu version 2.5), is it working >> for >4.2 kernel/ qemu-version? > > I haven't tried. > >> >> i.e.. >> >> >> >> Pl. look at virtio_pci.c in this patch.. VIRTIO_RD/WR/_1/2/4() >> >> implementation. They are in-use and applicable to virtio spec 0.95, >> >> so far support uio/ioport-way rd/wr. Now to support vfio-way rd/wr - >> >> need to check drv->kdrv value, that value should be of vfio_noiommu >> >> types __not__ generic _vfio types. >> > >> > I still don't understand why it would not work with VFIO w/IOMMU. >> >> with vfio+iommu; binding virtio pci device to vfio-pci driver fail; >> giving below error: >> [ 53.053464] VFIO - User Level meta-driver version: 0.3 >> [ 73.077805] vfio-pci: probe of 0000:00:03.0 failed with error -22 >> [ 73.077852] vfio-pci: probe of 0000:00:03.0 failed with error -22 >> >> vfio_pci_probe() --> vfio_iommu_group_get() --> iommu_group_get() >> fails: iommu doesn't have group for virtio pci device. > > Yes it fails when binding. > So the later check in the virtio PMD is useless. > Which check? >> In case of noiommu, it prepares the group / add device to iommu group, >> so it passes. >> >> Jason in other thread mentioned that he is working on virtio+iommu >> approach [1], Patches are not merged and I am yet to evaluate his >> patches for virtio pmd driver for iommu(+vfio). so wondering how >> virtio pci device could work unless jason patches used? >> >> [1] https://www.mail-archive.com/qemu-devel at nongnu.org/msg337079.html > > I haven't tried nested IOMMU. > All this thread was about the kernel module in use, i.e. VFIO. > We are saying that virtio could work in both VFIO modes. I agree and I am looking at stuff that works, vfio-noiommu is in linus-next, and soon be in mainline. This series will let non-x86 archs to use virtio pmd driver in first place and could take benefits of for example vhost, ovs-dpdk-vhost-user etc.. > Furthermore restricting virtio to no-iommu mode doesn't bring > any improvement. We're not __restricting__, as soon as virtio+iommu gets working state, we'll simply replace __noiommu with default. Then its upto user to try out virtio with vfio default or vfio_noiommu. > That's why I suggest to keep the initial semantic of kdrv and > not pollute it with VFIO modes. > I am okay to live with default and forget suffix __noiommu but there are implementation problem which was discussed in other thread - Virtio pmd driver should avoid interface parsing i.e. virtio_resource_init_uio/vfio() etc.. For vfio case - We could easily get rid of by moving /sys parsing to pci_eal layer, Right? If so then virtio currently works with vfio-noiommu, it make sense to me that pci_eal layer does parsing for pmd driver before that pmd driver get initialized. - Another case could be: iommu-less-pmd-driver. eal layer to do parsing before updating drv->kdrv. >> >> >> So at >> >> >> the initialization (example .. virtio-net) of such pmd driver, pmd >> >> >> driver should know that vfio-with-noiommu mode enabled or not? for >> >> >> that pmd driver simply checks drv->kdrv value. >> >> > >> >> > If a check is needed, I would prefer using your function >> >> > pci_vfio_is_noiommu() and remove driver modes from struct >> >> > rte_kernel_driver. >> >> >> >> I don't think calling pci_vfio_no_iommu() inside >> >> virtio_reg_rd/wr_1/2/3() would be a good idea. >> > >> > Why? The value may be cached in the priv properties. >> > >> pci_vfio_is_noiommu() parses /sys for >> - enable_noiommu param >> - attached driver name is vfio-noiommu or not. >> >> It does file operation for that, I meant to say that calling this api >> within register_rd/wr function is not correct. It would be better if >> those low level register_rd/wr api only checks driver_types. > > Yes, that's why I said the return of pci_vfio_is_noiommu() may be cached > to keep efficiency. I am not convinced though, Still find pmd driver checking driver_types using drv->kdrv is better approach than introducing a new global variable which may look something like; At pci_eal layer ---- bool vfio_mode; vfio_mode = pci_vfio_is_noiommu(); At virtio pmd driver layer ---- Checking value at vfio_mode variable before doing virtio_rd/wr for vfio interface. Instead virtio pmd driver doing virtio_reg_rd/wr_1/2/4() { if (drv->kdrv == VFIO) do pread()/pwrite() else in()/out() } is better approach. Let me know if you still think former is better than latter then I'll send patch revision right-away.