> -----Original Message----- > From: Alex Williamson [mailto:alex.william...@redhat.com] > Sent: Tuesday, February 19, 2013 9:55 AM > To: Bhushan Bharat-R65777 > Cc: Yoder Stuart-B08248; Wood Scott-B07421; Graf Alexander-B36701; qemu- > de...@nongnu.org; qemu-...@nongnu.org > Subject: Re: VFIO: Not require to make VFIO_IOMMU_MAP_DMA for MMAPED PCI bars > > On Tue, 2013-02-19 at 04:05 +0000, Bhushan Bharat-R65777 wrote: > > > > > -----Original Message----- > > > From: Alex Williamson [mailto:alex.william...@redhat.com] > > > Sent: Thursday, February 14, 2013 11:57 PM > > > To: Bhushan Bharat-R65777 > > > Cc: Yoder Stuart-B08248; Wood Scott-B07421; Graf Alexander-B36701; > > > qemu- de...@nongnu.org; qemu-...@nongnu.org > > > Subject: Re: VFIO: Not require to make VFIO_IOMMU_MAP_DMA for MMAPED > > > PCI bars > > > > > > On Thu, 2013-02-14 at 08:22 +0000, Bhushan Bharat-R65777 wrote: > > > > Hi Alex Williamson, > > > > > > > > I am able (with some hacks :)) to directly assign the e1000 PCI > > > > device to KVM guest using VFIO on Freescale device. One of the > > > > problem I am facing is about the DMA mapping in IOMMU for PCI > > > > device BARs. On Freescale devices, the mmaped PCI device BARs are > > > > not required to be mapped in IOMMU. Typically the flow of in/out > > > > transaction (from CPU) > > > > is: > > > > > > > > Incoming flow: > > > > > > > > -----| |----------| |---------------| > > > |-------------| > > > > CORE |<----<------<-----<--| IOMMU |<---<---<| > > > > PCI-Controller|<------<- > ---- > > > <----<| PCI device | > > > > -----| |----------| |---------------| > > > |-------------| > > > > > > > > Outgoing Flow: IOMMU is bypassed for out transactions > > > > > > > > -----| |----------| |---------------| > > > |-------------| > > > > CORE |>---->------>----| | IOMMU | ->-->| > > > > PCI-Controller|>------>- > ---- > > > >---->| PCI device | > > > > -----| | |----------| ^ |---------------| > > > |-------------| > > > > | | > > > > |------------------| > > > > > > Mapping the BAR into the IOMMU isn't for this second use case, CPU > > > to device is never IOMMU translated. Instead, it's for this: > > > > > > |----------| |---------------| |-------------| > > > | IOMMU |<---<---<| PCI-Controller|<------<-----<----<| PCI device | > > > |----------| |---------------| |-------------| > > > | > > > | |---------------| |-------------| > > > +-->---->--- ->| PCI-Controller|>------>----->---->| PCI device | > > > |---------------| |-------------| > > > > > > It's perfectly fine to not support peer-to-peer DMA, but let's skip > > > it where it's not supported in case it might actually work in some cases. > > > > > > > Also because of some hardware limitations on our IOMMU it is > > > > difficult to map these BAR regions with RAM (DDR) regions. So on > > > > Freescale device we want the VFIO_IOMMU_MAP_DMA ioctl to be called > > > > for RAM regions (DDR) only and _not_ for these mmaped ram regions > > > > of PCI device bars. I can understand that we need to add these > > > > mmaped PCI bars as RAM type in qemu memory_region_*(). So for that > > > > I tried to skip these regions in VFIO memory_listeners. Below > > > > changes which works for me. I am not sure whether this is correct > > > > approach, please suggest. > > > > > > > > ------------- > > > > diff --git a/hw/vfio_pci.c b/hw/vfio_pci.c index c51ae67..63728d8 > > > > 100644 > > > > --- a/hw/vfio_pci.c > > > > +++ b/hw/vfio_pci.c > > > > @@ -1115,9 +1115,35 @@ static int vfio_dma_map(VFIOContainer > > > > *container, > > > hwaddr iova, > > > > return -errno; > > > > } > > > > > > > > -static bool vfio_listener_skipped_section(MemoryRegionSection > > > > *section) > > > > +static int memory_region_is_mmap_bars(VFIOContainer *container, > > > > + MemoryRegionSection > > > > +*section) > > > > { > > > > - return !memory_region_is_ram(section->mr); > > > > + VFIOGroup *group; > > > > + VFIODevice *vdev; > > > > + int i; > > > > + > > > > + QLIST_FOREACH(group, &container->group_list, next) { > > > > > > I think you want to start at the global &group_list > > > > Why you think global? My understanding is that the operation on > > dma_map/unmap > is limited to a group in the given container. > > There can theoretically be more than one container, so if you're only > searching > groups within a container then you may not be searching all the devices. > > > > > + QLIST_FOREACH(vdev, &group->device_list, next) { > > > > + if (vdev->msix->mmap_mem.ram_addr == > > > > + section->mr->ram_addr) > > > > > > Note the comment in memory.h: > > > > > > struct MemoryRegion { > > > /* All fields are private - violators will be prosecuted */ > > > ... > > > ram_addr_t ram_addr; > > > > > > You'll need to invent some kind of memory region comparison > > > interfaces instead of accessing these private fields. > > > > You mean adding some api ine memory.c? > > Yes > > > > > + return 1; > > > > + for (i = 0; i < PCI_ROM_SLOT; i++) { > > > > + VFIOBAR *bar = &vdev->bars[i]; > > > > + if (bar->mmap_mem.ram_addr == section->mr->ram_addr) > > > > + return 1; > > > > + } > > > > + } > > > > + } > > > > + > > > > + return 0; > > > > +} > > > > > > It's a pretty heavy weight function, I think the memory API should > > > help us differentiate MMIO from RAM. > > > > What about adding a bool in " struct MemoryRegion {" > > > > Bool require_iommu_map; > > There will be APIs to set/clear this bool and default all "ram = true", will > have "require_iommu_map = true" > > > > if (!(memory_region->ram == true && memory_region->require_iommu_map == > > true) > > skip_iommumap; > > I don't think memory.c wants to care about requiring iommu mapping or not. > Perhaps an mmio address space vs a ram address space.
Can you please describe what you mean by mmio address space vs ram address space here? Which elements on struct MemoryRegion you are talking about? > > > > > > > +static bool vfio_listener_skipped_section(VFIOContainer *container, > > > > + MemoryRegionSection > > > > +*section) { > > > > + if (!memory_region_is_ram(section->mr)) > > > > + return 1; > > > > > > Need some kind of conditional here for platforms that support > > > peer-to-peer. > > > Thanks, > > > > What about adding a iommu_type will help here (VFIO_TYPE2_IOMMU), which does > not require BARs to be mapped in iommu ? > > "Type2" seems excessive, there's already an VFIO_IOMMU_GET_INFO ioctl that > could > easily add a flag and maybe a field. I think you're doing something with > iommu > geometry, perhaps inferring the geometry somehow. > Why not add that to the GET_INFO data and use that to decide whether to > restrict > mapping to ram? Thanks, I agree that getting the iommu type in VFIO_IOMMU_GET_INFO and then using a flag is a good idea. What I mean was that the flag have value either TYPE1 or TYPE2/FSL_PAMU etc. BTW: Currently the hw/vfio_pci.c uses VFIO_TYPE1_IOMMU defines Thanks -Bharat > > Alex >