> -----Original Message----- > From: Alex Williamson [mailto:alex.william...@redhat.com] > Sent: Thursday, February 14, 2013 11:57 PM > To: Bhushan Bharat-R65777 > Cc: Yoder Stuart-B08248; Wood Scott-B07421; Graf Alexander-B36701; qemu- > de...@nongnu.org; qemu-...@nongnu.org > Subject: Re: VFIO: Not require to make VFIO_IOMMU_MAP_DMA for MMAPED PCI bars > > On Thu, 2013-02-14 at 08:22 +0000, Bhushan Bharat-R65777 wrote: > > Hi Alex Williamson, > > > > I am able (with some hacks :)) to directly assign the e1000 PCI device > > to KVM guest using VFIO on Freescale device. One of the problem I am > > facing is about the DMA mapping in IOMMU for PCI device BARs. On > > Freescale devices, the mmaped PCI device BARs are not required to be > > mapped in IOMMU. Typically the flow of in/out transaction (from CPU) > > is: > > > > Incoming flow: > > > > -----| |----------| |---------------| > |-------------| > > CORE |<----<------<-----<--| IOMMU |<---<---<| > > PCI-Controller|<------<----- > <----<| PCI device | > > -----| |----------| |---------------| > |-------------| > > > > Outgoing Flow: IOMMU is bypassed for out transactions > > > > -----| |----------| |---------------| > |-------------| > > CORE |>---->------>----| | IOMMU | ->-->| > > PCI-Controller|>------>----- > >---->| PCI device | > > -----| | |----------| ^ |---------------| > |-------------| > > | | > > |------------------| > > Mapping the BAR into the IOMMU isn't for this second use case, CPU to device > is > never IOMMU translated. Instead, it's for this: > > |----------| |---------------| |-------------| > | IOMMU |<---<---<| PCI-Controller|<------<-----<----<| PCI device | > |----------| |---------------| |-------------| > | > | |---------------| |-------------| > +-->---->--- ->| PCI-Controller|>------>----->---->| PCI device | > |---------------| |-------------| > > It's perfectly fine to not support peer-to-peer DMA, but let's skip it where > it's not supported in case it might actually work in some cases. > > > Also because of some hardware limitations on our IOMMU it is difficult > > to map these BAR regions with RAM (DDR) regions. So on Freescale > > device we want the VFIO_IOMMU_MAP_DMA ioctl to be called for RAM > > regions (DDR) only and _not_ for these mmaped ram regions of PCI > > device bars. I can understand that we need to add these mmaped PCI > > bars as RAM type in qemu memory_region_*(). So for that I tried to > > skip these regions in VFIO memory_listeners. Below changes which works > > for me. I am not sure whether this is correct approach, please > > suggest. > > > > ------------- > > diff --git a/hw/vfio_pci.c b/hw/vfio_pci.c index c51ae67..63728d8 > > 100644 > > --- a/hw/vfio_pci.c > > +++ b/hw/vfio_pci.c > > @@ -1115,9 +1115,35 @@ static int vfio_dma_map(VFIOContainer *container, > hwaddr iova, > > return -errno; > > } > > > > -static bool vfio_listener_skipped_section(MemoryRegionSection > > *section) > > +static int memory_region_is_mmap_bars(VFIOContainer *container, > > + MemoryRegionSection *section) > > { > > - return !memory_region_is_ram(section->mr); > > + VFIOGroup *group; > > + VFIODevice *vdev; > > + int i; > > + > > + QLIST_FOREACH(group, &container->group_list, next) { > > I think you want to start at the global &group_list
Why you think global? My understanding is that the operation on dma_map/unmap is limited to a group in the given container. > > > + QLIST_FOREACH(vdev, &group->device_list, next) { > > + if (vdev->msix->mmap_mem.ram_addr == > > + section->mr->ram_addr) > > Note the comment in memory.h: > > struct MemoryRegion { > /* All fields are private - violators will be prosecuted */ > ... > ram_addr_t ram_addr; > > You'll need to invent some kind of memory region comparison interfaces instead > of accessing these private fields. You mean adding some api ine memory.c? > > > + return 1; > > + for (i = 0; i < PCI_ROM_SLOT; i++) { > > + VFIOBAR *bar = &vdev->bars[i]; > > + if (bar->mmap_mem.ram_addr == section->mr->ram_addr) > > + return 1; > > + } > > + } > > + } > > + > > + return 0; > > +} > > It's a pretty heavy weight function, I think the memory API should help us > differentiate MMIO from RAM. What about adding a bool in " struct MemoryRegion {" Bool require_iommu_map; There will be APIs to set/clear this bool and default all "ram = true", will have "require_iommu_map = true" if (!(memory_region->ram == true && memory_region->require_iommu_map == true) skip_iommumap; > > > +static bool vfio_listener_skipped_section(VFIOContainer *container, > > + MemoryRegionSection > > +*section) { > > + if (!memory_region_is_ram(section->mr)) > > + return 1; > > Need some kind of conditional here for platforms that support peer-to-peer. > Thanks, What about adding a iommu_type will help here (VFIO_TYPE2_IOMMU), which does not require BARs to be mapped in iommu ? Thanks -Bharat > > Alex > > > + return memory_region_is_mmap_bars(container, section); > > } > > > > static void vfio_listener_region_add(MemoryListener *listener, @@ > > -1129,7 +1155,7 @@ static void vfio_listener_region_add(MemoryListener > *listener, > > void *vaddr; > > int ret; > > > > - if (vfio_listener_skipped_section(section)) { > > + if (vfio_listener_skipped_section(container, section)) { > > DPRINTF("vfio: SKIPPING region_add %"HWADDR_PRIx" - %"PRIx64"\n", > > section->offset_within_address_space, > > section->offset_within_address_space + section->size > > - 1); @@ -1173,7 +1199,7 @@ static void > vfio_listener_region_del(MemoryListener *listener, > > hwaddr iova, end; > > int ret; > > - if (vfio_listener_skipped_section(section)) { > > + if (vfio_listener_skipped_section(container, section)) { > > DPRINTF("vfio: SKIPPING region_del %"HWADDR_PRIx" - %"PRIx64"\n", > > section->offset_within_address_space, > > section->offset_within_address_space + section->size > > - 1); > > > > ----------------- > > > > > > Thanks > > -Bharat > > > > >