+Tomasz, so he can reply to the thread +Marek and Russell as recommended by Tomasz
On Oct 30, 2015 22:27, "Robin Murphy" <robin.mur...@arm.com> wrote: > > Hi Dan, > > On 30/10/15 01:17, Daniel Kurtz wrote: >> >> +linux-media & VIDEOBUF2 FRAMEWORK maintainers since this is about the >> v4l2-contig's usage of the DMA API. >> >> Hi Robin, >> >> On Tue, Oct 27, 2015 at 12:55 AM, Robin Murphy <robin.mur...@arm.com> wrote: >>> >>> On 26/10/15 13:44, Yong Wu wrote: >>>> >>>> >>>> On Thu, 2015-10-01 at 20:13 +0100, Robin Murphy wrote: >>>> [...] >>>>> >>>>> >>>>> +/* >>>>> + * The DMA API client is passing in a scatterlist which could describe >>>>> + * any old buffer layout, but the IOMMU API requires everything to be >>>>> + * aligned to IOMMU pages. Hence the need for this complicated bit of >>>>> + * impedance-matching, to be able to hand off a suitably-aligned list, >>>>> + * but still preserve the original offsets and sizes for the caller. >>>>> + */ >>>>> +int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg, >>>>> + int nents, int prot) >>>>> +{ >>>>> + struct iommu_domain *domain = iommu_get_domain_for_dev(dev); >>>>> + struct iova_domain *iovad = domain->iova_cookie; >>>>> + struct iova *iova; >>>>> + struct scatterlist *s, *prev = NULL; >>>>> + dma_addr_t dma_addr; >>>>> + size_t iova_len = 0; >>>>> + int i; >>>>> + >>>>> + /* >>>>> + * Work out how much IOVA space we need, and align the segments >>>>> to >>>>> + * IOVA granules for the IOMMU driver to handle. With some clever >>>>> + * trickery we can modify the list in-place, but reversibly, by >>>>> + * hiding the original data in the as-yet-unused DMA fields. >>>>> + */ >>>>> + for_each_sg(sg, s, nents, i) { >>>>> + size_t s_offset = iova_offset(iovad, s->offset); >>>>> + size_t s_length = s->length; >>>>> + >>>>> + sg_dma_address(s) = s->offset; >>>>> + sg_dma_len(s) = s_length; >>>>> + s->offset -= s_offset; >>>>> + s_length = iova_align(iovad, s_length + s_offset); >>>>> + s->length = s_length; >>>>> + >>>>> + /* >>>>> + * The simple way to avoid the rare case of a segment >>>>> + * crossing the boundary mask is to pad the previous one >>>>> + * to end at a naturally-aligned IOVA for this one's >>>>> size, >>>>> + * at the cost of potentially over-allocating a little. >>>>> + */ >>>>> + if (prev) { >>>>> + size_t pad_len = roundup_pow_of_two(s_length); >>>>> + >>>>> + pad_len = (pad_len - iova_len) & (pad_len - 1); >>>>> + prev->length += pad_len; >>>> >>>> >>>> >>>> Hi Robin, >>>> While our v4l2 testing, It seems that we met a problem here. >>>> Here we update prev->length again, Do we need update >>>> sg_dma_len(prev) again too? >>>> >>>> Some function like vb2_dc_get_contiguous_size[1] always get >>>> sg_dma_len(s) to compare instead of s->length. so it may break >>>> unexpectedly while sg_dma_len(s) is not same with s->length. >>> >>> >>> >>> This is just tweaking the faked-up length that we hand off to iommu_map_sg() >>> (see also the iova_align() above), to trick it into bumping this segment up >>> to a suitable starting IOVA. The real length at this point is stashed in >>> sg_dma_len(s), and will be copied back into s->length in __finalise_sg(), so >>> both will hold the same true length once we return to the caller. >>> >>> Yes, it does mean that if you have a list where the segment lengths are page >>> aligned but not monotonically decreasing, e.g. {64k, 16k, 64k}, then you'll >>> still end up with a gap between the second and third segments, but that's >>> fine because the DMA API offers no guarantees about what the resulting DMA >>> addresses will be (consider the no-IOMMU case where they would each just be >>> "mapped" to their physical address). If that breaks v4l, then it's probably >>> v4l's DMA API use that needs looking at (again). >> >> >> Hmm, I thought the DMA API maps a (possibly) non-contiguous set of >> memory pages into a contiguous block in device memory address space. >> This would allow passing a dma mapped buffer to device dma using just >> a device address and length. > > > Not at all. The streaming DMA API (dma_map_* and friends) has two > responsibilities: performing any necessary cache maintenance to ensure the > device will correctly see data from the CPU, and the CPU will correctly see > data from the device; and working out an address for that buffer from the > device's point of view to actually hand off to the hardware (which is > perfectly well allowed to fail). > > Consider SWIOTLB's implementation - segments which already lie at physical > addresses within the device's DMA mask just get passed through, while those > that lie outside it get mapped into the bounce buffer, but still as > individual allocations (arch code just handles cache maintenance on the > resulting physical addresses and can apply any hard-wired DMA offset for the > device concerned). > >> IIUC, the change above breaks this model by inserting gaps in how the >> buffer is mapped to device memory, such that the buffer is no longer >> contiguous in dma address space. > > > Even the existing arch/arm IOMMU DMA code which I guess this implicitly > relies on doesn't guarantee that behaviour - if the mapping happens to reach > one of the segment length/boundary limits it won't just leave a gap, it'll > start an entirely new IOVA allocation which could well start at a wildly > different address[0]. > >> Here is the code in question from >> drivers/media/v4l2-core/videobuf2-dma-contig.c : >> >> static unsigned long vb2_dc_get_contiguous_size(struct sg_table *sgt) >> { >> struct scatterlist *s; >> dma_addr_t expected = sg_dma_address(sgt->sgl); >> unsigned int i; >> unsigned long size = 0; >> >> for_each_sg(sgt->sgl, s, sgt->nents, i) { >> if (sg_dma_address(s) != expected) >> break; >> expected = sg_dma_address(s) + sg_dma_len(s); >> size += sg_dma_len(s); >> } >> return size; >> } >> >> >> static void *vb2_dc_get_userptr(void *alloc_ctx, unsigned long vaddr, >> unsigned long size, enum dma_data_direction dma_dir) >> { >> struct vb2_dc_conf *conf = alloc_ctx; >> struct vb2_dc_buf *buf; >> struct frame_vector *vec; >> unsigned long offset; >> int n_pages, i; >> int ret = 0; >> struct sg_table *sgt; >> unsigned long contig_size; >> unsigned long dma_align = dma_get_cache_alignment(); >> DEFINE_DMA_ATTRS(attrs); >> >> dma_set_attr(DMA_ATTR_SKIP_CPU_SYNC, &attrs); >> >> buf = kzalloc(sizeof *buf, GFP_KERNEL); >> buf->dma_dir = dma_dir; >> >> offset = vaddr & ~PAGE_MASK; >> vec = vb2_create_framevec(vaddr, size, dma_dir == DMA_FROM_DEVICE); >> buf->vec = vec; >> n_pages = frame_vector_count(vec); >> >> sgt = kzalloc(sizeof(*sgt), GFP_KERNEL); >> >> ret = sg_alloc_table_from_pages(sgt, frame_vector_pages(vec), >> n_pages, >> offset, size, GFP_KERNEL); >> >> sgt->nents = dma_map_sg_attrs(buf->dev, sgt->sgl, sgt->orig_nents, >> buf->dma_dir, &attrs); >> >> contig_size = vb2_dc_get_contiguous_size(sgt); > > > (as an aside, it's rather unintuitive that the handling of the dma_map_sg > call actually failing is entirely implicit here) > >> if (contig_size < size) { >> >> <<<=== if the original buffer had sg entries that were not >> aligned on the "natural" alignment for their size, the new arm64 iommu >> core code inserts a 'gap' in the iommu mapping, which causes >> vb2_dc_get_contiguous_size() to exit early (and return a smaller size >> than expected). >> >> pr_err("contiguous mapping is too small %lu/%lu\n", >> contig_size, size); >> ret = -EFAULT; >> goto fail_map_sg; >> } >> >> >> So, is the videobuf2-dma-contig.c based on an incorrect assumption >> about how the DMA API is supposed to work? >> Is it even possible to map a "contiguous-in-iova-range" mapping for a >> buffer given as an sg_table with an arbitrary set of pages? > > > From the Streaming DMA mappings section of Documentation/DMA-API.txt: > > Note also that the above constraints on physical contiguity and > dma_mask may not apply if the platform has an IOMMU (a device which > maps an I/O DMA address to a physical memory address). However, to be > portable, device driver writers may *not* assume that such an IOMMU > exists. > > There's not strictly any harm in using the DMA API this way and *hoping* you > get what you want, as long as you're happy for it to fail pretty much 100% of > the time on some systems, and still in a minority of corner cases on any > system. However, if there's a real dependency on IOMMUs and tight control of > IOVA allocation here, then the DMA API isn't really the right tool for the > job, and maybe it's time to start looking to how to better fit these > multimedia-subsystem-type use cases into the IOMMU API - as far as I > understand it there's at least some conceptual overlap with the HSA PASID > stuff being prototyped in PCI/x86-land at the moment, so it could be an > apposite time to try and bang out some common requirements. > > Robin. > > [0]:http://article.gmane.org/gmane.linux.kernel.iommu/11185 > >> >> Thanks for helping to move this forward. >> >> -Dan >> > -- To unsubscribe from this list: send the line "unsubscribe linux-media" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html