On 6/5/26 20:44, Bobby Eshleman wrote: > On Fri, Jun 05, 2026 at 11:30:07AM +0200, Christian König wrote: >> On 6/4/26 02:42, Bobby Eshleman wrote: >>> From: Bobby Eshleman <[email protected]> >>> >>> get_sg_table() emitted one PAGE_SIZE sg entry per page even when the >>> underlying folio was larger. >>> >>> Instead, walk folios[] and emit one sg entry per folio. When folios >>> represent large pages (as is for MFD_HUGETLB), each sg entry is a large >>> page. Normal PAGE_SIZE sg tables are unchanged. >>> >>> Required by net/core/devmem to support rx-buf-size > PAGE_SIZE with >>> udmabuf. >> >> That doesn't explain why this is required. > > Sure, can definitely add. Devmem currently requires dmabuf sg entries to > be length and size aligned when it allocates niovs for NIC page pools. > Though udmabuf is not violating any dmabuf contract by emitting > PAGE_SIZE entries and the above restriction is probably more a > shortfalling of devmem, by emitting a single entry per folio this patch > allows udmabuf to be used by devmem for large pages. > >> >> Please note that accessing the pages/folio of an sg-table returned by >> DMA-buf is illegal and strictly forbidden! >> >> Regards, >> Christian. > > It seems both devmem and io_uring zcrx at least introspect through to > the sg-table to build NIC page pools (not accessing the memory itself, > however). Is there a better way?
That's an absolute NO-GO! We need to stop that immediately. Touching the underlying struct page of an DMA-buf exported sg-table is strictly forbidden. We even have code to wrap the sg_table and hide the struct pages on debug builds to catch those issues, see function dma_buf_wrap_sg_table(). My last status is that the NIC page pools are build directly from the DMA addresses exposed by the sg_table. Was there any change I'm not aware of? Regards, Christian. > > Best, > Bobby > >> >>> Signed-off-by: Bobby Eshleman <[email protected]> >>> --- >>> drivers/dma-buf/udmabuf.c | 47 >>> ++++++++++++++++++++++++++++++++++++++++++----- >>> 1 file changed, 42 insertions(+), 5 deletions(-) >>> >>> diff --git a/drivers/dma-buf/udmabuf.c b/drivers/dma-buf/udmabuf.c >>> index 94b8ecb892bb..f28dd3788ada 100644 >>> --- a/drivers/dma-buf/udmabuf.c >>> +++ b/drivers/dma-buf/udmabuf.c >>> @@ -141,26 +141,63 @@ static void vunmap_udmabuf(struct dma_buf *buf, >>> struct iosys_map *map) >>> vm_unmap_ram(map->vaddr, ubuf->pagecount); >>> } >>> >>> +/* Return the number of contiguous pages backed by the folio at @i. >>> + * A udmabuf may map only part of a folio, or reference the same folio >>> + * in multiple non-contiguous runs, so folio_nr_pages() can't be used. >>> + */ >>> +static pgoff_t udmabuf_folio_nr_pages(struct udmabuf *ubuf, pgoff_t i) >>> +{ >>> + struct folio *f = ubuf->folios[i]; >>> + pgoff_t j; >>> + >>> + for (j = 1; i + j < ubuf->pagecount; j++) { >>> + if (ubuf->folios[i + j] != f) >>> + break; >>> + /* Same folio, but not a sequential offset within it. */ >>> + if (ubuf->offsets[i + j] != ubuf->offsets[i] + j * >>> PAGE_SIZE) >>> + break; >>> + } >>> + return j; >>> +} >>> + >>> +/* Count the contiguous folio runs in @ubuf, one sg entry per run. */ >>> +static unsigned int udmabuf_sg_nents(struct udmabuf *ubuf) >>> +{ >>> + unsigned int nents = 0; >>> + pgoff_t i; >>> + >>> + for (i = 0; i < ubuf->pagecount; i += udmabuf_folio_nr_pages(ubuf, >>> i)) >>> + nents++; >>> + return nents; >>> +} >>> + >>> static struct sg_table *get_sg_table(struct device *dev, struct dma_buf >>> *buf, >>> enum dma_data_direction direction) >>> { >>> struct udmabuf *ubuf = buf->priv; >>> - struct sg_table *sg; >>> struct scatterlist *sgl; >>> - unsigned int i = 0; >>> + struct sg_table *sg; >>> + pgoff_t i, run; >>> + unsigned int nents; >>> int ret; >>> >>> + nents = udmabuf_sg_nents(ubuf); >>> + >>> sg = kzalloc_obj(*sg); >>> if (!sg) >>> return ERR_PTR(-ENOMEM); >>> >>> - ret = sg_alloc_table(sg, ubuf->pagecount, GFP_KERNEL); >>> + ret = sg_alloc_table(sg, nents, GFP_KERNEL); >>> if (ret < 0) >>> goto err_alloc; >>> >>> - for_each_sg(sg->sgl, sgl, ubuf->pagecount, i) >>> - sg_set_folio(sgl, ubuf->folios[i], PAGE_SIZE, >>> + sgl = sg->sgl; >>> + for (i = 0; i < ubuf->pagecount; i += run) { >>> + run = udmabuf_folio_nr_pages(ubuf, i); >>> + sg_set_folio(sgl, ubuf->folios[i], run << PAGE_SHIFT, >>> ubuf->offsets[i]); >>> + sgl = sg_next(sgl); >>> + } >>> >>> ret = dma_map_sgtable(dev, sg, direction, 0); >>> if (ret < 0) >>> >>> -- >>> 2.53.0-Meta >>> >>

