On Mon, Jun 08, 2026 at 03:59:04PM +0200, Christian König wrote: > On 6/8/26 15:55, Bobby Eshleman wrote: > > > > On Sun, Jun 7, 2026 at 11:42 PM Christian König <[email protected] > > <mailto:[email protected]>> wrote: > > > > On 6/5/26 20:44, Bobby Eshleman wrote: > > > On Fri, Jun 05, 2026 at 11:30:07AM +0200, Christian König wrote: > > >> On 6/4/26 02:42, Bobby Eshleman wrote: > > >>> From: Bobby Eshleman <[email protected] > > <mailto:[email protected]>> > > >>> > > >>> get_sg_table() emitted one PAGE_SIZE sg entry per page even when the > > >>> underlying folio was larger. > > >>> > > >>> Instead, walk folios[] and emit one sg entry per folio. When folios > > >>> represent large pages (as is for MFD_HUGETLB), each sg entry is a > > large > > >>> page. Normal PAGE_SIZE sg tables are unchanged. > > >>> > > >>> Required by net/core/devmem to support rx-buf-size > PAGE_SIZE with > > >>> udmabuf. > > >> > > >> That doesn't explain why this is required. > > > > > > Sure, can definitely add. Devmem currently requires dmabuf sg entries > > to > > > be length and size aligned when it allocates niovs for NIC page pools. > > > Though udmabuf is not violating any dmabuf contract by emitting > > > PAGE_SIZE entries and the above restriction is probably more a > > > shortfalling of devmem, by emitting a single entry per folio this > > patch > > > allows udmabuf to be used by devmem for large pages. > > > > > >> > > >> Please note that accessing the pages/folio of an sg-table returned > > by DMA-buf is illegal and strictly forbidden! > > >> > > >> Regards, > > >> Christian. > > > > > > It seems both devmem and io_uring zcrx at least introspect through to > > > the sg-table to build NIC page pools (not accessing the memory itself, > > > however). Is there a better way? > > > > That's an absolute NO-GO! We need to stop that immediately. > > > > Touching the underlying struct page of an DMA-buf exported sg-table is > > strictly forbidden. > > > > We even have code to wrap the sg_table and hide the struct pages on > > debug builds to catch those issues, see function dma_buf_wrap_sg_table(). > > > > My last status is that the NIC page pools are build directly from the > > DMA addresses exposed by the sg_table. > > > > Was there any change I'm not aware of? > > > > Regards, > > Christian. > > > > > > Oh no change, your mental model is still current. > > They just go through each sg and use sg_dma_address() on each. > > Ah, thanks! That was a near heart attack :D > > Yeah that is perfectly correct, question is do you then still really need > this udmabuf change? I mean the DMA API usually merges together contiguous > DMA addresses. > > Regards, > Christian. >
Hey Christian, sorry for the delay I justed want to double check what I'm seeing... I reverted the udmabuf patch and confirmed devmem still runs into 4K pages even for hugepage udmabuf. I see that the dma_map_direct() path is being taken, which if I am reading the code correctly results in the sg_dma_len(sg) inheriting sg->length directly (set by udmabuf's sg_set_folio(..., PAGE_SIZE) call), compared to the iommu_dma_map_phys() path which looks like it does merge when possible. Best, Bobby

