On Mon, Jun 08, 2026 at 03:59:04PM +0200, Christian König wrote:
> On 6/8/26 15:55, Bobby Eshleman wrote:
> > 
> > On Sun, Jun 7, 2026 at 11:42 PM Christian König <[email protected] 
> > <mailto:[email protected]>> wrote:
> > 
> >     On 6/5/26 20:44, Bobby Eshleman wrote:
> >     > On Fri, Jun 05, 2026 at 11:30:07AM +0200, Christian König wrote:
> >     >> On 6/4/26 02:42, Bobby Eshleman wrote:
> >     >>> From: Bobby Eshleman <[email protected] 
> > <mailto:[email protected]>>
> >     >>>
> >     >>> get_sg_table() emitted one PAGE_SIZE sg entry per page even when the
> >     >>> underlying folio was larger.
> >     >>>
> >     >>> Instead, walk folios[] and emit one sg entry per folio. When folios
> >     >>> represent large pages (as is for MFD_HUGETLB), each sg entry is a 
> > large
> >     >>> page. Normal PAGE_SIZE sg tables are unchanged.
> >     >>>
> >     >>> Required by net/core/devmem to support rx-buf-size > PAGE_SIZE with
> >     >>> udmabuf.
> >     >>
> >     >> That doesn't explain why this is required.
> >     >
> >     > Sure, can definitely add. Devmem currently requires dmabuf sg entries 
> > to
> >     > be length and size aligned when it allocates niovs for NIC page pools.
> >     > Though udmabuf is not violating any dmabuf contract by emitting
> >     > PAGE_SIZE entries and the above restriction is probably more a
> >     > shortfalling of devmem, by emitting a single entry per folio this 
> > patch
> >     > allows udmabuf to be used by devmem for large pages.
> >     >
> >     >>
> >     >> Please note that accessing the pages/folio of an sg-table returned 
> > by DMA-buf is illegal and strictly forbidden!
> >     >>
> >     >> Regards,
> >     >> Christian.
> >     >
> >     > It seems both devmem and io_uring zcrx at least introspect through to
> >     > the sg-table to build NIC page pools (not accessing the memory itself,
> >     > however). Is there a better way?
> > 
> >     That's an absolute NO-GO! We need to stop that immediately.
> > 
> >     Touching the underlying struct page of an DMA-buf exported sg-table is 
> > strictly forbidden.
> > 
> >     We even have code to wrap the sg_table and hide the struct pages on 
> > debug builds to catch those issues, see function dma_buf_wrap_sg_table().
> > 
> >     My last status is that the NIC page pools are build directly from the 
> > DMA addresses exposed by the sg_table.
> > 
> >     Was there any change I'm not aware of?
> > 
> >     Regards,
> >     Christian.
> > 
> > 
> > Oh no change, your mental model is still current.
> > They just go through each sg and use sg_dma_address() on each.
> 
> Ah, thanks! That was a near heart attack :D
> 
> Yeah that is perfectly correct, question is do you then still really need 
> this udmabuf change? I mean the DMA API usually merges together contiguous 
> DMA addresses.
> 
> Regards,
> Christian.
> 

Hey Christian, sorry for the delay I justed want to double check what
I'm seeing...

I reverted the udmabuf patch and confirmed devmem still runs into 4K
pages even for hugepage udmabuf. I see that the dma_map_direct() path is
being taken, which if I am reading the code correctly results in the
sg_dma_len(sg) inheriting sg->length directly (set by udmabuf's
sg_set_folio(..., PAGE_SIZE) call), compared to the iommu_dma_map_phys()
path which looks like it does merge when possible.

Best,
Bobby

Reply via email to