On Wed, Apr 09, 2025 at 12:37:14PM -0300, Jason Gunthorpe wrote: > On Wed, Apr 09, 2025 at 04:58:16PM +0300, Mike Rapoport wrote: > > > > > > I think we still don't really know what will be needed, so I'd stick > > > with folio only as that allows building the memfd and a potential slab > > > preservation system. > > > > void * seems to me much more reasonable than folio one as the starting > > point because it allows preserving folios with the right order but it's not > > limited to it. > > It would just call kho_preserve_folio() under the covers though.
How that will work for memblock and 1G pages? > > I don't mind having kho_preserve_folio() from day 1 and even stretching the > > use case we have right now to use it to preserve FDT memory. > > > > But kho_preserve_folio() does not make sense for reserve_mem and it won't > > make sense for vmalloc. > > It does for vmalloc too, just stop thinking about it as a > folio-for-pagecache and instead as an arbitary order handle to buddy > allocator memory that will someday be changed to a memdesc :| But we have memdesc today, it's struct page. It will be shrinked and maybe renamed, it will contain a pointer rather than data, but that's what basic memdesc is. And when the data structure that memdesc points to will be allocated separately folios won't make sense for order-0 allocations. > > The weird games slab does with casting back and forth to folio also seem to > > me like transitional and there won't be that folios in slab later. > > Yes transitional, but we are at the transitional point and KHO should > fit in. > > The lowest allocator primitive returns folios, which can represent any > order, and the caller casts to their own memdesc. The lowest allocation primitive returns pages. struct folio *__folio_alloc_noprof(gfp_t gfp, unsigned int order, int preferred_nid, nodemask_t *nodemask) { struct page *page = __alloc_pages_noprof(gfp | __GFP_COMP, order, preferred_nid, nodemask); return page_rmappable_folio(page); } EXPORT_SYMBOL(__folio_alloc_noprof); And page_rmappable_folio() clues about folio-for-pagecache very clearly. And I don't think folio will be a lowest primitive buddy returns anytime soon if ever. > Jason > -- Sincerely yours, Mike.