On Fri, Apr 04, 2025 at 09:47:29AM -0300, Jason Gunthorpe wrote: > On Fri, Apr 04, 2025 at 12:54:25PM +0300, Mike Rapoport wrote: > > > IMHO it should not call kho_preserve_phys() at all. > > > > Do you mean that for preserving large physical ranges we need something > > entirely different? > > If they don't use the buddy allocator, then yes? > > > Then we don't need the bitmaps at this point, as we don't have any users > > for kho_preserve_folio() and we should not worry ourself with orders and > > restoration of high order folios until then ;-) > > Arguably yes :\ > > Maybe change the reserved regions code to put the region list in a > folio and preserve the folio instead of using FDT as a "demo" for the > functionality.
Folios are not available when we restore reserved regions, this just won't work. > > The xarrays + bitmaps do have the limitation that we cannot store any > > information about the folio except its order and if we are anyway need > > something else to preserve physical ranges, I suggest starting with > > preserving ranges and then adding optimizations for the folio case. > > Why? What is the use case for physical ranges that isn't handled > entirely by reserved_mem_add()? > > We know what the future use case is for the folio preservation, all > the drivers and the iommu are going to rely on this. We don't know how much of the preservation will be based on folios. Most drivers do not use folios and for preserving memfd* and hugetlb we'd need to have some dance around that memory anyway. So I think kho_preserve_folio() would be a part of the fdbox or whatever that functionality will be called. > > Here's something that implements preservation of ranges (compile tested > > only) and adding folios with their orders and maybe other information would > > be quite easy. > > But folios and their orders is the *whole point*, again I don't see > any use case for preserving ranges, beyond it being a way to optimize > the memblock reserve path. But that path should be fixed up to just > use the bitmap directly.. Are they? The purpose of basic KHO is to make sure the memory we want to preserve is not trampled over. Preserving folios with their orders means we need to make sure memory range of the folio is preserved and we carry additional information to actually recreate the folio object, in case it is needed and in case it is possible. Hughetlb, for instance has its own way initializing folios and just keeping the order won't be enough for that. As for the optimizations of memblock reserve path, currently it what hurts the most in my and Pratyush experiments. They are not very representative, but still, preserving lots of pages/folios spread all over would have it's toll on the mm initialization. And I don't think invasive changes to how buddy and memory map initialization are the best way to move forward and optimize that. Quite possibly we'd want to be able to minimize amount of *ranges* that we preserve. So from the three alternatives we have now (xarrays + bitmaps, tables + bitmaps and maple tree for ranges) maple tree seems to be the simplest and efficient enough to start with. Preserving folio orders with it is really straighforward and until we see some real data of how the entire KHO machinery is used, I'd prefer simple over anything else. > Jason -- Sincerely yours, Mike.