On 6 Feb 2025, at 3:01, Andrew Morton wrote: > On Tue, 4 Feb 2025 22:14:10 -0500 Zi Yan <z...@nvidia.com> wrote: > >> This patchset adds a new buddy allocator like (or non-uniform) large folio >> split to reduce the total number of after-split folios, the amount of memory >> needed for multi-index xarray split, and keep more large folios after a >> split. > > It would be useful (vital, really) to provide some measurements which > help others understand the magnitude of these resource savings, please.
Hi Andrew, Can you please drop this series for now? I find that, after your above request, I misunderstood how xas_split_alloc() and xas_split() works in xarray, thus, my current implementation allocates more than enough xa_node during non-uniform split, although the excessive ones are freed at the end. It defeats the purpose of reducing memory consumption of multi-index xarray split, even if folio_split() has no function issue AFAICT. I am working on a better implementation that might require new xarray operations. I will post it as v7 later. I really appreciate that you asked about more info above. :) More details on memory saving for multi-index xarray split during non-uniform split compared to existing uniform split (I will add this to commit log in the next version): Existing uniform split requires 2^(order % XA_CHUNK_SHIFT) xa_node allocations during split, when the folio needs to be split to order-0. But non-uniform split only requires at most 1 xa_node allocation. For example, to split an order-9 folio, 8 xa_nodes are needed for uniform split, since the folio takes 8 multi-index slots in the xarray. But for non-uniform split, only the slot containing the given struct page needs a xa_node after the split. There will be a 7 xa_node saving. Hi Matthew, Do you mind checking my statement above on xarray memory saving? And correct me if I miss anything. Thanks. Best Regards, Yan, Zi