Hi Mike,

On Fri, Apr 04 2025, Mike Rapoport wrote:

[...]
> As for the optimizations of memblock reserve path, currently it what hurts
> the most in my and Pratyush experiments. They are not very representative,
> but still, preserving lots of pages/folios spread all over would have it's
> toll on the mm initialization. And I don't think invasive changes to how
> buddy and memory map initialization are the best way to move forward and
> optimize that. Quite possibly we'd want to be able to minimize amount of
> *ranges* that we preserve.
>
> So from the three alternatives we have now (xarrays + bitmaps, tables +
> bitmaps and maple tree for ranges) maple tree seems to be the simplest and
> efficient enough to start with.

But you'd need to somehow serialize the maple tree ranges into some
format. So you would either end up going back to the kho_mem ranges we
had, or have to invent something more complex. The sample code you wrote
is pretty much going back to having kho_mem ranges.

And if you say that we should minimize the amount of ranges, the table +
bitmaps is still a fairly good data structure. You can very well have a
higher order table where your entire range is a handful of bits. This
lets you track a small number of ranges fairly efficiently -- both in
terms of memory and in terms of CPU. I think the only place where it
doesn't work as well as a maple tree is if you want to merge or split a
lot ranges quickly. But if you say that you only want to have a handful
of ranges, does that really matter?

Also, I think the allocation pattern depends on which use case you have
in mind. For hypervisor live update, you might very well only have a
handful of ranges. The use case I have in mind is for taking a userspace
process, quickly checkpointing it by dumping its memory contents to a
memfd, and restoring it after KHO. For that, the ability to do random
sparse allocations quickly helps a lot.

So IMO the table works well for both sparse and dense allocations. So
why have a data structure that only solves one problem when we can have
one that solves both? And honestly, I don't think the table is that much
more complex either -- both in terms of understanding the idea and in
terms of code -- the whole thing is like 200 lines.

Also, I think changes to buddy initialization _is_ the way to optimize
boot times. Having maple tree ranges and moving them around into
memblock ranges does not really scale very well for anything other than
a handful of ranges, and we shouldn't limit ourselves to that without
good reason.

>  
> Preserving folio orders with it is really straighforward and until we see
> some real data of how the entire KHO machinery is used, I'd prefer simple
> over anything else.

-- 
Regards,
Pratyush Yadav

Reply via email to