On Tue, Oct 31, 2023 at 08:18:06PM -0300, Fabiano Rosas wrote: > Peter Xu <pet...@redhat.com> writes: > > > On Mon, Oct 23, 2023 at 05:36:00PM -0300, Fabiano Rosas wrote: > >> Currently multifd does not need to have knowledge of pages on the > >> receiving side because all the information needed is within the > >> packets that come in the stream. > >> > >> We're about to add support to fixed-ram migration, which cannot use > >> packets because it expects the ramblock section in the migration file > >> to contain only the guest pages data. > >> > >> Add a pointer to MultiFDPages in the multifd_recv_state and use the > >> pages similarly to what we already do on the sending side. The pages > >> are used to transfer data between the ram migration code in the main > >> migration thread and the multifd receiving threads. > >> > >> Signed-off-by: Fabiano Rosas <faro...@suse.de> > > > > If it'll be new code to maintain anyway, I think we don't necessarily > > always use multifd structs, right? > > > > For the sending side, unrelated to this series, I'm experimenting with > defining a generic structure to be passed into multifd: > > struct MultiFDData_t { > void *opaque; > size_t size; > bool ready; > void (*cleanup_fn)(void *); > }; > > The client code (ram.c) would use the opaque field to put whatever it > wants in it. Maybe we could have a similar concept on the receiving > side? > > Here's a PoC I'm writing, if you're interested: > > https://github.com/farosas/qemu/commits/multifd-packet-cleanups > > (I'm delaying sending this to the list because we already have a > reasonable backlog of features and refactorings to merge.)
I went through the idea, I agree it's reasonable to generalize multifd to drop the page constraints. Actually I'm wondering maybe it should be better that we have a thread pool model for migration, then multifd can be based on that. Something like: job submissions, proper locks, notifications, quits, etc. with a bunch of API to manipulate the thread pool. And actually.. I just noticed we have. :) See util/thread-pool.c. I didn't have closer look, but that looks like something good if we can work on top (e.g., I don't think we want the bottom halfs..), or refactor to satisfy all our needs from migration pov. Not something I'm asking right away, but maybe we can at least keep an eye on. > > > Rather than introducing MultiFDPages_t into recv side, can we allow pages > > to be distributed in chunks of (ramblock, start_offset, end_offset) tuples? > > That'll be much more efficient than per-page. We don't need page granule > > here on recv side, we want to load chunks of mem fast. > > > > We don't even need page granule on sender side, but since only myself cared > > about perf.. and obviously the plan is to even drop auto-pause, then VM can > > be running there, so sender must do that per-page for now. But now on recv > > side VM must be stopped before all ram loaded, so there's no such problem. > > And since we'll introduce new code anyway, IMHO we can decide how to do > > that even if we want to reuse multifd. > > > > Main thread can assign these (ramblock, start_offset, end_offset) jobs to > > recv threads. If ramblock is too small (e.g. 1M), assign it anyway to one > > thread. If ramblock is >512MB, cut it into slices and feed them to multifd > > threads one by one. All the rest can be the same. > > > > Would that be better? I would expect measurable loading speed difference > > with much larger chunks and with that range-based tuples. > > I need to check how that would interact with the existing recv_thread > code. Hopefully there's nothing there preventing us from using a > different data structure. Sure, thanks. Maybe there's a good way to provide a middle ground on both "less code changes" and "easily maintainable", if that helps on this series being merged. What I want to make sure is we don't introduce new complicated logic but even not doing the job as correct as we can. -- Peter Xu