On Thu, Feb 03, 2022 at 05:45:32PM +0000, Dr. David Alan Gilbert wrote: > * Peter Xu (pet...@redhat.com) wrote: > > This patch enables postcopy-preempt feature. > > > > It contains two major changes to the migration logic: > > > > (1) Postcopy requests are now sent via a different socket from precopy > > background migration stream, so as to be isolated from very high page > > request delays > > > > (2) For huge page enabled hosts: when there's postcopy requests, they can > > now > > intercept a partial sending of huge host pages on src QEMU. > > > > After this patch, we'll have two "channels" (or say, sockets, because it's > > only > > supported on socket-based channels) for postcopy: (1) PRECOPY channel > > (which is > > the default channel that transfers background pages), and (2) POSTCOPY > > channel (which only transfers requested pages). > > > > On the source QEMU, when we found a postcopy request, we'll interrupt the > > PRECOPY channel sending process and quickly switch to the POSTCOPY channel. > > After we serviced all the high priority postcopy pages, we'll switch back to > > PRECOPY channel so that we'll continue to send the interrupted huge page > > again. > > There's no new thread introduced. > > > > On the destination QEMU, one new thread is introduced to receive page data > > from > > the postcopy specific socket. > > > > This patch has a side effect. After sending postcopy pages, previously > > we'll > > assume the guest will access follow up pages so we'll keep sending from > > there. > > Now it's changed. Instead of going on with a postcopy requested page, > > we'll go > > back and continue sending the precopy huge page (which can be intercepted > > by a > > postcopy request so the huge page can be sent partially before). > > > > Whether that's a problem is debatable, because "assuming the guest will > > continue to access the next page" doesn't really suite when huge pages are > > used, especially if the huge page is large (e.g. 1GB pages). So that > > locality > > hint is much meaningless if huge pages are used. > > > > If postcopy preempt is enabled, a separate channel is created for it so > > that it > > can be used later for postcopy specific page requests. On dst node, a > > standalone thread is used to receive postcopy requested pages. The thread > > is > > created along with the ram listen thread during POSTCOPY_LISTEN phase. > > I think this patch could do with being split into two; the first one that > deals with closing/opening channels; and the second that handles the > data on the two channels and does the preemption.
Sounds good, I'll give it a shot on the split. > > Another thought is whether, if in the future we allow multifd + > postcopy, the multifd code would change - I think it would end up closer > to using multiple channels taking different pages on each one. Right, so potentially the postcopy channels can be multi-threaded too itself. We've had a quick discussion on irc, just to recap: I didn't reuse multifd infra because IMO multifd is designed with below ideas in mind: (1) Every multifd thread is equal (2) Throughput oriented However I found that postcopy needs something different when they're mixed up together with multifd. Firstly, we will have some channels sending as much as we could where latency is not an issue (aka background pages). However it's not suitable for page requests, so we could also have channels that are servicing page faults fron dst. In short, there're two types of channels/threads we want, and we may want to treat them differently. The current model is we only have 1 postcopy channel and 1 precopy channel, but it should be easier if we want to make it N post + 1 pre base on this series. So far all send() is still done in the migration thread so no new sender thread but 1 more receiver thread only. If we want to grow that 1->N for postcopy channels we may want to move that out too just like what we do with multifd. Not sure whether there can be something reused around. That's where I haven't yet explored, but this series should already share a common piece of code on refactoring of things like tmp huge page on dst node to be able to receive with multiple huge pages. This also reminded me that, instead of a new capability, should I simply expose a parameter "postcopy-channels=N" to CLI so that we can be prepared with multi postcopy channels? > > > Do we need to do anything in psotcopy recovery ? Yes. It's a todo (in the cover letter), if the whole thing looks sane I'll add that together in the non-rfc series. Thanks, -- Peter Xu