On Fri, Oct 27, 2023 at 5:30 AM Fabiano Rosas <faro...@suse.de> wrote: > > Hao Xiang <hao.xi...@bytedance.com> writes: > > > Juan Quintela had a patchset enabling zero page checking in multifd > > threads. > > > > https://lore.kernel.org/all/20220802063907.18882-13-quint...@redhat.com/ > > Hmm, risky to base your series on code more than an year old. We should > bother Juan so he sends an updated version for review. > > I have concerns about that series. First is why are we doing payload > processing (i.e. zero page detection) in the multifd thread. And that > affects your series directly, because AFAICS we're now doing more > processing still. >
I am pretty new to QEMU so my take could be wrong. We can wait for Juan to comment here. My understanding is that the migration main loop was originally designed to handle single sender thread (before multifd feature). Zero page checking is a pretty CPU intensive operation. So in case of multifd, we scaled up the number of sender threads in order to saturate network traffic. Doing zero page checking in the main loop is not going to scale with this new design. In fact, we (Bytedance) has merged Juan's change into our internal QEMU and we have been using this feature since last year. I was told that it improved performance pretty significantly. Ideally, I would love to see zero page checking be done in a separate thread pool so we can scale it independently from the sender threads but doing it in the sender thread is an inexpensive way to scale. > Second is more abstract but the multifd packet header is becoming just > about small details about pages. We should probably take the time now > and split that into a multifd header and a payload specific header. With > some versioning stuck to them for migration compatibility. > > Now, I don't want to block this series due to my idealistic views on the > code base, so I'll keep those aside while reviewing this, but I > definitely think we should look at the big picture before we get too > tangled up. > Totally agree. I actually have an implementation of this locally to do exactly that. The problem I see is that we use a fixed size page count in a payload but the payload size varies depending on how many zero pages are actually detected. The sender/receive pair has a synchronous loop on payload transfer and if we have a long fat pipe, the current behavior is not optimal for network bandwidth utilization. We can make sure we accumulate enough normal pages and we send a large packet. And when we send zero pages, we can accumulate them until we have a very large page count and we send them all at once.