On 17/03/2017 20:36, Dr. David Alan Gilbert wrote: > * Paolo Bonzini (pbonz...@redhat.com) wrote: >> On 17/03/2017 14:02, Dr. David Alan Gilbert wrote: >>>>> case RAM_SAVE_FLAG_MULTIFD_PAGE: >>>>> fd_num = qemu_get_be16(f); >>>>> - if (fd_num != 0) { >>>>> - /* this is yet an unused variable, changed later */ >>>>> - fd_num = fd_num; >>>>> - } >>>>> + multifd_recv_page(host, fd_num); >>>>> qemu_get_buffer(f, host, TARGET_PAGE_SIZE); >>>>> break; >>>> I still believe this design is a mistake. >>> Is it a use of a separate FD carrying all of the flags/addresses that >>> you object to? >> >> Yes, it introduces a serialization point unnecessarily, and I don't >> believe the rationale that Juan offered was strong enough. >> >> This is certainly true on the receive side, but serialization is not >> even necessary on the send side. > > Is there an easy way to benchmark it (without writing both) to figure > out if sending (word) (page) on one fd is less efficient than sending > two fd's with the pages and words separate?
I think it shouldn't be hard to write a version which keeps the central distributor but puts the metadata in the auxiliary fds too. But I think what matters is not efficiency, but rather being more forward-proof. Besides liberty of changing implementation, Juan's current code simply has no commands in auxiliary file descriptors, which can be very limiting. Paolo >> Multiple threads can efficiently split >> the work among themselves and visit the dirty bitmap without a central >> distributor. > > I mostly agree; I kind of fancy the idea of having one per NUMA node; > but a central distributor might be a good idea anyway in the cases > where you find the heavy-writer all happens to be in the same area. > >> >> I need to study the code more to understand another issue. Say you have >> a page that is sent to two different threads in two different >> iterations, like >> >> thread 1 >> iteration 1: pages 3, 7 >> thread 2 >> iteration 1: page 3 >> iteration 2: page 7 >> >> Does the code ensure that all threads wait at the end of an iteration? >> Otherwise, thread 2 could process page 7 from iteration 2 before or >> while thread 1 processes the same page from iteration 1. > > I think there's a sync at the end of each iteration on Juan's current code > that stops that.