Peter Xu <pet...@redhat.com> writes: > On Tue, Aug 27, 2024 at 04:17:59PM -0300, Fabiano Rosas wrote: >> Peter Xu <pet...@redhat.com> writes: >> >> > On Tue, Aug 27, 2024 at 03:54:51PM -0300, Fabiano Rosas wrote: >> >> Peter Xu <pet...@redhat.com> writes: >> >> >> >> > On Tue, Aug 27, 2024 at 02:46:06PM -0300, Fabiano Rosas wrote: >> >> >> Add documentation clarifying the usage of the multifd methods. The >> >> >> general idea is that the client code calls into multifd to trigger >> >> >> send/recv of data and multifd then calls these hooks back from the >> >> >> worker threads at opportune moments so the client can process a >> >> >> portion of the data. >> >> >> >> >> >> Suggested-by: Peter Xu <pet...@redhat.com> >> >> >> Signed-off-by: Fabiano Rosas <faro...@suse.de> >> >> >> --- >> >> >> Note that the doc is not symmetrical among send/recv because the recv >> >> >> side is still wonky. It doesn't give the packet to the hooks, which >> >> >> forces the p->normal, p->zero, etc. to be processed at the top level >> >> >> of the threads, where no client-specific information should be. >> >> >> --- >> >> >> migration/multifd.h | 76 +++++++++++++++++++++++++++++++++++++++++---- >> >> >> 1 file changed, 70 insertions(+), 6 deletions(-) >> >> >> >> >> >> diff --git a/migration/multifd.h b/migration/multifd.h >> >> >> index 13e7a88c01..ebb17bdbcf 100644 >> >> >> --- a/migration/multifd.h >> >> >> +++ b/migration/multifd.h >> >> >> @@ -229,17 +229,81 @@ typedef struct { >> >> >> } MultiFDRecvParams; >> >> >> >> >> >> typedef struct { >> >> >> - /* Setup for sending side */ >> >> >> + /* >> >> >> + * The send_setup, send_cleanup, send_prepare are only called on >> >> >> + * the QEMU instance at the migration source. >> >> >> + */ >> >> >> + >> >> >> + /* >> >> >> + * Setup for sending side. Called once per channel during channel >> >> >> + * setup phase. >> >> >> + * >> >> >> + * Must allocate p->iov. If packets are in use (default), one >> >> > >> >> > Pure thoughts: wonder whether we can assert(p->iov) that after the hook >> >> > returns in code to match this line. >> >> >> >> Not worth the extra instructions in my opinion. It would crash >> >> immediately once the thread touches p->iov anyway. >> > >> > It might still be good IMHO to have that assert(), not only to abort >> > earlier, but also as a code-styled comment. Your call when resend. >> > >> > PS: feel free to queue existing patches into your own tree without >> > resending the whole series! >> > >> >> >> >> > >> >> >> + * extra iovec must be allocated for the packet header. Any memory >> >> >> + * allocated in this hook must be released at send_cleanup. >> >> >> + * >> >> >> + * p->write_flags may be used for passing flags to the QIOChannel. >> >> >> + * >> >> >> + * p->compression_data may be used by compression methods to store >> >> >> + * compression data. >> >> >> + */ >> >> >> int (*send_setup)(MultiFDSendParams *p, Error **errp); >> >> >> - /* Cleanup for sending side */ >> >> >> + >> >> >> + /* >> >> >> + * Cleanup for sending side. Called once per channel during >> >> >> + * channel cleanup phase. May be empty. >> >> > >> >> > Hmm, if we require p->iov allocation per-ops, then they must free it >> >> > here? >> >> > I wonder whether we leaked it in most compressors. >> >> >> >> Sorry, this one shouldn't have that text. >> > >> > I still want to double check with you: we leaked iov[] in most compressors >> > here, or did I overlook something? >> >> They have their own send_cleanup function where p->iov is freed. > > Oh, so I guess I just accidentally stumbled upon > multifd_uadk_send_cleanup() when looking..
Yeah, this is a bit worrying. The reason this has not shown on valgrind or the asan that Peter ran recently is that uadk, qpl and soon qat are never enabled in a regular build. I have myself introduced compilation errors in those files that I only caught by accident at a later point (before sending to the ml). > > I thought I looked a few more but now when I check most of them are indeed > there but looks like uadk is missing that. > > I think it might still be a good idea to assert(iov==NULL) after the > cleanup.. Should we maybe just free p->iov at the top level then?