Peter Xu <pet...@redhat.com> writes:

> On Tue, Aug 27, 2024 at 04:17:59PM -0300, Fabiano Rosas wrote:
>> Peter Xu <pet...@redhat.com> writes:
>> 
>> > On Tue, Aug 27, 2024 at 03:54:51PM -0300, Fabiano Rosas wrote:
>> >> Peter Xu <pet...@redhat.com> writes:
>> >> 
>> >> > On Tue, Aug 27, 2024 at 02:46:06PM -0300, Fabiano Rosas wrote:
>> >> >> Add documentation clarifying the usage of the multifd methods. The
>> >> >> general idea is that the client code calls into multifd to trigger
>> >> >> send/recv of data and multifd then calls these hooks back from the
>> >> >> worker threads at opportune moments so the client can process a
>> >> >> portion of the data.
>> >> >> 
>> >> >> Suggested-by: Peter Xu <pet...@redhat.com>
>> >> >> Signed-off-by: Fabiano Rosas <faro...@suse.de>
>> >> >> ---
>> >> >> Note that the doc is not symmetrical among send/recv because the recv
>> >> >> side is still wonky. It doesn't give the packet to the hooks, which
>> >> >> forces the p->normal, p->zero, etc. to be processed at the top level
>> >> >> of the threads, where no client-specific information should be.
>> >> >> ---
>> >> >>  migration/multifd.h | 76 +++++++++++++++++++++++++++++++++++++++++----
>> >> >>  1 file changed, 70 insertions(+), 6 deletions(-)
>> >> >> 
>> >> >> diff --git a/migration/multifd.h b/migration/multifd.h
>> >> >> index 13e7a88c01..ebb17bdbcf 100644
>> >> >> --- a/migration/multifd.h
>> >> >> +++ b/migration/multifd.h
>> >> >> @@ -229,17 +229,81 @@ typedef struct {
>> >> >>  } MultiFDRecvParams;
>> >> >>  
>> >> >>  typedef struct {
>> >> >> -    /* Setup for sending side */
>> >> >> +    /*
>> >> >> +     * The send_setup, send_cleanup, send_prepare are only called on
>> >> >> +     * the QEMU instance at the migration source.
>> >> >> +     */
>> >> >> +
>> >> >> +    /*
>> >> >> +     * Setup for sending side. Called once per channel during channel
>> >> >> +     * setup phase.
>> >> >> +     *
>> >> >> +     * Must allocate p->iov. If packets are in use (default), one
>> >> >
>> >> > Pure thoughts: wonder whether we can assert(p->iov) that after the hook
>> >> > returns in code to match this line.
>> >> 
>> >> Not worth the extra instructions in my opinion. It would crash
>> >> immediately once the thread touches p->iov anyway.
>> >
>> > It might still be good IMHO to have that assert(), not only to abort
>> > earlier, but also as a code-styled comment.  Your call when resend.
>> >
>> > PS: feel free to queue existing patches into your own tree without
>> > resending the whole series!
>> >
>> >> 
>> >> >
>> >> >> +     * extra iovec must be allocated for the packet header. Any memory
>> >> >> +     * allocated in this hook must be released at send_cleanup.
>> >> >> +     *
>> >> >> +     * p->write_flags may be used for passing flags to the QIOChannel.
>> >> >> +     *
>> >> >> +     * p->compression_data may be used by compression methods to store
>> >> >> +     * compression data.
>> >> >> +     */
>> >> >>      int (*send_setup)(MultiFDSendParams *p, Error **errp);
>> >> >> -    /* Cleanup for sending side */
>> >> >> +
>> >> >> +    /*
>> >> >> +     * Cleanup for sending side. Called once per channel during
>> >> >> +     * channel cleanup phase. May be empty.
>> >> >
>> >> > Hmm, if we require p->iov allocation per-ops, then they must free it 
>> >> > here?
>> >> > I wonder whether we leaked it in most compressors.
>> >> 
>> >> Sorry, this one shouldn't have that text.
>> >
>> > I still want to double check with you: we leaked iov[] in most compressors
>> > here, or did I overlook something?
>> 
>> They have their own send_cleanup function where p->iov is freed.
>
> Oh, so I guess I just accidentally stumbled upon
> multifd_uadk_send_cleanup() when looking..

Yeah, this is a bit worrying. The reason this has not shown on valgrind
or the asan that Peter ran recently is that uadk, qpl and soon qat are
never enabled in a regular build. I have myself introduced compilation
errors in those files that I only caught by accident at a later point
(before sending to the ml).

>
> I thought I looked a few more but now when I check most of them are indeed
> there but looks like uadk is missing that.
>
> I think it might still be a good idea to assert(iov==NULL) after the
> cleanup..

Should we maybe just free p->iov at the top level then?

Reply via email to