On Tue, Feb 04, 2025 at 08:32:15PM +0100, Maciej S. Szmigiero wrote: > On 4.02.2025 18:54, Peter Xu wrote: > > On Thu, Jan 30, 2025 at 11:08:40AM +0100, Maciej S. Szmigiero wrote: > > > +static int multifd_device_state_save_thread(void *opaque) > > > +{ > > > + struct MultiFDDSSaveThreadData *data = opaque; > > > + int ret; > > > + > > > + ret = data->hdlr(data->idstr, data->instance_id, &send_threads_abort, > > > + data->handler_opaque); > > > > I thought we discussed somewhere and the plan was we could use Error** here > > to report errors. Would that still make sense, or maybe I lost some > > context? > > That was about *load* threads, here these are *save* threads.
Ah OK. > > Save handlers do not return an Error value, neither save_live_iterate, nor > save_live_complete_precopy or save_state does so. Let's try to make new APIs work with Error* if possible. > > > Meanwhile, I still feel uneasy on having these globals (send_threads_abort, > > send_threads_ret). Can we make MultiFDDSSaveThreadData the only interface > > between migration and the threads impl? So I wonder if it can be: > > > > ret = data->hdlr(data); > > > > With extended struct like this (I added thread_error and thread_quit): > > > > struct MultiFDDSSaveThreadData { > > SaveLiveCompletePrecopyThreadHandler hdlr; > > char *idstr; > > uint32_t instance_id; > > void *handler_opaque; > > /* > > * Should be NULL when struct passed over to thread, the thread should > > * set this if the handler would return false. It must be kept NULL if > > * the handler returned true / success. > > */ > > Error *thread_error; > > As I mentioned above, these handlers do not generally return Error type, > so this would need to be an *int; > > > /* > > * Migration core would set this when it wants to notify thread to > > * quit, for example, when error occured in other threads, or > > migration is > > * cancelled by the user. > > */ > > bool thread_quit; > > ^ I guess that was supposed to be a pointer too (*thread_quit). It's my intention to make this bool, to make everything managed per-thread. It's actually what we do with multifd, these are a bunch of extra threads to differeciate from the "IO threads" / "multifd threads". > > > }; > > > > Then if any multifd_device_state_save_thread() failed, for example, it > > should notify all threads to quit by setting thread_quit, instead of > > relying on yet another global variable to show migration needs to quit. > > multifd_abort_device_state_save_threads() needs to access > send_threads_abort too. This may need to become something like: QLIST_FOREACH() { MultiFDDSSaveThreadData *data = ...; data->thread_quit = true; } We may want to double check qmp 'migrate_cancel' will work when save threads are running, but this can also be done for later. > > And multifd_join_device_state_save_threads() needs to access > send_threads_ret. Then this one becomes: thread_pool_wait(send_threads); QLIST_FOREACH() { MultiFDDSSaveThreadData *data = ...; if (data->thread_error) { return false; } } return true; > > These variables ultimately will have to be stored somewhere since > there can be multiple save threads and so multiple instances of > MultiFDDSSaveThreadData. > > So these need to be stored somewhere where > multifd_spawn_device_state_save_thread() can reach them to assign > their addresses to MultiFDDSSaveThreadData members. Then multifd_spawn_device_state_save_thread() will need to manage the qlist, making sure migration core remembers what jobs it submitted. It sounds good to have that bookkeeping when I think about it, instead of throw the job to the thread pool and forget it.. > > However, at that point multifd_device_state_save_thread() can > access them too so it does not need to have them passed via > MultiFDDSSaveThreadData. > > However, nothing prevents putting send_threads* variables > into a global struct (with internal linkage - "static", just as > these separate ones are) if you like such construct more. This should be better than the current global vars indeed, but less favoured if the per-thread way could work above. Thanks, -- Peter Xu