On Tue, Oct 08, 2024 at 11:20:03AM -0300, Fabiano Rosas wrote: > Peter Xu <pet...@redhat.com> writes: > > > On Mon, Oct 07, 2024 at 03:44:51PM +0000, Shivam Kumar wrote: > >> If the client calls the QMP command to reset the migration > >> capabilities after the migration status is set to failed or cancelled > > > > Is cancelled ok? > > > > Asked because I think migrate_fd_cleanup() should still be in CANCELLING > > stage there, so no one can disable multifd capability before that, it > > should fail the QMP command. > > > > But FAILED indeed looks problematic. > > > > IIUC it's not only to multifd alone - is it a race condition that > > migrate_fd_cleanup() can be invoked without migration_is_running() keeps > > being true? Then I wonder what happens if a concurrent QMP "migrate" > > happens together with migrate_fd_cleanup(), even with multifd always off. > > > > Do we perhaps need to cleanup everything before the state changes to > > FAILED? > > > > Should we make CANCELLED the only terminal state aside from COMPLETED? > So migrate_fd_cleanup would set CANCELLED whenever it sees either > CANCELLING or FAILED.
I think that may be a major ABI change that can be risky, as we normally see CANCELLED to be user's choice. If we really want an ABI change, we could also introduce FAILING too, but I wonder what I replied in the other email could work without any ABI change, but close the gap on this race. -- Peter Xu