We have a logic in await_return_path_close_on_source() that we will explicitly shutdown the socket when migration encounters errors. However it could be racy because from_dst_file could have been reset right after checking it but before passing it to qemu_file_shutdown() by the rp_thread.
Fix it by shutdown() on the src file instead. Since they must be a pair of qemu files, shutdown on either of them will work the same. Since at it, drop the check for from_dst_file directly, which makes the behavior even more predictable. Reported-by: Dr. David Alan Gilbert <dgilb...@redhat.com> Signed-off-by: Peter Xu <pet...@redhat.com> --- migration/migration.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/migration/migration.c b/migration/migration.c index 21b94f75a3..4f48cde796 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -2882,12 +2882,15 @@ static int await_return_path_close_on_source(MigrationState *ms) * rp_thread will exit, however if there's an error we need to cause * it to exit. */ - if (qemu_file_get_error(ms->to_dst_file) && ms->rp_state.from_dst_file) { + if (qemu_file_get_error(ms->to_dst_file)) { /* * shutdown(2), if we have it, will cause it to unblock if it's stuck - * waiting for the destination. + * waiting for the destination. We do shutdown on to_dst_file should + * also shutdown the from_dst_file as they're in a pair. We explicilty + * don't operate on from_dst_file because it's potentially racy + * (rp_thread could have reset it in parallel). */ - qemu_file_shutdown(ms->rp_state.from_dst_file); + qemu_file_shutdown(ms->to_dst_file); mark_source_rp_bad(ms); } trace_await_return_path_close_on_source_joining(); -- 2.31.1