14.02.2024 16:27, Fabiano Rosas :
Michael Tokarev <m...@tls.msk.ru> writes:
..>>> This change, which is suggested for -stable, while simple by its own,
seems
to depend on the previous changes in this series, which are not for -stable.
In particular, whole "Finally recycle all the threads" loop in
multifd_send_terminate_threads()
(to which the join is being added by this change) is moved from elsewhere by
12808db3b8 "migration/multifd: Cleanup multifd_save_cleanup()" (patch 24 in
this same series).
We can probably add the missing join right into the previous location of this
loop (before 12808db3b8). I did this in the attached variant for 8.2, is
this correct?
I forgot to attach the patch. It just moves the join from
multifd_send_terminate_threads()
back to multifd_save_cleanup. Attached now.
It should work. This was originally developed without the rest of the
changes on this PR.
And this does not pass even the basic tests, so it's not that simple :)
Do you have a log of what failed?
Re-running it again... I haven't even tried to push it somewhere for CI to run,
I run local `ninja test', which painted some migration tests in red. Here:
202/844 qemu:qtest+qtest-aarch64 / qtest-aarch64/migration-test ERROR
70.26s killed by signal 6 SIGABRT
330/844 qemu:qtest+qtest-i386 / qtest-i386/migration-test ERROR
85.33s killed by signal 6 SIGABRT
454/844 qemu:qtest+qtest-x86_64 / qtest-x86_64/migration-test ERROR
101.02s killed by signal 6 SIGABRT
Unfortunately I don't see anything interesting in the log:
# starting QEMU: exec ./qemu-system-x86_64 -qtest unix:/tmp/qtest-463614.sock -qtest-log /dev/null -chardev socket,path=/tmp/qtest-463614.qmp,id=char0
-mon chardev=char0,mode=control -display none -audio none -accel kvm -accel tcg -machine pc-q35-8.2, -name target,debug-threads=on -m 150M -serial
file:/tmp/migration-test-SPJTI2/dest_serial -incoming defer -drive if=none,id=d0,file=/tmp/migration-test-SPJTI2/bootsect,format=raw -device
ide-hd,drive=d0,secs=1,cyls=1,heads=1 2>/dev/null -accel qtest
----------------------------------- stderr -----------------------------------
../../build/qemu/8.2/tests/qtest/libqtest.c:204: kill_qemu() detected QEMU
death from signal 6 (Aborted)
(test program exited with status code -6)
Without the attached patch it works.
Anyway, I could prepare a backport on top of 8.2 for you.
Well, that would definitely be helpful, if you think it's worth to
provide backports for 8.2 for these. As my attempt apparently isn't
very successful :)
The following patch (27/34) is more questionable than this one.
Thank you!
/mjt
From 6d4aae84a06fc7e26dcb1d986a4de3c6d65eb064 Mon Sep 17 00:00:00 2001
From: Fabiano Rosas <faro...@suse.de>
Date: Tue, 6 Feb 2024 18:51:13 -0300
Subject: [PATCH] migration/multifd: Join the TLS thread
We're currently leaking the resources of the TLS thread by not joining
it and also overwriting the p->thread pointer altogether.
Fixes: a1af605bd5 ("migration/multifd: fix hangup with TLS-Multifd due to blocking handshake")
Cc: qemu-stable <qemu-sta...@nongnu.org>
Reviewed-by: Peter Xu <pet...@redhat.com>
Signed-off-by: Fabiano Rosas <faro...@suse.de>
Link: https://lore.kernel.org/r/20240206215118.6171-2-faro...@suse.de
Signed-off-by: Peter Xu <pet...@redhat.com>
(cherry picked from commit e1921f10d9afe651f4887284e85f6789b37e67d3)
Signed-off-by: Michael Tokarev <m...@tls.msk.ru>
(Mjt: fixup for before v8.2.0-1142-g12808db3b8
"migration/multifd: Cleanup multifd_save_cleanup()")
---
migration/multifd.c | 8 +++++++-
migration/multifd.h | 2 ++
2 files changed, 9 insertions(+), 1 deletion(-)
diff --git a/migration/multifd.c b/migration/multifd.c
index 409460684f..3183aa9e82 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -525,6 +525,10 @@ void multifd_save_cleanup(void)
for (i = 0; i < migrate_multifd_channels(); i++) {
MultiFDSendParams *p = &multifd_send_state->params[i];
+ if (p->tls_thread_created) {
+ qemu_thread_join(&p->tls_thread);
+ }
+
if (p->running) {
qemu_thread_join(&p->thread);
}
@@ -826,7 +830,9 @@ static bool multifd_tls_channel_connect(MultiFDSendParams *p,
trace_multifd_tls_outgoing_handshake_start(ioc, tioc, hostname);
qio_channel_set_name(QIO_CHANNEL(tioc), "multifd-tls-outgoing");
p->c = QIO_CHANNEL(tioc);
- qemu_thread_create(&p->thread, "multifd-tls-handshake-worker",
+
+ p->tls_thread_created = true;
+ qemu_thread_create(&p->tls_thread, "multifd-tls-handshake-worker",
multifd_tls_handshake_thread, p,
QEMU_THREAD_JOINABLE);
return true;
diff --git a/migration/multifd.h b/migration/multifd.h
index a835643b48..8fbffbaa5a 100644
--- a/migration/multifd.h
+++ b/migration/multifd.h
@@ -75,6 +75,8 @@ typedef struct {
char *name;
/* channel thread id */
QemuThread thread;
+ QemuThread tls_thread;
+ bool tls_thread_created;
/* communication channel */
QIOChannel *c;
/* is the yank function registered */
--
2.39.2