Based-on: 20240202102857.110210-1-pet...@redhat.com [PATCH v2 00/23] migration/multifd: Refactor ->send_prepare() and cleanups https://lore.kernel.org/r/20240202102857.110210-1-pet...@redhat.com
Hi, For v3 I fixed the refcounting issue spotted by Avihai. The situation there is a bit clunky due to historical reasons. The gist is that we have an assumption that channel creation never fails after p->c has been set, so when 'p->c == NULL' we have to unref and when 'p->c != NULL' the cleanup code will do the unref. CI run: https://gitlab.com/farosas/qemu/-/pipelines/1166889341 v2: https://lore.kernel.org/r/20240205194929.28963-1-faro...@suse.de In this v2 I made sure NO channel is created after the semaphores are posted. Feel free to call me out if that's not the case. Not much changes, except that now both TLS and non-TLS go through the same code, so there's a centralized place to do error handling and releasing the semaphore. CI run: https://gitlab.com/farosas/qemu/-/pipelines/1165206107 based on Peter's code: https://gitlab.com/farosas/qemu/-/pipelines/1165303276 v1: https://lore.kernel.org/r/20240202191128.1901-1-faro...@suse.de This contains 2 patches from my previous series addressing the p->running misuse and the TLS thread leak and 3 new patches to fix the cleanup-while-creating-threads race. For the p->running I'm keeping the idea from the other series to remove p->running and use a more narrow p->thread_created flag. This flag is used only inform whether the thread has been created so we can join it. For the cleanup race I have moved some code around and added a semaphore to make multifd_save_setup() only return once all channel creation tasks have started. The idea is that after multifd_save_setup() returns, no new creations are in flight and the p->thread_created flags will never change again, so they're enough to cause the cleanup code to wait for the threads to join. CI run: https://gitlab.com/farosas/qemu/-/pipelines/1162798843 @Peter: I can rebase this on top of your series once we decide about it. Fabiano Rosas (6): migration/multifd: Join the TLS thread migration/multifd: Remove p->running migration/multifd: Move multifd_send_setup error handling in to the function migration/multifd: Move multifd_send_setup into migration thread migration/multifd: Unify multifd and TLS connection paths migration/multifd: Add a synchronization point for channel creation migration/migration.c | 14 ++-- migration/multifd.c | 168 +++++++++++++++++++++++++----------------- migration/multifd.h | 11 ++- 3 files changed, 109 insertions(+), 84 deletions(-) -- 2.35.3