On Thu, Aug 29, 2024 at 02:45:45PM +0530, Prasad Pandit wrote: > Hello Michael, > > On Thu, 29 Aug 2024 at 13:12, Michael S. Tsirkin <m...@redhat.com> wrote: > > Weird. Seems to indicate some kind of deadlock? > > * Such a deadlock should occur across all environments I guess, not > sure why it happens selectively. It is strange.
Some kind of race? > > So maybe vhost_user_postcopy_end should take the BQL? > === > diff --git a/migration/savevm.c b/migration/savevm.c > index e7c1215671..31acda3818 100644 > --- a/migration/savevm.c > +++ b/migration/savevm.c > @@ -2050,7 +2050,9 @@ static void *postcopy_ram_listen_thread(void *opaque) > */ > qemu_event_wait(&mis->main_thread_load_event); > } > + bql_lock(); > postcopy_ram_incoming_cleanup(mis); > + bql_unlock(); > > if (load_res < 0) { > /* > === > > * Actually a BQL patch above was tested and it worked fine. But not > sure if it is an acceptable solution. Another contention was taking > BQL could make things more complicated, so a local vhost-user specific > lock should be better. > > ...wdyt? > --- > - Prasad Keep it simple, is my advice. Not causing regressions is good. -- MST