* Paolo Bonzini (pbonz...@redhat.com) wrote: > > > On 07/03/2016 13:49, Dr. David Alan Gilbert wrote: > > b) The harder problem is that there's a race where qemu_bh_delete > > segs, and I'm not 100% sure why yet - it only does it sometime > > (i.e. run virt-test and leave it and it occasionally does it). > > From the core it looks like qemu->bh is corrupt (0x10101010...) > > so maybe mis has been freed at that point? > > I'm suspecting this is the postcopy_ram_listen_thread freeing > > mis at the end of it, but I don't know yet. > > That should be it. Maybe the patch can simply be reverted, because > loadvm_postcopy_handle_run runs from a thread and not a coroutine. Is > this correct?
That's still in the main thread, the 'run' comes from the packaged postcopy state, but is after the 'listener' thread has been started. I need to understand this anyway; the way it's supposed to work is that if postcopy is being used then not much cleanup happens in process_incoming_migration_co instead it exits and lets postcopy_ram_listen_thread do the cleanup at the end; I've not quite figured out what's going on here but it almost looks like both of them are cleaning up - that shouldn't happen. > However I have a bug or two for you to fix, too: > > 1) as far as I can see, postcopy_ram_listen_thread is not holding the > mutex during the call to qemu_loadvm_state_main. Is that a bug? No; the guest is running, the only thing that gets loaded by that listen thread is data that's postcopied - i.e. currently just ram pages that are loaded atomically. > 2) no one is currently joining mis->listen_thread, I suspect it actually > should be QEMU_THREAD_DETACHED. OK, that looks like the easier one. Dave > > :) > > Paolo -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK