Hi, Michael, On Wed, Jul 17, 2024 at 04:55:52AM -0400, Michael S. Tsirkin wrote: > I just want to understand how we managed to have two threads > talking in parallel. BQL is normally enough, which path > manages to invoke vhost-user with BQL not taken? > Just check BQL taken on each vhost user invocation and > you will figure it out.
Prasad mentioned how the race happened in the cover letter: https://lore.kernel.org/r/20240711131424.181615-1-ppan...@redhat.com Thread-1 Thread-2 vhost_dev_start postcopy_ram_incoming_cleanup vhost_device_iotlb_miss postcopy_notify vhost_backend_update_device_iotlb vhost_user_postcopy_notifier vhost_user_send_device_iotlb_msg vhost_user_postcopy_end process_message_reply process_message_reply vhost_user_read vhost_user_read vhost_user_read_header vhost_user_read_header "Fail to update device iotlb" "Failed to receive reply to postcopy_end" The normal case should be that thread-2 is postcopy_ram_listen_thread(), and this happens when postcopy migration is close to the end. Thanks, -- Peter Xu