On Fri, Jun 06, 2025 at 09:48:41AM -0500, JAEHOON KIM wrote: > > On 6/6/2025 8:40 AM, Fabiano Rosas wrote: > > Jaehoon Kim<jh...@linux.ibm.com> writes: > > > > > When the source VM attempts to connect to the destination VM's Unix > > > domain socket(cpr.sock) during CPR transfer, the socket file might not > > > yet be exist if the destination side hasn't completed the bind > > > operation. This can lead to connection failures when running tests with > > > the qtest framework. > > > > > Could you provide us the output of qtest in this case? Are you simply > > running > > make check or something else? > > Yes, I'm simply running 'make check-qtest-s390x'. > > Here's the qtest output from the failure: > # { > # "error": { > # "class": "GenericError", > # "desc": "Failed to connect to > '/tmp/migration-test-ZC7Z72/cpr.sock': No such file or directory" > # } > # } > not ok /s390x/migration/mode/transfer - > ERROR:../tests/qtest/libqtest.c:1453:qtest_vqmp_assert_success_ref: assertion > failed: (qdict_haskey(response, "return")) > Bail out!
So this is showing a failure when using $QEMU -incoming cpr:...address... as opposed to $QEMU -incoming cpr:defer I presume in the former case, the test is spawning QEMU, but the startup of QEMU & its listening on the UNIX socket is not synchronized with the parent process. In the latter case usnig 'defer', listening will be synchronized by the QMP command used to setup the incoming socket. So why do we see a race with "-incoming cpr:..address", but not with a traditional "-incoming ...address.." for non-CPR code ? With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|