On Thu, Mar 31, 2022, 6:47 AM Li Zhang <lizh...@suse.de> wrote: > On 3/31/22 08:10, Hanna Reitz wrote: > > On 29.03.22 18:49, Li Zhang wrote: > >> Update what I observed. > >> > >> It seems that aqmp is not stable when running test cases. > >> So I revert the patches as the following, iotest40/41 test cases work > >> well. > > > > Thanks for bisecting. I haven’t seen this problem before, so I didn’t > > look into it; CC-ing John, perhaps he figures something. > > > Thanks a lot. > It happens occassionally. I didn't find a way to reproduce it manually. >
Sorry for the instability. A few questions then: - what operating system and version - kernel version - architecture (looks like x86 and others?) - what commit/version of QEMU are you testing? - what python version are you using to run iotests? > > Hanna > > > >> commit 76cd358671e6b8e7c435ec65b1c44200254514a9 > >> > >> Author: John Snow <js...@redhat.com> > >> > >> Date: Tue Oct 26 13:56:12 2021 -0400 > >> > >> > >> > >> python, iotests: replace qmp with aqmp > >> > >> > >> > >> Swap out the synchronous QEMUMonitorProtocol from qemu.qmp with > >> the sync > >> > >> wrapper from qemu.aqmp instead. > >> > >> > >> > >> Add an escape hatch in the form of the environment variable > >> > >> QEMU_PYTHON_LEGACY_QMP which allows you to cajole QEMUMachine into > >> using > >> > >> the old implementation, proving that both implementations work > >> > >> concurrently. > >> > >> > >> > >> Signed-off-by: John Snow <js...@redhat.com> > >> > >> Reviewed-by: Kevin Wolf <kw...@redhat.com> > >> > >> Reviewed-by: Hanna Reitz <hre...@redhat.com> > >> > >> Message-id: 20211026175612.4127598-9-js...@redhat.com > >> > >> Signed-off-by: John Snow <js...@redhat.com> > >> > >> > >> commit 1611e6cf4e7163f6102b37010a8b7e7120f468b5 > >> > >> Author: John Snow <js...@redhat.com> > >> > >> Date: Thu Nov 18 15:46:18 2021 -0500 > >> > >> > >> > >> python/machine: handle "fast" QEMU terminations > >> > >> > >> > >> In the case that the QEMU process actually launches -- but then > >> dies so > >> > >> quickly that we can't establish a QMP connection to it -- > QEMUMachine > >> > >> currently calls _post_shutdown() assuming that it never launched > >> the VM > >> > >> process. > >> > >> > >> > >> This isn't true, though: it "merely" may have failed to establish > >> a QMP > >> > >> connection and the process is in the middle of its own exit path. > >> > >> > >> > >> If we don't wait for the subprocess, the caller may get a bogus > >> `None` > >> > >> return for .exitcode(). This behavior was observed from > >> > >> device-crash-test; after the switch to Async QMP, the timings were > >> > >> changed such that it was now seemingly possible to witness the > >> failure > >> > >> of "vm.launch()" *prior* to the exitcode becoming available. > >> > >> > >> > >> The semantic of the `_launched` property is changed in this > >> > >> patch. Instead of representing the condition "launch() executed > >> > >> successfully", it will now represent "has forked a child process > >> > >> successfully". This way, wait() when called in the exit path won't > >> > >> become a no-op. > >> > >> > >> > >> Signed-off-by: John Snow <js...@redhat.com> > >> > >> Reviewed-by: Willian Rampazzo <willi...@redhat.com> > >> > >> Message-id: 20211118204620.1897674-6-js...@redhat.com > >> > >> Signed-off-by: John Snow <js...@redhat.com> > >> > >> > >> > >> > >> > >> On 3/25/22 11:17, Li Zhang wrote: > >>> Hi, > >>> > >>> I backport some iotests patches to the tree and change timeout. > >>> It doesn't work. > >>> > >>> Sometimes, iotest41 also reports the errors. > >>> [ 1347s] > >>> +====================================================================== > >>> [ 1347s] +ERROR: test_top_node_in_wrong_chain > (__main__.TestSingleDrive) > >>> [ 1347s] > >>> +---------------------------------------------------------------------- > >>> [ 1347s] +Traceback (most recent call last): > >>> [ 1347s] + File > >>> > "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/machine.py", > >>> line 399, in launch > >>> [ 1347s] + self._launch() > >>> [ 1347s] + File > >>> > "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/machine.py", > >>> line 434, in _launch > >>> [ 1347s] + self._post_launch() > >>> [ 1347s] + File > >>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/qtest.py", > line > >>> 147, in _post_launch > >>> [ 1347s] + super()._post_launch() > >>> [ 1347s] + File > >>> > "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/machine.py", > >>> line 340, in _post_launch > >>> [ 1347s] + self._qmp.accept(self._qmp_timer) > >>> [ 1347s] + File > >>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/aqmp/legacy.py", > >>> line 69, in accept > >>> [ 1347s] + timeout > >>> [ 1347s] + File > >>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/aqmp/legacy.py", > >>> line 42, in _sync > >>> [ 1347s] + asyncio.wait_for(future, timeout=timeout) > >>> [ 1347s] + File "/usr/lib64/python3.6/asyncio/base_events.py", line > >>> 488, in run_until_complete > >>> [ 1347s] + return future.result() > >>> [ 1347s] + File "/usr/lib64/python3.6/asyncio/tasks.py", line 362, > >>> in wait_for > >>> [ 1347s] + raise futures.TimeoutError() > >>> [ 1347s] +concurrent.futures._base.TimeoutError > >>> > >>> > >>> I can see other errors like this, it's the problem of the socket. > >>> > >>> [ 1535s] socket_accept failed: Resource temporarily unavailable > >>> [ 1535s] ** > >>> [ 1535s] > >>> ERROR:../tests/qtest/libqtest.c:321:qtest_init_without_qmp_handshake: > >>> assertion failed: (s->fd >= 0 && s->qmp_fd >= 0) > >>> > >>> > >>> The script is running the command as this: > >>> /usr/bin/make -O -j4 check-block V=1 > >>> > >>> I can see the errors on ppc, arm or x86. > >>> But I couldn't reproduce it when I run it manually. > >>> > >>> It will be appreciated if any suggestions. Thanks. > >>> > >>> > >>> On 3/24/22 14:47, Li Zhang wrote: > >>>> Hi, > >>>> > >>>> When I run the testsuit on our buidling system, it reports a timeout > >>>> sometimes not always as the following. > >>>> It couldn't connect qmp socket. Any ideas about this problem? > >>>> > >>>> > >>>> [ 1989s] --- > >>>> /home/abuild/rpmbuild/BUILD/qemu-6.2.0/tests/qemu-iotests/040.out > >>>> [ 1989s] +++ 040.out.bad > >>>> [ 1989s] @@ -1,5 +1,55 @@ > >>>> [ 1989s] > >>>> -................................................................. > >>>> [ 1989s] +....ERROR:qemu.aqmp.qmp_client.qemu-6471:Failed to > >>>> establish connection: asyncio.exceptions.CancelledError > >>>> [ 1989s] > >>>> > +E..................................ERROR:qemu.aqmp.qmp_client.qemu-6471:Failed > > >>>> to establish connection: asyncio.exceptions.CancelledError > >>>> [ 1989s] +E......................... > >>>> [ 1989s] > >>>> > +====================================================================== > >>>> [ 1989s] +ERROR: test_commit_node (__main__.TestActiveZeroLengthImage) > >>>> [ 1989s] > >>>> > +---------------------------------------------------------------------- > >>>> [ 1989s] +Traceback (most recent call last): > >>>> [ 1989s] + File > >>>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/tests/qemu-iotests/040", > >>>> line 94, in setUp > >>>> [ 1989s] + self.vm.launch() > >>>> [ 1989s] + File > >>>> > "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/machine.py", > >>>> line 399, in launch > >>>> [ 1989s] + self._launch() > >>>> [ 1989s] + File > >>>> > "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/machine.py", > >>>> line 434, in _launch > >>>> [ 1989s] + self._post_launch() > >>>> [ 1989s] + File > >>>> > "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/qtest.py", > >>>> line 147, in _post_launch > >>>> [ 1989s] + super()._post_launch() > >>>> [ 1989s] + File > >>>> > "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/machine/machine.py", > >>>> line 340, in _post_launch > >>>> [ 1989s] + self._qmp.accept(self._qmp_timer) > >>>> [ 1989s] + File > >>>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/aqmp/legacy.py", > >>>> line 67, in accept > >>>> [ 1989s] + self._sync( > >>>> [ 1989s] + File > >>>> "/home/abuild/rpmbuild/BUILD/qemu-6.2.0/python/qemu/aqmp/legacy.py", > >>>> line 41, in _sync > >>>> [ 1989s] + return self._aloop.run_until_complete( > >>>> [ 1989s] + File "/usr/lib64/python3.8/asyncio/base_events.py", line > >>>> 616, in run_until_complete > >>>> [ 1989s] + return future.result() > >>>> [ 1989s] + File "/usr/lib64/python3.8/asyncio/tasks.py", line 501, > >>>> in wait_for > >>>> [ 1989s] + raise exceptions.TimeoutError() > >>>> [ 1989s] +asyncio.exceptions.TimeoutError > >>>> > >>> > >>> > >> > > > > >