Vladimir Sementsov-Ogievskiy <vsement...@yandex-team.ru> wrote: > On 26.07.23 16:32, Thomas Huth wrote: >> On 26/07/2023 15.00, Peter Maydell wrote: >>> On Wed, 26 Jul 2023 at 13:06, Juan Quintela <quint...@redhat.com> wrote: >>>> To make things easier, this is the part that show how it breaks (this is >>>> the gcov test): >>>> >>>> 357/423 qemu:block / io-qcow2-copy-before-write >>>> ERROR 6.38s exit status 1 >>>>>>> PYTHON=/builds/juan.quintela/qemu/build/pyvenv/bin/python3 >>>> MALLOC_PERTURB_=44 >>>> /builds/juan.quintela/qemu/build/pyvenv/bin/python3 >>>> /builds/juan.quintela/qemu/build/../tests/qemu-iotests/check -tap >>>> -qcow2 copy-before-write --source-dir >>>> /builds/juan.quintela/qemu/tests/qemu-iotests --build-dir >>>> /builds/juan.quintela/qemu/build/tests/qemu-iotests >>>> ――――――――――――――――――――――――――――――――――――― ✀ >>>> ――――――――――――――――――――――――――――――――――――― >>>> stderr: >>>> --- >>>> /builds/juan.quintela/qemu/tests/qemu-iotests/tests/copy-before-write.out >>>> +++ >>>> /builds/juan.quintela/qemu/build/scratch/qcow2-file-copy-before-write/copy-before-write.out.bad >>>> @@ -1,5 +1,21 @@ >>>> -.... >>>> +...F >>>> +====================================================================== >>>> +FAIL: test_timeout_break_snapshot (__main__.TestCbwError) >>>> +---------------------------------------------------------------------- >>>> +Traceback (most recent call last): >>>> + File >>>> "/builds/juan.quintela/qemu/tests/qemu-iotests/tests/copy-before-write", >>>> line 210, in test_timeout_break_snapshot >>>> + self.assertEqual(log, """\ >>>> +AssertionError: 'wrot[195 chars]read 1048576/1048576 bytes at >>>> offset 0\n1 MiB,[46 chars]c)\n' != 'wrot[195 chars]read failed: >>>> Permission denied\n' >>>> + wrote 524288/524288 bytes at offset 0 >>>> + 512 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) >>>> + wrote 524288/524288 bytes at offset 524288 >>>> + 512 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) >>>> ++ read failed: Permission denied >>>> +- read 1048576/1048576 bytes at offset 0 >>>> +- 1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec) >>>> + >>> >>> This iotest failing is an intermittent that I've seen running >>> pullreqs on master. I tend to see it on the s390 host. I >>> suspect a race condition somewhere where it fails if the host >>> is heavily loaded. >> It's obviously a failure in an iotest, so let's CC: the >> corresponding people (done now). >> > > Sorry for long delay. > > Does it still fail? > > In the test we expect that copy-before-write operation fails (because > of throttling and timeout), and therefore snapshot is broken and next > read from snapshot should fail. > > But most probably the copy-before-write operation succeeded in this > case for some reason.. I don't think that throttling and timeouts in > block layer can guarantee some determinism.. But usually it works. > > we use throttling with bps-write = 300 * 1024, i.e. 300KB per second. and > cbw-timeout is set to 1 second. > > Then we do write 512K, > > then the comment say: > # We need second write to trigger throttling > > and we write another 512K. > > first 512K are written, and we should wait 512/300 = 1.7 seconds since > _start_ of that write before issuing the second one.. But if write was > slow we may have to wait less than a second from finish of the first > write start the second one. Then timeout will not fire. > > ==== > > I see two possible ways to fix that: > > 1. decrease bps-write a bit. For example to 200 BPS. > > 2. rework the test to use null-co instead of real images. This way we will > not suffer from unstable IO duration. > > > So, is the problem still fire sometimes?
For me it is random. When it happens, it do it forever. And then it stops, and don't happens for a while. It is not happening for me now. Later, Juan.