Thomas Huth <th...@redhat.com> writes:

> On 19/07/2023 21.56, Milan Zamazal wrote:
>> Thomas Huth <th...@redhat.com> writes:
>> 
>
>>> On 18/07/2023 14.55, Milan Zamazal wrote:
>>>> Thomas Huth <th...@redhat.com> writes:
>>>>
>>>
>>>>> On 11/07/2023 01.02, Michael S. Tsirkin wrote:
>>>>>> From: Milan Zamazal <mzama...@redhat.com>
>>>>>> We don't have a virtio-scmi implementation in QEMU and only support
>>>>>
>>>>>> a
>>>>>> vhost-user backend.  This is very similar to virtio-gpio and we add the 
>>>>>> same
>>>>>> set of tests, just passing some vhost-user messages over the control 
>>>>>> socket.
>>>>>> Signed-off-by: Milan Zamazal <mzama...@redhat.com>
>>>>>> Acked-by: Thomas Huth <th...@redhat.com>
>>>>>> Message-Id: <20230628100524.342666-4-mzama...@redhat.com>
>>>>>> Reviewed-by: Michael S. Tsirkin <m...@redhat.com>
>>>>>> Signed-off-by: Michael S. Tsirkin <m...@redhat.com>
>>>>>> ---
>>>>>>     tests/qtest/libqos/virtio-scmi.h |  34 ++++++
>>>>>>     tests/qtest/libqos/virtio-scmi.c | 174 
>>>>>> +++++++++++++++++++++++++++++++
>>>>>>     tests/qtest/vhost-user-test.c    |  44 ++++++++
>>>>>>     MAINTAINERS                      |   1 +
>>>>>>     tests/qtest/libqos/meson.build   |   1 +
>>>>>>     5 files changed, 254 insertions(+)
>>>>>>     create mode 100644 tests/qtest/libqos/virtio-scmi.h
>>>>>>     create mode 100644 tests/qtest/libqos/virtio-scmi.c
>>>>>
>>>>>    Hi!
>>>>>
>>>>> I'm seeing some random failures with this new scmi test, so far only
>>>>> on non-x86 systems, e.g.:
>>>>>
>>>>>    https://app.travis-ci.com/github/huth/qemu/jobs/606246131#L4774
>>>>>
>>>>> It also reproduces on a s390x host here, but only if I run "make check
>>>>> -j$(nproc)" - if I run the tests single-threaded, the qos-test passes
>>>>> there. Seems like there is a race somewhere in this test?
>>>> Hmm, it's basically the same as virtio-gpio.c test, so it should be
>>>> OK.
>>>> Is it possible that the two tests (virtio-gpio.c & virtio-scmi.c)
>>>> interfere with each other in some way?  Is there possibly a way to
>>>> serialize them to check?
>>>
>>> I think within one qos-test, the sub-tests are already run
>>> serialized.
>> I see, OK.
>> 
>>> But there might be multiple qos-tests running in parallel, e.g. one
>>> for the aarch64 target and one for the ppc64 target. And indeed, I can
>>> reproduce the problem on my x86 laptop by running this in one terminal
>>> window:
>>>
>>> for ((x=0;x<1000;x++)); do \
>>>   QTEST_QEMU_STORAGE_DAEMON_BINARY=./storage-daemon/qemu-storage-daemon \
>>>   G_TEST_DBUS_DAEMON=.tests/dbus-vmstate-daemon.sh \
>>>   QTEST_QEMU_BINARY=./qemu-system-ppc64 \
>>>   MALLOC_PERTURB_=188 QTEST_QEMU_IMG=./qemu-img \
>>>   tests/qtest/qos-test -p \
>>>   
>>> /ppc64/pseries/spapr-pci-host-bridge/pci-bus-spapr/pci-bus/vhost-user-scmi-pci/vhost-user-scmi/vhost-user-scmi-tests/scmi/read-guest-mem/memfile
>>> \
>>>   || break ; \
>>> done
>>>
>>> And this in another terminal window at the same time:
>>>
>>> for ((x=0;x<1000;x++)); do \
>>>   QTEST_QEMU_STORAGE_DAEMON_BINARY=./storage-daemon/qemu-storage-daemon \
>>>   G_TEST_DBUS_DAEMON=.tests/dbus-vmstate-daemon.sh \
>>>   QTEST_QEMU_BINARY=./qemu-system-aarch64 \
>>>   MALLOC_PERTURB_=188 QTEST_QEMU_IMG=./qemu-img \
>>>   tests/qtest/qos-test -p \
>>>   
>>> /aarch64/virt/generic-pcihost/pci-bus-generic/pci-bus/vhost-user-scmi-pci/vhost-user-scmi/vhost-user-scmi-tests/scmi/read-guest-mem/memfile
>>> \
>>>   || break ; \
>>> done
>>>
>>> After a while, the aarch64 test broke with:
>>>
>>> /aarch64/virt/generic-pcihost/pci-bus-generic/pci-bus/vhost-user-scmi-pci/vhost-user-scmi/vhost-user-scmi-tests/scmi/read-guest-mem/memfile:
>>> qemu-system-aarch64: Failed to set msg fds.
>>> qemu-system-aarch64: Failed to set msg fds.
>>> qemu-system-aarch64: vhost VQ 0 ring restore failed: -22: Invalid argument 
>>> (22)
>>> qemu-system-aarch64: Failed to set msg fds.
>>> qemu-system-aarch64: vhost VQ 1 ring restore failed: -22: Invalid argument 
>>> (22)
>>> qemu-system-aarch64: Failed to set msg fds.
>>> qemu-system-aarch64: vhost_set_vring_call failed 22
>>> qemu-system-aarch64: Failed to set msg fds.
>>> qemu-system-aarch64: vhost_set_vring_call failed 22
>>> qemu-system-aarch64: Failed to write msg. Wrote -1 instead of 20.
>>> qemu-system-aarch64: Failed to set msg fds.
>>> qemu-system-aarch64: vhost VQ 0 ring restore failed: -22: Invalid argument 
>>> (22)
>>> qemu-system-aarch64: Failed to set msg fds.
>>> qemu-system-aarch64: vhost VQ 1 ring restore failed: -22: Invalid argument 
>>> (22)
>>> qemu-system-aarch64: ../../devel/qemu/hw/pci/msix.c:659:
>>> msix_unset_vector_notifiers: Assertion `dev->msix_vector_use_notifier
>>> && dev->msix_vector_release_notifier' failed.
>>> ../../devel/qemu/tests/qtest/libqtest.c:200: kill_qemu() detected QEMU
>>> death from signal 6 (Aborted) (core dumped)
>>> **
>>> ERROR:../../devel/qemu/tests/qtest/qos-test.c:191:subprocess_run_one_test:
>>> child process
>>> (/aarch64/virt/generic-pcihost/pci-bus-generic/pci-bus/vhost-user-scmi-pci/vhost-user-scmi/vhost-user-scmi-tests/scmi/read-guest-mem/memfile/subprocess
>>> [488457]) failed unexpectedly
>>> Aborted (core dumped)
>> Interesting, good discovery.
>> 
>>> Can you also reproduce it this way?
>> Unfortunately not.  I ran the loops several times and everything
>> passed.
>> I tried to compile and run it in a different distro container and it
>> passed too.  I also haven't been successful in getting any idea how the
>> processes could influence each other.
>> What OS and what QEMU configure flags did you use to compile and run
>> it?
>
> I'm using RHEL 8 on an older laptop ... and maybe the latter is
> related: I just noticed that I can also reproduce the problem by just
> running one of the above two for-loop while putting a lot of load on
> the machine otherwise, e.g. by running a "make -j$(nproc)" to rebuild
> the whole QEMU sources. So it's definitely a race *within* one QEMU
> process.

Ah, great, now I can easily reproduce it by running kernel compilation
in the background.  And I could also check that the supposed fix
remedies the problem.  I'll post the patch soon.

Thank you,
Milan


Reply via email to