Thomas Huth <th...@redhat.com> writes:

> On 20/12/2024 17.28, Peter Xu wrote:
>> On Thu, Dec 19, 2024 at 03:53:22PM -0300, Fabiano Rosas wrote:
>>> Stefan Hajnoczi <stefa...@redhat.com> writes:
>>>
>>>> Hi Fabiano,
>>>> Please take a look at this CI failure:
>>>>
>>>>>>> MALLOC_PERTURB_=61 QTEST_QEMU_BINARY=./qemu-system-s390x 
>>>>>>> UBSAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1:print_stacktrace=1
>>>>>>>  QTEST_QEMU_IMG=./qemu-img MESON_TEST_ITERATION=1 
>>>>>>> MSAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1:print_stacktrace=1
>>>>>>>  ASAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1 
>>>>>>> PYTHON=/home/gitlab-runner/builds/4S3awx_3/0/qemu-project/qemu/build/pyvenv/bin/python3
>>>>>>>  QTEST_QEMU_STORAGE_DAEMON_BINARY=./storage-daemon/qemu-storage-daemon 
>>>>>>> G_TEST_DBUS_DAEMON=/home/gitlab-runner/builds/4S3awx_3/0/qemu-project/qemu/tests/dbus-vmstate-daemon.sh
>>>>>>>  
>>>>>>> /home/gitlab-runner/builds/4S3awx_3/0/qemu-project/qemu/build/tests/qtest/migration-test
>>>>>>>  --tap -k
>>>> ――――――――――――――――――――――――――――――――――――― ✀  
>>>> ―――――――――――――――――――――――――――――――――――――
>>>> stderr:
>>>> Traceback (most recent call last):
>>>>    File 
>>>> "/home/gitlab-runner/builds/4S3awx_3/0/qemu-project/qemu/build/scripts/analyze-migration.py",
>>>>  line 688, in <module>
>>>>      dump.read(dump_memory = args.memory)
>>>>    File 
>>>> "/home/gitlab-runner/builds/4S3awx_3/0/qemu-project/qemu/build/scripts/analyze-migration.py",
>>>>  line 625, in read
>>>>      section.read()
>>>>    File 
>>>> "/home/gitlab-runner/builds/4S3awx_3/0/qemu-project/qemu/build/scripts/analyze-migration.py",
>>>>  line 461, in read
>>>>      field['data'] = reader(field, self.file)
>>>>    File 
>>>> "/home/gitlab-runner/builds/4S3awx_3/0/qemu-project/qemu/build/scripts/analyze-migration.py",
>>>>  line 434, in __init__
>>>>      for field in self.desc['struct']['fields']:
>>>> KeyError: 'fields'
>>>
>>> This is the command line that runs only this specific test:
>>>
>>> PYTHON=/usr/bin/python3.11 QTEST_QEMU_BINARY=./qemu-system-s390x
>>> ./tests/qtest/migration-test -p /s390x/migration/analyze-script
>>>
>>> I cannot reproduce in migration-next nor in the detached HEAD that the
>>> pipeline ran in (had to download the tarball from gitlab).
>>>
>>> The only s390 patch in this PR is one that I can test just fine with
>>> TCG, so there shouldn't be any difference from KVM (i.e. there should be
>>> no state being migrated with KVM that is not already migrated with TCG).
>>>
>>>> warning: fd: migration to a file is deprecated. Use file: instead.
>>>> warning: fd: migration to a file is deprecated. Use file: instead.
>>>
>>> This is harmless.
>>>
>>>> **
>>>> ERROR:../tests/qtest/migration-test.c:36:main: assertion failed (ret == 
>>>> 0): (1 == 0)
>>>> (test program exited with status code -6)
>>>
>>> This is the assert at the end of the tests, irrelevant.
>>>
>>>>
>>>> https://gitlab.com/qemu-project/qemu/-/jobs/8681858344#L8190
>>>>
>>>> If you find this pull request caused the failure, please send a new
>>>> revision. Otherwise please let me know so we can continue to
>>>> investigate.
>>>
>>> I don't have an s390x host at hand so the only thing I can to is to drop
>>> that patch and hope that resolves the problem. @Peter, @Thomas, any
>>> other ideas? Can you verify this on your end?
>> 
>> Cannot reproduce either here, x86_64 host only.  The report was from s390
>> host, though.  I'm not familiar with the s390 patch, I wonder if any of you
>> could use plain brain power to figure more things out.
>> 
>> We could wait for 1-2 more days to see whether Thomas can figure it out,
>> hopefully easily reproduceable on s390.. or we can also leave that for
>> later.  And if the current issue on such fix is s390-host-only, might be
>> easier to be picked up by s390 tree, perhaps?
>
> I tested migration-20241217-pull-request on a s390x (RHEL) host, but I 
> cannot reproduce the issue there - make check-qtest works without any 
> problems. Is it maybe related to that specific Ubuntu installation?
>

Since we cannot reproduce outside of the staging CI, could we run that
job again with a diagnostic patch? Here's the rebased PR with the patch:

https://gitlab.com/farosas/qemu/-/commits/migration-next

(fork CI run: https://gitlab.com/farosas/qemu/-/pipelines/1610691202)

Or should I just send a v2 of this PR with the debug patch?

Reply via email to