Cédric Le Goater <c...@redhat.com> writes: > On 1/5/24 19:04, Fabiano Rosas wrote: >> The migration tests have support for being passed two QEMU binaries to >> test migration compatibility. >> >> Add a CI job that builds the lastest release of QEMU and another job >> that uses that version plus an already present build of the current >> version and run the migration tests with the two, both as source and >> destination. I.e.: >> >> old QEMU (n-1) -> current QEMU (development tree) >> current QEMU (development tree) -> old QEMU (n-1) >> >> The purpose of this CI job is to ensure the code we're about to merge >> will not cause a migration compatibility problem when migrating the >> next release (which will contain that code) to/from the previous >> release. >> >> I'm leaving the jobs as manual for now because using an older QEMU in >> tests could hit bugs that were already fixed in the current >> development tree and we need to handle those case-by-case. >> >> Note: for user forks, the version tags need to be pushed to gitlab >> otherwise it won't be able to checkout a different version. >> >> Signed-off-by: Fabiano Rosas <faro...@suse.de> >> --- >> .gitlab-ci.d/buildtest.yml | 53 ++++++++++++++++++++++++++++++++++++++ >> 1 file changed, 53 insertions(+) >> >> diff --git a/.gitlab-ci.d/buildtest.yml b/.gitlab-ci.d/buildtest.yml >> index 91663946de..81163a3f6a 100644 >> --- a/.gitlab-ci.d/buildtest.yml >> +++ b/.gitlab-ci.d/buildtest.yml >> @@ -167,6 +167,59 @@ build-system-centos: >> x86_64-softmmu rx-softmmu sh4-softmmu nios2-softmmu >> MAKE_CHECK_ARGS: check-build >> >> +build-previous-qemu: >> + extends: .native_build_job_template >> + artifacts: >> + when: on_success >> + expire_in: 2 days >> + paths: >> + - build-previous >> + exclude: >> + - build-previous/**/*.p >> + - build-previous/**/*.a.p >> + - build-previous/**/*.fa.p >> + - build-previous/**/*.c.o >> + - build-previous/**/*.c.o.d >> + - build-previous/**/*.fa >> + needs: >> + job: amd64-opensuse-leap-container >> + variables: >> + QEMU_JOB_OPTIONAL: 1 >> + IMAGE: opensuse-leap >> + TARGETS: x86_64-softmmu aarch64-softmmu >> + before_script: >> + - export QEMU_PREV_VERSION="$(sed 's/\([0-9.]*\)\.[0-9]*/v\1.0/' >> VERSION)" >> + - git checkout $QEMU_PREV_VERSION >> + after_script: >> + - mv build build-previous >> + >> +.migration-compat-common: >> + extends: .common_test_job_template >> + needs: >> + - job: build-previous-qemu >> + - job: build-system-opensuse >> + allow_failure: true >> + variables: >> + QEMU_JOB_OPTIONAL: 1 >> + IMAGE: opensuse-leap >> + MAKE_CHECK_ARGS: check-build >> + script: >> + - cd build >> + - QTEST_QEMU_BINARY_SRC=../build-previous/qemu-system-${TARGET} >> + QTEST_QEMU_BINARY=./qemu-system-${TARGET} >> ./tests/qtest/migration-test >> + - QTEST_QEMU_BINARY_DST=../build-previous/qemu-system-${TARGET} >> + QTEST_QEMU_BINARY=./qemu-system-${TARGET} >> ./tests/qtest/migration-test >> + >> +migration-compat-aarch64: >> + extends: .migration-compat-common >> + variables: >> + TARGET: aarch64 >> + >> +migration-compat-x86_64: >> + extends: .migration-compat-common >> + variables: >> + TARGET: x86_64 > > > What about the others archs, s390x and ppc ? Do you lack the resources > or are there any problems to address ?
Currently s390x and ppc are only tested on KVM. Which means they are not tested at all unless someone runs migration-test on a custom runner. The same is true for this test. The TCG tests have been disabled: /* * On ppc64, the test only works with kvm-hv, but not with kvm-pr and TCG * is touchy due to race conditions on dirty bits (especially on PPC for * some reason) */ /* * Similar to ppc64, s390x seems to be touchy with TCG, so disable it * there until the problems are resolved */ It would be great if we could figure out what these issues are and fix them so we can at least test with TCG like we do for aarch64. Doing a TCG run of migration-test with both archs (one binary only, not this series): - ppc survived one run, taking 6 minutes longer than x86/Aarch64. - s390x survived one run, taking 40s less than x86/aarch64. I'll leave them enabled on my machine and do some runs here and there, see if I spot something. If not, we can consider re-enabling them once we figure out why ppc takes so long.