On Tue, Sep 12, 2023, 12:14 Daniel P. Berrangé <berra...@redhat.com> wrote:
> On Tue, Sep 12, 2023 at 05:01:26PM +0100, Alex Bennée wrote: > > > > Daniel P. Berrangé <berra...@redhat.com> writes: > > > > > On Tue, Sep 12, 2023 at 11:06:11AM -0400, Stefan Hajnoczi wrote: > > >> The avocado-system-alpine, avocado-system-fedora, and > > >> avocado-system-ubuntu jobs are unreliable. I identified them while > > >> looking over CI failures from the past week: > > >> https://gitlab.com/qemu-project/qemu/-/jobs/5058610614 > > >> https://gitlab.com/qemu-project/qemu/-/jobs/5058610654 > > >> https://gitlab.com/qemu-project/qemu/-/jobs/5030428571 > > >> > > >> Thomas Huth suggest on IRC today that there may be a legitimate > failure > > >> in there: > > >> > > >> th_huth: f4bug, yes, seems like it does not start at all correctly > on > > >> alpine anymore ... and it's broken since ~ 2 weeks already, so if > nobody > > >> noticed this by now, this is worrying > > >> > > >> It crept in because the jobs were already unreliable. > > >> > > >> I don't know how to interpret the job output, so all I can do is to > > >> propose removing these jobs. A useful CI job has two outcomes: pass or > > >> fail. Timeouts and other in-between states are not useful because they > > >> require constant triaging by someone who understands the details of > the > > >> tests and they can occur when run against pull requests that have > > >> nothing to do with the area covered by the test. > > >> > > >> Hopefully test owners will be able to identify the root causes and > solve > > >> them so that these jobs can stay. In their current state the jobs are > > >> not useful since I cannot cannot tell whether job failures are real or > > >> just intermittent when merging qemu.git pull requests. > > >> > > >> If you are a test owner, please take a look. > > >> > > >> It is likely that other avocado-system-* CI jobs have similar failures > > >> from time to time, but I'll leave them as long as they are passing. > > >> > > >> Buglink: https://gitlab.com/qemu-project/qemu/-/issues/1884 > > >> Signed-off-by: Stefan Hajnoczi <stefa...@redhat.com> > > >> --- > > >> .gitlab-ci.d/buildtest.yml | 27 --------------------------- > > >> 1 file changed, 27 deletions(-) > > >> > > >> diff --git a/.gitlab-ci.d/buildtest.yml b/.gitlab-ci.d/buildtest.yml > > >> index aee9101507..83ce448c4d 100644 > > >> --- a/.gitlab-ci.d/buildtest.yml > > >> +++ b/.gitlab-ci.d/buildtest.yml > > >> @@ -22,15 +22,6 @@ check-system-alpine: > > >> IMAGE: alpine > > >> MAKE_CHECK_ARGS: check-unit check-qtest > > >> > > >> -avocado-system-alpine: > > >> - extends: .avocado_test_job_template > > >> - needs: > > >> - - job: build-system-alpine > > >> - artifacts: true > > >> - variables: > > >> - IMAGE: alpine > > >> - MAKE_CHECK_ARGS: check-avocado > > > > > > Instead of entirely deleting, I'd suggest adding > > > > > > # Disabled due to frequent random failures > > > # https://gitlab.com/qemu-project/qemu/-/issues/1884 > > > when: manual > > > > > > See example: https://docs.gitlab.com/ee/ci/yaml/#when > > > > > > This disables the job from running unless someone explicitly > > > tells it to run > > > > What I don't understand is why we didn't gate the release back when they > > first tripped. We should have noticed between: > > > > https://gitlab.com/qemu-project/qemu/-/pipelines/956543770 > > > > and > > > > https://gitlab.com/qemu-project/qemu/-/pipelines/957154381 > > > > that the system tests where regressing. Yet we merged the changes > > anyway. > > I think that green series is misleading, based on Richard's > mail on list wrt the TCG pull series: > > https://lists.gnu.org/archive/html/qemu-devel/2023-08/msg04014.html > > "It's some sort of timing issue, which sometimes goes away > when re-run. I was re-running tests *a lot* in order to > get them to go green while running the 8.1 release. " > > > Essentially I'd put this down to the tests being soo non-deterministic > that we've given up trusting them. > Yes. Stefan > With regards, > Daniel > -- > |: https://berrange.com -o- > https://www.flickr.com/photos/dberrange :| > |: https://libvirt.org -o- > https://fstop138.berrange.com :| > |: https://entangle-photo.org -o- > https://www.instagram.com/dberrange :| > > >