On Mon, 7 Oct 2024 at 17:41, Thomas Huth <th...@redhat.com> wrote:
>
> On 07/10/2024 16.13, Peter Maydell wrote:
> >> Some of the other qmp-cmd-test
> >> runs in that job also came close to timing out:
> >>
> >> 102/109 qemu:qtest+qtest-m68k / qtest-m68k/qmp-cmd-test OK 56.56s 65
> >> subtests passed
> >> 105/109 qemu:qtest+qtest-mips64 / qtest-mips64/qmp-cmd-test OK 53.74s
> >> 65 subtests passed
> >> 106/109 qemu:qtest+qtest-s390x / qtest-s390x/qmp-cmd-test OK 45.48s 65
> >> subtests passed
> >>
> >> so maybe we should add it to slow_tests with a 120s
> >> timeout...
>
> Ok, m68k and s390x have been touched by this PR ... but still, it's one
> qtest (qmp-cmd-test) that is failing for multiple targets, so it rather
> sounds like we've got a regression in one of the previous PRs?

I think it's more likely that the k8s runners are just
horrifically inconsistent about speed: they have been
the flaky CI jobs in one way or another at least since
I started doing pullreq handling for this release cycle.

If they reliably ran these jobs in 20s then there would be
no issue, we would have tons of headroom between that and
the 60s timeout. (My local dev box runs them in 13s, and
it's not super high-powered.) If they reliably took 60s
then we'd have fixed up the timeouts already (but that
would imply a very slow CPU).

Our other option would be to use that meson "multiply
all the timeouts by X" feature for the k8s jobs. Of
course if it does go that slowly for the whole job
then we run into the whole-job timeout...

Paolo: do you have any idea why our k8s runner jobs
have such inconsistent performance ?

-- PMM

Reply via email to