On Fri, Sep 13, 2024 at 02:31:34PM +0100, Peter Maydell wrote: > On Fri, 13 Sept 2024 at 13:24, Peter Maydell <peter.mayd...@linaro.org> wrote: > > > > On Thu, 12 Sept 2024 at 16:10, Peter Maydell <peter.mayd...@linaro.org> > > wrote: > > > > > > The cross-i686-tci CI job is persistently flaky with various tests > > > hitting timeouts. One theory for why this is happening is that we're > > > running too many tests in parallel and so sometimes a test gets > > > starved of CPU and isn't able to complete within the timeout. > > > > > > (The environment this CI job runs in seems to cause us to default > > > to a parallelism of 9 in the main CI.) > > > > > > Signed-off-by: Peter Maydell <peter.mayd...@linaro.org> > > > --- > > > If this works we might be able to wind this up to -j2 or -j3, > > > and/or consider whether other CI jobs need something similar. > > > > I gave this a try, but unfortunately the result seems to be > > that the whole job times out: > > https://gitlab.com/qemu-project/qemu/-/jobs/7818441897 > > ...but then this simple retry passed with a runtime of 47 mins: > > https://gitlab.com/qemu-project/qemu/-/jobs/7819225200 > > I'm tempted to commit this as-is, and see whether it helps. > If it doesn't I can always back it off to -j2, and if it does > generate a lot of full-job-timeouts it's only me it's annoying.
Anyone know how many vCPUs our k8s runners have ? The gitlab runners that contributor forks use will have 2 vCPUs. So our current make -j$(nproc+1) will be effectively -j3 already in pipelines for forks. IOW, we intentionally slightly over-commit CPUs right now. Backing off to just -j$(nproc) may be better than hardcoding -j1/-j2, so that it takes account of different runner sizes ? With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|