Hi Valentin, On 14-03-2020 23:02, Valentin Vidić wrote: > On Sat, Mar 14, 2020 at 10:55:29PM +0100, Paul Gevers wrote: >> Can you rephrase your question, I don't understand what you're asking. >> Everything I can provide you is already available from ci.debian.net. > > Right, so when the test timeouts it hangs on something but this is not > visible in the logs or anywhere. You mentioned that it causes problems > for the whole host when this happens so I was thinking you chould send > some more info like the 'ps aufx' so I can get an idea what it is doing > when it hangs? If not I can just disable the last test and it should be > fine again.
We are having issues with the infrastructure and the end of the log hints that this test may be one of the tests that cause it. If I catch such a failure in real life, I'll send you the $(ps auxf), but I'm not inspecting every issue at the moment and regularly just restart *all* the workers when many hang. E.g. here [1] you can see the amount of running lxc containers per worker. It should toggle between 0 and 1 for the amd64 workers (ci-worker-[0-9]*), but they are ramping up because the lxc doesn't close always. Some workers remain longer at one level, while others jump multiple times per day. All workers are identical from the provisioning point of view. [1] https://ci.debian.net/munin/debci-day.html Paul
signature.asc
Description: OpenPGP digital signature

