Hi Paul, As more reference for upstream analysis, we got another debci failure on arm64, note that it took almost 3 hours to complete:
https://ci.debian.net/packages/g/guile-fibers/testing/arm64/63660342/#S14 126s autopkgtest [11:57:22]: test guile-tests-foreign: [----------------------- 127s assert #f equal to #f: ok 127s assert #t terminates: ok 128s assert (sleep 1) terminates: ok 129s assert (perform-operation (sleep-operation 1)) terminates: ok 129s assert (receive-from-fiber 42) equal to 42: ok 10126s assert (send-to-fiber 42) equal to 42: autopkgtest [14:44:02]: ERROR: timed out on ... 10127s autopkgtest [14:44:03]: test guile-tests-foreign: -----------------------] Another for tests-channels" on riscv64, also took several hours: https://ci.debian.net/packages/g/guile-fibers/testing/riscv64/63660348/ 409s autopkgtest [12:02:15]: test guile-tests-channels: [----------------------- 20409s assert run-fibers on (rpc 1) terminates: autopkgtest [17:35:35]: ERROR: timed out on ... 20409s autopkgtest [17:35:35]: test guile-tests-channels: -----------------------] Another one for "tests-foreign", also took hours: https://ci.debian.net/packages/g/guile-fibers/testing/riscv64/63660356/#S14 581s autopkgtest [12:05:48]: test guile-tests-foreign: [----------------------- 584s assert #f equal to #f: ok 584s assert #t terminates: ok 585s assert (sleep 1) terminates: ok 586s assert (perform-operation (sleep-operation 1)) terminates: ok 20581s assert (receive-from-fiber 42) equal to 42: autopkgtest [17:39:08]: ERROR: timed out on ... 20581s autopkgtest [17:39:08]: test guile-tests-foreign: -----------------------] These tests should be fairly quick, certainly not taking hours. Is there a way to lower the debci timeout to say 30 minutes to avoid them consuming CPU time? Paul Gevers <[email protected]> writes: > On 24-08-2025 14:10, Simon Josefsson wrote: >> I have triggered a bunch of jobs for some other archs too, but this >> appears to be amd64-specific: > > Interesting. Our amd64 worker is the most powerful host that we > have. Might it be a race condition, or something related to > parallelism? Yes this is a parallelism-heavy package, and the self-tests stress this. On old systems I would expect this to trigger libc and kernel bugs, but I think on any modern system the problem is more likely to be within guile-fibers. >> Paul, would the patch below improve the situation for you in Debian, or >> doesn't it matter until we stop making this test flaky? > > Sure it does help on the infrastructure, but it does paper over the > real problem. I was thinking how to lower the severity of this bug report. Is 'Serious' the right criticality for a flaky debci failure? If we mark the (apparently) flaky tests with 'flaky', would it still be 'Serious'? My plan is to make another upload, and for all the tests we've seen are flaky, mark them as 'flaky' so they hopefully won't disturb any debci workflow as much. Maybe this allow it to lower the severity to Normal and consider this an upstream bug? I suspect it will take time to resolve, I started a similar dance with Shepherd upstream bugs a couple of months ago about flaky tests and we are still not finished. >> I suppose we >> could remove the test from the debian/tests/ but I believe it actually >> indicate a serious upstream problem that we want to get resolved. > > Are you talking about only this test, or the whole test stanza in > d/t/control? In my opinion removing an individual test from a suite is > better than marking a full stanza as flaky. Each test has its own stanza. I don't think it is possible to separate each test further in any simple way. If one of the stanzas fails spuriously, I think the right thing is to mark that one flaky until upstream resolve it (or we realize it is a Debian-specific problem). >> Btw, what is the workflow that ends up noticing about flaky test in >> guile-fibers? I would expect guile-fibers to not have any reverse build >> dependencies in Debian except for packages I work on. > > I'm currently checking all packages where the last "pure" testing run > in the last 2 months is failing. This includes flaky tests. I see, thank you for explaining and doing that! /Simon
signature.asc
Description: PGP signature

