Hi, On Wed, Apr 13, 2022 at 05:21:03PM +0000, build...@builder.wildebeest.org wrote: > A new failure has been detected on builder elfutils-centos-x86_64 while > building elfutils. > > Full details are available at: > https://builder.wildebeest.org/buildbot/#builders/1/builds/932 > > Build state: failed test (failure) > Revision: 399b55a75830f1854c8da9f29282810e82f270b6 > Worker: centos-x86_64 > Build Reason: (unknown) > Blamelist: Mark Wielaard <m...@klomp.org> > > Steps: > [...] > - 8: make check ( failure ) > Logs: > - stdio: > https://builder.wildebeest.org/buildbot/#builders/1/builds/932/steps/8/logs/stdio > - test-suite.log: > https://builder.wildebeest.org/buildbot/#builders/1/builds/932/steps/8/logs/test-suite_log
Hmmm, this seems a little random. The change was just adding some (unused) constants to dwarf.h. The log says: command timed out: 1200 seconds without output running ['make', 'check', '-j4'], attempting to kill process killed by signal 9 program finished with exit code -1 elapsedTime=1925.591587 It looks like run-debuginfod-federation-sqlite.sh is missing. Looking at the buildbot worker the end of run-debuginfod-federation-sqlite.sh.log is: + mvalue=2 + '[' -z 2 ']' + echo 'metric thread_work_total{role="groom"}: 2' metric thread_work_total{role="groom"}: 2 + '[' 2 -eq 2 ']' + break + '[' 18 -eq 0 ']' + curl -s http://127.0.0.1:9112/buildid/beefbeefbeefd00dd00d/debuginfo + curl -s http://127.0.0.1:9112/metrics + grep 'error_count.*sqlite' error_count{sqlite3="database disk image is malformed"} 6 error_count{sqlite3="file is encrypted or is not a database"} 1 + kill -INT 28184 28371 + wait 28184 28371 Which seems to correspond to this part in run-debuginfod-federation-sqlite.sh ######################################################################## # Corrupt the sqlite database and get debuginfod to trip across its errors curl -s http://127.0.0.1:$PORT1/metrics | grep 'sqlite3.*reset' dd if=/dev/zero of=$DB bs=1 count=1 # trigger some random activity that's Sure to get sqlite3 upset kill -USR1 $PID1 wait_ready $PORT1 'thread_work_total{role="traverse"}' 2 wait_ready $PORT1 'thread_work_pending{role="scan"}' 0 wait_ready $PORT1 'thread_busy{role="scan"}' 0 kill -USR2 $PID1 wait_ready $PORT1 'thread_work_total{role="groom"}' 2 curl -s http://127.0.0.1:$PORT1/buildid/beefbeefbeefd00dd00d/debuginfo > /dev/null || true curl -s http://127.0.0.1:$PORT1/metrics | grep 'error_count.*sqlite' # Run the tests again without the servers running. The target file should # be found in the cache. kill -INT $PID1 $PID2 wait $PID1 $PID2 So maybe corruptin the sqlite database prevents a proper shutdown of the debuginfod process? Cheers, Mark