It was suggested I should write down a list of other issues we have on
the autobuilder right now. Some of these are related to the transition
to new infrastructure, some are previously seen issues occurring more
frequently, some are entirely new. I appreciate these should all have
bugs. I'm copying various people including those helping with SWAT so
we can transition those we don't get solved into bugzilla.

a) insane do_package_qa intermittent failures
---------------------------------------------

Two levels of probems. The cachedpath code returns False instead of
exceptions so there was unexpected API changes. The recent insane
changes also stopped handling DEBIAN files correctly leading to races.
This is perhaps the best backtrace we have:

https://valkyrie.yoctoproject.org//#/builders/43/builds/246

Who/Plan: RP has a partial revert for the first part to fix builds.
Ross is aware and working on the second.

b) docs build failing in missing inkscape
-----------------------------------------

e.g. https://valkyrie.yoctoproject.org//#/builders/34/builds/9

We could install inkscape on all the workers but I'm reluctant to do so
as it starts to add large dependency chains potentially and reduces our
host dependency checking. We could to with other tools to generate pdf
and epub docs during release and docs builds too.

Proposal is to add a new buildtools-docs target to the AB which can
pull in the meta-oe layer and build a more fully features docs
buildtools. This would also be useful for the screenshot QA imagemagick
tests we want to add.

Who/Plan: TBD

c) source mirroring failing on AB
---------------------------------

https://valkyrie.yoctoproject.org//#/builders/82/builds/15
https://valkyrie.yoctoproject.org//#/builders/82/builds/16

Sources are still mirrored off typhoon but we've stopped that cluster
and are running off valkyrie. This could mean sources don't appear
"fast" in the mirror until that mirroring moves to the new NAS.

Who/Plan: Michael to transition mirroring to run of valkyrie NAS

d) CDN artefacts are failing

Same deal as sources, we need to move the mirroring to be based off
valkyrie's NAS.

Who/Plan: Michael to transition CDN to run of valkyrie NAS

e) bitbake server timeout issues
--------------------------------

https://valkyrie.yoctoproject.org//#/builders/48/builds/185/steps/14/logs/stdio

I've savedĀ 

https://valkyrie.yocto.io/pub/shared-failure-data/fedora41-vk-1-selftest/

 - the json logs are something we've not had before for that
 - the key message in cookerdaemon is "Idle loop didn't finish queued
commands after 30s, exiting."
 - that message is there twice, it failed twice
 - can we decode the json logs and get timestamps
 - looks like it happens in particular in the siggen code?

Who/Plan: TBD

f) CVE database corruption
--------------------------

See list discussion:

https://lists.openembedded.org/g/openembedded-core/message/205715

Has a bug:

https://bugzilla.yoctoproject.org/show_bug.cgi?id=14899

I'm at a loss on this one. Started to wonder if an rsync job is
trampling the file. Check with Michael.

Who/Plan: Ask Michael about rsync job

g) Toaster test issues
----------------------

Toaster testing is more intermittent on the new faster workers. Have
some patches in progress but is has highlighted issues with the tests.

Help in changing "assertTrue(X in Y)" to "assertIn(X, Y)" in the
toaster tests would be welcome. I've made a few patches, more are
needed.

Help in being able to delete a project from the database before
starting tests would also be useful. Having trouble doing it in the
tests themselves due to database locking and can't work out the way to
call the rest DELETE API from selenium yet.

My plan is to length the timeouts and drop all the sleep/poll calls,
clean up the timeouts. This will make the tests much faster too. This
does mean adding "wait for alert to display" code. Have tried doing
this but tests need fixing.

Who/Plan: RP has ideas but would welcome help. RP needs to send
"debugging toaster tests email with tips".

h) weird networking issues causing test failures

See original email in this thread:
https://lists.openembedded.org/g/openembedded-core/message/205723

RP is at a loss.

Who/Plan: TBD

i) rust toolchain test failures mips/ppc

e.g.: https://valkyrie.yoctoproject.org//#/builders/21/builds/226

Who/Plan: Need to write email asking for mips/ppc help. Can
Adrian/Chuck help?

j) SPDX build warnings

RP hasn't merged Joshua's patch. Need to include in next testing run

Who/Plan: RP to test and merge patch

k) ssh test still causing failures

RP's fix looks to be incorrect, changed the wrong number. Correct fix
queued in master-next

Who/Plan: RP to test and merge patch


I'm sure there are more but I've put the ones I have in my head down
for now. I'm pretty sure people find fixing one issue painful, trying
to keep track of this many in my head is bad enough without trying to
fix them! Help on any of these is welcome.

Cheers,

Richard

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#205777): 
https://lists.openembedded.org/g/openembedded-core/message/205777
Mute This Topic: https://lists.openembedded.org/mt/108984574/21656
Group Owner: openembedded-core+ow...@lists.openembedded.org
Unsubscribe: https://lists.openembedded.org/g/openembedded-core/unsub 
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Reply via email to