Tim Andersson has proposed merging ~andersson123/autopkgtest-cloud:missing-tests into autopkgtest-cloud:master.
Requested reviews: Canonical's Ubuntu QA (canonical-ubuntu-qa) For more details, see: https://code.launchpad.net/~andersson123/autopkgtest-cloud/+git/autopkgtest-cloud/+merge/477440 Hopefully a fix for losing all these tests recently :/ -- Your team Canonical's Ubuntu QA is requested to review the proposed merge of ~andersson123/autopkgtest-cloud:missing-tests into autopkgtest-cloud:master.
diff --git a/charms/focal/autopkgtest-cloud-worker/autopkgtest-cloud/worker/worker b/charms/focal/autopkgtest-cloud-worker/autopkgtest-cloud/worker/worker index c281b0f..9483f8a 100755 --- a/charms/focal/autopkgtest-cloud-worker/autopkgtest-cloud/worker/worker +++ b/charms/focal/autopkgtest-cloud-worker/autopkgtest-cloud/worker/worker @@ -1401,16 +1401,28 @@ def request(msg): msg.channel.basic_reject( msg.delivery_tag, requeue=True ) + kill_openstack_server(test_uuid) + # return here so the worker can go back to listening for + # test requests + return else: + # Tim Andersson: + # We've recently (as of 28/11/2024) been losing test requests. + # This block was the cause - autopkgtest would exit with an abnormal exit code, + # and this block would assume that an admin had intentionally killed the test, + # causing the message to be removed from the queue. Prior to this logic that I mention here, + # the test request would go back in the queue, causing the test to loop forever. The best option + # here I believe is to count the failure as a "real" failure - then a.u.c admins can much more easily + # investigate the issue, as the result will go into the database, and the log will be available + # in the swift storage. + # Setting retry to 3 causes this whole convoluted block to not execute again. logging.warning( - "autopkgtest failure not requested via systemd, removing message %s from queue", + "autopkgtest has failed with an unknown code (%i), removing message %s from queue and counting as a real failure so admins can more easily debug the issue.", + code, body.encode(), ) msg.channel.basic_ack(msg.delivery_tag) - kill_openstack_server(test_uuid) - # return here so the worker can go back to listening for - # test requests - return + retry = 3 else: if num_failures >= 3: logging.warning(
-- Mailing list: https://launchpad.net/~canonical-ubuntu-qa Post to : canonical-ubuntu-qa@lists.launchpad.net Unsubscribe : https://launchpad.net/~canonical-ubuntu-qa More help : https://help.launchpad.net/ListHelp