[Expired for Auto Package Testing because there has been no activity for 60 days.]
** Changed in: auto-package-testing Status: Incomplete => Expired -- You received this bug notification because you are a member of Canonical's Ubuntu QA, which is subscribed to Auto Package Testing. https://bugs.launchpad.net/bugs/1988080 Title: cloud-worker-maintenance can hang Status in Auto Package Testing: Expired Bug description: The cloud-worker-maintenance job appeared to be stuck with the following in journalctl: Aug 27 16:58:12 juju-4d1272-prod-proposed-migration-5 cloud-worker-maintenance[3162016]: Error: Stopping the instance failed: websocket: close 1006 (abnormal closure): unexpected EOF Aug 27 16:58:12 juju-4d1272-prod-proposed-migration-5 cloud-worker-maintenance[3161610]: lxd-armhf-10.44.124.124:autopkgtest-lxd-cyynbq is old - deleting Aug 27 16:58:12 juju-4d1272-prod-proposed-migration-5 cloud-worker-maintenance[3161610]: Traceback (most recent call last): Aug 27 16:58:12 juju-4d1272-prod-proposed-migration-5 cloud-worker-maintenance[3161610]: File "/home/ubuntu/autopkgtest-cloud/tools/cleanup-lxd", line 59, in <module> Aug 27 16:58:12 juju-4d1272-prod-proposed-migration-5 cloud-worker-maintenance[3161610]: main() Aug 27 16:58:12 juju-4d1272-prod-proposed-migration-5 cloud-worker-maintenance[3161610]: File "/home/ubuntu/autopkgtest-cloud/tools/cleanup-lxd", line 55, in main Aug 27 16:58:12 juju-4d1272-prod-proposed-migration-5 cloud-worker-maintenance[3161610]: check_remote(remote) Aug 27 16:58:12 juju-4d1272-prod-proposed-migration-5 cloud-worker-maintenance[3161610]: File "/home/ubuntu/autopkgtest-cloud/tools/cleanup-lxd", line 40, in check_remote Aug 27 16:58:12 juju-4d1272-prod-proposed-migration-5 cloud-worker-maintenance[3161610]: subprocess.check_call( Aug 27 16:58:12 juju-4d1272-prod-proposed-migration-5 cloud-worker-maintenance[3161610]: File "/usr/lib/python3.8/subprocess.py", line 364, in check_call Aug 27 16:58:12 juju-4d1272-prod-proposed-migration-5 cloud-worker-maintenance[3161610]: raise CalledProcessError(retcode, cmd) Aug 27 16:58:12 juju-4d1272-prod-proposed-migration-5 cloud-worker-maintenance[3161610]: subprocess.CalledProcessError: Command '['lxc', 'delete', '--force', 'lxd-armhf-10.44.124.124:autopkgtest-lxd-cyynbq']' ret To workaround the failure we can restart the service and if it works again and if that does not work delete the broken container and reboot the host. To stop it from happening again Julian suggested adding a "TimeoutSec=1h" to cloud-worker-maintenance as a minimum. Ideally the delete call would have a 10 minute timeout with a wrapper for subprocess that handles the the timeout. To manage notifications about this bug go to: https://bugs.launchpad.net/auto-package-testing/+bug/1988080/+subscriptions -- Mailing list: https://launchpad.net/~canonical-ubuntu-qa Post to : canonical-ubuntu-qa@lists.launchpad.net Unsubscribe : https://launchpad.net/~canonical-ubuntu-qa More help : https://help.launchpad.net/ListHelp