Andreas, I have updated the Impact and Test Plan accordingly. Please let me know if you still have additional questions.
** Description changed: [ Impact ] - Cloud-init handles 503's generically as any other errors even though it - is a message from the server of when and how to retry later. This is - issues within AWS. + Cloud-init has handled 503's generically as any other server error. + However, a 503 should be understood as a temporary problem and that the + client should try again later. This lack of try-again behavior causes + problems in AWS where they expect cloud-init to retry when receiving a + 503. The code has been updated to retry according to the Retry-After + response header, or, retry 1 second later if no Retry-After is included. + These retries will happen until cloud-init no longer receives a 503. + This change is not datasource specific. [ Test Plan ] Boot an instance. Observe a 503 from the IMDS when contacting the metadata service. Ensure the request is eventually retried and succeeds. [ Where problems could occur ] This required a somewhat hefty refactor, this there is a bit more risk to this one. If there was a bug introduced here it could lead to failure to obtain metadata and inability for an instance to fully come up. That said, the suite of integration tests will detect such a problem for the vast majority of use cases. [ Other info ] Upstream issue: https://github.com/canonical/cloud-init/issues/5577 Upstream PR: https://github.com/canonical/cloud-init/pull/5938 ** Description changed: [ Impact ] Cloud-init has handled 503's generically as any other server error. However, a 503 should be understood as a temporary problem and that the client should try again later. This lack of try-again behavior causes problems in AWS where they expect cloud-init to retry when receiving a 503. The code has been updated to retry according to the Retry-After response header, or, retry 1 second later if no Retry-After is included. These retries will happen until cloud-init no longer receives a 503. This change is not datasource specific. [ Test Plan ] Boot an instance. Observe a 503 from the IMDS when contacting the metadata service. Ensure the request is eventually retried and succeeds. + Note that this behavior isn't consistently reproducible, so we may need + to rely on a mocked metadata service in order to test this + functionality. [ Where problems could occur ] This required a somewhat hefty refactor, this there is a bit more risk to this one. If there was a bug introduced here it could lead to failure to obtain metadata and inability for an instance to fully come up. That said, the suite of integration tests will detect such a problem for the vast majority of use cases. [ Other info ] Upstream issue: https://github.com/canonical/cloud-init/issues/5577 Upstream PR: https://github.com/canonical/cloud-init/pull/5938 -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/2094858 Title: Cloud-init fails on AWS if IMDSv2 returns a 503 error. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/cloud-init/+bug/2094858/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs