Thank you all for your attention on this bug.

Sorry my earlier comment on this bug was ill-informed and incorrect. I'm able 
to reproduce this as well through the server installer qemu/kvm based installs 
as well so I can confirm as well that this isn't/wasn't NetworkManager and 
systemd-networkd related.

Also, I am concerned with cloud-init.service being ordered specifically
after sytemd-resolved.service on all deployments as we will be affecting
all boots and delaying them on the systemd-resolved setup of DNS when
only specific use-cases such NoCloudNet with an FQDN as kernel cmdline
directive may need that service to be active.

Some other datasources like GCP do rely on DNS resolution of the
instance metadata service (GCP), but cloud-images inject a config into
/etc/hosts to resolve that locally in absence of active DNS in early
boot. Ec2 does also define instance-data:8773 as a potential fallback
IMDS definition, but both IPv4 and IPv6 endpoints are defined earlier in
the search order, so we never get back to that DNS lookup in all
practical deployments.


We may be able to avoid the cost of a strict `After=systemd-resolved.service` 
clause in cloud-init.service if we can add the following logic to nocloud by 
adding sensible retries in the NoCloud datasource.

 1. Check if seed URLs `netloc` is an ip address. If IP, no retries on
failure.

 2a. When seed URL is non-IP, retry on specific 'network resolution
error' URLError raised and retry X times for that failure mode

 - or -

 2b . When seed URL is non-IP, invoke socket.getaddrinfo to validate DNS
resolution prior to attempting to download metadata, if not resolvable,
retry only as long as systemd.resolved.services isn't yet active.


These retry approaches should allow us to avoid impacting typical boots on most 
systems, yet still support DNS-based needs for datasource detection in early 
boot if FQDN is used for IMDS.

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to systemd in Ubuntu.
https://bugs.launchpad.net/bugs/2008952

Title:
  DNS failure while trying to fetch user-data

Status in cloud-init:
  Triaged
Status in netplan:
  Invalid
Status in subiquity:
  New
Status in systemd package in Ubuntu:
  New

Bug description:
  In testing netboot + autoinstall of the new ubuntu desktop subiquity
  based installer for 23.04 I found cloud-init is failing to retrieve
  user-data because it can't resolved the hostname in the URL.  This
  same configuration does work for 22.04 based subiquity, so seems a
  regression.

  From the ipxe config:

  imgargs vmlinuz initrd=initrd \
   ip=dhcp \
   iso-url=http://cdimage.ubuntu.com/daily-live/pending/lunar-desktop-amd64.iso 
\
   fsck.mode=skip \
   layerfs-path=minimal.standard.live.squashfs \
   autoinstall \
   'ds=nocloud-net;s=http://boot.linuxgroove.com/ubuntu/23.04/' \

  That fails, but if we replace boot.linuxgroove.com with the IP it
  works.

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-init/+bug/2008952/+subscriptions


-- 
Mailing list: https://launchpad.net/~touch-packages
Post to     : touch-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~touch-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to