** Description changed:
+ [Impact]
+
+ cloud-init makes DNS queries early in boot, before systemd-networkd is
+ active. The query causes systemd-resolved to query registered hooks,
+ which then socket-activates systemd-networkd. But, this causes a boot
+ delay, because systemd-networkd cannot start until the relevant cloud-
+ init phase has completed.
+
+ Hence, the user impact is a delay in boot on certain clouds.
+
+ [Test Plan]
+
+ TODO
+
+ [Where problems could occur]
+
+ The fix is to order systemd-networkd-resolve-hook.socket After=network-
+ pre.target. This avoids the situation where the socket for the resolve
+ hook is bound before systemd-networkd could actually be started.
+
+ Hence, problems introduced by this would most likely be related to unit
+ ordering or dependency problems.
+
+ [Other info]
+
+ The fix comes from upstream: systemd-networkd-resolve-hook.socket
+
+ [Original Description]
+
`cloud-init`: `26.1-0ubuntu1`
`systemd`: `259.5-0ubuntu2`
`datasource`: `DataSourceGCELocal`
`interface`: `ens4`
DHCP used by `cloud-init`: `dhcpcd 10.3.0`
Confirmed clouds affected: Google cloud
On dev resolute images in GCE first boot is being delayed by about ~30s
during `cloud-init`'s local datasource discovery. Specifically when
`/etc/resolv.conf` is symlinked to the stub-resolver (`/etc/resolv.conf
-> ../run/systemd/resolve/stub-resolv.conf`) which is the default.
On initial boot `cloud-init analyze blame` shows that `init-
local/search-GCELocal` takes about ~31s and `/var/log/cloud-init.log`
shows that although `DHCP` completes quickly, the `DNS` resolution of
`metadata.google.internal` takes about 30s before the first metadata
request is sent. The same slow boot’s logs also show that `systemd-
resolved` times out on `io.systemd.Network` resolve hook, and that
`systemd-networkd.service` doesn't start until after `cloud-init-
local.service` *finishes*.
To test the theory I manually mounted a vanilla image (I had to do it
like this because `livecd-rootfs` forces a specific /etc/resolv.conf
file on you...) and edited it so that `/etc/resolv.conf` pointed to the
non-stubbed `systemd-resolved` file instead (`/etc/resolv.conf ->
../run/systemd/resolve/resolv.conf`). This reduced first boot `init-
local/search-GCELocal` back down to about ~2s and reduced the metadata
`hostname` resolution from about 30s to ~0.03s.
This indicates to me that the boot delay is caused by `systemd-
resolved`’s stub resolver path interacting badly with the
`io.systemd.Network` resolve hook path in `systemd` v259 (?) during
`cloud-init-local`.
## Longer explanation/moar logs
`cloud-init status` showed that `cloud-init` completed successfully with
no reported errors:
```
$ sudo cloud-init status --long
status: done
extended_status: done
boot_status_code: enabled-by-generator
detail: DataSourceGCELocal
errors: []
recoverable_errors: {}
```
but that most of the time is spent in the `local` GCE `datasource` search:
```
$ sudo cloud-init analyze blame
-- Boot Record 01 --
30.98900s (init-local/search-GCELocal)
06.80700s (modules-config/config-apt_configure)
01.62200s (modules-config/config-ntp)
01.09700s (modules-final/config-keys_to_console)
00.94900s (init-network/config-growpart)
00.91800s (modules-final/config-ssh_authkey_fingerprints)
00.58400s (init-network/config-ssh)
...
...
```
`cloud-init analyze` shows:
```
$ sudo cloud-init analyze show
-- Boot Record 01 --
Starting stage: init-local
|`->no cache found (duration: 0.000s) @00.01300s +00.00000s
|`->found local data from DataSourceGCELocal (duration: 30.989s) @00.02700s
+30.98900s
Finished stage: (init-local) 31.93500 seconds
```
... but an important finding from `/var/log/cloud-init.log` (see below) is
that:
* DHCP was quick
* routes were added immediately
* the large stall comes from `hostname` resolution
* and once resolution finished the metadata requests themselves succeed
straight away
```
2026-04-15 11:16:18,096 - dhcp.py[DEBUG]: Performing a dhcp discovery on ens4
2026-04-15 11:16:18,978 - performance.py[DEBUG]: Running ['dhcpcd',
'--ipv4only', '--waitip', '--persistent', '--noarp', '--debug',
'--script=/bin/true', 'ens4'] took 0.869 seconds
2026-04-15 11:16:18,984 - dhcp.py[DEBUG]: Parsing dhcpcd lease for interface
ens4: 'reason=BOUND
...
...
domain_name_servers=169.254.169.254
...
...
dhcp_server_identifier=169.254.169.254
...
...
2026-04-15 11:16:18,988 - ephemeral.py[DEBUG]: Received dhcp lease on ens4
for 10.164.0.17/255.255.255.255
2026-04-15 11:16:18,992 - subp.py[DEBUG]: Running command ['ip', '-4',
'route', 'append', '10.164.0.1/32', 'dev', 'ens4'] ...
2026-04-15 11:16:18,994 - subp.py[DEBUG]: Running command ['ip', '-4',
'route', 'append', '0.0.0.0/0', 'via', '10.164.0.1', 'dev', 'ens4'] ...
2026-04-15 11:16:49,030 - performance.py[DEBUG]: Resolving URL took 30.033
seconds
2026-04-15 11:16:49,031 - url_helper.py[DEBUG]: [0/6] open
'http://metadata.google.internal/computeMetadata/v1/instance/id' ...
2026-04-15 11:16:49,039 - url_helper.py[DEBUG]: Read from
http://metadata.google.internal/computeMetadata/v1/instance/id (200, 18b) after
1 attempts
2026-04-15 11:16:49,041 - url_helper.py[DEBUG]: Read from
http://metadata.google.internal/computeMetadata/v1/instance/zone (200, 41b)
after 1 attempts
2026-04-15 11:16:49,044 - url_helper.py[DEBUG]: Read from
http://metadata.google.internal/computeMetadata/v1/instance/hostname (200, 55b)
after 1 attempts
...
...
2026-04-15 11:16:49,063 - performance.py[DEBUG]: Getting metadata took 30.988
seconds
```
`journalctl` shows the resolve hook timeout and the ordering of
`cloud-init-local` vs. `systemd-networkd` more clearly:
```
$ sudo journalctl -b 0 -o short-monotonic \
-u cloud-init-local.service \
-u systemd-resolved.service \
-u systemd-networkd.service
[ 3.494233] localhost systemd[1]: Starting systemd-resolved.service -
Network Name Resolution...
[ 3.644166] localhost systemd[1]: Started systemd-resolved.service -
Network Name Resolution.
[ 8.555137] localhost systemd[1]: Starting cloud-init-local.service -
Cloud-init: Local Stage (pre-network)...
[ 39.023176] resolute-quick systemd-resolved[196]: System hostname changed
to 'resolute-quick'.
[ 39.124219] resolute-quick systemd-resolved[196]: Query on hook
'/run/systemd/resolve.hook/io.systemd.Network' failed with error
'io.systemd.TimedOut', ignoring.
[ 39.124478] resolute-quick systemd-resolved[196]: Query on hook
'/run/systemd/resolve.hook/io.systemd.Network' failed with error
'io.systemd.TimedOut', ignoring.
[ 39.658145] resolute-quick systemd[1]: Finished cloud-init-local.service -
Cloud-init: Local Stage (pre-network).
[ 39.693138] resolute-quick systemd[1]: Starting systemd-networkd.service -
Network Management...
[ 39.804399] resolute-quick systemd[1]: Started systemd-networkd.service -
Network Management.
```
In other words:
* `cloud-init-local` starts and needs to resolve `metadata.google.internal`
* with `stub-resolv.conf` that lookup goes through `systemd-resolved`
* `systemd-resolved` attempts to use the io.systemd.Network resolve hook, but
that hook times out after 30 seconds
* and finally `systemd-networkd` only starts after `cloud-init-local` has
finished
The ordering and socket relationships make this seem plausible:
* `cloud-init-local.service` has `Wants=network-pre.target` &&
`Before=network-pre.target`
* `systemd-networkd.service` has `After=network-pre.target` && `...
systemd-networkd-resolve-hook.socket`
* `systemd-networkd-resolve-hook.socket` listens on
`/run/systemd/resolve.hook/io.systemd.Network`
### When using an image with the modified symlink
On a fresh boot of the modified image:
```
$ cloud-init analyze blame
-- Boot Record 01 --
03.94800s (modules-config/config-ntp)
02.19300s (init-local/search-GCELocal)
01.29000s (modules-final/config-ssh_authkey_fingerprints)
00.88100s (init-network/config-growpart)
00.67000s (modules-final/config-keys_to_console)
00.41300s (modules-config/config-apt_configure)
...
...
```
and
```
$ cat /var/log/cloud-init.log
...
...
64:2026-04-15 20:10:24,412 - performance.py[DEBUG]: Resolving URL took 0.031
seconds
...
...
```
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2148619
Title:
`cloud-init` local datasource discovery takes ~30s to complete
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/cloud-init/+bug/2148619/+subscriptions
--
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs