The sequence is: exec growpart exec sgdisk --info # read-only exec sgdisk --pretend # read-only exec sgdisk --backup # read-only copy # modification of disk starts exec sgdisk --move-second-header \ --delete=PART \ --new=PART \ --typecode --partition-guid --change-name # now that sgdisk has *closed* the filehandle on the disk, systemd-udevd will # get an inotify signal and trigger udevd to run udev scripts on the disk. # this includes the *removal* of symlinks due to the --delete portion of sgdisk call # and following the removal, the -new will trigger the add run on the rules which would # recreate the symlinks.
# update kernel partition sizes; this is an ioctl so it does not trigger an udev events exec partx --update # the kernel has the new partition sizes, and udev scripts/events are all queued (and possibly in flight) exit growpart cloud-init invokes get_size() operation which: # this is where the race occurs if the symlink created by udev is *not* present os.open(/dev/disk/by-id/fancy-symlink-with-partuuid-points-to-sdb1) Dan had put a udevadm settle in this spot like so def get_size(filename) util.subp(['udevadm', 'settle']) os.open(....) So, you're suggesting that somehow _not all_ of the uevents triggered by the sgdisk command in growpart *wouldn't* have been queued before we call udevadm settle? If some other events are happening how is cloud-init to know such that it can take action to "handle this race" more robustly? Lastly if there is a *race* in the symlink creation/remove/delay in uevent propigation; why is that a userspace let alone a cloud-init issue. This isn't universally reproducible, rather it's pretty narrow circumstances between certain kernels and udevs all the while the growpart/cloud-init code remains the same. ** Changed in: cloud-init Status: New => Incomplete -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to systemd in Ubuntu. https://bugs.launchpad.net/bugs/1834875 Title: cloud-init growpart race with udev Status in cloud-init: Incomplete Status in systemd package in Ubuntu: New Bug description: On Azure, it happens regularly (20-30%), that cloud-init's growpart module fails to extend the partition to full size. Such as in this example: ======================================== 2019-06-28 12:24:18,666 - util.py[DEBUG]: Running command ['growpart', '--dry-run', '/dev/sda', '1'] with allowed return codes [0] (shell=False, capture=True) 2019-06-28 12:24:19,157 - util.py[DEBUG]: Running command ['growpart', '/dev/sda', '1'] with allowed return codes [0] (shell=False, capture=True) 2019-06-28 12:24:19,726 - util.py[DEBUG]: resize_devices took 1.075 seconds 2019-06-28 12:24:19,726 - handlers.py[DEBUG]: finish: init-network/config-growpart: FAIL: running config-growpart with frequency always 2019-06-28 12:24:19,727 - util.py[WARNING]: Running module growpart (<module 'cloudinit.config.cc_growpart' from '/usr/lib/python3/dist-packages/cloudinit/config/cc_growpart.py'>) failed 2019-06-28 12:24:19,727 - util.py[DEBUG]: Running module growpart (<module 'cloudinit.config.cc_growpart' from '/usr/lib/python3/dist-packages/cloudinit/config/cc_growpart.py'>) failed Traceback (most recent call last): File "/usr/lib/python3/dist-packages/cloudinit/stages.py", line 812, in _run_modules freq=freq) File "/usr/lib/python3/dist-packages/cloudinit/cloud.py", line 54, in run return self._runners.run(name, functor, args, freq, clear_on_fail) File "/usr/lib/python3/dist-packages/cloudinit/helpers.py", line 187, in run results = functor(*args) File "/usr/lib/python3/dist-packages/cloudinit/config/cc_growpart.py", line 351, in handle func=resize_devices, args=(resizer, devices)) File "/usr/lib/python3/dist-packages/cloudinit/util.py", line 2521, in log_time ret = func(*args, **kwargs) File "/usr/lib/python3/dist-packages/cloudinit/config/cc_growpart.py", line 298, in resize_devices (old, new) = resizer.resize(disk, ptnum, blockdev) File "/usr/lib/python3/dist-packages/cloudinit/config/cc_growpart.py", line 159, in resize return (before, get_size(partdev)) File "/usr/lib/python3/dist-packages/cloudinit/config/cc_growpart.py", line 198, in get_size fd = os.open(filename, os.O_RDONLY) FileNotFoundError: [Errno 2] No such file or directory: '/dev/disk/by-partuuid/a5f2b49f-abd6-427f-bbc4-ba5559235cf3' ======================================== @rcj suggested this is a race with udev. This seems to only happen on Cosmic and later. To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-init/+bug/1834875/+subscriptions -- Mailing list: https://launchpad.net/~touch-packages Post to : touch-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~touch-packages More help : https://help.launchpad.net/ListHelp