On Tue, Jan 15, 2019 at 04:29:36PM +0000, Steve McIntyre wrote: >I've let this slip off my radar since, and it's not gone away. > >I'm seeing this problem really obviously with live installations now, >as I've just been testing them with Secure Boot. > >Hoping to play with this more in the next few days...
OK, I *think* I have worked out the fundamental problem here. I uploaded an initial fix for #852323 a few weeks back, but it was only a quick hack (call "udevadm trigger" instead of "udevadm settle") until I found the time to dig in to the code properly. Now, with a few hours undisturbed on a long flight, I think I've sussed the root cause with judicious use of "udevadm monitor", ls and blkid. \o/ It's all down to ordering of events. In /lib/partman/commit.d in d-i, we have the following ordering at the moment for some of the scripts: 30parted 32update-dev <<----- triggers the udev update 45format_swap 50format_*fs <<----- responsible for running mkfs It's quite simple. 32update-dev is asking for udev updates, and that will cause the various /dev/disk/by-*/ symlinks to be updated or refreshed. But that's the wrong point. In particular for /dev/disk/bu-uuid, the UUID values themselves come from the filesystem headers. We *then* make the filesystems for each block device, and anything that changes at this point will never be represented in the by-uuid symlinks. We need to run update-dev *after* the filesystem creation scripts and this fixes things. I've just copiet it to 99update-dev locally while testing and that made all the difference. It could probably also just be moved instead of copied - at this point I'm not sure if anything in the other scripts also depend on the udev updates for their functionality. Fundamentally, it's a fairly harmless thing to run repeatedly so I'm tempted to just run it twice. Thoughts? So, the question is - how did this *ever* work on older versions of d-i (Jessie and earlier)? I honestly can't see how it could, so I'm going to hand-wave and say "timing". If the udev update code took longer to run, *maybe* the first bit of filesystem creation steps would have already run by the time the by-uuid symlinks were made. After all, it's only looking at the filesystem headers so it wouldn't matter if a long, slow mkfs for a big device was still going. I know that the systemd folks have put a lot of effort into making udev more streamlined over the last few years and *maybe* this has just exposed an underlying bug we've had for ages. [ Hitting send on this mail now on the plane while I have all this fresh in my head, even if it's not going to hit network for a few hours yet! ] -- Steve McIntyre, Cambridge, UK. st...@einval.com "Arguing that you don't care about the right to privacy because you have nothing to hide is no different than saying you don't care about free speech because you have nothing to say." -- Edward Snowden