On Sun, 2020-02-16 at 11:59:56 +0000, Simon McVittie wrote: > I would be grateful if people who advocate transitioning individual > packages, and people who consider the approach taken by usrmerge and > debootstrap to be sufficient, could refer to their preferred route in a > way that makes it clear which one they are advocating. Saying we should > do a transition "properly" is tautologous - of course we should! - but > when people disagree about what the proper way to do it is, it becomes an > ambiguous recommendation that doesn't guide anyone to do the right thing.
I've been consistently calling the concept of merging /* into /usr/* as merged-/usr and the specific approach of using directory symlinks as merged-/usr-via-symlinks (although I think that's confusing as the other approach does use symlink farms), so I think using either merged-/usr-via-aliased-dirs or merged-/usr-via-symlink-dirs is more clear (will be renaming the buildinfo tainted tag). The approach I've been proposing I'd call merged-/usr-via-moves-and-symlink-farms or something along those lines. I've dumped this all into <https://wiki.debian.org/Teams/Dpkg/MergedUsr> and updated the Dpkg FAQ: <https://wiki.debian.org/Teams/Dpkg/FAQ#Q:_Does_dpkg_support_merged-.2Fusr-via-aliased-dirs.3F> > The approach that transitions individual packages solves some bugs > (notably, if you ask dpkg which package owns /usr/lib/MA/libfoo.so.0, > you get the right answer, which has a lot of desirable consequences). It > also appears to *cause* some bugs. The ones I know about are: > > * packages that hard-code paths into /lib stop working on systems that > have *not* undergone the usrmerge-style /usr merge, because > /lib/MA/libfoo.so.0 no longer exists (for example see #950715 > involving libgcc.so.1 and the gold linker); For any pathname that has been hardcoded a symlink can be used for backwards compat, nothing unlike /bin or /sbin here. This looks just like a normal bug from a botched transition, nothing special. > * there is a class of bugs on systems that have *not* undergone the > usrmerge-style /usr merge, involving old libraries lingering in /lib/MA > (see #949395 for a summary of the instances that I know about), which > are very hard to debug because they are unreproducible, to do with the > state of an individual system, and are related to upgrades that happened > years in the past and whose logs expired long ago; merged-/usr (in general) seems irrelevant here, if this is really a bug in dpkg, then it needs fixing regardless of any approach taken, as there is nothing in dpkg really distinguishing between / and /usr as it is pathname neutral. I've pending to comment on that bug, but I was skimming over the involved code and didn't see anything obviously wrong that would forget pathnames. But my first suspect would be that we are not doing fsync() on the parent directories, the second would be perhaps related to multiarch refcounting logic, but ISTR this having happened in old glibc perhaps before multiarch was deployed, but would need to check. > * when paths migrate between package names and between paths at the same > time, there can be undetected file conflicts on systems that *have* > undergone the usrmerge-style /usr merge (for example see #950624, > again in libgcc.so.1) This is one of the issues inherent in merged-/usr-via-aliased-dirs. > In Debian we have (as usual) made life more difficult for ourselves by the > /usr merge being optional, I fully agree, but not on the reason for the root cause. Allowing the merged-/usr-via-aliased-dirs hack at all, has meant that we cannot do fully automatic migrations via debhelper with no maintainer scripts involved whatsoever. :( > which means that in any transition or upgrade > scenario, maintainers have to consider two cases: one where the system > has undergone the /usr merge, and one where it has not. For example, > #950624 only happens on systems that have undergone the /usr merge, > while #950715 and #949395 only happen on systems that have not. Testing > on any single system cannot detect all of these: to detect all the bugs, > we have to try both configurations. The difference between these issues IMO is one between bugs (a botched migration with no compat symlinks, and a potential dpkg bug), and design flaws inherent in the merged-/usr-via-aliased-dirs approach. I'm not sure how anyone could claim that the merged-/usr-via-aliased-dirs approach is not a hack, even if it might have some possible benefits, when it circumvents the package manager completely (we might as well ditch it and go use bare tarballs or something :/…), and implies action at a distance where the filesystem layout is injected from something that is not packaged at all and should not be packaged (or we'd be forced to hardcode bootstrapping ordering into each bootstrapper forever) so we also lose locality from the .deb. That approach would be acceptable in a distant future once and iff we ship no objects under top root directories, which for now would imply breaking ABI/APIs. Doing the proposed mandatory merged-/usr-via-aliased-dirs for a release cycle, and moving everything en mass, still does not fix the aliasing problems for all the paths that are hardcoded on non-/usr locations. This would mean painting ourselves into an even tinier and messier corner than where we are now… IMO the technically better and more sound solution would be to devise a way to revert the mess caused by the merged-/usr-via-aliased-dirs approach, and then convert everything automatically via debhelper. Thanks, Guillem