Adrian Bunk wrote: > On Fri, Dec 18, 2020 at 04:25:19PM -0800, Josh Triplett wrote: > >... > > I'm not suggesting there should be 50 versions of a given > > library in the archive, but allowing 2-4 versions would greatly simplify > > packaging, and would allow such unification efforts to take place > > incrementally, via transitions *in the archive* and *in collaboration > > with upstream*, rather than *all at once before a new package can be > > uploaded*. > > > > (I also *completely* understand pushing back on having 2-4 versions of > > something like OpenSSL; that'd be a huge maintenance and security > > burden. That doesn't mean we couldn't have 2-4 semver-major versions of > > a library to emit ANSI color codes, and handle reducing that number via > > incremental porting in the archive rather than via prohibition in > > advance.) > > It is important to always remember that the main product we are > delivering to our users are our stable releases.
(This is somewhat off-topic, but: I think that Debian stable is *one* of the main products of Debian, but not by any means the only one. Debian testing and unstable/experimental are also incredibly valuable. We need solutions that work for all of those. Those solutions *do* need to work for stable as well, though, and I'll address the rest of your mail in that regard.) > We do have 4 different versions of autoconf in the archive. > This works because autoconf does not have CVEs. There's a great deal of software out there with similar properties, most notably that it doesn't sit at a security boundary. That doesn't just include build-time code. Also, some types of security vulnerabilities are rare-to-nonexistent in other ecosystems. A library, written in a safe language, whose job is to generate ANSI terminal color codes, is not likely to have security vulnerabilities. It's not critical to force all packages to move to the latest version of that library immediately, before they can upload at all. Bundling *can* make it much more difficult to handle security support, for a variety of reasons (updating distinct embedded copies, dealing with more version skew, etc). But in the absence of bundling, if the *only* issue is that there may be 2-4 semver-major versions in the archive, I'd expect the process to be roughly "upload new versions of those packages, trigger rebuilds of dependencies". On balance, I wouldn't expect substantial scaling issues with the former. The *latter* would be where we may need some tooling improvements, for ecosystems that do the equivalent of static linking or library bundling at build time and ship a compiled artifact in their binary package. > If a library is so complex that your "unification efforts in > collaboration with upstream" would apply, chances are there > will be CVEs if anyone does a security audit of the code. I'm not talking about complexity of an individual library; that's not the primary issue here. I'm talking about quantity. If your package has 300 dependencies, most of which are relatively small, focused, self-contained libraries, the "collaboration with upstream" part is about collaboration with the upstream of your package, not the upstreams of the dependencies. If you want to package abc version 1.2.3, and among many other things, abc depends on xyz version 2.1.4, and xyz has a new version 3.0.1 now, it makes sense to work with the upstream of abc, sending them a patch to migrate to the new version, and waiting for abc 1.2.4 to come out with that update. It *doesn't* make sense to maintain a downstream Debian patch to make abc work with the newer xyz. abc can just build-depend on xyz-2, and a later version of abc can build-depend on xyz-3. That isn't a reflection of complexity in xyz, or in abc. Also, sometimes those dependencies are indirect through other dependencies, and to transition forward, you may want to move multiple dependencies forward in concert, for compatibility reasons or just to minimize duplication within one application. > > I think much of our resistance to allowing 2-4 distinct semver-major > > versions of a given library comes down to ELF shared libraries making it > > painful to have two versions of a library with distinct SONAMEs loaded > > at once, and while that can be worked around with symbol versioning, > > we've collectively experienced enough pain in such cases that we're > > hesitant to encourage it. Our policies have done a fair bit to mitigate > > that pain. But much of that pain is specific to ELF shared libraries and > > similar. > > No, the only real pain is providing security support. Debian has gone through many library transitions that have incurred substantial pain, including those where a lack of symbol versioning resulted in serious issues if two versions of the same library ended up in the same address space. That's in addition to the normal pain of library transitions, and in addition to all the *infrastructure* that Debian has built up around library versioning (such as shlibs files and symbols files). That has led to guidance such as not versioning most -dev packages, and instead forcing all new package uploads to transition to the new version of the library. By contrast with that, security support may not be nearly as much of an issue. The *majority* of libraries in Debian don't require any security updates at all. > >... > > The > > dependency and library mechanisms of some other ecosystems, are designed > > to support having multiple distinct versions of libraries in the same > > address space, with fully automatic equivalents of symbol versioning. > >... > > How can Debian security support packages from such ecosystems? By following the security advisories from those ecosystems, uploading new versions, and rebuilding the packages that depend on them. A security upload would be an upload of a semver-compatible version. Not every package is OpenSSL or libpng. I don't expect Debian to have 3-4 versions of OpenSSL (though I *do* expect it to have OpenSSL 3 and OpenSSL 1.1 in parallel for a while). I think it's reasonable to *allow* 2-4 versions of a small library for emitting ANSI terminal color codes. And I think we should have actual written policy supporting that. > If there is a CVE in a library that is used by 20 different packages > in 20 different versions, how does the ecosystem help Debian with > applying this CVE fix to all 20 versions with reasonable effort? "20 different versions" doesn't tend to happen; again, note that I'm not talking about bundling here, and that'll need solving another way. In ecosystems that use semantic versioning, the primary issue is providing the distinct *major* versions of packages required to satisfy dependencies. I'm not talking about packaging xyz 1.2.3, 1.2.4, 1.3.1, and 2.0.1. When xyz 1.3.1 is uploaded, it can safely replace 1.2.4, and packages using xyz 1.2.4 can get rebuilt via binNMU if needed. I'm talking about packaging xyz 1.3.1 and 2.0.1, as separate xyz-1 and xyz-2 packages, and allowing the use of both in build dependencies. Then, a package using xyz-1 can work with upstream to migrate to xyz-2, and when we have no more packages in the archive using xyz-1 we can drop it. That's different from requiring *exactly one* version of xyz, forcing all packages to transition immediately, and preventing people from uploading packages because they don't fork upstream and port to different versions of dependencies. - Josh Triplett