Hello! I'd like to suggest a change to the Debian Policy around Python packages that will help enable the world of Python packaging to continue to progress forward.
First, a little bit of background: At the Python level there are three metadata formats for Python packaging: * The original, setuptools style .egg-info directories. * The distutils style .egg-info *file* added to distutils at some point. * The new and improved, wheel based .dist-info directories. The presence of any of these files will signal to Python tools that a particular distribution has been installed, however there are two fairly major and important differences between the distutils style, and the other two. 1. The distutils style has no provisions to record what files on the system belong to the installed distribution, making it appear to Python tooling that there *are* no files other than the metadata file itself. 2. The distutils style has no provisions to include additional metadata files in the metadata, making it impossible to extend the python level metadata with additional files. I have a series of improvements that I'd like to make to the packaging toolchain that will sort of build on one another, but which is not going to function correctly with the distutils style metadata and I'm hoping that I can convince y'all to make it policy to default to generating one of the other two kinds (with varying methods, more on that later). Concretely the thing that this is blocking right now, is that with the newly released pip 8.0 I tried to make it so that pip will refuse to uninstall a project that is installed with distutils style metadata. This is because we do not have any way to associate the actual .py (and others) files on disk with the installed metadata, so all we have ever done is just simply remove the metadata file, making it appear as if the item is uninstalled but leaving behind all of the actual files. However I'm going to be reverting this in a pip 8.0.1 release because it caused a decent amount of breakage amongst pip's users, almost all of them people who are attempting to upgrade OS provided packages using pip. Now, I know that upgrading OS provided packages using pip is less than optimal and I would greatly prefer that people did not do it (and I'm generally in agreement) however if we don't enable people to do it, they'll just continue to use an old version of pip and file bugs. It's a non starter for pip to make it impossible to do. In addition to the uninstall bit, it also means that things like pip show -f return junk information for packages installed in this way. Beyond just (eventually) enabling pip to disable uninstallations of distutils based installs this will start to allow some other future changes that I think will be more interesting to Debian. The uninstallation of distutils based installs comes hand in hand with pip stomping all over already existing files willy nilly because the way upgrading a project like that works is pip uninstalls the metadata file that says X is installed, then it just overwrites over any of the files that happen to be in it's way when it installs the newer version. If we can remove the need for pip to gleefully overwrite files to support these types of installed packages, then we can make it so pip will hard fail if it attempts to overwrite an already existing file on disk. An additional benefit here is that by switching to using the directory based options, we can add additional metadata files to the installed projects, much like the INSTALLER file from PEP376 (IIRC). This file will likely be the path to having pip refuse to touch OS owned files all together without some sort of --force flag to override the safety switch. As far as compatibility goes, pip has always forced everything to be installed using setuptools and as far as I am aware, there's no real fallout from doing so. I think in 2016 it's pretty reasonable to assume that a Python project is capable of being installed using setuptools instead of distutils. So without getting into the actual *method* of doing this (of which there are several different options with different trade offs) does this sound like something at all that Debian would be interested in? ----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
signature.asc
Description: Message signed with OpenPGP using GPGMail