Naming of python binary packages
Bringing bug 1023512 [0] to the Debian Python list: [0] https://bugs.debian.org/1023512 > > According to the Debian Python Policy Section 4.3, binary package > > names should be named after the *import* name of the module, not the > > PyPI distribution name. > Unfortunately, I do not agree at all with this policy. The import name has > no importance, and IMO, we should change that policy so that the package > name matches the egg-name rather than the import name. I wouldn't quite say it has no importance. It describes which part of the filesystem the package owns. I don't know the history of this policy offhand, but I presume it's also because not all Python modules come from PyPI, and we needed a standard way to address them. Also, we sometimes break PyPI distributions up into separate binary packages. They are closer to a source package than a Debian binary package. FIWIW: I am not convinced that Python made the right decision in allowing distribution names to diverge from import names, it tends to just create confusion. But that's neither here nor there. > In many places, that would make our life of package maintainer better. A > good example is all the oslo libraries in OpenStack, that all have a dot in > their egg-name, but an underscore in the import path (so that it works > better under python3). In this specific case, using the dash instead of the > dot would be really stupid and break many things, like automation for > dependencies. Presumably that can be solved with a few automated adjustments, (like the . -> _ transformation you describe). Having a straightforward distribution name -> package name mapping would make automating dependencies simpler, I agree. But we have tooling that handles that already: dh-python and its' pydist data. > In fact, this extend to all of the Debian Python module archive. > > If you want to discuss this further, please open a thread in the list. I don't think the solution here is for your packages to use distribution-derived names while everyone else's use the policy-defined names. Can we rather come to a consensus on what we should be using? My vote would be strongly towards maintaining the status quo of the policy-defined names. I don't see any strong argument for changing this. Stefano -- Stefano Rivera http://tumbleweed.org.za/ +1 415 683 3272
Re: Naming of python binary packages
Hi debian-python (2023.08.11_14:49:00_+) > I don't think the solution here is for your packages to use > distribution-derived names while everyone else's use the policy-defined > names. Can we rather come to a consensus on what we should be using? I should say, of course, that we have a history of groups of packages that diverge from this policy. e.g. the Django app packages, and some sphinx things (I think). Sometimes it makes sense to not name things python3-foo, but rather something more descriptive to the sub-community that the package is a part of. But this example was a run of the mill Python module, as far as I can tell. Stefano -- Stefano Rivera http://tumbleweed.org.za/ +1 415 683 3272
Re: Naming of python binary packages
On Friday, August 11, 2023 10:49:00 AM EDT Stefano Rivera wrote: > My vote would be strongly towards maintaining the status quo of the > policy-defined names. > > I don't see any strong argument for changing this. Fully agreed. In addition to the reasons you listed, renaming a lot of packages would require a trip through New. I think we have enough backlog there without renaming a bunch of packages for a not very good reason. Scott K signature.asc Description: This is a digitally signed message part.
Re: Naming of python binary packages
On Fri, 11 Aug 2023 at 14:49:00 +, Stefano Rivera wrote: > > > According to the Debian Python Policy Section 4.3, binary package > > > names should be named after the *import* name of the module, not the > > > PyPI distribution name. > > > Unfortunately, I do not agree at all with this policy. The import name has > > no importance, and IMO, we should change that policy so that the package > > name matches the egg-name rather than the import name. > > I wouldn't quite say it has no importance. It describes which part of > the filesystem the package owns. More important than that, it describes the interface that the package provides to its reverse-dependencies: changing the name changes the interface, and vice versa. Having the package that lets you "import dbus" systematically be installable as "python3-dbus" is the same design principle as having the C library with SONAME libgtk-4.so.1 installable as libgtk-4-1 (and not gtk4-libs as it would be in some distributions), or having the Perl library that lets you "use File::chdir" installable as libfile-chdir-perl. This has been the policy for a while, and I think it's a good policy. In particular, it forces the necessary conflict resolution to happen at the distro level if two unrelated upstream projects (perhaps pyfoo-1.egg-info and Foo-2.egg-info) are both trying to be our implementation of "import foo". (disclosure: I wrote some of the text in Python Policy describing the naming convention under discussion here, but I was clarifying an existing convention and filling in the details of what to do in corner cases, rather than originating new policy. See also the thread starting at https://lists.debian.org/debian-python/2019/11/msg00125.html.) smcv
Re: dask.distributed RC bug #1042135
> > > > Thanks so much! I see you've already started on dask :) > > I took at quick look at arrow - yikes! There is potentially work > afoot on this though: > https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=970021 > Dask & dask.distributed 2023.8.0 was easier to update than some of the other versions they had between 2022.12 and now. Dask would still benifit from pyarrow, by I added enough pytest.importorskip to avoid triggering the tests that depend on pyarrow. It also looks like it builds for me and the debian builder so I closed 1042135. Hopefully that helps. (And it looks like it's got some code for pandas 2.0 so hopefully that'll help Rebecca Palmer. Diane