On Tue, Feb 21, 2023 at 02:23:34PM +0200, Adrian Bunk wrote:
> Looking at #1028371, should generated dependencies on python3-protobuf be
> python3-protobuf (>= 3.21), python3-protobuf (<< 3.22)
> to ensure that the binary package is used with the same version
> as the protobuf-compiler used during the build?
I'm not the maintainer, but a drive-by contributor. I looked a bit into
this, given its RC severity.
With my still somewhat limited understanding, a strict version alignment
between protobuf-compiler and python3-protobuf would probably resolve
this particular symptom, but the issues here seem to run deeper.
Specifically:
* The protobuf project provides three different versions of Python
bindings: pure Python, C++, and libupb-based[1]. These are
selectable using the PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION
environment variable.
* Debian's python3-protobuf, from src:protobuf, ships the pure Python
version, as well as the C++ bindings. The default implementation in
Debian is "cpp".
* The upb implementation is not included in src:protobuf, but in the
upb upstream source[2], i.e. what is src:upb in Debian, even though
the snapshot we have in Debian does not contain sources to Python
bindings.
* Upstream has switched the default implementation to "upb", and
deprecated the "cpp" implementation. There is, in fact, no way for
one to fetch the "cpp" version from PyPI. This is documented
extensively in their May 2022 release notes[3]. However, Debian
still ships, and defaults to, cpp, a major departure from upstream.
* Relatedly, when they made that switch, they also made changes to
their versioning scheme, disconnecting the Python library's version
from the source version. As a result, the Python API (both upb, as
well as pure Python), is now versioned at "4.21", rather than
"3.21". The Debian binary package python3-protobuf is versioned as
"3.21.12-1", which is not a version that exists, or will ever exist,
upstream. That binary package in fact, is shipping an egg named
protobuf-4.21.12.egg-info. (This is all also well documented in their
release notes[3]).
* Finally, in the same release notes document[3], they also state:
"Python upb requires generated code that has been generated from
protoc 3.19.0 or newer.".
Indeed, if one fetches protobuf 4.21 from PyPI, and runs:
python3 -c 'import bernhard'
or
PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=upb python3 -c 'import bernhard'
...a traceback message is emitted, but a much more informative one:
> TypeError: Descriptors cannot not be created directly.
>
> If this call came from a _pb2.py file, your generated code is out of
> date and must be regenerated with protoc >= 3.19.0.
>
> If you cannot immediately regenerate your protos, some other possible
workarounds are:
> 1. Downgrade the protobuf package to 3.20.x or lower.
> 2. Set PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python (but this will
> use pure-Python parsing and will be much slower).
>
> More information:
https://developers.google.com/protocol-buffers/docs/news/2022-05-06#python-updates
* The release notes specifically mention "upb" requiring protoc
(protobuf-compiler) >= 3.19, but not "cpp". However, as established
above, "cpp" is deprecated and not used by anyone (but Debian), and
therefore they either meant "the non-Pure-Python implementation"
there, or did not pay as much attention to forward- and
backwards-compatibility, or informative error messages for their
deprecated backend. It's likely, but not entirely clear, that the
protoc dependency requirement is >= 3.19 here as well.
* Finally note that the 3.21.12-1+b2 Python implementation still works
with python3-bernhard, Built-Using: protobuf-compiler (= 3.12.4-1+b3):
PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python python3 -c 'import bernhard'
All in all: it's almost certainly necessary to make the dependency
tighter, to something like >= 3.19, if not tight to = 3.21.
I still feel uneasy about Debian shipping a version of python3-protobuf
that includes, and defaults to, an implementation that is deprecated
upstream (and on top of it, is misversioned). I'm not sure what to make
of this so late in the release cycle, though.
For trixie the path forward is probably something along the lines of
updating src:upb to a newer upstream, building the upb-based extension
as python3-protobuf-upb, and then changing src:protobuf to not build the
cpp extension, make python3-protobuf Arch: all, and then Recommend (or
Depend) on python3-protobuf-upb as the native/fast implementation.
Faidon
1: https://github.com/protocolbuffers/protobuf/tree/main/python
2: https://github.com/protocolbuffers/upb/tree/main/python
3: https://protobuf.dev/news/2022-05-06/