> On Feb 22, 2017, at 10:16 AM, Phil Mayers <p.may...@imperial.ac.uk> wrote:
>
> On 22/02/17 17:42, Hynek Schlawack wrote:
>
>> I have to disagree here: I don’t want build tools of any kind in my
>> final containers therefore I build my artifacts separately no matter
>> what language. Of course you can just build the venv on your build
>
> Agreed, 100%. Apologies if I gave you the impression I was advocating
> otherwise.
>
>> server without wheeling up a temporary container and then package it
>> using Docker or DEB or whatever. You should be separating building
>> and running anyway so Python – as much as I’d like Go-style single
>> binaries too – is in no way special here. The nice thing about
>> temporary containers though is that I can do all of that on my Mac.
>
> I agree that you need to separate building and installation, and I've got no
> particular beef with using a container, chroot, throwaway VM or whatever
> works for people in doing the build phase.
>
> (What people do with the resultant build output - and in particular whether
> there is a lot of ignoring of the hard-learned lessons of system package
> managers going on now - I will not comment on ;o)
>
> What I was trying to say - badly, apparently - was that the system python
> *could* be attractive to someone because many dependencies may exist in the
> OS package list in suitable form, but conversely may not exist in PyPI in
> binary form for Linux.
Yes, and building these binary artifacts is often harder than some people
(cough, alpine, cough) seem to think. But there are better ways to square this
circle than restricting yourself to the versions of python libraries that
happen to be available in your distro.
> As a very simple example: if you have a traditional (non-container) Linux
> system hosting a Python application in a virtualenv, and you deploy a Python
> app to a virtualenv e.g. using Puppet or Ansible, you either need to:
>
> 1. Use no C extensions
> 2. Hope there's a manylinux1 binary wheel
> 3. Use the OS package and --system-site-packages
> 4. Compile the C extensions and make them available to pip
>
> #2 seems useful now that I know about it but - correct me if I'm wrong - the
> manylinux1 permitted C dependencies are super-tiny, and would not permit e.g.
> cryptography or psycopg2?
Cory already pointed this out tangentially, but I should emphasize:
'cryptography' and 'psycopg2' are things that you depend on at the Python
level. The things you depend on at the C level are libssl, libcrypto, and
libpq. If you want to build a manylinux wheel, you need to take this into
account and statically link those C dependencies, which some projects are
beginning to do. (Cryptography _could_ do this today, they already have the
infrastructure for doing it on macOS and Windows, the reason they're not
shipping manylinux1 wheels right now has to do with the political implications
of auto-shipping a second copy of openssl to Linux distros that expect to
manage security upgrades centrally).
> #4 is what you are advocating for I believe? But can we agree that for
> smaller projects, that might seem like a lot of repeated work if the package
> is already available in the OS
If you're going to do #4 with dh_virtualenv, your .deb can depend on the
relevant packages that contain the relevant C libraries, and build linux
wheels, which are vendor-specific and can dynamically link to whatever you want
(i.e. not manylinux wheels, which are vendor-neutral and must statically link
everything). Manylinux wheels are required for uploading to PyPI, where you
don't know who may be downloading - on your own infrastructure, where you are
shipping inside an artifact (like a .deb) that specifically has metadata
describing its dependencies, "linux" wheels are fine. Alone, hanging around on
PyPI as .whl files rather than as .debs in your infrastructure, they'd be
mystery meat, but that is not the case if they have proper dependency metadata.
It might seem weird to use Python-specific tooling and per-application
vendoring for Python dependencies, and yet use distro-global dynamic linking
for C dependencies. But, this is actually a perfectly cromulent strategy, and
I think this bears a more in-depth explanation.
C, and particularly the ecosystem of weird dynamic linker ceremony around C,
has an extremely robust side-by-side installation ecosystem, which distros
leverage to great effect. For example, on the Ubuntu machine sitting next to
me as I write this, I have libasan0 (4.8.5) libasan1 (4.9.4) libasan2 (5.4.1)
*and* libasan3 (6.2.0) installed, and this isn't even a computer with a
particularly significant amount of stuff going on! Nothing ever breaks and
loads the wrong libasan.N.
Python, by contrast, tried to do this in a C-ish way, but that attempt resulted
in this mess: https://packaging.python.org/multi_version_install/
<https://packaging.python.org/multi_version_install/>, which almost nobody
uses. Right at the top of that document, "For many use cases, virtual
environments address this need without the complication ...".
Even if you are 100%, completely bought into a distro-style way of life, no
containers at all, everything has to be in a system package to get installed,
virtualenvs still make more sense than trying to sync up the whole system's
Python library versions.
The reason nobody ever went back and tried to do multi-version installs "right"
with Python is that the Python and C library ecosystems are fundamentally
different in a bunch of important ways. For one thing, Python libraries have
no such thing as an enforceable ABI, so coupling between libraries and
applications is much closer than in C. For another, no SOVERSION. Also, many
small C utilities (the ones that would be some of the smaller entries in
requirements.txt in a Python app) are vendored in or statically linked in
applications, so the "dependency management" happens prior to the container
build, in the source repo of the upstream, where it is hidden. Python
dependencies often have a far higher rate of churn than C dependencies because
of the ease of development, which means both more divergence between required
versions for different applications, and more benefits to being up-to-date for
the applications that do rev faster.
Finally, the build process for Python packages is much simpler, since they're
usually treated as archives of files that move around, rather than elaborate
pre-build steps that are often required for C libraries to make sure everything
is smashed into the .so at build time.
So think of your Python libraries as "vendored in" to your package for these
reasons, rather than depended upon in the OS, and then participate in the
broader distro (i.e. "C") ecosystem by building wheels that dynamically link
whatever distro-level dependencies they need to.
> Wondering out loud, I guess it would be possible for OS-compiled python
> extensions to be somehow virtualenv or relocation-compatible. One could
> envisage something like:
>
> virtualenv t
> . t/bin/activate
> pip syspkg-install python-psycopg2
>
> ...and this going off and grabbing the OS-provided dependency of that name,
> extracting it, and deploying it into the virtualenv, rather than the system
> Python.
This is sort of what dh_virtualenv is. It doesn't set up the mapping for you
automatically, but you can pretty quickly figure out that python-psycopg2
build-depends: libpq-dev.
> There are doubtless all sorts of reasons that is not practical.
The main thing is just that you have to decide on a standard format for
distro-specific metadata, and then go encode it everywhere. I'm pretty sure
that distutils-sig would be open to codifying such an extension to the list of
standardized metadata fields, so that tools can use it.
> Anyway, to be clear - I'm not advocating using the system Python. I'm trying
> to explain why, based on the efforts we expend locally, it could seem
> attractive to smaller sites.
To be even clearer, using the system python is fine - it's using the global
python environment that has the most significant problem.
(Although, of course, the "system python" is probably CPython, and in most
cases you want to be using PyPy, right? So yeah don't use the system Python.)
I hope this explanation was helpful to those of you deploying with distro
tooling!
-glyph
_______________________________________________
Twisted-Python mailing list
Twisted-Python@twistedmatrix.com
http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python