Hi, Today, a colleague of mine experienced some weird behavior from `python3 -m venv`; he is actually using Ubuntu, but I've tracked it down to changes made by Debian to `ensurepip` and I've been wondering whether it should be considered a bug and fixed, so I thought I'd try to ask on this list :)
The behavior itself is admittedly pretty esoteric, as it arises when you attempt to manipulate a virtual environment with different versions of Debian-Python, which you probably shouldn't do in the first place? Or at least I wouldn't do it, I'd just re-create the virtual environment from scratch. This is how I've been able to reproduce it: 1. Create a virtual environment directory with Debian-Python 3.6 -- `python3.6 -m venv test-venv`. 2. Attempt to manipulate it with Debian-Python 3.5 -- `python3.5 -m venv test-venv` This results in the following error: ``` The virtual environment was not created successfully because ensurepip is not available. On Debian/Ubuntu systems, you need to install the python3-venv package using the following command. apt-get install python3-venv You may need to use sudo with that command. After installing the python3-venv package, recreate your virtual environment. Failing command: ['/home/lukes/test-venv/bin/python3.5', '-Im', 'ensurepip', '--upgrade', '--default-pip'] ``` This in itself is not entirely helpful, because `ensurepip` *was* actually available as the `python3-venv` package was already installed. The problem is that this message is displayed whenever the "failing command" shown above results in a `CalledProcessError`, irrespective of the error's cause, which can also be that `ensurepip` did in fact run but not successfully to completion. So I would suggest amending the `venv.EnvBuilder._setup_pip` method along the following lines, for more resilient error reporting (this is based on Debian-Python 3.5, but AFAICS, in Debian-Python 3.8, the code is still the same): ```diff --- a/usr/lib/python3.5/venv/__init__.py 2019-08-20 22:05:10.000000000 +0200 +++ b/usr/lib/python3.5/venv/__init__.py 2019-11-01 17:55:51.425990509 +0100 @@ -257,25 +257,33 @@ # following command will produce an unhelpful error. Let's make it # more user friendly. try: subprocess.check_output( cmd, stderr=subprocess.STDOUT, universal_newlines=True) - except subprocess.CalledProcessError: - print("""\ + except subprocess.CalledProcessError as err: + if ': No module named ensurepip' in err.output: + print("""\ The virtual environment was not created successfully because ensurepip is not available. On Debian/Ubuntu systems, you need to install the python3-venv package using the following command. apt-get install python3-venv You may need to use sudo with that command. After installing the python3-venv package, recreate your virtual environment. Failing command: {} """.format(cmd)) + else: + print("""\ +Tried to run this command: {} +But it failed with the following unexpected error: + +{} +""".format(cmd, err.output)) sys.exit(1) def setup_scripts(self, context): """ Set up scripts into the created environment from a directory. ``` Still, even in the current version, it's great that the failing command is at least shown, pinpointing what to investigate further. Which I did, and realized that the problem wasn't that `ensurepip` wasn't available, but that running it resulted in a different error, which `venv` just didn't display. This is the `ensurepip` error (the first line was repeated multiple times, but I'll just paste it once): ``` /home/lukes/test-venv/share/python-wheels/requests-2.18.4-py2.py3-none-any.whl/requests/__init__.py:80: RequestsDependencyWarning: urllib3 (1.13.1) or chardet (2.3.0) doesn't match a supported version! Traceback (most recent call last): File "/usr/lib/python3.5/runpy.py", line 184, in _run_module_as_main "__main__", mod_spec) File "/usr/lib/python3.5/runpy.py", line 85, in _run_code exec(code, run_globals) File "/usr/lib/python3.5/ensurepip/__main__.py", line 4, in <module> ensurepip._main() File "/usr/lib/python3.5/ensurepip/__init__.py", line 268, in _main default_pip=args.default_pip, File "/usr/lib/python3.5/ensurepip/__init__.py", line 174, in bootstrap _run_pip(args + _PROJECTS, additional_paths) File "/usr/lib/python3.5/ensurepip/__init__.py", line 66, in _run_pip import pip File "/tmp/tmpl2su6y_t/pip-8.1.1-py2.py3-none-any.whl/pip/__init__.py", line 16, in <module> File "/tmp/tmpl2su6y_t/pip-8.1.1-py2.py3-none-any.whl/pip/vcs/mercurial.py", line 9, in <module> File "/tmp/tmpl2su6y_t/pip-8.1.1-py2.py3-none-any.whl/pip/download.py", line 39, in <module> ImportError: cannot import name 'requests' ``` So this is what's happening: 1. `ensurepip` under Debian creates a directory under `test-venv/share/python-wheels` where it copies the wheels necessary to bootstrap `pip` from `/usr/share/python-wheels`; cf. the `ensurepip._bootstrap` function. 2. On subsequent runs, the `test-venv/share/python-wheels` directory is not cleaned when present. This means that in our case, it already contains wheels copied over by `ensurepip` from the first run of `venv` under Debian-Python 3.6, plus different versions of (mostly?) the same wheels which have now been copied over by `ensurepip` from Debian-Python 3.5. So we have two different sets of dependencies in wheel format, including among others the following libraries: - requests-2.18.4, urllib3-1.22, chardet-3.0.4 (from Debian-Python 3.6) - requests-2.9.1, urllib3-1.13.1, chardet-2.3.0 (from Debian-Python 3.5) 3. `ensurepip` only adds to `sys.path` the wheels that it has been manipulating (copying over), cf. the `copy_wheels` function, which is nested within `ensurepip._bootstrap`. So far so good, there's no interference. 4. Unfortunately, when `ensurepip._run_pip` tries to `import pip._internal`, *all* of the wheels under `test-venv/share/python-wheels` are now added to `sys.path` via the mechanism in `pip/_vendor/__init__.py` -- there's a glob which just adds all of the `*.whl` files in that directory. 4. As a consequence, depending on the order in which the different wheel versions of the dependencies get added to `sys.path`, we can end up trying to import an inconsistent set of dependencies, as indicated by the error above when trying to import `requests`: the first versions found on `sys.path` are requests-2.18.4 (from Debian-Python 3.6), but urllib3-1.13.1 and chardet-2.3.0 (from Debian-Python 3.5). Which results in `requests` failing to import because the versions don't match, and everything collapses. Now I'm not entirely sure what the purpose of copying the wheels over into `test-venv/share/python-wheels` even is (there must be a good reason, it's just not obvious to me why not add `/usr/share/python-wheels/*.whl` to `sys.path` directly), but AFAICS they're only used during the bootstrap process anyway -- they don't show up on `sys.path` once I run Python from the virtual environment. So I guess a possíble solution would be to reset this directory each time `ensurepip` runs, to make sure that there's only one set of dependencies in there at a time (the correct one). Something like: ```diff --- a/usr/lib/python3.5/ensurepip/__init__.py 2019-08-20 22:05:09.000000000 +0200 +++ b/usr/lib/python3.5/ensurepip/__init__.py 2019-11-02 00:10:00.170871478 +0100 @@ -1,9 +1,10 @@ import glob import os import os.path import pkgutil +import shutil import sys import tempfile __all__ = ["version", "bootstrap"] @@ -146,11 +147,13 @@ # pip to look in when attempting to locate wheels to use to satisfy # the dependencies that pip normally bundles but Debian has debundled. # This is critically important and if this directory changes then both # python-pip and python-virtualenv needs updated to match. venv_wheel_dir = os.path.join(sys.prefix, 'share', 'python-wheels') - os.makedirs(venv_wheel_dir, exist_ok=True) + if os.path.isdir(venv_wheel_dir): + shutil.rmtree(venv_wheel_dir) + os.makedirs(venv_wheel_dir) dependencies = [ os.path.basename(whl).split('-')[0] for whl in glob.glob('/usr/share/python-wheels/*.whl') ] copy_wheels(dependencies, venv_wheel_dir, sys.path) ``` Alternatively, if creating `test-venv/share/python-wheels` is not a hard requirement, it could be entirely avoided, `ensurepip._bootstrap` could just directly add the wheels in `/usr/share/python-wheels` to `sys.path`, and there wouldn't be any additional, possibly incompatible wheels for the `pip/_vendor/__init__.py` glob mechanism to add. Plus people like me would stop wondering where that `test-venv/share` directory came from which they don't see when they create virtual environments on other OSs, or using `pyenv` Python. It could look something like this: ```diff --- a/usr/lib/python3.5/ensurepip/__init__.py 2019-08-20 22:05:09.000000000 +0200 +++ b/usr/lib/python3.5/ensurepip/__init__.py 2019-11-02 01:00:22.743359249 +0100 @@ -128,38 +128,41 @@ def copy_wheels(wheels, destdir, paths): for project in wheels: wheel_names = glob.glob( '/usr/share/python-wheels/{}-*.whl'.format(project)) if len(wheel_names) == 0: raise RuntimeError('missing dependency wheel %s' % project) assert len(wheel_names) == 1, wheel_names wheel_name = os.path.basename(wheel_names[0]) path = os.path.join('/usr/share/python-wheels', wheel_name) - with open(path, 'rb') as fp: - whl = fp.read() - dest = os.path.join(destdir, wheel_name) - with open(dest, 'wb') as fp: - fp.write(whl) - paths.append(dest) + # Only perform copy if an actual destdir was provided... + if destdir is not None: + with open(path, 'rb') as fp: + whl = fp.read() + dest = os.path.join(destdir, wheel_name) + with open(dest, 'wb') as fp: + fp.write(whl) + paths.append(dest) + # ... otherwise just append the original path to paths: + else: + paths.append(path) with tempfile.TemporaryDirectory() as tmpdir: # This directory is a "well known directory" which Debian has patched # pip to look in when attempting to locate wheels to use to satisfy # the dependencies that pip normally bundles but Debian has debundled. # This is critically important and if this directory changes then both # python-pip and python-virtualenv needs updated to match. - venv_wheel_dir = os.path.join(sys.prefix, 'share', 'python-wheels') - os.makedirs(venv_wheel_dir, exist_ok=True) dependencies = [ os.path.basename(whl).split('-')[0] for whl in glob.glob('/usr/share/python-wheels/*.whl') ] - copy_wheels(dependencies, venv_wheel_dir, sys.path) + copy_wheels(dependencies, None, sys.path) # Put our bundled wheels into a temporary directory and construct the # additional paths that need added to sys.path additional_paths = [] copy_wheels(_PROJECTS, tmpdir, additional_paths) # Construct the arguments to be passed to the pip command args = ["install", "--no-index", "--find-links", tmpdir] if root: ``` Both of these approaches get rid of the problem, as in, the series of two commands listed at the beginning... ```sh $ python3.6 -m venv test-venv $ python3.5 -m venv test-venv ``` ... runs fine with either of these modifications. Just to be extra clear: with "regular" (non-Debian) Python, this problem doesn't happen because `venv`/`ensurepip` doesn't do any of the magic around the `test-venv/share/python-wheels` directory; this directory isn't even created, it's a Debian-specific modification. So it's not an upstream problem, I even tried those two commands with Python 3.5 and 3.6 installed via `pyenv` to make sure, and it worked fine out of the box. So what do you think? Is this worth fixing? Should I report it somewhere else? And thank you for taking the time to read this, I've probably been more verbose than necessary, as I wasn't sure how much shared context I could assume :) Best, David