New submission from Riccardo Coccioli <rcocci...@gmail.com>:

It seems that importlib.import_module() is not thread-safe if the loaded module 
raises an Exception on Python 3.4 and 3.5. I didn't find any thread-unsafe 
related information in Python's documentation.
The frequency of the failure appears to be random.

This is the setup to reproduce the issue:

#----- FILES STRUCTURE
├── fail.py
└── test.py
#-----

#----- CONTENT OF fail.py
ACCESSIBLE = 'accessible'

import nonexistent  # raise RuntimeError('failed') is basically the same

NOT_ACCESSIBLE = 'not accessible'
#-----

#----- CONTENT OF test.py
import importlib
import concurrent.futures


def f():
    try:
        mod = importlib.import_module('fail')
        # importlib.reload(mod)  # WORKAROUND

        try:
            val = mod.NOT_ACCESSIBLE
        except AttributeError as e:
            val = str(e)

        return (mod.__name__, type(mod), mod.ACCESSIBLE, val)
    except ImportError as e:
        return str(e)


with concurrent.futures.ThreadPoolExecutor(max_workers=3) as executor:
    futures = [executor.submit(f) for i in range(5)]
    for future in concurrent.futures.as_completed(futures):
        print(future.result())
#-----

Expected result:
#-----
No module named 'nonexistent'
No module named 'nonexistent'
No module named 'nonexistent'
No module named 'nonexistent'
No module named 'nonexistent'
#-----

Actual result:
#-----
No module named 'nonexistent'
No module named 'nonexistent'
No module named 'nonexistent'
('fail', <class 'module'>, 'accessible', "'module' object has no attribute 
'NOT_ACCESSIBLE'")
('fail', <class 'module'>, 'accessible', "'module' object has no attribute 
'NOT_ACCESSIBLE'")
#-----

In the unexpected output lines, the module has been "partially" imported. The 
'mod' object contains a module object, and trying to access an attribute 
defined before the import that raises Exception works fine, but trying to 
access an attribute defined after the failing import, fails.
It seems like the Exception was not properly raised at module load time, but at 
the same time the module is only partially loaded up to the failing import.

The actual number of half-imported modules varies between runs and picking 
different values for max_workers and range() and can also be zero (normal 
behaviour). Also the frequency of the issue varies.
Using multiprocessing.pool.ThreadPool() and apply_async() instead of 
concurrent.futures.ThreadPoolExecutor has the same effect.

I was able to reproduce the issue with the following Python versions and 
platforms:
- 3.4.2 and 3.5.3 on Linux Debian
- 3.4.9 and 3.5.6 on macOS High Sierra 10.13.6

While the issue doesn't show up at the best of my knowledge on:
- 3.6.7 and 3.7.2 on macOS High Sierra 10.13.6

Thanks to a colleague suggestion I also found a hacky workaround. Uncommenting 
the line in test.py marked as 'WORKAROUND' a reload of the module is forced. 
With that modification the actual result is:
#-----
No module named 'nonexistent'
No module named 'nonexistent'
No module named 'nonexistent'
module fail not in sys.modules
module fail not in sys.modules
#-----

While this doesn't solve the issue per se, it actually raises the same 
ImportError that the module was supposed to raise in the first place, just with 
a different message, allowing the code to continue it's normal execution.

----------
components: Library (Lib)
messages: 337887
nosy: Riccardo Coccioli
priority: normal
severity: normal
status: open
title: importlib.import_module() not thread safe if Exception is raised (3.4, 
3.5)
type: behavior
versions: Python 3.4, Python 3.5

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue36284>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to