New submission from Riccardo Coccioli <rcocci...@gmail.com>: It seems that importlib.import_module() is not thread-safe if the loaded module raises an Exception on Python 3.4 and 3.5. I didn't find any thread-unsafe related information in Python's documentation. The frequency of the failure appears to be random.
This is the setup to reproduce the issue: #----- FILES STRUCTURE ├── fail.py └── test.py #----- #----- CONTENT OF fail.py ACCESSIBLE = 'accessible' import nonexistent # raise RuntimeError('failed') is basically the same NOT_ACCESSIBLE = 'not accessible' #----- #----- CONTENT OF test.py import importlib import concurrent.futures def f(): try: mod = importlib.import_module('fail') # importlib.reload(mod) # WORKAROUND try: val = mod.NOT_ACCESSIBLE except AttributeError as e: val = str(e) return (mod.__name__, type(mod), mod.ACCESSIBLE, val) except ImportError as e: return str(e) with concurrent.futures.ThreadPoolExecutor(max_workers=3) as executor: futures = [executor.submit(f) for i in range(5)] for future in concurrent.futures.as_completed(futures): print(future.result()) #----- Expected result: #----- No module named 'nonexistent' No module named 'nonexistent' No module named 'nonexistent' No module named 'nonexistent' No module named 'nonexistent' #----- Actual result: #----- No module named 'nonexistent' No module named 'nonexistent' No module named 'nonexistent' ('fail', <class 'module'>, 'accessible', "'module' object has no attribute 'NOT_ACCESSIBLE'") ('fail', <class 'module'>, 'accessible', "'module' object has no attribute 'NOT_ACCESSIBLE'") #----- In the unexpected output lines, the module has been "partially" imported. The 'mod' object contains a module object, and trying to access an attribute defined before the import that raises Exception works fine, but trying to access an attribute defined after the failing import, fails. It seems like the Exception was not properly raised at module load time, but at the same time the module is only partially loaded up to the failing import. The actual number of half-imported modules varies between runs and picking different values for max_workers and range() and can also be zero (normal behaviour). Also the frequency of the issue varies. Using multiprocessing.pool.ThreadPool() and apply_async() instead of concurrent.futures.ThreadPoolExecutor has the same effect. I was able to reproduce the issue with the following Python versions and platforms: - 3.4.2 and 3.5.3 on Linux Debian - 3.4.9 and 3.5.6 on macOS High Sierra 10.13.6 While the issue doesn't show up at the best of my knowledge on: - 3.6.7 and 3.7.2 on macOS High Sierra 10.13.6 Thanks to a colleague suggestion I also found a hacky workaround. Uncommenting the line in test.py marked as 'WORKAROUND' a reload of the module is forced. With that modification the actual result is: #----- No module named 'nonexistent' No module named 'nonexistent' No module named 'nonexistent' module fail not in sys.modules module fail not in sys.modules #----- While this doesn't solve the issue per se, it actually raises the same ImportError that the module was supposed to raise in the first place, just with a different message, allowing the code to continue it's normal execution. ---------- components: Library (Lib) messages: 337887 nosy: Riccardo Coccioli priority: normal severity: normal status: open title: importlib.import_module() not thread safe if Exception is raised (3.4, 3.5) type: behavior versions: Python 3.4, Python 3.5 _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue36284> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com