New submission from Itamar Turner-Trauring <ita...@futurefoundries.com>:
By default, multiprocessing uses fork() without exec() on POSIX. For a variety of reasons this can lead to inconsistent state in subprocesses: module-level globals are copied, which can mess up logging, threads don't survive fork(), etc.. The end results vary, but quite often are silent lockups. In real world usage, this results in users getting mysterious hangs they do not have the knowledge to debug. The fix for these people is to use "spawn" by default, which is the default on Windows. Just a small sample: 1. Today I talked to a scientist who spent two weeks stuck, until she found my article on the subject (https://codewithoutrules.com/2018/09/04/python-multiprocessing/). Basically multiprocessing locked up, doing nothing forever. Switching to "spawn" fixed it. 2. https://github.com/dask/dask/issues/3759#issuecomment-476743555 is someone who had issues fixed by "spawn". 3. https://github.com/numpy/numpy/issues/15973 is a NumPy issue which apparently impacted scikit-learn. I suggest changing the default on POSIX to match Windows. ---------- messages: 367210 nosy: itamarst priority: normal severity: normal status: open title: multiprocessing's default start method of fork()-without-exec() is broken type: behavior versions: Python 3.5, Python 3.6, Python 3.7, Python 3.8, Python 3.9 _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue40379> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com