[issue40379] multiprocessing's default start method of fork()-without-exec() is broken

2021-04-30 Thread Itamar Turner-Trauring


Itamar Turner-Trauring  added the comment:

This change was made on macOS at some point, so why not Linux? "spawn" is 
already the default on macOS and Windows.

--

___
Python tracker 
<https://bugs.python.org/issue40379>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40379] multiprocessing's default start method of fork()-without-exec() is broken

2021-04-30 Thread Itamar Turner-Trauring


Itamar Turner-Trauring  added the comment:

Given people's general experience, I would not say that "fork" works on Linux 
either. More like "99% of the time it works, 1% it randomly breaks in 
mysterious way".

--

___
Python tracker 
<https://bugs.python.org/issue40379>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40379] multiprocessing's default start method of fork()-without-exec() is broken

2020-04-24 Thread Itamar Turner-Trauring


New submission from Itamar Turner-Trauring :

By default, multiprocessing uses fork() without exec() on POSIX. For a variety 
of reasons this can lead to inconsistent state in subprocesses: module-level 
globals are copied, which can mess up logging, threads don't survive fork(), 
etc..

The end results vary, but quite often are silent lockups.

In real world usage, this results in users getting mysterious hangs they do not 
have the knowledge to debug.

The fix for these people is to use "spawn" by default, which is the default on 
Windows.

Just a small sample:

1. Today I talked to a scientist who spent two weeks stuck, until she found my 
article on the subject 
(https://codewithoutrules.com/2018/09/04/python-multiprocessing/). Basically 
multiprocessing locked up, doing nothing forever. Switching to "spawn" fixed it.
2. https://github.com/dask/dask/issues/3759#issuecomment-476743555 is someone 
who had issues fixed by "spawn".
3. https://github.com/numpy/numpy/issues/15973 is a NumPy issue which 
apparently impacted scikit-learn.


I suggest changing the default on POSIX to match Windows.

--
messages: 367210
nosy: itamarst
priority: normal
severity: normal
status: open
title: multiprocessing's default start method of fork()-without-exec() is broken
type: behavior
versions: Python 3.5, Python 3.6, Python 3.7, Python 3.8, Python 3.9

___
Python tracker 
<https://bugs.python.org/issue40379>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40379] multiprocessing's default start method of fork()-without-exec() is broken

2020-04-24 Thread Itamar Turner-Trauring


Itamar Turner-Trauring  added the comment:

Looks like as of 3.8 this only impacts Linux/non-macOS-POSIX, so I'll amend the 
above to say this will also make it consistent with macOS.

--

___
Python tracker 
<https://bugs.python.org/issue40379>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40379] multiprocessing's default start method of fork()-without-exec() is broken

2020-05-05 Thread Itamar Turner-Trauring


Itamar Turner-Trauring  added the comment:

Just got an email from someone for whom switching to "spawn" fixed a problem. 
Earlier this week someone tweeted about this fixing things. This keeps hitting 
people in the real world.

--

___
Python tracker 
<https://bugs.python.org/issue40379>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40379] multiprocessing's default start method of fork()-without-exec() is broken

2020-11-06 Thread Itamar Turner-Trauring


Itamar Turner-Trauring  added the comment:

Another person with the same issue: 
https://twitter.com/volcan01010/status/1324764531139248128

--
nosy: +itamarst2

___
Python tracker 
<https://bugs.python.org/issue40379>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14976] queue.Queue() is not reentrant, so signals and GC can cause deadlocks

2017-04-16 Thread Itamar Turner-Trauring

Itamar Turner-Trauring added the comment:

This bug was closed on the basis that signals + threads don't interact well. 
Which is a good point.

Unfortunately this bug can happen in cases that have nothing to do with 
signals. If you look at the title and some of the comments it also happens as a 
result of garbage collection: in `__del__` methods and weakref callbacks.

Specifically, garbage collection can happen on any bytecode boundary and cause 
reentrancy problems with Queue.

The attached file is an attempt to demonstrate this: it runs and runs until GC 
happens and then deadlocks for me (on Python 3.6). I.e. it prints GC! and after 
that no further output is printed and no CPU usage reported.

Please re-open this bug: if you don't want to fix signal case that's fine, but 
the GC case is still an issue.

--
versions: +Python 3.5, Python 3.6
Added file: http://bugs.python.org/file46806/queuebug2.py

___
Python tracker 
<http://bugs.python.org/issue14976>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue19206] Support disabling file I/O when doing traceback formatting

2013-10-09 Thread Itamar Turner-Trauring

New submission from Itamar Turner-Trauring:

In certain situations it is best to avoid doing file I/O. For example, a 
program that runs in an event loop may wish to avoid any potentially blocking 
operations; reading from a file usually is fast, but can sometimes take 
arbitrary long. Another example (my specific use case) is a logging library - 
you don't want to block for an arbitrary amount of time when creating log 
messages in an application thread (a separate thread would do the writing).

Unfortunately, the traceback.py module reads from files to load the source 
lines for the traceback (using linecache.py). This means if you want to format 
a traceback without file I/O you have to either recreate some logic from the 
standard library, monkey-patch globally, or do a terrible hack you don't want 
to know about.

It would be better if the there was some way to ask the traceback.py module's 
functions to not do file I/O. The actual change would be fairly minor I suspect 
since the formatting functions already support getting None back from linecache.

--
components: Library (Lib)
messages: 199294
nosy: itamarst
priority: normal
severity: normal
status: open
title: Support disabling file I/O when doing traceback formatting
type: enhancement

___
Python tracker 
<http://bugs.python.org/issue19206>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue14976] queue.Queue() is not reentrant, so signals and GC can cause deadlocks

2014-02-17 Thread Itamar Turner-Trauring

Itamar Turner-Trauring added the comment:

This is not specifically a signal issue; it can happen with garbage collection 
as well if you have a Queue.put that runs in __del__ or a weakref callback 
function.

This can happen in real code. In my case, a thread that reads log messages from 
a Queue and writes them to disk. The thread putting log messages into the Queue 
can deadlock if GC happens to cause a log message to be written right after 
Queue.put() acquired the lock.
(see https://github.com/itamarst/crochet/issues/25).

--
nosy: +itamarst
title: Queue.PriorityQueue() is not interrupt safe -> queue.Queue() is not 
reentrant, so signals and GC can cause deadlocks
versions: +Python 3.3

___
Python tracker 
<http://bugs.python.org/issue14976>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com