[issue38501] multiprocessing.Pool hangs atexit (and garbage collection sometimes)

2019-10-16 Thread Eric Larson


New submission from Eric Larson :

The following code hangs on Python 3.8.0.rc0 on Ubuntu 19.10 when exiting the 
interpreter:


from multiprocessing import Pool

class A(object):
def __init__(self):
self.pool = Pool(processes=2)

solver = A()


When you eventually do ctrl-C, the traceback is:


^CProcess ForkPoolWorker-2:
Error in atexit._run_exitfuncs:
Process ForkPoolWorker-1:
Traceback (most recent call last):
  File "/usr/lib/python3.8/multiprocessing/util.py", line 277, in 
_run_finalizers
finalizer()
  File "/usr/lib/python3.8/multiprocessing/util.py", line 201, in __call__
res = self._callback(*self._args, **self._kwargs)
  File "/usr/lib/python3.8/multiprocessing/pool.py", line 689, in 
_terminate_pool
cls._help_stuff_finish(inqueue, task_handler, len(pool))
  File "/usr/lib/python3.8/multiprocessing/pool.py", line 674, in 
_help_stuff_finish
inqueue._rlock.acquire()
KeyboardInterrupt
Traceback (most recent call last):
  File "/usr/lib/python3.8/multiprocessing/process.py", line 313, in _bootstrap
self.run()
  File "/usr/lib/python3.8/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
  File "/usr/lib/python3.8/multiprocessing/pool.py", line 114, in worker
task = get()
  File "/usr/lib/python3.8/multiprocessing/queues.py", line 355, in get
with self._rlock:
  File "/usr/lib/python3.8/multiprocessing/synchronize.py", line 95, in 
__enter__
return self._semlock.__enter__()
KeyboardInterrupt
Traceback (most recent call last):
  File "/usr/lib/python3.8/multiprocessing/process.py", line 313, in _bootstrap
self.run()
  File "/usr/lib/python3.8/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
  File "/usr/lib/python3.8/multiprocessing/pool.py", line 114, in worker
task = get()
  File "/usr/lib/python3.8/multiprocessing/queues.py", line 356, in get
res = self._reader.recv_bytes()
  File "/usr/lib/python3.8/multiprocessing/connection.py", line 216, in 
recv_bytes
buf = self._recv_bytes(maxlength)
  File "/usr/lib/python3.8/multiprocessing/connection.py", line 414, in 
_recv_bytes
buf = self._recv(4)
  File "/usr/lib/python3.8/multiprocessing/connection.py", line 379, in _recv
chunk = read(handle, remaining)
KeyboardInterrupt


A similar problem seems to occur when garbage collecting when it's part of a 
class, and that class has not yet called `self.pool.terminate()` and 
`self.pool.close()`. Hopefully fixing the `atexit` behavior will similarly fix 
the `gc.collect()` behavior.

Cross-ref is in SciPy, https://github.com/scipy/scipy/issues/10927. It appears 
to also cause hangs on Travis 3.8-dev builds: 
https://travis-ci.org/scipy/scipy/jobs/598786785

--
components: Library (Lib)
messages: 354813
nosy: Eric Larson
priority: normal
severity: normal
status: open
title: multiprocessing.Pool hangs atexit (and garbage collection sometimes)
type: behavior
versions: Python 3.8

___
Python tracker 
<https://bugs.python.org/issue38501>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38501] multiprocessing.Pool hangs atexit (and garbage collection sometimes)

2020-04-10 Thread Eric Larson


Eric Larson  added the comment:

If that's out of contract, perhaps there should probably a big, visible warning 
at the top of the multiprocessning docs stating that creating one of these 
objects requires either using a context manager or ensuring manual 
`.close()`ing?

1. It either used to be in contract not to close it manually or was wrongly 
represented, the first Python 2.7 example in the docs 
(https://docs.python.org/2/library/multiprocessing.html#introduction) is:

from multiprocessing import Pool

def f(x):
return x*x

if __name__ == '__main__':
p = Pool(5)
print(p.map(f, [1, 2, 3]))

So I suspect this (difficult to track down) problem might hit users without 
more adequate warning.


2. I'm surprised it's actually out of contract when the 3.8 docs state that 
close will be automatically called:

> close()
>
> Indicate that no more data will be put on this queue by the current process. 
> The background thread will quit once it has flushed all buffered data to the 
> pipe. This is called automatically when the queue is garbage collected.

and

> terminate()
>
> Stops the worker processes immediately without completing outstanding work. 
> When the pool object is garbage collected terminate() will be called 
> immediately.

Or perhaps I misunderstand what this is saying?

--

___
Python tracker 
<https://bugs.python.org/issue38501>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38501] multiprocessing.Pool hangs atexit (and garbage collection sometimes)

2020-04-10 Thread Eric Larson


Eric Larson  added the comment:

> Why? This is a resource like any other and it requires proper resource 
> management. Would you also put a big warning on "open()" stating that opening 
> a file requires either using a context manager or ensure a manual close()?

One potential reason would be that the consequences of bad resource management 
in this case are different than in the open() case, i.e., here the interpreter 
hangs -- or Travis runs for your repo (SciPy) get stuck with over-50-minute 
errors, which is how we started looking for how to track it down.

> > the first Python 2.7 example in the docs 

> Python 2.7 is not supported and the pool has changed *a lot* since Python 2. 

Indeed, my point is more about potential prevalence: this (now incorrect) 
problematic usage pattern was the first example in the docs for multiprocessing 
for a long time, indicating that there might be a lot of code in the wild that 
might (still) make use of it.

> Yeah, and CPython does not promise that the __del__ method of any object will 
> be called, so is not assured that the finalized will call close():

Ahh, I didn't know that. This might be another good reason to add a warning 
about the risks of not ensuring closure yourself: others might have the same 
gap in knowledge I had, and assume the docs are saying that things will be 
taken care of by garbage collection on exit when they will not.

But you of course cannot document every possible problematic situation users 
could put themselves in, so I leave it up to your judgment whether or not it's 
worth it in this case.

--

___
Python tracker 
<https://bugs.python.org/issue38501>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com