New submission from Daniel Barcay <daniel.bar...@gmail.com>:

I have tracked down the exact cause of a sizable performance issue in using 
concurrent.futures.ProcessPoolExecutors, especially visible in cases where 
large amounts of data are being copied across the result.

The line-number causing the bad behavior, and several remediation paths are 
included below. Since this affects core behavior of the module, I'm reticent to 
try out a patch myself unless someone chimes in on the approach.

---Bug Symptoms:
  ProcessPoolExecutor.submit() hangs for long periods of time 
non-deterministically (over 20 seconds in my job). See causes section below for 
exact cause. 
   This hanging makes multiprocess job submissions impossible from a real-time 
constrained main thread, where the results are large objects.

---Ideal behavior:
   submit() should not block on any results of other jobs, and non-blocking 
wake signal should be used instead of a blocking put() call.

---Bug Cause:
In ProcessPoolExecutor.submit() line 473, a wake signal is being sent to the 
management thread in the form of posting a message to the result queue, waking 
the thread if it was in recv() mode.

I'm not even sure that this wake-up is necessary, as removing it seems to work 
just fine for my use-case on OSX. However, let's presume that it is for the 
time being..

The fact that submit() blocks on the result_queue being serviced is 
unnecessary, and hinders large results from being sent back across in 
concurrent.futures.result().

---Possible remediations:

If a more fully-fledged Queue implementation were used, this signal could be 
replaced by the non-blocking version. Alternately multiprocess.Queue 
implementation could be extended to implement non-blocking put()


--- Reproduction Details
  I'm using concurrent.futures.ProcessPoolExecutor for a complicated 
data-processing use-case where the result is a large object to be sent across 
the result() channel. Create any such setup where the results are on the order 
of 50MB strings, submit 5-10 jobs at a time, and watch the time it takes to 
call submit().

----------
components: Extension Modules
messages: 320257
nosy: dbarcay
priority: normal
severity: normal
status: open
title: concurrent.futures ProcessPoolExecutor submit() blocks on results being 
written
type: performance
versions: Python 3.6

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue33945>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to