Re: 70% [* SPAM *] multiprocessing.Queue blocks when sending large object

Lie Ryan Mon, 05 Dec 2011 04:31:27 -0800

On 11/30/2011 06:09 AM, DPalao wrote:

Hello,
I'm trying to use multiprocessing to parallelize a code. There is a number of
tasks (usually 12) that can be run independently. Each task produces a numpy
array, and at the end, those arrays must be combined.
I implemented this using Queues (multiprocessing.Queue): one for input and
another for output.
But the code blocks. And it must be related to the size of the item I put on
the Queue: if I put a small array, the code works well; if the array is
realistically large (in my case if can vary from 160kB to 1MB), the code
blocks apparently forever.
I have tried this:
http://www.bryceboe.com/2011/01/28/the-python-multiprocessing-queue-and-large-
objects/
but it didn't work (especifically I put a None sentinel at the end for each
worker).


Before I change the implementation,
is there a way to bypass this problem with  multiprocessing.Queue?
Should I post the code (or a sketchy version of it)?

Transferring data over multiprocessing.Queue involves copying the wholeobject across an inter-process pipe, so you need to have a reasonablylarge workload in the processes to justify the cost of the copying tobenefit from running the workload in parallel.

You may try to avoid the cost of copying by using shared memory(http://docs.python.org/library/multiprocessing.html#sharing-state-between-processes);you can use Queue for communicating when a new data comes in or when atask is done, but put the large data in shared memory. Be careful not toaccess the data from multiple processes concurrently.

In any case, have you tried a multithreaded solution? numpy is a Cextension, and I believe it releases the GIL when working, so itwouldn't be in your way to achieve parallelism.


--
http://mail.python.org/mailman/listinfo/python-list

Re: 70% [* SPAM *] multiprocessing.Queue blocks when sending large object

Reply via email to