There is a trick that I use when data transfer is the performance killer. Just save your big array first (for instance on and .hdf5 file) and send to the workers the indices to retrieve the portion of the array you are interested in instead of the actual subarray.
Anyway there are cases where multiprocessing will never help, since the operation is too fast with respect to the overhead involved in multiprocessing. In that case just give up and think about ways of changing the original problem. -- https://mail.python.org/mailman/listinfo/python-list