Hi Nick,
How does reduce work? I thought after reducing in the executor, it
will reduce in parallel between multiple executors instead of pulling
everything to driver and reducing there.
Sincerely,
DB Tsai
---
My Blog: https://www.dbtsai.com
Li
Can you key your RDD by some key and use reduceByKey? In fact if you are
merging bunch of maps you can create a set of (k, v) in your mapPartitions and
then reduceByKey using some merge function. The reduce will happen in parallel
on multiple nodes in this case. You'll end up with just a single
I suppose what I want is the memory efficiency of toLocalIterator and the
speed of collect. Is there any such thing?
On Mon, Jun 9, 2014 at 3:19 PM, Sung Hwan Chung
wrote:
> Hello,
>
> I noticed that the final reduce function happens in the driver node with a
> code that looks like the followin