Re: all values for a key must fit in memory

Mridul Muralidharan Sun, 20 Apr 2014 11:38:22 -0700

An iterator does not imply data has to be memory resident.
Think merge sort output as an iterator (disk backed).


Tom is actually planning to work on something similar with me on this
hopefully this or next month.

Regards,
Mridul


On Sun, Apr 20, 2014 at 11:46 PM, Sandy Ryza <sandy.r...@cloudera.com> wrote:
> Hey all,
>
> After a shuffle / groupByKey, Hadoop MapReduce allows the values for a key
> to not all fit in memory.  The current ShuffleFetcher.fetch API, which
> doesn't distinguish between keys and values, only returning an Iterator[P],
> seems incompatible with this.
>
> Any thoughts on how we could achieve parity here?
>
> -Sandy

Re: all values for a key must fit in memory

Reply via email to