Re: Does MapWithState follow with a shuffle ?

Shixiong(Ryan) Zhu Tue, 29 Nov 2016 13:40:20 -0800

Right. And you can specify the partitioner via
"StateSpec.partitioner(partitioner: Partitioner)".


On Tue, Nov 29, 2016 at 1:16 PM, Amit Sela <amitsel...@gmail.com> wrote:

> Hi all,
>
> I've been digging into MapWithState code (branch 1.6), and I came across
> the compute
> <https://github.com/apache/spark/blob/branch-1.6/streaming/src/main/scala/org/apache/spark/streaming/dstream/MapWithStateDStream.scala#L159>
> implementation in *InternalMapWithStateDStream*.
>
> Looking at the defined partitioner
> <https://github.com/apache/spark/blob/branch-1.6/streaming/src/main/scala/org/apache/spark/streaming/dstream/MapWithStateDStream.scala#L112>
>  it
> looks like it could be different from the parent RDD partitioner (if
> defaultParallelism() changed for instance, or input partitioning was
> smaller to begin with), which will eventually create
> <https://github.com/apache/spark/blob/branch-1.6/core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala#L537>
> a ShuffleRDD.
>
> Am I reading this right ?
>
> Thanks,
> Amit
>

Re: Does MapWithState follow with a shuffle ?

Reply via email to