, etc) to execute in a single stage.
Hope this helps,
-adrian
From: Tathagata Das
Date: Wednesday, October 21, 2015 at 10:36 AM
To: swetha
Cc: user
Subject: Re: Job splling to disk and memory in Spark Streaming
Well, reduceByKey needs to shutffle if your intermediate data is not already
partitioned
Well, reduceByKey needs to shutffle if your intermediate data is not
already partitioned in the same way as reduceByKey's partitioning.
reduceByKey() has other signatures that take in a partitioner, or simply
number of partitions. So you can set the same partitioner as your previous
stage. Without