Re: Spark Streaming- ReduceByKey not removing Duplicates for the same key in a Batch

2016-11-12 Thread dev loper
t; >>> I am not able to figure out where I am going wrong here . Please help >> me >> >>> here to get rid of this weird problem. Previously we were using >> createStream >> >>> for listening to Kafka Queue (number of partitions 1) , there we >

Re: Spark Streaming- ReduceByKey not removing Duplicates for the same key in a Batch

2016-11-12 Thread ayan guha
e (number of partitions 1) , there we > didn't face > >>> this issue. But when we moved to directStream (number of partitions > 100) we > >>> could easily reproduce this issue on high load . > >>>

Re: Spark Streaming- ReduceByKey not removing Duplicates for the same key in a Batch

2016-11-12 Thread Cody Koeninger
when we moved to directStream (number of partitions 100) we >>> could easily reproduce this issue on high load . >>> >>> Note: I even tried reduceByKeyAndWindow with duration of 5 seconds >>> instead of reduceByKey Operation, But even that didn't >&

Re: Spark Streaming- ReduceByKey not removing Duplicates for the same key in a Batch

2016-11-12 Thread dev loper
produce this issue on high load . >> >> *Note:* I even tried reduceByKeyAndWindow with duration of 5 seconds >> instead of reduceByKey Operation, But even that didn't help. >> uniqueCampaigns.reduceByKeyAndWindow((c1,c2)=>c1, Durations.Seconds(5), >> Durations.Seconds(5

Re: Spark Streaming- ReduceByKey not removing Duplicates for the same key in a Batch

2016-11-12 Thread Cody Koeninger
> any solutions to this issue. > > > *Stack Overflow Link* > https://stackoverflow.com/questions/40559858/spark- > streaming-reducebykey-not-removing-duplicates-for-the-same-key-in-a-batch > > > Thanks and Regards > Dev >

Spark Streaming- ReduceByKey not removing Duplicates for the same key in a Batch

2016-11-12 Thread dev loper
questions/40559858/spark-streaming-reducebykey-not-removing-duplicates-for-the-same-key-in-a-batch Thanks and Regards Dev