subject:"How does MapWithStateRDD distribute the data"

Re: How does MapWithStateRDD distribute the data

2017-06-16 Thread coolcoolkid

http://apache-spark-developers-list.1001551.n3.nabble.com/How-does-MapWithStateRDD-distribute-the-data-tp18544p21770.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com. - To unsubscribe e-mail:

Re: How does MapWithStateRDD distribute the data

2016-08-03 Thread Cody Koeninger

Are you using KafkaUtils.createDirectStream? On Wed, Aug 3, 2016 at 9:42 AM, Soumitra Johri wrote: > Hi, > > I am running a steaming job with 4 executors and 16 cores so that each > executor has two cores to work with. The input Kafka topic has 4 partitions. > With this given configuration I was

How does MapWithStateRDD distribute the data

2016-08-03 Thread Soumitra Johri

Hi, I am running a steaming job with 4 executors and 16 cores so that each executor has two cores to work with. The input Kafka topic has 4 partitions. With this given configuration I was expecting MapWithStateRDD to be evenly distributed across all executors, how ever I see that it uses only two