from:"bharatvenkat"

Re: Low Level Kafka Consumer for Spark

2014-08-29 Thread bharatvenkat

Chris, I did the Dstream.repartition mentioned in the document on parallelism in receiving, as well as set "spark.default.parallelism" and it still uses only 2 nodes in my cluster. I notice there is another email thread on the same topic: http://apache-spark-user-list.1001560.n3.nabble.com/DStre

Re: Low Level Kafka Consumer for Spark

2014-08-25 Thread bharatvenkat

I like this consumer for what it promises - better control over offset and recovery from failures. If I understand this right, it still uses single worker process to read from Kafka (one thread per partition) - is there a way to specify multiple worker processes (on different machines) to read fro