Sure, Thanks Tathagata! 


bit1...@163.com
 
From: Tathagata Das
Date: 2015-02-26 14:47
To: bit1...@163.com
CC: Akhil Das; user
Subject: Re: Re: Many Receiver vs. Many threads per Receiver
Spark Streaming has a new Kafka direct stream, to be release as experimental 
feature with 1.3. That uses a low level consumer. Not sure if it satisfies your 
purpose. 
If you want more control, its best to create your own Receiver with the low 
level Kafka API. 

TD

On Tue, Feb 24, 2015 at 12:09 AM, bit1...@163.com <bit1...@163.com> wrote:
Thanks Akhil.
Not sure whether thelowlevel consumer.will be officially supported by Spark 
Streaming. So far, I don't see it mentioned/documented in the spark streaming 
programming guide.



bit1...@163.com
 
From: Akhil Das
Date: 2015-02-24 16:21
To: bit1...@163.com
CC: user
Subject: Re: Many Receiver vs. Many threads per Receiver
I believe when you go with 1, it will distribute the consumer across your 
cluster (possibly on 6 machines), but still it i don't see a away to tell from 
which partition it will consume etc. If you are looking to have a consumer 
where you can specify the partition details and all, then you are better off 
with the lowlevel consumer.



Thanks
Best Regards

On Tue, Feb 24, 2015 at 9:36 AM, bit1...@163.com <bit1...@163.com> wrote:
Hi,
I  am experimenting Spark Streaming and Kafka Integration, To read messages 
from Kafka in parallel, basically there are two ways
1. Create many Receivers like (1 to 6).map(_ => KakfaUtils.createStream). 
2. Specifiy many threads when calling KakfaUtils.createStream like val 
topicMap("myTopic"=>6), this will create one receiver with 6 reading threads.

My question is which option is better, sounds option 2 is better is to me 
because it saves a lot of cores(one Receiver one core), but I learned from 
somewhere else that choice 1 is better, so I would ask and see how you guys 
elaborate on this. Thank



bit1...@163.com


Reply via email to