Re: Kafka - streaming from multiple topics

Jean-Pierre OCALAN Thu, 17 Dec 2015 09:18:58 -0800

Hi Cody,

First of all thanks for the note about spark.streaming.concurrentJobs. I
guess this is why it's not mentioned in the actual spark streaming doc.
Since those 3 topics contain completely different data on which I need to
apply different kind of transformations, I am not sure joining them would
be really efficient, unless you know something that I don't.


As I really don't need any interaction between those streams, I think I
might end up running 3 different streaming apps instead of one.

Thanks again!

On Thu, Dec 17, 2015 at 11:43 AM, Cody Koeninger <c...@koeninger.org> wrote:

> Using spark.streaming.concurrentJobs for this probably isn't a good idea,
> as it allows the next batch to start processing before current one is
> finished, which may have unintended consequences.
>
> Why can't you use a single stream with all the topics you care about, or
> multiple streams if you're e.g. joining them?
>
>
>
> On Wed, Dec 16, 2015 at 3:00 PM, jpocalan <jpoca...@gmail.com> wrote:
>
>> Nevermind, I found the answer to my questions.
>> The following spark configuration property will allow you to process
>> multiple KafkaDirectStream in parallel:
>> --conf spark.streaming.concurrentJobs=<something greater than 1>
>>
>>
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-user-list.1001560.n3.nabble.com/Kafka-streaming-from-multiple-topics-tp8678p25723.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> For additional commands, e-mail: user-h...@spark.apache.org
>>
>>
>


-- 
jean-pierre ocalan
jpoca...@gmail.com

Re: Kafka - streaming from multiple topics

Reply via email to