Re: Kafka DStream Parallelism

2015-02-27 Thread Corey Nolet
This was what I was thinking but wanted to verify. Thanks Sean! On Fri, Feb 27, 2015 at 9:56 PM, Sean Owen wrote: > The coarsest level at which you can parallelize is topic. Topics are > all but unrelated to each other so can be consumed independently. But > you can parallelize within the contex

Re: Kafka DStream Parallelism

2015-02-27 Thread Sean Owen
The coarsest level at which you can parallelize is topic. Topics are all but unrelated to each other so can be consumed independently. But you can parallelize within the context of a topic too. A Kafka group ID defines a consumer group. One consumer in a group receive each message to the topic tha

Kafka DStream Parallelism

2015-02-27 Thread Corey Nolet
Looking @ [1], it seems to recommend pull from multiple Kafka topics in order to parallelize data received from Kafka over multiple nodes. I notice in [2], however, that one of the createConsumer() functions takes a groupId. So am I understanding correctly that creating multiple DStreams with the s