Re: createMessageStreams vs createMessageStreamsByFilter

2015-03-13 Thread tao xiao
Fetch data from a leader to consumer. Replication fetcher is configured by another property On Saturday, March 14, 2015, Zakee wrote: > Sorry, but still confused. Maximum number of threads (fetchers) to fetch > from a Leader or maximum number of threads within a follower broker? > > Thanks for

Re: createMessageStreams vs createMessageStreamsByFilter

2015-03-13 Thread Zakee
Sorry, but still confused. Maximum number of threads (fetchers) to fetch from a Leader or maximum number of threads within a follower broker? Thanks for clarifying, -Zakee > On Mar 12, 2015, at 11:11 PM, tao xiao wrote: > > The number of fetchers is configurable via num.replica.fetchers. Th

Re: createMessageStreams vs createMessageStreamsByFilter

2015-03-12 Thread tao xiao
The number of fetchers is configurable via num.replica.fetchers. The description of num.replica.fetchers in Kafka documentation is not quite accurate. num.replica.fetchers actually controls the max number of fetchers per broker. In you case num.replica.fetchers=8 and 5 brokers the means no more 8 f

Re: createMessageStreams vs createMessageStreamsByFilter

2015-03-12 Thread Zakee
Is this always the case that there is only one fetcher per broker, won’t setting num.replica.fetchers greater than number-of-brokers cause more fetchers per broker? Let’s I have 5 brokers, and num of replica fetchers is 8, will there be 2 fetcher threads pulling from each broker? Thanks Zakee

Re: createMessageStreams vs createMessageStreamsByFilter

2015-03-12 Thread James Cheng
Ah, I understand now. I didn't realize that there was one fetcher thread per broker. Thanks Tao & Guozhang! -James On Mar 11, 2015, at 5:00 PM, tao xiao wrote: > Fetcher thread is per broker basis, it ensures that at lease one fetcher > thread per broker. Fetcher thread is sent to broker wit

Re: createMessageStreams vs createMessageStreamsByFilter

2015-03-11 Thread tao xiao
Fetcher thread is per broker basis, it ensures that at lease one fetcher thread per broker. Fetcher thread is sent to broker with a fetch request to ask for all partitions. So if A, B, C are in the same broker fetcher thread is still able to fetch data from A, B, C even though A returns no data. sa

Re: createMessageStreams vs createMessageStreamsByFilter

2015-03-11 Thread James Cheng
On Mar 11, 2015, at 9:12 AM, Guozhang Wang wrote: > Hi James, > > What I meant before is that a single fetcher may be responsible for putting > fetched data to multiple queues according to the construction of the > streams setup, where each queue may be consumed by a different thread. And > the

Re: createMessageStreams vs createMessageStreamsByFilter

2015-03-11 Thread Guozhang Wang
Hi James, What I meant before is that a single fetcher may be responsible for putting fetched data to multiple queues according to the construction of the streams setup, where each queue may be consumed by a different thread. And the queues are actually bounded. Now say if there are two queues tha

Re: createMessageStreams vs createMessageStreamsByFilter

2015-03-11 Thread tao xiao
consumer.timeout.ms only affects how the stream reads data from the internal chunk queue that is used to buffer received data. The actual data fetching is done by another fetcher thread kafka.consumer.ConsumerFetcherThread. The fetcher thread keeps reading data from broker and put them to the queu

Re: createMessageStreams vs createMessageStreamsByFilter

2015-03-10 Thread James Cheng
Hi, Sorry to bring up this old thread, but my question is about this exact thing: Guozhang, you said: > A more concrete example: say you have topic AC: 3 partitions, topic BC: 6 > partitions. > > With createMessageStreams("AC" => 3, "BC" => 2) a total of 5 threads will > be created, and consumin

Re: createMessageStreams vs createMessageStreamsByFilter

2015-02-11 Thread Guozhang Wang
The new consumer will be released in 0.9, which is targeted for end of this quarter. On Tue, Feb 10, 2015 at 7:11 PM, tao xiao wrote: > Do you know when the new consumer API will be publicly available? > > On Wed, Feb 11, 2015 at 10:43 AM, Guozhang Wang > wrote: > > > Yes, it can get stuck. For

Re: createMessageStreams vs createMessageStreamsByFilter

2015-02-10 Thread tao xiao
Do you know when the new consumer API will be publicly available? On Wed, Feb 11, 2015 at 10:43 AM, Guozhang Wang wrote: > Yes, it can get stuck. For example, AC and BC are processed by two > different processes and AC processors gets stuck, hence AC messages will > fill up in the consumer's buf

Re: createMessageStreams vs createMessageStreamsByFilter

2015-02-10 Thread Guozhang Wang
Yes, it can get stuck. For example, AC and BC are processed by two different processes and AC processors gets stuck, hence AC messages will fill up in the consumer's buffer and eventually prevents the fetcher thread to put more data into it; the fetcher thread will be blocked on that and not be abl

Re: createMessageStreams vs createMessageStreamsByFilter

2015-02-10 Thread tao xiao
Thank you Guozhang for your detailed explanation. In your example createMessageStreamsByFilter("*C" => 3) since threads are shared among topics there may be situation where all 3 threads threads get stuck with topic AC e.g. topic is empty which will be holding the connecting threads (setting consu

Re: createMessageStreams vs createMessageStreamsByFilter

2015-02-10 Thread Guozhang Wang
I was not clear before .. for createMessageStreamsByFilter each matched topic will have num-threads, but shared: i.e. there will be totally num-threads created, but each thread will be responsible for fetching all matched topics. A more concrete example: say you have topic AC: 3 partitions, topic

Re: createMessageStreams vs createMessageStreamsByFilter

2015-02-10 Thread tao xiao
Guozhang, Do you mean that each regex matched topic owns number of threads that get passed in to createMessageStreamsByFilter ? For example in below code If I have 3 matched topics each of which has 2 partitions then I should have 3 * 2 = 6 threads in total with each topic owning 2 threads. Topic

Re: createMessageStreams vs createMessageStreamsByFilter

2015-02-10 Thread Guozhang Wang
createMessageStreams is used for consuming from specific topic(s), where you can put a map of [topic-name, num-threads] as its input parameters; createMessageStreamsByFilter is used for consuming from wildcard topics, where you can put a (regex, num-threads) as its input parameters, and for each r

createMessageStreams vs createMessageStreamsByFilter

2015-02-10 Thread tao xiao
Hi team, I am comparing the differences between ConsumerConnector.createMessageStreams and ConsumerConnector.createMessageStreamsByFilter. My understanding is that createMessageStreams creates x number of threads (x is the number of threads passed in to the method) dedicated to the specified topic