Re: Kafka 0.9 consumer API question

2015-12-17 Thread hsy...@gmail.com
rself. You can make changes to that set > and > > always pass it to assign(), which would avoid the need to use > assignment(). > > Also, I probably wouldn't be overly concerned about the copying overhead > > unless profiling shows that it is actually a problem. Are your p

Re: Kafka 0.9 consumer API question

2015-12-15 Thread Rajiv Kurian
way I might try to implement your use case would be > > to > > > maintain the assignment set yourself. You can make changes to that set > > and > > > always pass it to assign(), which would avoid the need to use > > assignment(). > > > Also, I probabl

Re: Kafka 0.9 consumer API question

2015-12-15 Thread Jason Gustafson
assign(), which would avoid the need to use > assignment(). > > Also, I probably wouldn't be overly concerned about the copying overhead > > unless profiling shows that it is actually a problem. Are your partition > > assignments generally very large? > > > > -Jas

Re: Kafka 0.9 consumer API question

2015-12-15 Thread Rajiv Kurian
ofiling shows that it is actually a problem. Are your partition > assignments generally very large? > > -Jason > > > On Tue, Dec 15, 2015 at 1:32 PM, Rajiv Kurian wrote: > > > We are trying to use the Kafka 0.9 consumer API to poll specific > > partitions. We consume

Re: Kafka 0.9 consumer API question

2015-12-15 Thread Jason Gustafson
about the copying overhead unless profiling shows that it is actually a problem. Are your partition assignments generally very large? -Jason On Tue, Dec 15, 2015 at 1:32 PM, Rajiv Kurian wrote: > We are trying to use the Kafka 0.9 consumer API to poll specific > partitions. We consume par

Kafka 0.9 consumer API question

2015-12-15 Thread Rajiv Kurian
We are trying to use the Kafka 0.9 consumer API to poll specific partitions. We consume partitions based on our own logic instead of delegating that to Kafka. One of our use cases is handling a change in the partitions that we consume. This means that sometimes we need to consume additional

Re: Kafka 0.9 consumer API

2015-03-26 Thread Rajiv Kurian
Thanks Guozhang, I am currently working on a project at my current company where I process data from Kafka. The data is all tiny Kafka messages (25 -35 bytes) and so far we were bottle necked on our processing speed. Recently we have made significant improvements and our processing speed has gone

Re: Kafka 0.9 consumer API

2015-03-26 Thread Guozhang Wang
Rajiv, Those are good points. As for implementation we have developed a class in producer that can be probably re-used for the consumer as well. org.apache.kafka.clients.producer.internals.BufferPool Please feel free to add more comments on KAFKA-2045. Guozhang On Tue, Mar 24, 2015 at 12:21 P

Re: Kafka 0.9 consumer API

2015-03-24 Thread Rajiv Kurian
Hi Guozhang, Yeah the main motivation is to not require de-serialization but still allow the consumer to de-serialize into objects if they really want to. Another motivation for iterating over the ByteBuffer on the fly is that we can prevent copies all together. This has an added implication thoug

Re: Kafka 0.9 consumer API

2015-03-24 Thread Guozhang Wang
Hi Rajiv, Just want to clarify, that the main motivation for iterating over the byte buffer directly instead of iterating over the records is for not enforcing de-serialization, right? I think that can be done by passing the deserializer class info into the consumer record instead of putting the d

Re: Kafka 0.9 consumer API

2015-03-22 Thread Rajiv Kurian
Hi Guozhang, Thanks for the note. So if we do not deserialize till the last moment like Jay suggested we would not need extra buffers for deserialization. Unless we need random access to messages it seems like we can deserialize right at the time of iteration and allocate objects only if the Consu

Re: Kafka 0.9 consumer API

2015-03-22 Thread Rajiv Kurian
Thanks for the insight Jay. That seems like a good plan. I'll take a look at it ASAP. I have no idea how much things would improve in a general application with this. Like you said CRC and decompression could still be the dominant factor. In my experience cutting down allocation to 0 helps with 9

Re: Kafka 0.9 consumer API

2015-03-22 Thread Guozhang Wang
Rajiv, A side note for re-using ByteBuffer: in the new consumer we do plan to add some memory management module such that it will try to reuse allocated buffer for fetch responses. But as Jay mentioned, for now inside the poll() call de-serialization and de-compression is done which requires to al

Re: Kafka 0.9 consumer API

2015-03-22 Thread Jay Kreps
Zijing, the new consumer will be in the next release. We don't have a hard date for this yet. Rajiv, I'm game if we can show a >= 20% performance improvement. It certainly could be an improvement, but it might also be that the CRC validation and compression dominate. The first step would be htt

Re: Kafka 0.9 consumer API

2015-03-21 Thread Rajiv Kurian
Just a follow up - I have implemented a pretty hacky prototype It's too unclean to share right now but I can clean it up if you are interested. I don't think it offers anything that people already don't know about though. My prototype doesn't do any metadata requests yet but I have a flyweight bui

Re: Kafka 0.9 consumer API

2015-03-21 Thread Rajiv Kurian
I had a few more thoughts on the new API. Currently we use kafka to transfer really compact messages - around 25-35 bytes each. Our use case is a lot of messages but each very small. Will it be possible to do the following to reuse a ConsumerRecord and the ConsumerRecords objects? We employ our own

Re: Kafka 0.9 consumer API

2015-03-21 Thread Zijing Guo
Hi all,The document is very beautiful and the Kafka release version for this will be? and what is the timeline? ThanksEdwin On Friday, March 20, 2015 4:20 PM, Rajiv Kurian wrote: Awesome - can't wait for this version to be out! On Fri, Mar 20, 2015 at 12:22 PM, Jay Kreps wrote:

Re: Kafka 0.9 consumer API

2015-03-20 Thread Rajiv Kurian
Awesome - can't wait for this version to be out! On Fri, Mar 20, 2015 at 12:22 PM, Jay Kreps wrote: > The timeout in the poll call is more or less the timeout used by the > selector. So each call to poll will do socket activity on any ready > sockets, waiting for up to that time for a socket to

Re: Kafka 0.9 consumer API

2015-03-20 Thread Jay Kreps
The timeout in the poll call is more or less the timeout used by the selector. So each call to poll will do socket activity on any ready sockets, waiting for up to that time for a socket to be ready. There is no longer any background threads involved in the consumer, all activity is driven by the a

Re: Kafka 0.9 consumer API

2015-03-20 Thread Rajiv Kurian
I am trying to understand the semantics of the timeout specified in the poll method in http://kafka.apache.org/083/javadoc/index.html?org/apache/kafka/clients/consumer/KafkaConsumer.html. Is this timeout a measure of how long the fetch request will be parked on the broker waiting for a reply or is

Re: Kafka 0.9 consumer API

2015-03-20 Thread Rajiv Kurian
Thanks! On Thursday, March 19, 2015, Jay Kreps wrote: > Err, here: > > http://kafka.apache.org/083/javadoc/index.html?org/apache/kafka/clients/consumer/KafkaConsumer.html > > -Jay > > On Thu, Mar 19, 2015 at 9:40 PM, Jay Kreps > wrote: > > > The current work in progress is documented here: > >

Re: Kafka 0.9 consumer API

2015-03-19 Thread Jay Kreps
:-) On Thursday, March 19, 2015, James Cheng wrote: > Those are pretty much the best javadocs I've ever seen. :) > > Nice job, Kafka team. > > -James > > > On Mar 19, 2015, at 9:40 PM, Jay Kreps > wrote: > > > > Err, here: > > > http://kafka.apache.org/083/javadoc/index.html?org/apache/kafka/cl

Re: Kafka 0.9 consumer API

2015-03-19 Thread James Cheng
Those are pretty much the best javadocs I've ever seen. :) Nice job, Kafka team. -James > On Mar 19, 2015, at 9:40 PM, Jay Kreps wrote: > > Err, here: > http://kafka.apache.org/083/javadoc/index.html?org/apache/kafka/clients/consumer/KafkaConsumer.html > > -Jay > > On Thu, Mar 19, 2015 at 9:

Re: Kafka 0.9 consumer API

2015-03-19 Thread Jay Kreps
The current work in progress is documented here: On Thu, Mar 19, 2015 at 7:18 PM, Rajiv Kurian wrote: > Is there a link to the proposed new consumer non-blocking API? > > Thanks, > Rajiv >

Re: Kafka 0.9 consumer API

2015-03-19 Thread Jay Kreps
Err, here: http://kafka.apache.org/083/javadoc/index.html?org/apache/kafka/clients/consumer/KafkaConsumer.html -Jay On Thu, Mar 19, 2015 at 9:40 PM, Jay Kreps wrote: > The current work in progress is documented here: > > > On Thu, Mar 19, 2015 at 7:18 PM, Rajiv Kurian > wrote: > >> Is there a

Kafka 0.9 consumer API

2015-03-19 Thread Rajiv Kurian
Is there a link to the proposed new consumer non-blocking API? Thanks, Rajiv