Re: New Consumer API discussion

2014-03-27 Thread Neha Narkhede
If people don't have any more thoughts on this, I will go ahead and submit a reviewboard to https://issues.apache.org/jira/browse/KAFKA-1328. Thanks, Neha On Mon, Mar 24, 2014 at 5:39 PM, Neha Narkhede wrote: > I took some time to write some example code using the new consumer APIs to > cover a

Re: New Consumer API discussion

2014-03-24 Thread Neha Narkhede
I took some time to write some example code using the new consumer APIs to cover a range of use cases. This exercise was very useful (thanks for the suggestion, Jay!) since I found several improvements to the APIs to make them more usable. Here are some of the changes

Re: New Consumer API discussion

2014-03-24 Thread Neha Narkhede
Hey Chris, Really sorry for the late reply, wonder how this fell through the cracks. Anyhow, thanks for the great feedback! Here are my comments - 1. Why is the config String->Object instead of String->String? This is probably more of a feedback about the new config management that we adopted in

Re: New Consumer API discussion

2014-03-03 Thread Chris Riccomini
Hey Guys, Also, for reference, we'll be looking to implement new Samza consumers which have these APIs: http://samza.incubator.apache.org/learn/documentation/0.7.0/api/javadocs/or g/apache/samza/system/SystemConsumer.html http://samza.incubator.apache.org/learn/documentation/0.7.0/api/javadocs/o

Re: New Consumer API discussion

2014-03-03 Thread Chris Riccomini
Hey Guys, Sorry for the late follow up. Here are my questions/thoughts on the API: 1. Why is the config String->Object instead of String->String? 2. Are these Java docs correct? KafkaConsumer(java.util.Map configs) A consumer is instantiated by providing a set of key-value pairs as configur

Re: New Consumer API discussion

2014-02-21 Thread Jun Rao
Looks good overall. Some comments below. 1. The using of ellipsis: This may make passing a list of items from a collection to the api a bit harder. Suppose that you have a list of topics stored in ArrayList topics; If you want subscribe to all topics in one call, you will have to do: String[] t

Re: New Consumer API discussion

2014-02-13 Thread Neha Narkhede
I think you are saying both, i.e. if you have committed on a partition it returns you that value but if you haven't it does a remote lookup? Correct. The other argument for making committed batched is that commit() is batched, so there is symmetry. position() and seek() are always in memory chan

Re: New Consumer API discussion

2014-02-13 Thread Tom Brown
Conceptually, do the position methods only apply to topics you've subscribed to, or do they apply to all topics in the cluster? E.g., could I retrieve or set the committed position of any partition? The positive use case for having access to all partition information would be to setup an active m

Re: New Consumer API discussion

2014-02-13 Thread Jay Kreps
Hey Neha, I actually wasn't proposing the name TopicOffsetPosition, that was just a typo. I meant TopicPartitionOffset, and I was just referencing what was in the javadoc. So to restate my proposal without the typo, using just the existing classes (that naming is a separate question): long posi

Re: New Consumer API discussion

2014-02-13 Thread Pradeep Gollakota
Hi Neha, 6. It seems like #4 can be avoided by using Map> Long> or Map as the argument type. > > How? lastCommittedOffsets() is independent of positions(). I'm not sure I > understood your suggestion. I think of subscription as you're subscribing to a Set of TopicPartitions. Because the argume

Re: New Consumer API discussion

2014-02-13 Thread Neha Narkhede
2. It returns a list of results. But how can you use the list? The only way to use the list is to make a map of tp=>offset and then look up results in this map (or do a for loop over the list for the partition you want). I recommend that if this is an in-memory check we just do one at a time. E.g.

Re: New Consumer API discussion

2014-02-13 Thread Jay Kreps
Hey guys, One thing that bugs me is the lack of symmetric for the different position calls. The way I see it there are two positions we maintain: the fetch position and the last commit position. There are two things you can do to these positions: get the current value or change the current value.

Re: New Consumer API discussion

2014-02-11 Thread Guozhang Wang
Hi Pradeep: 1. I think TopicPartition is designed as an internal class and the plan was not to expose it to users just for simplicity. We probably will change the commit APIs not exposing them. 2. We have thought about that before, and finally decide to make it as subscribe(topic, partition) pos

Re: New Consumer API discussion

2014-02-11 Thread Imran Rashid
Hi, thanks for sharing this and getting feedback. Sorry I am probably missing something basic, but I'm not sure how a multi-threaded consumer would work. I can imagine its either: a) I just have one thread poll kafka. If I want to process msgs in multiple threads, than I deal w/ that after pol

Re: New Consumer API discussion

2014-02-11 Thread Pradeep Gollakota
Updated thoughts. 1. subscribe(String topic, int... paritions) and unsubscribe(String topic, int... partitions) should be subscribe(TopicPartition... topicPartitions)and unsubscribe(TopicPartition... topicPartitons) 2. Does it make sense to provide a convenience method to subs

Re: New Consumer API discussion

2014-02-11 Thread Pradeep Gollakota
Hi Jay, I apologize for derailing the conversation about the consumer API. We should start a new discussion about hierarchical topics, if we want to keep talking about it. My final thought on the matter is that, hierarchical topics is still an important feature to have in Kafka, because it gives u

Re: New Consumer API discussion

2014-02-11 Thread Jay Kreps
Hey Pradeep, That wiki is fairly old and it predated more flexible subscription mechanisms. In the high-level consumer you currently have wildcard subscription and in the new proposed interface you can actually subscribe based on any logic you want to create a "union" of streams. Personally I thin

Re: New Consumer API discussion

2014-02-10 Thread Pradeep Gollakota
WRT to hierarchical topics, I'm referring to KAFKA-1175. I would just like to think through the implications for the Consumer API if and when we do implement hierarchical topics. For example, in the proposal

Re: New Consumer API discussion

2014-02-10 Thread Neha Narkhede
Thanks for the feedback. Mattijs - - Constructors link to http://kafka.apache.org/documentation.html#consumerconfigs for valid configurations, which lists zookeeper.connect rather than metadata.broker.list, the value for BROKER_LIST_CONFIG in ConsumerConfig. Fixed it to just point to ConsumerConf

Re: New Consumer API discussion

2014-02-10 Thread Pradeep Gollakota
Couple of very quick thoughts. 1. +1 about renaming commit(...) and commitAsync(...) 2. I'd also like to extend the above for the poll() method as well. poll() and pollWithTimeout(long, TimeUnit)? 3. Have you guys given any thought around how this API would be used with hierarchical topics? 4. Wo

Re: New Consumer API discussion

2014-02-10 Thread Jay Kreps
A few items: 1. ConsumerRebalanceCallback a. onPartitionsRevoked would be a better name. b. We should discuss the possibility of splitting this into two interfaces. The motivation would be that in Java 8 single method interfaces can directly take methods which might be more intuitive. c. I