Hey guys, One thing that bugs me is the lack of symmetric for the different position calls. The way I see it there are two positions we maintain: the fetch position and the last commit position. There are two things you can do to these positions: get the current value or change the current value. But the names somewhat obscure this: Fetch position: - No get - set by positions(TopicOffsetPosition...) Committed position: - get by List<TopicOffsetPosition> lastCommittedPosition( TopicPartition...) - set by commit or commitAsync
The lastCommittedPosition is particular bothersome because: 1. The name is weird and long 2. It returns a list of results. But how can you use the list? The only way to use the list is to make a map of tp=>offset and then look up results in this map (or do a for loop over the list for the partition you want). I recommend that if this is an in-memory check we just do one at a time. E.g. long committedPosition(TopicPosition). What if we made it: long position(TopicPartition tp) void seek(TopicOffsetPosition p) long committed(TopicPartition tp) void commit(TopicOffsetPosition...); This still isn't terribly consistent, but I think it is better. I would also like to shorten the name TopicOffsetPosition. Offset and Position are duplicative of each other. So perhaps we could call it a PartitionOffset or a TopicPosition or something like that. In general class names that are just a concatenation of the fields (e.g. TopicAndPartitionAndOffset) seem kind of lazy to me since the name doesn't really describe it just enumerates. But that is more of a nit pick. -Jay On Mon, Feb 10, 2014 at 10:54 AM, Neha Narkhede <neha.narkh...@gmail.com>wrote: > As mentioned in previous emails, we are also working on a re-implementation > of the consumer. I would like to use this email thread to discuss the > details of the public API. I would also like us to be picky about this > public api now so it is as good as possible and we don't need to break it > in the future. > > The best way to get a feel for the API is actually to take a look at the > javadoc< > http://people.apache.org/~nehanarkhede/kafka-0.9-consumer-javadoc/doc/kafka/clients/consumer/KafkaConsumer.html > >, > the hope is to get the api docs good enough so that it is self-explanatory. > You can also take a look at the configs > here< > http://people.apache.org/~nehanarkhede/kafka-0.9-consumer-javadoc/doc/kafka/clients/consumer/ConsumerConfig.html > > > > Some background info on implementation: > > At a high level the primary difference in this consumer is that it removes > the distinction between the "high-level" and "low-level" consumer. The new > consumer API is non blocking and instead of returning a blocking iterator, > the consumer provides a poll() API that returns a list of records. We think > this is better compared to the blocking iterators since it effectively > decouples the threading strategy used for processing messages from the > consumer. It is worth noting that the consumer is entirely single threaded > and runs in the user thread. The advantage is that it can be easily > rewritten in less multi-threading-friendly languages. The consumer batches > data and multiplexes I/O over TCP connections to each of the brokers it > communicates with, for high throughput. The consumer also allows long poll > to reduce the end-to-end message latency for low throughput data. > > The consumer provides a group management facility that supports the concept > of a group with multiple consumer instances (just like the current > consumer). This is done through a custom heartbeat and group management > protocol transparent to the user. At the same time, it allows users the > option to subscribe to a fixed set of partitions and not use group > management at all. The offset management strategy defaults to Kafka based > offset management and the API provides a way for the user to use a > customized offset store to manage the consumer's offsets. > > A key difference in this consumer also is the fact that it does not depend > on zookeeper at all. > > More details about the new consumer design are > here< > https://cwiki.apache.org/confluence/display/KAFKA/Kafka+0.9+Consumer+Rewrite+Design > > > > Please take a look at the new > API< > http://people.apache.org/~nehanarkhede/kafka-0.9-consumer-javadoc/doc/kafka/clients/consumer/KafkaConsumer.html > >and > give us any thoughts you may have. > > Thanks, > Neha >