This looks great, big improvements for the list offset protocol which is currently quite odd.
One minor thing. I think the old v0 list offsets request also gave you the highwater mark, it kind of shoves it in as the last thing in the array of offsets. This is used internally to implement seekToEnd() iirc. How would that work once v0 is removed? Related, the wiki says: "Another related feature missing in KafkaConsumer is the access of partitions' high watermark. Typically, users only need the high watermark in order to get the per partition lag. This seems more suitable to be exposed through the metrics." The obvious usage is computing lag for sure, and I agree that is really more a metric than anything else, but I think that is not the only usage. Here is a use case I think is quite important that requires knowing the highwater mark: Say you want to implement some kind of batch process that wakes up every 5 minutes or every hour or once a day and processes all the messages and then goes back to sleep. The naive way to do that would be to poll() until you don't get any more records, but this is broken in two minor ways, first maybe you didn't get records because you are rebalancing and second this might never happen if new records are always getting written. A better approach is for your process, when it begins, to look at the current end of the log and process only up to that offset. This is important for Kafka Streams or anything else that wants to have a kind of batch-like mode. Technically you can do this by seeking to the end, checking your position, then starting over, as people do today. But I think we can agree that is kind of silly. An alternative would be to rename TimestampOffset to something like PartitionOffsets and have it have both the timestamp and offset as well as the beginning offset and highwatermark for the partition. The underlying protocol would need these two. Cheers, -Jay On Tue, Aug 30, 2016 at 8:38 PM, Becket Qin <becket....@gmail.com> wrote: > Hi Kafka devs, > > I created KIP-79 to allow consumer to precisely query the offsets based on > timestamp. > > In short we propose to : > 1. add a ListOffsetRequest/ListOffsetResponse v1, and > 2. add an offsetForTime() method in new consumer. > > The KIP wiki is the following: > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65868090 > > Comments are welcome. > > Thanks, > > Jiangjie (Becket) Qin >