[ 
https://issues.apache.org/jira/browse/KAFKA-2076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14389652#comment-14389652
 ] 

Jiangjie Qin commented on KAFKA-2076:
-------------------------------------

Got it. Thanks for the explanation, [~jkreps]. The strategy looks very good.

So as the first step, I guess we can add the following interface, given we 
agree it is useful from user point of view. We can Implement them with current 
existing screwy protocols we have now.
Map<TopicPartition, Long> latestOffsetsFor(List<TopicPartition> partitions) 
Later on, when we have a fully baked protocol we can replace the underlying 
implementation. Considering we are already using OffsetRequest in new consumer 
(ListOffsetRequest), this probably won't add an extra unwanted protocol 
dependency. 

As for the corner cases you mentioned:
1. I ask for the log end offset for a partition I am not subscribed to. -  We 
probably want to provide this, otherwise people will just subscribe first then 
get the log end offset. It adds unnecessary steps to user.
2. I ask for the log end offset for a partition I am subscribed to but for 
which a fetch request has not yet been issued. - My implementation proposal 
covers this, but we might want to add a back-off time for fetch LEO from 
broker, say, if the LEO in local map has not been updated for more than 
Log.end.offset.fetch.backoff.ms, we issue another request to broker. In a 
normally consuming consumer, hopefully no additional request will be sent for 
LEO.

Any thoughts?

> Add an API to new consumer to allow user get high watermark of partitions.
> --------------------------------------------------------------------------
>
>                 Key: KAFKA-2076
>                 URL: https://issues.apache.org/jira/browse/KAFKA-2076
>             Project: Kafka
>          Issue Type: Improvement
>            Reporter: Jiangjie Qin
>
> We have a use case that user wants to know how far it is behind a particular 
> partition on startup. Currently in each fetch response, we have high 
> watermark for each partition, we only keep a global max-lag metric. It would 
> be better that we keep a record of high watermark per partition and update it 
> on each fetch response. We can add a new API to let user query the high 
> watermark.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to