[jira] [Commented] (KAFKA-9430) Tighten up lag estimates when source topic optimization is on

Guozhang Wang (Jira) Tue, 04 Feb 2020 16:22:28 -0800


    [ 
https://issues.apache.org/jira/browse/KAFKA-9430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17030241#comment-17030241
 ]


Guozhang Wang commented on KAFKA-9430:
--------------------------------------

Your analysis is correct, I think it is just the terminology that people are 
talking about are a bit different:. Just to clarify:

The log = log-end-offset - state-current-offset, where log-end-offset is always 
queried via "adminClient.listOffsets", then the state-current-offset is as your 
mentioned above.

In the PR for KAFKA-9113, it has been refactored as this:

https://github.com/guozhangwang/kafka/blob/k9113-base/streams/src/main/java/org/apache/kafka/streams/KafkaStreams.java#L1220

where the `changelogOffsets` wraps the logic of what you've summarized here.


> Tighten up lag estimates when source topic optimization is on 
> --------------------------------------------------------------
>
>                 Key: KAFKA-9430
>                 URL: https://issues.apache.org/jira/browse/KAFKA-9430
>             Project: Kafka
>          Issue Type: Improvement
>          Components: streams
>    Affects Versions: 2.5.0
>            Reporter: Vinoth Chandar
>            Assignee: Vinoth Chandar
>            Priority: Blocker
>
> Right now, we use _endOffsets_ of the source topic for the computation. For 
> "optimized" changelogs, this will be wrong, strictly speaking, but it's an 
> over-estimate (which seems better than an under-estimate), and it's also 
> still an apples-to-apples comparison, since all replicas would use the same 
> upper bound to compute their lags, so the "pick the freshest" replica is 
> still going to pick the right one.
> The current implementation is technically correct, within the documented 
> behavior that the result is an "estimate", but I marked it as a blocker to be 
> sure that we revisit it after ongoing work to refactor the task management in 
> Streams is complete. If it becomes straightforward to tighten up the 
> estimate, we should go ahead and do it. Otherwise, we can downgrade the 
> priority of the ticket.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (KAFKA-9430) Tighten up lag estimates when source topic optimization is on

Reply via email to