My approach is to create another thread to regularly request and update the end offset of each partition for the `keySet` in the collection `lastReplicatedSourceOffsets` mentioned by your kip (if there is no update for a long time, it will be removed from `lastReplicatedSourceOffsets`). Obviously, such processing makes the calculation of the partition offset lag less real-time and accurate. But this also meets our needs, because we need the partition offset lag to analyze the replication performance of the task and which task may have performance problems; and if you monitor the overall offset lag of the topic, then using the "kafka_consumer_consumer_fetch_manager_metrics_records_lag" metric will be more real-time and accurate. This is my suggestion. I hope to be able to throw bricks and start jade, we can come up with a better solution.
best, hudeqi "Elxan Eminov" <elxanemino...@gmail.com>写道: > @huqedi replying to your comment on the PR ( > https://github.com/apache/kafka/pull/14077#discussion_r1314592488), quote: > > "I guess we have a disagreement about lag? My understanding of lag is: the > real LEO of the source cluster partition minus the LEO that has been > written to the target cluster. It seems that your definition of lag is: the > lag between the mirror task getting data from consumption and writing it to > the target cluster?" > > Yes, this is the case. I've missed the fact that the consumer itself might > be lagging behind the actual data in the partition. > I believe your definition of the lag is more precise, but: > Implementing it this way will come at the cost of an extra listOffsets > request, introducing the overhead that you mentioned in your initial > comment. > > If you have enough insights about this, what would you say is the chances > of the task consumer lagging behind the LEO of the partition? > Are they big enough to justify the extra call to listOffsets? > @Viktor, any thoughts? > > Thanks, > Elkhan > > On Mon, 4 Sept 2023 at 09:36, Elxan Eminov <elxanemino...@gmail.com> wrote: > > > I already have the PR for this so if it will make it easier to discuss, > > feel free to take a look: https://github.com/apache/kafka/pull/14077 > > > > On Mon, 4 Sept 2023 at 09:17, hudeqi <16120...@bjtu.edu.cn> wrote: > > > >> But does the offset of the last `ConsumerRecord` obtained in poll not > >> only represent the offset of this record in the source cluster? It seems > >> that it cannot represent the LEO of the source cluster for this partition. > >> I understand that the offset lag introduced here should be the LEO of the > >> source cluster minus the offset of the last record to be polled? > >> > >> best, > >> hudeqi > >> > >> > >> > -----原始邮件----- > >> > 发件人: "Elxan Eminov" <elxanemino...@gmail.com> > >> > 发送时间: 2023-09-04 14:52:08 (星期一) > >> > 收件人: dev@kafka.apache.org > >> > 抄送: > >> > 主题: Re: [DISCUSS] KIP-971 Expose replication-offset-lag MirrorMaker2 > >> metric > >> > > >> </elxanemino...@gmail.com> > > > >