Hi Mickael, Thanks for your response and apologies for a huge delay in mine.
My thinking is that any partition could go stale if there are no records being produced into it. If enough of such partitions are present and are owned by a single MM task, an OOM could happen. Regarding the scenario where the TTL value is lower than the refresh interval - I believe that this is an edge that we need to document and prevent against, for example either failing to start on such a combination or resorting to a default value that would satisfy the constraint and logging an error. Thanks, Elkhan On Thu, 8 Feb 2024 at 14:17, Mickael Maison <mickael.mai...@gmail.com> wrote: > Hi, > > Thanks for the updates. > I'm wondering whether we really need the ttl eviction mechanism. The > motivation is to "avoid storing stale LRO entries which can cause an > eventual OOM error". How could it contain stake entries? I would > expect its cache to only contain entries for partitions assigned to > the task that owns it. Also what is the expected behavior if there's > no available LRO in the cache? If we keep this mechanism what happens > if its value is lower than > replication.record.lag.metric.refresh.interval? > > Thanks, > Mickael > > On Mon, Feb 5, 2024 at 5:23 PM Elxan Eminov <elxanemino...@gmail.com> > wrote: > > > > Hi Mickael! > > Any further thoughts on this? > > > > Thanks, > > Elkhan > > > > On Thu, 18 Jan 2024 at 11:53, Mickael Maison <mickael.mai...@gmail.com> > > wrote: > > > > > Hi Elxan, > > > > > > Thanks for the updates. > > > > > > We used dots to separate words in configuration names, so I think > > > replication.offset.lag.metric.last-replicated-offset.ttl should be > > > named replication.offset.lag.metric.last.replicated.offset.ttl > > > instead. > > > > > > About the names of the metrics, fair enough if you prefer keeping the > > > replication prefix. Out of the alternatives you mentioned, I think I > > > prefer replication-record-lag. I think the metrics and configuration > > > names should match too. Let's see what the others think about it. > > > > > > Thanks, > > > Mickael > > > > > > On Mon, Jan 15, 2024 at 9:50 PM Elxan Eminov <elxanemino...@gmail.com> > > > wrote: > > > > > > > > Apologies, forgot to reply on your last comment about the metric > name. > > > > I believe both replication-lag and record-lag are a little too > abstract - > > > > what do you think about either leaving it as replication-offset-lag > or > > > > renaming to replication-record-lag? > > > > > > > > Thanks > > > > > > > > On Wed, 10 Jan 2024 at 15:31, Mickael Maison < > mickael.mai...@gmail.com> > > > > wrote: > > > > > > > > > Hi Elxan, > > > > > > > > > > Thanks for the KIP, it looks like a useful addition. > > > > > > > > > > Can you add to the KIP the default value you propose for > > > > > replication.lag.metric.refresh.interval? In MirrorMaker most > interval > > > > > configs can be set to -1 to disable them, will it be the case for > this > > > > > new feature or will this setting only accept positive values? > > > > > I also wonder if replication-lag, or record-lag would be clearer > names > > > > > instead of replication-offset-lag, WDYT? > > > > > > > > > > Thanks, > > > > > Mickael > > > > > > > > > > On Wed, Jan 3, 2024 at 6:15 PM Elxan Eminov < > elxanemino...@gmail.com> > > > > > wrote: > > > > > > > > > > > > Hi all, > > > > > > Here is the vote thread: > > > > > > https://lists.apache.org/thread/ftlnolcrh858dry89sjg06mdcdj9mrqv > > > > > > > > > > > > Cheers! > > > > > > > > > > > > On Wed, 27 Dec 2023 at 11:23, Elxan Eminov < > elxanemino...@gmail.com> > > > > > wrote: > > > > > > > > > > > > > Hi all, > > > > > > > I've updated the KIP with the details we discussed in this > thread. > > > > > > > I'll call in a vote after the holidays if everything looks > good. > > > > > > > Thanks! > > > > > > > > > > > > > > On Sat, 26 Aug 2023 at 15:49, Elxan Eminov < > > > elxanemino...@gmail.com> > > > > > > > wrote: > > > > > > > > > > > > > >> Relatively minor change with a new metric for MM2 > > > > > > >> > > > > > > >> > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-971%3A+Expose+replication-offset-lag+MirrorMaker2+metric > > > > > > >> > > > > > > > > > > > > > > > >