Hi Viktor,

SVV1. Not easy to provide a number, but yes, it does scale with the number
of replicated topic partitions. Enabling this feature will add the overhead
of an extra consumer, and allocates memory for an offset-sync index for
each partition. The index is limited to 64 entries. I could give an upper
bound of the memory usage as a function of the number of replicated
topic-partitions, but not sure if it would be useful for users, and to
where exactly document this. Wdyt?

No worries, thanks for looking at the KIP!
Daniel

Viktor Somogyi-Vass <viktor.somo...@cloudera.com.invalid> ezt írta
(időpont: 2024. okt. 28., H, 17:07):

> Hi Daniel,
>
> SVV1. Fair points about the performance impact. The next question is that
> can we quantify it somehow, ie. does it scale with the number of topics to
> reverse checkpoints, the offsets emitted, etc.?
>
> I'll do one more pass on the KIP in the following days but I wanted to
> reply to you with what I have so far to keep this going.
>
> Best,
> Viktor
>
> On Fri, Oct 25, 2024 at 5:32 PM Dániel Urbán <urb.dani...@gmail.com>
> wrote:
>
> > Hi,
> >
> > One more update. As I was working on the PR, I realized that the only way
> > to support IdentityReplicationPolicy is to add an extra topic filter to
> the
> > checkpointing. I updated the KIP accordingly.
> > I also opened a draft PR to demonstrate the proposed changes:
> > https://github.com/apache/kafka/pull/17593
> >
> > Daniel
> >
> > Dániel Urbán <urb.dani...@gmail.com> ezt írta (időpont: 2024. okt. 24.,
> > Cs,
> > 15:22):
> >
> > > Hi all,
> > > Just a bump/minor update here:
> > > As I've started working on a POC of the proposed solution, I've
> realised
> > > that the hard requirement related to the ReplicationPolicy
> implementation
> > > can be eliminated, updated the KIP accordingly.
> > > Daniel
> > >
> > > Dániel Urbán <urb.dani...@gmail.com> ezt írta (időpont: 2024. okt.
> 21.,
> > > H, 16:18):
> > >
> > >> Hi Mickael,
> > >> Good point, I renamed the KIP and this thread:
> > >>
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1098%3A+Reverse+Checkpointing+in+MirrorMaker
> > >> Thank you,
> > >> Daniel
> > >>
> > >> Mickael Maison <mickael.mai...@gmail.com> ezt írta (időpont: 2024.
> okt.
> > >> 21., H, 15:22):
> > >>
> > >>> Hi Daniel,
> > >>>
> > >>> I've not had time to take a close look at the KIP but my initial
> > >>> feedback would be to adjust the name to make it clear it's about
> > >>> MirrorMaker.
> > >>> The word "checkpoint" has several meanings in Kafka and from the
> > >>> current KIP name it's not clear if it's about KRaft, Streams or
> > >>> Connect.
> > >>>
> > >>> Thanks,
> > >>> Mickael
> > >>>
> > >>> On Mon, Oct 21, 2024 at 2:55 PM Dániel Urbán <urb.dani...@gmail.com>
> > >>> wrote:
> > >>> >
> > >>> > Hi Viktor,
> > >>> >
> > >>> > Thank you for the comments!
> > >>> >
> > >>> > SVV1: I think the feature has some performance implications. If the
> > >>> reverse
> > >>> > checkpointing is enabled, task startup will be possibly slower,
> since
> > >>> it
> > >>> > will need to consume from a second offset-syncs topic; and it will
> > >>> also use
> > >>> > more memory, to keep the second offset-sync history. Additionally,
> it
> > >>> is
> > >>> > also possible to have an offset-syncs topic present without an
> > actual,
> > >>> > opposite flow being active - I think only users can tell if the
> > reverse
> > >>> > checkpointing should be active, and they should be the one opting
> in
> > >>> for
> > >>> > the higher resource usage.
> > >>> >
> > >>> > SVV2: I mention the DefaultReplicationPolicy to provide examples. I
> > >>> don't
> > >>> > think it is required. The actual requirement related to the
> > >>> > ReplicationPolicy is that the policy should be able to correctly
> tell
> > >>> which
> > >>> > topic was replicated from which cluster. Because of this,
> > >>> > IdentityReplicationPolicy would not work, but
> > >>> DefaultReplicationPolicy, or
> > >>> > any other ReplicationPolicy implementations with a correctly
> > >>> implemented
> > >>> > "topicSource" method should work. I will make an explicit note of
> > this
> > >>> in
> > >>> > the KIP.
> > >>> >
> > >>> > Thank you,
> > >>> > Daniel
> > >>> >
> > >>> > Viktor Somogyi-Vass <viktor.somo...@cloudera.com.invalid> ezt írta
> > >>> > (időpont: 2024. okt. 18., Pén 17:28):
> > >>> >
> > >>> > > Hey Dan,
> > >>> > >
> > >>> > > I think this is a very useful idea. Two questions:
> > >>> > >
> > >>> > > SVV1: Do you think we need the feature flag at all? I know that
> not
> > >>> having
> > >>> > > this flag may technically render the KIP unnecessary (however it
> > may
> > >>> still
> > >>> > > be useful to discuss this topic and create a concensus). As you
> > >>> wrote in
> > >>> > > the KIP, we may be able to look up the target and source topics
> and
> > >>> if we
> > >>> > > can do this, we can probably detect if the replication is one-way
> > or
> > >>> > > prefixless (identity). So that means we don't need this flag to
> > >>> control
> > >>> > > when we want to use this. Then it is really just there to act as
> > >>> something
> > >>> > > that can turn the feature on and off if needed, but I'm not
> really
> > >>> sure if
> > >>> > > there is a great risk in just enabling this by default. If we
> > really
> > >>> just
> > >>> > > turn back the B -> A checkpoints and save them in the A -> B,
> then
> > >>> maybe
> > >>> > > it's not too risky and users would get this immediately by just
> > >>> upgrading.
> > >>> > >
> > >>> > > SVV2: You write that we need DefaultReplicationPolicy to use this
> > >>> feature,
> > >>> > > but most of the functionality is there on interface level in
> > >>> > > ReplicationPolicy. Is there anything that is missing from there
> and
> > >>> if so,
> > >>> > > what do you think about pulling it into the interface? If this
> > >>> improvement
> > >>> > > only works with the default replication policy, then it's
> somewhat
> > >>> limiting
> > >>> > > as users may have their own policy for various reasons, but if we
> > >>> can make
> > >>> > > it work on the interface level, then we could provide this
> feature
> > to
> > >>> > > everyone. Of course there can be replication policies like the
> > >>> identity one
> > >>> > > that by design disallows this feature, but for that, see my
> > previous
> > >>> point.
> > >>> > >
> > >>> > > Best,
> > >>> > > Viktor
> > >>> > >
> > >>> > > On Fri, Oct 18, 2024 at 3:30 PM Dániel Urbán <
> > urb.dani...@gmail.com>
> > >>> > > wrote:
> > >>> > >
> > >>> > > > Hi everyone,
> > >>> > > >
> > >>> > > > I'd like to start the discussion on KIP-1098: Reverse
> > >>> Checkpointing (
> > >>> > > >
> > >>> > > >
> > >>> > >
> > >>>
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1098%3A+Reverse+Checkpointing
> > >>> > > > )
> > >>> > > > which aims to minimize message reprocessing for consumers in
> > >>> failbacks.
> > >>> > > >
> > >>> > > > TIA,
> > >>> > > > Daniel
> > >>> > > >
> > >>> > >
> > >>>
> > >>
> >
>

Reply via email to