Omnia, have we considered just adding methods to ReplicationPolicy? I'm
reluctant to add a new class because, as Mickael points out, we'd need to
carry it around in client code.

Ryanne

On Fri, Feb 19, 2021 at 8:31 AM Mickael Maison <mickael.mai...@gmail.com>
wrote:

> Hi Omnia,
>
> Thanks for the clarifications.
>
> - I'm still a bit uneasy with the overlap between these 2 methods as
> currently `ReplicationPolicy.isInternalTopic` already handles MM2
> internal topics. Should we make it only handle Kafka internal topics
> and `isMM2InternalTopic()` only handle MM2 topics?
>
> - I'm not sure I understand what this method is used for. There are no
> such methods for the other 2 topics (offset-sync and heartbeat). Also
> what happens if there are other MM2 instances using different naming
> schemes in the same cluster. Do all instances have to know about the
> other naming schemes? What are the expected issues if they don't?
>
> - RemoteClusterUtils is a client-side utility so it does not have
> access to the MM2 configuration. Since this new API can affect the
> name of the checkpoint topic, it will need to be used client-side too
> so users can find the checkpoint topic name. I had to realized this
> was the case.
>
> Thanks
>
> On Mon, Feb 15, 2021 at 9:33 PM Omnia Ibrahim <o.g.h.ibra...@gmail.com>
> wrote:
> >
> > Hi Mickael, did you have some time to check my answer?
> >
> > On Thu, Jan 21, 2021 at 10:10 PM Omnia Ibrahim <o.g.h.ibra...@gmail.com>
> wrote:
> >>
> >> Hi Mickael,
> >> Thanks for taking another look into the KIP, regards your questions
> >>
> >> - I believe we need both "isMM2InternalTopic" and
> `ReplicationPolicy.isInternalTopic`  as `ReplicationPolicy.isInternalTopic`
> does check if a topic is Kafka internal topic, while `isMM2InternalTopic`
> is just focusing if a topic is MM2 internal topic or not(which is
> heartbeat/checkpoint/offset-sync). The fact that the default for MM2
> internal topics matches "ReplicationPolicy.isInternalTopic" will not be an
> accurate assumption anymore once we implement this KIP.
> >>
> >> - "isCheckpointTopic" will detect all checkpoint topics for all MM2
> instances this is needed for "MirrorClient.checkpointTopics" which
> originally check if the topic name ends with CHECKPOINTS_TOPIC_SUFFIX. So
> this method just to keep the same functionality that originally exists in
> MM2
> >>
> >> - "checkpointTopic" is used in two places 1. At topic creation in
> "MirrorCheckpointConnector.createInternalTopics" which use
> "sourceClusterAlias() + CHECKPOINTS_TOPIC_SUFFIX" and 2. At
> "MirrorClient.remoteConsumerOffsets" which is called by
> "RemoteClusterUtils.translateOffsets"  the cluster alias here referred to
> as "remoteCluster" where the topic name is "remoteClusterAlias +
> CHECKPOINTS_TOPIC_SUFFIX"  (which is an argument in RemoteClusterUtils, not
> a config) This why I called the variable cluster instead of source and
> instead of using the config to figure out the cluster aliases from config
> as we use checkpoints to keep `RemoteClusterUtils` compatible for existing
> users. I see a benefit of just read the config a find out the cluster
> aliases but on the other side, I'm not sure why "RemoteClusterUtils"
> doesn't get the name of the cluster from the properties instead of an
> argument, so I decided to keep it just for compatibility.
> >>
> >> Hope these answer some of your concerns.
> >> Best
> >> Omnia
> >>
> >> On Thu, Jan 21, 2021 at 3:37 PM Mickael Maison <
> mickael.mai...@gmail.com> wrote:
> >>>
> >>> Hi Omnia,
> >>>
> >>> Thanks for the updates. Sorry for the delay but I have a few more
> >>> small questions about the API:
> >>> - Do we really need "isMM2InternalTopic()"? There's already
> >>> "ReplicationPolicy.isInternalTopic()". If so, we need to explain the
> >>> difference between these 2 methods.
> >>>
> >>> - Is "isCheckpointTopic()" expected to detect all checkpoint topics
> >>> (for all MM2 instances) or only the ones for this connector instance.
> >>> If it's the later, I wonder if we could do without the method. As this
> >>> interface is only called by MM2, we could first call
> >>> "checkpointTopic()" and check if that's equal to the topic we're
> >>> checking. If it's the former, we don't really know topic names other
> >>> MM2 instances may be using!
> >>>
> >>> - The 3 methods returning topic names have different APIs:
> >>> "heartbeatsTopic()" takes no arguments, "offsetSyncTopic()" takes the
> >>> target cluster alias and "checkpointTopic()" takes "clusterAlias"
> >>> (which one is it? source or target?). As the interface extends
> >>> Configurable, maybe we could get rid of all the arguments and use the
> >>> config to find the cluster aliases. WDYT?
> >>>
> >>> These are minor concerns, just making sure I fully understand how the
> >>> API is expected to be used. Once these are cleared, I'll be happy to
> >>> vote for this KIP.
> >>>
> >>> Thanks
> >>>
> >>> On Fri, Jan 8, 2021 at 12:06 PM Omnia Ibrahim <o.g.h.ibra...@gmail.com>
> wrote:
> >>> >
> >>> > Hi Mickael,
> >>> > Did you get time to review the changes to the KIP? If you okay with
> it could you vote for the KIP here ttps://
> www.mail-archive.com/dev@kafka.apache.org/msg113575.html?
> >>> > Thanks
> >>> >
> >>> > On Thu, Dec 10, 2020 at 2:19 PM Omnia Ibrahim <
> o.g.h.ibra...@gmail.com> wrote:
> >>> >>
> >>> >> Hi Mickael,
> >>> >> 1) That's right the interface and default implementation will in
> mirror-connect
> >>> >> 2) Renaming the interface should be fine too especially if you
> planning to move other functionality related to the creation there, I can
> edit this
> >>> >>
> >>> >> if you are okay with that please vote for the KIP here
> https://www.mail-archive.com/dev@kafka.apache.org/msg113575.html
> >>> >>
> >>> >>
> >>> >> Thanks
> >>> >> Omnia
> >>> >> On Thu, Dec 10, 2020 at 12:58 PM Mickael Maison <
> mickael.mai...@gmail.com> wrote:
> >>> >>>
> >>> >>> Hi Omnia,
> >>> >>>
> >>> >>> Thank you for the reply, it makes sense.
> >>> >>>
> >>> >>> A couple more comments:
> >>> >>>
> >>> >>> 1) I'm assuming the new interface and default implementation will
> be
> >>> >>> in the mirror-client project? as the names of some of these topics
> are
> >>> >>> needed by RemoteClusterUtils on the client-side.
> >>> >>>
> >>> >>> 2) I'm about to open a KIP to specify where the offset-syncs topic
> is
> >>> >>> created by MM2. In restricted environments, we'd prefer MM2 to only
> >>> >>> have read access to the source cluster and have the offset-syncs on
> >>> >>> the target cluster. I think allowing to specify the cluster where
> to
> >>> >>> create that topic would be a natural extension of the interface you
> >>> >>> propose here.
> >>> >>>
> >>> >>> So I wonder if your interface could be named InternalTopicsPolicy?
> >>> >>> That's a bit more generic than InternalTopicNamingPolicy. That
> would
> >>> >>> also match the configuration setting, internal.topics.policy.class,
> >>> >>> you're proposing.
> >>> >>>
> >>> >>> Thanks
> >>> >>>
> >>> >>> On Thu, Dec 3, 2020 at 10:15 PM Omnia Ibrahim <
> o.g.h.ibra...@gmail.com> wrote:
> >>> >>> >
> >>> >>> > Hi Mickael,
> >>> >>> > Thanks for your feedback!
> >>> >>> > Regards your question about having more configurations, I
> considered adding
> >>> >>> > configuration per each topic however this meant adding more
> configurations
> >>> >>> > for MM2 which already have so many, also the more complicated
> and advanced
> >>> >>> > replication pattern you have between clusters the more
> configuration lines
> >>> >>> > will be added to your MM2 config which isn't going to be pretty
> if you
> >>> >>> > don't have the same topics names across your clusters.
> >>> >>> >
> >>> >>> > Also, it added more complexity to the implementation as MM2 need
> to
> >>> >>> > 1- identify if a topic is checkpoints so we could list the
> checkpoints
> >>> >>> > topics in MirrorMaker 2 utils as one cluster could have X numbers
> >>> >>> > checkpoints topics if it's connected to X clusters, this is done
> right now
> >>> >>> > by listing any topic with suffix `.checkpoints.internal`. This
> could be
> >>> >>> > done by add `checkpoints.topic.suffix` config but this would
> make an
> >>> >>> > assumption that checkpoints will always have a suffix also
> having a suffix
> >>> >>> > means that we may need a separator as well to concatenate this
> suffix with
> >>> >>> > a prefix to identify source cluster name.
> >>> >>> > 2- identify if a topic is internal, so it shouldn't be
> replicated or track
> >>> >>> > checkpoints for it, right now this is relaying on disallow
> topics with
> >>> >>> > `.internal` suffix to be not replicated and not tracked in
> checkpoints but
> >>> >>> > with making topics configurable we need a way to define what is
> an internal
> >>> >>> > topic. This could be done by making using a list of all internal
> topics
> >>> >>> > have been entered to the configuration.
> >>> >>> >
> >>> >>> > So having an interface seemed easier and also give more
> flexibility for
> >>> >>> > users to define their own topics name, define what is internal
> topic means,
> >>> >>> > how to find checkpoints topics and it will be one line config
> for each
> >>> >>> > herder, also it more consistence with MM2 code as MM2 config have
> >>> >>> > TopicFilter, ReplicationPolicy, GroupFilter, etc as interface
> and they can
> >>> >>> > be overridden by providing a custom implementation for them or
> have some
> >>> >>> > config that change their default implementations.
> >>> >>> >
> >>> >>> > Hope this answer your question. I also updated the KIP to add
> this to the
> >>> >>> > rejected solutions.
> >>> >>> >
> >>> >>> >
> >>> >>> > On Thu, Dec 3, 2020 at 3:19 PM Mickael Maison <
> mickael.mai...@gmail.com>
> >>> >>> > wrote:
> >>> >>> >
> >>> >>> > > Hi Omnia,
> >>> >>> > >
> >>> >>> > > Thanks for the KIP. Indeed being able to configure MM2's
> internal
> >>> >>> > > topic names would be a nice improvement.
> >>> >>> > >
> >>> >>> > > Looking at the KIP, I was surprised you propose an interface
> to allow
> >>> >>> > > users to specify names. Have you considered making names
> changeable
> >>> >>> > > via configurations? If so, we should definitely mention it in
> the
> >>> >>> > > rejected alternatives as it's the first method that comes to
> mind.
> >>> >>> > >
> >>> >>> > > I understand an interface gives a lot of flexibility but I'd
> expect
> >>> >>> > > topic names to be relatively simple and known in advance in
> most
> >>> >>> > > cases.
> >>> >>> > >
> >>> >>> > > I've not checked all use cases but something like below felt
> appropriate:
> >>> >>> > > clusters = primary,backup
> >>> >>> > > primary->backup.offsets-sync.topic=backup.mytopic-offsets-sync
> >>> >>> > >
> >>> >>> > > On Tue, Dec 1, 2020 at 3:36 PM Omnia Ibrahim <
> o.g.h.ibra...@gmail.com>
> >>> >>> > > wrote:
> >>> >>> > > >
> >>> >>> > > > Hey everyone,
> >>> >>> > > > Please take a look at KIP-690:
> >>> >>> > > >
> >>> >>> > > >
> >>> >>> > >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-690%3A+Add+additional+configuration+to+control+MirrorMaker+2+internal+topics+naming+convention
> >>> >>> > > >
> >>> >>> > > > Thanks for your feedback and support.
> >>> >>> > > >
> >>> >>> > > > Omnia
> >>> >>> > > >
> >>> >>> > >
>

Reply via email to