Hey Ryanne,

Sorry I am late here. Thanks much for all the work! After reading through
the latest KIP and all the previous discussion, I have some questions below:

1. Currently if there is topic created with "." in the topic name, would it
cause correctness issue for this KIP? For example, will consumer reads from
the topic that is not intended, and will API such
as RemoteClusterUtils.upstreamClusters(...) return wrong string as the
upstream cluster? If there is correctness issue which reduces the
usability, we probably should mark it in the Java doc and fix these issues
before considering the feature as ready and safe to use. Some future work
such as "propose a special separator character" has been suggested in the
email thread. Could we document these future work in the KIP if the work is
necessary for MM2 to be used reliably?

2. Suppose topic_1 is replicated from cluster A to cluster B. And we have a
pair of producer and consumer that produces and consumes from topic_1 in
cluster A. Let's say cluster A crashed and we need to migrate this pair of
consumer to use cluster B. According to the discussion in the email thread,
producer will producer to topic_1 in cluster B whereas Consumer will
consume from A.topic_1 in cluster B. It means that the message produced by
this producer will not longer be consumed by this consumer. Would this be
an issue?

3. Given that the "mechanism to migrate producers or consumers between
mirrored clusters" is listed as the motivation of this KIP, do we need to
specify in the KIP the migration procedure which we have discussed in depth
in the email thread? For example, user needs to additionally
call offsetsForTimes(...) after getting the timestamp from the checkpoint
for each.

4. A lot of class names (e.g. MirrorCheckpointConnector) is included in the
Public Interface section. Not sure if it is useful to specify the class
name without specifying the class API and its usage. My understanding is
that Public Interface section should include 1) minimum amount of the
information that user can read in order to use the feature provided by the
KIP; and 2) anything whose change incurs compatibility concern and thus KIP
discussion. So I am wondering whether we should include in the Public
Interface section 1) the class API for those classes listed in the section
(or remove from the KIP if we have not decided the class API); 2) the usage
of the new scripts (e.g. connect-mirror-maker.sh) with its arguments (e.g.
this
<https://cwiki.apache.org/confluence/display/KAFKA/KIP-113%3A+Support+replicas+movement+between+log+directories#KIP-113:Supportreplicasmovementbetweenlogdirectories-Scripts>
example); and 3) schema of the checkpoint topic and heartbeat topic which
is currently in the Proposed Change section.

5. It is said in the motivation section that the design includes new
metrics such as end-to-end replication latency across multiple data
centers/clusters. Usually we treat metrics as public interface. Could we
also specify these metrics (e.g. definition, type) in Public Interface
section similar to this
<https://cwiki.apache.org/confluence/display/KAFKA/KIP-237%3A+More+Controller+Health+Metrics#KIP-237:MoreControllerHealthMetrics-PublicInterfaces>
example.

6. The KIP says that the RemoteClusterUtils is "subject to change". Are we
going to have a future KIP to discuss and finalize these APIs before
committing any code that implements the API? If so, it may be cleaner to
specify this or even remove this class from the Public Interface section,
and specify the future KIP in this KIP.

7. If we are going to determine the API for RemoteClusterUtils in this KIP,
then I have one comment regarding the
RemoteClusterUtils.translateOffsets(...). If I understand the discussion in
the email thread correctly, this API reads the timestamp from the
checkpoint topic and return it to the caller. If so, it seems more
intuitive to call this e.g. checkpointTimestamp(...). And the doc probably
should say "Find the timestamp of the last committed messaged in the source
cluster...". Otherwise, could you briefly explain in the KIP how this API
is implemented.

8. It is said in the email discussion that targetClusterAlias is needed in
order to migrate consumer from the remote topic back to the source topic
(i.e. fallback). When a consumer is migrated from A.topic_1 in cluster B to
topic_1 in cluster A, how does consumer determine the start offset for
topic_1 in cluster A?

9. Does RemoteClusterUtils.upstreamClusters(...) return clusters that is no
longer an upstream cluster? If yes, it seems to reduced the usability of
the API if user only wants to know the current upstream cluster list. If
no, could you explain what is the semantics of the API (e.g. when an
upstream cluster is considered outdated) and how is this API implemented to
exclude outdated clusters. It may be useful to briefly explain the
implementation in the proposed change section for those APIs that are not
straightforward.

10. It is mentioned in the Remote Cluster Utils section that "it will be
possible to find high water marks that consumers". Not sure how is high
watermark used here. Can you explain a bit more, e.g. which API does this
sentence refer to?

11. The APIs for the interface ReplicationPolicy is specified without
explanation or Java doc. It is hard to determine how they can be used and
whether these APIs are necessary. Should we include the Java doc and put
this interface in the Public Interface section?

12. The migration section specifies that this API will be carried in three
separate Kafka release. Not sure if we need three releases. In the first
release, we implement this KIP and deprecate old API. In the subsequent
release, we remove the old API. Does this make sense?

13. Since the newly added checkpoint topic is log-compacted, I am wondering
whether we need to specify the log compaction key for this topic in the
Public Interface section.

Thanks,
Dong

Reply via email to