Jay Kreps created KAFKA-658:
-------------------------------

             Summary: Implement "Exact Mirroring" functionality in mirror maker
                 Key: KAFKA-658
                 URL: https://issues.apache.org/jira/browse/KAFKA-658
             Project: Kafka
          Issue Type: New Feature
            Reporter: Jay Kreps


There are two ways to implement "mirroring" (i.e. replicating a topic from one 
cluster to another):
1. Do a simple read from the source and write to the destination with no 
attempt to maintain the same partitioning or offsets in the destination 
cluster. In this case the destination cluster may have a different number of 
partitions, and you can even read from many clusters to create a merged 
cluster. This flexibility is nice. The downside is that since the partitioning 
and offsets are not the same a consumer of the source cluster has no equivalent 
position in the destination cluster. This is the style of mirroring we have 
implemented in the mirror-maker tool and use for datacenter replication today.
2. The second style of replication only would allow creating an exact replica 
of a source cluster (i.e. all partitions and offsets exactly the same). The 
nice thing about this is that the offsets and partitions would match exactly. 
The downside is that it is not possible to merge multiple source clusters this 
way or have different partitioning. We do not currently support this in mirror 
maker.

It would be nice to implement the second style as an option in mirror maker as 
having an exact replica would be a nice option to have in the case where you 
are replicating a single cluster only.

There are some nuances: In order to maintain the exact offsets it is important 
to guarantee that the producer never resends a message or loses a message. As a 
result it would be important to have only a single producer for each 
destination partition, and check the last produced message on startup (using 
the getOffsets api) so that in the case of a hard crash messages that are 
re-consumed are not re-emitted.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to