Hello Alex, This can be done by doing some tweaks in the MM code (with the 0.8.2 new producer).
1. Set-up your MM to have the total # of producers equal to the #. of partitions in source / target cluster. 2. When the consumer of the MM gets a message, put the message to the producer's queue based on its partition id; i.e. if the partition id is n, put to n's producer queue. 3. When producer sends the data, specify the partition id; so each producer will only send to a single partition. Guozhang On Tue, Nov 25, 2014 at 8:19 PM, Alex Melville <amelvi...@g.hmc.edu> wrote: > Howdy friends, > > > I'd like to mirror the topics on several clusters to a central cluster, and > I'm looking at using the default Mirrormaker to do so. I've already done > some basic testing on the Mirrormaker found here: > > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=27846330 > > and managed to successfully copy a topic's partitions on a source cluster > to a topic on a target cluster. So I'm able to mirror correctly. However > for my particular use case I need to ensure that when I copy a topic's > partitions from source cluster to target cluster, a partition created on > the target cluster contains data in the exact same order as the data on the > corresponding partition on the source cluster. > > I'm thinking of writing a Simple Consumer so I can manually compare the > events in a source cluster's partition with the corresponding partition on > the target cluster, but I'm not 100% sure if I'll be able to verify my > guarantee if I do it this way. Can anyone here verify that partitions > copied over to the target cluster by the default Mirrormaker are an exact > copy of those on the source cluster? > > > Thanks in advance, > > Alex Melville > -- -- Guozhang