Howdy Kafka Team,

We are trying to aggregate every topic on different geo-separate clusters
all into one central kafka cluster. We have the guarantee that the number
of partitions for a given topic will be the same on the source and target
clusters. Due to our particular use case, we need to make sure that the
ordering of the events in any given partition on a source cluster is in
exactly the same order on the corresponding partition in the target cluster.

So far we've use our custom producer to push messages that use a String key
and byte[] message type to the source cluster. But when we go to use the
Mirrormaker to copy from the source to the target cluster, if we use the
same partitioner that our custom producer uses then we get an error saying "[B
cannot be cast to java.lang.String". We understand this to mean that the MM
consumer is trying to partition the source cluster's data using a String
key, but since the message residing on the source cluster is in byte[]
form, using a String key makes no sense. However we need the producer that
pushes to the target cluster to use the exact same partitioning scheme our
custom producer used, so that the ordering on the source and target
partitions is exactly the same. How can we ensure this?


Once we have correctly mirrored exactly ordered partitions, what is the
best way to verify that the source and target partitions do store messages
in the exact same order? Right now we are thinking about writing a
SimpleConsumer that iterates through the logs of source and target
partition, comparing them to each other as the iteration ensues, but it'd
be nice if there was an existing tool for doing this, or if could have some
guarantee that the MM will retain partition ordering by default.


Cheers,


Alex Melville

Reply via email to