Dear Community members, We are proposing a new feature to improve the performance of Kafka mirror maker: https://cwiki.apache.org/confluence/display/KAFKA/KIP-712%3A+Shallow+Mirroring
The current Kafka MirrorMaker process (with the underlying Consumer and Producer library) uses significant CPU cycles and memory to decompress/recompress, deserialize/re-serialize messages and copy multiple times of messages bytes along the mirroring/replicating stages. The KIP proposes a *shallow mirror* feature which brings back the shallow iterator concept to the mirror process and also proposes to skip the unnecessary message decompression and recompression steps. We argue in many cases users just want a simple replication pipeline to replicate the message as it is from the source cluster to the destination cluster. In many cases the messages in the source cluster are already compressed and properly batched, users just need an identical copy of the message bytes through the mirroring without any transformation or repartitioning. We have a prototype implementation in house with MirrorMaker v1 and observed *CPU usage dropped from 50% to 15%* for some mirror pipelines. We name this feature: *shallow mirroring* since it has some resemblance to the old Kafka 0.7 namesake feature but the implementations are not quite the same. ‘*Shallow*’ means 1. we *shallowly* iterate RecordBatches inside MemoryRecords structure instead of deep iterating records inside RecordBatch; 2. We *shallowly* copy (share) pointers inside ByteBuffer instead of deep copying and deserializing bytes into objects. Please share discussions/feedback along this email thread.