The major motivation of adopting new producer before it's released, old producer is showing terrible throughput of cross-regional kafka mirroring in EC2.
Let me share numbers. Using iperf, network bandwidth between us-west-2 AWS EC2 and us-east-1 AWS EC2 is more than 40 MB/sec. But old producer's throughput is less than 3 MB/sec. start.timeend.timecompressionmessage.sizebatch.sizetotal.data.sent.in.MB MB.sectotal.data.sent.in.nMsgnMsg.sec2014-09-16 20:22:25:5372014-09-16 20:24:13:13823000200286.102.6589100000929.3594 Even though we increased the socket send buffer on the producer side and recv buffer on the broker side, it didn't work. send.buffer.bytes: 8388608 start.timeend.timecompressionmessage.sizebatch.sizetotal.data.sent.in.MB MB.sectotal.data.sent.in.nMsgnMsg.sec2014-09-16 20:48:49:5882014-09-16 20:50:03:00623000200286.103.89691000001362.0638 But new producer which is not released yet is showing significant performance improvement. Its performance is more than 30MB/sec. start.timeend.timecompressionmessage.sizebatch.sizetotal.data.sent.in.MB MB.sectotal.data.sent.in.nMsgnMsg.sec2014-09-16 20:50:31:7202014-09-16 20:50:41:24123000200286.1030.049610000010503.098 I was excited about new producer's performance but its partitioning logic is different. Without partition number in ProducerRecord, its partitioning logic is based on murmur2 hash key. But in the old partitioner, partitioning logic is based on key.hashCode. Could you make them same logic? Otherwise, I have to change implementation of kafka producer container.