Hello Theo, What are the values for your "replica.fetch.max.bytes", "replica.fetch.min.bytes", "replica.fetch.wait.max.ms" and "num.replica.fetchers" configs?
Guozhang On Mon, Sep 1, 2014 at 2:52 AM, Theo Hultberg <t...@iconara.net> wrote: > Hi, > > We're evaluating Kafka, and have a problem with it using more bandwidth > than we can explain. From what we can tell the replication uses at least > twice the bandwidth it should. > > We have four producer nodes and three broker nodes. We have enabled 3x > replication, so each node will get a copy of all data in this setup. The > producers have Snappy compression enabled and send batches of 200 messages. > The messages are around 1 KiB each. The cluster runs using mostly default > configuration, and the Kafka version is 0.8.1.1. > > When we run iftop on the broker nodes we see that each Kafka node receives > around 6-7 Mbit from each producer node (or around 25-30 Mbit in total), > but then sends around 50 Mbit to each other Kafka node (or 100 Mbit in > total). This is twice what we expected to see, and it seems to saturate the > bandwidth on our m1.xlarge machines. In other words, we expected the > incoming 25 Mbit to be amplified to 50 Mbit, not 100. > > One thing that could explain it, and that we don't really know how to > verify, is that the inter-node communication is not compressed. We aren't > sure about what compression ratio we get on the incoming data, but 50% > sounds reasonable. Could this explain what we're seeing? Is there a > configuration property to enable compression on the replication traffic > that we've missed? > > yours > Theo > -- -- Guozhang