I'm also curious to know what is the limiting factor of kafka write throughput?
I've never seen reports higher than 100mb/sec, obviously disks can provide much more. In my own test with single broker, single partition, single replica: bin/kafka-producer-perf-test.sh --topics perf --threads 10 --broker-list 10.80.42.154:9092 --messages 5000000 --message-size 3000 It tops around 90MB/sec. Cpu, disk, memory, network, everything is chilling, but still I can't get higher numbers. On Fri, Oct 11, 2013 at 11:17 AM, Bruno D. Rodrigues < bruno.rodrig...@litux.org> wrote: > Producer: > props.put("batch.num.messages", "1000"); // 200 > props.put("queue.buffering.max.messages", "20000"); // 10000 > props.put("request.required.acks", "0"); > props.put("producer.type", "async"); // sync > > // return ++this.count % a_numPartitions; // just round-robin > props.put("partitioner.class", "main.SimplePartitioner"); // > kafka.producer.DefaultPartitioner > > // disabled = 70MB source, 70MB network, enabled = 70MB source, > ~40-50MB network > props.put("compression.codec", "Snappy"); // none > > Consumer is with default settings, as I test separately without any > consumer at all, and then test the extra load of having 1..n consumers. I > assume the top speed would be without consumers at all. I'm measuring both > the produced messages as well as the consumer side. > > On the kafka server I've changed the following, expecting less disk writes > at the cost of loosing messages: > > #log.flush.interval.messages=10000 > log.flush.interval.messages=10000000 > #log.flush.interval.ms=1000 > log.flush.interval.ms=10000 > #log.segment.bytes=536870912 > # is signed int 32, only up to 2^31-1! > log.segment.bytes=2000000000 > #log.retention.hours=168 > log.retention.hours=1 > > > Basically I need high throughput of discardable messages, so having them > persisted temporarily on the disk, in an highly optimised manner like Kafka > shows, would be great not for the reliability (not loosing messages), but > because it would allow me to get some previous messages even if the client > (kafka client or real consumer client) disconnects, as well as providing a > way to go back in time some seconds if needed. > > > > A 11/10/2013, às 18:56, Magnus Edenhill <mag...@edenhill.se> escreveu: > > Make sure the fetch batch size and the local consumer queue sizes are large > enough, setting them too low will limit your throughput to the > broker<->client latency. > > This would be controlled using the following properties: > - fetch.message.max.bytes > - queued.max.message.chunks > > On the producer side you would want to play with: > - queue.buffering.max.ms and .messages > - batch.num.messages > > Memory on the broker should only affect disk cache performance, the more > the merrier of course, but it depends on your use case, with a bit of luck > the disk caches are already hot for the data you are reading (e.g., > recently produced). > > Consuming millions of messages per second on quad core i7 with 8 gigs of > RAM is possible without a sweat, given the disk caches are hot. > > > Regards, > Magnus > > > 2013/10/11 Bruno D. Rodrigues <bruno.rodrig...@litux.org> > > > On Thu, Oct 10, 2013 at 3:57 PM, Bruno D. Rodrigues < > bruno.rodrig...@litux.org> wrote: > > My personal newbie experience, which is surely completely wrong and > miss-configured, got me up to 70MB/sec, either with controlled 1K > > messages > > (hence 70Kmsg/sec) as well as with more random data (test data from 100 > bytes to a couple MB). First I thought the 70MB were the hard disk > > limit, > > but when I got the same result both with a proper linux server with a > > 10K > > disk, as well as with a Mac mini with a 5400rpm disk, I got confused. > > The mini has 2G, the linux server has 8 or 16, can'r recall at the > > moment. > > > The test was performed both with single and multi producers and > > consumers. > > One producer = 70MB, two producers = 35MB each and so forth. Running > standalone instances on each server, same value. Running both together > > in 2 > > partition 2 replica crossed mode, same result. > > As far as I understood, more memory just means more kernel buffer space > > to > > speed up the lack of disk speed, as kafka seems to not really depend on > memory for the queueing. > > > A 11/10/2013, às 17:28, Guozhang Wang <wangg...@gmail.com> escreveu: > > Hello, > > In most cases of Kafka, network bottleneck will be hit before the disk > bottleneck. So maybe you want to check your network capacity to see if it > has been saturated. > > > They are all connected to Gbit ethernet cards and proper network routers. > I can easily get way above 950Mbps up and down between each machine and > even between multiple machines. Gbit is 128MB/s. 70MB/s is 560Kbps. So far > so good, 56% network capacity is a goodish value. But then I enable snappy, > get the same 70MB on the input and output side, and 20MB/sec on the > network, so it surely isn't network limits. It's also not on the input or > output side - the input reads a pre-processed MMaped file that reads at > 150MB/sec without cache (SSD) up to 3GB/sec when loaded into memory. The > output simply counts the messages and size of them. > > One weird thing is that the kafka process seems to not cross the 100% cpu > on the top or equivalent command. Top shows 100% for each CPU, so a > multi-threaded process should go up to 400% (both the linux and mac mini > are 2 CPU with hiperthreading, so "almost" 4 cpus). > > > > >