For some reason the HTML formatting is being dropped from my email.. Making it harder to read the measurements table.
On 4/29/15 8:32 PM, "Roshan Naik" <ros...@hortonworks.com> wrote: > >@Jay, >My bad. I mistook the batch.size to be number of messages instead of >bytes. Below are revised measurements based on computing the batch.size >in bytes . > >@Jun, > > With explicit flush()... linger should not impact. Isn't it ? > >@Wang, > Larger batches are not necessarily giving better numbers are you can >see below. > > >The 2 problems I noted earlier still exist in the batched sync mode >(using flush() ). > > * batch.size still seems to play a factor even when set to a larger >value than the bytes generated by client > * 4 & 8 partition see a big slowdown > > > >Revised measurements for new Producer API: > >- All cases...Single threaded, 1k event size > > >Batched SYNC using flus() , acks=1 > > > > > > > > > > > 1 partition > > > > > > > Batch=4k Batch=8k Batch=16k > > > batch.size == clientBatch 140 > 124 > > > batch.size = 10MB 140 123 124 > > > batch.Size = 20MB 31 30 42 > > > > > > > > > > > > > > > > > > > > > 4 partitions > > > > > > > Batch=4k Batch=8k Batch=16k > > > batch.size == clientBatch 60 8 6 > > > batch.size = 10M 7 7 7 > > > batch.Size = 20M 6 6 5 > > > > > > > > > > > > > > > > > > > > > 8 partitions > > > > > > > Batch=4k Batch=8k Batch=16k > > > batch.size == clientBatch 7 8 8 > > > batch.size = 10M 7 8 7 > > > batch.Size = 20M 6 6 6 > > > >Just for reference I also took the number for default ASYNC mode with >acks=1 : > > > > > > > batch.size=deafult batch.size=4MB batch.size=8MB >batch.size=16MB >1 partition 53 130 113 76 >4 partitions 84 126 9 7 >8 partitions 9 12 10 5 > > > > > > > >