- Event size = 1kB.
- broker and client running on different machines (identical config, 32
cores, 256GB ram, 6x 1500rpm disk, 10gigEhernet)
- Don't readily have number for old batch sync API for the same params.
But can get it soon. However .. does it matter ?



-roshan






On 4/28/15 6:57 PM, "Joel Koshy" <jjkosh...@gmail.com> wrote:

>- What is the record size?
>- Is this a local setup? i.e., producer/broker running local?
>- Any overrides apart from batch size? E.g., linger time.
>- Can you establish a baseline - with the old producer's sync-send?
>
>Thanks,
>
>Joel
>
>On Wed, Apr 29, 2015 at 12:58:43AM +0000, Roshan Naik wrote:
>> Based on recent suggestion by Joel, I am experimenting with using
>>flush() to simulate  batched-sync behavior.
>> The essence of my  single threaded producer code is :
>> 
>>     for (int i = 0; i < numRecords;) {
>>         // 1- Send a batch
>>         for(int batchCounter=0; batchCounter<batchSz; ++batchCounter) {
>>             Future<RecordMetadata> f =  producer.send(record, null);
>>             futureList.add(f);
>>             i++;
>>         }
>>         // 2- Flush after sending batch
>>         producer.flush();
>> 
>>         // 3- Ensure all msgs were send
>>         for( Future<RecordMetadata> f : futureList) {
>>             f.get();
>>         }
>>     }
>> 
>> There are actually two batch size in play here. One is the number of
>>messages between every flush() call made by the client. The other is the
>> batch.size  setting which impacts the batching internally done by the
>>underlying Async api.
>> 
>> Intuitively  .. we either want to
>>   A) Set both batch sizes to be Equal, OR
>>   B) Set the underlying batch.size to a sufficiently large number so as
>>to effectively disable internal batch management
>> 
>> 
>> Below numbers are in MB/s.  The 'Batch' column indicate the number of
>>events between each explicit client flush()
>> Setup is 1-node broker and acks=1.
>> 
>>                 1 partition
>>                 Batch=4k        Batch=8k        Batch=16k
>> Equal batchSizes (a)    16      32      52
>> large batch.Size (b)    140     123     124
>> 
>>                 4 partitions
>>                 Batch=4k        Batch=8k        Batch=16k
>> Equal batchSz (a)       35      61      82
>> large batch.size (b)    7       7       7
>>                 8 partitions
>>                 Batch=4k        Batch=8k        Batch=16k
>> Equal batchSz (a)               49      70      99
>> large batch.size (b)    7       8       7
>> 
>> 
>> There are two issues noticeable in these number:
>> 1 - Case A is much faster than case B for 4 and 8 partitions.
>> 2 - Single partition mode outperforms all others and here case B is
>>faster than case A.
>> 
>> 
>> 
>> 
>> Side Note: I used the  client APIs  from the trunk while the broker is
>>running 0.8.2 (I don't think it matters, but nevertheless wanted to
>>point out)
>> 
>

Reply via email to