Re: Perf testing flush() - issues found

2015-05-02 Thread Roshan Naik
Thanks @Jay for suggesting changes to batch.size and linger.ms. I tried them out. It appears one can do better than the default batch.size for this synchronous batch mode with flush(). These new measurements are giving more "rational" numbers which with I can reason and infer some thumb rules (fo

Re: Perf testing flush() - issues found

2015-04-29 Thread Jay Kreps
Roshan, The client allocates a batch per partition and has a hard cap on memory usage (default 32MB). When it hits that cap it waits for in-flight requests to complete to use their memory. Setting the batch size to 20M is not good--that means each partition has a 20MB array allocated for it. This

Re: Perf testing flush() - issues found

2015-04-29 Thread Roshan Naik
For some reason the HTML formatting is being dropped from my email.. Making it harder to read the measurements table. On 4/29/15 8:32 PM, "Roshan Naik" wrote: > >@Jay, >My bad. I mistook the batch.size to be number of messages instead of >bytes. Below are revised measurements based on computing

Re: Perf testing flush() - issues found

2015-04-29 Thread Roshan Naik
@Jay, My bad. I mistook the batch.size to be number of messages instead of bytes. Below are revised measurements based on computing the batch.size in bytes . @Jun, With explicit flush()... linger should not impact. Isn't it ? @Wang, Larger batches are not necessarily giving better numbe

Re: Perf testing flush() - issues found

2015-04-29 Thread Guozhang Wang
Just to add to Jun's suggestion: 1. since the batch.size config is per-partition, with for example 4K messages * 1K message size between flush() and batch.size set to 4Mb, then with 8 partition, by the time of flush() each partition will get 0.5Mb only, meaning you may not be batching sufficiently

Re: Perf testing flush() - issues found

2015-04-29 Thread Jay Kreps
Just want to confirm that when you say batch.size and number of records will be equal you don't mean that literally. The batch.size is in bytes so if you wanted a batch size of 16 1k messages for a single partition then you are setting batch.size=16*1024. -Jay On Tue, Apr 28, 2015 at 5:58 PM, Ros

Re: Perf testing flush() - issues found

2015-04-29 Thread Jun Rao
Note that the batch size is per partition. The more partitions you have, the longer it will take to fill up all partitions with the same batch size. So, you probably need to increase the linger time such that in dependent of the number of partitions, the configured batch size can be reached. There

Re: Perf testing flush() - issues found

2015-04-28 Thread Roshan Naik
- Event size = 1kB. - broker and client running on different machines (identical config, 32 cores, 256GB ram, 6x 1500rpm disk, 10gigEhernet) - Don't readily have number for old batch sync API for the same params. But can get it soon. However .. does it matter ? -roshan On 4/28/15 6:57 PM,

Re: Perf testing flush() - issues found

2015-04-28 Thread Joel Koshy
- What is the record size? - Is this a local setup? i.e., producer/broker running local? - Any overrides apart from batch size? E.g., linger time. - Can you establish a baseline - with the old producer's sync-send? Thanks, Joel On Wed, Apr 29, 2015 at 12:58:43AM +, Roshan Naik wrote: > Based

Perf testing flush() - issues found

2015-04-28 Thread Roshan Naik
Based on recent suggestion by Joel, I am experimenting with using flush() to simulate batched-sync behavior. The essence of my single threaded producer code is : for (int i = 0; i < numRecords;) { // 1- Send a batch for(int batchCounter=0; batchCounter f = producer.send(rec