Been evaluating the perf of old and new Produce APIs for reliable high volume 
streaming data movement. I do see one area of improvement that the new API 
could use for synchronous clients.

AFAIKT, the new API does not support batched synchronous transfers. To do 
synchronous send, one needs to do a future.get() after every Producer.send(). I 
changed the new o.a.k.clients.tools.ProducerPerformance tool to asses the perf 
of this mode of operation. May not be surprising that it much slower than the 
async mode... hard t push it beyond 4MB/s.

The 0.8.1 Scala based producer API supported a batched sync mode via 
Producer.send( List<KeyedMessage> ) . My measurements show that it was able to 
approach (and sometimes exceed) the old async speeds... 266MB/s


Supporting this batched sync mode is very critical for streaming clients (such 
as flume for example) that need delivery guarantees. Although it can be done 
with Async mode, it requires additional book keeping as to which events are 
delivered and which ones are not. The programming model becomes much simpler 
with the batched sync mode. Client having to deal with one single future.get() 
helps performance greatly too as I noted.

Wanted to propose adding this as an enhancement to the new Producer API.

Reply via email to