Batch failure is a bit meaningless, since in the same batch, some records can succeed and others may fail. To implement an error handling logic (usually different than retry, since the producer has a configuration controlling retries), we recommend using the callback option of Send().
Gwen P.S Awesome seeing you here, Roshan :) On Mon, Apr 27, 2015 at 1:53 PM, Roshan Naik <ros...@hortonworks.com> wrote: > The important guarantee that is needed for a client producer thread is > that it requires an indication of success/failure of the batch of events > it pushed. Essentially it needs to retry producer.send() on that same > batch in case of failure. My understanding is that flush will simply flush > data from all threads (correct me if I am wrong). > > -roshan > > > > On 4/27/15 1:36 PM, "Joel Koshy" <jjkosh...@gmail.com> wrote: > > >This sounds like flush: > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-8+-+Add+a+flush+meth > >od+to+the+producer+API > > > >which was recently implemented in trunk. > > > >Joel > > > >On Mon, Apr 27, 2015 at 08:19:40PM +0000, Roshan Naik wrote: > >> Been evaluating the perf of old and new Produce APIs for reliable high > >>volume streaming data movement. I do see one area of improvement that > >>the new API could use for synchronous clients. > >> > >> AFAIKT, the new API does not support batched synchronous transfers. To > >>do synchronous send, one needs to do a future.get() after every > >>Producer.send(). I changed the new > >>o.a.k.clients.tools.ProducerPerformance tool to asses the perf of this > >>mode of operation. May not be surprising that it much slower than the > >>async mode... hard t push it beyond 4MB/s. > >> > >> The 0.8.1 Scala based producer API supported a batched sync mode via > >>Producer.send( List<KeyedMessage> ) . My measurements show that it was > >>able to approach (and sometimes exceed) the old async speeds... 266MB/s > >> > >> > >> Supporting this batched sync mode is very critical for streaming > >>clients (such as flume for example) that need delivery guarantees. > >>Although it can be done with Async mode, it requires additional book > >>keeping as to which events are delivered and which ones are not. The > >>programming model becomes much simpler with the batched sync mode. > >>Client having to deal with one single future.get() helps performance > >>greatly too as I noted. > >> > >> Wanted to propose adding this as an enhancement to the new Producer API. > > > >