Hi Gwen, can you clarify: by batch do you mean the protocol MessageSet, or some java client internal construct? If the former I was under the impression that a produced MessageSet either succeeds delivery or errors in its entirety on the broker.
Thanks, Magnus 2015-04-27 23:05 GMT+02:00 Gwen Shapira <gshap...@cloudera.com>: > Batch failure is a bit meaningless, since in the same batch, some records > can succeed and others may fail. > To implement an error handling logic (usually different than retry, since > the producer has a configuration controlling retries), we recommend using > the callback option of Send(). > > Gwen > > P.S > Awesome seeing you here, Roshan :) > > On Mon, Apr 27, 2015 at 1:53 PM, Roshan Naik <ros...@hortonworks.com> > wrote: > > > The important guarantee that is needed for a client producer thread is > > that it requires an indication of success/failure of the batch of events > > it pushed. Essentially it needs to retry producer.send() on that same > > batch in case of failure. My understanding is that flush will simply > flush > > data from all threads (correct me if I am wrong). > > > > -roshan > > > > > > > > On 4/27/15 1:36 PM, "Joel Koshy" <jjkosh...@gmail.com> wrote: > > > > >This sounds like flush: > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-8+-+Add+a+flush+meth > > >od+to+the+producer+API > > > > > >which was recently implemented in trunk. > > > > > >Joel > > > > > >On Mon, Apr 27, 2015 at 08:19:40PM +0000, Roshan Naik wrote: > > >> Been evaluating the perf of old and new Produce APIs for reliable high > > >>volume streaming data movement. I do see one area of improvement that > > >>the new API could use for synchronous clients. > > >> > > >> AFAIKT, the new API does not support batched synchronous transfers. To > > >>do synchronous send, one needs to do a future.get() after every > > >>Producer.send(). I changed the new > > >>o.a.k.clients.tools.ProducerPerformance tool to asses the perf of this > > >>mode of operation. May not be surprising that it much slower than the > > >>async mode... hard t push it beyond 4MB/s. > > >> > > >> The 0.8.1 Scala based producer API supported a batched sync mode via > > >>Producer.send( List<KeyedMessage> ) . My measurements show that it was > > >>able to approach (and sometimes exceed) the old async speeds... 266MB/s > > >> > > >> > > >> Supporting this batched sync mode is very critical for streaming > > >>clients (such as flume for example) that need delivery guarantees. > > >>Although it can be done with Async mode, it requires additional book > > >>keeping as to which events are delivered and which ones are not. The > > >>programming model becomes much simpler with the batched sync mode. > > >>Client having to deal with one single future.get() helps performance > > >>greatly too as I noted. > > >> > > >> Wanted to propose adding this as an enhancement to the new Producer > API. > > > > > > > >