Batch failure is a bit meaningless, since in the same batch, some records
can succeed and others may fail.
To implement an error handling logic (usually different than retry, since
the producer has a configuration controlling retries), we recommend using
the callback option of Send().

Gwen

P.S
Awesome seeing you here, Roshan :)

On Mon, Apr 27, 2015 at 1:53 PM, Roshan Naik <ros...@hortonworks.com> wrote:

> The important guarantee that is needed for a client producer thread is
> that it requires an indication of success/failure of the batch of events
> it pushed. Essentially it needs to retry producer.send() on that same
> batch in case of failure. My understanding is that flush will simply flush
> data from all threads (correct me if I am wrong).
>
> -roshan
>
>
>
> On 4/27/15 1:36 PM, "Joel Koshy" <jjkosh...@gmail.com> wrote:
>
> >This sounds like flush:
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-8+-+Add+a+flush+meth
> >od+to+the+producer+API
> >
> >which was recently implemented in trunk.
> >
> >Joel
> >
> >On Mon, Apr 27, 2015 at 08:19:40PM +0000, Roshan Naik wrote:
> >> Been evaluating the perf of old and new Produce APIs for reliable high
> >>volume streaming data movement. I do see one area of improvement that
> >>the new API could use for synchronous clients.
> >>
> >> AFAIKT, the new API does not support batched synchronous transfers. To
> >>do synchronous send, one needs to do a future.get() after every
> >>Producer.send(). I changed the new
> >>o.a.k.clients.tools.ProducerPerformance tool to asses the perf of this
> >>mode of operation. May not be surprising that it much slower than the
> >>async mode... hard t push it beyond 4MB/s.
> >>
> >> The 0.8.1 Scala based producer API supported a batched sync mode via
> >>Producer.send( List<KeyedMessage> ) . My measurements show that it was
> >>able to approach (and sometimes exceed) the old async speeds... 266MB/s
> >>
> >>
> >> Supporting this batched sync mode is very critical for streaming
> >>clients (such as flume for example) that need delivery guarantees.
> >>Although it can be done with Async mode, it requires additional book
> >>keeping as to which events are delivered and which ones are not. The
> >>programming model becomes much simpler with the batched sync mode.
> >>Client having to deal with one single future.get() helps performance
> >>greatly too as I noted.
> >>
> >> Wanted to propose adding this as an enhancement to the new Producer API.
> >
>
>

Reply via email to