In 0.8.2, we have a new java producer that allows to you specify a callback for each message to be sent.
Thanks, Jun On Thu, Dec 18, 2014 at 12:07 PM, Xiaoyu Wang <xw...@rocketfuel.com> wrote: > @Jun, We can increase the number of resends, but the produce request may > still fail. > > For async producer, at the time when it fails, we have > > - messages that are in queue but has not been sent. From javaapi, we > don't know which messages are still in queue. > > > - Is it possible that we expose the blocking queue size so we know what > remains in the queue? > > > - messages we have failed retrying. For the last batch, some may have > succeeded, but some failed retrying. From javaapi, we don't know what > are > the messages failed. > - Is it possible to dump the failed messages to a file so that the > next run can pick them up? > > Does this make sense? Is there other way you will recommend to keep track > of messages that have been sent for async producer? > > Thanks > > > > On Wed, Dec 17, 2014 at 10:58 AM, Jun Rao <j...@confluent.io> wrote: > > > > You can configure the number of resends on the producer. > > > > Thanks, > > > > Jun > > > > On Wed, Dec 17, 2014 at 10:34 AM, Xiaoyu Wang <xw...@rocketfuel.com> > > wrote: > > > > > > I have tested using "async" producer with "required.ack=-1" and got > > really > > > good performance. > > > > > > We have not used async producer much previously, any potential dataloss > > > when a broker goes down? For example, when a broker goes down, does > > > producer resend all the messages in a batch? > > > > > > > > > On Wed, Dec 17, 2014 at 1:16 PM, Xiaoyu Wang <xw...@rocketfuel.com> > > wrote: > > > > > > > > Thanks Jun. > > > > > > > > We have tested our producer with the different required.ack config. > > Even > > > > with the required.ack=1, the producer is > 10 times slower than with > > > > required.ack=0. Does this confirm with your testing? > > > > > > > > I saw the presentation of LinkedIn Kafka SRE. Wondering what > > > configuration > > > > you guys have at LinkedIn to guarantee zero data loss. > > > > > > > > Thanks again and really appreciate your help! > > > > > > > > On Tue, Dec 16, 2014 at 9:50 PM, Jun Rao <j...@confluent.io> wrote: > > > >> > > > >> replica.lag.max.messages only controls when a replica should be > > dropped > > > >> out > > > >> of the in-sync replica set (ISR). For a message to be considered > > > >> committed, > > > >> it has to be added to every replica in ISR. When the producer uses > > > ack=-1, > > > >> the broker waits until the produced message is committed before > > > >> acknowledging the client. So in the case of a clean leader election > > > (i.e., > > > >> there is at least one remaining replica in ISR), no committed > messages > > > are > > > >> lost. In the case of an unclean leader election, the number of > > messages > > > >> that can be lost depends on the state of the replicas and it's > > possible > > > to > > > >> lose more than replica.lag.max.messages messages. > > > >> > > > >> We do have the lag jmx metric per replica (see > > > >> http://kafka.apache.org/documentation.html#monitoring). > > > >> > > > >> Thanks, > > > >> > > > >> Jun > > > >> > > > >> On Sun, Dec 14, 2014 at 7:20 AM, Xiaoyu Wang <xw...@rocketfuel.com> > > > >> wrote: > > > >> > > > > >> > Hello, > > > >> > > > > >> > If I understand it correctly, when the number of messages a > replica > > is > > > >> > behind from the leader is < replica.lag.max.messages, the replica > is > > > >> > considered in sync with the master and are eligible for leader > > > election. > > > >> > > > > >> > This means we can lose at most replica.lag.max.messages messages > > > during > > > >> > leader election, is it? We can set the replica.lag.max.messages to > > be > > > >> very > > > >> > low, but then we may result in unclean leader election, so still > we > > > can > > > >> > lose data. > > > >> > > > > >> > Can you recommend some way to prevent data loss? We have tried > > setting > > > >> > require ack from all replicas, but that slows down producer > > > >> significantly. > > > >> > > > > >> > In addition, do we have metrics about how far each replica is > > behind? > > > If > > > >> > not, can we add them. > > > >> > > > > >> > > > > >> > Thanks, > > > >> > > > > >> > > > > > > > > > >