Hello Jiang, That is a valid point. The reason we design ack=-1 to be "receive acks from replicas in ISR" is basically trading consistency for availability. I think instead of change it meaning, we could add another ack, -2 for instance, to specify "receive acks from all replicas" as a favor of consistency.
Since you already did this much investigation would you like to file a JIRA and submit a patch for this? Guozhang On Fri, Jul 11, 2014 at 11:49 AM, Jiang Wu (Pricehistory) (BLOOMBERG/ 731 LEX -) <jwu...@bloomberg.net> wrote: > Hi, > I'm doing stress and failover tests on a 3 node 0.8.1.1 kafka cluster and > have the following observations. > > A topic is created with 1 partition and 3 replications. > request.required.acks is set to -1 for a sync producer. When the publishing > speed is high (3M messages, each 2000 bytes, published in lists of size > 2000), the two followers will fail out of sync. Only the leader remains in > ISR. But the producer can keep sending. If the leader is killed with CTR_C, > one follower will become leader, but message loss will happen because of > the unclean leader election. > > In the same test, request.required.acks=3 gives the desired result. > Followers will fail out of sync, but the producer will be blocked untill > all followers back to ISR. No data loss is observed in this case. > > From the code, this turns out to be how it's designed: > if ((requiredAcks < 0 && numAcks >= inSyncReplicas.size) || > (requiredAcks > 0 && numAcks >= requiredAcks)) { > /* > * requiredAcks < 0 means acknowledge after all replicas in ISR > * are fully caught up to the (local) leader's offset > * corresponding to this produce request. > */ > (true, ErrorMapping.NoError) > } > > I'm wondering if it's more reasonable to let request.required.acks=-1 mean > "receive acks from all replicas" instead of "receive acks from replicas in > ISR"? As in the above test, follower will fail out sync under high > publishing volume; that makes request.required.acks=-1 equivalent to > request.required.acks=1. Since the kafka document states > request.required.acks=-1 provides the best durability, one would expect it > is equivalent to request.required.acks=number_of_replications. > > Regards, > Jiang -- -- Guozhang