To avoid data loss, you will need to (1) use ack=-1 in the producer; (2) configure enough retries + backoff time in the producer (so that new leaders can be elected during failure).
For controlled failure, you can reduce the unavailability window by using controlled shutdown. See http://kafka.apache.org/documentation.html#basic_ops_restarting Thanks, Jun On Wed, Sep 3, 2014 at 11:48 AM, Andrew Otto <ao...@wikimedia.org> wrote: > Hiya, > > During leader changes, we see short periods of message loss on some of our > higher volume producers. I suspect that this is because it takes a couple > of seconds for Zookeeper to notice and notify the producers of the metadata > change. During this time, producer buffers can fill up and end up dropping > some messages. > > I’d like to do some troubleshooting. Is it possible to manually change > the leadership of a single partition? I see here[1] that I can start a > leadership election for a particular partition, but the JSON doesn’t show a > way to choose the new leader of the partition. > > Thanks! > -Andrew Otto > > [1] > https://cwiki.apache.org/confluence/display/KAFKA/Replication+tools#Replicationtools-Howtousethetool?.1