Re: Question: Data Loss and Data Duplication in Kafka

2016-09-05 Thread Jayesh Thakrar
below? From: R Krishna To: users@kafka.apache.org; Jayesh Thakrar Sent: Tuesday, August 30, 2016 2:02 AM Subject: Re: Question: Data Loss and Data Duplication in Kafka Experimenting with kafka myself, and found timeouts/batch expiry (valid and invalid configurations), and

Re: Question: Data Loss and Data Duplication in Kafka

2016-08-30 Thread R Krishna
Experimenting with kafka myself, and found timeouts/batch expiry (valid and invalid configurations), and max retries also can drop messages unless you handle and log them gracefully. There are also a bunch of org.apache.kafka.common.KafkaException hierarchy exceptions some of which are thrown for v

Question: Data Loss and Data Duplication in Kafka

2016-08-28 Thread Jayesh Thakrar
I am looking at ways how one might have data loss and duplication in a Kafka cluster and need some help/pointers/discussions. So far, here's what I have come up with: Loss at producer-sideSince the data send call is actually adding data to a cache/buffer, a crash of the producer can potentially r