Hi Users, I am facing message loss while using kafka v 0.8.2.2. Please see 
details below and help me if you can.

Issue: 2 messages produced to same partition one by one – Kafka producer 
returns same offset back which means message produced earlier is 
lost.<http://stackoverflow.com/questions/37732088/2-messages-produced-to-same-partition-one-by-one-message-1-overridden-by-next>

Details:
I have a unique problem which is happening like 50-100 times a day with message 
volume of more than 2 millions per day on the topic.I am using Kafka producer 
API 0.8.2.2 and I have 12 brokers (v 0.8.2.2) running in prod with replication 
of 4. I have a topic with 60 partitions and I am calculating partition for all 
my messages and providing the value in the ProducerRecord itself. Now, the 
issue -

Application creates 'ProducerRecord' using -

new ProducerRecord<String, String>(topic, 30, null, message1);
providing topic, value message1 and partition 30. Then application call the 
send method and future is returned -

// null is for callback
Future<RecordMetadata> future = producer.send(producerRecord. null);
Now, app prints the offset and partition value by calling get on Future and 
then getting values from RecordMetadata - this is what i get -

Kafka Response : partition 30, offset 3416092
Now, the app produce the next message - message2 to same partition -

new ProducerRecord<String, String>(topic, 30, null, message2);
and kafka response -

Kafka Response : partition 30, offset 3416092
I received the same offset again, and if I pull message from the offset of 
partition 30 using simple consumer, it ends up being the message2 which 
essentially mean i lost the message1.

Currently, the messages are produced using 10 threads each having its own 
instance of kafka producer (Earlier threads shared 1 Kafka producer but it was 
performing slow and we still had message loss).
I am using all default properties for producer except a few mentioned below, 
the message (String payload) size can be a few kbs to a 500 kbs. I am using 
acks value of 1.

value.serializer: org.apache.kafka.common.serialization.StringSerializer
key.serializer: org.apache.kafka.common.serialization.StringSerializer
bootstrap.servers: {SERVER VIP ENDPOINT}
acks: 1
batch.size: 204800
linger.ms: 10
send.buffer.bytes: 1048576
max.request.size: 10000000

What am i doing wrong here? Is there something I can look into or any producer 
property or server property I can tweak to make sure i don't lose any messages. 
I need some help here soon as I am losing some critical messages in production 
which is not good at all because as there is no exception given by Kafka 
Producer its even hard to find out the message lost unless downstream process 
reports it.

Thank you,
Vikram Gulia

Reply via email to