Patterns for message failure handling with Kafka

Jim Tue, 21 Jan 2014 14:47:05 -0800

I'm looking at message delivery patterns for Kafka consumers and wanted to
get people's thoughts on the following problem:



The objective is to ensure processing of individual messages with as much
certainty as possible for "at least once guarantees". I'm looking to have a
kafka consumer pull n messages, assuming 100 for arguments sake, process
them, commit the offset, then grab 100 more.

The issue comes in where you have single message failure. For example
message 30 cannot be deserialized, message 40 failed because of some 3rd
party service that was down for an instant, etc... So we're looking at
having a topic and a topic_retry pattern for consumers so that if there was
a single message failure we'd put messages 30 and 40 in the retry topic
with a failure count of 1 and if that failure count passes 3 it goes to
cold storage for manual analysis. Once we have processed all 100 either by
success or making sure they were re-enqueued we commit the offset, then
grab more messages. If the percentage of retry topics goes over a threshold
trip a circuit breaker for the consumer to stop pulling messages until the
issue can be resolved to prevent re-try flooding.

What are some patterns around this that people are using currently to
handle message failures at scale with kafka?


pardon if this is a frequent question but the
http://search-hadoop.com/kafka server
is down so I can't search historicals at the moment.

thanks,
Jim

Patterns for message failure handling with Kafka

Reply via email to