Re: Retry Message Consumption On Database Failure

Jason Gustafson Mon, 14 Mar 2016 10:16:54 -0700

Hey Michael,

I don't think a policy of retrying indefinitely is generally possible with
the new consumer even if you had a heartbeat API. The problem is that the
consumer itself doesn't control when the group needs to rebalance. If
another consumer joins or leaves the group, then all consumers will need to
rebalance, regardless whether they are in the middle of message processing
or not. Once the rebalance completes, the consumer may or may not get
assigned the same partition that the message came from. That said, if a
rebalance is unlikely because the group is stable, then you could use the
pause() API to move the message processing to a background thread. What
this would look like is basically this:


1. Receive message from poll() from partition 0.
2. Pause partition 0 using pause().
3. Send the message to a background thread for processing and continue
calling poll().
4. When the processing finishes, resume() the partition.
5. If the group rebalances before processing finishes, there are two cases:
  a) if partition 0 is reassigned, pause() it again in the
onPartitionsAssigned() callback (and you may also want to verify that the
last committed offset is still what you expect)
  b) otherwise, abort the background processing thread.

Would that work for your case? It's also worth mentioning that there's a
proposal to add a sticky partition assignor to Kafka, which would make 5.b
less likely.

-Jason



On Fri, Mar 11, 2016 at 1:03 AM, Michael Freeman <[email protected]>
wrote:

> Thanks Christian,
>                               Sending a heartbeat without having to poll
> would also be useful when using a large max.partition.fetch.bytes.
>
> For now I'm just going to shut the consumer down and restart after x
> period of time.
>
> Thanks for your insights.
>
> Michael
>
> > On 10 Mar 2016, at 18:33, Christian Posta <[email protected]>
> wrote:
> >
> > Yah that's a good point. That was brought up in another thread.
> >
> > The granularity of what poll() needs to be addressed. It tries to do too
> > many things at once, including heartbeating. Not so sure that's entirely
> > necessary.
> >
> > On Thu, Mar 10, 2016 at 1:40 AM, Michael Freeman <[email protected]>
> > wrote:
> >
> >> Thanks Christian,
> >>                               We would want to retry indefinitely. Or at
> >> least for say x minutes. If we don't poll how do we keep the heart beat
> >> alive to Kafka. We never want to loose this message and only want to
> commit
> >> to Kafka when the message is in Mongo. That's either as a successful
> >> message in a collection or an unsuccessful message in an error
> collection.
> >>
> >> Right now I let the consumer die and don't create a new one for x
> minutes.
> >> This causes a lot of rebalancing.
> >>
> >> Michael
> >>
> >>>> On 9 Mar 2016, at 21:12, Christian Posta <[email protected]>
> >>> wrote:
> >>>
> >>> So can you have to decide how long you're willing to "wait" for the
> mongo
> >>> db to come back, and what you'd like to do with that message. So for
> >>> example, do you just retry inserting to Mongo for a predefined period
> of
> >>> time? Do you try forever? If you try forever, are you okay with the
> >>> consumer threads blocking indefinitely? Or maybe you implement a
> "circuit
> >>> breaker" to shed load to mongo? Or are you willing to stash the message
> >>> into a DLQ and move on and try the next message?
> >>>
> >>> You don't need to "re-consume" the message do you? Can you just retry
> >>> and/or backoff-retry with the message you have? And just do the
> "commit"
> >> of
> >>> the offset if successfully?
> >>>
> >>>
> >>>
> >>> On Wed, Mar 9, 2016 at 2:00 PM, Michael Freeman <[email protected]>
> >>> wrote:
> >>>
> >>>> Hey,
> >>>>      My team is new to Kafka and we are using the examples found at.
> >>
> http://www.confluent.io/blog/tutorial-getting-started-with-the-new-apache-kafka-0.9-consumer-client
> >>>>
> >>>> We process messages from kafka and persist them to Mongo.
> >>>> If Mongo is unavailable we are wondering how we can re-consume the
> >> messages
> >>>> while we wait for Mongo to come back up.
> >>>>
> >>>> Right now we commit after the messages for each partition are
> processed
> >>>> (Following the example).
> >>>> I have tried a few approaches.
> >>>>
> >>>> 1. Catch the application exception and skip the kafka commit. However
> >> the
> >>>> next poll does not re consume the messages.
> >>>> 2. Allow the consumer to fail and restart the consumer. This works but
> >>>> causes a rebalance.
> >>>>
> >>>> Should I attempt to store the offset and parition (in memory) instead
> >> and
> >>>> attempt to reseek in order to re consume the messages?
> >>>>
> >>>> Whats the best practice approach in this kind of situation? My
> priority
> >> is
> >>>> to never loose a message and to ensure it makes it to Mongo.
> >> (Redelivery is
> >>>> ok)
> >>>>
> >>>> Thanks for any help or pointers in the right direction.
> >>>>
> >>>> Michael
> >>>
> >>>
> >>>
> >>> --
> >>> *Christian Posta*
> >>> twitter: @christianposta
> >>> http://www.christianposta.com/blog
> >>> http://fabric8.io
> >
> >
> >
> > --
> > *Christian Posta*
> > twitter: @christianposta
> > http://www.christianposta.com/blog
> > http://fabric8.io
>

Re: Retry Message Consumption On Database Failure

Reply via email to