We had to have one thread use the Consumer to poll and get records, it
would then put them into a blocking queue (in memory), pause our
subscription, have separate threads pull work from the blocking queue.
meanwhile the thread with the consumer would keep calling "poll" (getting
no data back b/c we paused our subscriptions). Once the blocking queue is
empty, the subscriptions would be resumed and "poll" get fetch data again.

We then had to have an object that the worker threads could use to "commit"
their offsets as they completed a task. The consumer thread would check to
see if there were offsets that needed to committed, if so it would commit
them.

It was a bunch of work to get it working and it's still not perfect, but
it's getting the job done.

-craig

On Tue, Jul 26, 2016 at 3:13 PM, Joey Echeverria <j...@rocana.com> wrote:

> We've been playing around with the new Consumer API and have it an
> unfortunate bump in the road. When our onPartitionsRevoked() callback
> is called we'd like to be able to commit any data that we were
> processing to stable storage so we can then commit the offsets back to
> Kafka. This way we don't throw away in progress work. The problem is
> that onPartitionsRevoked() is called from the same thread running
> poll() which means that heartbeats are paused while the callback is
> processing. Our commits sometimes take a few minutes which means that
> we'll lose our Kafka session and the partitions we were trying to
> commit will get reassigned. The end result of that is more duplicates
> in our stored data.
>
> Has anyone else encountered this? We're probably going to just do a
> rollback rather than a commit in the callback to return quickly, but I
> wanted to check to see if there was something we were missing.
>
> -Joey
>



-- 

https://github.com/mindscratch
https://www.google.com/+CraigWickesser
https://twitter.com/mind_scratch
https://twitter.com/craig_links

Reply via email to