[ https://issues.apache.org/jira/browse/FLINK-4702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15529580#comment-15529580 ]
ASF GitHub Bot commented on FLINK-4702: --------------------------------------- Github user tzulitai commented on the issue: https://github.com/apache/flink/pull/2559 @StephanEwen On a second look, I think the `commitSpecificOffsetsToKafka` method was designed to commit synchronously in the first place. `AbstractFetcher` holds a Map of all current pending offsets for committing by checkpointID, and on every `notifyCheckpointComplete` the offsets are removed from the Map before `commitSpecificOffsetsToKafka` is called. So, for async committing, I think we need to remove cleaning up the offsets in `AbstractFetcher#notifyCheckpointComplete()` and instead clean them up in a new separate callback handle method in `AbstractFetcher`. > Kafka consumer must commit offsets asynchronously > ------------------------------------------------- > > Key: FLINK-4702 > URL: https://issues.apache.org/jira/browse/FLINK-4702 > Project: Flink > Issue Type: Bug > Components: Kafka Connector > Affects Versions: 1.1.2 > Reporter: Stephan Ewen > Assignee: Stephan Ewen > Priority: Blocker > Fix For: 1.2.0, 1.1.3 > > > The offset commit calls to Kafka may occasionally take very long. > In that case, the {{notifyCheckpointComplete()}} method blocks for long and > the KafkaConsumer cannot make progress and cannot perform checkpoints. > Kafka 0.9+ have methods to commit asynchronously. > We should use those and make sure no more than one commit is concurrently in > progress, to that commit requests do not pile up. -- This message was sent by Atlassian JIRA (v6.3.4#6332)