[ https://issues.apache.org/jira/browse/KAFKA-2168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14563753#comment-14563753 ]
Guozhang Wang commented on KAFKA-2168: -------------------------------------- I would also prefer to stick with single-threaded consumer usage, and I agree that KAFKA-2123 would be important to have then. > New consumer poll() can block other calls like position(), commit(), and > close() indefinitely > --------------------------------------------------------------------------------------------- > > Key: KAFKA-2168 > URL: https://issues.apache.org/jira/browse/KAFKA-2168 > Project: Kafka > Issue Type: Bug > Components: clients, consumer > Reporter: Ewen Cheslack-Postava > Assignee: Jason Gustafson > > The new consumer is currently using very coarse-grained synchronization. For > most methods this isn't a problem since they finish quickly once the lock is > acquired, but poll() might run for a long time (and commonly will since > polling with long timeouts is a normal use case). This means any operations > invoked from another thread may block until the poll() call completes. > Some example use cases where this can be a problem: > * A shutdown hook is registered to trigger shutdown and invokes close(). It > gets invoked from another thread and blocks indefinitely. > * User wants to manage offset commit themselves in a background thread. If > the commit policy is not purely time based, it's not currently possibly to > make sure the call to commit() will be processed promptly. > Two possible solutions to this: > 1. Make sure a lock is not held during the actual select call. Since we have > multiple layers (KafkaConsumer -> NetworkClient -> Selector -> nio Selector) > this is probably hard to make work cleanly since locking is currently only > performed at the KafkaConsumer level and we'd want it unlocked around a > single line of code in Selector. > 2. Wake up the selector before synchronizing for certain operations. This > would require some additional coordination to make sure the caller of > wakeup() is able to acquire the lock promptly (instead of, e.g., the poll() > thread being woken up and then promptly reacquiring the lock with a > subsequent long poll() call). -- This message was sent by Atlassian JIRA (v6.3.4#6332)