Ewen Cheslack-Postava created KAFKA-2168:
--------------------------------------------

             Summary: New consumer poll() can block other calls like 
position(), commit(), and close() indefinitely
                 Key: KAFKA-2168
                 URL: https://issues.apache.org/jira/browse/KAFKA-2168
             Project: Kafka
          Issue Type: Bug
          Components: clients, consumer
            Reporter: Ewen Cheslack-Postava
            Assignee: Ewen Cheslack-Postava


The new consumer is currently using very coarse-grained synchronization. For 
most methods this isn't a problem since they finish quickly once the lock is 
acquired, but poll() might run for a long time (and commonly will since polling 
with long timeouts is a normal use case). This means any operations invoked 
from another thread may block until the poll() call completes.

Some example use cases where this can be a problem:

* A shutdown hook is registered to trigger shutdown and invokes close(). It 
gets invoked from another thread and blocks indefinitely.
* User wants to manage offset commit themselves in a background thread. If the 
commit policy is not purely time based, it's not currently possibly to make 
sure the call to commit() will be processed promptly.

Two possible solutions to this:
1. Make sure a lock is not held during the actual select call. Since we have 
multiple layers (KafkaConsumer -> NetworkClient -> Selector -> nio Selector) 
this is probably hard to make work cleanly since locking is currently only 
performed at the KafkaConsumer level and we'd want it unlocked around a single 
line of code in Selector.
2. Wake up the selector before synchronizing for certain operations. This would 
require some additional coordination to make sure the caller of wakeup() is 
able to acquire the lock promptly (instead of, e.g., the poll() thread being 
woken up and then promptly reacquiring the lock with a subsequent long poll() 
call).




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to