[ https://issues.apache.org/jira/browse/KAFKA-3378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15201560#comment-15201560 ]
ASF GitHub Bot commented on KAFKA-3378: --------------------------------------- GitHub user ijuma opened a pull request: https://github.com/apache/kafka/pull/1094 KAFKA-3378; Client blocks forever if SocketChannel connects instantly This is a different implementation to the one in #1085 by Larkin Lowrey (@llowrey). The hard part here was actually finding the problem and all credit goes to @llowrey. This PR also changes fixes our handling of `finishConnect` (we now check the return value). You can merge this pull request into a Git repository by running: $ git pull https://github.com/ijuma/kafka KAFKA-3378-instantly-connecting-socket-channels Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/1094.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1094 ---- commit 4071dc454413d74309e01368924951b5530c83e3 Author: Ismael Juma <ism...@juma.me.uk> Date: 2016-03-18T13:08:08Z Use diamond operator commit 7783c609342e3e71a50a972d7c797e6d12fe9fd4 Author: Ismael Juma <ism...@juma.me.uk> Date: 2016-03-18T13:15:32Z Use `time` instead of `SystemTime`, multi-catch and javadoc clean-ups Also removed unnecessary `if`. commit ae806c559ee68ba696cc7e7885b5569dc233a279 Author: Ismael Juma <ism...@juma.me.uk> Date: 2016-03-18T13:35:28Z Check the return value of `finishConnect` commit 377bc79379403770e33048cb7ee15bc8ef55ed26 Author: Ismael Juma <ism...@juma.me.uk> Date: 2016-03-18T14:32:06Z Handle immediately connected channels and use `finishConnect` return value ---- > Client blocks forever if SocketChannel connects instantly > --------------------------------------------------------- > > Key: KAFKA-3378 > URL: https://issues.apache.org/jira/browse/KAFKA-3378 > Project: Kafka > Issue Type: Bug > Components: clients > Affects Versions: 0.9.0.1 > Reporter: Larkin Lowrey > Assignee: Larkin Lowrey > Priority: Blocker > Fix For: 0.10.0.0 > > > Observed that some consumers were blocked in Fetcher.listOffset() when > starting many dozens of consumer threads at the same time. > Selector.connect(...) calls SocketChannel.connect() in non-blocking mode and > assumes that false is always returned and that the channel will be in the > Selector's readyKeys once the connection is ready for connect completion due > to the OP_CONNECT interest op. > When connect() returns true the channel is fully connected connected and will > not be included in readyKeys since only OP_CONNECT is set. > I implemented a fix which handles the case when connect(...) returns true and > verified that I no longer see stuck consumers. A git pull request will be > forthcoming. -- This message was sent by Atlassian JIRA (v6.3.4#6332)