Well spotted Larkin. Please file an issue as we definitely want to fix this before the next release.
Ismael On Wed, Mar 9, 2016 at 10:46 PM, Christian Posta <christian.po...@gmail.com> wrote: > Open a JIRA here: https://issues.apache.org/jira/browse/KAFKA > and open a github.com pull request here: https://github.com/apache/kafka > > May wish to peak at this too: > https://github.com/apache/kafka/blob/trunk/CONTRIBUTING.md > > I think you need an apache ICLA too > https://www.apache.org/licenses/icla.txt > > HTH > > On Wed, Mar 9, 2016 at 3:35 PM, Larkin Lowrey <llow...@gmail.com> wrote: > > > There is a bug in the 0.9.0.1 client which causes consumers to get stuck > > waiting for a connection to be ready to complete. > > > > The root cause is in the connect(...) method of > > > > clients/src/main/java/org/apache/kafka/common/network/Selector.java > > > > Here's the trouble item: > > > > try { > > socketChannel.connect(address); > > } catch (UnresolvedAddressException e) { > > > > The assumption is that socketChannel.connect(address) always returns > false > > when in non-blocking mode. A good assumption... but, sadly, wrong. > > > > When spinning up several dozen consumers at the same time we see a small > > number (one or two) where socketChannel.connect(...) returns true. When > > that happens the connection is valid and SelectionKey.OP_CONNECT will > never > > be triggered. The poll(long timeout) method in the same class will wait > for > > the channel to become ready with key.isConnectable() but that will never > > happen since the channel is already fully connected before the select is > > called. > > > > I implemented a sloppy fix which was able to demonstrate that addressing > > this case solves my stuck consumer problem. > > > > How do I submit a bug report for this issue, or does this email > constitute > > a bug report? > > > > --Larkin > > > > > > -- > *Christian Posta* > twitter: @christianposta > http://www.christianposta.com/blog > http://fabric8.io >