[ https://issues.apache.org/jira/browse/KAFKA-8154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16806059#comment-16806059 ]
Gordon commented on KAFKA-8154: ------------------------------- > We guarantee that we don't allocate a buffer that is larger than the total of > SSL session's buffer sizes That's the root of the problem. getApplicationBufferSize() indicates "buffers must be large enough to hold the application data from any inbound network application data packet received". Allocating a buffer of this size is acceptable as long as it is empty when unwrap() is called. The design of the read() function, does not necessarily empty the applicationReadBuffer between calls to unwrap(). There is a nested loop that calls unwrap() repeatedly while netReadBuffer has data. It also copies data from applicationReadBuffer to dst, but if dst is full after one iteration of the inner loop, then applicationReadBuffer won't be completely empty and a second call to unwrap() may fail because there isn't enough room left in applicationReadBuffer. There are at least three ways to solve this problem that are immediately apparent: 1: Eliminate the nested loop. I'm pretty sure if you change the inner loop from "while (netReadBuffer.position() > 0)" to "if (netReadBuffer.position() > 0)", then the remaining loop will not continue to iterate when dst is full, which means that you won't have a loop iteration where applicationReadBuffer still has data in it. The outer loop already checks the netReadBuffer to see if it has room, and does not attempt a network read if it is full, so this should be safe. 2: Break the inner loop in the Status.OK case. This is ugly, but this is already being done in the Status.BUFFER_OVERFLOW case. That section ends with a "break" if dst.hasRemaining() is false. (However, the code in the BUFFER_OVERFLOW case may also be somewhat broken in that it first tests to see if there is free space and then copies data into the buffer. It should be checking hasRemaining() after copying data, since the readFromAppBuffer(dst) call might fill dst.) In the OK case, if the code checks dst.hasRemaining() after the current readFromAppBuffer(dst), it could "break" and avoid the problem. 3: Resize the applicationReadBuffer on BUFFER_OVERFLOW. The fix I proposed in PR 5785 is not an "arbitrary" resize of the applicationReadBuffer. It merely ensures that when data is left over in applicationReadBuffer after moving data to "dst", and when the inner loop iterates a second time, the buffer will be resized so that there is at least getApplicationBufferSize() bytes free in the buffer. This is the recommended behavior if the application read buffer can't be emptied and unwrap() retried: https://docs.oracle.com/javase/8/docs/api/javax/net/ssl/SSLEngine.html#unwrap(java.nio.ByteBuffer,%20java.nio.ByteBuffer) > Buffer Overflow exceptions between brokers and with clients > ----------------------------------------------------------- > > Key: KAFKA-8154 > URL: https://issues.apache.org/jira/browse/KAFKA-8154 > Project: Kafka > Issue Type: Bug > Components: clients > Affects Versions: 2.1.0 > Reporter: Rajesh Nataraja > Priority: Major > Attachments: server.properties.txt > > > https://github.com/apache/kafka/pull/6495 > https://github.com/apache/kafka/pull/5785 -- This message was sent by Atlassian JIRA (v7.6.3#76005)