[ https://issues.apache.org/jira/browse/KAFKA-13418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17464190#comment-17464190 ]
shylaja kokoori commented on KAFKA-13418: ----------------------------------------- After enabling SSL logging (javax.net.debug=ssl,handshake), I see that unwrap call in the SslTransportLayer.read function returns handshakeStatus=NEED_WRAP when ssl key_update takes place. (log snippet below) Based on documentation provided in [https://datatracker.ietf.org/doc/html/rfc8446] key_updates normally happen during a read/write and connection has to be closed when it happens during handshake. Given that here key_updates are happening after handshaking is done, will something like attached patch work? I am new to Kafka and any feedback would be helpful. Kafka log: {code:java} javax.net.ssl|DEBUG|8D|ReplicaFetcherThread-0-2|2021-12-21 06:14:09.574 UTC|KeyUpdate.java:192|Consuming KeyUpdate post-handshake message ( "KeyUpdate": { "request_update": update_requested } ) javax.net.ssl|DEBUG|8D|ReplicaFetcherThread-0-2|2021-12-21 06:14:09.575 UTC|SSLCipher.java:1866|KeyLimit read side: algorithm = AES/GCM/NOPADDING:KEYUPDATE countdown value = 137438953472 javax.net.ssl|DEBUG|8D|ReplicaFetcherThread-0-2|2021-12-21 06:14:09.575 UTC|KeyUpdate.java:236|KeyUpdate: read key updated javax.net.ssl|DEBUG|8D|ReplicaFetcherThread-0-2|2021-12-21 06:14:09.575 UTC|KeyUpdate.java:271|Produced KeyUpdate post-handshake message ( "KeyUpdate": { "request_update": update_not_requested } ) javax.net.ssl|DEBUG|8D|ReplicaFetcherThread-0-2|2021-12-21 06:14:09.575 UTC|SSLCipher.java:2020|KeyLimit write side: algorithm = AES/GCM/NOPADDING:KEYUPDATE countdown value = 137438953472 javax.net.ssl|DEBUG|8D|ReplicaFetcherThread-0-2|2021-12-21 06:14:09.575 UTC|KeyUpdate.java:323|KeyUpdate: write key updated [2021-12-21 06:14:09,575] ERROR [SslTransportLayer channelId=2 key=channel=java.nio.channels.SocketChannel[connection-pending remote=/192.168.24.11:9093], selector=sun.nio.ch.EPollSelectorImpl@2eb1a872, interestOps=8, readyOps=0] Renegotiation requested, but it is not supported, channelId 2, appReadBuffer pos 0, netReadBuffer pos 0, netWriteBuffer pos 147 handshakeStatus NEED_WRAP State READY (org.apache.kafka.common.network.SslTransportLayer) javax.net.ssl|DEBUG|8D|ReplicaFetcherThread-0-2|2021-12-21 06:14:09.578 UTC|Alert.java:238|Received alert message ( "Alert": { "level" : "warning", "description": "close_notify" } ) javax.net.ssl|ALL|8D|ReplicaFetcherThread-0-2|2021-12-21 06:14:09.580 UTC|SSLEngineImpl.java:752|Closing outbound of SSLEngine{code} > Brokers disconnect intermittently with TLS1.3 > --------------------------------------------- > > Key: KAFKA-13418 > URL: https://issues.apache.org/jira/browse/KAFKA-13418 > Project: Kafka > Issue Type: Bug > Components: clients > Affects Versions: 2.8.0 > Reporter: shylaja kokoori > Assignee: shylaja kokoori > Priority: Minor > Attachments: tls1_3.patch > > > Using TLS1.3 (with JDK11) is causing a regression and an increase in > inter-broker p99 latency, as mentioned by Yiming in > [Kafka-9320|https://issues.apache.org/jira/browse/KAFKA-9320?focusedCommentId=17401818&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17401818]. > We tested this with Kafka 2.8. > The issue seems to be because of a renegotiation exception being thrown by > {code:java} > read(ByteBuffer dst) > {code} > & > {code:java} > write(ByteBuffer src) > {code} > in > _clients/src/main/java/org/apache/kafka/common/network/SslTransportLayer.java_ > This exception is causing the connection to close between the brokers before > read/write is completed. In our internal experiments we have seen the p99 > latency stabilize when we remove this exception. > Given that TLS1.3 does not support renegotiation, I would like to make it > applicable just for TLS1.2. -- This message was sent by Atlassian Jira (v8.20.1#820001)