[ https://issues.apache.org/jira/browse/KAFKA-17959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17897259#comment-17897259 ]
Arushi Helms commented on KAFKA-17959: -------------------------------------- Hi [~gnarula] I have enabled the debug logging for org.apache.kafka.clients and here are the log snippets from one of the brokers: {noformat} [2024-11-11 19:03:58,404] DEBUG [RaftManager id=2] Resolved host 10.87.170.83 to addresses [/10.87.170.83] (org.apache.kafka.clients.ClusterConnectionStates) [2024-11-11 19:03:58,404] DEBUG [RaftManager id=2] Initiating connection to node 10.87.170.83:9097 (id: 4 rack: null) using address /10.87.170.83 (org.apache.kafka.clients.NetworkClient){noformat} Further in the logs I also see: {noformat} 2024-11-11 19:03:59,951] DEBUG Resolved host cp-internal-onecloud-kfkc2.node.cp-internal-onecloud.consul as 10.87.170.9 (org.apache.kafka.clients.ClientUtils) [2024-11-11 19:03:59,951] DEBUG [NodeToControllerChannelManager id=2 name=heartbeat] Resolved host cp-internal-onecloud-kfkc2.node.cp-internal-onecloud.consul to addresses [cp-internal-onecloud-kfkc2.node.cp-internal-onecloud.consul/10.87.170.9] (org.apache.kafka.clients.ClusterConnectionStates) {noformat} {noformat} [2024-11-11 19:03:59,952] INFO [broker-2-to-controller-heartbeat-channel-manager]: Recorded new KRaft controller, from now on will use node cp-internal-onecloud-kfkc1.node.cp-internal-onecloud.consul:9097 (id: 4 rack: null) (kafka.server.NodeToControllerRequestThread) [2024-11-11 19:03:59,954] DEBUG Resolved host cp-internal-onecloud-kfkc1.node.cp-internal-onecloud.consul as 10.87.170.83 (org.apache.kafka.clients.ClientUtils) [2024-11-11 19:03:59,954] DEBUG [NodeToControllerChannelManager id=2 name=heartbeat] Resolved host cp-internal-onecloud-kfkc1.node.cp-internal-onecloud.consul to addresses [cp-internal-onecloud-kfkc1.node.cp-internal-onecloud.consul/10.87.170.83] (org.apache.kafka.clients.ClusterConnectionStates) [2024-11-11 19:03:59,954] DEBUG [NodeToControllerChannelManager id=2 name=heartbeat] Initiating connection to node cp-internal-onecloud-kfkc1.node.cp-internal-onecloud.consul:9097 (id: 4 rack: null) using address cp-internal-onecloud-kfkc1.node.cp-internal-onecloud.consul/10.87.170.83 (org.apache.kafka.clients.NetworkClient) [2024-11-11 19:03:59,958] DEBUG [NodeToControllerChannelManager id=2 name=heartbeat] Completed connection to node 4. Fetching API versions. (org.apache.kafka.clients.NetworkClient){noformat} *Few things to note about my setup:* - I have 3 brokers and 3 controllers, running distinctly on different hosts as docker containers. - Hostname on the machine does not match the CN or SAN of the certificate - As shared in the broker and controller configuration, we provide IPs for communication and have nowhere mentioned the hostnames. - For the time being, to make this work I have set the property *KAFKA_CFG_SSL_ENDPOINT_IDENTIFICATION_ALGORITHM* to empty string on both brokers and controllers. Please let me know if there is anything else you need to troubleshoot this issue. > Avoid Reverse DNS Lookup for IP-Based SSL Authentication in Kraft Mode > ---------------------------------------------------------------------- > > Key: KAFKA-17959 > URL: https://issues.apache.org/jira/browse/KAFKA-17959 > Project: Kafka > Issue Type: Bug > Components: kraft > Affects Versions: 3.6.0, 3.7.0, 3.8.0, 3.7.1 > Reporter: Arushi Helms > Assignee: Gaurav Narula > Priority: Blocker > > We have encountered an issue with Kafka's Kraft mode where reverse DNS > lookups are being performed unnecessarily during SSL authentication between > controllers and between brokers and controllers, despite using IP addresses > for communication. > In our Kafka setup, we are using IP addresses for communication and have > configured certificates with {*}IP addresses in the Subject Alternative Name > (SAN){*}. However, when the controller tries to establish SSL connections > with other controllers or brokers, it attempts a reverse DNS lookup on the IP > address (e.g., {{{}10.87.170.83{}}}), which causes SSL handshake failures due > to the mismatch between the resolved hostname and the IP address in the > certificate. > The issue arises even though the certificate contains the IP in the SAN and > should not require a reverse DNS lookup. This unnecessary lookup introduces > delays and inconsistencies, especially in environments where DNS resolution > is not required or reliable (e.g., in private networks). > h3. Affected Scenarios: > # {*}Broker-to-Controller Communication{*}: The broker fails to authenticate > with the controller because the reverse DNS lookup of the controller's IP > address does not match the expected DNS name in the certificate. > # {*}Controller-to-Controller Communication{*}: Controllers also fail to > authenticate with each other due to similar reverse DNS lookup issues. > h3. Current Behavior: > * Kafka's SSL handshake fails when using IPs for communication, with errors > like > {code:java}No subject alternative DNS name matching <resolved hostname> > found{code} due to reverse DNS lookup mismatches. > * The controller attempts reverse DNS lookups even when the connection is > established using IP addresses directly. > h3. Expected Behavior: > * Kafka should use the *IP address directly* for SSL engine creation and > authentication when IPs are provided for communication, without performing a > reverse DNS lookup. > * *SSL hostname verification* should match the IP address in the SAN of the > certificate, not a resolved DNS name. > h3. Request: > * Please address the issue by ensuring that Kafka does *not perform reverse > DNS lookups* for SSL authentication when IP addresses are explicitly provided > for communication. > * This behavior should be consistent across all Kafka components (brokers > and controllers) in Kraft mode. > > Old ticket with similar issue for reference: > https://issues.apache.org/jira/browse/KAFKA-5051 -- This message was sent by Atlassian Jira (v8.20.10#820010)