Hello, I’m currently working on setting up a Kafka cluster that utilizes mTLS for client communication and plaintext for inter-broker communication. I’m running into issues that prevent me from creating topics and ACLs from the Broker after it has started up. I have had success so far setting up ACLs per client when I have the configuration ‘allow.everyone.if.no.acl.found’ is set to true. However, that is not desired. I would like to have that setting always ‘false’ and still be able to create ACLs from the broker CLI tools. Before I get into the main issue, I’ll give a little background on the current setup that might help shed some light on what I’m doing wrong.
The Kafka cluster is setup using the bitnami/kafka image running Kafka v3.2.0 and bitnami/zookeeper image running Zookeeper 3.9.0. I have created all of the Java keystore/truststores for the brokers and clients ensuring that each certificate has a SAN that properly identifies it for hostname validation, but each has a unique Common Name entry to uniquely identify them for the purposes of the AclAuthorizer rule I’m using “RULE:^.*[Cc][Nn]=([a-zA-Z0-9]*),.*$/$1/L”. There is one broker and 2 clients as a proof of concept. Broker’s certificate has a common name of “kafka-tls.domain.com” and clients have basic common names of “client1” and “client2”. Each keystore has the specific service’s signed certificate, it’s private key, and the CA’s public certificate. Each truststore has the CA’s public certificate loaded into it. The broker is loaded with both its keystore and truststore and has the password for the stores + key added to the server.properties configuration file. Both clients are just basic pods that use a kafka image to utilize the CLI tools such as kafka-console-consumer.sh and kafka-console-producer.sh for confirmation purposes. Some of the pertinent configuration for the Broker in the server.properties file: listeners=INTERNAL://:9093,CLIENT://:9092 advertised.listeners=INTERNAL://kafka-vulcan-0.kafka-vulcan-headless.kafka-vulcan.svc.cluster.local:9093,CLIENT://kafka-vulcan-0.kafka-vulcan-headless.kafka-vulcan.svc.cluster.local:9092 listener.security.protocol.map=INTERNAL:PLAINTEXT,CLIENT:SSL zookeeper.connect=kafka-vulcan-zookeeper allow.everyone.if.no.acl.found=false authorizer.class.name=kafka.security.authorizer.AclAuthorizer auto.create.topics.enable=true delete.topic.enable=true inter.broker.listener.name=INTERNAL security.protocol=SSL ssl.client.auth=required ssl.enabled.protocols=TLSv1.2 ssl.endpoint.identification.algorithm=https ssl.principal.mapping.rules=RULE:^.*[Cc][Nn]=([a-zA-Z0-9]*),.*$/$1/L tls.client.auth=required tls.type=JKS ssl.keystore.type=JKS ssl.truststore.type=JKS ssl.key.password={ Password } ssl.keystore.location=/opt/bitnami/kafka/config/certs/kafka.keystore.jks ssl.truststore.location=/opt/bitnami/kafka/config/certs/kafka.truststore.jks ssl.keystore.password={ Password } ssl.truststore.password={ Password } After deploying the helm chart to my Kubernetes cluster, I can see that the broker has started up and attempts to update the metadata of the cluster and it fails to authorize. The error I see is the following: Principal = User:ANONYMOUS is Denied Operation = ClusterAction from host = {K8’s broker IP} on resource = Cluster:LITERAL:kafka-cluster for request = UpdateMetadata with resourceRefCount = 1 (kafka.authorizer.logger) This is where my knowledge of Kafka is admittedly not that great. To me, that means the Broker is not attempting to identify itself to Zookeeper and is thus using ANONYMOUS login to authenticate and then attempt to make changes. Because the config ‘allow.everyone.if.no.acl.found’ is set to false, there was no ACL found for the ANONYMOUS login and therefore the Broker failed to do what it was trying to do after starting up. I then attempt to create topics and run into errors that don’t provide much information: kafka-topics.sh --bootstrap-server localhost:9092 --create --topic client1 This results in a timeout, but the logs indicate: INFO [SocketServer listenerType=ZK_BROKER, nodeId=0] Failed authentication with /{K8’s broker IP} (channelId={K8’s broker IP}:9092-{K8’s broker IP}:39816-99) (SSL handshake failed) (org.apache.kafka.common.network.Selector) Sadly, I don’t get any more information regarding what could be causing the issue, but I get the feeling it’s similar to the previous error stating that the ANONYMOUS authentication failed. Do I need to setup Zookeeper Authentication in order to identify the Broker to Zookeeper when using the CLI tools such as kafka-topics and kafka-acls? I’ve attempted to use the ‘--command-config’ flag supplying the following configuration file with mixed results, but all failing: ssl.endpoint.identification.algorithm=https bootstrap.servers=kafka-vulcan-0.kafka-vulcan-headless.kafka-vulcan.svc.cluster.local:9092 security.protocol=SSL ssl.keystore.location=/opt/bitnami/kafka/config/certs/kafka.keystore.jks ssl.keystore.password={ Password } ssl.key.password={ Password } ssl.truststore.location=/opt/bitnami/kafka/config/certs/kafka.truststore.jks ssl.truststore.password={ Password } ssl.protocol=TLSv1.2 ssl.truststore.type=JKS I’ve tried using this command config with the following two kafka-topics.sh executions: // 1 kafka-topics.sh --bootstrap-server localhost:9092 --create --topic client1 // 2 kafka-topics.sh --bootstrap-server kafka-vulcan-0.kafka-vulcan-headless.kafka-vulcan.svc.cluster.local:9092 --create --topic client1 The result from “1” is: org.apache.kafka.common.errors.SslAuthenticationException: SSL handshake failed Caused by: javax.net.ssl.SSLHandshakeException: No subject alternative DNS name matching localhost found. The result from “2” is just a timeout without any logs indicating something further. I’m a little confused on how the AdminClient works. Is it communicating with Zookeeper? Would I be using INTERNAL or CLIENT for communicating (i.e. 9093 or 9092 respectively) when running these commands? Any help would be greatly appreciated. Thank you, Matthew Rabey