RE: Re: Migration to KRaft
Hi Sisindri, It’s definitely a challenge to justify a major architecture change when everything is working just fine. If it’s not broke don’t fix it right? In my experience most environments I’ve worked with, Zookeeper is a more than adequate cluster management service. For small/medium sized use cases (ie: > 50MB/s throughput, >5K partitions, with < 1000 ms latencies etc.) Zookeeper works pretty great right out of the box, and I don’t see a huge push to change those environments at least from a performance perspective. For large/X-Large use cases (ie: < 1GB/s throughput, <10k partitions, > 1000ms latencies etc.) that’s where KRaft mode really is a game changer. Every time I’ve had to assist our Kafka customers with large/X-large environments, it has consistently been a Zookeeper issue. That said there are still plenty of reasons for small/medium sized enterprises to migrate to KRaft, if for no other reason to keep our enterprise software up to date. With Zookeeper being fully deprecated in 4.0, everyone will have to migrate eventually or get left with unsupported environments that aren’t receiving security updates. So, with that in mind I would at the very least go ahead and upgrade to 3.9 even if you want to stay on Zookeeper. All metadata versions prior 3.3 will be incompatible with KRaft so getting your metadata version upgraded will position you better for the transition when you finally make the jump. Also, note there are some additional wire protocol issues with really old versions of Kafka that might necessitate additional interim upgrades to get to 3.9. I wrote a blog post about on the topic you might find useful here https://www.openlogic.com/blog/upgrade-kafka-4-planning As for production readiness, I would say yes, it’s definitely production ready. We actually recommend “green field” deployments deploy in KRaft mode avoiding Zookeeper altogether now. The community has done some great work getting to 3.9 and addressing a lot of the limitations and caveats that KRaft had in its early days. I’ve recently published a blog post you might find useful on that topic as well https://www.openlogic.com/blog/kafka-raft-mode . Also just fyi… we’ve received a lot of similar question from our customers, so we have been working on some LTS products around Zookeeper and pre-4.0 Kafka to help folks in similar situations extend the runway to 4.0 with additional support options and training. If any of that sounds useful definitely reach out! -- Thanks, Joe Carder | Enterprise Architect, Open Logic Perforce Software P: 866.399.6736 Visit us on: LinkedIn | Twitter | Facebook | YouTube On 2025/03/21 11:55:01 Manabolu Sisindri wrote: > Hello everyone, > Can someone suggest here please. > > Regards, > Sisindri M. > > On Fri, Mar 7, 2025 at 7:33 AM Manabolu Sisindri > mailto:ma...@gmail.com>> > wrote: > > > > > Hi Team, > > > > We’re evaluating whether to migrate to *KRaft mode* (Kafka 3.9.0) from > > our current Zookeeper-based setup. Given the stability of our current > > system, do you recommend migrating now, or should we continue with > > Zookeeper for the time being and plan the migration later? > > > > If we do migrate, can we consider *Kafka 3.9.0* as stable for production > > workloads, or are there any known limitations or issues in KRaft at this > > moment? > > > > Looking forward to your thoughts. > > -- > > Regards, > > Sisindri, > > 8317502751. > > > This e-mail may contain information that is privileged or confidential. If you are not the intended recipient, please delete the e-mail and any attachments and notify us immediately.
Kafka process fails to start when special character is present in Keystore password in SSL encryption and SASL authentication
Hi Luke, We are using Kafka 3.7.0 Broker/Client system in our prod environment with SASL_SSL communication between Kafka Clients and Broker. We are starting the Kafka process from the shell using the below command. `nohup $EXEC_KAFKA_CONFIG --zookeeper 127.0.0.1:2181 --entity-type brokers --entity-name 0 --alter --add-config $zooKeeperConfig >> $KAFKA_HOME/logs/nohup_z.out 2>&1 &` `nohup $KAFKA_HOME/bin/kafka-server-start.sh $KAFKA_HOME/config/server.properties >> $KAFKA_HOME/logs/nohup_b.out 2>&1 &` Here, we are passing the SSL Keystore and truststore password details in $zooKeeperConfig as shown below: zooKeeperConfig="listener.name.sasl_ssl.ssl.truststore.password=$KAFKA_SSL_KEYSTORE_AND_TRUSTSTORE_PWD,listener.name.sasl_ssl.ssl.keystore.password=$KAFKA_SSL_KEYSTORE_AND_TRUSTSTORE_PWD,listener.name.sasl_ssl.ssl.key.password=$KAFKA_SSL_KEYSTORE_AND_TRUSTSTORE_PWD,$KAFKA_SSL_PASSWORD_ENCODER_SECRET_PROP" Due to the security limitations we are not passing the SSL Keystore and truststore password in the /config/server.properties file. Everything runs fine when password does not contain any special characters but gives below exception in kafka server.log and the Kafka fails to start when some special characters are provided in the passwords. -- Caused by: org.apache.kafka.common.KafkaException: org.apache.kafka.common.KafkaException: Failed to load SSL keystore /xx/xx/xx/kafka/client.truststore.jks of type JKS at org.apache.kafka.common.network.SaslChannelBuilder.configure(SaslChannelBuilder.java:184) at org.apache.kafka.common.network.ChannelBuilders.create(ChannelBuilders.java:192) at org.apache.kafka.common.network.ChannelBuilders.clientChannelBuilder(ChannelBuilders.java:81) at org.apache.kafka.clients.ClientUtils.createChannelBuilder(ClientUtils.java:119) at org.apache.kafka.clients.ClientUtils.createNetworkClient(ClientUtils.java:223) ... 10 more Caused by: org.apache.kafka.common.KafkaException: Failed to load SSL keystore /xx/xx/xx/kafka/client.truststore.jks of type JKS at org.apache.kafka.common.security.ssl.DefaultSslEngineFactory$FileBasedStore.load(DefaultSslEngineFactory.java:382) at org.apache.kafka.common.security.ssl.DefaultSslEngineFactory$FileBasedStore.(DefaultSslEngineFactory.java:354) at org.apache.kafka.common.security.ssl.DefaultSslEngineFactory.createTruststore(DefaultSslEngineFactory.java:327) at org.apache.kafka.common.security.ssl.DefaultSslEngineFactory.configure(DefaultSslEngineFactory.java:171) at org.apache.kafka.common.security.ssl.SslFactory.instantiateSslEngineFactory(SslFactory.java:141) at org.apache.kafka.common.security.ssl.SslFactory.configure(SslFactory.java:98) at org.apache.kafka.common.network.SaslChannelBuilder.configure(SaslChannelBuilder.java:180) ... 14 more Caused by: java.io.IOException: Keystore was tampered with, or password was incorrect at java.base/sun.security.provider.JavaKeyStore.engineLoad(JavaKeyStore.java:813) at java.base/sun.security.util.KeyStoreDelegator.engineLoad(KeyStoreDelegator.java:221) at java.base/java.security.KeyStore.load(KeyStore.java:1473) at org.apache.kafka.common.security.ssl.DefaultSslEngineFactory$FileBasedStore.load(DefaultSslEngineFactory.java:379) ... 20 more Caused by: java.security.UnrecoverableKeyException: Password verification failed at java.base/sun.security.provider.JavaKeyStore.engineLoad(JavaKeyStore.java:811) ... 23 more - We have tested various special characters in passwords, including: ! " # $ % & ' ( ) * + , - . / : ; < = > ? @ [ \ ] ^ _ { | } ~` Among these, the following characters work fine, and the Kafka service runs without issues: ! @ # % ^ & * _ - . ? / ~ : ; < > | { } $ + (We tested these by placing them at the end of the password, e.g., abc4!@#%^&*_-.?/~:;<>|{}$+.) However, we observed that some characters behave differently depending on their position in the password. $ and + work if used at the end of the password but cause issues if used at the beginning. Certain characters, such as , [ ] ( ) ` = do not work regardless of their position. Please note that the same password works successfully when passed in /config/server.properties file. We think that this behavior occurs because Kafka is started via a shell script, and some special characters have predefined meanings in the shell, leading to unintended interpretation issues. Since the position of a character impacts its behavior, there could be other combinations where the allowed characters mentioned above are placed differently within the password, which may still cause failures. Since we suspect