RE: Re: Migration to KRaft

2025-03-26 Thread Joe Carder
Hi Sisindri,

It’s definitely a challenge to justify a major architecture change when 
everything is working just fine.  If it’s not broke don’t fix it right?  In my 
experience most environments I’ve worked with, Zookeeper is a more than 
adequate cluster management service.  For small/medium sized use cases (ie: > 
50MB/s throughput, >5K partitions, with < 1000 ms latencies etc.) Zookeeper 
works pretty great right out of the box, and I don’t see a huge push to change 
those environments at least from a performance perspective.  For large/X-Large 
use cases (ie: < 1GB/s throughput, <10k partitions,  > 1000ms latencies etc.) 
that’s where KRaft mode really is a game changer.  Every time I’ve had to 
assist our Kafka customers with large/X-large environments, it has consistently 
been a Zookeeper issue.

That said there are still plenty of reasons for small/medium sized enterprises 
to migrate to KRaft, if for no other reason to keep our enterprise software up 
to date.  With Zookeeper being fully deprecated in 4.0, everyone will have to 
migrate eventually or get left with unsupported environments that aren’t 
receiving security updates.  So, with that in mind I would at the very least go 
ahead and upgrade to 3.9 even if you want to stay on Zookeeper.  All metadata 
versions prior 3.3 will be incompatible with KRaft so getting your metadata 
version upgraded will position you better for the transition when you finally 
make the jump.  Also, note there are some additional wire protocol issues with 
really old versions of Kafka that might necessitate additional interim upgrades 
to get to 3.9.  I wrote a blog post about on the topic you might find useful 
here https://www.openlogic.com/blog/upgrade-kafka-4-planning

As for production readiness, I would say yes, it’s definitely production ready. 
 We actually recommend “green field” deployments deploy in KRaft mode avoiding 
Zookeeper altogether now.  The community has done some great work getting to 
3.9 and addressing a lot of the limitations and caveats that KRaft had in its 
early days.  I’ve recently published a blog post you might find useful on that 
topic as well https://www.openlogic.com/blog/kafka-raft-mode .

Also just fyi… we’ve received a lot of similar question from our customers, so 
we have been working on some LTS products around Zookeeper and pre-4.0 Kafka to 
help folks in similar situations extend the runway to 4.0 with additional 
support options and training.   If any of that sounds useful definitely reach 
out!

--
Thanks,
Joe Carder | Enterprise Architect, Open Logic
Perforce Software
P: 866.399.6736
Visit us on: LinkedIn | Twitter | Facebook | YouTube


On 2025/03/21 11:55:01 Manabolu Sisindri wrote:
> Hello everyone,
> Can someone suggest here please.
>
> Regards,
> Sisindri M.
>
> On Fri, Mar 7, 2025 at 7:33 AM Manabolu Sisindri 
> mailto:ma...@gmail.com>>
> wrote:
>
> >
> > Hi Team,
> >
> > We’re evaluating whether to migrate to *KRaft mode* (Kafka 3.9.0) from
> > our current Zookeeper-based setup. Given the stability of our current
> > system, do you recommend migrating now, or should we continue with
> > Zookeeper for the time being and plan the migration later?
> >
> > If we do migrate, can we consider *Kafka 3.9.0* as stable for production
> > workloads, or are there any known limitations or issues in KRaft at this
> > moment?
> >
> > Looking forward to your thoughts.
> > --
> > Regards,
> > Sisindri,
> > 8317502751.
> >
>


This e-mail may contain information that is privileged or confidential. If you 
are not the intended recipient, please delete the e-mail and any attachments 
and notify us immediately.



Kafka process fails to start when special character is present in Keystore password in SSL encryption and SASL authentication

2025-03-26 Thread Deepak Jain
Hi Luke,



We are using Kafka 3.7.0 Broker/Client system in our prod environment with 
SASL_SSL communication between Kafka Clients and Broker.  We are starting the 
Kafka process from the shell using the below command.


`nohup $EXEC_KAFKA_CONFIG --zookeeper 127.0.0.1:2181 --entity-type brokers 
--entity-name 0 --alter --add-config $zooKeeperConfig >> 
$KAFKA_HOME/logs/nohup_z.out 2>&1 &`
`nohup $KAFKA_HOME/bin/kafka-server-start.sh 
$KAFKA_HOME/config/server.properties >> $KAFKA_HOME/logs/nohup_b.out 2>&1 &`


Here, we are passing the SSL Keystore and truststore password details in 
$zooKeeperConfig as shown below:


zooKeeperConfig="listener.name.sasl_ssl.ssl.truststore.password=$KAFKA_SSL_KEYSTORE_AND_TRUSTSTORE_PWD,listener.name.sasl_ssl.ssl.keystore.password=$KAFKA_SSL_KEYSTORE_AND_TRUSTSTORE_PWD,listener.name.sasl_ssl.ssl.key.password=$KAFKA_SSL_KEYSTORE_AND_TRUSTSTORE_PWD,$KAFKA_SSL_PASSWORD_ENCODER_SECRET_PROP"


Due to the security limitations we are not passing the SSL Keystore and 
truststore password in the /config/server.properties file.


Everything runs fine when password does not contain any special characters but 
gives below exception in kafka server.log and the Kafka fails to start  when 
some special characters are provided in the passwords.


--

Caused by: org.apache.kafka.common.KafkaException: 
org.apache.kafka.common.KafkaException: Failed to load SSL keystore 
/xx/xx/xx/kafka/client.truststore.jks of type JKS
at 
org.apache.kafka.common.network.SaslChannelBuilder.configure(SaslChannelBuilder.java:184)
at 
org.apache.kafka.common.network.ChannelBuilders.create(ChannelBuilders.java:192)
at 
org.apache.kafka.common.network.ChannelBuilders.clientChannelBuilder(ChannelBuilders.java:81)
at 
org.apache.kafka.clients.ClientUtils.createChannelBuilder(ClientUtils.java:119)
at 
org.apache.kafka.clients.ClientUtils.createNetworkClient(ClientUtils.java:223)
... 10 more
Caused by: org.apache.kafka.common.KafkaException: Failed to load SSL keystore 
/xx/xx/xx/kafka/client.truststore.jks of type JKS
at 
org.apache.kafka.common.security.ssl.DefaultSslEngineFactory$FileBasedStore.load(DefaultSslEngineFactory.java:382)
at 
org.apache.kafka.common.security.ssl.DefaultSslEngineFactory$FileBasedStore.(DefaultSslEngineFactory.java:354)
at 
org.apache.kafka.common.security.ssl.DefaultSslEngineFactory.createTruststore(DefaultSslEngineFactory.java:327)
at 
org.apache.kafka.common.security.ssl.DefaultSslEngineFactory.configure(DefaultSslEngineFactory.java:171)
at 
org.apache.kafka.common.security.ssl.SslFactory.instantiateSslEngineFactory(SslFactory.java:141)
at 
org.apache.kafka.common.security.ssl.SslFactory.configure(SslFactory.java:98)
at 
org.apache.kafka.common.network.SaslChannelBuilder.configure(SaslChannelBuilder.java:180)
... 14 more
Caused by: java.io.IOException: Keystore was tampered with, or password was 
incorrect
at 
java.base/sun.security.provider.JavaKeyStore.engineLoad(JavaKeyStore.java:813)
at 
java.base/sun.security.util.KeyStoreDelegator.engineLoad(KeyStoreDelegator.java:221)
at java.base/java.security.KeyStore.load(KeyStore.java:1473)
at 
org.apache.kafka.common.security.ssl.DefaultSslEngineFactory$FileBasedStore.load(DefaultSslEngineFactory.java:379)
... 20 more
Caused by: java.security.UnrecoverableKeyException: Password verification failed
at 
java.base/sun.security.provider.JavaKeyStore.engineLoad(JavaKeyStore.java:811)
... 23 more
-


We have tested various special characters in passwords, including:

! " # $ % & ' ( ) * + , - . / : ; < = > ? @ [ \ ] ^ _ { | } ~`

Among these, the following characters work fine, and the Kafka service runs 
without issues:
! @ # % ^ & * _ - . ? / ~ : ; < > | { } $ +
(We tested these by placing them at the end of the password, e.g., 
abc4!@#%^&*_-.?/~:;<>|{}$+.)

However, we observed that some characters behave differently depending on their 
position in the password.
$ and + work if used at the end of the password but cause issues if used at the 
beginning.
Certain characters, such as , [ ] ( ) ` = do not work regardless of their 
position.

Please note that the same password works successfully when passed in 
/config/server.properties file.

We think that this behavior occurs because Kafka is started via a shell script, 
and some special characters have predefined meanings in the shell, leading to 
unintended interpretation issues. Since the position of a character impacts its 
behavior, there could be other combinations where the allowed characters 
mentioned above are placed differently within the password, which may still 
cause failures.

Since we suspect