Hi all,

I'm having an issue with Kafka 2.3.0 that I'm hoping someone can give me 
insight on.

For security reasons, I use SSL two-way authentication for inter broker 
communication. I would like to use short lived SSL certificates and rotate them 
frequently without needing to do a broker restart.
I'm trying to use the dynamic broker configuration features to achieve this. 
Specifically, whenever I generate a new certificate, I set the 
"listener.name.interbroker.ssl.keystore.location" property (I change the 
filename every time I rotate certificates).
These are the other relevant parts the broker config I'm using:

advertised.listeners=PUBLIC_CLIENT\://<redacted>\:9092,PRIVATE_CLIENT\://<redacted>\:9093,INTERBROKER\://<redacted>\:9094
inter.broker.listener.name=INTERBROKER
listener.name.interbroker.ssl.key.password=<redacted>
listener.name.interbroker.ssl.keystore.location=/opt/kafka/config/broker.keystore.jks
listener.name.interbroker.ssl.keystore.password=<redacted>
listener.name.interbroker.ssl.truststore.location=/opt/kafka/config/truststore.jks
listener.name.interbroker.ssl.truststore.password=<redacted>
listener.security.protocol.map=INTERBROKER\:SSL,PUBLIC_CLIENT\:SASL_PLAINTEXT,PRIVATE_CLIENT\:SASL_PLAINTEXT
listeners=PUBLIC_CLIENT\://<redacted>\:9092,PRIVATE_CLIENT\://<redacted>\:9093,INTERBROKER\://0.0.0.0\:9094
ssl.client.auth=required
ssl.enabled.protocols=TLSv1.2,TLSv1.1,TLSv1
ssl.protocol=TLSv1.2

Now, setting this property works fine, and everything appears ok. But if I ever 
restart a broker after the original certificate has expired (The one the broker 
started up with, which is no longer configured anywhere), then all of a sudden 
there appear to be communication failures between brokers. My logs fill up with 
messages like this:

[2019-07-22 03:57:43,605] INFO [SocketServer brokerId=1] Failed authentication 
with <IP address removed> (SSL handshake failed) 
(org.apache.kafka.common.network.Selector)

A little bit of extra logging injected into the code tells me that the failures 
are caused by out of date SSL certificates being used (Even though the 
properties have been updated to use the new , still valid, certificates). 
Checking the certificates on offer by the brokers on the inter broker listener 
port show the new, valid certificates being used. So it seems to me like some 
client network component inside Kafka is not listening to the new settings.
This sounds like the behaviour described in KAFKA-8336 
(https://issues.apache.org/jira/browse/KAFKA-8336), but this is marked as fixed 
in 2.3.0.

Am I configuring something improperly here? Has anyone else encountered this 
situation?

Thanks,
Michael

Reply via email to