I looked the latest code (Kafka 2.2.0) code in NetworkClient.java, I don't
see any changes in the related code, I believe the problem also existed
there.

Created this JIRA to track the issue:
https://issues.apache.org/jira/browse/KAFKA-8089

On Thu, Mar 7, 2019 at 11:39 PM Ismael Juma <ism...@juma.me.uk> wrote:

> Hi,
>
> It would be great to verify that this happens with Kafka 2.2.0 RC1. If it
> does, then please file a JIRA so that this doesn't get lost.
>
> Ismael
>
> On Thu, Mar 7, 2019 at 4:19 PM Henry Cai <h...@pinterest.com.invalid>
> wrote:
>
> > Hi,
> >
> > We have been using Kafka 2.0's mirror maker (which used High level
> > consumer) to do replication.  The topic is SSL enabled and the
> certificate
> > will expire at a random time within 12 hours.  When the certificate
> expired
> > we will see many SSL related exception in the log
> >
> > [2019-03-07 18:02:54,128] ERROR [Consumer
> > clientId=kafkamirror-euw1-use1-m10nkafka03-1,
> > groupId=kafkamirror-euw1-use1-m10nkafka03] Connection to node 3005 failed
> > authentication due to: SSL handshake failed
> > (org.apache.kafka.clients.NetworkClient)
> >
> >
> > This error will repeat for several hours.
> >
> >
> > However even with the SSL error, the preexisting socket connection will
> > still work so the main fetching activities is actually not affected, but
> > the metadata operations from the client and the heartbeats from heartbeat
> > thread will be affected since they might open new socket connections.  I
> > think those errors are most likely originated from those side activities.
> >
> >
> > The situation will last several hours until the main fetcher thread tried
> > to open a new connection (usually due to consumer rebalance) and then the
> > SSL Authentication exception will abort the operation and mirror maker
> will
> > exit.
> >
> >
> > During that several hours, the client wouldn't be able to get the latest
> > metadata and heartbeats also falters (we see rebalancing triggered
> because
> > of this).
> >
> >
> > In NetworkClient.processDisconnection(), when the above method prints the
> > ERROR message, can it just throw the AuthenticationException up, this
> will
> > kill the KafkaConsumer.poll(), and this will speedup the certificate
> > recycle (in our case, we will restart the mirror maker with the new
> > certificate)
> >
>

Reply via email to