Kafka service failure seen during scaling
Hi Team, We tried to scale kafka from single broker to 3 brokers. During scaling, getting below error. We are using apache/kafka 3.8.0 version [2024-11-13 23:00:57,251] ERROR Encountered fatal fault: Unable to apply PartitionChangeRecord record at offset 264919 on standby controller, from the batch with baseOffset 264919 (org.apache.kafka.server.fault.ProcessTerminatingFaultHandler) java.lang.RuntimeException: Tried to create partition YFqfehupTfah0LfzGbw-wA:1, but no topic with that ID was found. at org.apache.kafka.controller.ReplicationControlManager.replay(ReplicationControlManager.java:526) at org.apache.kafka.controller.QuorumController.replay(QuorumController.java:1504) at org.apache.kafka.controller.QuorumController.access$1700(QuorumController.java:179) at org.apache.kafka.controller.QuorumController$QuorumMetaLogListener.lambda$handleCommit$0(QuorumController.java:1083) at org.apache.kafka.controller.QuorumController$QuorumMetaLogListener.lambda$appendRaftEvent$3(QuorumController.java:1192) at org.apache.kafka.controller.QuorumController$ControllerEvent.run(QuorumController.java:577) at org.apache.kafka.queue.KafkaEventQueue$EventContext.run(KafkaEventQueue.java:131) at org.apache.kafka.queue.KafkaEventQueue$EventHandler.handleEvents(KafkaEventQueue.java:214) at org.apache.kafka.queue.KafkaEventQueue$EventHandler.run(KafkaEventQueue.java:185) at java.base/java.lang.Thread.run(Thread.java:840) [2024-11-13 23:00:57,251] ERROR Encountered fatal fault: Error loading metadata log record from offset 264919 (org.apache.kafka.server.fault.ProcessTerminatingFaultHandler) Please help us to resolve the issue. Thanks & Regards, Sravani
Intermediate CA communiation is not working between kafka and kafka-topic-manager/applications
*Hi Team,* We are facing issue with kafka topic manager, when intermediate CA is present. Please let us know how to resolve this issue. Kafka:3.8.0 is being used. *When we are trying to communicate between kafka and kafka-topic-mager we are using internal and third party CA certificates. when we are trying to connect using a certificate path with multiple CA's communication is breaking between kafka and applications.* *Example1: certificate is signed with CA - we didn't find any issue (No intermediate CA) certificate chain : certificate -> internal CA --SSL handshake completed successfully with peerHost- Nov 5 15:59:49 localhost kafka[128794]: [2024-11-05 13:59:49,380] DEBUG Accepted connection from /172.17.0.1:37520 <http://172.17.0.1:37520> on /172.17.0.18:9092 <http://172.17.0.18:9092> and assigned it to processor 1, sendBufferSize [actual|requested]: [102400|102400] recvBufferSize [actual|requested]: [102400|102400] (kafka.network.DataPlaneAcceptor) Nov 5 15:59:49 localhost kafka[128794]: [2024-11-05 13:59:49,380] DEBUG Processor 1 listening to new connection from /172.17.0.1:37520 <http://172.17.0.1:37520> (kafka.network.Processor) Nov 5 15:59:49 localhost kafka[128794]: [2024-11-05 13:59:49,401] DEBUG [SslTransportLayer channelId=172.17.0.18:9092-172.17.0.1:37520-15 key=channel=java.nio.channels.SocketChannel[connected local=/172.17.0.18:9092 <http://172.17.0.18:9092> remote=/172.17.0.1:37520 <http://172.17.0.1:37520>], selector=sun.nio.ch.EPollSelectorImpl@12a58e5e, interestOps=1, readyOps=0] _SSL handshake completed successfully with peerHost_ '172.17.0.1' peerPort 37520 peerPrincipal 'CN=kafka-topic-manager-localhost' protocol 'TLSv1.3' cipherSuite 'TLS_AES_128_GCM_SHA256' (org.apache.kafka.common.network.SslTransportLayer) Example2: certificate is signed with internal CA signed by thirdparty CA - hadshek is failing (With intermediate CA) certificate chain : certificate -> internal CA -> thirdparty CA -SSLHandshake NEED_UNWRAP channelId- Nov 5 16:38:21 localhost kafka[1332937]: [2024-11-05 14:38:21,370] DEBUG Processor 1 listening to new connection from /172.17.0.1:45242 <http://172.17.0.1:45242> (kafka.network.Processor) Nov 5 16:38:21 localhost kafka[1332937]: [2024-11-05 14:38:21,370] DEBUG Accepted connection from /172.17.0.1:45242 <http://172.17.0.1:45242> on /172.17.0.141:9092 <http://172.17.0.141:9092> and assigned it to processor 1, sendBufferSize [actual|requested]: [102400|102400] recvBufferSize [actual|requested]: [102400|102400] (kafka.network.DataPlaneAcceptor) Nov 5 16:38:21 localhost kafka[1332937]: [2024-11-05 14:38:21,370] TRACE [SslTransportLayer channelId=172.17.0.141:9092-172.17.0.1:45242-825 key=channel=java.nio.channels.SocketChannel[connected local=/172.17.0.141:9092 <http://172.17.0.141:9092> remote=/172.17.0.1:45242 <http://172.17.0.1:45242>], selector=sun.nio.ch.EPollSelectorImpl@39027b65, interestOps=1, readyOps=0] SSLHandshake NEED_UNWRAP channelId 172.17.0.141:9092-172.17.0.1:45242-825, appReadBuffer pos 0, netReadBuffer pos 0, netWriteBuffer pos 0 (org.apache.kafka.common.network.SslTransportLayer)* *Thanks & Regards,Sravani*
Re: Kafka service failure seen during scaling
Hi Team, Any update on this? Regards, Sravani On Tue, Nov 19, 2024, 16:56 Sravani wrote: > Hi Team, > > We tried to scale kafka from single broker to 3 brokers. During scaling, > getting below error. > We are using apache/kafka 3.8.0 version > > [2024-11-13 23:00:57,251] ERROR Encountered fatal fault: Unable to apply > PartitionChangeRecord record at offset 264919 on standby controller, from > the batch with baseOffset 264919 > (org.apache.kafka.server.fault.ProcessTerminatingFaultHandler) > java.lang.RuntimeException: Tried to create partition > YFqfehupTfah0LfzGbw-wA:1, but no topic with that ID was found. > at > org.apache.kafka.controller.ReplicationControlManager.replay(ReplicationControlManager.java:526) > at > org.apache.kafka.controller.QuorumController.replay(QuorumController.java:1504) > at > org.apache.kafka.controller.QuorumController.access$1700(QuorumController.java:179) > at > org.apache.kafka.controller.QuorumController$QuorumMetaLogListener.lambda$handleCommit$0(QuorumController.java:1083) > at > org.apache.kafka.controller.QuorumController$QuorumMetaLogListener.lambda$appendRaftEvent$3(QuorumController.java:1192) > at > org.apache.kafka.controller.QuorumController$ControllerEvent.run(QuorumController.java:577) > at > org.apache.kafka.queue.KafkaEventQueue$EventContext.run(KafkaEventQueue.java:131) > at > org.apache.kafka.queue.KafkaEventQueue$EventHandler.handleEvents(KafkaEventQueue.java:214) > at > org.apache.kafka.queue.KafkaEventQueue$EventHandler.run(KafkaEventQueue.java:185) > at java.base/java.lang.Thread.run(Thread.java:840) > [2024-11-13 23:00:57,251] ERROR Encountered fatal fault: Error loading > metadata log record from offset 264919 > (org.apache.kafka.server.fault.ProcessTerminatingFaultHandler) > > > Please help us to resolve the issue. > > > Thanks & Regards, > > Sravani >
Kafka services are unstable in 3 node cluster
a:679) Feb 27 22:20:29 localhost kafka[835188]: #011at kafka.network.SocketServer.$anonfun$stopProcessingRequests$4(Socket Feb 27 22:22:44 localhost kafka[841117]: [2025-02-27 20:22:44,151] WARN [RaftManager id=2] Graceful shutdown timed out after 5000ms (org.apache.kafka.raft.KafkaRaftClient) Feb 27 22:22:44 localhost kafka[841117]: [2025-02-27 20:22:44,151] ERROR [RaftManager id=2] Graceful shutdown of RaftClient failed (org.apache.kafka.raft.KafkaRaftClientDriver) Please prioritise this issue and let us know. *We have already created ticket - https://issues.apache.org/jira/browse/KAFKA-18958 <https://issues.apache.org/jira/browse/KAFKA-18958>* Thanks, Sravani