Kafka service failure seen during scaling

2024-11-19 Thread Sravani
Hi Team,

We tried to scale kafka from single broker to 3 brokers. During scaling,
getting below error.
We are using apache/kafka 3.8.0 version

[2024-11-13 23:00:57,251] ERROR Encountered fatal fault: Unable to apply
PartitionChangeRecord record at offset 264919 on standby controller, from
the batch with baseOffset 264919
(org.apache.kafka.server.fault.ProcessTerminatingFaultHandler)
java.lang.RuntimeException: Tried to create partition
YFqfehupTfah0LfzGbw-wA:1, but no topic with that ID was found.
at
org.apache.kafka.controller.ReplicationControlManager.replay(ReplicationControlManager.java:526)
at
org.apache.kafka.controller.QuorumController.replay(QuorumController.java:1504)
at
org.apache.kafka.controller.QuorumController.access$1700(QuorumController.java:179)
at
org.apache.kafka.controller.QuorumController$QuorumMetaLogListener.lambda$handleCommit$0(QuorumController.java:1083)
at
org.apache.kafka.controller.QuorumController$QuorumMetaLogListener.lambda$appendRaftEvent$3(QuorumController.java:1192)
at
org.apache.kafka.controller.QuorumController$ControllerEvent.run(QuorumController.java:577)
at
org.apache.kafka.queue.KafkaEventQueue$EventContext.run(KafkaEventQueue.java:131)
at
org.apache.kafka.queue.KafkaEventQueue$EventHandler.handleEvents(KafkaEventQueue.java:214)
at
org.apache.kafka.queue.KafkaEventQueue$EventHandler.run(KafkaEventQueue.java:185)
at java.base/java.lang.Thread.run(Thread.java:840)
[2024-11-13 23:00:57,251] ERROR Encountered fatal fault: Error loading
metadata log record from offset 264919
(org.apache.kafka.server.fault.ProcessTerminatingFaultHandler)


Please help us to resolve the issue.


Thanks & Regards,

Sravani


Intermediate CA communiation is not working between kafka and kafka-topic-manager/applications

2024-12-02 Thread Sravani
*Hi Team,*

We are facing issue with kafka topic manager, when intermediate CA is
present. Please let us know how to resolve this issue.
Kafka:3.8.0 is being used.

*When we are trying to communicate between kafka and kafka-topic-mager we
are using internal and third party CA certificates. when we are trying to
connect using a certificate path with multiple CA's communication is
breaking between kafka and applications.*

*Example1: certificate is signed with CA - we didn't find any issue (No
intermediate CA) certificate chain : certificate -> internal CA
--SSL handshake
completed successfully with peerHost- Nov 5
15:59:49 localhost kafka[128794]: [2024-11-05 13:59:49,380] DEBUG Accepted
connection from /172.17.0.1:37520 <http://172.17.0.1:37520> on
/172.17.0.18:9092 <http://172.17.0.18:9092> and assigned it to processor 1,
sendBufferSize [actual|requested]: [102400|102400] recvBufferSize
[actual|requested]: [102400|102400] (kafka.network.DataPlaneAcceptor) Nov 5
15:59:49 localhost kafka[128794]: [2024-11-05 13:59:49,380] DEBUG Processor
1 listening to new connection from /172.17.0.1:37520
<http://172.17.0.1:37520> (kafka.network.Processor) Nov 5 15:59:49
localhost kafka[128794]: [2024-11-05 13:59:49,401] DEBUG [SslTransportLayer
channelId=172.17.0.18:9092-172.17.0.1:37520-15
key=channel=java.nio.channels.SocketChannel[connected
local=/172.17.0.18:9092 <http://172.17.0.18:9092> remote=/172.17.0.1:37520
<http://172.17.0.1:37520>], selector=sun.nio.ch.EPollSelectorImpl@12a58e5e,
interestOps=1, readyOps=0] _SSL handshake completed successfully with
peerHost_ '172.17.0.1' peerPort 37520 peerPrincipal
'CN=kafka-topic-manager-localhost' protocol 'TLSv1.3' cipherSuite
'TLS_AES_128_GCM_SHA256'
(org.apache.kafka.common.network.SslTransportLayer) Example2: certificate
is signed with internal CA signed by thirdparty CA - hadshek is failing
(With intermediate CA) certificate chain : certificate -> internal CA ->
thirdparty CA
-SSLHandshake
NEED_UNWRAP channelId- Nov 5
16:38:21 localhost kafka[1332937]: [2024-11-05 14:38:21,370] DEBUG
Processor 1 listening to new connection from /172.17.0.1:45242
<http://172.17.0.1:45242> (kafka.network.Processor) Nov 5 16:38:21
localhost kafka[1332937]: [2024-11-05 14:38:21,370] DEBUG Accepted
connection from /172.17.0.1:45242 <http://172.17.0.1:45242> on
/172.17.0.141:9092 <http://172.17.0.141:9092> and assigned it to processor
1, sendBufferSize [actual|requested]: [102400|102400] recvBufferSize
[actual|requested]: [102400|102400] (kafka.network.DataPlaneAcceptor) Nov 5
16:38:21 localhost kafka[1332937]: [2024-11-05 14:38:21,370] TRACE
[SslTransportLayer channelId=172.17.0.141:9092-172.17.0.1:45242-825
key=channel=java.nio.channels.SocketChannel[connected
local=/172.17.0.141:9092 <http://172.17.0.141:9092>
remote=/172.17.0.1:45242 <http://172.17.0.1:45242>],
selector=sun.nio.ch.EPollSelectorImpl@39027b65, interestOps=1, readyOps=0]
SSLHandshake NEED_UNWRAP channelId 172.17.0.141:9092-172.17.0.1:45242-825,
appReadBuffer pos 0, netReadBuffer pos 0, netWriteBuffer pos 0
(org.apache.kafka.common.network.SslTransportLayer)*





*Thanks & Regards,Sravani*


Re: Kafka service failure seen during scaling

2024-12-02 Thread Sravani
Hi Team,

Any update on this?

Regards,
Sravani

On Tue, Nov 19, 2024, 16:56 Sravani  wrote:

> Hi Team,
>
> We tried to scale kafka from single broker to 3 brokers. During scaling,
> getting below error.
> We are using apache/kafka 3.8.0 version
>
> [2024-11-13 23:00:57,251] ERROR Encountered fatal fault: Unable to apply
> PartitionChangeRecord record at offset 264919 on standby controller, from
> the batch with baseOffset 264919
> (org.apache.kafka.server.fault.ProcessTerminatingFaultHandler)
> java.lang.RuntimeException: Tried to create partition
> YFqfehupTfah0LfzGbw-wA:1, but no topic with that ID was found.
> at
> org.apache.kafka.controller.ReplicationControlManager.replay(ReplicationControlManager.java:526)
> at
> org.apache.kafka.controller.QuorumController.replay(QuorumController.java:1504)
> at
> org.apache.kafka.controller.QuorumController.access$1700(QuorumController.java:179)
> at
> org.apache.kafka.controller.QuorumController$QuorumMetaLogListener.lambda$handleCommit$0(QuorumController.java:1083)
> at
> org.apache.kafka.controller.QuorumController$QuorumMetaLogListener.lambda$appendRaftEvent$3(QuorumController.java:1192)
> at
> org.apache.kafka.controller.QuorumController$ControllerEvent.run(QuorumController.java:577)
> at
> org.apache.kafka.queue.KafkaEventQueue$EventContext.run(KafkaEventQueue.java:131)
> at
> org.apache.kafka.queue.KafkaEventQueue$EventHandler.handleEvents(KafkaEventQueue.java:214)
> at
> org.apache.kafka.queue.KafkaEventQueue$EventHandler.run(KafkaEventQueue.java:185)
> at java.base/java.lang.Thread.run(Thread.java:840)
> [2024-11-13 23:00:57,251] ERROR Encountered fatal fault: Error loading
> metadata log record from offset 264919
> (org.apache.kafka.server.fault.ProcessTerminatingFaultHandler)
>
>
> Please help us to resolve the issue.
>
>
> Thanks & Regards,
>
> Sravani
>


Kafka services are unstable in 3 node cluster

2025-03-12 Thread Sravani
a:679)
Feb 27 22:20:29 localhost kafka[835188]: #011at
kafka.network.SocketServer.$anonfun$stopProcessingRequests$4(Socket
Feb 27 22:22:44 localhost kafka[841117]: [2025-02-27 20:22:44,151]
WARN [RaftManager
id=2] Graceful shutdown timed out after 5000ms
(org.apache.kafka.raft.KafkaRaftClient)
Feb 27 22:22:44 localhost kafka[841117]: [2025-02-27 20:22:44,151]
ERROR [RaftManager
id=2] Graceful shutdown of RaftClient failed
(org.apache.kafka.raft.KafkaRaftClientDriver)

Please prioritise this issue and let us know.

*We have already created ticket
- https://issues.apache.org/jira/browse/KAFKA-18958
<https://issues.apache.org/jira/browse/KAFKA-18958>*


Thanks,

Sravani