[ 
https://issues.apache.org/jira/browse/KAFKA-18386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Harris resolved KAFKA-18386.
---------------------------------
    Resolution: Won't Fix

> Mirror Maker2 Pod CrashLoopBackoff When one DC is powered off
> -------------------------------------------------------------
>
>                 Key: KAFKA-18386
>                 URL: https://issues.apache.org/jira/browse/KAFKA-18386
>             Project: Kafka
>          Issue Type: Bug
>          Components: mirrormaker
>    Affects Versions: 3.7.1
>            Reporter: George Yang
>            Priority: Major
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> When using Kubernetes deployment with MirrorMaker v3.7.1 and deploying one 
> Kafka node in each data center (DC1 and DC2), if DC1 is powered off, DC2 will 
> encounter a CrashLoopBackOff error. This issue is different from the one 
> described in KAFKA-17784. Please find the report log below:
> ```log
> [2025-01-01 08:05:53,432] WARN [AdminClient clientId=dc64->dc88] Connection 
> to node -1 (/192.168.2.88:13399) could not be established. Node may not be 
> available. 
> (org.apache.kafka.clients.NetworkClient:830)[kafka-admin-client-thread | 
> dc64->dc88]
> [2025-01-01 08:05:55,652] INFO [AdminClient clientId=dc64->dc88] Metadata 
> update failed 
> (org.apache.kafka.clients.admin.internals.AdminMetadataManager:267)[kafka-admin-client-thread
>  | dc64->dc88]
> org.apache.kafka.common.errors.TimeoutException: Timed out waiting to send 
> the call. Call: fetchMetadata
> [2025-01-01 08:05:55,653] INFO App info kafka.admin.client for dc64->dc88 
> unregistered 
> (org.apache.kafka.common.utils.AppInfoParser:88)[kafka-admin-client-thread | 
> dc64->dc88]
> [2025-01-01 08:05:55,653] INFO [AdminClient clientId=dc64->dc88] Metadata 
> update failed 
> (org.apache.kafka.clients.admin.internals.AdminMetadataManager:267)[kafka-admin-client-thread
>  | dc64->dc88]
> org.apache.kafka.common.errors.TimeoutException: Timed out waiting to send 
> the call. Call: fetchMetadata
> [2025-01-01 08:05:55,653] INFO [AdminClient clientId=dc64->dc88] Timed out 1 
> remaining operation(s) during close. 
> (org.apache.kafka.clients.admin.KafkaAdminClient:1450)[kafka-admin-client-thread
>  | dc64->dc88]
> [2025-01-01 08:05:55,657] INFO Metrics scheduler closed 
> (org.apache.kafka.common.metrics.Metrics:684)[kafka-admin-client-thread | 
> dc64->dc88]
> [2025-01-01 08:05:55,658] INFO Closing reporter 
> org.apache.kafka.common.metrics.JmxReporter 
> (org.apache.kafka.common.metrics.Metrics:688)[kafka-admin-client-thread | 
> dc64->dc88]
> [2025-01-01 08:05:55,658] INFO Metrics reporters closed 
> (org.apache.kafka.common.metrics.Metrics:694)[kafka-admin-client-thread | 
> dc64->dc88]
> [2025-01-01 08:05:55,658] ERROR Stopping due to error 
> (org.apache.kafka.connect.mirror.MirrorMaker:360)[main]
> org.apache.kafka.connect.errors.ConnectException: Failed to connect to and 
> describe Kafka cluster. Check worker's broker connection and security 
> properties.
>         at 
> org.apache.kafka.connect.runtime.WorkerConfig.lookupKafkaClusterId(WorkerConfig.java:305)
>         at 
> org.apache.kafka.connect.runtime.WorkerConfig.lookupKafkaClusterId(WorkerConfig.java:285)
>         at 
> org.apache.kafka.connect.runtime.WorkerConfig.kafkaClusterId(WorkerConfig.java:415)
>         at 
> org.apache.kafka.connect.mirror.MirrorMaker.addHerder(MirrorMaker.java:252)
>         at java.base/java.lang.Iterable.forEach(Unknown Source)
>         at 
> org.apache.kafka.connect.mirror.MirrorMaker.<init>(MirrorMaker.java:158)
>         at 
> org.apache.kafka.connect.mirror.MirrorMaker.<init>(MirrorMaker.java:170)
>         at 
> org.apache.kafka.connect.mirror.MirrorMaker.<init>(MirrorMaker.java:174)
>         at 
> org.apache.kafka.connect.mirror.MirrorMaker.main(MirrorMaker.java:347)
> Caused by: java.util.concurrent.ExecutionException: 
> org.apache.kafka.common.errors.TimeoutException: Timed out waiting for a node 
> assignment. Call: listNodes
>         at java.base/java.util.concurrent.CompletableFuture.reportGet(Unknown 
> Source)
>         at java.base/java.util.concurrent.CompletableFuture.get(Unknown 
> Source)
>         at 
> org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:165)
>         at 
> org.apache.kafka.connect.runtime.WorkerConfig.lookupKafkaClusterId(WorkerConfig.java:299)
>         ... 8 more
> Caused by: org.apache.kafka.common.errors.TimeoutException: Timed out waiting 
> for a node assignment. Call: listNodes
> [2025-01-01 08:05:55,687] INFO Stopped http_8083@6705fb02\{HTTP/1.1, 
> (http/1.1)}{0.0.0.0:8083} 
> (org.eclipse.jetty.server.AbstractConnector:383)[JettyShutdownThread]
> ```
> The configuration of mirrormaker is:
> ```
> clusters = dc64, dc88
> dc64.bootstrap.servers = 192.168.2.64:13399
> dc88.bootstrap.servers = 192.168.2.88:13399
> dc64->dc88.enabled = true
> dc64->dc88.topics = .*
> dc88->dc64.enabled = true
> dc88->dc64.topics = .*
> replication.factor=1
> tasks.max=6
> emit.checkpoints.interval.seconds=5
> dc64.producer.acks=all
> dc64.producer.batch.size=50000
> dc64.consumer.auto.offset.reset=latest
> dc88.consumer.auto.offset.reset=latest
> dc64.consumer.max.poll.interval.ms=20000
> dc88.consumer.max.poll.interval.ms=20000
> refresh.topics.enabled=true
> refresh.topics.interval.seconds=5
> refresh.groups.enabled=true
> refresh.groups.interval.seconds=5
> dedicated.mode.enable.internal.rest = true
> dc64.scheduled.rebalance.max.delay.ms=20000
> dc88.scheduled.rebalance.max.delay.ms=20000
> checkpoints.topic.replication.factor=1
> heartbeats.topic.replication.factor=1
> offset-syncs.topic.replication.factor=1
> offset.storage.replication.factor=1
> status.storage.replication.factor=1
> config.storage.replication.factor=1
> ```



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to