[ 
https://issues.apache.org/jira/browse/KAFKA-17101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17868837#comment-17868837
 ] 

Chris Egerton commented on KAFKA-17101:
---------------------------------------

[~kaushik srinivas] Are your Kafka clusters running in KRaft or ZooKeeper mode 
(or have you encountered this bug with both)?

I've seen some interesting interesting test failures recently on a PR to 
migrate our embedded integration testing library for Kafka Connect and 
MirrorMaker 2 from ZooKeeper to KRaft. The failure isn't identical to the issue 
reported in this bug, but it is similar: in our tests, a topic gets created out 
of band with the compact cleanup policy, then it's replicated by MM2 shortly 
after, but the replica topic has the delete cleanup policy. 

I wonder if there's a consistency issue in KRaft mode for topic configs that 
may cause stale or even default data to be returned when describing a topic. 
This could possibly explain both issues, but right now I don't have much more 
than a hunch to support it.

> Mirror maker internal topics cleanup policy changes to 'delete' from 
> 'compact' 
> -------------------------------------------------------------------------------
>
>                 Key: KAFKA-17101
>                 URL: https://issues.apache.org/jira/browse/KAFKA-17101
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 3.4.1, 3.5.1, 3.6.1
>            Reporter: kaushik srinivas
>            Priority: Major
>
> Scenario/Setup details
> Kafka cluster 1: 3 replicas
> Kafka cluster 2: 3 replicas
> MM1 moving data from cluster 1 to cluster 2
> MM2 moving data from cluster 2 to cluster 1
> Sometimes with a reboot of the kafka cluster 1 and MM1 instance, we observe 
> MM failing to come up with below exception,
> {code:java}
> {"message":"DistributedHerder-connect-1-1 - 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder - [Worker 
> clientId=connect-1, groupId=site1-mm2] Uncaught exception in herder work 
> thread, exiting: "}}
> org.apache.kafka.common.config.ConfigException: Topic 
> 'mm2-offsets.site1.internal' supplied via the 'offset.storage.topic' property 
> is required to have 'cleanup.policy=compact' to guarantee consistency and 
> durability of source connector offsets, but found the topic currently has 
> 'cleanup.policy=delete'. Continuing would likely result in eventually losing 
> source connector offsets and problems restarting this Connect cluster in the 
> future. Change the 'offset.storage.topic' property in the Connect worker 
> configurations to use a topic with 'cleanup.policy=compact'. {code}
> Once the topic is altered with cleanup policy of compact. MM works just fine.
> This is happening on our setups sporadically and across varieties of 
> scenarios. Not been successful in identifying the exact reproduction steps as 
> of now.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to