Can anyone pls provide some feedback. I don't have logs now but just wanted to confirm how can this case arise and how to ensure that it does not happen again?
On Fri, Nov 26, 2021 at 5:44 PM Lehar Jain <leha...@media.net> wrote: > Hello community, > > Recently my team faced an issue with our Kafka Connect Mirrormaker cluster > in which 1 partition was getting consumed and produced twice. The twice > consumption and production scenarios were also confirmed by checking the > BytesIn and BytesOut metrics from brokers. > > What happened was we added a couple of servers to our connect cluster and > then we did face a network partition with the rack of the new allocated > servers. After some time the connectivity was restored and the system was > working fine. But after a couple of hours, we observed that one topic was > receiving twice the amount of data it usually gets and all the messages > were getting repeated twice. > The same topic also had twice the consumption rate from the source > cluster. At this point, we thought that the issue might be because of the > mirrormaker and restarted the connector. Even after the restart, the issue > was still there and the messages were still getting duplicated. > At this point, I checked out the assignment list and stopped the worker > where this partition was assigned. After stopping this worker we observed > that the message rate in the destination topic match that of the source > topic and no message was being duplicated but after some time the > coordinator detected the stopped broker and a rebalance was triggered and > that again resulted in messages being consumed and produced twice. > > At this point, we stopped all the worker instances on servers that faced > the network outage and restarted the connector and everything ran fine. > > Has anyone faced this issue before or are there any scenarios where this > condition can arise. Is it a known issue? > > Regards, > Lehar Jain >