Hi, 

Hope you are safe and well!

Let me give a brief about my environment:

OS: Ubuntu 18.04
Kafka Version: Confluent Kafka v5.5.1
ZooKeeper Version : 3.5.8
No.of Kafka Brokers: 3
No. of Zookeeper nodes: 3

I am working on a project where we are aiming to move out from our existing 
infrastructure lets call it A where Kafka and ZooKeeper clusters are hosted to 
a better infrastructure lets call it B but with no or minimal downtime. Once 
the cutover is done, we would like to terminate the old infrastructure A.

I was able to use kafka-reassign-partitions.sh as per the steps mentioned in 
https://kafka.apache.org/documentation/#basic_ops_cluster_expansion to move the 
topics-partitions to the Kafka brokers I created in B. Please note that I have 
added 3 zookeeper nodes running in B into the zookeeper cluster in A and hence 
they were following the ZK leader in A. 
I was in the impression that since I had 6 nodes in the ZooKeeper ensemble, 
stopping the A side of ZooKeeper nodes would not cause an issue but I was 
wrong. As soon as I stopped the ZK process on the A nodes, B Zk nodes failed to 
accept any connections from Kafka and I assume it is because the leadership of 
ZK did not transfer to the ZK B nodes and failed the quorum resulting in this 
failure. I had to remove the version-2 folder inside the B Zk nodes and 
starting them 1 by 1 after removing the details of ZK A nodes from 
zookeeper.properties helped me to resolve the failure and run the cluster on 
infrastructure B. I know I failed miserably but this was a sandbox where I 
could afford the downtime but cannot in a production setup. I request your help 
and guidance to make it right. Please help!

Thanks in advance.

Regards,Rijo S Roy


Reply via email to