Neha, thanks for the answer, I want to understand what is the case when: >>Also, bringing down 1 node our of a 3 node zookeeper cluster is risky, since any subsequent leader election might not reach a quorum
I was thinking zookeeper guarantees quorum if only 1 node out of 3 fails? Thanks. On Tue, Jun 24, 2014 at 3:30 PM, Neha Narkhede <neha.narkh...@gmail.com> wrote: > See the explanation from the zookeeper folks here > <https://zookeeper.apache.org/doc/r3.3.2/zookeeperAdmin.html> - > > " Because Zookeeper requires a majority, it is best to use an odd number of > machines. For example, with four machines ZooKeeper can only handle the > failure of a single machine; if two machines fail, the remaining two > machines do not constitute a majority. However, with five machines > ZooKeeper can handle the failure of two machines." > > Hope that helps. > > > > On Tue, Jun 24, 2014 at 12:36 PM, Kane Kane <kane.ist...@gmail.com> wrote: > >> Sorry, i meant 5 nodes in previous question. >> >> On Tue, Jun 24, 2014 at 12:36 PM, Kane Kane <kane.ist...@gmail.com> wrote: >> > Hello Neha, >> > >> >>>ZK cluster of 3 nodes will tolerate the loss of 1 node, but if there is >> a >> > subsequent leader election for any reason, there is a chance that the >> > cluster does not reach a quorum. It is less likely but still risky to >> some >> > extent. >> > >> > Does it mean if you have to tolerate 1 node loss without any issues, >> > you need *at least* 4 nodes? >> > >> > On Tue, Jun 24, 2014 at 11:16 AM, Neha Narkhede <neha.narkh...@gmail.com> >> wrote: >> >> Can you elaborate your notion of "smooth"? I thought if you have >> >> replication factor=3 in this case, you should be able to tolerate loss >> >> of a node? >> >> >> >> Yes, you should be able to tolerate the loss of a node but if controlled >> >> shutdown is not enabled, the delay between loss of the old leader and >> >> election of the new leader will be longer. >> >> >> >> So, you mean ZK cluster of 3 nodes can't tolerate 1 node loss? I've >> >> seen many recommendations to run 3-nodes cluster, does it mean in >> >> cluster of 3 you won't be able to operate after loosing 1 node? >> >> >> >> ZK cluster of 3 nodes will tolerate the loss of 1 node, but if there is >> a >> >> subsequent leader election for any reason, there is a chance that the >> >> cluster does not reach a quorum. It is less likely but still risky to >> some >> >> extent. >> >> >> >> >> >> On Tue, Jun 24, 2014 at 2:44 AM, Hemath Kumar <hksrckmur...@gmail.com> >> >> wrote: >> >> >> >>> Yes kane i have the replication factor configured as 3 >> >>> >> >>> >> >>> On Tue, Jun 24, 2014 at 2:42 AM, Kane Kane <kane.ist...@gmail.com> >> wrote: >> >>> >> >>> > Hello Neha, can you explain your statements: >> >>> > >>Bringing one node down in a cluster will go smoothly only if your >> >>> > replication factor is 1 and you enabled controlled shutdown on the >> >>> brokers. >> >>> > >> >>> > Can you elaborate your notion of "smooth"? I thought if you have >> >>> > replication factor=3 in this case, you should be able to tolerate >> loss >> >>> > of a node? >> >>> > >> >>> > >>Also, bringing down 1 node our of a 3 node zookeeper cluster is >> risky, >> >>> > since any subsequent leader election might not reach a quorum. >> >>> > >> >>> > So, you mean ZK cluster of 3 nodes can't tolerate 1 node loss? I've >> >>> > seen many recommendations to run 3-nodes cluster, does it mean in >> >>> > cluster of 3 you won't be able to operate after loosing 1 node? >> >>> > >> >>> > Thanks. >> >>> > >> >>> > On Mon, Jun 23, 2014 at 9:04 AM, Neha Narkhede < >> neha.narkh...@gmail.com> >> >>> > wrote: >> >>> > > Bringing one node down in a cluster will go smoothly only if your >> >>> > > replication factor is 1 and you enabled controlled shutdown on the >> >>> > brokers. >> >>> > > Also, bringing down 1 node our of a 3 node zookeeper cluster is >> risky, >> >>> > > since any subsequent leader election might not reach a quorum. >> Having >> >>> > said >> >>> > > that, a partition going offline shouldn't cause a consumer's >> offset to >> >>> > > reset to an old value. How did you find out what the consumer's >> offset >> >>> > was? >> >>> > > Do you have your consumer's logs around? >> >>> > > >> >>> > > Thanks, >> >>> > > Neha >> >>> > > >> >>> > > >> >>> > > On Mon, Jun 23, 2014 at 12:28 AM, Hemath Kumar < >> hksrckmur...@gmail.com >> >>> > >> >>> > > wrote: >> >>> > > >> >>> > >> We have a 3 node cluster ( 3 kafka + 3 ZK nodes ) . Recently we >> came >> >>> > across >> >>> > >> a strange issue where we wanted to bring one of the node down from >> >>> > cluster >> >>> > >> ( 1 kafka + 1 zookeeper) for doing a maintenance . But the >> movement we >> >>> > >> brought it to down on some of the topics ( only some partitions) >> >>> > consumers >> >>> > >> offset is reset some old value. >> >>> > >> >> >>> > >> Any reason why this is happened?. As of my knowledge when brought >> one >> >>> > node >> >>> > >> down its should work smoothly with out any impact. >> >>> > >> >> >>> > >> Thanks, >> >>> > >> Murthy Chelankuri >> >>> > >> >> >>> > >> >>> >>