See the explanation from the zookeeper folks here
<https://zookeeper.apache.org/doc/r3.3.2/zookeeperAdmin.html> -

" Because Zookeeper requires a majority, it is best to use an odd number of
machines. For example, with four machines ZooKeeper can only handle the
failure of a single machine; if two machines fail, the remaining two
machines do not constitute a majority. However, with five machines
ZooKeeper can handle the failure of two machines."

Hope that helps.



On Tue, Jun 24, 2014 at 12:36 PM, Kane Kane <kane.ist...@gmail.com> wrote:

> Sorry, i meant 5 nodes in previous question.
>
> On Tue, Jun 24, 2014 at 12:36 PM, Kane Kane <kane.ist...@gmail.com> wrote:
> > Hello Neha,
> >
> >>>ZK cluster of 3 nodes will tolerate the loss of 1 node, but if there is
> a
> > subsequent leader election for any reason, there is a chance that the
> > cluster does not reach a quorum. It is less likely but still risky to
> some
> > extent.
> >
> > Does it mean if you have to tolerate 1 node loss without any issues,
> > you need *at least* 4 nodes?
> >
> > On Tue, Jun 24, 2014 at 11:16 AM, Neha Narkhede <neha.narkh...@gmail.com>
> wrote:
> >> Can you elaborate your notion of "smooth"? I thought if you have
> >> replication factor=3 in this case, you should be able to tolerate loss
> >> of a node?
> >>
> >> Yes, you should be able to tolerate the loss of a node but if controlled
> >> shutdown is not enabled, the delay between loss of the old leader and
> >> election of the new leader will be longer.
> >>
> >> So, you mean ZK cluster of 3 nodes can't tolerate 1 node loss? I've
> >> seen many recommendations to run 3-nodes cluster, does it mean in
> >> cluster of 3 you won't be able to operate after loosing 1 node?
> >>
> >> ZK cluster of 3 nodes will tolerate the loss of 1 node, but if there is
> a
> >> subsequent leader election for any reason, there is a chance that the
> >> cluster does not reach a quorum. It is less likely but still risky to
> some
> >> extent.
> >>
> >>
> >> On Tue, Jun 24, 2014 at 2:44 AM, Hemath Kumar <hksrckmur...@gmail.com>
> >> wrote:
> >>
> >>> Yes kane i have the replication factor configured as 3
> >>>
> >>>
> >>> On Tue, Jun 24, 2014 at 2:42 AM, Kane Kane <kane.ist...@gmail.com>
> wrote:
> >>>
> >>> > Hello Neha, can you explain your statements:
> >>> > >>Bringing one node down in a cluster will go smoothly only if your
> >>> > replication factor is 1 and you enabled controlled shutdown on the
> >>> brokers.
> >>> >
> >>> > Can you elaborate your notion of "smooth"? I thought if you have
> >>> > replication factor=3 in this case, you should be able to tolerate
> loss
> >>> > of a node?
> >>> >
> >>> > >>Also, bringing down 1 node our of a 3 node zookeeper cluster is
> risky,
> >>> > since any subsequent leader election might not reach a quorum.
> >>> >
> >>> > So, you mean ZK cluster of 3 nodes can't tolerate 1 node loss? I've
> >>> > seen many recommendations to run 3-nodes cluster, does it mean in
> >>> > cluster of 3 you won't be able to operate after loosing 1 node?
> >>> >
> >>> > Thanks.
> >>> >
> >>> > On Mon, Jun 23, 2014 at 9:04 AM, Neha Narkhede <
> neha.narkh...@gmail.com>
> >>> > wrote:
> >>> > > Bringing one node down in a cluster will go smoothly only if your
> >>> > > replication factor is 1 and you enabled controlled shutdown on the
> >>> > brokers.
> >>> > > Also, bringing down 1 node our of a 3 node zookeeper cluster is
> risky,
> >>> > > since any subsequent leader election might not reach a quorum.
> Having
> >>> > said
> >>> > > that, a partition going offline shouldn't cause a consumer's
> offset to
> >>> > > reset to an old value. How did you find out what the consumer's
> offset
> >>> > was?
> >>> > > Do you have your consumer's logs around?
> >>> > >
> >>> > > Thanks,
> >>> > > Neha
> >>> > >
> >>> > >
> >>> > > On Mon, Jun 23, 2014 at 12:28 AM, Hemath Kumar <
> hksrckmur...@gmail.com
> >>> >
> >>> > > wrote:
> >>> > >
> >>> > >> We have a 3 node cluster ( 3 kafka + 3 ZK nodes ) . Recently we
> came
> >>> > across
> >>> > >> a strange issue where we wanted to bring one of the node down from
> >>> > cluster
> >>> > >> ( 1 kafka + 1 zookeeper) for doing a maintenance . But the
> movement we
> >>> > >> brought it to down on some of the topics ( only some partitions)
> >>> > consumers
> >>> > >> offset is reset some old value.
> >>> > >>
> >>> > >> Any reason why this is happened?. As of my knowledge when brought
> one
> >>> > node
> >>> > >> down its should work smoothly with out any impact.
> >>> > >>
> >>> > >> Thanks,
> >>> > >> Murthy Chelankuri
> >>> > >>
> >>> >
> >>>
>

Reply via email to