Michael, as I understand it's "risky" if another (2nd) node would
fail, in this case your zookeeper will be not operational, right? But
if no other issues it shouldn't affect operations.

On Wed, Jun 25, 2014 at 2:08 AM, Michal Michalski
<michal.michal...@boxever.com> wrote:
> My understanding is that "bringing down 1 node our of a 3 node zookeeper
> cluster is risky,
> since any subsequent leader election *might* not reach a quorum"and "It is
> less likely but still risky to some
> extent" - *"it might not reach a quorum"*, because you need both of the
> remaining nodes to be up to reach quorum (of course it will be still
> possible, but it *might* fail). In case of a 5-node cluster having 1 node
> down is not that risky, because you still have 4 nodes and you need only 3
> of them to reach quorum.
>
> M.
>
> Kind regards,
> MichaƂ Michalski,
> michal.michal...@boxever.com
>
>
> On 25 June 2014 09:59, Kane Kane <kane.ist...@gmail.com> wrote:
>
>> Neha, thanks for the answer, I want to understand what is the case when:
>> >>Also, bringing down 1 node our of a 3 node zookeeper cluster is risky,
>> since any subsequent leader election might not reach a quorum
>>
>> I was thinking zookeeper guarantees quorum if only 1 node out of 3 fails?
>>
>> Thanks.
>>
>> On Tue, Jun 24, 2014 at 3:30 PM, Neha Narkhede <neha.narkh...@gmail.com>
>> wrote:
>> > See the explanation from the zookeeper folks here
>> > <https://zookeeper.apache.org/doc/r3.3.2/zookeeperAdmin.html> -
>> >
>> > " Because Zookeeper requires a majority, it is best to use an odd number
>> of
>> > machines. For example, with four machines ZooKeeper can only handle the
>> > failure of a single machine; if two machines fail, the remaining two
>> > machines do not constitute a majority. However, with five machines
>> > ZooKeeper can handle the failure of two machines."
>> >
>> > Hope that helps.
>> >
>> >
>> >
>> > On Tue, Jun 24, 2014 at 12:36 PM, Kane Kane <kane.ist...@gmail.com>
>> wrote:
>> >
>> >> Sorry, i meant 5 nodes in previous question.
>> >>
>> >> On Tue, Jun 24, 2014 at 12:36 PM, Kane Kane <kane.ist...@gmail.com>
>> wrote:
>> >> > Hello Neha,
>> >> >
>> >> >>>ZK cluster of 3 nodes will tolerate the loss of 1 node, but if there
>> is
>> >> a
>> >> > subsequent leader election for any reason, there is a chance that the
>> >> > cluster does not reach a quorum. It is less likely but still risky to
>> >> some
>> >> > extent.
>> >> >
>> >> > Does it mean if you have to tolerate 1 node loss without any issues,
>> >> > you need *at least* 4 nodes?
>> >> >
>> >> > On Tue, Jun 24, 2014 at 11:16 AM, Neha Narkhede <
>> neha.narkh...@gmail.com>
>> >> wrote:
>> >> >> Can you elaborate your notion of "smooth"? I thought if you have
>> >> >> replication factor=3 in this case, you should be able to tolerate
>> loss
>> >> >> of a node?
>> >> >>
>> >> >> Yes, you should be able to tolerate the loss of a node but if
>> controlled
>> >> >> shutdown is not enabled, the delay between loss of the old leader and
>> >> >> election of the new leader will be longer.
>> >> >>
>> >> >> So, you mean ZK cluster of 3 nodes can't tolerate 1 node loss? I've
>> >> >> seen many recommendations to run 3-nodes cluster, does it mean in
>> >> >> cluster of 3 you won't be able to operate after loosing 1 node?
>> >> >>
>> >> >> ZK cluster of 3 nodes will tolerate the loss of 1 node, but if there
>> is
>> >> a
>> >> >> subsequent leader election for any reason, there is a chance that the
>> >> >> cluster does not reach a quorum. It is less likely but still risky to
>> >> some
>> >> >> extent.
>> >> >>
>> >> >>
>> >> >> On Tue, Jun 24, 2014 at 2:44 AM, Hemath Kumar <
>> hksrckmur...@gmail.com>
>> >> >> wrote:
>> >> >>
>> >> >>> Yes kane i have the replication factor configured as 3
>> >> >>>
>> >> >>>
>> >> >>> On Tue, Jun 24, 2014 at 2:42 AM, Kane Kane <kane.ist...@gmail.com>
>> >> wrote:
>> >> >>>
>> >> >>> > Hello Neha, can you explain your statements:
>> >> >>> > >>Bringing one node down in a cluster will go smoothly only if
>> your
>> >> >>> > replication factor is 1 and you enabled controlled shutdown on the
>> >> >>> brokers.
>> >> >>> >
>> >> >>> > Can you elaborate your notion of "smooth"? I thought if you have
>> >> >>> > replication factor=3 in this case, you should be able to tolerate
>> >> loss
>> >> >>> > of a node?
>> >> >>> >
>> >> >>> > >>Also, bringing down 1 node our of a 3 node zookeeper cluster is
>> >> risky,
>> >> >>> > since any subsequent leader election might not reach a quorum.
>> >> >>> >
>> >> >>> > So, you mean ZK cluster of 3 nodes can't tolerate 1 node loss?
>> I've
>> >> >>> > seen many recommendations to run 3-nodes cluster, does it mean in
>> >> >>> > cluster of 3 you won't be able to operate after loosing 1 node?
>> >> >>> >
>> >> >>> > Thanks.
>> >> >>> >
>> >> >>> > On Mon, Jun 23, 2014 at 9:04 AM, Neha Narkhede <
>> >> neha.narkh...@gmail.com>
>> >> >>> > wrote:
>> >> >>> > > Bringing one node down in a cluster will go smoothly only if
>> your
>> >> >>> > > replication factor is 1 and you enabled controlled shutdown on
>> the
>> >> >>> > brokers.
>> >> >>> > > Also, bringing down 1 node our of a 3 node zookeeper cluster is
>> >> risky,
>> >> >>> > > since any subsequent leader election might not reach a quorum.
>> >> Having
>> >> >>> > said
>> >> >>> > > that, a partition going offline shouldn't cause a consumer's
>> >> offset to
>> >> >>> > > reset to an old value. How did you find out what the consumer's
>> >> offset
>> >> >>> > was?
>> >> >>> > > Do you have your consumer's logs around?
>> >> >>> > >
>> >> >>> > > Thanks,
>> >> >>> > > Neha
>> >> >>> > >
>> >> >>> > >
>> >> >>> > > On Mon, Jun 23, 2014 at 12:28 AM, Hemath Kumar <
>> >> hksrckmur...@gmail.com
>> >> >>> >
>> >> >>> > > wrote:
>> >> >>> > >
>> >> >>> > >> We have a 3 node cluster ( 3 kafka + 3 ZK nodes ) . Recently we
>> >> came
>> >> >>> > across
>> >> >>> > >> a strange issue where we wanted to bring one of the node down
>> from
>> >> >>> > cluster
>> >> >>> > >> ( 1 kafka + 1 zookeeper) for doing a maintenance . But the
>> >> movement we
>> >> >>> > >> brought it to down on some of the topics ( only some
>> partitions)
>> >> >>> > consumers
>> >> >>> > >> offset is reset some old value.
>> >> >>> > >>
>> >> >>> > >> Any reason why this is happened?. As of my knowledge when
>> brought
>> >> one
>> >> >>> > node
>> >> >>> > >> down its should work smoothly with out any impact.
>> >> >>> > >>
>> >> >>> > >> Thanks,
>> >> >>> > >> Murthy Chelankuri
>> >> >>> > >>
>> >> >>> >
>> >> >>>
>> >>
>>

Reply via email to