Thanks Han and Alexander for taking time out and your responses. I now understand the risks and the possible outcome of having the desired setup.
What would be better in your opinion to have failover (active-active) between both of these server rooms to avoid switching to the clone / 3rd zookeeper. I mean even if there are 5 nodes having 3 in one server room and 2 in other still there would be problem related to zookeeper majority leader election if the server room goes down that has 3 nodes. is there some way to achieve this ? Thanks again! Lee On Mon, Mar 6, 2017 at 4:16 PM, Alexander Binzberger < alexander.binzber...@wingcon.com> wrote: > I agree on this is one cluster but having one additional ZK node per site > does not help. (as far as I understand ZK) > > A 3 out of 6 is also not a majority. So I think you mean 3/5 with a cloned > 3rd one. This would mean manually switching the cloned one for majority > which can cause issues again. > 1. You actually build a master/slave ZK with manually switch over. > 2. While switching the clone from room to room you would have downtime. > 3. If you switch on both ZK node clones at the same time (by mistake) you > screwed. > 4. If you "switch" clones instead of moving it will all data on disk you > generate a split brain from which you have to recover first. > > So if you loose the connection between the rooms / the rooms get separated > / you loose one room: > * You (might) need manual interaction > * loose automatic fail-over between the rooms > * might face complete outage if your "master" room with the active 3rd > node is hit. > Actually this is the same scenario with 2/3 nodes spread over two > locations. > > What you need is a third cross connected location for real fault tolerance > and distribute your 3 or 5 ZK nodes over those. > Or live with a possible outage in such a scenario. > > Additional Hints: > * You can run any number of Kafka brokers on a ZK cluster. In your case > this could be 4 Kafka brokers on 3 ZK nodes. > * You should set topic replication to 2 (can be done at any time) and some > other producer/broker settings to ensure your messages will not get lost in > switch over cases. > * ZK service does not react nicely on disk full. > > > > Am 06.03.2017 um 15:10 schrieb Hans Jespersen: > >> In that case it’s really one cluster. Make sure to set different rack ids >> for each server room so kafka will ensure that the replicas always span >> both floors and you don’t loose availability of data if a server room goes >> down. >> You will have to configure one addition zookeeper node in each site which >> you will only ever startup if a site goes down because otherwise 2 of 4 >> zookeeper nodes is not a quorum.Again you would be better with 3 nodes >> because then you would only have to do this in the site that has the single >> active node. >> >> -hans >> >> >> On Mar 6, 2017, at 5:57 AM, Le Cyberian <lecyber...@gmail.com> wrote: >>> >>> Hi Hans, >>> >>> Thank you for your reply. >>> >>> Its basically two different server rooms on different floors and they are >>> connected with fiber connectivity so its almost like a local connection >>> between them no network latencies / lag. >>> >>> If i do a Mirror Maker / Replicator then i will not be able to use them >>> at >>> the same time for writes./ producers. because the consumers / producers >>> will request from all of them >>> >>> BR, >>> >>> Lee >>> >>> On Mon, Mar 6, 2017 at 2:50 PM, Hans Jespersen <h...@confluent.io> >>> wrote: >>> >>> What do you mean when you say you have "2 sites not datacenters"? You >>>> should be very careful configuring a stretch cluster across multiple >>>> sites. >>>> What is the RTT between the two sites? Why do you think that MIrror >>>> Maker >>>> (or Confluent Replicator) would not work between the sites and yet you >>>> think a stretch cluster will work? That seems wrong. >>>> >>>> -hans >>>> >>>> /** >>>> * Hans Jespersen, Principal Systems Engineer, Confluent Inc. >>>> * h...@confluent.io (650)924-2670 >>>> */ >>>> >>>> On Mon, Mar 6, 2017 at 5:37 AM, Le Cyberian <lecyber...@gmail.com> >>>> wrote: >>>> >>>> Hi Guys, >>>>> >>>>> Thank you very much for you reply. >>>>> >>>>> The scenario which i have to implement is that i have 2 sites not >>>>> datacenters so mirror maker would not work here. >>>>> >>>>> There will be 4 nodes in total, like 2 in Site A and 2 in Site B. The >>>>> >>>> idea >>>> >>>>> is to have Active-Active setup along with fault tolerance so that if >>>>> one >>>>> >>>> of >>>> >>>>> the site goes on the operations are normal. >>>>> >>>>> In this case if i go ahead with 4 node-cluster of both zookeeper and >>>>> >>>> kafka >>>> >>>>> it will give failover tolerance for 1 node only. >>>>> >>>>> What do you suggest to do in this case ? because to divide between 2 >>>>> >>>> sites >>>> >>>>> it needs to be even number if that makes sense ? Also if possible some >>>>> >>>> help >>>> >>>>> regarding partitions for topic and replication factor. >>>>> >>>>> I already have Kafka running with quiet few topics having replication >>>>> factor 1 along with 1 default partition, is there a way to repartition >>>>> / >>>>> increase partition of existing topics when i migrate to above setup ? I >>>>> think we can increase replication factor by Kafka rebalance tool. >>>>> >>>>> Thanks alot for your help and time looking into this. >>>>> >>>>> BR, >>>>> >>>>> Le >>>>> >>>>> On Mon, Mar 6, 2017 at 12:20 PM, Hans Jespersen <h...@confluent.io> >>>>> >>>> wrote: >>>> >>>>> Jens, >>>>>> >>>>>> I think you are correct that a 4 node zookeeper ensemble can be made >>>>>> to >>>>>> work but it will be slightly less resilient than a 3 node ensemble >>>>>> >>>>> because >>>>> >>>>>> it can only tolerate 1 failure (same as a 3 node ensemble) and the >>>>>> likelihood of node failures is higher because there is 1 more node >>>>>> that >>>>>> could fail. >>>>>> So it SHOULD be an odd number of zookeeper nodes (not MUST). >>>>>> >>>>>> -hans >>>>>> >>>>>> >>>>>> On Mar 6, 2017, at 12:20 AM, Jens Rantil <jens.ran...@tink.se> >>>>>>> >>>>>> wrote: >>>> >>>>> Hi Hans, >>>>>>> >>>>>>> On Mon, Mar 6, 2017 at 12:10 AM, Hans Jespersen <h...@confluent.io> >>>>>>>> >>>>>>> wrote: >>>>>> >>>>>>> A 4 node zookeeper ensemble will not even work. It MUST be an odd >>>>>>>> >>>>>>> number >>>>> >>>>>> of zookeeper nodes to start. >>>>>>>> >>>>>>> >>>>>>> Are you sure about that? If Zookeer doesn't run with four nodes, that >>>>>>> >>>>>> means >>>>>> >>>>>>> a running ensemble of three can't be live-migrated to other nodes >>>>>>> >>>>>> (because >>>>>> >>>>>>> that's done by increasing the ensemble and then reducing it in the >>>>>>> >>>>>> case >>>> >>>>> of >>>>>> >>>>>>> 3-node ensembles). IIRC, you can run four Zookeeper nodes, but that >>>>>>> >>>>>> means >>>>> >>>>>> quorum will be three nodes, so there's no added benefit in terms of >>>>>>> availability since you can only loose one node just like with a three >>>>>>> >>>>>> node >>>>>> >>>>>>> cluster. >>>>>>> >>>>>>> Cheers, >>>>>>> Jens >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Jens Rantil >>>>>>> Backend engineer >>>>>>> Tink AB >>>>>>> >>>>>>> Email: jens.ran...@tink.se >>>>>>> Phone: +46 708 84 18 32 >>>>>>> Web: www.tink.se >>>>>>> >>>>>>> Facebook <https://www.facebook.com/#!/tink.se> Linkedin >>>>>>> <http://www.linkedin.com/company/2735919?trk=vsrp_ >>>>>>> >>>>>> companies_res_photo&trkInfo=VSRPsearchId%3A1057023381369207406670% >>>>>> 2CVSRPtargetId%3A2735919%2CVSRPcmpt%3Aprimary> >>>>>> >>>>>>> Twitter <https://twitter.com/tink> >>>>>>> >>>>>> >> >