Shawn, Understand about redundancy and needing an odd number of nodes (I’ve used quorum in other (non-Solr) type of clusters, so I get it).
So what I’ve done now is installed ZooKeeper on a separate (physical) node (so no longer using ZooKeeper bundled with Solr, since that was causing come confusion). So I’m trying to follow this document regarding how to set up the “Ensemble” for Solr: https://solr.apache.org/guide/6_6/setting-up-an-external-zookeeper-ensemble.html This includes the following “example” in zoo.cfg: dataDir=/var/lib/zookeeperdata/1 clientPort=2181 initLimit=5 syncLimit=2 server.1=localhost:2888:3888 server.2=localhost:2889:3889 server.3=localhost:2890:3890 So assuming that I have three (3x) physical nodes — each running Solr 9.2 — and assuming that they are named: solr1.mydomain.com <http://solr1.mydomain.com/> solr2.mydomain.com <http://solr2.mydomain.com/> solr3.mydomain.com <http://solr.mydomain.com/> I am assuming that in zoo.cfg I will have: server.1=solr1.mydomain.com server.2=solr2.mydomain.com server.3=solr3.mydomain.com But do I also have three (3x) separate zoo.cfg files, or a single zoo.cfg file? This document is kinda inferring - I think? - that I need to create three separate copies in (assuming I’ve install ZooKeeper in /opt/zookeeper): /opt/zookeeper/conf/zoo1.cfg /opt/zookeeper/conf/zoo2.cfg /opt/zookeeper/conf/zoo3.cfg But is there also still just a /opt/zookeeper/conf/zoo.cfg as well? And do all of the configuration files contain the same thing? I’m not sure if this is more of a ZooKeeper question, or more of a Solr question, but I’m a bit confused nonetheless as to how this all fits together? As far as pointing each Solr instance, on each physical Solr node, to ZooKeeper, I am assuming that I just need to start Solr on each with (assuming my ZooKeeper node is zookeeper.mydomain.com <http://zookeeper.mydomain.com/>): bin/solr start -e cloud -z zookeeper.mydomain.com:2181 -noprompt Or is there anything else that I need to do on each Solr node? Thanks in advance for any clarification. Regards, Dave. > On Oct 20, 2023, at 2:51 PM, Shawn Heisey <apa...@elyograg.org.INVALID> wrote: > > On 10/19/23 17:48, David Filip wrote: >> I think I am getting confused between differences in Solr versions (most >> links seem to talk about Solr 6, and I’ve installed Solr 9), and SolrCloud >> vs. Standalone, when searching the ’Net … so I am hoping that someone can >> point me towards what I need to do. Apologies in advance for perhaps not >> using the correct Solr terminology. >> I will describe what I have, and what I want to accomplish, to the best of >> my abilities. >> I have installed Solr 9.2.1 on two separate physical nodes (different >> physical computers). Both are running SolrCloud, and are running with the >> same (duplicate) configuration files. Both are running their own local >> zookeeper, and are separate cores. Let’s call them solr1 and solr2. Right >> now I can index content on and search each one individually, but they do not >> know about each other (which is I think the fundamental problem I am trying >> to solve). > > You need three servers minimum. In the minimal fault-tolerant setup, two of > those will run Zookeeper and Solr, the third will only need to run Zookeeper. > If the third server does not run Solr, it can be a smaller server than the > other two. > >> My goal is to replicate content from one to the other, so that I can take >> one down (e.g., solr1) and still search current collections (e.g., on >> solr2). When I run Solr Admin web page, I can select: Collections=> >> {collection}, click on a Shard, and I see the [+ add replica] button, but I >> can’t add a new replica on the “other" node, because only the local node >> appears (e.g., 10.0.x.xxx:8983_solr). What I think I need to do is add the >> nodes (solr1 and solr2) together (?) so that I can add a new replica on the >> “other” node. > > This is an inherent capability of SolrCloud. One collection consists of one > or more shards, and each shard consists of one or more replicas. When there > is more than one replica, one of them will be elected leader. > > All the Solr servers must talk to the same ZK ensemble in order to form a > SolrCloud cluster. Zookeeper should run as its own process, not the embedded > ZK server that Solr provides, but dedicated hosts for ZK are not required > unless the SolrCloud cluster is really big. > >> I’ve found references that tell me I need an odd number of zookeeper nodes >> (for quorum), so I’m not sure if I want both nodes to share a single >> zookeeper instance? If I did do that, and let’s say that I pointed solr2 to >> zookeeper on solr1, could I still search against solr2 if solr1 zookeeper >> was down? I would think not, but I’m not sure. > > Here is the situation with ZK ensemble fault tolerance: > > 2 servers can sustain zero failures. > 3 servers can sustain one failure. > 4 servers can sustain one failure. > 5 servers can sustain two failures. > 6 servers can sustain two failures. > > Additional note: In geographically diverse setups, it is not possible to > have a fault tolerant ZK install with only two datacenters or availability > zones. You need three. > > This is why an odd number is recommended -- because adding one more node does > not provide any additional fault tolerance. > > If ZK has too many failures, SolrCloud will switch to read-only mode and the > node you contact will not be aware of other Solr servers going down or coming > up. > > I would recommend that any new Solr install, especially if you want fault > tolerance, should run SolrCloud, not standalone mode. > > Thanks, > Shawn >