On 10/20/2023 5:23 PM, David Filip wrote:
From a matter of perspective, however, I think what I am not clear on is
having more than one ZK “server”, and when and why I would need more than one?
Perhaps it is just terminology, but if I have three (3x) Solr instances (cores)
running on three (3x) separate physical servers (different hardware), and I want to
replicate shards between those three, do I have all three (3x) Solr instances
(cores) taking to the same single (1x) ZooKeeper “server"?
Or if I have three (3x) Solr instances (cores) replicating shards between them,
do I also need three (3x) ZooKeeper “servers”, e.g., server.1, server.2,
server.3, each “server” assigned to one specific Solr instance (core)?
You need three ZK "servers" each running on different physical hardware
so that ZK has fault tolerance. This requirement of a three server
minimum is inherent in ZK's design and cannot be changed.
You need two Solr servers minimum so that Solr has fault tolerance.
You can run Solr on the same hardware as you run ZK, but it is STRONGLY
recommended that ZK be a completely separate service from Solr, so that
if you restart Solr, ZK does not go down, and vice versa. For best
performance, it is also recommended that ZK's data directory reside on a
separate physical storage device from other processes like Solr, but if
you have a small SolrCloud cluster and/or fast disks such as SSD, that
may not be required.
ZK servers must all know about each other in order to maintain a
coherent cluster.
Each Solr instance must know about all the ZK servers, which is why the
zkhost string must list them all with an optional chroot. Every Solr
instance will maintain connections to all of the ZK servers simultaneously.
As I noted before, a SolrCloud collection is composed of one or more
shards. Each shard is composed of one or more replicas, each of which
is a Solr core. One Solr instance can host many cores. I would
recommend NOT running multiple Solr instances on each machine.
Thanks,
Shawn