I didnt see the pre-ASF versions, but reading the description and mapping to technologies, a lot of the places where ZK were used have been moved towards node-local decisions through data propagated via gossip.
For example: "All nodes on joining the cluster contact the leader who tells them for what ranges they are replicas for and leader makes a concerted effort to maintain the invariant that no node is responsible for more than N-1 ranges in the ring" is now done using the snitch - the snitch that ships with Cassandra uses gossip to broadcast which nodes are in which rack, and for a rack-aware replication model (now in the schema), each host can calculate exactly which other hosts are natural endpoints based on the hashtable token, and then skip endpoints that are on the same rack. Similarly, "All nodes on joining the cluster contact the leader who tells them for what ranges they are replicas for and leader makes a concerted effort to maintain the invariant that no node is responsible for more than N-1 ranges in the ring" - nodes on joining the cluster infer which range they should own by either specifying the token in yaml, or specifying number of tokens, then either subdividing the largest range (if single token and unspecified), randomly assigning (if multi-token, pre-balancing-work), or distributing tokens using an algorithm that tries to achieve optimal balance (multi-token, post-3.0-balancing-work). So, in general, cassandra has tried to remove the central point of failure in favor of each node being able to do those things on their own, even in the presence of wan failures / partitions / etc. On Wed, Sep 1, 2021 at 4:53 PM Han <keepsim...@gmail.com> wrote: > Hi, > > I'm reading an old annotated version of the Cassandra paper ( > https://docs.datastax.com/en/articles/cassandra/cassandrathenandnow.html) > , and am curious about this annotation about "Replication" section: > > Zookeeper usage was restricted to Facebookâs in-house Cassandra branch; > Apache Cassandra has always avoided it. > > <snip> > > Is there any paper or blog or other pointers to understand what Apache > Cassandra did to avoid Zookeeper? > > Thanks! > > Han > >