Hi Jeff,

thanks for answering to most of my points!
>From the reloadseeds' ticket, I followed to
https://issues.apache.org/jira/browse/CASSANDRA-3829 which was very
instructive, although a bit old.


On Mon, 7 Jan 2019 at 17:23, Jeff Jirsa <jji...@gmail.com> wrote:

> > On Jan 7, 2019, at 6:37 AM, Jonathan Ballet <jbal...@edgelab.ch> wrote:
> >
> [...]
>
> >   In essence, in my example that would be:
> >
> >   - decide that #2 and #3 will be the new seed nodes
> >   - update all the configuration files of all the nodes to write the IP
> addresses of #2 and #3
> >   - DON'T restart any node - the new seed configuration will be picked
> up only if the Cassandra process restarts
> >
> > * If I can manage to sort my Cassandra nodes by their age, could it be a
> strategy to have the seeds set to the 2 oldest nodes in the cluster? (This
> implies these nodes would change as the cluster's nodes get
> upgraded/replaced).
>
> You could do this, seems like a lot of headache for little benefit. Could
> be done with simple seed provider and config management
> (puppet/chef/ansible) laying  down new yaml or with your own seed provider
>

So, just to make it clear: sorting by age isn't a goal in itself, it was
just an example on how I could get a stable list.

Right now, we have a dedicated group of seed nodes + a dedicated group for
non-seeds: doing rolling-upgrade of the nodes from the second list is
relatively painless (although slow) whereas we are facing the issues
discussed in CASSANDRA-3829 for the first group which are non-seeds nodes
are not bootstrapping automatically and we need to operate them in a more
careful way.

What I'm really looking for is a way to simplify adding and removing nodes
into our (small) cluster: I can easily provide a small list of nodes from
our cluster with our config management tool so that new nodes are
discovering the rest of the cluster, but the documentation seems to imply
that seed nodes also have other functions and I'm not sure what problems we
could face trying to simplify this approach.

Ideally, what I would like to have would be:

* Considering a stable cluster (no new nodes, no nodes leaving), the N
seeds should be always the same N nodes
* Adding new nodes should not change that list
* Stopping/removing one of these N nodes should "promote" another
(non-seed) node as a seed
  - that would not restart the already running Cassandra nodes but would
update their configuration files.
  - if a node restart for whatever reason it would pick up this new
configuration

So: no node would start its life as a seed, only a few already existing
node would have this status. We would not have to deal with the "a seed
node doesn't bootstrap" problem and it would make our operation process
simpler.


> > I also have some more general questions about seed nodes and how they
> work:
> >
> > * I understand that seed nodes are used when a node starts and needs to
> discover the rest of the cluster's nodes. Once the node has joined and the
> cluster is stable, are seed nodes still playing a role in day to day
> operations?
>
> They’re used probabilistically in gossip to encourage convergence. Mostly
> useful in large clusters.
>

How "large" are we speaking here? How many nodes would it start to be
considered "large"?
Also, about the convergence: is this related to how fast/often the cluster
topology is changing? (new nodes, leaving nodes, underlying IP addresses
changing, etc.)

Thanks for your answers!

 Jonathan

Reply via email to