Re: Initializing a multiple node cluster (multiple datacenters)

Jeff Jirsa Fri, 23 Feb 2018 10:18:06 -0800

On Fri, Feb 23, 2018 at 10:12 AM, Oleksandr Shulgin <
oleksandr.shul...@zalando.de> wrote:

> On Fri, Feb 23, 2018 at 7:02 PM, Jeff Jirsa <jji...@gmail.com> wrote:
>>
>> Yes, seeds don't bootstrap.  But why?  I don't think I ever seen a
>>> comprehensive explanation of this.
>>>
>>> The meaning of seed in the most common sense is "connect to this host,
>> and use it as the starting point for adding this node to the cluster".
>>
>> If you specify that a joining node is the seed, the implication is that
>> it's already a member of the cluster (or, alternatively, authoritative on
>> the cluster's state).  Given that implication, why would it make sense to
>> then proceed to bootstrap? By setting it as a seed, you've told it that it
>> already knows what the cluster is.
>>
>
> Well, there is certain logic in that.  However, bootstrap is about
> streaming in the data, isn't it?  And being seed is about knowing the
> topology, i.e. which nodes exist in the cluster.  There is actually 0
> overlap of these two concerns, so I don't really see why a seed node
> shouldn't be able to bootstrap.  Would it break anything if it could, e.g.
> if you're explicit about it and request auto_boostrap=true?
>
>
I dont *think* it would break anything, but the more obvious answer is just
not to list the node as a seed if it needs to bootstrap.

This comes up a lot, and it's certainly one of those rough operator edges
that we can do better with. There's no strict requirement to have all of
the seeds exactly the same in a cluster, so if you need to bootstrap a new
seed, just join it with it not a seed, then bounce it to make it think it's
a seed after it's joined.

The easier answer is probably "give people a way to change seeds after
they're running", and it sorta exists, but it's hard to invoke
intentionally. We should just make that easier, and the rough edges will
get a little less rough.

Re: Initializing a multiple node cluster (multiple datacenters)

Reply via email to