---------- Forwarded message ---------
От: Ivan Daschinsky <ivanda...@gmail.com>
Date: чт, 19 нояб. 2020 г. в 13:02
Subject: Re: IEP-61 Technical discussion
To: Alexey Goncharuk <alexey.goncha...@gmail.com>


Alexey, let's arise another question. Specifically, how nodes initially
find each other (discovery) and how they detect failures.

I suppose, that gossip protocol is an ideal candidate. For example, consul
[1] uses this approach, using serf [2] library to discover members of
cluster.
Then consul forms raft ensemble (server nodes) and client use raft ensemble
only as lock service.

PacificA suggests internal heartbeats mechanism for failure detection of
replicated group, but it says nothing about initial discovery of nodes.

WDYT?

[1] -- https://www.consul.io/docs/architecture/gossip
[2] -- https://www.serf.io/

чт, 19 нояб. 2020 г. в 12:46, Alexey Goncharuk <alexey.goncha...@gmail.com>:

> Following up the Ignite 3.0 scope/development approach threads, this is a
> separate thread to discuss technical aspects of the IEP.
>
> Let's reiterate one more time on the questions raised by Ivan and also see
> if there are any other thoughts on the IEP:
>
>    - *Whether to deploy metastorage on a separate subset of the nodes or
>    allow Ignite to choose these nodes automatically.* I think it is
>    feasible to maintain both modes: by default, Ignite will choose
>    metastorage nodes automatically which essentially will provide the same
>    seamless user experience as TCP discovery SPI - no separate roles,
>    simplistic deployment. For deployments where people want to have more
>    fine-grained control over the nodes' assignments, we will provide a runtime
>    configuration which will allow pinning metastorage group to certain nodes,
>    thus eliminating the latency concerns.
>    - *Whether there are any TLA+ specs for the PacificA protocol.* Not to
>    my knowledge, but it is known to be used in production by Microsoft and
>    other projects, e.g. [1]
>
> I would like to collect general feedback on the IEP, as well as feedback
> on specific parts of it, such as:
>
>    - Metastorage API
>    - Any existing library that can be used to avoid re-implementing the
>    protocol ourselves? Perhaps, porting the existing implementation to Java
>    (the way TiKV did with etcd-raft [2] [3]? This is a very neat way btw in my
>    opinion because I like the finite automata-like approach of the replication
>    module, and, additionally, we could sync bug fixes and improvements from
>    the upstream project)
>
>
> Thanks,
> --AG
>
> [1] https://cwiki.apache.org/confluence/display/INCUBATOR/PegasusProposal
> [2] https://github.com/etcd-io/etcd/tree/master/raft
> [3] https://github.com/tikv/raft-rs
>


-- 
Sincerely yours, Ivan Daschinskiy


-- 
Sincerely yours, Ivan Daschinskiy

Reply via email to