I'd suggest to discuss this IEP and technical details in open ZOOM meeting.
чт, 19 нояб. 2020 г. в 13:47, Ivan Daschinsky <ivanda...@gmail.com>: > > > ---------- Forwarded message --------- > От: Ivan Daschinsky <ivanda...@gmail.com> > Date: чт, 19 нояб. 2020 г. в 13:02 > Subject: Re: IEP-61 Technical discussion > To: Alexey Goncharuk <alexey.goncha...@gmail.com> > > > Alexey, let's arise another question. Specifically, how nodes initially > find each other (discovery) and how they detect failures. > > I suppose, that gossip protocol is an ideal candidate. For example, consul > [1] uses this approach, using serf [2] library to discover members of > cluster. > Then consul forms raft ensemble (server nodes) and client use raft > ensemble only as lock service. > > PacificA suggests internal heartbeats mechanism for failure detection of > replicated group, but it says nothing about initial discovery of nodes. > > WDYT? > > [1] -- https://www.consul.io/docs/architecture/gossip > [2] -- https://www.serf.io/ > > чт, 19 нояб. 2020 г. в 12:46, Alexey Goncharuk <alexey.goncha...@gmail.com > >: > >> Following up the Ignite 3.0 scope/development approach threads, this is a >> separate thread to discuss technical aspects of the IEP. >> >> Let's reiterate one more time on the questions raised by Ivan and also >> see if there are any other thoughts on the IEP: >> >> - *Whether to deploy metastorage on a separate subset of the nodes or >> allow Ignite to choose these nodes automatically.* I think it is >> feasible to maintain both modes: by default, Ignite will choose >> metastorage nodes automatically which essentially will provide the same >> seamless user experience as TCP discovery SPI - no separate roles, >> simplistic deployment. For deployments where people want to have more >> fine-grained control over the nodes' assignments, we will provide a >> runtime >> configuration which will allow pinning metastorage group to certain nodes, >> thus eliminating the latency concerns. >> - *Whether there are any TLA+ specs for the PacificA protocol.* Not >> to my knowledge, but it is known to be used in production by Microsoft and >> other projects, e.g. [1] >> >> I would like to collect general feedback on the IEP, as well as feedback >> on specific parts of it, such as: >> >> - Metastorage API >> - Any existing library that can be used to avoid re-implementing the >> protocol ourselves? Perhaps, porting the existing implementation to Java >> (the way TiKV did with etcd-raft [2] [3]? This is a very neat way btw in >> my >> opinion because I like the finite automata-like approach of the >> replication >> module, and, additionally, we could sync bug fixes and improvements from >> the upstream project) >> >> >> Thanks, >> --AG >> >> [1] https://cwiki.apache.org/confluence/display/INCUBATOR/PegasusProposal >> [2] https://github.com/etcd-io/etcd/tree/master/raft >> [3] https://github.com/tikv/raft-rs >> > > > -- > Sincerely yours, Ivan Daschinskiy > > > -- > Sincerely yours, Ivan Daschinskiy > -- Sincerely yours, Ivan Daschinskiy