Ivan, Agree, let's have a call to discuss the IEP. I have some more thoughts regarding how the replication infrastructure works with atomic/transactional caches, will put this info to the IEP. Does next Friday, Nov 27th work for you? If ok, let's have an open call then.
As for the protocol port - we will not be dealing with the concurrency model if we choose this way, this is what I like about their code structure. Essentially, the raft module is a single-threaded automata which has a callback to process a message, process a tick (timeout) and produces messages that should be sent and log entries that should be persisted. Judging by the Rust port, it seems fairly straightforward. Will be happy to discuss this and other alternatives on the call as well. чт, 19 нояб. 2020 г. в 14:41, Ivan Daschinsky <ivanda...@gmail.com>: > > Any existing library that can be used to avoid re-implementing the > protocol ourselves? Perhaps, porting the existing implementation to Java > Personally, I like this idea. Go libraries (either raft module of etcd or > serf by Hashicorp) are famous for clean code, good design, stability, not > enormous size. > But, on other side, Go has different model for concurrency and porting > probably will not be so straightforward. > > > > чт, 19 нояб. 2020 г. в 13:48, Ivan Daschinsky <ivanda...@gmail.com>: > > > I'd suggest to discuss this IEP and technical details in open ZOOM > > meeting. > > > > чт, 19 нояб. 2020 г. в 13:47, Ivan Daschinsky <ivanda...@gmail.com>: > > > >> > >> > >> ---------- Forwarded message --------- > >> От: Ivan Daschinsky <ivanda...@gmail.com> > >> Date: чт, 19 нояб. 2020 г. в 13:02 > >> Subject: Re: IEP-61 Technical discussion > >> To: Alexey Goncharuk <alexey.goncha...@gmail.com> > >> > >> > >> Alexey, let's arise another question. Specifically, how nodes initially > >> find each other (discovery) and how they detect failures. > >> > >> I suppose, that gossip protocol is an ideal candidate. For example, > >> consul [1] uses this approach, using serf [2] library to discover > members > >> of cluster. > >> Then consul forms raft ensemble (server nodes) and client use raft > >> ensemble only as lock service. > >> > >> PacificA suggests internal heartbeats mechanism for failure detection of > >> replicated group, but it says nothing about initial discovery of nodes. > >> > >> WDYT? > >> > >> [1] -- https://www.consul.io/docs/architecture/gossip > >> [2] -- https://www.serf.io/ > >> > >> чт, 19 нояб. 2020 г. в 12:46, Alexey Goncharuk < > >> alexey.goncha...@gmail.com>: > >> > >>> Following up the Ignite 3.0 scope/development approach threads, this is > >>> a separate thread to discuss technical aspects of the IEP. > >>> > >>> Let's reiterate one more time on the questions raised by Ivan and also > >>> see if there are any other thoughts on the IEP: > >>> > >>> - *Whether to deploy metastorage on a separate subset of the nodes > >>> or allow Ignite to choose these nodes automatically.* I think it is > >>> feasible to maintain both modes: by default, Ignite will choose > >>> metastorage nodes automatically which essentially will provide the > same > >>> seamless user experience as TCP discovery SPI - no separate roles, > >>> simplistic deployment. For deployments where people want to have > more > >>> fine-grained control over the nodes' assignments, we will provide a > runtime > >>> configuration which will allow pinning metastorage group to certain > nodes, > >>> thus eliminating the latency concerns. > >>> - *Whether there are any TLA+ specs for the PacificA protocol.* Not > >>> to my knowledge, but it is known to be used in production by > Microsoft and > >>> other projects, e.g. [1] > >>> > >>> I would like to collect general feedback on the IEP, as well as > feedback > >>> on specific parts of it, such as: > >>> > >>> - Metastorage API > >>> - Any existing library that can be used to avoid re-implementing the > >>> protocol ourselves? Perhaps, porting the existing implementation to > Java > >>> (the way TiKV did with etcd-raft [2] [3]? This is a very neat way > btw in my > >>> opinion because I like the finite automata-like approach of the > replication > >>> module, and, additionally, we could sync bug fixes and improvements > from > >>> the upstream project) > >>> > >>> > >>> Thanks, > >>> --AG > >>> > >>> [1] > >>> https://cwiki.apache.org/confluence/display/INCUBATOR/PegasusProposal > >>> [2] https://github.com/etcd-io/etcd/tree/master/raft > >>> [3] https://github.com/tikv/raft-rs > >>> > >> > >> > >> -- > >> Sincerely yours, Ivan Daschinskiy > >> > >> > >> -- > >> Sincerely yours, Ivan Daschinskiy > >> > > > > > > -- > > Sincerely yours, Ivan Daschinskiy > > > > > -- > Sincerely yours, Ivan Daschinskiy >