[ 
https://issues.apache.org/jira/browse/IGNITE-13986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17264203#comment-17264203
 ] 

Ivan Bessonov commented on IGNITE-13986:
----------------------------------------

So, the library in question is {{scalecube-cluter}}. Last update was 4 months 
ago and it feels like nothing's happening with it right now. Maven central 
lacks latest release (2.6.2) and only has {{2.6.0-RC7}}, which is weird. I'll 
assume that latest release is stable though.

Usage examples can be seen in {{scalecube-cluster-examples}}, they serve as a 
good introduction.

Scalecube allows to group a cluster with given number of _seed_ addresses. It 
can be a single node or all nodes in cluster. New node cannot join cluster if 
all seed nodes are offline. This means that for better functioning we should 
list all potential IP addresses and ports in the list of seed addresses, which 
is a lot. As far as I understand, nodes will periodically bash into these 
addresses in background, so size of the list can affect cluster bootstrap time 
or network busyness.

Every node has associated _metadata_ that can be modified at any moment. 
Basically, metadata is any Java object. Serialization for these objects can be 
customized either explicitly via configuration or implicitly via 
{{ServiceLoader}} Java feature. Metadata can be used as {{JoiningNodeData}} 
object, but everything else for joining process has to be revisited (more on 
that later).

Overall, there are 4 types of events that nodes can handle:
 * ADDED - node is joining;
 * REMOVED - node is disconnected unexpectedly;
 * LEAVING - node is being stopped gracefully;
 * UPDATED - node metadata has been updated.

These don't come in any specific order, which means that current discovery 
events ordering can't be easily replicated. Messages mutability is also 
impossible. There is a builtin way to broadcast custom messages with gossip 
protocol, but it has no ordering as well. This means that join into _group 
membership subsystem_ and join into _Ignite cluster_ are very distinct 
processes.

Transport layer can be reconfigured. It consists of two entities: 
{{TransportFactory}} and {{MessageCodec}}. Second one has a weird interface 
that isn't used anywhere publicly (only in {{TransportImpl}}, which is just a 
part of default transport factory implementation).

Default transport uses netty. Even though we're going to use netty as well, I 
expect custom transport implementation. Reasons are simple:
 * versions incompatibility will be completely avoided;
 * we could use same underlying code as in communication protocol;
 * logs format is messed up in default implementation and there are many 
excessive messages being logged when node is leaving.

Speaking about logs - it uses sl4j. I'm not aware of what logging library we're 
going to use, but it's clear that we should find/write some adapter.

In short, the problems that I see:
 * somewhat excessive logging and explicit log4j dependency instead of 
{{java.util.logging.Logger}};
 * possible problems with big seed nodes lists.

Otherwise, looks good if we're ok with eventual consistency.

> Proof of concept - SWIM group membership protocol for discovery
> ---------------------------------------------------------------
>
>                 Key: IGNITE-13986
>                 URL: https://issues.apache.org/jira/browse/IGNITE-13986
>             Project: Ignite
>          Issue Type: New Feature
>            Reporter: Ivan Bessonov
>            Assignee: Ivan Bessonov
>            Priority: Major
>              Labels: iep-61, ignite-3
>
> In IEP-61 it is mentioned that discovery protocol will be updated. We need to 
> play with mentioned options for a little bit to conclude if they match our 
> needs:
> [http://www.cs.cornell.edu/Info/Projects/Spinglass/public_pdfs/SWIM.pdf]
> [https://github.com/scalecube/scalecube-cluster]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to