Re: [DISCUSSION] New Ignite settings for IGNITE-12438 and IGNITE-13013

Ivan Bessonov Mon, 29 Jun 2020 06:33:41 -0700

Hi Ivan,

sure, TCP connections are lazy. So, if a connection is not already opened
then node (trying to send a message) will initiate connection opening.
It's also possible that the opened connection is spontaneously closed for
some reason. Otherwise you are right, everything is as you described.


There's also a tie breaker when two nodes connect to each other at the
same time. Only one of them will succeed and it depends on internal
discovery order, which you can't control basically.

пн, 29 июн. 2020 г. в 16:01, Ivan Pavlukhin <vololo...@gmail.com>:

> Hi Ivan,
>
> Sorry for a possibly naive question. As I understand we are talking
> about order of establishing client-server connections. And I suppose
> that in some environments (e.g. cloud) servers cannot directly
> establish connections with clients. But TCP connections are
> bidirectional and we still can send messages in both directions. Could
> you please provide an example case in which servers have to initiate
> new connections to clients?
>
> 2020-06-29 13:08 GMT+03:00, Ivan Bessonov <bessonov...@gmail.com>:
> > Hi igniters, Hi Raymond,
> >
> > that was a really good point. I will try to address it as much as I can.
> >
> > First of all, this new mode will be configurable for now. As Val
> suggested,
> > "TcpCommunicationSpi#forceClientToServerConnections" will be a new
> > setting to trigger this behavior. Disabled by default.
> >
> > About issues with K8S deployments - I'm not an expert, but from what I've
> > heard, sometimes servers and client nodes are not in the same
> environments.
> > For example, there is an Ignite cluster and user tries to start client
> node
> > in
> > isolated K8S pod. In this case clients cannot properly resolve their own
> > addresses
> > and send it to servers, making it impossible for servers to connect to
> such
> > clients.
> > Or, in other words, clients are used as if they were thin.
> >
> > In your case everything is fine, clients and servers share the same
> network
> > and can resolve each other's addresses.
> >
> > Now, CQ issue [1]. You can pass a custom event filter when you register a
> > new
> > continuous query. But, depending on the setup, the class of this filter
> may
> > not
> > be in the classpath of the server node that holds the data and invokes
> that
> > filter.
> > There are two solutions to the problem:
> > - server fails to resolve class name and fails to register CQ;
> > - or server can have p2p deployment enabled. Let's assume that it was a
> > client
> > node that requested CQ. In this case the server will try to download
> > "class" file
> > directly from the node that sent the filter object in the first place.
> Due
> > to a poor
> > design decision it will be done synchronously while registering the
> query,
> > and
> > query registration is happening in "discovery" thread. In normal
> > circumstances
> > the server will load the class and finish query registration, it's just a
> > little bit slow.
> >
> > Second case is not compatible with a new "forceClientToServerConnections"
> > setting. I'm not sure that I need to go into all technical details, but
> the
> > result of
> > such procedure is a cluster that cannot process any discovery messages
> > during
> > TCP connection timeout, we're talking about tens of seconds or maybe even
> > several minutes depending on the settings and the environment. All this
> > time the
> > server will be in a "deadlock" state inside of the "discovery" thread. It
> > means that
> > some cluster operations will be unavailable during this period, like new
> > node joining
> > or starting a new cache. Node failures will not be processed properly as
> > well. For
> > me it's hard to predict real behavior until we reproduce the situation
> in a
> > live
> > environment. I saw this in tests only.
> >
> > I hope that my message clarifies the situation, or at least doesn't cause
> > more
> > confusion. These changes will not affect your infrastructure or your
> Ignite
> > installations, they are aimed at adding more flexibility to other ways of
> > using Ignite.
> >
> > [1] https://issues.apache.org/jira/browse/IGNITE-13156
> >
> >
> >
> > сб, 27 июн. 2020 г. в 09:54, Raymond Wilson <raymond_wil...@trimble.com
> >:
> >
> >> I have just caught up with this discussion and wanted to outline a set
> of
> >> use
> >> cases we have that rely on server nodes communicating with client nodes.
> >>
> >> Firstly, I'd like to confirm my mental model of server & client nodes
> >> within
> >> a grid (ignoring thin clients for now):
> >>
> >> A grid contains a set of nodes somewhat arbitrarily labelled 'server'
> and
> >> 'client' where the distinction of a 'server' node is that it is
> >> responsible
> >> for containing data (in-memory only, or also with persistence). Apart
> >> from
> >> that distinction, all nodes are essentially peers in the grid and may
> use
> >> the messaging fabric, compute layer and other grid features on an equal
> >> footing.
> >>
> >> In our solution we leverage these capabilities to build and orchestrate
> >> complex analytics queries that utilise compute functions that are
> >> initiated
> >> in three distinct ways: client -> client, client -> server and server ->
> >> client, and where all three styles of initiation are using within a
> >> single
> >> analytics request made to the grid it self. I can go into more detail
> >> about
> >> the exact sequencing of these activities if you like, but it may be
> >> sufficient to know they are used to reason about the problem statement
> >> and
> >> proposals outlined here.
> >>
> >> Our infrastructure is deployed to Kubernetes using EKS on AWS, and all
> >> three
> >> relationships between client and server nodes noted above function well
> >> (caveat: we do see odd things though such as long pauses on critical
> >> worker
> >> threads, and occasional empty topology warnings when locating client
> >> nodes
> >> to send requests to). We also use continuous queries in three contexts
> >> (all
> >> within server nodes).
> >>
> >> If this thread is suggesting changing the functional relationship
> between
> >> server and client nodes then this may have impacts on our architecture
> >> and
> >> implementation that we will need to consider.
> >>
> >> This thread has highlighted issues with K8s deployments and also CQ
> >> issues.
> >> The suggestion is that Server to Client just doesn't work on K8s, which
> >> does
> >> not agree with our experience of it working. I'd also like to understand
> >> better the bounds of the issue with CQ: When does it not work and what
> >> are
> >> the symptoms we would see if there was an issue with the way we are
> using
> >> it, or the K8s infrastructure we deploy to?
> >>
> >> Thanks,
> >> Raymond.
> >>
> >>
> >>
> >>
> >> --
> >> Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/
> >>
> >
> >
> > --
> > Sincerely yours,
> > Ivan Bessonov
> >
>
>
> --
>
> Best regards,
> Ivan Pavlukhin
>


-- 
Sincerely yours,
Ivan Bessonov

Re: [DISCUSSION] New Ignite settings for IGNITE-12438 and IGNITE-13013

Reply via email to