Re: Thin client: compute support

Pavel Tupitsyn Mon, 25 Nov 2019 11:42:30 -0800

Alex,

> we will mix entities from different layers (transport layer and request
body)
I would not call our message header (which includes the id) "transport
layer".
TCP is our transport layer. And it is fine to use request ID to identify
compute tasks (as we do with query cursors).


> we still can't be sure that the task is successfully started on a server
The request to start the task will fail and we'll get a response indicating
that right away

> we won't ever know about topology change
Looks like I'm missing something - how is topology change relevant to
executing compute tasks from client?

On Mon, Nov 25, 2019 at 10:17 PM Alex Plehanov <[email protected]>
wrote:

> Pavel, in this case, we will mix entities from different layers (transport
> layer and request body), it's not very good. The same behavior we can
> achieve with generated on client-side task id, but there will be no
> inter-layer data intersection and I think it will be easier to implement on
> both client and server-side. But we still can't be sure that the task is
> successfully started on a server. We won't ever know about topology change,
> because topology changed flag will be sent from server to client only with
> a response when the task will be completed. Are we accept that?
>
> пн, 25 нояб. 2019 г. в 19:07, Pavel Tupitsyn <[email protected]>:
>
> > Alex,
> >
> > I have a simpler idea. We already do request id handling in the protocol,
> > so:
> > - Client sends a normal request to execute compute task. Request ID is
> > generated as usual.
> > - As soon as task is completed, a response is received.
> >
> > As for cancellation - client can send a new request (with new request ID)
> > and (in the body) pass the request ID from above
> > as a task identifier. As a result, there are two responses:
> > - Cancellation response
> > - Task response (with proper cancelled status)
> >
> > That's it, no need to modify the core of the protocol. One request - one
> > response.
> >
> > On Mon, Nov 25, 2019 at 6:20 PM Alex Plehanov <[email protected]>
> > wrote:
> >
> > > Pavel, we need to inform the client when the task is completed, we need
> > the
> > > ability to cancel the task. I see several ways to implement this:
> > >
> > > 1. Сlient sends a request to the server to start a task, server return
> > task
> > > id in response. Server notifies client when task is completed with a
> new
> > > request (from server to client). Client can cancel the task by sending
> a
> > > new request with operation type "cancel" and task id. In this case, we
> > > should implement 2-ways requests.
> > > 2. Client generates unique task id and sends a request to the server to
> > > start a task, server don't reply immediately but wait until task is
> > > completed. Client can cancel task by sending new request with operation
> > > type "cancel" and task id. In this case, we should decouple request and
> > > response on the server-side (currently response is sent right after
> > request
> > > was processed). Also, we can't be sure that task is successfully
> started
> > on
> > > a server.
> > > 3. Client sends a request to the server to start a task, server return
> id
> > > in response. Client periodically asks the server about task status.
> > Client
> > > can cancel the task by sending new request with operation type "cancel"
> > and
> > > task id. This case brings some overhead to the communication channel.
> > >
> > > Personally, I think that the case with 2-ways requests is better, but
> I'm
> > > open to any other ideas.
> > >
> > > Aleksandr,
> > >
> > > Filtering logic for OP_CLUSTER_GROUP_GET_NODE_IDS looks
> overcomplicated.
> > Do
> > > we need server-side filtering at all? Wouldn't it be better to send
> basic
> > > info (ids, order, flags) for all nodes (there is relatively small
> amount
> > of
> > > data) and extended info (attributes) for selected list of nodes? In
> this
> > > case, we can do basic node filtration on client-side (forClients(),
> > > forServers(), forNodeIds(), forOthers(), etc).
> > >
> > > Do you use standard ClusterNode serialization? There are also metrics
> > > serialized with ClusterNode, do we need it on thin client? There are
> > other
> > > interfaces exist to show metrics, I think it's redundant to export
> > metrics
> > > to thin clients too.
> > >
> > > What do you think?
> > >
> > >
> > >
> > >
> > > пт, 22 нояб. 2019 г. в 20:15, Aleksandr Shapkin <[email protected]>:
> > >
> > > > Alex,
> > > >
> > > >
> > > >
> > > > I think you can create a new IEP page and I will fill it with the
> > Cluster
> > > > API details.
> > > >
> > > >
> > > >
> > > > In short, I’ve introduced several new codes:
> > > >
> > > >
> > > >
> > > > Cluster API is pretty straightforward:
> > > >
> > > >
> > > >
> > > > OP_CLUSTER_IS_ACTIVE = 5000
> > > >
> > > > OP_CLUSTER_CHANGE_STATE = 5001
> > > >
> > > > OP_CLUSTER_CHANGE_WAL_STATE = 5002
> > > >
> > > > OP_CLUSTER_GET_WAL_STATE = 5003
> > > >
> > > >
> > > >
> > > > Cluster group codes:
> > > >
> > > >
> > > >
> > > > OP_CLUSTER_GROUP_GET_NODE_IDS = 5100
> > > >
> > > > OP_CLUSTER_GROUP_GET_NODE_INFO = 5101
> > > >
> > > >
> > > >
> > > > The underlying implementation is based on the thick client logic.
> > > >
> > > >
> > > >
> > > > For every request, we provide a known topology version and if it has
> > > > changed,
> > > >
> > > > a client updates it firstly and then re-sends the filtering request.
> > > >
> > > >
> > > >
> > > > Alongside the topVer a client sends a serialized nodes projection
> > object
> > > >
> > > > that could be considered as a code to value mapping.
> > > >
> > > > Consider: [{Code = 1, Value= [“DotNet”, “MyAttribute”}, {Code=2,
> > > Value=1}]
> > > >
> > > > Where “1” stands for Attribute filtering and “2” – serverNodesOnly
> > flag.
> > > >
> > > >
> > > >
> > > > As a result of request processing, a server sends nodeId UUIDs and a
> > > > current topVer.
> > > >
> > > >
> > > >
> > > > When a client obtains nodeIds, it can perform a NODE_INFO call to
> get a
> > > >
> > > > serialized ClusterNode object. In addition there should be a
> different
> > > API
> > > >
> > > > method for accessing/updating node metrics.
> > > >
> > > > чт, 21 нояб. 2019 г. в 12:32, Sergey Kozlov <[email protected]>:
> > > >
> > > > > Hi Pavel
> > > > >
> > > > > On Thu, Nov 21, 2019 at 11:30 AM Pavel Tupitsyn <
> > [email protected]>
> > > > > wrote:
> > > > >
> > > > > > 1. I believe that Cluster operations for Thin Client protocol are
> > > > already
> > > > > > in the works
> > > > > > by Alexandr Shapkin. Can't find the ticket though.
> > > > > > Alexandr, can you please confirm and attach the ticket number?
> > > > > >
> > > > > > 2. Proposed changes will work only for Java tasks that are
> already
> > > > > deployed
> > > > > > on server nodes.
> > > > > > This is mostly useless for other thin clients we have (Python,
> PHP,
> > > > .NET,
> > > > > > C++).
> > > > > >
> > > > >
> > > > > I don't guess so. The task (execution) is a way to implement own
> > layer
> > > > for
> > > > > the thin client application.
> > > > >
> > > > >
> > > > > > We should think of a way to make this useful for all clients.
> > > > > > For example, we may allow sending tasks in some scripting
> language
> > > like
> > > > > > Javascript.
> > > > > > Thoughts?
> > > > > >
> > > > >
> > > > > The arbitrary code execution from a remote client must be protected
> > > > > from malicious code.
> > > > > I don't know how it could be designed but without that we open the
> > hole
> > > > to
> > > > > kill cluster.
> > > > >
> > > > >
> > > > > >
> > > > > > On Thu, Nov 21, 2019 at 11:21 AM Sergey Kozlov <
> > [email protected]
> > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Hi Alex
> > > > > > >
> > > > > > > The idea is great. But I have some concerns that probably
> should
> > be
> > > > > taken
> > > > > > > into account for design:
> > > > > > >
> > > > > > >    1. We need to have the ability to stop a task execution,
> smth
> > > like
> > > > > > >    OP_COMPUTE_CANCEL_TASK  operation (client to server)
> > > > > > >    2. What's about task execution timeout? It may help to the
> > > cluster
> > > > > > >    survival for buggy tasks
> > > > > > >    3. Ignite doesn't have roles/authorization functionality for
> > > now.
> > > > > But
> > > > > > a
> > > > > > >    task is the risky operation for cluster (for security
> > reasons).
> > > > > Could
> > > > > > we
> > > > > > >    add for Ignite configuration new options:
> > > > > > >       - Explicit turning on for compute task support for thin
> > > > protocol
> > > > > > >       (disabled by default) for whole cluster
> > > > > > >       - Explicit turning on for compute task support for a node
> > > > > > >       - The list of task names (classes) allowed to execute by
> > thin
> > > > > > client.
> > > > > > >    4. Support the labeling for task that may help to
> investigate
> > > > issues
> > > > > > on
> > > > > > >    cluster (the idea from IEP-34 [1])
> > > > > > >
> > > > > > > 1.
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-34+Thin+client%3A+transactions+support
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Thu, Nov 21, 2019 at 10:58 AM Alex Plehanov <
> > > > > [email protected]>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hello, Igniters!
> > > > > > > >
> > > > > > > > I have plans to start implementation of Compute interface for
> > > > Ignite
> > > > > > thin
> > > > > > > > client and want to discuss features that should be
> implemented.
> > > > > > > >
> > > > > > > > We already have Compute implementation for binary-rest
> clients
> > > > > > > > (GridClientCompute), which have the following functionality:
> > > > > > > > - Filtering cluster nodes (projection) for compute
> > > > > > > > - Executing task by the name
> > > > > > > >
> > > > > > > > I think we can implement this functionality in a thin client
> as
> > > > well.
> > > > > > > >
> > > > > > > > First of all, we need some operation types to request a list
> of
> > > all
> > > > > > > > available nodes and probably node attributes (by a list of
> > > nodes).
> > > > > Node
> > > > > > > > attributes will be helpful if we will decide to implement
> > analog
> > > of
> > > > > > > > ClusterGroup#forAttribute or ClusterGroup#forePredicate
> methods
> > > in
> > > > > the
> > > > > > > thin
> > > > > > > > client. Perhaps they can be requested lazily.
> > > > > > > >
> > > > > > > > From the protocol point of view there will be two new
> > operations:
> > > > > > > >
> > > > > > > > OP_CLUSTER_GET_NODES
> > > > > > > > Request: empty
> > > > > > > > Response: long topologyVersion, int minorTopologyVersion, int
> > > > > > nodesCount,
> > > > > > > > for each node set of node fields (UUID nodeId, Object or
> String
> > > > > > > > consistentId, long order, etc)
> > > > > > > >
> > > > > > > > OP_CLUSTER_GET_NODE_ATTRIBUTES
> > > > > > > > Request: int nodesCount, for each node: UUID nodeId
> > > > > > > > Response: int nodesCount, for each node: int attributesCount,
> > for
> > > > > each
> > > > > > > node
> > > > > > > > attribute: String name, Object value
> > > > > > > >
> > > > > > > > To execute tasks we need something like these methods in the
> > > client
> > > > > > API:
> > > > > > > > Object execute(String task, Object arg)
> > > > > > > > Future<Object> executeAsync(String task, Object arg)
> > > > > > > > Object affinityExecute(String task, String cache, Object key,
> > > > Object
> > > > > > arg)
> > > > > > > > Future<Object> affinityExecuteAsync(String task, String
> cache,
> > > > Object
> > > > > > > key,
> > > > > > > > Object arg)
> > > > > > > >
> > > > > > > > Which can be mapped to protocol operations:
> > > > > > > >
> > > > > > > > OP_COMPUTE_EXECUTE_TASK
> > > > > > > > Request: UUID nodeId, String taskName, Object arg
> > > > > > > > Response: Object result
> > > > > > > >
> > > > > > > > OP_COMPUTE_EXECUTE_TASK_AFFINITY
> > > > > > > > Request: String cacheName, Object key, String taskName,
> Object
> > > arg
> > > > > > > > Response: Object result
> > > > > > > >
> > > > > > > > The second operation is needed because we sometimes can't
> > > calculate
> > > > > and
> > > > > > > > connect to affinity node on the client-side (affinity
> awareness
> > > can
> > > > > be
> > > > > > > > disabled, custom affinity function can be used or there can
> be
> > no
> > > > > > > > connection between client and affinity node), but we can make
> > > best
> > > > > > effort
> > > > > > > > to send request to target node if affinity awareness is
> > enabled.
> > > > > > > >
> > > > > > > > Currently, on the server-side requests always processed
> > > > synchronously
> > > > > > and
> > > > > > > > responses are sent right after request was processed. To
> > execute
> > > > long
> > > > > > > tasks
> > > > > > > > async we should whether change this logic or introduce some
> > kind
> > > > > > two-way
> > > > > > > > communication between client and server (now only one-way
> > > requests
> > > > > from
> > > > > > > > client to server are allowed).
> > > > > > > >
> > > > > > > > Two-way communication can also be useful in the future if we
> > will
> > > > > send
> > > > > > > some
> > > > > > > > server-side generated events to clients.
> > > > > > > >
> > > > > > > > In case of two-way communication there can be new operations
> > > > > > introduced:
> > > > > > > >
> > > > > > > > OP_COMPUTE_EXECUTE_TASK (from client to server)
> > > > > > > > Request: UUID nodeId, String taskName, Object arg
> > > > > > > > Response: long taskId
> > > > > > > >
> > > > > > > > OP_COMPUTE_TASK_FINISHED (from server to client)
> > > > > > > > Request: taskId, Object result
> > > > > > > > Response: empty
> > > > > > > >
> > > > > > > > The same for affinity requests.
> > > > > > > >
> > > > > > > > Also, we can implement not only execute task operation, but
> > some
> > > > > other
> > > > > > > > operations from IgniteCompute (broadcast, run, call), but it
> > will
> > > > be
> > > > > > > useful
> > > > > > > > only for java thin client. And even with java thin client we
> > > should
> > > > > > > whether
> > > > > > > > implement peer-class-loading for thin clients (this also
> > requires
> > > > > > two-way
> > > > > > > > client-server communication) or put classes with executed
> > > closures
> > > > to
> > > > > > the
> > > > > > > > server locally.
> > > > > > > >
> > > > > > > > What do you think about proposed protocol changes?
> > > > > > > > Do we need two-way requests between client and server?
> > > > > > > > Do we need support of compute methods other than "execute
> > task"?
> > > > > > > > What do you think about peer-class-loading for thin clients?
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Sergey Kozlov
> > > > > > > GridGain Systems
> > > > > > > www.gridgain.com
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Sergey Kozlov
> > > > > GridGain Systems
> > > > > www.gridgain.com
> > > > >
> > > >
> > > >
> > > > --
> > > > Alex.
> > > >
> > >
> >
>

Re: Thin client: compute support

Reply via email to