Alex, I have a simpler idea. We already do request id handling in the protocol, so: - Client sends a normal request to execute compute task. Request ID is generated as usual. - As soon as task is completed, a response is received.
As for cancellation - client can send a new request (with new request ID) and (in the body) pass the request ID from above as a task identifier. As a result, there are two responses: - Cancellation response - Task response (with proper cancelled status) That's it, no need to modify the core of the protocol. One request - one response. On Mon, Nov 25, 2019 at 6:20 PM Alex Plehanov <plehanov.a...@gmail.com> wrote: > Pavel, we need to inform the client when the task is completed, we need the > ability to cancel the task. I see several ways to implement this: > > 1. Сlient sends a request to the server to start a task, server return task > id in response. Server notifies client when task is completed with a new > request (from server to client). Client can cancel the task by sending a > new request with operation type "cancel" and task id. In this case, we > should implement 2-ways requests. > 2. Client generates unique task id and sends a request to the server to > start a task, server don't reply immediately but wait until task is > completed. Client can cancel task by sending new request with operation > type "cancel" and task id. In this case, we should decouple request and > response on the server-side (currently response is sent right after request > was processed). Also, we can't be sure that task is successfully started on > a server. > 3. Client sends a request to the server to start a task, server return id > in response. Client periodically asks the server about task status. Client > can cancel the task by sending new request with operation type "cancel" and > task id. This case brings some overhead to the communication channel. > > Personally, I think that the case with 2-ways requests is better, but I'm > open to any other ideas. > > Aleksandr, > > Filtering logic for OP_CLUSTER_GROUP_GET_NODE_IDS looks overcomplicated. Do > we need server-side filtering at all? Wouldn't it be better to send basic > info (ids, order, flags) for all nodes (there is relatively small amount of > data) and extended info (attributes) for selected list of nodes? In this > case, we can do basic node filtration on client-side (forClients(), > forServers(), forNodeIds(), forOthers(), etc). > > Do you use standard ClusterNode serialization? There are also metrics > serialized with ClusterNode, do we need it on thin client? There are other > interfaces exist to show metrics, I think it's redundant to export metrics > to thin clients too. > > What do you think? > > > > > пт, 22 нояб. 2019 г. в 20:15, Aleksandr Shapkin <lexw...@gmail.com>: > > > Alex, > > > > > > > > I think you can create a new IEP page and I will fill it with the Cluster > > API details. > > > > > > > > In short, I’ve introduced several new codes: > > > > > > > > Cluster API is pretty straightforward: > > > > > > > > OP_CLUSTER_IS_ACTIVE = 5000 > > > > OP_CLUSTER_CHANGE_STATE = 5001 > > > > OP_CLUSTER_CHANGE_WAL_STATE = 5002 > > > > OP_CLUSTER_GET_WAL_STATE = 5003 > > > > > > > > Cluster group codes: > > > > > > > > OP_CLUSTER_GROUP_GET_NODE_IDS = 5100 > > > > OP_CLUSTER_GROUP_GET_NODE_INFO = 5101 > > > > > > > > The underlying implementation is based on the thick client logic. > > > > > > > > For every request, we provide a known topology version and if it has > > changed, > > > > a client updates it firstly and then re-sends the filtering request. > > > > > > > > Alongside the topVer a client sends a serialized nodes projection object > > > > that could be considered as a code to value mapping. > > > > Consider: [{Code = 1, Value= [“DotNet”, “MyAttribute”}, {Code=2, > Value=1}] > > > > Where “1” stands for Attribute filtering and “2” – serverNodesOnly flag. > > > > > > > > As a result of request processing, a server sends nodeId UUIDs and a > > current topVer. > > > > > > > > When a client obtains nodeIds, it can perform a NODE_INFO call to get a > > > > serialized ClusterNode object. In addition there should be a different > API > > > > method for accessing/updating node metrics. > > > > чт, 21 нояб. 2019 г. в 12:32, Sergey Kozlov <skoz...@gridgain.com>: > > > > > Hi Pavel > > > > > > On Thu, Nov 21, 2019 at 11:30 AM Pavel Tupitsyn <ptupit...@apache.org> > > > wrote: > > > > > > > 1. I believe that Cluster operations for Thin Client protocol are > > already > > > > in the works > > > > by Alexandr Shapkin. Can't find the ticket though. > > > > Alexandr, can you please confirm and attach the ticket number? > > > > > > > > 2. Proposed changes will work only for Java tasks that are already > > > deployed > > > > on server nodes. > > > > This is mostly useless for other thin clients we have (Python, PHP, > > .NET, > > > > C++). > > > > > > > > > > I don't guess so. The task (execution) is a way to implement own layer > > for > > > the thin client application. > > > > > > > > > > We should think of a way to make this useful for all clients. > > > > For example, we may allow sending tasks in some scripting language > like > > > > Javascript. > > > > Thoughts? > > > > > > > > > > The arbitrary code execution from a remote client must be protected > > > from malicious code. > > > I don't know how it could be designed but without that we open the hole > > to > > > kill cluster. > > > > > > > > > > > > > > On Thu, Nov 21, 2019 at 11:21 AM Sergey Kozlov <skoz...@gridgain.com > > > > > > wrote: > > > > > > > > > Hi Alex > > > > > > > > > > The idea is great. But I have some concerns that probably should be > > > taken > > > > > into account for design: > > > > > > > > > > 1. We need to have the ability to stop a task execution, smth > like > > > > > OP_COMPUTE_CANCEL_TASK operation (client to server) > > > > > 2. What's about task execution timeout? It may help to the > cluster > > > > > survival for buggy tasks > > > > > 3. Ignite doesn't have roles/authorization functionality for > now. > > > But > > > > a > > > > > task is the risky operation for cluster (for security reasons). > > > Could > > > > we > > > > > add for Ignite configuration new options: > > > > > - Explicit turning on for compute task support for thin > > protocol > > > > > (disabled by default) for whole cluster > > > > > - Explicit turning on for compute task support for a node > > > > > - The list of task names (classes) allowed to execute by thin > > > > client. > > > > > 4. Support the labeling for task that may help to investigate > > issues > > > > on > > > > > cluster (the idea from IEP-34 [1]) > > > > > > > > > > 1. > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/IGNITE/IEP-34+Thin+client%3A+transactions+support > > > > > > > > > > > > > > > > > > > > On Thu, Nov 21, 2019 at 10:58 AM Alex Plehanov < > > > plehanov.a...@gmail.com> > > > > > wrote: > > > > > > > > > > > Hello, Igniters! > > > > > > > > > > > > I have plans to start implementation of Compute interface for > > Ignite > > > > thin > > > > > > client and want to discuss features that should be implemented. > > > > > > > > > > > > We already have Compute implementation for binary-rest clients > > > > > > (GridClientCompute), which have the following functionality: > > > > > > - Filtering cluster nodes (projection) for compute > > > > > > - Executing task by the name > > > > > > > > > > > > I think we can implement this functionality in a thin client as > > well. > > > > > > > > > > > > First of all, we need some operation types to request a list of > all > > > > > > available nodes and probably node attributes (by a list of > nodes). > > > Node > > > > > > attributes will be helpful if we will decide to implement analog > of > > > > > > ClusterGroup#forAttribute or ClusterGroup#forePredicate methods > in > > > the > > > > > thin > > > > > > client. Perhaps they can be requested lazily. > > > > > > > > > > > > From the protocol point of view there will be two new operations: > > > > > > > > > > > > OP_CLUSTER_GET_NODES > > > > > > Request: empty > > > > > > Response: long topologyVersion, int minorTopologyVersion, int > > > > nodesCount, > > > > > > for each node set of node fields (UUID nodeId, Object or String > > > > > > consistentId, long order, etc) > > > > > > > > > > > > OP_CLUSTER_GET_NODE_ATTRIBUTES > > > > > > Request: int nodesCount, for each node: UUID nodeId > > > > > > Response: int nodesCount, for each node: int attributesCount, for > > > each > > > > > node > > > > > > attribute: String name, Object value > > > > > > > > > > > > To execute tasks we need something like these methods in the > client > > > > API: > > > > > > Object execute(String task, Object arg) > > > > > > Future<Object> executeAsync(String task, Object arg) > > > > > > Object affinityExecute(String task, String cache, Object key, > > Object > > > > arg) > > > > > > Future<Object> affinityExecuteAsync(String task, String cache, > > Object > > > > > key, > > > > > > Object arg) > > > > > > > > > > > > Which can be mapped to protocol operations: > > > > > > > > > > > > OP_COMPUTE_EXECUTE_TASK > > > > > > Request: UUID nodeId, String taskName, Object arg > > > > > > Response: Object result > > > > > > > > > > > > OP_COMPUTE_EXECUTE_TASK_AFFINITY > > > > > > Request: String cacheName, Object key, String taskName, Object > arg > > > > > > Response: Object result > > > > > > > > > > > > The second operation is needed because we sometimes can't > calculate > > > and > > > > > > connect to affinity node on the client-side (affinity awareness > can > > > be > > > > > > disabled, custom affinity function can be used or there can be no > > > > > > connection between client and affinity node), but we can make > best > > > > effort > > > > > > to send request to target node if affinity awareness is enabled. > > > > > > > > > > > > Currently, on the server-side requests always processed > > synchronously > > > > and > > > > > > responses are sent right after request was processed. To execute > > long > > > > > tasks > > > > > > async we should whether change this logic or introduce some kind > > > > two-way > > > > > > communication between client and server (now only one-way > requests > > > from > > > > > > client to server are allowed). > > > > > > > > > > > > Two-way communication can also be useful in the future if we will > > > send > > > > > some > > > > > > server-side generated events to clients. > > > > > > > > > > > > In case of two-way communication there can be new operations > > > > introduced: > > > > > > > > > > > > OP_COMPUTE_EXECUTE_TASK (from client to server) > > > > > > Request: UUID nodeId, String taskName, Object arg > > > > > > Response: long taskId > > > > > > > > > > > > OP_COMPUTE_TASK_FINISHED (from server to client) > > > > > > Request: taskId, Object result > > > > > > Response: empty > > > > > > > > > > > > The same for affinity requests. > > > > > > > > > > > > Also, we can implement not only execute task operation, but some > > > other > > > > > > operations from IgniteCompute (broadcast, run, call), but it will > > be > > > > > useful > > > > > > only for java thin client. And even with java thin client we > should > > > > > whether > > > > > > implement peer-class-loading for thin clients (this also requires > > > > two-way > > > > > > client-server communication) or put classes with executed > closures > > to > > > > the > > > > > > server locally. > > > > > > > > > > > > What do you think about proposed protocol changes? > > > > > > Do we need two-way requests between client and server? > > > > > > Do we need support of compute methods other than "execute task"? > > > > > > What do you think about peer-class-loading for thin clients? > > > > > > > > > > > > > > > > > > > > > -- > > > > > Sergey Kozlov > > > > > GridGain Systems > > > > > www.gridgain.com > > > > > > > > > > > > > > > > > > -- > > > Sergey Kozlov > > > GridGain Systems > > > www.gridgain.com > > > > > > > > > -- > > Alex. > > >