> I can't see any usage of request id in query cursors You are right, cursor id is a separate thing. Anyway, my point stands.
> client sends long term tasks to nodes and wants to do it with load balancing I still don't get it. Can you please provide equivalent use case with existing "thick" client? On Mon, Nov 25, 2019 at 11:59 PM Alex Plehanov <plehanov.a...@gmail.com> wrote: > > And it is fine to use request ID to identify compute tasks (as we do with > query cursors). > I can't see any usage of request id in query cursors. We send query request > and get cursor id in response. After that, we only use cursor id (to get > next pages and to close the resource). Did I miss something? > > > Looks like I'm missing something - how is topology change relevant to > executing compute tasks from client? > It's not relevant directly. But there are some cases where it will be > helpful. For example, if client sends long term tasks to nodes and wants to > do it with load balancing it will detect topology change only after some > time in the future with the first response, so load balancing will no work. > Perhaps we can add optional "topology version" field to the > OP_COMPUTE_EXECUTE_TASK request to solve this problem. > > > пн, 25 нояб. 2019 г. в 22:42, Pavel Tupitsyn <ptupit...@apache.org>: > > > Alex, > > > > > we will mix entities from different layers (transport layer and request > > body) > > I would not call our message header (which includes the id) "transport > > layer". > > TCP is our transport layer. And it is fine to use request ID to identify > > compute tasks (as we do with query cursors). > > > > > we still can't be sure that the task is successfully started on a > server > > The request to start the task will fail and we'll get a response > indicating > > that right away > > > > > we won't ever know about topology change > > Looks like I'm missing something - how is topology change relevant to > > executing compute tasks from client? > > > > On Mon, Nov 25, 2019 at 10:17 PM Alex Plehanov <plehanov.a...@gmail.com> > > wrote: > > > > > Pavel, in this case, we will mix entities from different layers > > (transport > > > layer and request body), it's not very good. The same behavior we can > > > achieve with generated on client-side task id, but there will be no > > > inter-layer data intersection and I think it will be easier to > implement > > on > > > both client and server-side. But we still can't be sure that the task > is > > > successfully started on a server. We won't ever know about topology > > change, > > > because topology changed flag will be sent from server to client only > > with > > > a response when the task will be completed. Are we accept that? > > > > > > пн, 25 нояб. 2019 г. в 19:07, Pavel Tupitsyn <ptupit...@apache.org>: > > > > > > > Alex, > > > > > > > > I have a simpler idea. We already do request id handling in the > > protocol, > > > > so: > > > > - Client sends a normal request to execute compute task. Request ID > is > > > > generated as usual. > > > > - As soon as task is completed, a response is received. > > > > > > > > As for cancellation - client can send a new request (with new request > > ID) > > > > and (in the body) pass the request ID from above > > > > as a task identifier. As a result, there are two responses: > > > > - Cancellation response > > > > - Task response (with proper cancelled status) > > > > > > > > That's it, no need to modify the core of the protocol. One request - > > one > > > > response. > > > > > > > > On Mon, Nov 25, 2019 at 6:20 PM Alex Plehanov < > plehanov.a...@gmail.com > > > > > > > wrote: > > > > > > > > > Pavel, we need to inform the client when the task is completed, we > > need > > > > the > > > > > ability to cancel the task. I see several ways to implement this: > > > > > > > > > > 1. Сlient sends a request to the server to start a task, server > > return > > > > task > > > > > id in response. Server notifies client when task is completed with > a > > > new > > > > > request (from server to client). Client can cancel the task by > > sending > > > a > > > > > new request with operation type "cancel" and task id. In this case, > > we > > > > > should implement 2-ways requests. > > > > > 2. Client generates unique task id and sends a request to the > server > > to > > > > > start a task, server don't reply immediately but wait until task is > > > > > completed. Client can cancel task by sending new request with > > operation > > > > > type "cancel" and task id. In this case, we should decouple request > > and > > > > > response on the server-side (currently response is sent right after > > > > request > > > > > was processed). Also, we can't be sure that task is successfully > > > started > > > > on > > > > > a server. > > > > > 3. Client sends a request to the server to start a task, server > > return > > > id > > > > > in response. Client periodically asks the server about task status. > > > > Client > > > > > can cancel the task by sending new request with operation type > > "cancel" > > > > and > > > > > task id. This case brings some overhead to the communication > channel. > > > > > > > > > > Personally, I think that the case with 2-ways requests is better, > but > > > I'm > > > > > open to any other ideas. > > > > > > > > > > Aleksandr, > > > > > > > > > > Filtering logic for OP_CLUSTER_GROUP_GET_NODE_IDS looks > > > overcomplicated. > > > > Do > > > > > we need server-side filtering at all? Wouldn't it be better to send > > > basic > > > > > info (ids, order, flags) for all nodes (there is relatively small > > > amount > > > > of > > > > > data) and extended info (attributes) for selected list of nodes? In > > > this > > > > > case, we can do basic node filtration on client-side (forClients(), > > > > > forServers(), forNodeIds(), forOthers(), etc). > > > > > > > > > > Do you use standard ClusterNode serialization? There are also > metrics > > > > > serialized with ClusterNode, do we need it on thin client? There > are > > > > other > > > > > interfaces exist to show metrics, I think it's redundant to export > > > > metrics > > > > > to thin clients too. > > > > > > > > > > What do you think? > > > > > > > > > > > > > > > > > > > > > > > > > пт, 22 нояб. 2019 г. в 20:15, Aleksandr Shapkin <lexw...@gmail.com > >: > > > > > > > > > > > Alex, > > > > > > > > > > > > > > > > > > > > > > > > I think you can create a new IEP page and I will fill it with the > > > > Cluster > > > > > > API details. > > > > > > > > > > > > > > > > > > > > > > > > In short, I’ve introduced several new codes: > > > > > > > > > > > > > > > > > > > > > > > > Cluster API is pretty straightforward: > > > > > > > > > > > > > > > > > > > > > > > > OP_CLUSTER_IS_ACTIVE = 5000 > > > > > > > > > > > > OP_CLUSTER_CHANGE_STATE = 5001 > > > > > > > > > > > > OP_CLUSTER_CHANGE_WAL_STATE = 5002 > > > > > > > > > > > > OP_CLUSTER_GET_WAL_STATE = 5003 > > > > > > > > > > > > > > > > > > > > > > > > Cluster group codes: > > > > > > > > > > > > > > > > > > > > > > > > OP_CLUSTER_GROUP_GET_NODE_IDS = 5100 > > > > > > > > > > > > OP_CLUSTER_GROUP_GET_NODE_INFO = 5101 > > > > > > > > > > > > > > > > > > > > > > > > The underlying implementation is based on the thick client logic. > > > > > > > > > > > > > > > > > > > > > > > > For every request, we provide a known topology version and if it > > has > > > > > > changed, > > > > > > > > > > > > a client updates it firstly and then re-sends the filtering > > request. > > > > > > > > > > > > > > > > > > > > > > > > Alongside the topVer a client sends a serialized nodes projection > > > > object > > > > > > > > > > > > that could be considered as a code to value mapping. > > > > > > > > > > > > Consider: [{Code = 1, Value= [“DotNet”, “MyAttribute”}, {Code=2, > > > > > Value=1}] > > > > > > > > > > > > Where “1” stands for Attribute filtering and “2” – > serverNodesOnly > > > > flag. > > > > > > > > > > > > > > > > > > > > > > > > As a result of request processing, a server sends nodeId UUIDs > and > > a > > > > > > current topVer. > > > > > > > > > > > > > > > > > > > > > > > > When a client obtains nodeIds, it can perform a NODE_INFO call to > > > get a > > > > > > > > > > > > serialized ClusterNode object. In addition there should be a > > > different > > > > > API > > > > > > > > > > > > method for accessing/updating node metrics. > > > > > > > > > > > > чт, 21 нояб. 2019 г. в 12:32, Sergey Kozlov < > skoz...@gridgain.com > > >: > > > > > > > > > > > > > Hi Pavel > > > > > > > > > > > > > > On Thu, Nov 21, 2019 at 11:30 AM Pavel Tupitsyn < > > > > ptupit...@apache.org> > > > > > > > wrote: > > > > > > > > > > > > > > > 1. I believe that Cluster operations for Thin Client protocol > > are > > > > > > already > > > > > > > > in the works > > > > > > > > by Alexandr Shapkin. Can't find the ticket though. > > > > > > > > Alexandr, can you please confirm and attach the ticket > number? > > > > > > > > > > > > > > > > 2. Proposed changes will work only for Java tasks that are > > > already > > > > > > > deployed > > > > > > > > on server nodes. > > > > > > > > This is mostly useless for other thin clients we have > (Python, > > > PHP, > > > > > > .NET, > > > > > > > > C++). > > > > > > > > > > > > > > > > > > > > > > I don't guess so. The task (execution) is a way to implement > own > > > > layer > > > > > > for > > > > > > > the thin client application. > > > > > > > > > > > > > > > > > > > > > > We should think of a way to make this useful for all clients. > > > > > > > > For example, we may allow sending tasks in some scripting > > > language > > > > > like > > > > > > > > Javascript. > > > > > > > > Thoughts? > > > > > > > > > > > > > > > > > > > > > > The arbitrary code execution from a remote client must be > > protected > > > > > > > from malicious code. > > > > > > > I don't know how it could be designed but without that we open > > the > > > > hole > > > > > > to > > > > > > > kill cluster. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Nov 21, 2019 at 11:21 AM Sergey Kozlov < > > > > skoz...@gridgain.com > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > Hi Alex > > > > > > > > > > > > > > > > > > The idea is great. But I have some concerns that probably > > > should > > > > be > > > > > > > taken > > > > > > > > > into account for design: > > > > > > > > > > > > > > > > > > 1. We need to have the ability to stop a task execution, > > > smth > > > > > like > > > > > > > > > OP_COMPUTE_CANCEL_TASK operation (client to server) > > > > > > > > > 2. What's about task execution timeout? It may help to > the > > > > > cluster > > > > > > > > > survival for buggy tasks > > > > > > > > > 3. Ignite doesn't have roles/authorization functionality > > for > > > > > now. > > > > > > > But > > > > > > > > a > > > > > > > > > task is the risky operation for cluster (for security > > > > reasons). > > > > > > > Could > > > > > > > > we > > > > > > > > > add for Ignite configuration new options: > > > > > > > > > - Explicit turning on for compute task support for > thin > > > > > > protocol > > > > > > > > > (disabled by default) for whole cluster > > > > > > > > > - Explicit turning on for compute task support for a > > node > > > > > > > > > - The list of task names (classes) allowed to execute > > by > > > > thin > > > > > > > > client. > > > > > > > > > 4. Support the labeling for task that may help to > > > investigate > > > > > > issues > > > > > > > > on > > > > > > > > > cluster (the idea from IEP-34 [1]) > > > > > > > > > > > > > > > > > > 1. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/IGNITE/IEP-34+Thin+client%3A+transactions+support > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Nov 21, 2019 at 10:58 AM Alex Plehanov < > > > > > > > plehanov.a...@gmail.com> > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > Hello, Igniters! > > > > > > > > > > > > > > > > > > > > I have plans to start implementation of Compute interface > > for > > > > > > Ignite > > > > > > > > thin > > > > > > > > > > client and want to discuss features that should be > > > implemented. > > > > > > > > > > > > > > > > > > > > We already have Compute implementation for binary-rest > > > clients > > > > > > > > > > (GridClientCompute), which have the following > > functionality: > > > > > > > > > > - Filtering cluster nodes (projection) for compute > > > > > > > > > > - Executing task by the name > > > > > > > > > > > > > > > > > > > > I think we can implement this functionality in a thin > > client > > > as > > > > > > well. > > > > > > > > > > > > > > > > > > > > First of all, we need some operation types to request a > > list > > > of > > > > > all > > > > > > > > > > available nodes and probably node attributes (by a list > of > > > > > nodes). > > > > > > > Node > > > > > > > > > > attributes will be helpful if we will decide to implement > > > > analog > > > > > of > > > > > > > > > > ClusterGroup#forAttribute or ClusterGroup#forePredicate > > > methods > > > > > in > > > > > > > the > > > > > > > > > thin > > > > > > > > > > client. Perhaps they can be requested lazily. > > > > > > > > > > > > > > > > > > > > From the protocol point of view there will be two new > > > > operations: > > > > > > > > > > > > > > > > > > > > OP_CLUSTER_GET_NODES > > > > > > > > > > Request: empty > > > > > > > > > > Response: long topologyVersion, int minorTopologyVersion, > > int > > > > > > > > nodesCount, > > > > > > > > > > for each node set of node fields (UUID nodeId, Object or > > > String > > > > > > > > > > consistentId, long order, etc) > > > > > > > > > > > > > > > > > > > > OP_CLUSTER_GET_NODE_ATTRIBUTES > > > > > > > > > > Request: int nodesCount, for each node: UUID nodeId > > > > > > > > > > Response: int nodesCount, for each node: int > > attributesCount, > > > > for > > > > > > > each > > > > > > > > > node > > > > > > > > > > attribute: String name, Object value > > > > > > > > > > > > > > > > > > > > To execute tasks we need something like these methods in > > the > > > > > client > > > > > > > > API: > > > > > > > > > > Object execute(String task, Object arg) > > > > > > > > > > Future<Object> executeAsync(String task, Object arg) > > > > > > > > > > Object affinityExecute(String task, String cache, Object > > key, > > > > > > Object > > > > > > > > arg) > > > > > > > > > > Future<Object> affinityExecuteAsync(String task, String > > > cache, > > > > > > Object > > > > > > > > > key, > > > > > > > > > > Object arg) > > > > > > > > > > > > > > > > > > > > Which can be mapped to protocol operations: > > > > > > > > > > > > > > > > > > > > OP_COMPUTE_EXECUTE_TASK > > > > > > > > > > Request: UUID nodeId, String taskName, Object arg > > > > > > > > > > Response: Object result > > > > > > > > > > > > > > > > > > > > OP_COMPUTE_EXECUTE_TASK_AFFINITY > > > > > > > > > > Request: String cacheName, Object key, String taskName, > > > Object > > > > > arg > > > > > > > > > > Response: Object result > > > > > > > > > > > > > > > > > > > > The second operation is needed because we sometimes can't > > > > > calculate > > > > > > > and > > > > > > > > > > connect to affinity node on the client-side (affinity > > > awareness > > > > > can > > > > > > > be > > > > > > > > > > disabled, custom affinity function can be used or there > can > > > be > > > > no > > > > > > > > > > connection between client and affinity node), but we can > > make > > > > > best > > > > > > > > effort > > > > > > > > > > to send request to target node if affinity awareness is > > > > enabled. > > > > > > > > > > > > > > > > > > > > Currently, on the server-side requests always processed > > > > > > synchronously > > > > > > > > and > > > > > > > > > > responses are sent right after request was processed. To > > > > execute > > > > > > long > > > > > > > > > tasks > > > > > > > > > > async we should whether change this logic or introduce > some > > > > kind > > > > > > > > two-way > > > > > > > > > > communication between client and server (now only one-way > > > > > requests > > > > > > > from > > > > > > > > > > client to server are allowed). > > > > > > > > > > > > > > > > > > > > Two-way communication can also be useful in the future if > > we > > > > will > > > > > > > send > > > > > > > > > some > > > > > > > > > > server-side generated events to clients. > > > > > > > > > > > > > > > > > > > > In case of two-way communication there can be new > > operations > > > > > > > > introduced: > > > > > > > > > > > > > > > > > > > > OP_COMPUTE_EXECUTE_TASK (from client to server) > > > > > > > > > > Request: UUID nodeId, String taskName, Object arg > > > > > > > > > > Response: long taskId > > > > > > > > > > > > > > > > > > > > OP_COMPUTE_TASK_FINISHED (from server to client) > > > > > > > > > > Request: taskId, Object result > > > > > > > > > > Response: empty > > > > > > > > > > > > > > > > > > > > The same for affinity requests. > > > > > > > > > > > > > > > > > > > > Also, we can implement not only execute task operation, > but > > > > some > > > > > > > other > > > > > > > > > > operations from IgniteCompute (broadcast, run, call), but > > it > > > > will > > > > > > be > > > > > > > > > useful > > > > > > > > > > only for java thin client. And even with java thin client > > we > > > > > should > > > > > > > > > whether > > > > > > > > > > implement peer-class-loading for thin clients (this also > > > > requires > > > > > > > > two-way > > > > > > > > > > client-server communication) or put classes with executed > > > > > closures > > > > > > to > > > > > > > > the > > > > > > > > > > server locally. > > > > > > > > > > > > > > > > > > > > What do you think about proposed protocol changes? > > > > > > > > > > Do we need two-way requests between client and server? > > > > > > > > > > Do we need support of compute methods other than "execute > > > > task"? > > > > > > > > > > What do you think about peer-class-loading for thin > > clients? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > Sergey Kozlov > > > > > > > > > GridGain Systems > > > > > > > > > www.gridgain.com > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > Sergey Kozlov > > > > > > > GridGain Systems > > > > > > > www.gridgain.com > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > Alex. > > > > > > > > > > > > > > > > > > > > >