Huge +1 from me for Feature Masks. I think this should be our top priority for thin client protocol, since it simplifies change management a lot.
On Mon, Jan 20, 2020 at 8:21 PM Igor Sapego <isap...@apache.org> wrote: > Sorry for the late reply. > > Approach with taskId will require a lot of changes in protocol and thus > more "heavy" for implementation, but it definitely looks to me less hacky > than reqId-approach. Moreover, as was mentioned, server notifications > mechanism will be required in a future anyway with high probability. So > from this point of view I like taskId-approach. > > On the other hand, what we should also consider here is performance. > Speaking of latency, it looks like reqId will have better results in case > of > small and fast tasks. The only question here, if we want to optimize thin > clients for this case. > > Also, what are you talking about mostly involves clients on platforms > that already have Compute API for thick clients. Let me mention one > more point of view here and another concern here. > > The changes you propose are going to change protocol version for sure. > In case with taskId approach and server notifications - even more so. > > But such clients as Python, Node.js, PHP, Go most probably won't have > support for this API, at least for now. Or never. But current > backward-compatibility mechanism implies protocol versions where we > imply that client that supports version 1.5 also supports all the features > introduced in all the previous versions of the protocol. > > Thus implementing Compute API in any of the proposed ways *may* > force mentioned clients to support changes in protocol which they not > necessarily need in order to introduce new features in the future. > > So, maybe it's a good time for us to change our backward compatibility > mechanism from protocol versioning to feature masks? > > WDYT? > > Best Regards, > Igor > > > On Fri, Jan 17, 2020 at 9:37 AM Alex Plehanov <plehanov.a...@gmail.com> > wrote: > > > Looks like we didn't rich consensus here. > > > > Igor, as thin client maintainer, can you please share your opinion? > > > > Everyone else also welcome, please share your thoughts about options to > > implement operations for compute. > > > > > > чт, 28 нояб. 2019 г. в 10:02, Alex Plehanov <plehanov.a...@gmail.com>: > > > > > > Since all thin client operations are inherently async, we should be > > able > > > to cancel any of them > > > It's illogical to have such ability. What should do cancel operation of > > > cancel operation? Moreover, sometimes it's dangerous, for example, > create > > > cache operation should never be canceled. There should be an explicit > set > > > of processes that we can cancel: queries, transactions, tasks, > services. > > > The lifecycle of services is more complex than the lifecycle of tasks. > > With > > > services, I suppose, we can't use request cancelation, so tasks will be > > the > > > only process with an exceptional pattern. > > > > > > > The request would be "execute task with specified node filter" - > simple > > > and efficient. > > > It's not simple: every compute or service request should contain > complex > > > node filtering logic, which duplicates the same logic for cluster API. > > > It's not efficient: for example, we can't implement forPredicate() > > > filtering in this case. > > > > > > > > > ср, 27 нояб. 2019 г. в 19:25, Pavel Tupitsyn <ptupit...@apache.org>: > > > > > >> > The request is already processed (task is started), we can't cancel > > the > > >> request > > >> The request is not "start a task". It is "execute task" (and get > > result). > > >> Same as "cache get" - you get a result in the end, we don't "start > cache > > >> get" then "end cache get". > > >> > > >> Since all thin client operations are inherently async, we should be > able > > >> to > > >> cancel any of them > > >> by sending another request with an id of prior request to be > cancelled. > > >> That's why I'm advocating for this approach - it will work for > anything, > > >> no > > >> special cases. > > >> And it keeps "happy path" as simple as it is right now. > > >> > > >> Queries are different because we retrieve results in pages, we can't > do > > >> them as one request. > > >> Transactions are also different because client controls when they > should > > >> end. > > >> There is no reason for task execution to be a special case like > queries > > or > > >> transactions. > > >> > > >> > we always need to send 2 requests to server to execute the task > > >> Nope. We don't need to get nodes on client at all. > > >> The request would be "execute task with specified node filter" - > simple > > >> and > > >> efficient. > > >> > > >> > > >> On Wed, Nov 27, 2019 at 4:31 PM Alex Plehanov < > plehanov.a...@gmail.com> > > >> wrote: > > >> > > >> > > We do cancel a request to perform a task. We may and should use > > this > > >> to > > >> > cancel any other request in future. > > >> > The request is already processed (task is started), we can't cancel > > the > > >> > request. As you mentioned before, we already do almost the same for > > >> queries > > >> > (close the cursor, but not cancel the request to run a query), it's > > >> better > > >> > to do such things in a common way. We have a pattern: start some > > process > > >> > (query, transaction), get id of this process, end process by this > id. > > >> The > > >> > "Execute task" process should match the same pattern. In my opinion, > > >> > implementation with two-way requests is the best option to match > this > > >> > pattern (we can even reuse OP_RESOURCE_CLOSE operation type in this > > >> case). > > >> > Sometime in the future, we will need two-way requests for some other > > >> > functionality (continuous queries, event listening, etc). But even > > >> without > > >> > two-way requests introducing some process id (task id in our case) > > will > > >> be > > >> > closer to existing pattern than canceling tasks by request id. > > >> > > > >> > > So every new request will apply those filters on server side, > using > > >> the > > >> > most recent set of nodes. > > >> > In this case, we always need to send 2 requests to server to execute > > the > > >> > task. First - to get nodes by the filter, second - to actually > execute > > >> the > > >> > task. It seems like overhead. The same will be for services. Cluster > > >> group > > >> > remains the same if the topology hasn't changed. We can use this > fact > > >> and > > >> > bind "execute task" request to topology. If topology has changed - > get > > >> > nodes for new topology and retry request. > > >> > > > >> > вт, 26 нояб. 2019 г. в 17:44, Pavel Tupitsyn <ptupit...@apache.org > >: > > >> > > > >> > > > After all, we don't cancel request > > >> > > We do cancel a request to perform a task. We may and should use > this > > >> to > > >> > > cancel any other request in future. > > >> > > > > >> > > > Client uses some cluster group filtration (for example > > forServers() > > >> > > cluster group) > > >> > > Please see above - Aleksandr Shapkin described how we store > > >> > > filtered cluster groups on client. > > >> > > We don't store node IDs, we store actual filters. So every new > > request > > >> > will > > >> > > apply those filters on server side, > > >> > > using the most recent set of nodes. > > >> > > > > >> > > var myGrp = cluster.forServers().forAttribute("foo"); // This does > > not > > >> > > issue any server requests, just builds an object with filters on > > >> client > > >> > > while (true) myGrp.compute().executeTask("bar"); // Every request > > >> > includes > > >> > > filters, and filters are applied on the server side > > >> > > > > >> > > On Tue, Nov 26, 2019 at 1:42 PM Alex Plehanov < > > >> plehanov.a...@gmail.com> > > >> > > wrote: > > >> > > > > >> > > > > Anyway, my point stands. > > >> > > > I can't agree. Why you don't want to use task id for this? After > > >> all, > > >> > we > > >> > > > don't cancel request (request is already processed), we cancel > the > > >> > task. > > >> > > So > > >> > > > it's more convenient to use task id here. > > >> > > > > > >> > > > > Can you please provide equivalent use case with existing > "thick" > > >> > > client? > > >> > > > For example: > > >> > > > Cluster consists of one server node. > > >> > > > Client uses some cluster group filtration (for example > > forServers() > > >> > > cluster > > >> > > > group). > > >> > > > Client starts to send periodically (for example 1 per minute) > > >> long-term > > >> > > > (for example 1 hour long) tasks to the cluster. > > >> > > > Meanwhile, several server nodes joined the cluster. > > >> > > > > > >> > > > In case of thick client: All server nodes will be used, tasks > will > > >> be > > >> > > load > > >> > > > balanced. > > >> > > > In case of thin client: Only one server node will be used, > client > > >> will > > >> > > > detect topology change after an hour. > > >> > > > > > >> > > > > > >> > > > вт, 26 нояб. 2019 г. в 11:50, Pavel Tupitsyn < > > ptupit...@apache.org > > >> >: > > >> > > > > > >> > > > > > I can't see any usage of request id in query cursors > > >> > > > > You are right, cursor id is a separate thing. > > >> > > > > Anyway, my point stands. > > >> > > > > > > >> > > > > > client sends long term tasks to nodes and wants to do it > with > > >> load > > >> > > > > balancing > > >> > > > > I still don't get it. Can you please provide equivalent use > case > > >> with > > >> > > > > existing "thick" client? > > >> > > > > > > >> > > > > > > >> > > > > On Mon, Nov 25, 2019 at 11:59 PM Alex Plehanov < > > >> > > plehanov.a...@gmail.com> > > >> > > > > wrote: > > >> > > > > > > >> > > > > > > And it is fine to use request ID to identify compute tasks > > >> (as we > > >> > > do > > >> > > > > with > > >> > > > > > query cursors). > > >> > > > > > I can't see any usage of request id in query cursors. We > send > > >> query > > >> > > > > request > > >> > > > > > and get cursor id in response. After that, we only use > cursor > > id > > >> > (to > > >> > > > get > > >> > > > > > next pages and to close the resource). Did I miss something? > > >> > > > > > > > >> > > > > > > Looks like I'm missing something - how is topology change > > >> > relevant > > >> > > to > > >> > > > > > executing compute tasks from client? > > >> > > > > > It's not relevant directly. But there are some cases where > it > > >> will > > >> > be > > >> > > > > > helpful. For example, if client sends long term tasks to > nodes > > >> and > > >> > > > wants > > >> > > > > to > > >> > > > > > do it with load balancing it will detect topology change > only > > >> after > > >> > > > some > > >> > > > > > time in the future with the first response, so load > balancing > > >> will > > >> > no > > >> > > > > work. > > >> > > > > > Perhaps we can add optional "topology version" field to the > > >> > > > > > OP_COMPUTE_EXECUTE_TASK request to solve this problem. > > >> > > > > > > > >> > > > > > > > >> > > > > > пн, 25 нояб. 2019 г. в 22:42, Pavel Tupitsyn < > > >> ptupit...@apache.org > > >> > >: > > >> > > > > > > > >> > > > > > > Alex, > > >> > > > > > > > > >> > > > > > > > we will mix entities from different layers (transport > > layer > > >> and > > >> > > > > request > > >> > > > > > > body) > > >> > > > > > > I would not call our message header (which includes the > id) > > >> > > > "transport > > >> > > > > > > layer". > > >> > > > > > > TCP is our transport layer. And it is fine to use request > ID > > >> to > > >> > > > > identify > > >> > > > > > > compute tasks (as we do with query cursors). > > >> > > > > > > > > >> > > > > > > > we still can't be sure that the task is successfully > > started > > >> > on a > > >> > > > > > server > > >> > > > > > > The request to start the task will fail and we'll get a > > >> response > > >> > > > > > indicating > > >> > > > > > > that right away > > >> > > > > > > > > >> > > > > > > > we won't ever know about topology change > > >> > > > > > > Looks like I'm missing something - how is topology change > > >> > relevant > > >> > > to > > >> > > > > > > executing compute tasks from client? > > >> > > > > > > > > >> > > > > > > On Mon, Nov 25, 2019 at 10:17 PM Alex Plehanov < > > >> > > > > plehanov.a...@gmail.com> > > >> > > > > > > wrote: > > >> > > > > > > > > >> > > > > > > > Pavel, in this case, we will mix entities from different > > >> layers > > >> > > > > > > (transport > > >> > > > > > > > layer and request body), it's not very good. The same > > >> behavior > > >> > we > > >> > > > can > > >> > > > > > > > achieve with generated on client-side task id, but there > > >> will > > >> > be > > >> > > no > > >> > > > > > > > inter-layer data intersection and I think it will be > > easier > > >> to > > >> > > > > > implement > > >> > > > > > > on > > >> > > > > > > > both client and server-side. But we still can't be sure > > that > > >> > the > > >> > > > task > > >> > > > > > is > > >> > > > > > > > successfully started on a server. We won't ever know > about > > >> > > topology > > >> > > > > > > change, > > >> > > > > > > > because topology changed flag will be sent from server > to > > >> > client > > >> > > > only > > >> > > > > > > with > > >> > > > > > > > a response when the task will be completed. Are we > accept > > >> that? > > >> > > > > > > > > > >> > > > > > > > пн, 25 нояб. 2019 г. в 19:07, Pavel Tupitsyn < > > >> > > ptupit...@apache.org > > >> > > > >: > > >> > > > > > > > > > >> > > > > > > > > Alex, > > >> > > > > > > > > > > >> > > > > > > > > I have a simpler idea. We already do request id > handling > > >> in > > >> > the > > >> > > > > > > protocol, > > >> > > > > > > > > so: > > >> > > > > > > > > - Client sends a normal request to execute compute > task. > > >> > > Request > > >> > > > ID > > >> > > > > > is > > >> > > > > > > > > generated as usual. > > >> > > > > > > > > - As soon as task is completed, a response is > received. > > >> > > > > > > > > > > >> > > > > > > > > As for cancellation - client can send a new request > > (with > > >> new > > >> > > > > request > > >> > > > > > > ID) > > >> > > > > > > > > and (in the body) pass the request ID from above > > >> > > > > > > > > as a task identifier. As a result, there are two > > >> responses: > > >> > > > > > > > > - Cancellation response > > >> > > > > > > > > - Task response (with proper cancelled status) > > >> > > > > > > > > > > >> > > > > > > > > That's it, no need to modify the core of the protocol. > > One > > >> > > > request > > >> > > > > - > > >> > > > > > > one > > >> > > > > > > > > response. > > >> > > > > > > > > > > >> > > > > > > > > On Mon, Nov 25, 2019 at 6:20 PM Alex Plehanov < > > >> > > > > > plehanov.a...@gmail.com > > >> > > > > > > > > > >> > > > > > > > > wrote: > > >> > > > > > > > > > > >> > > > > > > > > > Pavel, we need to inform the client when the task is > > >> > > completed, > > >> > > > > we > > >> > > > > > > need > > >> > > > > > > > > the > > >> > > > > > > > > > ability to cancel the task. I see several ways to > > >> implement > > >> > > > this: > > >> > > > > > > > > > > > >> > > > > > > > > > 1. Сlient sends a request to the server to start a > > task, > > >> > > server > > >> > > > > > > return > > >> > > > > > > > > task > > >> > > > > > > > > > id in response. Server notifies client when task is > > >> > completed > > >> > > > > with > > >> > > > > > a > > >> > > > > > > > new > > >> > > > > > > > > > request (from server to client). Client can cancel > the > > >> task > > >> > > by > > >> > > > > > > sending > > >> > > > > > > > a > > >> > > > > > > > > > new request with operation type "cancel" and task > id. > > In > > >> > this > > >> > > > > case, > > >> > > > > > > we > > >> > > > > > > > > > should implement 2-ways requests. > > >> > > > > > > > > > 2. Client generates unique task id and sends a > request > > >> to > > >> > the > > >> > > > > > server > > >> > > > > > > to > > >> > > > > > > > > > start a task, server don't reply immediately but > wait > > >> until > > >> > > > task > > >> > > > > is > > >> > > > > > > > > > completed. Client can cancel task by sending new > > request > > >> > with > > >> > > > > > > operation > > >> > > > > > > > > > type "cancel" and task id. In this case, we should > > >> decouple > > >> > > > > request > > >> > > > > > > and > > >> > > > > > > > > > response on the server-side (currently response is > > sent > > >> > right > > >> > > > > after > > >> > > > > > > > > request > > >> > > > > > > > > > was processed). Also, we can't be sure that task is > > >> > > > successfully > > >> > > > > > > > started > > >> > > > > > > > > on > > >> > > > > > > > > > a server. > > >> > > > > > > > > > 3. Client sends a request to the server to start a > > task, > > >> > > server > > >> > > > > > > return > > >> > > > > > > > id > > >> > > > > > > > > > in response. Client periodically asks the server > about > > >> task > > >> > > > > status. > > >> > > > > > > > > Client > > >> > > > > > > > > > can cancel the task by sending new request with > > >> operation > > >> > > type > > >> > > > > > > "cancel" > > >> > > > > > > > > and > > >> > > > > > > > > > task id. This case brings some overhead to the > > >> > communication > > >> > > > > > channel. > > >> > > > > > > > > > > > >> > > > > > > > > > Personally, I think that the case with 2-ways > requests > > >> is > > >> > > > better, > > >> > > > > > but > > >> > > > > > > > I'm > > >> > > > > > > > > > open to any other ideas. > > >> > > > > > > > > > > > >> > > > > > > > > > Aleksandr, > > >> > > > > > > > > > > > >> > > > > > > > > > Filtering logic for OP_CLUSTER_GROUP_GET_NODE_IDS > > looks > > >> > > > > > > > overcomplicated. > > >> > > > > > > > > Do > > >> > > > > > > > > > we need server-side filtering at all? Wouldn't it be > > >> better > > >> > > to > > >> > > > > send > > >> > > > > > > > basic > > >> > > > > > > > > > info (ids, order, flags) for all nodes (there is > > >> relatively > > >> > > > small > > >> > > > > > > > amount > > >> > > > > > > > > of > > >> > > > > > > > > > data) and extended info (attributes) for selected > list > > >> of > > >> > > > nodes? > > >> > > > > In > > >> > > > > > > > this > > >> > > > > > > > > > case, we can do basic node filtration on client-side > > >> > > > > (forClients(), > > >> > > > > > > > > > forServers(), forNodeIds(), forOthers(), etc). > > >> > > > > > > > > > > > >> > > > > > > > > > Do you use standard ClusterNode serialization? There > > are > > >> > also > > >> > > > > > metrics > > >> > > > > > > > > > serialized with ClusterNode, do we need it on thin > > >> client? > > >> > > > There > > >> > > > > > are > > >> > > > > > > > > other > > >> > > > > > > > > > interfaces exist to show metrics, I think it's > > >> redundant to > > >> > > > > export > > >> > > > > > > > > metrics > > >> > > > > > > > > > to thin clients too. > > >> > > > > > > > > > > > >> > > > > > > > > > What do you think? > > >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > пт, 22 нояб. 2019 г. в 20:15, Aleksandr Shapkin < > > >> > > > > lexw...@gmail.com > > >> > > > > > >: > > >> > > > > > > > > > > > >> > > > > > > > > > > Alex, > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > I think you can create a new IEP page and I will > > fill > > >> it > > >> > > with > > >> > > > > the > > >> > > > > > > > > Cluster > > >> > > > > > > > > > > API details. > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > In short, I’ve introduced several new codes: > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > Cluster API is pretty straightforward: > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > OP_CLUSTER_IS_ACTIVE = 5000 > > >> > > > > > > > > > > > > >> > > > > > > > > > > OP_CLUSTER_CHANGE_STATE = 5001 > > >> > > > > > > > > > > > > >> > > > > > > > > > > OP_CLUSTER_CHANGE_WAL_STATE = 5002 > > >> > > > > > > > > > > > > >> > > > > > > > > > > OP_CLUSTER_GET_WAL_STATE = 5003 > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > Cluster group codes: > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > OP_CLUSTER_GROUP_GET_NODE_IDS = 5100 > > >> > > > > > > > > > > > > >> > > > > > > > > > > OP_CLUSTER_GROUP_GET_NODE_INFO = 5101 > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > The underlying implementation is based on the > thick > > >> > client > > >> > > > > logic. > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > For every request, we provide a known topology > > version > > >> > and > > >> > > if > > >> > > > > it > > >> > > > > > > has > > >> > > > > > > > > > > changed, > > >> > > > > > > > > > > > > >> > > > > > > > > > > a client updates it firstly and then re-sends the > > >> > filtering > > >> > > > > > > request. > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > Alongside the topVer a client sends a serialized > > nodes > > >> > > > > projection > > >> > > > > > > > > object > > >> > > > > > > > > > > > > >> > > > > > > > > > > that could be considered as a code to value > mapping. > > >> > > > > > > > > > > > > >> > > > > > > > > > > Consider: [{Code = 1, Value= [“DotNet”, > > >> “MyAttribute”}, > > >> > > > > {Code=2, > > >> > > > > > > > > > Value=1}] > > >> > > > > > > > > > > > > >> > > > > > > > > > > Where “1” stands for Attribute filtering and “2” – > > >> > > > > > serverNodesOnly > > >> > > > > > > > > flag. > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > As a result of request processing, a server sends > > >> nodeId > > >> > > > UUIDs > > >> > > > > > and > > >> > > > > > > a > > >> > > > > > > > > > > current topVer. > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > When a client obtains nodeIds, it can perform a > > >> NODE_INFO > > >> > > > call > > >> > > > > to > > >> > > > > > > > get a > > >> > > > > > > > > > > > > >> > > > > > > > > > > serialized ClusterNode object. In addition there > > >> should > > >> > be > > >> > > a > > >> > > > > > > > different > > >> > > > > > > > > > API > > >> > > > > > > > > > > > > >> > > > > > > > > > > method for accessing/updating node metrics. > > >> > > > > > > > > > > > > >> > > > > > > > > > > чт, 21 нояб. 2019 г. в 12:32, Sergey Kozlov < > > >> > > > > > skoz...@gridgain.com > > >> > > > > > > >: > > >> > > > > > > > > > > > > >> > > > > > > > > > > > Hi Pavel > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > On Thu, Nov 21, 2019 at 11:30 AM Pavel Tupitsyn > < > > >> > > > > > > > > ptupit...@apache.org> > > >> > > > > > > > > > > > wrote: > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > 1. I believe that Cluster operations for Thin > > >> Client > > >> > > > > protocol > > >> > > > > > > are > > >> > > > > > > > > > > already > > >> > > > > > > > > > > > > in the works > > >> > > > > > > > > > > > > by Alexandr Shapkin. Can't find the ticket > > though. > > >> > > > > > > > > > > > > Alexandr, can you please confirm and attach > the > > >> > ticket > > >> > > > > > number? > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > 2. Proposed changes will work only for Java > > tasks > > >> > that > > >> > > > are > > >> > > > > > > > already > > >> > > > > > > > > > > > deployed > > >> > > > > > > > > > > > > on server nodes. > > >> > > > > > > > > > > > > This is mostly useless for other thin clients > we > > >> have > > >> > > > > > (Python, > > >> > > > > > > > PHP, > > >> > > > > > > > > > > .NET, > > >> > > > > > > > > > > > > C++). > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > I don't guess so. The task (execution) is a way > to > > >> > > > implement > > >> > > > > > own > > >> > > > > > > > > layer > > >> > > > > > > > > > > for > > >> > > > > > > > > > > > the thin client application. > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > We should think of a way to make this useful > for > > >> all > > >> > > > > clients. > > >> > > > > > > > > > > > > For example, we may allow sending tasks in > some > > >> > > scripting > > >> > > > > > > > language > > >> > > > > > > > > > like > > >> > > > > > > > > > > > > Javascript. > > >> > > > > > > > > > > > > Thoughts? > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > The arbitrary code execution from a remote > client > > >> must > > >> > be > > >> > > > > > > protected > > >> > > > > > > > > > > > from malicious code. > > >> > > > > > > > > > > > I don't know how it could be designed but > without > > >> that > > >> > we > > >> > > > > open > > >> > > > > > > the > > >> > > > > > > > > hole > > >> > > > > > > > > > > to > > >> > > > > > > > > > > > kill cluster. > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > On Thu, Nov 21, 2019 at 11:21 AM Sergey > Kozlov < > > >> > > > > > > > > skoz...@gridgain.com > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > wrote: > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > Hi Alex > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > The idea is great. But I have some concerns > > that > > >> > > > probably > > >> > > > > > > > should > > >> > > > > > > > > be > > >> > > > > > > > > > > > taken > > >> > > > > > > > > > > > > > into account for design: > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > 1. We need to have the ability to stop a > > task > > >> > > > > execution, > > >> > > > > > > > smth > > >> > > > > > > > > > like > > >> > > > > > > > > > > > > > OP_COMPUTE_CANCEL_TASK operation (client > > to > > >> > > server) > > >> > > > > > > > > > > > > > 2. What's about task execution timeout? > It > > >> may > > >> > > help > > >> > > > to > > >> > > > > > the > > >> > > > > > > > > > cluster > > >> > > > > > > > > > > > > > survival for buggy tasks > > >> > > > > > > > > > > > > > 3. Ignite doesn't have > roles/authorization > > >> > > > > functionality > > >> > > > > > > for > > >> > > > > > > > > > now. > > >> > > > > > > > > > > > But > > >> > > > > > > > > > > > > a > > >> > > > > > > > > > > > > > task is the risky operation for cluster > > (for > > >> > > > security > > >> > > > > > > > > reasons). > > >> > > > > > > > > > > > Could > > >> > > > > > > > > > > > > we > > >> > > > > > > > > > > > > > add for Ignite configuration new options: > > >> > > > > > > > > > > > > > - Explicit turning on for compute task > > >> > support > > >> > > > for > > >> > > > > > thin > > >> > > > > > > > > > > protocol > > >> > > > > > > > > > > > > > (disabled by default) for whole > cluster > > >> > > > > > > > > > > > > > - Explicit turning on for compute task > > >> > support > > >> > > > for > > >> > > > > a > > >> > > > > > > node > > >> > > > > > > > > > > > > > - The list of task names (classes) > > >> allowed to > > >> > > > > execute > > >> > > > > > > by > > >> > > > > > > > > thin > > >> > > > > > > > > > > > > client. > > >> > > > > > > > > > > > > > 4. Support the labeling for task that may > > >> help > > >> > to > > >> > > > > > > > investigate > > >> > > > > > > > > > > issues > > >> > > > > > > > > > > > > on > > >> > > > > > > > > > > > > > cluster (the idea from IEP-34 [1]) > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > 1. > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > > https://cwiki.apache.org/confluence/display/IGNITE/IEP-34+Thin+client%3A+transactions+support > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > On Thu, Nov 21, 2019 at 10:58 AM Alex > > Plehanov < > > >> > > > > > > > > > > > plehanov.a...@gmail.com> > > >> > > > > > > > > > > > > > wrote: > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > Hello, Igniters! > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > I have plans to start implementation of > > >> Compute > > >> > > > > interface > > >> > > > > > > for > > >> > > > > > > > > > > Ignite > > >> > > > > > > > > > > > > thin > > >> > > > > > > > > > > > > > > client and want to discuss features that > > >> should > > >> > be > > >> > > > > > > > implemented. > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > We already have Compute implementation for > > >> > > > binary-rest > > >> > > > > > > > clients > > >> > > > > > > > > > > > > > > (GridClientCompute), which have the > > following > > >> > > > > > > functionality: > > >> > > > > > > > > > > > > > > - Filtering cluster nodes (projection) for > > >> > compute > > >> > > > > > > > > > > > > > > - Executing task by the name > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > I think we can implement this > functionality > > >> in a > > >> > > thin > > >> > > > > > > client > > >> > > > > > > > as > > >> > > > > > > > > > > well. > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > First of all, we need some operation types > > to > > >> > > > request a > > >> > > > > > > list > > >> > > > > > > > of > > >> > > > > > > > > > all > > >> > > > > > > > > > > > > > > available nodes and probably node > attributes > > >> (by > > >> > a > > >> > > > list > > >> > > > > > of > > >> > > > > > > > > > nodes). > > >> > > > > > > > > > > > Node > > >> > > > > > > > > > > > > > > attributes will be helpful if we will > decide > > >> to > > >> > > > > implement > > >> > > > > > > > > analog > > >> > > > > > > > > > of > > >> > > > > > > > > > > > > > > ClusterGroup#forAttribute or > > >> > > > ClusterGroup#forePredicate > > >> > > > > > > > methods > > >> > > > > > > > > > in > > >> > > > > > > > > > > > the > > >> > > > > > > > > > > > > > thin > > >> > > > > > > > > > > > > > > client. Perhaps they can be requested > > lazily. > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > From the protocol point of view there will > > be > > >> two > > >> > > new > > >> > > > > > > > > operations: > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > OP_CLUSTER_GET_NODES > > >> > > > > > > > > > > > > > > Request: empty > > >> > > > > > > > > > > > > > > Response: long topologyVersion, int > > >> > > > > minorTopologyVersion, > > >> > > > > > > int > > >> > > > > > > > > > > > > nodesCount, > > >> > > > > > > > > > > > > > > for each node set of node fields (UUID > > nodeId, > > >> > > Object > > >> > > > > or > > >> > > > > > > > String > > >> > > > > > > > > > > > > > > consistentId, long order, etc) > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > OP_CLUSTER_GET_NODE_ATTRIBUTES > > >> > > > > > > > > > > > > > > Request: int nodesCount, for each node: > UUID > > >> > nodeId > > >> > > > > > > > > > > > > > > Response: int nodesCount, for each node: > int > > >> > > > > > > attributesCount, > > >> > > > > > > > > for > > >> > > > > > > > > > > > each > > >> > > > > > > > > > > > > > node > > >> > > > > > > > > > > > > > > attribute: String name, Object value > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > To execute tasks we need something like > > these > > >> > > methods > > >> > > > > in > > >> > > > > > > the > > >> > > > > > > > > > client > > >> > > > > > > > > > > > > API: > > >> > > > > > > > > > > > > > > Object execute(String task, Object arg) > > >> > > > > > > > > > > > > > > Future<Object> executeAsync(String task, > > >> Object > > >> > > arg) > > >> > > > > > > > > > > > > > > Object affinityExecute(String task, String > > >> cache, > > >> > > > > Object > > >> > > > > > > key, > > >> > > > > > > > > > > Object > > >> > > > > > > > > > > > > arg) > > >> > > > > > > > > > > > > > > Future<Object> affinityExecuteAsync(String > > >> task, > > >> > > > String > > >> > > > > > > > cache, > > >> > > > > > > > > > > Object > > >> > > > > > > > > > > > > > key, > > >> > > > > > > > > > > > > > > Object arg) > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > Which can be mapped to protocol > operations: > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > OP_COMPUTE_EXECUTE_TASK > > >> > > > > > > > > > > > > > > Request: UUID nodeId, String taskName, > > Object > > >> arg > > >> > > > > > > > > > > > > > > Response: Object result > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > OP_COMPUTE_EXECUTE_TASK_AFFINITY > > >> > > > > > > > > > > > > > > Request: String cacheName, Object key, > > String > > >> > > > taskName, > > >> > > > > > > > Object > > >> > > > > > > > > > arg > > >> > > > > > > > > > > > > > > Response: Object result > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > The second operation is needed because we > > >> > sometimes > > >> > > > > can't > > >> > > > > > > > > > calculate > > >> > > > > > > > > > > > and > > >> > > > > > > > > > > > > > > connect to affinity node on the > client-side > > >> > > (affinity > > >> > > > > > > > awareness > > >> > > > > > > > > > can > > >> > > > > > > > > > > > be > > >> > > > > > > > > > > > > > > disabled, custom affinity function can be > > >> used or > > >> > > > there > > >> > > > > > can > > >> > > > > > > > be > > >> > > > > > > > > no > > >> > > > > > > > > > > > > > > connection between client and affinity > > node), > > >> but > > >> > > we > > >> > > > > can > > >> > > > > > > make > > >> > > > > > > > > > best > > >> > > > > > > > > > > > > effort > > >> > > > > > > > > > > > > > > to send request to target node if affinity > > >> > > awareness > > >> > > > is > > >> > > > > > > > > enabled. > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > Currently, on the server-side requests > > always > > >> > > > processed > > >> > > > > > > > > > > synchronously > > >> > > > > > > > > > > > > and > > >> > > > > > > > > > > > > > > responses are sent right after request was > > >> > > processed. > > >> > > > > To > > >> > > > > > > > > execute > > >> > > > > > > > > > > long > > >> > > > > > > > > > > > > > tasks > > >> > > > > > > > > > > > > > > async we should whether change this logic > or > > >> > > > introduce > > >> > > > > > some > > >> > > > > > > > > kind > > >> > > > > > > > > > > > > two-way > > >> > > > > > > > > > > > > > > communication between client and server > (now > > >> only > > >> > > > > one-way > > >> > > > > > > > > > requests > > >> > > > > > > > > > > > from > > >> > > > > > > > > > > > > > > client to server are allowed). > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > Two-way communication can also be useful > in > > >> the > > >> > > > future > > >> > > > > if > > >> > > > > > > we > > >> > > > > > > > > will > > >> > > > > > > > > > > > send > > >> > > > > > > > > > > > > > some > > >> > > > > > > > > > > > > > > server-side generated events to clients. > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > In case of two-way communication there can > > be > > >> new > > >> > > > > > > operations > > >> > > > > > > > > > > > > introduced: > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > OP_COMPUTE_EXECUTE_TASK (from client to > > >> server) > > >> > > > > > > > > > > > > > > Request: UUID nodeId, String taskName, > > Object > > >> arg > > >> > > > > > > > > > > > > > > Response: long taskId > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > OP_COMPUTE_TASK_FINISHED (from server to > > >> client) > > >> > > > > > > > > > > > > > > Request: taskId, Object result > > >> > > > > > > > > > > > > > > Response: empty > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > The same for affinity requests. > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > Also, we can implement not only execute > task > > >> > > > operation, > > >> > > > > > but > > >> > > > > > > > > some > > >> > > > > > > > > > > > other > > >> > > > > > > > > > > > > > > operations from IgniteCompute (broadcast, > > run, > > >> > > call), > > >> > > > > but > > >> > > > > > > it > > >> > > > > > > > > will > > >> > > > > > > > > > > be > > >> > > > > > > > > > > > > > useful > > >> > > > > > > > > > > > > > > only for java thin client. And even with > > java > > >> > thin > > >> > > > > client > > >> > > > > > > we > > >> > > > > > > > > > should > > >> > > > > > > > > > > > > > whether > > >> > > > > > > > > > > > > > > implement peer-class-loading for thin > > clients > > >> > (this > > >> > > > > also > > >> > > > > > > > > requires > > >> > > > > > > > > > > > > two-way > > >> > > > > > > > > > > > > > > client-server communication) or put > classes > > >> with > > >> > > > > executed > > >> > > > > > > > > > closures > > >> > > > > > > > > > > to > > >> > > > > > > > > > > > > the > > >> > > > > > > > > > > > > > > server locally. > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > What do you think about proposed protocol > > >> > changes? > > >> > > > > > > > > > > > > > > Do we need two-way requests between client > > and > > >> > > > server? > > >> > > > > > > > > > > > > > > Do we need support of compute methods > other > > >> than > > >> > > > > "execute > > >> > > > > > > > > task"? > > >> > > > > > > > > > > > > > > What do you think about peer-class-loading > > for > > >> > thin > > >> > > > > > > clients? > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > -- > > >> > > > > > > > > > > > > > Sergey Kozlov > > >> > > > > > > > > > > > > > GridGain Systems > > >> > > > > > > > > > > > > > www.gridgain.com > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > -- > > >> > > > > > > > > > > > Sergey Kozlov > > >> > > > > > > > > > > > GridGain Systems > > >> > > > > > > > > > > > www.gridgain.com > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > -- > > >> > > > > > > > > > > Alex. > > >> > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > > > > >