Pavel, in this case, we will mix entities from different layers (transport
layer and request body), it's not very good. The same behavior we can achieve with generated on client-side task id, but there will be no inter-layer data intersection and I think it will be easier to implement on both client and server-side. But we still can't be sure that the task is successfully started on a server. We won't ever know about topology change, because topology changed flag will be sent from server to client only with a response when the task will be completed. Are we accept that? пн, 25 нояб. 2019 г. в 19:07, Pavel Tupitsyn <[hidden email]>: > Alex, > > I have a simpler idea. We already do request id handling in the protocol, > so: > - Client sends a normal request to execute compute task. Request ID is > generated as usual. > - As soon as task is completed, a response is received. > > As for cancellation - client can send a new request (with new request ID) > and (in the body) pass the request ID from above > as a task identifier. As a result, there are two responses: > - Cancellation response > - Task response (with proper cancelled status) > > That's it, no need to modify the core of the protocol. One request - one > response. > > On Mon, Nov 25, 2019 at 6:20 PM Alex Plehanov <[hidden email]> > wrote: > > > Pavel, we need to inform the client when the task is completed, we need > the > > ability to cancel the task. I see several ways to implement this: > > > > 1. Сlient sends a request to the server to start a task, server return > task > > id in response. Server notifies client when task is completed with a new > > request (from server to client). Client can cancel the task by sending a > > new request with operation type "cancel" and task id. In this case, we > > should implement 2-ways requests. > > 2. Client generates unique task id and sends a request to the server to > > start a task, server don't reply immediately but wait until task is > > completed. Client can cancel task by sending new request with operation > > type "cancel" and task id. In this case, we should decouple request and > > response on the server-side (currently response is sent right after > request > > was processed). Also, we can't be sure that task is successfully started > on > > a server. > > 3. Client sends a request to the server to start a task, server return id > > in response. Client periodically asks the server about task status. > Client > > can cancel the task by sending new request with operation type "cancel" > and > > task id. This case brings some overhead to the communication channel. > > > > Personally, I think that the case with 2-ways requests is better, but I'm > > open to any other ideas. > > > > Aleksandr, > > > > Filtering logic for OP_CLUSTER_GROUP_GET_NODE_IDS looks overcomplicated. > Do > > we need server-side filtering at all? Wouldn't it be better to send basic > > info (ids, order, flags) for all nodes (there is relatively small amount > of > > data) and extended info (attributes) for selected list of nodes? In this > > case, we can do basic node filtration on client-side (forClients(), > > forServers(), forNodeIds(), forOthers(), etc). > > > > Do you use standard ClusterNode serialization? There are also metrics > > serialized with ClusterNode, do we need it on thin client? There are > other > > interfaces exist to show metrics, I think it's redundant to export > metrics > > to thin clients too. > > > > What do you think? > > > > > > > > > > пт, 22 нояб. 2019 г. в 20:15, Aleksandr Shapkin <[hidden email]>: > > > > > Alex, > > > > > > > > > > > > I think you can create a new IEP page and I will fill it with the > Cluster > > > API details. > > > > > > > > > > > > In short, I’ve introduced several new codes: > > > > > > > > > > > > Cluster API is pretty straightforward: > > > > > > > > > > > > OP_CLUSTER_IS_ACTIVE = 5000 > > > > > > OP_CLUSTER_CHANGE_STATE = 5001 > > > > > > OP_CLUSTER_CHANGE_WAL_STATE = 5002 > > > > > > OP_CLUSTER_GET_WAL_STATE = 5003 > > > > > > > > > > > > Cluster group codes: > > > > > > > > > > > > OP_CLUSTER_GROUP_GET_NODE_IDS = 5100 > > > > > > OP_CLUSTER_GROUP_GET_NODE_INFO = 5101 > > > > > > > > > > > > The underlying implementation is based on the thick client logic. > > > > > > > > > > > > For every request, we provide a known topology version and if it has > > > changed, > > > > > > a client updates it firstly and then re-sends the filtering request. > > > > > > > > > > > > Alongside the topVer a client sends a serialized nodes projection > object > > > > > > that could be considered as a code to value mapping. > > > > > > Consider: [{Code = 1, Value= [“DotNet”, “MyAttribute”}, {Code=2, > > Value=1}] > > > > > > Where “1” stands for Attribute filtering and “2” – serverNodesOnly > flag. > > > > > > > > > > > > As a result of request processing, a server sends nodeId UUIDs and a > > > current topVer. > > > > > > > > > > > > When a client obtains nodeIds, it can perform a NODE_INFO call to get a > > > > > > serialized ClusterNode object. In addition there should be a different > > API > > > > > > method for accessing/updating node metrics. > > > > > > чт, 21 нояб. 2019 г. в 12:32, Sergey Kozlov <[hidden email]>: > > > > > > > Hi Pavel > > > > > > > > On Thu, Nov 21, 2019 at 11:30 AM Pavel Tupitsyn < > [hidden email]> > > > > wrote: > > > > > > > > > 1. I believe that Cluster operations for Thin Client protocol are > > > already > > > > > in the works > > > > > by Alexandr Shapkin. Can't find the ticket though. > > > > > Alexandr, can you please confirm and attach the ticket number? > > > > > > > > > > 2. Proposed changes will work only for Java tasks that are already > > > > deployed > > > > > on server nodes. > > > > > This is mostly useless for other thin clients we have (Python, PHP, > > > .NET, > > > > > C++). > > > > > > > > > > > > > I don't guess so. The task (execution) is a way to implement own > layer > > > for > > > > the thin client application. > > > > > > > > > > > > > We should think of a way to make this useful for all clients. > > > > > For example, we may allow sending tasks in some scripting language > > like > > > > > Javascript. > > > > > Thoughts? > > > > > > > > > > > > > The arbitrary code execution from a remote client must be protected > > > > from malicious code. > > > > I don't know how it could be designed but without that we open the > hole > > > to > > > > kill cluster. > > > > > > > > > > > > > > > > > > On Thu, Nov 21, 2019 at 11:21 AM Sergey Kozlov < > [hidden email] > > > > > > > > wrote: > > > > > > > > > > > Hi Alex > > > > > > > > > > > > The idea is great. But I have some concerns that probably should > be > > > > taken > > > > > > into account for design: > > > > > > > > > > > > 1. We need to have the ability to stop a task execution, smth > > like > > > > > > OP_COMPUTE_CANCEL_TASK operation (client to server) > > > > > > 2. What's about task execution timeout? It may help to the > > cluster > > > > > > survival for buggy tasks > > > > > > 3. Ignite doesn't have roles/authorization functionality for > > now. > > > > But > > > > > a > > > > > > task is the risky operation for cluster (for security > reasons). > > > > Could > > > > > we > > > > > > add for Ignite configuration new options: > > > > > > - Explicit turning on for compute task support for thin > > > protocol > > > > > > (disabled by default) for whole cluster > > > > > > - Explicit turning on for compute task support for a node > > > > > > - The list of task names (classes) allowed to execute by > thin > > > > > client. > > > > > > 4. Support the labeling for task that may help to investigate > > > issues > > > > > on > > > > > > cluster (the idea from IEP-34 [1]) > > > > > > > > > > > > 1. > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/IGNITE/IEP-34+Thin+client%3A+transactions+support > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Nov 21, 2019 at 10:58 AM Alex Plehanov < > > > > [hidden email]> > > > > > > wrote: > > > > > > > > > > > > > Hello, Igniters! > > > > > > > > > > > > > > I have plans to start implementation of Compute interface for > > > Ignite > > > > > thin > > > > > > > client and want to discuss features that should be implemented. > > > > > > > > > > > > > > We already have Compute implementation for binary-rest clients > > > > > > > (GridClientCompute), which have the following functionality: > > > > > > > - Filtering cluster nodes (projection) for compute > > > > > > > - Executing task by the name > > > > > > > > > > > > > > I think we can implement this functionality in a thin client as > > > well. > > > > > > > > > > > > > > First of all, we need some operation types to request a list of > > all > > > > > > > available nodes and probably node attributes (by a list of > > nodes). > > > > Node > > > > > > > attributes will be helpful if we will decide to implement > analog > > of > > > > > > > ClusterGroup#forAttribute or ClusterGroup#forePredicate methods > > in > > > > the > > > > > > thin > > > > > > > client. Perhaps they can be requested lazily. > > > > > > > > > > > > > > From the protocol point of view there will be two new > operations: > > > > > > > > > > > > > > OP_CLUSTER_GET_NODES > > > > > > > Request: empty > > > > > > > Response: long topologyVersion, int minorTopologyVersion, int > > > > > nodesCount, > > > > > > > for each node set of node fields (UUID nodeId, Object or String > > > > > > > consistentId, long order, etc) > > > > > > > > > > > > > > OP_CLUSTER_GET_NODE_ATTRIBUTES > > > > > > > Request: int nodesCount, for each node: UUID nodeId > > > > > > > Response: int nodesCount, for each node: int attributesCount, > for > > > > each > > > > > > node > > > > > > > attribute: String name, Object value > > > > > > > > > > > > > > To execute tasks we need something like these methods in the > > client > > > > > API: > > > > > > > Object execute(String task, Object arg) > > > > > > > Future<Object> executeAsync(String task, Object arg) > > > > > > > Object affinityExecute(String task, String cache, Object key, > > > Object > > > > > arg) > > > > > > > Future<Object> affinityExecuteAsync(String task, String cache, > > > Object > > > > > > key, > > > > > > > Object arg) > > > > > > > > > > > > > > Which can be mapped to protocol operations: > > > > > > > > > > > > > > OP_COMPUTE_EXECUTE_TASK > > > > > > > Request: UUID nodeId, String taskName, Object arg > > > > > > > Response: Object result > > > > > > > > > > > > > > OP_COMPUTE_EXECUTE_TASK_AFFINITY > > > > > > > Request: String cacheName, Object key, String taskName, Object > > arg > > > > > > > Response: Object result > > > > > > > > > > > > > > The second operation is needed because we sometimes can't > > calculate > > > > and > > > > > > > connect to affinity node on the client-side (affinity awareness > > can > > > > be > > > > > > > disabled, custom affinity function can be used or there can be > no > > > > > > > connection between client and affinity node), but we can make > > best > > > > > effort > > > > > > > to send request to target node if affinity awareness is > enabled. > > > > > > > > > > > > > > Currently, on the server-side requests always processed > > > synchronously > > > > > and > > > > > > > responses are sent right after request was processed. To > execute > > > long > > > > > > tasks > > > > > > > async we should whether change this logic or introduce some > kind > > > > > two-way > > > > > > > communication between client and server (now only one-way > > requests > > > > from > > > > > > > client to server are allowed). > > > > > > > > > > > > > > Two-way communication can also be useful in the future if we > will > > > > send > > > > > > some > > > > > > > server-side generated events to clients. > > > > > > > > > > > > > > In case of two-way communication there can be new operations > > > > > introduced: > > > > > > > > > > > > > > OP_COMPUTE_EXECUTE_TASK (from client to server) > > > > > > > Request: UUID nodeId, String taskName, Object arg > > > > > > > Response: long taskId > > > > > > > > > > > > > > OP_COMPUTE_TASK_FINISHED (from server to client) > > > > > > > Request: taskId, Object result > > > > > > > Response: empty > > > > > > > > > > > > > > The same for affinity requests. > > > > > > > > > > > > > > Also, we can implement not only execute task operation, but > some > > > > other > > > > > > > operations from IgniteCompute (broadcast, run, call), but it > will > > > be > > > > > > useful > > > > > > > only for java thin client. And even with java thin client we > > should > > > > > > whether > > > > > > > implement peer-class-loading for thin clients (this also > requires > > > > > two-way > > > > > > > client-server communication) or put classes with executed > > closures > > > to > > > > > the > > > > > > > server locally. > > > > > > > > > > > > > > What do you think about proposed protocol changes? > > > > > > > Do we need two-way requests between client and server? > > > > > > > Do we need support of compute methods other than "execute > task"? > > > > > > > What do you think about peer-class-loading for thin clients? > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > Sergey Kozlov > > > > > > GridGain Systems > > > > > > www.gridgain.com > > > > > > > > > > > > > > > > > > > > > > > -- > > > > Sergey Kozlov > > > > GridGain Systems > > > > www.gridgain.com > > > > > > > > > > > > > -- > > > Alex. > > > > > > |
Alex,
> we will mix entities from different layers (transport layer and request body) I would not call our message header (which includes the id) "transport layer". TCP is our transport layer. And it is fine to use request ID to identify compute tasks (as we do with query cursors). > we still can't be sure that the task is successfully started on a server The request to start the task will fail and we'll get a response indicating that right away > we won't ever know about topology change Looks like I'm missing something - how is topology change relevant to executing compute tasks from client? On Mon, Nov 25, 2019 at 10:17 PM Alex Plehanov <[hidden email]> wrote: > Pavel, in this case, we will mix entities from different layers (transport > layer and request body), it's not very good. The same behavior we can > achieve with generated on client-side task id, but there will be no > inter-layer data intersection and I think it will be easier to implement on > both client and server-side. But we still can't be sure that the task is > successfully started on a server. We won't ever know about topology change, > because topology changed flag will be sent from server to client only with > a response when the task will be completed. Are we accept that? > > пн, 25 нояб. 2019 г. в 19:07, Pavel Tupitsyn <[hidden email]>: > > > Alex, > > > > I have a simpler idea. We already do request id handling in the protocol, > > so: > > - Client sends a normal request to execute compute task. Request ID is > > generated as usual. > > - As soon as task is completed, a response is received. > > > > As for cancellation - client can send a new request (with new request ID) > > and (in the body) pass the request ID from above > > as a task identifier. As a result, there are two responses: > > - Cancellation response > > - Task response (with proper cancelled status) > > > > That's it, no need to modify the core of the protocol. One request - one > > response. > > > > On Mon, Nov 25, 2019 at 6:20 PM Alex Plehanov <[hidden email]> > > wrote: > > > > > Pavel, we need to inform the client when the task is completed, we need > > the > > > ability to cancel the task. I see several ways to implement this: > > > > > > 1. Сlient sends a request to the server to start a task, server return > > task > > > id in response. Server notifies client when task is completed with a > new > > > request (from server to client). Client can cancel the task by sending > a > > > new request with operation type "cancel" and task id. In this case, we > > > should implement 2-ways requests. > > > 2. Client generates unique task id and sends a request to the server to > > > start a task, server don't reply immediately but wait until task is > > > completed. Client can cancel task by sending new request with operation > > > type "cancel" and task id. In this case, we should decouple request and > > > response on the server-side (currently response is sent right after > > request > > > was processed). Also, we can't be sure that task is successfully > started > > on > > > a server. > > > 3. Client sends a request to the server to start a task, server return > id > > > in response. Client periodically asks the server about task status. > > Client > > > can cancel the task by sending new request with operation type "cancel" > > and > > > task id. This case brings some overhead to the communication channel. > > > > > > Personally, I think that the case with 2-ways requests is better, but > I'm > > > open to any other ideas. > > > > > > Aleksandr, > > > > > > Filtering logic for OP_CLUSTER_GROUP_GET_NODE_IDS looks > overcomplicated. > > Do > > > we need server-side filtering at all? Wouldn't it be better to send > basic > > > info (ids, order, flags) for all nodes (there is relatively small > amount > > of > > > data) and extended info (attributes) for selected list of nodes? In > this > > > case, we can do basic node filtration on client-side (forClients(), > > > forServers(), forNodeIds(), forOthers(), etc). > > > > > > Do you use standard ClusterNode serialization? There are also metrics > > > serialized with ClusterNode, do we need it on thin client? There are > > other > > > interfaces exist to show metrics, I think it's redundant to export > > metrics > > > to thin clients too. > > > > > > What do you think? > > > > > > > > > > > > > > > пт, 22 нояб. 2019 г. в 20:15, Aleksandr Shapkin <[hidden email]>: > > > > > > > Alex, > > > > > > > > > > > > > > > > I think you can create a new IEP page and I will fill it with the > > Cluster > > > > API details. > > > > > > > > > > > > > > > > In short, I’ve introduced several new codes: > > > > > > > > > > > > > > > > Cluster API is pretty straightforward: > > > > > > > > > > > > > > > > OP_CLUSTER_IS_ACTIVE = 5000 > > > > > > > > OP_CLUSTER_CHANGE_STATE = 5001 > > > > > > > > OP_CLUSTER_CHANGE_WAL_STATE = 5002 > > > > > > > > OP_CLUSTER_GET_WAL_STATE = 5003 > > > > > > > > > > > > > > > > Cluster group codes: > > > > > > > > > > > > > > > > OP_CLUSTER_GROUP_GET_NODE_IDS = 5100 > > > > > > > > OP_CLUSTER_GROUP_GET_NODE_INFO = 5101 > > > > > > > > > > > > > > > > The underlying implementation is based on the thick client logic. > > > > > > > > > > > > > > > > For every request, we provide a known topology version and if it has > > > > changed, > > > > > > > > a client updates it firstly and then re-sends the filtering request. > > > > > > > > > > > > > > > > Alongside the topVer a client sends a serialized nodes projection > > object > > > > > > > > that could be considered as a code to value mapping. > > > > > > > > Consider: [{Code = 1, Value= [“DotNet”, “MyAttribute”}, {Code=2, > > > Value=1}] > > > > > > > > Where “1” stands for Attribute filtering and “2” – serverNodesOnly > > flag. > > > > > > > > > > > > > > > > As a result of request processing, a server sends nodeId UUIDs and a > > > > current topVer. > > > > > > > > > > > > > > > > When a client obtains nodeIds, it can perform a NODE_INFO call to > get a > > > > > > > > serialized ClusterNode object. In addition there should be a > different > > > API > > > > > > > > method for accessing/updating node metrics. > > > > > > > > чт, 21 нояб. 2019 г. в 12:32, Sergey Kozlov <[hidden email]>: > > > > > > > > > Hi Pavel > > > > > > > > > > On Thu, Nov 21, 2019 at 11:30 AM Pavel Tupitsyn < > > [hidden email]> > > > > > wrote: > > > > > > > > > > > 1. I believe that Cluster operations for Thin Client protocol are > > > > already > > > > > > in the works > > > > > > by Alexandr Shapkin. Can't find the ticket though. > > > > > > Alexandr, can you please confirm and attach the ticket number? > > > > > > > > > > > > 2. Proposed changes will work only for Java tasks that are > already > > > > > deployed > > > > > > on server nodes. > > > > > > This is mostly useless for other thin clients we have (Python, > PHP, > > > > .NET, > > > > > > C++). > > > > > > > > > > > > > > > > I don't guess so. The task (execution) is a way to implement own > > layer > > > > for > > > > > the thin client application. > > > > > > > > > > > > > > > > We should think of a way to make this useful for all clients. > > > > > > For example, we may allow sending tasks in some scripting > language > > > like > > > > > > Javascript. > > > > > > Thoughts? > > > > > > > > > > > > > > > > The arbitrary code execution from a remote client must be protected > > > > > from malicious code. > > > > > I don't know how it could be designed but without that we open the > > hole > > > > to > > > > > kill cluster. > > > > > > > > > > > > > > > > > > > > > > On Thu, Nov 21, 2019 at 11:21 AM Sergey Kozlov < > > [hidden email] > > > > > > > > > > wrote: > > > > > > > > > > > > > Hi Alex > > > > > > > > > > > > > > The idea is great. But I have some concerns that probably > should > > be > > > > > taken > > > > > > > into account for design: > > > > > > > > > > > > > > 1. We need to have the ability to stop a task execution, > smth > > > like > > > > > > > OP_COMPUTE_CANCEL_TASK operation (client to server) > > > > > > > 2. What's about task execution timeout? It may help to the > > > cluster > > > > > > > survival for buggy tasks > > > > > > > 3. Ignite doesn't have roles/authorization functionality for > > > now. > > > > > But > > > > > > a > > > > > > > task is the risky operation for cluster (for security > > reasons). > > > > > Could > > > > > > we > > > > > > > add for Ignite configuration new options: > > > > > > > - Explicit turning on for compute task support for thin > > > > protocol > > > > > > > (disabled by default) for whole cluster > > > > > > > - Explicit turning on for compute task support for a node > > > > > > > - The list of task names (classes) allowed to execute by > > thin > > > > > > client. > > > > > > > 4. Support the labeling for task that may help to > investigate > > > > issues > > > > > > on > > > > > > > cluster (the idea from IEP-34 [1]) > > > > > > > > > > > > > > 1. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/IGNITE/IEP-34+Thin+client%3A+transactions+support > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Nov 21, 2019 at 10:58 AM Alex Plehanov < > > > > > [hidden email]> > > > > > > > wrote: > > > > > > > > > > > > > > > Hello, Igniters! > > > > > > > > > > > > > > > > I have plans to start implementation of Compute interface for > > > > Ignite > > > > > > thin > > > > > > > > client and want to discuss features that should be > implemented. > > > > > > > > > > > > > > > > We already have Compute implementation for binary-rest > clients > > > > > > > > (GridClientCompute), which have the following functionality: > > > > > > > > - Filtering cluster nodes (projection) for compute > > > > > > > > - Executing task by the name > > > > > > > > > > > > > > > > I think we can implement this functionality in a thin client > as > > > > well. > > > > > > > > > > > > > > > > First of all, we need some operation types to request a list > of > > > all > > > > > > > > available nodes and probably node attributes (by a list of > > > nodes). > > > > > Node > > > > > > > > attributes will be helpful if we will decide to implement > > analog > > > of > > > > > > > > ClusterGroup#forAttribute or ClusterGroup#forePredicate > methods > > > in > > > > > the > > > > > > > thin > > > > > > > > client. Perhaps they can be requested lazily. > > > > > > > > > > > > > > > > From the protocol point of view there will be two new > > operations: > > > > > > > > > > > > > > > > OP_CLUSTER_GET_NODES > > > > > > > > Request: empty > > > > > > > > Response: long topologyVersion, int minorTopologyVersion, int > > > > > > nodesCount, > > > > > > > > for each node set of node fields (UUID nodeId, Object or > String > > > > > > > > consistentId, long order, etc) > > > > > > > > > > > > > > > > OP_CLUSTER_GET_NODE_ATTRIBUTES > > > > > > > > Request: int nodesCount, for each node: UUID nodeId > > > > > > > > Response: int nodesCount, for each node: int attributesCount, > > for > > > > > each > > > > > > > node > > > > > > > > attribute: String name, Object value > > > > > > > > > > > > > > > > To execute tasks we need something like these methods in the > > > client > > > > > > API: > > > > > > > > Object execute(String task, Object arg) > > > > > > > > Future<Object> executeAsync(String task, Object arg) > > > > > > > > Object affinityExecute(String task, String cache, Object key, > > > > Object > > > > > > arg) > > > > > > > > Future<Object> affinityExecuteAsync(String task, String > cache, > > > > Object > > > > > > > key, > > > > > > > > Object arg) > > > > > > > > > > > > > > > > Which can be mapped to protocol operations: > > > > > > > > > > > > > > > > OP_COMPUTE_EXECUTE_TASK > > > > > > > > Request: UUID nodeId, String taskName, Object arg > > > > > > > > Response: Object result > > > > > > > > > > > > > > > > OP_COMPUTE_EXECUTE_TASK_AFFINITY > > > > > > > > Request: String cacheName, Object key, String taskName, > Object > > > arg > > > > > > > > Response: Object result > > > > > > > > > > > > > > > > The second operation is needed because we sometimes can't > > > calculate > > > > > and > > > > > > > > connect to affinity node on the client-side (affinity > awareness > > > can > > > > > be > > > > > > > > disabled, custom affinity function can be used or there can > be > > no > > > > > > > > connection between client and affinity node), but we can make > > > best > > > > > > effort > > > > > > > > to send request to target node if affinity awareness is > > enabled. > > > > > > > > > > > > > > > > Currently, on the server-side requests always processed > > > > synchronously > > > > > > and > > > > > > > > responses are sent right after request was processed. To > > execute > > > > long > > > > > > > tasks > > > > > > > > async we should whether change this logic or introduce some > > kind > > > > > > two-way > > > > > > > > communication between client and server (now only one-way > > > requests > > > > > from > > > > > > > > client to server are allowed). > > > > > > > > > > > > > > > > Two-way communication can also be useful in the future if we > > will > > > > > send > > > > > > > some > > > > > > > > server-side generated events to clients. > > > > > > > > > > > > > > > > In case of two-way communication there can be new operations > > > > > > introduced: > > > > > > > > > > > > > > > > OP_COMPUTE_EXECUTE_TASK (from client to server) > > > > > > > > Request: UUID nodeId, String taskName, Object arg > > > > > > > > Response: long taskId > > > > > > > > > > > > > > > > OP_COMPUTE_TASK_FINISHED (from server to client) > > > > > > > > Request: taskId, Object result > > > > > > > > Response: empty > > > > > > > > > > > > > > > > The same for affinity requests. > > > > > > > > > > > > > > > > Also, we can implement not only execute task operation, but > > some > > > > > other > > > > > > > > operations from IgniteCompute (broadcast, run, call), but it > > will > > > > be > > > > > > > useful > > > > > > > > only for java thin client. And even with java thin client we > > > should > > > > > > > whether > > > > > > > > implement peer-class-loading for thin clients (this also > > requires > > > > > > two-way > > > > > > > > client-server communication) or put classes with executed > > > closures > > > > to > > > > > > the > > > > > > > > server locally. > > > > > > > > > > > > > > > > What do you think about proposed protocol changes? > > > > > > > > Do we need two-way requests between client and server? > > > > > > > > Do we need support of compute methods other than "execute > > task"? > > > > > > > > What do you think about peer-class-loading for thin clients? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > Sergey Kozlov > > > > > > > GridGain Systems > > > > > > > www.gridgain.com > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > Sergey Kozlov > > > > > GridGain Systems > > > > > www.gridgain.com > > > > > > > > > > > > > > > > > -- > > > > Alex. > > > > > > > > > > |
> And it is fine to use request ID to identify compute tasks (as we do with
query cursors). I can't see any usage of request id in query cursors. We send query request and get cursor id in response. After that, we only use cursor id (to get next pages and to close the resource). Did I miss something? > Looks like I'm missing something - how is topology change relevant to executing compute tasks from client? It's not relevant directly. But there are some cases where it will be helpful. For example, if client sends long term tasks to nodes and wants to do it with load balancing it will detect topology change only after some time in the future with the first response, so load balancing will no work. Perhaps we can add optional "topology version" field to the OP_COMPUTE_EXECUTE_TASK request to solve this problem. пн, 25 нояб. 2019 г. в 22:42, Pavel Tupitsyn <[hidden email]>: > Alex, > > > we will mix entities from different layers (transport layer and request > body) > I would not call our message header (which includes the id) "transport > layer". > TCP is our transport layer. And it is fine to use request ID to identify > compute tasks (as we do with query cursors). > > > we still can't be sure that the task is successfully started on a server > The request to start the task will fail and we'll get a response indicating > that right away > > > we won't ever know about topology change > Looks like I'm missing something - how is topology change relevant to > executing compute tasks from client? > > On Mon, Nov 25, 2019 at 10:17 PM Alex Plehanov <[hidden email]> > wrote: > > > Pavel, in this case, we will mix entities from different layers > (transport > > layer and request body), it's not very good. The same behavior we can > > achieve with generated on client-side task id, but there will be no > > inter-layer data intersection and I think it will be easier to implement > on > > both client and server-side. But we still can't be sure that the task is > > successfully started on a server. We won't ever know about topology > change, > > because topology changed flag will be sent from server to client only > with > > a response when the task will be completed. Are we accept that? > > > > пн, 25 нояб. 2019 г. в 19:07, Pavel Tupitsyn <[hidden email]>: > > > > > Alex, > > > > > > I have a simpler idea. We already do request id handling in the > protocol, > > > so: > > > - Client sends a normal request to execute compute task. Request ID is > > > generated as usual. > > > - As soon as task is completed, a response is received. > > > > > > As for cancellation - client can send a new request (with new request > ID) > > > and (in the body) pass the request ID from above > > > as a task identifier. As a result, there are two responses: > > > - Cancellation response > > > - Task response (with proper cancelled status) > > > > > > That's it, no need to modify the core of the protocol. One request - > one > > > response. > > > > > > On Mon, Nov 25, 2019 at 6:20 PM Alex Plehanov <[hidden email] > > > > > wrote: > > > > > > > Pavel, we need to inform the client when the task is completed, we > need > > > the > > > > ability to cancel the task. I see several ways to implement this: > > > > > > > > 1. Сlient sends a request to the server to start a task, server > return > > > task > > > > id in response. Server notifies client when task is completed with a > > new > > > > request (from server to client). Client can cancel the task by > sending > > a > > > > new request with operation type "cancel" and task id. In this case, > we > > > > should implement 2-ways requests. > > > > 2. Client generates unique task id and sends a request to the server > to > > > > start a task, server don't reply immediately but wait until task is > > > > completed. Client can cancel task by sending new request with > operation > > > > type "cancel" and task id. In this case, we should decouple request > and > > > > response on the server-side (currently response is sent right after > > > request > > > > was processed). Also, we can't be sure that task is successfully > > started > > > on > > > > a server. > > > > 3. Client sends a request to the server to start a task, server > return > > id > > > > in response. Client periodically asks the server about task status. > > > Client > > > > can cancel the task by sending new request with operation type > "cancel" > > > and > > > > task id. This case brings some overhead to the communication channel. > > > > > > > > Personally, I think that the case with 2-ways requests is better, but > > I'm > > > > open to any other ideas. > > > > > > > > Aleksandr, > > > > > > > > Filtering logic for OP_CLUSTER_GROUP_GET_NODE_IDS looks > > overcomplicated. > > > Do > > > > we need server-side filtering at all? Wouldn't it be better to send > > basic > > > > info (ids, order, flags) for all nodes (there is relatively small > > amount > > > of > > > > data) and extended info (attributes) for selected list of nodes? In > > this > > > > case, we can do basic node filtration on client-side (forClients(), > > > > forServers(), forNodeIds(), forOthers(), etc). > > > > > > > > Do you use standard ClusterNode serialization? There are also metrics > > > > serialized with ClusterNode, do we need it on thin client? There are > > > other > > > > interfaces exist to show metrics, I think it's redundant to export > > > metrics > > > > to thin clients too. > > > > > > > > What do you think? > > > > > > > > > > > > > > > > > > > > пт, 22 нояб. 2019 г. в 20:15, Aleksandr Shapkin <[hidden email]>: > > > > > > > > > Alex, > > > > > > > > > > > > > > > > > > > > I think you can create a new IEP page and I will fill it with the > > > Cluster > > > > > API details. > > > > > > > > > > > > > > > > > > > > In short, I’ve introduced several new codes: > > > > > > > > > > > > > > > > > > > > Cluster API is pretty straightforward: > > > > > > > > > > > > > > > > > > > > OP_CLUSTER_IS_ACTIVE = 5000 > > > > > > > > > > OP_CLUSTER_CHANGE_STATE = 5001 > > > > > > > > > > OP_CLUSTER_CHANGE_WAL_STATE = 5002 > > > > > > > > > > OP_CLUSTER_GET_WAL_STATE = 5003 > > > > > > > > > > > > > > > > > > > > Cluster group codes: > > > > > > > > > > > > > > > > > > > > OP_CLUSTER_GROUP_GET_NODE_IDS = 5100 > > > > > > > > > > OP_CLUSTER_GROUP_GET_NODE_INFO = 5101 > > > > > > > > > > > > > > > > > > > > The underlying implementation is based on the thick client logic. > > > > > > > > > > > > > > > > > > > > For every request, we provide a known topology version and if it > has > > > > > changed, > > > > > > > > > > a client updates it firstly and then re-sends the filtering > request. > > > > > > > > > > > > > > > > > > > > Alongside the topVer a client sends a serialized nodes projection > > > object > > > > > > > > > > that could be considered as a code to value mapping. > > > > > > > > > > Consider: [{Code = 1, Value= [“DotNet”, “MyAttribute”}, {Code=2, > > > > Value=1}] > > > > > > > > > > Where “1” stands for Attribute filtering and “2” – serverNodesOnly > > > flag. > > > > > > > > > > > > > > > > > > > > As a result of request processing, a server sends nodeId UUIDs and > a > > > > > current topVer. > > > > > > > > > > > > > > > > > > > > When a client obtains nodeIds, it can perform a NODE_INFO call to > > get a > > > > > > > > > > serialized ClusterNode object. In addition there should be a > > different > > > > API > > > > > > > > > > method for accessing/updating node metrics. > > > > > > > > > > чт, 21 нояб. 2019 г. в 12:32, Sergey Kozlov <[hidden email] > >: > > > > > > > > > > > Hi Pavel > > > > > > > > > > > > On Thu, Nov 21, 2019 at 11:30 AM Pavel Tupitsyn < > > > [hidden email]> > > > > > > wrote: > > > > > > > > > > > > > 1. I believe that Cluster operations for Thin Client protocol > are > > > > > already > > > > > > > in the works > > > > > > > by Alexandr Shapkin. Can't find the ticket though. > > > > > > > Alexandr, can you please confirm and attach the ticket number? > > > > > > > > > > > > > > 2. Proposed changes will work only for Java tasks that are > > already > > > > > > deployed > > > > > > > on server nodes. > > > > > > > This is mostly useless for other thin clients we have (Python, > > PHP, > > > > > .NET, > > > > > > > C++). > > > > > > > > > > > > > > > > > > > I don't guess so. The task (execution) is a way to implement own > > > layer > > > > > for > > > > > > the thin client application. > > > > > > > > > > > > > > > > > > > We should think of a way to make this useful for all clients. > > > > > > > For example, we may allow sending tasks in some scripting > > language > > > > like > > > > > > > Javascript. > > > > > > > Thoughts? > > > > > > > > > > > > > > > > > > > The arbitrary code execution from a remote client must be > protected > > > > > > from malicious code. > > > > > > I don't know how it could be designed but without that we open > the > > > hole > > > > > to > > > > > > kill cluster. > > > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Nov 21, 2019 at 11:21 AM Sergey Kozlov < > > > [hidden email] > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > Hi Alex > > > > > > > > > > > > > > > > The idea is great. But I have some concerns that probably > > should > > > be > > > > > > taken > > > > > > > > into account for design: > > > > > > > > > > > > > > > > 1. We need to have the ability to stop a task execution, > > smth > > > > like > > > > > > > > OP_COMPUTE_CANCEL_TASK operation (client to server) > > > > > > > > 2. What's about task execution timeout? It may help to the > > > > cluster > > > > > > > > survival for buggy tasks > > > > > > > > 3. Ignite doesn't have roles/authorization functionality > for > > > > now. > > > > > > But > > > > > > > a > > > > > > > > task is the risky operation for cluster (for security > > > reasons). > > > > > > Could > > > > > > > we > > > > > > > > add for Ignite configuration new options: > > > > > > > > - Explicit turning on for compute task support for thin > > > > > protocol > > > > > > > > (disabled by default) for whole cluster > > > > > > > > - Explicit turning on for compute task support for a > node > > > > > > > > - The list of task names (classes) allowed to execute > by > > > thin > > > > > > > client. > > > > > > > > 4. Support the labeling for task that may help to > > investigate > > > > > issues > > > > > > > on > > > > > > > > cluster (the idea from IEP-34 [1]) > > > > > > > > > > > > > > > > 1. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/IGNITE/IEP-34+Thin+client%3A+transactions+support > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Nov 21, 2019 at 10:58 AM Alex Plehanov < > > > > > > [hidden email]> > > > > > > > > wrote: > > > > > > > > > > > > > > > > > Hello, Igniters! > > > > > > > > > > > > > > > > > > I have plans to start implementation of Compute interface > for > > > > > Ignite > > > > > > > thin > > > > > > > > > client and want to discuss features that should be > > implemented. > > > > > > > > > > > > > > > > > > We already have Compute implementation for binary-rest > > clients > > > > > > > > > (GridClientCompute), which have the following > functionality: > > > > > > > > > - Filtering cluster nodes (projection) for compute > > > > > > > > > - Executing task by the name > > > > > > > > > > > > > > > > > > I think we can implement this functionality in a thin > client > > as > > > > > well. > > > > > > > > > > > > > > > > > > First of all, we need some operation types to request a > list > > of > > > > all > > > > > > > > > available nodes and probably node attributes (by a list of > > > > nodes). > > > > > > Node > > > > > > > > > attributes will be helpful if we will decide to implement > > > analog > > > > of > > > > > > > > > ClusterGroup#forAttribute or ClusterGroup#forePredicate > > methods > > > > in > > > > > > the > > > > > > > > thin > > > > > > > > > client. Perhaps they can be requested lazily. > > > > > > > > > > > > > > > > > > From the protocol point of view there will be two new > > > operations: > > > > > > > > > > > > > > > > > > OP_CLUSTER_GET_NODES > > > > > > > > > Request: empty > > > > > > > > > Response: long topologyVersion, int minorTopologyVersion, > int > > > > > > > nodesCount, > > > > > > > > > for each node set of node fields (UUID nodeId, Object or > > String > > > > > > > > > consistentId, long order, etc) > > > > > > > > > > > > > > > > > > OP_CLUSTER_GET_NODE_ATTRIBUTES > > > > > > > > > Request: int nodesCount, for each node: UUID nodeId > > > > > > > > > Response: int nodesCount, for each node: int > attributesCount, > > > for > > > > > > each > > > > > > > > node > > > > > > > > > attribute: String name, Object value > > > > > > > > > > > > > > > > > > To execute tasks we need something like these methods in > the > > > > client > > > > > > > API: > > > > > > > > > Object execute(String task, Object arg) > > > > > > > > > Future<Object> executeAsync(String task, Object arg) > > > > > > > > > Object affinityExecute(String task, String cache, Object > key, > > > > > Object > > > > > > > arg) > > > > > > > > > Future<Object> affinityExecuteAsync(String task, String > > cache, > > > > > Object > > > > > > > > key, > > > > > > > > > Object arg) > > > > > > > > > > > > > > > > > > Which can be mapped to protocol operations: > > > > > > > > > > > > > > > > > > OP_COMPUTE_EXECUTE_TASK > > > > > > > > > Request: UUID nodeId, String taskName, Object arg > > > > > > > > > Response: Object result > > > > > > > > > > > > > > > > > > OP_COMPUTE_EXECUTE_TASK_AFFINITY > > > > > > > > > Request: String cacheName, Object key, String taskName, > > Object > > > > arg > > > > > > > > > Response: Object result > > > > > > > > > > > > > > > > > > The second operation is needed because we sometimes can't > > > > calculate > > > > > > and > > > > > > > > > connect to affinity node on the client-side (affinity > > awareness > > > > can > > > > > > be > > > > > > > > > disabled, custom affinity function can be used or there can > > be > > > no > > > > > > > > > connection between client and affinity node), but we can > make > > > > best > > > > > > > effort > > > > > > > > > to send request to target node if affinity awareness is > > > enabled. > > > > > > > > > > > > > > > > > > Currently, on the server-side requests always processed > > > > > synchronously > > > > > > > and > > > > > > > > > responses are sent right after request was processed. To > > > execute > > > > > long > > > > > > > > tasks > > > > > > > > > async we should whether change this logic or introduce some > > > kind > > > > > > > two-way > > > > > > > > > communication between client and server (now only one-way > > > > requests > > > > > > from > > > > > > > > > client to server are allowed). > > > > > > > > > > > > > > > > > > Two-way communication can also be useful in the future if > we > > > will > > > > > > send > > > > > > > > some > > > > > > > > > server-side generated events to clients. > > > > > > > > > > > > > > > > > > In case of two-way communication there can be new > operations > > > > > > > introduced: > > > > > > > > > > > > > > > > > > OP_COMPUTE_EXECUTE_TASK (from client to server) > > > > > > > > > Request: UUID nodeId, String taskName, Object arg > > > > > > > > > Response: long taskId > > > > > > > > > > > > > > > > > > OP_COMPUTE_TASK_FINISHED (from server to client) > > > > > > > > > Request: taskId, Object result > > > > > > > > > Response: empty > > > > > > > > > > > > > > > > > > The same for affinity requests. > > > > > > > > > > > > > > > > > > Also, we can implement not only execute task operation, but > > > some > > > > > > other > > > > > > > > > operations from IgniteCompute (broadcast, run, call), but > it > > > will > > > > > be > > > > > > > > useful > > > > > > > > > only for java thin client. And even with java thin client > we > > > > should > > > > > > > > whether > > > > > > > > > implement peer-class-loading for thin clients (this also > > > requires > > > > > > > two-way > > > > > > > > > client-server communication) or put classes with executed > > > > closures > > > > > to > > > > > > > the > > > > > > > > > server locally. > > > > > > > > > > > > > > > > > > What do you think about proposed protocol changes? > > > > > > > > > Do we need two-way requests between client and server? > > > > > > > > > Do we need support of compute methods other than "execute > > > task"? > > > > > > > > > What do you think about peer-class-loading for thin > clients? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > Sergey Kozlov > > > > > > > > GridGain Systems > > > > > > > > www.gridgain.com > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > Sergey Kozlov > > > > > > GridGain Systems > > > > > > www.gridgain.com > > > > > > > > > > > > > > > > > > > > > -- > > > > > Alex. > > > > > > > > > > > > > > > |
> I can't see any usage of request id in query cursors
You are right, cursor id is a separate thing. Anyway, my point stands. > client sends long term tasks to nodes and wants to do it with load balancing I still don't get it. Can you please provide equivalent use case with existing "thick" client? On Mon, Nov 25, 2019 at 11:59 PM Alex Plehanov <[hidden email]> wrote: > > And it is fine to use request ID to identify compute tasks (as we do with > query cursors). > I can't see any usage of request id in query cursors. We send query request > and get cursor id in response. After that, we only use cursor id (to get > next pages and to close the resource). Did I miss something? > > > Looks like I'm missing something - how is topology change relevant to > executing compute tasks from client? > It's not relevant directly. But there are some cases where it will be > helpful. For example, if client sends long term tasks to nodes and wants to > do it with load balancing it will detect topology change only after some > time in the future with the first response, so load balancing will no work. > Perhaps we can add optional "topology version" field to the > OP_COMPUTE_EXECUTE_TASK request to solve this problem. > > > пн, 25 нояб. 2019 г. в 22:42, Pavel Tupitsyn <[hidden email]>: > > > Alex, > > > > > we will mix entities from different layers (transport layer and request > > body) > > I would not call our message header (which includes the id) "transport > > layer". > > TCP is our transport layer. And it is fine to use request ID to identify > > compute tasks (as we do with query cursors). > > > > > we still can't be sure that the task is successfully started on a > server > > The request to start the task will fail and we'll get a response > indicating > > that right away > > > > > we won't ever know about topology change > > Looks like I'm missing something - how is topology change relevant to > > executing compute tasks from client? > > > > On Mon, Nov 25, 2019 at 10:17 PM Alex Plehanov <[hidden email]> > > wrote: > > > > > Pavel, in this case, we will mix entities from different layers > > (transport > > > layer and request body), it's not very good. The same behavior we can > > > achieve with generated on client-side task id, but there will be no > > > inter-layer data intersection and I think it will be easier to > implement > > on > > > both client and server-side. But we still can't be sure that the task > is > > > successfully started on a server. We won't ever know about topology > > change, > > > because topology changed flag will be sent from server to client only > > with > > > a response when the task will be completed. Are we accept that? > > > > > > пн, 25 нояб. 2019 г. в 19:07, Pavel Tupitsyn <[hidden email]>: > > > > > > > Alex, > > > > > > > > I have a simpler idea. We already do request id handling in the > > protocol, > > > > so: > > > > - Client sends a normal request to execute compute task. Request ID > is > > > > generated as usual. > > > > - As soon as task is completed, a response is received. > > > > > > > > As for cancellation - client can send a new request (with new request > > ID) > > > > and (in the body) pass the request ID from above > > > > as a task identifier. As a result, there are two responses: > > > > - Cancellation response > > > > - Task response (with proper cancelled status) > > > > > > > > That's it, no need to modify the core of the protocol. One request - > > one > > > > response. > > > > > > > > On Mon, Nov 25, 2019 at 6:20 PM Alex Plehanov < > [hidden email] > > > > > > > wrote: > > > > > > > > > Pavel, we need to inform the client when the task is completed, we > > need > > > > the > > > > > ability to cancel the task. I see several ways to implement this: > > > > > > > > > > 1. Сlient sends a request to the server to start a task, server > > return > > > > task > > > > > id in response. Server notifies client when task is completed with > a > > > new > > > > > request (from server to client). Client can cancel the task by > > sending > > > a > > > > > new request with operation type "cancel" and task id. In this case, > > we > > > > > should implement 2-ways requests. > > > > > 2. Client generates unique task id and sends a request to the > server > > to > > > > > start a task, server don't reply immediately but wait until task is > > > > > completed. Client can cancel task by sending new request with > > operation > > > > > type "cancel" and task id. In this case, we should decouple request > > and > > > > > response on the server-side (currently response is sent right after > > > > request > > > > > was processed). Also, we can't be sure that task is successfully > > > started > > > > on > > > > > a server. > > > > > 3. Client sends a request to the server to start a task, server > > return > > > id > > > > > in response. Client periodically asks the server about task status. > > > > Client > > > > > can cancel the task by sending new request with operation type > > "cancel" > > > > and > > > > > task id. This case brings some overhead to the communication > channel. > > > > > > > > > > Personally, I think that the case with 2-ways requests is better, > but > > > I'm > > > > > open to any other ideas. > > > > > > > > > > Aleksandr, > > > > > > > > > > Filtering logic for OP_CLUSTER_GROUP_GET_NODE_IDS looks > > > overcomplicated. > > > > Do > > > > > we need server-side filtering at all? Wouldn't it be better to send > > > basic > > > > > info (ids, order, flags) for all nodes (there is relatively small > > > amount > > > > of > > > > > data) and extended info (attributes) for selected list of nodes? In > > > this > > > > > case, we can do basic node filtration on client-side (forClients(), > > > > > forServers(), forNodeIds(), forOthers(), etc). > > > > > > > > > > Do you use standard ClusterNode serialization? There are also > metrics > > > > > serialized with ClusterNode, do we need it on thin client? There > are > > > > other > > > > > interfaces exist to show metrics, I think it's redundant to export > > > > metrics > > > > > to thin clients too. > > > > > > > > > > What do you think? > > > > > > > > > > > > > > > > > > > > > > > > > пт, 22 нояб. 2019 г. в 20:15, Aleksandr Shapkin <[hidden email] > >: > > > > > > > > > > > Alex, > > > > > > > > > > > > > > > > > > > > > > > > I think you can create a new IEP page and I will fill it with the > > > > Cluster > > > > > > API details. > > > > > > > > > > > > > > > > > > > > > > > > In short, I’ve introduced several new codes: > > > > > > > > > > > > > > > > > > > > > > > > Cluster API is pretty straightforward: > > > > > > > > > > > > > > > > > > > > > > > > OP_CLUSTER_IS_ACTIVE = 5000 > > > > > > > > > > > > OP_CLUSTER_CHANGE_STATE = 5001 > > > > > > > > > > > > OP_CLUSTER_CHANGE_WAL_STATE = 5002 > > > > > > > > > > > > OP_CLUSTER_GET_WAL_STATE = 5003 > > > > > > > > > > > > > > > > > > > > > > > > Cluster group codes: > > > > > > > > > > > > > > > > > > > > > > > > OP_CLUSTER_GROUP_GET_NODE_IDS = 5100 > > > > > > > > > > > > OP_CLUSTER_GROUP_GET_NODE_INFO = 5101 > > > > > > > > > > > > > > > > > > > > > > > > The underlying implementation is based on the thick client logic. > > > > > > > > > > > > > > > > > > > > > > > > For every request, we provide a known topology version and if it > > has > > > > > > changed, > > > > > > > > > > > > a client updates it firstly and then re-sends the filtering > > request. > > > > > > > > > > > > > > > > > > > > > > > > Alongside the topVer a client sends a serialized nodes projection > > > > object > > > > > > > > > > > > that could be considered as a code to value mapping. > > > > > > > > > > > > Consider: [{Code = 1, Value= [“DotNet”, “MyAttribute”}, {Code=2, > > > > > Value=1}] > > > > > > > > > > > > Where “1” stands for Attribute filtering and “2” – > serverNodesOnly > > > > flag. > > > > > > > > > > > > > > > > > > > > > > > > As a result of request processing, a server sends nodeId UUIDs > and > > a > > > > > > current topVer. > > > > > > > > > > > > > > > > > > > > > > > > When a client obtains nodeIds, it can perform a NODE_INFO call to > > > get a > > > > > > > > > > > > serialized ClusterNode object. In addition there should be a > > > different > > > > > API > > > > > > > > > > > > method for accessing/updating node metrics. > > > > > > > > > > > > чт, 21 нояб. 2019 г. в 12:32, Sergey Kozlov < > [hidden email] > > >: > > > > > > > > > > > > > Hi Pavel > > > > > > > > > > > > > > On Thu, Nov 21, 2019 at 11:30 AM Pavel Tupitsyn < > > > > [hidden email]> > > > > > > > wrote: > > > > > > > > > > > > > > > 1. I believe that Cluster operations for Thin Client protocol > > are > > > > > > already > > > > > > > > in the works > > > > > > > > by Alexandr Shapkin. Can't find the ticket though. > > > > > > > > Alexandr, can you please confirm and attach the ticket > number? > > > > > > > > > > > > > > > > 2. Proposed changes will work only for Java tasks that are > > > already > > > > > > > deployed > > > > > > > > on server nodes. > > > > > > > > This is mostly useless for other thin clients we have > (Python, > > > PHP, > > > > > > .NET, > > > > > > > > C++). > > > > > > > > > > > > > > > > > > > > > > I don't guess so. The task (execution) is a way to implement > own > > > > layer > > > > > > for > > > > > > > the thin client application. > > > > > > > > > > > > > > > > > > > > > > We should think of a way to make this useful for all clients. > > > > > > > > For example, we may allow sending tasks in some scripting > > > language > > > > > like > > > > > > > > Javascript. > > > > > > > > Thoughts? > > > > > > > > > > > > > > > > > > > > > > The arbitrary code execution from a remote client must be > > protected > > > > > > > from malicious code. > > > > > > > I don't know how it could be designed but without that we open > > the > > > > hole > > > > > > to > > > > > > > kill cluster. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Nov 21, 2019 at 11:21 AM Sergey Kozlov < > > > > [hidden email] > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > Hi Alex > > > > > > > > > > > > > > > > > > The idea is great. But I have some concerns that probably > > > should > > > > be > > > > > > > taken > > > > > > > > > into account for design: > > > > > > > > > > > > > > > > > > 1. We need to have the ability to stop a task execution, > > > smth > > > > > like > > > > > > > > > OP_COMPUTE_CANCEL_TASK operation (client to server) > > > > > > > > > 2. What's about task execution timeout? It may help to > the > > > > > cluster > > > > > > > > > survival for buggy tasks > > > > > > > > > 3. Ignite doesn't have roles/authorization functionality > > for > > > > > now. > > > > > > > But > > > > > > > > a > > > > > > > > > task is the risky operation for cluster (for security > > > > reasons). > > > > > > > Could > > > > > > > > we > > > > > > > > > add for Ignite configuration new options: > > > > > > > > > - Explicit turning on for compute task support for > thin > > > > > > protocol > > > > > > > > > (disabled by default) for whole cluster > > > > > > > > > - Explicit turning on for compute task support for a > > node > > > > > > > > > - The list of task names (classes) allowed to execute > > by > > > > thin > > > > > > > > client. > > > > > > > > > 4. Support the labeling for task that may help to > > > investigate > > > > > > issues > > > > > > > > on > > > > > > > > > cluster (the idea from IEP-34 [1]) > > > > > > > > > > > > > > > > > > 1. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/IGNITE/IEP-34+Thin+client%3A+transactions+support > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Nov 21, 2019 at 10:58 AM Alex Plehanov < > > > > > > > [hidden email]> > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > Hello, Igniters! > > > > > > > > > > > > > > > > > > > > I have plans to start implementation of Compute interface > > for > > > > > > Ignite > > > > > > > > thin > > > > > > > > > > client and want to discuss features that should be > > > implemented. > > > > > > > > > > > > > > > > > > > > We already have Compute implementation for binary-rest > > > clients > > > > > > > > > > (GridClientCompute), which have the following > > functionality: > > > > > > > > > > - Filtering cluster nodes (projection) for compute > > > > > > > > > > - Executing task by the name > > > > > > > > > > > > > > > > > > > > I think we can implement this functionality in a thin > > client > > > as > > > > > > well. > > > > > > > > > > > > > > > > > > > > First of all, we need some operation types to request a > > list > > > of > > > > > all > > > > > > > > > > available nodes and probably node attributes (by a list > of > > > > > nodes). > > > > > > > Node > > > > > > > > > > attributes will be helpful if we will decide to implement > > > > analog > > > > > of > > > > > > > > > > ClusterGroup#forAttribute or ClusterGroup#forePredicate > > > methods > > > > > in > > > > > > > the > > > > > > > > > thin > > > > > > > > > > client. Perhaps they can be requested lazily. > > > > > > > > > > > > > > > > > > > > From the protocol point of view there will be two new > > > > operations: > > > > > > > > > > > > > > > > > > > > OP_CLUSTER_GET_NODES > > > > > > > > > > Request: empty > > > > > > > > > > Response: long topologyVersion, int minorTopologyVersion, > > int > > > > > > > > nodesCount, > > > > > > > > > > for each node set of node fields (UUID nodeId, Object or > > > String > > > > > > > > > > consistentId, long order, etc) > > > > > > > > > > > > > > > > > > > > OP_CLUSTER_GET_NODE_ATTRIBUTES > > > > > > > > > > Request: int nodesCount, for each node: UUID nodeId > > > > > > > > > > Response: int nodesCount, for each node: int > > attributesCount, > > > > for > > > > > > > each > > > > > > > > > node > > > > > > > > > > attribute: String name, Object value > > > > > > > > > > > > > > > > > > > > To execute tasks we need something like these methods in > > the > > > > > client > > > > > > > > API: > > > > > > > > > > Object execute(String task, Object arg) > > > > > > > > > > Future<Object> executeAsync(String task, Object arg) > > > > > > > > > > Object affinityExecute(String task, String cache, Object > > key, > > > > > > Object > > > > > > > > arg) > > > > > > > > > > Future<Object> affinityExecuteAsync(String task, String > > > cache, > > > > > > Object > > > > > > > > > key, > > > > > > > > > > Object arg) > > > > > > > > > > > > > > > > > > > > Which can be mapped to protocol operations: > > > > > > > > > > > > > > > > > > > > OP_COMPUTE_EXECUTE_TASK > > > > > > > > > > Request: UUID nodeId, String taskName, Object arg > > > > > > > > > > Response: Object result > > > > > > > > > > > > > > > > > > > > OP_COMPUTE_EXECUTE_TASK_AFFINITY > > > > > > > > > > Request: String cacheName, Object key, String taskName, > > > Object > > > > > arg > > > > > > > > > > Response: Object result > > > > > > > > > > > > > > > > > > > > The second operation is needed because we sometimes can't > > > > > calculate > > > > > > > and > > > > > > > > > > connect to affinity node on the client-side (affinity > > > awareness > > > > > can > > > > > > > be > > > > > > > > > > disabled, custom affinity function can be used or there > can > > > be > > > > no > > > > > > > > > > connection between client and affinity node), but we can > > make > > > > > best > > > > > > > > effort > > > > > > > > > > to send request to target node if affinity awareness is > > > > enabled. > > > > > > > > > > > > > > > > > > > > Currently, on the server-side requests always processed > > > > > > synchronously > > > > > > > > and > > > > > > > > > > responses are sent right after request was processed. To > > > > execute > > > > > > long > > > > > > > > > tasks > > > > > > > > > > async we should whether change this logic or introduce > some > > > > kind > > > > > > > > two-way > > > > > > > > > > communication between client and server (now only one-way > > > > > requests > > > > > > > from > > > > > > > > > > client to server are allowed). > > > > > > > > > > > > > > > > > > > > Two-way communication can also be useful in the future if > > we > > > > will > > > > > > > send > > > > > > > > > some > > > > > > > > > > server-side generated events to clients. > > > > > > > > > > > > > > > > > > > > In case of two-way communication there can be new > > operations > > > > > > > > introduced: > > > > > > > > > > > > > > > > > > > > OP_COMPUTE_EXECUTE_TASK (from client to server) > > > > > > > > > > Request: UUID nodeId, String taskName, Object arg > > > > > > > > > > Response: long taskId > > > > > > > > > > > > > > > > > > > > OP_COMPUTE_TASK_FINISHED (from server to client) > > > > > > > > > > Request: taskId, Object result > > > > > > > > > > Response: empty > > > > > > > > > > > > > > > > > > > > The same for affinity requests. > > > > > > > > > > > > > > > > > > > > Also, we can implement not only execute task operation, > but > > > > some > > > > > > > other > > > > > > > > > > operations from IgniteCompute (broadcast, run, call), but > > it > > > > will > > > > > > be > > > > > > > > > useful > > > > > > > > > > only for java thin client. And even with java thin client > > we > > > > > should > > > > > > > > > whether > > > > > > > > > > implement peer-class-loading for thin clients (this also > > > > requires > > > > > > > > two-way > > > > > > > > > > client-server communication) or put classes with executed > > > > > closures > > > > > > to > > > > > > > > the > > > > > > > > > > server locally. > > > > > > > > > > > > > > > > > > > > What do you think about proposed protocol changes? > > > > > > > > > > Do we need two-way requests between client and server? > > > > > > > > > > Do we need support of compute methods other than "execute > > > > task"? > > > > > > > > > > What do you think about peer-class-loading for thin > > clients? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > Sergey Kozlov > > > > > > > > > GridGain Systems > > > > > > > > > www.gridgain.com > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > Sergey Kozlov > > > > > > > GridGain Systems > > > > > > > www.gridgain.com > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > Alex. > > > > > > > > > > > > > > > > > > > > > |
> Anyway, my point stands.
I can't agree. Why you don't want to use task id for this? After all, we don't cancel request (request is already processed), we cancel the task. So it's more convenient to use task id here. > Can you please provide equivalent use case with existing "thick" client? For example: Cluster consists of one server node. Client uses some cluster group filtration (for example forServers() cluster group). Client starts to send periodically (for example 1 per minute) long-term (for example 1 hour long) tasks to the cluster. Meanwhile, several server nodes joined the cluster. In case of thick client: All server nodes will be used, tasks will be load balanced. In case of thin client: Only one server node will be used, client will detect topology change after an hour. вт, 26 нояб. 2019 г. в 11:50, Pavel Tupitsyn <[hidden email]>: > > I can't see any usage of request id in query cursors > You are right, cursor id is a separate thing. > Anyway, my point stands. > > > client sends long term tasks to nodes and wants to do it with load > balancing > I still don't get it. Can you please provide equivalent use case with > existing "thick" client? > > > On Mon, Nov 25, 2019 at 11:59 PM Alex Plehanov <[hidden email]> > wrote: > > > > And it is fine to use request ID to identify compute tasks (as we do > with > > query cursors). > > I can't see any usage of request id in query cursors. We send query > request > > and get cursor id in response. After that, we only use cursor id (to get > > next pages and to close the resource). Did I miss something? > > > > > Looks like I'm missing something - how is topology change relevant to > > executing compute tasks from client? > > It's not relevant directly. But there are some cases where it will be > > helpful. For example, if client sends long term tasks to nodes and wants > to > > do it with load balancing it will detect topology change only after some > > time in the future with the first response, so load balancing will no > work. > > Perhaps we can add optional "topology version" field to the > > OP_COMPUTE_EXECUTE_TASK request to solve this problem. > > > > > > пн, 25 нояб. 2019 г. в 22:42, Pavel Tupitsyn <[hidden email]>: > > > > > Alex, > > > > > > > we will mix entities from different layers (transport layer and > request > > > body) > > > I would not call our message header (which includes the id) "transport > > > layer". > > > TCP is our transport layer. And it is fine to use request ID to > identify > > > compute tasks (as we do with query cursors). > > > > > > > we still can't be sure that the task is successfully started on a > > server > > > The request to start the task will fail and we'll get a response > > indicating > > > that right away > > > > > > > we won't ever know about topology change > > > Looks like I'm missing something - how is topology change relevant to > > > executing compute tasks from client? > > > > > > On Mon, Nov 25, 2019 at 10:17 PM Alex Plehanov < > [hidden email]> > > > wrote: > > > > > > > Pavel, in this case, we will mix entities from different layers > > > (transport > > > > layer and request body), it's not very good. The same behavior we can > > > > achieve with generated on client-side task id, but there will be no > > > > inter-layer data intersection and I think it will be easier to > > implement > > > on > > > > both client and server-side. But we still can't be sure that the task > > is > > > > successfully started on a server. We won't ever know about topology > > > change, > > > > because topology changed flag will be sent from server to client only > > > with > > > > a response when the task will be completed. Are we accept that? > > > > > > > > пн, 25 нояб. 2019 г. в 19:07, Pavel Tupitsyn <[hidden email]>: > > > > > > > > > Alex, > > > > > > > > > > I have a simpler idea. We already do request id handling in the > > > protocol, > > > > > so: > > > > > - Client sends a normal request to execute compute task. Request ID > > is > > > > > generated as usual. > > > > > - As soon as task is completed, a response is received. > > > > > > > > > > As for cancellation - client can send a new request (with new > request > > > ID) > > > > > and (in the body) pass the request ID from above > > > > > as a task identifier. As a result, there are two responses: > > > > > - Cancellation response > > > > > - Task response (with proper cancelled status) > > > > > > > > > > That's it, no need to modify the core of the protocol. One request > - > > > one > > > > > response. > > > > > > > > > > On Mon, Nov 25, 2019 at 6:20 PM Alex Plehanov < > > [hidden email] > > > > > > > > > wrote: > > > > > > > > > > > Pavel, we need to inform the client when the task is completed, > we > > > need > > > > > the > > > > > > ability to cancel the task. I see several ways to implement this: > > > > > > > > > > > > 1. Сlient sends a request to the server to start a task, server > > > return > > > > > task > > > > > > id in response. Server notifies client when task is completed > with > > a > > > > new > > > > > > request (from server to client). Client can cancel the task by > > > sending > > > > a > > > > > > new request with operation type "cancel" and task id. In this > case, > > > we > > > > > > should implement 2-ways requests. > > > > > > 2. Client generates unique task id and sends a request to the > > server > > > to > > > > > > start a task, server don't reply immediately but wait until task > is > > > > > > completed. Client can cancel task by sending new request with > > > operation > > > > > > type "cancel" and task id. In this case, we should decouple > request > > > and > > > > > > response on the server-side (currently response is sent right > after > > > > > request > > > > > > was processed). Also, we can't be sure that task is successfully > > > > started > > > > > on > > > > > > a server. > > > > > > 3. Client sends a request to the server to start a task, server > > > return > > > > id > > > > > > in response. Client periodically asks the server about task > status. > > > > > Client > > > > > > can cancel the task by sending new request with operation type > > > "cancel" > > > > > and > > > > > > task id. This case brings some overhead to the communication > > channel. > > > > > > > > > > > > Personally, I think that the case with 2-ways requests is better, > > but > > > > I'm > > > > > > open to any other ideas. > > > > > > > > > > > > Aleksandr, > > > > > > > > > > > > Filtering logic for OP_CLUSTER_GROUP_GET_NODE_IDS looks > > > > overcomplicated. > > > > > Do > > > > > > we need server-side filtering at all? Wouldn't it be better to > send > > > > basic > > > > > > info (ids, order, flags) for all nodes (there is relatively small > > > > amount > > > > > of > > > > > > data) and extended info (attributes) for selected list of nodes? > In > > > > this > > > > > > case, we can do basic node filtration on client-side > (forClients(), > > > > > > forServers(), forNodeIds(), forOthers(), etc). > > > > > > > > > > > > Do you use standard ClusterNode serialization? There are also > > metrics > > > > > > serialized with ClusterNode, do we need it on thin client? There > > are > > > > > other > > > > > > interfaces exist to show metrics, I think it's redundant to > export > > > > > metrics > > > > > > to thin clients too. > > > > > > > > > > > > What do you think? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > пт, 22 нояб. 2019 г. в 20:15, Aleksandr Shapkin < > [hidden email] > > >: > > > > > > > > > > > > > Alex, > > > > > > > > > > > > > > > > > > > > > > > > > > > > I think you can create a new IEP page and I will fill it with > the > > > > > Cluster > > > > > > > API details. > > > > > > > > > > > > > > > > > > > > > > > > > > > > In short, I’ve introduced several new codes: > > > > > > > > > > > > > > > > > > > > > > > > > > > > Cluster API is pretty straightforward: > > > > > > > > > > > > > > > > > > > > > > > > > > > > OP_CLUSTER_IS_ACTIVE = 5000 > > > > > > > > > > > > > > OP_CLUSTER_CHANGE_STATE = 5001 > > > > > > > > > > > > > > OP_CLUSTER_CHANGE_WAL_STATE = 5002 > > > > > > > > > > > > > > OP_CLUSTER_GET_WAL_STATE = 5003 > > > > > > > > > > > > > > > > > > > > > > > > > > > > Cluster group codes: > > > > > > > > > > > > > > > > > > > > > > > > > > > > OP_CLUSTER_GROUP_GET_NODE_IDS = 5100 > > > > > > > > > > > > > > OP_CLUSTER_GROUP_GET_NODE_INFO = 5101 > > > > > > > > > > > > > > > > > > > > > > > > > > > > The underlying implementation is based on the thick client > logic. > > > > > > > > > > > > > > > > > > > > > > > > > > > > For every request, we provide a known topology version and if > it > > > has > > > > > > > changed, > > > > > > > > > > > > > > a client updates it firstly and then re-sends the filtering > > > request. > > > > > > > > > > > > > > > > > > > > > > > > > > > > Alongside the topVer a client sends a serialized nodes > projection > > > > > object > > > > > > > > > > > > > > that could be considered as a code to value mapping. > > > > > > > > > > > > > > Consider: [{Code = 1, Value= [“DotNet”, “MyAttribute”}, > {Code=2, > > > > > > Value=1}] > > > > > > > > > > > > > > Where “1” stands for Attribute filtering and “2” – > > serverNodesOnly > > > > > flag. > > > > > > > > > > > > > > > > > > > > > > > > > > > > As a result of request processing, a server sends nodeId UUIDs > > and > > > a > > > > > > > current topVer. > > > > > > > > > > > > > > > > > > > > > > > > > > > > When a client obtains nodeIds, it can perform a NODE_INFO call > to > > > > get a > > > > > > > > > > > > > > serialized ClusterNode object. In addition there should be a > > > > different > > > > > > API > > > > > > > > > > > > > > method for accessing/updating node metrics. > > > > > > > > > > > > > > чт, 21 нояб. 2019 г. в 12:32, Sergey Kozlov < > > [hidden email] > > > >: > > > > > > > > > > > > > > > Hi Pavel > > > > > > > > > > > > > > > > On Thu, Nov 21, 2019 at 11:30 AM Pavel Tupitsyn < > > > > > [hidden email]> > > > > > > > > wrote: > > > > > > > > > > > > > > > > > 1. I believe that Cluster operations for Thin Client > protocol > > > are > > > > > > > already > > > > > > > > > in the works > > > > > > > > > by Alexandr Shapkin. Can't find the ticket though. > > > > > > > > > Alexandr, can you please confirm and attach the ticket > > number? > > > > > > > > > > > > > > > > > > 2. Proposed changes will work only for Java tasks that are > > > > already > > > > > > > > deployed > > > > > > > > > on server nodes. > > > > > > > > > This is mostly useless for other thin clients we have > > (Python, > > > > PHP, > > > > > > > .NET, > > > > > > > > > C++). > > > > > > > > > > > > > > > > > > > > > > > > > I don't guess so. The task (execution) is a way to implement > > own > > > > > layer > > > > > > > for > > > > > > > > the thin client application. > > > > > > > > > > > > > > > > > > > > > > > > > We should think of a way to make this useful for all > clients. > > > > > > > > > For example, we may allow sending tasks in some scripting > > > > language > > > > > > like > > > > > > > > > Javascript. > > > > > > > > > Thoughts? > > > > > > > > > > > > > > > > > > > > > > > > > The arbitrary code execution from a remote client must be > > > protected > > > > > > > > from malicious code. > > > > > > > > I don't know how it could be designed but without that we > open > > > the > > > > > hole > > > > > > > to > > > > > > > > kill cluster. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Nov 21, 2019 at 11:21 AM Sergey Kozlov < > > > > > [hidden email] > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > Hi Alex > > > > > > > > > > > > > > > > > > > > The idea is great. But I have some concerns that probably > > > > should > > > > > be > > > > > > > > taken > > > > > > > > > > into account for design: > > > > > > > > > > > > > > > > > > > > 1. We need to have the ability to stop a task > execution, > > > > smth > > > > > > like > > > > > > > > > > OP_COMPUTE_CANCEL_TASK operation (client to server) > > > > > > > > > > 2. What's about task execution timeout? It may help to > > the > > > > > > cluster > > > > > > > > > > survival for buggy tasks > > > > > > > > > > 3. Ignite doesn't have roles/authorization > functionality > > > for > > > > > > now. > > > > > > > > But > > > > > > > > > a > > > > > > > > > > task is the risky operation for cluster (for security > > > > > reasons). > > > > > > > > Could > > > > > > > > > we > > > > > > > > > > add for Ignite configuration new options: > > > > > > > > > > - Explicit turning on for compute task support for > > thin > > > > > > > protocol > > > > > > > > > > (disabled by default) for whole cluster > > > > > > > > > > - Explicit turning on for compute task support for > a > > > node > > > > > > > > > > - The list of task names (classes) allowed to > execute > > > by > > > > > thin > > > > > > > > > client. > > > > > > > > > > 4. Support the labeling for task that may help to > > > > investigate > > > > > > > issues > > > > > > > > > on > > > > > > > > > > cluster (the idea from IEP-34 [1]) > > > > > > > > > > > > > > > > > > > > 1. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/IGNITE/IEP-34+Thin+client%3A+transactions+support > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Nov 21, 2019 at 10:58 AM Alex Plehanov < > > > > > > > > [hidden email]> > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > Hello, Igniters! > > > > > > > > > > > > > > > > > > > > > > I have plans to start implementation of Compute > interface > > > for > > > > > > > Ignite > > > > > > > > > thin > > > > > > > > > > > client and want to discuss features that should be > > > > implemented. > > > > > > > > > > > > > > > > > > > > > > We already have Compute implementation for binary-rest > > > > clients > > > > > > > > > > > (GridClientCompute), which have the following > > > functionality: > > > > > > > > > > > - Filtering cluster nodes (projection) for compute > > > > > > > > > > > - Executing task by the name > > > > > > > > > > > > > > > > > > > > > > I think we can implement this functionality in a thin > > > client > > > > as > > > > > > > well. > > > > > > > > > > > > > > > > > > > > > > First of all, we need some operation types to request a > > > list > > > > of > > > > > > all > > > > > > > > > > > available nodes and probably node attributes (by a list > > of > > > > > > nodes). > > > > > > > > Node > > > > > > > > > > > attributes will be helpful if we will decide to > implement > > > > > analog > > > > > > of > > > > > > > > > > > ClusterGroup#forAttribute or ClusterGroup#forePredicate > > > > methods > > > > > > in > > > > > > > > the > > > > > > > > > > thin > > > > > > > > > > > client. Perhaps they can be requested lazily. > > > > > > > > > > > > > > > > > > > > > > From the protocol point of view there will be two new > > > > > operations: > > > > > > > > > > > > > > > > > > > > > > OP_CLUSTER_GET_NODES > > > > > > > > > > > Request: empty > > > > > > > > > > > Response: long topologyVersion, int > minorTopologyVersion, > > > int > > > > > > > > > nodesCount, > > > > > > > > > > > for each node set of node fields (UUID nodeId, Object > or > > > > String > > > > > > > > > > > consistentId, long order, etc) > > > > > > > > > > > > > > > > > > > > > > OP_CLUSTER_GET_NODE_ATTRIBUTES > > > > > > > > > > > Request: int nodesCount, for each node: UUID nodeId > > > > > > > > > > > Response: int nodesCount, for each node: int > > > attributesCount, > > > > > for > > > > > > > > each > > > > > > > > > > node > > > > > > > > > > > attribute: String name, Object value > > > > > > > > > > > > > > > > > > > > > > To execute tasks we need something like these methods > in > > > the > > > > > > client > > > > > > > > > API: > > > > > > > > > > > Object execute(String task, Object arg) > > > > > > > > > > > Future<Object> executeAsync(String task, Object arg) > > > > > > > > > > > Object affinityExecute(String task, String cache, > Object > > > key, > > > > > > > Object > > > > > > > > > arg) > > > > > > > > > > > Future<Object> affinityExecuteAsync(String task, String > > > > cache, > > > > > > > Object > > > > > > > > > > key, > > > > > > > > > > > Object arg) > > > > > > > > > > > > > > > > > > > > > > Which can be mapped to protocol operations: > > > > > > > > > > > > > > > > > > > > > > OP_COMPUTE_EXECUTE_TASK > > > > > > > > > > > Request: UUID nodeId, String taskName, Object arg > > > > > > > > > > > Response: Object result > > > > > > > > > > > > > > > > > > > > > > OP_COMPUTE_EXECUTE_TASK_AFFINITY > > > > > > > > > > > Request: String cacheName, Object key, String taskName, > > > > Object > > > > > > arg > > > > > > > > > > > Response: Object result > > > > > > > > > > > > > > > > > > > > > > The second operation is needed because we sometimes > can't > > > > > > calculate > > > > > > > > and > > > > > > > > > > > connect to affinity node on the client-side (affinity > > > > awareness > > > > > > can > > > > > > > > be > > > > > > > > > > > disabled, custom affinity function can be used or there > > can > > > > be > > > > > no > > > > > > > > > > > connection between client and affinity node), but we > can > > > make > > > > > > best > > > > > > > > > effort > > > > > > > > > > > to send request to target node if affinity awareness is > > > > > enabled. > > > > > > > > > > > > > > > > > > > > > > Currently, on the server-side requests always processed > > > > > > > synchronously > > > > > > > > > and > > > > > > > > > > > responses are sent right after request was processed. > To > > > > > execute > > > > > > > long > > > > > > > > > > tasks > > > > > > > > > > > async we should whether change this logic or introduce > > some > > > > > kind > > > > > > > > > two-way > > > > > > > > > > > communication between client and server (now only > one-way > > > > > > requests > > > > > > > > from > > > > > > > > > > > client to server are allowed). > > > > > > > > > > > > > > > > > > > > > > Two-way communication can also be useful in the future > if > > > we > > > > > will > > > > > > > > send > > > > > > > > > > some > > > > > > > > > > > server-side generated events to clients. > > > > > > > > > > > > > > > > > > > > > > In case of two-way communication there can be new > > > operations > > > > > > > > > introduced: > > > > > > > > > > > > > > > > > > > > > > OP_COMPUTE_EXECUTE_TASK (from client to server) > > > > > > > > > > > Request: UUID nodeId, String taskName, Object arg > > > > > > > > > > > Response: long taskId > > > > > > > > > > > > > > > > > > > > > > OP_COMPUTE_TASK_FINISHED (from server to client) > > > > > > > > > > > Request: taskId, Object result > > > > > > > > > > > Response: empty > > > > > > > > > > > > > > > > > > > > > > The same for affinity requests. > > > > > > > > > > > > > > > > > > > > > > Also, we can implement not only execute task operation, > > but > > > > > some > > > > > > > > other > > > > > > > > > > > operations from IgniteCompute (broadcast, run, call), > but > > > it > > > > > will > > > > > > > be > > > > > > > > > > useful > > > > > > > > > > > only for java thin client. And even with java thin > client > > > we > > > > > > should > > > > > > > > > > whether > > > > > > > > > > > implement peer-class-loading for thin clients (this > also > > > > > requires > > > > > > > > > two-way > > > > > > > > > > > client-server communication) or put classes with > executed > > > > > > closures > > > > > > > to > > > > > > > > > the > > > > > > > > > > > server locally. > > > > > > > > > > > > > > > > > > > > > > What do you think about proposed protocol changes? > > > > > > > > > > > Do we need two-way requests between client and server? > > > > > > > > > > > Do we need support of compute methods other than > "execute > > > > > task"? > > > > > > > > > > > What do you think about peer-class-loading for thin > > > clients? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > Sergey Kozlov > > > > > > > > > > GridGain Systems > > > > > > > > > > www.gridgain.com > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > Sergey Kozlov > > > > > > > > GridGain Systems > > > > > > > > www.gridgain.com > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > Alex. > > > > > > > > > > > > > > > > > > > > > > > > > > > > |
> After all, we don't cancel request
We do cancel a request to perform a task. We may and should use this to cancel any other request in future. > Client uses some cluster group filtration (for example forServers() cluster group) Please see above - Aleksandr Shapkin described how we store filtered cluster groups on client. We don't store node IDs, we store actual filters. So every new request will apply those filters on server side, using the most recent set of nodes. var myGrp = cluster.forServers().forAttribute("foo"); // This does not issue any server requests, just builds an object with filters on client while (true) myGrp.compute().executeTask("bar"); // Every request includes filters, and filters are applied on the server side On Tue, Nov 26, 2019 at 1:42 PM Alex Plehanov <[hidden email]> wrote: > > Anyway, my point stands. > I can't agree. Why you don't want to use task id for this? After all, we > don't cancel request (request is already processed), we cancel the task. So > it's more convenient to use task id here. > > > Can you please provide equivalent use case with existing "thick" client? > For example: > Cluster consists of one server node. > Client uses some cluster group filtration (for example forServers() cluster > group). > Client starts to send periodically (for example 1 per minute) long-term > (for example 1 hour long) tasks to the cluster. > Meanwhile, several server nodes joined the cluster. > > In case of thick client: All server nodes will be used, tasks will be load > balanced. > In case of thin client: Only one server node will be used, client will > detect topology change after an hour. > > > вт, 26 нояб. 2019 г. в 11:50, Pavel Tupitsyn <[hidden email]>: > > > > I can't see any usage of request id in query cursors > > You are right, cursor id is a separate thing. > > Anyway, my point stands. > > > > > client sends long term tasks to nodes and wants to do it with load > > balancing > > I still don't get it. Can you please provide equivalent use case with > > existing "thick" client? > > > > > > On Mon, Nov 25, 2019 at 11:59 PM Alex Plehanov <[hidden email]> > > wrote: > > > > > > And it is fine to use request ID to identify compute tasks (as we do > > with > > > query cursors). > > > I can't see any usage of request id in query cursors. We send query > > request > > > and get cursor id in response. After that, we only use cursor id (to > get > > > next pages and to close the resource). Did I miss something? > > > > > > > Looks like I'm missing something - how is topology change relevant to > > > executing compute tasks from client? > > > It's not relevant directly. But there are some cases where it will be > > > helpful. For example, if client sends long term tasks to nodes and > wants > > to > > > do it with load balancing it will detect topology change only after > some > > > time in the future with the first response, so load balancing will no > > work. > > > Perhaps we can add optional "topology version" field to the > > > OP_COMPUTE_EXECUTE_TASK request to solve this problem. > > > > > > > > > пн, 25 нояб. 2019 г. в 22:42, Pavel Tupitsyn <[hidden email]>: > > > > > > > Alex, > > > > > > > > > we will mix entities from different layers (transport layer and > > request > > > > body) > > > > I would not call our message header (which includes the id) > "transport > > > > layer". > > > > TCP is our transport layer. And it is fine to use request ID to > > identify > > > > compute tasks (as we do with query cursors). > > > > > > > > > we still can't be sure that the task is successfully started on a > > > server > > > > The request to start the task will fail and we'll get a response > > > indicating > > > > that right away > > > > > > > > > we won't ever know about topology change > > > > Looks like I'm missing something - how is topology change relevant to > > > > executing compute tasks from client? > > > > > > > > On Mon, Nov 25, 2019 at 10:17 PM Alex Plehanov < > > [hidden email]> > > > > wrote: > > > > > > > > > Pavel, in this case, we will mix entities from different layers > > > > (transport > > > > > layer and request body), it's not very good. The same behavior we > can > > > > > achieve with generated on client-side task id, but there will be no > > > > > inter-layer data intersection and I think it will be easier to > > > implement > > > > on > > > > > both client and server-side. But we still can't be sure that the > task > > > is > > > > > successfully started on a server. We won't ever know about topology > > > > change, > > > > > because topology changed flag will be sent from server to client > only > > > > with > > > > > a response when the task will be completed. Are we accept that? > > > > > > > > > > пн, 25 нояб. 2019 г. в 19:07, Pavel Tupitsyn <[hidden email] > >: > > > > > > > > > > > Alex, > > > > > > > > > > > > I have a simpler idea. We already do request id handling in the > > > > protocol, > > > > > > so: > > > > > > - Client sends a normal request to execute compute task. Request > ID > > > is > > > > > > generated as usual. > > > > > > - As soon as task is completed, a response is received. > > > > > > > > > > > > As for cancellation - client can send a new request (with new > > request > > > > ID) > > > > > > and (in the body) pass the request ID from above > > > > > > as a task identifier. As a result, there are two responses: > > > > > > - Cancellation response > > > > > > - Task response (with proper cancelled status) > > > > > > > > > > > > That's it, no need to modify the core of the protocol. One > request > > - > > > > one > > > > > > response. > > > > > > > > > > > > On Mon, Nov 25, 2019 at 6:20 PM Alex Plehanov < > > > [hidden email] > > > > > > > > > > > wrote: > > > > > > > > > > > > > Pavel, we need to inform the client when the task is completed, > > we > > > > need > > > > > > the > > > > > > > ability to cancel the task. I see several ways to implement > this: > > > > > > > > > > > > > > 1. Сlient sends a request to the server to start a task, server > > > > return > > > > > > task > > > > > > > id in response. Server notifies client when task is completed > > with > > > a > > > > > new > > > > > > > request (from server to client). Client can cancel the task by > > > > sending > > > > > a > > > > > > > new request with operation type "cancel" and task id. In this > > case, > > > > we > > > > > > > should implement 2-ways requests. > > > > > > > 2. Client generates unique task id and sends a request to the > > > server > > > > to > > > > > > > start a task, server don't reply immediately but wait until > task > > is > > > > > > > completed. Client can cancel task by sending new request with > > > > operation > > > > > > > type "cancel" and task id. In this case, we should decouple > > request > > > > and > > > > > > > response on the server-side (currently response is sent right > > after > > > > > > request > > > > > > > was processed). Also, we can't be sure that task is > successfully > > > > > started > > > > > > on > > > > > > > a server. > > > > > > > 3. Client sends a request to the server to start a task, server > > > > return > > > > > id > > > > > > > in response. Client periodically asks the server about task > > status. > > > > > > Client > > > > > > > can cancel the task by sending new request with operation type > > > > "cancel" > > > > > > and > > > > > > > task id. This case brings some overhead to the communication > > > channel. > > > > > > > > > > > > > > Personally, I think that the case with 2-ways requests is > better, > > > but > > > > > I'm > > > > > > > open to any other ideas. > > > > > > > > > > > > > > Aleksandr, > > > > > > > > > > > > > > Filtering logic for OP_CLUSTER_GROUP_GET_NODE_IDS looks > > > > > overcomplicated. > > > > > > Do > > > > > > > we need server-side filtering at all? Wouldn't it be better to > > send > > > > > basic > > > > > > > info (ids, order, flags) for all nodes (there is relatively > small > > > > > amount > > > > > > of > > > > > > > data) and extended info (attributes) for selected list of > nodes? > > In > > > > > this > > > > > > > case, we can do basic node filtration on client-side > > (forClients(), > > > > > > > forServers(), forNodeIds(), forOthers(), etc). > > > > > > > > > > > > > > Do you use standard ClusterNode serialization? There are also > > > metrics > > > > > > > serialized with ClusterNode, do we need it on thin client? > There > > > are > > > > > > other > > > > > > > interfaces exist to show metrics, I think it's redundant to > > export > > > > > > metrics > > > > > > > to thin clients too. > > > > > > > > > > > > > > What do you think? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > пт, 22 нояб. 2019 г. в 20:15, Aleksandr Shapkin < > > [hidden email] > > > >: > > > > > > > > > > > > > > > Alex, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I think you can create a new IEP page and I will fill it with > > the > > > > > > Cluster > > > > > > > > API details. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > In short, I’ve introduced several new codes: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Cluster API is pretty straightforward: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > OP_CLUSTER_IS_ACTIVE = 5000 > > > > > > > > > > > > > > > > OP_CLUSTER_CHANGE_STATE = 5001 > > > > > > > > > > > > > > > > OP_CLUSTER_CHANGE_WAL_STATE = 5002 > > > > > > > > > > > > > > > > OP_CLUSTER_GET_WAL_STATE = 5003 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Cluster group codes: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > OP_CLUSTER_GROUP_GET_NODE_IDS = 5100 > > > > > > > > > > > > > > > > OP_CLUSTER_GROUP_GET_NODE_INFO = 5101 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > The underlying implementation is based on the thick client > > logic. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > For every request, we provide a known topology version and if > > it > > > > has > > > > > > > > changed, > > > > > > > > > > > > > > > > a client updates it firstly and then re-sends the filtering > > > > request. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Alongside the topVer a client sends a serialized nodes > > projection > > > > > > object > > > > > > > > > > > > > > > > that could be considered as a code to value mapping. > > > > > > > > > > > > > > > > Consider: [{Code = 1, Value= [“DotNet”, “MyAttribute”}, > > {Code=2, > > > > > > > Value=1}] > > > > > > > > > > > > > > > > Where “1” stands for Attribute filtering and “2” – > > > serverNodesOnly > > > > > > flag. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > As a result of request processing, a server sends nodeId > UUIDs > > > and > > > > a > > > > > > > > current topVer. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > When a client obtains nodeIds, it can perform a NODE_INFO > call > > to > > > > > get a > > > > > > > > > > > > > > > > serialized ClusterNode object. In addition there should be a > > > > > different > > > > > > > API > > > > > > > > > > > > > > > > method for accessing/updating node metrics. > > > > > > > > > > > > > > > > чт, 21 нояб. 2019 г. в 12:32, Sergey Kozlov < > > > [hidden email] > > > > >: > > > > > > > > > > > > > > > > > Hi Pavel > > > > > > > > > > > > > > > > > > On Thu, Nov 21, 2019 at 11:30 AM Pavel Tupitsyn < > > > > > > [hidden email]> > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > 1. I believe that Cluster operations for Thin Client > > protocol > > > > are > > > > > > > > already > > > > > > > > > > in the works > > > > > > > > > > by Alexandr Shapkin. Can't find the ticket though. > > > > > > > > > > Alexandr, can you please confirm and attach the ticket > > > number? > > > > > > > > > > > > > > > > > > > > 2. Proposed changes will work only for Java tasks that > are > > > > > already > > > > > > > > > deployed > > > > > > > > > > on server nodes. > > > > > > > > > > This is mostly useless for other thin clients we have > > > (Python, > > > > > PHP, > > > > > > > > .NET, > > > > > > > > > > C++). > > > > > > > > > > > > > > > > > > > > > > > > > > > > I don't guess so. The task (execution) is a way to > implement > > > own > > > > > > layer > > > > > > > > for > > > > > > > > > the thin client application. > > > > > > > > > > > > > > > > > > > > > > > > > > > > We should think of a way to make this useful for all > > clients. > > > > > > > > > > For example, we may allow sending tasks in some scripting > > > > > language > > > > > > > like > > > > > > > > > > Javascript. > > > > > > > > > > Thoughts? > > > > > > > > > > > > > > > > > > > > > > > > > > > > The arbitrary code execution from a remote client must be > > > > protected > > > > > > > > > from malicious code. > > > > > > > > > I don't know how it could be designed but without that we > > open > > > > the > > > > > > hole > > > > > > > > to > > > > > > > > > kill cluster. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Nov 21, 2019 at 11:21 AM Sergey Kozlov < > > > > > > [hidden email] > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > Hi Alex > > > > > > > > > > > > > > > > > > > > > > The idea is great. But I have some concerns that > probably > > > > > should > > > > > > be > > > > > > > > > taken > > > > > > > > > > > into account for design: > > > > > > > > > > > > > > > > > > > > > > 1. We need to have the ability to stop a task > > execution, > > > > > smth > > > > > > > like > > > > > > > > > > > OP_COMPUTE_CANCEL_TASK operation (client to server) > > > > > > > > > > > 2. What's about task execution timeout? It may help > to > > > the > > > > > > > cluster > > > > > > > > > > > survival for buggy tasks > > > > > > > > > > > 3. Ignite doesn't have roles/authorization > > functionality > > > > for > > > > > > > now. > > > > > > > > > But > > > > > > > > > > a > > > > > > > > > > > task is the risky operation for cluster (for > security > > > > > > reasons). > > > > > > > > > Could > > > > > > > > > > we > > > > > > > > > > > add for Ignite configuration new options: > > > > > > > > > > > - Explicit turning on for compute task support > for > > > thin > > > > > > > > protocol > > > > > > > > > > > (disabled by default) for whole cluster > > > > > > > > > > > - Explicit turning on for compute task support > for > > a > > > > node > > > > > > > > > > > - The list of task names (classes) allowed to > > execute > > > > by > > > > > > thin > > > > > > > > > > client. > > > > > > > > > > > 4. Support the labeling for task that may help to > > > > > investigate > > > > > > > > issues > > > > > > > > > > on > > > > > > > > > > > cluster (the idea from IEP-34 [1]) > > > > > > > > > > > > > > > > > > > > > > 1. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/IGNITE/IEP-34+Thin+client%3A+transactions+support > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Nov 21, 2019 at 10:58 AM Alex Plehanov < > > > > > > > > > [hidden email]> > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > Hello, Igniters! > > > > > > > > > > > > > > > > > > > > > > > > I have plans to start implementation of Compute > > interface > > > > for > > > > > > > > Ignite > > > > > > > > > > thin > > > > > > > > > > > > client and want to discuss features that should be > > > > > implemented. > > > > > > > > > > > > > > > > > > > > > > > > We already have Compute implementation for > binary-rest > > > > > clients > > > > > > > > > > > > (GridClientCompute), which have the following > > > > functionality: > > > > > > > > > > > > - Filtering cluster nodes (projection) for compute > > > > > > > > > > > > - Executing task by the name > > > > > > > > > > > > > > > > > > > > > > > > I think we can implement this functionality in a thin > > > > client > > > > > as > > > > > > > > well. > > > > > > > > > > > > > > > > > > > > > > > > First of all, we need some operation types to > request a > > > > list > > > > > of > > > > > > > all > > > > > > > > > > > > available nodes and probably node attributes (by a > list > > > of > > > > > > > nodes). > > > > > > > > > Node > > > > > > > > > > > > attributes will be helpful if we will decide to > > implement > > > > > > analog > > > > > > > of > > > > > > > > > > > > ClusterGroup#forAttribute or > ClusterGroup#forePredicate > > > > > methods > > > > > > > in > > > > > > > > > the > > > > > > > > > > > thin > > > > > > > > > > > > client. Perhaps they can be requested lazily. > > > > > > > > > > > > > > > > > > > > > > > > From the protocol point of view there will be two new > > > > > > operations: > > > > > > > > > > > > > > > > > > > > > > > > OP_CLUSTER_GET_NODES > > > > > > > > > > > > Request: empty > > > > > > > > > > > > Response: long topologyVersion, int > > minorTopologyVersion, > > > > int > > > > > > > > > > nodesCount, > > > > > > > > > > > > for each node set of node fields (UUID nodeId, Object > > or > > > > > String > > > > > > > > > > > > consistentId, long order, etc) > > > > > > > > > > > > > > > > > > > > > > > > OP_CLUSTER_GET_NODE_ATTRIBUTES > > > > > > > > > > > > Request: int nodesCount, for each node: UUID nodeId > > > > > > > > > > > > Response: int nodesCount, for each node: int > > > > attributesCount, > > > > > > for > > > > > > > > > each > > > > > > > > > > > node > > > > > > > > > > > > attribute: String name, Object value > > > > > > > > > > > > > > > > > > > > > > > > To execute tasks we need something like these methods > > in > > > > the > > > > > > > client > > > > > > > > > > API: > > > > > > > > > > > > Object execute(String task, Object arg) > > > > > > > > > > > > Future<Object> executeAsync(String task, Object arg) > > > > > > > > > > > > Object affinityExecute(String task, String cache, > > Object > > > > key, > > > > > > > > Object > > > > > > > > > > arg) > > > > > > > > > > > > Future<Object> affinityExecuteAsync(String task, > String > > > > > cache, > > > > > > > > Object > > > > > > > > > > > key, > > > > > > > > > > > > Object arg) > > > > > > > > > > > > > > > > > > > > > > > > Which can be mapped to protocol operations: > > > > > > > > > > > > > > > > > > > > > > > > OP_COMPUTE_EXECUTE_TASK > > > > > > > > > > > > Request: UUID nodeId, String taskName, Object arg > > > > > > > > > > > > Response: Object result > > > > > > > > > > > > > > > > > > > > > > > > OP_COMPUTE_EXECUTE_TASK_AFFINITY > > > > > > > > > > > > Request: String cacheName, Object key, String > taskName, > > > > > Object > > > > > > > arg > > > > > > > > > > > > Response: Object result > > > > > > > > > > > > > > > > > > > > > > > > The second operation is needed because we sometimes > > can't > > > > > > > calculate > > > > > > > > > and > > > > > > > > > > > > connect to affinity node on the client-side (affinity > > > > > awareness > > > > > > > can > > > > > > > > > be > > > > > > > > > > > > disabled, custom affinity function can be used or > there > > > can > > > > > be > > > > > > no > > > > > > > > > > > > connection between client and affinity node), but we > > can > > > > make > > > > > > > best > > > > > > > > > > effort > > > > > > > > > > > > to send request to target node if affinity awareness > is > > > > > > enabled. > > > > > > > > > > > > > > > > > > > > > > > > Currently, on the server-side requests always > processed > > > > > > > > synchronously > > > > > > > > > > and > > > > > > > > > > > > responses are sent right after request was processed. > > To > > > > > > execute > > > > > > > > long > > > > > > > > > > > tasks > > > > > > > > > > > > async we should whether change this logic or > introduce > > > some > > > > > > kind > > > > > > > > > > two-way > > > > > > > > > > > > communication between client and server (now only > > one-way > > > > > > > requests > > > > > > > > > from > > > > > > > > > > > > client to server are allowed). > > > > > > > > > > > > > > > > > > > > > > > > Two-way communication can also be useful in the > future > > if > > > > we > > > > > > will > > > > > > > > > send > > > > > > > > > > > some > > > > > > > > > > > > server-side generated events to clients. > > > > > > > > > > > > > > > > > > > > > > > > In case of two-way communication there can be new > > > > operations > > > > > > > > > > introduced: > > > > > > > > > > > > > > > > > > > > > > > > OP_COMPUTE_EXECUTE_TASK (from client to server) > > > > > > > > > > > > Request: UUID nodeId, String taskName, Object arg > > > > > > > > > > > > Response: long taskId > > > > > > > > > > > > > > > > > > > > > > > > OP_COMPUTE_TASK_FINISHED (from server to client) > > > > > > > > > > > > Request: taskId, Object result > > > > > > > > > > > > Response: empty > > > > > > > > > > > > > > > > > > > > > > > > The same for affinity requests. > > > > > > > > > > > > > > > > > > > > > > > > Also, we can implement not only execute task > operation, > > > but > > > > > > some > > > > > > > > > other > > > > > > > > > > > > operations from IgniteCompute (broadcast, run, call), > > but > > > > it > > > > > > will > > > > > > > > be > > > > > > > > > > > useful > > > > > > > > > > > > only for java thin client. And even with java thin > > client > > > > we > > > > > > > should > > > > > > > > > > > whether > > > > > > > > > > > > implement peer-class-loading for thin clients (this > > also > > > > > > requires > > > > > > > > > > two-way > > > > > > > > > > > > client-server communication) or put classes with > > executed > > > > > > > closures > > > > > > > > to > > > > > > > > > > the > > > > > > > > > > > > server locally. > > > > > > > > > > > > > > > > > > > > > > > > What do you think about proposed protocol changes? > > > > > > > > > > > > Do we need two-way requests between client and > server? > > > > > > > > > > > > Do we need support of compute methods other than > > "execute > > > > > > task"? > > > > > > > > > > > > What do you think about peer-class-loading for thin > > > > clients? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > > Sergey Kozlov > > > > > > > > > > > GridGain Systems > > > > > > > > > > > www.gridgain.com > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > Sergey Kozlov > > > > > > > > > GridGain Systems > > > > > > > > > www.gridgain.com > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > Alex. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > |
In reply to this post by Alexey Plekhanov
Alex,
>Filtering logic for OP_CLUSTER_GROUP_GET_NODE_IDS looks overcomplicated. Do >we need server-side filtering at all? Wouldn't it be better to send basic >info (ids, order, flags) for all nodes (there is relatively small amount of >data) and extended info (attributes) for selected list of nodes? In this >case, we can do basic node filtration on client-side (forClients(), >forServers(), forNodeIds(), forOthers(), etc). I think it's ok to have a server-side filtering. This allows us to have a single endpoint for all clients and thus ensures that they all will get the same consistent list of nodes in return regardless of their internal implementations. The only protocol change here in comparison to GetNodes() - an optional filter object that in most cases is represented by a list of key-value attribute pairs. >Do you use standard ClusterNode serialization? There are also metrics >serialized with ClusterNode, do we need it on thin client? There are other >interfaces exist to show metrics, I think it's redundant to export metrics >to thin clients too. Alongside with the node ids, we could pass a flag indicating whether we are interested in the detailed node representation, say with metrics, or only in a basic format. This flag should be disabled by default. We could implement a GetNodeMetrics(nodeId) method later on if we decide to. *From: *Pavel Tupitsyn <[hidden email]> *Sent: *Tuesday, November 26, 2019 5:44 PM *To: *dev <[hidden email]> *Subject: *Re: Thin client: compute support > After all, we don't cancel request We do cancel a request to perform a task. We may and should use this to cancel any other request in future. > Client uses some cluster group filtration (for example forServers() cluster group) Please see above - Aleksandr Shapkin described how we store filtered cluster groups on client. We don't store node IDs, we store actual filters. So every new request will apply those filters on server side, using the most recent set of nodes. var myGrp = cluster.forServers().forAttribute("foo"); // This does not issue any server requests, just builds an object with filters on client while (true) myGrp.compute().executeTask("bar"); // Every request includes filters, and filters are applied on the server side On Tue, Nov 26, 2019 at 1:42 PM Alex Plehanov <[hidden email]> wrote: > > Anyway, my point stands. > I can't agree. Why you don't want to use task id for this? After all, we > don't cancel request (request is already processed), we cancel the task. So > it's more convenient to use task id here. > > > Can you please provide equivalent use case with existing "thick" client? > For example: > Cluster consists of one server node. > Client uses some cluster group filtration (for example forServers() cluster > group). > Client starts to send periodically (for example 1 per minute) long-term > (for example 1 hour long) tasks to the cluster. > Meanwhile, several server nodes joined the cluster. > > In case of thick client: All server nodes will be used, tasks will be load > balanced. > In case of thin client: Only one server node will be used, client will > detect topology change after an hour. > > > вт, 26 нояб. 2019 г. в 11:50, Pavel Tupitsyn <[hidden email]>: > > > > I can't see any usage of request id in query cursors > > You are right, cursor id is a separate thing. > > Anyway, my point stands. > > > > > client sends long term tasks to nodes and wants to do it with load > > balancing > > I still don't get it. Can you please provide equivalent use case with > > existing "thick" client? > > > > > > On Mon, Nov 25, 2019 at 11:59 PM Alex Plehanov <[hidden email]> > > wrote: > > > > > > And it is fine to use request ID to identify compute tasks (as we do > > with > > > query cursors). > > > I can't see any usage of request id in query cursors. We send query > > request > > > and get cursor id in response. After that, we only use cursor id (to > get > > > next pages and to close the resource). Did I miss something? > > > > > > > Looks like I'm missing something - how is topology change relevant to > > > executing compute tasks from client? > > > It's not relevant directly. But there are some cases where it will be > > > helpful. For example, if client sends long term tasks to nodes and > wants > > to > > > do it with load balancing it will detect topology change only after > some > > > time in the future with the first response, so load balancing will no > > work. > > > Perhaps we can add optional "topology version" field to the > > > OP_COMPUTE_EXECUTE_TASK request to solve this problem. > > > > > > > > > пн, 25 нояб. 2019 г. в 22:42, Pavel Tupitsyn <[hidden email]>: > > > > > > > Alex, > > > > > > > > > we will mix entities from different layers (transport layer and > > request > > > > body) > > > > I would not call our message header (which includes the id) > "transport > > > > layer". > > > > TCP is our transport layer. And it is fine to use request ID to > > identify > > > > compute tasks (as we do with query cursors). > > > > > > > > > we still can't be sure that the task is successfully started on a > > > server > > > > The request to start the task will fail and we'll get a response > > > indicating > > > > that right away > > > > > > > > > we won't ever know about topology change > > > > Looks like I'm missing something - how is topology change relevant to > > > > executing compute tasks from client? > > > > > > > > On Mon, Nov 25, 2019 at 10:17 PM Alex Plehanov < > > [hidden email]> > > > > wrote: > > > > > > > > > Pavel, in this case, we will mix entities from different layers > > > > (transport > > > > > layer and request body), it's not very good. The same behavior we > can > > > > > achieve with generated on client-side task id, but there will be no > > > > > inter-layer data intersection and I think it will be easier to > > > implement > > > > on > > > > > both client and server-side. But we still can't be sure that the > task > > > is > > > > > successfully started on a server. We won't ever know about topology > > > > change, > > > > > because topology changed flag will be sent from server to client > only > > > > with > > > > > a response when the task will be completed. Are we accept that? > > > > > > > > > > пн, 25 нояб. 2019 г. в 19:07, Pavel Tupitsyn <[hidden email] > >: > > > > > > > > > > > Alex, > > > > > > > > > > > > I have a simpler idea. We already do request id handling in the > > > > protocol, > > > > > > so: > > > > > > - Client sends a normal request to execute compute task. Request > ID > > > is > > > > > > generated as usual. > > > > > > - As soon as task is completed, a response is received. > > > > > > > > > > > > As for cancellation - client can send a new request (with new > > request > > > > ID) > > > > > > and (in the body) pass the request ID from above > > > > > > as a task identifier. As a result, there are two responses: > > > > > > - Cancellation response > > > > > > - Task response (with proper cancelled status) > > > > > > > > > > > > That's it, no need to modify the core of the protocol. One > request > > - > > > > one > > > > > > response. > > > > > > > > > > > > On Mon, Nov 25, 2019 at 6:20 PM Alex Plehanov < > > > [hidden email] > > > > > > > > > > > wrote: > > > > > > > > > > > > > Pavel, we need to inform the client when the task is completed, > > we > > > > need > > > > > > the > > > > > > > ability to cancel the task. I see several ways to implement > this: > > > > > > > > > > > > > > 1. Сlient sends a request to the server to start a task, server > > > > return > > > > > > task > > > > > > > id in response. Server notifies client when task is completed > > with > > > a > > > > > new > > > > > > > request (from server to client). Client can cancel the task by > > > > sending > > > > > a > > > > > > > new request with operation type "cancel" and task id. In this > > case, > > > > we > > > > > > > should implement 2-ways requests. > > > > > > > 2. Client generates unique task id and sends a request to the > > > server > > > > to > > > > > > > start a task, server don't reply immediately but wait until > task > > is > > > > > > > completed. Client can cancel task by sending new request with > > > > operation > > > > > > > type "cancel" and task id. In this case, we should decouple > > request > > > > and > > > > > > > response on the server-side (currently response is sent right > > after > > > > > > request > > > > > > > was processed). Also, we can't be sure that task is > successfully > > > > > started > > > > > > on > > > > > > > a server. > > > > > > > 3. Client sends a request to the server to start a task, server > > > > return > > > > > id > > > > > > > in response. Client periodically asks the server about task > > status. > > > > > > Client > > > > > > > can cancel the task by sending new request with operation type > > > > "cancel" > > > > > > and > > > > > > > task id. This case brings some overhead to the communication > > > channel. > > > > > > > > > > > > > > Personally, I think that the case with 2-ways requests is > better, > > > but > > > > > I'm > > > > > > > open to any other ideas. > > > > > > > > > > > > > > Aleksandr, > > > > > > > > > > > > > > Filtering logic for OP_CLUSTER_GROUP_GET_NODE_IDS looks > > > > > overcomplicated. > > > > > > Do > > > > > > > we need server-side filtering at all? Wouldn't it be better to > > send > > > > > basic > > > > > > > info (ids, order, flags) for all nodes (there is relatively > small > > > > > amount > > > > > > of > > > > > > > data) and extended info (attributes) for selected list of > nodes? > > In > > > > > this > > > > > > > case, we can do basic node filtration on client-side > > (forClients(), > > > > > > > forServers(), forNodeIds(), forOthers(), etc). > > > > > > > > > > > > > > Do you use standard ClusterNode serialization? There are also > > > metrics > > > > > > > serialized with ClusterNode, do we need it on thin client? > There > > > are > > > > > > other > > > > > > > interfaces exist to show metrics, I think it's redundant to > > export > > > > > > metrics > > > > > > > to thin clients too. > > > > > > > > > > > > > > What do you think? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > пт, 22 нояб. 2019 г. в 20:15, Aleksandr Shapkin < > > [hidden email] > > > >: > > > > > > > > > > > > > > > Alex, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I think you can create a new IEP page and I will fill it with > > the > > > > > > Cluster > > > > > > > > API details. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > In short, I’ve introduced several new codes: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Cluster API is pretty straightforward: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > OP_CLUSTER_IS_ACTIVE = 5000 > > > > > > > > > > > > > > > > OP_CLUSTER_CHANGE_STATE = 5001 > > > > > > > > > > > > > > > > OP_CLUSTER_CHANGE_WAL_STATE = 5002 > > > > > > > > > > > > > > > > OP_CLUSTER_GET_WAL_STATE = 5003 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Cluster group codes: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > OP_CLUSTER_GROUP_GET_NODE_IDS = 5100 > > > > > > > > > > > > > > > > OP_CLUSTER_GROUP_GET_NODE_INFO = 5101 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > The underlying implementation is based on the thick client > > logic. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > For every request, we provide a known topology version and if > > it > > > > has > > > > > > > > changed, > > > > > > > > > > > > > > > > a client updates it firstly and then re-sends the filtering > > > > request. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Alongside the topVer a client sends a serialized nodes > > projection > > > > > > object > > > > > > > > > > > > > > > > that could be considered as a code to value mapping. > > > > > > > > > > > > > > > > Consider: [{Code = 1, Value= [“DotNet”, “MyAttribute”}, > > {Code=2, > > > > > > > Value=1}] > > > > > > > > > > > > > > > > Where “1” stands for Attribute filtering and “2” – > > > serverNodesOnly > > > > > > flag. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > As a result of request processing, a server sends nodeId > UUIDs > > > and > > > > a > > > > > > > > current topVer. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > When a client obtains nodeIds, it can perform a NODE_INFO > call > > to > > > > > get a > > > > > > > > > > > > > > > > serialized ClusterNode object. In addition there should be a > > > > > different > > > > > > > API > > > > > > > > > > > > > > > > method for accessing/updating node metrics. > > > > > > > > > > > > > > > > чт, 21 нояб. 2019 г. в 12:32, Sergey Kozlov < > > > [hidden email] > > > > >: > > > > > > > > > > > > > > > > > Hi Pavel > > > > > > > > > > > > > > > > > > On Thu, Nov 21, 2019 at 11:30 AM Pavel Tupitsyn < > > > > > > [hidden email]> > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > 1. I believe that Cluster operations for Thin Client > > protocol > > > > are > > > > > > > > already > > > > > > > > > > in the works > > > > > > > > > > by Alexandr Shapkin. Can't find the ticket though. > > > > > > > > > > Alexandr, can you please confirm and attach the ticket > > > number? > > > > > > > > > > > > > > > > > > > > 2. Proposed changes will work only for Java tasks that > are > > > > > already > > > > > > > > > deployed > > > > > > > > > > on server nodes. > > > > > > > > > > This is mostly useless for other thin clients we have > > > (Python, > > > > > PHP, > > > > > > > > .NET, > > > > > > > > > > C++). > > > > > > > > > > > > > > > > > > > > > > > > > > > > I don't guess so. The task (execution) is a way to > implement > > > own > > > > > > layer > > > > > > > > for > > > > > > > > > the thin client application. > > > > > > > > > > > > > > > > > > > > > > > > > > > > We should think of a way to make this useful for all > > clients. > > > > > > > > > > For example, we may allow sending tasks in some scripting > > > > > language > > > > > > > like > > > > > > > > > > Javascript. > > > > > > > > > > Thoughts? > > > > > > > > > > > > > > > > > > > > > > > > > > > > The arbitrary code execution from a remote client must be > > > > protected > > > > > > > > > from malicious code. > > > > > > > > > I don't know how it could be designed but without that we > > open > > > > the > > > > > > hole > > > > > > > > to > > > > > > > > > kill cluster. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Nov 21, 2019 at 11:21 AM Sergey Kozlov < > > > > > > [hidden email] > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > Hi Alex > > > > > > > > > > > > > > > > > > > > > > The idea is great. But I have some concerns that > probably > > > > > should > > > > > > be > > > > > > > > > taken > > > > > > > > > > > into account for design: > > > > > > > > > > > > > > > > > > > > > > 1. We need to have the ability to stop a task > > execution, > > > > > smth > > > > > > > like > > > > > > > > > > > OP_COMPUTE_CANCEL_TASK operation (client to server) > > > > > > > > > > > 2. What's about task execution timeout? It may help > to > > > the > > > > > > > cluster > > > > > > > > > > > survival for buggy tasks > > > > > > > > > > > 3. Ignite doesn't have roles/authorization > > functionality > > > > for > > > > > > > now. > > > > > > > > > But > > > > > > > > > > a > > > > > > > > > > > task is the risky operation for cluster (for > security > > > > > > reasons). > > > > > > > > > Could > > > > > > > > > > we > > > > > > > > > > > add for Ignite configuration new options: > > > > > > > > > > > - Explicit turning on for compute task support > for > > > thin > > > > > > > > protocol > > > > > > > > > > > (disabled by default) for whole cluster > > > > > > > > > > > - Explicit turning on for compute task support > for > > a > > > > node > > > > > > > > > > > - The list of task names (classes) allowed to > > execute > > > > by > > > > > > thin > > > > > > > > > > client. > > > > > > > > > > > 4. Support the labeling for task that may help to > > > > > investigate > > > > > > > > issues > > > > > > > > > > on > > > > > > > > > > > cluster (the idea from IEP-34 [1]) > > > > > > > > > > > > > > > > > > > > > > 1. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/IGNITE/IEP-34+Thin+client%3A+transactions+support > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Nov 21, 2019 at 10:58 AM Alex Plehanov < > > > > > > > > > [hidden email]> > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > Hello, Igniters! > > > > > > > > > > > > > > > > > > > > > > > > I have plans to start implementation of Compute > > interface > > > > for > > > > > > > > Ignite > > > > > > > > > > thin > > > > > > > > > > > > client and want to discuss features that should be > > > > > implemented. > > > > > > > > > > > > > > > > > > > > > > > > We already have Compute implementation for > binary-rest > > > > > clients > > > > > > > > > > > > (GridClientCompute), which have the following > > > > functionality: > > > > > > > > > > > > - Filtering cluster nodes (projection) for compute > > > > > > > > > > > > - Executing task by the name > > > > > > > > > > > > > > > > > > > > > > > > I think we can implement this functionality in a thin > > > > client > > > > > as > > > > > > > > well. > > > > > > > > > > > > > > > > > > > > > > > > First of all, we need some operation types to > request a > > > > list > > > > > of > > > > > > > all > > > > > > > > > > > > available nodes and probably node attributes (by a > list > > > of > > > > > > > nodes). > > > > > > > > > Node > > > > > > > > > > > > attributes will be helpful if we will decide to > > implement > > > > > > analog > > > > > > > of > > > > > > > > > > > > ClusterGroup#forAttribute or > ClusterGroup#forePredicate > > > > > methods > > > > > > > in > > > > > > > > > the > > > > > > > > > > > thin > > > > > > > > > > > > client. Perhaps they can be requested lazily. > > > > > > > > > > > > > > > > > > > > > > > > From the protocol point of view there will be two new > > > > > > operations: > > > > > > > > > > > > > > > > > > > > > > > > OP_CLUSTER_GET_NODES > > > > > > > > > > > > Request: empty > > > > > > > > > > > > Response: long topologyVersion, int > > minorTopologyVersion, > > > > int > > > > > > > > > > nodesCount, > > > > > > > > > > > > for each node set of node fields (UUID nodeId, Object > > or > > > > > String > > > > > > > > > > > > consistentId, long order, etc) > > > > > > > > > > > > > > > > > > > > > > > > OP_CLUSTER_GET_NODE_ATTRIBUTES > > > > > > > > > > > > Request: int nodesCount, for each node: UUID nodeId > > > > > > > > > > > > Response: int nodesCount, for each node: int > > > > attributesCount, > > > > > > for > > > > > > > > > each > > > > > > > > > > > node > > > > > > > > > > > > attribute: String name, Object value > > > > > > > > > > > > > > > > > > > > > > > > To execute tasks we need something like these methods > > in > > > > the > > > > > > > client > > > > > > > > > > API: > > > > > > > > > > > > Object execute(String task, Object arg) > > > > > > > > > > > > Future<Object> executeAsync(String task, Object arg) > > > > > > > > > > > > Object affinityExecute(String task, String cache, > > Object > > > > key, > > > > > > > > Object > > > > > > > > > > arg) > > > > > > > > > > > > Future<Object> affinityExecuteAsync(String task, > String > > > > > cache, > > > > > > > > Object > > > > > > > > > > > key, > > > > > > > > > > > > Object arg) > > > > > > > > > > > > > > > > > > > > > > > > Which can be mapped to protocol operations: > > > > > > > > > > > > > > > > > > > > > > > > OP_COMPUTE_EXECUTE_TASK > > > > > > > > > > > > Request: UUID nodeId, String taskName, Object arg > > > > > > > > > > > > Response: Object result > > > > > > > > > > > > > > > > > > > > > > > > OP_COMPUTE_EXECUTE_TASK_AFFINITY > > > > > > > > > > > > Request: String cacheName, Object key, String > taskName, > > > > > Object > > > > > > > arg > > > > > > > > > > > > Response: Object result > > > > > > > > > > > > > > > > > > > > > > > > The second operation is needed because we sometimes > > can't > > > > > > > calculate > > > > > > > > > and > > > > > > > > > > > > connect to affinity node on the client-side (affinity > > > > > awareness > > > > > > > can > > > > > > > > > be > > > > > > > > > > > > disabled, custom affinity function can be used or > there > > > can > > > > > be > > > > > > no > > > > > > > > > > > > connection between client and affinity node), but we > > can > > > > make > > > > > > > best > > > > > > > > > > effort > > > > > > > > > > > > to send request to target node if affinity awareness > is > > > > > > enabled. > > > > > > > > > > > > > > > > > > > > > > > > Currently, on the server-side requests always > processed > > > > > > > > synchronously > > > > > > > > > > and > > > > > > > > > > > > responses are sent right after request was processed. > > To > > > > > > execute > > > > > > > > long > > > > > > > > > > > tasks > > > > > > > > > > > > async we should whether change this logic or > introduce > > > some > > > > > > kind > > > > > > > > > > two-way > > > > > > > > > > > > communication between client and server (now only > > one-way > > > > > > > requests > > > > > > > > > from > > > > > > > > > > > > client to server are allowed). > > > > > > > > > > > > > > > > > > > > > > > > Two-way communication can also be useful in the > future > > if > > > > we > > > > > > will > > > > > > > > > send > > > > > > > > > > > some > > > > > > > > > > > > server-side generated events to clients. > > > > > > > > > > > > > > > > > > > > > > > > In case of two-way communication there can be new > > > > operations > > > > > > > > > > introduced: > > > > > > > > > > > > > > > > > > > > > > > > OP_COMPUTE_EXECUTE_TASK (from client to server) > > > > > > > > > > > > Request: UUID nodeId, String taskName, Object arg > > > > > > > > > > > > Response: long taskId > > > > > > > > > > > > > > > > > > > > > > > > OP_COMPUTE_TASK_FINISHED (from server to client) > > > > > > > > > > > > Request: taskId, Object result > > > > > > > > > > > > Response: empty > > > > > > > > > > > > > > > > > > > > > > > > The same for affinity requests. > > > > > > > > > > > > > > > > > > > > > > > > Also, we can implement not only execute task > operation, > > > but > > > > > > some > > > > > > > > > other > > > > > > > > > > > > operations from IgniteCompute (broadcast, run, call), > > but > > > > it > > > > > > will > > > > > > > > be > > > > > > > > > > > useful > > > > > > > > > > > > only for java thin client. And even with java thin > > client > > > > we > > > > > > > should > > > > > > > > > > > whether > > > > > > > > > > > > implement peer-class-loading for thin clients (this > > also > > > > > > requires > > > > > > > > > > two-way > > > > > > > > > > > > client-server communication) or put classes with > > executed > > > > > > > closures > > > > > > > > to > > > > > > > > > > the > > > > > > > > > > > > server locally. > > > > > > > > > > > > > > > > > > > > > > > > What do you think about proposed protocol changes? > > > > > > > > > > > > Do we need two-way requests between client and > server? > > > > > > > > > > > > Do we need support of compute methods other than > > "execute > > > > > > task"? > > > > > > > > > > > > What do you think about peer-class-loading for thin > > > > clients? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > > Sergey Kozlov > > > > > > > > > > > GridGain Systems > > > > > > > > > > > www.gridgain.com > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > Sergey Kozlov > > > > > > > > > GridGain Systems > > > > > > > > > www.gridgain.com > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > Alex. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > |
In reply to this post by Pavel Tupitsyn
> We do cancel a request to perform a task. We may and should use this to
cancel any other request in future. The request is already processed (task is started), we can't cancel the request. As you mentioned before, we already do almost the same for queries (close the cursor, but not cancel the request to run a query), it's better to do such things in a common way. We have a pattern: start some process (query, transaction), get id of this process, end process by this id. The "Execute task" process should match the same pattern. In my opinion, implementation with two-way requests is the best option to match this pattern (we can even reuse OP_RESOURCE_CLOSE operation type in this case). Sometime in the future, we will need two-way requests for some other functionality (continuous queries, event listening, etc). But even without two-way requests introducing some process id (task id in our case) will be closer to existing pattern than canceling tasks by request id. > So every new request will apply those filters on server side, using the most recent set of nodes. In this case, we always need to send 2 requests to server to execute the task. First - to get nodes by the filter, second - to actually execute the task. It seems like overhead. The same will be for services. Cluster group remains the same if the topology hasn't changed. We can use this fact and bind "execute task" request to topology. If topology has changed - get nodes for new topology and retry request. вт, 26 нояб. 2019 г. в 17:44, Pavel Tupitsyn <[hidden email]>: > > After all, we don't cancel request > We do cancel a request to perform a task. We may and should use this to > cancel any other request in future. > > > Client uses some cluster group filtration (for example forServers() > cluster group) > Please see above - Aleksandr Shapkin described how we store > filtered cluster groups on client. > We don't store node IDs, we store actual filters. So every new request will > apply those filters on server side, > using the most recent set of nodes. > > var myGrp = cluster.forServers().forAttribute("foo"); // This does not > issue any server requests, just builds an object with filters on client > while (true) myGrp.compute().executeTask("bar"); // Every request includes > filters, and filters are applied on the server side > > On Tue, Nov 26, 2019 at 1:42 PM Alex Plehanov <[hidden email]> > wrote: > > > > Anyway, my point stands. > > I can't agree. Why you don't want to use task id for this? After all, we > > don't cancel request (request is already processed), we cancel the task. > So > > it's more convenient to use task id here. > > > > > Can you please provide equivalent use case with existing "thick" > client? > > For example: > > Cluster consists of one server node. > > Client uses some cluster group filtration (for example forServers() > cluster > > group). > > Client starts to send periodically (for example 1 per minute) long-term > > (for example 1 hour long) tasks to the cluster. > > Meanwhile, several server nodes joined the cluster. > > > > In case of thick client: All server nodes will be used, tasks will be > load > > balanced. > > In case of thin client: Only one server node will be used, client will > > detect topology change after an hour. > > > > > > вт, 26 нояб. 2019 г. в 11:50, Pavel Tupitsyn <[hidden email]>: > > > > > > I can't see any usage of request id in query cursors > > > You are right, cursor id is a separate thing. > > > Anyway, my point stands. > > > > > > > client sends long term tasks to nodes and wants to do it with load > > > balancing > > > I still don't get it. Can you please provide equivalent use case with > > > existing "thick" client? > > > > > > > > > On Mon, Nov 25, 2019 at 11:59 PM Alex Plehanov < > [hidden email]> > > > wrote: > > > > > > > > And it is fine to use request ID to identify compute tasks (as we > do > > > with > > > > query cursors). > > > > I can't see any usage of request id in query cursors. We send query > > > request > > > > and get cursor id in response. After that, we only use cursor id (to > > get > > > > next pages and to close the resource). Did I miss something? > > > > > > > > > Looks like I'm missing something - how is topology change relevant > to > > > > executing compute tasks from client? > > > > It's not relevant directly. But there are some cases where it will be > > > > helpful. For example, if client sends long term tasks to nodes and > > wants > > > to > > > > do it with load balancing it will detect topology change only after > > some > > > > time in the future with the first response, so load balancing will no > > > work. > > > > Perhaps we can add optional "topology version" field to the > > > > OP_COMPUTE_EXECUTE_TASK request to solve this problem. > > > > > > > > > > > > пн, 25 нояб. 2019 г. в 22:42, Pavel Tupitsyn <[hidden email]>: > > > > > > > > > Alex, > > > > > > > > > > > we will mix entities from different layers (transport layer and > > > request > > > > > body) > > > > > I would not call our message header (which includes the id) > > "transport > > > > > layer". > > > > > TCP is our transport layer. And it is fine to use request ID to > > > identify > > > > > compute tasks (as we do with query cursors). > > > > > > > > > > > we still can't be sure that the task is successfully started on a > > > > server > > > > > The request to start the task will fail and we'll get a response > > > > indicating > > > > > that right away > > > > > > > > > > > we won't ever know about topology change > > > > > Looks like I'm missing something - how is topology change relevant > to > > > > > executing compute tasks from client? > > > > > > > > > > On Mon, Nov 25, 2019 at 10:17 PM Alex Plehanov < > > > [hidden email]> > > > > > wrote: > > > > > > > > > > > Pavel, in this case, we will mix entities from different layers > > > > > (transport > > > > > > layer and request body), it's not very good. The same behavior we > > can > > > > > > achieve with generated on client-side task id, but there will be > no > > > > > > inter-layer data intersection and I think it will be easier to > > > > implement > > > > > on > > > > > > both client and server-side. But we still can't be sure that the > > task > > > > is > > > > > > successfully started on a server. We won't ever know about > topology > > > > > change, > > > > > > because topology changed flag will be sent from server to client > > only > > > > > with > > > > > > a response when the task will be completed. Are we accept that? > > > > > > > > > > > > пн, 25 нояб. 2019 г. в 19:07, Pavel Tupitsyn < > [hidden email] > > >: > > > > > > > > > > > > > Alex, > > > > > > > > > > > > > > I have a simpler idea. We already do request id handling in the > > > > > protocol, > > > > > > > so: > > > > > > > - Client sends a normal request to execute compute task. > Request > > ID > > > > is > > > > > > > generated as usual. > > > > > > > - As soon as task is completed, a response is received. > > > > > > > > > > > > > > As for cancellation - client can send a new request (with new > > > request > > > > > ID) > > > > > > > and (in the body) pass the request ID from above > > > > > > > as a task identifier. As a result, there are two responses: > > > > > > > - Cancellation response > > > > > > > - Task response (with proper cancelled status) > > > > > > > > > > > > > > That's it, no need to modify the core of the protocol. One > > request > > > - > > > > > one > > > > > > > response. > > > > > > > > > > > > > > On Mon, Nov 25, 2019 at 6:20 PM Alex Plehanov < > > > > [hidden email] > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > Pavel, we need to inform the client when the task is > completed, > > > we > > > > > need > > > > > > > the > > > > > > > > ability to cancel the task. I see several ways to implement > > this: > > > > > > > > > > > > > > > > 1. Сlient sends a request to the server to start a task, > server > > > > > return > > > > > > > task > > > > > > > > id in response. Server notifies client when task is completed > > > with > > > > a > > > > > > new > > > > > > > > request (from server to client). Client can cancel the task > by > > > > > sending > > > > > > a > > > > > > > > new request with operation type "cancel" and task id. In this > > > case, > > > > > we > > > > > > > > should implement 2-ways requests. > > > > > > > > 2. Client generates unique task id and sends a request to the > > > > server > > > > > to > > > > > > > > start a task, server don't reply immediately but wait until > > task > > > is > > > > > > > > completed. Client can cancel task by sending new request with > > > > > operation > > > > > > > > type "cancel" and task id. In this case, we should decouple > > > request > > > > > and > > > > > > > > response on the server-side (currently response is sent right > > > after > > > > > > > request > > > > > > > > was processed). Also, we can't be sure that task is > > successfully > > > > > > started > > > > > > > on > > > > > > > > a server. > > > > > > > > 3. Client sends a request to the server to start a task, > server > > > > > return > > > > > > id > > > > > > > > in response. Client periodically asks the server about task > > > status. > > > > > > > Client > > > > > > > > can cancel the task by sending new request with operation > type > > > > > "cancel" > > > > > > > and > > > > > > > > task id. This case brings some overhead to the communication > > > > channel. > > > > > > > > > > > > > > > > Personally, I think that the case with 2-ways requests is > > better, > > > > but > > > > > > I'm > > > > > > > > open to any other ideas. > > > > > > > > > > > > > > > > Aleksandr, > > > > > > > > > > > > > > > > Filtering logic for OP_CLUSTER_GROUP_GET_NODE_IDS looks > > > > > > overcomplicated. > > > > > > > Do > > > > > > > > we need server-side filtering at all? Wouldn't it be better > to > > > send > > > > > > basic > > > > > > > > info (ids, order, flags) for all nodes (there is relatively > > small > > > > > > amount > > > > > > > of > > > > > > > > data) and extended info (attributes) for selected list of > > nodes? > > > In > > > > > > this > > > > > > > > case, we can do basic node filtration on client-side > > > (forClients(), > > > > > > > > forServers(), forNodeIds(), forOthers(), etc). > > > > > > > > > > > > > > > > Do you use standard ClusterNode serialization? There are also > > > > metrics > > > > > > > > serialized with ClusterNode, do we need it on thin client? > > There > > > > are > > > > > > > other > > > > > > > > interfaces exist to show metrics, I think it's redundant to > > > export > > > > > > > metrics > > > > > > > > to thin clients too. > > > > > > > > > > > > > > > > What do you think? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > пт, 22 нояб. 2019 г. в 20:15, Aleksandr Shapkin < > > > [hidden email] > > > > >: > > > > > > > > > > > > > > > > > Alex, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I think you can create a new IEP page and I will fill it > with > > > the > > > > > > > Cluster > > > > > > > > > API details. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > In short, I’ve introduced several new codes: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Cluster API is pretty straightforward: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > OP_CLUSTER_IS_ACTIVE = 5000 > > > > > > > > > > > > > > > > > > OP_CLUSTER_CHANGE_STATE = 5001 > > > > > > > > > > > > > > > > > > OP_CLUSTER_CHANGE_WAL_STATE = 5002 > > > > > > > > > > > > > > > > > > OP_CLUSTER_GET_WAL_STATE = 5003 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Cluster group codes: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > OP_CLUSTER_GROUP_GET_NODE_IDS = 5100 > > > > > > > > > > > > > > > > > > OP_CLUSTER_GROUP_GET_NODE_INFO = 5101 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > The underlying implementation is based on the thick client > > > logic. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > For every request, we provide a known topology version and > if > > > it > > > > > has > > > > > > > > > changed, > > > > > > > > > > > > > > > > > > a client updates it firstly and then re-sends the filtering > > > > > request. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Alongside the topVer a client sends a serialized nodes > > > projection > > > > > > > object > > > > > > > > > > > > > > > > > > that could be considered as a code to value mapping. > > > > > > > > > > > > > > > > > > Consider: [{Code = 1, Value= [“DotNet”, “MyAttribute”}, > > > {Code=2, > > > > > > > > Value=1}] > > > > > > > > > > > > > > > > > > Where “1” stands for Attribute filtering and “2” – > > > > serverNodesOnly > > > > > > > flag. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > As a result of request processing, a server sends nodeId > > UUIDs > > > > and > > > > > a > > > > > > > > > current topVer. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > When a client obtains nodeIds, it can perform a NODE_INFO > > call > > > to > > > > > > get a > > > > > > > > > > > > > > > > > > serialized ClusterNode object. In addition there should be > a > > > > > > different > > > > > > > > API > > > > > > > > > > > > > > > > > > method for accessing/updating node metrics. > > > > > > > > > > > > > > > > > > чт, 21 нояб. 2019 г. в 12:32, Sergey Kozlov < > > > > [hidden email] > > > > > >: > > > > > > > > > > > > > > > > > > > Hi Pavel > > > > > > > > > > > > > > > > > > > > On Thu, Nov 21, 2019 at 11:30 AM Pavel Tupitsyn < > > > > > > > [hidden email]> > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > 1. I believe that Cluster operations for Thin Client > > > protocol > > > > > are > > > > > > > > > already > > > > > > > > > > > in the works > > > > > > > > > > > by Alexandr Shapkin. Can't find the ticket though. > > > > > > > > > > > Alexandr, can you please confirm and attach the ticket > > > > number? > > > > > > > > > > > > > > > > > > > > > > 2. Proposed changes will work only for Java tasks that > > are > > > > > > already > > > > > > > > > > deployed > > > > > > > > > > > on server nodes. > > > > > > > > > > > This is mostly useless for other thin clients we have > > > > (Python, > > > > > > PHP, > > > > > > > > > .NET, > > > > > > > > > > > C++). > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I don't guess so. The task (execution) is a way to > > implement > > > > own > > > > > > > layer > > > > > > > > > for > > > > > > > > > > the thin client application. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > We should think of a way to make this useful for all > > > clients. > > > > > > > > > > > For example, we may allow sending tasks in some > scripting > > > > > > language > > > > > > > > like > > > > > > > > > > > Javascript. > > > > > > > > > > > Thoughts? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > The arbitrary code execution from a remote client must be > > > > > protected > > > > > > > > > > from malicious code. > > > > > > > > > > I don't know how it could be designed but without that we > > > open > > > > > the > > > > > > > hole > > > > > > > > > to > > > > > > > > > > kill cluster. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Nov 21, 2019 at 11:21 AM Sergey Kozlov < > > > > > > > [hidden email] > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > Hi Alex > > > > > > > > > > > > > > > > > > > > > > > > The idea is great. But I have some concerns that > > probably > > > > > > should > > > > > > > be > > > > > > > > > > taken > > > > > > > > > > > > into account for design: > > > > > > > > > > > > > > > > > > > > > > > > 1. We need to have the ability to stop a task > > > execution, > > > > > > smth > > > > > > > > like > > > > > > > > > > > > OP_COMPUTE_CANCEL_TASK operation (client to > server) > > > > > > > > > > > > 2. What's about task execution timeout? It may > help > > to > > > > the > > > > > > > > cluster > > > > > > > > > > > > survival for buggy tasks > > > > > > > > > > > > 3. Ignite doesn't have roles/authorization > > > functionality > > > > > for > > > > > > > > now. > > > > > > > > > > But > > > > > > > > > > > a > > > > > > > > > > > > task is the risky operation for cluster (for > > security > > > > > > > reasons). > > > > > > > > > > Could > > > > > > > > > > > we > > > > > > > > > > > > add for Ignite configuration new options: > > > > > > > > > > > > - Explicit turning on for compute task support > > for > > > > thin > > > > > > > > > protocol > > > > > > > > > > > > (disabled by default) for whole cluster > > > > > > > > > > > > - Explicit turning on for compute task support > > for > > > a > > > > > node > > > > > > > > > > > > - The list of task names (classes) allowed to > > > execute > > > > > by > > > > > > > thin > > > > > > > > > > > client. > > > > > > > > > > > > 4. Support the labeling for task that may help to > > > > > > investigate > > > > > > > > > issues > > > > > > > > > > > on > > > > > > > > > > > > cluster (the idea from IEP-34 [1]) > > > > > > > > > > > > > > > > > > > > > > > > 1. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/IGNITE/IEP-34+Thin+client%3A+transactions+support > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Nov 21, 2019 at 10:58 AM Alex Plehanov < > > > > > > > > > > [hidden email]> > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > Hello, Igniters! > > > > > > > > > > > > > > > > > > > > > > > > > > I have plans to start implementation of Compute > > > interface > > > > > for > > > > > > > > > Ignite > > > > > > > > > > > thin > > > > > > > > > > > > > client and want to discuss features that should be > > > > > > implemented. > > > > > > > > > > > > > > > > > > > > > > > > > > We already have Compute implementation for > > binary-rest > > > > > > clients > > > > > > > > > > > > > (GridClientCompute), which have the following > > > > > functionality: > > > > > > > > > > > > > - Filtering cluster nodes (projection) for compute > > > > > > > > > > > > > - Executing task by the name > > > > > > > > > > > > > > > > > > > > > > > > > > I think we can implement this functionality in a > thin > > > > > client > > > > > > as > > > > > > > > > well. > > > > > > > > > > > > > > > > > > > > > > > > > > First of all, we need some operation types to > > request a > > > > > list > > > > > > of > > > > > > > > all > > > > > > > > > > > > > available nodes and probably node attributes (by a > > list > > > > of > > > > > > > > nodes). > > > > > > > > > > Node > > > > > > > > > > > > > attributes will be helpful if we will decide to > > > implement > > > > > > > analog > > > > > > > > of > > > > > > > > > > > > > ClusterGroup#forAttribute or > > ClusterGroup#forePredicate > > > > > > methods > > > > > > > > in > > > > > > > > > > the > > > > > > > > > > > > thin > > > > > > > > > > > > > client. Perhaps they can be requested lazily. > > > > > > > > > > > > > > > > > > > > > > > > > > From the protocol point of view there will be two > new > > > > > > > operations: > > > > > > > > > > > > > > > > > > > > > > > > > > OP_CLUSTER_GET_NODES > > > > > > > > > > > > > Request: empty > > > > > > > > > > > > > Response: long topologyVersion, int > > > minorTopologyVersion, > > > > > int > > > > > > > > > > > nodesCount, > > > > > > > > > > > > > for each node set of node fields (UUID nodeId, > Object > > > or > > > > > > String > > > > > > > > > > > > > consistentId, long order, etc) > > > > > > > > > > > > > > > > > > > > > > > > > > OP_CLUSTER_GET_NODE_ATTRIBUTES > > > > > > > > > > > > > Request: int nodesCount, for each node: UUID nodeId > > > > > > > > > > > > > Response: int nodesCount, for each node: int > > > > > attributesCount, > > > > > > > for > > > > > > > > > > each > > > > > > > > > > > > node > > > > > > > > > > > > > attribute: String name, Object value > > > > > > > > > > > > > > > > > > > > > > > > > > To execute tasks we need something like these > methods > > > in > > > > > the > > > > > > > > client > > > > > > > > > > > API: > > > > > > > > > > > > > Object execute(String task, Object arg) > > > > > > > > > > > > > Future<Object> executeAsync(String task, Object > arg) > > > > > > > > > > > > > Object affinityExecute(String task, String cache, > > > Object > > > > > key, > > > > > > > > > Object > > > > > > > > > > > arg) > > > > > > > > > > > > > Future<Object> affinityExecuteAsync(String task, > > String > > > > > > cache, > > > > > > > > > Object > > > > > > > > > > > > key, > > > > > > > > > > > > > Object arg) > > > > > > > > > > > > > > > > > > > > > > > > > > Which can be mapped to protocol operations: > > > > > > > > > > > > > > > > > > > > > > > > > > OP_COMPUTE_EXECUTE_TASK > > > > > > > > > > > > > Request: UUID nodeId, String taskName, Object arg > > > > > > > > > > > > > Response: Object result > > > > > > > > > > > > > > > > > > > > > > > > > > OP_COMPUTE_EXECUTE_TASK_AFFINITY > > > > > > > > > > > > > Request: String cacheName, Object key, String > > taskName, > > > > > > Object > > > > > > > > arg > > > > > > > > > > > > > Response: Object result > > > > > > > > > > > > > > > > > > > > > > > > > > The second operation is needed because we sometimes > > > can't > > > > > > > > calculate > > > > > > > > > > and > > > > > > > > > > > > > connect to affinity node on the client-side > (affinity > > > > > > awareness > > > > > > > > can > > > > > > > > > > be > > > > > > > > > > > > > disabled, custom affinity function can be used or > > there > > > > can > > > > > > be > > > > > > > no > > > > > > > > > > > > > connection between client and affinity node), but > we > > > can > > > > > make > > > > > > > > best > > > > > > > > > > > effort > > > > > > > > > > > > > to send request to target node if affinity > awareness > > is > > > > > > > enabled. > > > > > > > > > > > > > > > > > > > > > > > > > > Currently, on the server-side requests always > > processed > > > > > > > > > synchronously > > > > > > > > > > > and > > > > > > > > > > > > > responses are sent right after request was > processed. > > > To > > > > > > > execute > > > > > > > > > long > > > > > > > > > > > > tasks > > > > > > > > > > > > > async we should whether change this logic or > > introduce > > > > some > > > > > > > kind > > > > > > > > > > > two-way > > > > > > > > > > > > > communication between client and server (now only > > > one-way > > > > > > > > requests > > > > > > > > > > from > > > > > > > > > > > > > client to server are allowed). > > > > > > > > > > > > > > > > > > > > > > > > > > Two-way communication can also be useful in the > > future > > > if > > > > > we > > > > > > > will > > > > > > > > > > send > > > > > > > > > > > > some > > > > > > > > > > > > > server-side generated events to clients. > > > > > > > > > > > > > > > > > > > > > > > > > > In case of two-way communication there can be new > > > > > operations > > > > > > > > > > > introduced: > > > > > > > > > > > > > > > > > > > > > > > > > > OP_COMPUTE_EXECUTE_TASK (from client to server) > > > > > > > > > > > > > Request: UUID nodeId, String taskName, Object arg > > > > > > > > > > > > > Response: long taskId > > > > > > > > > > > > > > > > > > > > > > > > > > OP_COMPUTE_TASK_FINISHED (from server to client) > > > > > > > > > > > > > Request: taskId, Object result > > > > > > > > > > > > > Response: empty > > > > > > > > > > > > > > > > > > > > > > > > > > The same for affinity requests. > > > > > > > > > > > > > > > > > > > > > > > > > > Also, we can implement not only execute task > > operation, > > > > but > > > > > > > some > > > > > > > > > > other > > > > > > > > > > > > > operations from IgniteCompute (broadcast, run, > call), > > > but > > > > > it > > > > > > > will > > > > > > > > > be > > > > > > > > > > > > useful > > > > > > > > > > > > > only for java thin client. And even with java thin > > > client > > > > > we > > > > > > > > should > > > > > > > > > > > > whether > > > > > > > > > > > > > implement peer-class-loading for thin clients (this > > > also > > > > > > > requires > > > > > > > > > > > two-way > > > > > > > > > > > > > client-server communication) or put classes with > > > executed > > > > > > > > closures > > > > > > > > > to > > > > > > > > > > > the > > > > > > > > > > > > > server locally. > > > > > > > > > > > > > > > > > > > > > > > > > > What do you think about proposed protocol changes? > > > > > > > > > > > > > Do we need two-way requests between client and > > server? > > > > > > > > > > > > > Do we need support of compute methods other than > > > "execute > > > > > > > task"? > > > > > > > > > > > > > What do you think about peer-class-loading for thin > > > > > clients? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > > > Sergey Kozlov > > > > > > > > > > > > GridGain Systems > > > > > > > > > > > > www.gridgain.com > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > Sergey Kozlov > > > > > > > > > > GridGain Systems > > > > > > > > > > www.gridgain.com > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > Alex. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > |
> The request is already processed (task is started), we can't cancel the
request The request is not "start a task". It is "execute task" (and get result). Same as "cache get" - you get a result in the end, we don't "start cache get" then "end cache get". Since all thin client operations are inherently async, we should be able to cancel any of them by sending another request with an id of prior request to be cancelled. That's why I'm advocating for this approach - it will work for anything, no special cases. And it keeps "happy path" as simple as it is right now. Queries are different because we retrieve results in pages, we can't do them as one request. Transactions are also different because client controls when they should end. There is no reason for task execution to be a special case like queries or transactions. > we always need to send 2 requests to server to execute the task Nope. We don't need to get nodes on client at all. The request would be "execute task with specified node filter" - simple and efficient. On Wed, Nov 27, 2019 at 4:31 PM Alex Plehanov <[hidden email]> wrote: > > We do cancel a request to perform a task. We may and should use this to > cancel any other request in future. > The request is already processed (task is started), we can't cancel the > request. As you mentioned before, we already do almost the same for queries > (close the cursor, but not cancel the request to run a query), it's better > to do such things in a common way. We have a pattern: start some process > (query, transaction), get id of this process, end process by this id. The > "Execute task" process should match the same pattern. In my opinion, > implementation with two-way requests is the best option to match this > pattern (we can even reuse OP_RESOURCE_CLOSE operation type in this case). > Sometime in the future, we will need two-way requests for some other > functionality (continuous queries, event listening, etc). But even without > two-way requests introducing some process id (task id in our case) will be > closer to existing pattern than canceling tasks by request id. > > > So every new request will apply those filters on server side, using the > most recent set of nodes. > In this case, we always need to send 2 requests to server to execute the > task. First - to get nodes by the filter, second - to actually execute the > task. It seems like overhead. The same will be for services. Cluster group > remains the same if the topology hasn't changed. We can use this fact and > bind "execute task" request to topology. If topology has changed - get > nodes for new topology and retry request. > > вт, 26 нояб. 2019 г. в 17:44, Pavel Tupitsyn <[hidden email]>: > > > > After all, we don't cancel request > > We do cancel a request to perform a task. We may and should use this to > > cancel any other request in future. > > > > > Client uses some cluster group filtration (for example forServers() > > cluster group) > > Please see above - Aleksandr Shapkin described how we store > > filtered cluster groups on client. > > We don't store node IDs, we store actual filters. So every new request > will > > apply those filters on server side, > > using the most recent set of nodes. > > > > var myGrp = cluster.forServers().forAttribute("foo"); // This does not > > issue any server requests, just builds an object with filters on client > > while (true) myGrp.compute().executeTask("bar"); // Every request > includes > > filters, and filters are applied on the server side > > > > On Tue, Nov 26, 2019 at 1:42 PM Alex Plehanov <[hidden email]> > > wrote: > > > > > > Anyway, my point stands. > > > I can't agree. Why you don't want to use task id for this? After all, > we > > > don't cancel request (request is already processed), we cancel the > task. > > So > > > it's more convenient to use task id here. > > > > > > > Can you please provide equivalent use case with existing "thick" > > client? > > > For example: > > > Cluster consists of one server node. > > > Client uses some cluster group filtration (for example forServers() > > cluster > > > group). > > > Client starts to send periodically (for example 1 per minute) long-term > > > (for example 1 hour long) tasks to the cluster. > > > Meanwhile, several server nodes joined the cluster. > > > > > > In case of thick client: All server nodes will be used, tasks will be > > load > > > balanced. > > > In case of thin client: Only one server node will be used, client will > > > detect topology change after an hour. > > > > > > > > > вт, 26 нояб. 2019 г. в 11:50, Pavel Tupitsyn <[hidden email]>: > > > > > > > > I can't see any usage of request id in query cursors > > > > You are right, cursor id is a separate thing. > > > > Anyway, my point stands. > > > > > > > > > client sends long term tasks to nodes and wants to do it with load > > > > balancing > > > > I still don't get it. Can you please provide equivalent use case with > > > > existing "thick" client? > > > > > > > > > > > > On Mon, Nov 25, 2019 at 11:59 PM Alex Plehanov < > > [hidden email]> > > > > wrote: > > > > > > > > > > And it is fine to use request ID to identify compute tasks (as we > > do > > > > with > > > > > query cursors). > > > > > I can't see any usage of request id in query cursors. We send query > > > > request > > > > > and get cursor id in response. After that, we only use cursor id > (to > > > get > > > > > next pages and to close the resource). Did I miss something? > > > > > > > > > > > Looks like I'm missing something - how is topology change > relevant > > to > > > > > executing compute tasks from client? > > > > > It's not relevant directly. But there are some cases where it will > be > > > > > helpful. For example, if client sends long term tasks to nodes and > > > wants > > > > to > > > > > do it with load balancing it will detect topology change only after > > > some > > > > > time in the future with the first response, so load balancing will > no > > > > work. > > > > > Perhaps we can add optional "topology version" field to the > > > > > OP_COMPUTE_EXECUTE_TASK request to solve this problem. > > > > > > > > > > > > > > > пн, 25 нояб. 2019 г. в 22:42, Pavel Tupitsyn <[hidden email] > >: > > > > > > > > > > > Alex, > > > > > > > > > > > > > we will mix entities from different layers (transport layer and > > > > request > > > > > > body) > > > > > > I would not call our message header (which includes the id) > > > "transport > > > > > > layer". > > > > > > TCP is our transport layer. And it is fine to use request ID to > > > > identify > > > > > > compute tasks (as we do with query cursors). > > > > > > > > > > > > > we still can't be sure that the task is successfully started > on a > > > > > server > > > > > > The request to start the task will fail and we'll get a response > > > > > indicating > > > > > > that right away > > > > > > > > > > > > > we won't ever know about topology change > > > > > > Looks like I'm missing something - how is topology change > relevant > > to > > > > > > executing compute tasks from client? > > > > > > > > > > > > On Mon, Nov 25, 2019 at 10:17 PM Alex Plehanov < > > > > [hidden email]> > > > > > > wrote: > > > > > > > > > > > > > Pavel, in this case, we will mix entities from different layers > > > > > > (transport > > > > > > > layer and request body), it's not very good. The same behavior > we > > > can > > > > > > > achieve with generated on client-side task id, but there will > be > > no > > > > > > > inter-layer data intersection and I think it will be easier to > > > > > implement > > > > > > on > > > > > > > both client and server-side. But we still can't be sure that > the > > > task > > > > > is > > > > > > > successfully started on a server. We won't ever know about > > topology > > > > > > change, > > > > > > > because topology changed flag will be sent from server to > client > > > only > > > > > > with > > > > > > > a response when the task will be completed. Are we accept that? > > > > > > > > > > > > > > пн, 25 нояб. 2019 г. в 19:07, Pavel Tupitsyn < > > [hidden email] > > > >: > > > > > > > > > > > > > > > Alex, > > > > > > > > > > > > > > > > I have a simpler idea. We already do request id handling in > the > > > > > > protocol, > > > > > > > > so: > > > > > > > > - Client sends a normal request to execute compute task. > > Request > > > ID > > > > > is > > > > > > > > generated as usual. > > > > > > > > - As soon as task is completed, a response is received. > > > > > > > > > > > > > > > > As for cancellation - client can send a new request (with new > > > > request > > > > > > ID) > > > > > > > > and (in the body) pass the request ID from above > > > > > > > > as a task identifier. As a result, there are two responses: > > > > > > > > - Cancellation response > > > > > > > > - Task response (with proper cancelled status) > > > > > > > > > > > > > > > > That's it, no need to modify the core of the protocol. One > > > request > > > > - > > > > > > one > > > > > > > > response. > > > > > > > > > > > > > > > > On Mon, Nov 25, 2019 at 6:20 PM Alex Plehanov < > > > > > [hidden email] > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > Pavel, we need to inform the client when the task is > > completed, > > > > we > > > > > > need > > > > > > > > the > > > > > > > > > ability to cancel the task. I see several ways to implement > > > this: > > > > > > > > > > > > > > > > > > 1. Сlient sends a request to the server to start a task, > > server > > > > > > return > > > > > > > > task > > > > > > > > > id in response. Server notifies client when task is > completed > > > > with > > > > > a > > > > > > > new > > > > > > > > > request (from server to client). Client can cancel the task > > by > > > > > > sending > > > > > > > a > > > > > > > > > new request with operation type "cancel" and task id. In > this > > > > case, > > > > > > we > > > > > > > > > should implement 2-ways requests. > > > > > > > > > 2. Client generates unique task id and sends a request to > the > > > > > server > > > > > > to > > > > > > > > > start a task, server don't reply immediately but wait until > > > task > > > > is > > > > > > > > > completed. Client can cancel task by sending new request > with > > > > > > operation > > > > > > > > > type "cancel" and task id. In this case, we should decouple > > > > request > > > > > > and > > > > > > > > > response on the server-side (currently response is sent > right > > > > after > > > > > > > > request > > > > > > > > > was processed). Also, we can't be sure that task is > > > successfully > > > > > > > started > > > > > > > > on > > > > > > > > > a server. > > > > > > > > > 3. Client sends a request to the server to start a task, > > server > > > > > > return > > > > > > > id > > > > > > > > > in response. Client periodically asks the server about task > > > > status. > > > > > > > > Client > > > > > > > > > can cancel the task by sending new request with operation > > type > > > > > > "cancel" > > > > > > > > and > > > > > > > > > task id. This case brings some overhead to the > communication > > > > > channel. > > > > > > > > > > > > > > > > > > Personally, I think that the case with 2-ways requests is > > > better, > > > > > but > > > > > > > I'm > > > > > > > > > open to any other ideas. > > > > > > > > > > > > > > > > > > Aleksandr, > > > > > > > > > > > > > > > > > > Filtering logic for OP_CLUSTER_GROUP_GET_NODE_IDS looks > > > > > > > overcomplicated. > > > > > > > > Do > > > > > > > > > we need server-side filtering at all? Wouldn't it be better > > to > > > > send > > > > > > > basic > > > > > > > > > info (ids, order, flags) for all nodes (there is relatively > > > small > > > > > > > amount > > > > > > > > of > > > > > > > > > data) and extended info (attributes) for selected list of > > > nodes? > > > > In > > > > > > > this > > > > > > > > > case, we can do basic node filtration on client-side > > > > (forClients(), > > > > > > > > > forServers(), forNodeIds(), forOthers(), etc). > > > > > > > > > > > > > > > > > > Do you use standard ClusterNode serialization? There are > also > > > > > metrics > > > > > > > > > serialized with ClusterNode, do we need it on thin client? > > > There > > > > > are > > > > > > > > other > > > > > > > > > interfaces exist to show metrics, I think it's redundant to > > > > export > > > > > > > > metrics > > > > > > > > > to thin clients too. > > > > > > > > > > > > > > > > > > What do you think? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > пт, 22 нояб. 2019 г. в 20:15, Aleksandr Shapkin < > > > > [hidden email] > > > > > >: > > > > > > > > > > > > > > > > > > > Alex, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I think you can create a new IEP page and I will fill it > > with > > > > the > > > > > > > > Cluster > > > > > > > > > > API details. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > In short, I’ve introduced several new codes: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Cluster API is pretty straightforward: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > OP_CLUSTER_IS_ACTIVE = 5000 > > > > > > > > > > > > > > > > > > > > OP_CLUSTER_CHANGE_STATE = 5001 > > > > > > > > > > > > > > > > > > > > OP_CLUSTER_CHANGE_WAL_STATE = 5002 > > > > > > > > > > > > > > > > > > > > OP_CLUSTER_GET_WAL_STATE = 5003 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Cluster group codes: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > OP_CLUSTER_GROUP_GET_NODE_IDS = 5100 > > > > > > > > > > > > > > > > > > > > OP_CLUSTER_GROUP_GET_NODE_INFO = 5101 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > The underlying implementation is based on the thick > client > > > > logic. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > For every request, we provide a known topology version > and > > if > > > > it > > > > > > has > > > > > > > > > > changed, > > > > > > > > > > > > > > > > > > > > a client updates it firstly and then re-sends the > filtering > > > > > > request. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Alongside the topVer a client sends a serialized nodes > > > > projection > > > > > > > > object > > > > > > > > > > > > > > > > > > > > that could be considered as a code to value mapping. > > > > > > > > > > > > > > > > > > > > Consider: [{Code = 1, Value= [“DotNet”, “MyAttribute”}, > > > > {Code=2, > > > > > > > > > Value=1}] > > > > > > > > > > > > > > > > > > > > Where “1” stands for Attribute filtering and “2” – > > > > > serverNodesOnly > > > > > > > > flag. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > As a result of request processing, a server sends nodeId > > > UUIDs > > > > > and > > > > > > a > > > > > > > > > > current topVer. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > When a client obtains nodeIds, it can perform a NODE_INFO > > > call > > > > to > > > > > > > get a > > > > > > > > > > > > > > > > > > > > serialized ClusterNode object. In addition there should > be > > a > > > > > > > different > > > > > > > > > API > > > > > > > > > > > > > > > > > > > > method for accessing/updating node metrics. > > > > > > > > > > > > > > > > > > > > чт, 21 нояб. 2019 г. в 12:32, Sergey Kozlov < > > > > > [hidden email] > > > > > > >: > > > > > > > > > > > > > > > > > > > > > Hi Pavel > > > > > > > > > > > > > > > > > > > > > > On Thu, Nov 21, 2019 at 11:30 AM Pavel Tupitsyn < > > > > > > > > [hidden email]> > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > 1. I believe that Cluster operations for Thin Client > > > > protocol > > > > > > are > > > > > > > > > > already > > > > > > > > > > > > in the works > > > > > > > > > > > > by Alexandr Shapkin. Can't find the ticket though. > > > > > > > > > > > > Alexandr, can you please confirm and attach the > ticket > > > > > number? > > > > > > > > > > > > > > > > > > > > > > > > 2. Proposed changes will work only for Java tasks > that > > > are > > > > > > > already > > > > > > > > > > > deployed > > > > > > > > > > > > on server nodes. > > > > > > > > > > > > This is mostly useless for other thin clients we have > > > > > (Python, > > > > > > > PHP, > > > > > > > > > > .NET, > > > > > > > > > > > > C++). > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I don't guess so. The task (execution) is a way to > > > implement > > > > > own > > > > > > > > layer > > > > > > > > > > for > > > > > > > > > > > the thin client application. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > We should think of a way to make this useful for all > > > > clients. > > > > > > > > > > > > For example, we may allow sending tasks in some > > scripting > > > > > > > language > > > > > > > > > like > > > > > > > > > > > > Javascript. > > > > > > > > > > > > Thoughts? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > The arbitrary code execution from a remote client must > be > > > > > > protected > > > > > > > > > > > from malicious code. > > > > > > > > > > > I don't know how it could be designed but without that > we > > > > open > > > > > > the > > > > > > > > hole > > > > > > > > > > to > > > > > > > > > > > kill cluster. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Nov 21, 2019 at 11:21 AM Sergey Kozlov < > > > > > > > > [hidden email] > > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > Hi Alex > > > > > > > > > > > > > > > > > > > > > > > > > > The idea is great. But I have some concerns that > > > probably > > > > > > > should > > > > > > > > be > > > > > > > > > > > taken > > > > > > > > > > > > > into account for design: > > > > > > > > > > > > > > > > > > > > > > > > > > 1. We need to have the ability to stop a task > > > > execution, > > > > > > > smth > > > > > > > > > like > > > > > > > > > > > > > OP_COMPUTE_CANCEL_TASK operation (client to > > server) > > > > > > > > > > > > > 2. What's about task execution timeout? It may > > help > > > to > > > > > the > > > > > > > > > cluster > > > > > > > > > > > > > survival for buggy tasks > > > > > > > > > > > > > 3. Ignite doesn't have roles/authorization > > > > functionality > > > > > > for > > > > > > > > > now. > > > > > > > > > > > But > > > > > > > > > > > > a > > > > > > > > > > > > > task is the risky operation for cluster (for > > > security > > > > > > > > reasons). > > > > > > > > > > > Could > > > > > > > > > > > > we > > > > > > > > > > > > > add for Ignite configuration new options: > > > > > > > > > > > > > - Explicit turning on for compute task > support > > > for > > > > > thin > > > > > > > > > > protocol > > > > > > > > > > > > > (disabled by default) for whole cluster > > > > > > > > > > > > > - Explicit turning on for compute task > support > > > for > > > > a > > > > > > node > > > > > > > > > > > > > - The list of task names (classes) allowed to > > > > execute > > > > > > by > > > > > > > > thin > > > > > > > > > > > > client. > > > > > > > > > > > > > 4. Support the labeling for task that may help > to > > > > > > > investigate > > > > > > > > > > issues > > > > > > > > > > > > on > > > > > > > > > > > > > cluster (the idea from IEP-34 [1]) > > > > > > > > > > > > > > > > > > > > > > > > > > 1. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/IGNITE/IEP-34+Thin+client%3A+transactions+support > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Nov 21, 2019 at 10:58 AM Alex Plehanov < > > > > > > > > > > > [hidden email]> > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > Hello, Igniters! > > > > > > > > > > > > > > > > > > > > > > > > > > > > I have plans to start implementation of Compute > > > > interface > > > > > > for > > > > > > > > > > Ignite > > > > > > > > > > > > thin > > > > > > > > > > > > > > client and want to discuss features that should > be > > > > > > > implemented. > > > > > > > > > > > > > > > > > > > > > > > > > > > > We already have Compute implementation for > > > binary-rest > > > > > > > clients > > > > > > > > > > > > > > (GridClientCompute), which have the following > > > > > > functionality: > > > > > > > > > > > > > > - Filtering cluster nodes (projection) for > compute > > > > > > > > > > > > > > - Executing task by the name > > > > > > > > > > > > > > > > > > > > > > > > > > > > I think we can implement this functionality in a > > thin > > > > > > client > > > > > > > as > > > > > > > > > > well. > > > > > > > > > > > > > > > > > > > > > > > > > > > > First of all, we need some operation types to > > > request a > > > > > > list > > > > > > > of > > > > > > > > > all > > > > > > > > > > > > > > available nodes and probably node attributes (by > a > > > list > > > > > of > > > > > > > > > nodes). > > > > > > > > > > > Node > > > > > > > > > > > > > > attributes will be helpful if we will decide to > > > > implement > > > > > > > > analog > > > > > > > > > of > > > > > > > > > > > > > > ClusterGroup#forAttribute or > > > ClusterGroup#forePredicate > > > > > > > methods > > > > > > > > > in > > > > > > > > > > > the > > > > > > > > > > > > > thin > > > > > > > > > > > > > > client. Perhaps they can be requested lazily. > > > > > > > > > > > > > > > > > > > > > > > > > > > > From the protocol point of view there will be two > > new > > > > > > > > operations: > > > > > > > > > > > > > > > > > > > > > > > > > > > > OP_CLUSTER_GET_NODES > > > > > > > > > > > > > > Request: empty > > > > > > > > > > > > > > Response: long topologyVersion, int > > > > minorTopologyVersion, > > > > > > int > > > > > > > > > > > > nodesCount, > > > > > > > > > > > > > > for each node set of node fields (UUID nodeId, > > Object > > > > or > > > > > > > String > > > > > > > > > > > > > > consistentId, long order, etc) > > > > > > > > > > > > > > > > > > > > > > > > > > > > OP_CLUSTER_GET_NODE_ATTRIBUTES > > > > > > > > > > > > > > Request: int nodesCount, for each node: UUID > nodeId > > > > > > > > > > > > > > Response: int nodesCount, for each node: int > > > > > > attributesCount, > > > > > > > > for > > > > > > > > > > > each > > > > > > > > > > > > > node > > > > > > > > > > > > > > attribute: String name, Object value > > > > > > > > > > > > > > > > > > > > > > > > > > > > To execute tasks we need something like these > > methods > > > > in > > > > > > the > > > > > > > > > client > > > > > > > > > > > > API: > > > > > > > > > > > > > > Object execute(String task, Object arg) > > > > > > > > > > > > > > Future<Object> executeAsync(String task, Object > > arg) > > > > > > > > > > > > > > Object affinityExecute(String task, String cache, > > > > Object > > > > > > key, > > > > > > > > > > Object > > > > > > > > > > > > arg) > > > > > > > > > > > > > > Future<Object> affinityExecuteAsync(String task, > > > String > > > > > > > cache, > > > > > > > > > > Object > > > > > > > > > > > > > key, > > > > > > > > > > > > > > Object arg) > > > > > > > > > > > > > > > > > > > > > > > > > > > > Which can be mapped to protocol operations: > > > > > > > > > > > > > > > > > > > > > > > > > > > > OP_COMPUTE_EXECUTE_TASK > > > > > > > > > > > > > > Request: UUID nodeId, String taskName, Object arg > > > > > > > > > > > > > > Response: Object result > > > > > > > > > > > > > > > > > > > > > > > > > > > > OP_COMPUTE_EXECUTE_TASK_AFFINITY > > > > > > > > > > > > > > Request: String cacheName, Object key, String > > > taskName, > > > > > > > Object > > > > > > > > > arg > > > > > > > > > > > > > > Response: Object result > > > > > > > > > > > > > > > > > > > > > > > > > > > > The second operation is needed because we > sometimes > > > > can't > > > > > > > > > calculate > > > > > > > > > > > and > > > > > > > > > > > > > > connect to affinity node on the client-side > > (affinity > > > > > > > awareness > > > > > > > > > can > > > > > > > > > > > be > > > > > > > > > > > > > > disabled, custom affinity function can be used or > > > there > > > > > can > > > > > > > be > > > > > > > > no > > > > > > > > > > > > > > connection between client and affinity node), but > > we > > > > can > > > > > > make > > > > > > > > > best > > > > > > > > > > > > effort > > > > > > > > > > > > > > to send request to target node if affinity > > awareness > > > is > > > > > > > > enabled. > > > > > > > > > > > > > > > > > > > > > > > > > > > > Currently, on the server-side requests always > > > processed > > > > > > > > > > synchronously > > > > > > > > > > > > and > > > > > > > > > > > > > > responses are sent right after request was > > processed. > > > > To > > > > > > > > execute > > > > > > > > > > long > > > > > > > > > > > > > tasks > > > > > > > > > > > > > > async we should whether change this logic or > > > introduce > > > > > some > > > > > > > > kind > > > > > > > > > > > > two-way > > > > > > > > > > > > > > communication between client and server (now only > > > > one-way > > > > > > > > > requests > > > > > > > > > > > from > > > > > > > > > > > > > > client to server are allowed). > > > > > > > > > > > > > > > > > > > > > > > > > > > > Two-way communication can also be useful in the > > > future > > > > if > > > > > > we > > > > > > > > will > > > > > > > > > > > send > > > > > > > > > > > > > some > > > > > > > > > > > > > > server-side generated events to clients. > > > > > > > > > > > > > > > > > > > > > > > > > > > > In case of two-way communication there can be new > > > > > > operations > > > > > > > > > > > > introduced: > > > > > > > > > > > > > > > > > > > > > > > > > > > > OP_COMPUTE_EXECUTE_TASK (from client to server) > > > > > > > > > > > > > > Request: UUID nodeId, String taskName, Object arg > > > > > > > > > > > > > > Response: long taskId > > > > > > > > > > > > > > > > > > > > > > > > > > > > OP_COMPUTE_TASK_FINISHED (from server to client) > > > > > > > > > > > > > > Request: taskId, Object result > > > > > > > > > > > > > > Response: empty > > > > > > > > > > > > > > > > > > > > > > > > > > > > The same for affinity requests. > > > > > > > > > > > > > > > > > > > > > > > > > > > > Also, we can implement not only execute task > > > operation, > > > > > but > > > > > > > > some > > > > > > > > > > > other > > > > > > > > > > > > > > operations from IgniteCompute (broadcast, run, > > call), > > > > but > > > > > > it > > > > > > > > will > > > > > > > > > > be > > > > > > > > > > > > > useful > > > > > > > > > > > > > > only for java thin client. And even with java > thin > > > > client > > > > > > we > > > > > > > > > should > > > > > > > > > > > > > whether > > > > > > > > > > > > > > implement peer-class-loading for thin clients > (this > > > > also > > > > > > > > requires > > > > > > > > > > > > two-way > > > > > > > > > > > > > > client-server communication) or put classes with > > > > executed > > > > > > > > > closures > > > > > > > > > > to > > > > > > > > > > > > the > > > > > > > > > > > > > > server locally. > > > > > > > > > > > > > > > > > > > > > > > > > > > > What do you think about proposed protocol > changes? > > > > > > > > > > > > > > Do we need two-way requests between client and > > > server? > > > > > > > > > > > > > > Do we need support of compute methods other than > > > > "execute > > > > > > > > task"? > > > > > > > > > > > > > > What do you think about peer-class-loading for > thin > > > > > > clients? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > > > > Sergey Kozlov > > > > > > > > > > > > > GridGain Systems > > > > > > > > > > > > > www.gridgain.com > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > > Sergey Kozlov > > > > > > > > > > > GridGain Systems > > > > > > > > > > > www.gridgain.com > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > Alex. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > |
> Since all thin client operations are inherently async, we should be able
to cancel any of them It's illogical to have such ability. What should do cancel operation of cancel operation? Moreover, sometimes it's dangerous, for example, create cache operation should never be canceled. There should be an explicit set of processes that we can cancel: queries, transactions, tasks, services. The lifecycle of services is more complex than the lifecycle of tasks. With services, I suppose, we can't use request cancelation, so tasks will be the only process with an exceptional pattern. > The request would be "execute task with specified node filter" - simple and efficient. It's not simple: every compute or service request should contain complex node filtering logic, which duplicates the same logic for cluster API. It's not efficient: for example, we can't implement forPredicate() filtering in this case. ср, 27 нояб. 2019 г. в 19:25, Pavel Tupitsyn <[hidden email]>: > > The request is already processed (task is started), we can't cancel the > request > The request is not "start a task". It is "execute task" (and get result). > Same as "cache get" - you get a result in the end, we don't "start cache > get" then "end cache get". > > Since all thin client operations are inherently async, we should be able to > cancel any of them > by sending another request with an id of prior request to be cancelled. > That's why I'm advocating for this approach - it will work for anything, no > special cases. > And it keeps "happy path" as simple as it is right now. > > Queries are different because we retrieve results in pages, we can't do > them as one request. > Transactions are also different because client controls when they should > end. > There is no reason for task execution to be a special case like queries or > transactions. > > > we always need to send 2 requests to server to execute the task > Nope. We don't need to get nodes on client at all. > The request would be "execute task with specified node filter" - simple and > efficient. > > > On Wed, Nov 27, 2019 at 4:31 PM Alex Plehanov <[hidden email]> > wrote: > > > > We do cancel a request to perform a task. We may and should use this > to > > cancel any other request in future. > > The request is already processed (task is started), we can't cancel the > > request. As you mentioned before, we already do almost the same for > queries > > (close the cursor, but not cancel the request to run a query), it's > better > > to do such things in a common way. We have a pattern: start some process > > (query, transaction), get id of this process, end process by this id. The > > "Execute task" process should match the same pattern. In my opinion, > > implementation with two-way requests is the best option to match this > > pattern (we can even reuse OP_RESOURCE_CLOSE operation type in this > case). > > Sometime in the future, we will need two-way requests for some other > > functionality (continuous queries, event listening, etc). But even > without > > two-way requests introducing some process id (task id in our case) will > be > > closer to existing pattern than canceling tasks by request id. > > > > > So every new request will apply those filters on server side, using the > > most recent set of nodes. > > In this case, we always need to send 2 requests to server to execute the > > task. First - to get nodes by the filter, second - to actually execute > the > > task. It seems like overhead. The same will be for services. Cluster > group > > remains the same if the topology hasn't changed. We can use this fact and > > bind "execute task" request to topology. If topology has changed - get > > nodes for new topology and retry request. > > > > вт, 26 нояб. 2019 г. в 17:44, Pavel Tupitsyn <[hidden email]>: > > > > > > After all, we don't cancel request > > > We do cancel a request to perform a task. We may and should use this to > > > cancel any other request in future. > > > > > > > Client uses some cluster group filtration (for example forServers() > > > cluster group) > > > Please see above - Aleksandr Shapkin described how we store > > > filtered cluster groups on client. > > > We don't store node IDs, we store actual filters. So every new request > > will > > > apply those filters on server side, > > > using the most recent set of nodes. > > > > > > var myGrp = cluster.forServers().forAttribute("foo"); // This does not > > > issue any server requests, just builds an object with filters on client > > > while (true) myGrp.compute().executeTask("bar"); // Every request > > includes > > > filters, and filters are applied on the server side > > > > > > On Tue, Nov 26, 2019 at 1:42 PM Alex Plehanov <[hidden email] > > > > > wrote: > > > > > > > > Anyway, my point stands. > > > > I can't agree. Why you don't want to use task id for this? After all, > > we > > > > don't cancel request (request is already processed), we cancel the > > task. > > > So > > > > it's more convenient to use task id here. > > > > > > > > > Can you please provide equivalent use case with existing "thick" > > > client? > > > > For example: > > > > Cluster consists of one server node. > > > > Client uses some cluster group filtration (for example forServers() > > > cluster > > > > group). > > > > Client starts to send periodically (for example 1 per minute) > long-term > > > > (for example 1 hour long) tasks to the cluster. > > > > Meanwhile, several server nodes joined the cluster. > > > > > > > > In case of thick client: All server nodes will be used, tasks will be > > > load > > > > balanced. > > > > In case of thin client: Only one server node will be used, client > will > > > > detect topology change after an hour. > > > > > > > > > > > > вт, 26 нояб. 2019 г. в 11:50, Pavel Tupitsyn <[hidden email]>: > > > > > > > > > > I can't see any usage of request id in query cursors > > > > > You are right, cursor id is a separate thing. > > > > > Anyway, my point stands. > > > > > > > > > > > client sends long term tasks to nodes and wants to do it with > load > > > > > balancing > > > > > I still don't get it. Can you please provide equivalent use case > with > > > > > existing "thick" client? > > > > > > > > > > > > > > > On Mon, Nov 25, 2019 at 11:59 PM Alex Plehanov < > > > [hidden email]> > > > > > wrote: > > > > > > > > > > > > And it is fine to use request ID to identify compute tasks (as > we > > > do > > > > > with > > > > > > query cursors). > > > > > > I can't see any usage of request id in query cursors. We send > query > > > > > request > > > > > > and get cursor id in response. After that, we only use cursor id > > (to > > > > get > > > > > > next pages and to close the resource). Did I miss something? > > > > > > > > > > > > > Looks like I'm missing something - how is topology change > > relevant > > > to > > > > > > executing compute tasks from client? > > > > > > It's not relevant directly. But there are some cases where it > will > > be > > > > > > helpful. For example, if client sends long term tasks to nodes > and > > > > wants > > > > > to > > > > > > do it with load balancing it will detect topology change only > after > > > > some > > > > > > time in the future with the first response, so load balancing > will > > no > > > > > work. > > > > > > Perhaps we can add optional "topology version" field to the > > > > > > OP_COMPUTE_EXECUTE_TASK request to solve this problem. > > > > > > > > > > > > > > > > > > пн, 25 нояб. 2019 г. в 22:42, Pavel Tupitsyn < > [hidden email] > > >: > > > > > > > > > > > > > Alex, > > > > > > > > > > > > > > > we will mix entities from different layers (transport layer > and > > > > > request > > > > > > > body) > > > > > > > I would not call our message header (which includes the id) > > > > "transport > > > > > > > layer". > > > > > > > TCP is our transport layer. And it is fine to use request ID to > > > > > identify > > > > > > > compute tasks (as we do with query cursors). > > > > > > > > > > > > > > > we still can't be sure that the task is successfully started > > on a > > > > > > server > > > > > > > The request to start the task will fail and we'll get a > response > > > > > > indicating > > > > > > > that right away > > > > > > > > > > > > > > > we won't ever know about topology change > > > > > > > Looks like I'm missing something - how is topology change > > relevant > > > to > > > > > > > executing compute tasks from client? > > > > > > > > > > > > > > On Mon, Nov 25, 2019 at 10:17 PM Alex Plehanov < > > > > > [hidden email]> > > > > > > > wrote: > > > > > > > > > > > > > > > Pavel, in this case, we will mix entities from different > layers > > > > > > > (transport > > > > > > > > layer and request body), it's not very good. The same > behavior > > we > > > > can > > > > > > > > achieve with generated on client-side task id, but there will > > be > > > no > > > > > > > > inter-layer data intersection and I think it will be easier > to > > > > > > implement > > > > > > > on > > > > > > > > both client and server-side. But we still can't be sure that > > the > > > > task > > > > > > is > > > > > > > > successfully started on a server. We won't ever know about > > > topology > > > > > > > change, > > > > > > > > because topology changed flag will be sent from server to > > client > > > > only > > > > > > > with > > > > > > > > a response when the task will be completed. Are we accept > that? > > > > > > > > > > > > > > > > пн, 25 нояб. 2019 г. в 19:07, Pavel Tupitsyn < > > > [hidden email] > > > > >: > > > > > > > > > > > > > > > > > Alex, > > > > > > > > > > > > > > > > > > I have a simpler idea. We already do request id handling in > > the > > > > > > > protocol, > > > > > > > > > so: > > > > > > > > > - Client sends a normal request to execute compute task. > > > Request > > > > ID > > > > > > is > > > > > > > > > generated as usual. > > > > > > > > > - As soon as task is completed, a response is received. > > > > > > > > > > > > > > > > > > As for cancellation - client can send a new request (with > new > > > > > request > > > > > > > ID) > > > > > > > > > and (in the body) pass the request ID from above > > > > > > > > > as a task identifier. As a result, there are two responses: > > > > > > > > > - Cancellation response > > > > > > > > > - Task response (with proper cancelled status) > > > > > > > > > > > > > > > > > > That's it, no need to modify the core of the protocol. One > > > > request > > > > > - > > > > > > > one > > > > > > > > > response. > > > > > > > > > > > > > > > > > > On Mon, Nov 25, 2019 at 6:20 PM Alex Plehanov < > > > > > > [hidden email] > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > Pavel, we need to inform the client when the task is > > > completed, > > > > > we > > > > > > > need > > > > > > > > > the > > > > > > > > > > ability to cancel the task. I see several ways to > implement > > > > this: > > > > > > > > > > > > > > > > > > > > 1. Сlient sends a request to the server to start a task, > > > server > > > > > > > return > > > > > > > > > task > > > > > > > > > > id in response. Server notifies client when task is > > completed > > > > > with > > > > > > a > > > > > > > > new > > > > > > > > > > request (from server to client). Client can cancel the > task > > > by > > > > > > > sending > > > > > > > > a > > > > > > > > > > new request with operation type "cancel" and task id. In > > this > > > > > case, > > > > > > > we > > > > > > > > > > should implement 2-ways requests. > > > > > > > > > > 2. Client generates unique task id and sends a request to > > the > > > > > > server > > > > > > > to > > > > > > > > > > start a task, server don't reply immediately but wait > until > > > > task > > > > > is > > > > > > > > > > completed. Client can cancel task by sending new request > > with > > > > > > > operation > > > > > > > > > > type "cancel" and task id. In this case, we should > decouple > > > > > request > > > > > > > and > > > > > > > > > > response on the server-side (currently response is sent > > right > > > > > after > > > > > > > > > request > > > > > > > > > > was processed). Also, we can't be sure that task is > > > > successfully > > > > > > > > started > > > > > > > > > on > > > > > > > > > > a server. > > > > > > > > > > 3. Client sends a request to the server to start a task, > > > server > > > > > > > return > > > > > > > > id > > > > > > > > > > in response. Client periodically asks the server about > task > > > > > status. > > > > > > > > > Client > > > > > > > > > > can cancel the task by sending new request with operation > > > type > > > > > > > "cancel" > > > > > > > > > and > > > > > > > > > > task id. This case brings some overhead to the > > communication > > > > > > channel. > > > > > > > > > > > > > > > > > > > > Personally, I think that the case with 2-ways requests is > > > > better, > > > > > > but > > > > > > > > I'm > > > > > > > > > > open to any other ideas. > > > > > > > > > > > > > > > > > > > > Aleksandr, > > > > > > > > > > > > > > > > > > > > Filtering logic for OP_CLUSTER_GROUP_GET_NODE_IDS looks > > > > > > > > overcomplicated. > > > > > > > > > Do > > > > > > > > > > we need server-side filtering at all? Wouldn't it be > better > > > to > > > > > send > > > > > > > > basic > > > > > > > > > > info (ids, order, flags) for all nodes (there is > relatively > > > > small > > > > > > > > amount > > > > > > > > > of > > > > > > > > > > data) and extended info (attributes) for selected list of > > > > nodes? > > > > > In > > > > > > > > this > > > > > > > > > > case, we can do basic node filtration on client-side > > > > > (forClients(), > > > > > > > > > > forServers(), forNodeIds(), forOthers(), etc). > > > > > > > > > > > > > > > > > > > > Do you use standard ClusterNode serialization? There are > > also > > > > > > metrics > > > > > > > > > > serialized with ClusterNode, do we need it on thin > client? > > > > There > > > > > > are > > > > > > > > > other > > > > > > > > > > interfaces exist to show metrics, I think it's redundant > to > > > > > export > > > > > > > > > metrics > > > > > > > > > > to thin clients too. > > > > > > > > > > > > > > > > > > > > What do you think? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > пт, 22 нояб. 2019 г. в 20:15, Aleksandr Shapkin < > > > > > [hidden email] > > > > > > >: > > > > > > > > > > > > > > > > > > > > > Alex, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I think you can create a new IEP page and I will fill > it > > > with > > > > > the > > > > > > > > > Cluster > > > > > > > > > > > API details. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > In short, I’ve introduced several new codes: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Cluster API is pretty straightforward: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > OP_CLUSTER_IS_ACTIVE = 5000 > > > > > > > > > > > > > > > > > > > > > > OP_CLUSTER_CHANGE_STATE = 5001 > > > > > > > > > > > > > > > > > > > > > > OP_CLUSTER_CHANGE_WAL_STATE = 5002 > > > > > > > > > > > > > > > > > > > > > > OP_CLUSTER_GET_WAL_STATE = 5003 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Cluster group codes: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > OP_CLUSTER_GROUP_GET_NODE_IDS = 5100 > > > > > > > > > > > > > > > > > > > > > > OP_CLUSTER_GROUP_GET_NODE_INFO = 5101 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > The underlying implementation is based on the thick > > client > > > > > logic. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > For every request, we provide a known topology version > > and > > > if > > > > > it > > > > > > > has > > > > > > > > > > > changed, > > > > > > > > > > > > > > > > > > > > > > a client updates it firstly and then re-sends the > > filtering > > > > > > > request. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Alongside the topVer a client sends a serialized nodes > > > > > projection > > > > > > > > > object > > > > > > > > > > > > > > > > > > > > > > that could be considered as a code to value mapping. > > > > > > > > > > > > > > > > > > > > > > Consider: [{Code = 1, Value= [“DotNet”, “MyAttribute”}, > > > > > {Code=2, > > > > > > > > > > Value=1}] > > > > > > > > > > > > > > > > > > > > > > Where “1” stands for Attribute filtering and “2” – > > > > > > serverNodesOnly > > > > > > > > > flag. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > As a result of request processing, a server sends > nodeId > > > > UUIDs > > > > > > and > > > > > > > a > > > > > > > > > > > current topVer. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > When a client obtains nodeIds, it can perform a > NODE_INFO > > > > call > > > > > to > > > > > > > > get a > > > > > > > > > > > > > > > > > > > > > > serialized ClusterNode object. In addition there should > > be > > > a > > > > > > > > different > > > > > > > > > > API > > > > > > > > > > > > > > > > > > > > > > method for accessing/updating node metrics. > > > > > > > > > > > > > > > > > > > > > > чт, 21 нояб. 2019 г. в 12:32, Sergey Kozlov < > > > > > > [hidden email] > > > > > > > >: > > > > > > > > > > > > > > > > > > > > > > > Hi Pavel > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Nov 21, 2019 at 11:30 AM Pavel Tupitsyn < > > > > > > > > > [hidden email]> > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > 1. I believe that Cluster operations for Thin > Client > > > > > protocol > > > > > > > are > > > > > > > > > > > already > > > > > > > > > > > > > in the works > > > > > > > > > > > > > by Alexandr Shapkin. Can't find the ticket though. > > > > > > > > > > > > > Alexandr, can you please confirm and attach the > > ticket > > > > > > number? > > > > > > > > > > > > > > > > > > > > > > > > > > 2. Proposed changes will work only for Java tasks > > that > > > > are > > > > > > > > already > > > > > > > > > > > > deployed > > > > > > > > > > > > > on server nodes. > > > > > > > > > > > > > This is mostly useless for other thin clients we > have > > > > > > (Python, > > > > > > > > PHP, > > > > > > > > > > > .NET, > > > > > > > > > > > > > C++). > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I don't guess so. The task (execution) is a way to > > > > implement > > > > > > own > > > > > > > > > layer > > > > > > > > > > > for > > > > > > > > > > > > the thin client application. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > We should think of a way to make this useful for > all > > > > > clients. > > > > > > > > > > > > > For example, we may allow sending tasks in some > > > scripting > > > > > > > > language > > > > > > > > > > like > > > > > > > > > > > > > Javascript. > > > > > > > > > > > > > Thoughts? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > The arbitrary code execution from a remote client > must > > be > > > > > > > protected > > > > > > > > > > > > from malicious code. > > > > > > > > > > > > I don't know how it could be designed but without > that > > we > > > > > open > > > > > > > the > > > > > > > > > hole > > > > > > > > > > > to > > > > > > > > > > > > kill cluster. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Nov 21, 2019 at 11:21 AM Sergey Kozlov < > > > > > > > > > [hidden email] > > > > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi Alex > > > > > > > > > > > > > > > > > > > > > > > > > > > > The idea is great. But I have some concerns that > > > > probably > > > > > > > > should > > > > > > > > > be > > > > > > > > > > > > taken > > > > > > > > > > > > > > into account for design: > > > > > > > > > > > > > > > > > > > > > > > > > > > > 1. We need to have the ability to stop a task > > > > > execution, > > > > > > > > smth > > > > > > > > > > like > > > > > > > > > > > > > > OP_COMPUTE_CANCEL_TASK operation (client to > > > server) > > > > > > > > > > > > > > 2. What's about task execution timeout? It may > > > help > > > > to > > > > > > the > > > > > > > > > > cluster > > > > > > > > > > > > > > survival for buggy tasks > > > > > > > > > > > > > > 3. Ignite doesn't have roles/authorization > > > > > functionality > > > > > > > for > > > > > > > > > > now. > > > > > > > > > > > > But > > > > > > > > > > > > > a > > > > > > > > > > > > > > task is the risky operation for cluster (for > > > > security > > > > > > > > > reasons). > > > > > > > > > > > > Could > > > > > > > > > > > > > we > > > > > > > > > > > > > > add for Ignite configuration new options: > > > > > > > > > > > > > > - Explicit turning on for compute task > > support > > > > for > > > > > > thin > > > > > > > > > > > protocol > > > > > > > > > > > > > > (disabled by default) for whole cluster > > > > > > > > > > > > > > - Explicit turning on for compute task > > support > > > > for > > > > > a > > > > > > > node > > > > > > > > > > > > > > - The list of task names (classes) allowed > to > > > > > execute > > > > > > > by > > > > > > > > > thin > > > > > > > > > > > > > client. > > > > > > > > > > > > > > 4. Support the labeling for task that may help > > to > > > > > > > > investigate > > > > > > > > > > > issues > > > > > > > > > > > > > on > > > > > > > > > > > > > > cluster (the idea from IEP-34 [1]) > > > > > > > > > > > > > > > > > > > > > > > > > > > > 1. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/IGNITE/IEP-34+Thin+client%3A+transactions+support > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Nov 21, 2019 at 10:58 AM Alex Plehanov < > > > > > > > > > > > > [hidden email]> > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hello, Igniters! > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I have plans to start implementation of Compute > > > > > interface > > > > > > > for > > > > > > > > > > > Ignite > > > > > > > > > > > > > thin > > > > > > > > > > > > > > > client and want to discuss features that should > > be > > > > > > > > implemented. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > We already have Compute implementation for > > > > binary-rest > > > > > > > > clients > > > > > > > > > > > > > > > (GridClientCompute), which have the following > > > > > > > functionality: > > > > > > > > > > > > > > > - Filtering cluster nodes (projection) for > > compute > > > > > > > > > > > > > > > - Executing task by the name > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I think we can implement this functionality in > a > > > thin > > > > > > > client > > > > > > > > as > > > > > > > > > > > well. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > First of all, we need some operation types to > > > > request a > > > > > > > list > > > > > > > > of > > > > > > > > > > all > > > > > > > > > > > > > > > available nodes and probably node attributes > (by > > a > > > > list > > > > > > of > > > > > > > > > > nodes). > > > > > > > > > > > > Node > > > > > > > > > > > > > > > attributes will be helpful if we will decide to > > > > > implement > > > > > > > > > analog > > > > > > > > > > of > > > > > > > > > > > > > > > ClusterGroup#forAttribute or > > > > ClusterGroup#forePredicate > > > > > > > > methods > > > > > > > > > > in > > > > > > > > > > > > the > > > > > > > > > > > > > > thin > > > > > > > > > > > > > > > client. Perhaps they can be requested lazily. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > From the protocol point of view there will be > two > > > new > > > > > > > > > operations: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > OP_CLUSTER_GET_NODES > > > > > > > > > > > > > > > Request: empty > > > > > > > > > > > > > > > Response: long topologyVersion, int > > > > > minorTopologyVersion, > > > > > > > int > > > > > > > > > > > > > nodesCount, > > > > > > > > > > > > > > > for each node set of node fields (UUID nodeId, > > > Object > > > > > or > > > > > > > > String > > > > > > > > > > > > > > > consistentId, long order, etc) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > OP_CLUSTER_GET_NODE_ATTRIBUTES > > > > > > > > > > > > > > > Request: int nodesCount, for each node: UUID > > nodeId > > > > > > > > > > > > > > > Response: int nodesCount, for each node: int > > > > > > > attributesCount, > > > > > > > > > for > > > > > > > > > > > > each > > > > > > > > > > > > > > node > > > > > > > > > > > > > > > attribute: String name, Object value > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > To execute tasks we need something like these > > > methods > > > > > in > > > > > > > the > > > > > > > > > > client > > > > > > > > > > > > > API: > > > > > > > > > > > > > > > Object execute(String task, Object arg) > > > > > > > > > > > > > > > Future<Object> executeAsync(String task, Object > > > arg) > > > > > > > > > > > > > > > Object affinityExecute(String task, String > cache, > > > > > Object > > > > > > > key, > > > > > > > > > > > Object > > > > > > > > > > > > > arg) > > > > > > > > > > > > > > > Future<Object> affinityExecuteAsync(String > task, > > > > String > > > > > > > > cache, > > > > > > > > > > > Object > > > > > > > > > > > > > > key, > > > > > > > > > > > > > > > Object arg) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Which can be mapped to protocol operations: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > OP_COMPUTE_EXECUTE_TASK > > > > > > > > > > > > > > > Request: UUID nodeId, String taskName, Object > arg > > > > > > > > > > > > > > > Response: Object result > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > OP_COMPUTE_EXECUTE_TASK_AFFINITY > > > > > > > > > > > > > > > Request: String cacheName, Object key, String > > > > taskName, > > > > > > > > Object > > > > > > > > > > arg > > > > > > > > > > > > > > > Response: Object result > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > The second operation is needed because we > > sometimes > > > > > can't > > > > > > > > > > calculate > > > > > > > > > > > > and > > > > > > > > > > > > > > > connect to affinity node on the client-side > > > (affinity > > > > > > > > awareness > > > > > > > > > > can > > > > > > > > > > > > be > > > > > > > > > > > > > > > disabled, custom affinity function can be used > or > > > > there > > > > > > can > > > > > > > > be > > > > > > > > > no > > > > > > > > > > > > > > > connection between client and affinity node), > but > > > we > > > > > can > > > > > > > make > > > > > > > > > > best > > > > > > > > > > > > > effort > > > > > > > > > > > > > > > to send request to target node if affinity > > > awareness > > > > is > > > > > > > > > enabled. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Currently, on the server-side requests always > > > > processed > > > > > > > > > > > synchronously > > > > > > > > > > > > > and > > > > > > > > > > > > > > > responses are sent right after request was > > > processed. > > > > > To > > > > > > > > > execute > > > > > > > > > > > long > > > > > > > > > > > > > > tasks > > > > > > > > > > > > > > > async we should whether change this logic or > > > > introduce > > > > > > some > > > > > > > > > kind > > > > > > > > > > > > > two-way > > > > > > > > > > > > > > > communication between client and server (now > only > > > > > one-way > > > > > > > > > > requests > > > > > > > > > > > > from > > > > > > > > > > > > > > > client to server are allowed). > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Two-way communication can also be useful in the > > > > future > > > > > if > > > > > > > we > > > > > > > > > will > > > > > > > > > > > > send > > > > > > > > > > > > > > some > > > > > > > > > > > > > > > server-side generated events to clients. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > In case of two-way communication there can be > new > > > > > > > operations > > > > > > > > > > > > > introduced: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > OP_COMPUTE_EXECUTE_TASK (from client to server) > > > > > > > > > > > > > > > Request: UUID nodeId, String taskName, Object > arg > > > > > > > > > > > > > > > Response: long taskId > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > OP_COMPUTE_TASK_FINISHED (from server to > client) > > > > > > > > > > > > > > > Request: taskId, Object result > > > > > > > > > > > > > > > Response: empty > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > The same for affinity requests. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Also, we can implement not only execute task > > > > operation, > > > > > > but > > > > > > > > > some > > > > > > > > > > > > other > > > > > > > > > > > > > > > operations from IgniteCompute (broadcast, run, > > > call), > > > > > but > > > > > > > it > > > > > > > > > will > > > > > > > > > > > be > > > > > > > > > > > > > > useful > > > > > > > > > > > > > > > only for java thin client. And even with java > > thin > > > > > client > > > > > > > we > > > > > > > > > > should > > > > > > > > > > > > > > whether > > > > > > > > > > > > > > > implement peer-class-loading for thin clients > > (this > > > > > also > > > > > > > > > requires > > > > > > > > > > > > > two-way > > > > > > > > > > > > > > > client-server communication) or put classes > with > > > > > executed > > > > > > > > > > closures > > > > > > > > > > > to > > > > > > > > > > > > > the > > > > > > > > > > > > > > > server locally. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > What do you think about proposed protocol > > changes? > > > > > > > > > > > > > > > Do we need two-way requests between client and > > > > server? > > > > > > > > > > > > > > > Do we need support of compute methods other > than > > > > > "execute > > > > > > > > > task"? > > > > > > > > > > > > > > > What do you think about peer-class-loading for > > thin > > > > > > > clients? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > > > > > Sergey Kozlov > > > > > > > > > > > > > > GridGain Systems > > > > > > > > > > > > > > www.gridgain.com > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > > > Sergey Kozlov > > > > > > > > > > > > GridGain Systems > > > > > > > > > > > > www.gridgain.com > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > > Alex. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > |
Looks like we didn't rich consensus here.
Igor, as thin client maintainer, can you please share your opinion? Everyone else also welcome, please share your thoughts about options to implement operations for compute. чт, 28 нояб. 2019 г. в 10:02, Alex Plehanov <[hidden email]>: > > Since all thin client operations are inherently async, we should be able > to cancel any of them > It's illogical to have such ability. What should do cancel operation of > cancel operation? Moreover, sometimes it's dangerous, for example, create > cache operation should never be canceled. There should be an explicit set > of processes that we can cancel: queries, transactions, tasks, services. > The lifecycle of services is more complex than the lifecycle of tasks. With > services, I suppose, we can't use request cancelation, so tasks will be the > only process with an exceptional pattern. > > > The request would be "execute task with specified node filter" - simple > and efficient. > It's not simple: every compute or service request should contain complex > node filtering logic, which duplicates the same logic for cluster API. > It's not efficient: for example, we can't implement forPredicate() > filtering in this case. > > > ср, 27 нояб. 2019 г. в 19:25, Pavel Tupitsyn <[hidden email]>: > >> > The request is already processed (task is started), we can't cancel the >> request >> The request is not "start a task". It is "execute task" (and get result). >> Same as "cache get" - you get a result in the end, we don't "start cache >> get" then "end cache get". >> >> Since all thin client operations are inherently async, we should be able >> to >> cancel any of them >> by sending another request with an id of prior request to be cancelled. >> That's why I'm advocating for this approach - it will work for anything, >> no >> special cases. >> And it keeps "happy path" as simple as it is right now. >> >> Queries are different because we retrieve results in pages, we can't do >> them as one request. >> Transactions are also different because client controls when they should >> end. >> There is no reason for task execution to be a special case like queries or >> transactions. >> >> > we always need to send 2 requests to server to execute the task >> Nope. We don't need to get nodes on client at all. >> The request would be "execute task with specified node filter" - simple >> and >> efficient. >> >> >> On Wed, Nov 27, 2019 at 4:31 PM Alex Plehanov <[hidden email]> >> wrote: >> >> > > We do cancel a request to perform a task. We may and should use this >> to >> > cancel any other request in future. >> > The request is already processed (task is started), we can't cancel the >> > request. As you mentioned before, we already do almost the same for >> queries >> > (close the cursor, but not cancel the request to run a query), it's >> better >> > to do such things in a common way. We have a pattern: start some process >> > (query, transaction), get id of this process, end process by this id. >> The >> > "Execute task" process should match the same pattern. In my opinion, >> > implementation with two-way requests is the best option to match this >> > pattern (we can even reuse OP_RESOURCE_CLOSE operation type in this >> case). >> > Sometime in the future, we will need two-way requests for some other >> > functionality (continuous queries, event listening, etc). But even >> without >> > two-way requests introducing some process id (task id in our case) will >> be >> > closer to existing pattern than canceling tasks by request id. >> > >> > > So every new request will apply those filters on server side, using >> the >> > most recent set of nodes. >> > In this case, we always need to send 2 requests to server to execute the >> > task. First - to get nodes by the filter, second - to actually execute >> the >> > task. It seems like overhead. The same will be for services. Cluster >> group >> > remains the same if the topology hasn't changed. We can use this fact >> and >> > bind "execute task" request to topology. If topology has changed - get >> > nodes for new topology and retry request. >> > >> > вт, 26 нояб. 2019 г. в 17:44, Pavel Tupitsyn <[hidden email]>: >> > >> > > > After all, we don't cancel request >> > > We do cancel a request to perform a task. We may and should use this >> to >> > > cancel any other request in future. >> > > >> > > > Client uses some cluster group filtration (for example forServers() >> > > cluster group) >> > > Please see above - Aleksandr Shapkin described how we store >> > > filtered cluster groups on client. >> > > We don't store node IDs, we store actual filters. So every new request >> > will >> > > apply those filters on server side, >> > > using the most recent set of nodes. >> > > >> > > var myGrp = cluster.forServers().forAttribute("foo"); // This does not >> > > issue any server requests, just builds an object with filters on >> client >> > > while (true) myGrp.compute().executeTask("bar"); // Every request >> > includes >> > > filters, and filters are applied on the server side >> > > >> > > On Tue, Nov 26, 2019 at 1:42 PM Alex Plehanov < >> [hidden email]> >> > > wrote: >> > > >> > > > > Anyway, my point stands. >> > > > I can't agree. Why you don't want to use task id for this? After >> all, >> > we >> > > > don't cancel request (request is already processed), we cancel the >> > task. >> > > So >> > > > it's more convenient to use task id here. >> > > > >> > > > > Can you please provide equivalent use case with existing "thick" >> > > client? >> > > > For example: >> > > > Cluster consists of one server node. >> > > > Client uses some cluster group filtration (for example forServers() >> > > cluster >> > > > group). >> > > > Client starts to send periodically (for example 1 per minute) >> long-term >> > > > (for example 1 hour long) tasks to the cluster. >> > > > Meanwhile, several server nodes joined the cluster. >> > > > >> > > > In case of thick client: All server nodes will be used, tasks will >> be >> > > load >> > > > balanced. >> > > > In case of thin client: Only one server node will be used, client >> will >> > > > detect topology change after an hour. >> > > > >> > > > >> > > > вт, 26 нояб. 2019 г. в 11:50, Pavel Tupitsyn <[hidden email] >> >: >> > > > >> > > > > > I can't see any usage of request id in query cursors >> > > > > You are right, cursor id is a separate thing. >> > > > > Anyway, my point stands. >> > > > > >> > > > > > client sends long term tasks to nodes and wants to do it with >> load >> > > > > balancing >> > > > > I still don't get it. Can you please provide equivalent use case >> with >> > > > > existing "thick" client? >> > > > > >> > > > > >> > > > > On Mon, Nov 25, 2019 at 11:59 PM Alex Plehanov < >> > > [hidden email]> >> > > > > wrote: >> > > > > >> > > > > > > And it is fine to use request ID to identify compute tasks >> (as we >> > > do >> > > > > with >> > > > > > query cursors). >> > > > > > I can't see any usage of request id in query cursors. We send >> query >> > > > > request >> > > > > > and get cursor id in response. After that, we only use cursor id >> > (to >> > > > get >> > > > > > next pages and to close the resource). Did I miss something? >> > > > > > >> > > > > > > Looks like I'm missing something - how is topology change >> > relevant >> > > to >> > > > > > executing compute tasks from client? >> > > > > > It's not relevant directly. But there are some cases where it >> will >> > be >> > > > > > helpful. For example, if client sends long term tasks to nodes >> and >> > > > wants >> > > > > to >> > > > > > do it with load balancing it will detect topology change only >> after >> > > > some >> > > > > > time in the future with the first response, so load balancing >> will >> > no >> > > > > work. >> > > > > > Perhaps we can add optional "topology version" field to the >> > > > > > OP_COMPUTE_EXECUTE_TASK request to solve this problem. >> > > > > > >> > > > > > >> > > > > > пн, 25 нояб. 2019 г. в 22:42, Pavel Tupitsyn < >> [hidden email] >> > >: >> > > > > > >> > > > > > > Alex, >> > > > > > > >> > > > > > > > we will mix entities from different layers (transport layer >> and >> > > > > request >> > > > > > > body) >> > > > > > > I would not call our message header (which includes the id) >> > > > "transport >> > > > > > > layer". >> > > > > > > TCP is our transport layer. And it is fine to use request ID >> to >> > > > > identify >> > > > > > > compute tasks (as we do with query cursors). >> > > > > > > >> > > > > > > > we still can't be sure that the task is successfully started >> > on a >> > > > > > server >> > > > > > > The request to start the task will fail and we'll get a >> response >> > > > > > indicating >> > > > > > > that right away >> > > > > > > >> > > > > > > > we won't ever know about topology change >> > > > > > > Looks like I'm missing something - how is topology change >> > relevant >> > > to >> > > > > > > executing compute tasks from client? >> > > > > > > >> > > > > > > On Mon, Nov 25, 2019 at 10:17 PM Alex Plehanov < >> > > > > [hidden email]> >> > > > > > > wrote: >> > > > > > > >> > > > > > > > Pavel, in this case, we will mix entities from different >> layers >> > > > > > > (transport >> > > > > > > > layer and request body), it's not very good. The same >> behavior >> > we >> > > > can >> > > > > > > > achieve with generated on client-side task id, but there >> will >> > be >> > > no >> > > > > > > > inter-layer data intersection and I think it will be easier >> to >> > > > > > implement >> > > > > > > on >> > > > > > > > both client and server-side. But we still can't be sure that >> > the >> > > > task >> > > > > > is >> > > > > > > > successfully started on a server. We won't ever know about >> > > topology >> > > > > > > change, >> > > > > > > > because topology changed flag will be sent from server to >> > client >> > > > only >> > > > > > > with >> > > > > > > > a response when the task will be completed. Are we accept >> that? >> > > > > > > > >> > > > > > > > пн, 25 нояб. 2019 г. в 19:07, Pavel Tupitsyn < >> > > [hidden email] >> > > > >: >> > > > > > > > >> > > > > > > > > Alex, >> > > > > > > > > >> > > > > > > > > I have a simpler idea. We already do request id handling >> in >> > the >> > > > > > > protocol, >> > > > > > > > > so: >> > > > > > > > > - Client sends a normal request to execute compute task. >> > > Request >> > > > ID >> > > > > > is >> > > > > > > > > generated as usual. >> > > > > > > > > - As soon as task is completed, a response is received. >> > > > > > > > > >> > > > > > > > > As for cancellation - client can send a new request (with >> new >> > > > > request >> > > > > > > ID) >> > > > > > > > > and (in the body) pass the request ID from above >> > > > > > > > > as a task identifier. As a result, there are two >> responses: >> > > > > > > > > - Cancellation response >> > > > > > > > > - Task response (with proper cancelled status) >> > > > > > > > > >> > > > > > > > > That's it, no need to modify the core of the protocol. One >> > > > request >> > > > > - >> > > > > > > one >> > > > > > > > > response. >> > > > > > > > > >> > > > > > > > > On Mon, Nov 25, 2019 at 6:20 PM Alex Plehanov < >> > > > > > [hidden email] >> > > > > > > > >> > > > > > > > > wrote: >> > > > > > > > > >> > > > > > > > > > Pavel, we need to inform the client when the task is >> > > completed, >> > > > > we >> > > > > > > need >> > > > > > > > > the >> > > > > > > > > > ability to cancel the task. I see several ways to >> implement >> > > > this: >> > > > > > > > > > >> > > > > > > > > > 1. Сlient sends a request to the server to start a task, >> > > server >> > > > > > > return >> > > > > > > > > task >> > > > > > > > > > id in response. Server notifies client when task is >> > completed >> > > > > with >> > > > > > a >> > > > > > > > new >> > > > > > > > > > request (from server to client). Client can cancel the >> task >> > > by >> > > > > > > sending >> > > > > > > > a >> > > > > > > > > > new request with operation type "cancel" and task id. In >> > this >> > > > > case, >> > > > > > > we >> > > > > > > > > > should implement 2-ways requests. >> > > > > > > > > > 2. Client generates unique task id and sends a request >> to >> > the >> > > > > > server >> > > > > > > to >> > > > > > > > > > start a task, server don't reply immediately but wait >> until >> > > > task >> > > > > is >> > > > > > > > > > completed. Client can cancel task by sending new request >> > with >> > > > > > > operation >> > > > > > > > > > type "cancel" and task id. In this case, we should >> decouple >> > > > > request >> > > > > > > and >> > > > > > > > > > response on the server-side (currently response is sent >> > right >> > > > > after >> > > > > > > > > request >> > > > > > > > > > was processed). Also, we can't be sure that task is >> > > > successfully >> > > > > > > > started >> > > > > > > > > on >> > > > > > > > > > a server. >> > > > > > > > > > 3. Client sends a request to the server to start a task, >> > > server >> > > > > > > return >> > > > > > > > id >> > > > > > > > > > in response. Client periodically asks the server about >> task >> > > > > status. >> > > > > > > > > Client >> > > > > > > > > > can cancel the task by sending new request with >> operation >> > > type >> > > > > > > "cancel" >> > > > > > > > > and >> > > > > > > > > > task id. This case brings some overhead to the >> > communication >> > > > > > channel. >> > > > > > > > > > >> > > > > > > > > > Personally, I think that the case with 2-ways requests >> is >> > > > better, >> > > > > > but >> > > > > > > > I'm >> > > > > > > > > > open to any other ideas. >> > > > > > > > > > >> > > > > > > > > > Aleksandr, >> > > > > > > > > > >> > > > > > > > > > Filtering logic for OP_CLUSTER_GROUP_GET_NODE_IDS looks >> > > > > > > > overcomplicated. >> > > > > > > > > Do >> > > > > > > > > > we need server-side filtering at all? Wouldn't it be >> better >> > > to >> > > > > send >> > > > > > > > basic >> > > > > > > > > > info (ids, order, flags) for all nodes (there is >> relatively >> > > > small >> > > > > > > > amount >> > > > > > > > > of >> > > > > > > > > > data) and extended info (attributes) for selected list >> of >> > > > nodes? >> > > > > In >> > > > > > > > this >> > > > > > > > > > case, we can do basic node filtration on client-side >> > > > > (forClients(), >> > > > > > > > > > forServers(), forNodeIds(), forOthers(), etc). >> > > > > > > > > > >> > > > > > > > > > Do you use standard ClusterNode serialization? There are >> > also >> > > > > > metrics >> > > > > > > > > > serialized with ClusterNode, do we need it on thin >> client? >> > > > There >> > > > > > are >> > > > > > > > > other >> > > > > > > > > > interfaces exist to show metrics, I think it's >> redundant to >> > > > > export >> > > > > > > > > metrics >> > > > > > > > > > to thin clients too. >> > > > > > > > > > >> > > > > > > > > > What do you think? >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > пт, 22 нояб. 2019 г. в 20:15, Aleksandr Shapkin < >> > > > > [hidden email] >> > > > > > >: >> > > > > > > > > > >> > > > > > > > > > > Alex, >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > I think you can create a new IEP page and I will fill >> it >> > > with >> > > > > the >> > > > > > > > > Cluster >> > > > > > > > > > > API details. >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > In short, I’ve introduced several new codes: >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > Cluster API is pretty straightforward: >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > OP_CLUSTER_IS_ACTIVE = 5000 >> > > > > > > > > > > >> > > > > > > > > > > OP_CLUSTER_CHANGE_STATE = 5001 >> > > > > > > > > > > >> > > > > > > > > > > OP_CLUSTER_CHANGE_WAL_STATE = 5002 >> > > > > > > > > > > >> > > > > > > > > > > OP_CLUSTER_GET_WAL_STATE = 5003 >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > Cluster group codes: >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > OP_CLUSTER_GROUP_GET_NODE_IDS = 5100 >> > > > > > > > > > > >> > > > > > > > > > > OP_CLUSTER_GROUP_GET_NODE_INFO = 5101 >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > The underlying implementation is based on the thick >> > client >> > > > > logic. >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > For every request, we provide a known topology version >> > and >> > > if >> > > > > it >> > > > > > > has >> > > > > > > > > > > changed, >> > > > > > > > > > > >> > > > > > > > > > > a client updates it firstly and then re-sends the >> > filtering >> > > > > > > request. >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > Alongside the topVer a client sends a serialized nodes >> > > > > projection >> > > > > > > > > object >> > > > > > > > > > > >> > > > > > > > > > > that could be considered as a code to value mapping. >> > > > > > > > > > > >> > > > > > > > > > > Consider: [{Code = 1, Value= [“DotNet”, >> “MyAttribute”}, >> > > > > {Code=2, >> > > > > > > > > > Value=1}] >> > > > > > > > > > > >> > > > > > > > > > > Where “1” stands for Attribute filtering and “2” – >> > > > > > serverNodesOnly >> > > > > > > > > flag. >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > As a result of request processing, a server sends >> nodeId >> > > > UUIDs >> > > > > > and >> > > > > > > a >> > > > > > > > > > > current topVer. >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > When a client obtains nodeIds, it can perform a >> NODE_INFO >> > > > call >> > > > > to >> > > > > > > > get a >> > > > > > > > > > > >> > > > > > > > > > > serialized ClusterNode object. In addition there >> should >> > be >> > > a >> > > > > > > > different >> > > > > > > > > > API >> > > > > > > > > > > >> > > > > > > > > > > method for accessing/updating node metrics. >> > > > > > > > > > > >> > > > > > > > > > > чт, 21 нояб. 2019 г. в 12:32, Sergey Kozlov < >> > > > > > [hidden email] >> > > > > > > >: >> > > > > > > > > > > >> > > > > > > > > > > > Hi Pavel >> > > > > > > > > > > > >> > > > > > > > > > > > On Thu, Nov 21, 2019 at 11:30 AM Pavel Tupitsyn < >> > > > > > > > > [hidden email]> >> > > > > > > > > > > > wrote: >> > > > > > > > > > > > >> > > > > > > > > > > > > 1. I believe that Cluster operations for Thin >> Client >> > > > > protocol >> > > > > > > are >> > > > > > > > > > > already >> > > > > > > > > > > > > in the works >> > > > > > > > > > > > > by Alexandr Shapkin. Can't find the ticket though. >> > > > > > > > > > > > > Alexandr, can you please confirm and attach the >> > ticket >> > > > > > number? >> > > > > > > > > > > > > >> > > > > > > > > > > > > 2. Proposed changes will work only for Java tasks >> > that >> > > > are >> > > > > > > > already >> > > > > > > > > > > > deployed >> > > > > > > > > > > > > on server nodes. >> > > > > > > > > > > > > This is mostly useless for other thin clients we >> have >> > > > > > (Python, >> > > > > > > > PHP, >> > > > > > > > > > > .NET, >> > > > > > > > > > > > > C++). >> > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > > I don't guess so. The task (execution) is a way to >> > > > implement >> > > > > > own >> > > > > > > > > layer >> > > > > > > > > > > for >> > > > > > > > > > > > the thin client application. >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > > > We should think of a way to make this useful for >> all >> > > > > clients. >> > > > > > > > > > > > > For example, we may allow sending tasks in some >> > > scripting >> > > > > > > > language >> > > > > > > > > > like >> > > > > > > > > > > > > Javascript. >> > > > > > > > > > > > > Thoughts? >> > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > > The arbitrary code execution from a remote client >> must >> > be >> > > > > > > protected >> > > > > > > > > > > > from malicious code. >> > > > > > > > > > > > I don't know how it could be designed but without >> that >> > we >> > > > > open >> > > > > > > the >> > > > > > > > > hole >> > > > > > > > > > > to >> > > > > > > > > > > > kill cluster. >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > On Thu, Nov 21, 2019 at 11:21 AM Sergey Kozlov < >> > > > > > > > > [hidden email] >> > > > > > > > > > > >> > > > > > > > > > > > > wrote: >> > > > > > > > > > > > > >> > > > > > > > > > > > > > Hi Alex >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > The idea is great. But I have some concerns that >> > > > probably >> > > > > > > > should >> > > > > > > > > be >> > > > > > > > > > > > taken >> > > > > > > > > > > > > > into account for design: >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > 1. We need to have the ability to stop a task >> > > > > execution, >> > > > > > > > smth >> > > > > > > > > > like >> > > > > > > > > > > > > > OP_COMPUTE_CANCEL_TASK operation (client to >> > > server) >> > > > > > > > > > > > > > 2. What's about task execution timeout? It >> may >> > > help >> > > > to >> > > > > > the >> > > > > > > > > > cluster >> > > > > > > > > > > > > > survival for buggy tasks >> > > > > > > > > > > > > > 3. Ignite doesn't have roles/authorization >> > > > > functionality >> > > > > > > for >> > > > > > > > > > now. >> > > > > > > > > > > > But >> > > > > > > > > > > > > a >> > > > > > > > > > > > > > task is the risky operation for cluster (for >> > > > security >> > > > > > > > > reasons). >> > > > > > > > > > > > Could >> > > > > > > > > > > > > we >> > > > > > > > > > > > > > add for Ignite configuration new options: >> > > > > > > > > > > > > > - Explicit turning on for compute task >> > support >> > > > for >> > > > > > thin >> > > > > > > > > > > protocol >> > > > > > > > > > > > > > (disabled by default) for whole cluster >> > > > > > > > > > > > > > - Explicit turning on for compute task >> > support >> > > > for >> > > > > a >> > > > > > > node >> > > > > > > > > > > > > > - The list of task names (classes) >> allowed to >> > > > > execute >> > > > > > > by >> > > > > > > > > thin >> > > > > > > > > > > > > client. >> > > > > > > > > > > > > > 4. Support the labeling for task that may >> help >> > to >> > > > > > > > investigate >> > > > > > > > > > > issues >> > > > > > > > > > > > > on >> > > > > > > > > > > > > > cluster (the idea from IEP-34 [1]) >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > 1. >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> https://cwiki.apache.org/confluence/display/IGNITE/IEP-34+Thin+client%3A+transactions+support >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > On Thu, Nov 21, 2019 at 10:58 AM Alex Plehanov < >> > > > > > > > > > > > [hidden email]> >> > > > > > > > > > > > > > wrote: >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > > Hello, Igniters! >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > I have plans to start implementation of >> Compute >> > > > > interface >> > > > > > > for >> > > > > > > > > > > Ignite >> > > > > > > > > > > > > thin >> > > > > > > > > > > > > > > client and want to discuss features that >> should >> > be >> > > > > > > > implemented. >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > We already have Compute implementation for >> > > > binary-rest >> > > > > > > > clients >> > > > > > > > > > > > > > > (GridClientCompute), which have the following >> > > > > > > functionality: >> > > > > > > > > > > > > > > - Filtering cluster nodes (projection) for >> > compute >> > > > > > > > > > > > > > > - Executing task by the name >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > I think we can implement this functionality >> in a >> > > thin >> > > > > > > client >> > > > > > > > as >> > > > > > > > > > > well. >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > First of all, we need some operation types to >> > > > request a >> > > > > > > list >> > > > > > > > of >> > > > > > > > > > all >> > > > > > > > > > > > > > > available nodes and probably node attributes >> (by >> > a >> > > > list >> > > > > > of >> > > > > > > > > > nodes). >> > > > > > > > > > > > Node >> > > > > > > > > > > > > > > attributes will be helpful if we will decide >> to >> > > > > implement >> > > > > > > > > analog >> > > > > > > > > > of >> > > > > > > > > > > > > > > ClusterGroup#forAttribute or >> > > > ClusterGroup#forePredicate >> > > > > > > > methods >> > > > > > > > > > in >> > > > > > > > > > > > the >> > > > > > > > > > > > > > thin >> > > > > > > > > > > > > > > client. Perhaps they can be requested lazily. >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > From the protocol point of view there will be >> two >> > > new >> > > > > > > > > operations: >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > OP_CLUSTER_GET_NODES >> > > > > > > > > > > > > > > Request: empty >> > > > > > > > > > > > > > > Response: long topologyVersion, int >> > > > > minorTopologyVersion, >> > > > > > > int >> > > > > > > > > > > > > nodesCount, >> > > > > > > > > > > > > > > for each node set of node fields (UUID nodeId, >> > > Object >> > > > > or >> > > > > > > > String >> > > > > > > > > > > > > > > consistentId, long order, etc) >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > OP_CLUSTER_GET_NODE_ATTRIBUTES >> > > > > > > > > > > > > > > Request: int nodesCount, for each node: UUID >> > nodeId >> > > > > > > > > > > > > > > Response: int nodesCount, for each node: int >> > > > > > > attributesCount, >> > > > > > > > > for >> > > > > > > > > > > > each >> > > > > > > > > > > > > > node >> > > > > > > > > > > > > > > attribute: String name, Object value >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > To execute tasks we need something like these >> > > methods >> > > > > in >> > > > > > > the >> > > > > > > > > > client >> > > > > > > > > > > > > API: >> > > > > > > > > > > > > > > Object execute(String task, Object arg) >> > > > > > > > > > > > > > > Future<Object> executeAsync(String task, >> Object >> > > arg) >> > > > > > > > > > > > > > > Object affinityExecute(String task, String >> cache, >> > > > > Object >> > > > > > > key, >> > > > > > > > > > > Object >> > > > > > > > > > > > > arg) >> > > > > > > > > > > > > > > Future<Object> affinityExecuteAsync(String >> task, >> > > > String >> > > > > > > > cache, >> > > > > > > > > > > Object >> > > > > > > > > > > > > > key, >> > > > > > > > > > > > > > > Object arg) >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > Which can be mapped to protocol operations: >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > OP_COMPUTE_EXECUTE_TASK >> > > > > > > > > > > > > > > Request: UUID nodeId, String taskName, Object >> arg >> > > > > > > > > > > > > > > Response: Object result >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > OP_COMPUTE_EXECUTE_TASK_AFFINITY >> > > > > > > > > > > > > > > Request: String cacheName, Object key, String >> > > > taskName, >> > > > > > > > Object >> > > > > > > > > > arg >> > > > > > > > > > > > > > > Response: Object result >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > The second operation is needed because we >> > sometimes >> > > > > can't >> > > > > > > > > > calculate >> > > > > > > > > > > > and >> > > > > > > > > > > > > > > connect to affinity node on the client-side >> > > (affinity >> > > > > > > > awareness >> > > > > > > > > > can >> > > > > > > > > > > > be >> > > > > > > > > > > > > > > disabled, custom affinity function can be >> used or >> > > > there >> > > > > > can >> > > > > > > > be >> > > > > > > > > no >> > > > > > > > > > > > > > > connection between client and affinity node), >> but >> > > we >> > > > > can >> > > > > > > make >> > > > > > > > > > best >> > > > > > > > > > > > > effort >> > > > > > > > > > > > > > > to send request to target node if affinity >> > > awareness >> > > > is >> > > > > > > > > enabled. >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > Currently, on the server-side requests always >> > > > processed >> > > > > > > > > > > synchronously >> > > > > > > > > > > > > and >> > > > > > > > > > > > > > > responses are sent right after request was >> > > processed. >> > > > > To >> > > > > > > > > execute >> > > > > > > > > > > long >> > > > > > > > > > > > > > tasks >> > > > > > > > > > > > > > > async we should whether change this logic or >> > > > introduce >> > > > > > some >> > > > > > > > > kind >> > > > > > > > > > > > > two-way >> > > > > > > > > > > > > > > communication between client and server (now >> only >> > > > > one-way >> > > > > > > > > > requests >> > > > > > > > > > > > from >> > > > > > > > > > > > > > > client to server are allowed). >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > Two-way communication can also be useful in >> the >> > > > future >> > > > > if >> > > > > > > we >> > > > > > > > > will >> > > > > > > > > > > > send >> > > > > > > > > > > > > > some >> > > > > > > > > > > > > > > server-side generated events to clients. >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > In case of two-way communication there can be >> new >> > > > > > > operations >> > > > > > > > > > > > > introduced: >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > OP_COMPUTE_EXECUTE_TASK (from client to >> server) >> > > > > > > > > > > > > > > Request: UUID nodeId, String taskName, Object >> arg >> > > > > > > > > > > > > > > Response: long taskId >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > OP_COMPUTE_TASK_FINISHED (from server to >> client) >> > > > > > > > > > > > > > > Request: taskId, Object result >> > > > > > > > > > > > > > > Response: empty >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > The same for affinity requests. >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > Also, we can implement not only execute task >> > > > operation, >> > > > > > but >> > > > > > > > > some >> > > > > > > > > > > > other >> > > > > > > > > > > > > > > operations from IgniteCompute (broadcast, run, >> > > call), >> > > > > but >> > > > > > > it >> > > > > > > > > will >> > > > > > > > > > > be >> > > > > > > > > > > > > > useful >> > > > > > > > > > > > > > > only for java thin client. And even with java >> > thin >> > > > > client >> > > > > > > we >> > > > > > > > > > should >> > > > > > > > > > > > > > whether >> > > > > > > > > > > > > > > implement peer-class-loading for thin clients >> > (this >> > > > > also >> > > > > > > > > requires >> > > > > > > > > > > > > two-way >> > > > > > > > > > > > > > > client-server communication) or put classes >> with >> > > > > executed >> > > > > > > > > > closures >> > > > > > > > > > > to >> > > > > > > > > > > > > the >> > > > > > > > > > > > > > > server locally. >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > What do you think about proposed protocol >> > changes? >> > > > > > > > > > > > > > > Do we need two-way requests between client and >> > > > server? >> > > > > > > > > > > > > > > Do we need support of compute methods other >> than >> > > > > "execute >> > > > > > > > > task"? >> > > > > > > > > > > > > > > What do you think about peer-class-loading for >> > thin >> > > > > > > clients? >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > -- >> > > > > > > > > > > > > > Sergey Kozlov >> > > > > > > > > > > > > > GridGain Systems >> > > > > > > > > > > > > > www.gridgain.com >> > > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > > -- >> > > > > > > > > > > > Sergey Kozlov >> > > > > > > > > > > > GridGain Systems >> > > > > > > > > > > > www.gridgain.com >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > -- >> > > > > > > > > > > Alex. >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> > |
Sorry for the late reply.
Approach with taskId will require a lot of changes in protocol and thus more "heavy" for implementation, but it definitely looks to me less hacky than reqId-approach. Moreover, as was mentioned, server notifications mechanism will be required in a future anyway with high probability. So from this point of view I like taskId-approach. On the other hand, what we should also consider here is performance. Speaking of latency, it looks like reqId will have better results in case of small and fast tasks. The only question here, if we want to optimize thin clients for this case. Also, what are you talking about mostly involves clients on platforms that already have Compute API for thick clients. Let me mention one more point of view here and another concern here. The changes you propose are going to change protocol version for sure. In case with taskId approach and server notifications - even more so. But such clients as Python, Node.js, PHP, Go most probably won't have support for this API, at least for now. Or never. But current backward-compatibility mechanism implies protocol versions where we imply that client that supports version 1.5 also supports all the features introduced in all the previous versions of the protocol. Thus implementing Compute API in any of the proposed ways *may* force mentioned clients to support changes in protocol which they not necessarily need in order to introduce new features in the future. So, maybe it's a good time for us to change our backward compatibility mechanism from protocol versioning to feature masks? WDYT? Best Regards, Igor On Fri, Jan 17, 2020 at 9:37 AM Alex Plehanov <[hidden email]> wrote: > Looks like we didn't rich consensus here. > > Igor, as thin client maintainer, can you please share your opinion? > > Everyone else also welcome, please share your thoughts about options to > implement operations for compute. > > > чт, 28 нояб. 2019 г. в 10:02, Alex Plehanov <[hidden email]>: > > > > Since all thin client operations are inherently async, we should be > able > > to cancel any of them > > It's illogical to have such ability. What should do cancel operation of > > cancel operation? Moreover, sometimes it's dangerous, for example, create > > cache operation should never be canceled. There should be an explicit set > > of processes that we can cancel: queries, transactions, tasks, services. > > The lifecycle of services is more complex than the lifecycle of tasks. > With > > services, I suppose, we can't use request cancelation, so tasks will be > the > > only process with an exceptional pattern. > > > > > The request would be "execute task with specified node filter" - simple > > and efficient. > > It's not simple: every compute or service request should contain complex > > node filtering logic, which duplicates the same logic for cluster API. > > It's not efficient: for example, we can't implement forPredicate() > > filtering in this case. > > > > > > ср, 27 нояб. 2019 г. в 19:25, Pavel Tupitsyn <[hidden email]>: > > > >> > The request is already processed (task is started), we can't cancel > the > >> request > >> The request is not "start a task". It is "execute task" (and get > result). > >> Same as "cache get" - you get a result in the end, we don't "start cache > >> get" then "end cache get". > >> > >> Since all thin client operations are inherently async, we should be able > >> to > >> cancel any of them > >> by sending another request with an id of prior request to be cancelled. > >> That's why I'm advocating for this approach - it will work for anything, > >> no > >> special cases. > >> And it keeps "happy path" as simple as it is right now. > >> > >> Queries are different because we retrieve results in pages, we can't do > >> them as one request. > >> Transactions are also different because client controls when they should > >> end. > >> There is no reason for task execution to be a special case like queries > or > >> transactions. > >> > >> > we always need to send 2 requests to server to execute the task > >> Nope. We don't need to get nodes on client at all. > >> The request would be "execute task with specified node filter" - simple > >> and > >> efficient. > >> > >> > >> On Wed, Nov 27, 2019 at 4:31 PM Alex Plehanov <[hidden email]> > >> wrote: > >> > >> > > We do cancel a request to perform a task. We may and should use > this > >> to > >> > cancel any other request in future. > >> > The request is already processed (task is started), we can't cancel > the > >> > request. As you mentioned before, we already do almost the same for > >> queries > >> > (close the cursor, but not cancel the request to run a query), it's > >> better > >> > to do such things in a common way. We have a pattern: start some > process > >> > (query, transaction), get id of this process, end process by this id. > >> The > >> > "Execute task" process should match the same pattern. In my opinion, > >> > implementation with two-way requests is the best option to match this > >> > pattern (we can even reuse OP_RESOURCE_CLOSE operation type in this > >> case). > >> > Sometime in the future, we will need two-way requests for some other > >> > functionality (continuous queries, event listening, etc). But even > >> without > >> > two-way requests introducing some process id (task id in our case) > will > >> be > >> > closer to existing pattern than canceling tasks by request id. > >> > > >> > > So every new request will apply those filters on server side, using > >> the > >> > most recent set of nodes. > >> > In this case, we always need to send 2 requests to server to execute > the > >> > task. First - to get nodes by the filter, second - to actually execute > >> the > >> > task. It seems like overhead. The same will be for services. Cluster > >> group > >> > remains the same if the topology hasn't changed. We can use this fact > >> and > >> > bind "execute task" request to topology. If topology has changed - get > >> > nodes for new topology and retry request. > >> > > >> > вт, 26 нояб. 2019 г. в 17:44, Pavel Tupitsyn <[hidden email]>: > >> > > >> > > > After all, we don't cancel request > >> > > We do cancel a request to perform a task. We may and should use this > >> to > >> > > cancel any other request in future. > >> > > > >> > > > Client uses some cluster group filtration (for example > forServers() > >> > > cluster group) > >> > > Please see above - Aleksandr Shapkin described how we store > >> > > filtered cluster groups on client. > >> > > We don't store node IDs, we store actual filters. So every new > request > >> > will > >> > > apply those filters on server side, > >> > > using the most recent set of nodes. > >> > > > >> > > var myGrp = cluster.forServers().forAttribute("foo"); // This does > not > >> > > issue any server requests, just builds an object with filters on > >> client > >> > > while (true) myGrp.compute().executeTask("bar"); // Every request > >> > includes > >> > > filters, and filters are applied on the server side > >> > > > >> > > On Tue, Nov 26, 2019 at 1:42 PM Alex Plehanov < > >> [hidden email]> > >> > > wrote: > >> > > > >> > > > > Anyway, my point stands. > >> > > > I can't agree. Why you don't want to use task id for this? After > >> all, > >> > we > >> > > > don't cancel request (request is already processed), we cancel the > >> > task. > >> > > So > >> > > > it's more convenient to use task id here. > >> > > > > >> > > > > Can you please provide equivalent use case with existing "thick" > >> > > client? > >> > > > For example: > >> > > > Cluster consists of one server node. > >> > > > Client uses some cluster group filtration (for example > forServers() > >> > > cluster > >> > > > group). > >> > > > Client starts to send periodically (for example 1 per minute) > >> long-term > >> > > > (for example 1 hour long) tasks to the cluster. > >> > > > Meanwhile, several server nodes joined the cluster. > >> > > > > >> > > > In case of thick client: All server nodes will be used, tasks will > >> be > >> > > load > >> > > > balanced. > >> > > > In case of thin client: Only one server node will be used, client > >> will > >> > > > detect topology change after an hour. > >> > > > > >> > > > > >> > > > вт, 26 нояб. 2019 г. в 11:50, Pavel Tupitsyn < > [hidden email] > >> >: > >> > > > > >> > > > > > I can't see any usage of request id in query cursors > >> > > > > You are right, cursor id is a separate thing. > >> > > > > Anyway, my point stands. > >> > > > > > >> > > > > > client sends long term tasks to nodes and wants to do it with > >> load > >> > > > > balancing > >> > > > > I still don't get it. Can you please provide equivalent use case > >> with > >> > > > > existing "thick" client? > >> > > > > > >> > > > > > >> > > > > On Mon, Nov 25, 2019 at 11:59 PM Alex Plehanov < > >> > > [hidden email]> > >> > > > > wrote: > >> > > > > > >> > > > > > > And it is fine to use request ID to identify compute tasks > >> (as we > >> > > do > >> > > > > with > >> > > > > > query cursors). > >> > > > > > I can't see any usage of request id in query cursors. We send > >> query > >> > > > > request > >> > > > > > and get cursor id in response. After that, we only use cursor > id > >> > (to > >> > > > get > >> > > > > > next pages and to close the resource). Did I miss something? > >> > > > > > > >> > > > > > > Looks like I'm missing something - how is topology change > >> > relevant > >> > > to > >> > > > > > executing compute tasks from client? > >> > > > > > It's not relevant directly. But there are some cases where it > >> will > >> > be > >> > > > > > helpful. For example, if client sends long term tasks to nodes > >> and > >> > > > wants > >> > > > > to > >> > > > > > do it with load balancing it will detect topology change only > >> after > >> > > > some > >> > > > > > time in the future with the first response, so load balancing > >> will > >> > no > >> > > > > work. > >> > > > > > Perhaps we can add optional "topology version" field to the > >> > > > > > OP_COMPUTE_EXECUTE_TASK request to solve this problem. > >> > > > > > > >> > > > > > > >> > > > > > пн, 25 нояб. 2019 г. в 22:42, Pavel Tupitsyn < > >> [hidden email] > >> > >: > >> > > > > > > >> > > > > > > Alex, > >> > > > > > > > >> > > > > > > > we will mix entities from different layers (transport > layer > >> and > >> > > > > request > >> > > > > > > body) > >> > > > > > > I would not call our message header (which includes the id) > >> > > > "transport > >> > > > > > > layer". > >> > > > > > > TCP is our transport layer. And it is fine to use request ID > >> to > >> > > > > identify > >> > > > > > > compute tasks (as we do with query cursors). > >> > > > > > > > >> > > > > > > > we still can't be sure that the task is successfully > started > >> > on a > >> > > > > > server > >> > > > > > > The request to start the task will fail and we'll get a > >> response > >> > > > > > indicating > >> > > > > > > that right away > >> > > > > > > > >> > > > > > > > we won't ever know about topology change > >> > > > > > > Looks like I'm missing something - how is topology change > >> > relevant > >> > > to > >> > > > > > > executing compute tasks from client? > >> > > > > > > > >> > > > > > > On Mon, Nov 25, 2019 at 10:17 PM Alex Plehanov < > >> > > > > [hidden email]> > >> > > > > > > wrote: > >> > > > > > > > >> > > > > > > > Pavel, in this case, we will mix entities from different > >> layers > >> > > > > > > (transport > >> > > > > > > > layer and request body), it's not very good. The same > >> behavior > >> > we > >> > > > can > >> > > > > > > > achieve with generated on client-side task id, but there > >> will > >> > be > >> > > no > >> > > > > > > > inter-layer data intersection and I think it will be > easier > >> to > >> > > > > > implement > >> > > > > > > on > >> > > > > > > > both client and server-side. But we still can't be sure > that > >> > the > >> > > > task > >> > > > > > is > >> > > > > > > > successfully started on a server. We won't ever know about > >> > > topology > >> > > > > > > change, > >> > > > > > > > because topology changed flag will be sent from server to > >> > client > >> > > > only > >> > > > > > > with > >> > > > > > > > a response when the task will be completed. Are we accept > >> that? > >> > > > > > > > > >> > > > > > > > пн, 25 нояб. 2019 г. в 19:07, Pavel Tupitsyn < > >> > > [hidden email] > >> > > > >: > >> > > > > > > > > >> > > > > > > > > Alex, > >> > > > > > > > > > >> > > > > > > > > I have a simpler idea. We already do request id handling > >> in > >> > the > >> > > > > > > protocol, > >> > > > > > > > > so: > >> > > > > > > > > - Client sends a normal request to execute compute task. > >> > > Request > >> > > > ID > >> > > > > > is > >> > > > > > > > > generated as usual. > >> > > > > > > > > - As soon as task is completed, a response is received. > >> > > > > > > > > > >> > > > > > > > > As for cancellation - client can send a new request > (with > >> new > >> > > > > request > >> > > > > > > ID) > >> > > > > > > > > and (in the body) pass the request ID from above > >> > > > > > > > > as a task identifier. As a result, there are two > >> responses: > >> > > > > > > > > - Cancellation response > >> > > > > > > > > - Task response (with proper cancelled status) > >> > > > > > > > > > >> > > > > > > > > That's it, no need to modify the core of the protocol. > One > >> > > > request > >> > > > > - > >> > > > > > > one > >> > > > > > > > > response. > >> > > > > > > > > > >> > > > > > > > > On Mon, Nov 25, 2019 at 6:20 PM Alex Plehanov < > >> > > > > > [hidden email] > >> > > > > > > > > >> > > > > > > > > wrote: > >> > > > > > > > > > >> > > > > > > > > > Pavel, we need to inform the client when the task is > >> > > completed, > >> > > > > we > >> > > > > > > need > >> > > > > > > > > the > >> > > > > > > > > > ability to cancel the task. I see several ways to > >> implement > >> > > > this: > >> > > > > > > > > > > >> > > > > > > > > > 1. Сlient sends a request to the server to start a > task, > >> > > server > >> > > > > > > return > >> > > > > > > > > task > >> > > > > > > > > > id in response. Server notifies client when task is > >> > completed > >> > > > > with > >> > > > > > a > >> > > > > > > > new > >> > > > > > > > > > request (from server to client). Client can cancel the > >> task > >> > > by > >> > > > > > > sending > >> > > > > > > > a > >> > > > > > > > > > new request with operation type "cancel" and task id. > In > >> > this > >> > > > > case, > >> > > > > > > we > >> > > > > > > > > > should implement 2-ways requests. > >> > > > > > > > > > 2. Client generates unique task id and sends a request > >> to > >> > the > >> > > > > > server > >> > > > > > > to > >> > > > > > > > > > start a task, server don't reply immediately but wait > >> until > >> > > > task > >> > > > > is > >> > > > > > > > > > completed. Client can cancel task by sending new > request > >> > with > >> > > > > > > operation > >> > > > > > > > > > type "cancel" and task id. In this case, we should > >> decouple > >> > > > > request > >> > > > > > > and > >> > > > > > > > > > response on the server-side (currently response is > sent > >> > right > >> > > > > after > >> > > > > > > > > request > >> > > > > > > > > > was processed). Also, we can't be sure that task is > >> > > > successfully > >> > > > > > > > started > >> > > > > > > > > on > >> > > > > > > > > > a server. > >> > > > > > > > > > 3. Client sends a request to the server to start a > task, > >> > > server > >> > > > > > > return > >> > > > > > > > id > >> > > > > > > > > > in response. Client periodically asks the server about > >> task > >> > > > > status. > >> > > > > > > > > Client > >> > > > > > > > > > can cancel the task by sending new request with > >> operation > >> > > type > >> > > > > > > "cancel" > >> > > > > > > > > and > >> > > > > > > > > > task id. This case brings some overhead to the > >> > communication > >> > > > > > channel. > >> > > > > > > > > > > >> > > > > > > > > > Personally, I think that the case with 2-ways requests > >> is > >> > > > better, > >> > > > > > but > >> > > > > > > > I'm > >> > > > > > > > > > open to any other ideas. > >> > > > > > > > > > > >> > > > > > > > > > Aleksandr, > >> > > > > > > > > > > >> > > > > > > > > > Filtering logic for OP_CLUSTER_GROUP_GET_NODE_IDS > looks > >> > > > > > > > overcomplicated. > >> > > > > > > > > Do > >> > > > > > > > > > we need server-side filtering at all? Wouldn't it be > >> better > >> > > to > >> > > > > send > >> > > > > > > > basic > >> > > > > > > > > > info (ids, order, flags) for all nodes (there is > >> relatively > >> > > > small > >> > > > > > > > amount > >> > > > > > > > > of > >> > > > > > > > > > data) and extended info (attributes) for selected list > >> of > >> > > > nodes? > >> > > > > In > >> > > > > > > > this > >> > > > > > > > > > case, we can do basic node filtration on client-side > >> > > > > (forClients(), > >> > > > > > > > > > forServers(), forNodeIds(), forOthers(), etc). > >> > > > > > > > > > > >> > > > > > > > > > Do you use standard ClusterNode serialization? There > are > >> > also > >> > > > > > metrics > >> > > > > > > > > > serialized with ClusterNode, do we need it on thin > >> client? > >> > > > There > >> > > > > > are > >> > > > > > > > > other > >> > > > > > > > > > interfaces exist to show metrics, I think it's > >> redundant to > >> > > > > export > >> > > > > > > > > metrics > >> > > > > > > > > > to thin clients too. > >> > > > > > > > > > > >> > > > > > > > > > What do you think? > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > пт, 22 нояб. 2019 г. в 20:15, Aleksandr Shapkin < > >> > > > > [hidden email] > >> > > > > > >: > >> > > > > > > > > > > >> > > > > > > > > > > Alex, > >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > I think you can create a new IEP page and I will > fill > >> it > >> > > with > >> > > > > the > >> > > > > > > > > Cluster > >> > > > > > > > > > > API details. > >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > In short, I’ve introduced several new codes: > >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > Cluster API is pretty straightforward: > >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > OP_CLUSTER_IS_ACTIVE = 5000 > >> > > > > > > > > > > > >> > > > > > > > > > > OP_CLUSTER_CHANGE_STATE = 5001 > >> > > > > > > > > > > > >> > > > > > > > > > > OP_CLUSTER_CHANGE_WAL_STATE = 5002 > >> > > > > > > > > > > > >> > > > > > > > > > > OP_CLUSTER_GET_WAL_STATE = 5003 > >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > Cluster group codes: > >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > OP_CLUSTER_GROUP_GET_NODE_IDS = 5100 > >> > > > > > > > > > > > >> > > > > > > > > > > OP_CLUSTER_GROUP_GET_NODE_INFO = 5101 > >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > The underlying implementation is based on the thick > >> > client > >> > > > > logic. > >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > For every request, we provide a known topology > version > >> > and > >> > > if > >> > > > > it > >> > > > > > > has > >> > > > > > > > > > > changed, > >> > > > > > > > > > > > >> > > > > > > > > > > a client updates it firstly and then re-sends the > >> > filtering > >> > > > > > > request. > >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > Alongside the topVer a client sends a serialized > nodes > >> > > > > projection > >> > > > > > > > > object > >> > > > > > > > > > > > >> > > > > > > > > > > that could be considered as a code to value mapping. > >> > > > > > > > > > > > >> > > > > > > > > > > Consider: [{Code = 1, Value= [“DotNet”, > >> “MyAttribute”}, > >> > > > > {Code=2, > >> > > > > > > > > > Value=1}] > >> > > > > > > > > > > > >> > > > > > > > > > > Where “1” stands for Attribute filtering and “2” – > >> > > > > > serverNodesOnly > >> > > > > > > > > flag. > >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > As a result of request processing, a server sends > >> nodeId > >> > > > UUIDs > >> > > > > > and > >> > > > > > > a > >> > > > > > > > > > > current topVer. > >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > When a client obtains nodeIds, it can perform a > >> NODE_INFO > >> > > > call > >> > > > > to > >> > > > > > > > get a > >> > > > > > > > > > > > >> > > > > > > > > > > serialized ClusterNode object. In addition there > >> should > >> > be > >> > > a > >> > > > > > > > different > >> > > > > > > > > > API > >> > > > > > > > > > > > >> > > > > > > > > > > method for accessing/updating node metrics. > >> > > > > > > > > > > > >> > > > > > > > > > > чт, 21 нояб. 2019 г. в 12:32, Sergey Kozlov < > >> > > > > > [hidden email] > >> > > > > > > >: > >> > > > > > > > > > > > >> > > > > > > > > > > > Hi Pavel > >> > > > > > > > > > > > > >> > > > > > > > > > > > On Thu, Nov 21, 2019 at 11:30 AM Pavel Tupitsyn < > >> > > > > > > > > [hidden email]> > >> > > > > > > > > > > > wrote: > >> > > > > > > > > > > > > >> > > > > > > > > > > > > 1. I believe that Cluster operations for Thin > >> Client > >> > > > > protocol > >> > > > > > > are > >> > > > > > > > > > > already > >> > > > > > > > > > > > > in the works > >> > > > > > > > > > > > > by Alexandr Shapkin. Can't find the ticket > though. > >> > > > > > > > > > > > > Alexandr, can you please confirm and attach the > >> > ticket > >> > > > > > number? > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > 2. Proposed changes will work only for Java > tasks > >> > that > >> > > > are > >> > > > > > > > already > >> > > > > > > > > > > > deployed > >> > > > > > > > > > > > > on server nodes. > >> > > > > > > > > > > > > This is mostly useless for other thin clients we > >> have > >> > > > > > (Python, > >> > > > > > > > PHP, > >> > > > > > > > > > > .NET, > >> > > > > > > > > > > > > C++). > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > I don't guess so. The task (execution) is a way to > >> > > > implement > >> > > > > > own > >> > > > > > > > > layer > >> > > > > > > > > > > for > >> > > > > > > > > > > > the thin client application. > >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > We should think of a way to make this useful for > >> all > >> > > > > clients. > >> > > > > > > > > > > > > For example, we may allow sending tasks in some > >> > > scripting > >> > > > > > > > language > >> > > > > > > > > > like > >> > > > > > > > > > > > > Javascript. > >> > > > > > > > > > > > > Thoughts? > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > The arbitrary code execution from a remote client > >> must > >> > be > >> > > > > > > protected > >> > > > > > > > > > > > from malicious code. > >> > > > > > > > > > > > I don't know how it could be designed but without > >> that > >> > we > >> > > > > open > >> > > > > > > the > >> > > > > > > > > hole > >> > > > > > > > > > > to > >> > > > > > > > > > > > kill cluster. > >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > On Thu, Nov 21, 2019 at 11:21 AM Sergey Kozlov < > >> > > > > > > > > [hidden email] > >> > > > > > > > > > > > >> > > > > > > > > > > > > wrote: > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > Hi Alex > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > The idea is great. But I have some concerns > that > >> > > > probably > >> > > > > > > > should > >> > > > > > > > > be > >> > > > > > > > > > > > taken > >> > > > > > > > > > > > > > into account for design: > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > 1. We need to have the ability to stop a > task > >> > > > > execution, > >> > > > > > > > smth > >> > > > > > > > > > like > >> > > > > > > > > > > > > > OP_COMPUTE_CANCEL_TASK operation (client > to > >> > > server) > >> > > > > > > > > > > > > > 2. What's about task execution timeout? It > >> may > >> > > help > >> > > > to > >> > > > > > the > >> > > > > > > > > > cluster > >> > > > > > > > > > > > > > survival for buggy tasks > >> > > > > > > > > > > > > > 3. Ignite doesn't have roles/authorization > >> > > > > functionality > >> > > > > > > for > >> > > > > > > > > > now. > >> > > > > > > > > > > > But > >> > > > > > > > > > > > > a > >> > > > > > > > > > > > > > task is the risky operation for cluster > (for > >> > > > security > >> > > > > > > > > reasons). > >> > > > > > > > > > > > Could > >> > > > > > > > > > > > > we > >> > > > > > > > > > > > > > add for Ignite configuration new options: > >> > > > > > > > > > > > > > - Explicit turning on for compute task > >> > support > >> > > > for > >> > > > > > thin > >> > > > > > > > > > > protocol > >> > > > > > > > > > > > > > (disabled by default) for whole cluster > >> > > > > > > > > > > > > > - Explicit turning on for compute task > >> > support > >> > > > for > >> > > > > a > >> > > > > > > node > >> > > > > > > > > > > > > > - The list of task names (classes) > >> allowed to > >> > > > > execute > >> > > > > > > by > >> > > > > > > > > thin > >> > > > > > > > > > > > > client. > >> > > > > > > > > > > > > > 4. Support the labeling for task that may > >> help > >> > to > >> > > > > > > > investigate > >> > > > > > > > > > > issues > >> > > > > > > > > > > > > on > >> > > > > > > > > > > > > > cluster (the idea from IEP-34 [1]) > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > 1. > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > https://cwiki.apache.org/confluence/display/IGNITE/IEP-34+Thin+client%3A+transactions+support > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > On Thu, Nov 21, 2019 at 10:58 AM Alex > Plehanov < > >> > > > > > > > > > > > [hidden email]> > >> > > > > > > > > > > > > > wrote: > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > Hello, Igniters! > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > I have plans to start implementation of > >> Compute > >> > > > > interface > >> > > > > > > for > >> > > > > > > > > > > Ignite > >> > > > > > > > > > > > > thin > >> > > > > > > > > > > > > > > client and want to discuss features that > >> should > >> > be > >> > > > > > > > implemented. > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > We already have Compute implementation for > >> > > > binary-rest > >> > > > > > > > clients > >> > > > > > > > > > > > > > > (GridClientCompute), which have the > following > >> > > > > > > functionality: > >> > > > > > > > > > > > > > > - Filtering cluster nodes (projection) for > >> > compute > >> > > > > > > > > > > > > > > - Executing task by the name > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > I think we can implement this functionality > >> in a > >> > > thin > >> > > > > > > client > >> > > > > > > > as > >> > > > > > > > > > > well. > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > First of all, we need some operation types > to > >> > > > request a > >> > > > > > > list > >> > > > > > > > of > >> > > > > > > > > > all > >> > > > > > > > > > > > > > > available nodes and probably node attributes > >> (by > >> > a > >> > > > list > >> > > > > > of > >> > > > > > > > > > nodes). > >> > > > > > > > > > > > Node > >> > > > > > > > > > > > > > > attributes will be helpful if we will decide > >> to > >> > > > > implement > >> > > > > > > > > analog > >> > > > > > > > > > of > >> > > > > > > > > > > > > > > ClusterGroup#forAttribute or > >> > > > ClusterGroup#forePredicate > >> > > > > > > > methods > >> > > > > > > > > > in > >> > > > > > > > > > > > the > >> > > > > > > > > > > > > > thin > >> > > > > > > > > > > > > > > client. Perhaps they can be requested > lazily. > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > From the protocol point of view there will > be > >> two > >> > > new > >> > > > > > > > > operations: > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > OP_CLUSTER_GET_NODES > >> > > > > > > > > > > > > > > Request: empty > >> > > > > > > > > > > > > > > Response: long topologyVersion, int > >> > > > > minorTopologyVersion, > >> > > > > > > int > >> > > > > > > > > > > > > nodesCount, > >> > > > > > > > > > > > > > > for each node set of node fields (UUID > nodeId, > >> > > Object > >> > > > > or > >> > > > > > > > String > >> > > > > > > > > > > > > > > consistentId, long order, etc) > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > OP_CLUSTER_GET_NODE_ATTRIBUTES > >> > > > > > > > > > > > > > > Request: int nodesCount, for each node: UUID > >> > nodeId > >> > > > > > > > > > > > > > > Response: int nodesCount, for each node: int > >> > > > > > > attributesCount, > >> > > > > > > > > for > >> > > > > > > > > > > > each > >> > > > > > > > > > > > > > node > >> > > > > > > > > > > > > > > attribute: String name, Object value > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > To execute tasks we need something like > these > >> > > methods > >> > > > > in > >> > > > > > > the > >> > > > > > > > > > client > >> > > > > > > > > > > > > API: > >> > > > > > > > > > > > > > > Object execute(String task, Object arg) > >> > > > > > > > > > > > > > > Future<Object> executeAsync(String task, > >> Object > >> > > arg) > >> > > > > > > > > > > > > > > Object affinityExecute(String task, String > >> cache, > >> > > > > Object > >> > > > > > > key, > >> > > > > > > > > > > Object > >> > > > > > > > > > > > > arg) > >> > > > > > > > > > > > > > > Future<Object> affinityExecuteAsync(String > >> task, > >> > > > String > >> > > > > > > > cache, > >> > > > > > > > > > > Object > >> > > > > > > > > > > > > > key, > >> > > > > > > > > > > > > > > Object arg) > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > Which can be mapped to protocol operations: > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > OP_COMPUTE_EXECUTE_TASK > >> > > > > > > > > > > > > > > Request: UUID nodeId, String taskName, > Object > >> arg > >> > > > > > > > > > > > > > > Response: Object result > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > OP_COMPUTE_EXECUTE_TASK_AFFINITY > >> > > > > > > > > > > > > > > Request: String cacheName, Object key, > String > >> > > > taskName, > >> > > > > > > > Object > >> > > > > > > > > > arg > >> > > > > > > > > > > > > > > Response: Object result > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > The second operation is needed because we > >> > sometimes > >> > > > > can't > >> > > > > > > > > > calculate > >> > > > > > > > > > > > and > >> > > > > > > > > > > > > > > connect to affinity node on the client-side > >> > > (affinity > >> > > > > > > > awareness > >> > > > > > > > > > can > >> > > > > > > > > > > > be > >> > > > > > > > > > > > > > > disabled, custom affinity function can be > >> used or > >> > > > there > >> > > > > > can > >> > > > > > > > be > >> > > > > > > > > no > >> > > > > > > > > > > > > > > connection between client and affinity > node), > >> but > >> > > we > >> > > > > can > >> > > > > > > make > >> > > > > > > > > > best > >> > > > > > > > > > > > > effort > >> > > > > > > > > > > > > > > to send request to target node if affinity > >> > > awareness > >> > > > is > >> > > > > > > > > enabled. > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > Currently, on the server-side requests > always > >> > > > processed > >> > > > > > > > > > > synchronously > >> > > > > > > > > > > > > and > >> > > > > > > > > > > > > > > responses are sent right after request was > >> > > processed. > >> > > > > To > >> > > > > > > > > execute > >> > > > > > > > > > > long > >> > > > > > > > > > > > > > tasks > >> > > > > > > > > > > > > > > async we should whether change this logic or > >> > > > introduce > >> > > > > > some > >> > > > > > > > > kind > >> > > > > > > > > > > > > two-way > >> > > > > > > > > > > > > > > communication between client and server (now > >> only > >> > > > > one-way > >> > > > > > > > > > requests > >> > > > > > > > > > > > from > >> > > > > > > > > > > > > > > client to server are allowed). > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > Two-way communication can also be useful in > >> the > >> > > > future > >> > > > > if > >> > > > > > > we > >> > > > > > > > > will > >> > > > > > > > > > > > send > >> > > > > > > > > > > > > > some > >> > > > > > > > > > > > > > > server-side generated events to clients. > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > In case of two-way communication there can > be > >> new > >> > > > > > > operations > >> > > > > > > > > > > > > introduced: > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > OP_COMPUTE_EXECUTE_TASK (from client to > >> server) > >> > > > > > > > > > > > > > > Request: UUID nodeId, String taskName, > Object > >> arg > >> > > > > > > > > > > > > > > Response: long taskId > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > OP_COMPUTE_TASK_FINISHED (from server to > >> client) > >> > > > > > > > > > > > > > > Request: taskId, Object result > >> > > > > > > > > > > > > > > Response: empty > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > The same for affinity requests. > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > Also, we can implement not only execute task > >> > > > operation, > >> > > > > > but > >> > > > > > > > > some > >> > > > > > > > > > > > other > >> > > > > > > > > > > > > > > operations from IgniteCompute (broadcast, > run, > >> > > call), > >> > > > > but > >> > > > > > > it > >> > > > > > > > > will > >> > > > > > > > > > > be > >> > > > > > > > > > > > > > useful > >> > > > > > > > > > > > > > > only for java thin client. And even with > java > >> > thin > >> > > > > client > >> > > > > > > we > >> > > > > > > > > > should > >> > > > > > > > > > > > > > whether > >> > > > > > > > > > > > > > > implement peer-class-loading for thin > clients > >> > (this > >> > > > > also > >> > > > > > > > > requires > >> > > > > > > > > > > > > two-way > >> > > > > > > > > > > > > > > client-server communication) or put classes > >> with > >> > > > > executed > >> > > > > > > > > > closures > >> > > > > > > > > > > to > >> > > > > > > > > > > > > the > >> > > > > > > > > > > > > > > server locally. > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > What do you think about proposed protocol > >> > changes? > >> > > > > > > > > > > > > > > Do we need two-way requests between client > and > >> > > > server? > >> > > > > > > > > > > > > > > Do we need support of compute methods other > >> than > >> > > > > "execute > >> > > > > > > > > task"? > >> > > > > > > > > > > > > > > What do you think about peer-class-loading > for > >> > thin > >> > > > > > > clients? > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > -- > >> > > > > > > > > > > > > > Sergey Kozlov > >> > > > > > > > > > > > > > GridGain Systems > >> > > > > > > > > > > > > > www.gridgain.com > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > -- > >> > > > > > > > > > > > Sergey Kozlov > >> > > > > > > > > > > > GridGain Systems > >> > > > > > > > > > > > www.gridgain.com > >> > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > -- > >> > > > > > > > > > > Alex. > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > > > |
Huge +1 from me for Feature Masks.
I think this should be our top priority for thin client protocol, since it simplifies change management a lot. On Mon, Jan 20, 2020 at 8:21 PM Igor Sapego <[hidden email]> wrote: > Sorry for the late reply. > > Approach with taskId will require a lot of changes in protocol and thus > more "heavy" for implementation, but it definitely looks to me less hacky > than reqId-approach. Moreover, as was mentioned, server notifications > mechanism will be required in a future anyway with high probability. So > from this point of view I like taskId-approach. > > On the other hand, what we should also consider here is performance. > Speaking of latency, it looks like reqId will have better results in case > of > small and fast tasks. The only question here, if we want to optimize thin > clients for this case. > > Also, what are you talking about mostly involves clients on platforms > that already have Compute API for thick clients. Let me mention one > more point of view here and another concern here. > > The changes you propose are going to change protocol version for sure. > In case with taskId approach and server notifications - even more so. > > But such clients as Python, Node.js, PHP, Go most probably won't have > support for this API, at least for now. Or never. But current > backward-compatibility mechanism implies protocol versions where we > imply that client that supports version 1.5 also supports all the features > introduced in all the previous versions of the protocol. > > Thus implementing Compute API in any of the proposed ways *may* > force mentioned clients to support changes in protocol which they not > necessarily need in order to introduce new features in the future. > > So, maybe it's a good time for us to change our backward compatibility > mechanism from protocol versioning to feature masks? > > WDYT? > > Best Regards, > Igor > > > On Fri, Jan 17, 2020 at 9:37 AM Alex Plehanov <[hidden email]> > wrote: > > > Looks like we didn't rich consensus here. > > > > Igor, as thin client maintainer, can you please share your opinion? > > > > Everyone else also welcome, please share your thoughts about options to > > implement operations for compute. > > > > > > чт, 28 нояб. 2019 г. в 10:02, Alex Plehanov <[hidden email]>: > > > > > > Since all thin client operations are inherently async, we should be > > able > > > to cancel any of them > > > It's illogical to have such ability. What should do cancel operation of > > > cancel operation? Moreover, sometimes it's dangerous, for example, > create > > > cache operation should never be canceled. There should be an explicit > set > > > of processes that we can cancel: queries, transactions, tasks, > services. > > > The lifecycle of services is more complex than the lifecycle of tasks. > > With > > > services, I suppose, we can't use request cancelation, so tasks will be > > the > > > only process with an exceptional pattern. > > > > > > > The request would be "execute task with specified node filter" - > simple > > > and efficient. > > > It's not simple: every compute or service request should contain > complex > > > node filtering logic, which duplicates the same logic for cluster API. > > > It's not efficient: for example, we can't implement forPredicate() > > > filtering in this case. > > > > > > > > > ср, 27 нояб. 2019 г. в 19:25, Pavel Tupitsyn <[hidden email]>: > > > > > >> > The request is already processed (task is started), we can't cancel > > the > > >> request > > >> The request is not "start a task". It is "execute task" (and get > > result). > > >> Same as "cache get" - you get a result in the end, we don't "start > cache > > >> get" then "end cache get". > > >> > > >> Since all thin client operations are inherently async, we should be > able > > >> to > > >> cancel any of them > > >> by sending another request with an id of prior request to be > cancelled. > > >> That's why I'm advocating for this approach - it will work for > anything, > > >> no > > >> special cases. > > >> And it keeps "happy path" as simple as it is right now. > > >> > > >> Queries are different because we retrieve results in pages, we can't > do > > >> them as one request. > > >> Transactions are also different because client controls when they > should > > >> end. > > >> There is no reason for task execution to be a special case like > queries > > or > > >> transactions. > > >> > > >> > we always need to send 2 requests to server to execute the task > > >> Nope. We don't need to get nodes on client at all. > > >> The request would be "execute task with specified node filter" - > simple > > >> and > > >> efficient. > > >> > > >> > > >> On Wed, Nov 27, 2019 at 4:31 PM Alex Plehanov < > [hidden email]> > > >> wrote: > > >> > > >> > > We do cancel a request to perform a task. We may and should use > > this > > >> to > > >> > cancel any other request in future. > > >> > The request is already processed (task is started), we can't cancel > > the > > >> > request. As you mentioned before, we already do almost the same for > > >> queries > > >> > (close the cursor, but not cancel the request to run a query), it's > > >> better > > >> > to do such things in a common way. We have a pattern: start some > > process > > >> > (query, transaction), get id of this process, end process by this > id. > > >> The > > >> > "Execute task" process should match the same pattern. In my opinion, > > >> > implementation with two-way requests is the best option to match > this > > >> > pattern (we can even reuse OP_RESOURCE_CLOSE operation type in this > > >> case). > > >> > Sometime in the future, we will need two-way requests for some other > > >> > functionality (continuous queries, event listening, etc). But even > > >> without > > >> > two-way requests introducing some process id (task id in our case) > > will > > >> be > > >> > closer to existing pattern than canceling tasks by request id. > > >> > > > >> > > So every new request will apply those filters on server side, > using > > >> the > > >> > most recent set of nodes. > > >> > In this case, we always need to send 2 requests to server to execute > > the > > >> > task. First - to get nodes by the filter, second - to actually > execute > > >> the > > >> > task. It seems like overhead. The same will be for services. Cluster > > >> group > > >> > remains the same if the topology hasn't changed. We can use this > fact > > >> and > > >> > bind "execute task" request to topology. If topology has changed - > get > > >> > nodes for new topology and retry request. > > >> > > > >> > вт, 26 нояб. 2019 г. в 17:44, Pavel Tupitsyn <[hidden email] > >: > > >> > > > >> > > > After all, we don't cancel request > > >> > > We do cancel a request to perform a task. We may and should use > this > > >> to > > >> > > cancel any other request in future. > > >> > > > > >> > > > Client uses some cluster group filtration (for example > > forServers() > > >> > > cluster group) > > >> > > Please see above - Aleksandr Shapkin described how we store > > >> > > filtered cluster groups on client. > > >> > > We don't store node IDs, we store actual filters. So every new > > request > > >> > will > > >> > > apply those filters on server side, > > >> > > using the most recent set of nodes. > > >> > > > > >> > > var myGrp = cluster.forServers().forAttribute("foo"); // This does > > not > > >> > > issue any server requests, just builds an object with filters on > > >> client > > >> > > while (true) myGrp.compute().executeTask("bar"); // Every request > > >> > includes > > >> > > filters, and filters are applied on the server side > > >> > > > > >> > > On Tue, Nov 26, 2019 at 1:42 PM Alex Plehanov < > > >> [hidden email]> > > >> > > wrote: > > >> > > > > >> > > > > Anyway, my point stands. > > >> > > > I can't agree. Why you don't want to use task id for this? After > > >> all, > > >> > we > > >> > > > don't cancel request (request is already processed), we cancel > the > > >> > task. > > >> > > So > > >> > > > it's more convenient to use task id here. > > >> > > > > > >> > > > > Can you please provide equivalent use case with existing > "thick" > > >> > > client? > > >> > > > For example: > > >> > > > Cluster consists of one server node. > > >> > > > Client uses some cluster group filtration (for example > > forServers() > > >> > > cluster > > >> > > > group). > > >> > > > Client starts to send periodically (for example 1 per minute) > > >> long-term > > >> > > > (for example 1 hour long) tasks to the cluster. > > >> > > > Meanwhile, several server nodes joined the cluster. > > >> > > > > > >> > > > In case of thick client: All server nodes will be used, tasks > will > > >> be > > >> > > load > > >> > > > balanced. > > >> > > > In case of thin client: Only one server node will be used, > client > > >> will > > >> > > > detect topology change after an hour. > > >> > > > > > >> > > > > > >> > > > вт, 26 нояб. 2019 г. в 11:50, Pavel Tupitsyn < > > [hidden email] > > >> >: > > >> > > > > > >> > > > > > I can't see any usage of request id in query cursors > > >> > > > > You are right, cursor id is a separate thing. > > >> > > > > Anyway, my point stands. > > >> > > > > > > >> > > > > > client sends long term tasks to nodes and wants to do it > with > > >> load > > >> > > > > balancing > > >> > > > > I still don't get it. Can you please provide equivalent use > case > > >> with > > >> > > > > existing "thick" client? > > >> > > > > > > >> > > > > > > >> > > > > On Mon, Nov 25, 2019 at 11:59 PM Alex Plehanov < > > >> > > [hidden email]> > > >> > > > > wrote: > > >> > > > > > > >> > > > > > > And it is fine to use request ID to identify compute tasks > > >> (as we > > >> > > do > > >> > > > > with > > >> > > > > > query cursors). > > >> > > > > > I can't see any usage of request id in query cursors. We > send > > >> query > > >> > > > > request > > >> > > > > > and get cursor id in response. After that, we only use > cursor > > id > > >> > (to > > >> > > > get > > >> > > > > > next pages and to close the resource). Did I miss something? > > >> > > > > > > > >> > > > > > > Looks like I'm missing something - how is topology change > > >> > relevant > > >> > > to > > >> > > > > > executing compute tasks from client? > > >> > > > > > It's not relevant directly. But there are some cases where > it > > >> will > > >> > be > > >> > > > > > helpful. For example, if client sends long term tasks to > nodes > > >> and > > >> > > > wants > > >> > > > > to > > >> > > > > > do it with load balancing it will detect topology change > only > > >> after > > >> > > > some > > >> > > > > > time in the future with the first response, so load > balancing > > >> will > > >> > no > > >> > > > > work. > > >> > > > > > Perhaps we can add optional "topology version" field to the > > >> > > > > > OP_COMPUTE_EXECUTE_TASK request to solve this problem. > > >> > > > > > > > >> > > > > > > > >> > > > > > пн, 25 нояб. 2019 г. в 22:42, Pavel Tupitsyn < > > >> [hidden email] > > >> > >: > > >> > > > > > > > >> > > > > > > Alex, > > >> > > > > > > > > >> > > > > > > > we will mix entities from different layers (transport > > layer > > >> and > > >> > > > > request > > >> > > > > > > body) > > >> > > > > > > I would not call our message header (which includes the > id) > > >> > > > "transport > > >> > > > > > > layer". > > >> > > > > > > TCP is our transport layer. And it is fine to use request > ID > > >> to > > >> > > > > identify > > >> > > > > > > compute tasks (as we do with query cursors). > > >> > > > > > > > > >> > > > > > > > we still can't be sure that the task is successfully > > started > > >> > on a > > >> > > > > > server > > >> > > > > > > The request to start the task will fail and we'll get a > > >> response > > >> > > > > > indicating > > >> > > > > > > that right away > > >> > > > > > > > > >> > > > > > > > we won't ever know about topology change > > >> > > > > > > Looks like I'm missing something - how is topology change > > >> > relevant > > >> > > to > > >> > > > > > > executing compute tasks from client? > > >> > > > > > > > > >> > > > > > > On Mon, Nov 25, 2019 at 10:17 PM Alex Plehanov < > > >> > > > > [hidden email]> > > >> > > > > > > wrote: > > >> > > > > > > > > >> > > > > > > > Pavel, in this case, we will mix entities from different > > >> layers > > >> > > > > > > (transport > > >> > > > > > > > layer and request body), it's not very good. The same > > >> behavior > > >> > we > > >> > > > can > > >> > > > > > > > achieve with generated on client-side task id, but there > > >> will > > >> > be > > >> > > no > > >> > > > > > > > inter-layer data intersection and I think it will be > > easier > > >> to > > >> > > > > > implement > > >> > > > > > > on > > >> > > > > > > > both client and server-side. But we still can't be sure > > that > > >> > the > > >> > > > task > > >> > > > > > is > > >> > > > > > > > successfully started on a server. We won't ever know > about > > >> > > topology > > >> > > > > > > change, > > >> > > > > > > > because topology changed flag will be sent from server > to > > >> > client > > >> > > > only > > >> > > > > > > with > > >> > > > > > > > a response when the task will be completed. Are we > accept > > >> that? > > >> > > > > > > > > > >> > > > > > > > пн, 25 нояб. 2019 г. в 19:07, Pavel Tupitsyn < > > >> > > [hidden email] > > >> > > > >: > > >> > > > > > > > > > >> > > > > > > > > Alex, > > >> > > > > > > > > > > >> > > > > > > > > I have a simpler idea. We already do request id > handling > > >> in > > >> > the > > >> > > > > > > protocol, > > >> > > > > > > > > so: > > >> > > > > > > > > - Client sends a normal request to execute compute > task. > > >> > > Request > > >> > > > ID > > >> > > > > > is > > >> > > > > > > > > generated as usual. > > >> > > > > > > > > - As soon as task is completed, a response is > received. > > >> > > > > > > > > > > >> > > > > > > > > As for cancellation - client can send a new request > > (with > > >> new > > >> > > > > request > > >> > > > > > > ID) > > >> > > > > > > > > and (in the body) pass the request ID from above > > >> > > > > > > > > as a task identifier. As a result, there are two > > >> responses: > > >> > > > > > > > > - Cancellation response > > >> > > > > > > > > - Task response (with proper cancelled status) > > >> > > > > > > > > > > >> > > > > > > > > That's it, no need to modify the core of the protocol. > > One > > >> > > > request > > >> > > > > - > > >> > > > > > > one > > >> > > > > > > > > response. > > >> > > > > > > > > > > >> > > > > > > > > On Mon, Nov 25, 2019 at 6:20 PM Alex Plehanov < > > >> > > > > > [hidden email] > > >> > > > > > > > > > >> > > > > > > > > wrote: > > >> > > > > > > > > > > >> > > > > > > > > > Pavel, we need to inform the client when the task is > > >> > > completed, > > >> > > > > we > > >> > > > > > > need > > >> > > > > > > > > the > > >> > > > > > > > > > ability to cancel the task. I see several ways to > > >> implement > > >> > > > this: > > >> > > > > > > > > > > > >> > > > > > > > > > 1. Сlient sends a request to the server to start a > > task, > > >> > > server > > >> > > > > > > return > > >> > > > > > > > > task > > >> > > > > > > > > > id in response. Server notifies client when task is > > >> > completed > > >> > > > > with > > >> > > > > > a > > >> > > > > > > > new > > >> > > > > > > > > > request (from server to client). Client can cancel > the > > >> task > > >> > > by > > >> > > > > > > sending > > >> > > > > > > > a > > >> > > > > > > > > > new request with operation type "cancel" and task > id. > > In > > >> > this > > >> > > > > case, > > >> > > > > > > we > > >> > > > > > > > > > should implement 2-ways requests. > > >> > > > > > > > > > 2. Client generates unique task id and sends a > request > > >> to > > >> > the > > >> > > > > > server > > >> > > > > > > to > > >> > > > > > > > > > start a task, server don't reply immediately but > wait > > >> until > > >> > > > task > > >> > > > > is > > >> > > > > > > > > > completed. Client can cancel task by sending new > > request > > >> > with > > >> > > > > > > operation > > >> > > > > > > > > > type "cancel" and task id. In this case, we should > > >> decouple > > >> > > > > request > > >> > > > > > > and > > >> > > > > > > > > > response on the server-side (currently response is > > sent > > >> > right > > >> > > > > after > > >> > > > > > > > > request > > >> > > > > > > > > > was processed). Also, we can't be sure that task is > > >> > > > successfully > > >> > > > > > > > started > > >> > > > > > > > > on > > >> > > > > > > > > > a server. > > >> > > > > > > > > > 3. Client sends a request to the server to start a > > task, > > >> > > server > > >> > > > > > > return > > >> > > > > > > > id > > >> > > > > > > > > > in response. Client periodically asks the server > about > > >> task > > >> > > > > status. > > >> > > > > > > > > Client > > >> > > > > > > > > > can cancel the task by sending new request with > > >> operation > > >> > > type > > >> > > > > > > "cancel" > > >> > > > > > > > > and > > >> > > > > > > > > > task id. This case brings some overhead to the > > >> > communication > > >> > > > > > channel. > > >> > > > > > > > > > > > >> > > > > > > > > > Personally, I think that the case with 2-ways > requests > > >> is > > >> > > > better, > > >> > > > > > but > > >> > > > > > > > I'm > > >> > > > > > > > > > open to any other ideas. > > >> > > > > > > > > > > > >> > > > > > > > > > Aleksandr, > > >> > > > > > > > > > > > >> > > > > > > > > > Filtering logic for OP_CLUSTER_GROUP_GET_NODE_IDS > > looks > > >> > > > > > > > overcomplicated. > > >> > > > > > > > > Do > > >> > > > > > > > > > we need server-side filtering at all? Wouldn't it be > > >> better > > >> > > to > > >> > > > > send > > >> > > > > > > > basic > > >> > > > > > > > > > info (ids, order, flags) for all nodes (there is > > >> relatively > > >> > > > small > > >> > > > > > > > amount > > >> > > > > > > > > of > > >> > > > > > > > > > data) and extended info (attributes) for selected > list > > >> of > > >> > > > nodes? > > >> > > > > In > > >> > > > > > > > this > > >> > > > > > > > > > case, we can do basic node filtration on client-side > > >> > > > > (forClients(), > > >> > > > > > > > > > forServers(), forNodeIds(), forOthers(), etc). > > >> > > > > > > > > > > > >> > > > > > > > > > Do you use standard ClusterNode serialization? There > > are > > >> > also > > >> > > > > > metrics > > >> > > > > > > > > > serialized with ClusterNode, do we need it on thin > > >> client? > > >> > > > There > > >> > > > > > are > > >> > > > > > > > > other > > >> > > > > > > > > > interfaces exist to show metrics, I think it's > > >> redundant to > > >> > > > > export > > >> > > > > > > > > metrics > > >> > > > > > > > > > to thin clients too. > > >> > > > > > > > > > > > >> > > > > > > > > > What do you think? > > >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > пт, 22 нояб. 2019 г. в 20:15, Aleksandr Shapkin < > > >> > > > > [hidden email] > > >> > > > > > >: > > >> > > > > > > > > > > > >> > > > > > > > > > > Alex, > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > I think you can create a new IEP page and I will > > fill > > >> it > > >> > > with > > >> > > > > the > > >> > > > > > > > > Cluster > > >> > > > > > > > > > > API details. > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > In short, I’ve introduced several new codes: > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > Cluster API is pretty straightforward: > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > OP_CLUSTER_IS_ACTIVE = 5000 > > >> > > > > > > > > > > > > >> > > > > > > > > > > OP_CLUSTER_CHANGE_STATE = 5001 > > >> > > > > > > > > > > > > >> > > > > > > > > > > OP_CLUSTER_CHANGE_WAL_STATE = 5002 > > >> > > > > > > > > > > > > >> > > > > > > > > > > OP_CLUSTER_GET_WAL_STATE = 5003 > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > Cluster group codes: > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > OP_CLUSTER_GROUP_GET_NODE_IDS = 5100 > > >> > > > > > > > > > > > > >> > > > > > > > > > > OP_CLUSTER_GROUP_GET_NODE_INFO = 5101 > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > The underlying implementation is based on the > thick > > >> > client > > >> > > > > logic. > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > For every request, we provide a known topology > > version > > >> > and > > >> > > if > > >> > > > > it > > >> > > > > > > has > > >> > > > > > > > > > > changed, > > >> > > > > > > > > > > > > >> > > > > > > > > > > a client updates it firstly and then re-sends the > > >> > filtering > > >> > > > > > > request. > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > Alongside the topVer a client sends a serialized > > nodes > > >> > > > > projection > > >> > > > > > > > > object > > >> > > > > > > > > > > > > >> > > > > > > > > > > that could be considered as a code to value > mapping. > > >> > > > > > > > > > > > > >> > > > > > > > > > > Consider: [{Code = 1, Value= [“DotNet”, > > >> “MyAttribute”}, > > >> > > > > {Code=2, > > >> > > > > > > > > > Value=1}] > > >> > > > > > > > > > > > > >> > > > > > > > > > > Where “1” stands for Attribute filtering and “2” – > > >> > > > > > serverNodesOnly > > >> > > > > > > > > flag. > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > As a result of request processing, a server sends > > >> nodeId > > >> > > > UUIDs > > >> > > > > > and > > >> > > > > > > a > > >> > > > > > > > > > > current topVer. > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > When a client obtains nodeIds, it can perform a > > >> NODE_INFO > > >> > > > call > > >> > > > > to > > >> > > > > > > > get a > > >> > > > > > > > > > > > > >> > > > > > > > > > > serialized ClusterNode object. In addition there > > >> should > > >> > be > > >> > > a > > >> > > > > > > > different > > >> > > > > > > > > > API > > >> > > > > > > > > > > > > >> > > > > > > > > > > method for accessing/updating node metrics. > > >> > > > > > > > > > > > > >> > > > > > > > > > > чт, 21 нояб. 2019 г. в 12:32, Sergey Kozlov < > > >> > > > > > [hidden email] > > >> > > > > > > >: > > >> > > > > > > > > > > > > >> > > > > > > > > > > > Hi Pavel > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > On Thu, Nov 21, 2019 at 11:30 AM Pavel Tupitsyn > < > > >> > > > > > > > > [hidden email]> > > >> > > > > > > > > > > > wrote: > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > 1. I believe that Cluster operations for Thin > > >> Client > > >> > > > > protocol > > >> > > > > > > are > > >> > > > > > > > > > > already > > >> > > > > > > > > > > > > in the works > > >> > > > > > > > > > > > > by Alexandr Shapkin. Can't find the ticket > > though. > > >> > > > > > > > > > > > > Alexandr, can you please confirm and attach > the > > >> > ticket > > >> > > > > > number? > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > 2. Proposed changes will work only for Java > > tasks > > >> > that > > >> > > > are > > >> > > > > > > > already > > >> > > > > > > > > > > > deployed > > >> > > > > > > > > > > > > on server nodes. > > >> > > > > > > > > > > > > This is mostly useless for other thin clients > we > > >> have > > >> > > > > > (Python, > > >> > > > > > > > PHP, > > >> > > > > > > > > > > .NET, > > >> > > > > > > > > > > > > C++). > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > I don't guess so. The task (execution) is a way > to > > >> > > > implement > > >> > > > > > own > > >> > > > > > > > > layer > > >> > > > > > > > > > > for > > >> > > > > > > > > > > > the thin client application. > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > We should think of a way to make this useful > for > > >> all > > >> > > > > clients. > > >> > > > > > > > > > > > > For example, we may allow sending tasks in > some > > >> > > scripting > > >> > > > > > > > language > > >> > > > > > > > > > like > > >> > > > > > > > > > > > > Javascript. > > >> > > > > > > > > > > > > Thoughts? > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > The arbitrary code execution from a remote > client > > >> must > > >> > be > > >> > > > > > > protected > > >> > > > > > > > > > > > from malicious code. > > >> > > > > > > > > > > > I don't know how it could be designed but > without > > >> that > > >> > we > > >> > > > > open > > >> > > > > > > the > > >> > > > > > > > > hole > > >> > > > > > > > > > > to > > >> > > > > > > > > > > > kill cluster. > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > On Thu, Nov 21, 2019 at 11:21 AM Sergey > Kozlov < > > >> > > > > > > > > [hidden email] > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > wrote: > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > Hi Alex > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > The idea is great. But I have some concerns > > that > > >> > > > probably > > >> > > > > > > > should > > >> > > > > > > > > be > > >> > > > > > > > > > > > taken > > >> > > > > > > > > > > > > > into account for design: > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > 1. We need to have the ability to stop a > > task > > >> > > > > execution, > > >> > > > > > > > smth > > >> > > > > > > > > > like > > >> > > > > > > > > > > > > > OP_COMPUTE_CANCEL_TASK operation (client > > to > > >> > > server) > > >> > > > > > > > > > > > > > 2. What's about task execution timeout? > It > > >> may > > >> > > help > > >> > > > to > > >> > > > > > the > > >> > > > > > > > > > cluster > > >> > > > > > > > > > > > > > survival for buggy tasks > > >> > > > > > > > > > > > > > 3. Ignite doesn't have > roles/authorization > > >> > > > > functionality > > >> > > > > > > for > > >> > > > > > > > > > now. > > >> > > > > > > > > > > > But > > >> > > > > > > > > > > > > a > > >> > > > > > > > > > > > > > task is the risky operation for cluster > > (for > > >> > > > security > > >> > > > > > > > > reasons). > > >> > > > > > > > > > > > Could > > >> > > > > > > > > > > > > we > > >> > > > > > > > > > > > > > add for Ignite configuration new options: > > >> > > > > > > > > > > > > > - Explicit turning on for compute task > > >> > support > > >> > > > for > > >> > > > > > thin > > >> > > > > > > > > > > protocol > > >> > > > > > > > > > > > > > (disabled by default) for whole > cluster > > >> > > > > > > > > > > > > > - Explicit turning on for compute task > > >> > support > > >> > > > for > > >> > > > > a > > >> > > > > > > node > > >> > > > > > > > > > > > > > - The list of task names (classes) > > >> allowed to > > >> > > > > execute > > >> > > > > > > by > > >> > > > > > > > > thin > > >> > > > > > > > > > > > > client. > > >> > > > > > > > > > > > > > 4. Support the labeling for task that may > > >> help > > >> > to > > >> > > > > > > > investigate > > >> > > > > > > > > > > issues > > >> > > > > > > > > > > > > on > > >> > > > > > > > > > > > > > cluster (the idea from IEP-34 [1]) > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > 1. > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > > https://cwiki.apache.org/confluence/display/IGNITE/IEP-34+Thin+client%3A+transactions+support > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > On Thu, Nov 21, 2019 at 10:58 AM Alex > > Plehanov < > > >> > > > > > > > > > > > [hidden email]> > > >> > > > > > > > > > > > > > wrote: > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > Hello, Igniters! > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > I have plans to start implementation of > > >> Compute > > >> > > > > interface > > >> > > > > > > for > > >> > > > > > > > > > > Ignite > > >> > > > > > > > > > > > > thin > > >> > > > > > > > > > > > > > > client and want to discuss features that > > >> should > > >> > be > > >> > > > > > > > implemented. > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > We already have Compute implementation for > > >> > > > binary-rest > > >> > > > > > > > clients > > >> > > > > > > > > > > > > > > (GridClientCompute), which have the > > following > > >> > > > > > > functionality: > > >> > > > > > > > > > > > > > > - Filtering cluster nodes (projection) for > > >> > compute > > >> > > > > > > > > > > > > > > - Executing task by the name > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > I think we can implement this > functionality > > >> in a > > >> > > thin > > >> > > > > > > client > > >> > > > > > > > as > > >> > > > > > > > > > > well. > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > First of all, we need some operation types > > to > > >> > > > request a > > >> > > > > > > list > > >> > > > > > > > of > > >> > > > > > > > > > all > > >> > > > > > > > > > > > > > > available nodes and probably node > attributes > > >> (by > > >> > a > > >> > > > list > > >> > > > > > of > > >> > > > > > > > > > nodes). > > >> > > > > > > > > > > > Node > > >> > > > > > > > > > > > > > > attributes will be helpful if we will > decide > > >> to > > >> > > > > implement > > >> > > > > > > > > analog > > >> > > > > > > > > > of > > >> > > > > > > > > > > > > > > ClusterGroup#forAttribute or > > >> > > > ClusterGroup#forePredicate > > >> > > > > > > > methods > > >> > > > > > > > > > in > > >> > > > > > > > > > > > the > > >> > > > > > > > > > > > > > thin > > >> > > > > > > > > > > > > > > client. Perhaps they can be requested > > lazily. > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > From the protocol point of view there will > > be > > >> two > > >> > > new > > >> > > > > > > > > operations: > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > OP_CLUSTER_GET_NODES > > >> > > > > > > > > > > > > > > Request: empty > > >> > > > > > > > > > > > > > > Response: long topologyVersion, int > > >> > > > > minorTopologyVersion, > > >> > > > > > > int > > >> > > > > > > > > > > > > nodesCount, > > >> > > > > > > > > > > > > > > for each node set of node fields (UUID > > nodeId, > > >> > > Object > > >> > > > > or > > >> > > > > > > > String > > >> > > > > > > > > > > > > > > consistentId, long order, etc) > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > OP_CLUSTER_GET_NODE_ATTRIBUTES > > >> > > > > > > > > > > > > > > Request: int nodesCount, for each node: > UUID > > >> > nodeId > > >> > > > > > > > > > > > > > > Response: int nodesCount, for each node: > int > > >> > > > > > > attributesCount, > > >> > > > > > > > > for > > >> > > > > > > > > > > > each > > >> > > > > > > > > > > > > > node > > >> > > > > > > > > > > > > > > attribute: String name, Object value > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > To execute tasks we need something like > > these > > >> > > methods > > >> > > > > in > > >> > > > > > > the > > >> > > > > > > > > > client > > >> > > > > > > > > > > > > API: > > >> > > > > > > > > > > > > > > Object execute(String task, Object arg) > > >> > > > > > > > > > > > > > > Future<Object> executeAsync(String task, > > >> Object > > >> > > arg) > > >> > > > > > > > > > > > > > > Object affinityExecute(String task, String > > >> cache, > > >> > > > > Object > > >> > > > > > > key, > > >> > > > > > > > > > > Object > > >> > > > > > > > > > > > > arg) > > >> > > > > > > > > > > > > > > Future<Object> affinityExecuteAsync(String > > >> task, > > >> > > > String > > >> > > > > > > > cache, > > >> > > > > > > > > > > Object > > >> > > > > > > > > > > > > > key, > > >> > > > > > > > > > > > > > > Object arg) > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > Which can be mapped to protocol > operations: > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > OP_COMPUTE_EXECUTE_TASK > > >> > > > > > > > > > > > > > > Request: UUID nodeId, String taskName, > > Object > > >> arg > > >> > > > > > > > > > > > > > > Response: Object result > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > OP_COMPUTE_EXECUTE_TASK_AFFINITY > > >> > > > > > > > > > > > > > > Request: String cacheName, Object key, > > String > > >> > > > taskName, > > >> > > > > > > > Object > > >> > > > > > > > > > arg > > >> > > > > > > > > > > > > > > Response: Object result > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > The second operation is needed because we > > >> > sometimes > > >> > > > > can't > > >> > > > > > > > > > calculate > > >> > > > > > > > > > > > and > > >> > > > > > > > > > > > > > > connect to affinity node on the > client-side > > >> > > (affinity > > >> > > > > > > > awareness > > >> > > > > > > > > > can > > >> > > > > > > > > > > > be > > >> > > > > > > > > > > > > > > disabled, custom affinity function can be > > >> used or > > >> > > > there > > >> > > > > > can > > >> > > > > > > > be > > >> > > > > > > > > no > > >> > > > > > > > > > > > > > > connection between client and affinity > > node), > > >> but > > >> > > we > > >> > > > > can > > >> > > > > > > make > > >> > > > > > > > > > best > > >> > > > > > > > > > > > > effort > > >> > > > > > > > > > > > > > > to send request to target node if affinity > > >> > > awareness > > >> > > > is > > >> > > > > > > > > enabled. > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > Currently, on the server-side requests > > always > > >> > > > processed > > >> > > > > > > > > > > synchronously > > >> > > > > > > > > > > > > and > > >> > > > > > > > > > > > > > > responses are sent right after request was > > >> > > processed. > > >> > > > > To > > >> > > > > > > > > execute > > >> > > > > > > > > > > long > > >> > > > > > > > > > > > > > tasks > > >> > > > > > > > > > > > > > > async we should whether change this logic > or > > >> > > > introduce > > >> > > > > > some > > >> > > > > > > > > kind > > >> > > > > > > > > > > > > two-way > > >> > > > > > > > > > > > > > > communication between client and server > (now > > >> only > > >> > > > > one-way > > >> > > > > > > > > > requests > > >> > > > > > > > > > > > from > > >> > > > > > > > > > > > > > > client to server are allowed). > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > Two-way communication can also be useful > in > > >> the > > >> > > > future > > >> > > > > if > > >> > > > > > > we > > >> > > > > > > > > will > > >> > > > > > > > > > > > send > > >> > > > > > > > > > > > > > some > > >> > > > > > > > > > > > > > > server-side generated events to clients. > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > In case of two-way communication there can > > be > > >> new > > >> > > > > > > operations > > >> > > > > > > > > > > > > introduced: > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > OP_COMPUTE_EXECUTE_TASK (from client to > > >> server) > > >> > > > > > > > > > > > > > > Request: UUID nodeId, String taskName, > > Object > > >> arg > > >> > > > > > > > > > > > > > > Response: long taskId > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > OP_COMPUTE_TASK_FINISHED (from server to > > >> client) > > >> > > > > > > > > > > > > > > Request: taskId, Object result > > >> > > > > > > > > > > > > > > Response: empty > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > The same for affinity requests. > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > Also, we can implement not only execute > task > > >> > > > operation, > > >> > > > > > but > > >> > > > > > > > > some > > >> > > > > > > > > > > > other > > >> > > > > > > > > > > > > > > operations from IgniteCompute (broadcast, > > run, > > >> > > call), > > >> > > > > but > > >> > > > > > > it > > >> > > > > > > > > will > > >> > > > > > > > > > > be > > >> > > > > > > > > > > > > > useful > > >> > > > > > > > > > > > > > > only for java thin client. And even with > > java > > >> > thin > > >> > > > > client > > >> > > > > > > we > > >> > > > > > > > > > should > > >> > > > > > > > > > > > > > whether > > >> > > > > > > > > > > > > > > implement peer-class-loading for thin > > clients > > >> > (this > > >> > > > > also > > >> > > > > > > > > requires > > >> > > > > > > > > > > > > two-way > > >> > > > > > > > > > > > > > > client-server communication) or put > classes > > >> with > > >> > > > > executed > > >> > > > > > > > > > closures > > >> > > > > > > > > > > to > > >> > > > > > > > > > > > > the > > >> > > > > > > > > > > > > > > server locally. > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > What do you think about proposed protocol > > >> > changes? > > >> > > > > > > > > > > > > > > Do we need two-way requests between client > > and > > >> > > > server? > > >> > > > > > > > > > > > > > > Do we need support of compute methods > other > > >> than > > >> > > > > "execute > > >> > > > > > > > > task"? > > >> > > > > > > > > > > > > > > What do you think about peer-class-loading > > for > > >> > thin > > >> > > > > > > clients? > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > -- > > >> > > > > > > > > > > > > > Sergey Kozlov > > >> > > > > > > > > > > > > > GridGain Systems > > >> > > > > > > > > > > > > > www.gridgain.com > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > -- > > >> > > > > > > > > > > > Sergey Kozlov > > >> > > > > > > > > > > > GridGain Systems > > >> > > > > > > > > > > > www.gridgain.com > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > -- > > >> > > > > > > > > > > Alex. > > >> > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > > > > > |
Igor, thanks for the reply.
> Approach with taskId will require a lot of changes in protocol and thus more "heavy" for implementation Do you mean approach with server notifications mechanism? Yes, it will require a lot of changes. But in most recent messages we've discussed with Pavel approach without server notifications mechanism. This approach have the same complexity and performance as an approach with requestId. > But such clients as Python, Node.js, PHP, Go most probably won't have support for this API, at least for now. Without a server notifications mechanism, there will be no breaking changes in the protocol, so client implementation can just skip this feature and protocol version and implement the next one. > Or never. I think it still useful to execute java compute tasks from non-java thin clients. Also, we can provide some out-of-the-box java tasks, for example ExecutePythonScriptTask with python compute implementation, which can run python script on server node. > So, maybe it's a good time for us to change our backward compatibility mechanism from protocol versioning to feature masks? I like the idea with feature masks, but it will force us to support both backward compatibility mechanisms, protocol versioning and feature masks. пн, 20 янв. 2020 г. в 20:34, Pavel Tupitsyn <[hidden email]>: > Huge +1 from me for Feature Masks. > I think this should be our top priority for thin client protocol, since it > simplifies change management a lot. > > On Mon, Jan 20, 2020 at 8:21 PM Igor Sapego <[hidden email]> wrote: > > > Sorry for the late reply. > > > > Approach with taskId will require a lot of changes in protocol and thus > > more "heavy" for implementation, but it definitely looks to me less hacky > > than reqId-approach. Moreover, as was mentioned, server notifications > > mechanism will be required in a future anyway with high probability. So > > from this point of view I like taskId-approach. > > > > On the other hand, what we should also consider here is performance. > > Speaking of latency, it looks like reqId will have better results in case > > of > > small and fast tasks. The only question here, if we want to optimize thin > > clients for this case. > > > > Also, what are you talking about mostly involves clients on platforms > > that already have Compute API for thick clients. Let me mention one > > more point of view here and another concern here. > > > > The changes you propose are going to change protocol version for sure. > > In case with taskId approach and server notifications - even more so. > > > > But such clients as Python, Node.js, PHP, Go most probably won't have > > support for this API, at least for now. Or never. But current > > backward-compatibility mechanism implies protocol versions where we > > imply that client that supports version 1.5 also supports all the > features > > introduced in all the previous versions of the protocol. > > > > Thus implementing Compute API in any of the proposed ways *may* > > force mentioned clients to support changes in protocol which they not > > necessarily need in order to introduce new features in the future. > > > > So, maybe it's a good time for us to change our backward compatibility > > mechanism from protocol versioning to feature masks? > > > > WDYT? > > > > Best Regards, > > Igor > > > > > > On Fri, Jan 17, 2020 at 9:37 AM Alex Plehanov <[hidden email]> > > wrote: > > > > > Looks like we didn't rich consensus here. > > > > > > Igor, as thin client maintainer, can you please share your opinion? > > > > > > Everyone else also welcome, please share your thoughts about options to > > > implement operations for compute. > > > > > > > > > чт, 28 нояб. 2019 г. в 10:02, Alex Plehanov <[hidden email]>: > > > > > > > > Since all thin client operations are inherently async, we should be > > > able > > > > to cancel any of them > > > > It's illogical to have such ability. What should do cancel operation > of > > > > cancel operation? Moreover, sometimes it's dangerous, for example, > > create > > > > cache operation should never be canceled. There should be an explicit > > set > > > > of processes that we can cancel: queries, transactions, tasks, > > services. > > > > The lifecycle of services is more complex than the lifecycle of > tasks. > > > With > > > > services, I suppose, we can't use request cancelation, so tasks will > be > > > the > > > > only process with an exceptional pattern. > > > > > > > > > The request would be "execute task with specified node filter" - > > simple > > > > and efficient. > > > > It's not simple: every compute or service request should contain > > complex > > > > node filtering logic, which duplicates the same logic for cluster > API. > > > > It's not efficient: for example, we can't implement forPredicate() > > > > filtering in this case. > > > > > > > > > > > > ср, 27 нояб. 2019 г. в 19:25, Pavel Tupitsyn <[hidden email]>: > > > > > > > >> > The request is already processed (task is started), we can't > cancel > > > the > > > >> request > > > >> The request is not "start a task". It is "execute task" (and get > > > result). > > > >> Same as "cache get" - you get a result in the end, we don't "start > > cache > > > >> get" then "end cache get". > > > >> > > > >> Since all thin client operations are inherently async, we should be > > able > > > >> to > > > >> cancel any of them > > > >> by sending another request with an id of prior request to be > > cancelled. > > > >> That's why I'm advocating for this approach - it will work for > > anything, > > > >> no > > > >> special cases. > > > >> And it keeps "happy path" as simple as it is right now. > > > >> > > > >> Queries are different because we retrieve results in pages, we can't > > do > > > >> them as one request. > > > >> Transactions are also different because client controls when they > > should > > > >> end. > > > >> There is no reason for task execution to be a special case like > > queries > > > or > > > >> transactions. > > > >> > > > >> > we always need to send 2 requests to server to execute the task > > > >> Nope. We don't need to get nodes on client at all. > > > >> The request would be "execute task with specified node filter" - > > simple > > > >> and > > > >> efficient. > > > >> > > > >> > > > >> On Wed, Nov 27, 2019 at 4:31 PM Alex Plehanov < > > [hidden email]> > > > >> wrote: > > > >> > > > >> > > We do cancel a request to perform a task. We may and should use > > > this > > > >> to > > > >> > cancel any other request in future. > > > >> > The request is already processed (task is started), we can't > cancel > > > the > > > >> > request. As you mentioned before, we already do almost the same > for > > > >> queries > > > >> > (close the cursor, but not cancel the request to run a query), > it's > > > >> better > > > >> > to do such things in a common way. We have a pattern: start some > > > process > > > >> > (query, transaction), get id of this process, end process by this > > id. > > > >> The > > > >> > "Execute task" process should match the same pattern. In my > opinion, > > > >> > implementation with two-way requests is the best option to match > > this > > > >> > pattern (we can even reuse OP_RESOURCE_CLOSE operation type in > this > > > >> case). > > > >> > Sometime in the future, we will need two-way requests for some > other > > > >> > functionality (continuous queries, event listening, etc). But even > > > >> without > > > >> > two-way requests introducing some process id (task id in our case) > > > will > > > >> be > > > >> > closer to existing pattern than canceling tasks by request id. > > > >> > > > > >> > > So every new request will apply those filters on server side, > > using > > > >> the > > > >> > most recent set of nodes. > > > >> > In this case, we always need to send 2 requests to server to > execute > > > the > > > >> > task. First - to get nodes by the filter, second - to actually > > execute > > > >> the > > > >> > task. It seems like overhead. The same will be for services. > Cluster > > > >> group > > > >> > remains the same if the topology hasn't changed. We can use this > > fact > > > >> and > > > >> > bind "execute task" request to topology. If topology has changed - > > get > > > >> > nodes for new topology and retry request. > > > >> > > > > >> > вт, 26 нояб. 2019 г. в 17:44, Pavel Tupitsyn < > [hidden email] > > >: > > > >> > > > > >> > > > After all, we don't cancel request > > > >> > > We do cancel a request to perform a task. We may and should use > > this > > > >> to > > > >> > > cancel any other request in future. > > > >> > > > > > >> > > > Client uses some cluster group filtration (for example > > > forServers() > > > >> > > cluster group) > > > >> > > Please see above - Aleksandr Shapkin described how we store > > > >> > > filtered cluster groups on client. > > > >> > > We don't store node IDs, we store actual filters. So every new > > > request > > > >> > will > > > >> > > apply those filters on server side, > > > >> > > using the most recent set of nodes. > > > >> > > > > > >> > > var myGrp = cluster.forServers().forAttribute("foo"); // This > does > > > not > > > >> > > issue any server requests, just builds an object with filters on > > > >> client > > > >> > > while (true) myGrp.compute().executeTask("bar"); // Every > request > > > >> > includes > > > >> > > filters, and filters are applied on the server side > > > >> > > > > > >> > > On Tue, Nov 26, 2019 at 1:42 PM Alex Plehanov < > > > >> [hidden email]> > > > >> > > wrote: > > > >> > > > > > >> > > > > Anyway, my point stands. > > > >> > > > I can't agree. Why you don't want to use task id for this? > After > > > >> all, > > > >> > we > > > >> > > > don't cancel request (request is already processed), we cancel > > the > > > >> > task. > > > >> > > So > > > >> > > > it's more convenient to use task id here. > > > >> > > > > > > >> > > > > Can you please provide equivalent use case with existing > > "thick" > > > >> > > client? > > > >> > > > For example: > > > >> > > > Cluster consists of one server node. > > > >> > > > Client uses some cluster group filtration (for example > > > forServers() > > > >> > > cluster > > > >> > > > group). > > > >> > > > Client starts to send periodically (for example 1 per minute) > > > >> long-term > > > >> > > > (for example 1 hour long) tasks to the cluster. > > > >> > > > Meanwhile, several server nodes joined the cluster. > > > >> > > > > > > >> > > > In case of thick client: All server nodes will be used, tasks > > will > > > >> be > > > >> > > load > > > >> > > > balanced. > > > >> > > > In case of thin client: Only one server node will be used, > > client > > > >> will > > > >> > > > detect topology change after an hour. > > > >> > > > > > > >> > > > > > > >> > > > вт, 26 нояб. 2019 г. в 11:50, Pavel Tupitsyn < > > > [hidden email] > > > >> >: > > > >> > > > > > > >> > > > > > I can't see any usage of request id in query cursors > > > >> > > > > You are right, cursor id is a separate thing. > > > >> > > > > Anyway, my point stands. > > > >> > > > > > > > >> > > > > > client sends long term tasks to nodes and wants to do it > > with > > > >> load > > > >> > > > > balancing > > > >> > > > > I still don't get it. Can you please provide equivalent use > > case > > > >> with > > > >> > > > > existing "thick" client? > > > >> > > > > > > > >> > > > > > > > >> > > > > On Mon, Nov 25, 2019 at 11:59 PM Alex Plehanov < > > > >> > > [hidden email]> > > > >> > > > > wrote: > > > >> > > > > > > > >> > > > > > > And it is fine to use request ID to identify compute > tasks > > > >> (as we > > > >> > > do > > > >> > > > > with > > > >> > > > > > query cursors). > > > >> > > > > > I can't see any usage of request id in query cursors. We > > send > > > >> query > > > >> > > > > request > > > >> > > > > > and get cursor id in response. After that, we only use > > cursor > > > id > > > >> > (to > > > >> > > > get > > > >> > > > > > next pages and to close the resource). Did I miss > something? > > > >> > > > > > > > > >> > > > > > > Looks like I'm missing something - how is topology > change > > > >> > relevant > > > >> > > to > > > >> > > > > > executing compute tasks from client? > > > >> > > > > > It's not relevant directly. But there are some cases where > > it > > > >> will > > > >> > be > > > >> > > > > > helpful. For example, if client sends long term tasks to > > nodes > > > >> and > > > >> > > > wants > > > >> > > > > to > > > >> > > > > > do it with load balancing it will detect topology change > > only > > > >> after > > > >> > > > some > > > >> > > > > > time in the future with the first response, so load > > balancing > > > >> will > > > >> > no > > > >> > > > > work. > > > >> > > > > > Perhaps we can add optional "topology version" field to > the > > > >> > > > > > OP_COMPUTE_EXECUTE_TASK request to solve this problem. > > > >> > > > > > > > > >> > > > > > > > > >> > > > > > пн, 25 нояб. 2019 г. в 22:42, Pavel Tupitsyn < > > > >> [hidden email] > > > >> > >: > > > >> > > > > > > > > >> > > > > > > Alex, > > > >> > > > > > > > > > >> > > > > > > > we will mix entities from different layers (transport > > > layer > > > >> and > > > >> > > > > request > > > >> > > > > > > body) > > > >> > > > > > > I would not call our message header (which includes the > > id) > > > >> > > > "transport > > > >> > > > > > > layer". > > > >> > > > > > > TCP is our transport layer. And it is fine to use > request > > ID > > > >> to > > > >> > > > > identify > > > >> > > > > > > compute tasks (as we do with query cursors). > > > >> > > > > > > > > > >> > > > > > > > we still can't be sure that the task is successfully > > > started > > > >> > on a > > > >> > > > > > server > > > >> > > > > > > The request to start the task will fail and we'll get a > > > >> response > > > >> > > > > > indicating > > > >> > > > > > > that right away > > > >> > > > > > > > > > >> > > > > > > > we won't ever know about topology change > > > >> > > > > > > Looks like I'm missing something - how is topology > change > > > >> > relevant > > > >> > > to > > > >> > > > > > > executing compute tasks from client? > > > >> > > > > > > > > > >> > > > > > > On Mon, Nov 25, 2019 at 10:17 PM Alex Plehanov < > > > >> > > > > [hidden email]> > > > >> > > > > > > wrote: > > > >> > > > > > > > > > >> > > > > > > > Pavel, in this case, we will mix entities from > different > > > >> layers > > > >> > > > > > > (transport > > > >> > > > > > > > layer and request body), it's not very good. The same > > > >> behavior > > > >> > we > > > >> > > > can > > > >> > > > > > > > achieve with generated on client-side task id, but > there > > > >> will > > > >> > be > > > >> > > no > > > >> > > > > > > > inter-layer data intersection and I think it will be > > > easier > > > >> to > > > >> > > > > > implement > > > >> > > > > > > on > > > >> > > > > > > > both client and server-side. But we still can't be > sure > > > that > > > >> > the > > > >> > > > task > > > >> > > > > > is > > > >> > > > > > > > successfully started on a server. We won't ever know > > about > > > >> > > topology > > > >> > > > > > > change, > > > >> > > > > > > > because topology changed flag will be sent from server > > to > > > >> > client > > > >> > > > only > > > >> > > > > > > with > > > >> > > > > > > > a response when the task will be completed. Are we > > accept > > > >> that? > > > >> > > > > > > > > > > >> > > > > > > > пн, 25 нояб. 2019 г. в 19:07, Pavel Tupitsyn < > > > >> > > [hidden email] > > > >> > > > >: > > > >> > > > > > > > > > > >> > > > > > > > > Alex, > > > >> > > > > > > > > > > > >> > > > > > > > > I have a simpler idea. We already do request id > > handling > > > >> in > > > >> > the > > > >> > > > > > > protocol, > > > >> > > > > > > > > so: > > > >> > > > > > > > > - Client sends a normal request to execute compute > > task. > > > >> > > Request > > > >> > > > ID > > > >> > > > > > is > > > >> > > > > > > > > generated as usual. > > > >> > > > > > > > > - As soon as task is completed, a response is > > received. > > > >> > > > > > > > > > > > >> > > > > > > > > As for cancellation - client can send a new request > > > (with > > > >> new > > > >> > > > > request > > > >> > > > > > > ID) > > > >> > > > > > > > > and (in the body) pass the request ID from above > > > >> > > > > > > > > as a task identifier. As a result, there are two > > > >> responses: > > > >> > > > > > > > > - Cancellation response > > > >> > > > > > > > > - Task response (with proper cancelled status) > > > >> > > > > > > > > > > > >> > > > > > > > > That's it, no need to modify the core of the > protocol. > > > One > > > >> > > > request > > > >> > > > > - > > > >> > > > > > > one > > > >> > > > > > > > > response. > > > >> > > > > > > > > > > > >> > > > > > > > > On Mon, Nov 25, 2019 at 6:20 PM Alex Plehanov < > > > >> > > > > > [hidden email] > > > >> > > > > > > > > > > >> > > > > > > > > wrote: > > > >> > > > > > > > > > > > >> > > > > > > > > > Pavel, we need to inform the client when the task > is > > > >> > > completed, > > > >> > > > > we > > > >> > > > > > > need > > > >> > > > > > > > > the > > > >> > > > > > > > > > ability to cancel the task. I see several ways to > > > >> implement > > > >> > > > this: > > > >> > > > > > > > > > > > > >> > > > > > > > > > 1. Сlient sends a request to the server to start a > > > task, > > > >> > > server > > > >> > > > > > > return > > > >> > > > > > > > > task > > > >> > > > > > > > > > id in response. Server notifies client when task > is > > > >> > completed > > > >> > > > > with > > > >> > > > > > a > > > >> > > > > > > > new > > > >> > > > > > > > > > request (from server to client). Client can cancel > > the > > > >> task > > > >> > > by > > > >> > > > > > > sending > > > >> > > > > > > > a > > > >> > > > > > > > > > new request with operation type "cancel" and task > > id. > > > In > > > >> > this > > > >> > > > > case, > > > >> > > > > > > we > > > >> > > > > > > > > > should implement 2-ways requests. > > > >> > > > > > > > > > 2. Client generates unique task id and sends a > > request > > > >> to > > > >> > the > > > >> > > > > > server > > > >> > > > > > > to > > > >> > > > > > > > > > start a task, server don't reply immediately but > > wait > > > >> until > > > >> > > > task > > > >> > > > > is > > > >> > > > > > > > > > completed. Client can cancel task by sending new > > > request > > > >> > with > > > >> > > > > > > operation > > > >> > > > > > > > > > type "cancel" and task id. In this case, we should > > > >> decouple > > > >> > > > > request > > > >> > > > > > > and > > > >> > > > > > > > > > response on the server-side (currently response is > > > sent > > > >> > right > > > >> > > > > after > > > >> > > > > > > > > request > > > >> > > > > > > > > > was processed). Also, we can't be sure that task > is > > > >> > > > successfully > > > >> > > > > > > > started > > > >> > > > > > > > > on > > > >> > > > > > > > > > a server. > > > >> > > > > > > > > > 3. Client sends a request to the server to start a > > > task, > > > >> > > server > > > >> > > > > > > return > > > >> > > > > > > > id > > > >> > > > > > > > > > in response. Client periodically asks the server > > about > > > >> task > > > >> > > > > status. > > > >> > > > > > > > > Client > > > >> > > > > > > > > > can cancel the task by sending new request with > > > >> operation > > > >> > > type > > > >> > > > > > > "cancel" > > > >> > > > > > > > > and > > > >> > > > > > > > > > task id. This case brings some overhead to the > > > >> > communication > > > >> > > > > > channel. > > > >> > > > > > > > > > > > > >> > > > > > > > > > Personally, I think that the case with 2-ways > > requests > > > >> is > > > >> > > > better, > > > >> > > > > > but > > > >> > > > > > > > I'm > > > >> > > > > > > > > > open to any other ideas. > > > >> > > > > > > > > > > > > >> > > > > > > > > > Aleksandr, > > > >> > > > > > > > > > > > > >> > > > > > > > > > Filtering logic for OP_CLUSTER_GROUP_GET_NODE_IDS > > > looks > > > >> > > > > > > > overcomplicated. > > > >> > > > > > > > > Do > > > >> > > > > > > > > > we need server-side filtering at all? Wouldn't it > be > > > >> better > > > >> > > to > > > >> > > > > send > > > >> > > > > > > > basic > > > >> > > > > > > > > > info (ids, order, flags) for all nodes (there is > > > >> relatively > > > >> > > > small > > > >> > > > > > > > amount > > > >> > > > > > > > > of > > > >> > > > > > > > > > data) and extended info (attributes) for selected > > list > > > >> of > > > >> > > > nodes? > > > >> > > > > In > > > >> > > > > > > > this > > > >> > > > > > > > > > case, we can do basic node filtration on > client-side > > > >> > > > > (forClients(), > > > >> > > > > > > > > > forServers(), forNodeIds(), forOthers(), etc). > > > >> > > > > > > > > > > > > >> > > > > > > > > > Do you use standard ClusterNode serialization? > There > > > are > > > >> > also > > > >> > > > > > metrics > > > >> > > > > > > > > > serialized with ClusterNode, do we need it on thin > > > >> client? > > > >> > > > There > > > >> > > > > > are > > > >> > > > > > > > > other > > > >> > > > > > > > > > interfaces exist to show metrics, I think it's > > > >> redundant to > > > >> > > > > export > > > >> > > > > > > > > metrics > > > >> > > > > > > > > > to thin clients too. > > > >> > > > > > > > > > > > > >> > > > > > > > > > What do you think? > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > пт, 22 нояб. 2019 г. в 20:15, Aleksandr Shapkin < > > > >> > > > > [hidden email] > > > >> > > > > > >: > > > >> > > > > > > > > > > > > >> > > > > > > > > > > Alex, > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > I think you can create a new IEP page and I will > > > fill > > > >> it > > > >> > > with > > > >> > > > > the > > > >> > > > > > > > > Cluster > > > >> > > > > > > > > > > API details. > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > In short, I’ve introduced several new codes: > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > Cluster API is pretty straightforward: > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > OP_CLUSTER_IS_ACTIVE = 5000 > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > OP_CLUSTER_CHANGE_STATE = 5001 > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > OP_CLUSTER_CHANGE_WAL_STATE = 5002 > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > OP_CLUSTER_GET_WAL_STATE = 5003 > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > Cluster group codes: > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > OP_CLUSTER_GROUP_GET_NODE_IDS = 5100 > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > OP_CLUSTER_GROUP_GET_NODE_INFO = 5101 > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > The underlying implementation is based on the > > thick > > > >> > client > > > >> > > > > logic. > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > For every request, we provide a known topology > > > version > > > >> > and > > > >> > > if > > > >> > > > > it > > > >> > > > > > > has > > > >> > > > > > > > > > > changed, > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > a client updates it firstly and then re-sends > the > > > >> > filtering > > > >> > > > > > > request. > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > Alongside the topVer a client sends a serialized > > > nodes > > > >> > > > > projection > > > >> > > > > > > > > object > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > that could be considered as a code to value > > mapping. > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > Consider: [{Code = 1, Value= [“DotNet”, > > > >> “MyAttribute”}, > > > >> > > > > {Code=2, > > > >> > > > > > > > > > Value=1}] > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > Where “1” stands for Attribute filtering and > “2” – > > > >> > > > > > serverNodesOnly > > > >> > > > > > > > > flag. > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > As a result of request processing, a server > sends > > > >> nodeId > > > >> > > > UUIDs > > > >> > > > > > and > > > >> > > > > > > a > > > >> > > > > > > > > > > current topVer. > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > When a client obtains nodeIds, it can perform a > > > >> NODE_INFO > > > >> > > > call > > > >> > > > > to > > > >> > > > > > > > get a > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > serialized ClusterNode object. In addition there > > > >> should > > > >> > be > > > >> > > a > > > >> > > > > > > > different > > > >> > > > > > > > > > API > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > method for accessing/updating node metrics. > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > чт, 21 нояб. 2019 г. в 12:32, Sergey Kozlov < > > > >> > > > > > [hidden email] > > > >> > > > > > > >: > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > Hi Pavel > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > On Thu, Nov 21, 2019 at 11:30 AM Pavel > Tupitsyn > > < > > > >> > > > > > > > > [hidden email]> > > > >> > > > > > > > > > > > wrote: > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > 1. I believe that Cluster operations for > Thin > > > >> Client > > > >> > > > > protocol > > > >> > > > > > > are > > > >> > > > > > > > > > > already > > > >> > > > > > > > > > > > > in the works > > > >> > > > > > > > > > > > > by Alexandr Shapkin. Can't find the ticket > > > though. > > > >> > > > > > > > > > > > > Alexandr, can you please confirm and attach > > the > > > >> > ticket > > > >> > > > > > number? > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > 2. Proposed changes will work only for Java > > > tasks > > > >> > that > > > >> > > > are > > > >> > > > > > > > already > > > >> > > > > > > > > > > > deployed > > > >> > > > > > > > > > > > > on server nodes. > > > >> > > > > > > > > > > > > This is mostly useless for other thin > clients > > we > > > >> have > > > >> > > > > > (Python, > > > >> > > > > > > > PHP, > > > >> > > > > > > > > > > .NET, > > > >> > > > > > > > > > > > > C++). > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > I don't guess so. The task (execution) is a > way > > to > > > >> > > > implement > > > >> > > > > > own > > > >> > > > > > > > > layer > > > >> > > > > > > > > > > for > > > >> > > > > > > > > > > > the thin client application. > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > We should think of a way to make this useful > > for > > > >> all > > > >> > > > > clients. > > > >> > > > > > > > > > > > > For example, we may allow sending tasks in > > some > > > >> > > scripting > > > >> > > > > > > > language > > > >> > > > > > > > > > like > > > >> > > > > > > > > > > > > Javascript. > > > >> > > > > > > > > > > > > Thoughts? > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > The arbitrary code execution from a remote > > client > > > >> must > > > >> > be > > > >> > > > > > > protected > > > >> > > > > > > > > > > > from malicious code. > > > >> > > > > > > > > > > > I don't know how it could be designed but > > without > > > >> that > > > >> > we > > > >> > > > > open > > > >> > > > > > > the > > > >> > > > > > > > > hole > > > >> > > > > > > > > > > to > > > >> > > > > > > > > > > > kill cluster. > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > On Thu, Nov 21, 2019 at 11:21 AM Sergey > > Kozlov < > > > >> > > > > > > > > [hidden email] > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > wrote: > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > Hi Alex > > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > The idea is great. But I have some > concerns > > > that > > > >> > > > probably > > > >> > > > > > > > should > > > >> > > > > > > > > be > > > >> > > > > > > > > > > > taken > > > >> > > > > > > > > > > > > > into account for design: > > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > 1. We need to have the ability to stop > a > > > task > > > >> > > > > execution, > > > >> > > > > > > > smth > > > >> > > > > > > > > > like > > > >> > > > > > > > > > > > > > OP_COMPUTE_CANCEL_TASK operation > (client > > > to > > > >> > > server) > > > >> > > > > > > > > > > > > > 2. What's about task execution timeout? > > It > > > >> may > > > >> > > help > > > >> > > > to > > > >> > > > > > the > > > >> > > > > > > > > > cluster > > > >> > > > > > > > > > > > > > survival for buggy tasks > > > >> > > > > > > > > > > > > > 3. Ignite doesn't have > > roles/authorization > > > >> > > > > functionality > > > >> > > > > > > for > > > >> > > > > > > > > > now. > > > >> > > > > > > > > > > > But > > > >> > > > > > > > > > > > > a > > > >> > > > > > > > > > > > > > task is the risky operation for cluster > > > (for > > > >> > > > security > > > >> > > > > > > > > reasons). > > > >> > > > > > > > > > > > Could > > > >> > > > > > > > > > > > > we > > > >> > > > > > > > > > > > > > add for Ignite configuration new > options: > > > >> > > > > > > > > > > > > > - Explicit turning on for compute > task > > > >> > support > > > >> > > > for > > > >> > > > > > thin > > > >> > > > > > > > > > > protocol > > > >> > > > > > > > > > > > > > (disabled by default) for whole > > cluster > > > >> > > > > > > > > > > > > > - Explicit turning on for compute > task > > > >> > support > > > >> > > > for > > > >> > > > > a > > > >> > > > > > > node > > > >> > > > > > > > > > > > > > - The list of task names (classes) > > > >> allowed to > > > >> > > > > execute > > > >> > > > > > > by > > > >> > > > > > > > > thin > > > >> > > > > > > > > > > > > client. > > > >> > > > > > > > > > > > > > 4. Support the labeling for task that > may > > > >> help > > > >> > to > > > >> > > > > > > > investigate > > > >> > > > > > > > > > > issues > > > >> > > > > > > > > > > > > on > > > >> > > > > > > > > > > > > > cluster (the idea from IEP-34 [1]) > > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > 1. > > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > > > > https://cwiki.apache.org/confluence/display/IGNITE/IEP-34+Thin+client%3A+transactions+support > > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > On Thu, Nov 21, 2019 at 10:58 AM Alex > > > Plehanov < > > > >> > > > > > > > > > > > [hidden email]> > > > >> > > > > > > > > > > > > > wrote: > > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > Hello, Igniters! > > > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > I have plans to start implementation of > > > >> Compute > > > >> > > > > interface > > > >> > > > > > > for > > > >> > > > > > > > > > > Ignite > > > >> > > > > > > > > > > > > thin > > > >> > > > > > > > > > > > > > > client and want to discuss features that > > > >> should > > > >> > be > > > >> > > > > > > > implemented. > > > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > We already have Compute implementation > for > > > >> > > > binary-rest > > > >> > > > > > > > clients > > > >> > > > > > > > > > > > > > > (GridClientCompute), which have the > > > following > > > >> > > > > > > functionality: > > > >> > > > > > > > > > > > > > > - Filtering cluster nodes (projection) > for > > > >> > compute > > > >> > > > > > > > > > > > > > > - Executing task by the name > > > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > I think we can implement this > > functionality > > > >> in a > > > >> > > thin > > > >> > > > > > > client > > > >> > > > > > > > as > > > >> > > > > > > > > > > well. > > > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > First of all, we need some operation > types > > > to > > > >> > > > request a > > > >> > > > > > > list > > > >> > > > > > > > of > > > >> > > > > > > > > > all > > > >> > > > > > > > > > > > > > > available nodes and probably node > > attributes > > > >> (by > > > >> > a > > > >> > > > list > > > >> > > > > > of > > > >> > > > > > > > > > nodes). > > > >> > > > > > > > > > > > Node > > > >> > > > > > > > > > > > > > > attributes will be helpful if we will > > decide > > > >> to > > > >> > > > > implement > > > >> > > > > > > > > analog > > > >> > > > > > > > > > of > > > >> > > > > > > > > > > > > > > ClusterGroup#forAttribute or > > > >> > > > ClusterGroup#forePredicate > > > >> > > > > > > > methods > > > >> > > > > > > > > > in > > > >> > > > > > > > > > > > the > > > >> > > > > > > > > > > > > > thin > > > >> > > > > > > > > > > > > > > client. Perhaps they can be requested > > > lazily. > > > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > From the protocol point of view there > will > > > be > > > >> two > > > >> > > new > > > >> > > > > > > > > operations: > > > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > OP_CLUSTER_GET_NODES > > > >> > > > > > > > > > > > > > > Request: empty > > > >> > > > > > > > > > > > > > > Response: long topologyVersion, int > > > >> > > > > minorTopologyVersion, > > > >> > > > > > > int > > > >> > > > > > > > > > > > > nodesCount, > > > >> > > > > > > > > > > > > > > for each node set of node fields (UUID > > > nodeId, > > > >> > > Object > > > >> > > > > or > > > >> > > > > > > > String > > > >> > > > > > > > > > > > > > > consistentId, long order, etc) > > > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > OP_CLUSTER_GET_NODE_ATTRIBUTES > > > >> > > > > > > > > > > > > > > Request: int nodesCount, for each node: > > UUID > > > >> > nodeId > > > >> > > > > > > > > > > > > > > Response: int nodesCount, for each node: > > int > > > >> > > > > > > attributesCount, > > > >> > > > > > > > > for > > > >> > > > > > > > > > > > each > > > >> > > > > > > > > > > > > > node > > > >> > > > > > > > > > > > > > > attribute: String name, Object value > > > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > To execute tasks we need something like > > > these > > > >> > > methods > > > >> > > > > in > > > >> > > > > > > the > > > >> > > > > > > > > > client > > > >> > > > > > > > > > > > > API: > > > >> > > > > > > > > > > > > > > Object execute(String task, Object arg) > > > >> > > > > > > > > > > > > > > Future<Object> executeAsync(String task, > > > >> Object > > > >> > > arg) > > > >> > > > > > > > > > > > > > > Object affinityExecute(String task, > String > > > >> cache, > > > >> > > > > Object > > > >> > > > > > > key, > > > >> > > > > > > > > > > Object > > > >> > > > > > > > > > > > > arg) > > > >> > > > > > > > > > > > > > > Future<Object> > affinityExecuteAsync(String > > > >> task, > > > >> > > > String > > > >> > > > > > > > cache, > > > >> > > > > > > > > > > Object > > > >> > > > > > > > > > > > > > key, > > > >> > > > > > > > > > > > > > > Object arg) > > > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > Which can be mapped to protocol > > operations: > > > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > OP_COMPUTE_EXECUTE_TASK > > > >> > > > > > > > > > > > > > > Request: UUID nodeId, String taskName, > > > Object > > > >> arg > > > >> > > > > > > > > > > > > > > Response: Object result > > > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > OP_COMPUTE_EXECUTE_TASK_AFFINITY > > > >> > > > > > > > > > > > > > > Request: String cacheName, Object key, > > > String > > > >> > > > taskName, > > > >> > > > > > > > Object > > > >> > > > > > > > > > arg > > > >> > > > > > > > > > > > > > > Response: Object result > > > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > The second operation is needed because > we > > > >> > sometimes > > > >> > > > > can't > > > >> > > > > > > > > > calculate > > > >> > > > > > > > > > > > and > > > >> > > > > > > > > > > > > > > connect to affinity node on the > > client-side > > > >> > > (affinity > > > >> > > > > > > > awareness > > > >> > > > > > > > > > can > > > >> > > > > > > > > > > > be > > > >> > > > > > > > > > > > > > > disabled, custom affinity function can > be > > > >> used or > > > >> > > > there > > > >> > > > > > can > > > >> > > > > > > > be > > > >> > > > > > > > > no > > > >> > > > > > > > > > > > > > > connection between client and affinity > > > node), > > > >> but > > > >> > > we > > > >> > > > > can > > > >> > > > > > > make > > > >> > > > > > > > > > best > > > >> > > > > > > > > > > > > effort > > > >> > > > > > > > > > > > > > > to send request to target node if > affinity > > > >> > > awareness > > > >> > > > is > > > >> > > > > > > > > enabled. > > > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > Currently, on the server-side requests > > > always > > > >> > > > processed > > > >> > > > > > > > > > > synchronously > > > >> > > > > > > > > > > > > and > > > >> > > > > > > > > > > > > > > responses are sent right after request > was > > > >> > > processed. > > > >> > > > > To > > > >> > > > > > > > > execute > > > >> > > > > > > > > > > long > > > >> > > > > > > > > > > > > > tasks > > > >> > > > > > > > > > > > > > > async we should whether change this > logic > > or > > > >> > > > introduce > > > >> > > > > > some > > > >> > > > > > > > > kind > > > >> > > > > > > > > > > > > two-way > > > >> > > > > > > > > > > > > > > communication between client and server > > (now > > > >> only > > > >> > > > > one-way > > > >> > > > > > > > > > requests > > > >> > > > > > > > > > > > from > > > >> > > > > > > > > > > > > > > client to server are allowed). > > > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > Two-way communication can also be useful > > in > > > >> the > > > >> > > > future > > > >> > > > > if > > > >> > > > > > > we > > > >> > > > > > > > > will > > > >> > > > > > > > > > > > send > > > >> > > > > > > > > > > > > > some > > > >> > > > > > > > > > > > > > > server-side generated events to clients. > > > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > In case of two-way communication there > can > > > be > > > >> new > > > >> > > > > > > operations > > > >> > > > > > > > > > > > > introduced: > > > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > OP_COMPUTE_EXECUTE_TASK (from client to > > > >> server) > > > >> > > > > > > > > > > > > > > Request: UUID nodeId, String taskName, > > > Object > > > >> arg > > > >> > > > > > > > > > > > > > > Response: long taskId > > > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > OP_COMPUTE_TASK_FINISHED (from server to > > > >> client) > > > >> > > > > > > > > > > > > > > Request: taskId, Object result > > > >> > > > > > > > > > > > > > > Response: empty > > > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > The same for affinity requests. > > > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > Also, we can implement not only execute > > task > > > >> > > > operation, > > > >> > > > > > but > > > >> > > > > > > > > some > > > >> > > > > > > > > > > > other > > > >> > > > > > > > > > > > > > > operations from IgniteCompute > (broadcast, > > > run, > > > >> > > call), > > > >> > > > > but > > > >> > > > > > > it > > > >> > > > > > > > > will > > > >> > > > > > > > > > > be > > > >> > > > > > > > > > > > > > useful > > > >> > > > > > > > > > > > > > > only for java thin client. And even with > > > java > > > >> > thin > > > >> > > > > client > > > >> > > > > > > we > > > >> > > > > > > > > > should > > > >> > > > > > > > > > > > > > whether > > > >> > > > > > > > > > > > > > > implement peer-class-loading for thin > > > clients > > > >> > (this > > > >> > > > > also > > > >> > > > > > > > > requires > > > >> > > > > > > > > > > > > two-way > > > >> > > > > > > > > > > > > > > client-server communication) or put > > classes > > > >> with > > > >> > > > > executed > > > >> > > > > > > > > > closures > > > >> > > > > > > > > > > to > > > >> > > > > > > > > > > > > the > > > >> > > > > > > > > > > > > > > server locally. > > > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > What do you think about proposed > protocol > > > >> > changes? > > > >> > > > > > > > > > > > > > > Do we need two-way requests between > client > > > and > > > >> > > > server? > > > >> > > > > > > > > > > > > > > Do we need support of compute methods > > other > > > >> than > > > >> > > > > "execute > > > >> > > > > > > > > task"? > > > >> > > > > > > > > > > > > > > What do you think about > peer-class-loading > > > for > > > >> > thin > > > >> > > > > > > clients? > > > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > -- > > > >> > > > > > > > > > > > > > Sergey Kozlov > > > >> > > > > > > > > > > > > > GridGain Systems > > > >> > > > > > > > > > > > > > www.gridgain.com > > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > -- > > > >> > > > > > > > > > > > Sergey Kozlov > > > >> > > > > > > > > > > > GridGain Systems > > > >> > > > > > > > > > > > www.gridgain.com > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > -- > > > >> > > > > > > > > > > Alex. > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > > > > > > > > |
We've discussed thin client compute protocol with Pavel Tupitsyn and Igor
Sapego and come to the conclusion that approach with two-way requests should be used: client generates taskId and send a request to the server to execute a task. The server responds that the request has been accepted. After task has finished the server notifies the client (send a request without waiting for a response). The client can cancel the task by sending a corresponding request to the server. Also, a node list should be passed (optionally) with a request to limit nodes to execute the task. I will create IEP and file detailed protocol changes shortly. вт, 21 янв. 2020 г. в 18:46, Alex Plehanov <[hidden email]>: > Igor, thanks for the reply. > > > Approach with taskId will require a lot of changes in protocol and thus > more "heavy" for implementation > Do you mean approach with server notifications mechanism? Yes, it will > require a lot of changes. But in most recent messages we've discussed with > Pavel approach without server notifications mechanism. This approach have > the same complexity and performance as an approach with requestId. > > > But such clients as Python, Node.js, PHP, Go most probably won't have > support for this API, at least for now. > Without a server notifications mechanism, there will be no breaking > changes in the protocol, so client implementation can just skip this > feature and protocol version and implement the next one. > > > Or never. > I think it still useful to execute java compute tasks from non-java thin > clients. Also, we can provide some out-of-the-box java tasks, for example > ExecutePythonScriptTask with python compute implementation, which can run > python script on server node. > > > So, maybe it's a good time for us to change our backward compatibility > mechanism from protocol versioning to feature masks? > I like the idea with feature masks, but it will force us to support both > backward compatibility mechanisms, protocol versioning and feature masks. > > пн, 20 янв. 2020 г. в 20:34, Pavel Tupitsyn <[hidden email]>: > >> Huge +1 from me for Feature Masks. >> I think this should be our top priority for thin client protocol, since it >> simplifies change management a lot. >> >> On Mon, Jan 20, 2020 at 8:21 PM Igor Sapego <[hidden email]> wrote: >> >> > Sorry for the late reply. >> > >> > Approach with taskId will require a lot of changes in protocol and thus >> > more "heavy" for implementation, but it definitely looks to me less >> hacky >> > than reqId-approach. Moreover, as was mentioned, server notifications >> > mechanism will be required in a future anyway with high probability. So >> > from this point of view I like taskId-approach. >> > >> > On the other hand, what we should also consider here is performance. >> > Speaking of latency, it looks like reqId will have better results in >> case >> > of >> > small and fast tasks. The only question here, if we want to optimize >> thin >> > clients for this case. >> > >> > Also, what are you talking about mostly involves clients on platforms >> > that already have Compute API for thick clients. Let me mention one >> > more point of view here and another concern here. >> > >> > The changes you propose are going to change protocol version for sure. >> > In case with taskId approach and server notifications - even more so. >> > >> > But such clients as Python, Node.js, PHP, Go most probably won't have >> > support for this API, at least for now. Or never. But current >> > backward-compatibility mechanism implies protocol versions where we >> > imply that client that supports version 1.5 also supports all the >> features >> > introduced in all the previous versions of the protocol. >> > >> > Thus implementing Compute API in any of the proposed ways *may* >> > force mentioned clients to support changes in protocol which they not >> > necessarily need in order to introduce new features in the future. >> > >> > So, maybe it's a good time for us to change our backward compatibility >> > mechanism from protocol versioning to feature masks? >> > >> > WDYT? >> > >> > Best Regards, >> > Igor >> > >> > >> > On Fri, Jan 17, 2020 at 9:37 AM Alex Plehanov <[hidden email]> >> > wrote: >> > >> > > Looks like we didn't rich consensus here. >> > > >> > > Igor, as thin client maintainer, can you please share your opinion? >> > > >> > > Everyone else also welcome, please share your thoughts about options >> to >> > > implement operations for compute. >> > > >> > > >> > > чт, 28 нояб. 2019 г. в 10:02, Alex Plehanov <[hidden email] >> >: >> > > >> > > > > Since all thin client operations are inherently async, we should >> be >> > > able >> > > > to cancel any of them >> > > > It's illogical to have such ability. What should do cancel >> operation of >> > > > cancel operation? Moreover, sometimes it's dangerous, for example, >> > create >> > > > cache operation should never be canceled. There should be an >> explicit >> > set >> > > > of processes that we can cancel: queries, transactions, tasks, >> > services. >> > > > The lifecycle of services is more complex than the lifecycle of >> tasks. >> > > With >> > > > services, I suppose, we can't use request cancelation, so tasks >> will be >> > > the >> > > > only process with an exceptional pattern. >> > > > >> > > > > The request would be "execute task with specified node filter" - >> > simple >> > > > and efficient. >> > > > It's not simple: every compute or service request should contain >> > complex >> > > > node filtering logic, which duplicates the same logic for cluster >> API. >> > > > It's not efficient: for example, we can't implement forPredicate() >> > > > filtering in this case. >> > > > >> > > > >> > > > ср, 27 нояб. 2019 г. в 19:25, Pavel Tupitsyn <[hidden email] >> >: >> > > > >> > > >> > The request is already processed (task is started), we can't >> cancel >> > > the >> > > >> request >> > > >> The request is not "start a task". It is "execute task" (and get >> > > result). >> > > >> Same as "cache get" - you get a result in the end, we don't "start >> > cache >> > > >> get" then "end cache get". >> > > >> >> > > >> Since all thin client operations are inherently async, we should be >> > able >> > > >> to >> > > >> cancel any of them >> > > >> by sending another request with an id of prior request to be >> > cancelled. >> > > >> That's why I'm advocating for this approach - it will work for >> > anything, >> > > >> no >> > > >> special cases. >> > > >> And it keeps "happy path" as simple as it is right now. >> > > >> >> > > >> Queries are different because we retrieve results in pages, we >> can't >> > do >> > > >> them as one request. >> > > >> Transactions are also different because client controls when they >> > should >> > > >> end. >> > > >> There is no reason for task execution to be a special case like >> > queries >> > > or >> > > >> transactions. >> > > >> >> > > >> > we always need to send 2 requests to server to execute the task >> > > >> Nope. We don't need to get nodes on client at all. >> > > >> The request would be "execute task with specified node filter" - >> > simple >> > > >> and >> > > >> efficient. >> > > >> >> > > >> >> > > >> On Wed, Nov 27, 2019 at 4:31 PM Alex Plehanov < >> > [hidden email]> >> > > >> wrote: >> > > >> >> > > >> > > We do cancel a request to perform a task. We may and should >> use >> > > this >> > > >> to >> > > >> > cancel any other request in future. >> > > >> > The request is already processed (task is started), we can't >> cancel >> > > the >> > > >> > request. As you mentioned before, we already do almost the same >> for >> > > >> queries >> > > >> > (close the cursor, but not cancel the request to run a query), >> it's >> > > >> better >> > > >> > to do such things in a common way. We have a pattern: start some >> > > process >> > > >> > (query, transaction), get id of this process, end process by this >> > id. >> > > >> The >> > > >> > "Execute task" process should match the same pattern. In my >> opinion, >> > > >> > implementation with two-way requests is the best option to match >> > this >> > > >> > pattern (we can even reuse OP_RESOURCE_CLOSE operation type in >> this >> > > >> case). >> > > >> > Sometime in the future, we will need two-way requests for some >> other >> > > >> > functionality (continuous queries, event listening, etc). But >> even >> > > >> without >> > > >> > two-way requests introducing some process id (task id in our >> case) >> > > will >> > > >> be >> > > >> > closer to existing pattern than canceling tasks by request id. >> > > >> > >> > > >> > > So every new request will apply those filters on server side, >> > using >> > > >> the >> > > >> > most recent set of nodes. >> > > >> > In this case, we always need to send 2 requests to server to >> execute >> > > the >> > > >> > task. First - to get nodes by the filter, second - to actually >> > execute >> > > >> the >> > > >> > task. It seems like overhead. The same will be for services. >> Cluster >> > > >> group >> > > >> > remains the same if the topology hasn't changed. We can use this >> > fact >> > > >> and >> > > >> > bind "execute task" request to topology. If topology has changed >> - >> > get >> > > >> > nodes for new topology and retry request. >> > > >> > >> > > >> > вт, 26 нояб. 2019 г. в 17:44, Pavel Tupitsyn < >> [hidden email] >> > >: >> > > >> > >> > > >> > > > After all, we don't cancel request >> > > >> > > We do cancel a request to perform a task. We may and should use >> > this >> > > >> to >> > > >> > > cancel any other request in future. >> > > >> > > >> > > >> > > > Client uses some cluster group filtration (for example >> > > forServers() >> > > >> > > cluster group) >> > > >> > > Please see above - Aleksandr Shapkin described how we store >> > > >> > > filtered cluster groups on client. >> > > >> > > We don't store node IDs, we store actual filters. So every new >> > > request >> > > >> > will >> > > >> > > apply those filters on server side, >> > > >> > > using the most recent set of nodes. >> > > >> > > >> > > >> > > var myGrp = cluster.forServers().forAttribute("foo"); // This >> does >> > > not >> > > >> > > issue any server requests, just builds an object with filters >> on >> > > >> client >> > > >> > > while (true) myGrp.compute().executeTask("bar"); // Every >> request >> > > >> > includes >> > > >> > > filters, and filters are applied on the server side >> > > >> > > >> > > >> > > On Tue, Nov 26, 2019 at 1:42 PM Alex Plehanov < >> > > >> [hidden email]> >> > > >> > > wrote: >> > > >> > > >> > > >> > > > > Anyway, my point stands. >> > > >> > > > I can't agree. Why you don't want to use task id for this? >> After >> > > >> all, >> > > >> > we >> > > >> > > > don't cancel request (request is already processed), we >> cancel >> > the >> > > >> > task. >> > > >> > > So >> > > >> > > > it's more convenient to use task id here. >> > > >> > > > >> > > >> > > > > Can you please provide equivalent use case with existing >> > "thick" >> > > >> > > client? >> > > >> > > > For example: >> > > >> > > > Cluster consists of one server node. >> > > >> > > > Client uses some cluster group filtration (for example >> > > forServers() >> > > >> > > cluster >> > > >> > > > group). >> > > >> > > > Client starts to send periodically (for example 1 per minute) >> > > >> long-term >> > > >> > > > (for example 1 hour long) tasks to the cluster. >> > > >> > > > Meanwhile, several server nodes joined the cluster. >> > > >> > > > >> > > >> > > > In case of thick client: All server nodes will be used, tasks >> > will >> > > >> be >> > > >> > > load >> > > >> > > > balanced. >> > > >> > > > In case of thin client: Only one server node will be used, >> > client >> > > >> will >> > > >> > > > detect topology change after an hour. >> > > >> > > > >> > > >> > > > >> > > >> > > > вт, 26 нояб. 2019 г. в 11:50, Pavel Tupitsyn < >> > > [hidden email] >> > > >> >: >> > > >> > > > >> > > >> > > > > > I can't see any usage of request id in query cursors >> > > >> > > > > You are right, cursor id is a separate thing. >> > > >> > > > > Anyway, my point stands. >> > > >> > > > > >> > > >> > > > > > client sends long term tasks to nodes and wants to do it >> > with >> > > >> load >> > > >> > > > > balancing >> > > >> > > > > I still don't get it. Can you please provide equivalent use >> > case >> > > >> with >> > > >> > > > > existing "thick" client? >> > > >> > > > > >> > > >> > > > > >> > > >> > > > > On Mon, Nov 25, 2019 at 11:59 PM Alex Plehanov < >> > > >> > > [hidden email]> >> > > >> > > > > wrote: >> > > >> > > > > >> > > >> > > > > > > And it is fine to use request ID to identify compute >> tasks >> > > >> (as we >> > > >> > > do >> > > >> > > > > with >> > > >> > > > > > query cursors). >> > > >> > > > > > I can't see any usage of request id in query cursors. We >> > send >> > > >> query >> > > >> > > > > request >> > > >> > > > > > and get cursor id in response. After that, we only use >> > cursor >> > > id >> > > >> > (to >> > > >> > > > get >> > > >> > > > > > next pages and to close the resource). Did I miss >> something? >> > > >> > > > > > >> > > >> > > > > > > Looks like I'm missing something - how is topology >> change >> > > >> > relevant >> > > >> > > to >> > > >> > > > > > executing compute tasks from client? >> > > >> > > > > > It's not relevant directly. But there are some cases >> where >> > it >> > > >> will >> > > >> > be >> > > >> > > > > > helpful. For example, if client sends long term tasks to >> > nodes >> > > >> and >> > > >> > > > wants >> > > >> > > > > to >> > > >> > > > > > do it with load balancing it will detect topology change >> > only >> > > >> after >> > > >> > > > some >> > > >> > > > > > time in the future with the first response, so load >> > balancing >> > > >> will >> > > >> > no >> > > >> > > > > work. >> > > >> > > > > > Perhaps we can add optional "topology version" field to >> the >> > > >> > > > > > OP_COMPUTE_EXECUTE_TASK request to solve this problem. >> > > >> > > > > > >> > > >> > > > > > >> > > >> > > > > > пн, 25 нояб. 2019 г. в 22:42, Pavel Tupitsyn < >> > > >> [hidden email] >> > > >> > >: >> > > >> > > > > > >> > > >> > > > > > > Alex, >> > > >> > > > > > > >> > > >> > > > > > > > we will mix entities from different layers (transport >> > > layer >> > > >> and >> > > >> > > > > request >> > > >> > > > > > > body) >> > > >> > > > > > > I would not call our message header (which includes the >> > id) >> > > >> > > > "transport >> > > >> > > > > > > layer". >> > > >> > > > > > > TCP is our transport layer. And it is fine to use >> request >> > ID >> > > >> to >> > > >> > > > > identify >> > > >> > > > > > > compute tasks (as we do with query cursors). >> > > >> > > > > > > >> > > >> > > > > > > > we still can't be sure that the task is successfully >> > > started >> > > >> > on a >> > > >> > > > > > server >> > > >> > > > > > > The request to start the task will fail and we'll get a >> > > >> response >> > > >> > > > > > indicating >> > > >> > > > > > > that right away >> > > >> > > > > > > >> > > >> > > > > > > > we won't ever know about topology change >> > > >> > > > > > > Looks like I'm missing something - how is topology >> change >> > > >> > relevant >> > > >> > > to >> > > >> > > > > > > executing compute tasks from client? >> > > >> > > > > > > >> > > >> > > > > > > On Mon, Nov 25, 2019 at 10:17 PM Alex Plehanov < >> > > >> > > > > [hidden email]> >> > > >> > > > > > > wrote: >> > > >> > > > > > > >> > > >> > > > > > > > Pavel, in this case, we will mix entities from >> different >> > > >> layers >> > > >> > > > > > > (transport >> > > >> > > > > > > > layer and request body), it's not very good. The same >> > > >> behavior >> > > >> > we >> > > >> > > > can >> > > >> > > > > > > > achieve with generated on client-side task id, but >> there >> > > >> will >> > > >> > be >> > > >> > > no >> > > >> > > > > > > > inter-layer data intersection and I think it will be >> > > easier >> > > >> to >> > > >> > > > > > implement >> > > >> > > > > > > on >> > > >> > > > > > > > both client and server-side. But we still can't be >> sure >> > > that >> > > >> > the >> > > >> > > > task >> > > >> > > > > > is >> > > >> > > > > > > > successfully started on a server. We won't ever know >> > about >> > > >> > > topology >> > > >> > > > > > > change, >> > > >> > > > > > > > because topology changed flag will be sent from >> server >> > to >> > > >> > client >> > > >> > > > only >> > > >> > > > > > > with >> > > >> > > > > > > > a response when the task will be completed. Are we >> > accept >> > > >> that? >> > > >> > > > > > > > >> > > >> > > > > > > > пн, 25 нояб. 2019 г. в 19:07, Pavel Tupitsyn < >> > > >> > > [hidden email] >> > > >> > > > >: >> > > >> > > > > > > > >> > > >> > > > > > > > > Alex, >> > > >> > > > > > > > > >> > > >> > > > > > > > > I have a simpler idea. We already do request id >> > handling >> > > >> in >> > > >> > the >> > > >> > > > > > > protocol, >> > > >> > > > > > > > > so: >> > > >> > > > > > > > > - Client sends a normal request to execute compute >> > task. >> > > >> > > Request >> > > >> > > > ID >> > > >> > > > > > is >> > > >> > > > > > > > > generated as usual. >> > > >> > > > > > > > > - As soon as task is completed, a response is >> > received. >> > > >> > > > > > > > > >> > > >> > > > > > > > > As for cancellation - client can send a new request >> > > (with >> > > >> new >> > > >> > > > > request >> > > >> > > > > > > ID) >> > > >> > > > > > > > > and (in the body) pass the request ID from above >> > > >> > > > > > > > > as a task identifier. As a result, there are two >> > > >> responses: >> > > >> > > > > > > > > - Cancellation response >> > > >> > > > > > > > > - Task response (with proper cancelled status) >> > > >> > > > > > > > > >> > > >> > > > > > > > > That's it, no need to modify the core of the >> protocol. >> > > One >> > > >> > > > request >> > > >> > > > > - >> > > >> > > > > > > one >> > > >> > > > > > > > > response. >> > > >> > > > > > > > > >> > > >> > > > > > > > > On Mon, Nov 25, 2019 at 6:20 PM Alex Plehanov < >> > > >> > > > > > [hidden email] >> > > >> > > > > > > > >> > > >> > > > > > > > > wrote: >> > > >> > > > > > > > > >> > > >> > > > > > > > > > Pavel, we need to inform the client when the >> task is >> > > >> > > completed, >> > > >> > > > > we >> > > >> > > > > > > need >> > > >> > > > > > > > > the >> > > >> > > > > > > > > > ability to cancel the task. I see several ways to >> > > >> implement >> > > >> > > > this: >> > > >> > > > > > > > > > >> > > >> > > > > > > > > > 1. Сlient sends a request to the server to start >> a >> > > task, >> > > >> > > server >> > > >> > > > > > > return >> > > >> > > > > > > > > task >> > > >> > > > > > > > > > id in response. Server notifies client when task >> is >> > > >> > completed >> > > >> > > > > with >> > > >> > > > > > a >> > > >> > > > > > > > new >> > > >> > > > > > > > > > request (from server to client). Client can >> cancel >> > the >> > > >> task >> > > >> > > by >> > > >> > > > > > > sending >> > > >> > > > > > > > a >> > > >> > > > > > > > > > new request with operation type "cancel" and task >> > id. >> > > In >> > > >> > this >> > > >> > > > > case, >> > > >> > > > > > > we >> > > >> > > > > > > > > > should implement 2-ways requests. >> > > >> > > > > > > > > > 2. Client generates unique task id and sends a >> > request >> > > >> to >> > > >> > the >> > > >> > > > > > server >> > > >> > > > > > > to >> > > >> > > > > > > > > > start a task, server don't reply immediately but >> > wait >> > > >> until >> > > >> > > > task >> > > >> > > > > is >> > > >> > > > > > > > > > completed. Client can cancel task by sending new >> > > request >> > > >> > with >> > > >> > > > > > > operation >> > > >> > > > > > > > > > type "cancel" and task id. In this case, we >> should >> > > >> decouple >> > > >> > > > > request >> > > >> > > > > > > and >> > > >> > > > > > > > > > response on the server-side (currently response >> is >> > > sent >> > > >> > right >> > > >> > > > > after >> > > >> > > > > > > > > request >> > > >> > > > > > > > > > was processed). Also, we can't be sure that task >> is >> > > >> > > > successfully >> > > >> > > > > > > > started >> > > >> > > > > > > > > on >> > > >> > > > > > > > > > a server. >> > > >> > > > > > > > > > 3. Client sends a request to the server to start >> a >> > > task, >> > > >> > > server >> > > >> > > > > > > return >> > > >> > > > > > > > id >> > > >> > > > > > > > > > in response. Client periodically asks the server >> > about >> > > >> task >> > > >> > > > > status. >> > > >> > > > > > > > > Client >> > > >> > > > > > > > > > can cancel the task by sending new request with >> > > >> operation >> > > >> > > type >> > > >> > > > > > > "cancel" >> > > >> > > > > > > > > and >> > > >> > > > > > > > > > task id. This case brings some overhead to the >> > > >> > communication >> > > >> > > > > > channel. >> > > >> > > > > > > > > > >> > > >> > > > > > > > > > Personally, I think that the case with 2-ways >> > requests >> > > >> is >> > > >> > > > better, >> > > >> > > > > > but >> > > >> > > > > > > > I'm >> > > >> > > > > > > > > > open to any other ideas. >> > > >> > > > > > > > > > >> > > >> > > > > > > > > > Aleksandr, >> > > >> > > > > > > > > > >> > > >> > > > > > > > > > Filtering logic for OP_CLUSTER_GROUP_GET_NODE_IDS >> > > looks >> > > >> > > > > > > > overcomplicated. >> > > >> > > > > > > > > Do >> > > >> > > > > > > > > > we need server-side filtering at all? Wouldn't >> it be >> > > >> better >> > > >> > > to >> > > >> > > > > send >> > > >> > > > > > > > basic >> > > >> > > > > > > > > > info (ids, order, flags) for all nodes (there is >> > > >> relatively >> > > >> > > > small >> > > >> > > > > > > > amount >> > > >> > > > > > > > > of >> > > >> > > > > > > > > > data) and extended info (attributes) for selected >> > list >> > > >> of >> > > >> > > > nodes? >> > > >> > > > > In >> > > >> > > > > > > > this >> > > >> > > > > > > > > > case, we can do basic node filtration on >> client-side >> > > >> > > > > (forClients(), >> > > >> > > > > > > > > > forServers(), forNodeIds(), forOthers(), etc). >> > > >> > > > > > > > > > >> > > >> > > > > > > > > > Do you use standard ClusterNode serialization? >> There >> > > are >> > > >> > also >> > > >> > > > > > metrics >> > > >> > > > > > > > > > serialized with ClusterNode, do we need it on >> thin >> > > >> client? >> > > >> > > > There >> > > >> > > > > > are >> > > >> > > > > > > > > other >> > > >> > > > > > > > > > interfaces exist to show metrics, I think it's >> > > >> redundant to >> > > >> > > > > export >> > > >> > > > > > > > > metrics >> > > >> > > > > > > > > > to thin clients too. >> > > >> > > > > > > > > > >> > > >> > > > > > > > > > What do you think? >> > > >> > > > > > > > > > >> > > >> > > > > > > > > > >> > > >> > > > > > > > > > >> > > >> > > > > > > > > > >> > > >> > > > > > > > > > пт, 22 нояб. 2019 г. в 20:15, Aleksandr Shapkin < >> > > >> > > > > [hidden email] >> > > >> > > > > > >: >> > > >> > > > > > > > > > >> > > >> > > > > > > > > > > Alex, >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > I think you can create a new IEP page and I >> will >> > > fill >> > > >> it >> > > >> > > with >> > > >> > > > > the >> > > >> > > > > > > > > Cluster >> > > >> > > > > > > > > > > API details. >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > In short, I’ve introduced several new codes: >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > Cluster API is pretty straightforward: >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > OP_CLUSTER_IS_ACTIVE = 5000 >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > OP_CLUSTER_CHANGE_STATE = 5001 >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > OP_CLUSTER_CHANGE_WAL_STATE = 5002 >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > OP_CLUSTER_GET_WAL_STATE = 5003 >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > Cluster group codes: >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > OP_CLUSTER_GROUP_GET_NODE_IDS = 5100 >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > OP_CLUSTER_GROUP_GET_NODE_INFO = 5101 >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > The underlying implementation is based on the >> > thick >> > > >> > client >> > > >> > > > > logic. >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > For every request, we provide a known topology >> > > version >> > > >> > and >> > > >> > > if >> > > >> > > > > it >> > > >> > > > > > > has >> > > >> > > > > > > > > > > changed, >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > a client updates it firstly and then re-sends >> the >> > > >> > filtering >> > > >> > > > > > > request. >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > Alongside the topVer a client sends a >> serialized >> > > nodes >> > > >> > > > > projection >> > > >> > > > > > > > > object >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > that could be considered as a code to value >> > mapping. >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > Consider: [{Code = 1, Value= [“DotNet”, >> > > >> “MyAttribute”}, >> > > >> > > > > {Code=2, >> > > >> > > > > > > > > > Value=1}] >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > Where “1” stands for Attribute filtering and >> “2” – >> > > >> > > > > > serverNodesOnly >> > > >> > > > > > > > > flag. >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > As a result of request processing, a server >> sends >> > > >> nodeId >> > > >> > > > UUIDs >> > > >> > > > > > and >> > > >> > > > > > > a >> > > >> > > > > > > > > > > current topVer. >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > When a client obtains nodeIds, it can perform a >> > > >> NODE_INFO >> > > >> > > > call >> > > >> > > > > to >> > > >> > > > > > > > get a >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > serialized ClusterNode object. In addition >> there >> > > >> should >> > > >> > be >> > > >> > > a >> > > >> > > > > > > > different >> > > >> > > > > > > > > > API >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > method for accessing/updating node metrics. >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > чт, 21 нояб. 2019 г. в 12:32, Sergey Kozlov < >> > > >> > > > > > [hidden email] >> > > >> > > > > > > >: >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > > Hi Pavel >> > > >> > > > > > > > > > > > >> > > >> > > > > > > > > > > > On Thu, Nov 21, 2019 at 11:30 AM Pavel >> Tupitsyn >> > < >> > > >> > > > > > > > > [hidden email]> >> > > >> > > > > > > > > > > > wrote: >> > > >> > > > > > > > > > > > >> > > >> > > > > > > > > > > > > 1. I believe that Cluster operations for >> Thin >> > > >> Client >> > > >> > > > > protocol >> > > >> > > > > > > are >> > > >> > > > > > > > > > > already >> > > >> > > > > > > > > > > > > in the works >> > > >> > > > > > > > > > > > > by Alexandr Shapkin. Can't find the ticket >> > > though. >> > > >> > > > > > > > > > > > > Alexandr, can you please confirm and attach >> > the >> > > >> > ticket >> > > >> > > > > > number? >> > > >> > > > > > > > > > > > > >> > > >> > > > > > > > > > > > > 2. Proposed changes will work only for Java >> > > tasks >> > > >> > that >> > > >> > > > are >> > > >> > > > > > > > already >> > > >> > > > > > > > > > > > deployed >> > > >> > > > > > > > > > > > > on server nodes. >> > > >> > > > > > > > > > > > > This is mostly useless for other thin >> clients >> > we >> > > >> have >> > > >> > > > > > (Python, >> > > >> > > > > > > > PHP, >> > > >> > > > > > > > > > > .NET, >> > > >> > > > > > > > > > > > > C++). >> > > >> > > > > > > > > > > > > >> > > >> > > > > > > > > > > > >> > > >> > > > > > > > > > > > I don't guess so. The task (execution) is a >> way >> > to >> > > >> > > > implement >> > > >> > > > > > own >> > > >> > > > > > > > > layer >> > > >> > > > > > > > > > > for >> > > >> > > > > > > > > > > > the thin client application. >> > > >> > > > > > > > > > > > >> > > >> > > > > > > > > > > > >> > > >> > > > > > > > > > > > > We should think of a way to make this >> useful >> > for >> > > >> all >> > > >> > > > > clients. >> > > >> > > > > > > > > > > > > For example, we may allow sending tasks in >> > some >> > > >> > > scripting >> > > >> > > > > > > > language >> > > >> > > > > > > > > > like >> > > >> > > > > > > > > > > > > Javascript. >> > > >> > > > > > > > > > > > > Thoughts? >> > > >> > > > > > > > > > > > > >> > > >> > > > > > > > > > > > >> > > >> > > > > > > > > > > > The arbitrary code execution from a remote >> > client >> > > >> must >> > > >> > be >> > > >> > > > > > > protected >> > > >> > > > > > > > > > > > from malicious code. >> > > >> > > > > > > > > > > > I don't know how it could be designed but >> > without >> > > >> that >> > > >> > we >> > > >> > > > > open >> > > >> > > > > > > the >> > > >> > > > > > > > > hole >> > > >> > > > > > > > > > > to >> > > >> > > > > > > > > > > > kill cluster. >> > > >> > > > > > > > > > > > >> > > >> > > > > > > > > > > > >> > > >> > > > > > > > > > > > > >> > > >> > > > > > > > > > > > > On Thu, Nov 21, 2019 at 11:21 AM Sergey >> > Kozlov < >> > > >> > > > > > > > > [hidden email] >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > > > wrote: >> > > >> > > > > > > > > > > > > >> > > >> > > > > > > > > > > > > > Hi Alex >> > > >> > > > > > > > > > > > > > >> > > >> > > > > > > > > > > > > > The idea is great. But I have some >> concerns >> > > that >> > > >> > > > probably >> > > >> > > > > > > > should >> > > >> > > > > > > > > be >> > > >> > > > > > > > > > > > taken >> > > >> > > > > > > > > > > > > > into account for design: >> > > >> > > > > > > > > > > > > > >> > > >> > > > > > > > > > > > > > 1. We need to have the ability to >> stop a >> > > task >> > > >> > > > > execution, >> > > >> > > > > > > > smth >> > > >> > > > > > > > > > like >> > > >> > > > > > > > > > > > > > OP_COMPUTE_CANCEL_TASK operation >> (client >> > > to >> > > >> > > server) >> > > >> > > > > > > > > > > > > > 2. What's about task execution >> timeout? >> > It >> > > >> may >> > > >> > > help >> > > >> > > > to >> > > >> > > > > > the >> > > >> > > > > > > > > > cluster >> > > >> > > > > > > > > > > > > > survival for buggy tasks >> > > >> > > > > > > > > > > > > > 3. Ignite doesn't have >> > roles/authorization >> > > >> > > > > functionality >> > > >> > > > > > > for >> > > >> > > > > > > > > > now. >> > > >> > > > > > > > > > > > But >> > > >> > > > > > > > > > > > > a >> > > >> > > > > > > > > > > > > > task is the risky operation for >> cluster >> > > (for >> > > >> > > > security >> > > >> > > > > > > > > reasons). >> > > >> > > > > > > > > > > > Could >> > > >> > > > > > > > > > > > > we >> > > >> > > > > > > > > > > > > > add for Ignite configuration new >> options: >> > > >> > > > > > > > > > > > > > - Explicit turning on for compute >> task >> > > >> > support >> > > >> > > > for >> > > >> > > > > > thin >> > > >> > > > > > > > > > > protocol >> > > >> > > > > > > > > > > > > > (disabled by default) for whole >> > cluster >> > > >> > > > > > > > > > > > > > - Explicit turning on for compute >> task >> > > >> > support >> > > >> > > > for >> > > >> > > > > a >> > > >> > > > > > > node >> > > >> > > > > > > > > > > > > > - The list of task names (classes) >> > > >> allowed to >> > > >> > > > > execute >> > > >> > > > > > > by >> > > >> > > > > > > > > thin >> > > >> > > > > > > > > > > > > client. >> > > >> > > > > > > > > > > > > > 4. Support the labeling for task that >> may >> > > >> help >> > > >> > to >> > > >> > > > > > > > investigate >> > > >> > > > > > > > > > > issues >> > > >> > > > > > > > > > > > > on >> > > >> > > > > > > > > > > > > > cluster (the idea from IEP-34 [1]) >> > > >> > > > > > > > > > > > > > >> > > >> > > > > > > > > > > > > > 1. >> > > >> > > > > > > > > > > > > > >> > > >> > > > > > > > > > > > > > >> > > >> > > > > > > > > > > > > >> > > >> > > > > > > > > > > > >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > >> > > >> > > > > > > > > >> > > >> > > > > > > > >> > > >> > > > > > > >> > > >> > > > > > >> > > >> > > > > >> > > >> > > > >> > > >> > > >> > > >> > >> > > >> >> > > >> > >> https://cwiki.apache.org/confluence/display/IGNITE/IEP-34+Thin+client%3A+transactions+support >> > > >> > > > > > > > > > > > > > >> > > >> > > > > > > > > > > > > > >> > > >> > > > > > > > > > > > > > >> > > >> > > > > > > > > > > > > > On Thu, Nov 21, 2019 at 10:58 AM Alex >> > > Plehanov < >> > > >> > > > > > > > > > > > [hidden email]> >> > > >> > > > > > > > > > > > > > wrote: >> > > >> > > > > > > > > > > > > > >> > > >> > > > > > > > > > > > > > > Hello, Igniters! >> > > >> > > > > > > > > > > > > > > >> > > >> > > > > > > > > > > > > > > I have plans to start implementation of >> > > >> Compute >> > > >> > > > > interface >> > > >> > > > > > > for >> > > >> > > > > > > > > > > Ignite >> > > >> > > > > > > > > > > > > thin >> > > >> > > > > > > > > > > > > > > client and want to discuss features >> that >> > > >> should >> > > >> > be >> > > >> > > > > > > > implemented. >> > > >> > > > > > > > > > > > > > > >> > > >> > > > > > > > > > > > > > > We already have Compute implementation >> for >> > > >> > > > binary-rest >> > > >> > > > > > > > clients >> > > >> > > > > > > > > > > > > > > (GridClientCompute), which have the >> > > following >> > > >> > > > > > > functionality: >> > > >> > > > > > > > > > > > > > > - Filtering cluster nodes (projection) >> for >> > > >> > compute >> > > >> > > > > > > > > > > > > > > - Executing task by the name >> > > >> > > > > > > > > > > > > > > >> > > >> > > > > > > > > > > > > > > I think we can implement this >> > functionality >> > > >> in a >> > > >> > > thin >> > > >> > > > > > > client >> > > >> > > > > > > > as >> > > >> > > > > > > > > > > well. >> > > >> > > > > > > > > > > > > > > >> > > >> > > > > > > > > > > > > > > First of all, we need some operation >> types >> > > to >> > > >> > > > request a >> > > >> > > > > > > list >> > > >> > > > > > > > of >> > > >> > > > > > > > > > all >> > > >> > > > > > > > > > > > > > > available nodes and probably node >> > attributes >> > > >> (by >> > > >> > a >> > > >> > > > list >> > > >> > > > > > of >> > > >> > > > > > > > > > nodes). >> > > >> > > > > > > > > > > > Node >> > > >> > > > > > > > > > > > > > > attributes will be helpful if we will >> > decide >> > > >> to >> > > >> > > > > implement >> > > >> > > > > > > > > analog >> > > >> > > > > > > > > > of >> > > >> > > > > > > > > > > > > > > ClusterGroup#forAttribute or >> > > >> > > > ClusterGroup#forePredicate >> > > >> > > > > > > > methods >> > > >> > > > > > > > > > in >> > > >> > > > > > > > > > > > the >> > > >> > > > > > > > > > > > > > thin >> > > >> > > > > > > > > > > > > > > client. Perhaps they can be requested >> > > lazily. >> > > >> > > > > > > > > > > > > > > >> > > >> > > > > > > > > > > > > > > From the protocol point of view there >> will >> > > be >> > > >> two >> > > >> > > new >> > > >> > > > > > > > > operations: >> > > >> > > > > > > > > > > > > > > >> > > >> > > > > > > > > > > > > > > OP_CLUSTER_GET_NODES >> > > >> > > > > > > > > > > > > > > Request: empty >> > > >> > > > > > > > > > > > > > > Response: long topologyVersion, int >> > > >> > > > > minorTopologyVersion, >> > > >> > > > > > > int >> > > >> > > > > > > > > > > > > nodesCount, >> > > >> > > > > > > > > > > > > > > for each node set of node fields (UUID >> > > nodeId, >> > > >> > > Object >> > > >> > > > > or >> > > >> > > > > > > > String >> > > >> > > > > > > > > > > > > > > consistentId, long order, etc) >> > > >> > > > > > > > > > > > > > > >> > > >> > > > > > > > > > > > > > > OP_CLUSTER_GET_NODE_ATTRIBUTES >> > > >> > > > > > > > > > > > > > > Request: int nodesCount, for each node: >> > UUID >> > > >> > nodeId >> > > >> > > > > > > > > > > > > > > Response: int nodesCount, for each >> node: >> > int >> > > >> > > > > > > attributesCount, >> > > >> > > > > > > > > for >> > > >> > > > > > > > > > > > each >> > > >> > > > > > > > > > > > > > node >> > > >> > > > > > > > > > > > > > > attribute: String name, Object value >> > > >> > > > > > > > > > > > > > > >> > > >> > > > > > > > > > > > > > > To execute tasks we need something like >> > > these >> > > >> > > methods >> > > >> > > > > in >> > > >> > > > > > > the >> > > >> > > > > > > > > > client >> > > >> > > > > > > > > > > > > API: >> > > >> > > > > > > > > > > > > > > Object execute(String task, Object arg) >> > > >> > > > > > > > > > > > > > > Future<Object> executeAsync(String >> task, >> > > >> Object >> > > >> > > arg) >> > > >> > > > > > > > > > > > > > > Object affinityExecute(String task, >> String >> > > >> cache, >> > > >> > > > > Object >> > > >> > > > > > > key, >> > > >> > > > > > > > > > > Object >> > > >> > > > > > > > > > > > > arg) >> > > >> > > > > > > > > > > > > > > Future<Object> >> affinityExecuteAsync(String >> > > >> task, >> > > >> > > > String >> > > >> > > > > > > > cache, >> > > >> > > > > > > > > > > Object >> > > >> > > > > > > > > > > > > > key, >> > > >> > > > > > > > > > > > > > > Object arg) >> > > >> > > > > > > > > > > > > > > >> > > >> > > > > > > > > > > > > > > Which can be mapped to protocol >> > operations: >> > > >> > > > > > > > > > > > > > > >> > > >> > > > > > > > > > > > > > > OP_COMPUTE_EXECUTE_TASK >> > > >> > > > > > > > > > > > > > > Request: UUID nodeId, String taskName, >> > > Object >> > > >> arg >> > > >> > > > > > > > > > > > > > > Response: Object result >> > > >> > > > > > > > > > > > > > > >> > > >> > > > > > > > > > > > > > > OP_COMPUTE_EXECUTE_TASK_AFFINITY >> > > >> > > > > > > > > > > > > > > Request: String cacheName, Object key, >> > > String >> > > >> > > > taskName, >> > > >> > > > > > > > Object >> > > >> > > > > > > > > > arg >> > > >> > > > > > > > > > > > > > > Response: Object result >> > > >> > > > > > > > > > > > > > > >> > > >> > > > > > > > > > > > > > > The second operation is needed because >> we >> > > >> > sometimes >> > > >> > > > > can't >> > > >> > > > > > > > > > calculate >> > > >> > > > > > > > > > > > and >> > > >> > > > > > > > > > > > > > > connect to affinity node on the >> > client-side >> > > >> > > (affinity >> > > >> > > > > > > > awareness >> > > >> > > > > > > > > > can >> > > >> > > > > > > > > > > > be >> > > >> > > > > > > > > > > > > > > disabled, custom affinity function can >> be >> > > >> used or >> > > >> > > > there >> > > >> > > > > > can >> > > >> > > > > > > > be >> > > >> > > > > > > > > no >> > > >> > > > > > > > > > > > > > > connection between client and affinity >> > > node), >> > > >> but >> > > >> > > we >> > > >> > > > > can >> > > >> > > > > > > make >> > > >> > > > > > > > > > best >> > > >> > > > > > > > > > > > > effort >> > > >> > > > > > > > > > > > > > > to send request to target node if >> affinity >> > > >> > > awareness >> > > >> > > > is >> > > >> > > > > > > > > enabled. >> > > >> > > > > > > > > > > > > > > >> > > >> > > > > > > > > > > > > > > Currently, on the server-side requests >> > > always >> > > >> > > > processed >> > > >> > > > > > > > > > > synchronously >> > > >> > > > > > > > > > > > > and >> > > >> > > > > > > > > > > > > > > responses are sent right after request >> was >> > > >> > > processed. >> > > >> > > > > To >> > > >> > > > > > > > > execute >> > > >> > > > > > > > > > > long >> > > >> > > > > > > > > > > > > > tasks >> > > >> > > > > > > > > > > > > > > async we should whether change this >> logic >> > or >> > > >> > > > introduce >> > > >> > > > > > some >> > > >> > > > > > > > > kind >> > > >> > > > > > > > > > > > > two-way >> > > >> > > > > > > > > > > > > > > communication between client and server >> > (now >> > > >> only >> > > >> > > > > one-way >> > > >> > > > > > > > > > requests >> > > >> > > > > > > > > > > > from >> > > >> > > > > > > > > > > > > > > client to server are allowed). >> > > >> > > > > > > > > > > > > > > >> > > >> > > > > > > > > > > > > > > Two-way communication can also be >> useful >> > in >> > > >> the >> > > >> > > > future >> > > >> > > > > if >> > > >> > > > > > > we >> > > >> > > > > > > > > will >> > > >> > > > > > > > > > > > send >> > > >> > > > > > > > > > > > > > some >> > > >> > > > > > > > > > > > > > > server-side generated events to >> clients. >> > > >> > > > > > > > > > > > > > > >> > > >> > > > > > > > > > > > > > > In case of two-way communication there >> can >> > > be >> > > >> new >> > > >> > > > > > > operations >> > > >> > > > > > > > > > > > > introduced: >> > > >> > > > > > > > > > > > > > > >> > > >> > > > > > > > > > > > > > > OP_COMPUTE_EXECUTE_TASK (from client to >> > > >> server) >> > > >> > > > > > > > > > > > > > > Request: UUID nodeId, String taskName, >> > > Object >> > > >> arg >> > > >> > > > > > > > > > > > > > > Response: long taskId >> > > >> > > > > > > > > > > > > > > >> > > >> > > > > > > > > > > > > > > OP_COMPUTE_TASK_FINISHED (from server >> to >> > > >> client) >> > > >> > > > > > > > > > > > > > > Request: taskId, Object result >> > > >> > > > > > > > > > > > > > > Response: empty >> > > >> > > > > > > > > > > > > > > >> > > >> > > > > > > > > > > > > > > The same for affinity requests. >> > > >> > > > > > > > > > > > > > > >> > > >> > > > > > > > > > > > > > > Also, we can implement not only execute >> > task >> > > >> > > > operation, >> > > >> > > > > > but >> > > >> > > > > > > > > some >> > > >> > > > > > > > > > > > other >> > > >> > > > > > > > > > > > > > > operations from IgniteCompute >> (broadcast, >> > > run, >> > > >> > > call), >> > > >> > > > > but >> > > >> > > > > > > it >> > > >> > > > > > > > > will >> > > >> > > > > > > > > > > be >> > > >> > > > > > > > > > > > > > useful >> > > >> > > > > > > > > > > > > > > only for java thin client. And even >> with >> > > java >> > > >> > thin >> > > >> > > > > client >> > > >> > > > > > > we >> > > >> > > > > > > > > > should >> > > >> > > > > > > > > > > > > > whether >> > > >> > > > > > > > > > > > > > > implement peer-class-loading for thin >> > > clients >> > > >> > (this >> > > >> > > > > also >> > > >> > > > > > > > > requires >> > > >> > > > > > > > > > > > > two-way >> > > >> > > > > > > > > > > > > > > client-server communication) or put >> > classes >> > > >> with >> > > >> > > > > executed >> > > >> > > > > > > > > > closures >> > > >> > > > > > > > > > > to >> > > >> > > > > > > > > > > > > the >> > > >> > > > > > > > > > > > > > > server locally. >> > > >> > > > > > > > > > > > > > > >> > > >> > > > > > > > > > > > > > > What do you think about proposed >> protocol >> > > >> > changes? >> > > >> > > > > > > > > > > > > > > Do we need two-way requests between >> client >> > > and >> > > >> > > > server? >> > > >> > > > > > > > > > > > > > > Do we need support of compute methods >> > other >> > > >> than >> > > >> > > > > "execute >> > > >> > > > > > > > > task"? >> > > >> > > > > > > > > > > > > > > What do you think about >> peer-class-loading >> > > for >> > > >> > thin >> > > >> > > > > > > clients? >> > > >> > > > > > > > > > > > > > > >> > > >> > > > > > > > > > > > > > >> > > >> > > > > > > > > > > > > > >> > > >> > > > > > > > > > > > > > -- >> > > >> > > > > > > > > > > > > > Sergey Kozlov >> > > >> > > > > > > > > > > > > > GridGain Systems >> > > >> > > > > > > > > > > > > > www.gridgain.com >> > > >> > > > > > > > > > > > > > >> > > >> > > > > > > > > > > > > >> > > >> > > > > > > > > > > > >> > > >> > > > > > > > > > > > >> > > >> > > > > > > > > > > > -- >> > > >> > > > > > > > > > > > Sergey Kozlov >> > > >> > > > > > > > > > > > GridGain Systems >> > > >> > > > > > > > > > > > www.gridgain.com >> > > >> > > > > > > > > > > > >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > > -- >> > > >> > > > > > > > > > > Alex. >> > > >> > > > > > > > > > > >> > > >> > > > > > > > > > >> > > >> > > > > > > > > >> > > >> > > > > > > > >> > > >> > > > > > > >> > > >> > > > > > >> > > >> > > > > >> > > >> > > > >> > > >> > > >> > > >> > >> > > >> >> > > > >> > > >> > >> > |
Hello guys.
I've implemented PoC and created IEP [1] for thin client compute grid functionality. Please have a look. [1]: https://cwiki.apache.org/confluence/display/IGNITE/IEP-42+Thin+client%3A+compute+support пт, 24 янв. 2020 г. в 16:56, Alex Plehanov <[hidden email]>: > We've discussed thin client compute protocol with Pavel Tupitsyn and Igor > Sapego and come to the conclusion that approach with two-way requests > should be used: client generates taskId and send a request to the server to > execute a task. The server responds that the request has been accepted. > After task has finished the server notifies the client (send a request > without waiting for a response). The client can cancel the task by sending > a corresponding request to the server. > > Also, a node list should be passed (optionally) with a request to limit > nodes to execute the task. > > I will create IEP and file detailed protocol changes shortly. > > вт, 21 янв. 2020 г. в 18:46, Alex Plehanov <[hidden email]>: > >> Igor, thanks for the reply. >> >> > Approach with taskId will require a lot of changes in protocol and thus >> more "heavy" for implementation >> Do you mean approach with server notifications mechanism? Yes, it will >> require a lot of changes. But in most recent messages we've discussed with >> Pavel approach without server notifications mechanism. This approach have >> the same complexity and performance as an approach with requestId. >> >> > But such clients as Python, Node.js, PHP, Go most probably won't have >> support for this API, at least for now. >> Without a server notifications mechanism, there will be no breaking >> changes in the protocol, so client implementation can just skip this >> feature and protocol version and implement the next one. >> >> > Or never. >> I think it still useful to execute java compute tasks from non-java thin >> clients. Also, we can provide some out-of-the-box java tasks, for example >> ExecutePythonScriptTask with python compute implementation, which can run >> python script on server node. >> >> > So, maybe it's a good time for us to change our backward compatibility >> mechanism from protocol versioning to feature masks? >> I like the idea with feature masks, but it will force us to support both >> backward compatibility mechanisms, protocol versioning and feature masks. >> >> пн, 20 янв. 2020 г. в 20:34, Pavel Tupitsyn <[hidden email]>: >> >>> Huge +1 from me for Feature Masks. >>> I think this should be our top priority for thin client protocol, since >>> it >>> simplifies change management a lot. >>> >>> On Mon, Jan 20, 2020 at 8:21 PM Igor Sapego <[hidden email]> wrote: >>> >>> > Sorry for the late reply. >>> > >>> > Approach with taskId will require a lot of changes in protocol and thus >>> > more "heavy" for implementation, but it definitely looks to me less >>> hacky >>> > than reqId-approach. Moreover, as was mentioned, server notifications >>> > mechanism will be required in a future anyway with high probability. So >>> > from this point of view I like taskId-approach. >>> > >>> > On the other hand, what we should also consider here is performance. >>> > Speaking of latency, it looks like reqId will have better results in >>> case >>> > of >>> > small and fast tasks. The only question here, if we want to optimize >>> thin >>> > clients for this case. >>> > >>> > Also, what are you talking about mostly involves clients on platforms >>> > that already have Compute API for thick clients. Let me mention one >>> > more point of view here and another concern here. >>> > >>> > The changes you propose are going to change protocol version for sure. >>> > In case with taskId approach and server notifications - even more so. >>> > >>> > But such clients as Python, Node.js, PHP, Go most probably won't have >>> > support for this API, at least for now. Or never. But current >>> > backward-compatibility mechanism implies protocol versions where we >>> > imply that client that supports version 1.5 also supports all the >>> features >>> > introduced in all the previous versions of the protocol. >>> > >>> > Thus implementing Compute API in any of the proposed ways *may* >>> > force mentioned clients to support changes in protocol which they not >>> > necessarily need in order to introduce new features in the future. >>> > >>> > So, maybe it's a good time for us to change our backward compatibility >>> > mechanism from protocol versioning to feature masks? >>> > >>> > WDYT? >>> > >>> > Best Regards, >>> > Igor >>> > >>> > >>> > On Fri, Jan 17, 2020 at 9:37 AM Alex Plehanov <[hidden email] >>> > >>> > wrote: >>> > >>> > > Looks like we didn't rich consensus here. >>> > > >>> > > Igor, as thin client maintainer, can you please share your opinion? >>> > > >>> > > Everyone else also welcome, please share your thoughts about options >>> to >>> > > implement operations for compute. >>> > > >>> > > >>> > > чт, 28 нояб. 2019 г. в 10:02, Alex Plehanov <[hidden email] >>> >: >>> > > >>> > > > > Since all thin client operations are inherently async, we should >>> be >>> > > able >>> > > > to cancel any of them >>> > > > It's illogical to have such ability. What should do cancel >>> operation of >>> > > > cancel operation? Moreover, sometimes it's dangerous, for example, >>> > create >>> > > > cache operation should never be canceled. There should be an >>> explicit >>> > set >>> > > > of processes that we can cancel: queries, transactions, tasks, >>> > services. >>> > > > The lifecycle of services is more complex than the lifecycle of >>> tasks. >>> > > With >>> > > > services, I suppose, we can't use request cancelation, so tasks >>> will be >>> > > the >>> > > > only process with an exceptional pattern. >>> > > > >>> > > > > The request would be "execute task with specified node filter" - >>> > simple >>> > > > and efficient. >>> > > > It's not simple: every compute or service request should contain >>> > complex >>> > > > node filtering logic, which duplicates the same logic for cluster >>> API. >>> > > > It's not efficient: for example, we can't implement forPredicate() >>> > > > filtering in this case. >>> > > > >>> > > > >>> > > > ср, 27 нояб. 2019 г. в 19:25, Pavel Tupitsyn <[hidden email] >>> >: >>> > > > >>> > > >> > The request is already processed (task is started), we can't >>> cancel >>> > > the >>> > > >> request >>> > > >> The request is not "start a task". It is "execute task" (and get >>> > > result). >>> > > >> Same as "cache get" - you get a result in the end, we don't "start >>> > cache >>> > > >> get" then "end cache get". >>> > > >> >>> > > >> Since all thin client operations are inherently async, we should >>> be >>> > able >>> > > >> to >>> > > >> cancel any of them >>> > > >> by sending another request with an id of prior request to be >>> > cancelled. >>> > > >> That's why I'm advocating for this approach - it will work for >>> > anything, >>> > > >> no >>> > > >> special cases. >>> > > >> And it keeps "happy path" as simple as it is right now. >>> > > >> >>> > > >> Queries are different because we retrieve results in pages, we >>> can't >>> > do >>> > > >> them as one request. >>> > > >> Transactions are also different because client controls when they >>> > should >>> > > >> end. >>> > > >> There is no reason for task execution to be a special case like >>> > queries >>> > > or >>> > > >> transactions. >>> > > >> >>> > > >> > we always need to send 2 requests to server to execute the task >>> > > >> Nope. We don't need to get nodes on client at all. >>> > > >> The request would be "execute task with specified node filter" - >>> > simple >>> > > >> and >>> > > >> efficient. >>> > > >> >>> > > >> >>> > > >> On Wed, Nov 27, 2019 at 4:31 PM Alex Plehanov < >>> > [hidden email]> >>> > > >> wrote: >>> > > >> >>> > > >> > > We do cancel a request to perform a task. We may and should >>> use >>> > > this >>> > > >> to >>> > > >> > cancel any other request in future. >>> > > >> > The request is already processed (task is started), we can't >>> cancel >>> > > the >>> > > >> > request. As you mentioned before, we already do almost the same >>> for >>> > > >> queries >>> > > >> > (close the cursor, but not cancel the request to run a query), >>> it's >>> > > >> better >>> > > >> > to do such things in a common way. We have a pattern: start some >>> > > process >>> > > >> > (query, transaction), get id of this process, end process by >>> this >>> > id. >>> > > >> The >>> > > >> > "Execute task" process should match the same pattern. In my >>> opinion, >>> > > >> > implementation with two-way requests is the best option to match >>> > this >>> > > >> > pattern (we can even reuse OP_RESOURCE_CLOSE operation type in >>> this >>> > > >> case). >>> > > >> > Sometime in the future, we will need two-way requests for some >>> other >>> > > >> > functionality (continuous queries, event listening, etc). But >>> even >>> > > >> without >>> > > >> > two-way requests introducing some process id (task id in our >>> case) >>> > > will >>> > > >> be >>> > > >> > closer to existing pattern than canceling tasks by request id. >>> > > >> > >>> > > >> > > So every new request will apply those filters on server side, >>> > using >>> > > >> the >>> > > >> > most recent set of nodes. >>> > > >> > In this case, we always need to send 2 requests to server to >>> execute >>> > > the >>> > > >> > task. First - to get nodes by the filter, second - to actually >>> > execute >>> > > >> the >>> > > >> > task. It seems like overhead. The same will be for services. >>> Cluster >>> > > >> group >>> > > >> > remains the same if the topology hasn't changed. We can use this >>> > fact >>> > > >> and >>> > > >> > bind "execute task" request to topology. If topology has >>> changed - >>> > get >>> > > >> > nodes for new topology and retry request. >>> > > >> > >>> > > >> > вт, 26 нояб. 2019 г. в 17:44, Pavel Tupitsyn < >>> [hidden email] >>> > >: >>> > > >> > >>> > > >> > > > After all, we don't cancel request >>> > > >> > > We do cancel a request to perform a task. We may and should >>> use >>> > this >>> > > >> to >>> > > >> > > cancel any other request in future. >>> > > >> > > >>> > > >> > > > Client uses some cluster group filtration (for example >>> > > forServers() >>> > > >> > > cluster group) >>> > > >> > > Please see above - Aleksandr Shapkin described how we store >>> > > >> > > filtered cluster groups on client. >>> > > >> > > We don't store node IDs, we store actual filters. So every new >>> > > request >>> > > >> > will >>> > > >> > > apply those filters on server side, >>> > > >> > > using the most recent set of nodes. >>> > > >> > > >>> > > >> > > var myGrp = cluster.forServers().forAttribute("foo"); // This >>> does >>> > > not >>> > > >> > > issue any server requests, just builds an object with filters >>> on >>> > > >> client >>> > > >> > > while (true) myGrp.compute().executeTask("bar"); // Every >>> request >>> > > >> > includes >>> > > >> > > filters, and filters are applied on the server side >>> > > >> > > >>> > > >> > > On Tue, Nov 26, 2019 at 1:42 PM Alex Plehanov < >>> > > >> [hidden email]> >>> > > >> > > wrote: >>> > > >> > > >>> > > >> > > > > Anyway, my point stands. >>> > > >> > > > I can't agree. Why you don't want to use task id for this? >>> After >>> > > >> all, >>> > > >> > we >>> > > >> > > > don't cancel request (request is already processed), we >>> cancel >>> > the >>> > > >> > task. >>> > > >> > > So >>> > > >> > > > it's more convenient to use task id here. >>> > > >> > > > >>> > > >> > > > > Can you please provide equivalent use case with existing >>> > "thick" >>> > > >> > > client? >>> > > >> > > > For example: >>> > > >> > > > Cluster consists of one server node. >>> > > >> > > > Client uses some cluster group filtration (for example >>> > > forServers() >>> > > >> > > cluster >>> > > >> > > > group). >>> > > >> > > > Client starts to send periodically (for example 1 per >>> minute) >>> > > >> long-term >>> > > >> > > > (for example 1 hour long) tasks to the cluster. >>> > > >> > > > Meanwhile, several server nodes joined the cluster. >>> > > >> > > > >>> > > >> > > > In case of thick client: All server nodes will be used, >>> tasks >>> > will >>> > > >> be >>> > > >> > > load >>> > > >> > > > balanced. >>> > > >> > > > In case of thin client: Only one server node will be used, >>> > client >>> > > >> will >>> > > >> > > > detect topology change after an hour. >>> > > >> > > > >>> > > >> > > > >>> > > >> > > > вт, 26 нояб. 2019 г. в 11:50, Pavel Tupitsyn < >>> > > [hidden email] >>> > > >> >: >>> > > >> > > > >>> > > >> > > > > > I can't see any usage of request id in query cursors >>> > > >> > > > > You are right, cursor id is a separate thing. >>> > > >> > > > > Anyway, my point stands. >>> > > >> > > > > >>> > > >> > > > > > client sends long term tasks to nodes and wants to do it >>> > with >>> > > >> load >>> > > >> > > > > balancing >>> > > >> > > > > I still don't get it. Can you please provide equivalent >>> use >>> > case >>> > > >> with >>> > > >> > > > > existing "thick" client? >>> > > >> > > > > >>> > > >> > > > > >>> > > >> > > > > On Mon, Nov 25, 2019 at 11:59 PM Alex Plehanov < >>> > > >> > > [hidden email]> >>> > > >> > > > > wrote: >>> > > >> > > > > >>> > > >> > > > > > > And it is fine to use request ID to identify compute >>> tasks >>> > > >> (as we >>> > > >> > > do >>> > > >> > > > > with >>> > > >> > > > > > query cursors). >>> > > >> > > > > > I can't see any usage of request id in query cursors. We >>> > send >>> > > >> query >>> > > >> > > > > request >>> > > >> > > > > > and get cursor id in response. After that, we only use >>> > cursor >>> > > id >>> > > >> > (to >>> > > >> > > > get >>> > > >> > > > > > next pages and to close the resource). Did I miss >>> something? >>> > > >> > > > > > >>> > > >> > > > > > > Looks like I'm missing something - how is topology >>> change >>> > > >> > relevant >>> > > >> > > to >>> > > >> > > > > > executing compute tasks from client? >>> > > >> > > > > > It's not relevant directly. But there are some cases >>> where >>> > it >>> > > >> will >>> > > >> > be >>> > > >> > > > > > helpful. For example, if client sends long term tasks to >>> > nodes >>> > > >> and >>> > > >> > > > wants >>> > > >> > > > > to >>> > > >> > > > > > do it with load balancing it will detect topology change >>> > only >>> > > >> after >>> > > >> > > > some >>> > > >> > > > > > time in the future with the first response, so load >>> > balancing >>> > > >> will >>> > > >> > no >>> > > >> > > > > work. >>> > > >> > > > > > Perhaps we can add optional "topology version" field to >>> the >>> > > >> > > > > > OP_COMPUTE_EXECUTE_TASK request to solve this problem. >>> > > >> > > > > > >>> > > >> > > > > > >>> > > >> > > > > > пн, 25 нояб. 2019 г. в 22:42, Pavel Tupitsyn < >>> > > >> [hidden email] >>> > > >> > >: >>> > > >> > > > > > >>> > > >> > > > > > > Alex, >>> > > >> > > > > > > >>> > > >> > > > > > > > we will mix entities from different layers >>> (transport >>> > > layer >>> > > >> and >>> > > >> > > > > request >>> > > >> > > > > > > body) >>> > > >> > > > > > > I would not call our message header (which includes >>> the >>> > id) >>> > > >> > > > "transport >>> > > >> > > > > > > layer". >>> > > >> > > > > > > TCP is our transport layer. And it is fine to use >>> request >>> > ID >>> > > >> to >>> > > >> > > > > identify >>> > > >> > > > > > > compute tasks (as we do with query cursors). >>> > > >> > > > > > > >>> > > >> > > > > > > > we still can't be sure that the task is successfully >>> > > started >>> > > >> > on a >>> > > >> > > > > > server >>> > > >> > > > > > > The request to start the task will fail and we'll get >>> a >>> > > >> response >>> > > >> > > > > > indicating >>> > > >> > > > > > > that right away >>> > > >> > > > > > > >>> > > >> > > > > > > > we won't ever know about topology change >>> > > >> > > > > > > Looks like I'm missing something - how is topology >>> change >>> > > >> > relevant >>> > > >> > > to >>> > > >> > > > > > > executing compute tasks from client? >>> > > >> > > > > > > >>> > > >> > > > > > > On Mon, Nov 25, 2019 at 10:17 PM Alex Plehanov < >>> > > >> > > > > [hidden email]> >>> > > >> > > > > > > wrote: >>> > > >> > > > > > > >>> > > >> > > > > > > > Pavel, in this case, we will mix entities from >>> different >>> > > >> layers >>> > > >> > > > > > > (transport >>> > > >> > > > > > > > layer and request body), it's not very good. The >>> same >>> > > >> behavior >>> > > >> > we >>> > > >> > > > can >>> > > >> > > > > > > > achieve with generated on client-side task id, but >>> there >>> > > >> will >>> > > >> > be >>> > > >> > > no >>> > > >> > > > > > > > inter-layer data intersection and I think it will be >>> > > easier >>> > > >> to >>> > > >> > > > > > implement >>> > > >> > > > > > > on >>> > > >> > > > > > > > both client and server-side. But we still can't be >>> sure >>> > > that >>> > > >> > the >>> > > >> > > > task >>> > > >> > > > > > is >>> > > >> > > > > > > > successfully started on a server. We won't ever know >>> > about >>> > > >> > > topology >>> > > >> > > > > > > change, >>> > > >> > > > > > > > because topology changed flag will be sent from >>> server >>> > to >>> > > >> > client >>> > > >> > > > only >>> > > >> > > > > > > with >>> > > >> > > > > > > > a response when the task will be completed. Are we >>> > accept >>> > > >> that? >>> > > >> > > > > > > > >>> > > >> > > > > > > > пн, 25 нояб. 2019 г. в 19:07, Pavel Tupitsyn < >>> > > >> > > [hidden email] >>> > > >> > > > >: >>> > > >> > > > > > > > >>> > > >> > > > > > > > > Alex, >>> > > >> > > > > > > > > >>> > > >> > > > > > > > > I have a simpler idea. We already do request id >>> > handling >>> > > >> in >>> > > >> > the >>> > > >> > > > > > > protocol, >>> > > >> > > > > > > > > so: >>> > > >> > > > > > > > > - Client sends a normal request to execute compute >>> > task. >>> > > >> > > Request >>> > > >> > > > ID >>> > > >> > > > > > is >>> > > >> > > > > > > > > generated as usual. >>> > > >> > > > > > > > > - As soon as task is completed, a response is >>> > received. >>> > > >> > > > > > > > > >>> > > >> > > > > > > > > As for cancellation - client can send a new >>> request >>> > > (with >>> > > >> new >>> > > >> > > > > request >>> > > >> > > > > > > ID) >>> > > >> > > > > > > > > and (in the body) pass the request ID from above >>> > > >> > > > > > > > > as a task identifier. As a result, there are two >>> > > >> responses: >>> > > >> > > > > > > > > - Cancellation response >>> > > >> > > > > > > > > - Task response (with proper cancelled status) >>> > > >> > > > > > > > > >>> > > >> > > > > > > > > That's it, no need to modify the core of the >>> protocol. >>> > > One >>> > > >> > > > request >>> > > >> > > > > - >>> > > >> > > > > > > one >>> > > >> > > > > > > > > response. >>> > > >> > > > > > > > > >>> > > >> > > > > > > > > On Mon, Nov 25, 2019 at 6:20 PM Alex Plehanov < >>> > > >> > > > > > [hidden email] >>> > > >> > > > > > > > >>> > > >> > > > > > > > > wrote: >>> > > >> > > > > > > > > >>> > > >> > > > > > > > > > Pavel, we need to inform the client when the >>> task is >>> > > >> > > completed, >>> > > >> > > > > we >>> > > >> > > > > > > need >>> > > >> > > > > > > > > the >>> > > >> > > > > > > > > > ability to cancel the task. I see several ways >>> to >>> > > >> implement >>> > > >> > > > this: >>> > > >> > > > > > > > > > >>> > > >> > > > > > > > > > 1. Сlient sends a request to the server to >>> start a >>> > > task, >>> > > >> > > server >>> > > >> > > > > > > return >>> > > >> > > > > > > > > task >>> > > >> > > > > > > > > > id in response. Server notifies client when >>> task is >>> > > >> > completed >>> > > >> > > > > with >>> > > >> > > > > > a >>> > > >> > > > > > > > new >>> > > >> > > > > > > > > > request (from server to client). Client can >>> cancel >>> > the >>> > > >> task >>> > > >> > > by >>> > > >> > > > > > > sending >>> > > >> > > > > > > > a >>> > > >> > > > > > > > > > new request with operation type "cancel" and >>> task >>> > id. >>> > > In >>> > > >> > this >>> > > >> > > > > case, >>> > > >> > > > > > > we >>> > > >> > > > > > > > > > should implement 2-ways requests. >>> > > >> > > > > > > > > > 2. Client generates unique task id and sends a >>> > request >>> > > >> to >>> > > >> > the >>> > > >> > > > > > server >>> > > >> > > > > > > to >>> > > >> > > > > > > > > > start a task, server don't reply immediately but >>> > wait >>> > > >> until >>> > > >> > > > task >>> > > >> > > > > is >>> > > >> > > > > > > > > > completed. Client can cancel task by sending new >>> > > request >>> > > >> > with >>> > > >> > > > > > > operation >>> > > >> > > > > > > > > > type "cancel" and task id. In this case, we >>> should >>> > > >> decouple >>> > > >> > > > > request >>> > > >> > > > > > > and >>> > > >> > > > > > > > > > response on the server-side (currently response >>> is >>> > > sent >>> > > >> > right >>> > > >> > > > > after >>> > > >> > > > > > > > > request >>> > > >> > > > > > > > > > was processed). Also, we can't be sure that >>> task is >>> > > >> > > > successfully >>> > > >> > > > > > > > started >>> > > >> > > > > > > > > on >>> > > >> > > > > > > > > > a server. >>> > > >> > > > > > > > > > 3. Client sends a request to the server to >>> start a >>> > > task, >>> > > >> > > server >>> > > >> > > > > > > return >>> > > >> > > > > > > > id >>> > > >> > > > > > > > > > in response. Client periodically asks the server >>> > about >>> > > >> task >>> > > >> > > > > status. >>> > > >> > > > > > > > > Client >>> > > >> > > > > > > > > > can cancel the task by sending new request with >>> > > >> operation >>> > > >> > > type >>> > > >> > > > > > > "cancel" >>> > > >> > > > > > > > > and >>> > > >> > > > > > > > > > task id. This case brings some overhead to the >>> > > >> > communication >>> > > >> > > > > > channel. >>> > > >> > > > > > > > > > >>> > > >> > > > > > > > > > Personally, I think that the case with 2-ways >>> > requests >>> > > >> is >>> > > >> > > > better, >>> > > >> > > > > > but >>> > > >> > > > > > > > I'm >>> > > >> > > > > > > > > > open to any other ideas. >>> > > >> > > > > > > > > > >>> > > >> > > > > > > > > > Aleksandr, >>> > > >> > > > > > > > > > >>> > > >> > > > > > > > > > Filtering logic for >>> OP_CLUSTER_GROUP_GET_NODE_IDS >>> > > looks >>> > > >> > > > > > > > overcomplicated. >>> > > >> > > > > > > > > Do >>> > > >> > > > > > > > > > we need server-side filtering at all? Wouldn't >>> it be >>> > > >> better >>> > > >> > > to >>> > > >> > > > > send >>> > > >> > > > > > > > basic >>> > > >> > > > > > > > > > info (ids, order, flags) for all nodes (there is >>> > > >> relatively >>> > > >> > > > small >>> > > >> > > > > > > > amount >>> > > >> > > > > > > > > of >>> > > >> > > > > > > > > > data) and extended info (attributes) for >>> selected >>> > list >>> > > >> of >>> > > >> > > > nodes? >>> > > >> > > > > In >>> > > >> > > > > > > > this >>> > > >> > > > > > > > > > case, we can do basic node filtration on >>> client-side >>> > > >> > > > > (forClients(), >>> > > >> > > > > > > > > > forServers(), forNodeIds(), forOthers(), etc). >>> > > >> > > > > > > > > > >>> > > >> > > > > > > > > > Do you use standard ClusterNode serialization? >>> There >>> > > are >>> > > >> > also >>> > > >> > > > > > metrics >>> > > >> > > > > > > > > > serialized with ClusterNode, do we need it on >>> thin >>> > > >> client? >>> > > >> > > > There >>> > > >> > > > > > are >>> > > >> > > > > > > > > other >>> > > >> > > > > > > > > > interfaces exist to show metrics, I think it's >>> > > >> redundant to >>> > > >> > > > > export >>> > > >> > > > > > > > > metrics >>> > > >> > > > > > > > > > to thin clients too. >>> > > >> > > > > > > > > > >>> > > >> > > > > > > > > > What do you think? >>> > > >> > > > > > > > > > >>> > > >> > > > > > > > > > >>> > > >> > > > > > > > > > >>> > > >> > > > > > > > > > >>> > > >> > > > > > > > > > пт, 22 нояб. 2019 г. в 20:15, Aleksandr Shapkin >>> < >>> > > >> > > > > [hidden email] >>> > > >> > > > > > >: >>> > > >> > > > > > > > > > >>> > > >> > > > > > > > > > > Alex, >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > > > I think you can create a new IEP page and I >>> will >>> > > fill >>> > > >> it >>> > > >> > > with >>> > > >> > > > > the >>> > > >> > > > > > > > > Cluster >>> > > >> > > > > > > > > > > API details. >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > > > In short, I’ve introduced several new codes: >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > > > Cluster API is pretty straightforward: >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > > > OP_CLUSTER_IS_ACTIVE = 5000 >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > > > OP_CLUSTER_CHANGE_STATE = 5001 >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > > > OP_CLUSTER_CHANGE_WAL_STATE = 5002 >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > > > OP_CLUSTER_GET_WAL_STATE = 5003 >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > > > Cluster group codes: >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > > > OP_CLUSTER_GROUP_GET_NODE_IDS = 5100 >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > > > OP_CLUSTER_GROUP_GET_NODE_INFO = 5101 >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > > > The underlying implementation is based on the >>> > thick >>> > > >> > client >>> > > >> > > > > logic. >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > > > For every request, we provide a known topology >>> > > version >>> > > >> > and >>> > > >> > > if >>> > > >> > > > > it >>> > > >> > > > > > > has >>> > > >> > > > > > > > > > > changed, >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > > > a client updates it firstly and then re-sends >>> the >>> > > >> > filtering >>> > > >> > > > > > > request. >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > > > Alongside the topVer a client sends a >>> serialized >>> > > nodes >>> > > >> > > > > projection >>> > > >> > > > > > > > > object >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > > > that could be considered as a code to value >>> > mapping. >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > > > Consider: [{Code = 1, Value= [“DotNet”, >>> > > >> “MyAttribute”}, >>> > > >> > > > > {Code=2, >>> > > >> > > > > > > > > > Value=1}] >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > > > Where “1” stands for Attribute filtering and >>> “2” – >>> > > >> > > > > > serverNodesOnly >>> > > >> > > > > > > > > flag. >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > > > As a result of request processing, a server >>> sends >>> > > >> nodeId >>> > > >> > > > UUIDs >>> > > >> > > > > > and >>> > > >> > > > > > > a >>> > > >> > > > > > > > > > > current topVer. >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > > > When a client obtains nodeIds, it can perform >>> a >>> > > >> NODE_INFO >>> > > >> > > > call >>> > > >> > > > > to >>> > > >> > > > > > > > get a >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > > > serialized ClusterNode object. In addition >>> there >>> > > >> should >>> > > >> > be >>> > > >> > > a >>> > > >> > > > > > > > different >>> > > >> > > > > > > > > > API >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > > > method for accessing/updating node metrics. >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > > > чт, 21 нояб. 2019 г. в 12:32, Sergey Kozlov < >>> > > >> > > > > > [hidden email] >>> > > >> > > > > > > >: >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > > > > Hi Pavel >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > > On Thu, Nov 21, 2019 at 11:30 AM Pavel >>> Tupitsyn >>> > < >>> > > >> > > > > > > > > [hidden email]> >>> > > >> > > > > > > > > > > > wrote: >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > 1. I believe that Cluster operations for >>> Thin >>> > > >> Client >>> > > >> > > > > protocol >>> > > >> > > > > > > are >>> > > >> > > > > > > > > > > already >>> > > >> > > > > > > > > > > > > in the works >>> > > >> > > > > > > > > > > > > by Alexandr Shapkin. Can't find the ticket >>> > > though. >>> > > >> > > > > > > > > > > > > Alexandr, can you please confirm and >>> attach >>> > the >>> > > >> > ticket >>> > > >> > > > > > number? >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > 2. Proposed changes will work only for >>> Java >>> > > tasks >>> > > >> > that >>> > > >> > > > are >>> > > >> > > > > > > > already >>> > > >> > > > > > > > > > > > deployed >>> > > >> > > > > > > > > > > > > on server nodes. >>> > > >> > > > > > > > > > > > > This is mostly useless for other thin >>> clients >>> > we >>> > > >> have >>> > > >> > > > > > (Python, >>> > > >> > > > > > > > PHP, >>> > > >> > > > > > > > > > > .NET, >>> > > >> > > > > > > > > > > > > C++). >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > > I don't guess so. The task (execution) is a >>> way >>> > to >>> > > >> > > > implement >>> > > >> > > > > > own >>> > > >> > > > > > > > > layer >>> > > >> > > > > > > > > > > for >>> > > >> > > > > > > > > > > > the thin client application. >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > We should think of a way to make this >>> useful >>> > for >>> > > >> all >>> > > >> > > > > clients. >>> > > >> > > > > > > > > > > > > For example, we may allow sending tasks in >>> > some >>> > > >> > > scripting >>> > > >> > > > > > > > language >>> > > >> > > > > > > > > > like >>> > > >> > > > > > > > > > > > > Javascript. >>> > > >> > > > > > > > > > > > > Thoughts? >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > > The arbitrary code execution from a remote >>> > client >>> > > >> must >>> > > >> > be >>> > > >> > > > > > > protected >>> > > >> > > > > > > > > > > > from malicious code. >>> > > >> > > > > > > > > > > > I don't know how it could be designed but >>> > without >>> > > >> that >>> > > >> > we >>> > > >> > > > > open >>> > > >> > > > > > > the >>> > > >> > > > > > > > > hole >>> > > >> > > > > > > > > > > to >>> > > >> > > > > > > > > > > > kill cluster. >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > On Thu, Nov 21, 2019 at 11:21 AM Sergey >>> > Kozlov < >>> > > >> > > > > > > > > [hidden email] >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > > > > > wrote: >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > Hi Alex >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > The idea is great. But I have some >>> concerns >>> > > that >>> > > >> > > > probably >>> > > >> > > > > > > > should >>> > > >> > > > > > > > > be >>> > > >> > > > > > > > > > > > taken >>> > > >> > > > > > > > > > > > > > into account for design: >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > 1. We need to have the ability to >>> stop a >>> > > task >>> > > >> > > > > execution, >>> > > >> > > > > > > > smth >>> > > >> > > > > > > > > > like >>> > > >> > > > > > > > > > > > > > OP_COMPUTE_CANCEL_TASK operation >>> (client >>> > > to >>> > > >> > > server) >>> > > >> > > > > > > > > > > > > > 2. What's about task execution >>> timeout? >>> > It >>> > > >> may >>> > > >> > > help >>> > > >> > > > to >>> > > >> > > > > > the >>> > > >> > > > > > > > > > cluster >>> > > >> > > > > > > > > > > > > > survival for buggy tasks >>> > > >> > > > > > > > > > > > > > 3. Ignite doesn't have >>> > roles/authorization >>> > > >> > > > > functionality >>> > > >> > > > > > > for >>> > > >> > > > > > > > > > now. >>> > > >> > > > > > > > > > > > But >>> > > >> > > > > > > > > > > > > a >>> > > >> > > > > > > > > > > > > > task is the risky operation for >>> cluster >>> > > (for >>> > > >> > > > security >>> > > >> > > > > > > > > reasons). >>> > > >> > > > > > > > > > > > Could >>> > > >> > > > > > > > > > > > > we >>> > > >> > > > > > > > > > > > > > add for Ignite configuration new >>> options: >>> > > >> > > > > > > > > > > > > > - Explicit turning on for compute >>> task >>> > > >> > support >>> > > >> > > > for >>> > > >> > > > > > thin >>> > > >> > > > > > > > > > > protocol >>> > > >> > > > > > > > > > > > > > (disabled by default) for whole >>> > cluster >>> > > >> > > > > > > > > > > > > > - Explicit turning on for compute >>> task >>> > > >> > support >>> > > >> > > > for >>> > > >> > > > > a >>> > > >> > > > > > > node >>> > > >> > > > > > > > > > > > > > - The list of task names (classes) >>> > > >> allowed to >>> > > >> > > > > execute >>> > > >> > > > > > > by >>> > > >> > > > > > > > > thin >>> > > >> > > > > > > > > > > > > client. >>> > > >> > > > > > > > > > > > > > 4. Support the labeling for task >>> that may >>> > > >> help >>> > > >> > to >>> > > >> > > > > > > > investigate >>> > > >> > > > > > > > > > > issues >>> > > >> > > > > > > > > > > > > on >>> > > >> > > > > > > > > > > > > > cluster (the idea from IEP-34 [1]) >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > 1. >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > > >>> > > >> > > > > > > > > >>> > > >> > > > > > > > >>> > > >> > > > > > > >>> > > >> > > > > > >>> > > >> > > > > >>> > > >> > > > >>> > > >> > > >>> > > >> > >>> > > >> >>> > > >>> > >>> https://cwiki.apache.org/confluence/display/IGNITE/IEP-34+Thin+client%3A+transactions+support >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > On Thu, Nov 21, 2019 at 10:58 AM Alex >>> > > Plehanov < >>> > > >> > > > > > > > > > > > [hidden email]> >>> > > >> > > > > > > > > > > > > > wrote: >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > Hello, Igniters! >>> > > >> > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > I have plans to start implementation >>> of >>> > > >> Compute >>> > > >> > > > > interface >>> > > >> > > > > > > for >>> > > >> > > > > > > > > > > Ignite >>> > > >> > > > > > > > > > > > > thin >>> > > >> > > > > > > > > > > > > > > client and want to discuss features >>> that >>> > > >> should >>> > > >> > be >>> > > >> > > > > > > > implemented. >>> > > >> > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > We already have Compute >>> implementation for >>> > > >> > > > binary-rest >>> > > >> > > > > > > > clients >>> > > >> > > > > > > > > > > > > > > (GridClientCompute), which have the >>> > > following >>> > > >> > > > > > > functionality: >>> > > >> > > > > > > > > > > > > > > - Filtering cluster nodes >>> (projection) for >>> > > >> > compute >>> > > >> > > > > > > > > > > > > > > - Executing task by the name >>> > > >> > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > I think we can implement this >>> > functionality >>> > > >> in a >>> > > >> > > thin >>> > > >> > > > > > > client >>> > > >> > > > > > > > as >>> > > >> > > > > > > > > > > well. >>> > > >> > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > First of all, we need some operation >>> types >>> > > to >>> > > >> > > > request a >>> > > >> > > > > > > list >>> > > >> > > > > > > > of >>> > > >> > > > > > > > > > all >>> > > >> > > > > > > > > > > > > > > available nodes and probably node >>> > attributes >>> > > >> (by >>> > > >> > a >>> > > >> > > > list >>> > > >> > > > > > of >>> > > >> > > > > > > > > > nodes). >>> > > >> > > > > > > > > > > > Node >>> > > >> > > > > > > > > > > > > > > attributes will be helpful if we will >>> > decide >>> > > >> to >>> > > >> > > > > implement >>> > > >> > > > > > > > > analog >>> > > >> > > > > > > > > > of >>> > > >> > > > > > > > > > > > > > > ClusterGroup#forAttribute or >>> > > >> > > > ClusterGroup#forePredicate >>> > > >> > > > > > > > methods >>> > > >> > > > > > > > > > in >>> > > >> > > > > > > > > > > > the >>> > > >> > > > > > > > > > > > > > thin >>> > > >> > > > > > > > > > > > > > > client. Perhaps they can be requested >>> > > lazily. >>> > > >> > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > From the protocol point of view there >>> will >>> > > be >>> > > >> two >>> > > >> > > new >>> > > >> > > > > > > > > operations: >>> > > >> > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > OP_CLUSTER_GET_NODES >>> > > >> > > > > > > > > > > > > > > Request: empty >>> > > >> > > > > > > > > > > > > > > Response: long topologyVersion, int >>> > > >> > > > > minorTopologyVersion, >>> > > >> > > > > > > int >>> > > >> > > > > > > > > > > > > nodesCount, >>> > > >> > > > > > > > > > > > > > > for each node set of node fields (UUID >>> > > nodeId, >>> > > >> > > Object >>> > > >> > > > > or >>> > > >> > > > > > > > String >>> > > >> > > > > > > > > > > > > > > consistentId, long order, etc) >>> > > >> > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > OP_CLUSTER_GET_NODE_ATTRIBUTES >>> > > >> > > > > > > > > > > > > > > Request: int nodesCount, for each >>> node: >>> > UUID >>> > > >> > nodeId >>> > > >> > > > > > > > > > > > > > > Response: int nodesCount, for each >>> node: >>> > int >>> > > >> > > > > > > attributesCount, >>> > > >> > > > > > > > > for >>> > > >> > > > > > > > > > > > each >>> > > >> > > > > > > > > > > > > > node >>> > > >> > > > > > > > > > > > > > > attribute: String name, Object value >>> > > >> > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > To execute tasks we need something >>> like >>> > > these >>> > > >> > > methods >>> > > >> > > > > in >>> > > >> > > > > > > the >>> > > >> > > > > > > > > > client >>> > > >> > > > > > > > > > > > > API: >>> > > >> > > > > > > > > > > > > > > Object execute(String task, Object >>> arg) >>> > > >> > > > > > > > > > > > > > > Future<Object> executeAsync(String >>> task, >>> > > >> Object >>> > > >> > > arg) >>> > > >> > > > > > > > > > > > > > > Object affinityExecute(String task, >>> String >>> > > >> cache, >>> > > >> > > > > Object >>> > > >> > > > > > > key, >>> > > >> > > > > > > > > > > Object >>> > > >> > > > > > > > > > > > > arg) >>> > > >> > > > > > > > > > > > > > > Future<Object> >>> affinityExecuteAsync(String >>> > > >> task, >>> > > >> > > > String >>> > > >> > > > > > > > cache, >>> > > >> > > > > > > > > > > Object >>> > > >> > > > > > > > > > > > > > key, >>> > > >> > > > > > > > > > > > > > > Object arg) >>> > > >> > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > Which can be mapped to protocol >>> > operations: >>> > > >> > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > OP_COMPUTE_EXECUTE_TASK >>> > > >> > > > > > > > > > > > > > > Request: UUID nodeId, String taskName, >>> > > Object >>> > > >> arg >>> > > >> > > > > > > > > > > > > > > Response: Object result >>> > > >> > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > OP_COMPUTE_EXECUTE_TASK_AFFINITY >>> > > >> > > > > > > > > > > > > > > Request: String cacheName, Object key, >>> > > String >>> > > >> > > > taskName, >>> > > >> > > > > > > > Object >>> > > >> > > > > > > > > > arg >>> > > >> > > > > > > > > > > > > > > Response: Object result >>> > > >> > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > The second operation is needed >>> because we >>> > > >> > sometimes >>> > > >> > > > > can't >>> > > >> > > > > > > > > > calculate >>> > > >> > > > > > > > > > > > and >>> > > >> > > > > > > > > > > > > > > connect to affinity node on the >>> > client-side >>> > > >> > > (affinity >>> > > >> > > > > > > > awareness >>> > > >> > > > > > > > > > can >>> > > >> > > > > > > > > > > > be >>> > > >> > > > > > > > > > > > > > > disabled, custom affinity function >>> can be >>> > > >> used or >>> > > >> > > > there >>> > > >> > > > > > can >>> > > >> > > > > > > > be >>> > > >> > > > > > > > > no >>> > > >> > > > > > > > > > > > > > > connection between client and affinity >>> > > node), >>> > > >> but >>> > > >> > > we >>> > > >> > > > > can >>> > > >> > > > > > > make >>> > > >> > > > > > > > > > best >>> > > >> > > > > > > > > > > > > effort >>> > > >> > > > > > > > > > > > > > > to send request to target node if >>> affinity >>> > > >> > > awareness >>> > > >> > > > is >>> > > >> > > > > > > > > enabled. >>> > > >> > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > Currently, on the server-side requests >>> > > always >>> > > >> > > > processed >>> > > >> > > > > > > > > > > synchronously >>> > > >> > > > > > > > > > > > > and >>> > > >> > > > > > > > > > > > > > > responses are sent right after >>> request was >>> > > >> > > processed. >>> > > >> > > > > To >>> > > >> > > > > > > > > execute >>> > > >> > > > > > > > > > > long >>> > > >> > > > > > > > > > > > > > tasks >>> > > >> > > > > > > > > > > > > > > async we should whether change this >>> logic >>> > or >>> > > >> > > > introduce >>> > > >> > > > > > some >>> > > >> > > > > > > > > kind >>> > > >> > > > > > > > > > > > > two-way >>> > > >> > > > > > > > > > > > > > > communication between client and >>> server >>> > (now >>> > > >> only >>> > > >> > > > > one-way >>> > > >> > > > > > > > > > requests >>> > > >> > > > > > > > > > > > from >>> > > >> > > > > > > > > > > > > > > client to server are allowed). >>> > > >> > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > Two-way communication can also be >>> useful >>> > in >>> > > >> the >>> > > >> > > > future >>> > > >> > > > > if >>> > > >> > > > > > > we >>> > > >> > > > > > > > > will >>> > > >> > > > > > > > > > > > send >>> > > >> > > > > > > > > > > > > > some >>> > > >> > > > > > > > > > > > > > > server-side generated events to >>> clients. >>> > > >> > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > In case of two-way communication >>> there can >>> > > be >>> > > >> new >>> > > >> > > > > > > operations >>> > > >> > > > > > > > > > > > > introduced: >>> > > >> > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > OP_COMPUTE_EXECUTE_TASK (from client >>> to >>> > > >> server) >>> > > >> > > > > > > > > > > > > > > Request: UUID nodeId, String taskName, >>> > > Object >>> > > >> arg >>> > > >> > > > > > > > > > > > > > > Response: long taskId >>> > > >> > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > OP_COMPUTE_TASK_FINISHED (from server >>> to >>> > > >> client) >>> > > >> > > > > > > > > > > > > > > Request: taskId, Object result >>> > > >> > > > > > > > > > > > > > > Response: empty >>> > > >> > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > The same for affinity requests. >>> > > >> > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > Also, we can implement not only >>> execute >>> > task >>> > > >> > > > operation, >>> > > >> > > > > > but >>> > > >> > > > > > > > > some >>> > > >> > > > > > > > > > > > other >>> > > >> > > > > > > > > > > > > > > operations from IgniteCompute >>> (broadcast, >>> > > run, >>> > > >> > > call), >>> > > >> > > > > but >>> > > >> > > > > > > it >>> > > >> > > > > > > > > will >>> > > >> > > > > > > > > > > be >>> > > >> > > > > > > > > > > > > > useful >>> > > >> > > > > > > > > > > > > > > only for java thin client. And even >>> with >>> > > java >>> > > >> > thin >>> > > >> > > > > client >>> > > >> > > > > > > we >>> > > >> > > > > > > > > > should >>> > > >> > > > > > > > > > > > > > whether >>> > > >> > > > > > > > > > > > > > > implement peer-class-loading for thin >>> > > clients >>> > > >> > (this >>> > > >> > > > > also >>> > > >> > > > > > > > > requires >>> > > >> > > > > > > > > > > > > two-way >>> > > >> > > > > > > > > > > > > > > client-server communication) or put >>> > classes >>> > > >> with >>> > > >> > > > > executed >>> > > >> > > > > > > > > > closures >>> > > >> > > > > > > > > > > to >>> > > >> > > > > > > > > > > > > the >>> > > >> > > > > > > > > > > > > > > server locally. >>> > > >> > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > What do you think about proposed >>> protocol >>> > > >> > changes? >>> > > >> > > > > > > > > > > > > > > Do we need two-way requests between >>> client >>> > > and >>> > > >> > > > server? >>> > > >> > > > > > > > > > > > > > > Do we need support of compute methods >>> > other >>> > > >> than >>> > > >> > > > > "execute >>> > > >> > > > > > > > > task"? >>> > > >> > > > > > > > > > > > > > > What do you think about >>> peer-class-loading >>> > > for >>> > > >> > thin >>> > > >> > > > > > > clients? >>> > > >> > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > -- >>> > > >> > > > > > > > > > > > > > Sergey Kozlov >>> > > >> > > > > > > > > > > > > > GridGain Systems >>> > > >> > > > > > > > > > > > > > www.gridgain.com >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > > -- >>> > > >> > > > > > > > > > > > Sergey Kozlov >>> > > >> > > > > > > > > > > > GridGain Systems >>> > > >> > > > > > > > > > > > www.gridgain.com >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > > > -- >>> > > >> > > > > > > > > > > Alex. >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > > >>> > > >> > > > > > > > > >>> > > >> > > > > > > > >>> > > >> > > > > > > >>> > > >> > > > > > >>> > > >> > > > > >>> > > >> > > > >>> > > >> > > >>> > > >> > >>> > > >> >>> > > > >>> > > >>> > >>> >> |
Hi Alex,
First of all, thank you for preparing this IEP! Protocol changes look good to me. However, I have objections regarding race condition behavior and about Java API. 1. "Due to some races, it's possible that notification for some task will be delivered to the client before the response for this task is delivered. Client implementation should be ready to process such cases" I don't think this is acceptable. Client must never receive OP_COMPUTE_TASK_FINISHED notification before OP_COMPUTE_TASK_EXECUTE response, because there is no way to correlate task result with the task itself this way. 2. affinityExecute Does not make sense to me, and not consistent with Thick Compute API, where we have affinityRun/affinityCall instead. Compute API implies that heavy lifting is inside ComputeJob, so it does not matter where ComputeTask is executed, because jobs are mapped to other nodes. Pavel On Wed, Mar 25, 2020 at 3:47 PM Alex Plehanov <[hidden email]> wrote: > Hello guys. > > I've implemented PoC and created IEP [1] for thin client compute grid > functionality. Please have a look. > > [1]: > > https://cwiki.apache.org/confluence/display/IGNITE/IEP-42+Thin+client%3A+compute+support > > пт, 24 янв. 2020 г. в 16:56, Alex Plehanov <[hidden email]>: > > > We've discussed thin client compute protocol with Pavel Tupitsyn and Igor > > Sapego and come to the conclusion that approach with two-way requests > > should be used: client generates taskId and send a request to the server > to > > execute a task. The server responds that the request has been accepted. > > After task has finished the server notifies the client (send a request > > without waiting for a response). The client can cancel the task by > sending > > a corresponding request to the server. > > > > Also, a node list should be passed (optionally) with a request to limit > > nodes to execute the task. > > > > I will create IEP and file detailed protocol changes shortly. > > > > вт, 21 янв. 2020 г. в 18:46, Alex Plehanov <[hidden email]>: > > > >> Igor, thanks for the reply. > >> > >> > Approach with taskId will require a lot of changes in protocol and > thus > >> more "heavy" for implementation > >> Do you mean approach with server notifications mechanism? Yes, it will > >> require a lot of changes. But in most recent messages we've discussed > with > >> Pavel approach without server notifications mechanism. This approach > have > >> the same complexity and performance as an approach with requestId. > >> > >> > But such clients as Python, Node.js, PHP, Go most probably won't have > >> support for this API, at least for now. > >> Without a server notifications mechanism, there will be no breaking > >> changes in the protocol, so client implementation can just skip this > >> feature and protocol version and implement the next one. > >> > >> > Or never. > >> I think it still useful to execute java compute tasks from non-java thin > >> clients. Also, we can provide some out-of-the-box java tasks, for > example > >> ExecutePythonScriptTask with python compute implementation, which can > run > >> python script on server node. > >> > >> > So, maybe it's a good time for us to change our backward compatibility > >> mechanism from protocol versioning to feature masks? > >> I like the idea with feature masks, but it will force us to support both > >> backward compatibility mechanisms, protocol versioning and feature > masks. > >> > >> пн, 20 янв. 2020 г. в 20:34, Pavel Tupitsyn <[hidden email]>: > >> > >>> Huge +1 from me for Feature Masks. > >>> I think this should be our top priority for thin client protocol, since > >>> it > >>> simplifies change management a lot. > >>> > >>> On Mon, Jan 20, 2020 at 8:21 PM Igor Sapego <[hidden email]> > wrote: > >>> > >>> > Sorry for the late reply. > >>> > > >>> > Approach with taskId will require a lot of changes in protocol and > thus > >>> > more "heavy" for implementation, but it definitely looks to me less > >>> hacky > >>> > than reqId-approach. Moreover, as was mentioned, server notifications > >>> > mechanism will be required in a future anyway with high probability. > So > >>> > from this point of view I like taskId-approach. > >>> > > >>> > On the other hand, what we should also consider here is performance. > >>> > Speaking of latency, it looks like reqId will have better results in > >>> case > >>> > of > >>> > small and fast tasks. The only question here, if we want to optimize > >>> thin > >>> > clients for this case. > >>> > > >>> > Also, what are you talking about mostly involves clients on platforms > >>> > that already have Compute API for thick clients. Let me mention one > >>> > more point of view here and another concern here. > >>> > > >>> > The changes you propose are going to change protocol version for > sure. > >>> > In case with taskId approach and server notifications - even more so. > >>> > > >>> > But such clients as Python, Node.js, PHP, Go most probably won't have > >>> > support for this API, at least for now. Or never. But current > >>> > backward-compatibility mechanism implies protocol versions where we > >>> > imply that client that supports version 1.5 also supports all the > >>> features > >>> > introduced in all the previous versions of the protocol. > >>> > > >>> > Thus implementing Compute API in any of the proposed ways *may* > >>> > force mentioned clients to support changes in protocol which they not > >>> > necessarily need in order to introduce new features in the future. > >>> > > >>> > So, maybe it's a good time for us to change our backward > compatibility > >>> > mechanism from protocol versioning to feature masks? > >>> > > >>> > WDYT? > >>> > > >>> > Best Regards, > >>> > Igor > >>> > > >>> > > >>> > On Fri, Jan 17, 2020 at 9:37 AM Alex Plehanov < > [hidden email] > >>> > > >>> > wrote: > >>> > > >>> > > Looks like we didn't rich consensus here. > >>> > > > >>> > > Igor, as thin client maintainer, can you please share your opinion? > >>> > > > >>> > > Everyone else also welcome, please share your thoughts about > options > >>> to > >>> > > implement operations for compute. > >>> > > > >>> > > > >>> > > чт, 28 нояб. 2019 г. в 10:02, Alex Plehanov < > [hidden email] > >>> >: > >>> > > > >>> > > > > Since all thin client operations are inherently async, we > should > >>> be > >>> > > able > >>> > > > to cancel any of them > >>> > > > It's illogical to have such ability. What should do cancel > >>> operation of > >>> > > > cancel operation? Moreover, sometimes it's dangerous, for > example, > >>> > create > >>> > > > cache operation should never be canceled. There should be an > >>> explicit > >>> > set > >>> > > > of processes that we can cancel: queries, transactions, tasks, > >>> > services. > >>> > > > The lifecycle of services is more complex than the lifecycle of > >>> tasks. > >>> > > With > >>> > > > services, I suppose, we can't use request cancelation, so tasks > >>> will be > >>> > > the > >>> > > > only process with an exceptional pattern. > >>> > > > > >>> > > > > The request would be "execute task with specified node filter" > - > >>> > simple > >>> > > > and efficient. > >>> > > > It's not simple: every compute or service request should contain > >>> > complex > >>> > > > node filtering logic, which duplicates the same logic for cluster > >>> API. > >>> > > > It's not efficient: for example, we can't implement > forPredicate() > >>> > > > filtering in this case. > >>> > > > > >>> > > > > >>> > > > ср, 27 нояб. 2019 г. в 19:25, Pavel Tupitsyn < > [hidden email] > >>> >: > >>> > > > > >>> > > >> > The request is already processed (task is started), we can't > >>> cancel > >>> > > the > >>> > > >> request > >>> > > >> The request is not "start a task". It is "execute task" (and get > >>> > > result). > >>> > > >> Same as "cache get" - you get a result in the end, we don't > "start > >>> > cache > >>> > > >> get" then "end cache get". > >>> > > >> > >>> > > >> Since all thin client operations are inherently async, we should > >>> be > >>> > able > >>> > > >> to > >>> > > >> cancel any of them > >>> > > >> by sending another request with an id of prior request to be > >>> > cancelled. > >>> > > >> That's why I'm advocating for this approach - it will work for > >>> > anything, > >>> > > >> no > >>> > > >> special cases. > >>> > > >> And it keeps "happy path" as simple as it is right now. > >>> > > >> > >>> > > >> Queries are different because we retrieve results in pages, we > >>> can't > >>> > do > >>> > > >> them as one request. > >>> > > >> Transactions are also different because client controls when > they > >>> > should > >>> > > >> end. > >>> > > >> There is no reason for task execution to be a special case like > >>> > queries > >>> > > or > >>> > > >> transactions. > >>> > > >> > >>> > > >> > we always need to send 2 requests to server to execute the > task > >>> > > >> Nope. We don't need to get nodes on client at all. > >>> > > >> The request would be "execute task with specified node filter" - > >>> > simple > >>> > > >> and > >>> > > >> efficient. > >>> > > >> > >>> > > >> > >>> > > >> On Wed, Nov 27, 2019 at 4:31 PM Alex Plehanov < > >>> > [hidden email]> > >>> > > >> wrote: > >>> > > >> > >>> > > >> > > We do cancel a request to perform a task. We may and should > >>> use > >>> > > this > >>> > > >> to > >>> > > >> > cancel any other request in future. > >>> > > >> > The request is already processed (task is started), we can't > >>> cancel > >>> > > the > >>> > > >> > request. As you mentioned before, we already do almost the > same > >>> for > >>> > > >> queries > >>> > > >> > (close the cursor, but not cancel the request to run a query), > >>> it's > >>> > > >> better > >>> > > >> > to do such things in a common way. We have a pattern: start > some > >>> > > process > >>> > > >> > (query, transaction), get id of this process, end process by > >>> this > >>> > id. > >>> > > >> The > >>> > > >> > "Execute task" process should match the same pattern. In my > >>> opinion, > >>> > > >> > implementation with two-way requests is the best option to > match > >>> > this > >>> > > >> > pattern (we can even reuse OP_RESOURCE_CLOSE operation type in > >>> this > >>> > > >> case). > >>> > > >> > Sometime in the future, we will need two-way requests for some > >>> other > >>> > > >> > functionality (continuous queries, event listening, etc). But > >>> even > >>> > > >> without > >>> > > >> > two-way requests introducing some process id (task id in our > >>> case) > >>> > > will > >>> > > >> be > >>> > > >> > closer to existing pattern than canceling tasks by request id. > >>> > > >> > > >>> > > >> > > So every new request will apply those filters on server > side, > >>> > using > >>> > > >> the > >>> > > >> > most recent set of nodes. > >>> > > >> > In this case, we always need to send 2 requests to server to > >>> execute > >>> > > the > >>> > > >> > task. First - to get nodes by the filter, second - to actually > >>> > execute > >>> > > >> the > >>> > > >> > task. It seems like overhead. The same will be for services. > >>> Cluster > >>> > > >> group > >>> > > >> > remains the same if the topology hasn't changed. We can use > this > >>> > fact > >>> > > >> and > >>> > > >> > bind "execute task" request to topology. If topology has > >>> changed - > >>> > get > >>> > > >> > nodes for new topology and retry request. > >>> > > >> > > >>> > > >> > вт, 26 нояб. 2019 г. в 17:44, Pavel Tupitsyn < > >>> [hidden email] > >>> > >: > >>> > > >> > > >>> > > >> > > > After all, we don't cancel request > >>> > > >> > > We do cancel a request to perform a task. We may and should > >>> use > >>> > this > >>> > > >> to > >>> > > >> > > cancel any other request in future. > >>> > > >> > > > >>> > > >> > > > Client uses some cluster group filtration (for example > >>> > > forServers() > >>> > > >> > > cluster group) > >>> > > >> > > Please see above - Aleksandr Shapkin described how we store > >>> > > >> > > filtered cluster groups on client. > >>> > > >> > > We don't store node IDs, we store actual filters. So every > new > >>> > > request > >>> > > >> > will > >>> > > >> > > apply those filters on server side, > >>> > > >> > > using the most recent set of nodes. > >>> > > >> > > > >>> > > >> > > var myGrp = cluster.forServers().forAttribute("foo"); // > This > >>> does > >>> > > not > >>> > > >> > > issue any server requests, just builds an object with > filters > >>> on > >>> > > >> client > >>> > > >> > > while (true) myGrp.compute().executeTask("bar"); // Every > >>> request > >>> > > >> > includes > >>> > > >> > > filters, and filters are applied on the server side > >>> > > >> > > > >>> > > >> > > On Tue, Nov 26, 2019 at 1:42 PM Alex Plehanov < > >>> > > >> [hidden email]> > >>> > > >> > > wrote: > >>> > > >> > > > >>> > > >> > > > > Anyway, my point stands. > >>> > > >> > > > I can't agree. Why you don't want to use task id for this? > >>> After > >>> > > >> all, > >>> > > >> > we > >>> > > >> > > > don't cancel request (request is already processed), we > >>> cancel > >>> > the > >>> > > >> > task. > >>> > > >> > > So > >>> > > >> > > > it's more convenient to use task id here. > >>> > > >> > > > > >>> > > >> > > > > Can you please provide equivalent use case with existing > >>> > "thick" > >>> > > >> > > client? > >>> > > >> > > > For example: > >>> > > >> > > > Cluster consists of one server node. > >>> > > >> > > > Client uses some cluster group filtration (for example > >>> > > forServers() > >>> > > >> > > cluster > >>> > > >> > > > group). > >>> > > >> > > > Client starts to send periodically (for example 1 per > >>> minute) > >>> > > >> long-term > >>> > > >> > > > (for example 1 hour long) tasks to the cluster. > >>> > > >> > > > Meanwhile, several server nodes joined the cluster. > >>> > > >> > > > > >>> > > >> > > > In case of thick client: All server nodes will be used, > >>> tasks > >>> > will > >>> > > >> be > >>> > > >> > > load > >>> > > >> > > > balanced. > >>> > > >> > > > In case of thin client: Only one server node will be used, > >>> > client > >>> > > >> will > >>> > > >> > > > detect topology change after an hour. > >>> > > >> > > > > >>> > > >> > > > > >>> > > >> > > > вт, 26 нояб. 2019 г. в 11:50, Pavel Tupitsyn < > >>> > > [hidden email] > >>> > > >> >: > >>> > > >> > > > > >>> > > >> > > > > > I can't see any usage of request id in query cursors > >>> > > >> > > > > You are right, cursor id is a separate thing. > >>> > > >> > > > > Anyway, my point stands. > >>> > > >> > > > > > >>> > > >> > > > > > client sends long term tasks to nodes and wants to do > it > >>> > with > >>> > > >> load > >>> > > >> > > > > balancing > >>> > > >> > > > > I still don't get it. Can you please provide equivalent > >>> use > >>> > case > >>> > > >> with > >>> > > >> > > > > existing "thick" client? > >>> > > >> > > > > > >>> > > >> > > > > > >>> > > >> > > > > On Mon, Nov 25, 2019 at 11:59 PM Alex Plehanov < > >>> > > >> > > [hidden email]> > >>> > > >> > > > > wrote: > >>> > > >> > > > > > >>> > > >> > > > > > > And it is fine to use request ID to identify compute > >>> tasks > >>> > > >> (as we > >>> > > >> > > do > >>> > > >> > > > > with > >>> > > >> > > > > > query cursors). > >>> > > >> > > > > > I can't see any usage of request id in query cursors. > We > >>> > send > >>> > > >> query > >>> > > >> > > > > request > >>> > > >> > > > > > and get cursor id in response. After that, we only use > >>> > cursor > >>> > > id > >>> > > >> > (to > >>> > > >> > > > get > >>> > > >> > > > > > next pages and to close the resource). Did I miss > >>> something? > >>> > > >> > > > > > > >>> > > >> > > > > > > Looks like I'm missing something - how is topology > >>> change > >>> > > >> > relevant > >>> > > >> > > to > >>> > > >> > > > > > executing compute tasks from client? > >>> > > >> > > > > > It's not relevant directly. But there are some cases > >>> where > >>> > it > >>> > > >> will > >>> > > >> > be > >>> > > >> > > > > > helpful. For example, if client sends long term tasks > to > >>> > nodes > >>> > > >> and > >>> > > >> > > > wants > >>> > > >> > > > > to > >>> > > >> > > > > > do it with load balancing it will detect topology > change > >>> > only > >>> > > >> after > >>> > > >> > > > some > >>> > > >> > > > > > time in the future with the first response, so load > >>> > balancing > >>> > > >> will > >>> > > >> > no > >>> > > >> > > > > work. > >>> > > >> > > > > > Perhaps we can add optional "topology version" field > to > >>> the > >>> > > >> > > > > > OP_COMPUTE_EXECUTE_TASK request to solve this problem. > >>> > > >> > > > > > > >>> > > >> > > > > > > >>> > > >> > > > > > пн, 25 нояб. 2019 г. в 22:42, Pavel Tupitsyn < > >>> > > >> [hidden email] > >>> > > >> > >: > >>> > > >> > > > > > > >>> > > >> > > > > > > Alex, > >>> > > >> > > > > > > > >>> > > >> > > > > > > > we will mix entities from different layers > >>> (transport > >>> > > layer > >>> > > >> and > >>> > > >> > > > > request > >>> > > >> > > > > > > body) > >>> > > >> > > > > > > I would not call our message header (which includes > >>> the > >>> > id) > >>> > > >> > > > "transport > >>> > > >> > > > > > > layer". > >>> > > >> > > > > > > TCP is our transport layer. And it is fine to use > >>> request > >>> > ID > >>> > > >> to > >>> > > >> > > > > identify > >>> > > >> > > > > > > compute tasks (as we do with query cursors). > >>> > > >> > > > > > > > >>> > > >> > > > > > > > we still can't be sure that the task is > successfully > >>> > > started > >>> > > >> > on a > >>> > > >> > > > > > server > >>> > > >> > > > > > > The request to start the task will fail and we'll > get > >>> a > >>> > > >> response > >>> > > >> > > > > > indicating > >>> > > >> > > > > > > that right away > >>> > > >> > > > > > > > >>> > > >> > > > > > > > we won't ever know about topology change > >>> > > >> > > > > > > Looks like I'm missing something - how is topology > >>> change > >>> > > >> > relevant > >>> > > >> > > to > >>> > > >> > > > > > > executing compute tasks from client? > >>> > > >> > > > > > > > >>> > > >> > > > > > > On Mon, Nov 25, 2019 at 10:17 PM Alex Plehanov < > >>> > > >> > > > > [hidden email]> > >>> > > >> > > > > > > wrote: > >>> > > >> > > > > > > > >>> > > >> > > > > > > > Pavel, in this case, we will mix entities from > >>> different > >>> > > >> layers > >>> > > >> > > > > > > (transport > >>> > > >> > > > > > > > layer and request body), it's not very good. The > >>> same > >>> > > >> behavior > >>> > > >> > we > >>> > > >> > > > can > >>> > > >> > > > > > > > achieve with generated on client-side task id, but > >>> there > >>> > > >> will > >>> > > >> > be > >>> > > >> > > no > >>> > > >> > > > > > > > inter-layer data intersection and I think it will > be > >>> > > easier > >>> > > >> to > >>> > > >> > > > > > implement > >>> > > >> > > > > > > on > >>> > > >> > > > > > > > both client and server-side. But we still can't be > >>> sure > >>> > > that > >>> > > >> > the > >>> > > >> > > > task > >>> > > >> > > > > > is > >>> > > >> > > > > > > > successfully started on a server. We won't ever > know > >>> > about > >>> > > >> > > topology > >>> > > >> > > > > > > change, > >>> > > >> > > > > > > > because topology changed flag will be sent from > >>> server > >>> > to > >>> > > >> > client > >>> > > >> > > > only > >>> > > >> > > > > > > with > >>> > > >> > > > > > > > a response when the task will be completed. Are we > >>> > accept > >>> > > >> that? > >>> > > >> > > > > > > > > >>> > > >> > > > > > > > пн, 25 нояб. 2019 г. в 19:07, Pavel Tupitsyn < > >>> > > >> > > [hidden email] > >>> > > >> > > > >: > >>> > > >> > > > > > > > > >>> > > >> > > > > > > > > Alex, > >>> > > >> > > > > > > > > > >>> > > >> > > > > > > > > I have a simpler idea. We already do request id > >>> > handling > >>> > > >> in > >>> > > >> > the > >>> > > >> > > > > > > protocol, > >>> > > >> > > > > > > > > so: > >>> > > >> > > > > > > > > - Client sends a normal request to execute > compute > >>> > task. > >>> > > >> > > Request > >>> > > >> > > > ID > >>> > > >> > > > > > is > >>> > > >> > > > > > > > > generated as usual. > >>> > > >> > > > > > > > > - As soon as task is completed, a response is > >>> > received. > >>> > > >> > > > > > > > > > >>> > > >> > > > > > > > > As for cancellation - client can send a new > >>> request > >>> > > (with > >>> > > >> new > >>> > > >> > > > > request > >>> > > >> > > > > > > ID) > >>> > > >> > > > > > > > > and (in the body) pass the request ID from above > >>> > > >> > > > > > > > > as a task identifier. As a result, there are two > >>> > > >> responses: > >>> > > >> > > > > > > > > - Cancellation response > >>> > > >> > > > > > > > > - Task response (with proper cancelled status) > >>> > > >> > > > > > > > > > >>> > > >> > > > > > > > > That's it, no need to modify the core of the > >>> protocol. > >>> > > One > >>> > > >> > > > request > >>> > > >> > > > > - > >>> > > >> > > > > > > one > >>> > > >> > > > > > > > > response. > >>> > > >> > > > > > > > > > >>> > > >> > > > > > > > > On Mon, Nov 25, 2019 at 6:20 PM Alex Plehanov < > >>> > > >> > > > > > [hidden email] > >>> > > >> > > > > > > > > >>> > > >> > > > > > > > > wrote: > >>> > > >> > > > > > > > > > >>> > > >> > > > > > > > > > Pavel, we need to inform the client when the > >>> task is > >>> > > >> > > completed, > >>> > > >> > > > > we > >>> > > >> > > > > > > need > >>> > > >> > > > > > > > > the > >>> > > >> > > > > > > > > > ability to cancel the task. I see several ways > >>> to > >>> > > >> implement > >>> > > >> > > > this: > >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > > 1. Сlient sends a request to the server to > >>> start a > >>> > > task, > >>> > > >> > > server > >>> > > >> > > > > > > return > >>> > > >> > > > > > > > > task > >>> > > >> > > > > > > > > > id in response. Server notifies client when > >>> task is > >>> > > >> > completed > >>> > > >> > > > > with > >>> > > >> > > > > > a > >>> > > >> > > > > > > > new > >>> > > >> > > > > > > > > > request (from server to client). Client can > >>> cancel > >>> > the > >>> > > >> task > >>> > > >> > > by > >>> > > >> > > > > > > sending > >>> > > >> > > > > > > > a > >>> > > >> > > > > > > > > > new request with operation type "cancel" and > >>> task > >>> > id. > >>> > > In > >>> > > >> > this > >>> > > >> > > > > case, > >>> > > >> > > > > > > we > >>> > > >> > > > > > > > > > should implement 2-ways requests. > >>> > > >> > > > > > > > > > 2. Client generates unique task id and sends a > >>> > request > >>> > > >> to > >>> > > >> > the > >>> > > >> > > > > > server > >>> > > >> > > > > > > to > >>> > > >> > > > > > > > > > start a task, server don't reply immediately > but > >>> > wait > >>> > > >> until > >>> > > >> > > > task > >>> > > >> > > > > is > >>> > > >> > > > > > > > > > completed. Client can cancel task by sending > new > >>> > > request > >>> > > >> > with > >>> > > >> > > > > > > operation > >>> > > >> > > > > > > > > > type "cancel" and task id. In this case, we > >>> should > >>> > > >> decouple > >>> > > >> > > > > request > >>> > > >> > > > > > > and > >>> > > >> > > > > > > > > > response on the server-side (currently > response > >>> is > >>> > > sent > >>> > > >> > right > >>> > > >> > > > > after > >>> > > >> > > > > > > > > request > >>> > > >> > > > > > > > > > was processed). Also, we can't be sure that > >>> task is > >>> > > >> > > > successfully > >>> > > >> > > > > > > > started > >>> > > >> > > > > > > > > on > >>> > > >> > > > > > > > > > a server. > >>> > > >> > > > > > > > > > 3. Client sends a request to the server to > >>> start a > >>> > > task, > >>> > > >> > > server > >>> > > >> > > > > > > return > >>> > > >> > > > > > > > id > >>> > > >> > > > > > > > > > in response. Client periodically asks the > server > >>> > about > >>> > > >> task > >>> > > >> > > > > status. > >>> > > >> > > > > > > > > Client > >>> > > >> > > > > > > > > > can cancel the task by sending new request > with > >>> > > >> operation > >>> > > >> > > type > >>> > > >> > > > > > > "cancel" > >>> > > >> > > > > > > > > and > >>> > > >> > > > > > > > > > task id. This case brings some overhead to the > >>> > > >> > communication > >>> > > >> > > > > > channel. > >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > > Personally, I think that the case with 2-ways > >>> > requests > >>> > > >> is > >>> > > >> > > > better, > >>> > > >> > > > > > but > >>> > > >> > > > > > > > I'm > >>> > > >> > > > > > > > > > open to any other ideas. > >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > > Aleksandr, > >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > > Filtering logic for > >>> OP_CLUSTER_GROUP_GET_NODE_IDS > >>> > > looks > >>> > > >> > > > > > > > overcomplicated. > >>> > > >> > > > > > > > > Do > >>> > > >> > > > > > > > > > we need server-side filtering at all? Wouldn't > >>> it be > >>> > > >> better > >>> > > >> > > to > >>> > > >> > > > > send > >>> > > >> > > > > > > > basic > >>> > > >> > > > > > > > > > info (ids, order, flags) for all nodes (there > is > >>> > > >> relatively > >>> > > >> > > > small > >>> > > >> > > > > > > > amount > >>> > > >> > > > > > > > > of > >>> > > >> > > > > > > > > > data) and extended info (attributes) for > >>> selected > >>> > list > >>> > > >> of > >>> > > >> > > > nodes? > >>> > > >> > > > > In > >>> > > >> > > > > > > > this > >>> > > >> > > > > > > > > > case, we can do basic node filtration on > >>> client-side > >>> > > >> > > > > (forClients(), > >>> > > >> > > > > > > > > > forServers(), forNodeIds(), forOthers(), etc). > >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > > Do you use standard ClusterNode serialization? > >>> There > >>> > > are > >>> > > >> > also > >>> > > >> > > > > > metrics > >>> > > >> > > > > > > > > > serialized with ClusterNode, do we need it on > >>> thin > >>> > > >> client? > >>> > > >> > > > There > >>> > > >> > > > > > are > >>> > > >> > > > > > > > > other > >>> > > >> > > > > > > > > > interfaces exist to show metrics, I think it's > >>> > > >> redundant to > >>> > > >> > > > > export > >>> > > >> > > > > > > > > metrics > >>> > > >> > > > > > > > > > to thin clients too. > >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > > What do you think? > >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > > пт, 22 нояб. 2019 г. в 20:15, Aleksandr > Shapkin > >>> < > >>> > > >> > > > > [hidden email] > >>> > > >> > > > > > >: > >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > > > Alex, > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > I think you can create a new IEP page and I > >>> will > >>> > > fill > >>> > > >> it > >>> > > >> > > with > >>> > > >> > > > > the > >>> > > >> > > > > > > > > Cluster > >>> > > >> > > > > > > > > > > API details. > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > In short, I’ve introduced several new codes: > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > Cluster API is pretty straightforward: > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > OP_CLUSTER_IS_ACTIVE = 5000 > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > OP_CLUSTER_CHANGE_STATE = 5001 > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > OP_CLUSTER_CHANGE_WAL_STATE = 5002 > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > OP_CLUSTER_GET_WAL_STATE = 5003 > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > Cluster group codes: > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > OP_CLUSTER_GROUP_GET_NODE_IDS = 5100 > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > OP_CLUSTER_GROUP_GET_NODE_INFO = 5101 > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > The underlying implementation is based on > the > >>> > thick > >>> > > >> > client > >>> > > >> > > > > logic. > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > For every request, we provide a known > topology > >>> > > version > >>> > > >> > and > >>> > > >> > > if > >>> > > >> > > > > it > >>> > > >> > > > > > > has > >>> > > >> > > > > > > > > > > changed, > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > a client updates it firstly and then > re-sends > >>> the > >>> > > >> > filtering > >>> > > >> > > > > > > request. > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > Alongside the topVer a client sends a > >>> serialized > >>> > > nodes > >>> > > >> > > > > projection > >>> > > >> > > > > > > > > object > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > that could be considered as a code to value > >>> > mapping. > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > Consider: [{Code = 1, Value= [“DotNet”, > >>> > > >> “MyAttribute”}, > >>> > > >> > > > > {Code=2, > >>> > > >> > > > > > > > > > Value=1}] > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > Where “1” stands for Attribute filtering and > >>> “2” – > >>> > > >> > > > > > serverNodesOnly > >>> > > >> > > > > > > > > flag. > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > As a result of request processing, a server > >>> sends > >>> > > >> nodeId > >>> > > >> > > > UUIDs > >>> > > >> > > > > > and > >>> > > >> > > > > > > a > >>> > > >> > > > > > > > > > > current topVer. > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > When a client obtains nodeIds, it can > perform > >>> a > >>> > > >> NODE_INFO > >>> > > >> > > > call > >>> > > >> > > > > to > >>> > > >> > > > > > > > get a > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > serialized ClusterNode object. In addition > >>> there > >>> > > >> should > >>> > > >> > be > >>> > > >> > > a > >>> > > >> > > > > > > > different > >>> > > >> > > > > > > > > > API > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > method for accessing/updating node metrics. > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > чт, 21 нояб. 2019 г. в 12:32, Sergey Kozlov > < > >>> > > >> > > > > > [hidden email] > >>> > > >> > > > > > > >: > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > > Hi Pavel > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > On Thu, Nov 21, 2019 at 11:30 AM Pavel > >>> Tupitsyn > >>> > < > >>> > > >> > > > > > > > > [hidden email]> > >>> > > >> > > > > > > > > > > > wrote: > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > 1. I believe that Cluster operations for > >>> Thin > >>> > > >> Client > >>> > > >> > > > > protocol > >>> > > >> > > > > > > are > >>> > > >> > > > > > > > > > > already > >>> > > >> > > > > > > > > > > > > in the works > >>> > > >> > > > > > > > > > > > > by Alexandr Shapkin. Can't find the > ticket > >>> > > though. > >>> > > >> > > > > > > > > > > > > Alexandr, can you please confirm and > >>> attach > >>> > the > >>> > > >> > ticket > >>> > > >> > > > > > number? > >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > 2. Proposed changes will work only for > >>> Java > >>> > > tasks > >>> > > >> > that > >>> > > >> > > > are > >>> > > >> > > > > > > > already > >>> > > >> > > > > > > > > > > > deployed > >>> > > >> > > > > > > > > > > > > on server nodes. > >>> > > >> > > > > > > > > > > > > This is mostly useless for other thin > >>> clients > >>> > we > >>> > > >> have > >>> > > >> > > > > > (Python, > >>> > > >> > > > > > > > PHP, > >>> > > >> > > > > > > > > > > .NET, > >>> > > >> > > > > > > > > > > > > C++). > >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > I don't guess so. The task (execution) is > a > >>> way > >>> > to > >>> > > >> > > > implement > >>> > > >> > > > > > own > >>> > > >> > > > > > > > > layer > >>> > > >> > > > > > > > > > > for > >>> > > >> > > > > > > > > > > > the thin client application. > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > We should think of a way to make this > >>> useful > >>> > for > >>> > > >> all > >>> > > >> > > > > clients. > >>> > > >> > > > > > > > > > > > > For example, we may allow sending tasks > in > >>> > some > >>> > > >> > > scripting > >>> > > >> > > > > > > > language > >>> > > >> > > > > > > > > > like > >>> > > >> > > > > > > > > > > > > Javascript. > >>> > > >> > > > > > > > > > > > > Thoughts? > >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > The arbitrary code execution from a remote > >>> > client > >>> > > >> must > >>> > > >> > be > >>> > > >> > > > > > > protected > >>> > > >> > > > > > > > > > > > from malicious code. > >>> > > >> > > > > > > > > > > > I don't know how it could be designed but > >>> > without > >>> > > >> that > >>> > > >> > we > >>> > > >> > > > > open > >>> > > >> > > > > > > the > >>> > > >> > > > > > > > > hole > >>> > > >> > > > > > > > > > > to > >>> > > >> > > > > > > > > > > > kill cluster. > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > On Thu, Nov 21, 2019 at 11:21 AM Sergey > >>> > Kozlov < > >>> > > >> > > > > > > > > [hidden email] > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > wrote: > >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > Hi Alex > >>> > > >> > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > The idea is great. But I have some > >>> concerns > >>> > > that > >>> > > >> > > > probably > >>> > > >> > > > > > > > should > >>> > > >> > > > > > > > > be > >>> > > >> > > > > > > > > > > > taken > >>> > > >> > > > > > > > > > > > > > into account for design: > >>> > > >> > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > 1. We need to have the ability to > >>> stop a > >>> > > task > >>> > > >> > > > > execution, > >>> > > >> > > > > > > > smth > >>> > > >> > > > > > > > > > like > >>> > > >> > > > > > > > > > > > > > OP_COMPUTE_CANCEL_TASK operation > >>> (client > >>> > > to > >>> > > >> > > server) > >>> > > >> > > > > > > > > > > > > > 2. What's about task execution > >>> timeout? > >>> > It > >>> > > >> may > >>> > > >> > > help > >>> > > >> > > > to > >>> > > >> > > > > > the > >>> > > >> > > > > > > > > > cluster > >>> > > >> > > > > > > > > > > > > > survival for buggy tasks > >>> > > >> > > > > > > > > > > > > > 3. Ignite doesn't have > >>> > roles/authorization > >>> > > >> > > > > functionality > >>> > > >> > > > > > > for > >>> > > >> > > > > > > > > > now. > >>> > > >> > > > > > > > > > > > But > >>> > > >> > > > > > > > > > > > > a > >>> > > >> > > > > > > > > > > > > > task is the risky operation for > >>> cluster > >>> > > (for > >>> > > >> > > > security > >>> > > >> > > > > > > > > reasons). > >>> > > >> > > > > > > > > > > > Could > >>> > > >> > > > > > > > > > > > > we > >>> > > >> > > > > > > > > > > > > > add for Ignite configuration new > >>> options: > >>> > > >> > > > > > > > > > > > > > - Explicit turning on for > compute > >>> task > >>> > > >> > support > >>> > > >> > > > for > >>> > > >> > > > > > thin > >>> > > >> > > > > > > > > > > protocol > >>> > > >> > > > > > > > > > > > > > (disabled by default) for whole > >>> > cluster > >>> > > >> > > > > > > > > > > > > > - Explicit turning on for > compute > >>> task > >>> > > >> > support > >>> > > >> > > > for > >>> > > >> > > > > a > >>> > > >> > > > > > > node > >>> > > >> > > > > > > > > > > > > > - The list of task names > (classes) > >>> > > >> allowed to > >>> > > >> > > > > execute > >>> > > >> > > > > > > by > >>> > > >> > > > > > > > > thin > >>> > > >> > > > > > > > > > > > > client. > >>> > > >> > > > > > > > > > > > > > 4. Support the labeling for task > >>> that may > >>> > > >> help > >>> > > >> > to > >>> > > >> > > > > > > > investigate > >>> > > >> > > > > > > > > > > issues > >>> > > >> > > > > > > > > > > > > on > >>> > > >> > > > > > > > > > > > > > cluster (the idea from IEP-34 [1]) > >>> > > >> > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > 1. > >>> > > >> > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > > >>> > > >> > > > > > > > > >>> > > >> > > > > > > > >>> > > >> > > > > > > >>> > > >> > > > > > >>> > > >> > > > > >>> > > >> > > > >>> > > >> > > >>> > > >> > >>> > > > >>> > > >>> > https://cwiki.apache.org/confluence/display/IGNITE/IEP-34+Thin+client%3A+transactions+support > >>> > > >> > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > On Thu, Nov 21, 2019 at 10:58 AM Alex > >>> > > Plehanov < > >>> > > >> > > > > > > > > > > > [hidden email]> > >>> > > >> > > > > > > > > > > > > > wrote: > >>> > > >> > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > Hello, Igniters! > >>> > > >> > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > I have plans to start implementation > >>> of > >>> > > >> Compute > >>> > > >> > > > > interface > >>> > > >> > > > > > > for > >>> > > >> > > > > > > > > > > Ignite > >>> > > >> > > > > > > > > > > > > thin > >>> > > >> > > > > > > > > > > > > > > client and want to discuss features > >>> that > >>> > > >> should > >>> > > >> > be > >>> > > >> > > > > > > > implemented. > >>> > > >> > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > We already have Compute > >>> implementation for > >>> > > >> > > > binary-rest > >>> > > >> > > > > > > > clients > >>> > > >> > > > > > > > > > > > > > > (GridClientCompute), which have the > >>> > > following > >>> > > >> > > > > > > functionality: > >>> > > >> > > > > > > > > > > > > > > - Filtering cluster nodes > >>> (projection) for > >>> > > >> > compute > >>> > > >> > > > > > > > > > > > > > > - Executing task by the name > >>> > > >> > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > I think we can implement this > >>> > functionality > >>> > > >> in a > >>> > > >> > > thin > >>> > > >> > > > > > > client > >>> > > >> > > > > > > > as > >>> > > >> > > > > > > > > > > well. > >>> > > >> > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > First of all, we need some operation > >>> types > >>> > > to > >>> > > >> > > > request a > >>> > > >> > > > > > > list > >>> > > >> > > > > > > > of > >>> > > >> > > > > > > > > > all > >>> > > >> > > > > > > > > > > > > > > available nodes and probably node > >>> > attributes > >>> > > >> (by > >>> > > >> > a > >>> > > >> > > > list > >>> > > >> > > > > > of > >>> > > >> > > > > > > > > > nodes). > >>> > > >> > > > > > > > > > > > Node > >>> > > >> > > > > > > > > > > > > > > attributes will be helpful if we > will > >>> > decide > >>> > > >> to > >>> > > >> > > > > implement > >>> > > >> > > > > > > > > analog > >>> > > >> > > > > > > > > > of > >>> > > >> > > > > > > > > > > > > > > ClusterGroup#forAttribute or > >>> > > >> > > > ClusterGroup#forePredicate > >>> > > >> > > > > > > > methods > >>> > > >> > > > > > > > > > in > >>> > > >> > > > > > > > > > > > the > >>> > > >> > > > > > > > > > > > > > thin > >>> > > >> > > > > > > > > > > > > > > client. Perhaps they can be > requested > >>> > > lazily. > >>> > > >> > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > From the protocol point of view > there > >>> will > >>> > > be > >>> > > >> two > >>> > > >> > > new > >>> > > >> > > > > > > > > operations: > >>> > > >> > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > OP_CLUSTER_GET_NODES > >>> > > >> > > > > > > > > > > > > > > Request: empty > >>> > > >> > > > > > > > > > > > > > > Response: long topologyVersion, int > >>> > > >> > > > > minorTopologyVersion, > >>> > > >> > > > > > > int > >>> > > >> > > > > > > > > > > > > nodesCount, > >>> > > >> > > > > > > > > > > > > > > for each node set of node fields > (UUID > >>> > > nodeId, > >>> > > >> > > Object > >>> > > >> > > > > or > >>> > > >> > > > > > > > String > >>> > > >> > > > > > > > > > > > > > > consistentId, long order, etc) > >>> > > >> > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > OP_CLUSTER_GET_NODE_ATTRIBUTES > >>> > > >> > > > > > > > > > > > > > > Request: int nodesCount, for each > >>> node: > >>> > UUID > >>> > > >> > nodeId > >>> > > >> > > > > > > > > > > > > > > Response: int nodesCount, for each > >>> node: > >>> > int > >>> > > >> > > > > > > attributesCount, > >>> > > >> > > > > > > > > for > >>> > > >> > > > > > > > > > > > each > >>> > > >> > > > > > > > > > > > > > node > >>> > > >> > > > > > > > > > > > > > > attribute: String name, Object value > >>> > > >> > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > To execute tasks we need something > >>> like > >>> > > these > >>> > > >> > > methods > >>> > > >> > > > > in > >>> > > >> > > > > > > the > >>> > > >> > > > > > > > > > client > >>> > > >> > > > > > > > > > > > > API: > >>> > > >> > > > > > > > > > > > > > > Object execute(String task, Object > >>> arg) > >>> > > >> > > > > > > > > > > > > > > Future<Object> executeAsync(String > >>> task, > >>> > > >> Object > >>> > > >> > > arg) > >>> > > >> > > > > > > > > > > > > > > Object affinityExecute(String task, > >>> String > >>> > > >> cache, > >>> > > >> > > > > Object > >>> > > >> > > > > > > key, > >>> > > >> > > > > > > > > > > Object > >>> > > >> > > > > > > > > > > > > arg) > >>> > > >> > > > > > > > > > > > > > > Future<Object> > >>> affinityExecuteAsync(String > >>> > > >> task, > >>> > > >> > > > String > >>> > > >> > > > > > > > cache, > >>> > > >> > > > > > > > > > > Object > >>> > > >> > > > > > > > > > > > > > key, > >>> > > >> > > > > > > > > > > > > > > Object arg) > >>> > > >> > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > Which can be mapped to protocol > >>> > operations: > >>> > > >> > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > OP_COMPUTE_EXECUTE_TASK > >>> > > >> > > > > > > > > > > > > > > Request: UUID nodeId, String > taskName, > >>> > > Object > >>> > > >> arg > >>> > > >> > > > > > > > > > > > > > > Response: Object result > >>> > > >> > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > OP_COMPUTE_EXECUTE_TASK_AFFINITY > >>> > > >> > > > > > > > > > > > > > > Request: String cacheName, Object > key, > >>> > > String > >>> > > >> > > > taskName, > >>> > > >> > > > > > > > Object > >>> > > >> > > > > > > > > > arg > >>> > > >> > > > > > > > > > > > > > > Response: Object result > >>> > > >> > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > The second operation is needed > >>> because we > >>> > > >> > sometimes > >>> > > >> > > > > can't > >>> > > >> > > > > > > > > > calculate > >>> > > >> > > > > > > > > > > > and > >>> > > >> > > > > > > > > > > > > > > connect to affinity node on the > >>> > client-side > >>> > > >> > > (affinity > >>> > > >> > > > > > > > awareness > >>> > > >> > > > > > > > > > can > >>> > > >> > > > > > > > > > > > be > >>> > > >> > > > > > > > > > > > > > > disabled, custom affinity function > >>> can be > >>> > > >> used or > >>> > > >> > > > there > >>> > > >> > > > > > can > >>> > > >> > > > > > > > be > >>> > > >> > > > > > > > > no > >>> > > >> > > > > > > > > > > > > > > connection between client and > affinity > >>> > > node), > >>> > > >> but > >>> > > >> > > we > >>> > > >> > > > > can > >>> > > >> > > > > > > make > >>> > > >> > > > > > > > > > best > >>> > > >> > > > > > > > > > > > > effort > >>> > > >> > > > > > > > > > > > > > > to send request to target node if > >>> affinity > >>> > > >> > > awareness > >>> > > >> > > > is > >>> > > >> > > > > > > > > enabled. > >>> > > >> > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > Currently, on the server-side > requests > >>> > > always > >>> > > >> > > > processed > >>> > > >> > > > > > > > > > > synchronously > >>> > > >> > > > > > > > > > > > > and > >>> > > >> > > > > > > > > > > > > > > responses are sent right after > >>> request was > >>> > > >> > > processed. > >>> > > >> > > > > To > >>> > > >> > > > > > > > > execute > >>> > > >> > > > > > > > > > > long > >>> > > >> > > > > > > > > > > > > > tasks > >>> > > >> > > > > > > > > > > > > > > async we should whether change this > >>> logic > >>> > or > >>> > > >> > > > introduce > >>> > > >> > > > > > some > >>> > > >> > > > > > > > > kind > >>> > > >> > > > > > > > > > > > > two-way > >>> > > >> > > > > > > > > > > > > > > communication between client and > >>> server > >>> > (now > >>> > > >> only > >>> > > >> > > > > one-way > >>> > > >> > > > > > > > > > requests > >>> > > >> > > > > > > > > > > > from > >>> > > >> > > > > > > > > > > > > > > client to server are allowed). > >>> > > >> > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > Two-way communication can also be > >>> useful > >>> > in > >>> > > >> the > >>> > > >> > > > future > >>> > > >> > > > > if > >>> > > >> > > > > > > we > >>> > > >> > > > > > > > > will > >>> > > >> > > > > > > > > > > > send > >>> > > >> > > > > > > > > > > > > > some > >>> > > >> > > > > > > > > > > > > > > server-side generated events to > >>> clients. > >>> > > >> > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > In case of two-way communication > >>> there can > >>> > > be > >>> > > >> new > >>> > > >> > > > > > > operations > >>> > > >> > > > > > > > > > > > > introduced: > >>> > > >> > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > OP_COMPUTE_EXECUTE_TASK (from client > >>> to > >>> > > >> server) > >>> > > >> > > > > > > > > > > > > > > Request: UUID nodeId, String > taskName, > >>> > > Object > >>> > > >> arg > >>> > > >> > > > > > > > > > > > > > > Response: long taskId > >>> > > >> > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > OP_COMPUTE_TASK_FINISHED (from > server > >>> to > >>> > > >> client) > >>> > > >> > > > > > > > > > > > > > > Request: taskId, Object result > >>> > > >> > > > > > > > > > > > > > > Response: empty > >>> > > >> > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > The same for affinity requests. > >>> > > >> > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > Also, we can implement not only > >>> execute > >>> > task > >>> > > >> > > > operation, > >>> > > >> > > > > > but > >>> > > >> > > > > > > > > some > >>> > > >> > > > > > > > > > > > other > >>> > > >> > > > > > > > > > > > > > > operations from IgniteCompute > >>> (broadcast, > >>> > > run, > >>> > > >> > > call), > >>> > > >> > > > > but > >>> > > >> > > > > > > it > >>> > > >> > > > > > > > > will > >>> > > >> > > > > > > > > > > be > >>> > > >> > > > > > > > > > > > > > useful > >>> > > >> > > > > > > > > > > > > > > only for java thin client. And even > >>> with > >>> > > java > >>> > > >> > thin > >>> > > >> > > > > client > >>> > > >> > > > > > > we > >>> > > >> > > > > > > > > > should > >>> > > >> > > > > > > > > > > > > > whether > >>> > > >> > > > > > > > > > > > > > > implement peer-class-loading for > thin > >>> > > clients > >>> > > >> > (this > >>> > > >> > > > > also > >>> > > >> > > > > > > > > requires > >>> > > >> > > > > > > > > > > > > two-way > >>> > > >> > > > > > > > > > > > > > > client-server communication) or put > >>> > classes > >>> > > >> with > >>> > > >> > > > > executed > >>> > > >> > > > > > > > > > closures > >>> > > >> > > > > > > > > > > to > >>> > > >> > > > > > > > > > > > > the > >>> > > >> > > > > > > > > > > > > > > server locally. > >>> > > >> > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > What do you think about proposed > >>> protocol > >>> > > >> > changes? > >>> > > >> > > > > > > > > > > > > > > Do we need two-way requests between > >>> client > >>> > > and > >>> > > >> > > > server? > >>> > > >> > > > > > > > > > > > > > > Do we need support of compute > methods > >>> > other > >>> > > >> than > >>> > > >> > > > > "execute > >>> > > >> > > > > > > > > task"? > >>> > > >> > > > > > > > > > > > > > > What do you think about > >>> peer-class-loading > >>> > > for > >>> > > >> > thin > >>> > > >> > > > > > > clients? > >>> > > >> > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > -- > >>> > > >> > > > > > > > > > > > > > Sergey Kozlov > >>> > > >> > > > > > > > > > > > > > GridGain Systems > >>> > > >> > > > > > > > > > > > > > www.gridgain.com > >>> > > >> > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > -- > >>> > > >> > > > > > > > > > > > Sergey Kozlov > >>> > > >> > > > > > > > > > > > GridGain Systems > >>> > > >> > > > > > > > > > > > www.gridgain.com > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > -- > >>> > > >> > > > > > > > > > > Alex. > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > > >>> > > >> > > > > > > > > >>> > > >> > > > > > > > >>> > > >> > > > > > > >>> > > >> > > > > > >>> > > >> > > > > >>> > > >> > > > >>> > > >> > > >>> > > >> > >>> > > > > >>> > > > >>> > > >>> > >> > |
In reply to this post by Alexey Plekhanov
Alex, thanks for preparing the outline.
I'd like us to discuss an approach for compute tasks update with no downtimes on the servers' end. For instance, let's assume that a Python/C++/Node.JS developer requested to update a compute task he called from the app. Should we introduce some system level API to the binary protocol that can take a jar file (or class) and redeploy it automatically with the usage of peer-class-loading? - Denis On Wed, Mar 25, 2020 at 5:47 AM Alex Plehanov <[hidden email]> wrote: > Hello guys. > > I've implemented PoC and created IEP [1] for thin client compute grid > functionality. Please have a look. > > [1]: > > https://cwiki.apache.org/confluence/display/IGNITE/IEP-42+Thin+client%3A+compute+support > > пт, 24 янв. 2020 г. в 16:56, Alex Plehanov <[hidden email]>: > > > We've discussed thin client compute protocol with Pavel Tupitsyn and Igor > > Sapego and come to the conclusion that approach with two-way requests > > should be used: client generates taskId and send a request to the server > to > > execute a task. The server responds that the request has been accepted. > > After task has finished the server notifies the client (send a request > > without waiting for a response). The client can cancel the task by > sending > > a corresponding request to the server. > > > > Also, a node list should be passed (optionally) with a request to limit > > nodes to execute the task. > > > > I will create IEP and file detailed protocol changes shortly. > > > > вт, 21 янв. 2020 г. в 18:46, Alex Plehanov <[hidden email]>: > > > >> Igor, thanks for the reply. > >> > >> > Approach with taskId will require a lot of changes in protocol and > thus > >> more "heavy" for implementation > >> Do you mean approach with server notifications mechanism? Yes, it will > >> require a lot of changes. But in most recent messages we've discussed > with > >> Pavel approach without server notifications mechanism. This approach > have > >> the same complexity and performance as an approach with requestId. > >> > >> > But such clients as Python, Node.js, PHP, Go most probably won't have > >> support for this API, at least for now. > >> Without a server notifications mechanism, there will be no breaking > >> changes in the protocol, so client implementation can just skip this > >> feature and protocol version and implement the next one. > >> > >> > Or never. > >> I think it still useful to execute java compute tasks from non-java thin > >> clients. Also, we can provide some out-of-the-box java tasks, for > example > >> ExecutePythonScriptTask with python compute implementation, which can > run > >> python script on server node. > >> > >> > So, maybe it's a good time for us to change our backward compatibility > >> mechanism from protocol versioning to feature masks? > >> I like the idea with feature masks, but it will force us to support both > >> backward compatibility mechanisms, protocol versioning and feature > masks. > >> > >> пн, 20 янв. 2020 г. в 20:34, Pavel Tupitsyn <[hidden email]>: > >> > >>> Huge +1 from me for Feature Masks. > >>> I think this should be our top priority for thin client protocol, since > >>> it > >>> simplifies change management a lot. > >>> > >>> On Mon, Jan 20, 2020 at 8:21 PM Igor Sapego <[hidden email]> > wrote: > >>> > >>> > Sorry for the late reply. > >>> > > >>> > Approach with taskId will require a lot of changes in protocol and > thus > >>> > more "heavy" for implementation, but it definitely looks to me less > >>> hacky > >>> > than reqId-approach. Moreover, as was mentioned, server notifications > >>> > mechanism will be required in a future anyway with high probability. > So > >>> > from this point of view I like taskId-approach. > >>> > > >>> > On the other hand, what we should also consider here is performance. > >>> > Speaking of latency, it looks like reqId will have better results in > >>> case > >>> > of > >>> > small and fast tasks. The only question here, if we want to optimize > >>> thin > >>> > clients for this case. > >>> > > >>> > Also, what are you talking about mostly involves clients on platforms > >>> > that already have Compute API for thick clients. Let me mention one > >>> > more point of view here and another concern here. > >>> > > >>> > The changes you propose are going to change protocol version for > sure. > >>> > In case with taskId approach and server notifications - even more so. > >>> > > >>> > But such clients as Python, Node.js, PHP, Go most probably won't have > >>> > support for this API, at least for now. Or never. But current > >>> > backward-compatibility mechanism implies protocol versions where we > >>> > imply that client that supports version 1.5 also supports all the > >>> features > >>> > introduced in all the previous versions of the protocol. > >>> > > >>> > Thus implementing Compute API in any of the proposed ways *may* > >>> > force mentioned clients to support changes in protocol which they not > >>> > necessarily need in order to introduce new features in the future. > >>> > > >>> > So, maybe it's a good time for us to change our backward > compatibility > >>> > mechanism from protocol versioning to feature masks? > >>> > > >>> > WDYT? > >>> > > >>> > Best Regards, > >>> > Igor > >>> > > >>> > > >>> > On Fri, Jan 17, 2020 at 9:37 AM Alex Plehanov < > [hidden email] > >>> > > >>> > wrote: > >>> > > >>> > > Looks like we didn't rich consensus here. > >>> > > > >>> > > Igor, as thin client maintainer, can you please share your opinion? > >>> > > > >>> > > Everyone else also welcome, please share your thoughts about > options > >>> to > >>> > > implement operations for compute. > >>> > > > >>> > > > >>> > > чт, 28 нояб. 2019 г. в 10:02, Alex Plehanov < > [hidden email] > >>> >: > >>> > > > >>> > > > > Since all thin client operations are inherently async, we > should > >>> be > >>> > > able > >>> > > > to cancel any of them > >>> > > > It's illogical to have such ability. What should do cancel > >>> operation of > >>> > > > cancel operation? Moreover, sometimes it's dangerous, for > example, > >>> > create > >>> > > > cache operation should never be canceled. There should be an > >>> explicit > >>> > set > >>> > > > of processes that we can cancel: queries, transactions, tasks, > >>> > services. > >>> > > > The lifecycle of services is more complex than the lifecycle of > >>> tasks. > >>> > > With > >>> > > > services, I suppose, we can't use request cancelation, so tasks > >>> will be > >>> > > the > >>> > > > only process with an exceptional pattern. > >>> > > > > >>> > > > > The request would be "execute task with specified node filter" > - > >>> > simple > >>> > > > and efficient. > >>> > > > It's not simple: every compute or service request should contain > >>> > complex > >>> > > > node filtering logic, which duplicates the same logic for cluster > >>> API. > >>> > > > It's not efficient: for example, we can't implement > forPredicate() > >>> > > > filtering in this case. > >>> > > > > >>> > > > > >>> > > > ср, 27 нояб. 2019 г. в 19:25, Pavel Tupitsyn < > [hidden email] > >>> >: > >>> > > > > >>> > > >> > The request is already processed (task is started), we can't > >>> cancel > >>> > > the > >>> > > >> request > >>> > > >> The request is not "start a task". It is "execute task" (and get > >>> > > result). > >>> > > >> Same as "cache get" - you get a result in the end, we don't > "start > >>> > cache > >>> > > >> get" then "end cache get". > >>> > > >> > >>> > > >> Since all thin client operations are inherently async, we should > >>> be > >>> > able > >>> > > >> to > >>> > > >> cancel any of them > >>> > > >> by sending another request with an id of prior request to be > >>> > cancelled. > >>> > > >> That's why I'm advocating for this approach - it will work for > >>> > anything, > >>> > > >> no > >>> > > >> special cases. > >>> > > >> And it keeps "happy path" as simple as it is right now. > >>> > > >> > >>> > > >> Queries are different because we retrieve results in pages, we > >>> can't > >>> > do > >>> > > >> them as one request. > >>> > > >> Transactions are also different because client controls when > they > >>> > should > >>> > > >> end. > >>> > > >> There is no reason for task execution to be a special case like > >>> > queries > >>> > > or > >>> > > >> transactions. > >>> > > >> > >>> > > >> > we always need to send 2 requests to server to execute the > task > >>> > > >> Nope. We don't need to get nodes on client at all. > >>> > > >> The request would be "execute task with specified node filter" - > >>> > simple > >>> > > >> and > >>> > > >> efficient. > >>> > > >> > >>> > > >> > >>> > > >> On Wed, Nov 27, 2019 at 4:31 PM Alex Plehanov < > >>> > [hidden email]> > >>> > > >> wrote: > >>> > > >> > >>> > > >> > > We do cancel a request to perform a task. We may and should > >>> use > >>> > > this > >>> > > >> to > >>> > > >> > cancel any other request in future. > >>> > > >> > The request is already processed (task is started), we can't > >>> cancel > >>> > > the > >>> > > >> > request. As you mentioned before, we already do almost the > same > >>> for > >>> > > >> queries > >>> > > >> > (close the cursor, but not cancel the request to run a query), > >>> it's > >>> > > >> better > >>> > > >> > to do such things in a common way. We have a pattern: start > some > >>> > > process > >>> > > >> > (query, transaction), get id of this process, end process by > >>> this > >>> > id. > >>> > > >> The > >>> > > >> > "Execute task" process should match the same pattern. In my > >>> opinion, > >>> > > >> > implementation with two-way requests is the best option to > match > >>> > this > >>> > > >> > pattern (we can even reuse OP_RESOURCE_CLOSE operation type in > >>> this > >>> > > >> case). > >>> > > >> > Sometime in the future, we will need two-way requests for some > >>> other > >>> > > >> > functionality (continuous queries, event listening, etc). But > >>> even > >>> > > >> without > >>> > > >> > two-way requests introducing some process id (task id in our > >>> case) > >>> > > will > >>> > > >> be > >>> > > >> > closer to existing pattern than canceling tasks by request id. > >>> > > >> > > >>> > > >> > > So every new request will apply those filters on server > side, > >>> > using > >>> > > >> the > >>> > > >> > most recent set of nodes. > >>> > > >> > In this case, we always need to send 2 requests to server to > >>> execute > >>> > > the > >>> > > >> > task. First - to get nodes by the filter, second - to actually > >>> > execute > >>> > > >> the > >>> > > >> > task. It seems like overhead. The same will be for services. > >>> Cluster > >>> > > >> group > >>> > > >> > remains the same if the topology hasn't changed. We can use > this > >>> > fact > >>> > > >> and > >>> > > >> > bind "execute task" request to topology. If topology has > >>> changed - > >>> > get > >>> > > >> > nodes for new topology and retry request. > >>> > > >> > > >>> > > >> > вт, 26 нояб. 2019 г. в 17:44, Pavel Tupitsyn < > >>> [hidden email] > >>> > >: > >>> > > >> > > >>> > > >> > > > After all, we don't cancel request > >>> > > >> > > We do cancel a request to perform a task. We may and should > >>> use > >>> > this > >>> > > >> to > >>> > > >> > > cancel any other request in future. > >>> > > >> > > > >>> > > >> > > > Client uses some cluster group filtration (for example > >>> > > forServers() > >>> > > >> > > cluster group) > >>> > > >> > > Please see above - Aleksandr Shapkin described how we store > >>> > > >> > > filtered cluster groups on client. > >>> > > >> > > We don't store node IDs, we store actual filters. So every > new > >>> > > request > >>> > > >> > will > >>> > > >> > > apply those filters on server side, > >>> > > >> > > using the most recent set of nodes. > >>> > > >> > > > >>> > > >> > > var myGrp = cluster.forServers().forAttribute("foo"); // > This > >>> does > >>> > > not > >>> > > >> > > issue any server requests, just builds an object with > filters > >>> on > >>> > > >> client > >>> > > >> > > while (true) myGrp.compute().executeTask("bar"); // Every > >>> request > >>> > > >> > includes > >>> > > >> > > filters, and filters are applied on the server side > >>> > > >> > > > >>> > > >> > > On Tue, Nov 26, 2019 at 1:42 PM Alex Plehanov < > >>> > > >> [hidden email]> > >>> > > >> > > wrote: > >>> > > >> > > > >>> > > >> > > > > Anyway, my point stands. > >>> > > >> > > > I can't agree. Why you don't want to use task id for this? > >>> After > >>> > > >> all, > >>> > > >> > we > >>> > > >> > > > don't cancel request (request is already processed), we > >>> cancel > >>> > the > >>> > > >> > task. > >>> > > >> > > So > >>> > > >> > > > it's more convenient to use task id here. > >>> > > >> > > > > >>> > > >> > > > > Can you please provide equivalent use case with existing > >>> > "thick" > >>> > > >> > > client? > >>> > > >> > > > For example: > >>> > > >> > > > Cluster consists of one server node. > >>> > > >> > > > Client uses some cluster group filtration (for example > >>> > > forServers() > >>> > > >> > > cluster > >>> > > >> > > > group). > >>> > > >> > > > Client starts to send periodically (for example 1 per > >>> minute) > >>> > > >> long-term > >>> > > >> > > > (for example 1 hour long) tasks to the cluster. > >>> > > >> > > > Meanwhile, several server nodes joined the cluster. > >>> > > >> > > > > >>> > > >> > > > In case of thick client: All server nodes will be used, > >>> tasks > >>> > will > >>> > > >> be > >>> > > >> > > load > >>> > > >> > > > balanced. > >>> > > >> > > > In case of thin client: Only one server node will be used, > >>> > client > >>> > > >> will > >>> > > >> > > > detect topology change after an hour. > >>> > > >> > > > > >>> > > >> > > > > >>> > > >> > > > вт, 26 нояб. 2019 г. в 11:50, Pavel Tupitsyn < > >>> > > [hidden email] > >>> > > >> >: > >>> > > >> > > > > >>> > > >> > > > > > I can't see any usage of request id in query cursors > >>> > > >> > > > > You are right, cursor id is a separate thing. > >>> > > >> > > > > Anyway, my point stands. > >>> > > >> > > > > > >>> > > >> > > > > > client sends long term tasks to nodes and wants to do > it > >>> > with > >>> > > >> load > >>> > > >> > > > > balancing > >>> > > >> > > > > I still don't get it. Can you please provide equivalent > >>> use > >>> > case > >>> > > >> with > >>> > > >> > > > > existing "thick" client? > >>> > > >> > > > > > >>> > > >> > > > > > >>> > > >> > > > > On Mon, Nov 25, 2019 at 11:59 PM Alex Plehanov < > >>> > > >> > > [hidden email]> > >>> > > >> > > > > wrote: > >>> > > >> > > > > > >>> > > >> > > > > > > And it is fine to use request ID to identify compute > >>> tasks > >>> > > >> (as we > >>> > > >> > > do > >>> > > >> > > > > with > >>> > > >> > > > > > query cursors). > >>> > > >> > > > > > I can't see any usage of request id in query cursors. > We > >>> > send > >>> > > >> query > >>> > > >> > > > > request > >>> > > >> > > > > > and get cursor id in response. After that, we only use > >>> > cursor > >>> > > id > >>> > > >> > (to > >>> > > >> > > > get > >>> > > >> > > > > > next pages and to close the resource). Did I miss > >>> something? > >>> > > >> > > > > > > >>> > > >> > > > > > > Looks like I'm missing something - how is topology > >>> change > >>> > > >> > relevant > >>> > > >> > > to > >>> > > >> > > > > > executing compute tasks from client? > >>> > > >> > > > > > It's not relevant directly. But there are some cases > >>> where > >>> > it > >>> > > >> will > >>> > > >> > be > >>> > > >> > > > > > helpful. For example, if client sends long term tasks > to > >>> > nodes > >>> > > >> and > >>> > > >> > > > wants > >>> > > >> > > > > to > >>> > > >> > > > > > do it with load balancing it will detect topology > change > >>> > only > >>> > > >> after > >>> > > >> > > > some > >>> > > >> > > > > > time in the future with the first response, so load > >>> > balancing > >>> > > >> will > >>> > > >> > no > >>> > > >> > > > > work. > >>> > > >> > > > > > Perhaps we can add optional "topology version" field > to > >>> the > >>> > > >> > > > > > OP_COMPUTE_EXECUTE_TASK request to solve this problem. > >>> > > >> > > > > > > >>> > > >> > > > > > > >>> > > >> > > > > > пн, 25 нояб. 2019 г. в 22:42, Pavel Tupitsyn < > >>> > > >> [hidden email] > >>> > > >> > >: > >>> > > >> > > > > > > >>> > > >> > > > > > > Alex, > >>> > > >> > > > > > > > >>> > > >> > > > > > > > we will mix entities from different layers > >>> (transport > >>> > > layer > >>> > > >> and > >>> > > >> > > > > request > >>> > > >> > > > > > > body) > >>> > > >> > > > > > > I would not call our message header (which includes > >>> the > >>> > id) > >>> > > >> > > > "transport > >>> > > >> > > > > > > layer". > >>> > > >> > > > > > > TCP is our transport layer. And it is fine to use > >>> request > >>> > ID > >>> > > >> to > >>> > > >> > > > > identify > >>> > > >> > > > > > > compute tasks (as we do with query cursors). > >>> > > >> > > > > > > > >>> > > >> > > > > > > > we still can't be sure that the task is > successfully > >>> > > started > >>> > > >> > on a > >>> > > >> > > > > > server > >>> > > >> > > > > > > The request to start the task will fail and we'll > get > >>> a > >>> > > >> response > >>> > > >> > > > > > indicating > >>> > > >> > > > > > > that right away > >>> > > >> > > > > > > > >>> > > >> > > > > > > > we won't ever know about topology change > >>> > > >> > > > > > > Looks like I'm missing something - how is topology > >>> change > >>> > > >> > relevant > >>> > > >> > > to > >>> > > >> > > > > > > executing compute tasks from client? > >>> > > >> > > > > > > > >>> > > >> > > > > > > On Mon, Nov 25, 2019 at 10:17 PM Alex Plehanov < > >>> > > >> > > > > [hidden email]> > >>> > > >> > > > > > > wrote: > >>> > > >> > > > > > > > >>> > > >> > > > > > > > Pavel, in this case, we will mix entities from > >>> different > >>> > > >> layers > >>> > > >> > > > > > > (transport > >>> > > >> > > > > > > > layer and request body), it's not very good. The > >>> same > >>> > > >> behavior > >>> > > >> > we > >>> > > >> > > > can > >>> > > >> > > > > > > > achieve with generated on client-side task id, but > >>> there > >>> > > >> will > >>> > > >> > be > >>> > > >> > > no > >>> > > >> > > > > > > > inter-layer data intersection and I think it will > be > >>> > > easier > >>> > > >> to > >>> > > >> > > > > > implement > >>> > > >> > > > > > > on > >>> > > >> > > > > > > > both client and server-side. But we still can't be > >>> sure > >>> > > that > >>> > > >> > the > >>> > > >> > > > task > >>> > > >> > > > > > is > >>> > > >> > > > > > > > successfully started on a server. We won't ever > know > >>> > about > >>> > > >> > > topology > >>> > > >> > > > > > > change, > >>> > > >> > > > > > > > because topology changed flag will be sent from > >>> server > >>> > to > >>> > > >> > client > >>> > > >> > > > only > >>> > > >> > > > > > > with > >>> > > >> > > > > > > > a response when the task will be completed. Are we > >>> > accept > >>> > > >> that? > >>> > > >> > > > > > > > > >>> > > >> > > > > > > > пн, 25 нояб. 2019 г. в 19:07, Pavel Tupitsyn < > >>> > > >> > > [hidden email] > >>> > > >> > > > >: > >>> > > >> > > > > > > > > >>> > > >> > > > > > > > > Alex, > >>> > > >> > > > > > > > > > >>> > > >> > > > > > > > > I have a simpler idea. We already do request id > >>> > handling > >>> > > >> in > >>> > > >> > the > >>> > > >> > > > > > > protocol, > >>> > > >> > > > > > > > > so: > >>> > > >> > > > > > > > > - Client sends a normal request to execute > compute > >>> > task. > >>> > > >> > > Request > >>> > > >> > > > ID > >>> > > >> > > > > > is > >>> > > >> > > > > > > > > generated as usual. > >>> > > >> > > > > > > > > - As soon as task is completed, a response is > >>> > received. > >>> > > >> > > > > > > > > > >>> > > >> > > > > > > > > As for cancellation - client can send a new > >>> request > >>> > > (with > >>> > > >> new > >>> > > >> > > > > request > >>> > > >> > > > > > > ID) > >>> > > >> > > > > > > > > and (in the body) pass the request ID from above > >>> > > >> > > > > > > > > as a task identifier. As a result, there are two > >>> > > >> responses: > >>> > > >> > > > > > > > > - Cancellation response > >>> > > >> > > > > > > > > - Task response (with proper cancelled status) > >>> > > >> > > > > > > > > > >>> > > >> > > > > > > > > That's it, no need to modify the core of the > >>> protocol. > >>> > > One > >>> > > >> > > > request > >>> > > >> > > > > - > >>> > > >> > > > > > > one > >>> > > >> > > > > > > > > response. > >>> > > >> > > > > > > > > > >>> > > >> > > > > > > > > On Mon, Nov 25, 2019 at 6:20 PM Alex Plehanov < > >>> > > >> > > > > > [hidden email] > >>> > > >> > > > > > > > > >>> > > >> > > > > > > > > wrote: > >>> > > >> > > > > > > > > > >>> > > >> > > > > > > > > > Pavel, we need to inform the client when the > >>> task is > >>> > > >> > > completed, > >>> > > >> > > > > we > >>> > > >> > > > > > > need > >>> > > >> > > > > > > > > the > >>> > > >> > > > > > > > > > ability to cancel the task. I see several ways > >>> to > >>> > > >> implement > >>> > > >> > > > this: > >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > > 1. Сlient sends a request to the server to > >>> start a > >>> > > task, > >>> > > >> > > server > >>> > > >> > > > > > > return > >>> > > >> > > > > > > > > task > >>> > > >> > > > > > > > > > id in response. Server notifies client when > >>> task is > >>> > > >> > completed > >>> > > >> > > > > with > >>> > > >> > > > > > a > >>> > > >> > > > > > > > new > >>> > > >> > > > > > > > > > request (from server to client). Client can > >>> cancel > >>> > the > >>> > > >> task > >>> > > >> > > by > >>> > > >> > > > > > > sending > >>> > > >> > > > > > > > a > >>> > > >> > > > > > > > > > new request with operation type "cancel" and > >>> task > >>> > id. > >>> > > In > >>> > > >> > this > >>> > > >> > > > > case, > >>> > > >> > > > > > > we > >>> > > >> > > > > > > > > > should implement 2-ways requests. > >>> > > >> > > > > > > > > > 2. Client generates unique task id and sends a > >>> > request > >>> > > >> to > >>> > > >> > the > >>> > > >> > > > > > server > >>> > > >> > > > > > > to > >>> > > >> > > > > > > > > > start a task, server don't reply immediately > but > >>> > wait > >>> > > >> until > >>> > > >> > > > task > >>> > > >> > > > > is > >>> > > >> > > > > > > > > > completed. Client can cancel task by sending > new > >>> > > request > >>> > > >> > with > >>> > > >> > > > > > > operation > >>> > > >> > > > > > > > > > type "cancel" and task id. In this case, we > >>> should > >>> > > >> decouple > >>> > > >> > > > > request > >>> > > >> > > > > > > and > >>> > > >> > > > > > > > > > response on the server-side (currently > response > >>> is > >>> > > sent > >>> > > >> > right > >>> > > >> > > > > after > >>> > > >> > > > > > > > > request > >>> > > >> > > > > > > > > > was processed). Also, we can't be sure that > >>> task is > >>> > > >> > > > successfully > >>> > > >> > > > > > > > started > >>> > > >> > > > > > > > > on > >>> > > >> > > > > > > > > > a server. > >>> > > >> > > > > > > > > > 3. Client sends a request to the server to > >>> start a > >>> > > task, > >>> > > >> > > server > >>> > > >> > > > > > > return > >>> > > >> > > > > > > > id > >>> > > >> > > > > > > > > > in response. Client periodically asks the > server > >>> > about > >>> > > >> task > >>> > > >> > > > > status. > >>> > > >> > > > > > > > > Client > >>> > > >> > > > > > > > > > can cancel the task by sending new request > with > >>> > > >> operation > >>> > > >> > > type > >>> > > >> > > > > > > "cancel" > >>> > > >> > > > > > > > > and > >>> > > >> > > > > > > > > > task id. This case brings some overhead to the > >>> > > >> > communication > >>> > > >> > > > > > channel. > >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > > Personally, I think that the case with 2-ways > >>> > requests > >>> > > >> is > >>> > > >> > > > better, > >>> > > >> > > > > > but > >>> > > >> > > > > > > > I'm > >>> > > >> > > > > > > > > > open to any other ideas. > >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > > Aleksandr, > >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > > Filtering logic for > >>> OP_CLUSTER_GROUP_GET_NODE_IDS > >>> > > looks > >>> > > >> > > > > > > > overcomplicated. > >>> > > >> > > > > > > > > Do > >>> > > >> > > > > > > > > > we need server-side filtering at all? Wouldn't > >>> it be > >>> > > >> better > >>> > > >> > > to > >>> > > >> > > > > send > >>> > > >> > > > > > > > basic > >>> > > >> > > > > > > > > > info (ids, order, flags) for all nodes (there > is > >>> > > >> relatively > >>> > > >> > > > small > >>> > > >> > > > > > > > amount > >>> > > >> > > > > > > > > of > >>> > > >> > > > > > > > > > data) and extended info (attributes) for > >>> selected > >>> > list > >>> > > >> of > >>> > > >> > > > nodes? > >>> > > >> > > > > In > >>> > > >> > > > > > > > this > >>> > > >> > > > > > > > > > case, we can do basic node filtration on > >>> client-side > >>> > > >> > > > > (forClients(), > >>> > > >> > > > > > > > > > forServers(), forNodeIds(), forOthers(), etc). > >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > > Do you use standard ClusterNode serialization? > >>> There > >>> > > are > >>> > > >> > also > >>> > > >> > > > > > metrics > >>> > > >> > > > > > > > > > serialized with ClusterNode, do we need it on > >>> thin > >>> > > >> client? > >>> > > >> > > > There > >>> > > >> > > > > > are > >>> > > >> > > > > > > > > other > >>> > > >> > > > > > > > > > interfaces exist to show metrics, I think it's > >>> > > >> redundant to > >>> > > >> > > > > export > >>> > > >> > > > > > > > > metrics > >>> > > >> > > > > > > > > > to thin clients too. > >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > > What do you think? > >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > > пт, 22 нояб. 2019 г. в 20:15, Aleksandr > Shapkin > >>> < > >>> > > >> > > > > [hidden email] > >>> > > >> > > > > > >: > >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > > > Alex, > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > I think you can create a new IEP page and I > >>> will > >>> > > fill > >>> > > >> it > >>> > > >> > > with > >>> > > >> > > > > the > >>> > > >> > > > > > > > > Cluster > >>> > > >> > > > > > > > > > > API details. > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > In short, I’ve introduced several new codes: > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > Cluster API is pretty straightforward: > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > OP_CLUSTER_IS_ACTIVE = 5000 > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > OP_CLUSTER_CHANGE_STATE = 5001 > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > OP_CLUSTER_CHANGE_WAL_STATE = 5002 > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > OP_CLUSTER_GET_WAL_STATE = 5003 > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > Cluster group codes: > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > OP_CLUSTER_GROUP_GET_NODE_IDS = 5100 > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > OP_CLUSTER_GROUP_GET_NODE_INFO = 5101 > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > The underlying implementation is based on > the > >>> > thick > >>> > > >> > client > >>> > > >> > > > > logic. > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > For every request, we provide a known > topology > >>> > > version > >>> > > >> > and > >>> > > >> > > if > >>> > > >> > > > > it > >>> > > >> > > > > > > has > >>> > > >> > > > > > > > > > > changed, > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > a client updates it firstly and then > re-sends > >>> the > >>> > > >> > filtering > >>> > > >> > > > > > > request. > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > Alongside the topVer a client sends a > >>> serialized > >>> > > nodes > >>> > > >> > > > > projection > >>> > > >> > > > > > > > > object > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > that could be considered as a code to value > >>> > mapping. > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > Consider: [{Code = 1, Value= [“DotNet”, > >>> > > >> “MyAttribute”}, > >>> > > >> > > > > {Code=2, > >>> > > >> > > > > > > > > > Value=1}] > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > Where “1” stands for Attribute filtering and > >>> “2” – > >>> > > >> > > > > > serverNodesOnly > >>> > > >> > > > > > > > > flag. > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > As a result of request processing, a server > >>> sends > >>> > > >> nodeId > >>> > > >> > > > UUIDs > >>> > > >> > > > > > and > >>> > > >> > > > > > > a > >>> > > >> > > > > > > > > > > current topVer. > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > When a client obtains nodeIds, it can > perform > >>> a > >>> > > >> NODE_INFO > >>> > > >> > > > call > >>> > > >> > > > > to > >>> > > >> > > > > > > > get a > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > serialized ClusterNode object. In addition > >>> there > >>> > > >> should > >>> > > >> > be > >>> > > >> > > a > >>> > > >> > > > > > > > different > >>> > > >> > > > > > > > > > API > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > method for accessing/updating node metrics. > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > чт, 21 нояб. 2019 г. в 12:32, Sergey Kozlov > < > >>> > > >> > > > > > [hidden email] > >>> > > >> > > > > > > >: > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > > Hi Pavel > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > On Thu, Nov 21, 2019 at 11:30 AM Pavel > >>> Tupitsyn > >>> > < > >>> > > >> > > > > > > > > [hidden email]> > >>> > > >> > > > > > > > > > > > wrote: > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > 1. I believe that Cluster operations for > >>> Thin > >>> > > >> Client > >>> > > >> > > > > protocol > >>> > > >> > > > > > > are > >>> > > >> > > > > > > > > > > already > >>> > > >> > > > > > > > > > > > > in the works > >>> > > >> > > > > > > > > > > > > by Alexandr Shapkin. Can't find the > ticket > >>> > > though. > >>> > > >> > > > > > > > > > > > > Alexandr, can you please confirm and > >>> attach > >>> > the > >>> > > >> > ticket > >>> > > >> > > > > > number? > >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > 2. Proposed changes will work only for > >>> Java > >>> > > tasks > >>> > > >> > that > >>> > > >> > > > are > >>> > > >> > > > > > > > already > >>> > > >> > > > > > > > > > > > deployed > >>> > > >> > > > > > > > > > > > > on server nodes. > >>> > > >> > > > > > > > > > > > > This is mostly useless for other thin > >>> clients > >>> > we > >>> > > >> have > >>> > > >> > > > > > (Python, > >>> > > >> > > > > > > > PHP, > >>> > > >> > > > > > > > > > > .NET, > >>> > > >> > > > > > > > > > > > > C++). > >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > I don't guess so. The task (execution) is > a > >>> way > >>> > to > >>> > > >> > > > implement > >>> > > >> > > > > > own > >>> > > >> > > > > > > > > layer > >>> > > >> > > > > > > > > > > for > >>> > > >> > > > > > > > > > > > the thin client application. > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > We should think of a way to make this > >>> useful > >>> > for > >>> > > >> all > >>> > > >> > > > > clients. > >>> > > >> > > > > > > > > > > > > For example, we may allow sending tasks > in > >>> > some > >>> > > >> > > scripting > >>> > > >> > > > > > > > language > >>> > > >> > > > > > > > > > like > >>> > > >> > > > > > > > > > > > > Javascript. > >>> > > >> > > > > > > > > > > > > Thoughts? > >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > The arbitrary code execution from a remote > >>> > client > >>> > > >> must > >>> > > >> > be > >>> > > >> > > > > > > protected > >>> > > >> > > > > > > > > > > > from malicious code. > >>> > > >> > > > > > > > > > > > I don't know how it could be designed but > >>> > without > >>> > > >> that > >>> > > >> > we > >>> > > >> > > > > open > >>> > > >> > > > > > > the > >>> > > >> > > > > > > > > hole > >>> > > >> > > > > > > > > > > to > >>> > > >> > > > > > > > > > > > kill cluster. > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > On Thu, Nov 21, 2019 at 11:21 AM Sergey > >>> > Kozlov < > >>> > > >> > > > > > > > > [hidden email] > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > wrote: > >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > Hi Alex > >>> > > >> > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > The idea is great. But I have some > >>> concerns > >>> > > that > >>> > > >> > > > probably > >>> > > >> > > > > > > > should > >>> > > >> > > > > > > > > be > >>> > > >> > > > > > > > > > > > taken > >>> > > >> > > > > > > > > > > > > > into account for design: > >>> > > >> > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > 1. We need to have the ability to > >>> stop a > >>> > > task > >>> > > >> > > > > execution, > >>> > > >> > > > > > > > smth > >>> > > >> > > > > > > > > > like > >>> > > >> > > > > > > > > > > > > > OP_COMPUTE_CANCEL_TASK operation > >>> (client > >>> > > to > >>> > > >> > > server) > >>> > > >> > > > > > > > > > > > > > 2. What's about task execution > >>> timeout? > >>> > It > >>> > > >> may > >>> > > >> > > help > >>> > > >> > > > to > >>> > > >> > > > > > the > >>> > > >> > > > > > > > > > cluster > >>> > > >> > > > > > > > > > > > > > survival for buggy tasks > >>> > > >> > > > > > > > > > > > > > 3. Ignite doesn't have > >>> > roles/authorization > >>> > > >> > > > > functionality > >>> > > >> > > > > > > for > >>> > > >> > > > > > > > > > now. > >>> > > >> > > > > > > > > > > > But > >>> > > >> > > > > > > > > > > > > a > >>> > > >> > > > > > > > > > > > > > task is the risky operation for > >>> cluster > >>> > > (for > >>> > > >> > > > security > >>> > > >> > > > > > > > > reasons). > >>> > > >> > > > > > > > > > > > Could > >>> > > >> > > > > > > > > > > > > we > >>> > > >> > > > > > > > > > > > > > add for Ignite configuration new > >>> options: > >>> > > >> > > > > > > > > > > > > > - Explicit turning on for > compute > >>> task > >>> > > >> > support > >>> > > >> > > > for > >>> > > >> > > > > > thin > >>> > > >> > > > > > > > > > > protocol > >>> > > >> > > > > > > > > > > > > > (disabled by default) for whole > >>> > cluster > >>> > > >> > > > > > > > > > > > > > - Explicit turning on for > compute > >>> task > >>> > > >> > support > >>> > > >> > > > for > >>> > > >> > > > > a > >>> > > >> > > > > > > node > >>> > > >> > > > > > > > > > > > > > - The list of task names > (classes) > >>> > > >> allowed to > >>> > > >> > > > > execute > >>> > > >> > > > > > > by > >>> > > >> > > > > > > > > thin > >>> > > >> > > > > > > > > > > > > client. > >>> > > >> > > > > > > > > > > > > > 4. Support the labeling for task > >>> that may > >>> > > >> help > >>> > > >> > to > >>> > > >> > > > > > > > investigate > >>> > > >> > > > > > > > > > > issues > >>> > > >> > > > > > > > > > > > > on > >>> > > >> > > > > > > > > > > > > > cluster (the idea from IEP-34 [1]) > >>> > > >> > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > 1. > >>> > > >> > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > > >>> > > >> > > > > > > > > >>> > > >> > > > > > > > >>> > > >> > > > > > > >>> > > >> > > > > > >>> > > >> > > > > >>> > > >> > > > >>> > > >> > > >>> > > >> > >>> > > > >>> > > >>> > https://cwiki.apache.org/confluence/display/IGNITE/IEP-34+Thin+client%3A+transactions+support > >>> > > >> > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > On Thu, Nov 21, 2019 at 10:58 AM Alex > >>> > > Plehanov < > >>> > > >> > > > > > > > > > > > [hidden email]> > >>> > > >> > > > > > > > > > > > > > wrote: > >>> > > >> > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > Hello, Igniters! > >>> > > >> > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > I have plans to start implementation > >>> of > >>> > > >> Compute > >>> > > >> > > > > interface > >>> > > >> > > > > > > for > >>> > > >> > > > > > > > > > > Ignite > >>> > > >> > > > > > > > > > > > > thin > >>> > > >> > > > > > > > > > > > > > > client and want to discuss features > >>> that > >>> > > >> should > >>> > > >> > be > >>> > > >> > > > > > > > implemented. > >>> > > >> > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > We already have Compute > >>> implementation for > >>> > > >> > > > binary-rest > >>> > > >> > > > > > > > clients > >>> > > >> > > > > > > > > > > > > > > (GridClientCompute), which have the > >>> > > following > >>> > > >> > > > > > > functionality: > >>> > > >> > > > > > > > > > > > > > > - Filtering cluster nodes > >>> (projection) for > >>> > > >> > compute > >>> > > >> > > > > > > > > > > > > > > - Executing task by the name > >>> > > >> > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > I think we can implement this > >>> > functionality > >>> > > >> in a > >>> > > >> > > thin > >>> > > >> > > > > > > client > >>> > > >> > > > > > > > as > >>> > > >> > > > > > > > > > > well. > >>> > > >> > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > First of all, we need some operation > >>> types > >>> > > to > >>> > > >> > > > request a > >>> > > >> > > > > > > list > >>> > > >> > > > > > > > of > >>> > > >> > > > > > > > > > all > >>> > > >> > > > > > > > > > > > > > > available nodes and probably node > >>> > attributes > >>> > > >> (by > >>> > > >> > a > >>> > > >> > > > list > >>> > > >> > > > > > of > >>> > > >> > > > > > > > > > nodes). > >>> > > >> > > > > > > > > > > > Node > >>> > > >> > > > > > > > > > > > > > > attributes will be helpful if we > will > >>> > decide > >>> > > >> to > >>> > > >> > > > > implement > >>> > > >> > > > > > > > > analog > >>> > > >> > > > > > > > > > of > >>> > > >> > > > > > > > > > > > > > > ClusterGroup#forAttribute or > >>> > > >> > > > ClusterGroup#forePredicate > >>> > > >> > > > > > > > methods > >>> > > >> > > > > > > > > > in > >>> > > >> > > > > > > > > > > > the > >>> > > >> > > > > > > > > > > > > > thin > >>> > > >> > > > > > > > > > > > > > > client. Perhaps they can be > requested > >>> > > lazily. > >>> > > >> > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > From the protocol point of view > there > >>> will > >>> > > be > >>> > > >> two > >>> > > >> > > new > >>> > > >> > > > > > > > > operations: > >>> > > >> > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > OP_CLUSTER_GET_NODES > >>> > > >> > > > > > > > > > > > > > > Request: empty > >>> > > >> > > > > > > > > > > > > > > Response: long topologyVersion, int > >>> > > >> > > > > minorTopologyVersion, > >>> > > >> > > > > > > int > >>> > > >> > > > > > > > > > > > > nodesCount, > >>> > > >> > > > > > > > > > > > > > > for each node set of node fields > (UUID > >>> > > nodeId, > >>> > > >> > > Object > >>> > > >> > > > > or > >>> > > >> > > > > > > > String > >>> > > >> > > > > > > > > > > > > > > consistentId, long order, etc) > >>> > > >> > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > OP_CLUSTER_GET_NODE_ATTRIBUTES > >>> > > >> > > > > > > > > > > > > > > Request: int nodesCount, for each > >>> node: > >>> > UUID > >>> > > >> > nodeId > >>> > > >> > > > > > > > > > > > > > > Response: int nodesCount, for each > >>> node: > >>> > int > >>> > > >> > > > > > > attributesCount, > >>> > > >> > > > > > > > > for > >>> > > >> > > > > > > > > > > > each > >>> > > >> > > > > > > > > > > > > > node > >>> > > >> > > > > > > > > > > > > > > attribute: String name, Object value > >>> > > >> > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > To execute tasks we need something > >>> like > >>> > > these > >>> > > >> > > methods > >>> > > >> > > > > in > >>> > > >> > > > > > > the > >>> > > >> > > > > > > > > > client > >>> > > >> > > > > > > > > > > > > API: > >>> > > >> > > > > > > > > > > > > > > Object execute(String task, Object > >>> arg) > >>> > > >> > > > > > > > > > > > > > > Future<Object> executeAsync(String > >>> task, > >>> > > >> Object > >>> > > >> > > arg) > >>> > > >> > > > > > > > > > > > > > > Object affinityExecute(String task, > >>> String > >>> > > >> cache, > >>> > > >> > > > > Object > >>> > > >> > > > > > > key, > >>> > > >> > > > > > > > > > > Object > >>> > > >> > > > > > > > > > > > > arg) > >>> > > >> > > > > > > > > > > > > > > Future<Object> > >>> affinityExecuteAsync(String > >>> > > >> task, > >>> > > >> > > > String > >>> > > >> > > > > > > > cache, > >>> > > >> > > > > > > > > > > Object > >>> > > >> > > > > > > > > > > > > > key, > >>> > > >> > > > > > > > > > > > > > > Object arg) > >>> > > >> > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > Which can be mapped to protocol > >>> > operations: > >>> > > >> > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > OP_COMPUTE_EXECUTE_TASK > >>> > > >> > > > > > > > > > > > > > > Request: UUID nodeId, String > taskName, > >>> > > Object > >>> > > >> arg > >>> > > >> > > > > > > > > > > > > > > Response: Object result > >>> > > >> > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > OP_COMPUTE_EXECUTE_TASK_AFFINITY > >>> > > >> > > > > > > > > > > > > > > Request: String cacheName, Object > key, > >>> > > String > >>> > > >> > > > taskName, > >>> > > >> > > > > > > > Object > >>> > > >> > > > > > > > > > arg > >>> > > >> > > > > > > > > > > > > > > Response: Object result > >>> > > >> > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > The second operation is needed > >>> because we > >>> > > >> > sometimes > >>> > > >> > > > > can't > >>> > > >> > > > > > > > > > calculate > >>> > > >> > > > > > > > > > > > and > >>> > > >> > > > > > > > > > > > > > > connect to affinity node on the > >>> > client-side > >>> > > >> > > (affinity > >>> > > >> > > > > > > > awareness > >>> > > >> > > > > > > > > > can > >>> > > >> > > > > > > > > > > > be > >>> > > >> > > > > > > > > > > > > > > disabled, custom affinity function > >>> can be > >>> > > >> used or > >>> > > >> > > > there > >>> > > >> > > > > > can > >>> > > >> > > > > > > > be > >>> > > >> > > > > > > > > no > >>> > > >> > > > > > > > > > > > > > > connection between client and > affinity > >>> > > node), > >>> > > >> but > >>> > > >> > > we > >>> > > >> > > > > can > >>> > > >> > > > > > > make > >>> > > >> > > > > > > > > > best > >>> > > >> > > > > > > > > > > > > effort > >>> > > >> > > > > > > > > > > > > > > to send request to target node if > >>> affinity > >>> > > >> > > awareness > >>> > > >> > > > is > >>> > > >> > > > > > > > > enabled. > >>> > > >> > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > Currently, on the server-side > requests > >>> > > always > >>> > > >> > > > processed > >>> > > >> > > > > > > > > > > synchronously > >>> > > >> > > > > > > > > > > > > and > >>> > > >> > > > > > > > > > > > > > > responses are sent right after > >>> request was > >>> > > >> > > processed. > >>> > > >> > > > > To > >>> > > >> > > > > > > > > execute > >>> > > >> > > > > > > > > > > long > >>> > > >> > > > > > > > > > > > > > tasks > >>> > > >> > > > > > > > > > > > > > > async we should whether change this > >>> logic > >>> > or > >>> > > >> > > > introduce > >>> > > >> > > > > > some > >>> > > >> > > > > > > > > kind > >>> > > >> > > > > > > > > > > > > two-way > >>> > > >> > > > > > > > > > > > > > > communication between client and > >>> server > >>> > (now > >>> > > >> only > >>> > > >> > > > > one-way > >>> > > >> > > > > > > > > > requests > >>> > > >> > > > > > > > > > > > from > >>> > > >> > > > > > > > > > > > > > > client to server are allowed). > >>> > > >> > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > Two-way communication can also be > >>> useful > >>> > in > >>> > > >> the > >>> > > >> > > > future > >>> > > >> > > > > if > >>> > > >> > > > > > > we > >>> > > >> > > > > > > > > will > >>> > > >> > > > > > > > > > > > send > >>> > > >> > > > > > > > > > > > > > some > >>> > > >> > > > > > > > > > > > > > > server-side generated events to > >>> clients. > >>> > > >> > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > In case of two-way communication > >>> there can > >>> > > be > >>> > > >> new > >>> > > >> > > > > > > operations > >>> > > >> > > > > > > > > > > > > introduced: > >>> > > >> > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > OP_COMPUTE_EXECUTE_TASK (from client > >>> to > >>> > > >> server) > >>> > > >> > > > > > > > > > > > > > > Request: UUID nodeId, String > taskName, > >>> > > Object > >>> > > >> arg > >>> > > >> > > > > > > > > > > > > > > Response: long taskId > >>> > > >> > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > OP_COMPUTE_TASK_FINISHED (from > server > >>> to > >>> > > >> client) > >>> > > >> > > > > > > > > > > > > > > Request: taskId, Object result > >>> > > >> > > > > > > > > > > > > > > Response: empty > >>> > > >> > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > The same for affinity requests. > >>> > > >> > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > Also, we can implement not only > >>> execute > >>> > task > >>> > > >> > > > operation, > >>> > > >> > > > > > but > >>> > > >> > > > > > > > > some > >>> > > >> > > > > > > > > > > > other > >>> > > >> > > > > > > > > > > > > > > operations from IgniteCompute > >>> (broadcast, > >>> > > run, > >>> > > >> > > call), > >>> > > >> > > > > but > >>> > > >> > > > > > > it > >>> > > >> > > > > > > > > will > >>> > > >> > > > > > > > > > > be > >>> > > >> > > > > > > > > > > > > > useful > >>> > > >> > > > > > > > > > > > > > > only for java thin client. And even > >>> with > >>> > > java > >>> > > >> > thin > >>> > > >> > > > > client > >>> > > >> > > > > > > we > >>> > > >> > > > > > > > > > should > >>> > > >> > > > > > > > > > > > > > whether > >>> > > >> > > > > > > > > > > > > > > implement peer-class-loading for > thin > >>> > > clients > >>> > > >> > (this > >>> > > >> > > > > also > >>> > > >> > > > > > > > > requires > >>> > > >> > > > > > > > > > > > > two-way > >>> > > >> > > > > > > > > > > > > > > client-server communication) or put > >>> > classes > >>> > > >> with > >>> > > >> > > > > executed > >>> > > >> > > > > > > > > > closures > >>> > > >> > > > > > > > > > > to > >>> > > >> > > > > > > > > > > > > the > >>> > > >> > > > > > > > > > > > > > > server locally. > >>> > > >> > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > What do you think about proposed > >>> protocol > >>> > > >> > changes? > >>> > > >> > > > > > > > > > > > > > > Do we need two-way requests between > >>> client > >>> > > and > >>> > > >> > > > server? > >>> > > >> > > > > > > > > > > > > > > Do we need support of compute > methods > >>> > other > >>> > > >> than > >>> > > >> > > > > "execute > >>> > > >> > > > > > > > > task"? > >>> > > >> > > > > > > > > > > > > > > What do you think about > >>> peer-class-loading > >>> > > for > >>> > > >> > thin > >>> > > >> > > > > > > clients? > >>> > > >> > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > -- > >>> > > >> > > > > > > > > > > > > > Sergey Kozlov > >>> > > >> > > > > > > > > > > > > > GridGain Systems > >>> > > >> > > > > > > > > > > > > > www.gridgain.com > >>> > > >> > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > -- > >>> > > >> > > > > > > > > > > > Sergey Kozlov > >>> > > >> > > > > > > > > > > > GridGain Systems > >>> > > >> > > > > > > > > > > > www.gridgain.com > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > -- > >>> > > >> > > > > > > > > > > Alex. > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > > >>> > > >> > > > > > > > > >>> > > >> > > > > > > > >>> > > >> > > > > > > >>> > > >> > > > > > >>> > > >> > > > > >>> > > >> > > > >>> > > >> > > >>> > > >> > >>> > > > > >>> > > > >>> > > >>> > >> > |
Pavel,
1. Actually it can be solved on the client-side (and already solved in PoC implementation). But I agreed it brings extra complexity for client-side implementation, will try to provide such guarantees on the server-side. 2. ComputeTask has also "reduce" step which is executed on the initiator node. Binary-rest client implementation, for example, has such affinity methods (to execute the task by name). I'm ok with removing it. At least if someone will need it we can implement it again at any time in the future without protocol change. I've fixed IEP. Denis, Deployment API definitely needed as one of the next steps. Currently, we are talking only about the first step (execution of already deployed tasks). Also, I'm not sure about automatic redeploy and peer-class-loading for thin clients, I think it's better to have more control here and provide API to explicitly deploy classes or jar files. WDYT? ср, 25 мар. 2020 г. в 21:17, Denis Magda <[hidden email]>: > Alex, thanks for preparing the outline. > > I'd like us to discuss an approach for compute tasks update with no > downtimes on the servers' end. For instance, let's assume that a > Python/C++/Node.JS developer requested to update a compute task he called > from the app. Should we introduce some system level API to the binary > protocol that can take a jar file (or class) and redeploy it automatically > with the usage of peer-class-loading? > > - > Denis > > > On Wed, Mar 25, 2020 at 5:47 AM Alex Plehanov <[hidden email]> > wrote: > > > Hello guys. > > > > I've implemented PoC and created IEP [1] for thin client compute grid > > functionality. Please have a look. > > > > [1]: > > > > > https://cwiki.apache.org/confluence/display/IGNITE/IEP-42+Thin+client%3A+compute+support > > > > пт, 24 янв. 2020 г. в 16:56, Alex Plehanov <[hidden email]>: > > > > > We've discussed thin client compute protocol with Pavel Tupitsyn and > Igor > > > Sapego and come to the conclusion that approach with two-way requests > > > should be used: client generates taskId and send a request to the > server > > to > > > execute a task. The server responds that the request has been accepted. > > > After task has finished the server notifies the client (send a request > > > without waiting for a response). The client can cancel the task by > > sending > > > a corresponding request to the server. > > > > > > Also, a node list should be passed (optionally) with a request to limit > > > nodes to execute the task. > > > > > > I will create IEP and file detailed protocol changes shortly. > > > > > > вт, 21 янв. 2020 г. в 18:46, Alex Plehanov <[hidden email]>: > > > > > >> Igor, thanks for the reply. > > >> > > >> > Approach with taskId will require a lot of changes in protocol and > > thus > > >> more "heavy" for implementation > > >> Do you mean approach with server notifications mechanism? Yes, it will > > >> require a lot of changes. But in most recent messages we've discussed > > with > > >> Pavel approach without server notifications mechanism. This approach > > have > > >> the same complexity and performance as an approach with requestId. > > >> > > >> > But such clients as Python, Node.js, PHP, Go most probably won't > have > > >> support for this API, at least for now. > > >> Without a server notifications mechanism, there will be no breaking > > >> changes in the protocol, so client implementation can just skip this > > >> feature and protocol version and implement the next one. > > >> > > >> > Or never. > > >> I think it still useful to execute java compute tasks from non-java > thin > > >> clients. Also, we can provide some out-of-the-box java tasks, for > > example > > >> ExecutePythonScriptTask with python compute implementation, which can > > run > > >> python script on server node. > > >> > > >> > So, maybe it's a good time for us to change our backward > compatibility > > >> mechanism from protocol versioning to feature masks? > > >> I like the idea with feature masks, but it will force us to support > both > > >> backward compatibility mechanisms, protocol versioning and feature > > masks. > > >> > > >> пн, 20 янв. 2020 г. в 20:34, Pavel Tupitsyn <[hidden email]>: > > >> > > >>> Huge +1 from me for Feature Masks. > > >>> I think this should be our top priority for thin client protocol, > since > > >>> it > > >>> simplifies change management a lot. > > >>> > > >>> On Mon, Jan 20, 2020 at 8:21 PM Igor Sapego <[hidden email]> > > wrote: > > >>> > > >>> > Sorry for the late reply. > > >>> > > > >>> > Approach with taskId will require a lot of changes in protocol and > > thus > > >>> > more "heavy" for implementation, but it definitely looks to me less > > >>> hacky > > >>> > than reqId-approach. Moreover, as was mentioned, server > notifications > > >>> > mechanism will be required in a future anyway with high > probability. > > So > > >>> > from this point of view I like taskId-approach. > > >>> > > > >>> > On the other hand, what we should also consider here is > performance. > > >>> > Speaking of latency, it looks like reqId will have better results > in > > >>> case > > >>> > of > > >>> > small and fast tasks. The only question here, if we want to > optimize > > >>> thin > > >>> > clients for this case. > > >>> > > > >>> > Also, what are you talking about mostly involves clients on > platforms > > >>> > that already have Compute API for thick clients. Let me mention one > > >>> > more point of view here and another concern here. > > >>> > > > >>> > The changes you propose are going to change protocol version for > > sure. > > >>> > In case with taskId approach and server notifications - even more > so. > > >>> > > > >>> > But such clients as Python, Node.js, PHP, Go most probably won't > have > > >>> > support for this API, at least for now. Or never. But current > > >>> > backward-compatibility mechanism implies protocol versions where we > > >>> > imply that client that supports version 1.5 also supports all the > > >>> features > > >>> > introduced in all the previous versions of the protocol. > > >>> > > > >>> > Thus implementing Compute API in any of the proposed ways *may* > > >>> > force mentioned clients to support changes in protocol which they > not > > >>> > necessarily need in order to introduce new features in the future. > > >>> > > > >>> > So, maybe it's a good time for us to change our backward > > compatibility > > >>> > mechanism from protocol versioning to feature masks? > > >>> > > > >>> > WDYT? > > >>> > > > >>> > Best Regards, > > >>> > Igor > > >>> > > > >>> > > > >>> > On Fri, Jan 17, 2020 at 9:37 AM Alex Plehanov < > > [hidden email] > > >>> > > > >>> > wrote: > > >>> > > > >>> > > Looks like we didn't rich consensus here. > > >>> > > > > >>> > > Igor, as thin client maintainer, can you please share your > opinion? > > >>> > > > > >>> > > Everyone else also welcome, please share your thoughts about > > options > > >>> to > > >>> > > implement operations for compute. > > >>> > > > > >>> > > > > >>> > > чт, 28 нояб. 2019 г. в 10:02, Alex Plehanov < > > [hidden email] > > >>> >: > > >>> > > > > >>> > > > > Since all thin client operations are inherently async, we > > should > > >>> be > > >>> > > able > > >>> > > > to cancel any of them > > >>> > > > It's illogical to have such ability. What should do cancel > > >>> operation of > > >>> > > > cancel operation? Moreover, sometimes it's dangerous, for > > example, > > >>> > create > > >>> > > > cache operation should never be canceled. There should be an > > >>> explicit > > >>> > set > > >>> > > > of processes that we can cancel: queries, transactions, tasks, > > >>> > services. > > >>> > > > The lifecycle of services is more complex than the lifecycle of > > >>> tasks. > > >>> > > With > > >>> > > > services, I suppose, we can't use request cancelation, so tasks > > >>> will be > > >>> > > the > > >>> > > > only process with an exceptional pattern. > > >>> > > > > > >>> > > > > The request would be "execute task with specified node > filter" > > - > > >>> > simple > > >>> > > > and efficient. > > >>> > > > It's not simple: every compute or service request should > contain > > >>> > complex > > >>> > > > node filtering logic, which duplicates the same logic for > cluster > > >>> API. > > >>> > > > It's not efficient: for example, we can't implement > > forPredicate() > > >>> > > > filtering in this case. > > >>> > > > > > >>> > > > > > >>> > > > ср, 27 нояб. 2019 г. в 19:25, Pavel Tupitsyn < > > [hidden email] > > >>> >: > > >>> > > > > > >>> > > >> > The request is already processed (task is started), we > can't > > >>> cancel > > >>> > > the > > >>> > > >> request > > >>> > > >> The request is not "start a task". It is "execute task" (and > get > > >>> > > result). > > >>> > > >> Same as "cache get" - you get a result in the end, we don't > > "start > > >>> > cache > > >>> > > >> get" then "end cache get". > > >>> > > >> > > >>> > > >> Since all thin client operations are inherently async, we > should > > >>> be > > >>> > able > > >>> > > >> to > > >>> > > >> cancel any of them > > >>> > > >> by sending another request with an id of prior request to be > > >>> > cancelled. > > >>> > > >> That's why I'm advocating for this approach - it will work for > > >>> > anything, > > >>> > > >> no > > >>> > > >> special cases. > > >>> > > >> And it keeps "happy path" as simple as it is right now. > > >>> > > >> > > >>> > > >> Queries are different because we retrieve results in pages, we > > >>> can't > > >>> > do > > >>> > > >> them as one request. > > >>> > > >> Transactions are also different because client controls when > > they > > >>> > should > > >>> > > >> end. > > >>> > > >> There is no reason for task execution to be a special case > like > > >>> > queries > > >>> > > or > > >>> > > >> transactions. > > >>> > > >> > > >>> > > >> > we always need to send 2 requests to server to execute the > > task > > >>> > > >> Nope. We don't need to get nodes on client at all. > > >>> > > >> The request would be "execute task with specified node > filter" - > > >>> > simple > > >>> > > >> and > > >>> > > >> efficient. > > >>> > > >> > > >>> > > >> > > >>> > > >> On Wed, Nov 27, 2019 at 4:31 PM Alex Plehanov < > > >>> > [hidden email]> > > >>> > > >> wrote: > > >>> > > >> > > >>> > > >> > > We do cancel a request to perform a task. We may and > should > > >>> use > > >>> > > this > > >>> > > >> to > > >>> > > >> > cancel any other request in future. > > >>> > > >> > The request is already processed (task is started), we can't > > >>> cancel > > >>> > > the > > >>> > > >> > request. As you mentioned before, we already do almost the > > same > > >>> for > > >>> > > >> queries > > >>> > > >> > (close the cursor, but not cancel the request to run a > query), > > >>> it's > > >>> > > >> better > > >>> > > >> > to do such things in a common way. We have a pattern: start > > some > > >>> > > process > > >>> > > >> > (query, transaction), get id of this process, end process by > > >>> this > > >>> > id. > > >>> > > >> The > > >>> > > >> > "Execute task" process should match the same pattern. In my > > >>> opinion, > > >>> > > >> > implementation with two-way requests is the best option to > > match > > >>> > this > > >>> > > >> > pattern (we can even reuse OP_RESOURCE_CLOSE operation type > in > > >>> this > > >>> > > >> case). > > >>> > > >> > Sometime in the future, we will need two-way requests for > some > > >>> other > > >>> > > >> > functionality (continuous queries, event listening, etc). > But > > >>> even > > >>> > > >> without > > >>> > > >> > two-way requests introducing some process id (task id in our > > >>> case) > > >>> > > will > > >>> > > >> be > > >>> > > >> > closer to existing pattern than canceling tasks by request > id. > > >>> > > >> > > > >>> > > >> > > So every new request will apply those filters on server > > side, > > >>> > using > > >>> > > >> the > > >>> > > >> > most recent set of nodes. > > >>> > > >> > In this case, we always need to send 2 requests to server to > > >>> execute > > >>> > > the > > >>> > > >> > task. First - to get nodes by the filter, second - to > actually > > >>> > execute > > >>> > > >> the > > >>> > > >> > task. It seems like overhead. The same will be for services. > > >>> Cluster > > >>> > > >> group > > >>> > > >> > remains the same if the topology hasn't changed. We can use > > this > > >>> > fact > > >>> > > >> and > > >>> > > >> > bind "execute task" request to topology. If topology has > > >>> changed - > > >>> > get > > >>> > > >> > nodes for new topology and retry request. > > >>> > > >> > > > >>> > > >> > вт, 26 нояб. 2019 г. в 17:44, Pavel Tupitsyn < > > >>> [hidden email] > > >>> > >: > > >>> > > >> > > > >>> > > >> > > > After all, we don't cancel request > > >>> > > >> > > We do cancel a request to perform a task. We may and > should > > >>> use > > >>> > this > > >>> > > >> to > > >>> > > >> > > cancel any other request in future. > > >>> > > >> > > > > >>> > > >> > > > Client uses some cluster group filtration (for example > > >>> > > forServers() > > >>> > > >> > > cluster group) > > >>> > > >> > > Please see above - Aleksandr Shapkin described how we > store > > >>> > > >> > > filtered cluster groups on client. > > >>> > > >> > > We don't store node IDs, we store actual filters. So every > > new > > >>> > > request > > >>> > > >> > will > > >>> > > >> > > apply those filters on server side, > > >>> > > >> > > using the most recent set of nodes. > > >>> > > >> > > > > >>> > > >> > > var myGrp = cluster.forServers().forAttribute("foo"); // > > This > > >>> does > > >>> > > not > > >>> > > >> > > issue any server requests, just builds an object with > > filters > > >>> on > > >>> > > >> client > > >>> > > >> > > while (true) myGrp.compute().executeTask("bar"); // Every > > >>> request > > >>> > > >> > includes > > >>> > > >> > > filters, and filters are applied on the server side > > >>> > > >> > > > > >>> > > >> > > On Tue, Nov 26, 2019 at 1:42 PM Alex Plehanov < > > >>> > > >> [hidden email]> > > >>> > > >> > > wrote: > > >>> > > >> > > > > >>> > > >> > > > > Anyway, my point stands. > > >>> > > >> > > > I can't agree. Why you don't want to use task id for > this? > > >>> After > > >>> > > >> all, > > >>> > > >> > we > > >>> > > >> > > > don't cancel request (request is already processed), we > > >>> cancel > > >>> > the > > >>> > > >> > task. > > >>> > > >> > > So > > >>> > > >> > > > it's more convenient to use task id here. > > >>> > > >> > > > > > >>> > > >> > > > > Can you please provide equivalent use case with > existing > > >>> > "thick" > > >>> > > >> > > client? > > >>> > > >> > > > For example: > > >>> > > >> > > > Cluster consists of one server node. > > >>> > > >> > > > Client uses some cluster group filtration (for example > > >>> > > forServers() > > >>> > > >> > > cluster > > >>> > > >> > > > group). > > >>> > > >> > > > Client starts to send periodically (for example 1 per > > >>> minute) > > >>> > > >> long-term > > >>> > > >> > > > (for example 1 hour long) tasks to the cluster. > > >>> > > >> > > > Meanwhile, several server nodes joined the cluster. > > >>> > > >> > > > > > >>> > > >> > > > In case of thick client: All server nodes will be used, > > >>> tasks > > >>> > will > > >>> > > >> be > > >>> > > >> > > load > > >>> > > >> > > > balanced. > > >>> > > >> > > > In case of thin client: Only one server node will be > used, > > >>> > client > > >>> > > >> will > > >>> > > >> > > > detect topology change after an hour. > > >>> > > >> > > > > > >>> > > >> > > > > > >>> > > >> > > > вт, 26 нояб. 2019 г. в 11:50, Pavel Tupitsyn < > > >>> > > [hidden email] > > >>> > > >> >: > > >>> > > >> > > > > > >>> > > >> > > > > > I can't see any usage of request id in query > cursors > > >>> > > >> > > > > You are right, cursor id is a separate thing. > > >>> > > >> > > > > Anyway, my point stands. > > >>> > > >> > > > > > > >>> > > >> > > > > > client sends long term tasks to nodes and wants to > do > > it > > >>> > with > > >>> > > >> load > > >>> > > >> > > > > balancing > > >>> > > >> > > > > I still don't get it. Can you please provide > equivalent > > >>> use > > >>> > case > > >>> > > >> with > > >>> > > >> > > > > existing "thick" client? > > >>> > > >> > > > > > > >>> > > >> > > > > > > >>> > > >> > > > > On Mon, Nov 25, 2019 at 11:59 PM Alex Plehanov < > > >>> > > >> > > [hidden email]> > > >>> > > >> > > > > wrote: > > >>> > > >> > > > > > > >>> > > >> > > > > > > And it is fine to use request ID to identify > compute > > >>> tasks > > >>> > > >> (as we > > >>> > > >> > > do > > >>> > > >> > > > > with > > >>> > > >> > > > > > query cursors). > > >>> > > >> > > > > > I can't see any usage of request id in query > cursors. > > We > > >>> > send > > >>> > > >> query > > >>> > > >> > > > > request > > >>> > > >> > > > > > and get cursor id in response. After that, we only > use > > >>> > cursor > > >>> > > id > > >>> > > >> > (to > > >>> > > >> > > > get > > >>> > > >> > > > > > next pages and to close the resource). Did I miss > > >>> something? > > >>> > > >> > > > > > > > >>> > > >> > > > > > > Looks like I'm missing something - how is topology > > >>> change > > >>> > > >> > relevant > > >>> > > >> > > to > > >>> > > >> > > > > > executing compute tasks from client? > > >>> > > >> > > > > > It's not relevant directly. But there are some cases > > >>> where > > >>> > it > > >>> > > >> will > > >>> > > >> > be > > >>> > > >> > > > > > helpful. For example, if client sends long term > tasks > > to > > >>> > nodes > > >>> > > >> and > > >>> > > >> > > > wants > > >>> > > >> > > > > to > > >>> > > >> > > > > > do it with load balancing it will detect topology > > change > > >>> > only > > >>> > > >> after > > >>> > > >> > > > some > > >>> > > >> > > > > > time in the future with the first response, so load > > >>> > balancing > > >>> > > >> will > > >>> > > >> > no > > >>> > > >> > > > > work. > > >>> > > >> > > > > > Perhaps we can add optional "topology version" field > > to > > >>> the > > >>> > > >> > > > > > OP_COMPUTE_EXECUTE_TASK request to solve this > problem. > > >>> > > >> > > > > > > > >>> > > >> > > > > > > > >>> > > >> > > > > > пн, 25 нояб. 2019 г. в 22:42, Pavel Tupitsyn < > > >>> > > >> [hidden email] > > >>> > > >> > >: > > >>> > > >> > > > > > > > >>> > > >> > > > > > > Alex, > > >>> > > >> > > > > > > > > >>> > > >> > > > > > > > we will mix entities from different layers > > >>> (transport > > >>> > > layer > > >>> > > >> and > > >>> > > >> > > > > request > > >>> > > >> > > > > > > body) > > >>> > > >> > > > > > > I would not call our message header (which > includes > > >>> the > > >>> > id) > > >>> > > >> > > > "transport > > >>> > > >> > > > > > > layer". > > >>> > > >> > > > > > > TCP is our transport layer. And it is fine to use > > >>> request > > >>> > ID > > >>> > > >> to > > >>> > > >> > > > > identify > > >>> > > >> > > > > > > compute tasks (as we do with query cursors). > > >>> > > >> > > > > > > > > >>> > > >> > > > > > > > we still can't be sure that the task is > > successfully > > >>> > > started > > >>> > > >> > on a > > >>> > > >> > > > > > server > > >>> > > >> > > > > > > The request to start the task will fail and we'll > > get > > >>> a > > >>> > > >> response > > >>> > > >> > > > > > indicating > > >>> > > >> > > > > > > that right away > > >>> > > >> > > > > > > > > >>> > > >> > > > > > > > we won't ever know about topology change > > >>> > > >> > > > > > > Looks like I'm missing something - how is topology > > >>> change > > >>> > > >> > relevant > > >>> > > >> > > to > > >>> > > >> > > > > > > executing compute tasks from client? > > >>> > > >> > > > > > > > > >>> > > >> > > > > > > On Mon, Nov 25, 2019 at 10:17 PM Alex Plehanov < > > >>> > > >> > > > > [hidden email]> > > >>> > > >> > > > > > > wrote: > > >>> > > >> > > > > > > > > >>> > > >> > > > > > > > Pavel, in this case, we will mix entities from > > >>> different > > >>> > > >> layers > > >>> > > >> > > > > > > (transport > > >>> > > >> > > > > > > > layer and request body), it's not very good. The > > >>> same > > >>> > > >> behavior > > >>> > > >> > we > > >>> > > >> > > > can > > >>> > > >> > > > > > > > achieve with generated on client-side task id, > but > > >>> there > > >>> > > >> will > > >>> > > >> > be > > >>> > > >> > > no > > >>> > > >> > > > > > > > inter-layer data intersection and I think it > will > > be > > >>> > > easier > > >>> > > >> to > > >>> > > >> > > > > > implement > > >>> > > >> > > > > > > on > > >>> > > >> > > > > > > > both client and server-side. But we still can't > be > > >>> sure > > >>> > > that > > >>> > > >> > the > > >>> > > >> > > > task > > >>> > > >> > > > > > is > > >>> > > >> > > > > > > > successfully started on a server. We won't ever > > know > > >>> > about > > >>> > > >> > > topology > > >>> > > >> > > > > > > change, > > >>> > > >> > > > > > > > because topology changed flag will be sent from > > >>> server > > >>> > to > > >>> > > >> > client > > >>> > > >> > > > only > > >>> > > >> > > > > > > with > > >>> > > >> > > > > > > > a response when the task will be completed. Are > we > > >>> > accept > > >>> > > >> that? > > >>> > > >> > > > > > > > > > >>> > > >> > > > > > > > пн, 25 нояб. 2019 г. в 19:07, Pavel Tupitsyn < > > >>> > > >> > > [hidden email] > > >>> > > >> > > > >: > > >>> > > >> > > > > > > > > > >>> > > >> > > > > > > > > Alex, > > >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > I have a simpler idea. We already do request > id > > >>> > handling > > >>> > > >> in > > >>> > > >> > the > > >>> > > >> > > > > > > protocol, > > >>> > > >> > > > > > > > > so: > > >>> > > >> > > > > > > > > - Client sends a normal request to execute > > compute > > >>> > task. > > >>> > > >> > > Request > > >>> > > >> > > > ID > > >>> > > >> > > > > > is > > >>> > > >> > > > > > > > > generated as usual. > > >>> > > >> > > > > > > > > - As soon as task is completed, a response is > > >>> > received. > > >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > As for cancellation - client can send a new > > >>> request > > >>> > > (with > > >>> > > >> new > > >>> > > >> > > > > request > > >>> > > >> > > > > > > ID) > > >>> > > >> > > > > > > > > and (in the body) pass the request ID from > above > > >>> > > >> > > > > > > > > as a task identifier. As a result, there are > two > > >>> > > >> responses: > > >>> > > >> > > > > > > > > - Cancellation response > > >>> > > >> > > > > > > > > - Task response (with proper cancelled status) > > >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > That's it, no need to modify the core of the > > >>> protocol. > > >>> > > One > > >>> > > >> > > > request > > >>> > > >> > > > > - > > >>> > > >> > > > > > > one > > >>> > > >> > > > > > > > > response. > > >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > On Mon, Nov 25, 2019 at 6:20 PM Alex Plehanov > < > > >>> > > >> > > > > > [hidden email] > > >>> > > >> > > > > > > > > > >>> > > >> > > > > > > > > wrote: > > >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > > Pavel, we need to inform the client when the > > >>> task is > > >>> > > >> > > completed, > > >>> > > >> > > > > we > > >>> > > >> > > > > > > need > > >>> > > >> > > > > > > > > the > > >>> > > >> > > > > > > > > > ability to cancel the task. I see several > ways > > >>> to > > >>> > > >> implement > > >>> > > >> > > > this: > > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > 1. Сlient sends a request to the server to > > >>> start a > > >>> > > task, > > >>> > > >> > > server > > >>> > > >> > > > > > > return > > >>> > > >> > > > > > > > > task > > >>> > > >> > > > > > > > > > id in response. Server notifies client when > > >>> task is > > >>> > > >> > completed > > >>> > > >> > > > > with > > >>> > > >> > > > > > a > > >>> > > >> > > > > > > > new > > >>> > > >> > > > > > > > > > request (from server to client). Client can > > >>> cancel > > >>> > the > > >>> > > >> task > > >>> > > >> > > by > > >>> > > >> > > > > > > sending > > >>> > > >> > > > > > > > a > > >>> > > >> > > > > > > > > > new request with operation type "cancel" and > > >>> task > > >>> > id. > > >>> > > In > > >>> > > >> > this > > >>> > > >> > > > > case, > > >>> > > >> > > > > > > we > > >>> > > >> > > > > > > > > > should implement 2-ways requests. > > >>> > > >> > > > > > > > > > 2. Client generates unique task id and > sends a > > >>> > request > > >>> > > >> to > > >>> > > >> > the > > >>> > > >> > > > > > server > > >>> > > >> > > > > > > to > > >>> > > >> > > > > > > > > > start a task, server don't reply immediately > > but > > >>> > wait > > >>> > > >> until > > >>> > > >> > > > task > > >>> > > >> > > > > is > > >>> > > >> > > > > > > > > > completed. Client can cancel task by sending > > new > > >>> > > request > > >>> > > >> > with > > >>> > > >> > > > > > > operation > > >>> > > >> > > > > > > > > > type "cancel" and task id. In this case, we > > >>> should > > >>> > > >> decouple > > >>> > > >> > > > > request > > >>> > > >> > > > > > > and > > >>> > > >> > > > > > > > > > response on the server-side (currently > > response > > >>> is > > >>> > > sent > > >>> > > >> > right > > >>> > > >> > > > > after > > >>> > > >> > > > > > > > > request > > >>> > > >> > > > > > > > > > was processed). Also, we can't be sure that > > >>> task is > > >>> > > >> > > > successfully > > >>> > > >> > > > > > > > started > > >>> > > >> > > > > > > > > on > > >>> > > >> > > > > > > > > > a server. > > >>> > > >> > > > > > > > > > 3. Client sends a request to the server to > > >>> start a > > >>> > > task, > > >>> > > >> > > server > > >>> > > >> > > > > > > return > > >>> > > >> > > > > > > > id > > >>> > > >> > > > > > > > > > in response. Client periodically asks the > > server > > >>> > about > > >>> > > >> task > > >>> > > >> > > > > status. > > >>> > > >> > > > > > > > > Client > > >>> > > >> > > > > > > > > > can cancel the task by sending new request > > with > > >>> > > >> operation > > >>> > > >> > > type > > >>> > > >> > > > > > > "cancel" > > >>> > > >> > > > > > > > > and > > >>> > > >> > > > > > > > > > task id. This case brings some overhead to > the > > >>> > > >> > communication > > >>> > > >> > > > > > channel. > > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > Personally, I think that the case with > 2-ways > > >>> > requests > > >>> > > >> is > > >>> > > >> > > > better, > > >>> > > >> > > > > > but > > >>> > > >> > > > > > > > I'm > > >>> > > >> > > > > > > > > > open to any other ideas. > > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > Aleksandr, > > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > Filtering logic for > > >>> OP_CLUSTER_GROUP_GET_NODE_IDS > > >>> > > looks > > >>> > > >> > > > > > > > overcomplicated. > > >>> > > >> > > > > > > > > Do > > >>> > > >> > > > > > > > > > we need server-side filtering at all? > Wouldn't > > >>> it be > > >>> > > >> better > > >>> > > >> > > to > > >>> > > >> > > > > send > > >>> > > >> > > > > > > > basic > > >>> > > >> > > > > > > > > > info (ids, order, flags) for all nodes > (there > > is > > >>> > > >> relatively > > >>> > > >> > > > small > > >>> > > >> > > > > > > > amount > > >>> > > >> > > > > > > > > of > > >>> > > >> > > > > > > > > > data) and extended info (attributes) for > > >>> selected > > >>> > list > > >>> > > >> of > > >>> > > >> > > > nodes? > > >>> > > >> > > > > In > > >>> > > >> > > > > > > > this > > >>> > > >> > > > > > > > > > case, we can do basic node filtration on > > >>> client-side > > >>> > > >> > > > > (forClients(), > > >>> > > >> > > > > > > > > > forServers(), forNodeIds(), forOthers(), > etc). > > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > Do you use standard ClusterNode > serialization? > > >>> There > > >>> > > are > > >>> > > >> > also > > >>> > > >> > > > > > metrics > > >>> > > >> > > > > > > > > > serialized with ClusterNode, do we need it > on > > >>> thin > > >>> > > >> client? > > >>> > > >> > > > There > > >>> > > >> > > > > > are > > >>> > > >> > > > > > > > > other > > >>> > > >> > > > > > > > > > interfaces exist to show metrics, I think > it's > > >>> > > >> redundant to > > >>> > > >> > > > > export > > >>> > > >> > > > > > > > > metrics > > >>> > > >> > > > > > > > > > to thin clients too. > > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > What do you think? > > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > пт, 22 нояб. 2019 г. в 20:15, Aleksandr > > Shapkin > > >>> < > > >>> > > >> > > > > [hidden email] > > >>> > > >> > > > > > >: > > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > Alex, > > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > I think you can create a new IEP page and > I > > >>> will > > >>> > > fill > > >>> > > >> it > > >>> > > >> > > with > > >>> > > >> > > > > the > > >>> > > >> > > > > > > > > Cluster > > >>> > > >> > > > > > > > > > > API details. > > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > In short, I’ve introduced several new > codes: > > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > Cluster API is pretty straightforward: > > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > OP_CLUSTER_IS_ACTIVE = 5000 > > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > OP_CLUSTER_CHANGE_STATE = 5001 > > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > OP_CLUSTER_CHANGE_WAL_STATE = 5002 > > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > OP_CLUSTER_GET_WAL_STATE = 5003 > > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > Cluster group codes: > > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > OP_CLUSTER_GROUP_GET_NODE_IDS = 5100 > > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > OP_CLUSTER_GROUP_GET_NODE_INFO = 5101 > > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > The underlying implementation is based on > > the > > >>> > thick > > >>> > > >> > client > > >>> > > >> > > > > logic. > > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > For every request, we provide a known > > topology > > >>> > > version > > >>> > > >> > and > > >>> > > >> > > if > > >>> > > >> > > > > it > > >>> > > >> > > > > > > has > > >>> > > >> > > > > > > > > > > changed, > > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > a client updates it firstly and then > > re-sends > > >>> the > > >>> > > >> > filtering > > >>> > > >> > > > > > > request. > > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > Alongside the topVer a client sends a > > >>> serialized > > >>> > > nodes > > >>> > > >> > > > > projection > > >>> > > >> > > > > > > > > object > > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > that could be considered as a code to > value > > >>> > mapping. > > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > Consider: [{Code = 1, Value= [“DotNet”, > > >>> > > >> “MyAttribute”}, > > >>> > > >> > > > > {Code=2, > > >>> > > >> > > > > > > > > > Value=1}] > > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > Where “1” stands for Attribute filtering > and > > >>> “2” – > > >>> > > >> > > > > > serverNodesOnly > > >>> > > >> > > > > > > > > flag. > > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > As a result of request processing, a > server > > >>> sends > > >>> > > >> nodeId > > >>> > > >> > > > UUIDs > > >>> > > >> > > > > > and > > >>> > > >> > > > > > > a > > >>> > > >> > > > > > > > > > > current topVer. > > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > When a client obtains nodeIds, it can > > perform > > >>> a > > >>> > > >> NODE_INFO > > >>> > > >> > > > call > > >>> > > >> > > > > to > > >>> > > >> > > > > > > > get a > > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > serialized ClusterNode object. In addition > > >>> there > > >>> > > >> should > > >>> > > >> > be > > >>> > > >> > > a > > >>> > > >> > > > > > > > different > > >>> > > >> > > > > > > > > > API > > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > method for accessing/updating node > metrics. > > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > чт, 21 нояб. 2019 г. в 12:32, Sergey > Kozlov > > < > > >>> > > >> > > > > > [hidden email] > > >>> > > >> > > > > > > >: > > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > Hi Pavel > > >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > On Thu, Nov 21, 2019 at 11:30 AM Pavel > > >>> Tupitsyn > > >>> > < > > >>> > > >> > > > > > > > > [hidden email]> > > >>> > > >> > > > > > > > > > > > wrote: > > >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > 1. I believe that Cluster operations > for > > >>> Thin > > >>> > > >> Client > > >>> > > >> > > > > protocol > > >>> > > >> > > > > > > are > > >>> > > >> > > > > > > > > > > already > > >>> > > >> > > > > > > > > > > > > in the works > > >>> > > >> > > > > > > > > > > > > by Alexandr Shapkin. Can't find the > > ticket > > >>> > > though. > > >>> > > >> > > > > > > > > > > > > Alexandr, can you please confirm and > > >>> attach > > >>> > the > > >>> > > >> > ticket > > >>> > > >> > > > > > number? > > >>> > > >> > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > 2. Proposed changes will work only for > > >>> Java > > >>> > > tasks > > >>> > > >> > that > > >>> > > >> > > > are > > >>> > > >> > > > > > > > already > > >>> > > >> > > > > > > > > > > > deployed > > >>> > > >> > > > > > > > > > > > > on server nodes. > > >>> > > >> > > > > > > > > > > > > This is mostly useless for other thin > > >>> clients > > >>> > we > > >>> > > >> have > > >>> > > >> > > > > > (Python, > > >>> > > >> > > > > > > > PHP, > > >>> > > >> > > > > > > > > > > .NET, > > >>> > > >> > > > > > > > > > > > > C++). > > >>> > > >> > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > I don't guess so. The task (execution) > is > > a > > >>> way > > >>> > to > > >>> > > >> > > > implement > > >>> > > >> > > > > > own > > >>> > > >> > > > > > > > > layer > > >>> > > >> > > > > > > > > > > for > > >>> > > >> > > > > > > > > > > > the thin client application. > > >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > We should think of a way to make this > > >>> useful > > >>> > for > > >>> > > >> all > > >>> > > >> > > > > clients. > > >>> > > >> > > > > > > > > > > > > For example, we may allow sending > tasks > > in > > >>> > some > > >>> > > >> > > scripting > > >>> > > >> > > > > > > > language > > >>> > > >> > > > > > > > > > like > > >>> > > >> > > > > > > > > > > > > Javascript. > > >>> > > >> > > > > > > > > > > > > Thoughts? > > >>> > > >> > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > The arbitrary code execution from a > remote > > >>> > client > > >>> > > >> must > > >>> > > >> > be > > >>> > > >> > > > > > > protected > > >>> > > >> > > > > > > > > > > > from malicious code. > > >>> > > >> > > > > > > > > > > > I don't know how it could be designed > but > > >>> > without > > >>> > > >> that > > >>> > > >> > we > > >>> > > >> > > > > open > > >>> > > >> > > > > > > the > > >>> > > >> > > > > > > > > hole > > >>> > > >> > > > > > > > > > > to > > >>> > > >> > > > > > > > > > > > kill cluster. > > >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > On Thu, Nov 21, 2019 at 11:21 AM > Sergey > > >>> > Kozlov < > > >>> > > >> > > > > > > > > [hidden email] > > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > wrote: > > >>> > > >> > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > Hi Alex > > >>> > > >> > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > The idea is great. But I have some > > >>> concerns > > >>> > > that > > >>> > > >> > > > probably > > >>> > > >> > > > > > > > should > > >>> > > >> > > > > > > > > be > > >>> > > >> > > > > > > > > > > > taken > > >>> > > >> > > > > > > > > > > > > > into account for design: > > >>> > > >> > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > 1. We need to have the ability to > > >>> stop a > > >>> > > task > > >>> > > >> > > > > execution, > > >>> > > >> > > > > > > > smth > > >>> > > >> > > > > > > > > > like > > >>> > > >> > > > > > > > > > > > > > OP_COMPUTE_CANCEL_TASK operation > > >>> (client > > >>> > > to > > >>> > > >> > > server) > > >>> > > >> > > > > > > > > > > > > > 2. What's about task execution > > >>> timeout? > > >>> > It > > >>> > > >> may > > >>> > > >> > > help > > >>> > > >> > > > to > > >>> > > >> > > > > > the > > >>> > > >> > > > > > > > > > cluster > > >>> > > >> > > > > > > > > > > > > > survival for buggy tasks > > >>> > > >> > > > > > > > > > > > > > 3. Ignite doesn't have > > >>> > roles/authorization > > >>> > > >> > > > > functionality > > >>> > > >> > > > > > > for > > >>> > > >> > > > > > > > > > now. > > >>> > > >> > > > > > > > > > > > But > > >>> > > >> > > > > > > > > > > > > a > > >>> > > >> > > > > > > > > > > > > > task is the risky operation for > > >>> cluster > > >>> > > (for > > >>> > > >> > > > security > > >>> > > >> > > > > > > > > reasons). > > >>> > > >> > > > > > > > > > > > Could > > >>> > > >> > > > > > > > > > > > > we > > >>> > > >> > > > > > > > > > > > > > add for Ignite configuration new > > >>> options: > > >>> > > >> > > > > > > > > > > > > > - Explicit turning on for > > compute > > >>> task > > >>> > > >> > support > > >>> > > >> > > > for > > >>> > > >> > > > > > thin > > >>> > > >> > > > > > > > > > > protocol > > >>> > > >> > > > > > > > > > > > > > (disabled by default) for > whole > > >>> > cluster > > >>> > > >> > > > > > > > > > > > > > - Explicit turning on for > > compute > > >>> task > > >>> > > >> > support > > >>> > > >> > > > for > > >>> > > >> > > > > a > > >>> > > >> > > > > > > node > > >>> > > >> > > > > > > > > > > > > > - The list of task names > > (classes) > > >>> > > >> allowed to > > >>> > > >> > > > > execute > > >>> > > >> > > > > > > by > > >>> > > >> > > > > > > > > thin > > >>> > > >> > > > > > > > > > > > > client. > > >>> > > >> > > > > > > > > > > > > > 4. Support the labeling for task > > >>> that may > > >>> > > >> help > > >>> > > >> > to > > >>> > > >> > > > > > > > investigate > > >>> > > >> > > > > > > > > > > issues > > >>> > > >> > > > > > > > > > > > > on > > >>> > > >> > > > > > > > > > > > > > cluster (the idea from IEP-34 > [1]) > > >>> > > >> > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > 1. > > >>> > > >> > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > > >>> > > >> > > > > > > > > >>> > > >> > > > > > > > >>> > > >> > > > > > > >>> > > >> > > > > > >>> > > >> > > > > >>> > > >> > > > >>> > > >> > > >>> > > > > >>> > > > >>> > > > https://cwiki.apache.org/confluence/display/IGNITE/IEP-34+Thin+client%3A+transactions+support > > >>> > > >> > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > On Thu, Nov 21, 2019 at 10:58 AM > Alex > > >>> > > Plehanov < > > >>> > > >> > > > > > > > > > > > [hidden email]> > > >>> > > >> > > > > > > > > > > > > > wrote: > > >>> > > >> > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > Hello, Igniters! > > >>> > > >> > > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > I have plans to start > implementation > > >>> of > > >>> > > >> Compute > > >>> > > >> > > > > interface > > >>> > > >> > > > > > > for > > >>> > > >> > > > > > > > > > > Ignite > > >>> > > >> > > > > > > > > > > > > thin > > >>> > > >> > > > > > > > > > > > > > > client and want to discuss > features > > >>> that > > >>> > > >> should > > >>> > > >> > be > > >>> > > >> > > > > > > > implemented. > > >>> > > >> > > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > We already have Compute > > >>> implementation for > > >>> > > >> > > > binary-rest > > >>> > > >> > > > > > > > clients > > >>> > > >> > > > > > > > > > > > > > > (GridClientCompute), which have > the > > >>> > > following > > >>> > > >> > > > > > > functionality: > > >>> > > >> > > > > > > > > > > > > > > - Filtering cluster nodes > > >>> (projection) for > > >>> > > >> > compute > > >>> > > >> > > > > > > > > > > > > > > - Executing task by the name > > >>> > > >> > > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > I think we can implement this > > >>> > functionality > > >>> > > >> in a > > >>> > > >> > > thin > > >>> > > >> > > > > > > client > > >>> > > >> > > > > > > > as > > >>> > > >> > > > > > > > > > > well. > > >>> > > >> > > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > First of all, we need some > operation > > >>> types > > >>> > > to > > >>> > > >> > > > request a > > >>> > > >> > > > > > > list > > >>> > > >> > > > > > > > of > > >>> > > >> > > > > > > > > > all > > >>> > > >> > > > > > > > > > > > > > > available nodes and probably node > > >>> > attributes > > >>> > > >> (by > > >>> > > >> > a > > >>> > > >> > > > list > > >>> > > >> > > > > > of > > >>> > > >> > > > > > > > > > nodes). > > >>> > > >> > > > > > > > > > > > Node > > >>> > > >> > > > > > > > > > > > > > > attributes will be helpful if we > > will > > >>> > decide > > >>> > > >> to > > >>> > > >> > > > > implement > > >>> > > >> > > > > > > > > analog > > >>> > > >> > > > > > > > > > of > > >>> > > >> > > > > > > > > > > > > > > ClusterGroup#forAttribute or > > >>> > > >> > > > ClusterGroup#forePredicate > > >>> > > >> > > > > > > > methods > > >>> > > >> > > > > > > > > > in > > >>> > > >> > > > > > > > > > > > the > > >>> > > >> > > > > > > > > > > > > > thin > > >>> > > >> > > > > > > > > > > > > > > client. Perhaps they can be > > requested > > >>> > > lazily. > > >>> > > >> > > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > From the protocol point of view > > there > > >>> will > > >>> > > be > > >>> > > >> two > > >>> > > >> > > new > > >>> > > >> > > > > > > > > operations: > > >>> > > >> > > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > OP_CLUSTER_GET_NODES > > >>> > > >> > > > > > > > > > > > > > > Request: empty > > >>> > > >> > > > > > > > > > > > > > > Response: long topologyVersion, > int > > >>> > > >> > > > > minorTopologyVersion, > > >>> > > >> > > > > > > int > > >>> > > >> > > > > > > > > > > > > nodesCount, > > >>> > > >> > > > > > > > > > > > > > > for each node set of node fields > > (UUID > > >>> > > nodeId, > > >>> > > >> > > Object > > >>> > > >> > > > > or > > >>> > > >> > > > > > > > String > > >>> > > >> > > > > > > > > > > > > > > consistentId, long order, etc) > > >>> > > >> > > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > OP_CLUSTER_GET_NODE_ATTRIBUTES > > >>> > > >> > > > > > > > > > > > > > > Request: int nodesCount, for each > > >>> node: > > >>> > UUID > > >>> > > >> > nodeId > > >>> > > >> > > > > > > > > > > > > > > Response: int nodesCount, for each > > >>> node: > > >>> > int > > >>> > > >> > > > > > > attributesCount, > > >>> > > >> > > > > > > > > for > > >>> > > >> > > > > > > > > > > > each > > >>> > > >> > > > > > > > > > > > > > node > > >>> > > >> > > > > > > > > > > > > > > attribute: String name, Object > value > > >>> > > >> > > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > To execute tasks we need something > > >>> like > > >>> > > these > > >>> > > >> > > methods > > >>> > > >> > > > > in > > >>> > > >> > > > > > > the > > >>> > > >> > > > > > > > > > client > > >>> > > >> > > > > > > > > > > > > API: > > >>> > > >> > > > > > > > > > > > > > > Object execute(String task, Object > > >>> arg) > > >>> > > >> > > > > > > > > > > > > > > Future<Object> executeAsync(String > > >>> task, > > >>> > > >> Object > > >>> > > >> > > arg) > > >>> > > >> > > > > > > > > > > > > > > Object affinityExecute(String > task, > > >>> String > > >>> > > >> cache, > > >>> > > >> > > > > Object > > >>> > > >> > > > > > > key, > > >>> > > >> > > > > > > > > > > Object > > >>> > > >> > > > > > > > > > > > > arg) > > >>> > > >> > > > > > > > > > > > > > > Future<Object> > > >>> affinityExecuteAsync(String > > >>> > > >> task, > > >>> > > >> > > > String > > >>> > > >> > > > > > > > cache, > > >>> > > >> > > > > > > > > > > Object > > >>> > > >> > > > > > > > > > > > > > key, > > >>> > > >> > > > > > > > > > > > > > > Object arg) > > >>> > > >> > > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > Which can be mapped to protocol > > >>> > operations: > > >>> > > >> > > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > OP_COMPUTE_EXECUTE_TASK > > >>> > > >> > > > > > > > > > > > > > > Request: UUID nodeId, String > > taskName, > > >>> > > Object > > >>> > > >> arg > > >>> > > >> > > > > > > > > > > > > > > Response: Object result > > >>> > > >> > > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > OP_COMPUTE_EXECUTE_TASK_AFFINITY > > >>> > > >> > > > > > > > > > > > > > > Request: String cacheName, Object > > key, > > >>> > > String > > >>> > > >> > > > taskName, > > >>> > > >> > > > > > > > Object > > >>> > > >> > > > > > > > > > arg > > >>> > > >> > > > > > > > > > > > > > > Response: Object result > > >>> > > >> > > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > The second operation is needed > > >>> because we > > >>> > > >> > sometimes > > >>> > > >> > > > > can't > > >>> > > >> > > > > > > > > > calculate > > >>> > > >> > > > > > > > > > > > and > > >>> > > >> > > > > > > > > > > > > > > connect to affinity node on the > > >>> > client-side > > >>> > > >> > > (affinity > > >>> > > >> > > > > > > > awareness > > >>> > > >> > > > > > > > > > can > > >>> > > >> > > > > > > > > > > > be > > >>> > > >> > > > > > > > > > > > > > > disabled, custom affinity function > > >>> can be > > >>> > > >> used or > > >>> > > >> > > > there > > >>> > > >> > > > > > can > > >>> > > >> > > > > > > > be > > >>> > > >> > > > > > > > > no > > >>> > > >> > > > > > > > > > > > > > > connection between client and > > affinity > > >>> > > node), > > >>> > > >> but > > >>> > > >> > > we > > >>> > > >> > > > > can > > >>> > > >> > > > > > > make > > >>> > > >> > > > > > > > > > best > > >>> > > >> > > > > > > > > > > > > effort > > >>> > > >> > > > > > > > > > > > > > > to send request to target node if > > >>> affinity > > >>> > > >> > > awareness > > >>> > > >> > > > is > > >>> > > >> > > > > > > > > enabled. > > >>> > > >> > > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > Currently, on the server-side > > requests > > >>> > > always > > >>> > > >> > > > processed > > >>> > > >> > > > > > > > > > > synchronously > > >>> > > >> > > > > > > > > > > > > and > > >>> > > >> > > > > > > > > > > > > > > responses are sent right after > > >>> request was > > >>> > > >> > > processed. > > >>> > > >> > > > > To > > >>> > > >> > > > > > > > > execute > > >>> > > >> > > > > > > > > > > long > > >>> > > >> > > > > > > > > > > > > > tasks > > >>> > > >> > > > > > > > > > > > > > > async we should whether change > this > > >>> logic > > >>> > or > > >>> > > >> > > > introduce > > >>> > > >> > > > > > some > > >>> > > >> > > > > > > > > kind > > >>> > > >> > > > > > > > > > > > > two-way > > >>> > > >> > > > > > > > > > > > > > > communication between client and > > >>> server > > >>> > (now > > >>> > > >> only > > >>> > > >> > > > > one-way > > >>> > > >> > > > > > > > > > requests > > >>> > > >> > > > > > > > > > > > from > > >>> > > >> > > > > > > > > > > > > > > client to server are allowed). > > >>> > > >> > > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > Two-way communication can also be > > >>> useful > > >>> > in > > >>> > > >> the > > >>> > > >> > > > future > > >>> > > >> > > > > if > > >>> > > >> > > > > > > we > > >>> > > >> > > > > > > > > will > > >>> > > >> > > > > > > > > > > > send > > >>> > > >> > > > > > > > > > > > > > some > > >>> > > >> > > > > > > > > > > > > > > server-side generated events to > > >>> clients. > > >>> > > >> > > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > In case of two-way communication > > >>> there can > > >>> > > be > > >>> > > >> new > > >>> > > >> > > > > > > operations > > >>> > > >> > > > > > > > > > > > > introduced: > > >>> > > >> > > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > OP_COMPUTE_EXECUTE_TASK (from > client > > >>> to > > >>> > > >> server) > > >>> > > >> > > > > > > > > > > > > > > Request: UUID nodeId, String > > taskName, > > >>> > > Object > > >>> > > >> arg > > >>> > > >> > > > > > > > > > > > > > > Response: long taskId > > >>> > > >> > > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > OP_COMPUTE_TASK_FINISHED (from > > server > > >>> to > > >>> > > >> client) > > >>> > > >> > > > > > > > > > > > > > > Request: taskId, Object result > > >>> > > >> > > > > > > > > > > > > > > Response: empty > > >>> > > >> > > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > The same for affinity requests. > > >>> > > >> > > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > Also, we can implement not only > > >>> execute > > >>> > task > > >>> > > >> > > > operation, > > >>> > > >> > > > > > but > > >>> > > >> > > > > > > > > some > > >>> > > >> > > > > > > > > > > > other > > >>> > > >> > > > > > > > > > > > > > > operations from IgniteCompute > > >>> (broadcast, > > >>> > > run, > > >>> > > >> > > call), > > >>> > > >> > > > > but > > >>> > > >> > > > > > > it > > >>> > > >> > > > > > > > > will > > >>> > > >> > > > > > > > > > > be > > >>> > > >> > > > > > > > > > > > > > useful > > >>> > > >> > > > > > > > > > > > > > > only for java thin client. And > even > > >>> with > > >>> > > java > > >>> > > >> > thin > > >>> > > >> > > > > client > > >>> > > >> > > > > > > we > > >>> > > >> > > > > > > > > > should > > >>> > > >> > > > > > > > > > > > > > whether > > >>> > > >> > > > > > > > > > > > > > > implement peer-class-loading for > > thin > > >>> > > clients > > >>> > > >> > (this > > >>> > > >> > > > > also > > >>> > > >> > > > > > > > > requires > > >>> > > >> > > > > > > > > > > > > two-way > > >>> > > >> > > > > > > > > > > > > > > client-server communication) or > put > > >>> > classes > > >>> > > >> with > > >>> > > >> > > > > executed > > >>> > > >> > > > > > > > > > closures > > >>> > > >> > > > > > > > > > > to > > >>> > > >> > > > > > > > > > > > > the > > >>> > > >> > > > > > > > > > > > > > > server locally. > > >>> > > >> > > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > What do you think about proposed > > >>> protocol > > >>> > > >> > changes? > > >>> > > >> > > > > > > > > > > > > > > Do we need two-way requests > between > > >>> client > > >>> > > and > > >>> > > >> > > > server? > > >>> > > >> > > > > > > > > > > > > > > Do we need support of compute > > methods > > >>> > other > > >>> > > >> than > > >>> > > >> > > > > "execute > > >>> > > >> > > > > > > > > task"? > > >>> > > >> > > > > > > > > > > > > > > What do you think about > > >>> peer-class-loading > > >>> > > for > > >>> > > >> > thin > > >>> > > >> > > > > > > clients? > > >>> > > >> > > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > -- > > >>> > > >> > > > > > > > > > > > > > Sergey Kozlov > > >>> > > >> > > > > > > > > > > > > > GridGain Systems > > >>> > > >> > > > > > > > > > > > > > www.gridgain.com > > >>> > > >> > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > -- > > >>> > > >> > > > > > > > > > > > Sergey Kozlov > > >>> > > >> > > > > > > > > > > > GridGain Systems > > >>> > > >> > > > > > > > > > > > www.gridgain.com > > >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > -- > > >>> > > >> > > > > > > > > > > Alex. > > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > > >>> > > >> > > > > > > > > >>> > > >> > > > > > > > >>> > > >> > > > > > > >>> > > >> > > > > > >>> > > >> > > > > >>> > > >> > > > >>> > > >> > > >>> > > > > > >>> > > > > >>> > > > >>> > > >> > > > |
> will try to provide such guarantees on the server-side
Thanks. I think it is better to do once in the server code than N times in every client. On Wed, Mar 25, 2020 at 11:04 PM Alex Plehanov <[hidden email]> wrote: > Pavel, > > 1. Actually it can be solved on the client-side (and already solved in PoC > implementation). But I agreed it brings extra complexity for client-side > implementation, will try to provide such guarantees on the server-side. > 2. ComputeTask has also "reduce" step which is executed on the initiator > node. Binary-rest client implementation, for example, has such affinity > methods (to execute the task by name). I'm ok with removing it. At least if > someone will need it we can implement it again at any time in the future > without protocol change. > I've fixed IEP. > > Denis, > > Deployment API definitely needed as one of the next steps. Currently, we > are talking only about the first step (execution of already deployed > tasks). > Also, I'm not sure about automatic redeploy and peer-class-loading for thin > clients, I think it's better to have more control here and provide API to > explicitly deploy classes or jar files. WDYT? > > ср, 25 мар. 2020 г. в 21:17, Denis Magda <[hidden email]>: > > > Alex, thanks for preparing the outline. > > > > I'd like us to discuss an approach for compute tasks update with no > > downtimes on the servers' end. For instance, let's assume that a > > Python/C++/Node.JS developer requested to update a compute task he called > > from the app. Should we introduce some system level API to the binary > > protocol that can take a jar file (or class) and redeploy it > automatically > > with the usage of peer-class-loading? > > > > - > > Denis > > > > > > On Wed, Mar 25, 2020 at 5:47 AM Alex Plehanov <[hidden email]> > > wrote: > > > > > Hello guys. > > > > > > I've implemented PoC and created IEP [1] for thin client compute grid > > > functionality. Please have a look. > > > > > > [1]: > > > > > > > > > https://cwiki.apache.org/confluence/display/IGNITE/IEP-42+Thin+client%3A+compute+support > > > > > > пт, 24 янв. 2020 г. в 16:56, Alex Plehanov <[hidden email]>: > > > > > > > We've discussed thin client compute protocol with Pavel Tupitsyn and > > Igor > > > > Sapego and come to the conclusion that approach with two-way requests > > > > should be used: client generates taskId and send a request to the > > server > > > to > > > > execute a task. The server responds that the request has been > accepted. > > > > After task has finished the server notifies the client (send a > request > > > > without waiting for a response). The client can cancel the task by > > > sending > > > > a corresponding request to the server. > > > > > > > > Also, a node list should be passed (optionally) with a request to > limit > > > > nodes to execute the task. > > > > > > > > I will create IEP and file detailed protocol changes shortly. > > > > > > > > вт, 21 янв. 2020 г. в 18:46, Alex Plehanov <[hidden email] > >: > > > > > > > >> Igor, thanks for the reply. > > > >> > > > >> > Approach with taskId will require a lot of changes in protocol and > > > thus > > > >> more "heavy" for implementation > > > >> Do you mean approach with server notifications mechanism? Yes, it > will > > > >> require a lot of changes. But in most recent messages we've > discussed > > > with > > > >> Pavel approach without server notifications mechanism. This approach > > > have > > > >> the same complexity and performance as an approach with requestId. > > > >> > > > >> > But such clients as Python, Node.js, PHP, Go most probably won't > > have > > > >> support for this API, at least for now. > > > >> Without a server notifications mechanism, there will be no breaking > > > >> changes in the protocol, so client implementation can just skip this > > > >> feature and protocol version and implement the next one. > > > >> > > > >> > Or never. > > > >> I think it still useful to execute java compute tasks from non-java > > thin > > > >> clients. Also, we can provide some out-of-the-box java tasks, for > > > example > > > >> ExecutePythonScriptTask with python compute implementation, which > can > > > run > > > >> python script on server node. > > > >> > > > >> > So, maybe it's a good time for us to change our backward > > compatibility > > > >> mechanism from protocol versioning to feature masks? > > > >> I like the idea with feature masks, but it will force us to support > > both > > > >> backward compatibility mechanisms, protocol versioning and feature > > > masks. > > > >> > > > >> пн, 20 янв. 2020 г. в 20:34, Pavel Tupitsyn <[hidden email]>: > > > >> > > > >>> Huge +1 from me for Feature Masks. > > > >>> I think this should be our top priority for thin client protocol, > > since > > > >>> it > > > >>> simplifies change management a lot. > > > >>> > > > >>> On Mon, Jan 20, 2020 at 8:21 PM Igor Sapego <[hidden email]> > > > wrote: > > > >>> > > > >>> > Sorry for the late reply. > > > >>> > > > > >>> > Approach with taskId will require a lot of changes in protocol > and > > > thus > > > >>> > more "heavy" for implementation, but it definitely looks to me > less > > > >>> hacky > > > >>> > than reqId-approach. Moreover, as was mentioned, server > > notifications > > > >>> > mechanism will be required in a future anyway with high > > probability. > > > So > > > >>> > from this point of view I like taskId-approach. > > > >>> > > > > >>> > On the other hand, what we should also consider here is > > performance. > > > >>> > Speaking of latency, it looks like reqId will have better results > > in > > > >>> case > > > >>> > of > > > >>> > small and fast tasks. The only question here, if we want to > > optimize > > > >>> thin > > > >>> > clients for this case. > > > >>> > > > > >>> > Also, what are you talking about mostly involves clients on > > platforms > > > >>> > that already have Compute API for thick clients. Let me mention > one > > > >>> > more point of view here and another concern here. > > > >>> > > > > >>> > The changes you propose are going to change protocol version for > > > sure. > > > >>> > In case with taskId approach and server notifications - even more > > so. > > > >>> > > > > >>> > But such clients as Python, Node.js, PHP, Go most probably won't > > have > > > >>> > support for this API, at least for now. Or never. But current > > > >>> > backward-compatibility mechanism implies protocol versions where > we > > > >>> > imply that client that supports version 1.5 also supports all the > > > >>> features > > > >>> > introduced in all the previous versions of the protocol. > > > >>> > > > > >>> > Thus implementing Compute API in any of the proposed ways *may* > > > >>> > force mentioned clients to support changes in protocol which they > > not > > > >>> > necessarily need in order to introduce new features in the > future. > > > >>> > > > > >>> > So, maybe it's a good time for us to change our backward > > > compatibility > > > >>> > mechanism from protocol versioning to feature masks? > > > >>> > > > > >>> > WDYT? > > > >>> > > > > >>> > Best Regards, > > > >>> > Igor > > > >>> > > > > >>> > > > > >>> > On Fri, Jan 17, 2020 at 9:37 AM Alex Plehanov < > > > [hidden email] > > > >>> > > > > >>> > wrote: > > > >>> > > > > >>> > > Looks like we didn't rich consensus here. > > > >>> > > > > > >>> > > Igor, as thin client maintainer, can you please share your > > opinion? > > > >>> > > > > > >>> > > Everyone else also welcome, please share your thoughts about > > > options > > > >>> to > > > >>> > > implement operations for compute. > > > >>> > > > > > >>> > > > > > >>> > > чт, 28 нояб. 2019 г. в 10:02, Alex Plehanov < > > > [hidden email] > > > >>> >: > > > >>> > > > > > >>> > > > > Since all thin client operations are inherently async, we > > > should > > > >>> be > > > >>> > > able > > > >>> > > > to cancel any of them > > > >>> > > > It's illogical to have such ability. What should do cancel > > > >>> operation of > > > >>> > > > cancel operation? Moreover, sometimes it's dangerous, for > > > example, > > > >>> > create > > > >>> > > > cache operation should never be canceled. There should be an > > > >>> explicit > > > >>> > set > > > >>> > > > of processes that we can cancel: queries, transactions, > tasks, > > > >>> > services. > > > >>> > > > The lifecycle of services is more complex than the lifecycle > of > > > >>> tasks. > > > >>> > > With > > > >>> > > > services, I suppose, we can't use request cancelation, so > tasks > > > >>> will be > > > >>> > > the > > > >>> > > > only process with an exceptional pattern. > > > >>> > > > > > > >>> > > > > The request would be "execute task with specified node > > filter" > > > - > > > >>> > simple > > > >>> > > > and efficient. > > > >>> > > > It's not simple: every compute or service request should > > contain > > > >>> > complex > > > >>> > > > node filtering logic, which duplicates the same logic for > > cluster > > > >>> API. > > > >>> > > > It's not efficient: for example, we can't implement > > > forPredicate() > > > >>> > > > filtering in this case. > > > >>> > > > > > > >>> > > > > > > >>> > > > ср, 27 нояб. 2019 г. в 19:25, Pavel Tupitsyn < > > > [hidden email] > > > >>> >: > > > >>> > > > > > > >>> > > >> > The request is already processed (task is started), we > > can't > > > >>> cancel > > > >>> > > the > > > >>> > > >> request > > > >>> > > >> The request is not "start a task". It is "execute task" (and > > get > > > >>> > > result). > > > >>> > > >> Same as "cache get" - you get a result in the end, we don't > > > "start > > > >>> > cache > > > >>> > > >> get" then "end cache get". > > > >>> > > >> > > > >>> > > >> Since all thin client operations are inherently async, we > > should > > > >>> be > > > >>> > able > > > >>> > > >> to > > > >>> > > >> cancel any of them > > > >>> > > >> by sending another request with an id of prior request to be > > > >>> > cancelled. > > > >>> > > >> That's why I'm advocating for this approach - it will work > for > > > >>> > anything, > > > >>> > > >> no > > > >>> > > >> special cases. > > > >>> > > >> And it keeps "happy path" as simple as it is right now. > > > >>> > > >> > > > >>> > > >> Queries are different because we retrieve results in pages, > we > > > >>> can't > > > >>> > do > > > >>> > > >> them as one request. > > > >>> > > >> Transactions are also different because client controls when > > > they > > > >>> > should > > > >>> > > >> end. > > > >>> > > >> There is no reason for task execution to be a special case > > like > > > >>> > queries > > > >>> > > or > > > >>> > > >> transactions. > > > >>> > > >> > > > >>> > > >> > we always need to send 2 requests to server to execute > the > > > task > > > >>> > > >> Nope. We don't need to get nodes on client at all. > > > >>> > > >> The request would be "execute task with specified node > > filter" - > > > >>> > simple > > > >>> > > >> and > > > >>> > > >> efficient. > > > >>> > > >> > > > >>> > > >> > > > >>> > > >> On Wed, Nov 27, 2019 at 4:31 PM Alex Plehanov < > > > >>> > [hidden email]> > > > >>> > > >> wrote: > > > >>> > > >> > > > >>> > > >> > > We do cancel a request to perform a task. We may and > > should > > > >>> use > > > >>> > > this > > > >>> > > >> to > > > >>> > > >> > cancel any other request in future. > > > >>> > > >> > The request is already processed (task is started), we > can't > > > >>> cancel > > > >>> > > the > > > >>> > > >> > request. As you mentioned before, we already do almost the > > > same > > > >>> for > > > >>> > > >> queries > > > >>> > > >> > (close the cursor, but not cancel the request to run a > > query), > > > >>> it's > > > >>> > > >> better > > > >>> > > >> > to do such things in a common way. We have a pattern: > start > > > some > > > >>> > > process > > > >>> > > >> > (query, transaction), get id of this process, end process > by > > > >>> this > > > >>> > id. > > > >>> > > >> The > > > >>> > > >> > "Execute task" process should match the same pattern. In > my > > > >>> opinion, > > > >>> > > >> > implementation with two-way requests is the best option to > > > match > > > >>> > this > > > >>> > > >> > pattern (we can even reuse OP_RESOURCE_CLOSE operation > type > > in > > > >>> this > > > >>> > > >> case). > > > >>> > > >> > Sometime in the future, we will need two-way requests for > > some > > > >>> other > > > >>> > > >> > functionality (continuous queries, event listening, etc). > > But > > > >>> even > > > >>> > > >> without > > > >>> > > >> > two-way requests introducing some process id (task id in > our > > > >>> case) > > > >>> > > will > > > >>> > > >> be > > > >>> > > >> > closer to existing pattern than canceling tasks by request > > id. > > > >>> > > >> > > > > >>> > > >> > > So every new request will apply those filters on server > > > side, > > > >>> > using > > > >>> > > >> the > > > >>> > > >> > most recent set of nodes. > > > >>> > > >> > In this case, we always need to send 2 requests to server > to > > > >>> execute > > > >>> > > the > > > >>> > > >> > task. First - to get nodes by the filter, second - to > > actually > > > >>> > execute > > > >>> > > >> the > > > >>> > > >> > task. It seems like overhead. The same will be for > services. > > > >>> Cluster > > > >>> > > >> group > > > >>> > > >> > remains the same if the topology hasn't changed. We can > use > > > this > > > >>> > fact > > > >>> > > >> and > > > >>> > > >> > bind "execute task" request to topology. If topology has > > > >>> changed - > > > >>> > get > > > >>> > > >> > nodes for new topology and retry request. > > > >>> > > >> > > > > >>> > > >> > вт, 26 нояб. 2019 г. в 17:44, Pavel Tupitsyn < > > > >>> [hidden email] > > > >>> > >: > > > >>> > > >> > > > > >>> > > >> > > > After all, we don't cancel request > > > >>> > > >> > > We do cancel a request to perform a task. We may and > > should > > > >>> use > > > >>> > this > > > >>> > > >> to > > > >>> > > >> > > cancel any other request in future. > > > >>> > > >> > > > > > >>> > > >> > > > Client uses some cluster group filtration (for example > > > >>> > > forServers() > > > >>> > > >> > > cluster group) > > > >>> > > >> > > Please see above - Aleksandr Shapkin described how we > > store > > > >>> > > >> > > filtered cluster groups on client. > > > >>> > > >> > > We don't store node IDs, we store actual filters. So > every > > > new > > > >>> > > request > > > >>> > > >> > will > > > >>> > > >> > > apply those filters on server side, > > > >>> > > >> > > using the most recent set of nodes. > > > >>> > > >> > > > > > >>> > > >> > > var myGrp = cluster.forServers().forAttribute("foo"); // > > > This > > > >>> does > > > >>> > > not > > > >>> > > >> > > issue any server requests, just builds an object with > > > filters > > > >>> on > > > >>> > > >> client > > > >>> > > >> > > while (true) myGrp.compute().executeTask("bar"); // > Every > > > >>> request > > > >>> > > >> > includes > > > >>> > > >> > > filters, and filters are applied on the server side > > > >>> > > >> > > > > > >>> > > >> > > On Tue, Nov 26, 2019 at 1:42 PM Alex Plehanov < > > > >>> > > >> [hidden email]> > > > >>> > > >> > > wrote: > > > >>> > > >> > > > > > >>> > > >> > > > > Anyway, my point stands. > > > >>> > > >> > > > I can't agree. Why you don't want to use task id for > > this? > > > >>> After > > > >>> > > >> all, > > > >>> > > >> > we > > > >>> > > >> > > > don't cancel request (request is already processed), > we > > > >>> cancel > > > >>> > the > > > >>> > > >> > task. > > > >>> > > >> > > So > > > >>> > > >> > > > it's more convenient to use task id here. > > > >>> > > >> > > > > > > >>> > > >> > > > > Can you please provide equivalent use case with > > existing > > > >>> > "thick" > > > >>> > > >> > > client? > > > >>> > > >> > > > For example: > > > >>> > > >> > > > Cluster consists of one server node. > > > >>> > > >> > > > Client uses some cluster group filtration (for example > > > >>> > > forServers() > > > >>> > > >> > > cluster > > > >>> > > >> > > > group). > > > >>> > > >> > > > Client starts to send periodically (for example 1 per > > > >>> minute) > > > >>> > > >> long-term > > > >>> > > >> > > > (for example 1 hour long) tasks to the cluster. > > > >>> > > >> > > > Meanwhile, several server nodes joined the cluster. > > > >>> > > >> > > > > > > >>> > > >> > > > In case of thick client: All server nodes will be > used, > > > >>> tasks > > > >>> > will > > > >>> > > >> be > > > >>> > > >> > > load > > > >>> > > >> > > > balanced. > > > >>> > > >> > > > In case of thin client: Only one server node will be > > used, > > > >>> > client > > > >>> > > >> will > > > >>> > > >> > > > detect topology change after an hour. > > > >>> > > >> > > > > > > >>> > > >> > > > > > > >>> > > >> > > > вт, 26 нояб. 2019 г. в 11:50, Pavel Tupitsyn < > > > >>> > > [hidden email] > > > >>> > > >> >: > > > >>> > > >> > > > > > > >>> > > >> > > > > > I can't see any usage of request id in query > > cursors > > > >>> > > >> > > > > You are right, cursor id is a separate thing. > > > >>> > > >> > > > > Anyway, my point stands. > > > >>> > > >> > > > > > > > >>> > > >> > > > > > client sends long term tasks to nodes and wants to > > do > > > it > > > >>> > with > > > >>> > > >> load > > > >>> > > >> > > > > balancing > > > >>> > > >> > > > > I still don't get it. Can you please provide > > equivalent > > > >>> use > > > >>> > case > > > >>> > > >> with > > > >>> > > >> > > > > existing "thick" client? > > > >>> > > >> > > > > > > > >>> > > >> > > > > > > > >>> > > >> > > > > On Mon, Nov 25, 2019 at 11:59 PM Alex Plehanov < > > > >>> > > >> > > [hidden email]> > > > >>> > > >> > > > > wrote: > > > >>> > > >> > > > > > > > >>> > > >> > > > > > > And it is fine to use request ID to identify > > compute > > > >>> tasks > > > >>> > > >> (as we > > > >>> > > >> > > do > > > >>> > > >> > > > > with > > > >>> > > >> > > > > > query cursors). > > > >>> > > >> > > > > > I can't see any usage of request id in query > > cursors. > > > We > > > >>> > send > > > >>> > > >> query > > > >>> > > >> > > > > request > > > >>> > > >> > > > > > and get cursor id in response. After that, we only > > use > > > >>> > cursor > > > >>> > > id > > > >>> > > >> > (to > > > >>> > > >> > > > get > > > >>> > > >> > > > > > next pages and to close the resource). Did I miss > > > >>> something? > > > >>> > > >> > > > > > > > > >>> > > >> > > > > > > Looks like I'm missing something - how is > topology > > > >>> change > > > >>> > > >> > relevant > > > >>> > > >> > > to > > > >>> > > >> > > > > > executing compute tasks from client? > > > >>> > > >> > > > > > It's not relevant directly. But there are some > cases > > > >>> where > > > >>> > it > > > >>> > > >> will > > > >>> > > >> > be > > > >>> > > >> > > > > > helpful. For example, if client sends long term > > tasks > > > to > > > >>> > nodes > > > >>> > > >> and > > > >>> > > >> > > > wants > > > >>> > > >> > > > > to > > > >>> > > >> > > > > > do it with load balancing it will detect topology > > > change > > > >>> > only > > > >>> > > >> after > > > >>> > > >> > > > some > > > >>> > > >> > > > > > time in the future with the first response, so > load > > > >>> > balancing > > > >>> > > >> will > > > >>> > > >> > no > > > >>> > > >> > > > > work. > > > >>> > > >> > > > > > Perhaps we can add optional "topology version" > field > > > to > > > >>> the > > > >>> > > >> > > > > > OP_COMPUTE_EXECUTE_TASK request to solve this > > problem. > > > >>> > > >> > > > > > > > > >>> > > >> > > > > > > > > >>> > > >> > > > > > пн, 25 нояб. 2019 г. в 22:42, Pavel Tupitsyn < > > > >>> > > >> [hidden email] > > > >>> > > >> > >: > > > >>> > > >> > > > > > > > > >>> > > >> > > > > > > Alex, > > > >>> > > >> > > > > > > > > > >>> > > >> > > > > > > > we will mix entities from different layers > > > >>> (transport > > > >>> > > layer > > > >>> > > >> and > > > >>> > > >> > > > > request > > > >>> > > >> > > > > > > body) > > > >>> > > >> > > > > > > I would not call our message header (which > > includes > > > >>> the > > > >>> > id) > > > >>> > > >> > > > "transport > > > >>> > > >> > > > > > > layer". > > > >>> > > >> > > > > > > TCP is our transport layer. And it is fine to > use > > > >>> request > > > >>> > ID > > > >>> > > >> to > > > >>> > > >> > > > > identify > > > >>> > > >> > > > > > > compute tasks (as we do with query cursors). > > > >>> > > >> > > > > > > > > > >>> > > >> > > > > > > > we still can't be sure that the task is > > > successfully > > > >>> > > started > > > >>> > > >> > on a > > > >>> > > >> > > > > > server > > > >>> > > >> > > > > > > The request to start the task will fail and > we'll > > > get > > > >>> a > > > >>> > > >> response > > > >>> > > >> > > > > > indicating > > > >>> > > >> > > > > > > that right away > > > >>> > > >> > > > > > > > > > >>> > > >> > > > > > > > we won't ever know about topology change > > > >>> > > >> > > > > > > Looks like I'm missing something - how is > topology > > > >>> change > > > >>> > > >> > relevant > > > >>> > > >> > > to > > > >>> > > >> > > > > > > executing compute tasks from client? > > > >>> > > >> > > > > > > > > > >>> > > >> > > > > > > On Mon, Nov 25, 2019 at 10:17 PM Alex Plehanov < > > > >>> > > >> > > > > [hidden email]> > > > >>> > > >> > > > > > > wrote: > > > >>> > > >> > > > > > > > > > >>> > > >> > > > > > > > Pavel, in this case, we will mix entities from > > > >>> different > > > >>> > > >> layers > > > >>> > > >> > > > > > > (transport > > > >>> > > >> > > > > > > > layer and request body), it's not very good. > The > > > >>> same > > > >>> > > >> behavior > > > >>> > > >> > we > > > >>> > > >> > > > can > > > >>> > > >> > > > > > > > achieve with generated on client-side task id, > > but > > > >>> there > > > >>> > > >> will > > > >>> > > >> > be > > > >>> > > >> > > no > > > >>> > > >> > > > > > > > inter-layer data intersection and I think it > > will > > > be > > > >>> > > easier > > > >>> > > >> to > > > >>> > > >> > > > > > implement > > > >>> > > >> > > > > > > on > > > >>> > > >> > > > > > > > both client and server-side. But we still > can't > > be > > > >>> sure > > > >>> > > that > > > >>> > > >> > the > > > >>> > > >> > > > task > > > >>> > > >> > > > > > is > > > >>> > > >> > > > > > > > successfully started on a server. We won't > ever > > > know > > > >>> > about > > > >>> > > >> > > topology > > > >>> > > >> > > > > > > change, > > > >>> > > >> > > > > > > > because topology changed flag will be sent > from > > > >>> server > > > >>> > to > > > >>> > > >> > client > > > >>> > > >> > > > only > > > >>> > > >> > > > > > > with > > > >>> > > >> > > > > > > > a response when the task will be completed. > Are > > we > > > >>> > accept > > > >>> > > >> that? > > > >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > пн, 25 нояб. 2019 г. в 19:07, Pavel Tupitsyn < > > > >>> > > >> > > [hidden email] > > > >>> > > >> > > > >: > > > >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > Alex, > > > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > I have a simpler idea. We already do request > > id > > > >>> > handling > > > >>> > > >> in > > > >>> > > >> > the > > > >>> > > >> > > > > > > protocol, > > > >>> > > >> > > > > > > > > so: > > > >>> > > >> > > > > > > > > - Client sends a normal request to execute > > > compute > > > >>> > task. > > > >>> > > >> > > Request > > > >>> > > >> > > > ID > > > >>> > > >> > > > > > is > > > >>> > > >> > > > > > > > > generated as usual. > > > >>> > > >> > > > > > > > > - As soon as task is completed, a response > is > > > >>> > received. > > > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > As for cancellation - client can send a new > > > >>> request > > > >>> > > (with > > > >>> > > >> new > > > >>> > > >> > > > > request > > > >>> > > >> > > > > > > ID) > > > >>> > > >> > > > > > > > > and (in the body) pass the request ID from > > above > > > >>> > > >> > > > > > > > > as a task identifier. As a result, there are > > two > > > >>> > > >> responses: > > > >>> > > >> > > > > > > > > - Cancellation response > > > >>> > > >> > > > > > > > > - Task response (with proper cancelled > status) > > > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > That's it, no need to modify the core of the > > > >>> protocol. > > > >>> > > One > > > >>> > > >> > > > request > > > >>> > > >> > > > > - > > > >>> > > >> > > > > > > one > > > >>> > > >> > > > > > > > > response. > > > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > On Mon, Nov 25, 2019 at 6:20 PM Alex > Plehanov > > < > > > >>> > > >> > > > > > [hidden email] > > > >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > wrote: > > > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > Pavel, we need to inform the client when > the > > > >>> task is > > > >>> > > >> > > completed, > > > >>> > > >> > > > > we > > > >>> > > >> > > > > > > need > > > >>> > > >> > > > > > > > > the > > > >>> > > >> > > > > > > > > > ability to cancel the task. I see several > > ways > > > >>> to > > > >>> > > >> implement > > > >>> > > >> > > > this: > > > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > 1. Сlient sends a request to the server to > > > >>> start a > > > >>> > > task, > > > >>> > > >> > > server > > > >>> > > >> > > > > > > return > > > >>> > > >> > > > > > > > > task > > > >>> > > >> > > > > > > > > > id in response. Server notifies client > when > > > >>> task is > > > >>> > > >> > completed > > > >>> > > >> > > > > with > > > >>> > > >> > > > > > a > > > >>> > > >> > > > > > > > new > > > >>> > > >> > > > > > > > > > request (from server to client). Client > can > > > >>> cancel > > > >>> > the > > > >>> > > >> task > > > >>> > > >> > > by > > > >>> > > >> > > > > > > sending > > > >>> > > >> > > > > > > > a > > > >>> > > >> > > > > > > > > > new request with operation type "cancel" > and > > > >>> task > > > >>> > id. > > > >>> > > In > > > >>> > > >> > this > > > >>> > > >> > > > > case, > > > >>> > > >> > > > > > > we > > > >>> > > >> > > > > > > > > > should implement 2-ways requests. > > > >>> > > >> > > > > > > > > > 2. Client generates unique task id and > > sends a > > > >>> > request > > > >>> > > >> to > > > >>> > > >> > the > > > >>> > > >> > > > > > server > > > >>> > > >> > > > > > > to > > > >>> > > >> > > > > > > > > > start a task, server don't reply > immediately > > > but > > > >>> > wait > > > >>> > > >> until > > > >>> > > >> > > > task > > > >>> > > >> > > > > is > > > >>> > > >> > > > > > > > > > completed. Client can cancel task by > sending > > > new > > > >>> > > request > > > >>> > > >> > with > > > >>> > > >> > > > > > > operation > > > >>> > > >> > > > > > > > > > type "cancel" and task id. In this case, > we > > > >>> should > > > >>> > > >> decouple > > > >>> > > >> > > > > request > > > >>> > > >> > > > > > > and > > > >>> > > >> > > > > > > > > > response on the server-side (currently > > > response > > > >>> is > > > >>> > > sent > > > >>> > > >> > right > > > >>> > > >> > > > > after > > > >>> > > >> > > > > > > > > request > > > >>> > > >> > > > > > > > > > was processed). Also, we can't be sure > that > > > >>> task is > > > >>> > > >> > > > successfully > > > >>> > > >> > > > > > > > started > > > >>> > > >> > > > > > > > > on > > > >>> > > >> > > > > > > > > > a server. > > > >>> > > >> > > > > > > > > > 3. Client sends a request to the server to > > > >>> start a > > > >>> > > task, > > > >>> > > >> > > server > > > >>> > > >> > > > > > > return > > > >>> > > >> > > > > > > > id > > > >>> > > >> > > > > > > > > > in response. Client periodically asks the > > > server > > > >>> > about > > > >>> > > >> task > > > >>> > > >> > > > > status. > > > >>> > > >> > > > > > > > > Client > > > >>> > > >> > > > > > > > > > can cancel the task by sending new request > > > with > > > >>> > > >> operation > > > >>> > > >> > > type > > > >>> > > >> > > > > > > "cancel" > > > >>> > > >> > > > > > > > > and > > > >>> > > >> > > > > > > > > > task id. This case brings some overhead to > > the > > > >>> > > >> > communication > > > >>> > > >> > > > > > channel. > > > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > Personally, I think that the case with > > 2-ways > > > >>> > requests > > > >>> > > >> is > > > >>> > > >> > > > better, > > > >>> > > >> > > > > > but > > > >>> > > >> > > > > > > > I'm > > > >>> > > >> > > > > > > > > > open to any other ideas. > > > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > Aleksandr, > > > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > Filtering logic for > > > >>> OP_CLUSTER_GROUP_GET_NODE_IDS > > > >>> > > looks > > > >>> > > >> > > > > > > > overcomplicated. > > > >>> > > >> > > > > > > > > Do > > > >>> > > >> > > > > > > > > > we need server-side filtering at all? > > Wouldn't > > > >>> it be > > > >>> > > >> better > > > >>> > > >> > > to > > > >>> > > >> > > > > send > > > >>> > > >> > > > > > > > basic > > > >>> > > >> > > > > > > > > > info (ids, order, flags) for all nodes > > (there > > > is > > > >>> > > >> relatively > > > >>> > > >> > > > small > > > >>> > > >> > > > > > > > amount > > > >>> > > >> > > > > > > > > of > > > >>> > > >> > > > > > > > > > data) and extended info (attributes) for > > > >>> selected > > > >>> > list > > > >>> > > >> of > > > >>> > > >> > > > nodes? > > > >>> > > >> > > > > In > > > >>> > > >> > > > > > > > this > > > >>> > > >> > > > > > > > > > case, we can do basic node filtration on > > > >>> client-side > > > >>> > > >> > > > > (forClients(), > > > >>> > > >> > > > > > > > > > forServers(), forNodeIds(), forOthers(), > > etc). > > > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > Do you use standard ClusterNode > > serialization? > > > >>> There > > > >>> > > are > > > >>> > > >> > also > > > >>> > > >> > > > > > metrics > > > >>> > > >> > > > > > > > > > serialized with ClusterNode, do we need it > > on > > > >>> thin > > > >>> > > >> client? > > > >>> > > >> > > > There > > > >>> > > >> > > > > > are > > > >>> > > >> > > > > > > > > other > > > >>> > > >> > > > > > > > > > interfaces exist to show metrics, I think > > it's > > > >>> > > >> redundant to > > > >>> > > >> > > > > export > > > >>> > > >> > > > > > > > > metrics > > > >>> > > >> > > > > > > > > > to thin clients too. > > > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > What do you think? > > > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > пт, 22 нояб. 2019 г. в 20:15, Aleksandr > > > Shapkin > > > >>> < > > > >>> > > >> > > > > [hidden email] > > > >>> > > >> > > > > > >: > > > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > Alex, > > > >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > I think you can create a new IEP page > and > > I > > > >>> will > > > >>> > > fill > > > >>> > > >> it > > > >>> > > >> > > with > > > >>> > > >> > > > > the > > > >>> > > >> > > > > > > > > Cluster > > > >>> > > >> > > > > > > > > > > API details. > > > >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > In short, I’ve introduced several new > > codes: > > > >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > Cluster API is pretty straightforward: > > > >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > OP_CLUSTER_IS_ACTIVE = 5000 > > > >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > OP_CLUSTER_CHANGE_STATE = 5001 > > > >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > OP_CLUSTER_CHANGE_WAL_STATE = 5002 > > > >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > OP_CLUSTER_GET_WAL_STATE = 5003 > > > >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > Cluster group codes: > > > >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > OP_CLUSTER_GROUP_GET_NODE_IDS = 5100 > > > >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > OP_CLUSTER_GROUP_GET_NODE_INFO = 5101 > > > >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > The underlying implementation is based > on > > > the > > > >>> > thick > > > >>> > > >> > client > > > >>> > > >> > > > > logic. > > > >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > For every request, we provide a known > > > topology > > > >>> > > version > > > >>> > > >> > and > > > >>> > > >> > > if > > > >>> > > >> > > > > it > > > >>> > > >> > > > > > > has > > > >>> > > >> > > > > > > > > > > changed, > > > >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > a client updates it firstly and then > > > re-sends > > > >>> the > > > >>> > > >> > filtering > > > >>> > > >> > > > > > > request. > > > >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > Alongside the topVer a client sends a > > > >>> serialized > > > >>> > > nodes > > > >>> > > >> > > > > projection > > > >>> > > >> > > > > > > > > object > > > >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > that could be considered as a code to > > value > > > >>> > mapping. > > > >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > Consider: [{Code = 1, Value= [“DotNet”, > > > >>> > > >> “MyAttribute”}, > > > >>> > > >> > > > > {Code=2, > > > >>> > > >> > > > > > > > > > Value=1}] > > > >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > Where “1” stands for Attribute filtering > > and > > > >>> “2” – > > > >>> > > >> > > > > > serverNodesOnly > > > >>> > > >> > > > > > > > > flag. > > > >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > As a result of request processing, a > > server > > > >>> sends > > > >>> > > >> nodeId > > > >>> > > >> > > > UUIDs > > > >>> > > >> > > > > > and > > > >>> > > >> > > > > > > a > > > >>> > > >> > > > > > > > > > > current topVer. > > > >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > When a client obtains nodeIds, it can > > > perform > > > >>> a > > > >>> > > >> NODE_INFO > > > >>> > > >> > > > call > > > >>> > > >> > > > > to > > > >>> > > >> > > > > > > > get a > > > >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > serialized ClusterNode object. In > addition > > > >>> there > > > >>> > > >> should > > > >>> > > >> > be > > > >>> > > >> > > a > > > >>> > > >> > > > > > > > different > > > >>> > > >> > > > > > > > > > API > > > >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > method for accessing/updating node > > metrics. > > > >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > чт, 21 нояб. 2019 г. в 12:32, Sergey > > Kozlov > > > < > > > >>> > > >> > > > > > [hidden email] > > > >>> > > >> > > > > > > >: > > > >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > Hi Pavel > > > >>> > > >> > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > On Thu, Nov 21, 2019 at 11:30 AM Pavel > > > >>> Tupitsyn > > > >>> > < > > > >>> > > >> > > > > > > > > [hidden email]> > > > >>> > > >> > > > > > > > > > > > wrote: > > > >>> > > >> > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > 1. I believe that Cluster operations > > for > > > >>> Thin > > > >>> > > >> Client > > > >>> > > >> > > > > protocol > > > >>> > > >> > > > > > > are > > > >>> > > >> > > > > > > > > > > already > > > >>> > > >> > > > > > > > > > > > > in the works > > > >>> > > >> > > > > > > > > > > > > by Alexandr Shapkin. Can't find the > > > ticket > > > >>> > > though. > > > >>> > > >> > > > > > > > > > > > > Alexandr, can you please confirm and > > > >>> attach > > > >>> > the > > > >>> > > >> > ticket > > > >>> > > >> > > > > > number? > > > >>> > > >> > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > 2. Proposed changes will work only > for > > > >>> Java > > > >>> > > tasks > > > >>> > > >> > that > > > >>> > > >> > > > are > > > >>> > > >> > > > > > > > already > > > >>> > > >> > > > > > > > > > > > deployed > > > >>> > > >> > > > > > > > > > > > > on server nodes. > > > >>> > > >> > > > > > > > > > > > > This is mostly useless for other > thin > > > >>> clients > > > >>> > we > > > >>> > > >> have > > > >>> > > >> > > > > > (Python, > > > >>> > > >> > > > > > > > PHP, > > > >>> > > >> > > > > > > > > > > .NET, > > > >>> > > >> > > > > > > > > > > > > C++). > > > >>> > > >> > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > I don't guess so. The task (execution) > > is > > > a > > > >>> way > > > >>> > to > > > >>> > > >> > > > implement > > > >>> > > >> > > > > > own > > > >>> > > >> > > > > > > > > layer > > > >>> > > >> > > > > > > > > > > for > > > >>> > > >> > > > > > > > > > > > the thin client application. > > > >>> > > >> > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > We should think of a way to make > this > > > >>> useful > > > >>> > for > > > >>> > > >> all > > > >>> > > >> > > > > clients. > > > >>> > > >> > > > > > > > > > > > > For example, we may allow sending > > tasks > > > in > > > >>> > some > > > >>> > > >> > > scripting > > > >>> > > >> > > > > > > > language > > > >>> > > >> > > > > > > > > > like > > > >>> > > >> > > > > > > > > > > > > Javascript. > > > >>> > > >> > > > > > > > > > > > > Thoughts? > > > >>> > > >> > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > The arbitrary code execution from a > > remote > > > >>> > client > > > >>> > > >> must > > > >>> > > >> > be > > > >>> > > >> > > > > > > protected > > > >>> > > >> > > > > > > > > > > > from malicious code. > > > >>> > > >> > > > > > > > > > > > I don't know how it could be designed > > but > > > >>> > without > > > >>> > > >> that > > > >>> > > >> > we > > > >>> > > >> > > > > open > > > >>> > > >> > > > > > > the > > > >>> > > >> > > > > > > > > hole > > > >>> > > >> > > > > > > > > > > to > > > >>> > > >> > > > > > > > > > > > kill cluster. > > > >>> > > >> > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > On Thu, Nov 21, 2019 at 11:21 AM > > Sergey > > > >>> > Kozlov < > > > >>> > > >> > > > > > > > > [hidden email] > > > >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > wrote: > > > >>> > > >> > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > Hi Alex > > > >>> > > >> > > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > The idea is great. But I have some > > > >>> concerns > > > >>> > > that > > > >>> > > >> > > > probably > > > >>> > > >> > > > > > > > should > > > >>> > > >> > > > > > > > > be > > > >>> > > >> > > > > > > > > > > > taken > > > >>> > > >> > > > > > > > > > > > > > into account for design: > > > >>> > > >> > > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > 1. We need to have the ability > to > > > >>> stop a > > > >>> > > task > > > >>> > > >> > > > > execution, > > > >>> > > >> > > > > > > > smth > > > >>> > > >> > > > > > > > > > like > > > >>> > > >> > > > > > > > > > > > > > OP_COMPUTE_CANCEL_TASK > operation > > > >>> (client > > > >>> > > to > > > >>> > > >> > > server) > > > >>> > > >> > > > > > > > > > > > > > 2. What's about task execution > > > >>> timeout? > > > >>> > It > > > >>> > > >> may > > > >>> > > >> > > help > > > >>> > > >> > > > to > > > >>> > > >> > > > > > the > > > >>> > > >> > > > > > > > > > cluster > > > >>> > > >> > > > > > > > > > > > > > survival for buggy tasks > > > >>> > > >> > > > > > > > > > > > > > 3. Ignite doesn't have > > > >>> > roles/authorization > > > >>> > > >> > > > > functionality > > > >>> > > >> > > > > > > for > > > >>> > > >> > > > > > > > > > now. > > > >>> > > >> > > > > > > > > > > > But > > > >>> > > >> > > > > > > > > > > > > a > > > >>> > > >> > > > > > > > > > > > > > task is the risky operation for > > > >>> cluster > > > >>> > > (for > > > >>> > > >> > > > security > > > >>> > > >> > > > > > > > > reasons). > > > >>> > > >> > > > > > > > > > > > Could > > > >>> > > >> > > > > > > > > > > > > we > > > >>> > > >> > > > > > > > > > > > > > add for Ignite configuration > new > > > >>> options: > > > >>> > > >> > > > > > > > > > > > > > - Explicit turning on for > > > compute > > > >>> task > > > >>> > > >> > support > > > >>> > > >> > > > for > > > >>> > > >> > > > > > thin > > > >>> > > >> > > > > > > > > > > protocol > > > >>> > > >> > > > > > > > > > > > > > (disabled by default) for > > whole > > > >>> > cluster > > > >>> > > >> > > > > > > > > > > > > > - Explicit turning on for > > > compute > > > >>> task > > > >>> > > >> > support > > > >>> > > >> > > > for > > > >>> > > >> > > > > a > > > >>> > > >> > > > > > > node > > > >>> > > >> > > > > > > > > > > > > > - The list of task names > > > (classes) > > > >>> > > >> allowed to > > > >>> > > >> > > > > execute > > > >>> > > >> > > > > > > by > > > >>> > > >> > > > > > > > > thin > > > >>> > > >> > > > > > > > > > > > > client. > > > >>> > > >> > > > > > > > > > > > > > 4. Support the labeling for > task > > > >>> that may > > > >>> > > >> help > > > >>> > > >> > to > > > >>> > > >> > > > > > > > investigate > > > >>> > > >> > > > > > > > > > > issues > > > >>> > > >> > > > > > > > > > > > > on > > > >>> > > >> > > > > > > > > > > > > > cluster (the idea from IEP-34 > > [1]) > > > >>> > > >> > > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > 1. > > > >>> > > >> > > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > > >>> > > >> > > > > > > > > >>> > > >> > > > > > > > >>> > > >> > > > > > > >>> > > >> > > > > > >>> > > >> > > > > >>> > > >> > > > >>> > > > > > >>> > > > > >>> > > > > > > https://cwiki.apache.org/confluence/display/IGNITE/IEP-34+Thin+client%3A+transactions+support > > > >>> > > >> > > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > On Thu, Nov 21, 2019 at 10:58 AM > > Alex > > > >>> > > Plehanov < > > > >>> > > >> > > > > > > > > > > > [hidden email]> > > > >>> > > >> > > > > > > > > > > > > > wrote: > > > >>> > > >> > > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > Hello, Igniters! > > > >>> > > >> > > > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > I have plans to start > > implementation > > > >>> of > > > >>> > > >> Compute > > > >>> > > >> > > > > interface > > > >>> > > >> > > > > > > for > > > >>> > > >> > > > > > > > > > > Ignite > > > >>> > > >> > > > > > > > > > > > > thin > > > >>> > > >> > > > > > > > > > > > > > > client and want to discuss > > features > > > >>> that > > > >>> > > >> should > > > >>> > > >> > be > > > >>> > > >> > > > > > > > implemented. > > > >>> > > >> > > > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > We already have Compute > > > >>> implementation for > > > >>> > > >> > > > binary-rest > > > >>> > > >> > > > > > > > clients > > > >>> > > >> > > > > > > > > > > > > > > (GridClientCompute), which have > > the > > > >>> > > following > > > >>> > > >> > > > > > > functionality: > > > >>> > > >> > > > > > > > > > > > > > > - Filtering cluster nodes > > > >>> (projection) for > > > >>> > > >> > compute > > > >>> > > >> > > > > > > > > > > > > > > - Executing task by the name > > > >>> > > >> > > > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > I think we can implement this > > > >>> > functionality > > > >>> > > >> in a > > > >>> > > >> > > thin > > > >>> > > >> > > > > > > client > > > >>> > > >> > > > > > > > as > > > >>> > > >> > > > > > > > > > > well. > > > >>> > > >> > > > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > First of all, we need some > > operation > > > >>> types > > > >>> > > to > > > >>> > > >> > > > request a > > > >>> > > >> > > > > > > list > > > >>> > > >> > > > > > > > of > > > >>> > > >> > > > > > > > > > all > > > >>> > > >> > > > > > > > > > > > > > > available nodes and probably > node > > > >>> > attributes > > > >>> > > >> (by > > > >>> > > >> > a > > > >>> > > >> > > > list > > > >>> > > >> > > > > > of > > > >>> > > >> > > > > > > > > > nodes). > > > >>> > > >> > > > > > > > > > > > Node > > > >>> > > >> > > > > > > > > > > > > > > attributes will be helpful if we > > > will > > > >>> > decide > > > >>> > > >> to > > > >>> > > >> > > > > implement > > > >>> > > >> > > > > > > > > analog > > > >>> > > >> > > > > > > > > > of > > > >>> > > >> > > > > > > > > > > > > > > ClusterGroup#forAttribute or > > > >>> > > >> > > > ClusterGroup#forePredicate > > > >>> > > >> > > > > > > > methods > > > >>> > > >> > > > > > > > > > in > > > >>> > > >> > > > > > > > > > > > the > > > >>> > > >> > > > > > > > > > > > > > thin > > > >>> > > >> > > > > > > > > > > > > > > client. Perhaps they can be > > > requested > > > >>> > > lazily. > > > >>> > > >> > > > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > From the protocol point of view > > > there > > > >>> will > > > >>> > > be > > > >>> > > >> two > > > >>> > > >> > > new > > > >>> > > >> > > > > > > > > operations: > > > >>> > > >> > > > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > OP_CLUSTER_GET_NODES > > > >>> > > >> > > > > > > > > > > > > > > Request: empty > > > >>> > > >> > > > > > > > > > > > > > > Response: long topologyVersion, > > int > > > >>> > > >> > > > > minorTopologyVersion, > > > >>> > > >> > > > > > > int > > > >>> > > >> > > > > > > > > > > > > nodesCount, > > > >>> > > >> > > > > > > > > > > > > > > for each node set of node fields > > > (UUID > > > >>> > > nodeId, > > > >>> > > >> > > Object > > > >>> > > >> > > > > or > > > >>> > > >> > > > > > > > String > > > >>> > > >> > > > > > > > > > > > > > > consistentId, long order, etc) > > > >>> > > >> > > > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > OP_CLUSTER_GET_NODE_ATTRIBUTES > > > >>> > > >> > > > > > > > > > > > > > > Request: int nodesCount, for > each > > > >>> node: > > > >>> > UUID > > > >>> > > >> > nodeId > > > >>> > > >> > > > > > > > > > > > > > > Response: int nodesCount, for > each > > > >>> node: > > > >>> > int > > > >>> > > >> > > > > > > attributesCount, > > > >>> > > >> > > > > > > > > for > > > >>> > > >> > > > > > > > > > > > each > > > >>> > > >> > > > > > > > > > > > > > node > > > >>> > > >> > > > > > > > > > > > > > > attribute: String name, Object > > value > > > >>> > > >> > > > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > To execute tasks we need > something > > > >>> like > > > >>> > > these > > > >>> > > >> > > methods > > > >>> > > >> > > > > in > > > >>> > > >> > > > > > > the > > > >>> > > >> > > > > > > > > > client > > > >>> > > >> > > > > > > > > > > > > API: > > > >>> > > >> > > > > > > > > > > > > > > Object execute(String task, > Object > > > >>> arg) > > > >>> > > >> > > > > > > > > > > > > > > Future<Object> > executeAsync(String > > > >>> task, > > > >>> > > >> Object > > > >>> > > >> > > arg) > > > >>> > > >> > > > > > > > > > > > > > > Object affinityExecute(String > > task, > > > >>> String > > > >>> > > >> cache, > > > >>> > > >> > > > > Object > > > >>> > > >> > > > > > > key, > > > >>> > > >> > > > > > > > > > > Object > > > >>> > > >> > > > > > > > > > > > > arg) > > > >>> > > >> > > > > > > > > > > > > > > Future<Object> > > > >>> affinityExecuteAsync(String > > > >>> > > >> task, > > > >>> > > >> > > > String > > > >>> > > >> > > > > > > > cache, > > > >>> > > >> > > > > > > > > > > Object > > > >>> > > >> > > > > > > > > > > > > > key, > > > >>> > > >> > > > > > > > > > > > > > > Object arg) > > > >>> > > >> > > > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > Which can be mapped to protocol > > > >>> > operations: > > > >>> > > >> > > > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > OP_COMPUTE_EXECUTE_TASK > > > >>> > > >> > > > > > > > > > > > > > > Request: UUID nodeId, String > > > taskName, > > > >>> > > Object > > > >>> > > >> arg > > > >>> > > >> > > > > > > > > > > > > > > Response: Object result > > > >>> > > >> > > > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > OP_COMPUTE_EXECUTE_TASK_AFFINITY > > > >>> > > >> > > > > > > > > > > > > > > Request: String cacheName, > Object > > > key, > > > >>> > > String > > > >>> > > >> > > > taskName, > > > >>> > > >> > > > > > > > Object > > > >>> > > >> > > > > > > > > > arg > > > >>> > > >> > > > > > > > > > > > > > > Response: Object result > > > >>> > > >> > > > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > The second operation is needed > > > >>> because we > > > >>> > > >> > sometimes > > > >>> > > >> > > > > can't > > > >>> > > >> > > > > > > > > > calculate > > > >>> > > >> > > > > > > > > > > > and > > > >>> > > >> > > > > > > > > > > > > > > connect to affinity node on the > > > >>> > client-side > > > >>> > > >> > > (affinity > > > >>> > > >> > > > > > > > awareness > > > >>> > > >> > > > > > > > > > can > > > >>> > > >> > > > > > > > > > > > be > > > >>> > > >> > > > > > > > > > > > > > > disabled, custom affinity > function > > > >>> can be > > > >>> > > >> used or > > > >>> > > >> > > > there > > > >>> > > >> > > > > > can > > > >>> > > >> > > > > > > > be > > > >>> > > >> > > > > > > > > no > > > >>> > > >> > > > > > > > > > > > > > > connection between client and > > > affinity > > > >>> > > node), > > > >>> > > >> but > > > >>> > > >> > > we > > > >>> > > >> > > > > can > > > >>> > > >> > > > > > > make > > > >>> > > >> > > > > > > > > > best > > > >>> > > >> > > > > > > > > > > > > effort > > > >>> > > >> > > > > > > > > > > > > > > to send request to target node > if > > > >>> affinity > > > >>> > > >> > > awareness > > > >>> > > >> > > > is > > > >>> > > >> > > > > > > > > enabled. > > > >>> > > >> > > > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > Currently, on the server-side > > > requests > > > >>> > > always > > > >>> > > >> > > > processed > > > >>> > > >> > > > > > > > > > > synchronously > > > >>> > > >> > > > > > > > > > > > > and > > > >>> > > >> > > > > > > > > > > > > > > responses are sent right after > > > >>> request was > > > >>> > > >> > > processed. > > > >>> > > >> > > > > To > > > >>> > > >> > > > > > > > > execute > > > >>> > > >> > > > > > > > > > > long > > > >>> > > >> > > > > > > > > > > > > > tasks > > > >>> > > >> > > > > > > > > > > > > > > async we should whether change > > this > > > >>> logic > > > >>> > or > > > >>> > > >> > > > introduce > > > >>> > > >> > > > > > some > > > >>> > > >> > > > > > > > > kind > > > >>> > > >> > > > > > > > > > > > > two-way > > > >>> > > >> > > > > > > > > > > > > > > communication between client and > > > >>> server > > > >>> > (now > > > >>> > > >> only > > > >>> > > >> > > > > one-way > > > >>> > > >> > > > > > > > > > requests > > > >>> > > >> > > > > > > > > > > > from > > > >>> > > >> > > > > > > > > > > > > > > client to server are allowed). > > > >>> > > >> > > > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > Two-way communication can also > be > > > >>> useful > > > >>> > in > > > >>> > > >> the > > > >>> > > >> > > > future > > > >>> > > >> > > > > if > > > >>> > > >> > > > > > > we > > > >>> > > >> > > > > > > > > will > > > >>> > > >> > > > > > > > > > > > send > > > >>> > > >> > > > > > > > > > > > > > some > > > >>> > > >> > > > > > > > > > > > > > > server-side generated events to > > > >>> clients. > > > >>> > > >> > > > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > In case of two-way communication > > > >>> there can > > > >>> > > be > > > >>> > > >> new > > > >>> > > >> > > > > > > operations > > > >>> > > >> > > > > > > > > > > > > introduced: > > > >>> > > >> > > > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > OP_COMPUTE_EXECUTE_TASK (from > > client > > > >>> to > > > >>> > > >> server) > > > >>> > > >> > > > > > > > > > > > > > > Request: UUID nodeId, String > > > taskName, > > > >>> > > Object > > > >>> > > >> arg > > > >>> > > >> > > > > > > > > > > > > > > Response: long taskId > > > >>> > > >> > > > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > OP_COMPUTE_TASK_FINISHED (from > > > server > > > >>> to > > > >>> > > >> client) > > > >>> > > >> > > > > > > > > > > > > > > Request: taskId, Object result > > > >>> > > >> > > > > > > > > > > > > > > Response: empty > > > >>> > > >> > > > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > The same for affinity requests. > > > >>> > > >> > > > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > Also, we can implement not only > > > >>> execute > > > >>> > task > > > >>> > > >> > > > operation, > > > >>> > > >> > > > > > but > > > >>> > > >> > > > > > > > > some > > > >>> > > >> > > > > > > > > > > > other > > > >>> > > >> > > > > > > > > > > > > > > operations from IgniteCompute > > > >>> (broadcast, > > > >>> > > run, > > > >>> > > >> > > call), > > > >>> > > >> > > > > but > > > >>> > > >> > > > > > > it > > > >>> > > >> > > > > > > > > will > > > >>> > > >> > > > > > > > > > > be > > > >>> > > >> > > > > > > > > > > > > > useful > > > >>> > > >> > > > > > > > > > > > > > > only for java thin client. And > > even > > > >>> with > > > >>> > > java > > > >>> > > >> > thin > > > >>> > > >> > > > > client > > > >>> > > >> > > > > > > we > > > >>> > > >> > > > > > > > > > should > > > >>> > > >> > > > > > > > > > > > > > whether > > > >>> > > >> > > > > > > > > > > > > > > implement peer-class-loading for > > > thin > > > >>> > > clients > > > >>> > > >> > (this > > > >>> > > >> > > > > also > > > >>> > > >> > > > > > > > > requires > > > >>> > > >> > > > > > > > > > > > > two-way > > > >>> > > >> > > > > > > > > > > > > > > client-server communication) or > > put > > > >>> > classes > > > >>> > > >> with > > > >>> > > >> > > > > executed > > > >>> > > >> > > > > > > > > > closures > > > >>> > > >> > > > > > > > > > > to > > > >>> > > >> > > > > > > > > > > > > the > > > >>> > > >> > > > > > > > > > > > > > > server locally. > > > >>> > > >> > > > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > What do you think about proposed > > > >>> protocol > > > >>> > > >> > changes? > > > >>> > > >> > > > > > > > > > > > > > > Do we need two-way requests > > between > > > >>> client > > > >>> > > and > > > >>> > > >> > > > server? > > > >>> > > >> > > > > > > > > > > > > > > Do we need support of compute > > > methods > > > >>> > other > > > >>> > > >> than > > > >>> > > >> > > > > "execute > > > >>> > > >> > > > > > > > > task"? > > > >>> > > >> > > > > > > > > > > > > > > What do you think about > > > >>> peer-class-loading > > > >>> > > for > > > >>> > > >> > thin > > > >>> > > >> > > > > > > clients? > > > >>> > > >> > > > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > -- > > > >>> > > >> > > > > > > > > > > > > > Sergey Kozlov > > > >>> > > >> > > > > > > > > > > > > > GridGain Systems > > > >>> > > >> > > > > > > > > > > > > > www.gridgain.com > > > >>> > > >> > > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > -- > > > >>> > > >> > > > > > > > > > > > Sergey Kozlov > > > >>> > > >> > > > > > > > > > > > GridGain Systems > > > >>> > > >> > > > > > > > > > > > www.gridgain.com > > > >>> > > >> > > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > -- > > > >>> > > >> > > > > > > > > > > Alex. > > > >>> > > >> > > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > > >>> > > >> > > > > > > > > > > > >>> > > >> > > > > > > > > > > >>> > > >> > > > > > > > > > >>> > > >> > > > > > > > > >>> > > >> > > > > > > > >>> > > >> > > > > > > >>> > > >> > > > > > >>> > > >> > > > > >>> > > >> > > > >>> > > > > > > >>> > > > > > >>> > > > > >>> > > > >> > > > > > > |
Free forum by Nabble | Edit this page |