[
https://issues.apache.org/jira/browse/IGNITE-23458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Konstantin Orlov reassigned IGNITE-23458:
-----------------------------------------
Assignee: Konstantin Orlov
> Sql. Worker node failed to send message to another worker node
> --------------------------------------------------------------
>
> Key: IGNITE-23458
> URL: https://issues.apache.org/jira/browse/IGNITE-23458
> Project: Ignite
> Issue Type: Improvement
> Components: sql
> Affects Versions: 3.0.0-beta1
> Reporter: Evgeny Stanilovsky
> Assignee: Konstantin Orlov
> Priority: Major
> Labels: ignite-3
>
> Case: _Execution of multistage query involves communication between nodes in
> order to exchange control messages and batches of intermediate results. In
> case one node is unable to send a message to another node, it must inform the
> coordinator in order to let the latter handle the problem. The recovery is
> possible only if no rows have been returned to the client yet. To recover
> from such a situation, current execution must be dropped, query must be
> restarted from mapping stage. In case of RW transaction, tx itself must be
> restarted if it was implicit, otherwise recovery is not possible._
> From the other side, we have assumptions:
> 1. Message is delivered through messaging service or logical topology event
> is raised.
> 2. Messages need to be delivered also during temporary recipient
> unavailability.
> Thus if message can`t be sent, appropriate error message will be send into
> coordinator node and some kind of error related handling need to be
> implemented:
> 1. In case of runtime error - all execution need to be cancelled with
> appropriate informative exception on the client side.
> 2. In case on messaging service error - coordinator need to reschedule query
> execution according to logic above.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)