[ https://issues.apache.org/jira/browse/FLINK-14606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Zhu Zhu closed FLINK-14606. --------------------------- Resolution: Won't Do Closed because the value of {{inCallback}} and {{releasePartitions}} are not exactly aligned. {{inCallback}} only needs to be true if the task is already deployed and it is failed by JM. However, even if a task is not deployed, {{releasePartitions}} still needs to be true since the partition may have been created in external shuffle services. {{fromSchedulerNG}} will be removed along with the legacy scheduler removal, so we do not need to change it right now here. > Simplify params of Execution#processFail > ---------------------------------------- > > Key: FLINK-14606 > URL: https://issues.apache.org/jira/browse/FLINK-14606 > Project: Flink > Issue Type: Sub-task > Components: Runtime / Coordination > Affects Versions: 1.10.0 > Reporter: Zhu Zhu > Priority: Major > > The 3 params fromSchedulerNg/releasePartitions/isCallback of > Execution#processFail are quite a mess while they seem to be correlated. > I'd propose to simplify the prams of processFail by using a > {{isInternalError}} to replace those 3 params. {{isInternalError}} is true > iff the failure is from TM(strictly speaking, notified from SchedulerBase). > This also hardens the handling of cases that a task is successfully deployed > but JM does not realize it(see #3 below). > Here's why these 3 params can be simplified: > 1. {{fromSchedulerNg}}, true iff the failure is from TM and > isLegacyScheduling==false. > It's only used like this: {{if (!fromSchedulerNg && > !isLegacyScheduling()))}}. So it's the same to use {{!isInternalFailure}} to > replace it. > 2. {{releasePartitions}}, true iff the failure is from TM. > Now the value is exactly the same as {{isInternalFailure}}, we can drop it > and use {{isInternalFailure}} instead. > 3. {{isCallback}}, true iff the failure is from TM or the task is not > deployed. > It's only used like this: {{(!isCallback && (current == RUNNING || > current == DEPLOYING))}}. > So using {{!isInternalFailure}} to replace it would be enough. It is a > bit different for the case that a task deployment to a task manager fails, > which set {{isCallback}} to true previously. However, it would be safer to > signal a cancel call, in case the deployment is actually a success but the > response is lost on network. > cc [~GJL] -- This message was sent by Atlassian Jira (v8.3.4#803005)