attilapiros commented on code in PR #50630: URL: https://github.com/apache/spark/pull/50630#discussion_r2093658439
########## core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala: ########## @@ -1552,6 +1552,20 @@ private[spark] class DAGScheduler( // `findMissingPartitions()` returns all partitions every time. stage match { case sms: ShuffleMapStage if stage.isIndeterminate && !sms.isAvailable => + // already executed atleast once Review Comment: @cloud-fan The rollback is not executed when the `FetchFailed` is handled. It was (is) just checking whether a rollback would be needed on that chain of stages (including the result stage) and if it was needed it is checked whether it is possible. If rollback is needed but not possible (as for a `ResultStage` we have any autpot generated) but we do the abort. See https://github.com/apache/spark/blob/02f196a4f2c0615826e78fa0f6d99b49b66b81f1/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala#L2177-L2182 So there is no double rollback and abort is protected from double abort: https://github.com/apache/spark/blob/02f196a4f2c0615826e78fa0f6d99b49b66b81f1/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala#L2883-L2886 The advantages to do this check in `submitMissingTasks`: - we can handle the case when the executor loss introducing the recalculation of an indeterministic stage - we are latter in time as we were at the `FetchFailed` so we can make better decision. What I mean if any task finished for the result stage after the `FetchFailed` (as there is race among the events) we can abort the stage and not continuing with partial result coming as a result of the previous run. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org