attilapiros commented on code in PR #50630:
URL: https://github.com/apache/spark/pull/50630#discussion_r2093658439


##########
core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala:
##########
@@ -1552,6 +1552,20 @@ private[spark] class DAGScheduler(
     // `findMissingPartitions()` returns all partitions every time.
     stage match {
       case sms: ShuffleMapStage if stage.isIndeterminate && !sms.isAvailable =>
+        // already executed atleast once

Review Comment:
   @cloud-fan The rollback is not executed when the `FetchFailed` is handled. 
It was (is) just checking whether a rollback would be needed on that chain of 
stages (including the result stage) and if it was needed it is checked whether 
it is possible. If rollback is needed but not possible (as for a `ResultStage` 
we have any autpot generated) but  we do the abort.
   
   See
   
https://github.com/apache/spark/blob/02f196a4f2c0615826e78fa0f6d99b49b66b81f1/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala#L2177-L2182
   
   So there is no double rollback and abort is protected from double abort:
   
https://github.com/apache/spark/blob/02f196a4f2c0615826e78fa0f6d99b49b66b81f1/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala#L2883-L2886
   
   
   The advantages to do this check in `submitMissingTasks`:
   - we can handle the case when the executor loss introducing the 
recalculation of an indeterministic stage
   - we are latter in time as we were at the `FetchFailed` so we can make 
better decision. What I mean if any task finished for the result stage after 
the `FetchFailed` (as there is race among the events)  we can abort the stage 
and not continuing with partial result coming as a result of the previous run.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to