attilapiros commented on PR #50033:
URL: https://github.com/apache/spark/pull/50033#issuecomment-2785113081

   Yes, they are correct.
   
   That stage has two tasks one is running on `hostC` and the other is on 
`hostD`:
   
https://github.com/apache/spark/blob/00a4aadb8cfce30f2234453c64b9ca46c60fa07f/core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala#L3160
   
   The fetch failure was from the host called `hostC`:
   
https://github.com/apache/spark/blob/00a4aadb8cfce30f2234453c64b9ca46c60fa07f/core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala#L3166
   
   This caused a executor lost on `hostC`:
   
   ```
   25/04/07 19:48:08.769 pool-1-thread-1-ScalaTest-running-DAGSchedulerSuite 
INFO DAGSchedulerSuite$MyDAGScheduler: Executor lost: hostC-exec (epoch 4)
   ```
   
   So this removes the output which was made on `hostC`. This is how we get the 
assert right but latter when the  `ResubmitFailedStages` is handled the 
execution goes to `submitMissingTasks()` where all the output is removed:
   
https://github.com/apache/spark/blob/2b3fb526c8bd8b486f280756d5282cc84f7473d7/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala#L1555
 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to