attilapiros commented on PR #50033:
URL: https://github.com/apache/spark/pull/50033#issuecomment-2784704888

   >And for the second case, where the fisrt result task is successful, but 
subsequent task fails, then the code will follow the existing path of aborting 
the query.
   
   @ahshahid Check your test "SPARK-51272: retry all the partitions of result 
stage, if the first result task has failed and failing ShuffleMap stage is 
inDeterminate". 
   
   The title is a bit misleading as fetching from the determinate stage fails 
(`shuffleId1`):
   
   ```
   makeCompletionEvent(
           taskSets.find(_.stageId == resultStage.id).get.tasks(0),
           FetchFailed(makeBlockManagerId("hostA"), shuffleId1, 0L, 0, 0, 
"ignored"),
   ```
   
   The indeterminate stage is resubmitted as the determinate stage failure lead 
to losing the executors and all the shuffle blocks. There is no abort called 
because of the condition:
   
https://github.com/apache/spark/blob/00a4aadb8cfce30f2234453c64b9ca46c60fa07f/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala#L2156
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to