Re: [PR] [SPARK-51272][CORE]. Fix for the race condition in Scheduler causing failure in retrying all partitions in case of indeterministic shuffle keys [spark]

via GitHub Mon, 14 Apr 2025 21:17:24 -0700


attilapiros commented on PR #50033:
URL: https://github.com/apache/spark/pull/50033#issuecomment-2801616710


   > If yes, we could be more aggressive when handling this case -
   
   > Invalidate all downstream shuffle output
   > Any result stage which has/had started, and not completed - fail that job.
   > Does this align with your observations/analysis @attilapiros ?
   
   We are getting closer.
   
   > Specifically about JDBC - assuming it is not due to the case we discussed 
above - I am not entirely sure :-)
   > If the commit protocol has been correctly implemented, we will need to 
understand that better ...
   
   There is transaction management for writing the rows of a partition (so for 
a task):
   
https://github.com/apache/spark/blob/1fa05b8cb755bbf2432a37a96bcaf329982b7684/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/JdbcUtils.scala#L804-L807
   
   But the re-execution of a task will do the duplication as I see.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Re: [PR] [SPARK-51272][CORE]. Fix for the race condition in Scheduler causing failure in retrying all partitions in case of indeterministic shuffle keys [spark]

Reply via email to