[PR] [SPARK-51638][CORE][3.5] Fix fetching the remote disk stored RDD blocks via the external shuffle service [spark]

via GitHub Tue, 15 Apr 2025 15:29:55 -0700


attilapiros opened a new pull request, #50596:
URL: https://github.com/apache/spark/pull/50596


   ### What changes were proposed in this pull request?
   
   Fix remote fetching of disk stored RDD blocks via the external shuffle 
service when `spark.shuffle.service.fetch.rdd.enabled` is set.
   
   ### Why are the changes needed?
   
   After https://issues.apache.org/jira/browse/SPARK-43221 remote fetching was 
handled in `BlockManagerMasterEndpoint#getLocationsAndStatus` at one place 
where all the location was used along with the `blockManagerInfo` map but this 
map only includes information about the active executors which are not already 
killed (after for example downscaling in dynamic allocation or just killed 
because of a failures). 
   
   This PR extend the search to all the remote external shuffle services where 
the `blockStatusByShuffleService` map is used. That map contains block infos 
even for the killed executors.
   
   ### Does this PR introduce _any_ user-facing change?
   
   No.
   
   ### How was this patch tested?
   
   An existing unit test was extended.
   
   ### Was this patch authored or co-authored using generative AI tooling?
   
   No.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[PR] [SPARK-51638][CORE][3.5] Fix fetching the remote disk stored RDD blocks via the external shuffle service [spark]

Reply via email to