[ 
https://issues.apache.org/jira/browse/SPARK-56199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-56199:
-----------------------------------
    Labels: pull-request-available  (was: )

> ShuffleBlockFetcherIterator should not read FalbackStorage blocks as local 
> blocks
> ---------------------------------------------------------------------------------
>
>                 Key: SPARK-56199
>                 URL: https://issues.apache.org/jira/browse/SPARK-56199
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core
>    Affects Versions: 4.2.0
>            Reporter: Enrico Minack
>            Priority: Major
>              Labels: pull-request-available
>
> The ShuffleBlockFetcherIterator treats blocks stored on the FallbackStorage 
> (very likely a remote distributed storage like S3 or HDFS) as local block 
> files.
> The current implementation has the following disadvantages:
> - blocks are read from fallback storage single threaded and synchronously 
> (one by one)
> - waiting for fallback storage blocks being transferred over the network 
> blocks reading local blocks
> - fallback storage blocks are read in the {{catch}} branch when trying to 
> read it as a local block (which cannot be found locally)
> Reading fallback storage blocks should be treated like remote blocks: fetched 
> independently from local blocks, asynchronously and multi-threaded.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to