[ 
https://issues.apache.org/jira/browse/SPARK-52508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-52508:
-----------------------------------
    Labels: pull-request-available  (was: )

> Read from fallback storage should consider replication delay
> ------------------------------------------------------------
>
>                 Key: SPARK-52508
>                 URL: https://issues.apache.org/jira/browse/SPARK-52508
>             Project: Spark
>          Issue Type: Sub-task
>          Components: Kubernetes
>    Affects Versions: 4.1.0
>            Reporter: Enrico Minack
>            Priority: Major
>              Labels: pull-request-available
>
> Using the storage decommissioning feature on Kubernetes with a distributed 
> filesystem as the fallback storage might run into the situation where an 
> executor cannot see the shuffle data on the distributed filesystem that has 
> just been written by the decommissioned executor. This is caused by some 
> replication delay. Given the dependent executor knows the location of the 
> shuffle data is the fallback storage, it can defer reading on a 
> {{FileNotFoundException}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to