In your case you need to externalize the shuffle files to a component outside of your spark cluster to make them persist after spark workers death. https://spark.apache.org/docs/latest/running-on-yarn.html#configuring-the-external-shuffle-service
2017-12-20 10:46 GMT+01:00 chopinxb <chopi...@gmail.com>: > In my use case, i run spark on yarn-client mode with dynamicAllocation > enabled, When a node shutting down abnormally, my spark application will > fails because of task fail to fetch shuffle blocks from that node 4 times. > Why spark do not leverage Alluxio(distributed in-memory filesystem) to > write > shuffle blocks with replicas ? In this situation,when a node shutdown,task > can fetch shuffle blocks from other replicas. we can abtain higher > stability > > > > -- > Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ > > --------------------------------------------------------------------- > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > >