I also had similar problem while joining a dataset. After digging into the worker logs i figured out it was throwing CancelledKeyException, Not sure the cause.
Thanks Best Regards On Tue, Sep 30, 2014 at 5:15 AM, jamborta <jambo...@gmail.com> wrote: > hi all, > > I have a problem with my application when I increase the data size over 5GB > (the cluster has about 100GB memory to handle that). First I get this > warning: > > WARN TaskSetManager: Lost task 10.1 in stage 4.1 (TID 408, backend-node1): > FetchFailed(BlockManagerId(3, backend-node0, 41484, 0), shuffleId=1, > mapId=0, r > educeId=18) > > then this one: > > 14/09/29 23:26:44 WARN TaskSetManager: Lost task 2.0 in stage 5.2 (TID 418, > backend-node1): ExecutorLostFailure (executor lost) > > a few second later the all executors shut down: > > 14/09/29 23:26:53 ERROR YarnClientSchedulerBackend: Yarn application > already > ended: FINISHED > 14/09/29 23:26:53 INFO SparkUI: Stopped Spark web UI at > http://backend-node0:4040 > 14/09/29 23:26:53 INFO YarnClientSchedulerBackend: Shutting down all > executors > > even SparkContext stops. > > Not sure how to debug this, there is nothing in the logs apart from this. I > have given enough memory to all executors. > > thanks for the help, > > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/ExecutorLostFailure-kills-sparkcontext-tp15370.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > >