When running Spark job often times some tasks fails for stage X with OOM however same task for same stage succeeds eventually when relaunched and stage X and job completes successfully.
One thing I can think of is say there 2 cores per executor and say executor memory of 8G so initially task got OOM as 2 task per executor needed 8gb+ memory but eventually when task was relaunched for that executor no other task was running and hence it could finish successfully. https://stackoverflow.com/questions/48532836/spark-memory-management-for-oom However I do not find very clear answer. -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org