Hi Everyone, We are using Spark 3.0.1 with Kubernetes resource manager. Facing an intermittent issue in which the driver pod gets deleted and the driver logs have this message that Spark Context was shutdown.
The same job works fine with given set of configurations most of the time but sometimes it fails. It mostly occurs while reading or writing parquet files to hdfs. (but not sure if it's the only usecase affected) Any pointers to find the root cause? Most of the earlier reported issues mention executors getting OOM as the cause. But we have not seen an OOM error in any of executors. Also, why the context will be shutdown in this case instead of retrying with new executors. Another doubt is why the driver pod gets deleted. Shouldn't it just error out? Regards, Shrikant -- Regards, Shrikant Prasad