Maybe enabling DEBUG level log in your job and follow the processing logic until the failure?
BTW, you need to look at what happens during job processing. `Spark Context was shutdown` is not the root cause, but the result of job failure in most cases. Dongjoon. On Fri, Oct 28, 2022 at 12:10 AM Shrikant Prasad <shrikant....@gmail.com> wrote: > Thanks Dongjoon for replying. I have tried with Spark 3.2 and still facing > the same issue. > > Looking for some pointers which can help in debugging to find the > root cause. > > Regards, > Shrikant > > On Thu, 27 Oct 2022 at 10:36 PM, Dongjoon Hyun <dongjoon.h...@gmail.com> > wrote: > >> Hi, Shrikant. >> >> It seems that you are using non-GA features. >> >> FYI, since Apache Spark 3.1.1, Kubernetes Support became GA in the >> community. >> >> https://spark.apache.org/releases/spark-release-3-1-1.html >> >> In addition, Apache Spark 3.1 reached EOL last month. >> >> Could you try the latest distribution like Apache Spark 3.3.1 to see that >> you are still experiencing the same issue? >> >> It will reduce the scope of your issues by excluding many known and fixed >> bugs at 3.0/3.1/3.2/3.3.0. >> >> Thanks, >> Dongjoon. >> >> >> On Wed, Oct 26, 2022 at 11:16 PM Shrikant Prasad <shrikant....@gmail.com> >> wrote: >> >>> Hi Everyone, >>> >>> We are using Spark 3.0.1 with Kubernetes resource manager. Facing an >>> intermittent issue in which the driver pod gets deleted and the driver logs >>> have this message that Spark Context was shutdown. >>> >>> The same job works fine with given set of configurations most of the >>> time but sometimes it fails. It mostly occurs while reading or writing >>> parquet files to hdfs. (but not sure if it's the only usecase affected) >>> >>> Any pointers to find the root cause? >>> >>> Most of the earlier reported issues mention executors getting OOM as the >>> cause. But we have not seen an OOM error in any of executors. Also, why the >>> context will be shutdown in this case instead of retrying with new >>> executors. >>> Another doubt is why the driver pod gets deleted. Shouldn't it just >>> error out? >>> >>> Regards, >>> Shrikant >>> >>> -- >>> Regards, >>> Shrikant Prasad >>> >> -- > Regards, > Shrikant Prasad >