Maybe enabling DEBUG level log in your job and follow the processing logic
until the failure?

BTW, you need to look at what happens during job processing.

`Spark Context was shutdown` is not the root cause, but the result of job
failure in most cases.

Dongjoon.

On Fri, Oct 28, 2022 at 12:10 AM Shrikant Prasad <shrikant....@gmail.com>
wrote:

> Thanks Dongjoon for replying. I have tried with Spark 3.2 and still facing
> the same issue.
>
> Looking for some pointers which can help in debugging to find the
> root cause.
>
> Regards,
> Shrikant
>
> On Thu, 27 Oct 2022 at 10:36 PM, Dongjoon Hyun <dongjoon.h...@gmail.com>
> wrote:
>
>> Hi, Shrikant.
>>
>> It seems that you are using non-GA features.
>>
>> FYI, since Apache Spark 3.1.1, Kubernetes Support became GA in the
>> community.
>>
>>     https://spark.apache.org/releases/spark-release-3-1-1.html
>>
>> In addition, Apache Spark 3.1 reached EOL last month.
>>
>> Could you try the latest distribution like Apache Spark 3.3.1 to see that
>> you are still experiencing the same issue?
>>
>> It will reduce the scope of your issues by excluding many known and fixed
>> bugs at 3.0/3.1/3.2/3.3.0.
>>
>> Thanks,
>> Dongjoon.
>>
>>
>> On Wed, Oct 26, 2022 at 11:16 PM Shrikant Prasad <shrikant....@gmail.com>
>> wrote:
>>
>>> Hi Everyone,
>>>
>>> We are using Spark 3.0.1 with Kubernetes resource manager. Facing an
>>> intermittent issue in which the driver pod gets deleted and the driver logs
>>> have this message that Spark Context was shutdown.
>>>
>>> The same job works fine with given set of configurations most of the
>>> time but sometimes it fails. It mostly occurs while reading or writing
>>> parquet files to hdfs. (but not sure if it's the only usecase affected)
>>>
>>> Any pointers to find the root cause?
>>>
>>> Most of the earlier reported issues mention executors getting OOM as the
>>> cause. But we have not seen an OOM error in any of executors. Also, why the
>>> context will be shutdown in this case instead of retrying with new
>>> executors.
>>> Another doubt is why the driver pod gets deleted. Shouldn't it just
>>> error out?
>>>
>>> Regards,
>>> Shrikant
>>>
>>> --
>>> Regards,
>>> Shrikant Prasad
>>>
>> --
> Regards,
> Shrikant Prasad
>

Reply via email to