It's not running on the executor; that's not the issue. See your stack
trace, where it clearly happens in the driver.

On Mon, Jan 2, 2023 at 8:58 AM Shrikant Prasad <shrikant....@gmail.com>
wrote:

> Even if I set the master as yarn, it will not have access to rest of the
> spark confs. It will need spark.yarn.app.id.
>
> The main issue is if its working as it is in Spark 2.3 why its not working
> in Spark 3 i.e why the session is getting created on executor.
> Another thing we tried is removing the df to rdd conversion just for debug
> and it works in Spark 3.
>
> So, it might be something to do with df to rdd conversion or serialization
> behavior change from Spark 2.3 to Spark 3.0 if there is any. But couldn't
> find the root cause.
>
> Regards,
> Shrikant
>
> On Mon, 2 Jan 2023 at 7:54 PM, Sean Owen <sro...@gmail.com> wrote:
>
>> So call .setMaster("yarn"), per the error
>>
>> On Mon, Jan 2, 2023 at 8:20 AM Shrikant Prasad <shrikant....@gmail.com>
>> wrote:
>>
>>> We are running it in cluster deploy mode with yarn.
>>>
>>> Regards,
>>> Shrikant
>>>
>>> On Mon, 2 Jan 2023 at 6:15 PM, Stelios Philippou <stevo...@gmail.com>
>>> wrote:
>>>
>>>> Can we see your Spark Configuration parameters ?
>>>>
>>>> The mater URL refers to as per java
>>>> new SparkConf()....setMaster("local[*]")
>>>> according to where you want to run this
>>>>
>>>> On Mon, 2 Jan 2023 at 14:38, Shrikant Prasad <shrikant....@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I am trying to migrate one spark application from Spark 2.3 to 3.0.1.
>>>>>
>>>>> The issue can be reproduced using below sample code:
>>>>>
>>>>> object TestMain {
>>>>>
>>>>> val session =
>>>>> SparkSession.builder().appName("test").enableHiveSupport().getOrCreate()
>>>>>
>>>>> def main(args: Array[String]): Unit = {
>>>>>
>>>>> import session.implicits._
>>>>> val a = *session.*sparkContext.parallelize(*Array*
>>>>> (("A",1),("B",2))).toDF("_c1","_c2").*rdd*.map(x=>
>>>>> x(0).toString).collect()
>>>>> *println*(a.mkString("|"))
>>>>>
>>>>> }
>>>>> }
>>>>>
>>>>> It runs successfully in Spark 2.3 but fails with Spark 3.0.1 with
>>>>> below exception:
>>>>>
>>>>> Caused by: org.apache.spark.SparkException: A master URL must be set
>>>>> in your configuration
>>>>>
>>>>>                 at
>>>>> org.apache.spark.SparkContext.<init>(SparkContext.scala:394)
>>>>>
>>>>>                 at
>>>>> org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2690)
>>>>>
>>>>>                 at
>>>>> org.apache.spark.sql.SparkSession$Builder.$anonfun$getOrCreate$2(SparkSession.scala:949)
>>>>>
>>>>>                 at scala.Option.getOrElse(Option.scala:189)
>>>>>
>>>>>                 at
>>>>> org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:943)
>>>>>
>>>>>                 at TestMain$.<init>(TestMain.scala:7)
>>>>>
>>>>>                 at TestMain$.<clinit>(TestMain.scala)
>>>>>
>>>>>
>>>>> From the exception it appears that it tries to create spark session on
>>>>> executor also in Spark 3 whereas its not created again on executor in 
>>>>> Spark
>>>>> 2.3.
>>>>>
>>>>> Can anyone help in identfying why there is this change in behavior?
>>>>>
>>>>> Thanks and Regards,
>>>>>
>>>>> Shrikant
>>>>>
>>>>> --
>>>>> Regards,
>>>>> Shrikant Prasad
>>>>>
>>>> --
>>> Regards,
>>> Shrikant Prasad
>>>
>> --
> Regards,
> Shrikant Prasad
>

Reply via email to