I'm using Scala 2.11 and Spark 2.4.7 which are required by other parts of the
solution.  
I run the program with `spark-submit`. 
I am stuck on a problem that occurs on the 2nd statement. 


    spark =
SparkSession.builder.appName("MyApp.py").master("local[8]").getOrCreate()

    db2graph = spark._jvm.com.company.project.MyScalaClass(spark)

1. First, I construct a SparkSession called `spark`
2. Then I try to call a Scala class AND pass it the SparkSession, which the
Scala class requires. 

I'm attempting to do this with Py4J's gateway class to pass the request from
the Python VM to the Scala JVM. But I get the error, before Py4J even
attempts the call to MyScalaClass.

  File
"/home/ytian/spark/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py",
line 1516, in __call__
  File
"/home/ytian/spark/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py",
line 1516, in <listcomp>
  File "/home/ytian/spark/python/lib/py4j-0.10.7-src.zip/py4j/protocol.py",
line 298, in get_command_part
AttributeError: 'SparkSession' object has no attribute '_get_object_id'

I suspect the second line of my code (above) is doing something wrong, or
not doing something it needs to do, but I haven't found any Google hits that
explain how to do it correctly. 
- In `py4j/protocol.py`
(https://github.com/bartdag/py4j/blob/master/py4j-python/src/py4j/protocol.py
line 298) it does indeed call `_get_object_id`.  Since SparkSession does not
have such a method, the problem must be that I have to somehow wrap
SparkSession. 
- I did find references
(https://www.py4j.org/advanced_topics.html#accessing-java-collections-and-arrays-from-python)
on how to wrap/unwrap Python objects to and from Java.  But SparSession
isn't any of those types.  






--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: [email protected]

Reply via email to