date:20200525

PySpark .collect() output to Scala Array[Row]

2020-05-25 Thread Nick Ruest

Hi, I've hit a wall with trying to just implement a couple of Scala methods of in a Python version of our project. My Python function looks like this: def Write_Graphml(data, graphml_path, sc): return sc.getOrCreate()._jvm.io.archivesunleashed.app.WriteGraphML(data, graphml_path).apply Whe

Re: PySpark .collect() output to Scala Array[Row]

2020-05-25 Thread Sean Owen

(This is better for user@) You have an object, which can't be instantiated. You can make it a class to make it instantiable, but you can try writing ... WriteGraphML.apply(...) in python instead. On Mon, May 25, 2020 at 1:23 PM Nick Ruest wrote: > > Hi, > > I've hit a wall with trying to just imp

Re: Inconsistent schema on Encoders.bean (reported issues from user@)

2020-05-25 Thread Jungtaek Lim

I meant how to interpret Java Beans in Spark are not consistently defined. Unlike you've guessed, in most paths Spark uses "read-only" properties. (All the failed existing tests in my experiment have "read-only" properties.) The problematic case is when Java bean is used for read-write; one case i