Hi,
Thank you for the quick answers !
Ndjido Ardo: actually the Python part is not legacy code. I'm not familiar
with UDF in Spark, do you have some examples of Python UDFs and how to use
them in Scala code ?
Holden Karau: the pipe interface seems like a good solution, but I'm a bit
concerned ab
When faced with this issue I followed the approach taken by pyspark and used
py4j. You have to:
- ensure your code is Java compatible
- use py4j to call the java (scala) code from python
> On Apr 18, 2016, at 10:29 AM, Holden Karau wrote:
>
> So if there is just a few python functions your int
So if there is just a few python functions your interested in accessing you
can also use the pipe interface (you'll have to manually serialize your
data on both ends in ways that Python and Scala can respectively parse) -
but its a very generic approach and can work with many different languages.
Hi Didier,
I think with PySpark you can wrap your legacy Python functions into UDFs
and use it in your DataFrames. But you have to use DataFrames instead of
RDD.
cheers,
Ardo
On Mon, Apr 18, 2016 at 7:13 PM, didmar wrote:
> Hi,
>
> I have a Spark project in Scala and I would like to call some