Thanks for the pointer, Bryan! Sounds like I was on the right track in
terms of what's available for now.

(And Gourav -- I'm certainly interested in migrating to Scala, but our team
is mostly Java, Python, and R based right now!)


On Thu, Jul 21, 2016 at 11:00 PM, Bryan Cutler <cutl...@gmail.com> wrote:

> Everett, I had the same question today and came across this old thread.
> Not sure if there has been any more recent work to support this.
> http://apache-spark-developers-list.1001551.n3.nabble.com/Using-UDFs-in-Java-without-registration-td12497.html
>
>
> On Thu, Jul 21, 2016 at 10:10 AM, Everett Anderson <
> ever...@nuna.com.invalid> wrote:
>
>> Hi,
>>
>> In the Java Spark DataFrames API, you can create a UDF, register it, and
>> then access it by string name by using the convenience UDF classes in
>> org.apache.spark.sql.api.java
>> <https://spark.apache.org/docs/latest/api/java/org/apache/spark/sql/api/java/package-summary.html>
>> .
>>
>> Example
>>
>> UDF1<String, Long> testUdf1 = new UDF1<>() { ... }
>>
>> sqlContext.udf().register("testfn", testUdf1, DataTypes.LongType);
>>
>> DataFrame df2 = df.withColumn("new_col", *functions.callUDF("testfn"*,
>> df.col("old_col")));
>>
>> However, I'd like to avoid registering these by name, if possible, since
>> I have many of them and would need to deal with name conflicts.
>>
>> There are udf() methods like this that seem to be from the Scala API
>> <https://spark.apache.org/docs/latest/api/java/org/apache/spark/sql/functions.html#udf(scala.Function1,%20scala.reflect.api.TypeTags.TypeTag,%20scala.reflect.api.TypeTags.TypeTag)>,
>> where you don't have to register everything by name first.
>>
>> However, using those methods from Java would require interacting with
>> Scala's scala.reflect.api.TypeTags.TypeTag. I'm having a hard time
>> figuring out how to create a TypeTag from Java.
>>
>> Does anyone have an example of using the udf() methods from Java?
>>
>> Thanks!
>>
>> - Everett
>>
>>
>

Reply via email to