Thanks for the pointer, Bryan! Sounds like I was on the right track in terms of what's available for now.
(And Gourav -- I'm certainly interested in migrating to Scala, but our team is mostly Java, Python, and R based right now!) On Thu, Jul 21, 2016 at 11:00 PM, Bryan Cutler <cutl...@gmail.com> wrote: > Everett, I had the same question today and came across this old thread. > Not sure if there has been any more recent work to support this. > http://apache-spark-developers-list.1001551.n3.nabble.com/Using-UDFs-in-Java-without-registration-td12497.html > > > On Thu, Jul 21, 2016 at 10:10 AM, Everett Anderson < > ever...@nuna.com.invalid> wrote: > >> Hi, >> >> In the Java Spark DataFrames API, you can create a UDF, register it, and >> then access it by string name by using the convenience UDF classes in >> org.apache.spark.sql.api.java >> <https://spark.apache.org/docs/latest/api/java/org/apache/spark/sql/api/java/package-summary.html> >> . >> >> Example >> >> UDF1<String, Long> testUdf1 = new UDF1<>() { ... } >> >> sqlContext.udf().register("testfn", testUdf1, DataTypes.LongType); >> >> DataFrame df2 = df.withColumn("new_col", *functions.callUDF("testfn"*, >> df.col("old_col"))); >> >> However, I'd like to avoid registering these by name, if possible, since >> I have many of them and would need to deal with name conflicts. >> >> There are udf() methods like this that seem to be from the Scala API >> <https://spark.apache.org/docs/latest/api/java/org/apache/spark/sql/functions.html#udf(scala.Function1,%20scala.reflect.api.TypeTags.TypeTag,%20scala.reflect.api.TypeTags.TypeTag)>, >> where you don't have to register everything by name first. >> >> However, using those methods from Java would require interacting with >> Scala's scala.reflect.api.TypeTags.TypeTag. I'm having a hard time >> figuring out how to create a TypeTag from Java. >> >> Does anyone have an example of using the udf() methods from Java? >> >> Thanks! >> >> - Everett >> >> >