I forgot to mention, I can do: sparkSession.sql("create function z as 'QualityToString'");
prior to starting HiveThriftServer2 and that will register the UDF, but only in the default database. It won’t be present in other databases. I can register it again in the other databases as needed, but it just seems to me that that shouldn’t be necessary. Further detail on QualityToString, for testing purposes I’m implementing Spark UDF and Hive UDF in one class: public class QualityToString extends UDF implements UDF1<Integer, String>, Serializable { @Override public String call(Integer t1) throws Exception { return QUALITY.toString(t1); } public Text evaluate(Integer i) throws Exception { return new Text(this.call(i)); } } [cid:imaged918e8.GIF@87801364.42ab9893] Shawn Lavelle Software Development 4101 Arrowhead Drive Medina, Minnesota 55340-9457 Phone: 763 551 0559 Fax: 763 551 0750 Email: shawn.lave...@osii.com<mailto:shawn.lave...@osii.com> Website: www.osii.com<http://www.osii.com> From: Lavelle, Shawn Sent: Tuesday, February 28, 2017 10:25 AM To: user@spark.apache.org Subject: Register Spark UDF for use with Hive Thriftserver/Beeline Hello all, I’m trying to make my custom UDFs available from a beeline session via Hive ThriftServer. I’ve been successful in registering them via my DataSourceAPI as it provides the current sqlContext. However, the udfs are not accessible at initial connection, meaning a query won’t parse as the udfs aren’t yet registered. What is the right way to register a Spark UDF to be available over the HiveThriftServer as connect time? The following haven’t worked, but perhaps my timing is off? this.sparkSession = SparkSession.builder()… SQLContext sqlContext = sparkSession.sqlContext(); sqlContext.sql("create temporary function z as ‘QualityToString'"); sparkSession.udf().register("v", new QualityToString(),QualityToString.returnType()); SparkSession.setDefaultSession(this.sparkSession); HiveThriftServer2.startWithContext(sqlContext); Neither z nor v show up. I’ve tried registering after starting the HiveThriftServer, also to no avail. I’ve tried grabbing the spark or sql context as user authentication time, but to no avail either. (Registering at authentication time worked in Hive 0.11 and Shark 0.9.2, but I suspect the session is now created differently and/or after authentication.) I’m sure I’m not the first person to want to use sparkSQL UDFs in HIVE via beeline, how should I be registering them? Thank you! ~ Shawn