I forgot to mention, I can do:
sparkSession.sql("create function z as 'QualityToString'");

prior to starting HiveThriftServer2 and that will register the UDF, but only in 
the default database.  It won’t be present in other databases. I can register 
it again in the other databases as needed, but it just seems to me that that 
shouldn’t be necessary.

Further detail on QualityToString, for testing purposes I’m implementing Spark 
UDF and Hive UDF in one class:
public class QualityToString extends UDF implements UDF1<Integer, String>, 
Serializable {

    @Override
    public String call(Integer t1) throws Exception {
        return QUALITY.toString(t1);
    }

    public Text evaluate(Integer i) throws Exception {
        return new Text(this.call(i));
    }
}



[cid:imaged918e8.GIF@87801364.42ab9893]

Shawn Lavelle
Software Development

4101 Arrowhead Drive
Medina, Minnesota 55340-9457
Phone: 763 551 0559
Fax: 763 551 0750
Email: shawn.lave...@osii.com<mailto:shawn.lave...@osii.com>
Website: www.osii.com<http://www.osii.com>

From: Lavelle, Shawn
Sent: Tuesday, February 28, 2017 10:25 AM
To: user@spark.apache.org
Subject: Register Spark UDF for use with Hive Thriftserver/Beeline

Hello all,

   I’m trying to make my custom UDFs available from a beeline session via Hive 
ThriftServer.  I’ve been successful in registering them via my DataSourceAPI as 
it provides the current sqlContext. However, the udfs are not accessible at 
initial connection, meaning a query won’t parse as the udfs aren’t yet 
registered.

   What is the right way to register a Spark UDF to be available over the 
HiveThriftServer as connect time?

The following haven’t worked, but perhaps my timing is off?

       this.sparkSession = SparkSession.builder()…
       SQLContext sqlContext = sparkSession.sqlContext();
        sqlContext.sql("create temporary function z as ‘QualityToString'");
        sparkSession.udf().register("v", new 
QualityToString(),QualityToString.returnType());
        SparkSession.setDefaultSession(this.sparkSession);
       HiveThriftServer2.startWithContext(sqlContext);

Neither z nor v show up.  I’ve tried registering after starting the 
HiveThriftServer, also to no avail. I’ve tried grabbing the spark or sql 
context as user authentication time, but to no avail either.  (Registering at 
authentication time worked in Hive 0.11 and Shark 0.9.2, but I suspect the 
session is now created differently and/or after authentication.)

I’m sure I’m not the first person to want to use sparkSQL UDFs in HIVE via 
beeline, how should I be registering them?

Thank you!

~ Shawn

Reply via email to