[ https://issues.apache.org/jira/browse/HIVE-12045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14946555#comment-14946555 ]
Zsolt Tóth commented on HIVE-12045: ----------------------------------- [~xuefuz] I attached the jar with both the generic and non-generic functions. To reproduce: {code} set hive.execution.engine=spark; create function genNth as 'org.example.GenericUDFNth' using jar 'hdfs:///...'; create function nth as 'org.example.UDFNth' using jar 'hdfs:///...'; select distinct nth(1,2,1) from tab; select distinct genNth(1,2,1) from tab; {code} > ClassNotFound for GenericUDF in "select distinct..." query (Hive on Spark) > -------------------------------------------------------------------------- > > Key: HIVE-12045 > URL: https://issues.apache.org/jira/browse/HIVE-12045 > Project: Hive > Issue Type: Bug > Components: Spark > Environment: Cloudera QuickStart VM, beeline > Reporter: Zsolt Tóth > Attachments: example.jar > > > If I execute the following query in beeline, I get ClassNotFoundException for > the UDF class. > {code} > drop function myGenericUdf; > create function myGenericUdf as 'org.example.myGenericUdf' using jar > 'hdfs:///tmp/myudf.jar'; > select distinct myGenericUdf(1,2,1) from mytable; > {code} > In my example, myGenericUdf just looks for the 1st argument's value in the > others and returns the index. I don't think this is related to the actual > GenericUDF function. > Note that: > "select myGenericUdf(1,2,1) from mytable;" succeeds > If I use the non-generic implementation of the same UDF, the select distinct > call succeeds. > StackTrace: > {code} > 15/10/06 05:20:25 ERROR exec.Utilities: Failed to load plan: > hdfs://quickstart.cloudera:8020/tmp/hive/hive/f9de3f09-c12d-4528-9ee6-1f12932a14ae/hive_2015-10-06_05-20-07_438_6519207588897968406-20/-mr-10003/27cd7226-3e22-46f4-bddd-fb8fd4aa4b8d/map.xml: > org.apache.hive.com.esotericsoftware.kryo.KryoException: Unable to find > class: org.example.myGenericUDF > Serialization trace: > genericUDF (org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc) > chidren (org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc) > colExprMap (org.apache.hadoop.hive.ql.exec.GroupByOperator) > childOperators (org.apache.hadoop.hive.ql.exec.SelectOperator) > childOperators (org.apache.hadoop.hive.ql.exec.TableScanOperator) > aliasToWork (org.apache.hadoop.hive.ql.plan.MapWork) > org.apache.hive.com.esotericsoftware.kryo.KryoException: Unable to find > class: org.example.myGenericUDF > Serialization trace: > genericUDF (org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc) > chidren (org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc) > colExprMap (org.apache.hadoop.hive.ql.exec.GroupByOperator) > childOperators (org.apache.hadoop.hive.ql.exec.SelectOperator) > childOperators (org.apache.hadoop.hive.ql.exec.TableScanOperator) > aliasToWork (org.apache.hadoop.hive.ql.plan.MapWork) > at > org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:138) > at > org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:115) > at > org.apache.hive.com.esotericsoftware.kryo.Kryo.readClass(Kryo.java:656) > at > org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:99) > at > org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507) > at > org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776) > at > org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:112) > at > org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:18) > at > org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694) > at > org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106) > at > org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507) > at > org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776) > at > org.apache.hive.com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:139) > at > org.apache.hive.com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:17) > at > org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694) > at > org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106) > at > org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507) > at > org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776) > at > org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:112) > at > org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:18) > at > org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694) > at > org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106) > at > org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507) > at > org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776) > at > org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:112) > at > org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:18) > at > org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694) > at > org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106) > at > org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507) > at > org.apache.hive.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:776) > at > org.apache.hive.com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:139) > at > org.apache.hive.com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:17) > at > org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694) > at > org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106) > at > org.apache.hive.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:507) > at > org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:672) > at > org.apache.hadoop.hive.ql.exec.Utilities.deserializeObjectByKryo(Utilities.java:1069) > at > org.apache.hadoop.hive.ql.exec.Utilities.deserializePlan(Utilities.java:960) > at > org.apache.hadoop.hive.ql.exec.Utilities.deserializePlan(Utilities.java:974) > at > org.apache.hadoop.hive.ql.exec.Utilities.getBaseWork(Utilities.java:416) > at > org.apache.hadoop.hive.ql.exec.Utilities.getMapWork(Utilities.java:296) > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.init(HiveInputFormat.java:268) > at > org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:505) > at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:203) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) > at scala.Option.getOrElse(Option.scala:120) > at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) > at > org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217) > at scala.Option.getOrElse(Option.scala:120) > at org.apache.spark.rdd.RDD.partitions(RDD.scala:217) > at org.apache.spark.ShuffleDependency.<init>(Dependency.scala:82) > at > org.apache.spark.rdd.ShuffledRDD.getDependencies(ShuffledRDD.scala:80) > at org.apache.spark.rdd.RDD$$anonfun$dependencies$2.apply(RDD.scala:206) > at org.apache.spark.rdd.RDD$$anonfun$dependencies$2.apply(RDD.scala:204) > at scala.Option.getOrElse(Option.scala:120) > at org.apache.spark.rdd.RDD.dependencies(RDD.scala:204) > at > org.apache.spark.scheduler.DAGScheduler.visit$2(DAGScheduler.scala:338) > at > org.apache.spark.scheduler.DAGScheduler.getAncestorShuffleDependencies(DAGScheduler.scala:355) > at > org.apache.spark.scheduler.DAGScheduler.registerShuffleDependencies(DAGScheduler.scala:317) > at > org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$getShuffleMapStage(DAGScheduler.scala:218) > at > org.apache.spark.scheduler.DAGScheduler$$anonfun$visit$1$1.apply(DAGScheduler.scala:301) > at > org.apache.spark.scheduler.DAGScheduler$$anonfun$visit$1$1.apply(DAGScheduler.scala:298) > at scala.collection.immutable.List.foreach(List.scala:318) > at > org.apache.spark.scheduler.DAGScheduler.visit$1(DAGScheduler.scala:298) > at > org.apache.spark.scheduler.DAGScheduler.getParentStages(DAGScheduler.scala:310) > at > org.apache.spark.scheduler.DAGScheduler.newStage(DAGScheduler.scala:244) > at > org.apache.spark.scheduler.DAGScheduler.handleJobSubmitted(DAGScheduler.scala:731) > at > org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1362) > at > org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1354) > at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) > Caused by: java.lang.ClassNotFoundException: org.example.myGenericUDF > at java.net.URLClassLoader$1.run(URLClassLoader.java:366) > at java.net.URLClassLoader$1.run(URLClassLoader.java:355) > at java.security.AccessController.doPrivileged(Native Method) > at java.net.URLClassLoader.findClass(URLClassLoader.java:354) > at java.lang.ClassLoader.loadClass(ClassLoader.java:425) > at java.lang.ClassLoader.loadClass(ClassLoader.java:358) > at java.lang.Class.forName0(Native Method) > at java.lang.Class.forName(Class.java:270) > at > org.apache.hive.com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:136) > ... 72 more > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)