Please file a JIRA:https://issues.apache.org/jira/browse/SPARK/ <https://www.google.com/url?q=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FSPARK%2F&sa=D&sntz=1&usg=AFQjCNFS_GnMso2OCOITA0TSJ5U10b3JSQ>
On Thu, Oct 9, 2014 at 6:48 PM, Anand Mohan <chinn...@gmail.com> wrote: > Hi, > > I just noticed the Percentile UDAF PR being merged into trunk and decided > to test it. > So pulled in today's trunk and tested the percentile queries. > They work marvelously, Thanks a lot for bringing this into Spark SQL. > > However Hive percentile UDAF also supports an array mode where in you can > give the list of percentiles that you want and it would return an array of > double values one for each requested percentile. > This query is failing with the below error. However a query with the > individual percentiles like > percentile(turnaroundtime,0.25),percentile(turnaroundtime,0.5),percentile(turnaroundtime,0.75) > is working. (and so this issue is not of a high priority as there is this > workaround for us) > > Thanks, > Anand Mohan > > 0: jdbc:hive2://dev-uuppala.sfohi.philips.com> select name, > percentile(turnaroundtime,array(0,0.25,0.5,0.75,1)) from exam group by name; > > Error: org.apache.spark.SparkException: Job aborted due to stage failure: > Task 1 in stage 25.0 failed 4 times, most recent failure: Lost task 1.3 in > stage 25.0 (TID 305, Dev-uuppala.sfohi.philips.com): > java.lang.ClassCastException: scala.collection.mutable.ArrayBuffer cannot > be cast to [Ljava.lang.Object; > > org.apache.hadoop.hive.serde2.objectinspector.StandardListObjectInspector.getListLength(StandardListObjectInspector.java:83) > > org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters$ListConverter.convert(ObjectInspectorConverters.java:259) > > org.apache.hadoop.hive.ql.udf.generic.GenericUDFUtils$ConversionHelper.convertIfNecessary(GenericUDFUtils.java:349) > > org.apache.hadoop.hive.ql.udf.generic.GenericUDAFBridge$GenericUDAFBridgeEvaluator.iterate(GenericUDAFBridge.java:170) > > org.apache.spark.sql.hive.HiveUdafFunction.update(hiveUdfs.scala:342) > > org.apache.spark.sql.execution.Aggregate$$anonfun$execute$1$$anonfun$7.apply(Aggregate.scala:167) > > org.apache.spark.sql.execution.Aggregate$$anonfun$execute$1$$anonfun$7.apply(Aggregate.scala:151) > org.apache.spark.rdd.RDD$$anonfun$13.apply(RDD.scala:599) > org.apache.spark.rdd.RDD$$anonfun$13.apply(RDD.scala:599) > > org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35) > org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) > org.apache.spark.rdd.RDD.iterator(RDD.scala:229) > org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31) > org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) > org.apache.spark.rdd.RDD.iterator(RDD.scala:229) > org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61) > org.apache.spark.scheduler.Task.run(Task.scala:56) > > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:181) > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > java.lang.Thread.run(Thread.java:745) > Driver stacktrace: (state=,code=0) > > > > ------------------------------ > View this message in context: Spark SQL Percentile UDAF > <http://apache-spark-user-list.1001560.n3.nabble.com/Spark-SQL-Percentile-UDAF-tp16092.html> > Sent from the Apache Spark User List mailing list archive > <http://apache-spark-user-list.1001560.n3.nabble.com/> at Nabble.com. >