Aha, that makes sense. Thanks for the response! I guess one of the areas Spark could need some love in in error messages (:
On Fri, Jul 18, 2014 at 9:41 PM, Michael Armbrust <mich...@databricks.com> wrote: > Sorry for the non-obvious error message. It is not valid SQL to include > attributes in the select clause unless they are also in the group by clause > or are inside of an aggregate function. > > On Jul 18, 2014 5:12 AM, "Martin Gammelsæter" <martingammelsae...@gmail.com> > wrote: >> >> Hi again! >> >> I am having problems when using GROUP BY on both SQLContext and >> HiveContext (same problem). >> >> My code (simplified as much as possible) can be seen here: >> http://pastebin.com/33rjW67H >> >> In short, I'm getting data from a Cassandra store with Datastax' new >> driver (which works great by the way, recommended!), and mapping it to >> a Spark SQL table through a Product class (Dokument in the source). >> Regular SELECTs and stuff works fine, but once I try to do a GROUP BY, >> I get the following error: >> >> Exception in thread "main" org.apache.spark.SparkException: Job >> aborted due to stage failure: Task 0.0:25 failed 4 times, most recent >> failure: Exception failure in TID 63 on host 192.168.121.132: >> org.apache.spark.sql.catalyst.errors.package$TreeNodeException: No >> function to evaluate expression. type: AttributeReference, tree: id#0 >> >> org.apache.spark.sql.catalyst.expressions.AttributeReference.eval(namedExpressions.scala:158) >> >> org.apache.spark.sql.catalyst.expressions.MutableProjection.apply(Projection.scala:64) >> >> org.apache.spark.sql.execution.Aggregate$$anonfun$execute$1$$anonfun$7$$anon$1.next(Aggregate.scala:195) >> >> org.apache.spark.sql.execution.Aggregate$$anonfun$execute$1$$anonfun$7$$anon$1.next(Aggregate.scala:174) >> scala.collection.Iterator$$anon$11.next(Iterator.scala:328) >> scala.collection.Iterator$class.foreach(Iterator.scala:727) >> scala.collection.AbstractIterator.foreach(Iterator.scala:1157) >> >> scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48) >> >> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103) >> >> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47) >> >> scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273) >> scala.collection.AbstractIterator.to(Iterator.scala:1157) >> >> scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265) >> scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157) >> >> scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252) >> scala.collection.AbstractIterator.toArray(Iterator.scala:1157) >> org.apache.spark.rdd.RDD$$anonfun$15.apply(RDD.scala:750) >> org.apache.spark.rdd.RDD$$anonfun$15.apply(RDD.scala:750) >> >> org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1096) >> >> org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1096) >> >> org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:112) >> org.apache.spark.scheduler.Task.run(Task.scala:51) >> >> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:187) >> >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >> >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >> java.lang.Thread.run(Thread.java:745) >> >> What am I doing wrong? >> >> -- >> Best regards, >> Martin Gammelsæter -- Mvh. Martin Gammelsæter 92209139