Aha, that makes sense. Thanks for the response! I guess one of the
areas Spark could need some love in in error messages (:

On Fri, Jul 18, 2014 at 9:41 PM, Michael Armbrust
<mich...@databricks.com> wrote:
> Sorry for the non-obvious error message.  It is not valid SQL to include
> attributes in the select clause unless they are also in the group by clause
> or are inside of an aggregate function.
>
> On Jul 18, 2014 5:12 AM, "Martin Gammelsæter" <martingammelsae...@gmail.com>
> wrote:
>>
>> Hi again!
>>
>> I am having problems when using GROUP BY on both SQLContext and
>> HiveContext (same problem).
>>
>> My code (simplified as much as possible) can be seen here:
>> http://pastebin.com/33rjW67H
>>
>> In short, I'm getting data from a Cassandra store with Datastax' new
>> driver (which works great by the way, recommended!), and mapping it to
>> a Spark SQL table through a Product class (Dokument in the source).
>> Regular SELECTs and stuff works fine, but once I try to do a GROUP BY,
>> I get the following error:
>>
>> Exception in thread "main" org.apache.spark.SparkException: Job
>> aborted due to stage failure: Task 0.0:25 failed 4 times, most recent
>> failure: Exception failure in TID 63 on host 192.168.121.132:
>> org.apache.spark.sql.catalyst.errors.package$TreeNodeException: No
>> function to evaluate expression. type: AttributeReference, tree: id#0
>>
>> org.apache.spark.sql.catalyst.expressions.AttributeReference.eval(namedExpressions.scala:158)
>>
>> org.apache.spark.sql.catalyst.expressions.MutableProjection.apply(Projection.scala:64)
>>
>> org.apache.spark.sql.execution.Aggregate$$anonfun$execute$1$$anonfun$7$$anon$1.next(Aggregate.scala:195)
>>
>> org.apache.spark.sql.execution.Aggregate$$anonfun$execute$1$$anonfun$7$$anon$1.next(Aggregate.scala:174)
>>         scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
>>         scala.collection.Iterator$class.foreach(Iterator.scala:727)
>>         scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>>
>> scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
>>
>> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
>>
>> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
>>
>> scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
>>         scala.collection.AbstractIterator.to(Iterator.scala:1157)
>>
>> scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
>>         scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
>>
>> scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
>>         scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
>>         org.apache.spark.rdd.RDD$$anonfun$15.apply(RDD.scala:750)
>>         org.apache.spark.rdd.RDD$$anonfun$15.apply(RDD.scala:750)
>>
>> org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1096)
>>
>> org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1096)
>>
>> org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:112)
>>         org.apache.spark.scheduler.Task.run(Task.scala:51)
>>
>> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:187)
>>
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>         java.lang.Thread.run(Thread.java:745)
>>
>> What am I doing wrong?
>>
>> --
>> Best regards,
>> Martin Gammelsæter



-- 
Mvh.
Martin Gammelsæter
92209139

Reply via email to