Hi, I am using the latest calliope library from tuplejump.com to create RDD for cassandra table. I am on a 3 nodes spark 1.1.0 with yarn.
My cassandra table is defined as below and I have about 2000 rows of data inserted. CREATE TABLE top_shows ( program_id varchar, view_minute timestamp, view_count counter, PRIMARY KEY (view_minute, program_id) //note that view_minute is the partition key ); Here are the simple steps I ran from spark-shell on master node spark-shell --master yarn-client --jars rna/rna-spark-streaming-assembly-1.0-SNAPSHOT.jar --driver-memory 512m --executor-memory 512m --num-executors 3 --executor-cores 1 // Import the necessary import org.apache.spark.rdd.RDD import org.apache.spark.SparkContext import com.tuplejump.calliope.utils.RichByteBuffer._ import com.tuplejump.calliope.Implicits._ import com.tuplejump.calliope.CasBuilder import com.tuplejump.calliope.Types.{CQLRowKeyMap, CQLRowMap} // Define my class and the implicit cast case class ProgramViewCount(viewMinute:Long, program:String, viewCount:Long) implicit def keyValtoProgramViewCount(key:CQLRowKeyMap, values:CQLRowMap):ProgramViewCount = ProgramViewCount(key.get("view_minute").get.getLong, key.get("program_id").toString, values.get("view_count").get.getLong) // Use the cql3 interface to read from table with WHERE predicate. val cas = CasBuilder.cql3.withColumnFamily("streaming_qa", "top_shows").onHost("23.22.120.96") .where("view_minute = 1413861780000") val allPrograms = sc.cql3Cassandra[ProgramViewCount](cas) // Lazy evaluation till this point val rowCount = allPrograms.count I hit the following exception. It seems that it does not like my where clause. If I do not have the WHERE CLAUSE, it works fine. But with the WHERE CLAUSE, no matter the predicate is on partition key or not, it will fail with the following exception. Anyone else using calliope package can share some lights? Thanks a lot. Tian scala> val rowCount = allPrograms.count .... 14/10/21 23:26:07 WARN scheduler.TaskSetManager: Lost task 2.0 in stage 0.0 (TID 2, ip-10-187-51-136.ec2.internal): java.lang.RuntimeException: com.tuplejump.calliope.hadoop.cql3.CqlPagingRecordReader$RowIterator.executeQuery(CqlPagingRecordReader.java:665) com.tuplejump.calliope.hadoop.cql3.CqlPagingRecordReader$RowIterator.<init>(CqlPagingRecordReader.java:301) com.tuplejump.calliope.hadoop.cql3.CqlPagingRecordReader.initialize(CqlPagingRecordReader.java:167) com.tuplejump.calliope.cql3.Cql3CassandraRDD$$anon$1.<init>(Cql3CassandraRDD.scala:75) com.tuplejump.calliope.cql3.Cql3CassandraRDD.compute(Cql3CassandraRDD.scala:64) org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262) org.apache.spark.rdd.RDD.iterator(RDD.scala:229) org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:62) org.apache.spark.scheduler.Task.run(Task.scala:54) org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:177) java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) java.lang.Thread.run(Thread.java:745) -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/spark-1-1-0-RDD-and-Calliope-1-1-0-CTP-U2-H2-tp16975.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org