Re: How to make symbol for one column in Spark SQL.

2014-12-04 Thread Tim Chou
... Thank you! I'm so stupid... This is the only thing I miss in the tutorial...orz Thanks, Tim 2014-12-04 16:49 GMT-06:00 Michael Armbrust : > You need to import sqlContext._ > > On Thu, Dec 4, 2014 at 2:26 PM, Tim Chou wrote: > >> I have tried to use function where

How to make symbol for one column in Spark SQL.

2014-12-04 Thread Tim Chou
I have tried to use function where and filter in SchemaRDD. I have build class for tuple/record in the table like this: case class Region(num:Int, str1:String, str2:String) I also successfully create a SchemaRDD. scala> val results = sqlContext.sql("select * from region") results: org.apache.spa

How to create a new SchemaRDD which is not based on original SparkPlan?

2014-12-03 Thread Tim Chou
Hi All, My question is about lazy running mode for SchemaRDD, I guess. I know lazy mode is good, however, I still have this demand. For example, here is the first SchemaRDD, named result.(select * from table where num>1 and num < 4): results: org.apache.spark.sql.SchemaRDD = SchemaRDD[59] at RDD

How does Spark SQL traverse the physical tree?

2014-11-24 Thread Tim Chou
Hi All, I'm learning the code of Spark SQL. I'm confused about how SchemaRDD executes each operator. I'm tracing the code. I found toRDD() function in QueryExecution is the start for running a query. toRDD function will run SparkPlan, which is a tree structure. However, I didn't find any iterat

Spark- How can I run MapReduce only on one partition in an RDD?

2014-11-13 Thread Tim Chou
Hi All, I use textFile to create a RDD. However, I don't want to handle the whole data in this RDD. For example, maybe I only want to solve the data in 3rd partition of the RDD. How can I do it? Here are some possible solutions that I'm thinking: 1. Create multiple RDDs when reading the file 2.

Fwd: How to add elements into map?

2014-11-07 Thread Tim Chou
Here is the code I run in spark-shell: val table = sc.textFile(args(1)) val histMap = collection.mutable.Map[Int,Int]() for (x <- table) { val tuple = x.split('|') histMap.put(tuple(0).toInt, 1) } Why is histMap still null? Is there something wrong with my code? Thanks,

How to add elements into map?

2014-11-07 Thread Tim Chou
Here is the code I run in spark-shell: val table = sc.textFile(args(1)) val histMap = collection.mutable.Map[Int,Int]() for (x <- table) { val tuple = x.split('|') histMap.put(tuple(0).toInt, 1) } Why is histMap still null? Is there something wrong with my code? Thanks

Re: My task is finished successfully, however, I find some exceptions in webpage.

2014-10-04 Thread Tim Chou
Can anyone help me? I find if I doesn't use a hdfs file as the input, then there's no this kind of exceptions. I search online and find nothing. How to debug spark program? Thanks, Tim 2014-10-03 17:46 GMT-05:00 Tim Chou : > Hi All, > > Sorry to disturb you. > > I h

My task is finished successfully, however, I find some exceptions in webpage.

2014-10-03 Thread Tim Chou
Hi All, Sorry to disturb you. I have built a spark cluster based on mesos. I run some tests on spark shell. It works. However I can find some exceptions in webpage. scala> val textFile = sc.textFile("hdfs://10.1.2.12:9000/README.md") scala> textFile.count() 14/10/03 15:20:54 INFO mapred.FileInpu