[Spark SQL Continue] Sorry, it is not only limited in SQL, may due to network

2014-10-09 Thread Trident
Dear Community, Please ignore my last post about Spark SQL. When I run: val file = sc.textFile("./README.md") val count = file.flatMap(line => line.split(" ")).map(word => (word, 1)).reduceByKey(_+_) count.collect() ‍ it happends too. is there any possible reason f

[Spark SQL] Strange NPE in Spark SQL with Hive

2014-10-09 Thread Trident
Hi Community, I use Spark 1.0.2, using Spark SQL to do Hive SQL. When I run the following code in Spark Shell: val file = sc.textFile("./README.md") val count = file.flatMap(line => line.split(" ")).map(word => (word, 1)).reduceByKey(_+_) count.collect() ‍ Correct and no error

Too big data Spark SQL on Hive table on version 1.0.2 has some strange output

2014-10-05 Thread Trident
Dear Developers, I'm limited in using Spark 1.0.2 currently. I use Spark SQL on Hive table to load amplab benchmark, which is 25.6GiB approximately. I run: CREATE EXTERNAL TABLE uservisits (sourceIP STRING,destURL STRING, visitDate STRING,adRevenue DOUBLE,userAgent STRING,countryCode STRING,

Network Communication - Akka or more?

2014-09-16 Thread Trident
akka? 3. When running ./bin/run-example SparkPi I noticed that the jar file has been sent from server to client. It is scary because the jar is big. Is it common? Trident