Re: Spark 1.5.1 Build Failure

2015-10-30 Thread Jia Zhan
t-compile-first) on project spark-core_2.10: Execution > scala-test-compile-first of goal > net.alchim31.maven:scala-maven-plugin:3.2.2:testCompile failed. > CompileFailed -> [Help 1] > > > > -- > Regards, > Raghuveer Chanda > > -- Jia Zhan

Re: In-memory computing and cache() in Spark

2015-10-19 Thread Jia Zhan
5 at 11:32 PM, Igor Berman wrote: > Does ur iterations really submit job? I dont see any action there > On Oct 17, 2015 00:03, "Jia Zhan" wrote: > >> Hi all, >> >> I am running Spark locally in one node and trying to sweep the memory >> size for perfor

Re: In-memory computing and cache() in Spark

2015-10-19 Thread Jia Zhan
Hi Jia, > > RDDs are cached on the executor, not on the driver. I am assuming you are > running locally and haven't changed spark.executor.memory? > > Sonal > On Oct 19, 2015 1:58 AM, "Jia Zhan" wrote: > > Anyone has any clue what's going on.? Why would

Re: In-memory computing and cache() in Spark

2015-10-18 Thread Jia Zhan
Anyone has any clue what's going on.? Why would caching with 2g memory much faster than with 15g memory? Thanks very much! On Fri, Oct 16, 2015 at 2:02 PM, Jia Zhan wrote: > Hi all, > > I am running Spark locally in one node and trying to sweep the memory size > for perfor

In-memory computing and cache() in Spark

2015-10-16 Thread Jia Zhan
Hi all, I am running Spark locally in one node and trying to sweep the memory size for performance tuning. The machine has 8 CPUs and 16G main memory, the dataset in my local disk is about 10GB. I have several quick questions and appreciate any comments. 1. Spark performs in-memory computing, but

Can we gracefully kill stragglers in Spark SQL

2015-09-04 Thread Jia Zhan
work? Is it possible to early terminate some tasks without affecting the overall execution of the job, with some cost of accuracy? Appreciate your help! -- Jia Zhan