Re: Spark Processing Large Data Stuck

2014-06-21 Thread Peng Cheng
JVM will quit after spending most of its time on GC (about 95%), but usually before that you have to wait for a long time, particularly if your job is already at massive scale. Since it is hard to run profiling online, maybe its easier for debugging if you make a lot of partitions (so you can watc

Re: Spark Processing Large Data Stuck

2014-06-21 Thread yxzhao
Thanks Krishna, I use a small cluster and each compute node has 16GB of RAM and 8 2.66GHz CPU cores. On Sat, Jun 21, 2014 at 3:16 PM, Krishna Sankar [via Apache Spark User List] wrote: > Hi, > >- I have seen similar behavior before. As far as I can tell, the root >cause is the ou

Re: Spark Processing Large Data Stuck

2014-06-21 Thread Krishna Sankar
Hi, - I have seen similar behavior before. As far as I can tell, the root cause is the out of memory error - verified this by monitoring the memory. - I had a 30 GB file and was running on a single machine with 16GB. So I knew it would fail. - But instead of raising an exce