Re: Timeout errors from Akka in Spark 1.2.1

2015-04-16 Thread N B
Hi Guillaume, Interesting that you brought up Shuffle. In fact we are experiencing this issue of shuffle files being left behind and not being cleaned up. Since this is a Spark streaming application, it is expected to stay up indefinitely, so shuffle files being left is a big problem right now. Si

Re: Timeout errors from Akka in Spark 1.2.1

2015-04-08 Thread N B
Thanks TD. I believe that might have been the issue. Will try for a few days after passing in the GC option on the java command line when we start the process. Thanks for your timely help. NB On Wed, Apr 8, 2015 at 6:08 PM, Tathagata Das wrote: > Yes, in local mode they the driver and executor

Re: Timeout errors from Akka in Spark 1.2.1

2015-04-08 Thread Tathagata Das
Yes, in local mode they the driver and executor will be same the process. And in that case the Java options in SparkConf configuration will not work. On Wed, Apr 8, 2015 at 1:44 PM, N B wrote: > Since we are running in local mode, won't all the executors be in the same > JVM as the driver? > >

Re: Timeout errors from Akka in Spark 1.2.1

2015-04-08 Thread N B
Since we are running in local mode, won't all the executors be in the same JVM as the driver? Thanks NB On Wed, Apr 8, 2015 at 1:29 PM, Tathagata Das wrote: > Its does take effect on the executors, not on the driver. Which is okay > because executors have all the data and therefore have GC issu

Re: Timeout errors from Akka in Spark 1.2.1

2015-04-08 Thread Tathagata Das
Its does take effect on the executors, not on the driver. Which is okay because executors have all the data and therefore have GC issues, not so usually for the driver. If you want to double-sure, print the JVM flag (e.g. http://stackoverflow.com/questions/10486375/print-all-jvm-flags) However, th

Re: Timeout errors from Akka in Spark 1.2.1

2015-04-08 Thread N B
Hi TD, Thanks for the response. Since you mentioned GC, this got me thinking. Given that we are running in local mode (all in a single JVM) for now, does the option "spark.executor.extraJavaOptions" set to "-XX:+UseConcMarkSweepGC" inside SparkConf object take effect at all before we use it to cr

Re: Timeout errors from Akka in Spark 1.2.1

2015-04-08 Thread Tathagata Das
There are a couple of options. Increase timeout (see Spark configuration). Also see past mails in the mailing list. Another option you may try (I have gut feeling that may work, but I am not sure) is calling GC on the driver periodically. The cleaning up of stuff is tied to GCing of RDD objects a