Problem we're seeing is a gradual memory leak in the driver's JVM.

Executing jobs using a long running Java app which creates relatively
short-lived SparkContext's. So our Spark drivers are created as part of
this application's JVM. We're using standalone cluster mode, spark 1.0.2

Root cause of the memory leak seems to be Spark's DiskBlockManager - it
registers a JVM shutdown hook that's responsible for deleting local spark
dirs:
    Runtime.getRuntime.addShutdownHook(new Thread("delete Spark local dirs")

(this doesn't seem to have changed in Spark 1.2)

The problem is that this causes the entire Akka actor-system of each
application to stay in memory:
Runtime shutdown hooks -> DiskBlockManager -> ShuffleBlockManager ->
BlockManager -> ActorSystemImpl

Anyone came across this issue as well?

I would imagine that with YARN, when using yarn-cluster mode, this would
not be an issue, as the JVM running the Spark driver would itself be
short-lived. Is that the case?

Is there no way of creating short-lived SparkContext applications using the
same JVM then? Is the only alternative using one long-running SparkContext?

I did see examples of Java applications re-creating SparkContex's - for
example, Ooyala's spark-jobserver - so I would imagine this is possible, no?
https://github.com/ooyala/spark-jobserver/blob/master/job-server/src/spark.jobserver/JobManagerActor.scala#L104

Thanks,

*Noam Barcay*
Developer // *Kenshoo*
*Office* +972 3 746-6500 *427 // *Mobile* +972 54 475-3142
__________________________________________
*www.Kenshoo.com* <http://kenshoo.com/>

-- 
This e-mail, as well as any attached document, may contain material which 
is confidential and privileged and may include trademark, copyright and 
other intellectual property rights that are proprietary to Kenshoo Ltd, 
 its subsidiaries or affiliates ("Kenshoo"). This e-mail and its 
attachments may be read, copied and used only by the addressee for the 
purpose(s) for which it was disclosed herein. If you have received it in 
error, please destroy the message and any attachment, and contact us 
immediately. If you are not the intended recipient, be aware that any 
review, reliance, disclosure, copying, distribution or use of the contents 
of this message without Kenshoo's express permission is strictly prohibited.

Reply via email to