Try these: - Disable shuffle : spark.shuffle.spill=false (It might end up in OOM) - Enable log rotation:
sparkConf.set("spark.executor.logs.rolling.strategy", "size") .set("spark.executor.logs.rolling.size.maxBytes", "1024") .set("spark.executor.logs.rolling.maxRetainedFiles", "3") You can also look into spark.cleaner.ttl <https://spark.apache.org/docs/latest/configuration.html#execution-behavior> Thanks Best Regards On Tue, May 26, 2015 at 12:23 PM, sayantini <sayantiniba...@gmail.com> wrote: > Hi All, > > > > Please help me with the below 2 issues: > > > > *Environment:* > > > > I am running my spark cluster in stand alone mode. > > > > I am initializing the spark context from inside my tomcat server. > > > > I am setting below properties in environment.sh in $SPARK_HOME/conf > directory > > > > SPARK_MASTER_OPTS=-Dspark.deploy.retainedApplications=1 > -Dspark.deploy.retainedDrivers=1 > > SPARK_WORKER_OPTS=-Dspark.worker.cleanup.enabled=true > -Dspark.worker.cleanup.interval=600 -Dspark.worker.cleanup.appDataTtl=600 > > > > SPARK_LOCAL_DIRS=$user.home/tmp > > > > *Issue 1:* > > > > Still in my $SPARK_HOME/work folder, application-folders continue to grow > as and when I restart the tomcat. > > > > I also tried to stop the spark context (sc.stop()) in tomcat’s > contextDestroyed listener but still I am not able to remove the undesired > application folders. > > > > *Issue 2:* > > The ‘tmp’ folder is getting filled up with shuffle data and eating my > entire hard disk. Is there any setting to remove shuffle data of ‘FINISHED’ > applications. > > > > Thanks in advance, > > Sayantini >