I agree that the garbage collection PR<https://github.com/apache/spark/pull/126>would make things very convenient in a lot of usecases. However, there are two broads reasons why it is hard for that PR to get into 0.9.1. 1. The PR still needs some amount of work and quite a lot of testing. While we enable RDD and shuffle cleanup based on Java GC, its behavior in a real workloads still needs to be understood (especially since it is tied to Spark driver's garbage collection behavior). 2. This actually changes some of the semantic behavior of Spark and should not be included in a bug-fix release. The PR will definitely be present for Spark 1.0, which is expected to be release around end of April (not too far ;) ).
TD On Wed, Mar 19, 2014 at 5:57 PM, Mridul Muralidharan <mri...@gmail.com>wrote: > Would be great if the garbage collection PR is also committed - if not > the whole thing, atleast the part to unpersist broadcast variables > explicitly would be great. > Currently we are running with a custom impl which does something > similar, and I would like to move to standard distribution for that. > > > Thanks, > Mridul > > > On Wed, Mar 19, 2014 at 5:07 PM, Tathagata Das > <tathagata.das1...@gmail.com> wrote: > > Hello everyone, > > > > Since the release of Spark 0.9, we have received a number of important > bug > > fixes and we would like to make a bug-fix release of Spark 0.9.1. We are > > going to cut a release candidate soon and we would love it if people test > > it out. We have backported several bug fixes into the 0.9 and updated > JIRA > > accordingly< > https://spark-project.atlassian.net/browse/SPARK-1275?jql=project%20in%20(SPARK%2C%20BLINKDB%2C%20MLI%2C%20MLLIB%2C%20SHARK%2C%20STREAMING%2C%20GRAPH%2C%20TACHYON)%20AND%20fixVersion%20%3D%200.9.1%20AND%20status%20in%20(Resolved%2C%20Closed) > >. > > Please let me know if there are fixes that were not backported but you > > would like to see them in 0.9.1. > > > > Thanks! > > > > TD >