Oh, I see. Another idea would be to provide something like sc.prune(a, b, c) that traverses the dependency graph of RDDs a, b, c and unpersists any cached RDDs not referenced by any other RDD. In this case you could store the return value of blowUp and call prune on it after line 9.
Ankur <http://www.ankurdave.com/>