[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-13 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10590763 --- Diff: core/src/main/scala/org/apache/spark/scheduler/ResultTask.scala --- @@ -21,20 +21,16 @@ import java.io._ import java.util.zip.{GZIPInputStre

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-13 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10590604 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -123,17 +123,17 @@ class DAGScheduler( private val nextStag

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-13 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10590414 --- Diff: core/src/main/scala/org/apache/spark/rdd/RDD.scala --- @@ -1025,6 +1025,14 @@ abstract class RDD[T: ClassTag]( checkpointData.flatMap(_.g

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-13 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10589028 --- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala --- @@ -50,23 +50,26 @@ private[spark] class MapOutputTrackerMasterActor(tracker: Map

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-13 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10588418 --- Diff: core/src/main/scala/org/apache/spark/scheduler/ShuffleMapTask.scala --- @@ -17,28 +17,24 @@ package org.apache.spark.scheduler

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-13 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10586047 --- Diff: core/src/main/scala/org/apache/spark/scheduler/ShuffleMapTask.scala --- @@ -17,28 +17,24 @@ package org.apache.spark.scheduler

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-13 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10585234 --- Diff: core/src/main/scala/org/apache/spark/scheduler/ShuffleMapTask.scala --- @@ -17,28 +17,24 @@ package org.apache.spark.scheduler

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-13 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10582800 --- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala --- @@ -181,15 +178,50 @@ private[spark] class MapOutputTracker(conf: SparkConf) extends

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-13 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10581648 --- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala --- @@ -181,15 +178,50 @@ private[spark] class MapOutputTracker(conf: SparkConf) extends

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-13 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10581534 --- Diff: core/src/main/scala/org/apache/spark/scheduler/ShuffleMapTask.scala --- @@ -17,28 +17,24 @@ package org.apache.spark.scheduler

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-13 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10579603 --- Diff: core/src/main/scala/org/apache/spark/ContextCleaner.scala --- @@ -0,0 +1,135 @@ +/* + * Licensed to the Apache Software Foundation (ASF) u

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-13 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10579167 --- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala --- @@ -181,15 +178,50 @@ private[spark] class MapOutputTracker(conf: SparkConf) exten

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-13 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10578363 --- Diff: core/src/main/scala/org/apache/spark/ContextCleaner.scala --- @@ -0,0 +1,135 @@ +/* + * Licensed to the Apache Software Foundation (ASF) u

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-13 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10578300 --- Diff: core/src/main/scala/org/apache/spark/scheduler/ShuffleMapTask.scala --- @@ -17,28 +17,24 @@ package org.apache.spark.scheduler

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-13 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10577891 --- Diff: core/src/main/scala/org/apache/spark/scheduler/ShuffleMapTask.scala --- @@ -17,28 +17,24 @@ package org.apache.spark.scheduler

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-13 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10576368 --- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala --- @@ -273,8 +310,9 @@ private[spark] class MapOutputTrackerMaster(conf: SparkConf)

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-13 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10576181 --- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala --- @@ -149,14 +151,9 @@ private[spark] class MapOutputTracker(conf: SparkConf) extend

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10553393 --- Diff: core/src/main/scala/org/apache/spark/ContextCleaner.scala --- @@ -0,0 +1,135 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under on

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread tdas
Github user tdas commented on the pull request: https://github.com/apache/spark/pull/126#issuecomment-37501973 How does immutable Hashmaps help to store metadata? For example, how would you store block ID --> block info in the BlockManager using immutable HashMaps? --- If your proje

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/126#issuecomment-37501863 If you don't need high performance, why not just put a normal immutable hashmap so you don't have to worry about concurrency? --- If your project is set up for it, you can r

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10553293 --- Diff: core/src/main/scala/org/apache/spark/ContextCleaner.scala --- @@ -0,0 +1,135 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under on

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread tdas
Github user tdas commented on the pull request: https://github.com/apache/spark/pull/126#issuecomment-37501724 @rxin It overrides stuff to make sure such things like traversing entire HashMap does not happen. They are meant for being drop-in replacements of scala HashMaps when applyin

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10553150 --- Diff: core/src/main/scala/org/apache/spark/util/BoundedHashMap.scala --- @@ -0,0 +1,67 @@ +/* + * Licensed to the Apache Software Foundation (ASF) unde

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/126#issuecomment-37501259 @tdas I haven't finished looking at this (will probably spend more time after Fri) - but WrappedJavaHashMap is fairly complicated, and it seems like a recipe for complexity a

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10553128 --- Diff: core/src/main/scala/org/apache/spark/util/TimeStampedWeakValueHashMap.scala --- @@ -0,0 +1,112 @@ +/* + * Licensed to the Apache Software Founda

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10553097 --- Diff: core/src/main/scala/org/apache/spark/ContextCleaner.scala --- @@ -0,0 +1,135 @@ +/* + * Licensed to the Apache Software Foundation (ASF) unde

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10553086 --- Diff: core/src/main/scala/org/apache/spark/ContextCleaner.scala --- @@ -0,0 +1,135 @@ +/* + * Licensed to the Apache Software Foundation (ASF) unde

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10553056 --- Diff: core/src/main/scala/org/apache/spark/scheduler/ShuffleMapTask.scala --- @@ -17,28 +17,24 @@ package org.apache.spark.scheduler

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10553044 --- Diff: core/src/main/scala/org/apache/spark/scheduler/ShuffleMapTask.scala --- @@ -17,28 +17,24 @@ package org.apache.spark.scheduler

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10553029 --- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala --- @@ -181,15 +178,50 @@ private[spark] class MapOutputTracker(conf: SparkConf) extends

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10552981 --- Diff: core/src/main/scala/org/apache/spark/util/BoundedHashMap.scala --- @@ -0,0 +1,67 @@ +/* + * Licensed to the Apache Software Foundation (ASF) unde

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10552941 --- Diff: core/src/main/scala/org/apache/spark/ContextCleaner.scala --- @@ -0,0 +1,135 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under on

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10552946 --- Diff: core/src/main/scala/org/apache/spark/ContextCleaner.scala --- @@ -0,0 +1,135 @@ +/* + * Licensed to the Apache Software Foundation (ASF) unde

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10552915 --- Diff: core/src/main/scala/org/apache/spark/ContextCleaner.scala --- @@ -0,0 +1,135 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under on

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10552886 --- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala --- @@ -20,15 +20,15 @@ package org.apache.spark import java.io._ import java.util.

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10552879 --- Diff: core/src/main/scala/org/apache/spark/ContextCleaner.scala --- @@ -0,0 +1,135 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under on

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10552846 --- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala --- @@ -20,15 +20,15 @@ package org.apache.spark import java.io._ import java.util.

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10552839 --- Diff: core/src/main/scala/org/apache/spark/ContextCleaner.scala --- @@ -0,0 +1,135 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under on

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10552834 --- Diff: core/src/main/scala/org/apache/spark/ContextCleaner.scala --- @@ -0,0 +1,135 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under on

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10552817 --- Diff: core/src/main/scala/org/apache/spark/ContextCleaner.scala --- @@ -0,0 +1,135 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under on

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10552793 --- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala --- @@ -181,15 +178,50 @@ private[spark] class MapOutputTracker(conf: SparkConf) extends Log

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10552777 --- Diff: core/src/main/scala/org/apache/spark/scheduler/ShuffleMapTask.scala --- @@ -17,28 +17,24 @@ package org.apache.spark.scheduler +imp

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10552754 --- Diff: core/src/main/scala/org/apache/spark/ContextCleaner.scala --- @@ -0,0 +1,135 @@ +/* + * Licensed to the Apache Software Foundation (ASF) unde

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10552717 --- Diff: core/src/main/scala/org/apache/spark/ContextCleaner.scala --- @@ -0,0 +1,135 @@ +/* + * Licensed to the Apache Software Foundation (ASF) unde

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10552693 --- Diff: core/src/main/scala/org/apache/spark/ContextCleaner.scala --- @@ -0,0 +1,135 @@ +/* + * Licensed to the Apache Software Foundation (ASF) unde

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10552692 --- Diff: core/src/main/scala/org/apache/spark/rdd/RDD.scala --- @@ -1025,6 +1025,14 @@ abstract class RDD[T: ClassTag]( checkpointData.flatMap(_.getCheck

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10552645 --- Diff: core/src/main/scala/org/apache/spark/rdd/RDD.scala --- @@ -1025,6 +1025,14 @@ abstract class RDD[T: ClassTag]( checkpointData.flatMap(_.getC

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10552638 --- Diff: core/src/main/scala/org/apache/spark/util/TimeStampedWeakValueHashMap.scala --- @@ -0,0 +1,112 @@ +/* + * Licensed to the Apache Software Founda

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10552625 --- Diff: core/src/main/scala/org/apache/spark/util/TimeStampedWeakValueHashMap.scala --- @@ -0,0 +1,112 @@ +/* + * Licensed to the Apache Software Founda

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10552599 --- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala --- @@ -181,15 +178,50 @@ private[spark] class MapOutputTracker(conf: SparkConf) extends

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10552496 --- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala --- @@ -181,15 +178,50 @@ private[spark] class MapOutputTracker(conf: SparkConf) extends Log

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10552205 --- Diff: core/src/main/scala/org/apache/spark/util/TimeStampedWeakValueHashMap.scala --- @@ -0,0 +1,112 @@ +/* + * Licensed to the Apache Software Founda

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10552156 --- Diff: core/src/main/scala/org/apache/spark/util/TimeStampedWeakValueHashMap.scala --- @@ -0,0 +1,112 @@ +/* + * Licensed to the Apache Software Founda

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread tdas
Github user tdas commented on the pull request: https://github.com/apache/spark/pull/126#issuecomment-37498313 @yaoshengzhe This is only safe, best-effort attempt to clean metadata, so not guarantee is being provided here. All we are trying to do for long running Spark computatio

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/126#issuecomment-37497428 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/126#issuecomment-37497429 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13142/ --- If your project

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10551528 --- Diff: core/src/main/scala/org/apache/spark/util/TimeStampedWeakValueHashMap.scala --- @@ -0,0 +1,112 @@ +/* + * Licensed to the Apache Software Fo

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread aarondav
Github user aarondav commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10551206 --- Diff: core/src/main/scala/org/apache/spark/rdd/RDD.scala --- @@ -1025,6 +1025,14 @@ abstract class RDD[T: ClassTag]( checkpointData.flatMap(_.getC

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10551122 --- Diff: core/src/main/scala/org/apache/spark/util/TimeStampedWeakValueHashMap.scala --- @@ -0,0 +1,112 @@ +/* + * Licensed to the Apache Software Fo

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10551095 --- Diff: core/src/main/scala/org/apache/spark/ContextCleaner.scala --- @@ -0,0 +1,126 @@ +/* + * Licensed to the Apache Software Foundation (ASF) un

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/126#issuecomment-37493973 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not hav

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/126#issuecomment-37493975 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have t

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10551057 --- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala --- @@ -181,15 +178,50 @@ private[spark] class MapOutputTracker(conf: SparkConf) extends

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10551026 --- Diff: core/src/main/scala/org/apache/spark/scheduler/ShuffleMapTask.scala --- @@ -17,28 +17,24 @@ package org.apache.spark.scheduler

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10550676 --- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala --- @@ -50,23 +54,26 @@ private[spark] class MapOutputTrackerMasterActor(tracker: MapOutputT

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10550660 --- Diff: core/src/main/scala/org/apache/spark/rdd/RDD.scala --- @@ -1025,6 +1025,14 @@ abstract class RDD[T: ClassTag]( checkpointData.flatMap(_.getCheck

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread yaoshengzhe
Github user yaoshengzhe commented on the pull request: https://github.com/apache/spark/pull/126#issuecomment-37492214 @tdas, based on what you said, it's fine to put logic into finalize if we can make sure code in finalize is done in the right way and accept the fact that finalize mig

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10550372 --- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala --- @@ -181,15 +186,49 @@ private[spark] class MapOutputTracker(conf: SparkConf) extends

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10550266 --- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala --- @@ -181,15 +186,49 @@ private[spark] class MapOutputTracker(conf: SparkConf) extends

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread aarondav
Github user aarondav commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10549798 --- Diff: core/src/main/scala/org/apache/spark/ContextCleaner.scala --- @@ -0,0 +1,126 @@ +/* + * Licensed to the Apache Software Foundation (ASF) unde

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread aarondav
Github user aarondav commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10549758 --- Diff: core/src/main/scala/org/apache/spark/rdd/RDD.scala --- @@ -1025,6 +1025,14 @@ abstract class RDD[T: ClassTag]( checkpointData.flatMap(_.getC

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread aarondav
Github user aarondav commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10549710 --- Diff: core/src/main/scala/org/apache/spark/rdd/RDD.scala --- @@ -1025,6 +1025,14 @@ abstract class RDD[T: ClassTag]( checkpointData.flatMap(_.getC

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10549577 --- Diff: core/src/main/scala/org/apache/spark/ContextCleaner.scala --- @@ -0,0 +1,126 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under on

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10549507 --- Diff: core/src/main/scala/org/apache/spark/ContextCleaner.scala --- @@ -0,0 +1,126 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under on

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10549481 --- Diff: core/src/main/scala/org/apache/spark/ContextCleaner.scala --- @@ -0,0 +1,126 @@ +/* + * Licensed to the Apache Software Foundation (ASF) un

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread aarondav
Github user aarondav commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10549475 --- Diff: core/src/main/scala/org/apache/spark/ContextCleaner.scala --- @@ -0,0 +1,126 @@ +/* + * Licensed to the Apache Software Foundation (ASF) unde

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10549460 --- Diff: core/src/main/scala/org/apache/spark/ContextCleaner.scala --- @@ -0,0 +1,126 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under on

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread aarondav
Github user aarondav commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10549439 --- Diff: core/src/main/scala/org/apache/spark/ContextCleaner.scala --- @@ -0,0 +1,126 @@ +/* + * Licensed to the Apache Software Foundation (ASF) unde

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread aarondav
Github user aarondav commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10549405 --- Diff: core/src/main/scala/org/apache/spark/ContextCleaner.scala --- @@ -0,0 +1,126 @@ +/* + * Licensed to the Apache Software Foundation (ASF) unde

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread aarondav
Github user aarondav commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10549357 --- Diff: core/src/main/scala/org/apache/spark/ContextCleaner.scala --- @@ -0,0 +1,126 @@ +/* + * Licensed to the Apache Software Foundation (ASF) unde

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread aarondav
Github user aarondav commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10549336 --- Diff: core/src/main/scala/org/apache/spark/ContextCleaner.scala --- @@ -0,0 +1,126 @@ +/* + * Licensed to the Apache Software Foundation (ASF) unde

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread aarondav
Github user aarondav commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10549211 --- Diff: core/src/main/scala/org/apache/spark/ContextCleaner.scala --- @@ -0,0 +1,126 @@ +/* + * Licensed to the Apache Software Foundation (ASF) unde

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10548672 --- Diff: core/src/main/scala/org/apache/spark/ContextCleaner.scala --- @@ -0,0 +1,126 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under on

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread tdas
Github user tdas commented on the pull request: https://github.com/apache/spark/pull/126#issuecomment-37485786 @yaoshengzhe I agree using finalizer is not the most ideal thing in the world. However, the problem that we are dealing with here is that there is no clean and safe way to de

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10548591 --- Diff: core/src/main/scala/org/apache/spark/storage/ShuffleBlockManager.scala --- @@ -169,23 +169,32 @@ class ShuffleBlockManager(blockManager: BlockManage

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread andrewor14
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/126#issuecomment-37484584 This is not specifically related to your patch, but I think we can remove the MetadataCleaner in SparkContext that cleans up persisted RDDs periodically. In particular,

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread andrewor14
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/126#issuecomment-37484105 It's true that finalizers are not predictable, but what this patch is doing is no worse than before. Cleanup() logic is decoupled from finalize(); it can still be calle

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread yaoshengzhe
Github user yaoshengzhe commented on the pull request: https://github.com/apache/spark/pull/126#issuecomment-37475144 @Tathagata, I am strongly disagree to put cleanup logic in finalize. Finalizers are unpredictable, often dangerous, and generally unnecessary, e.g. there is a severe p

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10543349 --- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala --- @@ -181,15 +186,49 @@ private[spark] class MapOutputTracker(conf: SparkConf) exten

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10543216 --- Diff: core/src/main/scala/org/apache/spark/ContextCleaner.scala --- @@ -0,0 +1,126 @@ +/* + * Licensed to the Apache Software Foundation (ASF) un

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10543160 --- Diff: core/src/main/scala/org/apache/spark/ContextCleaner.scala --- @@ -0,0 +1,126 @@ +/* + * Licensed to the Apache Software Foundation (ASF) un

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10543007 --- Diff: core/src/main/scala/org/apache/spark/ContextCleaner.scala --- @@ -0,0 +1,126 @@ +/* + * Licensed to the Apache Software Foundation (ASF) un

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10540569 --- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala --- @@ -181,15 +186,49 @@ private[spark] class MapOutputTracker(conf: SparkConf) exten

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10540432 --- Diff: core/src/main/scala/org/apache/spark/ContextCleaner.scala --- @@ -0,0 +1,126 @@ +/* + * Licensed to the Apache Software Foundation (ASF) un

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10540325 --- Diff: core/src/main/scala/org/apache/spark/ContextCleaner.scala --- @@ -0,0 +1,126 @@ +/* + * Licensed to the Apache Software Foundation (ASF) un

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10540258 --- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala --- @@ -50,23 +54,26 @@ private[spark] class MapOutputTrackerMasterActor(tracker: MapO

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10533790 --- Diff: core/src/main/scala/org/apache/spark/Dependency.scala --- @@ -49,9 +49,28 @@ class ShuffleDependency[K, V]( @transient rdd: RDD[_ <: Product2[K,

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-12 Thread ScrapCodes
Github user ScrapCodes commented on a diff in the pull request: https://github.com/apache/spark/pull/126#discussion_r10514209 --- Diff: core/src/main/scala/org/apache/spark/Dependency.scala --- @@ -49,9 +49,28 @@ class ShuffleDependency[K, V]( @transient rdd: RDD[_ <: Produ

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/126#issuecomment-37369954 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-1103] [WIP] Automatic garbage collectio...

2014-03-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/126#issuecomment-37369955 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13119/ --- If your project

  1   2   >