Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/126#discussion_r10553393
--- Diff: core/src/main/scala/org/apache/spark/ContextCleaner.scala ---
@@ -0,0 +1,135 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under
Github user tdas commented on the pull request:
https://github.com/apache/spark/pull/126#issuecomment-37501973
How does immutable Hashmaps help to store metadata? For example, how would
you store block ID --> block info in the BlockManager using immutable HashMaps?
---
If y
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/126#discussion_r10553293
--- Diff: core/src/main/scala/org/apache/spark/ContextCleaner.scala ---
@@ -0,0 +1,135 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under
Github user tdas commented on the pull request:
https://github.com/apache/spark/pull/126#issuecomment-37501724
@rxin It overrides stuff to make sure such things like traversing entire
HashMap does not happen. They are meant for being drop-in replacements of scala
HashMaps when
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/126#discussion_r10553150
--- Diff: core/src/main/scala/org/apache/spark/util/BoundedHashMap.scala ---
@@ -0,0 +1,67 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/126#discussion_r10553128
--- Diff:
core/src/main/scala/org/apache/spark/util/TimeStampedWeakValueHashMap.scala ---
@@ -0,0 +1,112 @@
+/*
+ * Licensed to the Apache Software
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/126#discussion_r10552915
--- Diff: core/src/main/scala/org/apache/spark/ContextCleaner.scala ---
@@ -0,0 +1,135 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/126#discussion_r10552886
--- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala ---
@@ -20,15 +20,15 @@ package org.apache.spark
import java.io._
import
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/126#discussion_r10552879
--- Diff: core/src/main/scala/org/apache/spark/ContextCleaner.scala ---
@@ -0,0 +1,135 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/126#discussion_r10552817
--- Diff: core/src/main/scala/org/apache/spark/ContextCleaner.scala ---
@@ -0,0 +1,135 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/126#discussion_r10552793
--- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala ---
@@ -181,15 +178,50 @@ private[spark] class MapOutputTracker(conf:
SparkConf) extends
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/126#discussion_r10552777
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/ShuffleMapTask.scala ---
@@ -17,28 +17,24 @@
package org.apache.spark.scheduler
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/126#discussion_r10552692
--- Diff: core/src/main/scala/org/apache/spark/rdd/RDD.scala ---
@@ -1025,6 +1025,14 @@ abstract class RDD[T: ClassTag](
checkpointData.flatMap
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/126#discussion_r10552625
--- Diff:
core/src/main/scala/org/apache/spark/util/TimeStampedWeakValueHashMap.scala ---
@@ -0,0 +1,112 @@
+/*
+ * Licensed to the Apache Software
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/126#discussion_r10552496
--- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala ---
@@ -181,15 +178,50 @@ private[spark] class MapOutputTracker(conf:
SparkConf) extends
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/126#discussion_r10552205
--- Diff:
core/src/main/scala/org/apache/spark/util/TimeStampedWeakValueHashMap.scala ---
@@ -0,0 +1,112 @@
+/*
+ * Licensed to the Apache Software
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/126#discussion_r10552156
--- Diff:
core/src/main/scala/org/apache/spark/util/TimeStampedWeakValueHashMap.scala ---
@@ -0,0 +1,112 @@
+/*
+ * Licensed to the Apache Software
Github user tdas commented on the pull request:
https://github.com/apache/spark/pull/126#issuecomment-37498313
@yaoshengzhe
This is only safe, best-effort attempt to clean metadata, so not guarantee
is being provided here. All we are trying to do for long running Spark
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/126#discussion_r10550676
--- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala ---
@@ -50,23 +54,26 @@ private[spark] class
MapOutputTrackerMasterActor(tracker
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/126#discussion_r10550660
--- Diff: core/src/main/scala/org/apache/spark/rdd/RDD.scala ---
@@ -1025,6 +1025,14 @@ abstract class RDD[T: ClassTag](
checkpointData.flatMap
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/126#discussion_r10549577
--- Diff: core/src/main/scala/org/apache/spark/ContextCleaner.scala ---
@@ -0,0 +1,126 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/126#discussion_r10549507
--- Diff: core/src/main/scala/org/apache/spark/ContextCleaner.scala ---
@@ -0,0 +1,126 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/126#discussion_r10549460
--- Diff: core/src/main/scala/org/apache/spark/ContextCleaner.scala ---
@@ -0,0 +1,126 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/126#discussion_r10548672
--- Diff: core/src/main/scala/org/apache/spark/ContextCleaner.scala ---
@@ -0,0 +1,126 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under
Github user tdas commented on the pull request:
https://github.com/apache/spark/pull/126#issuecomment-37485786
@yaoshengzhe I agree using finalizer is not the most ideal thing in the
world. However, the problem that we are dealing with here is that there is no
clean and safe way to
Github user tdas commented on a diff in the pull request:
https://github.com/apache/spark/pull/126#discussion_r10533790
--- Diff: core/src/main/scala/org/apache/spark/Dependency.scala ---
@@ -49,9 +49,28 @@ class ShuffleDependency[K, V](
@transient rdd: RDD[_ <: Product
Github user tdas commented on the pull request:
https://github.com/apache/spark/pull/126#issuecomment-37358288
HAHA! I was already working on adding that try-catch. Realized that a bit
late after the PR.
And yes, super.finalize() is a good call.
---
If your project is set up
GitHub user tdas opened a pull request:
https://github.com/apache/spark/pull/126
[SPARK-1103] [WIP] Automatic garbage collection of RDD, shuffle and
broadcast data
This PR allows Spark to automatically cleanup metadata and data related to
persisted RDDs, shuffles and broadcast
28 matches
Mail list logo