Github user markhamstra commented on a diff in the pull request:
https://github.com/apache/spark/pull/152#discussion_r10637484
--- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala ---
@@ -35,13 +35,21 @@ private[spark] case class
GetMapOutputStatuses(shuffleId
Github user markhamstra commented on a diff in the pull request:
https://github.com/apache/spark/pull/140#discussion_r10618038
--- Diff: core/src/main/scala/org/apache/spark/rdd/DoubleRDDFunctions.scala
---
@@ -86,14 +92,9 @@ class DoubleRDDFunctions(self: RDD[Double]) extends
Github user markhamstra commented on a diff in the pull request:
https://github.com/apache/spark/pull/126#discussion_r10590763
--- Diff: core/src/main/scala/org/apache/spark/scheduler/ResultTask.scala
---
@@ -21,20 +21,16 @@ import java.io._
import java.util.zip
Github user markhamstra commented on a diff in the pull request:
https://github.com/apache/spark/pull/126#discussion_r10590604
--- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala
---
@@ -123,17 +123,17 @@ class DAGScheduler(
private val
Github user markhamstra commented on a diff in the pull request:
https://github.com/apache/spark/pull/126#discussion_r10590414
--- Diff: core/src/main/scala/org/apache/spark/rdd/RDD.scala ---
@@ -1025,6 +1025,14 @@ abstract class RDD[T: ClassTag](
checkpointData.flatMap
Github user markhamstra commented on a diff in the pull request:
https://github.com/apache/spark/pull/126#discussion_r10589028
--- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala ---
@@ -50,23 +50,26 @@ private[spark] class
MapOutputTrackerMasterActor(tracker
Github user markhamstra commented on a diff in the pull request:
https://github.com/apache/spark/pull/126#discussion_r10588418
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/ShuffleMapTask.scala ---
@@ -17,28 +17,24 @@
package org.apache.spark.scheduler
Github user markhamstra commented on a diff in the pull request:
https://github.com/apache/spark/pull/126#discussion_r10586047
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/ShuffleMapTask.scala ---
@@ -17,28 +17,24 @@
package org.apache.spark.scheduler
Github user markhamstra commented on a diff in the pull request:
https://github.com/apache/spark/pull/126#discussion_r10579603
--- Diff: core/src/main/scala/org/apache/spark/ContextCleaner.scala ---
@@ -0,0 +1,135 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF
Github user markhamstra commented on a diff in the pull request:
https://github.com/apache/spark/pull/126#discussion_r10578363
--- Diff: core/src/main/scala/org/apache/spark/ContextCleaner.scala ---
@@ -0,0 +1,135 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF
Github user markhamstra commented on the pull request:
https://github.com/apache/spark/pull/125#issuecomment-37325883
@srowen Do you mean something other than
```
org.apache
apache
13
```
...which is part of the maven build and already
Github user markhamstra commented on the pull request:
https://github.com/apache/spark/pull/77#issuecomment-36659947
The standard maven build procedure should be to run `mvn -DskipTests
package` first (which builds the assembly) and then `mvn test`. The "Building
Spark with
Github user markhamstra commented on a diff in the pull request:
https://github.com/apache/spark/pull/63#discussion_r10262224
--- Diff: core/src/main/scala/org/apache/spark/scheduler/WorkerOffer.scala
---
@@ -21,4 +21,4 @@ package org.apache.spark.scheduler
* Represents free
Github user markhamstra commented on a diff in the pull request:
https://github.com/apache/spark/pull/63#discussion_r1026
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
---
@@ -125,14 +126,17 @@ class
Github user markhamstra commented on a diff in the pull request:
https://github.com/apache/spark/pull/63#discussion_r10232174
--- Diff: core/src/main/scala/org/apache/spark/scheduler/WorkerOffer.scala
---
@@ -21,4 +21,6 @@ package org.apache.spark.scheduler
* Represents free
Github user markhamstra commented on the pull request:
https://github.com/apache/spark/pull/51#issuecomment-36443418
Ah, I see. create-release.sh was handled in another PR.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well
Github user markhamstra commented on the pull request:
https://github.com/apache/spark/pull/51#issuecomment-36442877
Looks good. The only remaining incubat* I find are in
dev/create-release/create-release.sh, but I'm not sure how you use that script.
---
If your project is s
Github user markhamstra commented on the pull request:
https://github.com/apache/spark/pull/27#issuecomment-36277558
I see two issues: 1) The deterministic nature of the current scheduler
places tasks on the same small set of machines while leaving others largely
unused; 2) There is
Github user markhamstra commented on the pull request:
https://github.com/apache/spark/pull/6#issuecomment-36267797
This broke the maven build. Also, both SBT and Maven are still building
artifacts with "incubating".
[ERROR] The project
org.apache.spark:spark
Github user markhamstra commented on the pull request:
https://github.com/apache/spark/pull/26#issuecomment-36226183
Yes, they definitely have value, but putting them directly into Spark also
has costs and imposes responsibilities on the maintainers. The question is how
to get the
Github user markhamstra commented on the pull request:
https://github.com/apache/spark/pull/26#issuecomment-36224757
I'm bothered by the idea of vagrant, docker, ec2, and potentially other
virtualization and cloud environments (EMR, etc.) all becoming supported and
maintained
21 matches
Mail list logo