Github user kayousterhout commented on a diff in the pull request:
https://github.com/apache/spark/pull/159#discussion_r10640471
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala ---
@@ -533,8 +575,11 @@ private[spark] class TaskSetManager
Github user kayousterhout commented on a diff in the pull request:
https://github.com/apache/spark/pull/159#discussion_r10640410
--- Diff:
core/src/test/scala/org/apache/spark/scheduler/TaskSetManagerSuite.scala ---
@@ -298,6 +298,94 @@ class TaskSetManagerSuite extends FunSuite
Github user kayousterhout commented on a diff in the pull request:
https://github.com/apache/spark/pull/159#discussion_r10640396
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala ---
@@ -533,8 +575,11 @@ private[spark] class TaskSetManager
Github user kayousterhout commented on a diff in the pull request:
https://github.com/apache/spark/pull/159#discussion_r10640377
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala ---
@@ -228,12 +239,18 @@ private[spark] class TaskSetManager
Github user kayousterhout commented on a diff in the pull request:
https://github.com/apache/spark/pull/147#discussion_r10637306
--- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala ---
@@ -35,13 +35,21 @@ private[spark] case class
GetMapOutputStatuses(shuffleId
Github user kayousterhout commented on the pull request:
https://github.com/apache/spark/pull/143#issuecomment-37711356
I know @pwendell has expressed concern about config option bloat so maybe
he has an opinion here...I would be in favor of not adding a config option
because it
Github user kayousterhout commented on the pull request:
https://github.com/apache/spark/pull/146#issuecomment-37711109
The examples that you added are awesome!!!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user kayousterhout commented on a diff in the pull request:
https://github.com/apache/spark/pull/146#discussion_r10632433
--- Diff:
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/ExpressionEvaluationSuite.scala
---
@@ -0,0 +1,115 @@
+/*
+ * Licensed to
Github user kayousterhout commented on a diff in the pull request:
https://github.com/apache/spark/pull/146#discussion_r10632425
--- Diff:
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/DistributionSuite.scala
---
@@ -0,0 +1,174 @@
+/*
+ * Licensed to the
Github user kayousterhout commented on a diff in the pull request:
https://github.com/apache/spark/pull/146#discussion_r10632329
--- Diff:
examples/src/main/scala/org/apache/spark/sql/examples/HiveFromSpark.scala ---
@@ -0,0 +1,65 @@
+/*
+ * Licensed to the Apache Software
Github user kayousterhout commented on a diff in the pull request:
https://github.com/apache/spark/pull/146#discussion_r10632314
--- Diff: bin/compute-classpath.sh ---
@@ -33,23 +33,43 @@ fi
# Build up classpath
CLASSPATH="$SPARK_CLASSPATH:$FWDIR
Github user kayousterhout commented on the pull request:
https://github.com/apache/spark/pull/143#issuecomment-37686977
Ah sorry I didn't see that clean() gets called when the RDD is created and
not just when the job is submitted. I think the check in DAGScheduler should
be re
Github user kayousterhout commented on the pull request:
https://github.com/apache/spark/pull/143#issuecomment-37685303
I'm not sure this fixes the problem Reynold was referring to in his pull
request. If you look in DAGScheduler.scala, on line 773, it does essentially
the
Github user kayousterhout commented on a diff in the pull request:
https://github.com/apache/spark/pull/143#discussion_r10622197
--- Diff:
core/src/test/scala/org/apache/spark/serializer/ProactiveClosureSerializationSuite.scala
---
@@ -0,0 +1,79 @@
+/*
+ * Licensed to the
Github user kayousterhout commented on a diff in the pull request:
https://github.com/apache/spark/pull/143#discussion_r10622183
--- Diff:
streaming/src/main/scala/org/apache/spark/streaming/dstream/DStream.scala ---
@@ -533,7 +533,7 @@ abstract class DStream[T: ClassTag
Github user kayousterhout commented on a diff in the pull request:
https://github.com/apache/spark/pull/101#discussion_r10548780
--- Diff: core/src/main/scala/org/apache/spark/rdd/HadoopRDD.scala ---
@@ -165,12 +174,29 @@ class HadoopRDD[K, V](
override def compute
Github user kayousterhout commented on a diff in the pull request:
https://github.com/apache/spark/pull/42#discussion_r10458444
--- Diff: core/src/main/scala/org/apache/spark/ui/SparkUI.scala ---
@@ -69,19 +100,55 @@ private[spark] class SparkUI(sc: SparkContext) extends
Logging
Github user kayousterhout commented on a diff in the pull request:
https://github.com/apache/spark/pull/42#discussion_r10458236
--- Diff:
core/src/main/scala/org/apache/spark/ui/jobs/JobProgressListener.scala ---
@@ -30,16 +32,23 @@ import org.apache.spark.scheduler
Github user kayousterhout commented on the pull request:
https://github.com/apache/spark/pull/42#issuecomment-37057167
Jenkins, retest this please
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user kayousterhout commented on the pull request:
https://github.com/apache/spark/pull/42#issuecomment-37057138
Maybe try rebasing on master? It looks like the same error has appeared on
other PRs: https://github.com/apache/spark/pull/85 in the past. It's also
possible
Github user kayousterhout commented on a diff in the pull request:
https://github.com/apache/spark/pull/85#discussion_r10369848
--- Diff: core/src/main/scala/org/apache/spark/util/AkkaUtils.scala ---
@@ -108,6 +108,6 @@ private[spark] object AkkaUtils {
/** Returns the
Github user kayousterhout commented on a diff in the pull request:
https://github.com/apache/spark/pull/42#discussion_r10369800
--- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala
---
@@ -978,6 +971,11 @@ class DAGScheduler(
logDebug("Addit
Github user kayousterhout commented on the pull request:
https://github.com/apache/spark/pull/42#issuecomment-36943803
I think it would be good if the persisted UI noted somewhere that the
associated application is dead -- maybe you could add this in ui/UIUtils.scala,
in the header
Github user kayousterhout commented on a diff in the pull request:
https://github.com/apache/spark/pull/42#discussion_r10363587
--- Diff: core/src/main/scala/org/apache/spark/ui/UISparkListener.scala ---
@@ -0,0 +1,123 @@
+/*
+ * Licensed to the Apache Software Foundation
Github user kayousterhout commented on a diff in the pull request:
https://github.com/apache/spark/pull/42#discussion_r10363478
--- Diff: core/src/main/scala/org/apache/spark/executor/TaskMetrics.scala
---
@@ -68,6 +70,11 @@ class TaskMetrics extends Serializable {
* here
Github user kayousterhout commented on a diff in the pull request:
https://github.com/apache/spark/pull/42#discussion_r10363343
--- Diff:
core/src/main/scala/org/apache/spark/ui/jobs/JobProgressListener.scala ---
@@ -30,16 +32,23 @@ import org.apache.spark.scheduler
Github user kayousterhout commented on a diff in the pull request:
https://github.com/apache/spark/pull/42#discussion_r10363140
--- Diff:
core/src/main/scala/org/apache/spark/ui/jobs/JobProgressListener.scala ---
@@ -53,27 +62,28 @@ private[spark] class JobProgressListener(val sc
Github user kayousterhout commented on a diff in the pull request:
https://github.com/apache/spark/pull/42#discussion_r10362915
--- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala
---
@@ -978,6 +971,11 @@ class DAGScheduler(
logDebug("Addit
Github user kayousterhout commented on a diff in the pull request:
https://github.com/apache/spark/pull/42#discussion_r10362811
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/SparkListener.scala ---
@@ -19,33 +19,43 @@ package org.apache.spark.scheduler
import
Github user kayousterhout commented on a diff in the pull request:
https://github.com/apache/spark/pull/42#discussion_r10361732
--- Diff:
core/src/main/scala/org/apache/spark/ui/jobs/JobProgressListener.scala ---
@@ -30,16 +32,23 @@ import org.apache.spark.scheduler
Github user kayousterhout commented on a diff in the pull request:
https://github.com/apache/spark/pull/42#discussion_r10361689
--- Diff:
core/src/main/scala/org/apache/spark/ui/jobs/JobProgressListener.scala ---
@@ -30,16 +32,23 @@ import org.apache.spark.scheduler
Github user kayousterhout commented on a diff in the pull request:
https://github.com/apache/spark/pull/42#discussion_r10361549
--- Diff: core/src/main/scala/org/apache/spark/ui/UIReloader.scala ---
@@ -0,0 +1,51 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF
Github user kayousterhout commented on a diff in the pull request:
https://github.com/apache/spark/pull/42#discussion_r10361480
--- Diff: core/src/main/scala/org/apache/spark/ui/SparkUI.scala ---
@@ -27,28 +28,58 @@ import org.apache.spark.ui.jobs.JobProgressUI
import
Github user kayousterhout commented on a diff in the pull request:
https://github.com/apache/spark/pull/42#discussion_r10361374
--- Diff: core/src/main/scala/org/apache/spark/deploy/master/Master.scala
---
@@ -625,6 +653,30 @@ private[spark] class Master(host: String, port: Int
Github user kayousterhout commented on a diff in the pull request:
https://github.com/apache/spark/pull/42#discussion_r10361035
--- Diff: core/src/main/scala/org/apache/spark/deploy/master/Master.scala
---
@@ -50,6 +51,9 @@ private[spark] class Master(host: String, port: Int
Github user kayousterhout commented on a diff in the pull request:
https://github.com/apache/spark/pull/42#discussion_r10360999
--- Diff: core/src/main/scala/org/apache/spark/deploy/master/Master.scala
---
@@ -50,6 +51,9 @@ private[spark] class Master(host: String, port: Int
Github user kayousterhout commented on a diff in the pull request:
https://github.com/apache/spark/pull/42#discussion_r10360902
--- Diff: core/src/main/scala/org/apache/spark/ui/UIReloader.scala ---
@@ -0,0 +1,51 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF
Github user kayousterhout commented on a diff in the pull request:
https://github.com/apache/spark/pull/42#discussion_r10360822
--- Diff: core/src/main/scala/org/apache/spark/ui/UISparkListener.scala ---
@@ -0,0 +1,123 @@
+/*
+ * Licensed to the Apache Software Foundation
Github user kayousterhout commented on a diff in the pull request:
https://github.com/apache/spark/pull/42#discussion_r10360310
--- Diff: core/src/main/scala/org/apache/spark/scheduler/EventBus.scala ---
@@ -0,0 +1,77 @@
+/*
+ * Licensed to the Apache Software Foundation
Github user kayousterhout commented on a diff in the pull request:
https://github.com/apache/spark/pull/42#discussion_r10360102
--- Diff: core/src/main/scala/org/apache/spark/ui/SparkUI.scala ---
@@ -69,19 +100,55 @@ private[spark] class SparkUI(sc: SparkContext) extends
Logging
Github user kayousterhout commented on a diff in the pull request:
https://github.com/apache/spark/pull/42#discussion_r10357808
--- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala
---
@@ -194,10 +148,46 @@ class DAGScheduler
Github user kayousterhout commented on a diff in the pull request:
https://github.com/apache/spark/pull/42#discussion_r10357702
--- Diff:
core/src/main/scala/org/apache/spark/storage/BlockManagerRegistrationListener.scala
---
@@ -0,0 +1,51 @@
+/*
+ * Licensed to the
Github user kayousterhout commented on a diff in the pull request:
https://github.com/apache/spark/pull/42#discussion_r10357634
--- Diff:
core/src/main/scala/org/apache/spark/storage/BlockManagerMasterActor.scala ---
@@ -50,6 +50,8 @@ class BlockManagerMasterActor(val isLocal
Github user kayousterhout commented on a diff in the pull request:
https://github.com/apache/spark/pull/42#discussion_r10357546
--- Diff:
core/src/main/scala/org/apache/spark/storage/BlockManagerRegistrationListener.scala
---
@@ -0,0 +1,51 @@
+/*
+ * Licensed to the
Github user kayousterhout commented on the pull request:
https://github.com/apache/spark/pull/42#issuecomment-36929100
Jenkins, retest this please
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user kayousterhout commented on a diff in the pull request:
https://github.com/apache/spark/pull/42#discussion_r10356197
--- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala
---
@@ -32,7 +32,7 @@ import org.apache.spark.executor.TaskMetrics
Github user kayousterhout commented on a diff in the pull request:
https://github.com/apache/spark/pull/42#discussion_r10356042
--- Diff: core/src/main/scala/org/apache/spark/SparkEnv.scala ---
@@ -164,9 +167,18 @@ object SparkEnv extends Logging
Github user kayousterhout commented on a diff in the pull request:
https://github.com/apache/spark/pull/85#discussion_r10355654
--- Diff: core/src/main/scala/org/apache/spark/util/AkkaUtils.scala ---
@@ -108,6 +108,6 @@ private[spark] object AkkaUtils {
/** Returns the
Github user kayousterhout commented on a diff in the pull request:
https://github.com/apache/spark/pull/85#discussion_r10355530
--- Diff:
repl/src/main/scala/org/apache/spark/repl/ExecutorClassLoader.scala ---
@@ -33,7 +33,7 @@ import org.objectweb.asm.Opcodes._
* used to
Github user kayousterhout commented on the pull request:
https://github.com/apache/spark/pull/63#issuecomment-36799644
This looks good -- I've merged this into master.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as wel
Github user kayousterhout commented on the pull request:
https://github.com/apache/spark/pull/63#issuecomment-36653243
This new version of the change doesn't look any simpler to me than the
current version of the code and I think is a slightly confusing way of using
worker offe
Github user kayousterhout commented on the pull request:
https://github.com/apache/spark/pull/62#issuecomment-36577709
I've merged this into master.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project
GitHub user kayousterhout opened a pull request:
https://github.com/apache/spark/pull/69
Remove broken/unused Connection.getChunkFIFO method.
This method appears to be broken -- since it never removes
anything from messages, and it adds new messages to it,
the while loop is
Github user kayousterhout commented on a diff in the pull request:
https://github.com/apache/spark/pull/63#discussion_r10230444
--- Diff: core/src/main/scala/org/apache/spark/scheduler/WorkerOffer.scala
---
@@ -21,4 +21,6 @@ package org.apache.spark.scheduler
* Represents
Github user kayousterhout commented on a diff in the pull request:
https://github.com/apache/spark/pull/63#discussion_r10230140
--- Diff: core/src/main/scala/org/apache/spark/scheduler/WorkerOffer.scala
---
@@ -21,4 +21,6 @@ package org.apache.spark.scheduler
* Represents
Github user kayousterhout commented on a diff in the pull request:
https://github.com/apache/spark/pull/43#discussion_r10227330
--- Diff:
core/src/main/scala/org/apache/spark/storage/BlockFetcherIterator.scala ---
@@ -146,6 +146,12 @@ object BlockFetcherIterator
Github user kayousterhout commented on a diff in the pull request:
https://github.com/apache/spark/pull/43#discussion_r10225873
--- Diff:
core/src/main/scala/org/apache/spark/storage/BlockFetcherIterator.scala ---
@@ -146,6 +146,12 @@ object BlockFetcherIterator
Github user kayousterhout commented on a diff in the pull request:
https://github.com/apache/spark/pull/43#discussion_r10225217
--- Diff: core/src/main/scala/org/apache/spark/storage/DiskStore.scala ---
@@ -84,12 +84,27 @@ private class DiskStore(blockManager: BlockManager
Github user kayousterhout commented on the pull request:
https://github.com/apache/spark/pull/62#issuecomment-36483058
Unfortunately this isn't very useful for getting network bandwidth...if you
consider a simple case where two shuffle reads (for one task) occur
simultane
GitHub user kayousterhout opened a pull request:
https://github.com/apache/spark/pull/62
Remove the remoteFetchTime metric.
This metric is confusing: it adds up all of the time to fetch
shuffle inputs, but fetches often happen in parallel, so
remoteFetchTime can be much
GitHub user kayousterhout opened a pull request:
https://github.com/apache/spark/pull/61
Removed accidentally checked in comment
It looks like this comment was added a while ago by @mridulm as part of a
merge and was accidentally checked in. We should remove it.
You can merge
Github user kayousterhout commented on the pull request:
https://github.com/apache/spark/pull/27#issuecomment-36433891
I've merged this into master
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project doe
Github user kayousterhout commented on a diff in the pull request:
https://github.com/apache/spark/pull/42#discussion_r10183222
--- Diff: core/src/main/scala/org/apache/spark/scheduler/JobLogger.scala ---
@@ -339,8 +217,8 @@ class JobLogger(val user: String, val logDirName:
String
Github user kayousterhout commented on a diff in the pull request:
https://github.com/apache/spark/pull/42#discussion_r10175006
--- Diff: core/src/main/scala/org/apache/spark/scheduler/JobLogger.scala ---
@@ -339,8 +217,8 @@ class JobLogger(val user: String, val logDirName:
String
Github user kayousterhout commented on the pull request:
https://github.com/apache/spark/pull/39#issuecomment-36323841
Thanks @pwendell -- fixed the style!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project
Github user kayousterhout commented on a diff in the pull request:
https://github.com/apache/spark/pull/39#discussion_r10155351
--- Diff:
core/src/main/scala/org/apache/spark/storage/BlockFetcherIterator.scala ---
@@ -233,7 +238,16 @@ object BlockFetcherIterator
GitHub user kayousterhout opened a pull request:
https://github.com/apache/spark/pull/39
Remote BlockFetchTracker trait
This trait seems to have been created a while ago when there
were multiple implementations; now that there's just one, I think it
makes sense to mer
GitHub user kayousterhout opened a pull request:
https://github.com/apache/spark/pull/27
[SPARK-979] Randomize order of offers.
This commit randomizes the order of resource offers to avoid scheduling
all tasks on the same small set of machines.
This is a much simpler
68 matches
Mail list logo