[GitHub] spark pull request: [SPARK-1132] Persisting Web UI through refacto...

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/42#issuecomment-37718878 Build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this fea

[GitHub] spark pull request: [SPARK-1132] Persisting Web UI through refacto...

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/42#issuecomment-37718877 Build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this fea

[GitHub] spark pull request: [SPARK-1132] Persisting Web UI through refacto...

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/42#issuecomment-37718879 One or more automated tests failed Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13191/ --- If your pr

[GitHub] spark pull request: [SPARK-1132] Persisting Web UI through refacto...

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/42#issuecomment-37718880 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13192/ --- If your project i

[GitHub] spark pull request: Spark 1246 add min max to stat counter

2014-03-14 Thread dwmclary
Github user dwmclary commented on the pull request: https://github.com/apache/spark/pull/144#issuecomment-37718637 @mateiz OK, should be good to go now. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-1244] Throw exception if map output sta...

2014-03-14 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/147#discussion_r10633644 --- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala --- @@ -35,13 +35,21 @@ private[spark] case class GetMapOutputStatuses(shuffleId: Int)

[GitHub] spark pull request: [SPARK-1132] Persisting Web UI through refacto...

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/42#issuecomment-37718089 Build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this f

[GitHub] spark pull request: [SPARK-1132] Persisting Web UI through refacto...

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/42#issuecomment-37718090 Build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feat

[GitHub] spark pull request: [SPARK-1244] Throw exception if map output sta...

2014-03-14 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/147#discussion_r10633600 --- Diff: core/src/test/scala/org/apache/spark/MapOutputTrackerSuite.scala --- @@ -136,4 +123,47 @@ class MapOutputTrackerSuite extends FunSuite with LocalSp

[GitHub] spark pull request: Fix serialization of MutablePair. Also provide...

2014-03-14 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/141#issuecomment-37717834 @marmbrus mind closing this? Somehow github didn't detect the close id correctly. --- If your project is set up for it, you can reply to this email and have your reply a

[GitHub] spark pull request: Fix serialization of MutablePair. Also provide...

2014-03-14 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/141#issuecomment-37717648 I did : https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=e19044cb1048c3755d1ea2cb43879d2225d49b54 --- If your project is set up for it, you can reply to this

[GitHub] spark pull request: [SPARK-1132] Persisting Web UI through refacto...

2014-03-14 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/42#discussion_r10633506 --- Diff: core/src/main/scala/org/apache/spark/ui/UIReloader.scala --- @@ -0,0 +1,50 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

[GitHub] spark pull request: [SPARK-1132] Persisting Web UI through refacto...

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/42#issuecomment-37717261 Build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this f

[GitHub] spark pull request: [SPARK-1132] Persisting Web UI through refacto...

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/42#issuecomment-37717262 Build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feat

[GitHub] spark pull request: [SPARK-1244] Throw exception if map output sta...

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/147#issuecomment-37717211 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-1244] Throw exception if map output sta...

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/147#issuecomment-37717212 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13190/ --- If your project

[GitHub] spark pull request: [SPARK-1132] Persisting Web UI through refacto...

2014-03-14 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/42#discussion_r10633389 --- Diff: core/src/main/scala/org/apache/spark/scheduler/JobLogger.scala --- @@ -80,187 +81,78 @@ class JobLogger(val user: String, val logDirName: String)

[GitHub] spark pull request: [SPARK-1244] Throw exception if map output sta...

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/147#issuecomment-37716333 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have t

[GitHub] spark pull request: [SPARK-1244] Throw exception if map output sta...

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/147#issuecomment-37716332 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not hav

[GitHub] spark pull request: Fix serialization of MutablePair. Also provide...

2014-03-14 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/141#issuecomment-37716246 @rxin did you actually merge this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: SPARK-1251 Support for optimizing and executin...

2014-03-14 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/146#issuecomment-37716174 I think in the example in the PR description it should be `case class Person(name: String, age: Int)` otherwise there is a casting error. --- If your project is set up f

[GitHub] spark pull request: [SPARK-1244] Throw exception if map output sta...

2014-03-14 Thread andrewor14
GitHub user andrewor14 opened a pull request: https://github.com/apache/spark/pull/147 [SPARK-1244] Throw exception if map output status exceeds frame size In the existing code, this fails silently... You can merge this pull request into a Git repository by running: $ git pull

Re: test cases stuck on "local-cluster mode" of ReplSuite?

2014-03-14 Thread Nan Zhu
Yeah, I tested that, I had my SPARK_HOME point to a very old location, after I fixed that, everything goes well Thank you so much for pointing this out Best, -- Nan Zhu On Friday, March 14, 2014 at 6:41 PM, Michael Armbrust wrote: > Sorry to revive an old thread, but I just ran into thi

[GitHub] spark pull request: Spark 1246 add min max to stat counter

2014-03-14 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/144#issuecomment-37714253 Yeah sorry, I didn't mean leave out max and min from StatCounter, I just meant that the RDD.max() and RDD.min() methods should directly call reduce. If you're calling those

[GitHub] spark pull request: Spark 1246 add min max to stat counter

2014-03-14 Thread mateiz
Github user mateiz commented on a diff in the pull request: https://github.com/apache/spark/pull/144#discussion_r10633009 --- Diff: python/pyspark/rdd.py --- @@ -534,7 +534,26 @@ def func(iterator): return reduce(op, vals, zeroValue) # TODO: aggregate

[GitHub] spark pull request: Spark 1246 add min max to stat counter

2014-03-14 Thread mateiz
Github user mateiz commented on a diff in the pull request: https://github.com/apache/spark/pull/144#discussion_r10633006 --- Diff: core/src/main/scala/org/apache/spark/rdd/RDD.scala --- @@ -958,6 +958,10 @@ abstract class RDD[T: ClassTag]( */ def takeOrdered(num: Int

[GitHub] spark pull request: Spark 1246 add min max to stat counter

2014-03-14 Thread mateiz
Github user mateiz commented on a diff in the pull request: https://github.com/apache/spark/pull/144#discussion_r10633001 --- Diff: core/src/test/scala/org/apache/spark/PartitioningSuite.scala --- @@ -171,6 +171,8 @@ class PartitioningSuite extends FunSuite with SharedSparkContext

[GitHub] spark pull request: Spark 1246 add min max to stat counter

2014-03-14 Thread mateiz
Github user mateiz commented on a diff in the pull request: https://github.com/apache/spark/pull/144#discussion_r10633002 --- Diff: core/src/main/scala/org/apache/spark/api/java/JavaRDDLike.scala --- @@ -477,6 +477,16 @@ trait JavaRDDLike[T, This <: JavaRDDLike[T, This]] extends S

Re: [DISCUSS] Necessity of Maven *and* SBT Build in Spark

2014-03-14 Thread Matei Zaharia
I like the pom-reader approach as well — in particular, that it lets you add extra stuff in your SBT build after loading the dependencies from the POM. Profiles would be the one thing missing to be able to pass options through. Matei On Mar 14, 2014, at 10:03 AM, Patrick Wendell wrote: > Hey

[GitHub] spark pull request: SPARK-1254. Consolidate, order, and harmonize ...

2014-03-14 Thread witgo
Github user witgo commented on the pull request: https://github.com/apache/spark/pull/145#issuecomment-37713958 Well done, the PR can fix [ SPARK-1248](https://spark-project.atlassian.net/browse/SPARK-1248) --- If your project is set up for it, you can reply to this email and have y

[GitHub] spark pull request: SPARK-1246, added min max API to Double RDDs i...

2014-03-14 Thread ScrapCodes
Github user ScrapCodes commented on a diff in the pull request: https://github.com/apache/spark/pull/140#discussion_r10632880 --- Diff: core/src/main/scala/org/apache/spark/rdd/DoubleRDDFunctions.scala --- @@ -86,14 +92,9 @@ class DoubleRDDFunctions(self: RDD[Double]) extends Logg

[GitHub] spark pull request: Spark 1246 add min max to stat counter

2014-03-14 Thread ScrapCodes
Github user ScrapCodes commented on the pull request: https://github.com/apache/spark/pull/144#issuecomment-37712645 Ahh I understood the downside, that would be just for numbers then. makes sense. May be we can have both ? --- If your project is set up for it, you can reply to this

[GitHub] spark pull request: SPARK-1246, added min max API to Double RDDs i...

2014-03-14 Thread ScrapCodes
Github user ScrapCodes commented on a diff in the pull request: https://github.com/apache/spark/pull/140#discussion_r10632860 --- Diff: project/build.properties --- @@ -14,4 +14,4 @@ # See the License for the specific language governing permissions and # limitations under

[GitHub] spark pull request: Spark 1246 add min max to stat counter

2014-03-14 Thread ScrapCodes
Github user ScrapCodes commented on the pull request: https://github.com/apache/spark/pull/144#issuecomment-37712447 Hey Matei, For a large dataset someone might wanna do it once, like with stat counter all of the numbers are calculated in one go. --- If your project is set

[GitHub] spark pull request: SPARK-1251 Support for optimizing and executin...

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/146#issuecomment-37712122 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-1251 Support for optimizing and executin...

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/146#issuecomment-37712123 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13189/ --- If your project

[GitHub] spark pull request: SPARK-897: preemptively serialize closures

2014-03-14 Thread kayousterhout
Github user kayousterhout commented on the pull request: https://github.com/apache/spark/pull/143#issuecomment-37711356 I know @pwendell has expressed concern about config option bloat so maybe he has an opinion here...I would be in favor of not adding a config option because it's a r

[GitHub] spark pull request: SPARK-1251 Support for optimizing and executin...

2014-03-14 Thread kayousterhout
Github user kayousterhout commented on the pull request: https://github.com/apache/spark/pull/146#issuecomment-37711109 The examples that you added are awesome!!! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your pr

[GitHub] spark pull request: SPARK-1251 Support for optimizing and executin...

2014-03-14 Thread kayousterhout
Github user kayousterhout commented on a diff in the pull request: https://github.com/apache/spark/pull/146#discussion_r10632433 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/ExpressionEvaluationSuite.scala --- @@ -0,0 +1,115 @@ +/* + * Licensed to th

[GitHub] spark pull request: SPARK-1251 Support for optimizing and executin...

2014-03-14 Thread kayousterhout
Github user kayousterhout commented on a diff in the pull request: https://github.com/apache/spark/pull/146#discussion_r10632425 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/DistributionSuite.scala --- @@ -0,0 +1,174 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: SPARK-1251 Support for optimizing and executin...

2014-03-14 Thread kayousterhout
Github user kayousterhout commented on a diff in the pull request: https://github.com/apache/spark/pull/146#discussion_r10632329 --- Diff: examples/src/main/scala/org/apache/spark/sql/examples/HiveFromSpark.scala --- @@ -0,0 +1,65 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request: SPARK-1251 Support for optimizing and executin...

2014-03-14 Thread kayousterhout
Github user kayousterhout commented on a diff in the pull request: https://github.com/apache/spark/pull/146#discussion_r10632314 --- Diff: bin/compute-classpath.sh --- @@ -33,23 +33,43 @@ fi # Build up classpath CLASSPATH="$SPARK_CLASSPATH:$FWDIR/conf" +# Support

[GitHub] spark pull request: SPARK-1251 Support for optimizing and executin...

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/146#issuecomment-37710497 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not hav

[GitHub] spark pull request: SPARK-1251 Support for optimizing and executin...

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/146#issuecomment-37710500 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have t

[GitHub] spark pull request: SPARK-1242 Add aggregate to python rdd

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/139#issuecomment-37710423 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-1251 Support for optimizing and executin...

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/146#issuecomment-37710426 One or more automated tests failed Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13187/ --- If your p

[GitHub] spark pull request: SPARK-1251 Support for optimizing and executin...

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/146#issuecomment-37710424 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-1242 Add aggregate to python rdd

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/139#issuecomment-37710425 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13188/ --- If your project

[GitHub] spark pull request: SPARK-1251 Support for optimizing and executin...

2014-03-14 Thread aarondav
Github user aarondav commented on a diff in the pull request: https://github.com/apache/spark/pull/146#discussion_r10631616 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/SqlParser.scala --- @@ -0,0 +1,328 @@ +/* + * Licensed to the Apache Software Foun

[GitHub] spark pull request: SPARK-1242 Add aggregate to python rdd

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/139#issuecomment-37708361 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not hav

[GitHub] spark pull request: SPARK-1242 Add aggregate to python rdd

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/139#issuecomment-37708362 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have t

[GitHub] spark pull request: SPARK-1251 Support for optimizing and executin...

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/146#issuecomment-37708046 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have t

[GitHub] spark pull request: SPARK-1251 Support for optimizing and executin...

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/146#issuecomment-37708044 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not hav

[GitHub] spark pull request: SPARK-1251 Support for optimizing and executin...

2014-03-14 Thread marmbrus
GitHub user marmbrus opened a pull request: https://github.com/apache/spark/pull/146 SPARK-1251 Support for optimizing and executing structured queries This pull request adds support to Spark for working with structured data using a simple SQL dialect, HiveQL and a Scala Query DSL.

Re: test cases stuck on "local-cluster mode" of ReplSuite?

2014-03-14 Thread Michael Armbrust
Sorry to revive an old thread, but I just ran into this issue myself. It is likely that you do not have the assembly jar built, or that you have SPARK_HOME set incorrectly (it does not need to be set). Michael On Thu, Feb 27, 2014 at 8:13 AM, Nan Zhu wrote: > Hi, all > > Actually this problem

[GitHub] spark pull request: SPARK-1240: handle the case of empty RDD when ...

2014-03-14 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/135#issuecomment-37704961 @mateiz, done~ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: SPARK-1240: handle the case of empty RDD when ...

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/135#issuecomment-37704877 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-1240: handle the case of empty RDD when ...

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/135#issuecomment-37704879 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13186/ --- If your project

[GitHub] spark pull request: SPARK-1240: handle the case of empty RDD when ...

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/135#issuecomment-37703031 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not hav

[GitHub] spark pull request: SPARK-1240: handle the case of empty RDD when ...

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/135#issuecomment-37703032 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have t

[GitHub] spark pull request: SPARK-1254. Consolidate, order, and harmonize ...

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/145#issuecomment-37702868 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13185/ --- If your project

[GitHub] spark pull request: SPARK-1254. Consolidate, order, and harmonize ...

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/145#issuecomment-37702867 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-1254. Consolidate, order, and harmonize ...

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/145#issuecomment-37699806 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have t

[GitHub] spark pull request: SPARK-1254. Consolidate, order, and harmonize ...

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/145#issuecomment-37699807 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not hav

[GitHub] spark pull request: SPARK-1254. Consolidate, order, and harmonize ...

2014-03-14 Thread srowen
GitHub user srowen opened a pull request: https://github.com/apache/spark/pull/145 SPARK-1254. Consolidate, order, and harmonize repository declarations in Maven/SBT builds This suggestion addresses a few minor suboptimalities with how repositories are handled. 1) Use HTTP

[GitHub] spark pull request: Spark 1246 add min max to stat counter

2014-03-14 Thread dwmclary
Github user dwmclary commented on the pull request: https://github.com/apache/spark/pull/144#issuecomment-37694293 Matei, I updated the branch to do just that. Thanks for the review! --- If your project is set up for it, you can reply to this email and have your reply appe

[GitHub] spark pull request: SPARK-897: preemptively serialize closures

2014-03-14 Thread willb
Github user willb commented on the pull request: https://github.com/apache/spark/pull/143#issuecomment-37693355 A configuration option makes sense to me and I'm happy to add it. Let me know if you have strong feelings about what it should be called. --- If your project is set up for

[GitHub] spark pull request: SPARK-897: preemptively serialize closures

2014-03-14 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/143#issuecomment-37691023 I was thinking maybe we want a config option for this - which is on by default, but can be turned off. What do you guys think? --- If your project is set up for it, you can

[GitHub] spark pull request: SPARK-897: preemptively serialize closures

2014-03-14 Thread willb
Github user willb commented on the pull request: https://github.com/apache/spark/pull/143#issuecomment-37690238 Here's what I was thinking about that: I left the check in `DAGScheduler` in place because preemptive checking is optional (and indeed not done everywhere) and it seems lik

[GitHub] spark pull request: Fix serialization of MutablePair. Also provide...

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/141#issuecomment-37687238 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: Fix serialization of MutablePair. Also provide...

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/141#issuecomment-37687240 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13184/ --- If your project

[GitHub] spark pull request: SPARK-897: preemptively serialize closures

2014-03-14 Thread kayousterhout
Github user kayousterhout commented on the pull request: https://github.com/apache/spark/pull/143#issuecomment-37686977 Ah sorry I didn't see that clean() gets called when the RDD is created and not just when the job is submitted. I think the check in DAGScheduler should be removed n

[GitHub] spark pull request: SPARK-897: preemptively serialize closures

2014-03-14 Thread willb
Github user willb commented on the pull request: https://github.com/apache/spark/pull/143#issuecomment-37686459 Yes, my understanding of SPARK-897 is that the issue is ensuring serializability errors are reported to the user as soon as possible. And essentially what these commits do

[GitHub] spark pull request: SPARK-897: preemptively serialize closures

2014-03-14 Thread willb
Github user willb commented on a diff in the pull request: https://github.com/apache/spark/pull/143#discussion_r10622998 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/dstream/DStream.scala --- @@ -533,7 +533,7 @@ abstract class DStream[T: ClassTag] ( * on ea

[GitHub] spark pull request: SPARK-897: preemptively serialize closures

2014-03-14 Thread kayousterhout
Github user kayousterhout commented on the pull request: https://github.com/apache/spark/pull/143#issuecomment-37685303 I'm not sure this fixes the problem Reynold was referring to in his pull request. If you look in DAGScheduler.scala, on line 773, it does essentially the same thing

[GitHub] spark pull request: SPARK-897: preemptively serialize closures

2014-03-14 Thread kayousterhout
Github user kayousterhout commented on a diff in the pull request: https://github.com/apache/spark/pull/143#discussion_r10622197 --- Diff: core/src/test/scala/org/apache/spark/serializer/ProactiveClosureSerializationSuite.scala --- @@ -0,0 +1,79 @@ +/* + * Licensed to the

[GitHub] spark pull request: SPARK-897: preemptively serialize closures

2014-03-14 Thread kayousterhout
Github user kayousterhout commented on a diff in the pull request: https://github.com/apache/spark/pull/143#discussion_r10622183 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/dstream/DStream.scala --- @@ -533,7 +533,7 @@ abstract class DStream[T: ClassTag] (

Re: cloudera repo down again - mqtt

2014-03-14 Thread Sean Owen
PS the Cloudera cert issue was cleared up a few hours ago; give it a spin. On Fri, Mar 14, 2014 at 8:22 AM, Sean Owen wrote: > Yes, I'm using Maven 3.2.1. Actually, scratch that, it fails for me too > once it gets down into the MQTT module, with a clearer error: > > sun.security.validator.Valida

[GitHub] spark pull request: Fix serialization of MutablePair. Also provide...

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/141#issuecomment-37681668 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not hav

[GitHub] spark pull request: Fix serialization of MutablePair. Also provide...

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/141#issuecomment-37681669 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have t

[GitHub] spark pull request: Spark 1246 add min max to stat counter

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/144#issuecomment-37681660 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your proj

[GitHub] spark pull request: Fix serialization of MutablePair. Also provide...

2014-03-14 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/141#issuecomment-37681568 Merge.d --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark pull request: SPARK-897: preemptively serialize closures

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/143#issuecomment-37681460 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13183/ --- If your project

[GitHub] spark pull request: Fix serialization of MutablePair. Also provide...

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/141#issuecomment-37681461 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-897: preemptively serialize closures

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/143#issuecomment-37681456 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: Fix serialization of MutablePair. Also provide...

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/141#issuecomment-37681462 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13182/ --- If your project

[GitHub] spark pull request: Don't swallow all kryo errors, only those that...

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/142#issuecomment-37681452 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13181/ --- If your project

[GitHub] spark pull request: Don't swallow all kryo errors, only those that...

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/142#issuecomment-37681451 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-1240: handle the case of empty RDD when ...

2014-03-14 Thread CodingCat
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/135#issuecomment-37680371 sure, will do that this evening~ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark pull request: SPARK-1240: handle the case of empty RDD when ...

2014-03-14 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/135#issuecomment-37679595 Can you check whether this is broken in Python too, and fix it there as well? --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request: Fix serialization of MutablePair. Also provide...

2014-03-14 Thread marmbrus
Github user marmbrus commented on the pull request: https://github.com/apache/spark/pull/141#issuecomment-37679480 Changed to update. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feat

[GitHub] spark pull request: Spark 1246 add min max to stat counter

2014-03-14 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/144#issuecomment-37679298 It might be better to implement `RDD.min` and `RDD.max` with `reduce` directly instead of building a whole StatCounter for them. Also, can you add these to the Java/Scala S

[GitHub] spark pull request: Spark 1246 add min max to stat counter

2014-03-14 Thread mateiz
Github user mateiz commented on a diff in the pull request: https://github.com/apache/spark/pull/144#discussion_r10620079 --- Diff: python/pyspark/rdd.py --- @@ -24,6 +24,7 @@ import sys import shlex import traceback +from bisect import bisect_right --- End di

[GitHub] spark pull request: Fix serialization of MutablePair. Also provide...

2014-03-14 Thread mateiz
Github user mateiz commented on a diff in the pull request: https://github.com/apache/spark/pull/141#discussion_r10619938 --- Diff: core/src/main/scala/org/apache/spark/util/MutablePair.scala --- @@ -25,10 +25,20 @@ package org.apache.spark.util * @param _2 Element 2 of thi

[GitHub] spark pull request: Don't swallow all kryo errors, only those that...

2014-03-14 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/142#issuecomment-37678749 Or perhaps there's a way to check on the Input object itself whether we're done. --- If your project is set up for it, you can reply to this email and have your reply appe

[GitHub] spark pull request: Don't swallow all kryo errors, only those that...

2014-03-14 Thread mateiz
Github user mateiz commented on the pull request: https://github.com/apache/spark/pull/142#issuecomment-37678686 Looks good but maybe make the test `e.getMessage.toLowerCase.contains("buffer underflow")`, in case they change the wording. --- If your project is set up for it, you can

[GitHub] spark pull request: Spark 1246 add min max to stat counter

2014-03-14 Thread dwmclary
GitHub user dwmclary opened a pull request: https://github.com/apache/spark/pull/144 Spark 1246 add min max to stat counter Here's the addition of min and max to statscounter.py and min and max methods to rdd.py. You can merge this pull request into a Git repository by running:

[GitHub] spark pull request: Don't swallow all kryo errors, only those that...

2014-03-14 Thread aarondav
Github user aarondav commented on the pull request: https://github.com/apache/spark/pull/142#issuecomment-37676276 Good catch, merging into master. We may want to merge this into branch-0.9 as well, @pwendell any thoughts? --- If your project is set up for it, you can reply to this e

[GitHub] spark pull request: Fix serialization of MutablePair. Also provide...

2014-03-14 Thread aarondav
Github user aarondav commented on the pull request: https://github.com/apache/spark/pull/141#issuecomment-37676157 Looks good to me. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this featu

[GitHub] spark pull request: SPARK-897: preemptively serialize closures

2014-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/143#issuecomment-37676118 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have t

  1   2   >