ALS Solve.solvePositive

2014-03-06 Thread Debasish Das
Hi, I am running ALS on a sparse problem (10M x 1M) and I am getting the following error: org.jblas.exceptions.LapackArgumentException: LAPACK DPOSV: Leading minor of order i of A is not positive definite. at org.jblas.SimpleBlas.posv(SimpleBlas.java:373) at org.jblas.Solve.solvePositive(Solve.ja

[GitHub] spark pull request: Spark-1163, Added missing Python RDD functions

2014-03-06 Thread prabinb
GitHub user prabinb opened a pull request: https://github.com/apache/spark/pull/92 Spark-1163, Added missing Python RDD functions You can merge this pull request into a Git repository by running: $ git pull https://github.com/prabinb/spark python-api-rdd Alternatively you can

[GitHub] spark pull request: Spark-1163, Added missing Python RDD functions

2014-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/92#issuecomment-36838038 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your proje

special case of custom partitioning

2014-03-06 Thread Manoj Awasthi
Hi All, I have a three machine cluster. I have two RDDs each consisting of (K,V) pairs. RDDs have just three keys 'a', 'b' and 'c'. // list1 - List(('a',1), ('b',2), val rdd1 = sc.parallelize(list1).groupByKey(new HashPartitioner(3)) // list2 - List(('a',2), ('b',7), v

[GitHub] spark pull request: SPARK-1162 Added top in python.

2014-03-06 Thread ScrapCodes
GitHub user ScrapCodes opened a pull request: https://github.com/apache/spark/pull/93 SPARK-1162 Added top in python. You can merge this pull request into a Git repository by running: $ git pull https://github.com/ScrapCodes/spark-1 SPARK-1162/pyspark-top-takeOrdered Alterna

[GitHub] spark pull request: SPARK-1162 Added top in python.

2014-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/93#issuecomment-36887773 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have th

[GitHub] spark pull request: SPARK-1162 Added top in python.

2014-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/93#issuecomment-36887770 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-1162 Added top in python.

2014-03-06 Thread ScrapCodes
Github user ScrapCodes commented on the pull request: https://github.com/apache/spark/pull/93#issuecomment-36887864 @mateiz I am learning python while doing this, so not sure if it is going to make sense. + I have not figured how to implement takeOrdered. Will it be fine if I

[GitHub] spark pull request: SPARK-1162 Added top in python.

2014-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/93#issuecomment-36892161 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have t

[GitHub] spark pull request: SPARK-1162 Added top in python.

2014-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/93#issuecomment-36892162 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13023/ --- If your project i

ALS solve.solvePositive

2014-03-06 Thread Debasish Das
Hi, I am running ALS on a sparse problem (10M x 1M) and I am getting the following error: org.jblas.exceptions.LapackArgumentException: LAPACK DPOSV: Leading minor of order i of A is not positive definite. at org.jblas.SimpleBlas.posv(SimpleBlas.java:373) at org.jblas.Solve.solvePositive(Solve.ja

Re: ALS solve.solvePositive

2014-03-06 Thread Sebastian Schelter
I'm not sure about the mathematical details, but I found in some experiments with Mahout that the matrix there was also not positive definite. Therefore, we chose QR decomposition to solve the linear system. --sebastian On 03/06/2014 03:44 PM, Debasish Das wrote: Hi, I am running ALS on a s

QR decomposition in Spark ALS

2014-03-06 Thread Debasish Das
Hi Sebastian, Yes Mahout ALS and Oryx runs fine on the same matrix because Sean calls QR decomposition. But the ALS objective should give us strictly positive definite matrix..I am thinking more on it.. There are some random factor assignment step but that also initializes factors with normal(0,

Re: QR decomposition in Spark ALS

2014-03-06 Thread Sean Owen
Hmm, Will Xt*X be positive definite in all cases? For example it's not if X has linearly independent rows? (I'm not going to guarantee 100% that I haven't missed something there.) Even though your data is huge, if it was generated by some synthetic process, maybe it is very low rank? QR decomposi

Re: QR decomposition in Spark ALS

2014-03-06 Thread Debasish Das
Yes that will be really cool if the data has linearly independent rows ! I have to debug it more but I got it running with jblas Solve.solve.. I will try breeze QR decomposition next. Have you guys tried adding bound constraints in QR decomposition / BLAS posv other than projecting to positive sp

Re: QR decomposition in Spark ALS

2014-03-06 Thread Debasish Das
Bound constraints in QR decomposition / BLAS posv other than projecting to positive space at each iteration ? Common usecases are feature generation from photos/videos etc... I saw a paper on projecting to positive space from 70s...there are some improvements later using projected gradients but t

Re: QR decomposition in Spark ALS

2014-03-06 Thread Sean Owen
I do not think there is an advantage in projecting into only the nonnegative quadrant (meaning all of X and Y are nonnegative right?) The argument I have seen is simply interpretability but this doesn't matter here. I think it would be a great exercise to see if the QR decomposition is as fast, an

Re: QR decomposition in Spark ALS

2014-03-06 Thread Matei Zaharia
Xt*X should mathematically always be positive semi-definite, so the only way this might be bad is if it’s not invertible due to linearly dependent rows. This might happen due to the initialization or possibly due to numerical issues, though it seems unlikely. Maybe it also happens if some users

Re: special case of custom partitioning

2014-03-06 Thread Evan Chan
I would love to hear the answer to this as well. On Thu, Mar 6, 2014 at 4:09 AM, Manoj Awasthi wrote: > Hi All, > > > I have a three machine cluster. I have two RDDs each consisting of (K,V) > pairs. RDDs have just three keys 'a', 'b' and 'c'. > > // list1 - List(('a',1), ('b',2), >

Re: [DISCUSS] Necessity of Maven *and* SBT Build in Spark

2014-03-06 Thread Konstantin Boudnik
On Tue, Feb 25, 2014 at 03:20PM, Evan Chan wrote: > The correct way to exclude dependencies in SBT is actually to declare > a dependency as "provided". I'm not familiar with Maven or its Yes, I believe this would be equivalent to the maven exclusion of an artifact's transitive deps. Cos > depe

Re: [DISCUSS] Necessity of Maven *and* SBT Build in Spark

2014-03-06 Thread Konstantin Boudnik
With all due respect Patrick - this approach is seeking for troubles. Proacively ;) Cos On Tue, Feb 25, 2014 at 04:09PM, Patrick Wendell wrote: > What I mean is this. AFIAK the shader plug-in is primarily designed > for creating uber jars which contain spark and all dependencies. But > since Spar

Re: QR decomposition in Spark ALS

2014-03-06 Thread Sean Owen
Agree. For example you could have a user-feature matrix X = 1 0 1 0 and X' * X = 2 0 0 0 is not positive definite but is semidefinite. So I think the code should be calling solveSymmetric not solvePositive? the latter requires positive definite. -- Sean Owen | Director, Data Science | London

Re: [DISCUSS] Necessity of Maven *and* SBT Build in Spark

2014-03-06 Thread Konstantin Boudnik
On Wed, Feb 26, 2014 at 09:22AM, Sean Owen wrote: > Side point -- "provides" scope is not the same as an exclude. > "provides" means, this artifact is used directly by this code (compile > time), but it is not necessary to package it, since it will be > available from a runtime container. Exclusion

Re: QR decomposition in Spark ALS

2014-03-06 Thread Debasish Das
Matei, If the data has linearly dependent rows ALS should have a failback mechanism. Either remove the rows and then call BLAS posv or call BLAS gesv or Breeze QR decomposition. I can share the analysis over email. Thanks. Deb On Thu, Mar 6, 2014 at 9:39 AM, Matei Zaharia wrote: > Xt*X should

Re: QR decomposition in Spark ALS

2014-03-06 Thread Matei Zaharia
But Sean, because that matrix is not invertible, you can’t solve it. That’s why I’m saying, as long as it is solvable, it will be positive definite too, and in that case solvePositive is optimized for this use case (I believe it does Cholesky decomposition). Matei On Mar 6, 2014, at 9:58 AM,

Re: QR decomposition in Spark ALS

2014-03-06 Thread Matei Zaharia
Yup, this would definitely be fine. I’d like to understand when this happens though, I imagine it might be if a user / product has no ratings (though we should certainly try to run well in that case). Matei On Mar 6, 2014, at 10:00 AM, Debasish Das wrote: > Matei, > > If the data has linearl

[GitHub] spark pull request: [WIP] SPARK-1192: The document for most of the...

2014-03-06 Thread kayousterhout
Github user kayousterhout commented on a diff in the pull request: https://github.com/apache/spark/pull/85#discussion_r10355530 --- Diff: repl/src/main/scala/org/apache/spark/repl/ExecutorClassLoader.scala --- @@ -33,7 +33,7 @@ import org.objectweb.asm.Opcodes._ * used to loa

[GitHub] spark pull request: [WIP] SPARK-1192: The document for most of the...

2014-03-06 Thread kayousterhout
Github user kayousterhout commented on a diff in the pull request: https://github.com/apache/spark/pull/85#discussion_r10355654 --- Diff: core/src/main/scala/org/apache/spark/util/AkkaUtils.scala --- @@ -108,6 +108,6 @@ private[spark] object AkkaUtils { /** Returns the

Re: QR decomposition in Spark ALS

2014-03-06 Thread Sean Owen
Yes in this case you end up with the least-squares solution. I don't see a problem with that; it's a corner case anyway and the best you can do. The QR decomposition will handle it either way, finding the exact solution when it exists. I think it's slower than Cholesky? (yes I am guessing that is

[GitHub] spark pull request: [WIP] SPARK-1192: The document for most of the...

2014-03-06 Thread CodingCat
Github user CodingCat commented on a diff in the pull request: https://github.com/apache/spark/pull/85#discussion_r10355799 --- Diff: core/src/main/scala/org/apache/spark/util/AkkaUtils.scala --- @@ -108,6 +108,6 @@ private[spark] object AkkaUtils { /** Returns the defa

Re: scala.collection.immutable.Nil$ cannot be cast to org.apache.spark.util.BoundedPriorityQueue

2014-03-06 Thread yao
Hi Fabrizio, Can someone explain me why do I get SparkConf not serializable error ? > First, SparkConf is not serializable and that's what the exception tells you. Why you stuck in this situation ? Well, that's must because some of your classes must require a SparkConf class. In your case, that's

[GitHub] spark pull request: [SPARK-1132] Persisting Web UI through refacto...

2014-03-06 Thread kayousterhout
Github user kayousterhout commented on a diff in the pull request: https://github.com/apache/spark/pull/42#discussion_r10356042 --- Diff: core/src/main/scala/org/apache/spark/SparkEnv.scala --- @@ -164,9 +167,18 @@ object SparkEnv extends Logging { } }

[GitHub] spark pull request: [SPARK-1132] Persisting Web UI through refacto...

2014-03-06 Thread kayousterhout
Github user kayousterhout commented on a diff in the pull request: https://github.com/apache/spark/pull/42#discussion_r10356197 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -32,7 +32,7 @@ import org.apache.spark.executor.TaskMetrics import

[GitHub] spark pull request: [SPARK-1132] Persisting Web UI through refacto...

2014-03-06 Thread kayousterhout
Github user kayousterhout commented on the pull request: https://github.com/apache/spark/pull/42#issuecomment-36929100 Jenkins, retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not h

[GitHub] spark pull request: [SPARK-1132] Persisting Web UI through refacto...

2014-03-06 Thread kayousterhout
Github user kayousterhout commented on a diff in the pull request: https://github.com/apache/spark/pull/42#discussion_r10357546 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManagerRegistrationListener.scala --- @@ -0,0 +1,51 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-1132] Persisting Web UI through refacto...

2014-03-06 Thread kayousterhout
Github user kayousterhout commented on a diff in the pull request: https://github.com/apache/spark/pull/42#discussion_r10357634 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManagerMasterActor.scala --- @@ -50,6 +50,8 @@ class BlockManagerMasterActor(val isLocal: Bool

[GitHub] spark pull request: [SPARK-1132] Persisting Web UI through refacto...

2014-03-06 Thread kayousterhout
Github user kayousterhout commented on a diff in the pull request: https://github.com/apache/spark/pull/42#discussion_r10357702 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManagerRegistrationListener.scala --- @@ -0,0 +1,51 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request: [SPARK-1132] Persisting Web UI through refacto...

2014-03-06 Thread kayousterhout
Github user kayousterhout commented on a diff in the pull request: https://github.com/apache/spark/pull/42#discussion_r10357808 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -194,10 +148,46 @@ class DAGScheduler( } }

[GitHub] spark pull request: SPARK-1164 Deprecated reduceByKeyToDriver as i...

2014-03-06 Thread yaoshengzhe
Github user yaoshengzhe commented on the pull request: https://github.com/apache/spark/pull/72#issuecomment-36929714 I think reduceByKeyToDriver is a better name and developer could easily figure out the behavior of this operation. Instead, reduceByKeyLocally is a little bit confusing

[GitHub] spark pull request: Patch for SPARK-942

2014-03-06 Thread kellrott
Github user kellrott commented on the pull request: https://github.com/apache/spark/pull/50#issuecomment-36931153 I think I've covered all the formatting requests. Any other issues? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request: SPARK-1195: set map_input_file environment var...

2014-03-06 Thread tgravescs
GitHub user tgravescs opened a pull request: https://github.com/apache/spark/pull/94 SPARK-1195: set map_input_file environment variable in PipedRDD Hadoop uses the config mapreduce.map.input.file to indicate the input filename to the map when the input split is of type FileSplit. S

[GitHub] spark pull request: [SPARK-1132] Persisting Web UI through refacto...

2014-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/42#issuecomment-36932992 Build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this f

[GitHub] spark pull request: SPARK-1195: set map_input_file environment var...

2014-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/94#issuecomment-36932976 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have th

[GitHub] spark pull request: [SPARK-1132] Persisting Web UI through refacto...

2014-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/42#issuecomment-36932993 Build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feat

[GitHub] spark pull request: SPARK-1195: set map_input_file environment var...

2014-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/94#issuecomment-36932975 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: Change yarn-standalone to yarn-cluster and fix...

2014-03-06 Thread sryza
GitHub user sryza opened a pull request: https://github.com/apache/spark/pull/95 Change yarn-standalone to yarn-cluster and fix up running on YARN docs This patch changes "yarn-standalone" to "yarn-cluster" (but still supports the former). It also cleans up the Running on YARN docs

[GitHub] spark pull request: Change yarn-standalone to yarn-cluster and fix...

2014-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/95#issuecomment-36933456 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: Change yarn-standalone to yarn-cluster and fix...

2014-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/95#issuecomment-36933457 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have th

[GitHub] spark pull request: SPARK-1126. spark-app preliminary

2014-03-06 Thread sryza
Github user sryza commented on a diff in the pull request: https://github.com/apache/spark/pull/86#discussion_r10359483 --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkApp.scala --- @@ -0,0 +1,178 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under o

[GitHub] spark pull request: SPARK-1126. spark-app preliminary

2014-03-06 Thread sryza
Github user sryza commented on a diff in the pull request: https://github.com/apache/spark/pull/86#discussion_r10359569 --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkApp.scala --- @@ -0,0 +1,178 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under o

[GitHub] spark pull request: SPARK-1187, Added missing Python APIs

2014-03-06 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/75#issuecomment-36934603 I played around with these and it looks good to me. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well.

Re: special case of custom partitioning

2014-03-06 Thread Mayur Rustagi
How about PartitionerAwareUnionRDD? Regards Mayur Mayur Rustagi Ph: +1 (760) 203 3257 http://www.sigmoidanalytics.com @mayur_rustagi On Thu, Mar 6, 2014 at 9:42 AM, Evan Chan wrote: > I would love to hear the answer to this as well. > > On Thu, Mar 6, 2014

Re: scala.collection.immutable.Nil$ cannot be cast to org.apache.spark.util.BoundedPriorityQueue

2014-03-06 Thread Fabrizio Milo aka misto
Thank you for the reply ! that make sense :) On Thu, Mar 6, 2014 at 11:11 AM, yao wrote: > Hi Fabrizio, > > Can someone explain me why do I get SparkConf not serializable error ? >> > > First, SparkConf is not serializable and that's what the exception tells > you. Why you stuck in this situation

[GitHub] spark pull request: [SPARK-1132] Persisting Web UI through refacto...

2014-03-06 Thread kayousterhout
Github user kayousterhout commented on a diff in the pull request: https://github.com/apache/spark/pull/42#discussion_r10360102 --- Diff: core/src/main/scala/org/apache/spark/ui/SparkUI.scala --- @@ -69,19 +100,55 @@ private[spark] class SparkUI(sc: SparkContext) extends Logging {

[GitHub] spark pull request: SPARK-1197. Change yarn-standalone to yarn-clu...

2014-03-06 Thread yaoshengzhe
Github user yaoshengzhe commented on the pull request: https://github.com/apache/spark/pull/95#issuecomment-36935885 Hi Sandy, What is the point to give a new name to "yarn-standalone" ? I think it requires people to change their spark code build for yarn and create some confusion

[GitHub] spark pull request: SPARK-1126. spark-app preliminary

2014-03-06 Thread sryza
Github user sryza commented on a diff in the pull request: https://github.com/apache/spark/pull/86#discussion_r10360322 --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkAppArguments.scala --- @@ -0,0 +1,155 @@ +/* + * Licensed to the Apache Software Foundation (AS

[GitHub] spark pull request: [SPARK-1132] Persisting Web UI through refacto...

2014-03-06 Thread kayousterhout
Github user kayousterhout commented on a diff in the pull request: https://github.com/apache/spark/pull/42#discussion_r10360310 --- Diff: core/src/main/scala/org/apache/spark/scheduler/EventBus.scala --- @@ -0,0 +1,77 @@ +/* + * Licensed to the Apache Software Foundation (AS

[GitHub] spark pull request: MLI-1 Decision Trees

2014-03-06 Thread mengxr
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/79#issuecomment-36936178 @manishamde Do you mind updating the code style first to make it easy for people who want to review the code? I will mark a few examples. We also need a Spark JIRA ticket fo

[GitHub] spark pull request: MLI-1 Decision Trees

2014-03-06 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/79#discussion_r10360358 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/tree/DecisionTree.scala --- @@ -0,0 +1,915 @@ +/* + * Licensed to the Apache Software Foundation (A

[GitHub] spark pull request: MLI-1 Decision Trees

2014-03-06 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/79#discussion_r10360367 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/tree/DecisionTree.scala --- @@ -0,0 +1,915 @@ +/* + * Licensed to the Apache Software Foundation (A

[GitHub] spark pull request: MLI-1 Decision Trees

2014-03-06 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/79#discussion_r10360401 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/tree/DecisionTree.scala --- @@ -0,0 +1,915 @@ +/* + * Licensed to the Apache Software Foundation (A

[GitHub] spark pull request: MLI-1 Decision Trees

2014-03-06 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/79#discussion_r10360441 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/tree/DecisionTree.scala --- @@ -0,0 +1,915 @@ +/* + * Licensed to the Apache Software Foundation (A

[GitHub] spark pull request: MLI-1 Decision Trees

2014-03-06 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/79#discussion_r10360427 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/tree/DecisionTree.scala --- @@ -0,0 +1,915 @@ +/* + * Licensed to the Apache Software Foundation (A

[GitHub] spark pull request: MLI-1 Decision Trees

2014-03-06 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/79#discussion_r10360465 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/tree/DecisionTree.scala --- @@ -0,0 +1,915 @@ +/* + * Licensed to the Apache Software Foundation (A

[GitHub] spark pull request: SPARK-1197. Change yarn-standalone to yarn-clu...

2014-03-06 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/95#discussion_r10360526 --- Diff: docs/running-on-yarn.md --- @@ -82,35 +84,30 @@ For example: ./bin/spark-class org.apache.spark.deploy.yarn.Client \ --jar

[GitHub] spark pull request: MLI-1 Decision Trees

2014-03-06 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/79#discussion_r10360528 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/tree/DecisionTree.scala --- @@ -0,0 +1,915 @@ +/* + * Licensed to the Apache Software Foundation (A

[GitHub] spark pull request: MLI-1 Decision Trees

2014-03-06 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/79#discussion_r10360539 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/tree/DecisionTree.scala --- @@ -0,0 +1,915 @@ +/* + * Licensed to the Apache Software Foundation (A

[GitHub] spark pull request: MLI-1 Decision Trees

2014-03-06 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/79#discussion_r10360574 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/tree/DecisionTreeRunner.scala --- @@ -0,0 +1,143 @@ +/* + * Licensed to the Apache Software Foundat

[GitHub] spark pull request: SPARK-1197. Change yarn-standalone to yarn-clu...

2014-03-06 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/95#issuecomment-36936894 @yaoshengzhe this supports "yarn-standalone" for backwards compatibility so you don't need to change your application. The name "yarn-standalone" is really confusing becau

[GitHub] spark pull request: MLI-1 Decision Trees

2014-03-06 Thread mengxr
Github user mengxr commented on a diff in the pull request: https://github.com/apache/spark/pull/79#discussion_r10360640 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/tree/impurity/Impurity.scala --- @@ -0,0 +1,25 @@ +/* + * Licensed to the Apache Software Foundatio

[GitHub] spark pull request: SPARK-1197. Change yarn-standalone to yarn-clu...

2014-03-06 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/95#issuecomment-36936989 @sryza thanks Sandy this looks good to me. @tgraves want to take a look? If not I can merge this tonight - it's just some doc fixes. --- If your project is set up for it,

[GitHub] spark pull request: SPARK-1126. spark-app preliminary

2014-03-06 Thread sryza
Github user sryza commented on the pull request: https://github.com/apache/spark/pull/86#issuecomment-36937078 I uploaded a new patch that takes most of the review feedback into account. Includes the following changes: * changes Opt to OptionAssigner and uses default parameters

[GitHub] spark pull request: SPARK-1197. Change yarn-standalone to yarn-clu...

2014-03-06 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/95#discussion_r10360720 --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala --- @@ -1243,7 +1245,7 @@ object SparkContext { } scheduler

[GitHub] spark pull request: [SPARK-1132] Persisting Web UI through refacto...

2014-03-06 Thread kayousterhout
Github user kayousterhout commented on a diff in the pull request: https://github.com/apache/spark/pull/42#discussion_r10360822 --- Diff: core/src/main/scala/org/apache/spark/ui/UISparkListener.scala --- @@ -0,0 +1,123 @@ +/* + * Licensed to the Apache Software Foundation (A

[GitHub] spark pull request: [SPARK-1132] Persisting Web UI through refacto...

2014-03-06 Thread kayousterhout
Github user kayousterhout commented on a diff in the pull request: https://github.com/apache/spark/pull/42#discussion_r10360902 --- Diff: core/src/main/scala/org/apache/spark/ui/UIReloader.scala --- @@ -0,0 +1,51 @@ +/* + * Licensed to the Apache Software Foundation (ASF) un

[GitHub] spark pull request: [SPARK-1132] Persisting Web UI through refacto...

2014-03-06 Thread kayousterhout
Github user kayousterhout commented on a diff in the pull request: https://github.com/apache/spark/pull/42#discussion_r10360999 --- Diff: core/src/main/scala/org/apache/spark/deploy/master/Master.scala --- @@ -50,6 +51,9 @@ private[spark] class Master(host: String, port: Int, webU

[GitHub] spark pull request: [SPARK-1132] Persisting Web UI through refacto...

2014-03-06 Thread kayousterhout
Github user kayousterhout commented on a diff in the pull request: https://github.com/apache/spark/pull/42#discussion_r10361035 --- Diff: core/src/main/scala/org/apache/spark/deploy/master/Master.scala --- @@ -50,6 +51,9 @@ private[spark] class Master(host: String, port: Int, webU

[GitHub] spark pull request: SPARK-1189: Add Security to Spark - Akka, Http...

2014-03-06 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/33#discussion_r10361204 --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala --- @@ -135,6 +135,8 @@ class SparkContext( val isLocal = (master == "local" || m

[GitHub] spark pull request: SPARK-1187, Added missing Python APIs

2014-03-06 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/75 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enable

[GitHub] spark pull request: SPARK-1197. Change yarn-standalone to yarn-clu...

2014-03-06 Thread tgravescs
Github user tgravescs commented on a diff in the pull request: https://github.com/apache/spark/pull/95#discussion_r10361302 --- Diff: docs/running-on-yarn.md --- @@ -82,35 +84,30 @@ For example: ./bin/spark-class org.apache.spark.deploy.yarn.Client \ --ja

[GitHub] spark pull request: SPARK-1197. Change yarn-standalone to yarn-clu...

2014-03-06 Thread yaoshengzhe
Github user yaoshengzhe commented on the pull request: https://github.com/apache/spark/pull/95#issuecomment-36938296 @pwendell I agree what you saying. One more question, is that possible to move all these string constants in some class ? --- If your project is set up for it, you

[GitHub] spark pull request: [SPARK-1132] Persisting Web UI through refacto...

2014-03-06 Thread kayousterhout
Github user kayousterhout commented on a diff in the pull request: https://github.com/apache/spark/pull/42#discussion_r10361374 --- Diff: core/src/main/scala/org/apache/spark/deploy/master/Master.scala --- @@ -625,6 +653,30 @@ private[spark] class Master(host: String, port: Int, w

[GitHub] spark pull request: SPARK-1197. Change yarn-standalone to yarn-clu...

2014-03-06 Thread tgravescs
Github user tgravescs commented on the pull request: https://github.com/apache/spark/pull/95#issuecomment-36938497 Looks good to me (with the doc fixes commented on). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If you

[GitHub] spark pull request: [SPARK-1132] Persisting Web UI through refacto...

2014-03-06 Thread kayousterhout
Github user kayousterhout commented on a diff in the pull request: https://github.com/apache/spark/pull/42#discussion_r10361480 --- Diff: core/src/main/scala/org/apache/spark/ui/SparkUI.scala --- @@ -27,28 +28,58 @@ import org.apache.spark.ui.jobs.JobProgressUI import org.apach

[GitHub] spark pull request: [SPARK-1132] Persisting Web UI through refacto...

2014-03-06 Thread kayousterhout
Github user kayousterhout commented on a diff in the pull request: https://github.com/apache/spark/pull/42#discussion_r10361549 --- Diff: core/src/main/scala/org/apache/spark/ui/UIReloader.scala --- @@ -0,0 +1,51 @@ +/* + * Licensed to the Apache Software Foundation (ASF) un

[GitHub] spark pull request: SPARK-1197. Change yarn-standalone to yarn-clu...

2014-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/95#issuecomment-36939034 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have t

[GitHub] spark pull request: SPARK-1195: set map_input_file environment var...

2014-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/94#issuecomment-36939045 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13024/ --- If your project i

[GitHub] spark pull request: SPARK-1195: set map_input_file environment var...

2014-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/94#issuecomment-36939044 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have t

[GitHub] spark pull request: SPARK-1197. Change yarn-standalone to yarn-clu...

2014-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/95#issuecomment-36939035 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13026/ --- If your project i

[GitHub] spark pull request: [SPARK-1132] Persisting Web UI through refacto...

2014-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/42#issuecomment-36939032 One or more automated tests failed Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13025/ --- If your pr

[GitHub] spark pull request: [SPARK-1132] Persisting Web UI through refacto...

2014-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/42#issuecomment-36939031 Build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this fea

[GitHub] spark pull request: [SPARK-1132] Persisting Web UI through refacto...

2014-03-06 Thread kayousterhout
Github user kayousterhout commented on a diff in the pull request: https://github.com/apache/spark/pull/42#discussion_r10361689 --- Diff: core/src/main/scala/org/apache/spark/ui/jobs/JobProgressListener.scala --- @@ -30,16 +32,23 @@ import org.apache.spark.scheduler._ * class

[GitHub] spark pull request: [SPARK-1132] Persisting Web UI through refacto...

2014-03-06 Thread kayousterhout
Github user kayousterhout commented on a diff in the pull request: https://github.com/apache/spark/pull/42#discussion_r10361732 --- Diff: core/src/main/scala/org/apache/spark/ui/jobs/JobProgressListener.scala --- @@ -30,16 +32,23 @@ import org.apache.spark.scheduler._ * class

[GitHub] spark pull request: SPARK-1126. spark-app preliminary

2014-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/86#issuecomment-36939213 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-1126. spark-app preliminary

2014-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/86#issuecomment-36939354 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have th

[GitHub] spark pull request: SPARK-1189: Add Security to Spark - Akka, Http...

2014-03-06 Thread tgravescs
Github user tgravescs commented on a diff in the pull request: https://github.com/apache/spark/pull/33#discussion_r10361820 --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala --- @@ -135,6 +135,8 @@ class SparkContext( val isLocal = (master == "local" ||

[GitHub] spark pull request: SPARK-1126. spark-app preliminary

2014-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/86#issuecomment-36939470 One or more automated tests failed Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13027/ --- If your pr

[GitHub] spark pull request: SPARK-1126. spark-app preliminary

2014-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/86#issuecomment-36939469 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have t

[GitHub] spark pull request: [SPARK-1132] Persisting Web UI through refacto...

2014-03-06 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/42#discussion_r10362120 --- Diff: core/src/main/scala/org/apache/spark/deploy/master/Master.scala --- @@ -625,6 +653,30 @@ private[spark] class Master(host: String, port: Int, webUiP

[GitHub] spark pull request: SPARK-1162 Added top in python.

2014-03-06 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/93#discussion_r10362394 --- Diff: python/pyspark/rdd.py --- @@ -628,6 +669,26 @@ def mergeMaps(m1, m2): m1[k] += v return m1 return

  1   2   3   >