[GitHub] spark pull request: MLI-1 Decision Trees

2014-03-06 Thread hsaputra
Github user hsaputra commented on the pull request: https://github.com/apache/spark/pull/79#issuecomment-36973907 @manishamde thank you for decision tree contribution and the detail comments/ documentation in the PR :) Looking forward to review and seeing this as part of Spark MLl

[GitHub] spark pull request: SPARK-1162 Added top in python.

2014-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/93#issuecomment-36973825 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have th

[GitHub] spark pull request: SPARK-1162 Added top in python.

2014-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/93#issuecomment-36973824 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: MLI-1 Decision Trees

2014-03-06 Thread manishamde
Github user manishamde commented on the pull request: https://github.com/apache/spark/pull/79#issuecomment-36973041 @mengxr @hsaputra Thanks for the code style comments. I have made a lot of effort to document the code. I guess I still need to make the code consistent with the Spark s

[GitHub] spark pull request: SPARK-1126. spark-app preliminary

2014-03-06 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/86#discussion_r10374842 --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala --- @@ -0,0 +1,160 @@ +/* + * Licensed to the Apache Software Foundation (ASF) u

[GitHub] spark pull request: SPARK-1126. spark-app preliminary

2014-03-06 Thread mridulm
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/86#discussion_r10374821 --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala --- @@ -0,0 +1,160 @@ +/* + * Licensed to the Apache Software Foundation (ASF) un

[GitHub] spark pull request: SPARK-1126. spark-app preliminary

2014-03-06 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/86#discussion_r10374754 --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala --- @@ -0,0 +1,160 @@ +/* + * Licensed to the Apache Software Foundation (ASF) u

Please remove me from the mail list.//Re: [GitHub] spark pull request: Fix #SPARK-1149 Bad partitioners can cause Spa...

2014-03-06 Thread Qiuxin (robert)
Please remove me from the mail list. -邮件原件- 发件人: CodingCat [mailto:g...@git.apache.org] 发送时间: 2014年3月7日 7:38 收件人: dev@spark.apache.org 主题: [GitHub] spark pull request: Fix #SPARK-1149 Bad partitioners can cause Spa... Github user CodingCat commented on a diff in the pull request: ht

[GitHub] spark pull request: SPARK-1195: set map_input_file environment var...

2014-03-06 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/94#discussion_r10374683 --- Diff: core/src/test/scala/org/apache/spark/PipedRDDSuite.scala --- @@ -89,4 +97,37 @@ class PipedRDDSuite extends FunSuite with SharedSparkContext {

[GitHub] spark pull request: SPARK-1162 Added top in python.

2014-03-06 Thread ScrapCodes
Github user ScrapCodes commented on the pull request: https://github.com/apache/spark/pull/93#issuecomment-36971911 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-1193. Fix indentation in pom.xmls

2014-03-06 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/91#issuecomment-36971593 @sryza mind up merging this now that #33 is in? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your pro

[GitHub] spark pull request: [WIP] SPARK-1192: The document for most of the...

2014-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/85#issuecomment-36969145 Build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this fea

[GitHub] spark pull request: [WIP] SPARK-1192: The document for most of the...

2014-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/85#issuecomment-36969146 Build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this fea

[GitHub] spark pull request: [WIP] SPARK-1192: The document for most of the...

2014-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/85#issuecomment-36969147 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13037/ --- If your project i

[GitHub] spark pull request: [WIP] SPARK-1192: The document for most of the...

2014-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/85#issuecomment-36969148 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13038/ --- If your project i

[GitHub] spark pull request: SPARK-1126. spark-app preliminary

2014-03-06 Thread mridulm
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/86#discussion_r10373471 --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala --- @@ -0,0 +1,160 @@ +/* + * Licensed to the Apache Software Foundation (ASF) un

[GitHub] spark pull request: SPARK-1126. spark-app preliminary

2014-03-06 Thread mridulm
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/86#discussion_r10373335 --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala --- @@ -0,0 +1,160 @@ +/* + * Licensed to the Apache Software Foundation (ASF) un

[GitHub] spark pull request: [WIP] SPARK-1192: The document for most of the...

2014-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/85#issuecomment-36967527 Build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feat

[GitHub] spark pull request: [WIP] SPARK-1192: The document for most of the...

2014-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/85#issuecomment-36967526 Build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this f

[GitHub] spark pull request: [WIP] SPARK-1192: The document for most of the...

2014-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/85#issuecomment-36967432 Build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feat

[GitHub] spark pull request: [WIP] SPARK-1192: The document for most of the...

2014-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/85#issuecomment-36967431 Build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this f

[GitHub] spark pull request: SPARK-1126. spark-app preliminary

2014-03-06 Thread mridulm
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/86#discussion_r10373228 --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala --- @@ -0,0 +1,160 @@ +/* + * Licensed to the Apache Software Foundation (ASF) un

[GitHub] spark pull request: SPARK-1126. spark-app preliminary

2014-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/86#issuecomment-36967269 One or more automated tests failed Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13036/ --- If your pr

[GitHub] spark pull request: SPARK-1126. spark-app preliminary

2014-03-06 Thread mridulm
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/86#discussion_r10373180 --- Diff: bin/spark-submit --- @@ -0,0 +1,38 @@ +#!/usr/bin/env bash + +# +# Licensed to the Apache Software Foundation (ASF) under one or more

[GitHub] spark pull request: SPARK-1126. spark-app preliminary

2014-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/86#issuecomment-36967267 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have t

[GitHub] spark pull request: SPARK-1126. spark-app preliminary

2014-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/86#issuecomment-36967210 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have th

[GitHub] spark pull request: SPARK-1126. spark-app preliminary

2014-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/86#issuecomment-36967209 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-1126. spark-app preliminary

2014-03-06 Thread sryza
Github user sryza commented on the pull request: https://github.com/apache/spark/pull/86#issuecomment-36966736 I uploaded a new patch that doesn't start a new JVM and parses --driver-memory in bash. It wasn't as bad as I expected (thanks to some help from @umbrant and @atm).

when run the same job, time that spark used is very diffrent from shark.

2014-03-06 Thread qingyang li
*Hi, community, I have setup 3 nodes spark cluster using standalone mode, each machine's memery is 16G, the core is 4. * *when i run " val file = sc.textFile("/user/hive/warehouse/b/test.txt") file.filter(line => line.contains("2013-")).count() "* *it cost 2.7s , * *but , when

[GitHub] spark pull request: SPARK-1197. Change yarn-standalone to yarn-clu...

2014-03-06 Thread sryza
Github user sryza commented on the pull request: https://github.com/apache/spark/pull/95#issuecomment-36965043 If there's consensus on a different identifier, I'd be happy to post an addendum patch. --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request: Fix #SPARK-1149 Bad partitioners can cause Spa...

2014-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/44#issuecomment-36964801 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13035/ --- If your project i

[GitHub] spark pull request: SPARK-1162 Added top in python.

2014-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/93#issuecomment-36964795 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have t

[GitHub] spark pull request: SPARK-1162 Added top in python.

2014-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/93#issuecomment-36964796 One or more automated tests failed Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13034/ --- If your pr

[GitHub] spark pull request: Fix #SPARK-1149 Bad partitioners can cause Spa...

2014-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/44#issuecomment-36964800 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have t

[GitHub] spark pull request: SPARK-1197. Change yarn-standalone to yarn-clu...

2014-03-06 Thread mridulm
Github user mridulm commented on the pull request: https://github.com/apache/spark/pull/95#issuecomment-36964810 I would have preferred a different identifier (though I dont have good alternatives yet), but that seems moot now since the PR was closed before I could get to it.

Re: ALS solve.solvePositive

2014-03-06 Thread Xiangrui Meng
If the matrix is very ill-conditioned, then A^T A becomes numerically rank deficient. However, if you use a reasonably large positive regularization constant (lambda), "A^T A + lambda I" should be still positive definite. What was the regularization constant (lambda) you set? Could you test whether

[GitHub] spark pull request: SPARK-1126. spark-app preliminary

2014-03-06 Thread mridulm
Github user mridulm commented on a diff in the pull request: https://github.com/apache/spark/pull/86#discussion_r10372142 --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkApp.scala --- @@ -0,0 +1,178 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

Re: MLLib - Thoughts about refactoring Updater for LBFGS?

2014-03-06 Thread Xiangrui Meng
Hi DB, Thanks for doing the comparison! What were the running times for fortran/breeze/riso? Best, Xiangrui On Thu, Mar 6, 2014 at 4:21 PM, DB Tsai wrote: > Hi David, > > I can converge to the same result with your breeze LBFGS and Fortran > implementations now. Probably, I made some mistakes w

[GitHub] spark pull request: Fix #SPARK-1149 Bad partitioners can cause Spa...

2014-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/44#issuecomment-36962290 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: Fix #SPARK-1149 Bad partitioners can cause Spa...

2014-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/44#issuecomment-36962291 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have th

[GitHub] spark pull request: SPARK-1162 Added top in python.

2014-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/93#issuecomment-36962209 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-1162 Added top in python.

2014-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/93#issuecomment-36962210 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have th

[GitHub] spark pull request: [WIP] SPARK-1192: The document for most of the...

2014-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/85#issuecomment-36962138 Build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this fea

[GitHub] spark pull request: Spark 1165 rdd.intersection in python and java

2014-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/80#issuecomment-36962141 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13033/ --- If your project i

[GitHub] spark pull request: [WIP] SPARK-1192: The document for most of the...

2014-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/85#issuecomment-36962140 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13031/ --- If your project i

[GitHub] spark pull request: Spark 1165 rdd.intersection in python and java

2014-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/80#issuecomment-36962139 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have t

[GitHub] spark pull request: Example for cassandra CQL read/write from spar...

2014-03-06 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/87 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enable

rdd.take() failed when input partition is larger than hdfs blocksize

2014-03-06 Thread Chen Jin
Hello Spark Developers, While trying to use rdd.take(numItems) My job just hangs there forever, the following is output messages: 14/03/07 00:52:21 INFO SparkContext: Starting job: take at xx.java:55 14/03/07 00:52:21 INFO DAGScheduler: Got job 1 (take at xx.java:55) with 1 output p

[GitHub] spark pull request: Fix #SPARK-1149 Bad partitioners can cause Spa...

2014-03-06 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/44#discussion_r10370952 --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala --- @@ -847,6 +847,8 @@ class SparkContext( partitions: Seq[Int], allowL

[GitHub] spark pull request: Fix #SPARK-1149 Bad partitioners can cause Spa...

2014-03-06 Thread witgo
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/44#discussion_r10370902 --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala --- @@ -847,6 +847,8 @@ class SparkContext( partitions: Seq[Int], allowLoca

[GitHub] spark pull request: Example for cassandra CQL read/write from spar...

2014-03-06 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/87#issuecomment-36960009 Thanks, merged! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: SPARK-1162 Added top in python.

2014-03-06 Thread ScrapCodes
Github user ScrapCodes commented on a diff in the pull request: https://github.com/apache/spark/pull/93#discussion_r10370555 --- Diff: python/pyspark/rdd.py --- @@ -628,6 +669,26 @@ def mergeMaps(m1, m2): m1[k] += v return m1 return

[GitHub] spark pull request: [SPARK-1132] Persisting Web UI through refacto...

2014-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/42#issuecomment-36959517 Build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this fea

[GitHub] spark pull request: [SPARK-1132] Persisting Web UI through refacto...

2014-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/42#issuecomment-36959518 One or more automated tests failed Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13032/ --- If your pr

[GitHub] spark pull request: Spark 1165 rdd.intersection in python and java

2014-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/80#issuecomment-36959193 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: Spark 1165 rdd.intersection in python and java

2014-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/80#issuecomment-36959194 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have th

[GitHub] spark pull request: [SPARK-1132] Persisting Web UI through refacto...

2014-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/42#issuecomment-36958961 Build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this f

[GitHub] spark pull request: [SPARK-1132] Persisting Web UI through refacto...

2014-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/42#issuecomment-36958962 Build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feat

[GitHub] spark pull request: [WIP] SPARK-1192: The document for most of the...

2014-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/85#issuecomment-36958934 Build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this f

[GitHub] spark pull request: [WIP] SPARK-1192: The document for most of the...

2014-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/85#issuecomment-36958936 Build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feat

[GitHub] spark pull request: SPARK-1004. PySpark on YARN

2014-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/30#issuecomment-36958876 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have t

[GitHub] spark pull request: SPARK-1004. PySpark on YARN

2014-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/30#issuecomment-36958877 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13030/ --- If your project i

[GitHub] spark pull request: SPARK-1197. Change yarn-standalone to yarn-clu...

2014-03-06 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/95 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enable

[GitHub] spark pull request: SPARK-1189: Add Security to Spark - Akka, Http...

2014-03-06 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/33 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enable

[GitHub] spark pull request: SPARK-1197. Change yarn-standalone to yarn-clu...

2014-03-06 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/95#issuecomment-36958050 @sryza thanks I'll merge this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not hav

[GitHub] spark pull request: [WIP] SPARK-1192: The document for most of the...

2014-03-06 Thread kayousterhout
Github user kayousterhout commented on a diff in the pull request: https://github.com/apache/spark/pull/85#discussion_r10369848 --- Diff: core/src/main/scala/org/apache/spark/util/AkkaUtils.scala --- @@ -108,6 +108,6 @@ private[spark] object AkkaUtils { /** Returns the

[GitHub] spark pull request: [SPARK-1132] Persisting Web UI through refacto...

2014-03-06 Thread kayousterhout
Github user kayousterhout commented on a diff in the pull request: https://github.com/apache/spark/pull/42#discussion_r10369800 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -978,6 +971,11 @@ class DAGScheduler( logDebug("Additional e

[GitHub] spark pull request: SPARK-1004. PySpark on YARN

2014-03-06 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/30#issuecomment-36957701 Ah okay works fine when I do that. Sorry about that. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If you

[GitHub] spark pull request: MLI-1 Decision Trees

2014-03-06 Thread hsaputra
Github user hsaputra commented on a diff in the pull request: https://github.com/apache/spark/pull/79#discussion_r10369686 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/tree/DecisionTree.scala --- @@ -0,0 +1,915 @@ +/* + * Licensed to the Apache Software Foundation

[GitHub] spark pull request: SPARK-1197. Change yarn-standalone to yarn-clu...

2014-03-06 Thread sryza
Github user sryza commented on the pull request: https://github.com/apache/spark/pull/95#issuecomment-36956664 Updated patch incorporates review feedback --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project doe

Re: MLLib - Thoughts about refactoring Updater for LBFGS?

2014-03-06 Thread DB Tsai
On Thu, Mar 6, 2014 at 4:26 PM, David Hall wrote: > I'm not sure why Spark should be serializing LBFGS? Shouldn't it live on > the controller node? Or is this a per-node thing? > > But no problem to make it serializable. It will live in the controller node. Only RDD operations are per-node thing.

[GitHub] spark pull request: SPARK-1189: Add Security to Spark - Akka, Http...

2014-03-06 Thread tgravescs
Github user tgravescs commented on the pull request: https://github.com/apache/spark/pull/33#issuecomment-36955323 I committed this, thanks for all the reviews Patrick! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If y

[GitHub] spark pull request: SPARK-1004. PySpark on YARN

2014-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/30#issuecomment-36955199 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have th

[GitHub] spark pull request: SPARK-1004. PySpark on YARN

2014-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/30#issuecomment-36955197 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: Example for cassandra CQL read/write from spar...

2014-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/87#issuecomment-36955140 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have t

[GitHub] spark pull request: Example for cassandra CQL read/write from spar...

2014-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/87#issuecomment-36955141 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13029/ --- If your project i

Re: MLLib - Thoughts about refactoring Updater for LBFGS?

2014-03-06 Thread David Hall
On Thu, Mar 6, 2014 at 4:21 PM, DB Tsai wrote: > Hi David, > > I can converge to the same result with your breeze LBFGS and Fortran > implementations now. Probably, I made some mistakes when I tried > breeze before. I apologize that I claimed it's not stable. > > See the test case in BreezeLBFGSS

[GitHub] spark pull request: SPARK-1189: Add Security to Spark - Akka, Http...

2014-03-06 Thread tgravescs
Github user tgravescs commented on the pull request: https://github.com/apache/spark/pull/33#issuecomment-36955086 Nope nothing else to address in this. I'll merge it shortly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as wel

[GitHub] spark pull request: [SPARK-1132] Persisting Web UI through refacto...

2014-03-06 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/42#discussion_r10368594 --- Diff: core/src/main/scala/org/apache/spark/SparkEnv.scala --- @@ -164,9 +167,18 @@ object SparkEnv extends Logging { } } -

Re: MLLib - Thoughts about refactoring Updater for LBFGS?

2014-03-06 Thread DB Tsai
Hi David, I can converge to the same result with your breeze LBFGS and Fortran implementations now. Probably, I made some mistakes when I tried breeze before. I apologize that I claimed it's not stable. See the test case in BreezeLBFGSSuite.scala https://github.com/AlpineNow/spark/tree/dbtsai-bre

[GitHub] spark pull request: [SPARK-1132] Persisting Web UI through refacto...

2014-03-06 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/42#discussion_r10368558 --- Diff: core/src/main/scala/org/apache/spark/scheduler/SparkListener.scala --- @@ -19,33 +19,43 @@ package org.apache.spark.scheduler import jav

[GitHub] spark pull request: [SPARK-1132] Persisting Web UI through refacto...

2014-03-06 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/42#discussion_r10368525 --- Diff: core/src/main/scala/org/apache/spark/ui/jobs/JobProgressListener.scala --- @@ -30,16 +32,23 @@ import org.apache.spark.scheduler._ * class, s

[GitHub] spark pull request: SPARK-1195: set map_input_file environment var...

2014-03-06 Thread tgravescs
Github user tgravescs commented on a diff in the pull request: https://github.com/apache/spark/pull/94#discussion_r10368475 --- Diff: core/src/test/scala/org/apache/spark/PipedRDDSuite.scala --- @@ -89,4 +97,37 @@ class PipedRDDSuite extends FunSuite with SharedSparkContext {

[GitHub] spark pull request: SPARK-1004. PySpark on YARN

2014-03-06 Thread sryza
Github user sryza commented on the pull request: https://github.com/apache/spark/pull/30#issuecomment-36953787 Updated to 1.0.0 and removed incubating --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does n

[GitHub] spark pull request: SPARK-1004. PySpark on YARN

2014-03-06 Thread sryza
Github user sryza commented on the pull request: https://github.com/apache/spark/pull/30#issuecomment-36953590 You need to run make inside the python directory first. Did you do that? (This obviously needs to be documented). --- If your project is set up for it, you can reply to th

[GitHub] spark pull request: SPARK-1004. PySpark on YARN

2014-03-06 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/30#issuecomment-36953290 Hey @sryza I tested this using a local standalone cluster and it didn't seem to work. The executors failed when they were asked to launch pyspark: ``` 14/03/06

[GitHub] spark pull request: SPARK-1004. PySpark on YARN

2014-03-06 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/30#discussion_r10367892 --- Diff: sbin/spark-config.sh --- @@ -34,3 +34,6 @@ this="$config_bin/$script" export SPARK_PREFIX=`dirname "$this"`/.. export SPARK_HOME=${SPARK_PRE

[GitHub] spark pull request: Fix #SPARK-1149 Bad partitioners can cause Spa...

2014-03-06 Thread CodingCat
Github user CodingCat commented on a diff in the pull request: https://github.com/apache/spark/pull/44#discussion_r10367306 --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala --- @@ -847,6 +847,8 @@ class SparkContext( partitions: Seq[Int], allow

[GitHub] spark pull request: [SPARK-1132] Persisting Web UI through refacto...

2014-03-06 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/42#discussion_r10367126 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -978,6 +971,11 @@ class DAGScheduler( logDebug("Additional exec

[GitHub] spark pull request: GRAPH-1: Map side distinct in collect vertex i...

2014-03-06 Thread holdenk
Github user holdenk commented on the pull request: https://github.com/apache/spark/pull/21#issuecomment-36951277 Ok I'll switch it tonight. On Thu, Mar 6, 2014 at 3:09 PM, Reynold Xin wrote: > We should use the primitive hashmap - otherwise it is pretty slow >

[GitHub] spark pull request: SPARK-1004. PySpark on YARN

2014-03-06 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/30#issuecomment-36951182 Hey @JoshRosen mind taking a look at this I think @sryza has tested it on YARN. But personally don't know enough about python packaging to look it over with confidence.

[GitHub] spark pull request: SPARK-1004. PySpark on YARN

2014-03-06 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/30#discussion_r10366986 --- Diff: python/setup.py --- @@ -0,0 +1,30 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agree

[GitHub] spark pull request: Example for cassandra CQL read/write from spar...

2014-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/87#issuecomment-36950986 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have th

[GitHub] spark pull request: Example for cassandra CQL read/write from spar...

2014-03-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/87#issuecomment-36950985 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-1193. Fix indentation in pom.xmls

2014-03-06 Thread pwendell
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/91#issuecomment-36950404 Thanks we can merge this. Want #33 to go in first since I think this will conflict with it. --- If your project is set up for it, you can reply to this email and have you

[GitHub] spark pull request: Patch for SPARK-942

2014-03-06 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/50 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enable

[GitHub] spark pull request: [SPARK-1132] Persisting Web UI through refacto...

2014-03-06 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/42#discussion_r10366581 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -978,6 +971,11 @@ class DAGScheduler( logDebug("Additional execut

[GitHub] spark pull request: Fix #SPARK-1149 Bad partitioners can cause Spa...

2014-03-06 Thread pwendell
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/44#discussion_r10366461 --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala --- @@ -847,6 +847,8 @@ class SparkContext( partitions: Seq[Int], allowL

[GitHub] spark pull request: GRAPH-1: Map side distinct in collect vertex i...

2014-03-06 Thread rxin
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/21#issuecomment-36949585 We should use the primitive hashmap - otherwise it is pretty slow --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as we

[GitHub] spark pull request: [SPARK-1132] Persisting Web UI through refacto...

2014-03-06 Thread andrewor14
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/42#discussion_r10366355 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -978,6 +971,11 @@ class DAGScheduler( logDebug("Additional exec

  1   2   3   >