RE:Results of tests

2015-01-09 Thread Tony Reix
Hi Ted Thanks for the info. However, I'm still unable to understand how the page: https://amplab.cs.berkeley.edu/jenkins/view/Spark/job/Spark-1.2-Maven-with-YARN/lastSuccessfulBuild/HADOOP_PROFILE=hadoop-2.4,label=centos/testReport/ has been built. This page contains details I do not find in t

Re: missing document of several messages in actor-based receiver?

2015-01-09 Thread Tathagata Das
It was not really mean to be hidden. So its essentially the case of the documentation being insufficient. This code has not gotten much attention for a while, so it could have a bugs. If you find any and submit a fix for them, I am happy to take a look! TD On Thu, Jan 8, 2015 at 6:33 PM, Nan Zhu

Re: K-Means And Class Tags

2015-01-09 Thread Devl Devel
Hi Joseph Thanks for the suggestion, however retag is a private method and when I call in Scala: val retaggedInput = parsedData.retag(classOf[Vector]) I get: Symbol retag is inaccessible from this place However I can do this from Java, and it works in Scala: return words.rdd().retag(Vector.cl

Re: Results of tests

2015-01-09 Thread Sean Owen
Hey Tony, the number of tests run could vary depending on how the build is configured. For example, YARN-related tests would only run when the yarn profile is turned on. Java 8 tests would only run under Java 8. Although I don't know that there's any reason to believe the IBM JVM has a problem wit

Re: missing document of several messages in actor-based receiver?

2015-01-09 Thread Nan Zhu
Thanks, TD, I just created 2 JIRAs to track these, https://issues.apache.org/jira/browse/SPARK-5174 https://issues.apache.org/jira/browse/SPARK-5175 Can you help to me assign these two JIRAs to me, and I’d like to submit the PRs Best, -- Nan Zhu http://codingcat.me On Friday, Januar

Python to Java object conversion of numpy array

2015-01-09 Thread Meethu Mathew
Hi, I am trying to send a numpy array as an argument to a function predict() in a class in spark/python/pyspark/mllib/clustering.py which is passed to the function callMLlibFunc(name, *args) in spark/python/pyspark/mllib/common.py. Now the value is passed to the function _py2java(sc, obj) .

Re: PR #3872

2015-01-09 Thread Michael Armbrust
I will look at it this weekend. On Thu, Jan 8, 2015 at 2:43 PM, Bill Bejeck wrote: > Could one of the admins take a look at PR 3872 (JIRA 3299) submitted on 1/1 >

Re: Python to Java object conversion of numpy array

2015-01-09 Thread Davies Liu
Hey Meethu, The Java API accepts only Vector, so you should convert the numpy array into pyspark.mllib.linalg.DenseVector. BTW, which class are you using? the KMeansModel.predict() accept numpy.array, it will do the conversion for you. Davies On Fri, Jan 9, 2015 at 4:45 AM, Meethu Mathew wrote

Re: Results of tests

2015-01-09 Thread Ted Yu
For a build which uses JUnit, we would see a summary such as the following ( https://builds.apache.org/job/HBase-TRUNK/6007/console): Tests run: 2199, Failures: 0, Errors: 0, Skipped: 25 In https://amplab.cs.berkeley.edu/jenkins/view/Spark/job/Spark-1.2-Maven-with-YARN/lastSuccessfulBuild/HADOO

Re: Results of tests

2015-01-09 Thread Josh Rosen
The "Test Result" pages for Jenkins builds shows some nice statistics for the test run, including individual test times: https://amplab.cs.berkeley.edu/jenkins/view/Spark/job/Spark-1.2-Maven-with-YARN/lastSuccessfulBuild/HADOOP_PROFILE=hadoop-2.4,label=centos/testReport/ Currently this only cover

Re: Results of tests

2015-01-09 Thread Nicholas Chammas
Just created: "Integrate Python unit tests into Jenkins" https://issues.apache.org/jira/browse/SPARK-5178 Nick On Fri Jan 09 2015 at 2:48:48 PM Josh Rosen wrote: > The "Test Result" pages for Jenkins builds shows some nice statistics for > the test run, including individual test times: > > ht

Present/Future of monitoring spark jobs, "MetricsSystem" vs. Web UI, etc.

2015-01-09 Thread Ryan Williams
I've long wished the web UI gave me a better sense of how the metrics it reports are changing over time, so I was intrigued to stumble across the MetricsSystem

Re: Results of tests

2015-01-09 Thread Ted Yu
I noticed that org.apache.spark.sql.hive.execution has a lot of tests skipped. Is there plan to enable these tests on Jenkins (so that there is no regression across releases) ? Cheers On Fri, Jan 9, 2015 at 11:46 AM, Josh Rosen wrote: > The "Test Result" pages for Jenkins builds shows some nic

Re-use scaling means and variances from StandardScalerModel

2015-01-09 Thread ogeagla
Hello, I would like to re-use the means and variances computed by the fit function in the StandardScaler, as I persist them and my use case requires consisted scaling of data based on some initial data set. The StandardScalerModel's constructor takes means and variances, but is private[mllib]. F

Re: missing document of several messages in actor-based receiver?

2015-01-09 Thread Nan Zhu
Hi, I have created the PR for these two issues Best, -- Nan Zhu http://codingcat.me On Friday, January 9, 2015 at 7:38 AM, Nan Zhu wrote: > Thanks, TD, > > I just created 2 JIRAs to track these, > > https://issues.apache.org/jira/browse/SPARK-5174 > > https://issues.apache.org

Re: Re-use scaling means and variances from StandardScalerModel

2015-01-09 Thread Xiangrui Meng
Feel free to create a JIRA for this issue. We might need to discuss what to put in the public constructors. In the meanwhile, you can use Java serialization to save/load the model: sc.parallelize(Seq(model), 1).saveAsObjectFile("/tmp/model") val model = sc.objectFile[StandardScalerModel]("/tmp/mod