RE: [SparkStreaming] NPE in DStreamCheckPointData.scala:125

2015-06-17 Thread Haopu Wang
Can someone help? Thank you! From: Haopu Wang Sent: Monday, June 15, 2015 3:36 PM To: user; dev@spark.apache.org Subject: [SparkStreaming] NPE in DStreamCheckPointData.scala:125 I use the attached program to test checkpoint. It's quite simple. When I run t

[MLlib] Contributing algorithm for DP means clustering

2015-06-17 Thread Meethu Mathew
Hi all, At present, all the clustering algorithms in MLlib require the number of clusters to be specified in advance. The Dirichlet process (DP) is a popular non-parametric Bayesian mixture model that allows for flexible clustering of data without having to specify apriori the number of clusters.

[mllib] Refactoring some spark.mllib model classes in Python not inheriting JavaModelWrapper

2015-06-17 Thread Yu Ishikawa
Hi all, I think we should refactor some machine learning model classes in Python to reduce the software maintainability. Inheriting JavaModelWrapper class, we can easily and directly call Scala API for the model without PythonMLlibAPI. In some case, a machine learning model class in Python has co

RE: [SparkScore] Performance portal for Apache Spark

2015-06-17 Thread Duan, Jiangang
We are looking for more workloads – if you guys have any suggestions, let us know. -jiangang From: Sandy Ryza [mailto:sandy.r...@cloudera.com] Sent: Wednesday, June 17, 2015 5:51 PM To: Huang, Jie Cc: u...@spark.apache.org; dev@spark.apache.org Subject: Re: [SparkScore] Performance portal for Ap

Re: [SparkScore] Performance portal for Apache Spark

2015-06-17 Thread Sandy Ryza
This looks really awesome. On Tue, Jun 16, 2015 at 10:27 AM, Huang, Jie wrote: > Hi All > > We are happy to announce Performance portal for Apache Spark > http://01org.github.io/sparkscore/ ! > > The Performance Portal for Apache Spark provides performance data on the > Spark upsteam to the com

Re: [sample code] deeplearning4j for Spark ML (@DeveloperAPI)

2015-06-17 Thread Xiangrui Meng
Hi Eron, Please register your Spark Package on http://spark-packages.org, which helps users find your work. Do you have some performance benchmark to share? Thanks! Best, Xiangrui On Wed, Jun 10, 2015 at 10:48 PM, Nick Pentreath wrote: > Looks very interesting, thanks for sharing this. > > I ha

Hive 0.12 support in 1.4.0 ?

2015-06-17 Thread Thomas Dudziak
So I'm a little confused, has Hive 0.12 support disappeared in 1.4.0 ? The release notes didn't mention anything, but the documentation doesn't list a way to build for 0.12 anymore ( http://spark.apache.org/docs/latest/building-spark.html#building-with-hive-and-jdbc-support, in fact it doesn't list

Re: [SparkR] Have we already had any lint for SparkR?

2015-06-17 Thread Yu Ishikawa
Hi Shivaram, Thank you for your reply and letting me know that. I will join the discussion on JIRA later. Thanks, Yu - -- Yu Ishikawa -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/SparkR-Have-we-already-had-any-lint-for-SparkR-tp12773p12775.ht

Re: [SparkR] Have we already had any lint for SparkR?

2015-06-17 Thread Shivaram Venkataraman
We do have a JIRA open for this at https://issues.apache.org/jira/browse/SPARK-6813 but I don't think anybody is actively working on it yet. FWIW I think https://github.com/jimhester/lintr looks more recently updated compared to google-rlint, but we can discuss more about this on the JIRA On Wed,

[SparkR] Have we already had any lint for SparkR?

2015-06-17 Thread Yu Ishikawa
Hi all, Have we already had any lint for R? I'm afraid I'm not familiar with the inside of SparkR yet. As you know, we have lint for Scala and Python to check these codes. So I think we should also check R codes automatically. And google-rlint would be nice. google-rlint - A program to lint the

Re: Welcoming some new committers

2015-06-17 Thread Chester Chen
Congratulations to All. DB and Sandy, great works ! On Wed, Jun 17, 2015 at 3:12 PM, Matei Zaharia wrote: > Hey all, > > Over the past 1.5 months we added a number of new committers to the > project, and I wanted to welcome them now that all of their respective > forms, accounts, etc are in. J

Welcoming some new committers

2015-06-17 Thread Matei Zaharia
Hey all, Over the past 1.5 months we added a number of new committers to the project, and I wanted to welcome them now that all of their respective forms, accounts, etc are in. Join me in welcoming the following new committers: - Davies Liu - DB Tsai - Kousuke Saruta - Sandy Ryza - Yin Huai Lo

Re: Sidebar: issues targeted for 1.4.0

2015-06-17 Thread Heller, Chris
I appreciate targets having the strong meaning you suggest, as its useful to get a sense of what will realistically be included in a release. Would it make sense (speaking as a relative outsider here) that we would not enter into the RC phase of a release until all JIRA targeting that release wer

Re: Sidebar: issues targeted for 1.4.0

2015-06-17 Thread Patrick Wendell
Hey Sean, Thanks for bringing this up - I went through and fixed about 10 of them. Unfortunately there isn't a hard and fast way to resolve them. I found all of the following: - Features that missed the release and needed to be retargeted to 1.5. - Bugs that missed the release and needed to be re

Re: Random Forest driver memory

2015-06-17 Thread Isca Harmatz
hello, does anyone has any help on the issue? Isca On Tue, Jun 16, 2015 at 7:45 AM, Isca Harmatz wrote: > hello, > > i have noticed that the random forest implementation crashes when > to many trees/ to big maxDepth is used. > > im guessing that this is something to do with the amount of node

Implementing and Using a Custom Actor-based Receiver

2015-06-17 Thread anshu shukla
Is there any good sample code in java to implement *Implementing and Using a Custom Actor-based Receiver .* -- Thanks & Regards, Anshu Shukla